On Mon, 19 Dec 2005 17:48:31 +0000, Matthew Toseland wrote:

> On Mon, Dec 19, 2005 at 02:22:36PM +0200, Jusa Saari wrote:
>> 
>> A simple solution is to have FProxy parse the HTML and identify links
>> (which it must do anyway to filter images loaded from the Web and
>> whatever) and add images and other page requisites to the download queue
>> without the browser needing to ask each of them separately. Still won't
>> help for getting multiple pages at once, but at least getting an
>> image-heavy page becomes faster. This would also combat the effect which
>> makes other than the topmost images drop off the network since no one
>> has the patience to wait for them.
> 
> Wouldn't help much.

Would you please either be a bit more verbose than that, or not reply at
all ? Anyone who knew _why_ it won't help didn't get anything from your
answer, and anyone who didn't know (such as I) still doesn't, making your
answer completely useless to anyone and therefore a waste of both your and
your readers time as well as diskspace and bandwith in whatever server(s)
this list is stored on.

Now, the theory is that an image-heavy website is slow to load because the
browser only requests two items simultaneously, meaning that the high
latencies involved with each request add up; and FProxy making requests
for content it knows the browser will soon ask helps because that way the
content beyond the first few images will already be cached when the
browser arrives there, eliminating any noticeable latency.

Please explain why this theory is wrong ?

The theory about bitrot combatting effect is directly linked to the high
latency, and the tendency of browsers to requests images in page in the
order they appear in page source. The user simply hits the stop button (or
the browser times out the page) before the bottom images are loaded;
because of this, they are never requested, and consequently fall off the
network. FProxy automatically queuing the images for download would ensure
that the bottommost images are requested every time the page is loaded,
ensuring that they stay in the network as long as the page does.

Please explain why this theory is wrong ?

>> > Now... here is a suggestion of how to change both of these.
>> > 
>> > Use the "Redirect"-header of the HTTP-protocol. A simple algorithm:
>> > 
>> > 1) For a connection C, check if data is being downloaded.
>> >       If not, download the data/add download to queue
>> > 2) If data not retrieved in 30 seconds, set Redirect to the same URL
>> >       and close socket. Continue to get data in background.
>> > 
>> > Now, let's assume that the webbrowser uses a maximum of one connection
>> > and round robin to alternate between requests. Then only one
>> > connection to the fproxy is used at once and all images get an equal
>> > amount of time.
> 
> IMHO adding a suitable time-refresh header is very sensible; in theory it
> ought to get browsers to retry non-iframed images automatically.
>> > 
>> > If, however, the webbrowser just queues the requests and bangs until
>> > the first are done until it gives up, the difference will be that new
>> > connections for new retries will be made every 30 seconds. So; no
>> > bigger loss.
>> > 
>> > The main downside is that a browser maybe never gets done loading the
>> > data if the key is lost forever. How should this be solved? A countup
>> > after the URL the rdirection goes to, so after a set number of
>> > retries, the statuspage shows instead?
> 
> That's how we do it now.
>> > 
>> > Also, for htmlpages, it's better to use the statuspage (but with more
>> > information). Some kind of auto-sensening maybe? but this is a pain
>> > for download managers.



Reply via email to