On Jul 21, 2007, at 11:58, Joakim Krassman wrote:

>  Is there
> anyone that do have a code snippet that actually tells me the exactly 
> weight
> of all the content that the httpclient downloaded?

There is no way for any client to know the entire weight of all content 
without requesting each piece individually.  Like Arno said in his 
response, you can call HEAD for each item and check the Content-Length 
header returned (if supported) without having to actually download 
everything.  You _do_ need to download the main document first in order 
to parse it.

This is how the HTTP protocol works; the fact that Web browsers make it 
look seemlessly as a single request and operation is mostly 
behind-the-scenes "magic".  If you pay close attention, particularly on 
a slow connection, you'll see the "Progress Bar" in your browser 
stretch and contract constantly as it updates the "total size" when it 
finds images and other media while parsing the document (that's why 
usually the text is rendered before the images start loading).

The HttpCli component in ICS does not provide this feature 
automatically, but it shouldn't be too hard to implement.  For parsing 
the document and extracting all urls for image, css, js, and other 
external resources, you can use any one of the many open source HTML 
parsers out there (I even believe the JEDI project has a couple).  
Implementing your own parser shouldn't be that hard either: recall that 
you are not interested in understanding all layout information; you are 
only looking for things that look like <img src=...> and the like.

So, to recap, here's a simple algorithm for doing what you requested:
1. GET document;
2. TOTAL_BYTES = Content-Length;
3. ResourceList := extract_resources(document);
4. For i := 0 To (ResourceList.Count - 1)
        4.a. HEAD resource[i];
        4.b. TOTAL_BYTES = TOTAL_BYTES + Content-Length;
5. Next;

        dZ.

-- 
        DZ-Jay [TeamICS]
        http://www.overbyte.be/eng/overbyte/teamics.html

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Reply via email to