1st -- I want to make it quite clear that I am *not* a masochist
2nd -- I have cable modem so I don't have to worry about slow downloads
       from my end, nor tying up my phone line for hours.
��
-oa
3rd -- I did not spend the entire day centered on the computer screen. 
       Today was big-time racing day ... After linking to the site with
       1.70r3 I watched the last 125 laps of the Indianapolis 500; I
       then switched to PBS and watched 2 presentations of the British
       version of Antiques Roadshow.  Somewhere between a & b I made and
       ate some breakfast.  I also took the dog down to the park on the
       corner so that she could do what she will not do in "her" yard. 
Next I tuned into something or other until the NASCAR CocaCola 600
started, and also did some major pruning on the weeping cherry tree I've
been nursing.  After the 600 started I got around to grabbing something
for lunch snack, and took time out for a quick "dog trip" and to
hopefully seal the bark on one of the main trunks of the same weeping
cherry which was showing dead leaves at the top.  I watched some more of
the Coke 600, and periodically checked the website being downloaded. 
They have more problems than just page design, btw ... some of the
transfers occured so quickly the status line was a blur, but the
majority of the transactions were sliced by their server into segments
ranging from 240 to 328 bytes -- that bytes per slice, with long waits
between each slice.  After the sun had gone down in North Carolina and
the lights had come on at the race track, with well over 100 laps still
to go, it was another walk to the park and around the park until dog had
done all that was necessary.  Then back to the race on TV and the
website images are still being downloaded by Arachne.  This set a new
record a couple of hours prior to that point. <G>  But I watched the
race to the end, got some dinner down, started on the NBC reprise of the
attack on Pearl Harbor, and waited .... and then waited some more. 
Then, after more than 6 hours of downloading, I said F-It!  1.70r3, of
course, did it's "blank screen" act when I terminated the download. 
Attempts to get the page to load from disk were OK, but not the images. 
They started downloading again from scratch.  I checked the cache, and
there were 257 files.  That didn't sink in at the time.  I tried to get
1.70r3 to show me *any* of the pictures in the cache, and it was a no-go
for all half-dozen tries.

4th -- Only as I typed this did I realize part of what the problem is
with the combination of "that URL" + Arachne.  There were 1029 html
Atoms on the page alone, there were supposedly 257 pictures but I didn't
count the blocks and I think Clarence is wrong there.  Read on and see
why ...  Arachne is set to clear cache automatically whenever 256 items
are in it; that means if there are more than 256 items on a page to
download, things get deleted from the top down as each new thing is
added to the bottom.  And that means that every time Arachne is
successful in downloading a few files, and goes to verify images, it
discovers that a bunch of images at the front of the list aren't there
in cache!  So I downloads those missing items to cache, and in doing so
erases an equal number of items off the top of the list again... and so
it goes.  And that is why after more than 6 hours of constant
downloading, I never *did* get the page downloaded.  And that is why I
think that there are more than 527 pictures on the site ... it's just
that only 526 anything plus the index file can be kept in the cache, so
that is all that can ever be found in the cache, regardless of how long
one has allowed Arachne to download. :<

5th -- I'm glad I'm not a drinker, or I would have gone through *at
least* a fifth as the eternal download progress today.

Now I have some SUGGESTIONS!  First, would the closest person in the
Czech Republic to that server and/or that domain office please go visit
them and blow them away for their absolute stupidity!  Second, and most
MOST important:  Arachne needs to have some sort of routine added where
it stops dowloading from a page when the max number of cache items are
reached.  I cannot think of a decent way to do it.  If I've been to more
than one site, the cache will have bunches of stuff in it, and I'd hate
to have a download of a site stop simply because the cache is full.  And
most of us would not want the cache to automatically clear each time we
go to a different URL ... that could be terrible.  My only hope is that
the "HTML ATOMS" number means the number of 'pieces' or 'files' which
will have to be downloaded to render the page; if that is the case, then
Arachne could be "taught" to download only 256 items on any page where
the HTML Atoms number is in excess of 256. If there could be a warning
issued prior to starting the download, something like "this page has too
much crap and cannot be completely downloaded" would be nice.  If
Arachne could be additionally "trained" to take instances like that and
accept some "N"ext hotkey or whatever, and then download as many more
files as cache can hold, we could at least see a page like that in "bits
and pieces" ...

I don't feel like cranking up NetScrape to see if it could handle that
page; I get the feeling it probably could, or at least until I ran out
of drive space, since windows software never has learned that limits are
necessary and good things.

And thus ends today's saga of Arachne in SW Ohio...

l.d.
====

On Sat, 26 May 2001 04:34:38 -0400, Clarence Verge wrote:

> Hello All;
> Some totally amazing stuff is coming out of various points in czland
> but web page design isn't part of it. :((

> I stumbled onto this collection of images from the International Space
> Station group.   There are at least five real beauties here, although
> they are cropped versions of the originals - which I understand are
> about four times the size in both X and Y directions.

> It's hard to believe that the wacko(s) who assembled the collection
> are dumb enuff to think everyone has a T1 connection, but it must be
> true - THEY PUT TWO HUNDRED & FIFTY-SEVEN IMAGES ON THE PAGE !!

> If you dare to visit:
> http://iwebs.upol.cz/iss/galerie_sts98.htm
> even *WITH* a T1 connection, don't expect much from Arachne.

> She will happily waste your time downloading the thumbnails, and then
> just sit there and do NOTHING when finished. I suppose that's better
> than crashing. If you surf your cache and ask for an image gallery,
> she will have to be poked a few times to start, and then will convert
> only a small part of the images in cache.

> Arachne 1.62 will convert (on my P90) 98 out of 257, present red boxes
> for about 100 more (not counted) and then fill the rest of the page with
> the message "Too many images converted" for each of the remaining files.
> At this point only about 25% of my available TEMP space on Ramdisk has
> been used.

> Think I should use a later version ?
> I went straight to A1.70r3. I found it much more difficult to prod her
> into doing it, but when activated this version would only convert 47
> images. Switched to A1.66. I found this version to be a little easier
> to bring to life, but I only got 47 images out of 257 here also.
> Back to A1.62 (with the .BMPs in TEMP cleared). A recheck confirmed
> that A1.62 would do 98 of 257, and only required a manual conversion
> of one file to teach her what to do.

> Seems A1.62 is the best in this area, but I've run into several pages
> lately while on Hollywood patrol with more than 98 images on them.

> *

> Here's a little comparison for you:

> Using Netscape 2.02 on a 33Mhz '486 (NO ramdisk) that page downloaded
> in 8 min 36 sec @ 31200 and the screen displayed IMMEDIATELY although
> NS cranked the disk for about another minute while it created all the
> off-screen images.

> Arachne on my *P90* with the same ISP and at the same baudrate took
> 9 min 21 sec to finish downloading the images and deciding to do no
> more. A later experiment with a shortened page indicates she would
> have taken almost two minutes more to display the page if she had
> been up to it.

> As a matter of interest, she will get the page with images OFF
> in only 17 seconds.
> I've had a little luck downloading with images off, editing the page
> into thirds and then selecting one image for display.  Arachne will
> download the shortened list of images after displaying that image.

> - Clarence Verge
> - Back to using Arachne V1.62 ....

-- Arachne V1.70;rev.3, NON-COMMERCIAL copy, http://arachne.cz/

Reply via email to