At 08:20 PM 9/29/2002 -0400, David A. Desrosiers wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > > > > David, what did you see? What I saw on my system was that it held the > > pluck to the host, which is what I would expect given that --stayonhost is > > more rigid than --stayondomain. Other than that, it functioned normally. > > Look even closer. images.slashdot.org is not fetched (though still >is part of the --stayondomain construct of slashdot.ort), when --stayonhost >is used. If you omit --stayonhost, you will properly get all the pages from >images.slashdot.org as well as slashdot.org content.
Yes, this is precisely what I expected. "images.slashdot.org" is not the host "slashdot.org". Perhaps I'm not understanding... when I do --stayonhost without --stayondomain, specifying "http://slashdot.org", I do not get "images.slashdot.org". This is pre-existing behavior. If I do "--stayondomain" without "--stayonhost", I do get "images.slashdot.org", which is what I figured was desired behavior. If I do both, I've told it to not just stay on the "slashdot.org" domain, but on the "slashdot.org" host, correct? What behavior would you expect under that circumstance? I'm willing to change it, but intuitively it seems to me that --stayonhost should trump --stayondomain, as it currently does. > "I don't see why you can't simply load http://slashdot.org/palm/.." > > So basically, don't pound the main page. Use the lean page, always, >or use the RSS content they make available (or use the slashpluck script, >which does the same thing). I haven't been, except a few tests. I pound my own website about three meters from me. > > I just overlooked that possibility because it's a radio button (i.e. > > mutually exclusive) in Desktop. > > Bleh, radio button. My shell script doesn't include radio buttons, >so remember that what you might use as your interface, may not be the same >that others use for theirs. Your changes affect more than your GUI >interface. Great work overall, keep it up. Like I said, I'm happy to put an exclusion in there... I'm just fully unclear on how the current behavior differs from what you'd expect. How's this for a solution... if both are specified, I'll print out a warning that stayonhost overrides stayondomain and spidering will be limited accordingly. Sound good? Regards Tony McNamara _______________________________________________ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list

