On Wed, 3 Oct 2001, Gilles Detillieux wrote:

> Date: Wed, 3 Oct 2001 09:51:03 -0500 (CDT)
> From: Gilles Detillieux <[EMAIL PROTECTED]>
> To: Joe R. Jah <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED]
> Subject: Re: [htdig-dev] Re: URL Rewrite patch for 3.1.6 snapshots
> 
> > > > > If you get a chance to run old and new snapshots of htdig with -vvv and
> > > > > compare the outputs, you may be able to track down the source of the
> > > > > different URLs that are parsed in both cases.  To do this in a meaningful
> > > > > way, though, you'll need to try a static site, or perhaps a snapshot of
> > > > > your site, so you don't get thrown off in your comparisons by updates
> > > > > to the site between digs.
> > > > 
> > > > Yes, I have kept that snapshot for a happy occasion like that;)
> > > 
> > > Keep me posted if you get a chance to run this test with both snapshots.
> > > I can't think of any changes to 3.1.6 that would cause it to lose valid
> > > URLs, but it would be good to confirm without a doubt that the lost URLs
> > > on your system are all indeed URLs that should not have been indexed.
> > 
> > In the happy hour;)))
> 
> It might be best if you're sober when you do this test.  ;-)

The happy hour turned into a couple of unhappy weeks:(

-r--r--r--  1 jjah  www    24621528 Oct  2 13:20 rundig_vvv.082901
-r--r--r--  1 jjah  www    20266702 Oct  2 14:15 rundig_vvv.093001

I found 82 links from one document with META ROBOT: Noindex tag;)  I could
not find an efficient way of hunting down the other 138 links that were
unaccounted for in two 20 meg+ files; however, I must assume that they are
some sort of duplicates;-/

Regards,

Joe
-- 
     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        [EMAIL PROTECTED]


_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to