- - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: OK Subject: Re: More indexing questions
1. The new regexps work much better, I think. I managed to destroy my database by accidentally running indexer -O instead of indexer -I. Nothing worked right after that, so I had to clear the whole DB. You may want to add that switch to the man page _and_ put an "Are you sure?" check on that one. :-) But indexing from scratch the sid's are now removed as they should. Unsure why the alpha/alnum thing didn't work, the man page for regexec on my system (gentoo linux) claims to support it... 2. Ah, pretty useless as a randomizer in other words. How about sorting the urls into one bucket per host and then indexing one from each bucket as long as possible instead? - - - - - - - - - - - - - - - - - - - - - - - - - - - - Read the full topic here: http://www.dataparksearch.org/cgi-bin/simpleforum.cgi?fid=02;topic_id=1108663453
