> 1) Recognize that regex is the wrong tool for the job, and go back to > using StringMatch for robots.txt handling as in 3.2. However, keep > the test of the URL's path in Retriever::IsValidURL(), and make sure > Server::push() tests the path only and not the full URL. > > 2) Stick to using regex, and Jim's patch, but fix Server::push() to > test the path. Also, it would be a good idea to add the meta character > escaping code from my patch. > > 3) Back out Jim's patch and apply mine, which addresses all the concerns > I brought up, and is the simplest fix to the existing code (pre-Jim's > patch). My fix adds a pattern to the regex to skip over the protocol > and server name parts of the URL, so the match is effectively anchored to > the path component of the URL, even though the pattern matching is done > (consistently) on the whole URL. > > Whichever of the 3 you choose, it's going to need testing in any case. > Fix 1 would be the ideal one, I think, but I didn't (and still don't) > have time to do that, so I opted for the simple fix in the above > mentioned e-mail. Given that fix 2 is now partially implemented, > maybe the best course of action would be to follow through and address > the 2 missing pieces. My fix (no. 3) has the disadvantage that future > developers will be scratching their heads trying to figure out my regular > expression for skipping to the path portion of the whole URL.
If your fix #3 is fully implemented, vs Jim's needing more work.. I'd rather go with #3. Jim: What is your estimate of finishing #2? I don't want to step on toes.. just get 3.2 out. After 3.2 we all need to have a pow-wow about what the next step is. Thanks. Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php _______________________________________________ ht://Dig Developer mailing list: [EMAIL PROTECTED] List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-dev