> 1) Recognize that regex is the wrong tool for the job, and go back to
> using StringMatch for robots.txt handling as in 3.2.  However, keep
> the test of the URL's path in Retriever::IsValidURL(), and make sure
> Server::push() tests the path only and not the full URL.
>
> 2) Stick to using regex, and Jim's patch, but fix Server::push() to
> test the path.  Also, it would be a good idea to add the meta character
> escaping code from my patch.
>
> 3) Back out Jim's patch and apply mine, which addresses all the concerns
> I brought up, and is the simplest fix to the existing code (pre-Jim's
> patch).  My fix adds a pattern to the regex to skip over the protocol
> and server name parts of the URL, so the match is effectively anchored to
> the path component of the URL, even though the pattern matching is done
> (consistently) on the whole URL.
>
> Whichever of the 3 you choose, it's going to need testing in any case.
> Fix 1 would be the ideal one, I think, but I didn't (and still don't)
> have time to do that, so I opted for the simple fix in the above
> mentioned e-mail.  Given that fix 2 is now partially implemented,
> maybe the best course of action would be to follow through and address
> the 2 missing pieces.  My fix (no. 3) has the disadvantage that future
> developers will be scratching their heads trying to figure out my regular
> expression for skipping to the path portion of the whole URL.

  If your fix #3 is fully implemented, vs Jim's needing more work.. I'd
rather go with #3.

  Jim:  What is your estimate of finishing #2?

I don't want to step on toes.. just get 3.2 out.

After 3.2 we all need to have a pow-wow about what the next step is.

Thanks.

Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485




-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to