Jukka,
>On 8/5/06, Chris Schneider <[EMAIL PROTECTED]> wrote:
>>Given this, shouldn't the default URL normalizer just add a slash to
>>the end of a URL that doesn't have a file extension?
At 8:41 AM +0300 8/5/06, Jukka Zitting wrote:
>Section 6.2.4 of RFC 3986 suggests that a crawler could do suc
Hi,
On 8/5/06, Jukka Zitting <[EMAIL PROTECTED]> wrote:
Section 6.2.4 of RFC 3986 suggests that a crawler could do such a
normalization if it detects that
http://mail.python.org/mailman/listinfo redirects to
http://mail.python.org/mailman/listinfo/.
Which it of course doesn't... :-) Another re
Chris Schneider wrote:
Gang,
Pardon my ignorance, but I noticed recently that some URLs were
duplicated in my crawldb, once with a terminating slash and once
without it. For example, both of the following URLs were found in the
same crawldb:
http://mail.python.org/mailman/listinfo/
http://ma
Hi,
On 8/5/06, Chris Schneider <[EMAIL PROTECTED]> wrote:
Given this, shouldn't the default URL normalizer just add a slash to
the end of a URL that doesn't have a file extension?
Section 6.2.4 of RFC 3986 suggests that a crawler could do such a
normalization if it detects that
http://mail.pyt
Gang,
Pardon my ignorance, but I noticed recently that some URLs were
duplicated in my crawldb, once with a terminating slash and once
without it. For example, both of the following URLs were found in the
same crawldb:
http://mail.python.org/mailman/listinfo/
http://mail.python.org/mailman/l