[
https://issues.apache.org/jira/browse/NUTCH-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964069#comment-13964069
]
Alex McLintock commented on NUTCH-1748:
---------------------------------------
FYI
"The similarity to unix and other disk operating system filename conventions
should be taken as purely coincidental, and should not be taken to indicate
that URIs should be interpreted as file names."
quote from http://www.w3.org/Addressing/URL/4_URI_Recommentations.html
That page also says
The slash ("/", ASCII 2F hex) character is reserved for the delimiting of
substrings whose relationship is hierarchical. This enables partial forms of
the URI. Substrings consisting of single or double dots ("." or "..") are
similarly reserved.
So if we assume that a substring is something which has to be delimited then
"/../" is NOT allowed, but ".." surrounded by one or more other characters
should be.
> Despite Unix systems accept files containing two dots.Urlfilter-validator
> rejects such path names.
> --------------------------------------------------------------------------------------------------
>
> Key: NUTCH-1748
> URL: https://issues.apache.org/jira/browse/NUTCH-1748
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 2.2.1
> Reporter: Sertac TURKEL
> Priority: Minor
> Fix For: 2.3
>
>
> Unix systems accept files containing two dots "abc..xyz.txt". So
> urlfilter-validator should not reject this kind of urls. Also paths
> containing "/../" or "/.." in final position should be still rejected.
--
This message was sent by Atlassian JIRA
(v6.2#6252)