[
https://issues.apache.org/jira/browse/HADOOP-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495372#comment-13495372
]
Daryn Sharp commented on HADOOP-8977:
-------------------------------------
Here's a copy-n-paste from an offline discussion:
The problem here is hadoop is designed around URIs and we are trying to support
something (windows paths) that is NOT a URI. Earlier, I bent over backwards
trying to temporarily handle windows paths but Suresh convinced me it was the
wrong direction.
The article I linked is about how to properly reference windows paths as URIs
and says windows style \ paths are deprecated in IE which I think essentially
means the file browser. The windows shell supports / paths so I'm grappling
with why we should perpetuate deprecated windows paths as pseudo-URIs when real
URIs appear to be fully supported in windows.
I'd be a bit happier ( or less unhappy! :) )if \ support is more context
specific to just windows local path names. As it stands, all URIs on windows
are subject to \ to / conversion which prevents windows from accessing valid
filenames in hdfs and other supported filesystems. I can understand/sympathize
with the motivation to support c:\path, but I don't agree that
hdfs:\\host\path, or hdfs:\/path/path2\path3 should be supported at all.
This bizarre behavior creates compatibility issues where jobs accessing paths
in that way are not cross-platform compatible. Ie. They "work" on hadoop for
windows, but fail on every other OS. Once we "let the cat of of the bag" by
adding more pseudo-support for non-URIs on windows, it's going to be that much
harder to take it away.
What if we did something a bit more selective:
# [a-z]:\ considered a windows non-URI
#* implicitly deemed to have a "file" scheme if not already declared
#* all \ are converted to / - which means no quoting of metachars available, or
we support ^ as the escape
#* throw an exception if / already exists in the path
# [a-z]:/
#* considered a standard URI
#* implicitly deemed to have a "file" scheme if not already declared
#* no \ conversion - quoting of metachars is supported
# all other URI schemes and relative paths
#* no change
# add ctor Path(File)
#* allow users to create Paths from non-URIs
#* will eventually be the only supported way to access non-URI paths
# eliminate treating ":" as an invalid path character to allow drive letters
I'm curious what serious breakage we'll have if we just require standard URIs -
ie. change little to nothing or implement the above proposal?
> multiple FsShell test failures on Windows
> -----------------------------------------
>
> Key: HADOOP-8977
> URL: https://issues.apache.org/jira/browse/HADOOP-8977
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: trunk-win
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Fix For: trunk-win
>
> Attachments: HADOOP-8977-branch-trunk-win.patch,
> HADOOP-8977-branch-trunk-win.patch, HADOOP-8977-branch-trunk-win.patch,
> HADOOP-8977.patch
>
>
> Multiple FsShell-related tests fail on Windows. Commands are returning
> non-zero exit status.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira