Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19885
Hi.
If the comparision is isolated to a method testing URIs, rather than
filesystems, it should be straightforward to write a suite of tests for this,
with lists of URIs expected to match, as separate one of those to fail
That way we can review those combinations which people expect to
match/don't match & see they meet our expectations, plus have somewhere to put
new variants over time.
So: do that test, then we can see if the code does what's needed. Once
that's done I'll use it as a basis for defining what Path is meant to do in the
Hadoop FS spec & tests.
Things to check
```
file:///file1 file:///file 2 : match; no auth
file:///c:file1 file://c:file2 match, windows cruft. This is the bit of
Path which is most trouble
file://host/file1 file://host/file2
wasb://bucket1@user wasb://bucket1@user/
hdfs:/path1 hdfs:/path2 -- "default" FS; may be patched by the time you
get to FileSystem.getURI
hdfs://namenode1/path1 hdfs://namenode1:8020/path2 -using default port.
I think by the time you ask the filesystem for this (FileSystem.getURI() this
may have been patched up)
```
no match:
```
file:///file1 file://host/file2 :no auth in src URI (sean's problem)
file://host/file1 file:///file2
file://host/file1 file://host2/file2
wasb://bucket1@user wasb://bucket2@user/
wasb://bucket1@user wasb://bucket1@user2/
s3a://user@pass:bucket1/ s3a://user2@pass2:bucket1/ (we do a bit of
secret stripping in S3A, so this may end up working in real life. Could relax
that to retaining user@ though, if we retain it at all)
hdfs:/path1 hdfs:/path2
hdfs://namenode1/path1 hdfs://namenode1:8080/path2
hdfs://namenode1:8020/path1 hdfs://namenode1:8080/path2
```
See? It's complex. Add the parameterised test and then it becomes easier to
review/maintain & be confident those corner cases are being handled
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]