Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/19885 Hi. If the comparision is isolated to a method testing URIs, rather than filesystems, it should be straightforward to write a suite of tests for this, with lists of URIs expected to match, as separate one of those to fail That way we can review those combinations which people expect to match/don't match & see they meet our expectations, plus have somewhere to put new variants over time. So: do that test, then we can see if the code does what's needed. Once that's done I'll use it as a basis for defining what Path is meant to do in the Hadoop FS spec & tests. Things to check ``` file:///file1 file:///file 2 : match; no auth file:///c:file1 file://c:file2 match, windows cruft. This is the bit of Path which is most trouble file://host/file1 file://host/file2 wasb://bucket1@user wasb://bucket1@user/ hdfs:/path1 hdfs:/path2 -- "default" FS; may be patched by the time you get to FileSystem.getURI hdfs://namenode1/path1 hdfs://namenode1:8020/path2 -using default port. I think by the time you ask the filesystem for this (FileSystem.getURI() this may have been patched up) ``` no match: ``` file:///file1 file://host/file2 :no auth in src URI (sean's problem) file://host/file1 file:///file2 file://host/file1 file://host2/file2 wasb://bucket1@user wasb://bucket2@user/ wasb://bucket1@user wasb://bucket1@user2/ s3a://user@pass:bucket1/ s3a://user2@pass2:bucket1/ (we do a bit of secret stripping in S3A, so this may end up working in real life. Could relax that to retaining user@ though, if we retain it at all) hdfs:/path1 hdfs:/path2 hdfs://namenode1/path1 hdfs://namenode1:8080/path2 hdfs://namenode1:8020/path1 hdfs://namenode1:8080/path2 ``` See? It's complex. Add the parameterised test and then it becomes easier to review/maintain & be confident those corner cases are being handled
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org