attilapiros commented on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-654047643
> I wonder we can link the spot on the JDK code (if it was from OpenJDK) so
that we feel confident to remove the relevant test.
>
> I usually see the concern of removing tests, as the Spark codebase is
quite complicated and in many cases UTs can only find the broken part. It tends
to make us comfortable if the code is from JDK (or what we get rid of is
actually what JDK is doing for us) but still be ideal which/where code it is.
I see your point but here the old `createNormalizedInternedPathname` was as
good as it could imitate the `java.io.FileSystem#normalize()` because of the
interning of the result String.
This is kind a mentioned in the old method javadoc:
https://github.com/apache/spark/blob/dea7bc464db6724d89e8b31264e054f82ceb4e5f/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java#L59-L60
In the old tests we should have to test how close these two transformation
are but there was the same problem: the `java.io.FileSystem#normalize()` cannot
be called there too.
Now with this trick (reading back the path) we would test whether the result
of `java.io.FileSystem#normalize()` is the same
`java.io.FileSystem#normalize()`.
For the same reason I do not see the value by giving a link to the exact
code behind.
They are here:
-
https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/windows/classes/java/io/WinNTFileSystem.java#L206-L223
-
https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/solaris/classes/java/io/UnixFileSystem.java#L96-L99
Still those are just implementation details which are not really relevant as
it does not matter what the exact transformation behind the normalisation is
(depending on the OS type) as in the final `File` constructor the JDK one will
be executed and if we differ from it then interning the String does not help at
all.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]