Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/546#issuecomment-42484142
@pwendell @nsuthar
First, note SPARK-1527 (https://issues.apache.org/jira/browse/SPARK-1527)
and its PR (https://github.com/apache/spark/pull/436).
@advancedxy says that was a real issue, and I think it is fixed by the PR,
which uses `getAbsolutePath()`. FWIW I agree with that change.
I think that is slightly preferable to the change in this PR.
`getCanonicalPath()` is probably just fine too, and also solves the issue, but
may be unnecessary.
There is a separate issue addressed in this PR / SPARK-1623. That is that
the set `files`, which has `String`s, is populated with the result of
`File.getAbsolutePath()` but then later unpopulated with the result of
`File.toString()`, which is `File.getPath()`, which is *not* necessarily the
same for the same file.
Whether it happens to all work out because the arguments happen to always
be absolute, I think it's probably nicer to resolve this by a) calling
`files.remove(file.getAbsolutePath)` or b) just making `files` a set of `File`
to begin with.
So I suggest:
- commit the PR for SPARK-1527 and close it
- abandon this PR and make a new one that implements b) or a) above to
resolve SPARK-1623
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---