Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/546#issuecomment-41503498
It goes back to the problem we are trying to solve.
If the set/map can contain arbitrary paths then file.getCanonical is
unavoidable.
But then (multiple) IO and native call overhead will be there
If it is in context of a specific usecase - then it becomes function of
that : do we need canonical names or are the path always constructed using
common patterns which are consistent (always relative to cwd or specified w.r.t
some root (which can be symlink).
IMO spark does the latter - unless there are recent changes I am missing.
In any case, getAbsolutePath/File does not buy us anything much.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---