Github user trystanleftwich commented on the pull request:
https://github.com/apache/spark/pull/4881#issuecomment-77208611
So to confirm, i think this function needs to be able to handle 5 states:
Path is a dir which has subdirs
(structure is hdfs://foo/foo1/foo2.jar)
path = hdfs://foo -> local_dir://foo
(Where local_dir://foo looks like local_dir://foo/foo1/foo2.jar)
Path is a file you want to copy with the same name
path = hdfs://foo.jar -> local_dir/foo.jar
Path is a file which you want to copy with a different name
path = hdfs://foo.jar -> local_dir/bar.jar
Path is a dir which contains multiple files that you want to copy with the
same names
(Structure is hdfs://foo/foo1.jar, hdfs://foo/foo2.jar)
path = hdfs://foo -> local_dir/foo/
(where local_dir/foo looks like local_dir/foo/foo1.jar and
local_dir/foo/foo2.jar)
Path is a dir which contains multiple files and subdirs you want to copy
(similar to above)
Anything I have missed? I have some code in testing now that achieves this
ill post it when i've run it on my cluster.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]