Github user HeartSaVioR commented on the issue:
https://github.com/apache/spark/pull/22952
@zsxwing @gaborgsomogyi
What we were trying to do is enforcing archive path so that moved files
will not make overlap with source path. There may be same file name with
different directory so I'm also trying to persist its own path in final
archived path, which means archive files will not be placed in same directory.
Based on above, I thought enforcing archive path with checking glob path is
not easy to do, because without knowing final archive path (per file) we can't
check it matches with glob pattern. That's why I just would rather restrict all
subdirectories instead of finding a way to check against glob pattern.
Actually I'm a bit afraid that we might be putting too much complexity on
enforcing archive path. If we are OK with not enforcing archive path and just
verify the final archive path doesn't overlap source path per each source file,
it would be simple to do. We can make Spark not moving file and log warning
message to let end users specify other directory.
Would like to hear everyone's thought and idea. Thanks in advance!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]