[GitHub] spark pull request: [SPARK-8406] [SQL] Adding UUID to output file ...

liancheng Sat, 20 Jun 2015 01:12:00 -0700

Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/6864#issuecomment-113728195
  
    @lianhuiwang Yeah, thanks for reminding. We are also working on this issue. 
It will be addressed in another PR. At first, appending jobs with output 
committers like `DirectParquetOutputCommitter` can be tricky to handle since 
they writes directly to the target directory without using any temporary folder 
(this can be super useful for S3 since S3 file metadata operations and 
directory operations can be very slow). But with this PR, the job level UUID 
can be used to distinguish files written by different jobs.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-8406] [SQL] Adding UUID to output file ...

Reply via email to