[ https://issues.apache.org/jira/browse/CRUNCH-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309392#comment-14309392 ]
Tom White commented on CRUNCH-481: ---------------------------------- I'm not sure that changing the job ID is enough - for example FileOutputCommitter uses the task attempt ID to create directories, and I think it may be constructed from a different Job instance (so it wouldn't get the "decorated" ID). It would be good if the OutputCommitter implementations could take account of the output name, but I'm not sure how to do that for FileOutputCommitter - have a Crunch-specific one? (Doing it for Kite is not a problem.) CompositeOutputCommitter already has some special casing for FileOutputCommitter, which shows that there is a problem here... bq. I'm not very familiar with the commiter logic, but for some reason this wasn't exposed when running against Hadoop 1. The committer logic for the new MR API in Hadoop 1 has some limitations. For example, it is not called properly from the local job runner. For this reason, we don't use an output committer in Kite under Hadoop 1: https://github.com/kite-sdk/kite/blob/master/kite-data/kite-data-mapreduce/src/main/java/org/kitesdk/data/mapreduce/DatasetKeyOutputFormat.java#L478-L481 > Support independent output committers for multiple outputs > ---------------------------------------------------------- > > Key: CRUNCH-481 > URL: https://issues.apache.org/jira/browse/CRUNCH-481 > Project: Crunch > Issue Type: Bug > Components: Core > Reporter: Aniket Kulkarni > Assignee: Josh Wills > Priority: Minor > Fix For: 0.12.0 > > Attachments: CRUNCH-481-hadoop-2-compat.patch, CRUNCH-481.patch, > CRUNCH-481.patch, CRUNCH-481.patch, CRUNCH-481c.patch > > > I faced this issue while trying to write to Kite and HDFS in the same > pipeline. A similar issue was logged for Kite[1][2]. > I was attempting to write a PCollection to Kite and a different PTable to > HDFS as a text file. The write to Kite succeeded, however the write to HDFS > only produced a _SUCCESS file with no text file. > Commenting out the write to Kite seems to solve the issue and I can see the > text file being written. > [1] - https://issues.cloudera.org/browse/CDK-756 > [2] - > http://mail-archives.apache.org/mod_mbox/crunch-dev/201401.mbox/%3ccaf-wd4qcue0toh3qewpdnnom3u786pvjlgh7t6go_abctpl...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)