[
https://issues.apache.org/jira/browse/AVRO-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Nagavaram updated AVRO-1215:
-----------------------------------
Attachment: AVRO-1215-v3.patch
removed a out.println line from the code.
> AvroMultipleOutputs not working when specifying baseOutputPath
> --------------------------------------------------------------
>
> Key: AVRO-1215
> URL: https://issues.apache.org/jira/browse/AVRO-1215
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.2
> Reporter: Matthew Hayes
> Assignee: Ashish Nagavaram
> Labels: avro, mapreduce
> Attachments: avro-1215.patch, AVRO-1215.patch, AVRO-1215-v2.patch,
> AVRO-1215-v3.patch
>
>
> I'm calling the write() method of AvroMultipleOutputs which takes the
> baseOutputPath. The reducer appears to begin hanging once it tries writing
> to a baseOuputPath value not already encountered. It then fails with:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to
> create file ... because current leaseholder is trying to recreate file.
> I think the problem has to do with this line in AvroMultipleOutputs:
> {code}
> // get the record writer from context output format
> //FileOutputFormat.setOutputName(taskContext, baseFileName);
> {code}
> This line is not commented out in the similar code from Hadoop. So I think
> the baseOutputPath is ignored. As a result when each record writer is
> created it uses the same path, leading to the exception.
> Uncommenting this line does not work because of visibility of the method.
> However what this method does is set "mapreduce.output.basename". But
> setting this doesn't work either.
> After digging through Avro code I found that AvroOutputFormatBase is using
> "avro.mo.config.namedOutput" to create the path. If I replace the commented
> out line with this it seems to work:
> {code}
> taskContext.getConfiguration().set("avro.mo.config.namedOutput",
> baseFileName);
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira