[ https://issues.apache.org/jira/browse/CRUNCH-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dave Beech updated CRUNCH-228: ------------------------------ Attachment: CRUNCH-228.patch Here's a patch. Basically I've replaced the special case addition of the '.avro' extension in FileTargetImpl with a more general one. This did cause an integration test failure regarding Trevni files. I'm not too familiar with Trevni and didn't want to take a risk, so I added a change in TrevniKeyTarget to preserve the original behaviour rather than update the filenames in the test. I think changing the test would be cleaner, so let me know if you agree this is a better option. > FileTargetImpl cuts off extensions of output files > -------------------------------------------------- > > Key: CRUNCH-228 > URL: https://issues.apache.org/jira/browse/CRUNCH-228 > Project: Crunch > Issue Type: Bug > Reporter: Dave Beech > Attachments: CRUNCH-228.patch > > > Compressed files written by mapreduce often have extensions, e.g. '.deflate', > '.gz' or '.snappy'. Crunch currently cuts off these extensions during the > move of output files to their final destination, which is fine in some > circumstances but causes problems in others. > For example, running 'hadoop fs -text myfile.deflate' will show the > decompressed text on screen but running 'hadoop fs -text myfile' on a > deflate-compressed file with no extension prints unreadable compressed data > instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira