[
https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651619#comment-13651619
]
Arun C Murthy commented on MAPREDUCE-5038:
------------------------------------------
[~hagleitn] Thanks for pointing this, along with the test. I verified Hive
breaks only with this patch too ([~sandyr] see the test Gunther pointed to).
[~sandyr] I'm mystified too - but Hive has used
o.a.h.mapred.CombineFileInputFormat forever and that test (archive_multi.q) has
been around since 2011 (HIVE-2278). Also, Hive has had support for HAR for a
long while too (ARCHIVE syntax). Looks like we need to investigate this further
to see why this broke Hive... can you please look?
Since this is not in 1.2 (currently under vote) I'm not too worried, we can
revert this patch if we don't come up with a fix quickly?
> old API CombineFileInputFormat missing fixes that are in new API
> -----------------------------------------------------------------
>
> Key: MAPREDUCE-5038
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.1.1
> Reporter: Sandy Ryza
> Assignee: Sandy Ryza
> Fix For: 1.3.0
>
> Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch,
> MAPREDUCE-5038-revised-1.patch, MAPREDUCE-5038-revised-1.patch,
> MAPREDUCE-5038-revised.patch
>
>
> The following changes patched the CombineFileInputFormat in mapreduce, but
> neglected the one in mapred
> MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files
> MAPREDUCE-2021 solved returning duplicate hostnames in split locations
> MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default
> FS
> In trunk this is not an issue as the one in mapred extends the one in
> mapreduce.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira