[
https://issues.apache.org/jira/browse/MAPREDUCE-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
alex.lv updated MAPREDUCE-5537:
-------------------------------
Tags: lvxin.patch
Affects Version/s: 0.20.205.0
Release Note: i use the hadoop-0.20.2-cdh3u6,and the comment'content
is the patch
Hadoop Flags: Reviewed
Status: Patch Available (was: Open)
--- ./src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java
2013-03-21 02:48:42.000000000 +0800
+++ ./CombineFileInputFormat.java 2013-09-26 11:57:11.958730087 +0800
@@ -201,6 +201,11 @@
// times, one time each for each pool in the next loop.
List<Path> newpaths = new LinkedList<Path>();
for (int i = 0; i < paths.length; i++) {
+ if(paths[i].getName().endsWith("lzo.index"))
+ {
+ LOG.warn("paths["+i+"] ="+paths[i]+"is the lzo index file!");
+ continue;
+ }
FileSystem fs = paths[i].getFileSystem(job);
Path p = fs.makeQualified(paths[i]);
newpaths.add(p);
> hive return different results with and without index when
> hive.hadoop.supports.splittable.combineinputformat =true
> ------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5537
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5537
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.20.205.0
> Reporter: alex.lv
> Assignee: alex.lv
>
> the environment:
> hive-0.8.1
> hadoop-0.20.2-cdh3u6
> the Presentation:
> i use the hive-0.8.1 to exec the query:
> select count(*) from table t1;
> the table t1 is lzo formatted ,and the follows is :
> # Storage Information
>
> SerDe Library:
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>
> InputFormat:
> com.hadoop.mapred.DeprecatedLzoTextInputFormat
>
> OutputFormat:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> and the hive.hadoop.supports.splittable.combineinputformat =true
> when i index the table t1,the result is 265329 .
> when i remove the index of the t1,the result is 265325.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira