[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

alex.lv updated MAPREDUCE-5537:
-------------------------------

                 Tags: lvxin.patch
    Affects Version/s: 0.20.205.0
         Release Note: i use the hadoop-0.20.2-cdh3u6,and the comment'content 
is the patch
         Hadoop Flags: Reviewed
               Status: Patch Available  (was: Open)

--- ./src/mapred/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java       
2013-03-21 02:48:42.000000000 +0800
+++ ./CombineFileInputFormat.java       2013-09-26 11:57:11.958730087 +0800
@@ -201,6 +201,11 @@
     // times, one time each for each pool in the next loop.
     List<Path> newpaths = new LinkedList<Path>();
     for (int i = 0; i < paths.length; i++) {
+      if(paths[i].getName().endsWith("lzo.index"))
+      {
+       LOG.warn("paths["+i+"] ="+paths[i]+"is the lzo index file!");
+         continue;
+      }
       FileSystem fs = paths[i].getFileSystem(job);
       Path p = fs.makeQualified(paths[i]);
       newpaths.add(p);

                
> hive return different results with and without index when 
> hive.hadoop.supports.splittable.combineinputformat =true
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5537
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5537
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>            Reporter: alex.lv
>            Assignee: alex.lv
>
> the  environment:
> hive-0.8.1
> hadoop-0.20.2-cdh3u6
> the Presentation:
> i use the hive-0.8.1 to exec the query:
> select count(*) from table t1;
> the table t1 is lzo formatted ,and the follows is :
> # Storage Information                                                         
>                                        
> SerDe Library:                  
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe                            
>        
> InputFormat:                    
> com.hadoop.mapred.DeprecatedLzoTextInputFormat                                
>        
> OutputFormat:                   
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat 
> and the hive.hadoop.supports.splittable.combineinputformat =true
> when i index the table t1,the result is  265329 .
> when i remove the index of the t1,the result is  265325.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to