Qinghui Xu created HIVE-20254:
---------------------------------

             Summary: CheckNonCombinablePathCallable is buggy
                 Key: HIVE-20254
                 URL: https://issues.apache.org/jira/browse/HIVE-20254
             Project: Hive
          Issue Type: Bug
            Reporter: Qinghui Xu


CombineHiveInputFormat provides the possibility for people to avoid combine 
some part of their inputs (by implementing AvoidSplitCombination)

We spot a problem with that when our query tries to read a lot of partitions 
(more than 100). In fact, when there are more than 100 input paths, the check 
of combinability is run in parallel:
 * dividing the input path array into several chunks (each chunk with no more 
than 100 paths)
 * submit each chunk to a CheckNonCombinablePathCallable
 * each CheckNonCombinablePathCallable will return a set of index for the paths 
to not be combined

The problem is that CheckNonCombinablePathCallable returns a set of relative 
index (the index inside the chunk) instead of the absolute index, it means that 
the returned indices are always smaller than 100, thus all the paths in the 
array with position bigger than 100 are never taken into account for avoiding 
combine input.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to