Qinghui Xu created HIVE-20254: --------------------------------- Summary: CheckNonCombinablePathCallable is buggy Key: HIVE-20254 URL: https://issues.apache.org/jira/browse/HIVE-20254 Project: Hive Issue Type: Bug Reporter: Qinghui Xu
CombineHiveInputFormat provides the possibility for people to avoid combine some part of their inputs (by implementing AvoidSplitCombination) We spot a problem with that when our query tries to read a lot of partitions (more than 100). In fact, when there are more than 100 input paths, the check of combinability is run in parallel: * dividing the input path array into several chunks (each chunk with no more than 100 paths) * submit each chunk to a CheckNonCombinablePathCallable * each CheckNonCombinablePathCallable will return a set of index for the paths to not be combined The problem is that CheckNonCombinablePathCallable returns a set of relative index (the index inside the chunk) instead of the absolute index, it means that the returned indices are always smaller than 100, thus all the paths in the array with position bigger than 100 are never taken into account for avoiding combine input. -- This message was sent by Atlassian JIRA (v7.6.3#76005)