Re: Review Request 16747: Add file pruning into Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- (Updated July 4, 2014, 12:13 a.m.) Review request for hive. Changes --- Fixed test fails Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 0256ec9 itests/qtest/testconfiguration.properties 1462ecd itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPathName.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/Context.java abc4290 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java a80feb9 ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 622ee45 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 29d59a4 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 1095173 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a9869f7 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 949bcfb ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 683618f ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java c3a83d4 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 61cc874 ql/src/java/org/apache/hadoop/hive/ql/io/orc/ColumnStatisticsImpl.java 409de7c ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePruningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 7d7c764 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 703c9d1 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 399f92a ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java f293c43 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9945dea ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 699b476 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java e7db370 ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 41243fe ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientnegative/index_compact_entry_limit.q.out 85614ca ql/src/test/results/clientnegative/index_compact_size_limit.q.out 7c6bb0a ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION ql/src/test/results/clientpositive/tez/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu
Re: Review Request 16747: Add file pruning into Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- (Updated June 24, 2014, 9:33 a.m.) Review request for hive. Changes --- Rebased to trunk Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7932a3d itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPathName.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java a80feb9 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5e5cf97 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 179ad29 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 2ce4dbd ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 683618f ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java c3a83d4 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 61cc874 ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePruningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 7d7c764 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 703c9d1 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cb284d7 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java f293c43 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9945dea ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 699b476 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 7aaf455 ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu
Re: Review Request 16747: Add file pruning into Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- (Updated June 25, 2014, 1:27 a.m.) Review request for hive. Changes --- Fixed tez test fails Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7932a3d itests/qtest/pom.xml dc4519a itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPathName.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java a80feb9 ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 622ee45 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5e5cf97 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 179ad29 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 2ce4dbd ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 949bcfb ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 683618f ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java c3a83d4 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 61cc874 ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePruningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 7d7c764 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 703c9d1 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83d09c0 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java f293c43 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9945dea ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 699b476 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 7aaf455 ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 41243fe ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION ql/src/test/results/clientpositive/tez/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu
Re: Review Request 16747: Add file pruning into Hive
On Jan. 10, 2014, 6:02 p.m., Sergey Shelukhin wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java, line 2868 https://reviews.apache.org/r/16747/diff/1/?file=419383#file419383line2868 why make it a hashset now? or should it have always been one Navis Ryu wrote: I'm little confusing on this. Would it be not possible to have multiple paths for an alias? I think that kind of scenario is not supported by current hive. Reverting to list. just checking... if it makes sense its ok On Jan. 10, 2014, 6:02 p.m., Sergey Shelukhin wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java, line 2921 https://reviews.apache.org/r/16747/diff/1/?file=419383#file419383line2921 nit: could return Collection from the method if it's not hard to change Navis Ryu wrote: It's used by other code parts including TEZ. Would it be better to leave it as-is? probably better to keep as is then... thanks On Jan. 10, 2014, 6:02 p.m., Sergey Shelukhin wrote: ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java, line 559 https://reviews.apache.org/r/16747/diff/1/?file=419395#file419395line559 why is it recreating the list? maybe use addAll if it is needed? Navis Ryu wrote: to convert String to Path? ah, ic. Thanks - Sergey --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/#review31518 --- On Jan. 13, 2014, 4:33 a.m., Navis Ryu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- (Updated Jan. 13, 2014, 4:33 a.m.) Review request for hive. Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16d54c6 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 96a78fc ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fccea89 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 5511bca ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a7e2253 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java e66c22c ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 4be56f3 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 99172d4 ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePrunningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 9f35575 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 76f5a31 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java 96c8d89 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9929275 ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 6ee6bee ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9c35890 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu
Re: Review Request 16747: Add file pruning into Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- (Updated Jan. 14, 2014, 1:19 a.m.) Review request for hive. Changes --- Fixed test fails (should return null summary for not-existing path) Check all files in the path recursively Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16d54c6 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 96a78fc ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fccea89 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 5511bca ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a7e2253 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java e66c22c ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 4be56f3 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 99172d4 ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePrunningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 9f35575 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 76f5a31 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java 96c8d89 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9929275 ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 6ee6bee ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9c35890 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu
Re: Review Request 16747: Add file pruning into Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- (Updated Jan. 13, 2014, 4:33 a.m.) Review request for hive. Changes --- Fixed test fails addressed comments Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16d54c6 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 96a78fc ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fccea89 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 5511bca ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a7e2253 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java e66c22c ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 4be56f3 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 99172d4 ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePrunningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 9f35575 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 76f5a31 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java 96c8d89 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9929275 ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 6ee6bee ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9c35890 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu
Re: Review Request 16747: Add file pruning into Hive
On Jan. 10, 2014, 6:02 p.m., Sergey Shelukhin wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java, line 2868 https://reviews.apache.org/r/16747/diff/1/?file=419383#file419383line2868 why make it a hashset now? or should it have always been one I'm little confusing on this. Would it be not possible to have multiple paths for an alias? I think that kind of scenario is not supported by current hive. Reverting to list. On Jan. 10, 2014, 6:02 p.m., Sergey Shelukhin wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java, line 2921 https://reviews.apache.org/r/16747/diff/1/?file=419383#file419383line2921 nit: could return Collection from the method if it's not hard to change It's used by other code parts including TEZ. Would it be better to leave it as-is? On Jan. 10, 2014, 6:02 p.m., Sergey Shelukhin wrote: ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java, line 559 https://reviews.apache.org/r/16747/diff/1/?file=419395#file419395line559 why is it recreating the list? maybe use addAll if it is needed? to convert String to Path? On Jan. 10, 2014, 6:02 p.m., Sergey Shelukhin wrote: ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java, line 695 https://reviews.apache.org/r/16747/diff/1/?file=419395#file419395line695 is it possible to use 3 proper fields? done On Jan. 10, 2014, 6:02 p.m., Sergey Shelukhin wrote: ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java, line 452 https://reviews.apache.org/r/16747/diff/1/?file=419387#file419387line452 what is the form of the path here? just checking, in case it contains protocol prefix, or may start w// I cannot remember exact context of this method but it was for supporting old CDH hadoop(CDH3v1?), which replaces a directory to files in it. Let's remove this and see what happens in test. On Jan. 10, 2014, 6:02 p.m., Sergey Shelukhin wrote: ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java, line 583 https://reviews.apache.org/r/16747/diff/1/?file=419395#file419395line583 javadoc for these methods? Here and above/below at least a little description if not params :) done - Navis --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/#review31518 --- On Jan. 9, 2014, 2:19 a.m., Navis Ryu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- (Updated Jan. 9, 2014, 2:19 a.m.) Review request for hive. Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfd539 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 96a78fc ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 7dc3d59 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 42d764d ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a7e2253 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java e66c22c ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 4be56f3 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 99172d4 ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePrunningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 9f35575 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 76f5a31 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java 96c8d89 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9929275 ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java da1437c ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9c35890 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu
Re: Review Request 16747: Add file pruning into Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/#review31518 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java https://reviews.apache.org/r/16747/#comment60028 why make it a hashset now? or should it have always been one ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java https://reviews.apache.org/r/16747/#comment60029 nit: could return Collection from the method if it's not hard to change ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java https://reviews.apache.org/r/16747/#comment60030 what is the form of the path here? just checking, in case it contains protocol prefix, or may start w// ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java https://reviews.apache.org/r/16747/#comment60031 why is it recreating the list? maybe use addAll if it is needed? ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java https://reviews.apache.org/r/16747/#comment60032 javadoc for these methods? Here and above/below at least a little description if not params :) ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java https://reviews.apache.org/r/16747/#comment60033 is it possible to use 3 proper fields? - Sergey Shelukhin On Jan. 9, 2014, 2:19 a.m., Navis Ryu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- (Updated Jan. 9, 2014, 2:19 a.m.) Review request for hive. Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfd539 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 96a78fc ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 7dc3d59 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 42d764d ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a7e2253 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java e66c22c ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 4be56f3 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 99172d4 ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePrunningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 9f35575 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 76f5a31 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java 96c8d89 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9929275 ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java da1437c ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9c35890 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu
Review Request 16747: Add file pruning into Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- Review request for hive. Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfd539 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 96a78fc ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 7dc3d59 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 42d764d ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a7e2253 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java e66c22c ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 4be56f3 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 99172d4 ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePrunningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 9f35575 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 76f5a31 ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java 96c8d89 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9929275 ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java da1437c ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9c35890 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu