----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38796/#review100855 -----------------------------------------------------------
I mainly looked through the change in planning side. You may have another person to look at the execution side change. The planning side overall looks good to me. Have couple of comments below. contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java (line 89) <https://reviews.apache.org/r/38796/#comment158101> You may consider put the checking of option when adding this rule into Drill's rule set. See [1]. Doing this will save the overhead to matching this rule. [1] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java#L229 contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java (line 102) <https://reviews.apache.org/r/38796/#comment158102> partitions.size() ==0 would be covered by line 114 (the block 107-112 will be skipped)? contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java (line 128) <https://reviews.apache.org/r/38796/#comment158104> the format seems to expect 3 inputs; is "e" unused? contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java (line 176) <https://reviews.apache.org/r/38796/#comment158108> I feel the block of checking "*" seems unnecesary. If I understand correctly, Hive table is regarded as table with schema. Therefore, we should not see "*" in hiveScanRel.getRowType; * column should have been expanded into the list of columns in Hive table. contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java (line 195) <https://reviews.apache.org/r/38796/#comment158110> Seems like during the convertion, you only maintain the column names. What about the column type? Do we need to maintain the column type as well? DrillFixedRelDataTypeImpl assume every column has "any" type; meaning the column type in hive table is lost? contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java (line 199) <https://reviews.apache.org/r/38796/#comment158111> I feel probably there is a bug in existing code HiveScan.getColumn() should not contain "*", if Drill treats Hive table as schemaed table. contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java (line 259) <https://reviews.apache.org/r/38796/#comment158123> Again, if the column type is maintained during the conversion, seems it's necessary to cast for every column; if the hive type happens to be same as parquet type, we do not have to cast. - Jinfeng Ni On Sept. 27, 2015, 7:50 a.m., Venki Korukanti wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38796/ > ----------------------------------------------------------- > > (Updated Sept. 27, 2015, 7:50 a.m.) > > > Review request for drill and Jinfeng Ni. > > > Repository: drill-git > > > Description > ------- > > Please jira DRILL-3209 for details. > > > Diffs > ----- > > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java > 11c6455 > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java > PRE-CREATION > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeParquetScan.java > PRE-CREATION > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeParquetSubScan.java > PRE-CREATION > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeScanBatchCreator.java > PRE-CREATION > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveScan.java > 9ada569 > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java > 23aa37f > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveSubScan.java > 2181c2a > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/DrillHiveTable.java > b459ee4 > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java > f0b4bdc > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHiveProjectPushDown.java > 6423a36 > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/TestHiveStorage.java > 9211af6 > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/TestInfoSchemaOnHiveStorage.java > 6118be5 > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/store/hive/HiveTestDataGenerator.java > 34a7ed6 > exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java > 0f6a5bb > > exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java > 118f7ad > > Diff: https://reviews.apache.org/r/38796/diff/ > > > Testing > ------- > > Added unittests to test reading all supported types, project pushdown and > partition pruning. Manually tested with Hive tables containing large amount > of data (these tests will become part of the regression suite). > > > Thanks, > > Venki Korukanti > >
