> On Sept. 28, 2015, 10:18 p.m., Aman Sinha wrote: > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeParquetScan.java, > > line 63 > > <https://reviews.apache.org/r/38796/diff/2/?file=1085485#file1085485line63> > > > > Since the RecordCount is the same regardless of the type of the reader, > > we should not divide it by the factor. Dividing the cpu cost and disk cost > > seems ok. > > Venki Korukanti wrote: > If I understand correctly, we are using only the rowcount while > caclulating the self cost of the scan in ScanPrel.computeSelfCost. So we need > to alter the rowcount here. > > Aman Sinha wrote: > True..the current cost model for Scans is computing cpuCost as a function > of rowCount and columnCount. I will open an enhancement JIRA to change that > such that 2 different scan methods (such as Hive scan vs. Drill native scan) > that produce the same row count but differ in cpu cost and I/O cost can be > modeled accurately. > > Given that, you don't have to change the cost here...my only other > suggestion would be to use a static constant as a factor: e.g > HIVE_COST_FACTOR (or something similar).
Added HIVE_SERDE_SCAN_OVERHEAD_FACTOR constant. - Venki ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38796/#review100878 ----------------------------------------------------------- On Sept. 29, 2015, 9:23 a.m., Venki Korukanti wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38796/ > ----------------------------------------------------------- > > (Updated Sept. 29, 2015, 9:23 a.m.) > > > Review request for drill and Jinfeng Ni. > > > Repository: drill-git > > > Description > ------- > > Please jira DRILL-3209 for details. > > > Diffs > ----- > > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/HivePartitionDescriptor.java > 11c6455 > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/planner/sql/logical/ConvertHiveParquetScanToDrillParquetScan.java > PRE-CREATION > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeParquetScan.java > PRE-CREATION > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeParquetSubScan.java > PRE-CREATION > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeScanBatchCreator.java > PRE-CREATION > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveScan.java > 9ada569 > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java > 23aa37f > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveSubScan.java > 2181c2a > > contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/DrillHiveTable.java > b459ee4 > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHivePartitionPruning.java > f0b4bdc > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/TestHiveProjectPushDown.java > 6423a36 > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/TestHiveStorage.java > 9211af6 > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/TestInfoSchemaOnHiveStorage.java > 6118be5 > > contrib/storage-hive/core/src/test/java/org/apache/drill/exec/store/hive/HiveTestDataGenerator.java > 34a7ed6 > exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java > 66f9f03 > > exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java > 5838bd1 > > Diff: https://reviews.apache.org/r/38796/diff/ > > > Testing > ------- > > Added unittests to test reading all supported types, project pushdown and > partition pruning. Manually tested with Hive tables containing large amount > of data (these tests will become part of the regression suite). > > > Thanks, > > Venki Korukanti > >
