arina-ielchiieva commented on a change in pull request #1955: DRILL-7491: Incorrect count() returned for complex types in parquet URL: https://github.com/apache/drill/pull/1955#discussion_r366250930
########## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScanWithMetadata.java ########## @@ -180,7 +180,7 @@ public long getColumnValueCount(SchemaPath column) { } else if (nonInterestingColStats != null) { tableRowCount = TableStatisticsKind.ROW_COUNT.getValue(getNonInterestingColumnsMetadata()); } else { - return 0; // returns 0 if the column doesn't exist in the table. + return Statistic.NO_COLUMN_STATS; Review comment: @ihuzenko I am not sure that change is correct. As you can see from java-doc and comment that you have removed, 0 is returned deliberately to avoid full scan if column does not exist. Here is an example of unit test that shows that with your change, you are breaking existing functionality, you can add it to `org.apache.drill.exec.planner.logical.TestConvertCountToDirectScan` class: ``` @Test public void textConvertAbsentColumn() throws Exception { String sql = "select count(abc) as cnt from cp.`tpch/nation.parquet`"; queryBuilder() .sql(sql) .planMatcher() .include("DynamicPojoRecordReader") .match(); testBuilder() .sqlQuery(sql) .unOrdered() .baselineColumns("cnt") .baselineValues(0L) .go(); } ``` After your changes, this test fill fail. I think you need to find a way to determine if column is absent or complex. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services