[ https://issues.apache.org/jira/browse/DRILL-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490867#comment-16490867 ]
ASF GitHub Bot commented on DRILL-6353: --------------------------------------- arina-ielchiieva commented on a change in pull request #1259: DRILL-6353: Upgrade Parquet MR dependencies URL: https://github.com/apache/drill/pull/1259#discussion_r190929789 ########## File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetMetadataCache.java ########## @@ -737,6 +738,7 @@ public void testBooleanPartitionPruning() throws Exception { } } + @Ignore Review comment: I have investigated why these tests fail. For example, let's take `testIntervalDayPartitionPruning`. First test creates partitioned table using Drill. Since table is created at runtime, new parquet lib is used. Created table contains 4 files, one of them contains all nulls. For this file with nulls, statistics for all types except of binary is `num_nulls: 3, min/max not defined`. For binary type it is `no stats for this column`. In previous parquet version, statistics was written correctly. Maybe this is bug in parquet, maybe in Drill writer. Another problem is with metadata file. We do write metadata for binary columns into it successfully. Example: ``` "columnTypeInfo" : { "`col_intrvl_day`" : { "name" : [ "col_intrvl_day" ], "primitiveType" : "FIXED_LEN_BYTE_ARRAY", "originalType" : "INTERVAL", "precision" : 0, "scale" : 0, "repetitionLevel" : 0, "definitionLevel" : 1 }, "name" : [ "col_intrvl_day" ], "minValue" : "AAAAABoAAACQ4KEB", "maxValue" : "AAAAABoAAACQ4KEB", "nulls" : 0 ``` But when reading it back from file, we read empty strings. Looks like this one is Drill bug. @vrozov I also have noticed that `ParquetFileReader.readFooter(conf, path, NO_FILTER);`. If you'll have a chance, please replace it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Upgrade Parquet MR dependencies > ------------------------------- > > Key: DRILL-6353 > URL: https://issues.apache.org/jira/browse/DRILL-6353 > Project: Apache Drill > Issue Type: Task > Reporter: Vlad Rozov > Assignee: Vlad Rozov > Priority: Major > Fix For: 1.14.0 > > > Upgrade from a custom build {{1.8.1-drill-r0}} to Apache release {{1.10.0}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)