[
https://issues.apache.org/jira/browse/DRILL-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490872#comment-16490872
]
ASF GitHub Bot commented on DRILL-6353:
---------------------------------------
arina-ielchiieva commented on a change in pull request #1259: DRILL-6353:
Upgrade Parquet MR dependencies
URL: https://github.com/apache/drill/pull/1259#discussion_r190929789
##########
File path:
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetMetadataCache.java
##########
@@ -737,6 +738,7 @@ public void testBooleanPartitionPruning() throws Exception
{
}
}
+ @Ignore
Review comment:
I have investigated why these tests fail. For example, let's take
`testIntervalDayPartitionPruning`.
First test creates partitioned table using Drill. Since table is created at
runtime, new parquet lib is used. Created table contains 4 files, one of them
contains all nulls. For this file with nulls, statistics for all types except
of binary is `num_nulls: 3, min/max not defined`. For binary type it is `no
stats for this column`. For binary columns without null, statistics is written
correctly. Did not check when mixed though (but I think it should be fine). In
previous parquet version, statistics was written correctly. Maybe this is bug
in parquet, maybe in Drill writer.
Another problem is with metadata file. We do write metadata for binary
columns into it successfully. Example:
```
"columnTypeInfo" : {
"`col_intrvl_day`" : {
"name" : [ "col_intrvl_day" ],
"primitiveType" : "FIXED_LEN_BYTE_ARRAY",
"originalType" : "INTERVAL",
"precision" : 0,
"scale" : 0,
"repetitionLevel" : 0,
"definitionLevel" : 1
},
"name" : [ "col_intrvl_day" ],
"minValue" : "AAAAABoAAACQ4KEB",
"maxValue" : "AAAAABoAAACQ4KEB",
"nulls" : 0
```
But when reading it back from file, we read empty strings. Looks like this
one is Drill bug.
@vrozov I also have noticed that `ParquetFileReader.readFooter(conf, path,
NO_FILTER);` is deprecated. If you'll have a chance, please replace it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Upgrade Parquet MR dependencies
> -------------------------------
>
> Key: DRILL-6353
> URL: https://issues.apache.org/jira/browse/DRILL-6353
> Project: Apache Drill
> Issue Type: Task
> Reporter: Vlad Rozov
> Assignee: Vlad Rozov
> Priority: Major
> Fix For: 1.14.0
>
>
> Upgrade from a custom build {{1.8.1-drill-r0}} to Apache release {{1.10.0}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)