[ 
https://issues.apache.org/jira/browse/DRILL-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva closed DRILL-7629.
-----------------------------------
    Resolution: Fixed

Fixed in the scope of DRILL-7361.

> Parquet MAP field support missing in recent stable release (?)
> --------------------------------------------------------------
>
>                 Key: DRILL-7629
>                 URL: https://issues.apache.org/jira/browse/DRILL-7629
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 1.17.0
>         Environment: Drill 1.17
> Zulu OpenJDK 8 build 1.8.0_232
> Debian Buster 10.3
> Kernel version 4.19.98-1
> EC c5.2xlarge instances (8 Cores, 16GB RAM)
>            Reporter: Idan Sheinberg
>            Priority: Major
>             Fix For: 1.18.0
>
>
> Encountered this issue when lowering {{planner.slice_target}}  (to say, 100) 
> in order to make drill generate more fragments. Queries then started crashing 
> with the following error:
> {code:java}
> Caused by: java.io.IOException: Unable to parse column [`currencyPair` 
> STRUCT<`bfix` MAP<`map` STRUCT<`key` ARRAY<VARCHAR>, `value` ARRAY<DOUBLE>>>> 
> not null]: Line [1], position [29], offending symbol 
> [@4,29:31='MAP',<26>,1:29]: no viable alternative at input '`bfix`MAP'
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaExprParser.parseColumn(SchemaExprParser.java:80)
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaExprParser.parseColumn(SchemaExprParser.java:61)
>       at 
> org.apache.drill.exec.record.metadata.AbstractColumnMetadata.createColumnMetadata(AbstractColumnMetadata.java:75)
>       at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> com.fasterxml.jackson.databind.introspect.AnnotatedMethod.call(AnnotatedMethod.java:109)
>       at 
> com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:283)
>       ... 72 common frames omitted
> Caused by: 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaParsingException: 
> Line [1], position [29], offending symbol [@4,29:31='MAP',<26>,1:29]: no 
> viable alternative at input '`bfix`MAP'
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaExprParser$ErrorListener.syntaxError(SchemaExprParser.java:120)
>       at 
> org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)
>       at org.antlr.v4.runtime.Parser.notifyErrorListeners(Parser.java:544)
>       at 
> org.antlr.v4.runtime.DefaultErrorStrategy.reportNoViableAlternative(DefaultErrorStrategy.java:310)
>       at 
> org.antlr.v4.runtime.DefaultErrorStrategy.reportError(DefaultErrorStrategy.java:136)
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaParser.column(SchemaParser.java:403)
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaParser.column_def(SchemaParser.java:317)
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaParser.columns(SchemaParser.java:262)
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaParser.struct_type(SchemaParser.java:1395)
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaParser.struct_column(SchemaParser.java:579)
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaParser.column(SchemaParser.java:383)
>       at 
> org.apache.drill.exec.record.metadata.schema.parser.SchemaExprParser.parseColumn(SchemaExprParser.java:78){code}
> All files in the queried directory are parquet files that share the same 
> schema, just to be clear.
> Looking into the stack-trace, this seems like an {{antlr}} error. Assuming 
> {{SchemaParser}} generated from 
> [this|https://github.com/apache/drill/blob/drill-1.17.0/exec/vector/src/main/antlr4/org/apache/drill/exec/record/metadata/schema/parser/SchemaParser.g4]
>  {{g4}} file you can see {{MAP}} support is lacking
> Looking around a bit in Jira/Github, I noticed that this issue had already 
> been fixed in DRILL-7361. I can also confirm that upgrading to the last 
> SNAPSHOT version (built from source today) resolved the issue.
> A few questions:
>  * Did you intentionally drop parquet MAP field support in Drill for 1.17 as 
> part of the Antlr lexer refactoring, or was it never present to begin with (I 
> see 1.16 is not using antlr parsing for parquet schema)?
>  * Can we safely assume the (newly added) MAP field support will persist from 
> here on out, or at as part of the 1.18 release?
>  * Probably not the best place to ask, but as for 1.18, is there a 
> timeline/plan for that already? or is there a possibility for a hot-fix 
> version release? would really be happy to work on a stable version rather 
> than a self-built one.
> I'd be able to provide parquet files and guidance towards re-creating this 
> issue in 1.17, should the need arise.
> Thanks in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to