[
https://issues.apache.org/jira/browse/ASTERIXDB-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402643#comment-17402643
]
ASF subversion and git services commented on ASTERIXDB-2918:
------------------------------------------------------------
Commit 1bddb400be3ee2aeb589a0221f415c4a802ff3e5 in asterixdb's branch
refs/heads/master from Wail Alkowaileet
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=1bddb40 ]
[ASTERIXDB-2918][EXT] Validate the type when creating Parquet external dataset
- user model changes: no
- storage format changes: no
- interface changes: no
Details:
Ensure the used type - when creating an external dataset using
Parquet format - does not contain declared fields.
Change-Id: I4870a91ecf41b41996b862704b767e04abc14569
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/12905
Reviewed-by: Hussain Towaileb <[email protected]>
Tested-by: Jenkins <[email protected]>
Integration-Tests: Jenkins <[email protected]>
> IndexOutOfBoundsException when querying Parquet files
> -----------------------------------------------------
>
> Key: ASTERIXDB-2918
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2918
> Project: Apache AsterixDB
> Issue Type: Bug
> Components: EXT - External data
> Reporter: Ingo Müller
> Assignee: Wail Y. Alkowaileet
> Priority: Major
> Attachments: Run2012B_SingleMu_restructured_1000.parquet,
> create_event_type.sqlpp, stacktrace.log
>
>
> I am getting an IndexOutOfBoundsException when creating an external table
> based on Parquet files onHDFS or loading them into an existing table if I
> specify a closed type for the table. If a specify an empty open type as
> follows, all works fine:
> {{CREATE TYPE anyType IF NOT EXISTS AS OPEN {};}}
> Then I create an external table as follows:
> CREATE EXTERNAL DATASET untypedDataset(anyType)
> USING hdfs
> (("hdfs"="hdfs://namenode:8020"),
> ("path"="/test/*.parquet"),
> ("input-format"="parquet-input-format"))
> With {{anyType}}, I can query the table just fine. However, if I use the
> {{eventType}} created as shown in the attachment, running any query against
> the dataset produces an error about an exception. In cc.log, I find the
> output as attached in {{stacktrace.log}}.
> I do not know how to debug this further.
> For your reference, I am using a self-compiled development from master from a
> few days ago (rev. 5120106e) running on AdoptOpenJDK 15. I am also attaching
> the Parquet file that caused the problem.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)