[
https://issues.apache.org/jira/browse/FLINK-27025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517800#comment-17517800
]
Dian Fu commented on FLINK-27025:
---------------------------------
[~martijnvisser] From the attachment, I think [~marsupialtail] is using Flink
SQL via Python Table API. This should be supported as it shares the
connector/format support provided in the Java Table API & SQL.
[~marsupialtail] Regarding this issue, could you double check if all the
versions of the jars are consistent and the Java versions on all the nodes of
the cluster are the same?
PS: [~marsupialtail] It would be great to ask this kind of question in the user
mailing list instead of opening a ticket~
> Cannot read parquet file, after putting the jar in the right place with right
> permissions
> -----------------------------------------------------------------------------------------
>
> Key: FLINK-27025
> URL: https://issues.apache.org/jira/browse/FLINK-27025
> Project: Flink
> Issue Type: Bug
> Components: API / Python, Table SQL / API
> Affects Versions: 1.14.0
> Reporter: Ziheng Wang
> Priority: Major
> Attachments: tpch-12-parquet.py
>
>
> I am using Flink with the SQL API on AWS EMR. I can run queries on CSV files,
> no problem.
> However when I try to run queries on Parquet files, I get this error: Caused
> by: java.io.StreamCorruptedException: unexpected block data
> I have put flink-sql-parquet_2.12-1.14.0.jar under /usr/lib/flink/lib on the
> master node of the EMR cluster. Indeed it seems that Flink picks up on it,
> because if the jar is not there then the error is different (it says it can't
> understand parquet source) The jar has full 777 permissions under the same
> username as all the other jars in that file.
> I tried passing a folder name as the Parquet source as well as a single
> Parquet file, nothing works.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)