[
https://issues.apache.org/jira/browse/FLINK-27025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516694#comment-17516694
]
Martijn Visser commented on FLINK-27025:
----------------------------------------
[~marsupialtail] Based on your attachment you're trying to use Python, not
Flink SQL in this case, right? I'm not sure that Parquet is supported when
using PyFlink. [~dianfu] [~hxbks2ks] any feedback on this?
> Cannot read parquet file, after putting the jar in the right place with right
> permissions
> -----------------------------------------------------------------------------------------
>
> Key: FLINK-27025
> URL: https://issues.apache.org/jira/browse/FLINK-27025
> Project: Flink
> Issue Type: Bug
> Components: API / Python, Table SQL / API
> Affects Versions: 1.14.0
> Reporter: Ziheng Wang
> Priority: Major
> Attachments: tpch-12-parquet.py
>
>
> I am using Flink with the SQL API on AWS EMR. I can run queries on CSV files,
> no problem.
> However when I try to run queries on Parquet files, I get this error: Caused
> by: java.io.StreamCorruptedException: unexpected block data
> I have put flink-sql-parquet_2.12-1.14.0.jar under /usr/lib/flink/lib on the
> master node of the EMR cluster. Indeed it seems that Flink picks up on it,
> because if the jar is not there then the error is different (it says it can't
> understand parquet source) The jar has full 777 permissions under the same
> username as all the other jars in that file.
> I tried passing a folder name as the Parquet source as well as a single
> Parquet file, nothing works.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)