Xiaomeng Zhang created IMPALA-9201:
--------------------------------------
Summary: Impala can't read zstd file compressed by zstd command
Key: IMPALA-9201
URL: https://issues.apache.org/jira/browse/IMPALA-9201
Project: IMPALA
Issue Type: Bug
Components: Backend
Affects Versions: Impala 3.4.0
Reporter: Xiaomeng Zhang
Assignee: Abhishek Rawat
To reproduce:
# get a parquet file written by impala
# use "hadoop fs -get" to download locally
# use command "zstd -i parquetfile -o zstdfile" to get a zstd compressed file
parquet.zst.
# use "hadoop fs -put" to put zstd file in directory "/test-warehouse/par_zstd"
# in impala, create table with location on -"/test-warehouse/par_zstd"
# run select * from that table, get error :
{code:java}
[localhost:21000] default> select * from par_zstd;
Query: select * from par_zstd
Query submitted at: 2019-11-25 14:59:07 (Coordinator:
http://xiaomeng-OptiPlex-9020:25000)
Query progress can be monitored at:
http://xiaomeng-OptiPlex-9020:25000/query_plan?query_id=b0411d5136965e30:549208ad00000000
ERROR: File 'hdfs://localhost:20500/test-warehouse/par_zstd/parquet.zst' has an
invalid Parquet version number: ����
. Please check that it is a valid Parquet file. This error can also occur due
to stale metadata. If you believe this is a valid Parquet file, try running
"refresh default.par_zstd".
{code}
# In hive run select * from table, get error:
{code:java}
Error: java.io.IOException: java.lang.RuntimeException:
hdfs://localhost:20500/test-warehouse/par_zstd/parquet.zstd is not a Parquet
file. expected magic number at tail [80, 65, 82, 49] but found [-2, -72, -113,
-90] (state=,code=0)
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)