Hi Team,
Have you ever meeti Exception like this?
My production: dataFrame write to hdfs(hive external table path) with
parquet type  then "msck repair table xxxxx"


After job finising,  I select data by using Spark-SQL, it works fine.
However, when I use Hive to select  it showed: "Failed with exception
java.io.IOException:org.apache.parquet.io.ParquetDecodingException: Can not
read value at 1 in block 0 in file
hdfs://stampy/apps/dt/gops/pp_cs_db/hold_txn_featuresForTest/etl_date=2021-10-22/part-00000-9954a8af-d694-47e1-8b0b-1a29534ea659-c000.snappy.parquet"

I checked my spark job running conf:

--conf spark.shuffle.service.enabled=true \
--conf spark.dynamicAllocation.enabled=true \
--conf spark.dynamicAllocation.initialExecutors=20 \
--conf spark.dynamicAllocation.minExecutors=10 \
--conf spark.dynamicAllocation.maxExecutors=100 \
--conf spark.driver.memory=10g \
--conf spark.driver.cores=8 \
--conf spark.executor.cores=5 \
--conf spark.executor.memory=18g \
--conf spark.sql.autoBroadcastJoinThreshold=-1 \
--conf spark.network.timeout=1200s \
--conf spark.executor.heartbeatInterval=600s \
--conf spark.speculation=true \
--conf spark.speculation.interval=600s \
--conf spark.speculation.quantile=0.9 \
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
--conf spark.sql.parquet.writeLegacyFormat=true   \
--conf spark.port.maxRetries=100 \
--conf spark.blacklist.enabled=false \
--conf spark.sql.shuffle.partitions=1000 \
--conf spark.default.parallelism=1000 \
--conf spark.pipeline.splitStage.mode=repartition \
--conf spark.sql.hive.convertMetastoreParquet=false \


I checked all I can find on the website. It looks can't help. but If I
re-run the job, the data can be selected  in Hive. So strange.

Spark version:2.3.0
Hive:2.6.5
Hive Table store as Parequet

everyday has one batch job. maybe 10 days job have 2days job with this
issue. if re-run, the data can be regular.

Reply via email to