Hi Team, Have you ever meeti Exception like this? My production: dataFrame write to hdfs(hive external table path) with parquet type then "msck repair table xxxxx"
After job finising, I select data by using Spark-SQL, it works fine. However, when I use Hive to select it showed: "Failed with exception java.io.IOException:org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file hdfs://stampy/apps/dt/gops/pp_cs_db/hold_txn_featuresForTest/etl_date=2021-10-22/part-00000-9954a8af-d694-47e1-8b0b-1a29534ea659-c000.snappy.parquet" I checked my spark job running conf: --conf spark.shuffle.service.enabled=true \ --conf spark.dynamicAllocation.enabled=true \ --conf spark.dynamicAllocation.initialExecutors=20 \ --conf spark.dynamicAllocation.minExecutors=10 \ --conf spark.dynamicAllocation.maxExecutors=100 \ --conf spark.driver.memory=10g \ --conf spark.driver.cores=8 \ --conf spark.executor.cores=5 \ --conf spark.executor.memory=18g \ --conf spark.sql.autoBroadcastJoinThreshold=-1 \ --conf spark.network.timeout=1200s \ --conf spark.executor.heartbeatInterval=600s \ --conf spark.speculation=true \ --conf spark.speculation.interval=600s \ --conf spark.speculation.quantile=0.9 \ --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ --conf spark.sql.parquet.writeLegacyFormat=true \ --conf spark.port.maxRetries=100 \ --conf spark.blacklist.enabled=false \ --conf spark.sql.shuffle.partitions=1000 \ --conf spark.default.parallelism=1000 \ --conf spark.pipeline.splitStage.mode=repartition \ --conf spark.sql.hive.convertMetastoreParquet=false \ I checked all I can find on the website. It looks can't help. but If I re-run the job, the data can be selected in Hive. So strange. Spark version:2.3.0 Hive:2.6.5 Hive Table store as Parequet everyday has one batch job. maybe 10 days job have 2days job with this issue. if re-run, the data can be regular.