from:"Dong Jiang"

Corrupt parquet file

2018-02-05 Thread Dong Jiang

Hi, We are running on Spark 2.2.1, generating parquet files, like the following pseudo code df.write.parquet(...) We have recently noticed parquet file corruptions, when reading the parquet in Spark or Presto, as the following: Caused by: org.apache.parquet.io.ParquetDecodingException: Can not

Re: Corrupt parquet file

2018-02-05 Thread Dong Jiang

before, what do you do to prevent a recurrence? Thanks, Dong From: Ryan Blue <rb...@netflix.com> Reply-To: "rb...@netflix.com" <rb...@netflix.com> Date: Monday, February 5, 2018 at 12:46 PM To: Dong Jiang <dji...@dataxu.com> Cc: Spark Dev List <dev@spark.apache.or

Re: Corrupt parquet file

2018-02-12 Thread Dong Jiang

back the entire data set, and then copy from HDFS to S3. Any other thoughts? From: Steve Loughran <ste...@hortonworks.com> Date: Monday, February 12, 2018 at 2:27 PM To: "rb...@netflix.com" <rb...@netflix.com> Cc: Dong Jiang <dji...@dataxu.com>, Apache Spark Dev <de

Re: Corrupt parquet file

2018-02-05 Thread Dong Jiang

o: "rb...@netflix.com" <rb...@netflix.com> Date: Monday, February 5, 2018 at 1:34 PM To: Dong Jiang <dji...@dataxu.com> Cc: Spark Dev List <dev@spark.apache.org> Subject: Re: Corrupt parquet file We ensure the bad node is removed from our cluster and reprocess to replac

Re: Corrupt parquet file

2018-02-05 Thread Dong Jiang

a recurrence? Can you share your experience? Thanks, Dong From: Ryan Blue <rb...@netflix.com> Reply-To: "rb...@netflix.com" <rb...@netflix.com> Date: Monday, February 5, 2018 at 12:38 PM To: Dong Jiang <dji...@dataxu.com> Cc: Spark Dev List <dev@spark.apache.or

Spark SQL unexpected behavior when comparing timestamp to date

2018-03-02 Thread Dong Jiang

Hi, I opened a JIRA ticket https://issues.apache.org/jira/browse/SPARK-23549, I don't know if anyone can take a look? Spark SQL unexpected behavior when comparing timestamp to date scala> spark.version res1: String = 2.2.1 scala> spark.sql("select cast('2017-03-01 00:00:00' as timestamp)

Corrupt parquet file

Re: Corrupt parquet file

Re: Corrupt parquet file

Re: Corrupt parquet file

Re: Corrupt parquet file

Spark SQL unexpected behavior when comparing timestamp to date

6 matches

Site Navigation

Mail list logo

Footer information