using parquet and we are doing simple write and read.
for writing - *ds.write().parquet(outputPath); // this is writing 40K
part files*
for reading - *sqlContext.read().parquet(inputPath).javaRDD() // here we
are trying to read same 40K part files*
*Regards,*
*Prateek R
Hi all,
Please share if anyone have faced the same problem. There are many similar
issues on web but I did not find any solution and reason why this happens.
It will be really helpful.
Regards,
Prateek
On Mon, Apr 29, 2019 at 3:18 PM Prateek Rajput
wrote:
> I checked and removed 0 sized fi
On Tue, Apr 30, 2019 at 6:48 PM Vatsal Patel
wrote:
> *Issue: *
>
> When I am reading sequence file in spark, I can specify the number of
> partitions as an argument to the API, below is the way
> *public JavaPairRDD sequenceFile(String path, Class
> keyClass, Class valueClass, int minPartitions
issue is coming it is happening in case of spark only.
On Mon, Apr 29, 2019 at 2:50 PM Deepak Sharma wrote:
> This can happen if the file size is 0
>
> On Mon, Apr 29, 2019 at 2:28 PM Prateek Rajput
> wrote:
>
>> Hi guys,
>> I am getting this strange error again and agai
Hi guys,
I am getting this strange error again and again while reading from from a
sequence file in spark.
User class threw exception: org.apache.spark.SparkException: Job aborted.
at
org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100)
at
org.apache.spark.rdd.PairRDDF