We noticed similar perf degradation using Parquet (outside of Spark) and it
happened due to merging of multiple schemas. Would be good to know if
disabling merge of schema (if the schema is same) as Michael suggested
helps in your case.
On Wed, Apr 8, 2015 at 11:43 AM, Michael Armbrust
Hello folks,
Newbie here! Just had a quick question - is there a job submission API such
as the one with hadoop
https://hadoop.apache.org/docs/r2.3.0/api/org/apache/hadoop/mapreduce/Job.html#submit()
to submit Spark jobs to a Yarn cluster? I see in example that
bin/spark-submit is what's out