Hi All -
I'm having an issue with detecting a failed Spark application state when
using the startApplication method and SparkAppHandle with the SparkLauncher
in Spark 2.0.1.
Previous I had used a Java Process to waitFor it to return an non-zero exit
code to detect failure which worked. But when t
ing and testing anything from trunk to get it
working.
On Sat, Oct 29, 2016 at 6:08 AM, Steve Loughran wrote:
>
> On 27 Oct 2016, at 23:04, adam kramer wrote:
>
> Is the version of Spark built for Hadoop 2.7 and later only for 2.x
> releases?
>
> Is there any reason why
Is the version of Spark built for Hadoop 2.7 and later only for 2.x releases?
Is there any reason why Hadoop 3.0 is a non-starter for use with Spark
2.0? The version of aws-sdk in 3.0 actually works for DynamoDB which
would resolve our driver dependency issues.
Thanks,
Adam
-
bucketed columns?
On Tue, Oct 18, 2016 at 10:59 PM, adam kramer wrote:
> Hello All,
>
> I’m trying to improve join efficiency within (self-join) and across
> data sets loaded from different parquet files primarily due to a
> multi-stage data ingestion environment.
>
> Are th
Hello All,
I’m trying to improve join efficiency within (self-join) and across
data sets loaded from different parquet files primarily due to a
multi-stage data ingestion environment.
Are there specific benefits to shuffling efficiency (e.g. no network
transmission) if the parquet files are writt