Dear Spark users,
My team is working on a small library that builds on PySpark and is organized
like PySpark as well -- it has a JVM component (that runs in the Spark driver
and executor) and a Python component (that runs in the PySpark driver and
executor processes). What's a good approach
Hi all,
I am thinking of starting work on a profiler for Spark clusters. The current
idea is that it would collect jstacks from executor nodes and put them into
a central index (either a database or elasticsearch), and it would present
them to people in a UI that would let people slice and dice
.addJar(extras-v2.jar)
print(sc2.filter(/* fn that depends on jar */).count)
}
... even if classes in extras-v1.jar and extras-v2.jar have name collisions.
Punya
From: Punya Biswal pbis...@palantir.com
Reply-To: user@spark.apache.org
Date: Sunday, March 16, 2014 at 11:09 AM
To: user
Hi all,
The Maven central repo contains an artifact for spark 0.9.0 built with
unmodified Hadoop, and the Cloudera repo contains an artifact for spark
0.9.0 built with CDH 5 beta. Is there a repo that contains spark-core built
against a non-beta version of CDH (such as 4.4.0)?
Punya