I ran into this recently. Turned out we had an old
org-xerial-snappy.properties file in one of our conf directories that
had the setting:
# Disables loading Snappy-Java native library bundled in the
# snappy-java-*.jar file forcing to load the Snappy-Java native
# library from the java.library.path.
#
org.xerial.snappy.disable.bundled.libs=true
When I switched that to false, it made the problem go away.
May or may not be your problem of course, but worth a look.
HTH,
DR
On 11/17/2015 05:22 PM, Andy Davidson wrote:
I started a spark POC. I created a ec2 cluster on AWS using spark-c2. I have
3 slaves. In general I am running into trouble even with small work loads. I
am using IPython notebooks running on my spark cluster. Everything is
painfully slow. I am using the standAlone cluster manager. I noticed that I
am getting the following warning on my driver console. Any idea what the
problem might be?
15/11/17 22:01:59 WARN MetricsSystem: Using default name DAGScheduler for
source because spark.app.id is not set.
15/11/17 22:03:05 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
15/11/17 22:03:05 WARN LoadSnappy: Snappy native library not loaded
Here is an overview of my POS app. I have a file on hdfs containing about
5000 twitter status strings.
tweetStrings = sc.textFile(dataURL)
jTweets = (tweetStrings.map(lambda x: json.loads(x)).take(10))
Generated the following error ³error occurred while calling o78.partitions.:
java.lang.OutOfMemoryError: GC overhead limit exceeded²
Any idea what we need to do to improve new spark user¹s out of the box
experience?
Kind regards
Andy
export PYSPARK_PYTHON=python3.4
export PYSPARK_DRIVER_PYTHON=python3.4
export IPYTHON_OPTS="notebook --no-browser --port=7000 --log-level=WARN"
MASTER_URL=spark://ec2-55-218-207-122.us-west-1.compute.amazonaws.com:7077
numCores=2
$SPARK_ROOT/bin/pyspark --master $MASTER_URL --total-executor-cores
$numCores $*
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org