Sven Krasser created SPARK-5209:
-----------------------------------
Summary: Jobs fail with "unexpected value" exception in certain
environments
Key: SPARK-5209
URL: https://issues.apache.org/jira/browse/SPARK-5209
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 1.2.0
Environment: Amazon Elastic Map Reduce
Reporter: Sven Krasser
Jobs fail consistently and reproducibly with exceptions of the following type
in PySpark using Spark 1.2.0:
{noformat}
2015-01-13 00:14:05,898 ERROR [Executor task launch worker-1] executor.Executor
(Logging.scala:logError(96)) - Exception in task 27.0 in stage 0.0 (TID 28)
org.apache.spark.SparkException: PairwiseRDD: unexpected value:
List([B@4c09f3e0)
{noformat}
The issue appeared the first time in Spark 1.2.0 and is sensitive to the
environment (configuration, cluster size), i.e. some changes to the environment
will cause the error to not occur.
The following steps yield a reproduction on Amazon Elastic Map Reduce. Launch
an EMR cluster with the following parameters (this will bootstrap Spark 1.2.0
onto it):
{code}
aws emr create-cluster --region us-west-1 --no-auto-terminate \
--ec2-attributes KeyName=your-key-here,SubnetId=your-subnet-here \
--bootstrap-actions
Path=s3://support.elasticmapreduce/spark/install-spark,Args='["-g","-v","1.2.0.a"]'
\
--ami-version 3.3 --instance-groups
InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge \
InstanceGroupType=CORE,InstanceCount=3,InstanceType=r3.xlarge --name "Spark
Issue Repro" \
--visible-to-all-users --applications Name=Ganglia
{code}
Next, copy the attached {{spark-defaults.conf}} to {{~/spark/conf/}}.
Run {{~/spark/bin/spark-submit gen_test_data.py}} to generate a test data set
on HDFS. Then lastly run {{~/spark/bin/spark-submit repro.py}} to reproduce the
error.
Driver and executor logs are attached. For reference, a spark-user thread on
the topic is here:
http://mail-archives.us.apache.org/mod_mbox/spark-user/201501.mbox/%[email protected]%3E
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]