Re: Unable to ship external Python libraries in PYSPARK
Is there some way to ship textfile just like ship python libraries? Thanks in advance Daijia -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-ship-external-Python-libraries-in-PYSPARK-tp14074p14412.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to submit Pyspark job in mesos?
:21.963773 20042 status_update_manager.cpp:368] Forwarding status update TASK_LOST (UUID: 84107fc4-d997-4e9c-a256-00d30e5eb4f4) for task 5 of framework 20140730-165621-1526966464-5050-23977- to master@192.168.3.91:5050 I0730 16:57:21.966195 20042 status_update_manager.cpp:393] Received status update acknowledgement (UUID: 84107fc4-d997-4e9c-a256-00d30e5eb4f4) for task 5 of framework 20140730-165621-1526966464-5050-23977- I0730 16:57:21.966434 20042 slave.cpp:2198] Cleaning up executor '5' of framework 20140730-165621-1526966464-5050-23977- I0730 16:57:21.966717 20049 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20140730-154530-1526966464-5050-22832-1/frameworks/20140730-165621-1526966464-5050-23977-/executors/5/runs/33dedf06-507b-4f0f-b59b-7890f876d3b4' for gc 6.8881231704days in the future I0730 16:57:21.966872 20042 slave.cpp:2273] Cleaning up framework 20140730-165621-1526966464-5050-23977- I0730 16:57:21.967042 20049 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20140730-154530-1526966464-5050-22832-1/frameworks/20140730-165621-1526966464-5050-23977-/executors/5' for gc 6.8880958518days in the future I0730 16:57:21.967258 20049 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20140730-154530-1526966464-5050-22832-1/frameworks/20140730-165621-1526966464-5050-23977-' for gc 6.8880614519days in the future I0730 16:57:21.967341 20042 status_update_manager.cpp:277] Closing status update streams for framework 20140730-165621-1526966464-5050-23977- spark running console print: 14/07/30 16:56:48 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4041 14/07/30 16:56:48 INFO ui.SparkUI: Started SparkUI at http://CentOS-19:4041 14/07/30 16:56:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/07/30 16:56:49 INFO scheduler.EventLoggingListener: Logging events to /tmp/spark-events/my_test.py-1406710609033 14/07/30 16:56:49 INFO util.Utils: Copying /home/daijia/deal_three_word/my_test.py to /tmp/spark-c8e9af2f-32b5-4bf0-9f57-c46dc82a4450/my_test.py 14/07/30 16:56:49 INFO spark.SparkContext: Added file file:/home/daijia/deal_three_word/my_test.py at http://192.168.3.91:42379/files/my_test.py with timestamp 1406710609772 I0730 16:56:49.882772 24123 sched.cpp:121] Version: 0.18.1 I0730 16:56:49.884660 24131 sched.cpp:217] New master detected at master@192.168.3.91:5050 I0730 16:56:49.884770 24131 sched.cpp:225] No credentials provided. Attempting to register without authentication I0730 16:56:49.885520 24131 sched.cpp:391] Framework registered with 20140730-165621-1526966464-5050-23977- 14/07/30 16:56:49 INFO mesos.CoarseMesosSchedulerBackend: Registered as framework ID 20140730-165621-1526966464-5050-23977- 14/07/30 16:56:50 INFO spark.SparkContext: Starting job: count at /home/daijia/deal_three_word/my_test.py:27 14/07/30 16:56:50 INFO scheduler.DAGScheduler: Got job 0 (count at /home/daijia/deal_three_word/my_test.py:27) with 2 output partitions (allowLocal=false) 14/07/30 16:56:50 INFO scheduler.DAGScheduler: Final stage: Stage 0(count at /home/daijia/deal_three_word/my_test.py:27) 14/07/30 16:56:50 INFO scheduler.DAGScheduler: Parents of final stage: List() 14/07/30 16:56:50 INFO scheduler.DAGScheduler: Missing parents: List() 14/07/30 16:56:50 INFO scheduler.DAGScheduler: Submitting Stage 0 (PythonRDD[1] at RDD at PythonRDD.scala:37), which has no missing parents 14/07/30 16:56:50 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (PythonRDD[1] at RDD at PythonRDD.scala:37) 14/07/30 16:56:50 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 14/07/30 16:56:55 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 0 is now TASK_LOST 14/07/30 16:57:00 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 1 is now TASK_LOST 14/07/30 16:57:00 INFO mesos.CoarseMesosSchedulerBackend: Blacklisting Mesos slave value: 20140730-154530-1526966464-5050-22832-2 due to too many failures; is Spark installed on it? 14/07/30 16:57:05 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 2 is now TASK_LOST 14/07/30 16:57:05 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/07/30 16:57:11 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 3 is now TASK_LOST 14/07/30 16:57:11 INFO mesos.CoarseMesosSchedulerBackend: Blacklisting Mesos slave value: 20140730-154530-1526966464-5050-22832-0 due to too many failures; is Spark installed on it? 14/07/30 16:57:16 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 4 is now TASK_LOST 14/07/30 16:57:20 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/07/30 16:57:21 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 5 is now TASK_LOST 14/07/30 16:57:21 INFO mesos.CoarseMesosSchedulerBackend
How to submit Pyspark job in mesos?
Dear all, I have spark1.0.0 and mesos0.18.1. After setting in mesos and spark and starting the mesos cluster, I try to run the pyspark job by the command below: spark-submit /path/to/my_pyspark_job.py --master mesos://192.168.0.21:5050 It occurs error below: 14/07/29 18:40:49 INFO server.Server: jetty-8.y.z-SNAPSHOT 14/07/29 18:40:49 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4041 14/07/29 18:40:49 INFO ui.SparkUI: Started SparkUI at http://CentOS-19:4041 14/07/29 18:40:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/07/29 18:40:50 INFO scheduler.EventLoggingListener: Logging events to /tmp/spark-events/my_test.py-1406630449771 14/07/29 18:40:50 INFO util.Utils: Copying /home/daijia/deal_three_word/my_test.py to /tmp/spark-4365b01d-b57a-4abb-b39c-cb57b83a28ce/my_test.py 14/07/29 18:40:50 INFO spark.SparkContext: Added file file:/home/daijia/deal_three_word/my_test.py at http://192.168.3.91:51188/files/my_test.py with timestamp 1406630450333 I0729 18:40:50.440551 15033 sched.cpp:121] Version: 0.18.1 I0729 18:40:50.442450 15035 sched.cpp:217] New master detected at master@192.168.3.91:5050 I0729 18:40:50.442570 15035 sched.cpp:225] No credentials provided. Attempting to register without authentication I0729 18:40:50.443234 15036 sched.cpp:391] Framework registered with 20140729-174911-1526966464-5050-13758-0006 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Registered as framework ID 20140729-174911-1526966464-5050-13758-0006 14/07/29 18:40:50 INFO spark.SparkContext: Starting job: count at /home/daijia/deal_three_word/my_test.py:27 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 0 is now TASK_LOST 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 1 is now TASK_LOST 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 3 is now TASK_LOST 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Blacklisting Mesos slave value: 20140729-163345-1526966464-5050-10913-0 due to too many failures; is Spark installed on it? 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 2 is now TASK_LOST 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Blacklisting Mesos slave value: 20140729-163345-1526966464-5050-10913-2 due to too many failures; is Spark installed on it? 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Got job 0 (count at /home/daijia/deal_three_word/my_test.py:27) with 2 output partitions (allowLocal=false) 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Final stage: Stage 0(count at /home/daijia/deal_three_word/my_test.py:27) 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Parents of final stage: List() 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Missing parents: List() 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 4 is now TASK_LOST 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Submitting Stage 0 (PythonRDD[1] at RDD at PythonRDD.scala:37), which has no missing parents 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 5 is now TASK_LOST 14/07/29 18:40:50 INFO mesos.CoarseMesosSchedulerBackend: Blacklisting Mesos slave value: 20140729-163345-1526966464-5050-10913-1 due to too many failures; is Spark installed on it? 14/07/29 18:40:50 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (PythonRDD[1] at RDD at PythonRDD.scala:37) 14/07/29 18:40:50 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 14/07/29 18:41:05 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/07/29 18:41:20 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/07/29 18:41:20 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory It just repeats the last message. Here is my python scirpt: #!/usr/bin/env python #coding=utf-8 from pyspark import SparkContext sc = SparkContext() temp = [] for index in range(1000): temp.append(index) sc.parallelize(temp).count() So, the running command is right? Or some other reasons lead to the problem. Thanks in advance, Daijia -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-submit-Pyspark-job-in-mesos-tp10905.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: How to submit Pyspark job in mesos?
Actually, it runs okay in my slaves deployed by standalone mode. When I switch to mesos, the error just occurs. Anyway, thanks for your reply and any ideas will help. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-submit-Pyspark-job-in-mesos-tp10905p10918.html Sent from the Apache Spark User List mailing list archive at Nabble.com.