Nothing appears to be running on hivecluster2:8080. 'sudo jps' does show
[hivedata@hivecluster2 ~]$ sudo jps 9953 PepAgent 13797 JournalNode 7618 NameNode 6574 Jps 12716 Worker 16671 RunJar 18675 Main 18177 JobTracker 10918 Master 18139 TaskTracker 7674 DataNode I kill all processes listed. I restart Spark Master on hivecluster2: [hivedata@hivecluster2 ~]$ sudo /opt/cloudera/parcels/SPARK/lib/spark/sbin/start-master.sh starting org.apache.spark.deploy.master.Master, logging to /var/log/spark/spark-root-org.apache.spark.deploy.master.Master-1-hivecluster2.out I run the spark shell again: [hivedata@hivecluster2 ~]$ spark-shell -usejavacp -classpath "*.jar" 14/06/02 13:52:13 INFO spark.HttpServer: Starting HTTP Server 14/06/02 13:52:13 INFO server.Server: jetty-7.6.8.v20121106 14/06/02 13:52:13 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:52814 Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 0.9.0 /_/ Using Scala version 2.10.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_31) Type in expressions to have them evaluated. Type :help for more information. 14/06/02 13:52:19 INFO slf4j.Slf4jLogger: Slf4jLogger started 14/06/02 13:52:19 INFO Remoting: Starting remoting 14/06/02 13:52:19 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark@hivecluster2:46033] 14/06/02 13:52:19 INFO Remoting: Remoting now listens on addresses: [akka.tcp://spark@hivecluster2:46033] 14/06/02 13:52:19 INFO spark.SparkEnv: Registering BlockManagerMaster 14/06/02 13:52:19 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20140602135219-bd8a 14/06/02 13:52:19 INFO storage.MemoryStore: MemoryStore started with capacity 294.4 MB. 14/06/02 13:52:19 INFO network.ConnectionManager: Bound socket to port 50645 with id = ConnectionManagerId(hivecluster2,50645) 14/06/02 13:52:19 INFO storage.BlockManagerMaster: Trying to register BlockManager 14/06/02 13:52:19 INFO storage.BlockManagerMasterActor$BlockManagerInfo: Registering block manager hivecluster2:50645 with 294.4 MB RAM 14/06/02 13:52:19 INFO storage.BlockManagerMaster: Registered BlockManager 14/06/02 13:52:19 INFO spark.HttpServer: Starting HTTP Server 14/06/02 13:52:19 INFO server.Server: jetty-7.6.8.v20121106 14/06/02 13:52:19 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:36103 14/06/02 13:52:19 INFO broadcast.HttpBroadcast: Broadcast server started at http://10.10.30.211:36103 14/06/02 13:52:19 INFO spark.SparkEnv: Registering MapOutputTracker 14/06/02 13:52:19 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-ecce4c62-fef6-4369-a3d5-e3d7cbd1e00c 14/06/02 13:52:19 INFO spark.HttpServer: Starting HTTP Server 14/06/02 13:52:19 INFO server.Server: jetty-7.6.8.v20121106 14/06/02 13:52:19 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:37662 14/06/02 13:52:19 INFO server.Server: jetty-7.6.8.v20121106 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/storage/rdd,null} 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/storage,null} 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/stages/stage,null} 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/stages/pool,null} 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/stages,null} 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/environment,null} 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/executors,null} 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/metrics/json,null} 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/static,null} 14/06/02 13:52:19 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/,null} 14/06/02 13:52:19 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 14/06/02 13:52:19 INFO ui.SparkUI: Started Spark Web UI at *http://hivecluster2:4040 <http://hivecluster2:4040>* 14/06/02 13:52:19 INFO client.AppClient$ClientActor: Connecting to master spark://hivecluster2:7077... 14/06/02 13:52:20 INFO cluster.SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20140602135220-0000 Created spark context.. Spark context available as sc. Note that the Spark Web UI is running at hivecluster2:4040, I get the UI when I go there. I verify again that nothing exists at hivecluster2:8080. I try to run my code: ... val sparkConf = new SparkConf() sparkConf.setMaster("spark://hivecluster2:7077") sparkConf.setAppName("Test Spark App") sparkConf.setJars(Array("avro-1.7.6.jar", "avro-mapred-1.7.6.jar")) val sc = new SparkContext(sparkConf) This produces a new spark server(!) at port 4041: 14/06/02 13:55:31 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4041 14/06/02 13:55:31 INFO ui.SparkUI: Started Spark Web UI at http://hivecluster2:4041 14/06/02 13:55:31 INFO spark.SparkContext: Added JAR avro-1.7.6.jar at http://10.10.30.211:49845/jars/avro-1.7.6.jar with timestamp 1401742531616 14/06/02 13:55:31 INFO spark.SparkContext: Added JAR avro-mapred-1.7.6.jar at http://10.10.30.211:49845/jars/avro-mapred-1.7.6.jar with timestamp 1401742531617 14/06/02 13:55:31 INFO client.AppClient$ClientActor: Connecting to master spark://hivecluster2:7077... 14/06/02 13:55:31 INFO cluster.SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20140602135531-0001 sc: org.apache.spark.SparkContext = org.apache.spark.SparkContext@2e9329e9 I run the rest of my code... val input = "hdfs://hivecluster2/securityx/web_proxy_mef/2014/05/29/22/*.avro"//part-m-000{15,16}.avro" val jobConf= new JobConf(sc.hadoopConfiguration) jobConf.setJobName("Test Scala Job") FileInputFormat.setInputPaths(jobConf, input) val rdd = sc.hadoopRDD( //confBroadcast.value.value, jobConf, classOf[org.apache.avro.mapred.AvroInputFormat[GenericRecord]], classOf[org.apache.avro.mapred.AvroWrapper[GenericRecord]], classOf[org.apache.hadoop.io.NullWritable], 1) val f1 = rdd.first I get this: 14/06/02 14:00:36 INFO mapred.FileInputFormat: Total input paths to process : 17 14/06/02 14:00:36 INFO spark.SparkContext: Starting job: first at <console>:47 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Got job 0 (first at <console>:47) with 1 output partitions (allowLocal=true) 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Final stage: Stage 0 (first at <console>:47) 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Parents of final stage: List() 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Missing parents: List() 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Computing the requested partition locally 14/06/02 14:00:36 INFO rdd.HadoopRDD: Input split: hdfs://hivecluster2/securityx/web_proxy_mef/2014/05/29/22/part-m-00000.avro:0+3864 14/06/02 14:00:36 INFO spark.SparkContext: Job finished: first at <console>:47, took 0.374416468 s 14/06/02 14:00:36 INFO spark.SparkContext: Starting job: first at <console>:47 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Got job 1 (first at <console>:47) with 16 output partitions (allowLocal=true) 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Final stage: Stage 1 (first at <console>:47) 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Parents of final stage: List() 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Missing parents: List() 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Submitting Stage 1 (HadoopRDD[0] at hadoopRDD at <console>:45), which has no missing parents 14/06/02 14:00:36 INFO scheduler.DAGScheduler: Submitting 16 missing tasks from Stage 1 (HadoopRDD[0] at hadoopRDD at <console>:45) 14/06/02 14:00:36 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 16 tasks 14/06/02 14:00:51 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory I see my job at http://hivecluster2:4041, but not at hivecluster2:4040. Task succeeded, 0/16. How do I instantiate a new SparkContext without creating a new web server thing? That seems to be the issue. Russ On Mon, Jun 2, 2014 at 1:19 PM, Aaron Davidson <ilike...@gmail.com> wrote: > You may have to do "sudo jps", because it should definitely list your > processes. > > What does hivecluster2:8080 look like? My guess is it says there are 2 > applications registered, and one has taken all the executors. There must be > two applications running, as those are the only things that keep open those > 4040/4041 ports. > > > On Mon, Jun 2, 2014 at 11:32 AM, Russell Jurney <russell.jur...@gmail.com> > wrote: > >> If it matters, I have servers running at >> http://hivecluster2:4040/stages/ and http://hivecluster2:4041/stages/ >> >> When I run rdd.first, I see an item at >> http://hivecluster2:4041/stages/ but no tasks are running. Stage ID 1, >> first at <console>:46, Tasks: Succeeded/Total 0/16. >> >> On Mon, Jun 2, 2014 at 10:09 AM, Russell Jurney >> <russell.jur...@gmail.com> wrote: >> > Looks like just worker and master processes are running: >> > >> > [hivedata@hivecluster2 ~]$ jps >> > >> > 10425 Jps >> > >> > [hivedata@hivecluster2 ~]$ ps aux|grep spark >> > >> > hivedata 10424 0.0 0.0 103248 820 pts/3 S+ 10:05 0:00 grep >> spark >> > >> > root 10918 0.5 1.4 4752880 230512 ? Sl May27 41:43 java >> -cp >> > >> :/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/conf:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/core/lib/*:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/repl/lib/*:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/examples/lib/*:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/bagel/lib/*:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/mllib/lib/*:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/streaming/lib/*:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/lib/*:/etc/hadoop/conf:/opt/cloudera/parcels/CDH/lib/hadoop/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-hdfs/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-yarn/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-mapreduce/*:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/lib/scala-library.jar:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/lib/scala-compiler.jar:/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/lib/jline.jar >> > -Dspark.akka.logLifecycleEvents=true >> > >> -Djava.library.path=/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.98/lib/spark/lib:/opt/cloudera/parcels/CDH/lib/hadoop/lib/native >> > -Xms512m -Xmx512m org.apache.spark.deploy.master.Master --ip >> hivecluster2 >> > --port 7077 --webui-port 18080 >> > >> > root 12715 0.0 0.0 148028 656 ? S May27 0:00 sudo >> > /opt/cloudera/parcels/SPARK/lib/spark/bin/spark-class >> > org.apache.spark.deploy.worker.Worker spark://hivecluster2:7077 >> > >> > root 12716 0.3 1.1 4155884 191340 ? Sl May27 30:21 java >> -cp >> > >> :/opt/cloudera/parcels/SPARK/lib/spark/conf:/opt/cloudera/parcels/SPARK/lib/spark/core/lib/*:/opt/cloudera/parcels/SPARK/lib/spark/repl/lib/*:/opt/cloudera/parcels/SPARK/lib/spark/examples/lib/*:/opt/cloudera/parcels/SPARK/lib/spark/bagel/lib/*:/opt/cloudera/parcels/SPARK/lib/spark/mllib/lib/*:/opt/cloudera/parcels/SPARK/lib/spark/streaming/lib/*:/opt/cloudera/parcels/SPARK/lib/spark/lib/*:/etc/hadoop/conf:/opt/cloudera/parcels/CDH/lib/hadoop/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-hdfs/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-yarn/*:/opt/cloudera/parcels/CDH/lib/hadoop/../hadoop-mapreduce/*:/opt/cloudera/parcels/SPARK/lib/spark/lib/scala-library.jar:/opt/cloudera/parcels/SPARK/lib/spark/lib/scala-compiler.jar:/opt/cloudera/parcels/SPARK/lib/spark/lib/jline.jar >> > -Dspark.akka.logLifecycleEvents=true >> > >> -Djava.library.path=/opt/cloudera/parcels/SPARK/lib/spark/lib:/opt/cloudera/parcels/CDH/lib/hadoop/lib/native >> > -Xms512m -Xmx512m org.apache.spark.deploy.worker.Worker >> > spark://hivecluster2:7077 >> > >> > >> > >> > >> > On Sun, Jun 1, 2014 at 7:41 PM, Aaron Davidson <ilike...@gmail.com> >> wrote: >> >> >> >> Sounds like you have two shells running, and the first one is talking >> all >> >> your resources. Do a "jps" and kill the other guy, then try again. >> >> >> >> By the way, you can look at http://localhost:8080 (replace localhost >> with >> >> the server your Spark Master is running on) to see what applications >> are >> >> currently started, and what resource allocations they have. >> >> >> >> >> >> On Sun, Jun 1, 2014 at 6:47 PM, Russell Jurney < >> russell.jur...@gmail.com> >> >> wrote: >> >>> >> >>> Thanks again. Run results here: >> >>> https://gist.github.com/rjurney/dc0efae486ba7d55b7d5 >> >>> >> >>> This time I get a port already in use exception on 4040, but it isn't >> >>> fatal. Then when I run rdd.first, I get this over and over: >> >>> >> >>> 14/06/01 18:35:40 WARN scheduler.TaskSchedulerImpl: Initial job has >> not >> >>> accepted any resources; check your cluster UI to ensure that workers >> are >> >>> registered and have sufficient memory >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> On Sun, Jun 1, 2014 at 3:09 PM, Aaron Davidson <ilike...@gmail.com> >> >>> wrote: >> >>>> >> >>>> You can avoid that by using the constructor that takes a SparkConf, >> a la >> >>>> >> >>>> val conf = new SparkConf() >> >>>> conf.setJars("avro.jar", ...) >> >>>> val sc = new SparkContext(conf) >> >>>> >> >>>> >> >>>> On Sun, Jun 1, 2014 at 2:32 PM, Russell Jurney >> >>>> <russell.jur...@gmail.com> wrote: >> >>>>> >> >>>>> Followup question: the docs to make a new SparkContext require that >> I >> >>>>> know where $SPARK_HOME is. However, I have no idea. Any idea where >> that >> >>>>> might be? >> >>>>> >> >>>>> >> >>>>> On Sun, Jun 1, 2014 at 10:28 AM, Aaron Davidson <ilike...@gmail.com >> > >> >>>>> wrote: >> >>>>>> >> >>>>>> Gotcha. The easiest way to get your dependencies to your Executors >> >>>>>> would probably be to construct your SparkContext with all >> necessary jars >> >>>>>> passed in (as the "jars" parameter), or inside a SparkConf with >> setJars(). >> >>>>>> Avro is a "necessary jar", but it's possible your application also >> needs to >> >>>>>> distribute other ones to the cluster. >> >>>>>> >> >>>>>> An easy way to make sure all your dependencies get shipped to the >> >>>>>> cluster is to create an assembly jar of your application, and then >> you just >> >>>>>> need to tell Spark about that jar, which includes all your >> application's >> >>>>>> transitive dependencies. Maven and sbt both have pretty >> straightforward ways >> >>>>>> of producing assembly jars. >> >>>>>> >> >>>>>> >> >>>>>> On Sat, May 31, 2014 at 11:23 PM, Russell Jurney >> >>>>>> <russell.jur...@gmail.com> wrote: >> >>>>>>> >> >>>>>>> Thanks for the fast reply. >> >>>>>>> >> >>>>>>> I am running CDH 4.4 with the Cloudera Parcel of Spark 0.9.0, in >> >>>>>>> standalone mode. >> >>>>>>> >> >>>>>>> >> >>>>>>> On Saturday, May 31, 2014, Aaron Davidson <ilike...@gmail.com> >> wrote: >> >>>>>>>> >> >>>>>>>> First issue was because your cluster was configured incorrectly. >> You >> >>>>>>>> could probably read 1 file because that was done on the driver >> node, but >> >>>>>>>> when it tried to run a job on the cluster, it failed. >> >>>>>>>> >> >>>>>>>> Second issue, it seems that the jar containing avro is not >> getting >> >>>>>>>> propagated to the Executors. What version of Spark are you >> running on? What >> >>>>>>>> deployment mode (YARN, standalone, Mesos)? >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On Sat, May 31, 2014 at 9:37 PM, Russell Jurney >> >>>>>>>> <russell.jur...@gmail.com> wrote: >> >>>>>>>> >> >>>>>>>> Now I get this: >> >>>>>>>> >> >>>>>>>> scala> rdd.first >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO spark.SparkContext: Starting job: first at >> >>>>>>>> <console>:41 >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Got job 4 (first >> at >> >>>>>>>> <console>:41) with 1 output partitions (allowLocal=true) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Final stage: >> Stage 4 >> >>>>>>>> (first at <console>:41) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Parents of final >> >>>>>>>> stage: List() >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Missing parents: >> >>>>>>>> List() >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Computing the >> >>>>>>>> requested partition locally >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO rdd.HadoopRDD: Input split: >> >>>>>>>> >> hdfs://hivecluster2/securityx/web_proxy_mef/2014/05/29/22/part-m-00000.avro:0+3864 >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO spark.SparkContext: Job finished: first at >> >>>>>>>> <console>:41, took 0.037371256 s >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO spark.SparkContext: Starting job: first at >> >>>>>>>> <console>:41 >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Got job 5 (first >> at >> >>>>>>>> <console>:41) with 16 output partitions (allowLocal=true) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Final stage: >> Stage 5 >> >>>>>>>> (first at <console>:41) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Parents of final >> >>>>>>>> stage: List() >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Missing parents: >> >>>>>>>> List() >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Submitting Stage 5 >> >>>>>>>> (HadoopRDD[0] at hadoopRDD at <console>:37), which has no >> missing parents >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.DAGScheduler: Submitting 16 >> missing >> >>>>>>>> tasks from Stage 5 (HadoopRDD[0] at hadoopRDD at <console>:37) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSchedulerImpl: Adding task >> set >> >>>>>>>> 5.0 with 16 tasks >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:0 >> >>>>>>>> as TID 92 on executor 2: hivecluster3 (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:0 as 1294 bytes in 1 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:3 >> >>>>>>>> as TID 93 on executor 1: hivecluster5.labs.lan (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:3 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:1 >> >>>>>>>> as TID 94 on executor 4: hivecluster4 (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:1 as 1294 bytes in 1 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:2 >> >>>>>>>> as TID 95 on executor 0: hivecluster6.labs.lan (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:2 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:4 >> >>>>>>>> as TID 96 on executor 3: hivecluster1.labs.lan (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:4 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:6 >> >>>>>>>> as TID 97 on executor 2: hivecluster3 (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:6 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:5 >> >>>>>>>> as TID 98 on executor 1: hivecluster5.labs.lan (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:5 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:8 >> >>>>>>>> as TID 99 on executor 4: hivecluster4 (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:8 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:7 >> >>>>>>>> as TID 100 on executor 0: hivecluster6.labs.lan (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:7 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> >>>>>>>> 5.0:10 as TID 101 on executor 3: hivecluster1.labs.lan >> (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:10 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> >>>>>>>> 5.0:14 as TID 102 on executor 2: hivecluster3 (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:14 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> 5.0:9 >> >>>>>>>> as TID 103 on executor 1: hivecluster5.labs.lan (NODE_LOCAL) >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Serialized task >> >>>>>>>> 5.0:9 as 1294 bytes in 0 ms >> >>>>>>>> >> >>>>>>>> 14/05/31 21:36:28 INFO scheduler.TaskSetManager: Starting task >> >>>>>>>> 5.0:11 as TID 104 on executor 4: hivecluster4 (N >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com >> >>>>>>> datasyndrome.com >> >>>>>> >> >>>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com >> >>>>> datasyndrome.com >> >>>> >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com >> >>> datasyndrome.com >> >> >> >> >> > >> > >> > >> > -- >> > Russell Jurney twitter.com/rjurney russell.jur...@gmail.com >> datasyndrome.com >> >> >> >> -- >> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com >> datasyndrome.com >> > > -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com