Re: Accessing Web UI
Try http://localhost:4040 On Mon, Feb 22, 2016 at 8:23 AM, Vasanth Bhatwrote: > Thanks Gourav, Eduardo > > I tried http://localhost:8080 and http://OAhtvJ5MCA:8080/ . Both > cases the forefox just hangs. > > Also I tried with lynx text based browser. I get the message "HTTP > request sent; waiting for response." and it hangs as well. > > Is there way to enable debug logs in spark master service, to understand > what's going wrong? > > > Thanks > Vasanth > > > > On Fri, Feb 19, 2016 at 5:46 PM, Gourav Sengupta < > gourav.sengu...@gmail.com> wrote: > >> can you please try localhost:8080? >> >> Regards, >> Gourav Sengupta >> >> On Fri, Feb 19, 2016 at 11:18 AM, vasbhat wrote: >> >>> Hi, >>> >>>I have installed the spark1.6 and trying to start the master >>> (start-master.sh) and access the webUI. >>> >>> I get the following logs on running the start-master.sh >>> >>> Spark Command: /usr/jdk/instances/jdk1.8.0/jre/bin/java -cp >>> >>> /usr/local/spark-1.6.0-bin-hadoop2.6/conf/:/usr/local/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/usr/local/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/usr/local/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/usr/local/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar >>> -Xms4g -Xmx4g org.apache.spark.deploy.master.Master --ip OAhtvJ5MCA >>> --port >>> 7077 --webui-port 8080 >>> >>> Using Spark's default log4j profile: >>> org/apache/spark/log4j-defaults.properties >>> 16/02/19 03:07:30 INFO Master: Registered signal handlers for [TERM, HUP, >>> INT] >>> 16/02/19 03:07:30 WARN NativeCodeLoader: Unable to load native-hadoop >>> library for your platform... using builtin-java classes where applicable >>> 16/02/19 03:07:31 INFO SecurityManager: Changing view acls to: sluser >>> 16/02/19 03:07:31 INFO SecurityManager: Changing modify acls to: sluser >>> 16/02/19 03:07:31 INFO SecurityManager: SecurityManager: authentication >>> disabled; ui acls disabled; users with view permissions: Set(sluser); >>> users >>> with modify permissions: Set(sluser) >>> 16/02/19 03:07:32 INFO Utils: Successfully started service 'sparkMaster' >>> on >>> port 7077. >>> 16/02/19 03:07:32 INFO Master: Starting Spark master at >>> spark://OAhtvJ5MCA:7077 >>> 16/02/19 03:07:32 INFO Master: Running Spark version 1.6.0 >>> 16/02/19 03:07:32 WARN AbstractConnector: insufficient threads configured >>> for SelectChannelConnector@0.0.0.0:8080 >>> 16/02/19 03:07:32 INFO Utils: Successfully started service 'MasterUI' on >>> port 8080. >>> 16/02/19 03:07:32 INFO MasterWebUI: Started MasterWebUI at >>> http://127.0.0.1:8080 >>> 16/02/19 03:07:32 WARN AbstractConnector: insufficient threads configured >>> for SelectChannelConnector@OAhtvJ5MCA:6066 >>> 16/02/19 03:07:32 INFO Utils: Successfully started service on port 6066. >>> 16/02/19 03:07:32 INFO StandaloneRestServer: Started REST server for >>> submitting applications on port 6066 >>> 16/02/19 03:07:33 INFO Master: I have been elected leader! New state: >>> ALIVE >>> >>> -- >>> Through netstat I can see that port 8080 is Listening >>> Now when I start firefox and access http://127.0.0.1:8080 , firefox >>> just >>> hangs with the message >>> >>> Waiting for "127.0.0.1" and does not connect to UI. >>> >>> How do I enable debug for the spark master daemon, to understand what's >>> happening. >>> >>> Thanks >>> Vasanth >>> >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Re-Accessing-Web-UI-tp23029p26276.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> - >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >
Re: spark-submit stuck and no output in console
Sonal, SparkPi couldn't run as well. Stuck to the screen with no output hadoop-user@yks-hadoop-m01:/usr/local/spark$ ./bin/run-example SparkPi On Tue, Nov 17, 2015 at 12:22 PM, Steve Loughranwrote: > 48 hours is one of those kerberos warning times (as is 24h, 72h and 7 > days) Does this mean I need to restart the whole Hadoop YARN cluster to reset kerberos? -- Odeyemi 'Kayode O. http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde
Re: spark-submit stuck and no output in console
Our hadoop NFS Gateway seems to be malfunctioning. I basically restart it. Now spark jobs have resumed successfully. Problem solved.
Re: spark-submit stuck and no output in console
Anyone experienced this issue as well? On Mon, Nov 16, 2015 at 8:06 PM, Kayode Odeyemi <drey...@gmail.com> wrote: > > Or are you saying that the Java process never even starts? > > > Exactly. > > Here's what I got back from jstack as expected: > > hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 > 31316: Unable to open socket file: target process not responding or > HotSpot VM not loaded > The -F option can be used when the target process is not responding > hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 -F > Attaching to core -F from executable 31316, please wait... > Error attaching to core file: Can't attach to the core file > > > -- Odeyemi 'Kayode O. http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde
Re: spark-submit stuck and no output in console
Thanks for the reply Sonal. I'm on JDK 7 (/usr/lib/jvm/java-7-oracle) My env is a YARN cluster made of 7 nodes (6 datanodes/ node manager, 1 namenode/resource manager). On the namenode, is where I executed the spark-submit job while on one of the datanodes, I executed 'hadoop fs -put /binstore /user/hadoop-user/' to dump 1TB of data into all the datanodes. That process is still running without hassle and it's only using 1.3 GB of 1.7g heap space. Initially, I submitted 2 jobs to the YARN cluster which was running for 2 days and suddenly stops. Nothing in the logs shows the root cause. On Tue, Nov 17, 2015 at 11:42 AM, Sonal Goyal <sonalgoy...@gmail.com> wrote: > Could it be jdk related ? Which version are you on? > > Best Regards, > Sonal > Founder, Nube Technologies <http://www.nubetech.co> > Reifier at Strata Hadoop World > <http://strataconf.com/big-data-conference-sg-2015/public/schedule/detail/44606> > Reifier at Spark Summit 2015 > <https://spark-summit.org/2015/events/real-time-fuzzy-matching-with-spark-and-elastic-search/> > > <http://in.linkedin.com/in/sonalgoyal> > > > > On Tue, Nov 17, 2015 at 2:48 PM, Kayode Odeyemi <drey...@gmail.com> wrote: > >> Anyone experienced this issue as well? >> >> On Mon, Nov 16, 2015 at 8:06 PM, Kayode Odeyemi <drey...@gmail.com> >> wrote: >> >>> >>> Or are you saying that the Java process never even starts? >>> >>> >>> Exactly. >>> >>> Here's what I got back from jstack as expected: >>> >>> hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 >>> 31316: Unable to open socket file: target process not responding or >>> HotSpot VM not loaded >>> The -F option can be used when the target process is not responding >>> hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 -F >>> Attaching to core -F from executable 31316, please wait... >>> Error attaching to core file: Can't attach to the core file >>> >>> >>> >> >> >> -- >> Odeyemi 'Kayode O. >> http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde >> > > -- Odeyemi 'Kayode O. http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde
Re: spark-submit stuck and no output in console
Spark 1.5.1 The fact is that there's no stack trace. No output from that command at all to the console. This is all I get: hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ tail -1 /tmp/spark-profile-job.log nohup: ignoring input /usr/local/spark/bin/spark-class: line 76: 29516 Killed "$RUNNER" -cp "$LAUNCH_CLASSPATH" org.apache.spark.launcher.Main "$@" On Mon, Nov 16, 2015 at 5:22 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Which release of Spark are you using ? > > Can you take stack trace and pastebin it ? > > Thanks > > On Mon, Nov 16, 2015 at 5:50 AM, Kayode Odeyemi <drey...@gmail.com> wrote: > >> ./spark-submit --class com.migration.UpdateProfiles --executor-memory 8g >> ~/migration-profiles-0.1-SNAPSHOT.jar >> >> is stuck and outputs nothing to the console. >> >> What could be the cause of this? Current max heap size is 1.75g and it's >> only using 1g. >> >> > -- Odeyemi 'Kayode O. http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde
spark-submit stuck and no output in console
./spark-submit --class com.migration.UpdateProfiles --executor-memory 8g ~/migration-profiles-0.1-SNAPSHOT.jar is stuck and outputs nothing to the console. What could be the cause of this? Current max heap size is 1.75g and it's only using 1g.
Re: spark-submit stuck and no output in console
> Or are you saying that the Java process never even starts? Exactly. Here's what I got back from jstack as expected: hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 31316: Unable to open socket file: target process not responding or HotSpot VM not loaded The -F option can be used when the target process is not responding hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 -F Attaching to core -F from executable 31316, please wait... Error attaching to core file: Can't attach to the core file
Re: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
Thank you. That seems to resolve it. On Fri, Nov 6, 2015 at 11:46 PM, Ted Yu <yuzhih...@gmail.com> wrote: > You mentioned resourcemanager but not nodemanagers. > > I think you need to install Spark on nodes running nodemanagers. > > Cheers > > On Fri, Nov 6, 2015 at 1:32 PM, Kayode Odeyemi <drey...@gmail.com> wrote: > >> Hi, >> >> I have a YARN hadoop setup of 8 nodes (7 datanodes, 1 namenode and >> resourcemaneger). I have Spark setup only on the namenode/resource manager. >> >> Do I need to have Spark installed on the datanodes? >> >> I asked because I'm getting below error when I run a Spark job through >> spark-submit: >> >> Error: Could not find or load main class >> org.apache.spark.deploy.yarn.ExecutorLauncher >> >> I appreciate your help. >> >> Many thanks >> >> >
Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
Hi, I have a YARN hadoop setup of 8 nodes (7 datanodes, 1 namenode and resourcemaneger). I have Spark setup only on the namenode/resource manager. Do I need to have Spark installed on the datanodes? I asked because I'm getting below error when I run a Spark job through spark-submit: Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher I appreciate your help. Many thanks
Futures timed out after [120 seconds].
Hi, I'm running a Spark standalone in cluster mode (1 master, 2 workers). Everything has failed including spark-submit with errors such as "Caused by: java.lang.ClassNotFoundException: com.migration.App$$anonfun$upsert$1" Now, I've reverted back to submitting jobs through scala apps. Any ideas on how to solve this error? 15/11/05 01:46:37 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 58748. Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:149) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:250) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) Caused by: org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:162) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) ... 4 more Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241) ... 11 more
Re: Executor app-20151104202102-0000 finished with state EXITED
I've tried that once. No job was executed on the workers. That is, the workers weren't used. What I want to achieve is to have the SparkContext use a remote spark standalone master at 192.168.2.11 (this is where I started the master with ./start-master.sh and all the slaves with ./start-slaves.sh) On Wed, Nov 4, 2015 at 9:28 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Something like this: > conf.setMaster("local[3]") > > On Wed, Nov 4, 2015 at 11:08 AM, Kayode Odeyemi <drey...@gmail.com> wrote: > >> Thanks Ted. >> >> Where would you suggest I add that? I'm creating a SparkContext from a >> Spark app. My conf setup looks like this: >> >> conf.setMaster("spark://192.168.2.11:7077") >> conf.set("spark.logConf", "true") >> conf.set("spark.akka.logLifecycleEvents", "true") >> conf.set("spark.executor.memory", "5g") >> >> On Wed, Nov 4, 2015 at 9:04 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Have you tried using -Dspark.master=local ? >>> >>> Cheers >>> >>> On Wed, Nov 4, 2015 at 10:47 AM, Kayode Odeyemi <drey...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> I can't seem to understand why all created executors always fail. >>>> >>>> I have a Spark standalone cluster setup make up of 2 workers and 1 >>>> master. My spark-env looks like this: >>>> >>>> SPARK_MASTER_IP=192.168.2.11 >>>> SPARK_LOCAL_IP=192.168.2.11 >>>> SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=4" >>>> SPARK_WORKER_CORES=4 >>>> SPARK_WORKER_MEMORY=6g >>>> >>>> From the Spark logs, I get this: >>>> >>>> 15/11/04 20:36:35 WARN remote.ReliableDeliverySupervisor: Association with >>>> remote system [akka.tcp://sparkDriver@172.26.71.5:61094] has failed, >>>> address is now gated for [5000] ms. Reason: [Association failed with >>>> [akka.tcp://sparkDriver@172.26.71.5:61094]] Caused by: [Operation timed >>>> out: /172.26.71.5:61094] >>>> Exception in thread "main" akka.actor.ActorNotFound: Actor not found for: >>>> ActorSelection[Anchor(akka.tcp://sparkDriver@172.26.71.5:61094/), >>>> Path(/user/CoarseGrainedScheduler)] >>>>at >>>> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65) >>>>at >>>> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63) >>>>at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) >>>>at >>>> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) >>>>at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73) >>>>at >>>> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74) >>>>at >>>> akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120) >>>>at >>>> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73) >>>>at >>>> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40) >>>>at >>>> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248) >>>>at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266) >>>>at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533) >>>>at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569) >>>>at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559) >>>>at >>>> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87) >>>>at akka.remote.EndpointWriter.postStop(Endpoint.scala:557) >>>>at akka.actor.Actor$class.aroundPostStop(Actor.scala:477) >>>>at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:411) >>>>at >>>> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) >>>>at >>>> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) >>>>at akka.actor.ActorCell.terminate(ActorCell.scala:369) >>>>at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) >>>>at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) >>>>at akka.dispatch.Mailbox.processAllS
Re: Executor app-20151104202102-0000 finished with state EXITED
Thanks Ted. Where would you suggest I add that? I'm creating a SparkContext from a Spark app. My conf setup looks like this: conf.setMaster("spark://192.168.2.11:7077") conf.set("spark.logConf", "true") conf.set("spark.akka.logLifecycleEvents", "true") conf.set("spark.executor.memory", "5g") On Wed, Nov 4, 2015 at 9:04 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Have you tried using -Dspark.master=local ? > > Cheers > > On Wed, Nov 4, 2015 at 10:47 AM, Kayode Odeyemi <drey...@gmail.com> wrote: > >> Hi, >> >> I can't seem to understand why all created executors always fail. >> >> I have a Spark standalone cluster setup make up of 2 workers and 1 >> master. My spark-env looks like this: >> >> SPARK_MASTER_IP=192.168.2.11 >> SPARK_LOCAL_IP=192.168.2.11 >> SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=4" >> SPARK_WORKER_CORES=4 >> SPARK_WORKER_MEMORY=6g >> >> From the Spark logs, I get this: >> >> 15/11/04 20:36:35 WARN remote.ReliableDeliverySupervisor: Association with >> remote system [akka.tcp://sparkDriver@172.26.71.5:61094] has failed, address >> is now gated for [5000] ms. Reason: [Association failed with >> [akka.tcp://sparkDriver@172.26.71.5:61094]] Caused by: [Operation timed out: >> /172.26.71.5:61094] >> Exception in thread "main" akka.actor.ActorNotFound: Actor not found for: >> ActorSelection[Anchor(akka.tcp://sparkDriver@172.26.71.5:61094/), >> Path(/user/CoarseGrainedScheduler)] >> at >> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65) >> at >> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63) >> at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) >> at >> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) >> at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73) >> at >> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74) >> at >> akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120) >> at >> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73) >> at >> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40) >> at >> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248) >> at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266) >> at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533) >> at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569) >> at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559) >> at >> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87) >> at akka.remote.EndpointWriter.postStop(Endpoint.scala:557) >> at akka.actor.Actor$class.aroundPostStop(Actor.scala:477) >> at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:411) >> at >> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) >> at >> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) >> at akka.actor.ActorCell.terminate(ActorCell.scala:369) >> at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) >> at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) >> at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263) >> at akka.dispatch.Mailbox.run(Mailbox.scala:219) >> at >> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) >> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >> at >> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >> at >> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >> at >> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >> 15/11/04 20:36:35 INFO actor.LocalActorRef: Message >> [akka.remote.EndpointWriter$AckIdleCheckTimer$] from >> Actor[akka://driverPropsFetcher/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkDriver%40172.26.71.5%3A61094-0/endpointWriter#-1769599826] >> to >> Actor[akka://driverPropsFetcher/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkDriver%40172.26.71.5%3A61094-0/endpointWriter#-1769599826] >> was not delivered. [1] dead letters encountered. This logging can be turned >> off or adjusted with configuration settings 'akka.log-dead-letters' and >> 'akka.log-dead-letters-during-shutdown'. >> >> I appreciate any kind of help. >> > >
SPARK_SSH_FOREGROUND format
From http://spark.apache.org/docs/latest/spark-standalone.html#cluster-launch-scripts : If you do not have a password-less setup, you can set the environment > variable SPARK_SSH_FOREGROUND and serially provide a password for each > worker. > What does "serially provide a password for each worker mean"? can it be something like this (assume I have 3 workers): SPARK_SSH_FOREGROUND=worker1pass,worker2pass,worker3pass
Executor app-20151104202102-0000 finished with state EXITED
Hi, I can't seem to understand why all created executors always fail. I have a Spark standalone cluster setup make up of 2 workers and 1 master. My spark-env looks like this: SPARK_MASTER_IP=192.168.2.11 SPARK_LOCAL_IP=192.168.2.11 SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=4" SPARK_WORKER_CORES=4 SPARK_WORKER_MEMORY=6g >From the Spark logs, I get this: 15/11/04 20:36:35 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@172.26.71.5:61094] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://sparkDriver@172.26.71.5:61094]] Caused by: [Operation timed out: /172.26.71.5:61094] Exception in thread "main" akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@172.26.71.5:61094/), Path(/user/CoarseGrainedScheduler)] at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65) at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73) at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74) at akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120) at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73) at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40) at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248) at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266) at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533) at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569) at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559) at akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87) at akka.remote.EndpointWriter.postStop(Endpoint.scala:557) at akka.actor.Actor$class.aroundPostStop(Actor.scala:477) at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:411) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) at akka.actor.ActorCell.terminate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 15/11/04 20:36:35 INFO actor.LocalActorRef: Message [akka.remote.EndpointWriter$AckIdleCheckTimer$] from Actor[akka://driverPropsFetcher/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkDriver%40172.26.71.5%3A61094-0/endpointWriter#-1769599826] to Actor[akka://driverPropsFetcher/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkDriver%40172.26.71.5%3A61094-0/endpointWriter#-1769599826] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. I appreciate any kind of help.
Re: Maven build failed (Spark master)
Thank you. But I'm getting same warnings and it's still preventing the archive from being generated. I've ran this on both OSX Lion and Ubuntu 12. Same error. No .gz file On Mon, Oct 26, 2015 at 9:10 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Looks like '-Pyarn' was missing in your command. > > On Mon, Oct 26, 2015 at 12:06 PM, Kayode Odeyemi <drey...@gmail.com> > wrote: > >> I used this command which is synonymous to what you have: >> >> ./make-distribution.sh --name spark-latest --tgz --mvn mvn >> -Dhadoop.version=2.6.0 -Phadoop-2.6 -Phive -Phive-thriftserver -DskipTests >> clean package -U >> >> But I still see WARNINGS like this in the output and no .gz file created: >> >> cp: /usr/local/spark-latest/spark-[WARNING] See >> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/.part-r-5.gz.parquet.crc: >> No such file or directory >> cp: /usr/local/spark-latest/spark-[WARNING] See >> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/part-r-5.gz.parquet: >> No such file or directory >> cp: /usr/local/spark-latest/spark-[WARNING] See >> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9: >> No such file or directory >> cp: >> /usr/local/spark-latest/dist/python/test_support/sql/parquet_partitioned/year=2015/month=9: >> unable to copy extended attributes to >> /usr/local/spark-latest/spark-[WARNING] See >> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9: >> No such file or directory >> cp: /usr/local/spark-latest/spark-[WARNING] See >> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1: >> No such file or directory >> cp: >> /usr/local/spark-latest/dist/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1: >> unable to copy extended attributes to >> /usr/local/spark-latest/spark-[WARNING] See >> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1: >> No such file or directory >> cp: /usr/local/spark-latest/spark-[WARNING] See >> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1/.part-r-7.gz.parquet.crc: >> No such file or directory >> >> On Mon, Oct 26, 2015 at 8:58 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> If you use the command shown in: >>> https://github.com/apache/spark/pull/9281 >>> >>> You should have got the following: >>> >>> >>> ./dist/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/part-r-8.gz.parquet >>> >>> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1/part-r-7.gz.parquet >>> >>> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=25/part-r-4.gz.parquet >>> >>> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=25/part-r-2.gz.parquet >>> >>> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/part-r-5.gz.parquet >>> >>> On Mon, Oct 26, 2015 at 11:47 AM, Kayode Odeyemi <drey...@gmail.com> >>> wrote: >>> >>>> I see a lot of stuffs like this after the a successful maven build: >>>> >>>> cp: /usr/local/spark-latest/spark-[WARNING] See >>>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/ >>>> part-r-00008.gz.parquet: No such file or directory >>>> >>>> Seems it fails when it tries to package the build as an archive. >>>> >>>> I'm using the latest code on github master. >>>> >>>> Any ideas please? >>>> >>>> On Mon, Oct 26, 2015 at 6:20 PM, Yana Kadiyska <yana.kadiy...@gmail.com >>>> > wrote: >>>> >>>>> In 1.4 ./make_distribution produces a .tgz file in the root directory >>>>> (same directory that make_distribution is in) >>>>> >>>>> >>>>> >>>>> On Mon, Oct 26, 2015 at 8:46 AM, Kayode Odeyemi <drey...@gmail.com> >&g
Re: Maven build failed (Spark master)
Thanks gents. Removal of 'clean package -U' made the difference. On Tue, Oct 27, 2015 at 6:39 PM, Todd Nist <tsind...@gmail.com> wrote: > I issued the same basic command and it worked fine. > > RADTech-MBP:spark $ ./make-distribution.sh --name hadoop-2.6 --tgz -Pyarn > -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests > > Which created: spark-1.6.0-SNAPSHOT-bin-hadoop-2.6.tgz in the root > directory of the project. > > FWIW, the environment was an MBP with OS X 10.10.5 and Java: > > java version "1.8.0_51" > Java(TM) SE Runtime Environment (build 1.8.0_51-b16) > Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode) > > -Todd > > On Tue, Oct 27, 2015 at 12:17 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> I used the following command: >> make-distribution.sh --name custom-spark --tgz -Phadoop-2.4 -Phive >> -Phive-thriftserver -Pyarn >> >> spark-1.6.0-SNAPSHOT-bin-custom-spark.tgz was generated (with patch from >> SPARK-11348) >> >> Can you try above command ? >> >> Thanks >> >> On Tue, Oct 27, 2015 at 7:03 AM, Kayode Odeyemi <drey...@gmail.com> >> wrote: >> >>> Ted, I switched to this: >>> >>> ./make-distribution.sh --name spark-latest --tgz -Dhadoop.version=2.6.0 >>> -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn -DskipTests clean package -U >>> >>> Same error. No .gz file. Here's the bottom output log: >>> >>> + rm -rf /home/emperor/javaprojects/spark/dist >>> + mkdir -p /home/emperor/javaprojects/spark/dist/lib >>> + echo 'Spark [WARNING] See >>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin (git revision >>> 3689beb) built for Hadoop [WARNING] See >>> http://docs.codehaus.org/display/MAVENUSER/Shade+Pl >>> + echo 'Build flags: -Dhadoop.version=2.6.0' -Phadoop-2.6 -Phive >>> -Phive-thriftserver -Pyarn -DskipTests clean package -U >>> + cp >>> /home/emperor/javaprojects/spark/assembly/target/scala-2.10/spark-assembly-1.6.0-SNAPSHOT-hadoop2.6.0.jar >>> /home/emperor/javaprojects/spark/dist/lib/ >>> + cp >>> /home/emperor/javaprojects/spark/examples/target/scala-2.10/spark-examples-1.6.0-SNAPSHOT-hadoop2.6.0.jar >>> /home/emperor/javaprojects/spark/dist/lib/ >>> + cp >>> /home/emperor/javaprojects/spark/network/yarn/target/scala-2.10/spark-1.6.0-SNAPSHOT-yarn-shuffle.jar >>> /home/emperor/javaprojects/spark/dist/lib/ >>> + mkdir -p /home/emperor/javaprojects/spark/dist/examples/src/main >>> + cp -r /home/emperor/javaprojects/spark/examples/src/main >>> /home/emperor/javaprojects/spark/dist/examples/src/ >>> + '[' 1 == 1 ']' >>> + cp >>> /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar >>> /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-core-3.2.10.jar >>> /home/emperor/javaprojects >>> ed/jars/datanucleus-rdbms-3.2.9.jar >>> /home/emperor/javaprojects/spark/dist/lib/ >>> + cp /home/emperor/javaprojects/spark/LICENSE >>> /home/emperor/javaprojects/spark/dist >>> + cp -r /home/emperor/javaprojects/spark/licenses >>> /home/emperor/javaprojects/spark/dist >>> + cp /home/emperor/javaprojects/spark/NOTICE >>> /home/emperor/javaprojects/spark/dist >>> + '[' -e /home/emperor/javaprojects/spark/CHANGES.txt ']' >>> + cp -r /home/emperor/javaprojects/spark/data >>> /home/emperor/javaprojects/spark/dist >>> + mkdir /home/emperor/javaprojects/spark/dist/conf >>> + cp /home/emperor/javaprojects/spark/conf/docker.properties.template >>> /home/emperor/javaprojects/spark/conf/fairscheduler.xml.template >>> /home/emperor/javaprojects/spark/conf/log4j.properties >>> emperor/javaprojects/spark/conf/metrics.properties.template >>> /home/emperor/javaprojects/spark/conf/slaves.template >>> /home/emperor/javaprojects/spark/conf/spark-defaults.conf.template /home/em >>> ts/spark/conf/spark-env.sh.template >>> /home/emperor/javaprojects/spark/dist/conf >>> + cp /home/emperor/javaprojects/spark/README.md >>> /home/emperor/javaprojects/spark/dist >>> + cp -r /home/emperor/javaprojects/spark/bin >>> /home/emperor/javaprojects/spark/dist >>> + cp -r /home/emperor/javaprojects/spark/python >>> /home/emperor/javaprojects/spark/dist >>> + cp -r /home/emperor/javaprojects/spark/sbin >>> /home/emperor/javaprojects/spark/dist >>> + cp -r /home/emperor/javaprojects/spark/ec2 >>> /home/emperor/javaprojects/spark/dist >>> + '[' -d
Re: Maven build failed (Spark master)
Seems the build and directory structure in dist is similar to the .gz file downloaded from the downloads page. Can the dist directory be used as is? On Tue, Oct 27, 2015 at 4:03 PM, Kayode Odeyemi <drey...@gmail.com> wrote: > Ted, I switched to this: > > ./make-distribution.sh --name spark-latest --tgz -Dhadoop.version=2.6.0 > -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn -DskipTests clean package -U > > Same error. No .gz file. Here's the bottom output log: > > + rm -rf /home/emperor/javaprojects/spark/dist > + mkdir -p /home/emperor/javaprojects/spark/dist/lib > + echo 'Spark [WARNING] See > http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin (git revision > 3689beb) built for Hadoop [WARNING] See > http://docs.codehaus.org/display/MAVENUSER/Shade+Pl > + echo 'Build flags: -Dhadoop.version=2.6.0' -Phadoop-2.6 -Phive > -Phive-thriftserver -Pyarn -DskipTests clean package -U > + cp > /home/emperor/javaprojects/spark/assembly/target/scala-2.10/spark-assembly-1.6.0-SNAPSHOT-hadoop2.6.0.jar > /home/emperor/javaprojects/spark/dist/lib/ > + cp > /home/emperor/javaprojects/spark/examples/target/scala-2.10/spark-examples-1.6.0-SNAPSHOT-hadoop2.6.0.jar > /home/emperor/javaprojects/spark/dist/lib/ > + cp > /home/emperor/javaprojects/spark/network/yarn/target/scala-2.10/spark-1.6.0-SNAPSHOT-yarn-shuffle.jar > /home/emperor/javaprojects/spark/dist/lib/ > + mkdir -p /home/emperor/javaprojects/spark/dist/examples/src/main > + cp -r /home/emperor/javaprojects/spark/examples/src/main > /home/emperor/javaprojects/spark/dist/examples/src/ > + '[' 1 == 1 ']' > + cp > /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar > /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-core-3.2.10.jar > /home/emperor/javaprojects > ed/jars/datanucleus-rdbms-3.2.9.jar > /home/emperor/javaprojects/spark/dist/lib/ > + cp /home/emperor/javaprojects/spark/LICENSE > /home/emperor/javaprojects/spark/dist > + cp -r /home/emperor/javaprojects/spark/licenses > /home/emperor/javaprojects/spark/dist > + cp /home/emperor/javaprojects/spark/NOTICE > /home/emperor/javaprojects/spark/dist > + '[' -e /home/emperor/javaprojects/spark/CHANGES.txt ']' > + cp -r /home/emperor/javaprojects/spark/data > /home/emperor/javaprojects/spark/dist > + mkdir /home/emperor/javaprojects/spark/dist/conf > + cp /home/emperor/javaprojects/spark/conf/docker.properties.template > /home/emperor/javaprojects/spark/conf/fairscheduler.xml.template > /home/emperor/javaprojects/spark/conf/log4j.properties > emperor/javaprojects/spark/conf/metrics.properties.template > /home/emperor/javaprojects/spark/conf/slaves.template > /home/emperor/javaprojects/spark/conf/spark-defaults.conf.template /home/em > ts/spark/conf/spark-env.sh.template > /home/emperor/javaprojects/spark/dist/conf > + cp /home/emperor/javaprojects/spark/README.md > /home/emperor/javaprojects/spark/dist > + cp -r /home/emperor/javaprojects/spark/bin > /home/emperor/javaprojects/spark/dist > + cp -r /home/emperor/javaprojects/spark/python > /home/emperor/javaprojects/spark/dist > + cp -r /home/emperor/javaprojects/spark/sbin > /home/emperor/javaprojects/spark/dist > + cp -r /home/emperor/javaprojects/spark/ec2 > /home/emperor/javaprojects/spark/dist > + '[' -d /home/emperor/javaprojects/spark/R/lib/SparkR ']' > + '[' false == true ']' > + '[' true == true ']' > + TARDIR_NAME='spark-[WARNING] See > http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest' > + TARDIR='/home/emperor/javaprojects/spark/spark-[WARNING] See > http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest' > + rm -rf '/home/emperor/javaprojects/spark/spark-[WARNING] See > http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest' > + cp -r /home/emperor/javaprojects/spark/dist > '/home/emperor/javaprojects/spark/spark-[WARNING] See > http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest' > cp: cannot create directory > `/home/emperor/javaprojects/spark/spark-[WARNING] See > http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest': > No such file or directory > > > On Tue, Oct 27, 2015 at 2:14 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Can you try the same command shown in the pull request ? >> >> Thanks >> >> On Oct 27, 2015, at 12:40 AM, Kayode Odeyemi <drey...@gmail.com> wrote: >> >> Thank you. >> >> But I'm getting same warnings and it's still preventing the archive from >> being generated. >> >> I've ran this on both OSX Lion and Ubuntu 12. Same error. No .gz file >> >> On Mon, Oct 26, 2015 at 9:10 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Looks
Re: Maven build failed (Spark master)
Ted, I switched to this: ./make-distribution.sh --name spark-latest --tgz -Dhadoop.version=2.6.0 -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn -DskipTests clean package -U Same error. No .gz file. Here's the bottom output log: + rm -rf /home/emperor/javaprojects/spark/dist + mkdir -p /home/emperor/javaprojects/spark/dist/lib + echo 'Spark [WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin (git revision 3689beb) built for Hadoop [WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Pl + echo 'Build flags: -Dhadoop.version=2.6.0' -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn -DskipTests clean package -U + cp /home/emperor/javaprojects/spark/assembly/target/scala-2.10/spark-assembly-1.6.0-SNAPSHOT-hadoop2.6.0.jar /home/emperor/javaprojects/spark/dist/lib/ + cp /home/emperor/javaprojects/spark/examples/target/scala-2.10/spark-examples-1.6.0-SNAPSHOT-hadoop2.6.0.jar /home/emperor/javaprojects/spark/dist/lib/ + cp /home/emperor/javaprojects/spark/network/yarn/target/scala-2.10/spark-1.6.0-SNAPSHOT-yarn-shuffle.jar /home/emperor/javaprojects/spark/dist/lib/ + mkdir -p /home/emperor/javaprojects/spark/dist/examples/src/main + cp -r /home/emperor/javaprojects/spark/examples/src/main /home/emperor/javaprojects/spark/dist/examples/src/ + '[' 1 == 1 ']' + cp /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-core-3.2.10.jar /home/emperor/javaprojects ed/jars/datanucleus-rdbms-3.2.9.jar /home/emperor/javaprojects/spark/dist/lib/ + cp /home/emperor/javaprojects/spark/LICENSE /home/emperor/javaprojects/spark/dist + cp -r /home/emperor/javaprojects/spark/licenses /home/emperor/javaprojects/spark/dist + cp /home/emperor/javaprojects/spark/NOTICE /home/emperor/javaprojects/spark/dist + '[' -e /home/emperor/javaprojects/spark/CHANGES.txt ']' + cp -r /home/emperor/javaprojects/spark/data /home/emperor/javaprojects/spark/dist + mkdir /home/emperor/javaprojects/spark/dist/conf + cp /home/emperor/javaprojects/spark/conf/docker.properties.template /home/emperor/javaprojects/spark/conf/fairscheduler.xml.template /home/emperor/javaprojects/spark/conf/log4j.properties emperor/javaprojects/spark/conf/metrics.properties.template /home/emperor/javaprojects/spark/conf/slaves.template /home/emperor/javaprojects/spark/conf/spark-defaults.conf.template /home/em ts/spark/conf/spark-env.sh.template /home/emperor/javaprojects/spark/dist/conf + cp /home/emperor/javaprojects/spark/README.md /home/emperor/javaprojects/spark/dist + cp -r /home/emperor/javaprojects/spark/bin /home/emperor/javaprojects/spark/dist + cp -r /home/emperor/javaprojects/spark/python /home/emperor/javaprojects/spark/dist + cp -r /home/emperor/javaprojects/spark/sbin /home/emperor/javaprojects/spark/dist + cp -r /home/emperor/javaprojects/spark/ec2 /home/emperor/javaprojects/spark/dist + '[' -d /home/emperor/javaprojects/spark/R/lib/SparkR ']' + '[' false == true ']' + '[' true == true ']' + TARDIR_NAME='spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest' + TARDIR='/home/emperor/javaprojects/spark/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest' + rm -rf '/home/emperor/javaprojects/spark/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest' + cp -r /home/emperor/javaprojects/spark/dist '/home/emperor/javaprojects/spark/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest' cp: cannot create directory `/home/emperor/javaprojects/spark/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest': No such file or directory On Tue, Oct 27, 2015 at 2:14 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Can you try the same command shown in the pull request ? > > Thanks > > On Oct 27, 2015, at 12:40 AM, Kayode Odeyemi <drey...@gmail.com> wrote: > > Thank you. > > But I'm getting same warnings and it's still preventing the archive from > being generated. > > I've ran this on both OSX Lion and Ubuntu 12. Same error. No .gz file > > On Mon, Oct 26, 2015 at 9:10 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Looks like '-Pyarn' was missing in your command. >> >> On Mon, Oct 26, 2015 at 12:06 PM, Kayode Odeyemi <drey...@gmail.com> >> wrote: >> >>> I used this command which is synonymous to what you have: >>> >>> ./make-distribution.sh --name spark-latest --tgz --mvn mvn >>> -Dhadoop.version=2.6.0 -Phadoop-2.6 -Phive -Phive-thriftserver -DskipTests >>> clean package -U >>> >>> But I still see WARNINGS like this in the output and no .gz file created: >>> >>> cp: /usr/local/spark-latest/spark-[WARNING] See >>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/p
Re: Maven build failed (Spark master)
I used this command which is synonymous to what you have: ./make-distribution.sh --name spark-latest --tgz --mvn mvn -Dhadoop.version=2.6.0 -Phadoop-2.6 -Phive -Phive-thriftserver -DskipTests clean package -U But I still see WARNINGS like this in the output and no .gz file created: cp: /usr/local/spark-latest/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/.part-r-5.gz.parquet.crc: No such file or directory cp: /usr/local/spark-latest/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/part-r-5.gz.parquet: No such file or directory cp: /usr/local/spark-latest/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9: No such file or directory cp: /usr/local/spark-latest/dist/python/test_support/sql/parquet_partitioned/year=2015/month=9: unable to copy extended attributes to /usr/local/spark-latest/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9: No such file or directory cp: /usr/local/spark-latest/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1: No such file or directory cp: /usr/local/spark-latest/dist/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1: unable to copy extended attributes to /usr/local/spark-latest/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1: No such file or directory cp: /usr/local/spark-latest/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1/.part-r-7.gz.parquet.crc: No such file or directory On Mon, Oct 26, 2015 at 8:58 PM, Ted Yu <yuzhih...@gmail.com> wrote: > If you use the command shown in: > https://github.com/apache/spark/pull/9281 > > You should have got the following: > > > ./dist/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/part-r-8.gz.parquet > > ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1/part-r-7.gz.parquet > > ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=25/part-r-4.gz.parquet > > ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=25/part-r-2.gz.parquet > > ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/part-r-5.gz.parquet > > On Mon, Oct 26, 2015 at 11:47 AM, Kayode Odeyemi <drey...@gmail.com> > wrote: > >> I see a lot of stuffs like this after the a successful maven build: >> >> cp: /usr/local/spark-latest/spark-[WARNING] See >> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/ >> part-r-8.gz.parquet: No such file or directory >> >> Seems it fails when it tries to package the build as an archive. >> >> I'm using the latest code on github master. >> >> Any ideas please? >> >> On Mon, Oct 26, 2015 at 6:20 PM, Yana Kadiyska <yana.kadiy...@gmail.com> >> wrote: >> >>> In 1.4 ./make_distribution produces a .tgz file in the root directory >>> (same directory that make_distribution is in) >>> >>> >>> >>> On Mon, Oct 26, 2015 at 8:46 AM, Kayode Odeyemi <drey...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> The ./make_distribution task completed. However, I can't seem to locate >>>> the >>>> .tar.gz file. >>>> >>>> Where does Spark save this? or should I just work with the dist >>>> directory? >>>> >>>> On Fri, Oct 23, 2015 at 4:23 PM, Kayode Odeyemi <drey...@gmail.com> >>>> wrote: >>>> >>>>> I saw this when I tested manually (without ./make-distribution) >>>>> >>>>> Detected Maven Version: 3.2.2 is not in the allowed range 3.3.3. >>>>> >>>>> So I simply upgraded maven to 3.3.3. >>>>> >>>>> Resolved. Thanks >>>>> >>>>> On Fri, Oct 23, 2015 at 3:17 PM, Sean Owen <so...@cloudera.com> wrote: >>>>> >>>>>> This doesn't show the actual error output from Maven. I have a strong >
Re: Maven build failed (Spark master)
I see a lot of stuffs like this after the a successful maven build: cp: /usr/local/spark-latest/spark-[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/ part-r-8.gz.parquet: No such file or directory Seems it fails when it tries to package the build as an archive. I'm using the latest code on github master. Any ideas please? On Mon, Oct 26, 2015 at 6:20 PM, Yana Kadiyska <yana.kadiy...@gmail.com> wrote: > In 1.4 ./make_distribution produces a .tgz file in the root directory > (same directory that make_distribution is in) > > > > On Mon, Oct 26, 2015 at 8:46 AM, Kayode Odeyemi <drey...@gmail.com> wrote: > >> Hi, >> >> The ./make_distribution task completed. However, I can't seem to locate >> the >> .tar.gz file. >> >> Where does Spark save this? or should I just work with the dist directory? >> >> On Fri, Oct 23, 2015 at 4:23 PM, Kayode Odeyemi <drey...@gmail.com> >> wrote: >> >>> I saw this when I tested manually (without ./make-distribution) >>> >>> Detected Maven Version: 3.2.2 is not in the allowed range 3.3.3. >>> >>> So I simply upgraded maven to 3.3.3. >>> >>> Resolved. Thanks >>> >>> On Fri, Oct 23, 2015 at 3:17 PM, Sean Owen <so...@cloudera.com> wrote: >>> >>>> This doesn't show the actual error output from Maven. I have a strong >>>> guess that you haven't set MAVEN_OPTS to increase the memory Maven can >>>> use. >>>> >>>> On Fri, Oct 23, 2015 at 6:14 AM, Kayode Odeyemi <drey...@gmail.com> >>>> wrote: >>>> > Hi, >>>> > >>>> > I can't seem to get a successful maven build. Please see command >>>> output >>>> > below: >>>> > >>>> > bash-3.2$ ./make-distribution.sh --name spark-latest --tgz --mvn mvn >>>> > -Dhadoop.version=2.7.0 -Phadoop-2.7 -Phive -Phive-thriftserver >>>> -DskipTests >>>> > clean package >>>> > +++ dirname ./make-distribution.sh >>>> > ++ cd . >>>> > ++ pwd >>>> > + SPARK_HOME=/usr/local/spark-latest >>>> > + DISTDIR=/usr/local/spark-latest/dist >>>> > + SPARK_TACHYON=false >>>> > + TACHYON_VERSION=0.7.1 >>>> > + TACHYON_TGZ=tachyon-0.7.1-bin.tar.gz >>>> > + >>>> > TACHYON_URL= >>>> https://github.com/amplab/tachyon/releases/download/v0.7.1/tachyon-0.7.1-bin.tar.gz >>>> > + MAKE_TGZ=false >>>> > + NAME=none >>>> > + MVN=/usr/local/spark-latest/build/mvn >>>> > + (( 12 )) >>>> > + case $1 in >>>> > + NAME=spark-latest >>>> > + shift >>>> > + shift >>>> > + (( 10 )) >>>> > + case $1 in >>>> > + MAKE_TGZ=true >>>> > + shift >>>> > + (( 9 )) >>>> > + case $1 in >>>> > + MVN=mvn >>>> > + shift >>>> > + shift >>>> > + (( 7 )) >>>> > + case $1 in >>>> > + break >>>> > + '[' -z >>>> /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']' >>>> > + '[' -z >>>> /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']' >>>> > ++ command -v git >>>> > + '[' /usr/bin/git ']' >>>> > ++ git rev-parse --short HEAD >>>> > + GITREV=487d409 >>>> > + '[' '!' -z 487d409 ']' >>>> > + GITREVSTRING=' (git revision 487d409)' >>>> > + unset GITREV >>>> > ++ command -v mvn >>>> > + '[' '!' /usr/bin/mvn ']' >>>> > ++ mvn help:evaluate -Dexpression=project.version >>>> -Dhadoop.version=2.7.0 >>>> > -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package >>>> > ++ grep -v INFO >>>> > ++ tail -n 1 >>>> > + VERSION='[ERROR] [Help 1] >>>> > >>>> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException >>>> ' >>>> > >>>> > Same output error with JDK 7 >>>> > >>>> > Appreciate your help. >>>> > >>>> > >>>> >>> >>> >>> >> >
Loading binary files from NFS share
Hi, Is it possible to load binary files from NFS share like this: sc.binaryFiles("nfs://host/mountpath") I understand that it takes a path, but want to know if it allows protocol. Appreciate your help.
Re: Maven build failed (Spark master)
Hi, The ./make_distribution task completed. However, I can't seem to locate the .tar.gz file. Where does Spark save this? or should I just work with the dist directory? On Fri, Oct 23, 2015 at 4:23 PM, Kayode Odeyemi <drey...@gmail.com> wrote: > I saw this when I tested manually (without ./make-distribution) > > Detected Maven Version: 3.2.2 is not in the allowed range 3.3.3. > > So I simply upgraded maven to 3.3.3. > > Resolved. Thanks > > On Fri, Oct 23, 2015 at 3:17 PM, Sean Owen <so...@cloudera.com> wrote: > >> This doesn't show the actual error output from Maven. I have a strong >> guess that you haven't set MAVEN_OPTS to increase the memory Maven can >> use. >> >> On Fri, Oct 23, 2015 at 6:14 AM, Kayode Odeyemi <drey...@gmail.com> >> wrote: >> > Hi, >> > >> > I can't seem to get a successful maven build. Please see command output >> > below: >> > >> > bash-3.2$ ./make-distribution.sh --name spark-latest --tgz --mvn mvn >> > -Dhadoop.version=2.7.0 -Phadoop-2.7 -Phive -Phive-thriftserver >> -DskipTests >> > clean package >> > +++ dirname ./make-distribution.sh >> > ++ cd . >> > ++ pwd >> > + SPARK_HOME=/usr/local/spark-latest >> > + DISTDIR=/usr/local/spark-latest/dist >> > + SPARK_TACHYON=false >> > + TACHYON_VERSION=0.7.1 >> > + TACHYON_TGZ=tachyon-0.7.1-bin.tar.gz >> > + >> > TACHYON_URL= >> https://github.com/amplab/tachyon/releases/download/v0.7.1/tachyon-0.7.1-bin.tar.gz >> > + MAKE_TGZ=false >> > + NAME=none >> > + MVN=/usr/local/spark-latest/build/mvn >> > + (( 12 )) >> > + case $1 in >> > + NAME=spark-latest >> > + shift >> > + shift >> > + (( 10 )) >> > + case $1 in >> > + MAKE_TGZ=true >> > + shift >> > + (( 9 )) >> > + case $1 in >> > + MVN=mvn >> > + shift >> > + shift >> > + (( 7 )) >> > + case $1 in >> > + break >> > + '[' -z >> /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']' >> > + '[' -z >> /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']' >> > ++ command -v git >> > + '[' /usr/bin/git ']' >> > ++ git rev-parse --short HEAD >> > + GITREV=487d409 >> > + '[' '!' -z 487d409 ']' >> > + GITREVSTRING=' (git revision 487d409)' >> > + unset GITREV >> > ++ command -v mvn >> > + '[' '!' /usr/bin/mvn ']' >> > ++ mvn help:evaluate -Dexpression=project.version -Dhadoop.version=2.7.0 >> > -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package >> > ++ grep -v INFO >> > ++ tail -n 1 >> > + VERSION='[ERROR] [Help 1] >> > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException >> ' >> > >> > Same output error with JDK 7 >> > >> > Appreciate your help. >> > >> > >> > > >
Maven build failed (Spark master)
Hi, I can't seem to get a successful maven build. Please see command output below: bash-3.2$ ./make-distribution.sh --name spark-latest --tgz --mvn mvn -Dhadoop.version=2.7.0 -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package +++ dirname ./make-distribution.sh ++ cd . ++ pwd + SPARK_HOME=/usr/local/spark-latest + DISTDIR=/usr/local/spark-latest/dist + SPARK_TACHYON=false + TACHYON_VERSION=0.7.1 + TACHYON_TGZ=tachyon-0.7.1-bin.tar.gz + TACHYON_URL= https://github.com/amplab/tachyon/releases/download/v0.7.1/tachyon-0.7.1-bin.tar.gz + MAKE_TGZ=false + NAME=none + MVN=/usr/local/spark-latest/build/mvn + (( 12 )) + case $1 in + NAME=spark-latest + shift + shift + (( 10 )) + case $1 in + MAKE_TGZ=true + shift + (( 9 )) + case $1 in + MVN=mvn + shift + shift + (( 7 )) + case $1 in + break + '[' -z /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']' + '[' -z /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']' ++ command -v git + '[' /usr/bin/git ']' ++ git rev-parse --short HEAD + GITREV=487d409 + '[' '!' -z 487d409 ']' + GITREVSTRING=' (git revision 487d409)' + unset GITREV ++ command -v mvn + '[' '!' /usr/bin/mvn ']' ++ mvn help:evaluate -Dexpression=project.version -Dhadoop.version=2.7.0 -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package ++ grep -v INFO ++ tail -n 1 + VERSION='[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException' Same output error with JDK 7 Appreciate your help.
Re: Maven build failed (Spark master)
I saw this when I tested manually (without ./make-distribution) Detected Maven Version: 3.2.2 is not in the allowed range 3.3.3. So I simply upgraded maven to 3.3.3. Resolved. Thanks On Fri, Oct 23, 2015 at 3:17 PM, Sean Owen <so...@cloudera.com> wrote: > This doesn't show the actual error output from Maven. I have a strong > guess that you haven't set MAVEN_OPTS to increase the memory Maven can > use. > > On Fri, Oct 23, 2015 at 6:14 AM, Kayode Odeyemi <drey...@gmail.com> wrote: > > Hi, > > > > I can't seem to get a successful maven build. Please see command output > > below: > > > > bash-3.2$ ./make-distribution.sh --name spark-latest --tgz --mvn mvn > > -Dhadoop.version=2.7.0 -Phadoop-2.7 -Phive -Phive-thriftserver > -DskipTests > > clean package > > +++ dirname ./make-distribution.sh > > ++ cd . > > ++ pwd > > + SPARK_HOME=/usr/local/spark-latest > > + DISTDIR=/usr/local/spark-latest/dist > > + SPARK_TACHYON=false > > + TACHYON_VERSION=0.7.1 > > + TACHYON_TGZ=tachyon-0.7.1-bin.tar.gz > > + > > TACHYON_URL= > https://github.com/amplab/tachyon/releases/download/v0.7.1/tachyon-0.7.1-bin.tar.gz > > + MAKE_TGZ=false > > + NAME=none > > + MVN=/usr/local/spark-latest/build/mvn > > + (( 12 )) > > + case $1 in > > + NAME=spark-latest > > + shift > > + shift > > + (( 10 )) > > + case $1 in > > + MAKE_TGZ=true > > + shift > > + (( 9 )) > > + case $1 in > > + MVN=mvn > > + shift > > + shift > > + (( 7 )) > > + case $1 in > > + break > > + '[' -z /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home > ']' > > + '[' -z /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home > ']' > > ++ command -v git > > + '[' /usr/bin/git ']' > > ++ git rev-parse --short HEAD > > + GITREV=487d409 > > + '[' '!' -z 487d409 ']' > > + GITREVSTRING=' (git revision 487d409)' > > + unset GITREV > > ++ command -v mvn > > + '[' '!' /usr/bin/mvn ']' > > ++ mvn help:evaluate -Dexpression=project.version -Dhadoop.version=2.7.0 > > -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package > > ++ grep -v INFO > > ++ tail -n 1 > > + VERSION='[ERROR] [Help 1] > > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException' > > > > Same output error with JDK 7 > > > > Appreciate your help. > > > > >
Re: sqlContext load by offset
When I use that I get a "Caused by: org.postgresql.util.PSQLException: ERROR: column "none" does not exist" On Thu, Oct 22, 2015 at 9:31 PM, Kayode Odeyemi <drey...@gmail.com> wrote: > Hi, > > I've trying to load a postgres table using the following expression: > > val cachedIndex = cache.get("latest_legacy_group_index") > val mappingsDF = sqlContext.load("jdbc", Map( > "url" -> Config.dataSourceUrl(mode, Some("mappings")), > "dbtable" -> s"(select userid, yid, username from legacyusers offset > $cachedIndex ) as legacyusers") > ) > > I'll like to know if this expression is correct: > > "dbtable" -> s"(select userid, yid, username from legacyusers offset > $cachedIndex ) as legacyusers") > > As you can see. I'm trying to load the table records by offset > > I appreciate your help. > > -- Odeyemi 'Kayode O. http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde
Fwd: sqlContext load by offset
Hi, I've trying to load a postgres table using the following expression: val cachedIndex = cache.get("latest_legacy_group_index") val mappingsDF = sqlContext.load("jdbc", Map( "url" -> Config.dataSourceUrl(mode, Some("mappings")), "dbtable" -> s"(select userid, yid, username from legacyusers offset $cachedIndex ) as legacyusers") ) I'll like to know if this expression is correct: "dbtable" -> s"(select userid, yid, username from legacyusers offset $cachedIndex ) as legacyusers") As you can see. I'm trying to load the table records by offset I appreciate your help.