Re: Accessing Web UI

2016-02-22 Thread Kayode Odeyemi
Try http://localhost:4040

On Mon, Feb 22, 2016 at 8:23 AM, Vasanth Bhat  wrote:

> Thanks Gourav, Eduardo
>
> I tried  http://localhost:8080  and   http://OAhtvJ5MCA:8080/  .  Both
> cases the forefox just hangs.
>
> Also I tried with lynx text based browser.   I get the message  "HTTP
> request sent; waiting for response."  and it hangs as well.
>
> Is there way to enable debug logs in spark master service, to understand
> what's going wrong?
>
>
> Thanks
> Vasanth
>
>
>
> On Fri, Feb 19, 2016 at 5:46 PM, Gourav Sengupta <
> gourav.sengu...@gmail.com> wrote:
>
>> can you please try localhost:8080?
>>
>> Regards,
>> Gourav Sengupta
>>
>> On Fri, Feb 19, 2016 at 11:18 AM, vasbhat  wrote:
>>
>>> Hi,
>>>
>>>I have installed the spark1.6 and  trying to start the master
>>> (start-master.sh) and access the webUI.
>>>
>>> I get the following logs on running the start-master.sh
>>>
>>> Spark Command: /usr/jdk/instances/jdk1.8.0/jre/bin/java -cp
>>>
>>> /usr/local/spark-1.6.0-bin-hadoop2.6/conf/:/usr/local/spark-1.6.0-bin-hadoop2.6/lib/spark-assembly-1.6.0-hadoop2.6.0.jar:/usr/local/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/usr/local/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/usr/local/spark-1.6.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar
>>> -Xms4g -Xmx4g org.apache.spark.deploy.master.Master --ip OAhtvJ5MCA
>>> --port
>>> 7077 --webui-port 8080
>>> 
>>> Using Spark's default log4j profile:
>>> org/apache/spark/log4j-defaults.properties
>>> 16/02/19 03:07:30 INFO Master: Registered signal handlers for [TERM, HUP,
>>> INT]
>>> 16/02/19 03:07:30 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>> 16/02/19 03:07:31 INFO SecurityManager: Changing view acls to: sluser
>>> 16/02/19 03:07:31 INFO SecurityManager: Changing modify acls to: sluser
>>> 16/02/19 03:07:31 INFO SecurityManager: SecurityManager: authentication
>>> disabled; ui acls disabled; users with view permissions: Set(sluser);
>>> users
>>> with modify permissions: Set(sluser)
>>> 16/02/19 03:07:32 INFO Utils: Successfully started service 'sparkMaster'
>>> on
>>> port 7077.
>>> 16/02/19 03:07:32 INFO Master: Starting Spark master at
>>> spark://OAhtvJ5MCA:7077
>>> 16/02/19 03:07:32 INFO Master: Running Spark version 1.6.0
>>> 16/02/19 03:07:32 WARN AbstractConnector: insufficient threads configured
>>> for SelectChannelConnector@0.0.0.0:8080
>>> 16/02/19 03:07:32 INFO Utils: Successfully started service 'MasterUI' on
>>> port 8080.
>>> 16/02/19 03:07:32 INFO MasterWebUI: Started MasterWebUI at
>>> http://127.0.0.1:8080
>>> 16/02/19 03:07:32 WARN AbstractConnector: insufficient threads configured
>>> for SelectChannelConnector@OAhtvJ5MCA:6066
>>> 16/02/19 03:07:32 INFO Utils: Successfully started service on port 6066.
>>> 16/02/19 03:07:32 INFO StandaloneRestServer: Started REST server for
>>> submitting applications on port 6066
>>> 16/02/19 03:07:33 INFO Master: I have been elected leader! New state:
>>> ALIVE
>>>
>>> --
>>>  Through netstat I can see that port 8080 is Listening
>>>  Now when I start firefox and access http://127.0.0.1:8080  ,  firefox
>>> just
>>> hangs with the message
>>>
>>> Waiting for "127.0.0.1"   and  does not connect to UI.
>>>
>>>  How do I enable debug for the spark master daemon, to understand what's
>>> happening.
>>>
>>> Thanks
>>> Vasanth
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Re-Accessing-Web-UI-tp23029p26276.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>


Re: spark-submit stuck and no output in console

2015-11-17 Thread Kayode Odeyemi
Sonal, SparkPi couldn't run as well. Stuck to the screen with no output

hadoop-user@yks-hadoop-m01:/usr/local/spark$ ./bin/run-example SparkPi

On Tue, Nov 17, 2015 at 12:22 PM, Steve Loughran 
wrote:

> 48 hours is one of those kerberos warning times (as is 24h, 72h and 7
> days) 


Does this mean I need to restart the whole Hadoop YARN cluster to reset
kerberos?




-- 
Odeyemi 'Kayode O.
http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde


Re: spark-submit stuck and no output in console

2015-11-17 Thread Kayode Odeyemi
Our hadoop NFS Gateway seems to be malfunctioning.

I basically restart it. Now spark jobs have resumed successfully.

Problem solved.


Re: spark-submit stuck and no output in console

2015-11-17 Thread Kayode Odeyemi
Anyone experienced this issue as well?

On Mon, Nov 16, 2015 at 8:06 PM, Kayode Odeyemi <drey...@gmail.com> wrote:

>
> Or are you saying that the Java process never even starts?
>
>
> Exactly.
>
> Here's what I got back from jstack as expected:
>
> hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316
> 31316: Unable to open socket file: target process not responding or
> HotSpot VM not loaded
> The -F option can be used when the target process is not responding
> hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 -F
> Attaching to core -F from executable 31316, please wait...
> Error attaching to core file: Can't attach to the core file
>
>
>


-- 
Odeyemi 'Kayode O.
http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde


Re: spark-submit stuck and no output in console

2015-11-17 Thread Kayode Odeyemi
Thanks for the reply Sonal.

I'm on JDK 7 (/usr/lib/jvm/java-7-oracle)

My env is a YARN cluster made of 7 nodes (6 datanodes/
node manager, 1 namenode/resource manager).

On the namenode, is where I executed the spark-submit job while on one of
the datanodes,  I executed 'hadoop fs -put /binstore /user/hadoop-user/' to
dump 1TB of data into all the datanodes. That process is still running
without hassle and it's only using 1.3 GB of 1.7g heap space.

Initially, I submitted 2 jobs to the YARN cluster which was running for 2
days and suddenly stops. Nothing in the logs shows the root cause.


On Tue, Nov 17, 2015 at 11:42 AM, Sonal Goyal <sonalgoy...@gmail.com> wrote:

> Could it be jdk related ? Which version are you on?
>
> Best Regards,
> Sonal
> Founder, Nube Technologies <http://www.nubetech.co>
> Reifier at Strata Hadoop World
> <http://strataconf.com/big-data-conference-sg-2015/public/schedule/detail/44606>
> Reifier at Spark Summit 2015
> <https://spark-summit.org/2015/events/real-time-fuzzy-matching-with-spark-and-elastic-search/>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>
> On Tue, Nov 17, 2015 at 2:48 PM, Kayode Odeyemi <drey...@gmail.com> wrote:
>
>> Anyone experienced this issue as well?
>>
>> On Mon, Nov 16, 2015 at 8:06 PM, Kayode Odeyemi <drey...@gmail.com>
>> wrote:
>>
>>>
>>> Or are you saying that the Java process never even starts?
>>>
>>>
>>> Exactly.
>>>
>>> Here's what I got back from jstack as expected:
>>>
>>> hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316
>>> 31316: Unable to open socket file: target process not responding or
>>> HotSpot VM not loaded
>>> The -F option can be used when the target process is not responding
>>> hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 -F
>>> Attaching to core -F from executable 31316, please wait...
>>> Error attaching to core file: Can't attach to the core file
>>>
>>>
>>>
>>
>>
>> --
>> Odeyemi 'Kayode O.
>> http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde
>>
>
>


-- 
Odeyemi 'Kayode O.
http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde


Re: spark-submit stuck and no output in console

2015-11-16 Thread Kayode Odeyemi
Spark 1.5.1

The fact is that there's no stack trace. No output from that command at all
to the console.

This is all I get:

hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ tail -1
/tmp/spark-profile-job.log
nohup: ignoring input
/usr/local/spark/bin/spark-class: line 76: 29516 Killed
 "$RUNNER" -cp "$LAUNCH_CLASSPATH" org.apache.spark.launcher.Main "$@"


On Mon, Nov 16, 2015 at 5:22 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Which release of Spark are you using ?
>
> Can you take stack trace and pastebin it ?
>
> Thanks
>
> On Mon, Nov 16, 2015 at 5:50 AM, Kayode Odeyemi <drey...@gmail.com> wrote:
>
>> ./spark-submit --class com.migration.UpdateProfiles --executor-memory 8g
>> ~/migration-profiles-0.1-SNAPSHOT.jar
>>
>> is stuck and outputs nothing to the console.
>>
>> What could be the cause of this? Current max heap size is 1.75g and it's
>> only using 1g.
>>
>>
>


-- 
Odeyemi 'Kayode O.
http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde


spark-submit stuck and no output in console

2015-11-16 Thread Kayode Odeyemi
./spark-submit --class com.migration.UpdateProfiles --executor-memory 8g
~/migration-profiles-0.1-SNAPSHOT.jar

is stuck and outputs nothing to the console.

What could be the cause of this? Current max heap size is 1.75g and it's
only using 1g.


Re: spark-submit stuck and no output in console

2015-11-16 Thread Kayode Odeyemi
> Or are you saying that the Java process never even starts?


Exactly.

Here's what I got back from jstack as expected:

hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316
31316: Unable to open socket file: target process not responding or HotSpot
VM not loaded
The -F option can be used when the target process is not responding
hadoop-user@yks-hadoop-m01:/usr/local/spark/bin$ jstack 31316 -F
Attaching to core -F from executable 31316, please wait...
Error attaching to core file: Can't attach to the core file


Re: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher

2015-11-06 Thread Kayode Odeyemi
Thank you. That seems to resolve it.

On Fri, Nov 6, 2015 at 11:46 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> You mentioned resourcemanager but not nodemanagers.
>
> I think you need to install Spark on nodes running nodemanagers.
>
> Cheers
>
> On Fri, Nov 6, 2015 at 1:32 PM, Kayode Odeyemi <drey...@gmail.com> wrote:
>
>> Hi,
>>
>> I have a YARN hadoop setup of 8 nodes (7 datanodes, 1 namenode and
>> resourcemaneger). I have Spark setup only on the namenode/resource manager.
>>
>> Do I need to have Spark installed on the datanodes?
>>
>> I asked because I'm getting below error when I run a Spark job through
>> spark-submit:
>>
>> Error: Could not find or load main class 
>> org.apache.spark.deploy.yarn.ExecutorLauncher
>>
>> I appreciate your help.
>>
>> Many thanks
>>
>>
>


Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher

2015-11-06 Thread Kayode Odeyemi
Hi,

I have a YARN hadoop setup of 8 nodes (7 datanodes, 1 namenode and
resourcemaneger). I have Spark setup only on the namenode/resource manager.

Do I need to have Spark installed on the datanodes?

I asked because I'm getting below error when I run a Spark job through
spark-submit:

Error: Could not find or load main class
org.apache.spark.deploy.yarn.ExecutorLauncher

I appreciate your help.

Many thanks


Futures timed out after [120 seconds].

2015-11-04 Thread Kayode Odeyemi
Hi,

I'm running a Spark standalone in cluster mode (1 master, 2 workers).

Everything has failed including spark-submit with errors such as "Caused
by: java.lang.ClassNotFoundException: com.migration.App$$anonfun$upsert$1"

Now, I've reverted back to submitting jobs through scala apps.

Any ideas on how to solve this error?

15/11/05 01:46:37 INFO util.Utils: Successfully started service
'driverPropsFetcher' on port 58748.
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:149)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:250)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.rpc.RpcTimeoutException: Futures timed out
after [120 seconds]. This timeout is controlled by
spark.rpc.lookupTimeout
at 
org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:162)
at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)
at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
... 4 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out
after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at 
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241)
... 11 more


Re: Executor app-20151104202102-0000 finished with state EXITED

2015-11-04 Thread Kayode Odeyemi
I've tried that once. No job was executed on the workers. That is, the
workers weren't used.

What I want to achieve is to have the SparkContext use a remote spark
standalone master at 192.168.2.11 (this is where I started the master with
./start-master.sh and all the slaves with ./start-slaves.sh)

On Wed, Nov 4, 2015 at 9:28 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Something like this:
> conf.setMaster("local[3]")
>
> On Wed, Nov 4, 2015 at 11:08 AM, Kayode Odeyemi <drey...@gmail.com> wrote:
>
>> Thanks Ted.
>>
>> Where would you suggest I add that? I'm creating a SparkContext from a
>> Spark app. My conf setup looks like this:
>>
>> conf.setMaster("spark://192.168.2.11:7077")
>> conf.set("spark.logConf", "true")
>> conf.set("spark.akka.logLifecycleEvents", "true")
>> conf.set("spark.executor.memory", "5g")
>>
>> On Wed, Nov 4, 2015 at 9:04 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> Have you tried using -Dspark.master=local ?
>>>
>>> Cheers
>>>
>>> On Wed, Nov 4, 2015 at 10:47 AM, Kayode Odeyemi <drey...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I can't seem to understand why all created executors always fail.
>>>>
>>>> I have a Spark standalone cluster setup make up of 2 workers and 1
>>>> master. My spark-env looks like this:
>>>>
>>>> SPARK_MASTER_IP=192.168.2.11
>>>> SPARK_LOCAL_IP=192.168.2.11
>>>> SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=4"
>>>> SPARK_WORKER_CORES=4
>>>> SPARK_WORKER_MEMORY=6g
>>>>
>>>> From the Spark logs, I get this:
>>>>
>>>> 15/11/04 20:36:35 WARN remote.ReliableDeliverySupervisor: Association with 
>>>> remote system [akka.tcp://sparkDriver@172.26.71.5:61094] has failed, 
>>>> address is now gated for [5000] ms. Reason: [Association failed with 
>>>> [akka.tcp://sparkDriver@172.26.71.5:61094]] Caused by: [Operation timed 
>>>> out: /172.26.71.5:61094]
>>>> Exception in thread "main" akka.actor.ActorNotFound: Actor not found for: 
>>>> ActorSelection[Anchor(akka.tcp://sparkDriver@172.26.71.5:61094/), 
>>>> Path(/user/CoarseGrainedScheduler)]
>>>>at 
>>>> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
>>>>at 
>>>> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
>>>>at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
>>>>at 
>>>> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
>>>>at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73)
>>>>at 
>>>> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
>>>>at 
>>>> akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120)
>>>>at 
>>>> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
>>>>at 
>>>> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
>>>>at 
>>>> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
>>>>at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266)
>>>>at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533)
>>>>at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569)
>>>>at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559)
>>>>at 
>>>> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)
>>>>at akka.remote.EndpointWriter.postStop(Endpoint.scala:557)
>>>>at akka.actor.Actor$class.aroundPostStop(Actor.scala:477)
>>>>at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:411)
>>>>at 
>>>> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>>>>at 
>>>> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>>>>at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>>>>at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>>>>at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>>>>at akka.dispatch.Mailbox.processAllS

Re: Executor app-20151104202102-0000 finished with state EXITED

2015-11-04 Thread Kayode Odeyemi
Thanks Ted.

Where would you suggest I add that? I'm creating a SparkContext from a
Spark app. My conf setup looks like this:

conf.setMaster("spark://192.168.2.11:7077")
conf.set("spark.logConf", "true")
conf.set("spark.akka.logLifecycleEvents", "true")
conf.set("spark.executor.memory", "5g")

On Wed, Nov 4, 2015 at 9:04 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Have you tried using -Dspark.master=local ?
>
> Cheers
>
> On Wed, Nov 4, 2015 at 10:47 AM, Kayode Odeyemi <drey...@gmail.com> wrote:
>
>> Hi,
>>
>> I can't seem to understand why all created executors always fail.
>>
>> I have a Spark standalone cluster setup make up of 2 workers and 1
>> master. My spark-env looks like this:
>>
>> SPARK_MASTER_IP=192.168.2.11
>> SPARK_LOCAL_IP=192.168.2.11
>> SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=4"
>> SPARK_WORKER_CORES=4
>> SPARK_WORKER_MEMORY=6g
>>
>> From the Spark logs, I get this:
>>
>> 15/11/04 20:36:35 WARN remote.ReliableDeliverySupervisor: Association with 
>> remote system [akka.tcp://sparkDriver@172.26.71.5:61094] has failed, address 
>> is now gated for [5000] ms. Reason: [Association failed with 
>> [akka.tcp://sparkDriver@172.26.71.5:61094]] Caused by: [Operation timed out: 
>> /172.26.71.5:61094]
>> Exception in thread "main" akka.actor.ActorNotFound: Actor not found for: 
>> ActorSelection[Anchor(akka.tcp://sparkDriver@172.26.71.5:61094/), 
>> Path(/user/CoarseGrainedScheduler)]
>>  at 
>> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
>>  at 
>> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
>>  at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
>>  at 
>> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
>>  at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73)
>>  at 
>> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
>>  at 
>> akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120)
>>  at 
>> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
>>  at 
>> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
>>  at 
>> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
>>  at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266)
>>  at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533)
>>  at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569)
>>  at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559)
>>  at 
>> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)
>>  at akka.remote.EndpointWriter.postStop(Endpoint.scala:557)
>>  at akka.actor.Actor$class.aroundPostStop(Actor.scala:477)
>>  at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:411)
>>  at 
>> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>>  at 
>> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>>  at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>>  at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>>  at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>>  at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
>>  at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>>  at 
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>>  at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>  at 
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>  at 
>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>  at 
>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>> 15/11/04 20:36:35 INFO actor.LocalActorRef: Message 
>> [akka.remote.EndpointWriter$AckIdleCheckTimer$] from 
>> Actor[akka://driverPropsFetcher/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkDriver%40172.26.71.5%3A61094-0/endpointWriter#-1769599826]
>>  to 
>> Actor[akka://driverPropsFetcher/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkDriver%40172.26.71.5%3A61094-0/endpointWriter#-1769599826]
>>  was not delivered. [1] dead letters encountered. This logging can be turned 
>> off or adjusted with configuration settings 'akka.log-dead-letters' and 
>> 'akka.log-dead-letters-during-shutdown'.
>>
>> I appreciate any kind of help.
>>
>
>


SPARK_SSH_FOREGROUND format

2015-11-04 Thread Kayode Odeyemi
From
http://spark.apache.org/docs/latest/spark-standalone.html#cluster-launch-scripts
:

If you do not have a password-less setup, you can set the environment
> variable SPARK_SSH_FOREGROUND and serially provide a password for each
> worker.
>

What does "serially provide a password for each worker mean"?

can it be something like this (assume I have 3 workers):
SPARK_SSH_FOREGROUND=worker1pass,worker2pass,worker3pass


Executor app-20151104202102-0000 finished with state EXITED

2015-11-04 Thread Kayode Odeyemi
Hi,

I can't seem to understand why all created executors always fail.

I have a Spark standalone cluster setup make up of 2 workers and 1 master.
My spark-env looks like this:

SPARK_MASTER_IP=192.168.2.11
SPARK_LOCAL_IP=192.168.2.11
SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=4"
SPARK_WORKER_CORES=4
SPARK_WORKER_MEMORY=6g

>From the Spark logs, I get this:

15/11/04 20:36:35 WARN remote.ReliableDeliverySupervisor: Association
with remote system [akka.tcp://sparkDriver@172.26.71.5:61094] has
failed, address is now gated for [5000] ms. Reason: [Association
failed with [akka.tcp://sparkDriver@172.26.71.5:61094]] Caused by:
[Operation timed out: /172.26.71.5:61094]
Exception in thread "main" akka.actor.ActorNotFound: Actor not found
for: ActorSelection[Anchor(akka.tcp://sparkDriver@172.26.71.5:61094/),
Path(/user/CoarseGrainedScheduler)]
at 
akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
at 
akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at 
akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73)
at 
akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
at 
akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:120)
at 
akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
at 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
at 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266)
at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533)
at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569)
at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:559)
at 
akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)
at akka.remote.EndpointWriter.postStop(Endpoint.scala:557)
at akka.actor.Actor$class.aroundPostStop(Actor.scala:477)
at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:411)
at 
akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
at 
akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
at akka.actor.ActorCell.terminate(ActorCell.scala:369)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/11/04 20:36:35 INFO actor.LocalActorRef: Message
[akka.remote.EndpointWriter$AckIdleCheckTimer$] from
Actor[akka://driverPropsFetcher/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkDriver%40172.26.71.5%3A61094-0/endpointWriter#-1769599826]
to 
Actor[akka://driverPropsFetcher/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkDriver%40172.26.71.5%3A61094-0/endpointWriter#-1769599826]
was not delivered. [1] dead letters encountered. This logging can be
turned off or adjusted with configuration settings
'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

I appreciate any kind of help.


Re: Maven build failed (Spark master)

2015-10-27 Thread Kayode Odeyemi
Thank you.

But I'm getting same warnings and it's still preventing the archive from
being generated.

I've ran this on both OSX Lion and Ubuntu 12. Same error. No .gz file

On Mon, Oct 26, 2015 at 9:10 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Looks like '-Pyarn' was missing in your command.
>
> On Mon, Oct 26, 2015 at 12:06 PM, Kayode Odeyemi <drey...@gmail.com>
> wrote:
>
>> I used this command which is synonymous to what you have:
>>
>> ./make-distribution.sh --name spark-latest --tgz --mvn mvn
>> -Dhadoop.version=2.6.0 -Phadoop-2.6 -Phive -Phive-thriftserver -DskipTests
>> clean package -U
>>
>> But I still see WARNINGS like this in the output and no .gz file created:
>>
>> cp: /usr/local/spark-latest/spark-[WARNING] See
>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/.part-r-5.gz.parquet.crc:
>> No such file or directory
>> cp: /usr/local/spark-latest/spark-[WARNING] See
>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/part-r-5.gz.parquet:
>> No such file or directory
>> cp: /usr/local/spark-latest/spark-[WARNING] See
>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9:
>> No such file or directory
>> cp:
>> /usr/local/spark-latest/dist/python/test_support/sql/parquet_partitioned/year=2015/month=9:
>> unable to copy extended attributes to
>> /usr/local/spark-latest/spark-[WARNING] See
>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9:
>> No such file or directory
>> cp: /usr/local/spark-latest/spark-[WARNING] See
>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1:
>> No such file or directory
>> cp:
>> /usr/local/spark-latest/dist/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1:
>> unable to copy extended attributes to
>> /usr/local/spark-latest/spark-[WARNING] See
>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1:
>> No such file or directory
>> cp: /usr/local/spark-latest/spark-[WARNING] See
>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1/.part-r-7.gz.parquet.crc:
>> No such file or directory
>>
>> On Mon, Oct 26, 2015 at 8:58 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> If you use the command shown in:
>>> https://github.com/apache/spark/pull/9281
>>>
>>> You should have got the following:
>>>
>>>
>>> ./dist/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/part-r-8.gz.parquet
>>>
>>> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1/part-r-7.gz.parquet
>>>
>>> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=25/part-r-4.gz.parquet
>>>
>>> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=25/part-r-2.gz.parquet
>>>
>>> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/part-r-5.gz.parquet
>>>
>>> On Mon, Oct 26, 2015 at 11:47 AM, Kayode Odeyemi <drey...@gmail.com>
>>> wrote:
>>>
>>>> I see a lot of stuffs like this after the a successful maven build:
>>>>
>>>> cp: /usr/local/spark-latest/spark-[WARNING] See
>>>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/
>>>> part-r-00008.gz.parquet: No such file or directory
>>>>
>>>> Seems it fails when it tries to package the build as an archive.
>>>>
>>>> I'm using the latest code on github master.
>>>>
>>>> Any ideas please?
>>>>
>>>> On Mon, Oct 26, 2015 at 6:20 PM, Yana Kadiyska <yana.kadiy...@gmail.com
>>>> > wrote:
>>>>
>>>>> In 1.4 ./make_distribution produces a .tgz file in the root directory
>>>>> (same directory that make_distribution is in)
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Oct 26, 2015 at 8:46 AM, Kayode Odeyemi <drey...@gmail.com>
>&g

Re: Maven build failed (Spark master)

2015-10-27 Thread Kayode Odeyemi
Thanks gents.

Removal of 'clean package -U' made the difference.

On Tue, Oct 27, 2015 at 6:39 PM, Todd Nist <tsind...@gmail.com> wrote:

> I issued the same basic command and it worked fine.
>
> RADTech-MBP:spark $ ./make-distribution.sh --name hadoop-2.6 --tgz -Pyarn
> -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests
>
> Which created: spark-1.6.0-SNAPSHOT-bin-hadoop-2.6.tgz in the root
> directory of the project.
>
> FWIW, the environment was an MBP with OS X 10.10.5 and Java:
>
> java version "1.8.0_51"
> Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
> Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
>
> -Todd
>
> On Tue, Oct 27, 2015 at 12:17 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> I used the following command:
>> make-distribution.sh --name custom-spark --tgz -Phadoop-2.4 -Phive
>> -Phive-thriftserver -Pyarn
>>
>> spark-1.6.0-SNAPSHOT-bin-custom-spark.tgz was generated (with patch from
>> SPARK-11348)
>>
>> Can you try above command ?
>>
>> Thanks
>>
>> On Tue, Oct 27, 2015 at 7:03 AM, Kayode Odeyemi <drey...@gmail.com>
>> wrote:
>>
>>> Ted, I switched to this:
>>>
>>> ./make-distribution.sh --name spark-latest --tgz -Dhadoop.version=2.6.0
>>> -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn -DskipTests clean package -U
>>>
>>> Same error. No .gz file. Here's the bottom output log:
>>>
>>> + rm -rf /home/emperor/javaprojects/spark/dist
>>> + mkdir -p /home/emperor/javaprojects/spark/dist/lib
>>> + echo 'Spark [WARNING] See
>>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin (git revision
>>> 3689beb) built for Hadoop [WARNING] See
>>> http://docs.codehaus.org/display/MAVENUSER/Shade+Pl
>>> + echo 'Build flags: -Dhadoop.version=2.6.0' -Phadoop-2.6 -Phive
>>> -Phive-thriftserver -Pyarn -DskipTests clean package -U
>>> + cp
>>> /home/emperor/javaprojects/spark/assembly/target/scala-2.10/spark-assembly-1.6.0-SNAPSHOT-hadoop2.6.0.jar
>>> /home/emperor/javaprojects/spark/dist/lib/
>>> + cp
>>> /home/emperor/javaprojects/spark/examples/target/scala-2.10/spark-examples-1.6.0-SNAPSHOT-hadoop2.6.0.jar
>>> /home/emperor/javaprojects/spark/dist/lib/
>>> + cp
>>> /home/emperor/javaprojects/spark/network/yarn/target/scala-2.10/spark-1.6.0-SNAPSHOT-yarn-shuffle.jar
>>> /home/emperor/javaprojects/spark/dist/lib/
>>> + mkdir -p /home/emperor/javaprojects/spark/dist/examples/src/main
>>> + cp -r /home/emperor/javaprojects/spark/examples/src/main
>>> /home/emperor/javaprojects/spark/dist/examples/src/
>>> + '[' 1 == 1 ']'
>>> + cp
>>> /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar
>>> /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-core-3.2.10.jar
>>> /home/emperor/javaprojects
>>> ed/jars/datanucleus-rdbms-3.2.9.jar
>>> /home/emperor/javaprojects/spark/dist/lib/
>>> + cp /home/emperor/javaprojects/spark/LICENSE
>>> /home/emperor/javaprojects/spark/dist
>>> + cp -r /home/emperor/javaprojects/spark/licenses
>>> /home/emperor/javaprojects/spark/dist
>>> + cp /home/emperor/javaprojects/spark/NOTICE
>>> /home/emperor/javaprojects/spark/dist
>>> + '[' -e /home/emperor/javaprojects/spark/CHANGES.txt ']'
>>> + cp -r /home/emperor/javaprojects/spark/data
>>> /home/emperor/javaprojects/spark/dist
>>> + mkdir /home/emperor/javaprojects/spark/dist/conf
>>> + cp /home/emperor/javaprojects/spark/conf/docker.properties.template
>>> /home/emperor/javaprojects/spark/conf/fairscheduler.xml.template
>>> /home/emperor/javaprojects/spark/conf/log4j.properties
>>> emperor/javaprojects/spark/conf/metrics.properties.template
>>> /home/emperor/javaprojects/spark/conf/slaves.template
>>> /home/emperor/javaprojects/spark/conf/spark-defaults.conf.template /home/em
>>> ts/spark/conf/spark-env.sh.template
>>> /home/emperor/javaprojects/spark/dist/conf
>>> + cp /home/emperor/javaprojects/spark/README.md
>>> /home/emperor/javaprojects/spark/dist
>>> + cp -r /home/emperor/javaprojects/spark/bin
>>> /home/emperor/javaprojects/spark/dist
>>> + cp -r /home/emperor/javaprojects/spark/python
>>> /home/emperor/javaprojects/spark/dist
>>> + cp -r /home/emperor/javaprojects/spark/sbin
>>> /home/emperor/javaprojects/spark/dist
>>> + cp -r /home/emperor/javaprojects/spark/ec2
>>> /home/emperor/javaprojects/spark/dist
>>> + '[' -d

Re: Maven build failed (Spark master)

2015-10-27 Thread Kayode Odeyemi
Seems the build and directory structure in dist is similar to the .gz file
downloaded from the
downloads page. Can the dist directory be used as is?

On Tue, Oct 27, 2015 at 4:03 PM, Kayode Odeyemi <drey...@gmail.com> wrote:

> Ted, I switched to this:
>
> ./make-distribution.sh --name spark-latest --tgz -Dhadoop.version=2.6.0
> -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn -DskipTests clean package -U
>
> Same error. No .gz file. Here's the bottom output log:
>
> + rm -rf /home/emperor/javaprojects/spark/dist
> + mkdir -p /home/emperor/javaprojects/spark/dist/lib
> + echo 'Spark [WARNING] See
> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin (git revision
> 3689beb) built for Hadoop [WARNING] See
> http://docs.codehaus.org/display/MAVENUSER/Shade+Pl
> + echo 'Build flags: -Dhadoop.version=2.6.0' -Phadoop-2.6 -Phive
> -Phive-thriftserver -Pyarn -DskipTests clean package -U
> + cp
> /home/emperor/javaprojects/spark/assembly/target/scala-2.10/spark-assembly-1.6.0-SNAPSHOT-hadoop2.6.0.jar
> /home/emperor/javaprojects/spark/dist/lib/
> + cp
> /home/emperor/javaprojects/spark/examples/target/scala-2.10/spark-examples-1.6.0-SNAPSHOT-hadoop2.6.0.jar
> /home/emperor/javaprojects/spark/dist/lib/
> + cp
> /home/emperor/javaprojects/spark/network/yarn/target/scala-2.10/spark-1.6.0-SNAPSHOT-yarn-shuffle.jar
> /home/emperor/javaprojects/spark/dist/lib/
> + mkdir -p /home/emperor/javaprojects/spark/dist/examples/src/main
> + cp -r /home/emperor/javaprojects/spark/examples/src/main
> /home/emperor/javaprojects/spark/dist/examples/src/
> + '[' 1 == 1 ']'
> + cp
> /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar
> /home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-core-3.2.10.jar
> /home/emperor/javaprojects
> ed/jars/datanucleus-rdbms-3.2.9.jar
> /home/emperor/javaprojects/spark/dist/lib/
> + cp /home/emperor/javaprojects/spark/LICENSE
> /home/emperor/javaprojects/spark/dist
> + cp -r /home/emperor/javaprojects/spark/licenses
> /home/emperor/javaprojects/spark/dist
> + cp /home/emperor/javaprojects/spark/NOTICE
> /home/emperor/javaprojects/spark/dist
> + '[' -e /home/emperor/javaprojects/spark/CHANGES.txt ']'
> + cp -r /home/emperor/javaprojects/spark/data
> /home/emperor/javaprojects/spark/dist
> + mkdir /home/emperor/javaprojects/spark/dist/conf
> + cp /home/emperor/javaprojects/spark/conf/docker.properties.template
> /home/emperor/javaprojects/spark/conf/fairscheduler.xml.template
> /home/emperor/javaprojects/spark/conf/log4j.properties
> emperor/javaprojects/spark/conf/metrics.properties.template
> /home/emperor/javaprojects/spark/conf/slaves.template
> /home/emperor/javaprojects/spark/conf/spark-defaults.conf.template /home/em
> ts/spark/conf/spark-env.sh.template
> /home/emperor/javaprojects/spark/dist/conf
> + cp /home/emperor/javaprojects/spark/README.md
> /home/emperor/javaprojects/spark/dist
> + cp -r /home/emperor/javaprojects/spark/bin
> /home/emperor/javaprojects/spark/dist
> + cp -r /home/emperor/javaprojects/spark/python
> /home/emperor/javaprojects/spark/dist
> + cp -r /home/emperor/javaprojects/spark/sbin
> /home/emperor/javaprojects/spark/dist
> + cp -r /home/emperor/javaprojects/spark/ec2
> /home/emperor/javaprojects/spark/dist
> + '[' -d /home/emperor/javaprojects/spark/R/lib/SparkR ']'
> + '[' false == true ']'
> + '[' true == true ']'
> + TARDIR_NAME='spark-[WARNING] See
> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest'
> + TARDIR='/home/emperor/javaprojects/spark/spark-[WARNING] See
> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest'
> + rm -rf '/home/emperor/javaprojects/spark/spark-[WARNING] See
> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest'
> + cp -r /home/emperor/javaprojects/spark/dist
> '/home/emperor/javaprojects/spark/spark-[WARNING] See
> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest'
> cp: cannot create directory
> `/home/emperor/javaprojects/spark/spark-[WARNING] See
> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest':
> No such file or directory
>
>
> On Tue, Oct 27, 2015 at 2:14 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Can you try the same command shown in the pull request ?
>>
>> Thanks
>>
>> On Oct 27, 2015, at 12:40 AM, Kayode Odeyemi <drey...@gmail.com> wrote:
>>
>> Thank you.
>>
>> But I'm getting same warnings and it's still preventing the archive from
>> being generated.
>>
>> I've ran this on both OSX Lion and Ubuntu 12. Same error. No .gz file
>>
>> On Mon, Oct 26, 2015 at 9:10 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> Looks

Re: Maven build failed (Spark master)

2015-10-27 Thread Kayode Odeyemi
Ted, I switched to this:

./make-distribution.sh --name spark-latest --tgz -Dhadoop.version=2.6.0
-Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn -DskipTests clean package -U

Same error. No .gz file. Here's the bottom output log:

+ rm -rf /home/emperor/javaprojects/spark/dist
+ mkdir -p /home/emperor/javaprojects/spark/dist/lib
+ echo 'Spark [WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin (git revision
3689beb) built for Hadoop [WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Pl
+ echo 'Build flags: -Dhadoop.version=2.6.0' -Phadoop-2.6 -Phive
-Phive-thriftserver -Pyarn -DskipTests clean package -U
+ cp
/home/emperor/javaprojects/spark/assembly/target/scala-2.10/spark-assembly-1.6.0-SNAPSHOT-hadoop2.6.0.jar
/home/emperor/javaprojects/spark/dist/lib/
+ cp
/home/emperor/javaprojects/spark/examples/target/scala-2.10/spark-examples-1.6.0-SNAPSHOT-hadoop2.6.0.jar
/home/emperor/javaprojects/spark/dist/lib/
+ cp
/home/emperor/javaprojects/spark/network/yarn/target/scala-2.10/spark-1.6.0-SNAPSHOT-yarn-shuffle.jar
/home/emperor/javaprojects/spark/dist/lib/
+ mkdir -p /home/emperor/javaprojects/spark/dist/examples/src/main
+ cp -r /home/emperor/javaprojects/spark/examples/src/main
/home/emperor/javaprojects/spark/dist/examples/src/
+ '[' 1 == 1 ']'
+ cp
/home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar
/home/emperor/javaprojects/spark/lib_managed/jars/datanucleus-core-3.2.10.jar
/home/emperor/javaprojects
ed/jars/datanucleus-rdbms-3.2.9.jar
/home/emperor/javaprojects/spark/dist/lib/
+ cp /home/emperor/javaprojects/spark/LICENSE
/home/emperor/javaprojects/spark/dist
+ cp -r /home/emperor/javaprojects/spark/licenses
/home/emperor/javaprojects/spark/dist
+ cp /home/emperor/javaprojects/spark/NOTICE
/home/emperor/javaprojects/spark/dist
+ '[' -e /home/emperor/javaprojects/spark/CHANGES.txt ']'
+ cp -r /home/emperor/javaprojects/spark/data
/home/emperor/javaprojects/spark/dist
+ mkdir /home/emperor/javaprojects/spark/dist/conf
+ cp /home/emperor/javaprojects/spark/conf/docker.properties.template
/home/emperor/javaprojects/spark/conf/fairscheduler.xml.template
/home/emperor/javaprojects/spark/conf/log4j.properties
emperor/javaprojects/spark/conf/metrics.properties.template
/home/emperor/javaprojects/spark/conf/slaves.template
/home/emperor/javaprojects/spark/conf/spark-defaults.conf.template /home/em
ts/spark/conf/spark-env.sh.template
/home/emperor/javaprojects/spark/dist/conf
+ cp /home/emperor/javaprojects/spark/README.md
/home/emperor/javaprojects/spark/dist
+ cp -r /home/emperor/javaprojects/spark/bin
/home/emperor/javaprojects/spark/dist
+ cp -r /home/emperor/javaprojects/spark/python
/home/emperor/javaprojects/spark/dist
+ cp -r /home/emperor/javaprojects/spark/sbin
/home/emperor/javaprojects/spark/dist
+ cp -r /home/emperor/javaprojects/spark/ec2
/home/emperor/javaprojects/spark/dist
+ '[' -d /home/emperor/javaprojects/spark/R/lib/SparkR ']'
+ '[' false == true ']'
+ '[' true == true ']'
+ TARDIR_NAME='spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest'
+ TARDIR='/home/emperor/javaprojects/spark/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest'
+ rm -rf '/home/emperor/javaprojects/spark/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest'
+ cp -r /home/emperor/javaprojects/spark/dist
'/home/emperor/javaprojects/spark/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest'
cp: cannot create directory
`/home/emperor/javaprojects/spark/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest':
No such file or directory


On Tue, Oct 27, 2015 at 2:14 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Can you try the same command shown in the pull request ?
>
> Thanks
>
> On Oct 27, 2015, at 12:40 AM, Kayode Odeyemi <drey...@gmail.com> wrote:
>
> Thank you.
>
> But I'm getting same warnings and it's still preventing the archive from
> being generated.
>
> I've ran this on both OSX Lion and Ubuntu 12. Same error. No .gz file
>
> On Mon, Oct 26, 2015 at 9:10 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Looks like '-Pyarn' was missing in your command.
>>
>> On Mon, Oct 26, 2015 at 12:06 PM, Kayode Odeyemi <drey...@gmail.com>
>> wrote:
>>
>>> I used this command which is synonymous to what you have:
>>>
>>> ./make-distribution.sh --name spark-latest --tgz --mvn mvn
>>> -Dhadoop.version=2.6.0 -Phadoop-2.6 -Phive -Phive-thriftserver -DskipTests
>>> clean package -U
>>>
>>> But I still see WARNINGS like this in the output and no .gz file created:
>>>
>>> cp: /usr/local/spark-latest/spark-[WARNING] See
>>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/p

Re: Maven build failed (Spark master)

2015-10-26 Thread Kayode Odeyemi
I used this command which is synonymous to what you have:

./make-distribution.sh --name spark-latest --tgz --mvn mvn
-Dhadoop.version=2.6.0 -Phadoop-2.6 -Phive -Phive-thriftserver -DskipTests
clean package -U

But I still see WARNINGS like this in the output and no .gz file created:

cp: /usr/local/spark-latest/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/.part-r-5.gz.parquet.crc:
No such file or directory
cp: /usr/local/spark-latest/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/part-r-5.gz.parquet:
No such file or directory
cp: /usr/local/spark-latest/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9:
No such file or directory
cp:
/usr/local/spark-latest/dist/python/test_support/sql/parquet_partitioned/year=2015/month=9:
unable to copy extended attributes to
/usr/local/spark-latest/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9:
No such file or directory
cp: /usr/local/spark-latest/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1:
No such file or directory
cp:
/usr/local/spark-latest/dist/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1:
unable to copy extended attributes to
/usr/local/spark-latest/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1:
No such file or directory
cp: /usr/local/spark-latest/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1/.part-r-7.gz.parquet.crc:
No such file or directory

On Mon, Oct 26, 2015 at 8:58 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> If you use the command shown in:
> https://github.com/apache/spark/pull/9281
>
> You should have got the following:
>
>
> ./dist/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/part-r-8.gz.parquet
>
> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=9/day=1/part-r-7.gz.parquet
>
> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=25/part-r-4.gz.parquet
>
> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=25/part-r-2.gz.parquet
>
> ./dist/python/test_support/sql/parquet_partitioned/year=2015/month=10/day=26/part-r-5.gz.parquet
>
> On Mon, Oct 26, 2015 at 11:47 AM, Kayode Odeyemi <drey...@gmail.com>
> wrote:
>
>> I see a lot of stuffs like this after the a successful maven build:
>>
>> cp: /usr/local/spark-latest/spark-[WARNING] See
>> http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/
>> part-r-8.gz.parquet: No such file or directory
>>
>> Seems it fails when it tries to package the build as an archive.
>>
>> I'm using the latest code on github master.
>>
>> Any ideas please?
>>
>> On Mon, Oct 26, 2015 at 6:20 PM, Yana Kadiyska <yana.kadiy...@gmail.com>
>> wrote:
>>
>>> In 1.4 ./make_distribution produces a .tgz file in the root directory
>>> (same directory that make_distribution is in)
>>>
>>>
>>>
>>> On Mon, Oct 26, 2015 at 8:46 AM, Kayode Odeyemi <drey...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> The ./make_distribution task completed. However, I can't seem to locate
>>>> the
>>>> .tar.gz file.
>>>>
>>>> Where does Spark save this? or should I just work with the dist
>>>> directory?
>>>>
>>>> On Fri, Oct 23, 2015 at 4:23 PM, Kayode Odeyemi <drey...@gmail.com>
>>>> wrote:
>>>>
>>>>> I saw this when I tested manually (without ./make-distribution)
>>>>>
>>>>> Detected Maven Version: 3.2.2 is not in the allowed range 3.3.3.
>>>>>
>>>>> So I simply upgraded maven to 3.3.3.
>>>>>
>>>>> Resolved. Thanks
>>>>>
>>>>> On Fri, Oct 23, 2015 at 3:17 PM, Sean Owen <so...@cloudera.com> wrote:
>>>>>
>>>>>> This doesn't show the actual error output from Maven. I have a strong
>

Re: Maven build failed (Spark master)

2015-10-26 Thread Kayode Odeyemi
I see a lot of stuffs like this after the a successful maven build:

cp: /usr/local/spark-latest/spark-[WARNING] See
http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin-bin-spark-latest/python/test_support/sql/parquet_partitioned/year=2014/month=9/day=1/
part-r-8.gz.parquet: No such file or directory

Seems it fails when it tries to package the build as an archive.

I'm using the latest code on github master.

Any ideas please?

On Mon, Oct 26, 2015 at 6:20 PM, Yana Kadiyska <yana.kadiy...@gmail.com>
wrote:

> In 1.4 ./make_distribution produces a .tgz file in the root directory
> (same directory that make_distribution is in)
>
>
>
> On Mon, Oct 26, 2015 at 8:46 AM, Kayode Odeyemi <drey...@gmail.com> wrote:
>
>> Hi,
>>
>> The ./make_distribution task completed. However, I can't seem to locate
>> the
>> .tar.gz file.
>>
>> Where does Spark save this? or should I just work with the dist directory?
>>
>> On Fri, Oct 23, 2015 at 4:23 PM, Kayode Odeyemi <drey...@gmail.com>
>> wrote:
>>
>>> I saw this when I tested manually (without ./make-distribution)
>>>
>>> Detected Maven Version: 3.2.2 is not in the allowed range 3.3.3.
>>>
>>> So I simply upgraded maven to 3.3.3.
>>>
>>> Resolved. Thanks
>>>
>>> On Fri, Oct 23, 2015 at 3:17 PM, Sean Owen <so...@cloudera.com> wrote:
>>>
>>>> This doesn't show the actual error output from Maven. I have a strong
>>>> guess that you haven't set MAVEN_OPTS to increase the memory Maven can
>>>> use.
>>>>
>>>> On Fri, Oct 23, 2015 at 6:14 AM, Kayode Odeyemi <drey...@gmail.com>
>>>> wrote:
>>>> > Hi,
>>>> >
>>>> > I can't seem to get a successful maven build. Please see command
>>>> output
>>>> > below:
>>>> >
>>>> > bash-3.2$ ./make-distribution.sh --name spark-latest --tgz --mvn mvn
>>>> > -Dhadoop.version=2.7.0 -Phadoop-2.7 -Phive -Phive-thriftserver
>>>> -DskipTests
>>>> > clean package
>>>> > +++ dirname ./make-distribution.sh
>>>> > ++ cd .
>>>> > ++ pwd
>>>> > + SPARK_HOME=/usr/local/spark-latest
>>>> > + DISTDIR=/usr/local/spark-latest/dist
>>>> > + SPARK_TACHYON=false
>>>> > + TACHYON_VERSION=0.7.1
>>>> > + TACHYON_TGZ=tachyon-0.7.1-bin.tar.gz
>>>> > +
>>>> > TACHYON_URL=
>>>> https://github.com/amplab/tachyon/releases/download/v0.7.1/tachyon-0.7.1-bin.tar.gz
>>>> > + MAKE_TGZ=false
>>>> > + NAME=none
>>>> > + MVN=/usr/local/spark-latest/build/mvn
>>>> > + ((  12  ))
>>>> > + case $1 in
>>>> > + NAME=spark-latest
>>>> > + shift
>>>> > + shift
>>>> > + ((  10  ))
>>>> > + case $1 in
>>>> > + MAKE_TGZ=true
>>>> > + shift
>>>> > + ((  9  ))
>>>> > + case $1 in
>>>> > + MVN=mvn
>>>> > + shift
>>>> > + shift
>>>> > + ((  7  ))
>>>> > + case $1 in
>>>> > + break
>>>> > + '[' -z
>>>> /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']'
>>>> > + '[' -z
>>>> /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']'
>>>> > ++ command -v git
>>>> > + '[' /usr/bin/git ']'
>>>> > ++ git rev-parse --short HEAD
>>>> > + GITREV=487d409
>>>> > + '[' '!' -z 487d409 ']'
>>>> > + GITREVSTRING=' (git revision 487d409)'
>>>> > + unset GITREV
>>>> > ++ command -v mvn
>>>> > + '[' '!' /usr/bin/mvn ']'
>>>> > ++ mvn help:evaluate -Dexpression=project.version
>>>> -Dhadoop.version=2.7.0
>>>> > -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package
>>>> > ++ grep -v INFO
>>>> > ++ tail -n 1
>>>> > + VERSION='[ERROR] [Help 1]
>>>> >
>>>> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
>>>> '
>>>> >
>>>> > Same output error with JDK 7
>>>> >
>>>> > Appreciate your help.
>>>> >
>>>> >
>>>>
>>>
>>>
>>>
>>
>


Loading binary files from NFS share

2015-10-26 Thread Kayode Odeyemi
Hi,

Is it possible to load binary files from NFS share like this:

sc.binaryFiles("nfs://host/mountpath")

I understand that it takes a path, but want to know if it allows protocol.

Appreciate your help.


Re: Maven build failed (Spark master)

2015-10-26 Thread Kayode Odeyemi
Hi,

The ./make_distribution task completed. However, I can't seem to locate the
.tar.gz file.

Where does Spark save this? or should I just work with the dist directory?

On Fri, Oct 23, 2015 at 4:23 PM, Kayode Odeyemi <drey...@gmail.com> wrote:

> I saw this when I tested manually (without ./make-distribution)
>
> Detected Maven Version: 3.2.2 is not in the allowed range 3.3.3.
>
> So I simply upgraded maven to 3.3.3.
>
> Resolved. Thanks
>
> On Fri, Oct 23, 2015 at 3:17 PM, Sean Owen <so...@cloudera.com> wrote:
>
>> This doesn't show the actual error output from Maven. I have a strong
>> guess that you haven't set MAVEN_OPTS to increase the memory Maven can
>> use.
>>
>> On Fri, Oct 23, 2015 at 6:14 AM, Kayode Odeyemi <drey...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > I can't seem to get a successful maven build. Please see command output
>> > below:
>> >
>> > bash-3.2$ ./make-distribution.sh --name spark-latest --tgz --mvn mvn
>> > -Dhadoop.version=2.7.0 -Phadoop-2.7 -Phive -Phive-thriftserver
>> -DskipTests
>> > clean package
>> > +++ dirname ./make-distribution.sh
>> > ++ cd .
>> > ++ pwd
>> > + SPARK_HOME=/usr/local/spark-latest
>> > + DISTDIR=/usr/local/spark-latest/dist
>> > + SPARK_TACHYON=false
>> > + TACHYON_VERSION=0.7.1
>> > + TACHYON_TGZ=tachyon-0.7.1-bin.tar.gz
>> > +
>> > TACHYON_URL=
>> https://github.com/amplab/tachyon/releases/download/v0.7.1/tachyon-0.7.1-bin.tar.gz
>> > + MAKE_TGZ=false
>> > + NAME=none
>> > + MVN=/usr/local/spark-latest/build/mvn
>> > + ((  12  ))
>> > + case $1 in
>> > + NAME=spark-latest
>> > + shift
>> > + shift
>> > + ((  10  ))
>> > + case $1 in
>> > + MAKE_TGZ=true
>> > + shift
>> > + ((  9  ))
>> > + case $1 in
>> > + MVN=mvn
>> > + shift
>> > + shift
>> > + ((  7  ))
>> > + case $1 in
>> > + break
>> > + '[' -z
>> /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']'
>> > + '[' -z
>> /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']'
>> > ++ command -v git
>> > + '[' /usr/bin/git ']'
>> > ++ git rev-parse --short HEAD
>> > + GITREV=487d409
>> > + '[' '!' -z 487d409 ']'
>> > + GITREVSTRING=' (git revision 487d409)'
>> > + unset GITREV
>> > ++ command -v mvn
>> > + '[' '!' /usr/bin/mvn ']'
>> > ++ mvn help:evaluate -Dexpression=project.version -Dhadoop.version=2.7.0
>> > -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package
>> > ++ grep -v INFO
>> > ++ tail -n 1
>> > + VERSION='[ERROR] [Help 1]
>> > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
>> '
>> >
>> > Same output error with JDK 7
>> >
>> > Appreciate your help.
>> >
>> >
>>
>
>
>


Maven build failed (Spark master)

2015-10-23 Thread Kayode Odeyemi
Hi,

I can't seem to get a successful maven build. Please see command output
below:

bash-3.2$ ./make-distribution.sh --name spark-latest --tgz --mvn mvn
-Dhadoop.version=2.7.0 -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests
clean package
+++ dirname ./make-distribution.sh
++ cd .
++ pwd
+ SPARK_HOME=/usr/local/spark-latest
+ DISTDIR=/usr/local/spark-latest/dist
+ SPARK_TACHYON=false
+ TACHYON_VERSION=0.7.1
+ TACHYON_TGZ=tachyon-0.7.1-bin.tar.gz
+ TACHYON_URL=
https://github.com/amplab/tachyon/releases/download/v0.7.1/tachyon-0.7.1-bin.tar.gz
+ MAKE_TGZ=false
+ NAME=none
+ MVN=/usr/local/spark-latest/build/mvn
+ ((  12  ))
+ case $1 in
+ NAME=spark-latest
+ shift
+ shift
+ ((  10  ))
+ case $1 in
+ MAKE_TGZ=true
+ shift
+ ((  9  ))
+ case $1 in
+ MVN=mvn
+ shift
+ shift
+ ((  7  ))
+ case $1 in
+ break
+ '[' -z /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']'
+ '[' -z /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home ']'
++ command -v git
+ '[' /usr/bin/git ']'
++ git rev-parse --short HEAD
+ GITREV=487d409
+ '[' '!' -z 487d409 ']'
+ GITREVSTRING=' (git revision 487d409)'
+ unset GITREV
++ command -v mvn
+ '[' '!' /usr/bin/mvn ']'
++ mvn help:evaluate -Dexpression=project.version -Dhadoop.version=2.7.0
-Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package
++ grep -v INFO
++ tail -n 1
+ VERSION='[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException'

Same output error with JDK 7

Appreciate your help.


Re: Maven build failed (Spark master)

2015-10-23 Thread Kayode Odeyemi
I saw this when I tested manually (without ./make-distribution)

Detected Maven Version: 3.2.2 is not in the allowed range 3.3.3.

So I simply upgraded maven to 3.3.3.

Resolved. Thanks

On Fri, Oct 23, 2015 at 3:17 PM, Sean Owen <so...@cloudera.com> wrote:

> This doesn't show the actual error output from Maven. I have a strong
> guess that you haven't set MAVEN_OPTS to increase the memory Maven can
> use.
>
> On Fri, Oct 23, 2015 at 6:14 AM, Kayode Odeyemi <drey...@gmail.com> wrote:
> > Hi,
> >
> > I can't seem to get a successful maven build. Please see command output
> > below:
> >
> > bash-3.2$ ./make-distribution.sh --name spark-latest --tgz --mvn mvn
> > -Dhadoop.version=2.7.0 -Phadoop-2.7 -Phive -Phive-thriftserver
> -DskipTests
> > clean package
> > +++ dirname ./make-distribution.sh
> > ++ cd .
> > ++ pwd
> > + SPARK_HOME=/usr/local/spark-latest
> > + DISTDIR=/usr/local/spark-latest/dist
> > + SPARK_TACHYON=false
> > + TACHYON_VERSION=0.7.1
> > + TACHYON_TGZ=tachyon-0.7.1-bin.tar.gz
> > +
> > TACHYON_URL=
> https://github.com/amplab/tachyon/releases/download/v0.7.1/tachyon-0.7.1-bin.tar.gz
> > + MAKE_TGZ=false
> > + NAME=none
> > + MVN=/usr/local/spark-latest/build/mvn
> > + ((  12  ))
> > + case $1 in
> > + NAME=spark-latest
> > + shift
> > + shift
> > + ((  10  ))
> > + case $1 in
> > + MAKE_TGZ=true
> > + shift
> > + ((  9  ))
> > + case $1 in
> > + MVN=mvn
> > + shift
> > + shift
> > + ((  7  ))
> > + case $1 in
> > + break
> > + '[' -z /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home
> ']'
> > + '[' -z /Library/Java/JavaVirtualMachines/jdk1.8.0_20.jdk/Contents/Home
> ']'
> > ++ command -v git
> > + '[' /usr/bin/git ']'
> > ++ git rev-parse --short HEAD
> > + GITREV=487d409
> > + '[' '!' -z 487d409 ']'
> > + GITREVSTRING=' (git revision 487d409)'
> > + unset GITREV
> > ++ command -v mvn
> > + '[' '!' /usr/bin/mvn ']'
> > ++ mvn help:evaluate -Dexpression=project.version -Dhadoop.version=2.7.0
> > -Phadoop-2.7 -Phive -Phive-thriftserver -DskipTests clean package
> > ++ grep -v INFO
> > ++ tail -n 1
> > + VERSION='[ERROR] [Help 1]
> > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException'
> >
> > Same output error with JDK 7
> >
> > Appreciate your help.
> >
> >
>


Re: sqlContext load by offset

2015-10-23 Thread Kayode Odeyemi
When I use that I get a "Caused by: org.postgresql.util.PSQLException:
ERROR: column "none" does not exist"

On Thu, Oct 22, 2015 at 9:31 PM, Kayode Odeyemi <drey...@gmail.com> wrote:

> Hi,
>
> I've trying to load a postgres table using the following expression:
>
> val cachedIndex = cache.get("latest_legacy_group_index")
> val mappingsDF = sqlContext.load("jdbc", Map(
>   "url" -> Config.dataSourceUrl(mode, Some("mappings")),
>   "dbtable" -> s"(select userid, yid, username from legacyusers offset
> $cachedIndex ) as legacyusers")
> )
>
> I'll like to know if this expression is correct:
>
> "dbtable" -> s"(select userid, yid, username from legacyusers offset
> $cachedIndex ) as legacyusers")
>
> As you can see. I'm trying to load the table records by offset
>
> I appreciate your help.
>
>


-- 
Odeyemi 'Kayode O.
http://ng.linkedin.com/in/kayodeodeyemi. t: @charyorde


Fwd: sqlContext load by offset

2015-10-22 Thread Kayode Odeyemi
Hi,

I've trying to load a postgres table using the following expression:

val cachedIndex = cache.get("latest_legacy_group_index")
val mappingsDF = sqlContext.load("jdbc", Map(
  "url" -> Config.dataSourceUrl(mode, Some("mappings")),
  "dbtable" -> s"(select userid, yid, username from legacyusers offset
$cachedIndex ) as legacyusers")
)

I'll like to know if this expression is correct:

"dbtable" -> s"(select userid, yid, username from legacyusers offset
$cachedIndex ) as legacyusers")

As you can see. I'm trying to load the table records by offset

I appreciate your help.