Re: Invalid checkpoint url

2015-09-22 Thread Tathagata Das
>> at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1417)
>>> at
>>> org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1417)
>>> at
>>> org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1417)
>>> at
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>> at
>>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>> at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1417)
>>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1468)
>>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1483)
>>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1504)
>>> at
>>> com.datastax.spark.connector.streaming.DStreamFunctions$$anonfun$saveToCassandra$1.apply(DStreamFunctions.scala:33)
>>> at
>>> com.datastax.spark.connector.streaming.DStreamFunctions$$anonfun$saveToCassandra$1.apply(DStreamFunctions.scala:33)
>>> at
>>> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:534)
>>> at
>>> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:534)
>>> at
>>> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42)
>>> at
>>> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>>> at
>>> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>>> at scala.util.Try$.apply(Try.scala:161)
>>> at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
>>> at
>>> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176)
>>> at
>>> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176)
>>> at
>>> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176)
>>> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>>> at
>>> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> On Tue, Sep 22, 2015 at 6:49 PM, Adrian Tanase 
>>> wrote:
>>>
>>>> Have you tried simply ssc.checkpoint("checkpoint”)? This should create
>>>> it in the local folder, has always worked for me when in development on
>>>> local mode.
>>>>
>>>> For the others (/tmp/..) make sure you have rights to write there.
>>>>
>>>> -adrian
>>>>
>>>> From: srungarapu vamsi
>>>> Date: Tuesday, September 22, 2015 at 7:59 AM
>>>> To: user
>>>> Subject: Invalid checkpoint url
>>>>
>>>> I am using reduceByKeyAndWindow (with inverse reduce function) in my
>>>> code.
>>>> In order to use this, it seems the checkpointDirectory which i have to
>>>> use should be hadoop compatible file system.
>>>> Does that mean that, i should setup hadoop on my system.
>>>> I googled about this and i found in a S.O answer that i need not setup
>>>> hdfs but the checkpoint directory should be HDFS copatible.
>>>>
>>>> I am a beginner in this area. I am running my spark streaming
>>>> application on ubuntu 14.04, spark -1.3.1
>>>> If at all i need not setup hdfs and ext4 is hdfs compatible, then how
>>>> does my checkpoint directory look like?
>>>>
>>>> i tried all these:
>>>> ssc.checkpoint("/tmp/checkpoint")
>>>> ssc.checkpoint("hdfs:///tmp/checkpoint")
>>>> ssc.checkpoint("file:///tmp/checkpoint")
>>>>
>>>> But none of them worked for me.
>>>>
>>>> --
>>>> /Vamsi
>>>>
>>>
>>>
>>>
>>> --
>>> /Vamsi
>>>
>>
>>
>
>
> --
> /Vamsi
>


Re: Invalid checkpoint url

2015-09-22 Thread srungarapu vamsi
doCheckpoint(RDD.scala:1417)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1468)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1483)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1504)
>> at
>> com.datastax.spark.connector.streaming.DStreamFunctions$$anonfun$saveToCassandra$1.apply(DStreamFunctions.scala:33)
>> at
>> com.datastax.spark.connector.streaming.DStreamFunctions$$anonfun$saveToCassandra$1.apply(DStreamFunctions.scala:33)
>> at
>> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:534)
>> at
>> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:534)
>> at
>> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42)
>> at
>> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>> at
>> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>> at scala.util.Try$.apply(Try.scala:161)
>> at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
>> at
>> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176)
>> at
>> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176)
>> at
>> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176)
>> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>> at
>> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>>
>> On Tue, Sep 22, 2015 at 6:49 PM, Adrian Tanase  wrote:
>>
>>> Have you tried simply ssc.checkpoint("checkpoint”)? This should create
>>> it in the local folder, has always worked for me when in development on
>>> local mode.
>>>
>>> For the others (/tmp/..) make sure you have rights to write there.
>>>
>>> -adrian
>>>
>>> From: srungarapu vamsi
>>> Date: Tuesday, September 22, 2015 at 7:59 AM
>>> To: user
>>> Subject: Invalid checkpoint url
>>>
>>> I am using reduceByKeyAndWindow (with inverse reduce function) in my
>>> code.
>>> In order to use this, it seems the checkpointDirectory which i have to
>>> use should be hadoop compatible file system.
>>> Does that mean that, i should setup hadoop on my system.
>>> I googled about this and i found in a S.O answer that i need not setup
>>> hdfs but the checkpoint directory should be HDFS copatible.
>>>
>>> I am a beginner in this area. I am running my spark streaming
>>> application on ubuntu 14.04, spark -1.3.1
>>> If at all i need not setup hdfs and ext4 is hdfs compatible, then how
>>> does my checkpoint directory look like?
>>>
>>> i tried all these:
>>> ssc.checkpoint("/tmp/checkpoint")
>>> ssc.checkpoint("hdfs:///tmp/checkpoint")
>>> ssc.checkpoint("file:///tmp/checkpoint")
>>>
>>> But none of them worked for me.
>>>
>>> --
>>> /Vamsi
>>>
>>
>>
>>
>> --
>> /Vamsi
>>
>
>


-- 
/Vamsi


Re: Invalid checkpoint url

2015-09-22 Thread Tathagata Das
reaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176)
> at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176)
> at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176)
> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
> at
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> On Tue, Sep 22, 2015 at 6:49 PM, Adrian Tanase  wrote:
>
>> Have you tried simply ssc.checkpoint("checkpoint”)? This should create it
>> in the local folder, has always worked for me when in development on local
>> mode.
>>
>> For the others (/tmp/..) make sure you have rights to write there.
>>
>> -adrian
>>
>> From: srungarapu vamsi
>> Date: Tuesday, September 22, 2015 at 7:59 AM
>> To: user
>> Subject: Invalid checkpoint url
>>
>> I am using reduceByKeyAndWindow (with inverse reduce function) in my
>> code.
>> In order to use this, it seems the checkpointDirectory which i have to
>> use should be hadoop compatible file system.
>> Does that mean that, i should setup hadoop on my system.
>> I googled about this and i found in a S.O answer that i need not setup
>> hdfs but the checkpoint directory should be HDFS copatible.
>>
>> I am a beginner in this area. I am running my spark streaming application
>> on ubuntu 14.04, spark -1.3.1
>> If at all i need not setup hdfs and ext4 is hdfs compatible, then how
>> does my checkpoint directory look like?
>>
>> i tried all these:
>> ssc.checkpoint("/tmp/checkpoint")
>> ssc.checkpoint("hdfs:///tmp/checkpoint")
>> ssc.checkpoint("file:///tmp/checkpoint")
>>
>> But none of them worked for me.
>>
>> --
>> /Vamsi
>>
>
>
>
> --
> /Vamsi
>


Re: Invalid checkpoint url

2015-09-22 Thread srungarapu vamsi
utor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

On Tue, Sep 22, 2015 at 6:49 PM, Adrian Tanase  wrote:

> Have you tried simply ssc.checkpoint("checkpoint”)? This should create it
> in the local folder, has always worked for me when in development on local
> mode.
>
> For the others (/tmp/..) make sure you have rights to write there.
>
> -adrian
>
> From: srungarapu vamsi
> Date: Tuesday, September 22, 2015 at 7:59 AM
> To: user
> Subject: Invalid checkpoint url
>
> I am using reduceByKeyAndWindow (with inverse reduce function) in my code.
> In order to use this, it seems the checkpointDirectory which i have to use
> should be hadoop compatible file system.
> Does that mean that, i should setup hadoop on my system.
> I googled about this and i found in a S.O answer that i need not setup
> hdfs but the checkpoint directory should be HDFS copatible.
>
> I am a beginner in this area. I am running my spark streaming application
> on ubuntu 14.04, spark -1.3.1
> If at all i need not setup hdfs and ext4 is hdfs compatible, then how does
> my checkpoint directory look like?
>
> i tried all these:
> ssc.checkpoint("/tmp/checkpoint")
> ssc.checkpoint("hdfs:///tmp/checkpoint")
> ssc.checkpoint("file:///tmp/checkpoint")
>
> But none of them worked for me.
>
> --
> /Vamsi
>



-- 
/Vamsi


Re: Invalid checkpoint url

2015-09-22 Thread Adrian Tanase
Have you tried simply ssc.checkpoint("checkpoint”)? This should create it in 
the local folder, has always worked for me when in development on local mode.

For the others (/tmp/..) make sure you have rights to write there.

-adrian

From: srungarapu vamsi
Date: Tuesday, September 22, 2015 at 7:59 AM
To: user
Subject: Invalid checkpoint url

I am using reduceByKeyAndWindow (with inverse reduce function) in my code.
In order to use this, it seems the checkpointDirectory which i have to use 
should be hadoop compatible file system.
Does that mean that, i should setup hadoop on my system.
I googled about this and i found in a S.O answer that i need not setup hdfs but 
the checkpoint directory should be HDFS copatible.

I am a beginner in this area. I am running my spark streaming application on 
ubuntu 14.04, spark -1.3.1
If at all i need not setup hdfs and ext4 is hdfs compatible, then how does my 
checkpoint directory look like?

i tried all these:
ssc.checkpoint("/tmp/checkpoint")
ssc.checkpoint("hdfs:///tmp/checkpoint")
ssc.checkpoint("file:///tmp/checkpoint")

But none of them worked for me.

--
/Vamsi



Invalid checkpoint url

2015-09-21 Thread srungarapu vamsi
I am using reduceByKeyAndWindow (with inverse reduce function) in my code.
In order to use this, it seems the checkpointDirectory which i have to use
should be hadoop compatible file system.
Does that mean that, i should setup hadoop on my system.
I googled about this and i found in a S.O answer that i need not setup hdfs
but the checkpoint directory should be HDFS copatible.

I am a beginner in this area. I am running my spark streaming application
on ubuntu 14.04, spark -1.3.1
If at all i need not setup hdfs and ext4 is hdfs compatible, then how does
my checkpoint directory look like?

i tried all these:
ssc.checkpoint("/tmp/checkpoint")
ssc.checkpoint("hdfs:///tmp/checkpoint")
ssc.checkpoint("file:///tmp/checkpoint")

But none of them worked for me.

-- 
/Vamsi