Re: using spark context in map funciton TASk not serilizable error

2016-01-20 Thread Giri P
method1 looks like this

reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)

reRDD has userId's

def method1(sc:SparkContext , userId: string){
sc.cassandraTable("Keyspace", "Table2").where("userid = ?" userId)
...do something

return "Test"
}

On Wed, Jan 20, 2016 at 11:00 AM, Shixiong(Ryan) Zhu <
shixi...@databricks.com> wrote:

> You should not use SparkContext or RDD directly in your closures.
>
> Could you show the codes of "method1"? Maybe you only needs join or
> something else. E.g.,
>
> val cassandraRDD = sc.cassandraTable("keySpace", "tableName")
> reRDD.join(cassandraRDD).map().saveAsTextFile(outputDir)
>
>
> On Tue, Jan 19, 2016 at 4:12 AM, Ricardo Paiva <
> ricardo.pa...@corp.globo.com> wrote:
>
>> Did you try SparkContext.getOrCreate() ?
>>
>> You don't need to pass the sparkContext to the map function, you can
>> retrieve it from the SparkContext singleton.
>>
>> Regards,
>>
>> Ricardo
>>
>>
>> On Mon, Jan 18, 2016 at 6:29 PM, gpatcham [via Apache Spark User List] 
>> <[hidden
>> email] <http:///user/SendEmail.jtp?type=node=26006=0>> wrote:
>>
>>> Hi,
>>>
>>> I have a use case where I need to pass sparkcontext in map function
>>>
>>> reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
>>>
>>> Method1 needs spark context to query cassandra. But I see below error
>>>
>>> java.io.NotSerializableException: org.apache.spark.SparkContext
>>>
>>> Is there a way we can fix this ?
>>>
>>> Thanks
>>>
>>> --
>>> If you reply to this email, your message will be added to the discussion
>>> below:
>>>
>>> http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
>>> To start a new topic under Apache Spark User List, email [hidden email]
>>> <http:///user/SendEmail.jtp?type=node=26006=1>
>>> To unsubscribe from Apache Spark User List, click here.
>>> NAML
>>> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>>
>>
>>
>>
>> --
>> Ricardo Paiva
>> Big Data
>> *globo.com* <http://www.globo.com>
>>
>> --
>> View this message in context: Re: using spark context in map funciton
>> TASk not serilizable error
>> <http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998p26006.html>
>>
>> Sent from the Apache Spark User List mailing list archive
>> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>>
>
>


Re: using spark context in map funciton TASk not serilizable error

2016-01-20 Thread Shixiong(Ryan) Zhu
You should not use SparkContext or RDD directly in your closures.

Could you show the codes of "method1"? Maybe you only needs join or
something else. E.g.,

val cassandraRDD = sc.cassandraTable("keySpace", "tableName")
reRDD.join(cassandraRDD).map().saveAsTextFile(outputDir)


On Tue, Jan 19, 2016 at 4:12 AM, Ricardo Paiva <ricardo.pa...@corp.globo.com
> wrote:

> Did you try SparkContext.getOrCreate() ?
>
> You don't need to pass the sparkContext to the map function, you can
> retrieve it from the SparkContext singleton.
>
> Regards,
>
> Ricardo
>
>
> On Mon, Jan 18, 2016 at 6:29 PM, gpatcham [via Apache Spark User List] 
> <[hidden
> email] <http:///user/SendEmail.jtp?type=node=26006=0>> wrote:
>
>> Hi,
>>
>> I have a use case where I need to pass sparkcontext in map function
>>
>> reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
>>
>> Method1 needs spark context to query cassandra. But I see below error
>>
>> java.io.NotSerializableException: org.apache.spark.SparkContext
>>
>> Is there a way we can fix this ?
>>
>> Thanks
>>
>> --------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
>> To start a new topic under Apache Spark User List, email [hidden email]
>> <http:///user/SendEmail.jtp?type=node=26006=1>
>> To unsubscribe from Apache Spark User List, click here.
>> NAML
>> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
>
> --
> Ricardo Paiva
> Big Data
> *globo.com* <http://www.globo.com>
>
> --
> View this message in context: Re: using spark context in map funciton
> TASk not serilizable error
> <http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998p26006.html>
>
> Sent from the Apache Spark User List mailing list archive
> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>


Re: using spark context in map funciton TASk not serilizable error

2016-01-19 Thread Ricardo Paiva
Did you try SparkContext.getOrCreate() ?

You don't need to pass the sparkContext to the map function, you can
retrieve it from the SparkContext singleton.

Regards,

Ricardo


On Mon, Jan 18, 2016 at 6:29 PM, gpatcham [via Apache Spark User List] <
ml-node+s1001560n25998...@n3.nabble.com> wrote:

> Hi,
>
> I have a use case where I need to pass sparkcontext in map function
>
> reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
>
> Method1 needs spark context to query cassandra. But I see below error
>
> java.io.NotSerializableException: org.apache.spark.SparkContext
>
> Is there a way we can fix this ?
>
> Thanks
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
> To start a new topic under Apache Spark User List, email
> ml-node+s1001560n1...@n3.nabble.com
> To unsubscribe from Apache Spark User List, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=1=cmljYXJkby5wYWl2YUBjb3JwLmdsb2JvLmNvbXwxfDQ1MDcxMTc2Mw==>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>



-- 
Ricardo Paiva
Big Data
*globo.com* <http://www.globo.com>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998p26006.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

using spark context in map funciton TASk not serilizable error

2016-01-18 Thread gpatcham
Hi,

I have a use case where I need to pass sparkcontext in map function 

reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)

Method1 needs spark context to query cassandra. But I see below error

java.io.NotSerializableException: org.apache.spark.SparkContext

Is there a way we can fix this ?

Thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: using spark context in map funciton TASk not serilizable error

2016-01-18 Thread Ted Yu
Can you pass the properties which are needed for accessing Cassandra
without going through SparkContext ?

SparkContext isn't designed to be used in the way illustrated below.

Cheers

On Mon, Jan 18, 2016 at 12:29 PM, gpatcham <gpatc...@gmail.com> wrote:

> Hi,
>
> I have a use case where I need to pass sparkcontext in map function
>
> reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
>
> Method1 needs spark context to query cassandra. But I see below error
>
> java.io.NotSerializableException: org.apache.spark.SparkContext
>
> Is there a way we can fix this ?
>
> Thanks
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: using spark context in map funciton TASk not serilizable error

2016-01-18 Thread Giri P
I'm using spark cassandra connector to do this and the way we access
cassandra table is

sc.cassandraTable("keySpace", "tableName")

Thanks
Giri

On Mon, Jan 18, 2016 at 12:37 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Can you pass the properties which are needed for accessing Cassandra
> without going through SparkContext ?
>
> SparkContext isn't designed to be used in the way illustrated below.
>
> Cheers
>
> On Mon, Jan 18, 2016 at 12:29 PM, gpatcham <gpatc...@gmail.com> wrote:
>
>> Hi,
>>
>> I have a use case where I need to pass sparkcontext in map function
>>
>> reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
>>
>> Method1 needs spark context to query cassandra. But I see below error
>>
>> java.io.NotSerializableException: org.apache.spark.SparkContext
>>
>> Is there a way we can fix this ?
>>
>> Thanks
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: using spark context in map funciton TASk not serilizable error

2016-01-18 Thread Giri P
Can we use @transient ?


On Mon, Jan 18, 2016 at 12:44 PM, Giri P <gpatc...@gmail.com> wrote:

> I'm using spark cassandra connector to do this and the way we access
> cassandra table is
>
> sc.cassandraTable("keySpace", "tableName")
>
> Thanks
> Giri
>
> On Mon, Jan 18, 2016 at 12:37 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Can you pass the properties which are needed for accessing Cassandra
>> without going through SparkContext ?
>>
>> SparkContext isn't designed to be used in the way illustrated below.
>>
>> Cheers
>>
>> On Mon, Jan 18, 2016 at 12:29 PM, gpatcham <gpatc...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I have a use case where I need to pass sparkcontext in map function
>>>
>>> reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
>>>
>>> Method1 needs spark context to query cassandra. But I see below error
>>>
>>> java.io.NotSerializableException: org.apache.spark.SparkContext
>>>
>>> Is there a way we can fix this ?
>>>
>>> Thanks
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>


Re: using spark context in map funciton TASk not serilizable error

2016-01-18 Thread Ted Yu
Did you mean constructing SparkContext on the worker nodes ?

Not sure whether that would work.

Doesn't seem to be good practice.

On Mon, Jan 18, 2016 at 1:27 PM, Giri P <gpatc...@gmail.com> wrote:

> Can we use @transient ?
>
>
> On Mon, Jan 18, 2016 at 12:44 PM, Giri P <gpatc...@gmail.com> wrote:
>
>> I'm using spark cassandra connector to do this and the way we access
>> cassandra table is
>>
>> sc.cassandraTable("keySpace", "tableName")
>>
>> Thanks
>> Giri
>>
>> On Mon, Jan 18, 2016 at 12:37 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>>> Can you pass the properties which are needed for accessing Cassandra
>>> without going through SparkContext ?
>>>
>>> SparkContext isn't designed to be used in the way illustrated below.
>>>
>>> Cheers
>>>
>>> On Mon, Jan 18, 2016 at 12:29 PM, gpatcham <gpatc...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a use case where I need to pass sparkcontext in map function
>>>>
>>>> reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
>>>>
>>>> Method1 needs spark context to query cassandra. But I see below error
>>>>
>>>> java.io.NotSerializableException: org.apache.spark.SparkContext
>>>>
>>>> Is there a way we can fix this ?
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> -
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>>>
>>
>


Re: using spark context in map funciton TASk not serilizable error

2016-01-18 Thread Giri P
yes I tried doing that but that doesn't work.

I'm looking at using SQLContext and dataframes. Is SQLCOntext serializable?

On Mon, Jan 18, 2016 at 1:29 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Did you mean constructing SparkContext on the worker nodes ?
>
> Not sure whether that would work.
>
> Doesn't seem to be good practice.
>
> On Mon, Jan 18, 2016 at 1:27 PM, Giri P <gpatc...@gmail.com> wrote:
>
>> Can we use @transient ?
>>
>>
>> On Mon, Jan 18, 2016 at 12:44 PM, Giri P <gpatc...@gmail.com> wrote:
>>
>>> I'm using spark cassandra connector to do this and the way we access
>>> cassandra table is
>>>
>>> sc.cassandraTable("keySpace", "tableName")
>>>
>>> Thanks
>>> Giri
>>>
>>> On Mon, Jan 18, 2016 at 12:37 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>
>>>> Can you pass the properties which are needed for accessing Cassandra
>>>> without going through SparkContext ?
>>>>
>>>> SparkContext isn't designed to be used in the way illustrated below.
>>>>
>>>> Cheers
>>>>
>>>> On Mon, Jan 18, 2016 at 12:29 PM, gpatcham <gpatc...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have a use case where I need to pass sparkcontext in map function
>>>>>
>>>>> reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
>>>>>
>>>>> Method1 needs spark context to query cassandra. But I see below error
>>>>>
>>>>> java.io.NotSerializableException: org.apache.spark.SparkContext
>>>>>
>>>>> Is there a way we can fix this ?
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> -
>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>
>


Re: using spark context in map funciton TASk not serilizable error

2016-01-18 Thread Ted Yu
class SQLContext private[sql](
@transient val sparkContext: SparkContext,
@transient protected[sql] val cacheManager: CacheManager,
@transient private[sql] val listener: SQLListener,
val isRootContext: Boolean)
  extends org.apache.spark.Logging with Serializable {

FYI

On Mon, Jan 18, 2016 at 1:44 PM, Giri P <gpatc...@gmail.com> wrote:

> yes I tried doing that but that doesn't work.
>
> I'm looking at using SQLContext and dataframes. Is SQLCOntext serializable?
>
> On Mon, Jan 18, 2016 at 1:29 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Did you mean constructing SparkContext on the worker nodes ?
>>
>> Not sure whether that would work.
>>
>> Doesn't seem to be good practice.
>>
>> On Mon, Jan 18, 2016 at 1:27 PM, Giri P <gpatc...@gmail.com> wrote:
>>
>>> Can we use @transient ?
>>>
>>>
>>> On Mon, Jan 18, 2016 at 12:44 PM, Giri P <gpatc...@gmail.com> wrote:
>>>
>>>> I'm using spark cassandra connector to do this and the way we access
>>>> cassandra table is
>>>>
>>>> sc.cassandraTable("keySpace", "tableName")
>>>>
>>>> Thanks
>>>> Giri
>>>>
>>>> On Mon, Jan 18, 2016 at 12:37 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>>> Can you pass the properties which are needed for accessing Cassandra
>>>>> without going through SparkContext ?
>>>>>
>>>>> SparkContext isn't designed to be used in the way illustrated below.
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Mon, Jan 18, 2016 at 12:29 PM, gpatcham <gpatc...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have a use case where I need to pass sparkcontext in map function
>>>>>>
>>>>>> reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
>>>>>>
>>>>>> Method1 needs spark context to query cassandra. But I see below error
>>>>>>
>>>>>> java.io.NotSerializableException: org.apache.spark.SparkContext
>>>>>>
>>>>>> Is there a way we can fix this ?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/using-spark-context-in-map-funciton-TASk-not-serilizable-error-tp25998.html
>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>> Nabble.com.
>>>>>>
>>>>>> -
>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>