I have a streaming Spark process and I need to do some logging in the
`foreachRDD` function, but I'm having trouble accessing the logger as a
variable in the `foreachRDD` function

I would like to do the following

    import logging

    myLogger = logging.getLogger(LOGGER_NAME)
    ...
    ...
    someData = <STREAMING DATA>

    someData.foreachRDD(lambda now, rdds : myLogger.info( <SOMETHING ABOUT
RDD>))

Inside the lambda, it cannot access `myLogger`. I get a giant stacktrace -
here is a snippet.


      File
"/juicero/press-mgmt/spark-1.5.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 537, in     save_reduce
    save(state)
      File "/usr/lib/python2.7/pickle.py", line 286, in save
        f(self, obj) # Call unbound method with explicit self
      File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
        save(element)
      File "/usr/lib/python2.7/pickle.py", line 286, in save
        f(self, obj) # Call unbound method with explicit self
      File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
        self._batch_setitems(obj.iteritems())
      File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
        save(v)
      File "/usr/lib/python2.7/pickle.py", line 286, in save
        f(self, obj) # Call unbound method with explicit self
      File
"/juicero/press-mgmt/spark-1.5.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 315, in save_builtin_function
        return self.save_function(obj)
      File
"/juicero/press-mgmt/spark-1.5.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 191, in save_function
        if islambda(obj) or obj.__code__.co_filename == '<stdin>' or
themodule is None:
    AttributeError: 'builtin_function_or_method' object has no attribute
'__code__'



I don't understand why I can't access `myLogger`. Does it have something to
do with Spark cannot serialize this logger object?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Get-variable-into-Spark-s-foreachRDD-function-tp24852.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to