I have a streaming Spark process and I need to do some logging in the `foreachRDD` function, but I'm having trouble accessing the logger as a variable in the `foreachRDD` function
I would like to do the following import logging myLogger = logging.getLogger(LOGGER_NAME) ... ... someData = <STREAMING DATA> someData.foreachRDD(lambda now, rdds : myLogger.info( <SOMETHING ABOUT RDD>)) Inside the lambda, it cannot access `myLogger`. I get a giant stacktrace - here is a snippet. File "/juicero/press-mgmt/spark-1.5.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 537, in save_reduce save(state) File "/usr/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple save(element) File "/usr/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/usr/lib/python2.7/pickle.py", line 649, in save_dict self._batch_setitems(obj.iteritems()) File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems save(v) File "/usr/lib/python2.7/pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/juicero/press-mgmt/spark-1.5.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 315, in save_builtin_function return self.save_function(obj) File "/juicero/press-mgmt/spark-1.5.0-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 191, in save_function if islambda(obj) or obj.__code__.co_filename == '<stdin>' or themodule is None: AttributeError: 'builtin_function_or_method' object has no attribute '__code__' I don't understand why I can't access `myLogger`. Does it have something to do with Spark cannot serialize this logger object? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Get-variable-into-Spark-s-foreachRDD-function-tp24852.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org