Chao Gao created LIVY-774: ----------------------------- Summary: Logging does not print to stdout or stderr correctly on PySpark through Livy Key: LIVY-774 URL: https://issues.apache.org/jira/browse/LIVY-774 Project: Livy Issue Type: Bug Components: API Affects Versions: 0.7.0 Reporter: Chao Gao Attachments: JupyterNotebook_use_Livy_bug.png, LinuxCurl_use_Livy_error.png, Works_on_PySpark_CLI.png, Zeppelin_use_Livy_bug.png
h2. Summary When using PySpark through Livy on Zeppelin or Jupyter Notebook, or Linux curl, For the 1st time, it could print out the log to stdout or stderr. But for the 2nd time and afterwards, it will show the error stack: {color:#FF0000}{{ValueError: I/O operation on closed file}}{color} If we use PySpark CLI on the master node, it works well, you could check the attachment: +_Works_on_PySpark_CLI.png_+ h2. Reproduce Step In Zeppelin using Livy as interpreter {code:java} %pyspark import sys import logging; // OUTPUT Spark Application Id: application_1591899500515_0002 {code} When the 1st time, we try to print log to stdout or stderr, it works well. {code:java} %pyspark logger = logging.getLogger("log_example") logger.setLevel(logging.ERROR) ch = logging.StreamHandler(sys.stderr) ch.setLevel(logging.ERROR) logger.addHandler(ch) logger.error("test error!") // OUTPUT is expected test error!{code} When we try to print log to stdout or stderr 2nd time and afterwards, it will show the error stack. {code:java} %pyspark logger.error("test error again!") // OUTPUT showing error stack --- Logging error --- Traceback (most recent call last): File "/usr/lib64/python3.7/logging/__init__.py", line 1028, in emit stream.write(msg + self.terminator) File "/tmp/1262710270598062870", line 534, in write super(UnicodeDecodingStringIO, self).write(s) ValueError: I/O operation on closed file Call stack: File "/tmp/1262710270598062870", line 714, in <module> sys.exit(main()) File "/tmp/1262710270598062870", line 686, in main response = handler(content) File "/tmp/1262710270598062870", line 318, in execute_request result = node.execute() File "/tmp/1262710270598062870", line 229, in execute exec(code, global_dict) File "<stdin>", line 1, in <module> Message: 'test error again!'{code} For Jupyter notebook, or Linux curl command, they got the same error. You could check the attachments: +_1. Zeppelin_use_Livy_bug.png_+ +_2. JupyterNotebook_use_Livy_bug.png_+ +_3. LinuxCurl_use_Livy_error.png_+ -- This message was sent by Atlassian Jira (v8.3.4#803005)