jeff-xu-z opened a new pull request, #358:
URL: https://github.com/apache/incubator-livy/pull/358

   
   ## What changes were proposed in this pull request?
   
   Proposed code fix for 
[https://issues.apache.org/jira/browse/LIVY-896](https://issues.apache.org/jira/browse/LIVY-896).
   
   ## How was this patch tested?
   
   (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
   
   ### Unit tests
   
   Two new unit tests are included in the PR. Outputs of the new unit tests are 
attached as 
[unit-tests.txt](https://github.com/apache/incubator-livy/files/9793910/unit-tests.txt).
   
   ### System tests
   
   I run system tests manually to verify the fix on an EMR in AWS.
   
   1. Install open source Livy 0.7.1 and configured it running on port 8999.
   2. Upload my PySpark program run_sql.py to the cluster's HDFS (see the 
artifact below)
   3. Loop my test_livy.py (see the artifact below) for 10 times. I hit the 
issue 5 out of 10 tries (Livy reported SessionState.SUCCESS even though 
spark-submit failed). See 
[reproduced.txt](https://github.com/apache/incubator-livy/files/9793894/reproduced.txt)
 for details.
   5. Replaced livy-server jar with the fixed version.
   6. Loop my test_livy.py for 100 times. I never hit the issue again. See 
[fixed.txt](https://github.com/apache/incubator-livy/files/9793943/fixed.txt) 
for details.
   7. I also quickly verified that Livy Session does not have the issue. See 
[livy_session.txt](https://github.com/apache/incubator-livy/files/9793963/livy_session.txt)
 for details.
   
   ### Artifact: test_livy.py (the verification program)
   ```
   #!/usr/bin/env python3
   from livy import LivyBatch
   import sys
   
   import logging
   logging.basicConfig(format='%(asctime)s %(levelname)s %(message)s', 
stream=sys.stderr, level=logging.INFO)
   
   
   if __name__ == "__main__":
       batch = LivyBatch.create(
               url="http://ip-100-64-129-199.us-west-2.compute.internal:8999";,
               file="/tmp/run_sql.py", args=["-s", "select * from abc"],
               )
       logging.info(f"batch id={batch.batch_id} created ...")
       batch.wait()
       logging.info(f"batch id={batch.batch_id}, state={batch.state}")
   ```
   
   ### Artifact: run_sql.py (the Spark program to run a given SQL)
   
   ```
   from pyspark.sql import SparkSession
   import sys
   import argparse
   
   if __name__ == "__main__":
       parser = argparse.ArgumentParser(
           formatter_class = argparse.ArgumentDefaultsHelpFormatter,
       )
       parser.add_argument("-s", action="store", dest="sql")
       args = parser.parse_args()
   
       spark = SparkSession.builder.\
               appName("PySpark SparkSQL").\
               enableHiveSupport().\
               config("spark.ui.enabled", "false").\
               getOrCreate()
       try:
           spark.sql(args.sql).show()
       finally:
          spark.stop()
   ```
   
   ### Artifact: test_session.py (verify Livy session does not have the issue)
   ```
   #!/usr/bin/env python3
   from livy import LivySession, SessionKind
   import sys
   import logging
   logging.basicConfig(format='%(asctime)s %(levelname)s %(message)s', 
stream=sys.stderr, level=logging.INFO)
   
   
   if __name__ == "__main__":
       sess = LivySession.create(
               url="http://ip-100-64-129-199.us-west-2.compute.internal:8999";, 
               kind=SessionKind.SQL)
       logging.info(f"session id={sess.session_id} created ...")
       sess.wait()
       logging.info(f"session id={sess.session_id} is ready")
       try:
         sess.download_sql("SELECT * from xxyyzz")
       except Exception as e:
         logging.info(str(e))
   ```
   Please review https://livy.incubator.apache.org/community/ before opening a 
pull request.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to