Hi Boris, Interesting. Multiple queries is supported by the spark-sql operator and this should work using Airflow. Executing SQL from a file:
Fokkos-MBP:~ fokkodriesprong$ spark-sql --driver-java-options "-Dlog4j.configuration=file:///tmp/log4j.properties" -f query.sql 1 Time taken: 1.976 seconds, Fetched 1 row(s) 1 Time taken: 0.034 seconds, Fetched 1 row(s) Executing SQL from the command-line: Fokkos-MBP:~ fokkodriesprong$ spark-sql --driver-java-options "-Dlog4j.configuration=file:///tmp/log4j.properties" -e "SELECT 1; SELECT 1;" 1 Time taken: 1.947 seconds, Fetched 1 row(s) 1 Time taken: 0.032 seconds, Fetched 1 row(s) Can you share the exception that you are seeing? What version of Spark are you using? Cheers, Fokko 2017-10-11 18:01 GMT+02:00 Boris Tyukin <[email protected]>: > hi guys, > > tried spark_sql_hook to run a multi-statement query (two queries separated > by semi-column ) and it hangs forever. If i comment out the second query, > it runs fine. > > Anyone had the same issue? i do not see anything in the code preventing > more one statement. > > sql = """ > select * from .... ; > select * from .... ; > """ > > spark = SparkSqlHook(sql, conn_id='spark_default', master='yarn', > num_executors=4) > spark.run_query() > > Boris >
