Hi Aleksander, What version of Airflow are you using?
Like mentioned in the ticket, the problem was as follows. In the old situation the first buffer was first consumed (STDOUT) and the second buffer was consumed after the STDOUT was exhausted. This was problematic because the second buffer (STDERR) was filling up, and when it would reach the max, it would block. Therefore we pipe the STDERR to the STDOUT and just consume the STDOUT. Does this make sense? Please share the logs of the SparkSqlOperator job. Cheers, Fokko 2018-03-07 9:57 GMT+01:00 Aleksander Sabitov <[email protected]>: > HI Fokko! > May be by chance you can advice something. I have small issues while using > Spark SQL operator in Airflow. Most probably it's connected to the fact > that I use Docker to run Airflow. Issue is that when Spark SQL job failed > or succeeded (regardless) - logging loop still consuming continuously empty > byte strings. Before last commits for Spark SQL hook it was reading byte > strings and decoding them "on the fly" - https://github.com/apache/ > incubator-airflow/commit/32750601ad0a422283613bf7fccff8eb5407bc9c#diff- > 16c0ecc7c4b60bfe6e66592bb70e17cf > I think my issue can be connected somehow to the fact of both Docker usage > and byte strings. > Any ideas? > > Thanks in advance! > > Aleksandr >
