Re: question regarding Spark SQL operator (hook) logging in Airflow

Driesprong, Fokko Wed, 07 Mar 2018 01:03:49 -0800

Hi Aleksander,

What version of Airflow are you using?


Like mentioned in the ticket, the problem was as follows. In the old
situation the first buffer was first consumed (STDOUT) and the second
buffer was consumed after the STDOUT was exhausted. This was problematic
because the second buffer (STDERR) was filling up, and when it would reach
the max, it would block. Therefore we pipe the STDERR to the STDOUT and
just consume the STDOUT. Does this make sense? Please share the logs of the
SparkSqlOperator job.

Cheers, Fokko






2018-03-07 9:57 GMT+01:00 Aleksander Sabitov <[email protected]>:

> HI Fokko!
> May be by chance you can advice something. I have small issues while using
> Spark SQL operator in Airflow. Most probably it's connected to the fact
> that I use Docker to run Airflow. Issue is that when Spark SQL job failed
> or succeeded (regardless) - logging loop still consuming continuously empty
> byte strings. Before last commits for Spark SQL hook it was reading byte
> strings and decoding them "on the fly" - https://github.com/apache/
> incubator-airflow/commit/32750601ad0a422283613bf7fccff8eb5407bc9c#diff-
> 16c0ecc7c4b60bfe6e66592bb70e17cf
> I think my issue can be connected somehow to the fact of both Docker usage
> and byte strings.
> Any ideas?
>
> Thanks in advance!
>
> Aleksandr
>

Re: question regarding Spark SQL operator (hook) logging in Airflow

Reply via email to