[
https://issues.apache.org/jira/browse/AIRFLOW-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Bandy updated AIRFLOW-2108:
---------------------------------
Description:
When the BashOperator logs every line of output from the executing process, it
strips leading whitespace which makes it difficult to interpret output that was
formatted with indentation.
For example, I'm executing [PGLoader|http://pgloader.readthedocs.io/] through
this operator. When it finishes, it prints a summary which appears in the logs
like so:
{noformat}
[2018-02-14 07:31:44,524] {bash_operator.py:101} INFO -
2018-02-14T07:31:44.115000Z LOG report summary reset
[2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - table name errors
read imported bytes total time read write
[2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - fetch meta data
0 524 524 1.438s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Schemas
0 0 0 0.161s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create SQL Types
0 19 19 20.413s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create tables
0 310 310 3m2.316s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Set Table OIDs
0 155 155 0.458s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Index Build Completion
0 353 353 1m37.323s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Indexes
0 353 353 3m25.929s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Reset Sequences
0 0 0 2.677s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Primary Keys 0
147 147 1m21.091s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Foreign Keys
0 16 16 8.283s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Triggers
0 0 0 0.339s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Install Comments
0 0 0 0.000s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Total import time
∞ 0 0 6m35.642s
{noformat}
Ideally, the leading whitespace would be retained, so the logs look like this:
{noformat}
[2018-02-14 07:31:44,524] {bash_operator.py:101} INFO -
2018-02-14T07:31:44.115000Z LOG report summary reset
[2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - table name
errors read imported bytes total time read write
[2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - fetch meta data
0 524 524 1.438s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Schemas
0 0 0 0.161s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create SQL Types
0 19 19 20.413s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create tables
0 310 310 3m2.316s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Set Table OIDs
0 155 155 0.458s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Index Build Completion
0 353 353 1m37.323s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Indexes
0 353 353 3m25.929s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Reset Sequences
0 0 0 2.677s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Primary Keys
0 147 147 1m21.091s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Foreign Keys
0 16 16 8.283s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Triggers
0 0 0 0.339s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Install Comments
0 0 0 0.000s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Total import time
∞ 0 0 6m35.642s
{noformat}
was:
When the BashOperator logs every line of output from the executing process, it
strips leading whitespace which makes it difficult to interpret output that was
formatted with indentation.
For example, I'm executing [PGLoader|http://pgloader.readthedocs.io/] through
this operator. When it finishes, it prints a summary which appears in the logs
like so:
{noformat}
[2018-02-14 07:31:44,524] {bash_operator.py:101} INFO -
2018-02-14T07:31:44.115000Z LOG report summary reset
[2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - table name errors
read imported bytes total time read write
[2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - fetch meta data
0 524 524 1.438s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Schemas
0 0 0 0.161s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create SQL Types
0 19 19 20.413s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create tables
0 310 310 3m2.316s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Set Table OIDs
0 155 155 0.458s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Index Build Completion
0 353 353 1m37.323s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Indexes
0 353 353 3m25.929s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Reset Sequences
0 0 0 2.677s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Primary Keys 0
147 147 1m21.091s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Foreign Keys
0 16 16 8.283s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Triggers
0 0 0 0.339s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Install Comments
0 0 0 0.000s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Total import time
∞ 0 0 6m35.642s
{noformat}
Ideally, the leading whitespace would be retained, so the output looks like
this:
{noformat}
[2018-02-14 07:31:44,524] {bash_operator.py:101} INFO -
2018-02-14T07:31:44.115000Z LOG report summary reset
[2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - table name
errors read imported bytes total time read write
[2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - fetch meta data
0 524 524 1.438s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Schemas
0 0 0 0.161s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create SQL Types
0 19 19 20.413s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create tables
0 310 310 3m2.316s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Set Table OIDs
0 155 155 0.458s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Index Build Completion
0 353 353 1m37.323s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Indexes
0 353 353 3m25.929s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Reset Sequences
0 0 0 2.677s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Primary Keys
0 147 147 1m21.091s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Foreign Keys
0 16 16 8.283s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Triggers
0 0 0 0.339s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Install Comments
0 0 0 0.000s
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - ----------------------
--------- --------- --------- --------- -------------- --------- ---------
[2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Total import time
∞ 0 0 6m35.642s
{noformat}
> BashOperator discards process indentation
> -----------------------------------------
>
> Key: AIRFLOW-2108
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2108
> Project: Apache Airflow
> Issue Type: Bug
> Components: operators
> Affects Versions: 1.9.0
> Reporter: Chris Bandy
> Priority: Minor
>
> When the BashOperator logs every line of output from the executing process,
> it strips leading whitespace which makes it difficult to interpret output
> that was formatted with indentation.
> For example, I'm executing [PGLoader|http://pgloader.readthedocs.io/] through
> this operator. When it finishes, it prints a summary which appears in the
> logs like so:
> {noformat}
> [2018-02-14 07:31:44,524] {bash_operator.py:101} INFO -
> 2018-02-14T07:31:44.115000Z LOG report summary reset
> [2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - table name errors
> read imported bytes total time read write
> [2018-02-14 07:31:44,564] {bash_operator.py:101} INFO -
> ---------------------- --------- --------- --------- ---------
> -------------- --------- ---------
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - fetch meta data
> 0 524 524 1.438s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Schemas
> 0 0 0 0.161s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create SQL Types
> 0 19 19 20.413s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create tables
> 0 310 310 3m2.316s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Set Table OIDs
> 0 155 155 0.458s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO -
> ---------------------- --------- --------- --------- ---------
> -------------- --------- ---------
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO -
> ---------------------- --------- --------- --------- ---------
> -------------- --------- ---------
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Index Build
> Completion 0 353 353 1m37.323s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Indexes
> 0 353 353 3m25.929s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Reset Sequences
> 0 0 0 2.677s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Primary Keys
> 0 147 147 1m21.091s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Foreign Keys
> 0 16 16 8.283s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Triggers
> 0 0 0 0.339s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Install Comments
> 0 0 0 0.000s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO -
> ---------------------- --------- --------- --------- ---------
> -------------- --------- ---------
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Total import time
> ∞ 0 0 6m35.642s
> {noformat}
> Ideally, the leading whitespace would be retained, so the logs look like this:
> {noformat}
> [2018-02-14 07:31:44,524] {bash_operator.py:101} INFO -
> 2018-02-14T07:31:44.115000Z LOG report summary reset
> [2018-02-14 07:31:44,564] {bash_operator.py:101} INFO - table
> name errors read imported bytes total time read
> write
> [2018-02-14 07:31:44,564] {bash_operator.py:101} INFO -
> ---------------------- --------- --------- --------- ---------
> -------------- --------- ---------
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - fetch meta
> data 0 524 524 1.438s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create
> Schemas 0 0 0 0.161s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create SQL
> Types 0 19 19 20.413s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create
> tables 0 310 310 3m2.316s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Set Table
> OIDs 0 155 155 0.458s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO -
> ---------------------- --------- --------- --------- ---------
> -------------- --------- ---------
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO -
> ---------------------- --------- --------- --------- ---------
> -------------- --------- ---------
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Index Build
> Completion 0 353 353 1m37.323s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create
> Indexes 0 353 353 3m25.929s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Reset
> Sequences 0 0 0 2.677s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Primary
> Keys 0 147 147 1m21.091s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create Foreign
> Keys 0 16 16 8.283s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Create
> Triggers 0 0 0 0.339s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Install
> Comments 0 0 0 0.000s
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO -
> ---------------------- --------- --------- --------- ---------
> -------------- --------- ---------
> [2018-02-14 07:31:44,567] {bash_operator.py:101} INFO - Total import
> time ∞ 0 0 6m35.642s
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)