blag opened a new issue, #28280:
URL: https://github.com/apache/airflow/issues/28280

   ### Apache Airflow version
   
   main (development)
   
   ### What happened
   
   Variables with newlines in them are not always reproduced exactly as they 
are pasted into the web UI. Depending on how they are rendered, Unix newlines 
(`\n`) are silently converted to Windows newlines (`\r\n`) when rendered.
   
   Additionally, when users try to specify `\r` CR/"carriage return" in Python 
code, that gets rendered as a `\n` LF/"line feed" character.
   
   Tested on both astro and Airflow with Breeze.
   
   ### What you think should happen instead
   
   Unix newlines (`\n`) should always be rendered as Unix newlines (`\n`); 
Windows newlines (`\r\n`) should always be rendered as Windows newlines.
   
   ### How to reproduce
   
   Steps to reproduce:
   
   1. Spin up Airflow with Breeze (`breeze start-airflow`) - I tested with the 
PostgreSQL database backend
   2. Install `dos2unix` within the Airflow container: `apt update; apt install 
--yes dos2unix`
   3. Create a Variable in the Airflow UI named `newlines`, and ensure that its 
content contains newlines. Note that the non-newline content doesn't really 
matter. I used:
      ```
      Line 1
      Line 2
      Line 3
      ```
   4. Run this dag:
      ```python
      from datetime import datetime
      
      from airflow import DAG
      from airflow.operators.bash import BashOperator
      from airflow.operators.smooth import SmoothOperator
      
      
      with DAG("test_newlines", schedule=None) as dag:
          test_bash_operator_multiline_env_var = BashOperator(
              dag=dag,
              task_id="test_bash_operator_multiline_env_var",
              start_date=datetime.now(),
              env={
                  "MULTILINE_ENV_VAR": """Line 1
      Line 2
      Line 3"""
              },
              append_env=True,
              bash_command='''
      if [[ "$(echo $MULTILINE_ENV_VAR)" != "$(echo $MULTILINE_ENV_VAR | 
dos2unix)" ]]; then
          echo >&2 "Multiline environment variable contains newlines 
incorrectly converted to Windows CRLF"
          exit 1
      fi''',
          )
      
          test_bash_operator_heredoc_contains_newlines = BashOperator(
              dag=dag,
              task_id="test_bash_operator_heredoc_contains_newlines",
              start_date=datetime.now(),
              bash_command="""
      diff <(
      cat <<EOF
      Line 1
      Line 2
      Line 3
      EOF
      ) <(
      cat | dos2unix <<EOF
      Line 1
      Line 2
      Line 3
      EOF
      ) || {
          echo >&2 "Bash heredoc contains newlines incorrectly converted to 
Windows CRLF"
          exit 1
      }
      """,
          )
      
          test_bash_operator_env_var_from_variable_jinja_interpolation = 
BashOperator(
              dag=dag,
              
task_id="test_bash_operator_env_var_from_variable_jinja_interpolation",
              start_date=datetime.now(),
              env={
                  "ENV_VAR_AIRFLOW_VARIABLE_WITH_NEWLINES": "{{ 
var.value['newlines'] }}",
              },
              append_env=True,
              bash_command='''
      diff <(echo "$ENV_VAR_AIRFLOW_VARIABLE_WITH_NEWLINES") <(echo 
"$ENV_VAR_AIRFLOW_VARIABLE_WITH_NEWLINES" | dos2unix) || {
          echo >&2 "Environment variable contains newlines incorrectly 
converted to Windows CRLF"
          exit 1
      }
      ''',
          )
      
          test_bash_operator_from_variable_jinja_interpolation = BashOperator(
              dag=dag,
              task_id="test_bash_operator_from_variable_jinja_interpolation",
              start_date=datetime.now(),
              bash_command='''
      diff <(echo "{{ var.value['newlines'] }}") <(echo "{{ 
var.value['newlines'] }}" | dos2unix) || {
          echo >&2 "Jinja interpolated string contains newlines incorrectly 
converted to Windows CRLF"
          exit 1
      }
      ''',
          )
      
          test_bash_operator_backslash_n_not_equals_backslash_r = BashOperator(
              dag=dag,
              task_id="test_bash_operator_backslash_n_not_equals_backslash_r",
              start_date=datetime.now(),
              bash_command='''
      if [[ "\r" == "\n" ]]; then
          echo >&2 "Backslash-r has been incorrectly converted into backslash-n"
          exit 1
      fi''',
          )
      
          passing = SmoothOperator(dag=dag, task_id="passing", 
start_date=datetime.now())
          failing = SmoothOperator(dag=dag, task_id="failing", 
start_date=datetime.now())
      
          [
              test_bash_operator_multiline_env_var,
              test_bash_operator_heredoc_contains_newlines,
          ] >> passing
          [
              test_bash_operator_env_var_from_variable_jinja_interpolation,
              test_bash_operator_from_variable_jinja_interpolation,
              test_bash_operator_backslash_n_not_equals_backslash_r,
          ] >> failing
      ```
   
   Warning: The `test_bash_operator_heredoc_contains_newlines` may not actually 
ever complete in Breeze. It does complete successfully when using `astro`. That 
might be another bug.
   
   ### Operating System
   
   macOS 11.6.8 "Big Sur"
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   Local breeze environment
   Local `astro` environment
   
   ### Anything else
   
   Every time.
   
   I'm still trying to narrow down on a root cause so I don't have a PR yet. 
But somebody else might have a better idea of where to look and how to fix it, 
so don't let me stop you.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to