potiuk commented on issue #28280:
URL: https://github.com/apache/airflow/issues/28280#issuecomment-1359916847
Good catch.
It is because this is what Jinja2 does by default - it replaces any new line
it detects in the template (not in the values) with the value of
newline_sequence passed as parameter of the Template (which is `\n` by
default). This is nothing specific to Airflow - it's just how Jinja2 works.
You can test it yourself:
When you run this In a regular terminal this one (same with no
newline_sequence):
```
import jinja2
jinjatemplate = jinja2.Template('Hello \r{{ name }}!', newline_sequence="\n")
print(jinjatemplate.render(name="World"))
```
You will get:
```
Hello
World!
```
Where this one:
```
import jinja2
jinjatemplate = jinja2.Template('Hello \r{{ name }}!', newline_sequence="\r")
print(jinjatemplate.render(name="World"))
```
Will print just:
```
World!
```
The thing is that Jinja2 treats the template as multi-line text file. It
reads the temple line by line and later joins the line using "newline_sequence"
as separator. I think this has a lot to do with the ways how some formatting of
jinja template and whitespace is done. For example it would be rather difficult
to achieve whitespace manipulation that Jinja allows you to do with whitespace
control:
https://jinja.palletsprojects.com/en/3.1.x/templates/#whitespace-control - so
JINJA authors actually opted to define what is the "canonical newline" they
always use in the rendered template. Which makes a lot of sense - imagine the
complexity if you have a template that mixes both \r and 'n and Jinja
adding/removing those - with some level of consistency.
Additionally, the `test_bash_operator_heredoc_contains_newlines` does not
complete successfuly for you because it has a bug (in the task definition).
Proper command should be:
```
bash_command="""
diff <(
cat <<EOF
Line 1
Line 2
Line 3
EOF
) <(
cat <<EOF | dos2unix # <- Here EOF was in a wrong place
Line 1
Line 2
Line 3
EOF
) || {
echo >&2 "Bash heredoc contains newlines incorrectly converted to
Windows CRLF"
exit 1
}
""",
```
The problem was that heredoc was pased as input to dos2unix command, not to
the `cat` command and the cat command simply read from the stdin. Apparently
astro sdk somehow replaces the stdin when it runs airflow tasks and cat simply
gets empty string or smth like that.
Solution:
I am not sure if we want to do anything with it. This seems to be standard
Jinja2 behaviour, and even if it kinda unexpected, I think this is the
behaviour our users already likely rely on and changing it would be
backwards-incompatible. Maybe we could specify newline_sequence somewhere in
DAG definition (same as render_template_as_native_object?)
In any case, It would be great however if this behaviour is documented.
WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]