Pedrinhonitz commented on issue #65379:
URL: https://github.com/apache/airflow/issues/65379#issuecomment-4537497549
Hello, I'm curious about the problem and would like to understand it better.
Thinking about it, perhaps it's possible to "fix" this without modifying
Airflow. I might be wrong, but if you pass the delimiter through the task
instead of the DAG body or `default_args`, I think you won't have this problem.
I did a local test and it worked; maybe this will help.
I created this code, where I assign the delimiter to variables, as you
suggested, in the DAG body:
```python
from datetime import datetime
from airflow import DAG
from airflow.providers.standard.operators.python import PythonOperator
field_del = '@@\0@@'
record_del = '^^\0^^'
def process_csv(**context):
print(f"field_del repr: {field_del!r}")
print(f"record_del repr: {record_del!r}")
return {"ok": True}
with DAG(
dag_id="old_csv_delimiter_dag",
start_date=datetime(2024, 1, 1),
schedule=None,
catchup=False,
tags=["testing", "issue-65379"]
) as dag:
PythonOperator(
task_id="process_csv",
python_callable=process_csv,
op_kwargs={"field_del": field_del, "record_del": record_del}
)
```
And I created this other one as follows (this one worked locally):
```python
from datetime import datetime
from airflow import DAG
from airflow.providers.standard.operators.python import PythonOperator
def _delimiters():
nul = chr(0)
return f"@@{nul}@@", f"^^{nul}^^{nul}"
def process_csv(**context):
field_del, record_del = _delimiters()
print(f"field_del repr: {field_del!r}")
print(f"record_del repr: {record_del!r}")
assert '\x00' in field_del
assert '\x00' in record_del
return {"ok": True}
with DAG(
dag_id="fix_csv_delimiter_dag",
start_date=datetime(2024, 1, 1),
schedule=None,
catchup=False,
tags=["testing", "issue-65379"]
) as dag:
PythonOperator(
task_id="process_csv",
python_callable=process_csv
)
```
I might be wrong, but this prevents the DAG from changing the delimiter
parameters and causes it to be assembled inside the task, avoiding null bytes
in the string; at least it worked here. If it doesn't work, please provide more
details about the error.
**This may not be the best solution, but it might help migrate the version
faster than if someone else solved the problem.**
**_Cursor AI was used to assist in reviewing your issue with the
claude-4.6-sonnet-medium model; there may be defects._**
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]