brki opened a new issue, #51686:
URL: https://github.com/apache/airflow/issues/51686
### Apache Airflow Provider(s)
http
### Versions of Apache Airflow Providers
apache-airflow-providers-http==5.3.0
### Apache Airflow version
2.11.0
### Operating System
debian bookworm
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
not relevant
### What happened
A HttpHook was instantiated: `hook = HttpHook(...)`
A value for `extra_options` was created dynamically, based on some condition.
In this case, it looked like:
```
extra_options_val = {"proxies": {"http": "redacted", "https": "redacted"}}
```
`hook.run(..., extra_options=extra_options_val))`
This runs OK.
A second call is made (to get a second page of results):
`hook.run(..., extra_options=extra_options_val))`
This fails (because the "proxies" value was popped out of the
`extra_options_val` dict on the first call to `hook.run()`.
What should happen:
The `extra_options` dict passed in to `run()` should not be modified.
`HttpHook.run()` should make a deepcopy of `extra_options`, or
`HttpHook._configure_session_from_extra()` should not remove values from the
`extra_options` parameter.
### What you think should happen instead
The `extra_options` dict passed in to `run()` should not be modified.
`HttpHook.run()` should make a deepcopy of `extra_options`, or
`HttpHook._configure_session_from_extra()` should not remove values from the
`extra_options` parameter.
### How to reproduce
```
from copy import deepcopy
import airflow
from airflow.providers.standard.operators.python import PythonOperator
from airflow.providers.http.hooks.http import HttpHook
def http_hook_test():
extra_options = {"verify_ssl": False}
original_extra_options = deepcopy(extra_options)
# jsonplaceholder Connection has
# * Host = https://jsonplaceholder.typicode.com
# * Extra = {"foo": "bar"}
# Note: some Extra value is necessary, otherwise the extra_options
passed to run are not used, see https://github.com/apache/airflow/issues/51685
http_hook = HttpHook(method='GET', http_conn_id='jsonplaceholder')
http_hook.run(
endpoint=f"/posts/1",
extra_options=extra_options,
)
assert extra_options == original_extra_options, f"extra_options:
{extra_options}, original_extra_options: {original_extra_options}"
with airflow.DAG(
dag_id='http_hook_test_dag',
schedule_interval=None,
start_date=airflow.utils.dates.days_ago(1),
catchup=False,
) as dag:
test_http_hook = PythonOperator(
task_id='test_http_hook',
python_callable=http_hook_test,
)
```
### Anything else
Modifying a mutable value passed in should, imho, be avoided in framework
libraries. Sometimes it can make sense, for example in
`update_these_values(values_to_update)`. But in general, the caller of a
framework library function will not expect it to modified, unless the function
documentation explicitly states that it will be modified.
This also seems to apply to the Airflow 3.0.2 HttpHook.
The provided reproducible example uses `verify_ssl` instead of `proxies` in
extra_options because it's easier to clearly reproduce, but the same behaviour
is seen with `proxies`.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]