shivaam commented on issue #55678:
URL: https://github.com/apache/airflow/issues/55678#issuecomment-3869575628

   I was able to reproduce this. Here's the analysis. Can someone assign it to 
me. 
   
   **Root Cause**
   
   `BaseSQLOperator.get_hook()` copies all `connection.extra_dejson` fields 
into `hook_params`, which get passed as kwargs to `AthenaSQLHook.__init__()`. 
The hook forwards everything to `super().__init__()` → 
`AwsGenericHook.__init__()`, which has a strict signature — no `**kwargs`.
   
   These params aren't even needed in `__init__`. They're already read from 
`self.conn.extra_dejson` later in `get_conn()` when PyAthena actually connects. 
The operator passes them redundantly.
   
   It's the only hook in the Amazon provider that inherits from both 
`AwsBaseHook` and `DbApiHook`. `RedshiftSQLHook` avoids this by extending only 
`DbApiHook`.
   
   **Fix options I considered**
   
   1. **Filter kwargs in `AthenaSQLHook.__init__()`** — use an allowlist of 
params that `AwsGenericHook` accepts, only pass those to `super().__init__()`. 
There will be no impact to any other classes or functionality. 
   2. **Add `**kwargs` to `AwsGenericHook.__init__()`** — would fix this but 
silently swallows typos/invalid params for all AWS hooks. I think the strict 
signature is intentional here.
   
   I'm leaning toward option 1 with an allowlist approach (only forward params 
that `AwsGenericHook` accepts).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to