wavewater opened a new issue #17135: URL: https://github.com/apache/airflow/issues/17135
When calling the exasol hooks get_pandas_df function (https://github.com/apache/airflow/blob/main/airflow/providers/exasol/hooks/exasol.py) I noticed that it does not return a pandas dataframe. It returns None. In fact the function definition type hint explicitly states that None is returned. But the name of the function suggests otherwise. The name get_pandas_df implies that it should return a dataframe and not None. I think that it would make more sense if get_pandas_df would indeed return a dataframe as the name is alluring to. So the code should be like this: `def get_pandas_df(self, sql: Union[str, list], parameters: Optional[dict] = None, **kwargs) -> pd.DataFrame: ... some code ... with closing(self.get_conn()) as conn: df=conn.export_to_pandas(sql, query_params=parameters, **kwargs) return df` INSTEAD OF: `def get_pandas_df(self, sql: Union[str, list], parameters: Optional[dict] = None, **kwargs) -> None: ... some code ... with closing(self.get_conn()) as conn: conn.export_to_pandas(sql, query_params=parameters, **kwargs)` **Apache Airflow version**: 2.1.0 **Kubernetes version (if you are using kubernetes)** (use `kubectl version`): Not using Kubernetes **Environment**:Official Airflow-Docker Image - **Cloud provider or hardware configuration**: no cloud - docker host (DELL Server with 48 Cores, 512GB RAM and many TB storage) - **OS** (e.g. from /etc/os-release):Official Airflow-Docker Image on CentOS 7 Host - **Kernel** (e.g. `uname -a`): Linux cad18b35be00 3.10.0-1160.21.1.el7.x86_64 #1 SMP Tue Mar 16 18:28:22 UTC 2021 x86_64 GNU/Linux - **Install tools**: only docker - **Others**: **What happened**: You can replicate the findings with following dag file: import datetime from airflow import DAG from airflow.operators.python_operator import PythonOperator from airflow.providers.exasol.operators.exasol import ExasolHook import pandas as pd default_args = {"owner": "airflow"} def call_exasol_hook(**kwargs): #Make connection to Exasol hook = ExasolHook(exasol_conn_id='Exasol QA') sql = 'select 42;' df = hook.get_pandas_df(sql = sql) return df with DAG( dag_id="exasol_hook_problem", start_date=datetime.datetime(2021, 5, 5), schedule_interval="@once", default_args=default_args, catchup=False, ) as dag: set_variable = PythonOperator( task_id='call_exasol_hook', python_callable=call_exasol_hook ) Sorry for the strange code formatting. I do not know how to fix this in the github UI form. Sorry also in case I missed something. When testing or executing the task via CLI: ` airflow tasks test exasol_hook_problem call_exasol_hook 2021-07-20` the logs show: `[2021-07-21 12:53:19,775] {python.py:151} INFO - Done. Returned value was: None` None was returned - although get_pandas_df was called. A pandas df should have been returned instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
