skolchin opened a new issue, #24618: URL: https://github.com/apache/airflow/issues/24618
### Apache Airflow Provider(s) oracle ### Versions of Apache Airflow Providers apache-airflow-providers-oracle==3.1.0 ### Apache Airflow version 2.3.2 (latest released) ### Operating System Linux 4.19.79-1.el7.x86_64 ### Deployment Docker-Compose ### Deployment details Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 Python: 3.8 Oracle database charset: UTF-8 (returned by `SELECT value FROM nls_database_parameters WHERE parameter = 'NLS_NCHAR_CHARACTERSET'`) Oracle's client environment: - LC_CTYPE=C.UTF-8 - NLS_LANG=AMERICAN_AMERICA.CL8MSWIN1251 - LC_ALL=C.UTF-8 ### What happened Any query to Oracle database with UTF8 charset failed with error: > oracledb.exceptions.NotSupportedError: DPY-3012: national character set id 871 is not supported by python-oracledb in thin mode ### What you think should happen instead Definetelly, it should work, as it was in previous Oracle provider version (3.0.0). Quick search shows that `python-oracledb` package, which replaces `cx_Oracle` in 3.1.0, uses **thin** driver mode by default, and it seems that UTF-8 codepage is not supported in that mode ( [see this issue](https://stackoverflow.com/questions/72465536/python-oracledb-new-cx-oracle-connection-generating-notsupportederror-dpy-3012) ). In order to get to thick mode, a call to `oracledb.init_oracle_client()` is required before any connection made ( [see here](https://python-oracledb.readthedocs.io/en/latest/api_manual/module.html#oracledb.init_oracle_client) ). Indeed, if I add this call to `airflow/providers/oracle/hooks/oracle.py`, everything works fine. Resulting code looks like this: ``` import math import warnings from datetime import datetime from typing import Dict, List, Optional, Union import oracledb oracledb.init_oracle_client() ... ``` Downgrade to version 3.0.0 also helps, but I suppose it should be some permanent solution, like adding a configuration parameter or so. ### How to reproduce - Setup an Oracle database with UTF8 charset - Setup an Airflow connection with `oracle` type - Create an operator which issues a `SELECT` statement against the database ### Anything else Task execution log as follows: > [2022-06-23, 17:35:36 MSK] {task_command.py:370} INFO - Running <TaskInstance: nip-stage-load2.load-dict.load-sa_user scheduled__2022-06-22T00:00:00+00:00 [running]> on host dwh_develop_scheduler > [2022-06-23, 17:35:37 MSK] {taskinstance.py:1569} INFO - Exporting the following env vars: > [email protected] > AIRFLOW_CTX_DAG_OWNER=airflow > AIRFLOW_CTX_DAG_ID=nip-stage-load2 > AIRFLOW_CTX_TASK_ID=load-dict.load-sa_user > AIRFLOW_CTX_EXECUTION_DATE=2022-06-22T00:00:00+00:00 > AIRFLOW_CTX_TRY_NUMBER=1 > AIRFLOW_CTX_DAG_RUN_ID=scheduled__2022-06-22T00:00:00+00:00 > [2022-06-23, 17:35:37 MSK] {base.py:68} INFO - Using connection ID 'nip_standby' for task execution. > [2022-06-23, 17:35:37 MSK] {base.py:68} INFO - Using connection ID 'stage' for task execution. > [2022-06-23, 17:35:37 MSK] {data_transfer.py:198} INFO - Executing: > SELECT * FROM GMP.SA_USER > [2022-06-23, 17:35:37 MSK] {base.py:68} INFO - Using connection ID 'nip_standby' for task execution. > [2022-06-23, 17:35:37 MSK] {taskinstance.py:1889} ERROR - Task failed with exception > Traceback (most recent call last): > File "/home/airflow/.local/lib/python3.8/site-packages/dwh_etl/operators/data_transfer.py", line 265, in execute > if not self.no_check and self.compare_datasets(self.object_name, src, dest): > File "/home/airflow/.local/lib/python3.8/site-packages/dwh_etl/operators/data_transfer.py", line 199, in compare_datasets > src_df = src.get_pandas_df(sql) > File "/home/airflow/.local/lib/python3.8/site-packages/airflow/hooks/dbapi.py", line 128, in get_pandas_df > with closing(self.get_conn()) as conn: > File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/oracle/hooks/oracle.py", line 149, in get_conn > conn = oracledb.connect(**conn_config) > File "/home/airflow/.local/lib/python3.8/site-packages/oracledb/connection.py", line 1000, in connect > return conn_class(dsn=dsn, pool=pool, params=params, **kwargs) > File "/home/airflow/.local/lib/python3.8/site-packages/oracledb/connection.py", line 128, in __init__ > impl.connect(params_impl) > File "src/oracledb/impl/thin/connection.pyx", line 345, in oracledb.thin_impl.ThinConnImpl.connect > File "src/oracledb/impl/thin/connection.pyx", line 163, in oracledb.thin_impl.ThinConnImpl._connect_with_params > File "src/oracledb/impl/thin/connection.pyx", line 129, in oracledb.thin_impl.ThinConnImpl._connect_with_description > File "src/oracledb/impl/thin/connection.pyx", line 250, in oracledb.thin_impl.ThinConnImpl._connect_with_address > File "src/oracledb/impl/thin/protocol.pyx", line 197, in oracledb.thin_impl.Protocol._connect_phase_two > File "src/oracledb/impl/thin/protocol.pyx", line 263, in oracledb.thin_impl.Protocol._process_message > File "src/oracledb/impl/thin/protocol.pyx", line 242, in oracledb.thin_impl.Protocol._process_message > File "src/oracledb/impl/thin/messages.pyx", line 280, in oracledb.thin_impl.Message.process > File "src/oracledb/impl/thin/messages.pyx", line 2094, in oracledb.thin_impl.ProtocolMessage._process_message > File "/home/airflow/.local/lib/python3.8/site-packages/oracledb/errors.py", line 103, in _raise_err > raise exc_type(_Error(message)) from cause > oracledb.exceptions.NotSupportedError: DPY-3012: national character set id 871 is not supported by python-oracledb in thin mode ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
