NathanFarmer opened a new issue #9918:
URL: https://github.com/apache/airflow/issues/9918


   **Description**
   
   Allowing a custom outputtypehandler for querying corrupted Oracle data as 
defined 
[here.](https://cx-oracle.readthedocs.io/en/latest/user_guide/sql_execution.html?highlight=outputtypehandler#querying-corrupt-data)
   
   **Use case / motivation**
   
   I am querying a full table that is very old. It contains some corrupted data 
that I do not have the privileges to change or create a view from. Whenever I 
select the data using the OracleHook I get the error message "ValueError: year 
-4712 is out of range". I would like to handle this by returning None for this 
particular record as shown in this cx_Oracle solution: 
[https://github.com/oracle/python-cx_Oracle/issues/347#issuecomment-525253126](https://github.com/oracle/python-cx_Oracle/issues/347#issuecomment-525253126).
   
   **What I tried**
   
   I tried the following code which did not throw any errors (other than the 
same ValueError), but also did not end up calling either my OutputHandler or 
DateTimeConverter functions.
   
   ```
   from airflow.hooks.oracle_hook import OracleHook
   import cx_Oracle
   
   from datetime import datetime
   import os
   os.environ['NLS_DATE_FORMAT'] = 'YYYY-MM-DD HH24:MI:SS'
   
   # Dealing with invalid years in the database
   def DateTimeConverter(value):
       print('DateTimeConverter was called')
       if value.startswith('4712'):
           return None
       return datetime.strptime(value, '%Y-%m-%d %H:%M:%S')
   
   def OutputHandler(cursor, name, defaulttype, length, precision, scale):
       print('OutputHandler was called')
       if defaulttype == cx_Oracle.DATETIME:
           return cursor.var(cx_Oracle.STRING, arraysize=cursor.arraysize,
                             outconverter=DateTimeConverter)
   
   def extract(extract_connection)
       # Return the extracted records
       extract_records_query = 'SELECT col1, col2, col3 FROM table'
       o_extract_hook = OracleHook(oracle_conn_id=extract_connection)
       o_extract_hook.outputtypehandler = OutputHandler
       print('Extract started')
       extract_records = o_extract_hook.get_records(sql=extract_records_query)
       return extract_records
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to