Richard Lee created AIRFLOW-1171:
------------------------------------

             Summary: Encoding error for non latin-1 Postgres database
                 Key: AIRFLOW-1171
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1171
             Project: Apache Airflow
          Issue Type: Bug
          Components: db, hooks
    Affects Versions: Airflow 1.8
         Environment: macOS 10.12.5
Python 2.7.12
Postgres 9.6.1

However, these are irrelevant to this issue.
            Reporter: Richard Lee
            Assignee: Richard Lee


There's [a known issue|https://github.com/psycopg/psycopg2/issues/331] from 
psycopg2 that Airflow ignores the encoding settings from db by default and 
which results in encoding error if there's any non latin-1 content in database 
cell.

Reference stack trace:
{code}
  File "dags/recipe_hourly_pageviews.py", line 73, in <module>
    dag.cli()
  File 
"/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/models.py",
 line 3339, in cli
    args.func(args, self)
  File 
"/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/bin/cli.py",
 line 585, in test
    ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
  File 
"/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/utils/db.py",
 line 53, in wrapper
    result = func(*args, **kwargs)
  File 
"/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/models.py",
 line 1374, in run
    result = task_copy.execute(context=context)
  File 
"/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/operators/generic_transfer.py",
 l
ine 78, in execute
    destination_hook.insert_rows(table=self.destination_table, rows=results)
  File 
"/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/hooks/dbapi_hook.py",
 line 215, i
n insert_rows
    l.append(self._serialize_cell(cell, conn))
  File 
"/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/hooks/postgres_hook.py",
 line 70,
 in _serialize_cell
    return psycopg2.extensions.adapt(cell).getquoted().decode('utf-8')
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 6-10: 
ordinal not in range(256)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to