[
https://issues.apache.org/jira/browse/AIRFLOW-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fokko Driesprong resolved AIRFLOW-2500.
---------------------------------------
Resolution: Fixed
> Fix MySqlToHiveTransfer to transfer unsigned type properly
> ----------------------------------------------------------
>
> Key: AIRFLOW-2500
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2500
> Project: Apache Airflow
> Issue Type: Bug
> Components: operators
> Reporter: Kengo Seki
> Assignee: Kengo Seki
> Priority: Major
>
> Given the following table,
> {code}
> mysql> USE airflow_ci
> Database changed
> mysql> DESC users;
> +-------+------------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-------+------------------+------+-----+---------+-------+
> | id | int(10) unsigned | YES | | NULL | |
> +-------+------------------+------+-----+---------+-------+
> 1 row in set (0.00 sec)
> mysql> SELECT * FROM users;
> +------------+
> | id |
> +------------+
> | 2147483647 |
> | 2147483648 |
> +------------+
> 2 rows in set (0.00 sec)
> {code}
> executing MySqlToHiveTransfer:
> {code}
> In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer
> ...: t = MySqlToHiveTransfer(sql="SELECT * FROM airflow_ci.users",
> hive_table="users", recreate=True, task_id="t")
> ...: t.execute(None)
> ...:
> [2018-05-21 12:14:09,137] {base_hook.py:83} INFO - Using connection to:
> localhost
> [2018-05-21 12:14:09,140] {base_hook.py:83} INFO - Using connection to:
> localhost
> [2018-05-21 12:14:09,146] {hive_hooks.py:427} INFO - DROP TABLE IF EXISTS
> users;
> CREATE TABLE IF NOT EXISTS users (
> id INT)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ''
> STORED AS textfile
> ;
> (snip)
> [2018-05-21 12:14:31,667] {hive_hooks.py:233} INFO - Loading data to table
> default.users
> [2018-05-21 12:14:32,364] {hive_hooks.py:233} INFO - Table default.users
> stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0]
> [2018-05-21 12:14:32,365] {hive_hooks.py:233} INFO - OK
> [2018-05-21 12:14:32,366] {hive_hooks.py:233} INFO - Time taken: 1.299 seconds
> {code}
> ... then the value greater than the upper bound for signed integer is not
> properly fetched from Hive.
> {code}
> hive> SELECT * FROM users;
> OK
> 2147483647
> NULL
> Time taken: 2.461 seconds, Fetched: 2 row(s)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)