[
https://issues.apache.org/jira/browse/AIRFLOW-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16786836#comment-16786836
]
ASF subversion and git services commented on AIRFLOW-2641:
----------------------------------------------------------
Commit e7343a3db108f5aea8467c73ca9856819e130d73 in airflow's branch
refs/heads/v1-10-stable from OmerJog
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=e7343a3 ]
[AIRFLOW-2641] Fix MySqlToHiveTransfer to handle MySQL DECIMAL correctly
> Fix MySqlToHiveTransfer to handle MySQL DECIMAL correctly
> ---------------------------------------------------------
>
> Key: AIRFLOW-2641
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2641
> Project: Apache Airflow
> Issue Type: Bug
> Reporter: Kengo Seki
> Priority: Major
> Fix For: 1.10.3
>
>
> This line
> {code:title=airflow/operators/mysql_to_hive.py}
> 98 @classmethod
> 99 def type_map(cls, mysql_type):
> 100 t = MySQLdb.constants.FIELD_TYPE
> 101 d = {
> 102 t.BIT: 'INT',
> 103 t.DECIMAL: 'DOUBLE',
> {code}
> perhaps intends to convert MySQL DECIMAL to Hive DOUBLE, but it doesn't work
> as expected.
> {code}
> mysql> DESC t;
> +-------+---------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-------+---------------+------+-----+---------+-------+
> | c | decimal(10,0) | YES | | NULL | |
> +-------+---------------+------+-----+---------+-------+
> 1 row in set (0.00 sec)
> {code}
> {code}
> In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer
> In [2]: t = MySqlToHiveTransfer(mysql_conn_id="airflow_db", sql="SELECT *
> FROM t", hive_table="t", recreate=True, task_id="t", ignore_ti_state=True)
> In [3]: t.execute(None)
> [2018-06-18 23:37:53,193] {base_hook.py:83} INFO - Using connection to:
> localhost
> [2018-06-18 23:37:53,199] {hive_hooks.py:429} INFO - DROP TABLE IF EXISTS t;
> CREATE TABLE IF NOT EXISTS t (
> c STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ''
> STORED AS textfile
> ;
> (snip)
> [2018-06-18 23:38:25,048] {hive_hooks.py:235} INFO - Loading data to table
> default.t
> [2018-06-18 23:38:25,866] {hive_hooks.py:235} INFO - Table default.t stats:
> [numFiles=1$ numRows=0, totalSize=0, rawDataSize=0]
> [2018-06-18 23:38:25,868] {hive_hooks.py:235} INFO - OK
> [2018-06-18 23:38:25,871] {hive_hooks.py:235} INFO - Time taken: 1.498 seconds
> {code}
> {code}
> $ hive -e 'DESC t'
> Logging initialized using configuration in
> file:/etc/hive/conf.dist/hive-log4j.properties
> OK
> c string
> Time taken: 2.342 seconds, Fetched: 1 row(s)
> {code}
> This is because {{MySQLdb.constants.FIELD_TYPE.DECIMAL}} does not stand for
> DECIMAL type on MySQL 5.0+. {{MySQLdb.constants.FIELD_TYPE.NEWDECIMAL}}
> should be used here.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)