Kengo Seki created AIRFLOW-2412: ----------------------------------- Summary: Fix HiveCliHook.load_file to address HIVE-10541 Key: AIRFLOW-2412 URL: https://issues.apache.org/jira/browse/AIRFLOW-2412 Project: Apache Airflow Issue Type: Improvement Components: hive_hooks, hooks Reporter: Kengo Seki Assignee: Kengo Seki
HiveCliHook.load_file generates a query file and executes it using {{-f}} option, but that file doesn't have a newline at the end. In such case, beeline bundled Hive under 1.3 doesn't execute the last query due to [a bug|https://issues.apache.org/jira/browse/HIVE-10541]. Example: register connection and prepare file to be loaded: {code} $ airflow connections -a --conn_id hive_cli --conn_type hive_cli --conn_host localhost --conn_port 10000 --conn_schema default --conn_extra '{"use_beeline": true, "auth": "none"}' [2018-05-02 18:38:48,208] {__init__.py:48} INFO - Using executor SequentialExecutor Successfully added `conn_id`=hive_cli : hive_cli://:@localhost:10000/default $ cat /tmp/t 0 1 2 3 4 5 6 7 8 9 {code} executing load_file via ipython: {code} In [1]: from airflow.hooks.hive_hooks import HiveCliHook In [2]: hook = HiveCliHook("hive_cli") [2018-05-02 18:50:42,161] {base_hook.py:85} INFO - Using connection to: localhost In [3]: hook.load_file(field_dict={"c": "int"}, filepath="/tmp/t", table="foo") (snip) [2018-05-02 18:51:06,043] {hive_hooks.py:216} INFO - beeline -u jdbc:hive2://localhost:10000/default;auth=none -f /tmp/airflow_hiveop_75jxXU/tmpmvhi0M [2018-05-02 18:51:07,397] {hive_hooks.py:231} INFO - Connecting to jdbc:hive2://localhost:10000/default;auth=none [2018-05-02 18:51:07,598] {hive_hooks.py:231} INFO - Connected to: Apache Hive (version 1.2.1) [2018-05-02 18:51:07,600] {hive_hooks.py:231} INFO - Driver: Hive JDBC (version 1.2.1) [2018-05-02 18:51:07,600] {hive_hooks.py:231} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ [2018-05-02 18:51:07,644] {hive_hooks.py:231} INFO - 0: jdbc:hive2://localhost:10000/default> USE default; [2018-05-02 18:51:07,749] {hive_hooks.py:231} INFO - No rows affected (0.104 seconds) [2018-05-02 18:51:07,773] {hive_hooks.py:231} INFO - 0: jdbc:hive2://localhost:10000/defTABLE fooD DATA LOCAL INPATH '/tmp/t' OVERWRITE INTO [2018-05-02 18:51:07,773] {hive_hooks.py:231} INFO - Closing: 0: jdbc:hive2://localhost:10000/default;auth=none {code} Hive table is created, but no data is loaded: {code} 0: jdbc:hive2://localhost:10000/default> SHOW TABLES; +-----------+--+ | tab_name | +-----------+--+ | foo | +-----------+--+ 1 row selected (0.037 seconds) 0: jdbc:hive2://localhost:10000/default> SELECT * FROM foo; +--------+--+ | foo.c | +--------+--+ +--------+--+ No rows selected (0.1 seconds) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)