with BashOperator ,i use beeline to insert data into hive ,hql with chinese
characters ,after dag run success,hive data contain unreadable code.
python :
# -*- coding: utf-8 -*-
import airflow
from airflow.models import DAG
from airflow.operators.bash_operator import BashOperator
from airflow.operators.python_operator import BranchPythonOperator
from datetime import datetime
import time
from datetime import timedelta
import sys
import pendulum
local_tz = pendulum.timezone("Asia/Shanghai")
reload(sys)
sys.setdefaultencoding('utf-8')
default_args = {
'owner': 'airflow',
'depends_on_past':False,
'start_date':datetime(2018,10,9,19,22,20,tzinfo=local_tz),
'retries':0
}
dag = DAG(
'inserthiveutf8',
default_args=default_args,
description='null',
catchup=False,
schedule_interval=None
)
adf37 = r"""
beeline -u "jdbc:hive2://10.138.***.***:30010/di_zz" -n "*****" -p
"*****" -e 'insert into di_zz.tt_wms_inout_detail_new(fac_id)
values ("中")'
"""
abcd8491539084126613 =BashOperator(
task_id='abcd8491539084126613',
bash_command=adf37,
dag=dag)
i have tried this:
abcd8491539084126613 =BashOperator(
task_id='abcd8491539084126613',
bash_command="sh ~/insert.sh ",
dag=dag)
this:
export LANG=en_US.UTF-8
beeline -u "jdbc:hive2://10.138.***.***:30010/di_zz" -n "*****" -p
"*****" -e 'insert into di_zz.tt_wms_inout_detail_new(fac_id)
values ("中")'
this:
beeline -u "jdbc:hive2://10.138.***.***:30010/di_zz" -n "*****" -p
"*****" -f ~/hql.sql
log:
[2018-10-09 21:00:58,485] {bash_operator.py:110} INFO - INFO :
Compiling
command(queryId=hive_20181009210000_89390a92-c4de-413f-9958-4d7da1065ef9):
insert into di_zz.tt_wms_inout_detail_new(fac_id) values ("???")
[2018-10-09 21:00:58,485] {bash_operator.py:110} INFO - INFO :
Semantic Analysis Completed
[2018-10-09 21:00:58,486] {bash_operator.py:110} INFO - INFO :
Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_col0,
type:timestamp, comment:null), FieldSchema(name:_col1, type:string,
comment:null), FieldSchema(name:_col2, type:void, comment:null),
FieldSchema(name:_col3, type:void, comment:null),
FieldSchema(name:_col4, type:void, comment:null),
FieldSchema(name:_col5, type:void, comment:null),
FieldSchema(name:_col6, type:void, comment:null),
FieldSchema(name:_col7, type:bigint, comment:null),
FieldSchema(name:_col8, type:bigint, comment:null),
FieldSchema(name:_col9, type:bigint, comment:null),
FieldSchema(name:_col10, type:void, comment:null),
FieldSchema(name:_col11, type:timestamp, comment:null)],
properties:null)
[2018-10-09 21:00:58,486] {bash_operator.py:110} INFO - INFO :
Completed compiling
command(queryId=hive_20181009210000_89390a92-c4de-413f-9958-4d7da1065ef9);
Time taken: 0.291 seconds
[2018-10-09 21:00:58,486] {bash_operator.py:110} INFO - INFO :
Executing
command(queryId=hive_20181009210000_89390a92-c4de-413f-9958-4d7da1065ef9):
insert into di_zz.tt_wms_inout_detail_new(fac_id) values ("???")
[2018-10-09 21:00:58,486] {bash_operator.py:110} INFO - INFO : Query
ID = hive_20181009210000_89390a92-c4de-413f-9958-4d7da1065ef9
data:
+------------------------------------+---------------------------------+-----------------------------------+----------------------------------+------------------------------------+------------------------------------+---------------------------------------+-----------------------------------+----------------------------------+-----------------------------------+-------------------------------------------+--------------------------------------+--+
| tt_wms_inout_detail_new.stat_date | tt_wms_inout_detail_new.fac_id
| tt_wms_inout_detail_new.fac_name | tt_wms_inout_detail_new.ware_id
| tt_wms_inout_detail_new.ware_name |
tt_wms_inout_detail_new.ware_type |
tt_wms_inout_detail_new.product_code |
tt_wms_inout_detail_new.ware_cnt | tt_wms_inout_detail_new.ware_in |
tt_wms_inout_detail_new.ware_out |
tt_wms_inout_detail_new.sap_factory_name |
tt_wms_inout_detail_new.di_etl_date |
+------------------------------------+---------------------------------+-----------------------------------+----------------------------------+------------------------------------+------------------------------------+---------------------------------------+-----------------------------------+----------------------------------+-----------------------------------+-------------------------------------------+--------------------------------------+--+
| NULL | ���
| NULL | NULL
| NULL | NULL
| NULL | NULL
| NULL | NULL
| NULL | NULL
|
I am searching for a long time on net. But no use. Please help or try to
give some ideas how to achieve this.
Thanks in advance.