[ 
https://issues.apache.org/jira/browse/AIRFLOW-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liwei li closed AIRFLOW-3180.
-----------------------------
    Resolution: Not A Bug

> Chinese characters all become gibberish when using  BashOperator  beeline 
> command insert data  into hive table
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-3180
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3180
>             Project: Apache Airflow
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: liwei li
>            Priority: Blocker
>
> with BashOperator ,i use beeline to insert data into hive ,hql with chinese 
> characters ,after dag run success,hive data contain unreadable code.
> python :
> {code:java}
> # -*- coding: utf-8 -*-
> import airflow
> from airflow.models import DAG
> from airflow.operators.bash_operator import BashOperator
> from airflow.operators.python_operator import BranchPythonOperator
> from datetime import datetime
> import time
> from datetime import timedelta
> import sys
> import pendulum
> local_tz = pendulum.timezone("Asia/Shanghai")
> reload(sys)
> sys.setdefaultencoding('utf-8')
> default_args = {
>               'owner': 'airflow',
>               'depends_on_past':False,
>               'start_date':datetime(2018,10,9,19,22,20,tzinfo=local_tz),
>               'retries':0
> }
> dag = DAG(
>             'inserthiveutf8',
>              default_args=default_args,
>              description='null',
>               catchup=False,
>              schedule_interval=None
> )
> adf37 = r"""
> beeline -u "jdbc:hive2://10.138.***.***:30010/di_zz" -n "*****" -p "*****"  
> -e "insert into   di_zz.tt_wms_inout_detail_new(fac_id) values ('中')"
>   """
> abcd8491539084126613 =BashOperator(
>                task_id='abcd8491539084126613',
>                bash_command=adf37,
>                dag=dag){code}
> i have tried this:
> {code:java}
> abcd8491539084126613 =BashOperator( task_id='abcd8491539084126613', 
> bash_command="sh ~/insert.sh ", dag=dag){code}
> this:
> {code:java}
> export LANG=en_US.UTF-8 
> beeline -u "jdbc:hive2://10.138.***.***:30010/di_zz" -n "*****" -p "*****" -e 
> 'insert into di_zz.tt_wms_inout_detail_new(fac_id) values ("中")'{code}
> this:
> {code:java}
> beeline -u "jdbc:hive2://10.138.***.***:30010/di_zz" -n "*****" -p "*****" -f 
> ~/hql.sql{code}
> log:
> {code:java}
> [2018-10-09 21:00:58,485] {bash_operator.py:110} INFO - INFO  : Compiling 
> command(queryId=hive_20181009210000_89390a92-c4de-413f-9958-4d7da1065ef9): 
> insert into   di_zz.tt_wms_inout_detail_new(fac_id) values ("???")
> [2018-10-09 21:00:58,485] {bash_operator.py:110} INFO - INFO  : Semantic 
> Analysis Completed
> [2018-10-09 21:00:58,486] {bash_operator.py:110} INFO - INFO  : Returning 
> Hive schema: Schema(fieldSchemas:[FieldSchema(name:_col0, type:timestamp, 
> comment:null), FieldSchema(name:_col1, type:string, comment:null), 
> FieldSchema(name:_col2, type:void, comment:null), FieldSchema(name:_col3, 
> type:void, comment:null), FieldSchema(name:_col4, type:void, comment:null), 
> FieldSchema(name:_col5, type:void, comment:null), FieldSchema(name:_col6, 
> type:void, comment:null), FieldSchema(name:_col7, type:bigint, comment:null), 
> FieldSchema(name:_col8, type:bigint, comment:null), FieldSchema(name:_col9, 
> type:bigint, comment:null), FieldSchema(name:_col10, type:void, 
> comment:null), FieldSchema(name:_col11, type:timestamp, comment:null)], 
> properties:null)
> [2018-10-09 21:00:58,486] {bash_operator.py:110} INFO - INFO  : Completed 
> compiling 
> command(queryId=hive_20181009210000_89390a92-c4de-413f-9958-4d7da1065ef9); 
> Time taken: 0.291 seconds
> [2018-10-09 21:00:58,486] {bash_operator.py:110} INFO - INFO  : Executing 
> command(queryId=hive_20181009210000_89390a92-c4de-413f-9958-4d7da1065ef9): 
> insert into   di_zz.tt_wms_inout_detail_new(fac_id) values ("???")
> [2018-10-09 21:00:58,486] {bash_operator.py:110} INFO - INFO  : Query ID = 
> hive_20181009210000_89390a92-c4de-413f-9958-4d7da1065ef9
> {code}
> data:
> {code:java}
> +------------------------------------+---------------------------------+-----------------------------------+----------------------------------+------------------------------------+------------------------------------+---------------------------------------+-----------------------------------+----------------------------------+-----------------------------------+-------------------------------------------+--------------------------------------+--+
> | tt_wms_inout_detail_new.stat_date  | tt_wms_inout_detail_new.fac_id  | 
> tt_wms_inout_detail_new.fac_name  | tt_wms_inout_detail_new.ware_id  | 
> tt_wms_inout_detail_new.ware_name  | tt_wms_inout_detail_new.ware_type  | 
> tt_wms_inout_detail_new.product_code  | tt_wms_inout_detail_new.ware_cnt  | 
> tt_wms_inout_detail_new.ware_in  | tt_wms_inout_detail_new.ware_out  | 
> tt_wms_inout_detail_new.sap_factory_name  | 
> tt_wms_inout_detail_new.di_etl_date  |
> +------------------------------------+---------------------------------+-----------------------------------+----------------------------------+------------------------------------+------------------------------------+---------------------------------------+-----------------------------------+----------------------------------+-----------------------------------+-------------------------------------------+--------------------------------------+--+
> | NULL                               | ���                             | NULL 
>                              | NULL                             | NULL        
>                        | NULL                               | NULL            
>                       | NULL                              | NULL              
>                | NULL                              | NULL                     
>                  | NULL                                 |
> {code}
> this is a simple example .the production script is comlex more
> Thanks in advance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to