Rahul Singh created AIRFLOW-1496:
------------------------------------

             Summary: Druid hook unable to load data from hdfs
                 Key: AIRFLOW-1496
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1496
             Project: Apache Airflow
          Issue Type: Bug
          Components: hooks
    Affects Versions: 1.8.0
         Environment: RHEL 6.7 , Python 2.7.13
            Reporter: Rahul Singh


Hi,

I am trying to use druid hook to load data from hdfs to druid , below is my dag 
script :

from datetime import datetime, timedelta
import json
from airflow.hooks import HttpHook, DruidHook
from airflow.operators import PythonOperator
from airflow.models import DAG

def check_druid_con():
 dr_hook = 
DruidHook(druid_ingest_conn_id='DRUID_INDEX',druid_query_conn_id='DRUID_QUERY')
 
dr_hook.load_from_hdfs("druid_airflow","hdfs://10.55.26.71/demanddata/demand2.tsv","stay_date",["channel","rate"],"2016-12-11/2017-12-13",1,-1,metric_spec=[{
 "name" : "count", "type" : "count" 
}],hadoop_dependency_coordinates="org.apache.hadoop:hadoop-client:2.7.3")

default_args = {
    'owner': 'TC',
    'start_date': datetime(2017, 8, 7),
    'retries': 1,
    'retry_delay': timedelta(minutes=5)
}
dag = DAG('druid_data_load', default_args=default_args)
druid_task1=PythonOperator(task_id='check_druid',
                   python_callable=check_druid_con,
                   dag=dag)


I keep getting error , TypeError: load_from_hdfs() takes at least 10 arguments 
(10 given) . However I have given 10 arguments to load_from_hdfs , still it 
errors out . Please help.

Regards
Rahul



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to