Rahul Singh created AIRFLOW-1496: ------------------------------------ Summary: Druid hook unable to load data from hdfs Key: AIRFLOW-1496 URL: https://issues.apache.org/jira/browse/AIRFLOW-1496 Project: Apache Airflow Issue Type: Bug Components: hooks Affects Versions: 1.8.0 Environment: RHEL 6.7 , Python 2.7.13 Reporter: Rahul Singh
Hi, I am trying to use druid hook to load data from hdfs to druid , below is my dag script : from datetime import datetime, timedelta import json from airflow.hooks import HttpHook, DruidHook from airflow.operators import PythonOperator from airflow.models import DAG def check_druid_con(): dr_hook = DruidHook(druid_ingest_conn_id='DRUID_INDEX',druid_query_conn_id='DRUID_QUERY') dr_hook.load_from_hdfs("druid_airflow","hdfs://10.55.26.71/demanddata/demand2.tsv","stay_date",["channel","rate"],"2016-12-11/2017-12-13",1,-1,metric_spec=[{ "name" : "count", "type" : "count" }],hadoop_dependency_coordinates="org.apache.hadoop:hadoop-client:2.7.3") default_args = { 'owner': 'TC', 'start_date': datetime(2017, 8, 7), 'retries': 1, 'retry_delay': timedelta(minutes=5) } dag = DAG('druid_data_load', default_args=default_args) druid_task1=PythonOperator(task_id='check_druid', python_callable=check_druid_con, dag=dag) I keep getting error , TypeError: load_from_hdfs() takes at least 10 arguments (10 given) . However I have given 10 arguments to load_from_hdfs , still it errors out . Please help. Regards Rahul -- This message was sent by Atlassian JIRA (v6.4.14#64029)