arifpratama398 opened a new issue #10456: URL: https://github.com/apache/druid/issues/10456
"Index datasource from Hadoop 3.1.1 hdfs failed in kerberized cluster" ### Affected Version 0.18.1 ### Description I am trying to index data in HDFS to Druid but failed. Command : ``` curl --negotiate -u:[email protected] -b /tmp/krb5cc_1008 -X 'POST' -H 'Content-Type:application/json' -d @/home/druid/wikipedia-index-hadoop.json http://XXX.XXX.com:8390/druid/indexer/v1/task ``` Json Spec : ``` { "type" : "index_hadoop", "spec" : { "dataSchema" : { "dataSource" : "wikipedia_hadoop_29092020", "parser" : { "type" : "hadoopyString", "parseSpec" : { "format" : "json", "dimensionsSpec" : { "dimensions" : [ "channel", "cityName", "comment", "countryIsoCode", "countryName", "isAnonymous", "isMinor", "isNew", "isRobot", "isUnpatrolled", "metroCode", "namespace", "page", "regionIsoCode", "regionName", "user", { "name": "added", "type": "long" }, { "name": "deleted", "type": "long" }, { "name": "delta", "type": "long" } ] }, "timestampSpec" : { "format" : "auto", "column" : "time" } } }, "metricsSpec" : [], "granularitySpec" : { "type" : "uniform", "segmentGranularity" : "day", "queryGranularity" : "none", "intervals" : ["2015-09-12/2015-09-13"], "rollup" : false } }, "ioConfig" : { "type" : "hadoop", "inputSpec" : { "type" : "static", "paths" : "/user/druid/quickstart/wikiticker-2015-09-12-sampled.json.gz" } }, "tuningConfig" : { "type" : "hadoop", "partitionsSpec" : { "type" : "hashed", "targetPartitionSize" : 5000000 }, "forceExtendableShardSpecs" : true, "jobProperties" : { "fs.default.name" : "hdfs://nn", "fs.defaultFS" : "hdfs://nn/user/druid", "dfs.datanode.address" : "0.0.0.0:50010", "dfs.client.use.datanode.hostname" : "true", "dfs.datanode.use.datanode.hostname" : "true", "yarn.resourcemanager.hostname" : "xxx.xxx.com", "yarn.nodemanager.vmem-check-enabled" : "false", "mapreduce.map.java.opts" : "-Duser.timezone=UTC -Dfile.encoding=UTF-8", "mapreduce.job.user.classpath.first" : "true", "mapreduce.reduce.java.opts" : "-Duser.timezone=UTC -Dfile.encoding=UTF-8", "mapreduce.map.memory.mb" : 1024, "mapreduce.reduce.memory.mb" : 1024 } } }, "hadoopDependencyCoordinates": ["org.apache.hadoop:hadoop-client:3.1.1"] } ``` Errors while processing index in Task Log ``` org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 2020-09-30T03:27:20,417 WARN [task-runner-0-priority-0] org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 2020-09-30T03:27:20,531 WARN [task-runner-0-priority-0] org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] Error: com.google.inject.internal.Errors.checkNotNull(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object; Error: com.google.inject.internal.Errors.checkNotNull(Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object; ``` i am already set druid for hadoop kerberos cluster by set in _common ``` druid.security.extensions.loadList=["druid-kerberos"] druid.hadoop.security.kerberos.keytab=/etc/security/keytabs/druid.headless.keytab [email protected] ``` i am also following doc from https://druid.apache.org/docs/0.18.1/tutorials/tutorial-kerberos-hadoop.html,copying hadoop configuration *-site.xml to druid conf dir but still facing same error. am I missing something ? Thanks in advance. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
