[
https://issues.apache.org/jira/browse/METRON-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957360#comment-16957360
]
Nick Allen commented on METRON-2297:
------------------------------------
* When using Kerberos authentication no topology is able to access HDFS, which
primarily impacts Enrichment and Batch Indexing.
* If the local cache on a Storm worker contains a ticket for the 'metron'
user, the topology is able to access HDFS.
* When using Kerberos authentication, all topologies are able to access Kafka.
> Enrichment Topology Unable to Load Geo IP Data from HDFS
> --------------------------------------------------------
>
> Key: METRON-2297
> URL: https://issues.apache.org/jira/browse/METRON-2297
> Project: Metron
> Issue Type: Sub-task
> Reporter: Nick Allen
> Assignee: Nick Allen
> Priority: Major
>
> On the `feature/METRON-2088-support-hdp-3.1` feature branch, the Enrichment
> topology is unable to load the GeoIP data from HDFS when using Kerberos
> authentication. The Enrichment topology shows this error.
> {code:java}
> 2019-10-03 18:23:18.545 o.a.h.i.Client Curator-TreeCache-0 [WARN] Exception
> encountered while connecting to the server :
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate
> via:[TOKEN, KERBEROS]
> 2019-10-03 18:23:18.552 o.a.m.e.a.m.MaxMindDbUtilities Curator-TreeCache-0
> [ERROR] Unable to open new database file
> /apps/metron/geo/default/GeoLite2-City.tar.gz
> java.io.IOException: DestHost:destPort metrong-1.openstacklocal:8020 ,
> LocalHost:localPort metrong-7/172.22.74.121:0. Failed on local exception:
> java.io.IOException: org.apache.hadoop.security.AccessControlException:
> Client cannot authenticate via:[TOKEN, KERBEROS]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method) ~[?:1.8.0_112]
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> ~[?:1.8.0_112]
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> ~[?:1.8.0_112]
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> ~[?:1.8.0_112]
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1502)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client.call(Client.java:1444)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client.call(Client.java:1354)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at com.sun.proxy.$Proxy55.getBlockLocations(Unknown Source) ~[?:?]
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:317)
> ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[?:1.8.0_112]
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> ~[?:1.8.0_112]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at com.sun.proxy.$Proxy56.getBlockLocations(Unknown Source) ~[?:?]
> at
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:862)
> ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:851)
> ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:840)
> ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1004)
> ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:320)
> ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316)
> ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:328)
> ~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:899)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.metron.enrichment.adapters.maxmind.MaxMindDatabase.update(MaxMindDatabase.java:113)
> ~[stormjar.jar:?]
> at
> org.apache.metron.enrichment.adapters.maxmind.geo.GeoLiteCityDatabase.updateIfNecessary(GeoLiteCityDatabase.java:142)
> ~[stormjar.jar:?]
> at
> org.apache.metron.enrichment.adapters.geo.GeoAdapter.updateAdapter(GeoAdapter.java:64)
> ~[stormjar.jar:?]
> at
> org.apache.metron.enrichment.bolt.UnifiedEnrichmentBolt.reloadCallback(UnifiedEnrichmentBolt.java:239)
> ~[stormjar.jar:?]
> at
> org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.reloadCallback(ConfigurationsUpdater.java:148)
> ~[stormjar.jar:?]
> at
> org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.update(ConfigurationsUpdater.java:77)
> ~[stormjar.jar:?]
> at
> org.apache.metron.zookeeper.SimpleEventListener.childEvent(SimpleEventListener.java:120)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:685)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:679)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92)
> [stormjar.jar:?]
> at
> org.apache.metron.guava.enrichment.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache.callListeners(TreeCache.java:678)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache.access$1400(TreeCache.java:69)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache$4.run(TreeCache.java:790)
> [stormjar.jar:?]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_112]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [?:1.8.0_112]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_112]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [?:1.8.0_112]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [?:1.8.0_112]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> Caused by: java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate
> via:[TOKEN, KERBEROS]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:758)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:721)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:814)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:411)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1559)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client.call(Client.java:1390)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> ... 46 more
> Caused by: org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]
> at
> org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:173)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:615)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:411)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:801)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:797)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:797)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:411)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1559)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> at org.apache.hadoop.ipc.Client.call(Client.java:1390)
> ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
> ... 46 more
> 2019-10-03 18:23:18.558 o.a.c.f.r.c.TreeCache Curator-TreeCache-0 [ERROR]
> java.lang.IllegalStateException: Unable to update MaxMind database
> at
> org.apache.metron.enrichment.adapters.maxmind.MaxMindDbUtilities.handleDatabaseIOException(MaxMindDbUtilities.java:81)
> ~[stormjar.jar:?]
> at
> org.apache.metron.enrichment.adapters.maxmind.MaxMindDatabase.update(MaxMindDatabase.java:127)
> ~[stormjar.jar:?]
> at
> org.apache.metron.enrichment.adapters.maxmind.geo.GeoLiteCityDatabase.updateIfNecessary(GeoLiteCityDatabase.java:142)
> ~[stormjar.jar:?]
> at
> org.apache.metron.enrichment.adapters.geo.GeoAdapter.updateAdapter(GeoAdapter.java:64)
> ~[stormjar.jar:?]
> at
> org.apache.metron.enrichment.bolt.UnifiedEnrichmentBolt.reloadCallback(UnifiedEnrichmentBolt.java:239)
> ~[stormjar.jar:?]
> at
> org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.reloadCallback(ConfigurationsUpdater.java:148)
> ~[stormjar.jar:?]
> at
> org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.update(ConfigurationsUpdater.java:77)
> ~[stormjar.jar:?]
> at
> org.apache.metron.zookeeper.SimpleEventListener.childEvent(SimpleEventListener.java:120)
> ~[stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:685)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:679)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92)
> [stormjar.jar:?]
> at
> org.apache.metron.guava.enrichment.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache.callListeners(TreeCache.java:678)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache.access$1400(TreeCache.java:69)
> [stormjar.jar:?]
> at
> org.apache.curator.framework.recipes.cache.TreeCache$4.run(TreeCache.java:790)
> [stormjar.jar:?]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_112]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [?:1.8.0_112]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_112]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [?:1.8.0_112]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [?:1.8.0_112]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745)
> [?:1.8.0_112]EnvironmentCluster details below. 172.22.75.252
> metronh-5.openstacklocal metronh-5 metronh-5.openstacklocal.
> 172.22.75.250 metronh-4.openstacklocal metronh-4
> metronh-4.openstacklocal.
> 172.22.75.248 metronh-3.openstacklocal metronh-3
> metronh-3.openstacklocal.
> 172.22.75.246 metronh-2.openstacklocal metronh-2
> metronh-2.openstacklocal.
> 172.22.75.244 metronh-1.openstacklocal metronh-1
> metronh-1.openstacklocal. Ambari node - metronh-1.openstacklocalMetron node -
> metronh-5.openstacklocalPEM file for SSH - attachedKnown Issue
> DescriptionNoneTesting ProcedureNoneAttachmentsActivityCommentsView 9 older
> commentsNicholas AllenOctober 10, 2019, 12:18 PMEditedAfter more testing the
> tgt_renew script work around is not going to work. The work around looks like
> the following.Kerberos authentication against HDFS from Metron's Storm
> topologies can fail. The Storm worker is unable to present a valid Kerberos
> ticket to authenticate against HDFS. This impacts the Enrichment and Batch
> Indexing topologies, which each interact with HDFS.
> To mitigate this problem, before starting the Metron topologies in a secured
> cluster using Kerberos authentication, the additional installation step is
> required. A periodic job should be scheduled to obtain and cache a Kerberos
> ticket.
> The job should be scheduled on each node hosting a Storm Supervisor.The job
> should run as the user ‘metron’.The job should kinit using the Metron keytab
> often located at /etc/security/keytabs/metron.headless.keytab.The job should
> be scheduled to run at least as frequently as the ticket lifetime to ensure
> that a ticket is always cached and available for the topologies.
> EditDeleteNicholas AllenOctober 10, 2019, 12:42 PMIt is especially
> interesting that authentication only seems to fail against HDFS, not against
> other systems like Kafka. We’ve only seen failures in Enrichment and Batch
> Indexing against HDFS. In the case of Batch Indexing, it was able to consume
> messages from Kafka, but unable to write them to HDFS.EditDeleteNicholas
> AllenOctober 10, 2019, 1:09 PMEditedI am attaching a worker log where the
> client JAAS is set to debug and the worker failed to authenticate with HDFS.
> It seems to show that the work used the metron ticket. See worker.log
> attached.EditDeleteNicholas AllenOctober 10, 2019, 4:44 PMI was able to
> capture the error with DEBUG logs on in Storm, which is rather difficult to
> do surprisingly (long story). See attached
> worker.debug.logEditDeleteNicholas AllenOctober 10, 2019, 5:04 PMEditedBased
> on these logs it appears that the client is trying to do simple
> (non-authenticated) authentication when of course the server is kerberized.
> That’s based on messages like “PrivilegedActionException as:metron
> (auth:SIMPLE)” and client isn't using kerberos. 1
> 2
> 3
> 4
> 5
> 6
> 7
> 8
> 9
> 2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient
> Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] Get token info
> proto:interface org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB
> info:@org.apache.hadoop.security.token.TokenInfo(value=class
> org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSelector)
> 2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient
> Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] tokens aren't supported for
> this protocol or user doesn't have one
> 2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient
> Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] client isn't using kerberos
> 2019-10-10 20:30:26.084 o.a.h.s.UserGroupInformation
> Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedActionException
> as:metron (auth:SIMPLE)
> cause:org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]
> 2019-10-10 20:30:26.084 o.a.h.s.UserGroupInformation
> Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedAction as:metron
> (auth:SIMPLE)
> from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:721)
> 2019-10-10 20:30:26.085 o.a.h.i.Client Thread-7-hdfsIndexingBolt-executor[3
> 3] [WARN] Exception encountered while connecting to the server :
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate
> via:[TOKEN, KERBEROS]
> 2019-10-10 20:30:26.086 o.a.h.s.UserGroupInformation
> Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedActionException
> as:metron (auth:SIMPLE) cause:java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate
> via:[TOKEN, KERBEROS]
> 2019-10-10 20:30:26.086 o.a.h.i.Client Thread-7-hdfsIndexingBolt-executor[3
> 3] [DEBUG] closing ipc connection to
> nicksolr-1.openstacklocal/172.22.76.204:8020:
> org.apache.hadoop.security.AccessControlException: Client cannot authenticate
> via:[TOKEN, KERBEROS]
> java.io.IOException: org.apache.hadoop.security.AccessControlException:
> Client cannot authenticate via:[TOKEN, KERBEROS] These messages can be
> matched up to the source code
> here.https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
>
> https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java
>
> https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java
> EditDelete
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)