Nick Allen created METRON-2297:
----------------------------------
Summary: Enrichment Topology Unable to Load Geo IP Data from HDFS
Key: METRON-2297
URL: https://issues.apache.org/jira/browse/METRON-2297
Project: Metron
Issue Type: Sub-task
Reporter: Nick Allen
Assignee: Nick Allen
On the `feature/METRON-2088-support-hdp-3.1` feature branch, the Enrichment
topology is unable to load the GeoIP data from HDFS when using Kerberos
authentication. The Enrichment topology shows this error.
{code:java}
2019-10-03 18:23:18.545 o.a.h.i.Client Curator-TreeCache-0 [WARN] Exception
encountered while connecting to the server :
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
2019-10-03 18:23:18.552 o.a.m.e.a.m.MaxMindDbUtilities Curator-TreeCache-0
[ERROR] Unable to open new database file
/apps/metron/geo/default/GeoLite2-City.tar.gz
java.io.IOException: DestHost:destPort metrong-1.openstacklocal:8020 ,
LocalHost:localPort metrong-7/172.22.74.121:0. Failed on local exception:
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client
cannot authenticate via:[TOKEN, KERBEROS]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method) ~[?:1.8.0_112]
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
~[?:1.8.0_112]
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
~[?:1.8.0_112]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
~[?:1.8.0_112]
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1502)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1444)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1354)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at com.sun.proxy.$Proxy55.getBlockLocations(Unknown Source) ~[?:?]
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:317)
~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
~[?:1.8.0_112]
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8.0_112]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:1.8.0_112]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at com.sun.proxy.$Proxy56.getBlockLocations(Unknown Source) ~[?:?]
at
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:862)
~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:851)
~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:840)
~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1004)
~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:320)
~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316)
~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:328)
~[hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:899)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.metron.enrichment.adapters.maxmind.MaxMindDatabase.update(MaxMindDatabase.java:113)
~[stormjar.jar:?]
at
org.apache.metron.enrichment.adapters.maxmind.geo.GeoLiteCityDatabase.updateIfNecessary(GeoLiteCityDatabase.java:142)
~[stormjar.jar:?]
at
org.apache.metron.enrichment.adapters.geo.GeoAdapter.updateAdapter(GeoAdapter.java:64)
~[stormjar.jar:?]
at
org.apache.metron.enrichment.bolt.UnifiedEnrichmentBolt.reloadCallback(UnifiedEnrichmentBolt.java:239)
~[stormjar.jar:?]
at
org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.reloadCallback(ConfigurationsUpdater.java:148)
~[stormjar.jar:?]
at
org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.update(ConfigurationsUpdater.java:77)
~[stormjar.jar:?]
at
org.apache.metron.zookeeper.SimpleEventListener.childEvent(SimpleEventListener.java:120)
[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:685)
[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:679)
[stormjar.jar:?]
at
org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92)
[stormjar.jar:?]
at
org.apache.metron.guava.enrichment.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
[stormjar.jar:?]
at
org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84)
[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache.callListeners(TreeCache.java:678)
[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache.access$1400(TreeCache.java:69)
[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache$4.run(TreeCache.java:790)
[stormjar.jar:?]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_112]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
Caused by: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:758)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at java.security.AccessController.doPrivileged(Native Method)
~[?:1.8.0_112]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:721)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:814)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:411)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1559)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1390)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
... 46 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
at
org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:173)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:615)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:411)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:801)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:797)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at java.security.AccessController.doPrivileged(Native Method)
~[?:1.8.0_112]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:797)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:411)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1559)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1390)
~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
... 46 more
2019-10-03 18:23:18.558 o.a.c.f.r.c.TreeCache Curator-TreeCache-0 [ERROR]
java.lang.IllegalStateException: Unable to update MaxMind database
at
org.apache.metron.enrichment.adapters.maxmind.MaxMindDbUtilities.handleDatabaseIOException(MaxMindDbUtilities.java:81)
~[stormjar.jar:?]
at
org.apache.metron.enrichment.adapters.maxmind.MaxMindDatabase.update(MaxMindDatabase.java:127)
~[stormjar.jar:?]
at
org.apache.metron.enrichment.adapters.maxmind.geo.GeoLiteCityDatabase.updateIfNecessary(GeoLiteCityDatabase.java:142)
~[stormjar.jar:?]
at
org.apache.metron.enrichment.adapters.geo.GeoAdapter.updateAdapter(GeoAdapter.java:64)
~[stormjar.jar:?]
at
org.apache.metron.enrichment.bolt.UnifiedEnrichmentBolt.reloadCallback(UnifiedEnrichmentBolt.java:239)
~[stormjar.jar:?]
at
org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.reloadCallback(ConfigurationsUpdater.java:148)
~[stormjar.jar:?]
at
org.apache.metron.common.zookeeper.configurations.ConfigurationsUpdater.update(ConfigurationsUpdater.java:77)
~[stormjar.jar:?]
at
org.apache.metron.zookeeper.SimpleEventListener.childEvent(SimpleEventListener.java:120)
~[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:685)
[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache$2.apply(TreeCache.java:679)
[stormjar.jar:?]
at
org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:92)
[stormjar.jar:?]
at
org.apache.metron.guava.enrichment.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
[stormjar.jar:?]
at
org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84)
[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache.callListeners(TreeCache.java:678)
[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache.access$1400(TreeCache.java:69)
[stormjar.jar:?]
at
org.apache.curator.framework.recipes.cache.TreeCache$4.run(TreeCache.java:790)
[stormjar.jar:?]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_112]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[?:1.8.0_112]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_112]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_112]
at java.lang.Thread.run(Thread.java:745)
[?:1.8.0_112]EnvironmentCluster details below. 172.22.75.252
metronh-5.openstacklocal metronh-5 metronh-5.openstacklocal.
172.22.75.250 metronh-4.openstacklocal metronh-4
metronh-4.openstacklocal.
172.22.75.248 metronh-3.openstacklocal metronh-3
metronh-3.openstacklocal.
172.22.75.246 metronh-2.openstacklocal metronh-2
metronh-2.openstacklocal.
172.22.75.244 metronh-1.openstacklocal metronh-1
metronh-1.openstacklocal. Ambari node - metronh-1.openstacklocalMetron node -
metronh-5.openstacklocalPEM file for SSH - attachedKnown Issue
DescriptionNoneTesting ProcedureNoneAttachmentsActivityCommentsView 9 older
commentsNicholas AllenOctober 10, 2019, 12:18 PMEditedAfter more testing the
tgt_renew script work around is not going to work. The work around looks like
the following.Kerberos authentication against HDFS from Metron's Storm
topologies can fail. The Storm worker is unable to present a valid Kerberos
ticket to authenticate against HDFS. This impacts the Enrichment and Batch
Indexing topologies, which each interact with HDFS.
To mitigate this problem, before starting the Metron topologies in a secured
cluster using Kerberos authentication, the additional installation step is
required. A periodic job should be scheduled to obtain and cache a Kerberos
ticket.
The job should be scheduled on each node hosting a Storm Supervisor.The job
should run as the user ‘metron’.The job should kinit using the Metron keytab
often located at /etc/security/keytabs/metron.headless.keytab.The job should be
scheduled to run at least as frequently as the ticket lifetime to ensure that a
ticket is always cached and available for the topologies. EditDeleteNicholas
AllenOctober 10, 2019, 12:42 PMIt is especially interesting that authentication
only seems to fail against HDFS, not against other systems like Kafka. We’ve
only seen failures in Enrichment and Batch Indexing against HDFS. In the case
of Batch Indexing, it was able to consume messages from Kafka, but unable to
write them to HDFS.EditDeleteNicholas AllenOctober 10, 2019, 1:09 PMEditedI am
attaching a worker log where the client JAAS is set to debug and the worker
failed to authenticate with HDFS. It seems to show that the work used the
metron ticket. See worker.log attached.EditDeleteNicholas AllenOctober 10,
2019, 4:44 PMI was able to capture the error with DEBUG logs on in Storm, which
is rather difficult to do surprisingly (long story). See attached
worker.debug.logEditDeleteNicholas AllenOctober 10, 2019, 5:04 PMEditedBased on
these logs it appears that the client is trying to do simple
(non-authenticated) authentication when of course the server is kerberized.
That’s based on messages like “PrivilegedActionException as:metron
(auth:SIMPLE)” and client isn't using kerberos. 1
2
3
4
5
6
7
8
9
2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient
Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] Get token info proto:interface
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB
info:@org.apache.hadoop.security.token.TokenInfo(value=class
org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSelector)
2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient
Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] tokens aren't supported for
this protocol or user doesn't have one
2019-10-10 20:30:26.084 o.a.h.s.SaslRpcClient
Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] client isn't using kerberos
2019-10-10 20:30:26.084 o.a.h.s.UserGroupInformation
Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedActionException
as:metron (auth:SIMPLE)
cause:org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
2019-10-10 20:30:26.084 o.a.h.s.UserGroupInformation
Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedAction as:metron
(auth:SIMPLE)
from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:721)
2019-10-10 20:30:26.085 o.a.h.i.Client Thread-7-hdfsIndexingBolt-executor[3 3]
[WARN] Exception encountered while connecting to the server :
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
2019-10-10 20:30:26.086 o.a.h.s.UserGroupInformation
Thread-7-hdfsIndexingBolt-executor[3 3] [DEBUG] PrivilegedActionException
as:metron (auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
2019-10-10 20:30:26.086 o.a.h.i.Client Thread-7-hdfsIndexingBolt-executor[3 3]
[DEBUG] closing ipc connection to nicksolr-1.openstacklocal/172.22.76.204:8020:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client
cannot authenticate via:[TOKEN, KERBEROS] These messages can be matched up to
the source code
here.https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SaslRpcClient.java
https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java
https://github.com/apache/hadoop/blob/release-3.1.3-RC0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java
EditDelete
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)