[ 
https://issues.apache.org/jira/browse/HDDS-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng reassigned HDDS-10734:
---------------------------------

    Assignee: Siyao Meng

> [Hbase Ozone] ImportTSV fails during OM Rolling Restart with 
> "SecretManager$InvalidToken: Tampered/Invalid token."
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-10734
>                 URL: https://issues.apache.org/jira/browse/HDDS-10734
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: OM
>            Reporter: Pratyush Bhatt
>            Assignee: Siyao Meng
>            Priority: Major
>
> Triggering ImportTSV during Rolling restart is failing.
> Debugged the issue, and its reproducible everytime when the "reducers" are 
> getting used by ImportTSV and at the same time there is a OM rolling restart 
> stage going on.
> {code:java}
> 2024-04-22 10:15:41,159|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|24/04/22 10:15:41 INFO 
> mapreduce.Job:  map 100% reduce 69%
> 2024-04-22 10:15:43,169|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|24/04/22 10:15:43 INFO 
> mapreduce.Job:  map 100% reduce 70%
> 2024-04-22 10:15:49,198|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|24/04/22 10:15:49 INFO 
> mapreduce.Job:  map 100% reduce 71%
> 2024-04-22 10:16:29,396|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|24/04/22 10:16:29 INFO 
> mapreduce.Job: Task Id : attempt_1713778160624_0007_r_000072_0, Status : 
> FAILED
> 2024-04-22 10:16:29,434|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|Error: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: Tampered/Invalid 
> token.
> 2024-04-22 10:16:29,434|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 2024-04-22 10:16:29,434|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 2024-04-22 10:16:29,434|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 2024-04-22 10:16:29,435|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> 2024-04-22 10:16:29,435|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
> 2024-04-22 10:16:29,435|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110)
> 2024-04-22 10:16:29,435|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:253)
> 2024-04-22 10:16:29,435|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:115)
> 2024-04-22 10:16:29,436|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.<init>(BasicRootedOzoneClientAdapterImpl.java:201)
> 2024-04-22 10:16:29,436|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.ozone.RootedOzoneClientAdapterImpl.<init>(RootedOzoneClientAdapterImpl.java:51)
> 2024-04-22 10:16:29,436|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.ozone.RootedOzoneFileSystem.createAdapter(RootedOzoneFileSystem.java:111)
> 2024-04-22 10:16:29,436|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.initialize(BasicRootedOzoneFileSystem.java:189)
> 2024-04-22 10:16:29,436|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3451)
> 2024-04-22 10:16:29,437|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:161)
> 2024-04-22 10:16:29,437|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3556)
> 2024-04-22 10:16:29,437|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3503)
> 2024-04-22 10:16:29,437|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.FileSystem.get(FileSystem.java:521)
> 2024-04-22 10:16:29,437|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.fs.FileSystem.get(FileSystem.java:269)
> 2024-04-22 10:16:29,438|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:173)
> 2024-04-22 10:16:29,438|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> java.security.AccessController.doPrivileged(Native Method)
> 2024-04-22 10:16:29,438|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> javax.security.auth.Subject.doAs(Subject.java:422)
> 2024-04-22 10:16:29,438|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
> 2024-04-22 10:16:29,438|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> 2024-04-22 10:16:29,438|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  Tampered/Invalid token.
> 2024-04-22 10:16:29,439|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1616)
> 2024-04-22 10:16:29,439|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.ipc.Client.call(Client.java:1562)
> 2024-04-22 10:16:29,439|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.ipc.Client.call(Client.java:1459)
> 2024-04-22 10:16:29,439|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
> 2024-04-22 10:16:29,439|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
> 2024-04-22 10:16:29,440|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> com.sun.proxy.$Proxy17.submitRequest(Unknown Source)
> 2024-04-22 10:16:29,440|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
> 2024-04-22 10:16:29,440|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2024-04-22 10:16:29,440|INFO|Thread-37|machine.py:205 - 
> run()||GUID=51e988c6-6805-43d1-9290-eb6f667ac2dd|at 
> java.lang.reflect.Method.invoke(Method.java:498) {code}
> Checked the leader OM logs, shows below:
> {code:java}
> 2024-04-22 10:16:24,671 WARN [Socket Reader #1 for port 
> 9862]-SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> 10.140.133.64:46032:null (DIGEST-MD5: IO error acquiring password) with true 
> cause: (OM:om102 is not the leader. Could not determine the leader node.)
> 2024-04-22 10:16:24,671 WARN [Socket Reader #1 for port 
> 9862]-SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> 10.140.68.1:43592:null (DIGEST-MD5: IO error acquiring password) with true 
> cause: (OM:om102 is not the leader. Could not determine the leader node.)
> 2024-04-22 10:16:24,672 WARN [Socket Reader #1 for port 
> 9862]-SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> 10.140.170.2:41974:null (DIGEST-MD5: IO error acquiring password) with true 
> cause: (OM:om102 is not the leader. Could not determine the leader node.)
> 2024-04-22 10:16:24,672 WARN [Socket Reader #1 for port 
> 9862]-SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> 10.140.133.64:46020:null (DIGEST-MD5: IO error acquiring password) with true 
> cause: (OM:om102 is not the leader. Could not determine the leader node.)
> 2024-04-22 10:16:24,672 WARN [Socket Reader #1 for port 
> 9862]-SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> 10.140.11.131:50274:null (DIGEST-MD5: IO error acquiring password) with true 
> cause: (OM:om102 is not the leader. Could not determine the leader node.)
> 2024-04-22 10:16:24,675 WARN [Socket Reader #1 for port 
> 9862]-SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed for 
> 10.140.11.131:50290:null (DIGEST-MD5: IO error acquiring password) with true 
> cause: (OM:om102 is not the leader. Could not determine the leader node.) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to