[
https://issues.apache.org/jira/browse/HDDS-6763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539049#comment-17539049
]
Attila Doroszlai commented on HDDS-6763:
----------------------------------------
SCM during startup is waiting few seconds for {{addSCM}} call to complete. It
is replacing DB with checkpoint in another thread. When SCM start continues
and tries to access certificates in old, stale DB. Seems to be very similar to
HDDS-6732.
> SCM crashed while getting certificate from SCMCertStore
> -------------------------------------------------------
>
> Key: HDDS-6763
> URL: https://issues.apache.org/jira/browse/HDDS-6763
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Neil Joshi
> Priority: Major
> Attachments: docker-ozonesecure-ha.log, hs_err_pid8.log
>
>
> With secure HA cluster, scm crashes in safe-mode while retrieving certificate
> from SCMCertStore.
>
>
> {code:java}
> Current thread (0x00007fda6c017800): JavaThread "Listener at 0.0.0.0/9860"
> [_thread_in_native, id=171, stack(0x00007fda7399b000,0x00007fda73a9c000)]
> Stack: [0x00007fda7399b000,0x00007fda73a9c000], sp=0x00007fda73a99508, free
> space=1017k
> Native frames: (J=compiled Java code, A=aot compiled Java code,
> j=interpreted, Vv=VM code, C=native code)
> C 0x00007fda6d2426a0
> C [librocksdbjni6738631227401849863.so+0x28d8a2]
> Java_org_rocksdb_RocksDB_get__J_3BIIJ+0x62
> j org.rocksdb.RocksDB.get(J[BIIJ)[B+0
> j org.rocksdb.RocksDB.get(Lorg/rocksdb/ColumnFamilyHandle;[B)[B+13
> j org.apache.hadoop.hdds.utils.db.RDBTable.get([B)[B+9
> j
> org.apache.hadoop.hdds.utils.db.RDBTable.get(Ljava/lang/Object;)Ljava/lang/Object;+5
> j
> org.apache.hadoop.hdds.utils.db.TypedTable.getFromTable(Ljava/lang/Object;)Ljava/lang/Object;+14
> j
> org.apache.hadoop.hdds.utils.db.TypedTable.get(Ljava/lang/Object;)Ljava/lang/Object;+61
> j
> org.apache.hadoop.hdds.scm.server.SCMCertStore.getCertificateByID(Ljava/math/BigInteger;Lorg/apache/hadoop/hdds/security/x509/certificate/authority/CertificateStore$CertType;)Ljava/security/cert/X509Ce
> rtificate;+17
> v ~StubRoutines::call_stub
> V [libjvm.so+0x8ce625] JavaCalls::call_helper(JavaValue*, methodHandle
> const&, JavaCallArguments*, Thread*)+0x395
> V [libjvm.so+0xced613] invoke(InstanceKlass*, methodHandle const&, Handle,
> bool, objArrayHandle, BasicType, objArrayHandle, bool, Thread*) [clone
> .isra.199]+0x603
> V [libjvm.so+0xcee367] Reflection::invoke_method(oopDesc*, Handle,
> objArrayHandle, Thread*)+0x137
> V [libjvm.so+0x97d953] JVM_InvokeMethod+0x1a3
> j
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+0
> [email protected]
> j
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+100
> [email protected]
> J 3798 c1
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
> [email protected] (10 bytes) @ 0x00007fda5556b674
> [0x00007fda5556b540+0x000000
> 0000000134]
> J 3797 c1
> java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
> [email protected] (65 bytes) @ 0x00007fda5556b10c
> [0x00007fda5556ada0+0x000000000000036c]
> j
> org.apache.hadoop.hdds.scm.ha.SCMHAInvocationHandler.invokeLocal(Ljava/lang/reflect/Method;[Ljava/lang/Object;)Ljava/lang/Object;+35
> j
> org.apache.hadoop.hdds.scm.ha.SCMHAInvocationHandler.invoke(Ljava/lang/Object;Ljava/lang/reflect/Method;[Ljava/lang/Object;)Ljava/lang/Object;+33
> j
> com.sun.proxy.$Proxy21.getCertificateByID(Ljava/math/BigInteger;Lorg/apache/hadoop/hdds/security/x509/certificate/authority/CertificateStore$CertType;)Ljava/security/cert/X509Certificate;+20
> j
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.persistSCMCertificates()V+76
> j org.apache.hadoop.hdds.scm.server.StorageContainerManager.start()V+155
> j
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter$SCMStarterHelper.start(Lorg/apache/hadoop/hdds/conf/OzoneConfiguration;)V+6
> j
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.startScm()V+8
> {code}
> [^hs_err_pid8.log]
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]