Sammi Chen created HDDS-13886:
---------------------------------
Summary: Improve log message for SCM bootstrap failure
Key: HDDS-13886
URL: https://issues.apache.org/jira/browse/HDDS-13886
Project: Apache Ozone
Issue Type: Improvement
Reporter: Sammi Chen
Recently hit a SCM bootstrap failure case, due to a SCM property
"ozone.scm.service.ids " is mis-spelled as "ozone.scm.service.id", here is the
"ozone scm --bootstrap" output, "--loglevel DEBUG" also didn't show more info.
{code:java}
************************************************************/
2025-11-04 01:54:05,805 [main] INFO server.StorageContainerManagerStarter:
registered UNIX signal handlers for [SIGHUP, SIGINT, SIGTERM]
2025-11-04 01:54:05,935 [main] INFO reflections.Reflections: Reflections took
88 ms to scan 3 urls, producing 131 keys and 298 values
2025-11-04 01:54:05,978 [main] INFO ha.SCMHANodeDetails: ServiceID for
StorageContainerManager is null
2025-11-04 01:54:05,978 [main] INFO ha.SCMHANodeDetails:
ozone.scm.default.service.id is not defined, falling back to
ozone.scm.service.ids to find serviceID for StorageContainerManager if it is HA
enabled cluster
2025-11-04 01:54:05,989 [main] DEBUG security.SecurityUtil: Setting
hadoop.security.token.service.use_ip to true
2025-11-04 01:54:06,212 [main] DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with
annotation @org.apache.hadoop.metrics2.annotation.Metric(sampleName=Ops,
always=false, valueName=Time, about=, interval=10, type=DEFAULT,
value=[GetGroups])
2025-11-04 01:54:06,215 [main] DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with
annotation @org.apache.hadoop.metrics2.annotation.Metric(sampleName=Ops,
always=false, valueName=Time, about=, interval=10, type=DEFAULT, value=[Rate of
failed kerberos logins and latency (milliseconds)])
2025-11-04 01:54:06,215 [main] DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with
annotation @org.apache.hadoop.metrics2.annotation.Metric(sampleName=Ops,
always=false, valueName=Time, about=, interval=10, type=DEFAULT, value=[Rate of
successful kerberos logins and latency (milliseconds)])
2025-11-04 01:54:06,215 [main] DEBUG lib.MutableMetricsFactory: field private
org.apache.hadoop.metrics2.lib.MutableGaugeInt
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with
annotation @org.apache.hadoop.metrics2.annotation.Metric(sampleName=Ops,
always=false, valueName=Time, about=, interval=10, type=DEFAULT, value=[Renewal
failures since last successful login])
2025-11-04 01:54:06,215 [main] DEBUG lib.MutableMetricsFactory: field private
org.apache.hadoop.metrics2.lib.MutableGaugeLong
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal
with annotation @org.apache.hadoop.metrics2.annotation.Metric(sampleName=Ops,
always=false, valueName=Time, about=, interval=10, type=DEFAULT, value=[Renewal
failures since startup])
2025-11-04 01:54:06,216 [main] DEBUG impl.MetricsSystemImpl: UgiMetrics, User
and group related metrics
2025-11-04 01:54:06,228 [main] DEBUG security.Groups: Creating new Groups
object
2025-11-04 01:54:06,229 [main] DEBUG util.PerformanceAdvisory: Falling back to
shell based
2025-11-04 01:54:06,229 [main] DEBUG
security.JniBasedUnixGroupsMappingWithFallback: Group mapping
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
2025-11-04 01:54:06,230 [main] DEBUG security.Groups: Group mapping
impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback;
cacheTimeout=300000; warningDeltaMs=5000
2025-11-04 01:54:06,232 [main] DEBUG security.UserGroupInformation: Hadoop login
2025-11-04 01:54:06,233 [main] DEBUG security.UserGroupInformation: hadoop
login commit
2025-11-04 01:54:06,234 [main] DEBUG security.UserGroupInformation: Using local
user: UnixPrincipal: root
2025-11-04 01:54:06,234 [main] DEBUG security.UserGroupInformation: Using user:
"UnixPrincipal: root" with name: root
2025-11-04 01:54:06,234 [main] DEBUG security.UserGroupInformation: User entry:
"root"
2025-11-04 01:54:06,234 [main] DEBUG security.UserGroupInformation: UGI
loginUser: root (auth:SIMPLE)
Failed to get SCM info
2025-11-04 01:54:06,239 [shutdown-hook-0] INFO
server.StorageContainerManagerStarter: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down StorageContainerManager at
ozone-210-2.vpc.cloudera.com/10.65.55.1
************************************************************/
2025-11-04 01:54:06,240 [Thread-2] DEBUG util.ShutdownHookManager: Completed
shutdown in 0.001 seconds; Timeouts: 0
2025-11-04 01:54:06,249 [Thread-2] DEBUG util.ShutdownHookManager:
ShutdownHookManager completed shutdown.
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]