[
https://issues.apache.org/jira/browse/HDDS-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17505177#comment-17505177
]
Shawn commented on HDDS-6441:
-----------------------------
This is the log from beginning for a datanode. It reported the container file
missing error:
************************************************************/
2022-03-10 03:55:30 INFO main HddsDatanodeService:90 - registered UNIX signal
handlers for [TERM, HUP, INT]
2022-03-10 03:55:30 INFO main MetricRegistries:64 - Loaded MetricRegistries
class org.apache.ratis.metrics.impl.MetricRegistriesImpl
2022-03-10 03:55:30 INFO main MetricsConfig:120 - Loaded properties from
hadoop-metrics2.properties
2022-03-10 03:55:30 INFO main MetricsSystemImpl:378 - Scheduled Metric
snapshot period at 10 second(s).
2022-03-10 03:55:30 INFO main MetricsSystemImpl:191 - HddsDatanode metrics
system started
2022-03-10 03:55:30 INFO main HddsDatanodeService:235 - HddsDatanodeService
host:ozone-dn-prod-0.ozone-dn.ozone-prod.prod.k8s.cloud.abc.com
ip:100.114.104.16
2022-03-10 03:55:30 INFO main DNCertificateClient:127 - Loading certificate
from location:/data/metadata/dn/certs.
2022-03-10 03:55:30 INFO main HddsDatanodeService:245 - Ozone security is
enabled. Attempting login for Hdds Datanode user. Principal:
dn/[email protected],keytab:
/etc/security/keytabs/dn-dn.keytab
2022-03-10 03:55:30 WARN main NativeCodeLoader:60 - Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
2022-03-10 03:55:31 INFO main UserGroupInformation:1129 - Login successful for
user dn/[email protected] using keytab file dn-dn.keytab.
Keytab auto renewal enabled : false
2022-03-10 03:55:31 INFO main HddsDatanodeService:261 - Hdds Datanode login
successful.
2022-03-10 03:55:31 INFO main HddsDatanodeService:336 - Initializing secure
Datanode.
2022-03-10 03:55:31 INFO main DNCertificateClient:127 - Loading certificate
from location:/data/metadata/dn/certs.
2022-03-10 03:55:31 INFO main DNCertificateClient:704 - Certificate client
init case: 6
2022-03-10 03:55:31 INFO main DNCertificateClient:750 - Found private and
public key but certificate is missing.
2022-03-10 03:55:31 INFO main HddsDatanodeService:339 - Init response: GETCERT
2022-03-10 03:55:31 INFO main OzoneSecurityUtil:119 -
ip:2620:149:106a:1712:0:0:0:43%eth0 not returned.
2022-03-10 03:55:31 INFO main OzoneSecurityUtil:119 -
ip:fe80:0:0:0:c3c:43ff:fe1f:bd75%eth0 not returned.
2022-03-10 03:55:31 INFO main OzoneSecurityUtil:116 - Adding
ip:100.114.104.16,host:ozone-dn-prod-0.ozone-dn.ozone-prod.prod.k8s.cloud.abc.com
2022-03-10 03:55:31 INFO main OzoneSecurityUtil:119 - ip:0:0:0:0:0:0:0:1%lo
not returned.
2022-03-10 03:55:31 INFO main OzoneSecurityUtil:119 - ip:127.0.0.1 not
returned.
2022-03-10 03:55:31 INFO main HddsDatanodeService:443 - Creating csr for DN->
subject:[email protected]
2022-03-10 03:55:31 INFO main DNCertificateClient:127 - Loading certificate
from location:/data/metadata/dn/certs.
2022-03-10 03:55:31 INFO main DNCertificateClient:168 - Added certificate from
file:/data/metadata/dn/certs/3811657258662350.crt.
2022-03-10 03:55:31 INFO main DNCertificateClient:168 - Added certificate from
file:/data/metadata/dn/certs/CA-18382205469584.crt.
2022-03-10 03:55:31 INFO main DNCertificateClient:168 - Added certificate from
file:/data/metadata/dn/certs/ROOTCA-1.crt.
2022-03-10 03:55:31 INFO main HddsDatanodeService:346 - Successfully stored
SCM signed certificate, case:GETCERT.
2022-03-10 03:55:31 INFO main AbstractLayoutVersionManager:79 - Initializing
Layout version manager with metadata layout = SCM_HA (version = 2), software
layout = SCM_HA (version = 2)
2022-03-10 03:55:31 INFO main Reflections:232 - Reflections took 68 ms to scan
2 urls, producing 84 keys and 167 values
2022-03-10 03:55:31 INFO main SaveSpaceUsageToFile:94 - Cached usage info file
/data/storage/scmUsed not found
2022-03-10 03:55:31 INFO main HddsVolume:120 - Creating HddsVolume:
/data/storage/hdds of storage type : DISK capacity : 9232042688512
2022-03-10 03:55:31 INFO main MutableVolumeSet:168 - Added Volume :
/data/storage/hdds to VolumeSet
2022-03-10 03:55:31 INFO main ThrottledAsyncChecker:141 - Scheduling a check
for /data/storage/hdds
2022-03-10 03:55:31 INFO main StorageVolumeChecker:202 - Scheduled health
check for volume /data/storage/hdds
2022-03-10 03:55:31 WARN main ServerUtils:237 - Storage directory for Ratis is
not configured. It is a good idea to map this to an SSD disk. Falling back to
ozone.metadata.dirs
2022-03-10 03:55:31 INFO main SaveSpaceUsageToFile:94 - Cached usage info file
/data/metadata/ratis/scmUsed not found
2022-03-10 03:55:31 INFO main MutableVolumeSet:168 - Added Volume :
/data/metadata/ratis to VolumeSet
2022-03-10 03:55:31 INFO main ThrottledAsyncChecker:141 - Scheduling a check
for /data/metadata/ratis
2022-03-10 03:55:31 INFO main StorageVolumeChecker:202 - Scheduled health
check for volume /data/metadata/ratis
2022-03-10 03:55:32 INFO Thread-6 ContainerReader:142 - Start to verify
containers on volume /data/storage/hdds
2022-03-10 03:56:02 ERROR Thread-6 ContainerReader:159 - Missing .container
file for ContainerID: 15221
2022-03-10 03:56:05 INFO Thread-6 ContainerReader:172 - Finish verifying
containers on volume /data/storage/hdds
2022-03-10 03:56:05 INFO main OzoneContainer:236 - Build ContainerSet costs 33s
2022-03-10 03:56:05 WARN main ServerUtils:237 - Storage directory for Ratis is
not configured. It is a good idea to map this to an SSD disk. Falling back to
ozone.metadata.dirs
2022-03-10 03:56:05 INFO main RaftServer:44 - raft.rpc.type = GRPC (default)
> Ozone metadata does not align with underlying blocks when there are many
> incomplete uploads happens
> ---------------------------------------------------------------------------------------------------
>
> Key: HDDS-6441
> URL: https://issues.apache.org/jira/browse/HDDS-6441
> Project: Apache Ozone
> Issue Type: Bug
> Components: Ozone Datanode
> Affects Versions: 1.2.0
> Reporter: Shawn
> Priority: Major
>
> Ozone metadata does not align with underlying blocks when there are many
> incomplete uploads happens. I have a cluster which has a very few objects.
> But the datanode usage tells me I almost run out of space.
> ????
> Usage info for datanode with UUID f50108f1-d8bf-44e3-abed-6e77c91f994d:
> Capacity : 8802545958912B
> SCMUsed : 8802128257024B (99.99525%)
> Remaining : 74715136B (0.00085%)
> Usage info for datanode with UUID 2bdb3198-b71f-4153-9663-e3b349c6f82a:
> Capacity : 8802545958912B
> SCMUsed : 8802133102592B (99.99531%)
> Remaining : 76824576B (0.00087%)
> Usage info for datanode with UUID d5644a36-b967-44a6-a736-4bd2013c2b86:
> Capacity : 8793955991552B
> SCMUsed : 8793311227904B (99.99267%)
> Remaining : 291676160B (0.00332%)
> ...
>
> Also I see there are lots of errors in logs, complaining out of disk space
> and also report missing .container files as below:
> ????
> 2022-03-10 03:56:02 ERROR Thread-6 ContainerReader:159 - Missing .container
> file for ContainerID: 15221
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]