[ 
https://issues.apache.org/jira/browse/HDDS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813413#comment-17813413
 ] 

Hemant Kumar edited comment on HDDS-10275 at 2/1/24 10:23 PM:
--------------------------------------------------------------

Similar issue:
Snapshot creation request:
{code:java}
2024-02-01 20:53:55,305 INFO [OM StateMachine ApplyTransaction Thread - 
0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: Created 
snapshot: 'snapshot-05' with snapshotId: 'adfce936-2a08-4fb9-bf70-b7ba156a97b0' 
under path 'vol1/bucket1' {code}
Double duffer flush thread:
{code:java}
...
2024-02-01 20:14:32,633 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-d57f780b-66b5-4a91-9b23-c20f19d4cb86
 in 146 milliseconds
2024-02-01 20:14:32,634 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 1 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-d57f780b-66b5-4a91-9b23-c20f19d4cb86
 availability.
2024-02-01 20:14:32,635 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-d57f780b-66b5-4a91-9b23-c20f19d4cb86
 for snapshot snapshot-95554815-4005-4da9-865a-4b6af255f0de
2024-02-01 20:14:46,837 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-5f004ec9-4318-4ff4-b276-915a81c88bcb
 in 104 milliseconds
2024-02-01 20:14:46,838 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 1 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-5f004ec9-4318-4ff4-b276-915a81c88bcb
 availability.
2024-02-01 20:14:46,839 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-5f004ec9-4318-4ff4-b276-915a81c88bcb
 for snapshot snapshot-9a56ac18-8696-49ee-ab4d-70de293889b0
2024-02-01 20:15:01,037 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-89471c5f-f06f-4a82-b3be-01030f5fcb79
 in 93 milliseconds
2024-02-01 20:15:01,039 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 1 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-89471c5f-f06f-4a82-b3be-01030f5fcb79
 availability.
2024-02-01 20:15:01,040 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-89471c5f-f06f-4a82-b3be-01030f5fcb79
 for snapshot snapshot-0988bc35-880a-4824-a283-077781bb4513
2024-02-01 20:15:15,305 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-4fb23ffb-2939-4594-bc0f-cd6c656727a6
 in 90 milliseconds
2024-02-01 20:15:15,305 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 0 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-4fb23ffb-2939-4594-bc0f-cd6c656727a6
 availability.
2024-02-01 20:15:15,307 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-4fb23ffb-2939-4594-bc0f-cd6c656727a6
 for snapshot snapshot-f87bb691-7f01-4622-8652-0d5364c7bc34
2024-02-01 20:15:29,382 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-f94591f8-a702-41a8-8ad0-ab8713886c12
 in 87 milliseconds
2024-02-01 20:15:29,383 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 1 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-f94591f8-a702-41a8-8ad0-ab8713886c12
 availability.
2024-02-01 20:15:29,384 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-f94591f8-a702-41a8-8ad0-ab8713886c12
 for snapshot snapshot-ddd0615e-07a3-4d9e-9e47-2a2800a6936e
...{code}
No double buffed flush after 2024-02-01 20:15:29

There was not bootstrapping request during that time:
{code:java}
[root@host hadoop-ozone]# grep "Received GET request to obtain DB checkpoint 
snapshot" ozone-om.log
2024-01-24 00:14:29,329 INFO 
[qtp1994100808-78]-org.apache.hadoop.hdds.utils.DBCheckpointServlet: Received 
GET request to obtain DB checkpoint snapshot
[root@host hadoop-ozone]# grep "Received POST request to obtain DB checkpoint 
snapshot" ozone-om.log
[root@host hadoop-ozone]# {code}


was (Author: JIRAUSER297350):
Similar issue:
Snapshot creation request: 
{code:java}
2024-02-01 20:53:55,305 INFO [OM StateMachine ApplyTransaction Thread - 
0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: Created 
snapshot: 'snapshot-05' with snapshotId: 'adfce936-2a08-4fb9-bf70-b7ba156a97b0' 
under path 'vol1/bucket1' {code}
Double duffer flush thread:
{code:java}
...
2024-02-01 20:14:32,633 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-d57f780b-66b5-4a91-9b23-c20f19d4cb86
 in 146 milliseconds
2024-02-01 20:14:32,634 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 1 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-d57f780b-66b5-4a91-9b23-c20f19d4cb86
 availability.
2024-02-01 20:14:32,635 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-d57f780b-66b5-4a91-9b23-c20f19d4cb86
 for snapshot snapshot-95554815-4005-4da9-865a-4b6af255f0de
2024-02-01 20:14:46,837 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-5f004ec9-4318-4ff4-b276-915a81c88bcb
 in 104 milliseconds
2024-02-01 20:14:46,838 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 1 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-5f004ec9-4318-4ff4-b276-915a81c88bcb
 availability.
2024-02-01 20:14:46,839 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-5f004ec9-4318-4ff4-b276-915a81c88bcb
 for snapshot snapshot-9a56ac18-8696-49ee-ab4d-70de293889b0
2024-02-01 20:15:01,037 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-89471c5f-f06f-4a82-b3be-01030f5fcb79
 in 93 milliseconds
2024-02-01 20:15:01,039 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 1 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-89471c5f-f06f-4a82-b3be-01030f5fcb79
 availability.
2024-02-01 20:15:01,040 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-89471c5f-f06f-4a82-b3be-01030f5fcb79
 for snapshot snapshot-0988bc35-880a-4824-a283-077781bb4513
2024-02-01 20:15:15,305 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-4fb23ffb-2939-4594-bc0f-cd6c656727a6
 in 90 milliseconds
2024-02-01 20:15:15,305 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 0 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-4fb23ffb-2939-4594-bc0f-cd6c656727a6
 availability.
2024-02-01 20:15:15,307 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-4fb23ffb-2939-4594-bc0f-cd6c656727a6
 for snapshot snapshot-f87bb691-7f01-4622-8652-0d5364c7bc34
2024-02-01 20:15:29,382 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
 Created checkpoint in rocksDB at 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-f94591f8-a702-41a8-8ad0-ab8713886c12
 in 87 milliseconds
2024-02-01 20:15:29,383 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils: 
Waited for 1 milliseconds for checkpoint directory 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-f94591f8-a702-41a8-8ad0-ab8713886c12
 availability.
2024-02-01 20:15:29,384 INFO 
[OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
Created checkpoint : 
/var/lib/hadoop-ozone/om/data/db.snapshots/checkpointState/om.db-f94591f8-a702-41a8-8ad0-ab8713886c12
 for snapshot snapshot-ddd0615e-07a3-4d9e-9e47-2a2800a6936e
...{code}
No double buffed flush after 2024-02-01 20:15:29

> Double buffer not flushing DB transactions
> ------------------------------------------
>
>                 Key: HDDS-10275
>                 URL: https://issues.apache.org/jira/browse/HDDS-10275
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Hemant Kumar
>            Priority: Major
>
> While looking into snapshot diff failure because it could not load the 
> snapshot because checkpointing dir doesn’t exist. Snapshot creation succeeded 
> but checkpointing dir doesn’t exist because it happens inside double buffed 
> flush.
> Looked at logs and there was no double buffer flush logs during that time.
> Snapshot creation request:
> {code:java}
> 2023-11-27 00:40:23,345 INFO [OM StateMachine ApplyTransaction Thread - 
> 0]-org.apache.hadoop.ozone.om.request.snapshot.OMSnapshotCreateRequest: 
> Created snapshot: 'snap-ay36z' with snapshotId: 
> 'bf0c6141-4185-4361-b15f-c4aa71c5c6d8' under path 'vol-2xd36/buck-id806'
> {code}
> Double Buffer flush logs:
> {code:java}
> ...
> 2023-11-27 00:10:23,826 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
>  Created checkpoint in rocksDB at 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-b2e9acb3-fee2-4190-8272-0649edca8d93
>  in 30 milliseconds
> 2023-11-27 00:10:23,827 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils:
>  Waited for 1 milliseconds for checkpoint directory 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-b2e9acb3-fee2-4190-8272-0649edca8d93
>  availability.
> 2023-11-27 00:10:23,828 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
> Created checkpoint : 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-b2e9acb3-fee2-4190-8272-0649edca8d93
>  for snapshot snap-mswq9
> 2023-11-27 00:10:39,586 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
>  Created checkpoint in rocksDB at 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3369ac3a-61e1-4eca-b3cf-eb2de0b2d688
>  in 30 milliseconds
> 2023-11-27 00:10:39,586 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils:
>  Waited for 0 milliseconds for checkpoint directory 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3369ac3a-61e1-4eca-b3cf-eb2de0b2d688
>  availability.
> 2023-11-27 00:10:39,587 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
> Created checkpoint : 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3369ac3a-61e1-4eca-b3cf-eb2de0b2d688
>  for snapshot snap-f5u3t
> 2023-11-27 00:10:55,949 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
>  Created checkpoint in rocksDB at 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3a690c8f-f3ef-415d-b25c-3aaf763c9507
>  in 22 milliseconds
> 2023-11-27 00:10:55,950 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils:
>  Waited for 1 milliseconds for checkpoint directory 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3a690c8f-f3ef-415d-b25c-3aaf763c9507
>  availability.
> 2023-11-27 00:10:55,950 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
> Created checkpoint : 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-3a690c8f-f3ef-415d-b25c-3aaf763c9507
>  for snapshot snap-jfktn
> 2023-11-29 08:52:24,698 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
>  Created checkpoint in rocksDB at 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-c3ba17ef-d947-454e-9c4f-b9063ae65650
>  in 15 milliseconds
> 2023-11-29 08:52:24,715 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils:
>  Waited for 16 milliseconds for checkpoint directory 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-c3ba17ef-d947-454e-9c4f-b9063ae65650
>  availability.
> 2023-11-29 08:52:24,717 WARN 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
> Took 614733 ns to find endKey. Caller is 
> deleteKeysFromDelKeyTableInSnapshotScope
> 2023-11-29 08:52:24,718 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
> Created checkpoint : 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-c3ba17ef-d947-454e-9c4f-b9063ae65650
>  for snapshot snap-ay36z
> 2023-11-29 08:52:24,745 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointManager:
>  Created checkpoint in rocksDB at 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-bf0c6141-4185-4361-b15f-c4aa71c5c6d8
>  in 12 milliseconds
> 2023-11-29 08:52:24,746 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.hdds.utils.db.RDBCheckpointUtils:
>  Waited for 0 milliseconds for checkpoint directory 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-bf0c6141-4185-4361-b15f-c4aa71c5c6d8
>  availability.
> 2023-11-29 08:52:24,747 INFO 
> [OMDoubleBufferFlushThread]-org.apache.hadoop.ozone.om.OmSnapshotManager: 
> Created checkpoint : 
> /var/lib/hadoop-ozone/om/data035525/db.snapshots/checkpointState/om.db-bf0c6141-4185-4361-b15f-c4aa71c5c6d8
>  for snapshot snap-ay36z
> ...
> {code}
> Also looked if double buffer thread was terminated or paused but no log 
> exists for that as well. I looked at the logs for the whole hour between last 
> double buffer flush and check-pointing was not created. Couldn’t find any 
> issue in that as well.
> On follower nodes, double buffer were working properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to