haodiao commented on issue #3208: URL: https://github.com/apache/kvrocks/issues/3208#issuecomment-3422423161
**The latest log of the master server:** ``` [2025-10-20T11:03:42.228626+08:00][I][cmd_replication.cc:61] Slave 10.8.8.13:33270, listening port: 6666, announce ip: 10.8.8.13 asks for synchronization with next sequence: 1, replication id: not supported, and local sequence: 10347 [2025-10-20T11:03:42.697981+08:00][E][replication.cc:114] Ping slave [10.8.8.13:55732] err: Broken pipe, would stop the thread [2025-10-20T11:03:42.698101+08:00][W][replication.cc:100] Slave thread was terminated, would stop feeding the slave: 10.8.8.13:55732 [2025-10-20T11:03:42.833349+08:00][I][storage.cc:1143] [storage] Create checkpoint successfully [2025-10-20T11:03:42.833795+08:00][I][cmd_replication.cc:224] [replication] Succeed sending full data file info to 10.8.8.13 [2025-10-20T11:04:10.751592+08:00][I][cmd_replication.cc:281] [replication] Succeed sending file 000095.log to 10.8.8.13 [2025-10-20T11:04:10.752006+08:00][I][cmd_replication.cc:281] [replication] Succeed sending file 000097.sst to 10.8.8.13 [2025-10-20T11:04:10.752494+08:00][I][cmd_replication.cc:281] [replication] Succeed sending file MANIFEST-000091 to 10.8.8.13 [2025-10-20T11:04:10.752730+08:00][I][cmd_replication.cc:281] [replication] Succeed sending file CURRENT to 10.8.8.13 [2025-10-20T11:04:10.753005+08:00][I][cmd_replication.cc:281] [replication] Succeed sending file 000090.log to 10.8.8.13 [2025-10-20T11:04:11.374264+08:00][I][cmd_replication.cc:281] [replication] Succeed sending file 000094.sst to 10.8.8.13 [2025-10-20T11:04:11.374751+08:00][I][cmd_replication.cc:281] [replication] Succeed sending file OPTIONS-000093 to 10.8.8.13 [2025-10-20T11:04:13.261882+08:00][I][cmd_replication.cc:61] Slave 10.8.8.13:41128, listening port: 6666, announce ip: 10.8.8.13 asks for synchronization with next sequence: 10348, replication id: not supported, and local sequence: 10347 [2025-10-20T11:04:13.262201+08:00][I][cmd_replication.cc:113] New replica: 10.8.8.13:41128 was added, start incremental syncing ``` **The latest log of the slave server:** ``` [2025-10-20T03:03:41.218142+00:00][I][main.cc:168] kvrocks unstable (commit 2215279) [2025-10-20T03:03:41.353900+00:00][I][storage.cc:407] [storage] Success to load the data from disk: 82 ms [2025-10-20T03:03:41.379658+00:00][I][worker.cc:76] [worker] Listening on: 0.0.0.0:6666 [2025-10-20T03:03:41.381334+00:00][I][worker.cc:76] [worker] Listening on: 0.0.0.0:6666 [2025-10-20T03:03:41.381730+00:00][I][worker.cc:76] [worker] Listening on: 0.0.0.0:6666 [2025-10-20T03:03:41.382108+00:00][I][worker.cc:76] [worker] Listening on: 0.0.0.0:6666 [2025-10-20T03:03:41.382459+00:00][I][worker.cc:76] [worker] Listening on: 0.0.0.0:6666 [2025-10-20T03:03:41.382836+00:00][I][worker.cc:76] [worker] Listening on: 0.0.0.0:6666 [2025-10-20T03:03:41.383209+00:00][I][worker.cc:76] [worker] Listening on: 0.0.0.0:6666 [2025-10-20T03:03:41.383604+00:00][I][worker.cc:76] [worker] Listening on: 0.0.0.0:6666 [2025-10-20T03:03:41.384461+00:00][W][replication.cc:430] Clean old synced checkpoint successfully [2025-10-20T03:03:41.384802+00:00][I][worker.cc:596] [worker] Thread #140388142089920 started [2025-10-20T03:03:41.384949+00:00][I][worker.cc:596] [worker] Thread #140388150482624 started [2025-10-20T03:03:41.385115+00:00][I][worker.cc:596] [worker] Thread #140388225754816 started [2025-10-20T03:03:41.385285+00:00][I][worker.cc:596] [worker] Thread #140388217362112 started [2025-10-20T03:03:41.385452+00:00][I][worker.cc:596] [worker] Thread #140388208969408 started [2025-10-20T03:03:41.385619+00:00][I][worker.cc:596] [worker] Thread #140388200576704 started [2025-10-20T03:03:41.385813+00:00][I][worker.cc:596] [worker] Thread #140388192184000 started [2025-10-20T03:03:41.385981+00:00][I][worker.cc:596] [worker] Thread #140388183791296 started [2025-10-20T03:03:41.386295+00:00][I][server.cc:245] [server] Ready to accept connections [2025-10-20T03:03:41.552330+00:00][I][replication.cc:482] [replication] Auth request was sent, waiting for response [2025-10-20T03:03:41.715270+00:00][I][replication.cc:498] [replication] Auth response was received, continue... [2025-10-20T03:03:41.715440+00:00][I][replication.cc:505] [replication] Check db name request was sent, waiting for response [2025-10-20T03:03:41.876086+00:00][I][replication.cc:525] [replication] DB name is valid, continue... [2025-10-20T03:03:41.876181+00:00][I][replication.cc:543] [replication] replconf request was sent, waiting for response [2025-10-20T03:03:42.210064+00:00][I][replication.cc:568] [replication] replconf is ok, start psync [2025-10-20T03:03:42.210959+00:00][I][replication.cc:598] [replication] Try to use psync, next seq: 1 [2025-10-20T03:03:42.533106+00:00][I][replication.cc:629] [replication] Failed to psync, error: -ERR sequence out of range, please use fullsync, switch to fullsync [2025-10-20T03:03:42.533222+00:00][I][replication.cc:482] [replication] Auth request was sent, waiting for response [2025-10-20T03:03:42.694377+00:00][I][replication.cc:498] [replication] Auth response was received, continue... [2025-10-20T03:03:42.694478+00:00][I][replication.cc:752] [replication] Start syncing data with fullsync [2025-10-20T03:03:42.977450+00:00][I][replication.cc:839] [replication] Succeeded fetching full data files info, fetching files in parallel [2025-10-20T03:04:11.355202+00:00][I][replication.cc:953] [fetch] Fetched 000095.log, crc32 0, skip count: 0, fetch count: 1, progress: 1 / 7 [2025-10-20T03:04:11.367702+00:00][I][replication.cc:953] [fetch] Fetched 000097.sst, crc32 0, skip count: 0, fetch count: 2, progress: 2 / 7 [2025-10-20T03:04:11.368381+00:00][I][replication.cc:953] [fetch] Fetched MANIFEST-000091, crc32 0, skip count: 0, fetch count: 3, progress: 3 / 7 [2025-10-20T03:04:11.368541+00:00][I][replication.cc:953] [fetch] Fetched CURRENT, crc32 0, skip count: 0, fetch count: 4, progress: 4 / 7 [2025-10-20T03:04:11.369380+00:00][I][replication.cc:953] [fetch] Fetched 000090.log, crc32 0, skip count: 0, fetch count: 5, progress: 5 / 7 [2025-10-20T03:04:12.240074+00:00][I][replication.cc:953] [fetch] Fetched 000094.sst, crc32 0, skip count: 0, fetch count: 6, progress: 6 / 7 [2025-10-20T03:04:12.248616+00:00][I][replication.cc:953] [fetch] Fetched OPTIONS-000093, crc32 0, skip count: 0, fetch count: 7, progress: 7 / 7 [2025-10-20T03:04:12.249103+00:00][I][replication.cc:858] [replication] Succeeded fetching files in parallel, restoring the backup [2025-10-20T03:04:12.249128+00:00][I][server.cc:1506] [server] Disconnecting slaves... [2025-10-20T03:04:12.249156+00:00][I][server.cc:1522] [server] Waiting workers for finishing executing commands... [2025-10-20T03:04:12.249280+00:00][I][server.cc:1533] [server] Stopping the task runner and clear task queue... [2025-10-20T03:04:12.257134+00:00][I][server.cc:1541] [server] Waiting for closing DB... [2025-10-20T03:04:12.385834+00:00][I][event_listener.cc:187] [event_listener/table_file_created] column family: default, file path: /var/lib/kvrocks/db/000100.sst, file size: 2066844, job_id: 1, reason: recovery, status: OK [2025-10-20T03:04:12.387995+00:00][I][event_listener.cc:187] [event_listener/table_file_created] column family: metadata, file path: /var/lib/kvrocks/db/000101.sst, file size: 17932, job_id: 1, reason: recovery, status: OK [2025-10-20T03:04:12.400665+00:00][I][event_listener.cc:92] [event_listener/compaction_begin] column family: default, job_id: 3, compaction reason: FilesMarkedForCompaction, output compression type: no, base input level(files): 0(2), output level(files): 6(0), input bytes: 4114724, output bytes: 0, is_manual_compaction: no [2025-10-20T03:04:12.401004+00:00][I][event_listener.cc:116] [event_listener/subcompaction_begin] column family: default, job_id: 3, compaction reason: FilesMarkedForCompaction, output compression type: no [2025-10-20T03:04:12.401631+00:00][E][compact_filter.cc:131] [compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: site:1, err: storage is closed [2025-10-20T03:04:12.401916+00:00][E][compact_filter.cc:131] [compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: site:1, err: storage is closed [2025-10-20T03:04:12.402194+00:00][E][compact_filter.cc:131] [compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: site:1, err: storage is closed [2025-10-20T03:04:12.402507+00:00][E][compact_filter.cc:131] [compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: site:1, err: storage is closed [2025-10-20T03:04:12.403451+00:00][E][compact_filter.cc:131] [compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: site:1, err: storage is closed [2025-10-20T03:04:12.404077+00:00][E][compact_filter.cc:131] [compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: site:1, err: storage is closed [2025-10-20T03:04:12.404292+00:00][I][storage.cc:407] [storage] Success to load the data from disk: 137 ms [2025-10-20T03:04:12.406077+00:00][I][replication.cc:874] [replication] Succeeded restoring the backup, fullsync was finish [2025-10-20T03:04:12.413198+00:00][I][event_listener.cc:187] [event_listener/table_file_created] column family: default, file path: /var/lib/kvrocks/db/000106.sst, file size: 2075735, job_id: 3, reason: compaction, status: OK [2025-10-20T03:04:12.413467+00:00][I][event_listener.cc:124] [event_listener/subcompaction_completed] column family: default, job_id: 3, compaction reason: FilesMarkedForCompaction, output compression type: no, base input level(files): 0, output level(files): 6, input bytes: 0, output bytes: 0, is_manual_compaction: no, elapsed(micro): 0 [2025-10-20T03:04:12.415358+00:00][I][event_listener.cc:103] [event_listener/compaction_completed] column family: default, job_id: 3, compaction reason: FilesMarkedForCompaction, output compression type: no, base input level(files): 0(2), output level(files): 6(1), input bytes: 4114724, output bytes: 2075735, is_manual_compaction: no, elapsed(micro): 12579 [2025-10-20T03:04:12.415986+00:00][I][event_listener.cc:194] [event_listener/table_file_deleted] db: /var/lib/kvrocks/db, sst file: /var/lib/kvrocks/db/000094.sst, status: OK [2025-10-20T03:04:12.416515+00:00][I][event_listener.cc:194] [event_listener/table_file_deleted] db: /var/lib/kvrocks/db, sst file: /var/lib/kvrocks/db/000100.sst, status: OK [2025-10-20T03:04:12.417135+00:00][I][event_listener.cc:92] [event_listener/compaction_begin] column family: metadata, job_id: 4, compaction reason: FilesMarkedForCompaction, output compression type: no, base input level(files): 0(2), output level(files): 6(0), input bytes: 31235, output bytes: 0, is_manual_compaction: no [2025-10-20T03:04:12.417314+00:00][I][event_listener.cc:116] [event_listener/subcompaction_begin] column family: metadata, job_id: 4, compaction reason: FilesMarkedForCompaction, output compression type: no [2025-10-20T03:04:12.419093+00:00][I][event_listener.cc:187] [event_listener/table_file_created] column family: metadata, file path: /var/lib/kvrocks/db/000107.sst, file size: 15258, job_id: 4, reason: compaction, status: OK [2025-10-20T03:04:12.419281+00:00][I][event_listener.cc:124] [event_listener/subcompaction_completed] column family: metadata, job_id: 4, compaction reason: FilesMarkedForCompaction, output compression type: no, base input level(files): 0, output level(files): 6, input bytes: 0, output bytes: 0, is_manual_compaction: no, elapsed(micro): 0 [2025-10-20T03:04:12.420877+00:00][I][event_listener.cc:103] [event_listener/compaction_completed] column family: metadata, job_id: 4, compaction reason: FilesMarkedForCompaction, output compression type: no, base input level(files): 0(2), output level(files): 6(1), input bytes: 31235, output bytes: 15258, is_manual_compaction: no, elapsed(micro): 2076 [2025-10-20T03:04:12.421158+00:00][I][event_listener.cc:194] [event_listener/table_file_deleted] db: /var/lib/kvrocks/db, sst file: /var/lib/kvrocks/db/000097.sst, status: OK [2025-10-20T03:04:12.421339+00:00][I][event_listener.cc:194] [event_listener/table_file_deleted] db: /var/lib/kvrocks/db, sst file: /var/lib/kvrocks/db/000101.sst, status: OK [2025-10-20T03:04:12.570575+00:00][I][replication.cc:482] [replication] Auth request was sent, waiting for response [2025-10-20T03:04:12.730508+00:00][I][replication.cc:498] [replication] Auth response was received, continue... [2025-10-20T03:04:12.730842+00:00][I][replication.cc:482] [replication] Auth request was sent, waiting for response [2025-10-20T03:04:12.892167+00:00][I][replication.cc:498] [replication] Auth response was received, continue... [2025-10-20T03:04:12.892575+00:00][I][replication.cc:505] [replication] Check db name request was sent, waiting for response [2025-10-20T03:04:13.056552+00:00][I][replication.cc:525] [replication] DB name is valid, continue... [2025-10-20T03:04:13.056884+00:00][I][replication.cc:543] [replication] replconf request was sent, waiting for response [2025-10-20T03:04:13.220553+00:00][I][replication.cc:568] [replication] replconf is ok, start psync [2025-10-20T03:04:13.244140+00:00][I][replication.cc:598] [replication] Try to use psync, next seq: 10348 [2025-10-20T03:04:13.406248+00:00][I][replication.cc:633] [replication] PSync is ok, start increment batch loop [2025-10-20T03:04:41.644379+00:00][I][compaction_checker.cc:35] [compaction checker] Start to compact the column family: pubsub [2025-10-20T03:04:41.644519+00:00][I][compaction_checker.cc:38] [compaction checker] Compact the column family: pubsub finished, result: OK [2025-10-20T03:04:41.644539+00:00][I][compaction_checker.cc:35] [compaction checker] Start to compact the column family: propagate [2025-10-20T03:04:41.644568+00:00][I][compaction_checker.cc:38] [compaction checker] Compact the column family: propagate finished, result: OK ``` Both the master and its replica are stuck in a zombie state with long-term data inconsistency. A restart of either node is required to re-establish synchronization. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
