haodiao commented on issue #3208:
URL: https://github.com/apache/kvrocks/issues/3208#issuecomment-3422423161

   **The latest log of the master server:**
   ```
   [2025-10-20T11:03:42.228626+08:00][I][cmd_replication.cc:61] Slave 
10.8.8.13:33270, listening port: 6666, announce ip: 10.8.8.13 asks for 
synchronization with next sequence: 1, replication id: not supported, and local 
sequence: 10347
   [2025-10-20T11:03:42.697981+08:00][E][replication.cc:114] Ping slave 
[10.8.8.13:55732] err: Broken pipe, would stop the thread
   [2025-10-20T11:03:42.698101+08:00][W][replication.cc:100] Slave thread was 
terminated, would stop feeding the slave: 10.8.8.13:55732
   [2025-10-20T11:03:42.833349+08:00][I][storage.cc:1143] [storage] Create 
checkpoint successfully
   [2025-10-20T11:03:42.833795+08:00][I][cmd_replication.cc:224] [replication] 
Succeed sending full data file info to 10.8.8.13
   [2025-10-20T11:04:10.751592+08:00][I][cmd_replication.cc:281] [replication] 
Succeed sending file 000095.log to 10.8.8.13
   [2025-10-20T11:04:10.752006+08:00][I][cmd_replication.cc:281] [replication] 
Succeed sending file 000097.sst to 10.8.8.13
   [2025-10-20T11:04:10.752494+08:00][I][cmd_replication.cc:281] [replication] 
Succeed sending file MANIFEST-000091 to 10.8.8.13
   [2025-10-20T11:04:10.752730+08:00][I][cmd_replication.cc:281] [replication] 
Succeed sending file CURRENT to 10.8.8.13
   [2025-10-20T11:04:10.753005+08:00][I][cmd_replication.cc:281] [replication] 
Succeed sending file 000090.log to 10.8.8.13
   [2025-10-20T11:04:11.374264+08:00][I][cmd_replication.cc:281] [replication] 
Succeed sending file 000094.sst to 10.8.8.13
   [2025-10-20T11:04:11.374751+08:00][I][cmd_replication.cc:281] [replication] 
Succeed sending file OPTIONS-000093 to 10.8.8.13
   [2025-10-20T11:04:13.261882+08:00][I][cmd_replication.cc:61] Slave 
10.8.8.13:41128, listening port: 6666, announce ip: 10.8.8.13 asks for 
synchronization with next sequence: 10348, replication id: not supported, and 
local sequence: 10347
   [2025-10-20T11:04:13.262201+08:00][I][cmd_replication.cc:113] New replica: 
10.8.8.13:41128 was added, start incremental syncing
   ```
   
   **The latest log of the slave server:**
   ```
   [2025-10-20T03:03:41.218142+00:00][I][main.cc:168] kvrocks unstable (commit 
2215279)
   [2025-10-20T03:03:41.353900+00:00][I][storage.cc:407] [storage] Success to 
load the data from disk: 82 ms
   [2025-10-20T03:03:41.379658+00:00][I][worker.cc:76] [worker] Listening on: 
0.0.0.0:6666
   [2025-10-20T03:03:41.381334+00:00][I][worker.cc:76] [worker] Listening on: 
0.0.0.0:6666
   [2025-10-20T03:03:41.381730+00:00][I][worker.cc:76] [worker] Listening on: 
0.0.0.0:6666
   [2025-10-20T03:03:41.382108+00:00][I][worker.cc:76] [worker] Listening on: 
0.0.0.0:6666
   [2025-10-20T03:03:41.382459+00:00][I][worker.cc:76] [worker] Listening on: 
0.0.0.0:6666
   [2025-10-20T03:03:41.382836+00:00][I][worker.cc:76] [worker] Listening on: 
0.0.0.0:6666
   [2025-10-20T03:03:41.383209+00:00][I][worker.cc:76] [worker] Listening on: 
0.0.0.0:6666
   [2025-10-20T03:03:41.383604+00:00][I][worker.cc:76] [worker] Listening on: 
0.0.0.0:6666
   [2025-10-20T03:03:41.384461+00:00][W][replication.cc:430] Clean old synced 
checkpoint successfully
   [2025-10-20T03:03:41.384802+00:00][I][worker.cc:596] [worker] Thread 
#140388142089920 started
   [2025-10-20T03:03:41.384949+00:00][I][worker.cc:596] [worker] Thread 
#140388150482624 started
   [2025-10-20T03:03:41.385115+00:00][I][worker.cc:596] [worker] Thread 
#140388225754816 started
   [2025-10-20T03:03:41.385285+00:00][I][worker.cc:596] [worker] Thread 
#140388217362112 started
   [2025-10-20T03:03:41.385452+00:00][I][worker.cc:596] [worker] Thread 
#140388208969408 started
   [2025-10-20T03:03:41.385619+00:00][I][worker.cc:596] [worker] Thread 
#140388200576704 started
   [2025-10-20T03:03:41.385813+00:00][I][worker.cc:596] [worker] Thread 
#140388192184000 started
   [2025-10-20T03:03:41.385981+00:00][I][worker.cc:596] [worker] Thread 
#140388183791296 started
   [2025-10-20T03:03:41.386295+00:00][I][server.cc:245] [server] Ready to 
accept connections
   [2025-10-20T03:03:41.552330+00:00][I][replication.cc:482] [replication] Auth 
request was sent, waiting for response
   [2025-10-20T03:03:41.715270+00:00][I][replication.cc:498] [replication] Auth 
response was received, continue...
   [2025-10-20T03:03:41.715440+00:00][I][replication.cc:505] [replication] 
Check db name request was sent, waiting for response
   [2025-10-20T03:03:41.876086+00:00][I][replication.cc:525] [replication] DB 
name is valid, continue...
   [2025-10-20T03:03:41.876181+00:00][I][replication.cc:543] [replication] 
replconf request was sent, waiting for response
   [2025-10-20T03:03:42.210064+00:00][I][replication.cc:568] [replication] 
replconf is ok, start psync
   [2025-10-20T03:03:42.210959+00:00][I][replication.cc:598] [replication] Try 
to use psync, next seq: 1
   [2025-10-20T03:03:42.533106+00:00][I][replication.cc:629] [replication] 
Failed to psync, error: -ERR sequence out of range, please use fullsync, switch 
to fullsync
   [2025-10-20T03:03:42.533222+00:00][I][replication.cc:482] [replication] Auth 
request was sent, waiting for response
   [2025-10-20T03:03:42.694377+00:00][I][replication.cc:498] [replication] Auth 
response was received, continue...
   [2025-10-20T03:03:42.694478+00:00][I][replication.cc:752] [replication] 
Start syncing data with fullsync
   [2025-10-20T03:03:42.977450+00:00][I][replication.cc:839] [replication] 
Succeeded fetching full data files info, fetching files in parallel
   [2025-10-20T03:04:11.355202+00:00][I][replication.cc:953] [fetch] Fetched 
000095.log, crc32 0, skip count: 0, fetch count: 1, progress: 1 / 7
   [2025-10-20T03:04:11.367702+00:00][I][replication.cc:953] [fetch] Fetched 
000097.sst, crc32 0, skip count: 0, fetch count: 2, progress: 2 / 7
   [2025-10-20T03:04:11.368381+00:00][I][replication.cc:953] [fetch] Fetched 
MANIFEST-000091, crc32 0, skip count: 0, fetch count: 3, progress: 3 / 7
   [2025-10-20T03:04:11.368541+00:00][I][replication.cc:953] [fetch] Fetched 
CURRENT, crc32 0, skip count: 0, fetch count: 4, progress: 4 / 7
   [2025-10-20T03:04:11.369380+00:00][I][replication.cc:953] [fetch] Fetched 
000090.log, crc32 0, skip count: 0, fetch count: 5, progress: 5 / 7
   [2025-10-20T03:04:12.240074+00:00][I][replication.cc:953] [fetch] Fetched 
000094.sst, crc32 0, skip count: 0, fetch count: 6, progress: 6 / 7
   [2025-10-20T03:04:12.248616+00:00][I][replication.cc:953] [fetch] Fetched 
OPTIONS-000093, crc32 0, skip count: 0, fetch count: 7, progress: 7 / 7
   [2025-10-20T03:04:12.249103+00:00][I][replication.cc:858] [replication] 
Succeeded fetching files in parallel, restoring the backup
   [2025-10-20T03:04:12.249128+00:00][I][server.cc:1506] [server] Disconnecting 
slaves...
   [2025-10-20T03:04:12.249156+00:00][I][server.cc:1522] [server] Waiting 
workers for finishing executing commands...
   [2025-10-20T03:04:12.249280+00:00][I][server.cc:1533] [server] Stopping the 
task runner and clear task queue...
   [2025-10-20T03:04:12.257134+00:00][I][server.cc:1541] [server] Waiting for 
closing DB...
   [2025-10-20T03:04:12.385834+00:00][I][event_listener.cc:187] 
[event_listener/table_file_created] column family: default, file path: 
/var/lib/kvrocks/db/000100.sst, file size: 2066844, job_id: 1, reason: 
recovery, status: OK
   [2025-10-20T03:04:12.387995+00:00][I][event_listener.cc:187] 
[event_listener/table_file_created] column family: metadata, file path: 
/var/lib/kvrocks/db/000101.sst, file size: 17932, job_id: 1, reason: recovery, 
status: OK
   [2025-10-20T03:04:12.400665+00:00][I][event_listener.cc:92] 
[event_listener/compaction_begin] column family: default, job_id: 3, compaction 
reason: FilesMarkedForCompaction, output compression type: no, base input 
level(files): 0(2), output level(files): 6(0), input bytes: 4114724, output 
bytes: 0, is_manual_compaction: no
   [2025-10-20T03:04:12.401004+00:00][I][event_listener.cc:116] 
[event_listener/subcompaction_begin] column family: default, job_id: 3, 
compaction reason: FilesMarkedForCompaction, output compression type: no
   [2025-10-20T03:04:12.401631+00:00][E][compact_filter.cc:131] 
[compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: 
site:1, err: storage is closed
   [2025-10-20T03:04:12.401916+00:00][E][compact_filter.cc:131] 
[compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: 
site:1, err: storage is closed
   [2025-10-20T03:04:12.402194+00:00][E][compact_filter.cc:131] 
[compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: 
site:1, err: storage is closed
   [2025-10-20T03:04:12.402507+00:00][E][compact_filter.cc:131] 
[compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: 
site:1, err: storage is closed
   [2025-10-20T03:04:12.403451+00:00][E][compact_filter.cc:131] 
[compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: 
site:1, err: storage is closed
   [2025-10-20T03:04:12.404077+00:00][E][compact_filter.cc:131] 
[compact_filter/subkey] Failed to get metadata, namespace: __namespace, key: 
site:1, err: storage is closed
   [2025-10-20T03:04:12.404292+00:00][I][storage.cc:407] [storage] Success to 
load the data from disk: 137 ms
   [2025-10-20T03:04:12.406077+00:00][I][replication.cc:874] [replication] 
Succeeded restoring the backup, fullsync was finish
   [2025-10-20T03:04:12.413198+00:00][I][event_listener.cc:187] 
[event_listener/table_file_created] column family: default, file path: 
/var/lib/kvrocks/db/000106.sst, file size: 2075735, job_id: 3, reason: 
compaction, status: OK
   [2025-10-20T03:04:12.413467+00:00][I][event_listener.cc:124] 
[event_listener/subcompaction_completed] column family: default, job_id: 3, 
compaction reason: FilesMarkedForCompaction, output compression type: no, base 
input level(files): 0, output level(files): 6, input bytes: 0, output bytes: 0, 
is_manual_compaction: no, elapsed(micro): 0
   [2025-10-20T03:04:12.415358+00:00][I][event_listener.cc:103] 
[event_listener/compaction_completed] column family: default, job_id: 3, 
compaction reason: FilesMarkedForCompaction, output compression type: no, base 
input level(files): 0(2), output level(files): 6(1), input bytes: 4114724, 
output bytes: 2075735, is_manual_compaction: no, elapsed(micro): 12579
   [2025-10-20T03:04:12.415986+00:00][I][event_listener.cc:194] 
[event_listener/table_file_deleted] db: /var/lib/kvrocks/db, sst file: 
/var/lib/kvrocks/db/000094.sst, status: OK
   [2025-10-20T03:04:12.416515+00:00][I][event_listener.cc:194] 
[event_listener/table_file_deleted] db: /var/lib/kvrocks/db, sst file: 
/var/lib/kvrocks/db/000100.sst, status: OK
   [2025-10-20T03:04:12.417135+00:00][I][event_listener.cc:92] 
[event_listener/compaction_begin] column family: metadata, job_id: 4, 
compaction reason: FilesMarkedForCompaction, output compression type: no, base 
input level(files): 0(2), output level(files): 6(0), input bytes: 31235, output 
bytes: 0, is_manual_compaction: no
   [2025-10-20T03:04:12.417314+00:00][I][event_listener.cc:116] 
[event_listener/subcompaction_begin] column family: metadata, job_id: 4, 
compaction reason: FilesMarkedForCompaction, output compression type: no
   [2025-10-20T03:04:12.419093+00:00][I][event_listener.cc:187] 
[event_listener/table_file_created] column family: metadata, file path: 
/var/lib/kvrocks/db/000107.sst, file size: 15258, job_id: 4, reason: 
compaction, status: OK
   [2025-10-20T03:04:12.419281+00:00][I][event_listener.cc:124] 
[event_listener/subcompaction_completed] column family: metadata, job_id: 4, 
compaction reason: FilesMarkedForCompaction, output compression type: no, base 
input level(files): 0, output level(files): 6, input bytes: 0, output bytes: 0, 
is_manual_compaction: no, elapsed(micro): 0
   [2025-10-20T03:04:12.420877+00:00][I][event_listener.cc:103] 
[event_listener/compaction_completed] column family: metadata, job_id: 4, 
compaction reason: FilesMarkedForCompaction, output compression type: no, base 
input level(files): 0(2), output level(files): 6(1), input bytes: 31235, output 
bytes: 15258, is_manual_compaction: no, elapsed(micro): 2076
   [2025-10-20T03:04:12.421158+00:00][I][event_listener.cc:194] 
[event_listener/table_file_deleted] db: /var/lib/kvrocks/db, sst file: 
/var/lib/kvrocks/db/000097.sst, status: OK
   [2025-10-20T03:04:12.421339+00:00][I][event_listener.cc:194] 
[event_listener/table_file_deleted] db: /var/lib/kvrocks/db, sst file: 
/var/lib/kvrocks/db/000101.sst, status: OK
   [2025-10-20T03:04:12.570575+00:00][I][replication.cc:482] [replication] Auth 
request was sent, waiting for response
   [2025-10-20T03:04:12.730508+00:00][I][replication.cc:498] [replication] Auth 
response was received, continue...
   [2025-10-20T03:04:12.730842+00:00][I][replication.cc:482] [replication] Auth 
request was sent, waiting for response
   [2025-10-20T03:04:12.892167+00:00][I][replication.cc:498] [replication] Auth 
response was received, continue...
   [2025-10-20T03:04:12.892575+00:00][I][replication.cc:505] [replication] 
Check db name request was sent, waiting for response
   [2025-10-20T03:04:13.056552+00:00][I][replication.cc:525] [replication] DB 
name is valid, continue...
   [2025-10-20T03:04:13.056884+00:00][I][replication.cc:543] [replication] 
replconf request was sent, waiting for response
   [2025-10-20T03:04:13.220553+00:00][I][replication.cc:568] [replication] 
replconf is ok, start psync
   [2025-10-20T03:04:13.244140+00:00][I][replication.cc:598] [replication] Try 
to use psync, next seq: 10348
   [2025-10-20T03:04:13.406248+00:00][I][replication.cc:633] [replication] 
PSync is ok, start increment batch loop
   [2025-10-20T03:04:41.644379+00:00][I][compaction_checker.cc:35] [compaction 
checker] Start to compact the column family: pubsub
   [2025-10-20T03:04:41.644519+00:00][I][compaction_checker.cc:38] [compaction 
checker] Compact the column family: pubsub finished, result: OK
   [2025-10-20T03:04:41.644539+00:00][I][compaction_checker.cc:35] [compaction 
checker] Start to compact the column family: propagate
   [2025-10-20T03:04:41.644568+00:00][I][compaction_checker.cc:38] [compaction 
checker] Compact the column family: propagate finished, result: OK
   ```
   
   Both the master and its replica are stuck in a zombie state with long-term 
data inconsistency. A restart of either node is required to re-establish 
synchronization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to