Re: [I] Bug(tcmalloc):ALL nodes coredump when doing Bulkload [incubator-pegasus]

via GitHub Thu, 23 May 2024 04:27:36 -0700


ruojieranyishen commented on issue #2006:
URL: 
https://github.com/apache/incubator-pegasus/issues/2006#issuecomment-2126874200


   The aforementioned phenomenon is one of issues triggered by bulkload 
download.
   
   # Phenomenon
   
   1. Doing bulkload (download sst file stage) with any action which need to 
**restart ONE node**,may cause ALL nodes coredump.
   2. **bulkload file missing** also cause many nodes coredump
   3. Phenomenon 1 and 2 both cause tcmalloc report `large alloc 2560917504`
   
   # Reason
   
   After execute the `clear_bulk_load_states` function, `download_sst_file` 
tasks still remain, which causes above phenomenon.
   
   ## Case 1 With Phenomenon 1
   
   **operation**：we restart one node.
   
   ballot increase，function `clear_bulk_load_states_if_needed()` clear  88.5 
replica  **_metadata.files** at 15:17:56.753
   
   ```
   D2024-05-20 15:17:56.753 (1716189476753079718 146668) 
replica.replica13.0404000d0000005d: 
replica_config.cpp:819:update_local_configuration(): [email protected]:27101: 
update ballot to init file from 3 to 4 OK 
   D2024-05-20 15:17:56.753 (1716189476753147052 146668) 
replica.replica13.0404000d0000005d:clear_bulk_load_states_if_needed(): 
[[email protected]:27101] prepare to clear bulk load states, current status = 
replication::bulk_load_status::BLS_DOWNLOADING
   D2024-05-20 15:17:56.753 (1716189476753464144 146668) 
replica.replica13.0404000d0000005d: 
replica_config.cpp:1045:update_local_configuration(): [email protected]:27101: 
status change replication::partition_status::PS_INACTIVE @ 3 => 
replication::partition_status::PS_PRIMARY @ 4, pre(1, 0), app(0, 0), duration = 
3 ms, replica_configuration(pid=88.5, ballot=4, primary=10.142.98.52:27101, 
status=3, learner_signature=0, pop_all=0, split_sync_to_child=0)
   ```
   
   **But at 15:17:56.873, the 88.5 replicais still downloading sst file, cause 
core. **
   
   D2024-05-20 15:17:56.873 (1716189476873362400 146626) 
replica.default7.04010007000000ca: 
block_service_manager.cpp:181:download_file(): download 
file(/home/work/ssd2/pegasus/c3tst-performance2/replica/reps/88.5.pegasus/bulk_load/33.sst)
 succeed, file_size = 65930882, md5 = 7a4d3da9250f52b4e31095c1d7042c2f 
D2024-05-20 15:17:58.348 (1716189478348326864 146626) 
replica.default7.04010007000000ca: 
replica_bulk_loader.cpp:479:**download_sst_file**(): [[email protected]:27101] 
download_sst_file remote_dir 
/user/s_pegasus/lpfsplit/c3tst-performance2/ingest_p32_10G/5 ,local_dir 
/home/work/ssd2/pegasus/c3tst-performance2/replica/reps/88.5.pegasus/bulk_load,f_meta.name
 33.sst
   
   ## Case 2 With Phenomenon 2
   
   **operation**：app ingest_p4_10G partition 1，bulkload file missing 
88.sst,89.sst,90.sst,93.sst
   
   ```
   [general]
   app_name           : ingest_p4_10G
   app_id             : 95           
   partition_count    : 4            
   max_replica_count  : 3            
   
   [replicas]
   pidx  ballot  replica_count  primary                              
secondaries                                                              
   0     8       3/3            c3-hadoop-pegasus-tst-st01.bj:27101  
[c3-hadoop-pegasus-tst-st03.bj:27101,c3-hadoop-pegasus-tst-st05.bj:27101]  
   1     7       3/3            c3-hadoop-pegasus-tst-st03.bj:27101  
[c3-hadoop-pegasus-tst-st01.bj:27101,c3-hadoop-pegasus-tst-st02.bj:27101]  
   2     8       3/3            c3-hadoop-pegasus-tst-st04.bj:27101  
[c3-hadoop-pegasus-tst-st03.bj:27101,c3-hadoop-pegasus-tst-st02.bj:27101]  
   3     3       3/3            c3-hadoop-pegasus-tst-st01.bj:27101  
[c3-hadoop-pegasus-tst-st03.bj:27101,c3-hadoop-pegasus-tst-st04.bj:27101]  
   ```
   
   ### primary replica
   
   primary replica failed to download file(88.sst) ，and stop downloading all 
sst file.
   
   ```
   log.1.txt:E2024-05-22 14:28:11.231 (1716359291231595252 102084) 
replica.default1.040100090000072b: 
replica_bulk_loader.cpp:520:download_sst_file(): [[email protected]:27101] 
failed to download file(88.sst), error = ERR_CORRUPTION
   ```
   
   But meta says continue downloading.
   
   ```
   D2024-05-22 14:28:18.983 (1716359298983653491 102121) 
replica.replica2.04008ebc00010f3f: replica_bulk_loader.cpp:71:on_bulk_load(): 
[[email protected]:27101] receive bulk load request, remote provider = 
hdfs_zjy, remote_root_path = /user/s_pegasus/lpfsplit, cluster_name = 
c3tst-performance2, app_name = ingest_p4_10G, meta_bulk_load_status = 
replication::bulk_load_status::BLS_DOWNLOADING, local bulk_load_status = 
replication::bulk_load_status::BLS_DOWNLOADING
   ```
   
   primary replica reports download progress to meta.
   
   ```
   D2024-05-22 14:28:18.983 (1716359298983689828 102121) 
replica.replica2.04008ebc00010f3f: 
replica_bulk_loader.cpp:879:report_group_download_progress(): 
[[email protected]:27101] primary = 10.142.102.47:27101, download progress = 
89%, status = ERR_CORRUPTION
   D2024-05-22 14:28:18.983 (1716359298983703147 102121) 
replica.replica2.04008ebc00010f3f: 
replica_bulk_loader.cpp:892:report_group_download_progress(): 
[[email protected]:27101] secondary = 10.142.98.52:27101, download progress = 
88%, status=ERR_OK
   D2024-05-22 14:28:18.983 (1716359298983714700 102121) 
replica.replica2.04008ebc00010f3f: 
replica_bulk_loader.cpp:892:report_group_download_progress(): 
[[email protected]:27101] secondary = 10.142.97.9:27101, download progress = 
88%, status=ERR_OK
   ```
   
   meta says stop downloading，and **clear _metadata.files**. However, all 
download tasks were not terminated successfully.
   
   ```
   D2024-05-22 14:28:28.988 (1716359308988487559 102121) 
replica.replica2.04008ebc00010f46: replica_bulk_loader.cpp:71:on_bulk_load(): 
[[email protected]:27101] receive bulk load request, remote provider = 
hdfs_zjy, remote_root_path = /user/s_pegasus/lpfsplit, cluster_name = 
c3tst-performance2, app_name = ingest_p4_10G, meta_bulk_load_status = 
replication::bulk_load_status::BLS_FAILED, local bulk_load_status = 
replication::bulk_load_status::BLS_DOWNLOADING
   ```
   
   At 14:28:29, download_sst_file task still exists, access _metadata.files, 
causing core.
   
   ```
   D2024-05-22 14:28:29.529 (1716359309529341231 102089) 
replica.default6.04010000000007b5: 
replica_bulk_loader.cpp:479:download_sst_file(): [[email protected]:27101] 
download_sst_file remote_dir 
/user/s_pegasus/lpfsplit/c3tst-performance2/ingest_p4_10G/0 ,local_dir 
/home/work/ssd1/pegasus/c3tst-performance2/replica/reps/95.0.pegasus/bulk_load,f_meta.name
 92.sst
   F2024-05-22 14:28:29.536 (1716359309536349665 102089) 
replica.default6.04010000000007b5: filesystem.cpp:111:get_normalized_path(): 
assertion expression: len <= 4086
   ```
   
   ### secondary replica
   
   Secondary receives primary replica message to cancel the bulkload task and 
clear _metadata.files, but does not terminate all download tasks, cause core 
dump.
   
   Other replicas generate core dumps due to this reason, cause many replica 
server core dump.
   
   ```
   D2024-05-22 14:28:28.992 (1716359308992917139 159129) 
replica.replica2.04006d6800010139: 
replica_bulk_loader.cpp:183:on_group_bulk_load(): [[email protected]:27101] 
receive group_bulk_load request, primary address = 10.142.102.47:27101, ballot 
= 7, **meta bulk_load_status = replication::bulk_load_status::BLS_FAILED, local 
bulk_load_status = replication::bulk_load_status::BLS_DOWNLOADING**
   ```
   
   ```
   D2024-05-22 14:28:30.384 (1716359310384983585 159094) 
replica.default5.040100080000056a: 
replica_bulk_loader.cpp:479:download_sst_file(): [[email protected]:27101] 
download_sst_file remote_dir 
/user/s_pegasus/lpfsplit/c3tst-performance2/ingest_p4_10G/0 ,local_dir 
/home/work/ssd2/pegasus/c3tst-performance2/replica/reps/95.0.pegasus/bulk_load,f_meta.name
 20
   F2024-05-22 14:28:30.452 (1716359310452229248 159094) 
replica.default5.040100080000056a: filesystem.cpp:111:get_normalized_path(): 
assertion expression: len <= 4086
   ```
   
   
   
   ## tcmalloc report large alloc
   
   _metadata.files is cleared, cause f_meta.name length in download_sst_file 
function is very long. 
   
   ```
   const file_meta &f_meta = _metadata.files[file_index];
   const std::string &file_name = utils::filesystem::path_combine(local_dir, 
f_meta.name);
   ```
   
   ```
   log.1.txt:F2024-05-20 17:22:49.621 (1716196969621641630 170503) 
replica.default11.0401000b000000de: filesystem.cpp:111:get_normalized_path(): 
lpf path chao chu 4096, get_normalized_path LEN 410828079
   log.2.txt:F2024-05-20 17:23:38.595 (1716197018595730772 192879) 
replica.default10.040100040000002e: filesystem.cpp:111:get_normalized_path(): 
lpf path chao chu 4096, get_normalized_path LEN 532715376
   log.1.txt:F2024-05-20 17:22:50.77 (1716196970077285996 164438) 
replica.default11.0401000b000000c6: filesystem.cpp:111:get_normalized_path(): 
lpf path chao chu 4096, get_normalized_path LEN 383022703
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Bug(tcmalloc):ALL nodes coredump when doing Bulkload [incubator-pegasus]

Reply via email to