ruojieranyishen commented on issue #2006: URL: https://github.com/apache/incubator-pegasus/issues/2006#issuecomment-2126874200
The aforementioned phenomenon is one of issues triggered by bulkload download. # Phenomenon 1. Doing bulkload (download sst file stage) with any action which need to **restart ONE node**,may cause ALL nodes coredump. 2. **bulkload file missing** also cause many nodes coredump 3. Phenomenon 1 and 2 both cause tcmalloc report `large alloc 2560917504` # Reason After execute the `clear_bulk_load_states` function, `download_sst_file` tasks still remain, which causes above phenomenon. ## Case 1 With Phenomenon 1 **operation**:we restart one node. ballot increase,function `clear_bulk_load_states_if_needed()` clear 88.5 replica **_metadata.files** at 15:17:56.753 ``` D2024-05-20 15:17:56.753 (1716189476753079718 146668) replica.replica13.0404000d0000005d: replica_config.cpp:819:update_local_configuration(): [email protected]:27101: update ballot to init file from 3 to 4 OK D2024-05-20 15:17:56.753 (1716189476753147052 146668) replica.replica13.0404000d0000005d:clear_bulk_load_states_if_needed(): [[email protected]:27101] prepare to clear bulk load states, current status = replication::bulk_load_status::BLS_DOWNLOADING D2024-05-20 15:17:56.753 (1716189476753464144 146668) replica.replica13.0404000d0000005d: replica_config.cpp:1045:update_local_configuration(): [email protected]:27101: status change replication::partition_status::PS_INACTIVE @ 3 => replication::partition_status::PS_PRIMARY @ 4, pre(1, 0), app(0, 0), duration = 3 ms, replica_configuration(pid=88.5, ballot=4, primary=10.142.98.52:27101, status=3, learner_signature=0, pop_all=0, split_sync_to_child=0) ``` **But at 15:17:56.873, the 88.5 replicais still downloading sst file, cause core. ** D2024-05-20 15:17:56.873 (1716189476873362400 146626) replica.default7.04010007000000ca: block_service_manager.cpp:181:download_file(): download file(/home/work/ssd2/pegasus/c3tst-performance2/replica/reps/88.5.pegasus/bulk_load/33.sst) succeed, file_size = 65930882, md5 = 7a4d3da9250f52b4e31095c1d7042c2f D2024-05-20 15:17:58.348 (1716189478348326864 146626) replica.default7.04010007000000ca: replica_bulk_loader.cpp:479:**download_sst_file**(): [[email protected]:27101] download_sst_file remote_dir /user/s_pegasus/lpfsplit/c3tst-performance2/ingest_p32_10G/5 ,local_dir /home/work/ssd2/pegasus/c3tst-performance2/replica/reps/88.5.pegasus/bulk_load,f_meta.name 33.sst ## Case 2 With Phenomenon 2 **operation**:app ingest_p4_10G partition 1,bulkload file missing 88.sst,89.sst,90.sst,93.sst ``` [general] app_name : ingest_p4_10G app_id : 95 partition_count : 4 max_replica_count : 3 [replicas] pidx ballot replica_count primary secondaries 0 8 3/3 c3-hadoop-pegasus-tst-st01.bj:27101 [c3-hadoop-pegasus-tst-st03.bj:27101,c3-hadoop-pegasus-tst-st05.bj:27101] 1 7 3/3 c3-hadoop-pegasus-tst-st03.bj:27101 [c3-hadoop-pegasus-tst-st01.bj:27101,c3-hadoop-pegasus-tst-st02.bj:27101] 2 8 3/3 c3-hadoop-pegasus-tst-st04.bj:27101 [c3-hadoop-pegasus-tst-st03.bj:27101,c3-hadoop-pegasus-tst-st02.bj:27101] 3 3 3/3 c3-hadoop-pegasus-tst-st01.bj:27101 [c3-hadoop-pegasus-tst-st03.bj:27101,c3-hadoop-pegasus-tst-st04.bj:27101] ``` ### primary replica primary replica failed to download file(88.sst) ,and stop downloading all sst file. ``` log.1.txt:E2024-05-22 14:28:11.231 (1716359291231595252 102084) replica.default1.040100090000072b: replica_bulk_loader.cpp:520:download_sst_file(): [[email protected]:27101] failed to download file(88.sst), error = ERR_CORRUPTION ``` But meta says continue downloading. ``` D2024-05-22 14:28:18.983 (1716359298983653491 102121) replica.replica2.04008ebc00010f3f: replica_bulk_loader.cpp:71:on_bulk_load(): [[email protected]:27101] receive bulk load request, remote provider = hdfs_zjy, remote_root_path = /user/s_pegasus/lpfsplit, cluster_name = c3tst-performance2, app_name = ingest_p4_10G, meta_bulk_load_status = replication::bulk_load_status::BLS_DOWNLOADING, local bulk_load_status = replication::bulk_load_status::BLS_DOWNLOADING ``` primary replica reports download progress to meta. ``` D2024-05-22 14:28:18.983 (1716359298983689828 102121) replica.replica2.04008ebc00010f3f: replica_bulk_loader.cpp:879:report_group_download_progress(): [[email protected]:27101] primary = 10.142.102.47:27101, download progress = 89%, status = ERR_CORRUPTION D2024-05-22 14:28:18.983 (1716359298983703147 102121) replica.replica2.04008ebc00010f3f: replica_bulk_loader.cpp:892:report_group_download_progress(): [[email protected]:27101] secondary = 10.142.98.52:27101, download progress = 88%, status=ERR_OK D2024-05-22 14:28:18.983 (1716359298983714700 102121) replica.replica2.04008ebc00010f3f: replica_bulk_loader.cpp:892:report_group_download_progress(): [[email protected]:27101] secondary = 10.142.97.9:27101, download progress = 88%, status=ERR_OK ``` meta says stop downloading,and **clear _metadata.files**. However, all download tasks were not terminated successfully. ``` D2024-05-22 14:28:28.988 (1716359308988487559 102121) replica.replica2.04008ebc00010f46: replica_bulk_loader.cpp:71:on_bulk_load(): [[email protected]:27101] receive bulk load request, remote provider = hdfs_zjy, remote_root_path = /user/s_pegasus/lpfsplit, cluster_name = c3tst-performance2, app_name = ingest_p4_10G, meta_bulk_load_status = replication::bulk_load_status::BLS_FAILED, local bulk_load_status = replication::bulk_load_status::BLS_DOWNLOADING ``` At 14:28:29, download_sst_file task still exists, access _metadata.files, causing core. ``` D2024-05-22 14:28:29.529 (1716359309529341231 102089) replica.default6.04010000000007b5: replica_bulk_loader.cpp:479:download_sst_file(): [[email protected]:27101] download_sst_file remote_dir /user/s_pegasus/lpfsplit/c3tst-performance2/ingest_p4_10G/0 ,local_dir /home/work/ssd1/pegasus/c3tst-performance2/replica/reps/95.0.pegasus/bulk_load,f_meta.name 92.sst F2024-05-22 14:28:29.536 (1716359309536349665 102089) replica.default6.04010000000007b5: filesystem.cpp:111:get_normalized_path(): assertion expression: len <= 4086 ``` ### secondary replica Secondary receives primary replica message to cancel the bulkload task and clear _metadata.files, but does not terminate all download tasks, cause core dump. Other replicas generate core dumps due to this reason, cause many replica server core dump. ``` D2024-05-22 14:28:28.992 (1716359308992917139 159129) replica.replica2.04006d6800010139: replica_bulk_loader.cpp:183:on_group_bulk_load(): [[email protected]:27101] receive group_bulk_load request, primary address = 10.142.102.47:27101, ballot = 7, **meta bulk_load_status = replication::bulk_load_status::BLS_FAILED, local bulk_load_status = replication::bulk_load_status::BLS_DOWNLOADING** ``` ``` D2024-05-22 14:28:30.384 (1716359310384983585 159094) replica.default5.040100080000056a: replica_bulk_loader.cpp:479:download_sst_file(): [[email protected]:27101] download_sst_file remote_dir /user/s_pegasus/lpfsplit/c3tst-performance2/ingest_p4_10G/0 ,local_dir /home/work/ssd2/pegasus/c3tst-performance2/replica/reps/95.0.pegasus/bulk_load,f_meta.name 20 F2024-05-22 14:28:30.452 (1716359310452229248 159094) replica.default5.040100080000056a: filesystem.cpp:111:get_normalized_path(): assertion expression: len <= 4086 ``` ## tcmalloc report large alloc _metadata.files is cleared, cause f_meta.name length in download_sst_file function is very long. ``` const file_meta &f_meta = _metadata.files[file_index]; const std::string &file_name = utils::filesystem::path_combine(local_dir, f_meta.name); ``` ``` log.1.txt:F2024-05-20 17:22:49.621 (1716196969621641630 170503) replica.default11.0401000b000000de: filesystem.cpp:111:get_normalized_path(): lpf path chao chu 4096, get_normalized_path LEN 410828079 log.2.txt:F2024-05-20 17:23:38.595 (1716197018595730772 192879) replica.default10.040100040000002e: filesystem.cpp:111:get_normalized_path(): lpf path chao chu 4096, get_normalized_path LEN 532715376 log.1.txt:F2024-05-20 17:22:50.77 (1716196970077285996 164438) replica.default11.0401000b000000c6: filesystem.cpp:111:get_normalized_path(): lpf path chao chu 4096, get_normalized_path LEN 383022703 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
