ruojieranyishen opened a new pull request, #2036:
URL: https://github.com/apache/incubator-pegasus/pull/2036

   # What problem does this PR solve? <!--add issue link with summary if 
exists-->
   Related issue:
   https://github.com/apache/incubator-pegasus/issues/2006
   ### What is changed and how does it work?
   Avoid using _metadata.files reference ,and add a  read_lock 
   
   # Tests <!-- At least one of them must be included. -->
   Because bulkload imports a large amount of data.
   - Manual test
   ## Test 1: bulkload files miss four sst files.
   
![image](https://github.com/apache/incubator-pegasus/assets/93246280/71c03c48-7451-4fb6-9029-5ae8808504df)
   After fix : Table ingest_p4_10G partition1 is missing files, and the table 
ballot does not increase after bulkload.
   ```c++
   [2024/5/23 15:11:10] [general]
   [2024/5/23 15:11:10] app_name           : ingest_p4_10G
   [2024/5/23 15:11:10] app_id             : 100          
   [2024/5/23 15:11:10] partition_count    : 4            
   [2024/5/23 15:11:10] max_replica_count  : 3            
   [2024/5/23 15:11:10] 
   [2024/5/23 15:11:10] [replicas]
   [2024/5/23 15:11:10] pidx  ballot  replica_count  primary                    
          secondaries                                                           
     
   [2024/5/23 15:11:10] 0     3       3/3            
c3-hadoop-pegasus-tst-st05.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 15:11:10] 1     3       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 15:11:10] 2     3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 15:11:10] 3     3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 15:11:10] 
   [2024/5/23 15:11:10] [nodes]
   [2024/5/23 15:11:10] node                                 primary  secondary 
 total  
   [2024/5/23 15:11:10] c3-hadoop-pegasus-tst-st02.bj:31101  1        1         
 2      
   [2024/5/23 15:11:10] c3-hadoop-pegasus-tst-st01.bj:31101  1        1         
 2      
   [2024/5/23 15:11:10] c3-hadoop-pegasus-tst-st05.bj:31101  1        2         
 3      
   [2024/5/23 15:11:10] c3-hadoop-pegasus-tst-st04.bj:31101  0        2         
 2      
   [2024/5/23 15:11:10] c3-hadoop-pegasus-tst-st03.bj:31101  1        2         
 3      
   
   
   
   [2024/5/23 15:13:12] >>> start_bulk_load -a ingest_p4_10G  -c 
c3tst-performance2 -p hdfs_zjy -r /user/s_pegasus/lpfsplit
   
   [2024/5/23 15:15:58] >>> query_bulk_load_status -a ingest_p4_10G -d
   [2024/5/23 15:15:58] [all partitions]
   [2024/5/23 15:15:58] partition_index  partition_status  is_cleaned_up  
   [2024/5/23 15:15:58] 0                BLS_FAILED        NO             
   [2024/5/23 15:15:58] 1                BLS_FAILED        NO             
   [2024/5/23 15:15:58] 2                BLS_FAILED        NO             
   [2024/5/23 15:15:58] 3                BLS_FAILED        NO    
   
   
   [2024/5/23 15:16:13] [general]
   [2024/5/23 15:16:13] app_name           : ingest_p4_10G
   [2024/5/23 15:16:13] app_id             : 100          
   [2024/5/23 15:16:13] partition_count    : 4            
   [2024/5/23 15:16:13] max_replica_count  : 3            
   [2024/5/23 15:16:13] 
   [2024/5/23 15:16:13] [replicas]
   [2024/5/23 15:16:13] pidx  ballot  replica_count  primary                    
          secondaries                                                           
     
   [2024/5/23 15:16:13] 0     3       3/3            
c3-hadoop-pegasus-tst-st05.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 15:16:13] 1     3       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 15:16:13] 2     3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 15:16:13] 3     3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 15:16:13] 
   [2024/5/23 15:16:13] [nodes]
   [2024/5/23 15:16:13] node                                 primary  secondary 
 total  
   [2024/5/23 15:16:13] c3-hadoop-pegasus-tst-st02.bj:31101  1        1         
 2      
   [2024/5/23 15:16:13] c3-hadoop-pegasus-tst-st01.bj:31101  1        1         
 2      
   [2024/5/23 15:16:13] c3-hadoop-pegasus-tst-st05.bj:31101  1        2         
 3      
   [2024/5/23 15:16:13] c3-hadoop-pegasus-tst-st04.bj:31101  0        2         
 2      
   [2024/5/23 15:16:13] c3-hadoop-pegasus-tst-st03.bj:3
   ``` 
   ## Test 2: Bulkload Download stage restart a node
   After fix : No continuous core dumps on multiple nodes.
   ```c++
   [2024/5/23 15:59:18] >>> app ingest_p32_10G -dr
   [2024/5/23 15:59:18] [parameters]
   [2024/5/23 15:59:18] app_name  : ingest_p32_10G
   [2024/5/23 15:59:18] detailed  : true          
   [2024/5/23 15:59:18] 
   [2024/5/23 15:59:18] [general]
   [2024/5/23 15:59:18] app_name           : ingest_p32_10G
   [2024/5/23 15:59:18] app_id             : 101           
   [2024/5/23 15:59:18] partition_count    : 32            
   [2024/5/23 15:59:18] max_replica_count  : 3             
   [2024/5/23 15:59:18] 
   [2024/5/23 15:59:18] [replicas]
   [2024/5/23 15:59:18] pidx  ballot  replica_count  primary                    
          secondaries                                                           
     
   [2024/5/23 15:59:18] 0     3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 15:59:18] 1     3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 15:59:18] 2     3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 15:59:18] 3     3       3/3            
c3-hadoop-pegasus-tst-st05.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 15:59:18] 4     3       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 15:59:18] 5     3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 15:59:18] 6     3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 15:59:18] 7     3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 15:59:18] 8     3       3/3            
c3-hadoop-pegasus-tst-st05.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 15:59:18] 9     3       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 15:59:18] 10    3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 15:59:18] 11    3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 15:59:18] 12    3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 15:59:18] 13    3       3/3            
c3-hadoop-pegasus-tst-st05.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 15:59:18] 14    3       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 15:59:18] 15    3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 15:59:18] 16    3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 15:59:18] 17    3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 15:59:18] 18    3       3/3            
c3-hadoop-pegasus-tst-st05.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 15:59:18] 19    3       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 15:59:18] 20    3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 15:59:18] 21    3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 15:59:18] 22    3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 15:59:18] 23    3       3/3            
c3-hadoop-pegasus-tst-st05.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 15:59:18] 24    3       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 15:59:18] 25    3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 15:59:18] 26    3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 15:59:18] 27    3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 15:59:18] 28    3       3/3            
c3-hadoop-pegasus-tst-st05.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 15:59:18] 29    3       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 15:59:18] 30    3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 15:59:18] 31    3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   
   
   
   [2024/5/23 15:59:39] >>> start_bulk_load -a ingest_p32_10G  -c 
c3tst-performance2 -p hdfs_zjy -r /user/s_pegasus/lpfsplit
   
   [2024/5/23 16:01:18] 2024-05-23 16:01:18 Stop task 4 of replica on 
10.142.100.15(0) success
   [2024/5/23 16:01:50] 2024-05-23 16:01:50 Start task 4 of replica on 
10.142.100.15(0) success
   
   [2024/5/23 16:03:17] [general]
   [2024/5/23 16:03:17] app_name           : ingest_p32_10G
   [2024/5/23 16:03:17] app_id             : 101           
   [2024/5/23 16:03:17] partition_count    : 32            
   [2024/5/23 16:03:17] max_replica_count  : 3             
   [2024/5/23 16:03:17] 
   [2024/5/23 16:03:17] [replicas]
   [2024/5/23 16:03:17] pidx  ballot  replica_count  primary                    
          secondaries                                                           
     
   [2024/5/23 16:03:17] 0     5       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 1     5       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 2     3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 16:03:17] 3     6       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 4     5       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 5     3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 16:03:17] 6     3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 16:03:17] 7     4       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st05.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 16:03:17] 8     6       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 9     5       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 10    3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 16:03:17] 11    5       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 12    4       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 13    6       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 14    5       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 15    5       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 16    3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 16:03:17] 17    4       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 18    6       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 19    3       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 16:03:17] 20    3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st02.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 16:03:17] 21    3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 16:03:17] 22    3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st04.bj:31101]  
   [2024/5/23 16:03:17] 23    6       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 24    5       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 25    3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 16:03:17] 26    3       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st01.bj:31101]  
   [2024/5/23 16:03:17] 27    3       3/3            
c3-hadoop-pegasus-tst-st01.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st02.bj:31101]  
   [2024/5/23 16:03:17] 28    6       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st04.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 29    5       3/3            
c3-hadoop-pegasus-tst-st03.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   [2024/5/23 16:03:17] 30    3       3/3            
c3-hadoop-pegasus-tst-st04.bj:31101  
[c3-hadoop-pegasus-tst-st01.bj:31101,c3-hadoop-pegasus-tst-st03.bj:31101]  
   [2024/5/23 16:03:17] 31    5       3/3            
c3-hadoop-pegasus-tst-st02.bj:31101  
[c3-hadoop-pegasus-tst-st03.bj:31101,c3-hadoop-pegasus-tst-st05.bj:31101]  
   ``` 
   
   # Side effects
   Locking `_metadata.files` may incur a performance penalty.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to