sajjad-moradi opened a new issue, #13398:
URL: https://github.com/apache/pinot/issues/13398

   ## Summary
   There is a race condition happening when the following two threads want to 
build/download a segment. 
   1. consuming thread (in building the segment in RETAINING state)
   2. helix thread handling transition message CONSUMING->ONLINE (after 
downloading is finished)
   
   The race condition leads to SIGSEGV on realtime servers.
   
   ## Details
   In this recent PR  https://github.com/apache/pinot/pull/12886,  a lock (on 
the segment name) has been added to the consuming thread when it’s trying to 
build the segment in RETAINING state:
   
   ![Pasted Graphic 
2](https://github.com/apache/pinot/assets/8548220/7e4006ad-57dc-4f0d-88da-3c28ea976782)
   
   The helix thread also has a lock on the segment name right at the beginning 
of processing the transition message:
   
   ![Pasted Graphic 
1](https://github.com/apache/pinot/assets/8548220/05c58367-2cba-4d05-bd8a-407317831adb)
   
   The problem occurs if the helix thread first acquires the lock, then the 
consuming thread tries to acquire the lock for building the segment. Since the 
lock cannot be acquired, the consuming thread is blocked on that line of code. 
Now the helix thread reaches a point that calls a “stop” function to kill the 
consuming thread, but since the consuming thread is waiting for the lock, it 
cannot be killed. The stop function waits for 10 minutes for joining, and then 
continues without any interruption.
   
   ![Pasted 
Graphic](https://github.com/apache/pinot/assets/8548220/78bed789-c4db-45f6-8b4f-a560b805c395)
   
   After stop method goes through, the helix thread downloads the segment from 
the deep store; releases the off-heap memory associated with the consuming 
segment; and finally releases the lock.
   Now the consuming thread acquires the lock, and tries to convert the 
consuming segment to an immutable segment using the off-heap memory that’s 
already released. That causes a segmentation fault (SIGSEGV).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to