satish-mittal commented on issue #10861:
URL: https://github.com/apache/pinot/issues/10861#issuecomment-1580194353

   If we look at Helix path, 
`SegmentOnlineOfflineStateModel.onBecomeOnlineFromConsuming` calls 
`LLRealtimeSegmentDataManager.goOnlineFromConsuming()`:
   
   ```
     public void goOnlineFromConsuming(SegmentZKMetadata segmentZKMetadata)
         throws InterruptedException {
       _serverMetrics.setValueOfTableGauge(_metricKeyName, 
ServerGauge.LLC_PARTITION_CONSUMING, 0);
       try {
         // Remove the segment file before we do anything else.
         removeSegmentFile();
         _leaseExtender.removeSegment(_segmentNameStr);
         final StreamPartitionMsgOffset endOffset =
             
_streamPartitionMsgOffsetFactory.create(segmentZKMetadata.getEndOffset());
         _segmentLogger
             .info("State: {}, transitioning from CONSUMING to ONLINE 
(startOffset: {}, endOffset: {})", _state.toString(),
                 _startOffset, endOffset);
         stop();
         _segmentLogger.info("Consumer thread stopped in state {}", 
_state.toString());
   
         switch (_state) {
           case COMMITTED:
           case RETAINED:
             // Nothing to do. we already built local segment and swapped it 
with in-memory data.
             _segmentLogger.info("State {}. Nothing to do", _state.toString());
             break;
           ....// other cases
           ....// other cases
           default:
             _segmentLogger.info("Downloading to replace segment while in state 
{}", _state.toString());
             downloadSegmentAndReplace(segmentZKMetadata);
             break;
   ```
   
   Currently, Helix state transition is going to `default` and attempting to 
downloadSegmentAndReplace, which will delete the segment (that is still being 
read by index creator, leading to SIGSEGV).
   
   Here is the flow of `PartitionConsumer` thread when it receives `RETAIN` 
response from controller will be (as per `run()`):
   
   ```
               case KEEP:
                 _state = State.RETAINING;
                 CompletionMode segmentCompletionMode = 
getSegmentCompletionMode();
                 switch (segmentCompletionMode) {
                   case DOWNLOAD:
                     _state = State.DISCARDED;
                     break;
                   case DEFAULT:
                     success = buildSegmentAndReplace();
                     if (success) {
                       _state = State.RETAINED;
                     } else {
                       // Could not build segment for some reason. We can only 
download it.
                       _state = State.ERROR;
                       _segmentLogger.error("Could not build segment for {}", 
_segmentNameStr);
                     }
                     break;
                   default:
                     break;
                 }
                 break;
   ```
   
   So `PartitionConsumer` thread temporarily sets the state to `RETAINING`, and 
then goes ahead to call `buildSegmentAndReplace()` which is currently taking 
more than 10 minutes.
   
   Ideally, Helix state transition should be able to detect that 
`PartitionConsumer` thread is already building the segment; and it will either 
eventually load it successfully (thus going to `RETAINED`), or go to `ERROR`. 
   
   If controller has any mechanism to eventually detect later that a segment is 
in ERROR state and rectify it, then Helix state transition of CONSUMING -> 
ONLINE could simply treat `RETAINING` state in the same way as `RETAINED` or 
`COMMITTED` and do nothing. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to