chrajeshbabu opened a new issue, #14571:
URL: https://github.com/apache/pinot/issues/14571

   Segment stuck in bad state when the download from deep store failed with EOF 
exception while adding new segment.
   
   ```
   {
     "segmentName": <segment_name>,
     "serverState": {
       "Server_<host>_<port>": {
         "idealState": "ONLINE",
         "externalView": "ERROR",
         "segmentSize": "0 bytes",
         "consumerInfo": null,
         "errorInfo": {
           "timestamp": "2024-11-28 19:24:50 GMT",
           "errorMessage": "Caught exception while adding ONLINE segment",
           "stackTrace": "java.io.EOFException\n\tat 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.read(GzipCompressorInputStream.java:316)\n\tat
 
org.apache.commons.compress.archivers.tar.TarArchiveInputStream.read(TarArchiveInputStream.java:634)\n\tat
 java.base/java.io.FilterInputStream.read(FilterInputStream.java:106)\n\tat 
org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1483)\n\tat 
org.apache.commons.io.IOUtils.copy(IOUtils.java:1107)\n\tat 
org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1456)\n\tat 
org.apache.commons.io.IOUtils.copy(IOUtils.java:1085)\n\tat 
org.apache.pinot.common.utils.TarGzCompressionUtils.untarWithRateLimiter(TarGzCompressionUtils.java:202)\n\tat
 
org.apache.pinot.common.utils.TarGzCompressionUtils.untar(TarGzCompressionUtils.java:148)\n\tat
 
org.apache.pinot.common.utils.TarGzCompressionUtils.untar(TarGzCompressionUtils.java:138)\n\tat
 
org.apache.pinot.core.data.manager.BaseTableDataManager.untarSegment(BaseTableDataManager.ja
 va:835)\n\tat 
org.apache.pinot.core.data.manager.BaseTableDataManager.downloadSegmentFromDeepStore(BaseTableDataManager.java:783)\n\tat
 
org.apache.pinot.core.data.manager.BaseTableDataManager.downloadSegment(BaseTableDataManager.java:730)\n\tat
 
org.apache.pinot.core.data.manager.BaseTableDataManager.downloadAndLoadSegment(BaseTableDataManager.java:389)\n\tat
 
org.apache.pinot.core.data.manager.BaseTableDataManager.addNewOnlineSegment(BaseTableDataManager.java:360)\n\tat
 
org.apache.pinot.core.data.manager.offline.OfflineTableDataManager.doAddOnlineSegment(OfflineTableDataManager.java:54)\n\tat
 
org.apache.pinot.core.data.manager.BaseTableDataManager.addOnlineSegment(BaseTableDataManager.java:313)\n\tat
 
org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addOnlineSegment(HelixInstanceDataManager.java:275)\n\tat
 
org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFact
 ory.java:131)\n\tat 
jdk.internal.reflect.GeneratedMethodAccessor147.invoke(Unknown Source)\n\tat 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
 java.base/java.lang.reflect.Method.invoke(Method.java:569)\n\tat 
org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:350)\n\tat
 
org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:278)\n\tat
 org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97)\n\tat 
org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49)\n\tat 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\n\tat
 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\n\tat
 java.base/java.lang.Thread.run(Thread.java:840)\n"
         }
       }
     }
   }```
   
   Tried to reload the segment from the rest API which is loading the segment 
further because of the segment registration not happened yet and 
_segmentDataManagerMap doesn't have entry for the segment.
   
   `2024/11/30 00:21:11.702 WARN [HelixInstanceDataManager] 
[HelixTaskExecutor-message_handle_thread_54] Failed to get segment data manager 
for segments: [<segment_name>] of table: 
org.apache.pinot.core.data.manager.offline.OfflineTableDataManager@52a09a91, 
skipping reloading them
   `
   
   ```
     public void downloadAndLoadSegment(SegmentZKMetadata zkMetadata, 
IndexLoadingConfig indexLoadingConfig)
         throws Exception {
       String segmentName = zkMetadata.getSegmentName();
       _logger.info("Downloading and loading segment: {}", segmentName);
       File indexDir = downloadSegment(zkMetadata);
       addSegment(ImmutableSegmentLoader.load(indexDir, indexLoadingConfig));
       _logger.info("Downloaded and loaded segment: {} with CRC: {} on tier: 
{}", segmentName, zkMetadata.getCrc(),
           TierConfigUtils.normalizeTierName(zkMetadata.getTier()));
     }
   ```
   
   
   It would be better to register the segment and then download so that any 
unexpected failures can be might get fixed using the reload segment of reload 
table API. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to