jihoonson edited a comment on issue #6124: KafkaIndexTask can delete published 
segments on restart
URL: 
https://github.com/apache/incubator-druid/issues/6124#issuecomment-411279909
 
 
   > This doesn't seem right: there is code specifically to handle the case 
where a task tries to publish a segment that some task already published. It 
happens all the time with replicas, and they just ignore that segment and move 
on to the next one.
   
   Yes, correct. The task doesn't fail at this point, but it just fails to 
update the metastore.
   
   > I wonder if the real reason for publish failure is that the startMetadata 
doesn't match up. I bet it wouldn't match up: it sounds like the task is trying 
to publish from the point it originally started from rather than from the point 
it last published.
   
   Here is the 
[code](https://github.com/apache/incubator-druid/blob/master/server/src/main/java/io/druid/segment/realtime/appenderator/BaseAppenderatorDriver.java#L555-L581)
 what this error happens.
   
   ```java
                 final boolean published = publisher.publishSegments(
                     ImmutableSet.copyOf(segmentsAndMetadata.getSegments()),
                     metadata == null ? null : ((AppenderatorDriverMetadata) 
metadata).getCallerMetadata()
                 );
   
                 if (published) {
                   log.info("Published segments.");
                 } else {
                   log.info("Transaction failure while publishing segments, 
checking if someone else beat us to it.");
                   final Set<SegmentIdentifier> segmentsIdentifiers = 
segmentsAndMetadata
                       .getSegments()
                       .stream()
                       .map(SegmentIdentifier::fromDataSegment)
                       .collect(Collectors.toSet());
                   if (usedSegmentChecker.findUsedSegments(segmentsIdentifiers)
                                         
.equals(Sets.newHashSet(segmentsAndMetadata.getSegments()))) {
                     log.info(
                         "Removing our segments from deep storage because 
someone else already published them: %s",
                         segmentsAndMetadata.getSegments()
                     );
                     
segmentsAndMetadata.getSegments().forEach(dataSegmentKiller::killQuietly);
   
                     log.info("Our segments really do exist, awaiting 
handoff.");
                   } else {
                     throw new ISE("Failed to publish segments[%s]", 
segmentsAndMetadata.getSegments());
                   }
                 }
   ```
   
   `published` is false, so the task checks the segments it failed to publish 
are already being used using `usedSegmentChecker` which is the 
`ActionBasedUsedSegmentChecker` in this case. And 
`usedSegmentChecker.findUsedSegments(segmentsIdentifiers).equals(Sets.newHashSet(segmentsAndMetadata.getSegments()))`
 returns false and the task throws an exception. This is because the task 
generated more segments after restarting. I've verified this by comparing the 
publishing segments before and after restart.
   
   I didn't see any logs related to metadata mismatch. 
   
   > In (7) the task should not have removed the published segments (this is 
the biggest bug).
   
   Not sure about this. This can make publishing segments non-atomic as well as 
potential garbage segments. Probably solving (3) would be enough because this 
should never happen?
   
   > In (3) the task should have done something smarter instead of restoring a 
setup that couldn't possibly work out.
   
   Yes, I think we should keep the task states in local disk which represents 
what the task was doing. Also, for 
`usedSegmentChecker.findUsedSegments(segmentsIdentifiers).equals(Sets.newHashSet(segmentsAndMetadata.getSegments()))`,
 we can change this to check `usedSegments` _include_ 
`segmentsAndMetadata.getSegments()` and continue publishing the segments not in 
`usedSegments`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to