gianm commented on a change in pull request #8154: Fix `is_realtime` column 
behavior in sys.segments table
URL: https://github.com/apache/incubator-druid/pull/8154#discussion_r309484150
 
 

 ##########
 File path: 
sql/src/main/java/org/apache/druid/sql/calcite/schema/DruidSchema.java
 ##########
 @@ -431,19 +431,29 @@ void removeSegment(final DataSegment segment)
     }
   }
 
-  private void removeServerSegment(final DruidServerMetadata server, final 
DataSegment segment)
+  @VisibleForTesting
+  void removeServerSegment(final DruidServerMetadata server, final DataSegment 
segment)
   {
     synchronized (lock) {
       log.debug("Segment[%s] is gone from server[%s]", segment.getId(), 
server.getName());
       final Map<SegmentId, AvailableSegmentMetadata> knownSegments = 
segmentMetadataInfo.get(segment.getDataSource());
       final AvailableSegmentMetadata segmentMetadata = 
knownSegments.get(segment.getId());
-      final Set<String> segmentServers = segmentMetadata.getReplicas();
-      final ImmutableSet<String> servers = FluentIterable.from(segmentServers)
-                                                         
.filter(Predicates.not(Predicates.equalTo(server.getName())))
-                                                         .toSet();
+      final Set<DruidServerMetadata> segmentServers = 
segmentMetadata.getReplicas();
+      final ImmutableSet<DruidServerMetadata> servers = FluentIterable
+          .from(segmentServers)
+          .filter(Predicates.not(Predicates.equalTo(server)))
+          .toSet();
+      final Optional<DruidServerMetadata> realtimeServer = servers
+          .stream()
+          .filter(metadata -> metadata.getType().equals(ServerType.REALTIME))
+          .findAny();
+
+      // if there is no realtime server in the replicas, isRealtime flag 
should be unset
+      long isRealtime = realtimeServer.isPresent() ? 1 : 0;
 
 Review comment:
   I think the idea I had in the previous conversation was motivated by 
`is_realtime` being false meaning that the segment has completely flushed out 
of the realtime system. I thought it'd be useful for finding segments where 
handoff is lingering for some reason.
   
   If we change it to what you are proposing, @clintropolis, then it'd be 
somewhat harder to find segments where handoff is lingering (but still possible 
by joining `segments` and `servers`). But it'd be easier to find segments where 
handoff has progressed far enough that a historical has loaded the segment.
   
   Six of one, half dozen of the other.
   
   I am okay with either approach, so long as it's clearly documented what the 
behavior should be. I don't think the fact that the change would be 
theoretically incompatible is a big deal, since the column is buggy enough that 
it doesn't work anyway right now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to