clintropolis commented on a change in pull request #10660:
URL: https://github.com/apache/druid/pull/10660#discussion_r540509717



##########
File path: 
sql/src/main/java/org/apache/druid/sql/calcite/schema/DruidSchema.java
##########
@@ -448,25 +449,27 @@ void removeServerSegment(final DruidServerMetadata 
server, final DataSegment seg
   {
     synchronized (lock) {
       log.debug("Segment[%s] is gone from server[%s]", segment.getId(), 
server.getName());
-      if (server.getType().equals(ServerType.BROKER)) {
-        // a segment on a broker means a broadcast datasource, skip metadata 
because we'll also see this segment on the
-        // historical, however mark the datasource for refresh because it 
might no longer be broadcast or something
-        dataSourcesNeedingRebuild.add(segment.getDataSource());
-      } else {
-        final Map<SegmentId, AvailableSegmentMetadata> knownSegments = 
segmentMetadataInfo.get(segment.getDataSource());
-        final AvailableSegmentMetadata segmentMetadata = 
knownSegments.get(segment.getId());
-        final Set<DruidServerMetadata> segmentServers = 
segmentMetadata.getReplicas();
-        final ImmutableSet<DruidServerMetadata> servers = FluentIterable
-            .from(segmentServers)
-            .filter(Predicates.not(Predicates.equalTo(server)))
-            .toSet();
-
-        final AvailableSegmentMetadata metadataWithNumReplicas = 
AvailableSegmentMetadata
-            .from(segmentMetadata)
-            .withReplicas(servers)
-            .withRealtime(recomputeIsRealtime(servers))
-            .build();
-        knownSegments.put(segment.getId(), metadataWithNumReplicas);
+      final Map<SegmentId, AvailableSegmentMetadata> knownSegments = 
segmentMetadataInfo.get(segment.getDataSource());
+      if (knownSegments != null && !knownSegments.isEmpty()) {

Review comment:
       >Do you know why that's not the case? I mean, why not add segments 
served by Brokers to segmentMetadataInfo?
   
   Hmm, trying to remember exactly. I know `BrokerServerView` doesn't track 
broker served segments on the timeline because it would require changes to 
query server selection logic to either filter out brokers, or, make it smart 
enough to query locally through the brokers segment manager directly whenever 
it picked itself (so that it doesn't make an infinite query loop by submitting 
another query to itself and so on). Side note, fixing this would be a cool 
improvement because it would allow for brokers to query other brokers, which 
with some other work could potentially allow for cool broker trees where not 
every broker has to track segments of every historical, but could still query 
every segment.
   
    I think that `DruidSchema` actually could probably track them though, since 
it doesn't use the broker server views timelines, just the events fired from 
the timeline (which still do happen for segments served by brokers to allow 
things like `DruidSchema` to act on them). I suspect that I skipped broker 
segment tracking simply because it wasn't really necessary to compute the 
schema itself, only whether the datasource is globally available, but it 
doesn't seem like it would be harmful to add them. I'll look into making the 
change, because that would be less magical behavior.
   
   >At any rate, if it's not the case, then I think knownSegments could be null 
here if it's dropped from historicals before the broker, so it makes sense to 
check for it. There should be a comment about why this can happen.
   
   I guess the null check really only belongs inside of `if 
(server.getType().equals(ServerType.BROKER))` clause since that is the only 
case it can be null. If changing to just track broker segments is problematic 
for some reason, I'll make this change and add a comment.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to