clintropolis commented on a change in pull request #10660:
URL: https://github.com/apache/druid/pull/10660#discussion_r540509717
##########
File path:
sql/src/main/java/org/apache/druid/sql/calcite/schema/DruidSchema.java
##########
@@ -448,25 +449,27 @@ void removeServerSegment(final DruidServerMetadata
server, final DataSegment seg
{
synchronized (lock) {
log.debug("Segment[%s] is gone from server[%s]", segment.getId(),
server.getName());
- if (server.getType().equals(ServerType.BROKER)) {
- // a segment on a broker means a broadcast datasource, skip metadata
because we'll also see this segment on the
- // historical, however mark the datasource for refresh because it
might no longer be broadcast or something
- dataSourcesNeedingRebuild.add(segment.getDataSource());
- } else {
- final Map<SegmentId, AvailableSegmentMetadata> knownSegments =
segmentMetadataInfo.get(segment.getDataSource());
- final AvailableSegmentMetadata segmentMetadata =
knownSegments.get(segment.getId());
- final Set<DruidServerMetadata> segmentServers =
segmentMetadata.getReplicas();
- final ImmutableSet<DruidServerMetadata> servers = FluentIterable
- .from(segmentServers)
- .filter(Predicates.not(Predicates.equalTo(server)))
- .toSet();
-
- final AvailableSegmentMetadata metadataWithNumReplicas =
AvailableSegmentMetadata
- .from(segmentMetadata)
- .withReplicas(servers)
- .withRealtime(recomputeIsRealtime(servers))
- .build();
- knownSegments.put(segment.getId(), metadataWithNumReplicas);
+ final Map<SegmentId, AvailableSegmentMetadata> knownSegments =
segmentMetadataInfo.get(segment.getDataSource());
+ if (knownSegments != null && !knownSegments.isEmpty()) {
Review comment:
>Do you know why that's not the case? I mean, why not add segments
served by Brokers to segmentMetadataInfo?
Hmm, trying to remember exactly. I know `BrokerServerView` doesn't track
broker served segments on the timeline because it would require changes to
query server selection logic to either filter out brokers, or, make it smart
enough to query locally through the brokers segment manager directly whenever
it picked itself (so that it doesn't make an infinite query loop by submitting
another query to itself and so on). Side note, fixing this would be a cool
improvement because it would allow for brokers to query other brokers, which
with some other work could potentially allow for cool broker trees where not
every broker has to track segments of every historical, but could still query
every segment.
I think that `DruidSchema` actually could probably track them though, since
it doesn't use the broker server views timelines, just the events fired from
the timeline (which still do happen for segments served by brokers to allow
things like `DruidSchema` to act on them). I suspect that I skipped broker
segment tracking simply because it wasn't really necessary to compute the
schema itself, only whether the datasource is globally available, but it
doesn't seem like it would be harmful to add them. I'll look into making the
change, because that would be less magical behavior.
>At any rate, if it's not the case, then I think knownSegments could be null
here if it's dropped from historicals before the broker, so it makes sense to
check for it. There should be a comment about why this can happen.
I guess the null check really only belongs inside of `if
(server.getType().equals(ServerType.BROKER))` clause since that is the only
case it can be null. If changing to just track broker segments is problematic
for some reason, I'll make this change and add a comment.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]