dclim commented on a change in pull request #6451: Fix inconsistent segment 
size(#6448)
URL: https://github.com/apache/incubator-druid/pull/6451#discussion_r224682676
 
 

 ##########
 File path: 
sql/src/main/java/org/apache/druid/sql/calcite/schema/SystemSchema.java
 ##########
 @@ -211,14 +210,64 @@ public TableType getJdbcTableType()
       final Iterator<Entry<DataSegment, SegmentMetadataHolder>> 
availableSegmentEntries = availableSegmentMetadata.entrySet()
                                                                                
                                   .iterator();
 
+      // in memory map to store segment data from available segments
+      final Map<String, PartialSegmentData> partialSegmentDataMap = new 
HashMap<>();
+      for (SegmentMetadataHolder holder : availableSegmentMetadata.values()) {
+        final PartialSegmentData data = new PartialSegmentData(
+            holder.isAvailable(),
+            holder.isRealtime(),
+            holder.getNumReplicas(),
+            holder.getNumRows()
+        );
+        partialSegmentDataMap.put(holder.getSegmentId(), data);
+      }
+
       //get published segments from coordinator
       final JsonParserIterator<DataSegment> metadataSegments = 
getMetadataSegments(
           druidLeaderClient,
           jsonMapper,
           responseHandler
       );
 
-      Set<String> availableSegmentIds = new HashSet<>();
+      final Set<String> segmentsAlreadySeen = new HashSet<>();
+
+      //auth check for published segments
+      final CloseableIterator<DataSegment> authorizedPublishedSegments = 
getAuthorizedPublishedSegments(
+          metadataSegments,
+          root
+      );
+      final FluentIterable<Object[]> publishedSegments = FluentIterable
+          .from(() -> authorizedPublishedSegments)
+          .transform(val -> {
+            try {
+              if (!segmentsAlreadySeen.contains(val.getIdentifier())) {
+                segmentsAlreadySeen.add(val.getIdentifier());
+              }
+              final PartialSegmentData partialSegmentData = 
partialSegmentDataMap.get(val.getIdentifier());
+              return new Object[]{
+                  val.getIdentifier(),
+                  val.getDataSource(),
+                  val.getInterval().getStart(),
+                  val.getInterval().getEnd(),
+                  val.getSize(),
+                  val.getVersion(),
+                  val.getShardSpec().getPartitionNum(),
+                  partialSegmentData == null ? 0L : 
partialSegmentData.getNumReplicas(),
+                  partialSegmentData == null ? -1L : 
partialSegmentData.getNumRows(),
 
 Review comment:
   I know this was from the original review I wasn't involved with, but is -1 
the right value to use here? Can this lead to an incorrect result if you are 
summing the total number of rows across a bunch of segments and some are 
returning real values and others are returning -1?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to