gianm commented on code in PR #17774:
URL: https://github.com/apache/druid/pull/17774#discussion_r2056445526


##########
processing/src/main/java/org/apache/druid/segment/SegmentReference.java:
##########
@@ -19,12 +19,31 @@
 
 package org.apache.druid.segment;
 
+import org.apache.druid.query.policy.NoopPolicyEnforcer;
+import org.apache.druid.query.policy.PolicyEnforcer;
+
 /**
- * A {@link Segment} with a associated references, such as {@link 
ReferenceCountingSegment} where the reference is
+ * A {@link Segment} with an associated references, such as {@link 
ReferenceCountingSegment} where the reference is
  * the segment itself, and {@link 
org.apache.druid.segment.join.HashJoinSegment} which wraps a
  * {@link ReferenceCountingSegment} and also includes the associated list of
  * {@link org.apache.druid.segment.join.JoinableClause}
  */
 public interface SegmentReference extends Segment, ReferenceCountedObject
 {
+
+  /**
+   * Validates if the segment complies with the policy restrictions on tables.
+   * <p>
+   * This should be called right before the segment is about to be processed 
by the query stack, and after
+   * {@link 
org.apache.druid.query.planning.ExecutionVertex#createSegmentMapFunction(PolicyEnforcer)}.
+   */
+  default void validateOrElseThrow(PolicyEnforcer policyEnforcer)

Review Comment:
   > To solidify, I'm thinking, can we eliminate the use cases when 
`ReferenceCountingSegment` is wrapped with another `SegmentReference`? I think 
it would be simpler if we know the structure is `RestrictedSegment` wraps a 
`ReferenceCountingSegment`, which wraps a basic `segment` 
(`QueryableIndexSegment`, `LookupSegment`, etc).
   
   I did see double-wrapping (`ReferenceCountingSegment` on top of 
`ReferenceCountingSegment`) with Dart with the query `select page, q.c from 
wikipedia, (select count(*) c from "kttm-v2-2019-08-25") q`. However, I doubt 
this is "necessary". The Dart code could likely be adjusted to not do this. So, 
if it's helpful, go ahead and restrict things so `ReferenceCountingSegment` 
cannot wrap another `SegmentReference`.
   
   > To have equally robust validation for `Segment`, I think we would need to 
add `Segment#getChildren` to address the first point. For the second point I 
think we can use a check like this for leaf segments to see if they correspond 
to regular tables: `as(PhysicalSegmentInspector.class) != null`.
   
   About this (using `PhysicalSegmentInspector` as a proxy for "Segment 
represents a regular table"), I thought about it some more and IMO this 
approach is still not ideal. Nothing in the interface says that it can't be 
implemented for a non-table, and nothing requires that it is implemented for a 
table.
   
   I think a nicer idea would be to make `Segment#getId` nullable, and spec it 
so that it should return nonnull for regular tables backed by actual segments, 
null for anything else. This to me would make more sense than the current way 
`getId()` works, because currently dummy IDs are needed for `Segment` that 
aren't backed by actual segments, which seems odd. I skimmed usages of 
`getId()`; most seem to either be running in scenarios where it would always be 
nonnull anyway, or else are using it to interpolate into log messages (could 
use `toString` or `asString` instead).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to