FrankChen021 commented on code in PR #19548:
URL: https://github.com/apache/druid/pull/19548#discussion_r3356163368


##########
server/src/main/java/org/apache/druid/segment/join/LookupJoinableFactory.java:
##########
@@ -57,7 +57,7 @@ public Optional<Joinable> build(final DataSource dataSource, 
final JoinCondition
     if (condition.canHashJoin()) {
       final String lookupName = lookupDataSource.getLookupName();
       return lookupProvider.get(lookupName)
-                           .map(c -> 
LookupJoinable.wrap(c.getLookupExtractorFactory().get()));
+                           .map(c -> 
LookupJoinable.wrap(c.getLookupExtractorFactory()));

Review Comment:
   [P1] Pin one lookup snapshot for each joinable
   
   Wrapping the factory means a single LookupJoinable can observe different 
cache versions across getMatchableColumnValues/getCorrelatedColumnValues during 
join filter conversion or preanalysis, then makeJoinMatcher during execution. 
If a namespace refresh lands between those calls, pushed-down left filters can 
be based on version N while the matcher uses version N+1, which can drop rows 
that should match the execution-time lookup. The lookup version needs to be 
retained once for the query/joinable lifecycle and released through the 
existing reference path, not reacquired independently per method.



##########
processing/src/main/java/org/apache/druid/query/dimension/LookupDimensionSpec.java:
##########
@@ -136,18 +139,31 @@ public String getName()
   @Override
   public ExtractionFn getExtractionFn()
   {
-    final LookupExtractor lookupExtractor;
+    final LookupExtractor lookupExtractor = getLookupExtractor();
+    return makeLookupExtractionFn(lookupExtractor);
+  }
 
+  private LookupExtractor getLookupExtractor()
+  {
     if (Strings.isNullOrEmpty(name)) {
-      lookupExtractor = this.lookup;
-    } else {
-      lookupExtractor = lookupExtractorFactoryContainerProvider
-          .get(name)
-          .orElseThrow(() -> new ISE("Lookup [%s] not found", name))
-          .getLookupExtractorFactory()
-          .get();
+      return this.lookup;
     }
 
+    final LookupExtractorFactory lookupExtractorFactory = 
lookupExtractorFactoryContainerProvider
+        .get(name)
+        .orElseThrow(() -> new ISE("Lookup [%s] not found", name))
+        .getLookupExtractorFactory();
+
+    final Optional<RetainedLookupExtractor> retainedLookupExtractor =

Review Comment:
   [P1] Avoid retaining lookup caches for metadata-only calls
   
   getExtractionFn now acquires a retained lookup, but callers such as 
preservesOrdering, ColumnProcessors.computeDimensionSpecCapabilities, and 
DimensionHandlerUtils call getExtractionFn just to inspect metadata and then 
drop the returned function without a close hook. Those retained references are 
released only by Cleaner/GC, so normal capability checks can keep retired cache 
versions referenced and block later namespace refreshes until GC or the 15 
minute timeout. Metadata paths should avoid the retained acquisition or close 
it deterministically after inspection.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to