FrankChen021 commented on code in PR #19548:
URL: https://github.com/apache/druid/pull/19548#discussion_r3413392514
##########
server/src/main/java/org/apache/druid/segment/join/LookupJoinableFactory.java:
##########
@@ -57,9 +61,20 @@ public Optional<Joinable> build(final DataSource dataSource,
final JoinCondition
if (condition.canHashJoin()) {
final String lookupName = lookupDataSource.getLookupName();
return lookupProvider.get(lookupName)
- .map(c ->
LookupJoinable.wrap(c.getLookupExtractorFactory().get()));
+ .map(this::buildLookupJoinable);
} else {
return Optional.empty();
}
}
+
+ private LookupJoinable buildLookupJoinable(final
LookupExtractorFactoryContainer container)
+ {
+ final LookupExtractorFactory lookupExtractorFactory =
container.getLookupExtractorFactory();
+ final Optional<RetainedLookupExtractor> retainedLookupExtractor =
+ lookupExtractorFactory.acquireRetainedLookupExtractor();
Review Comment:
Thanks, this addresses the joinable lifecycle issue I was worried about. The
retained lookup is now owned by the query-scoped SegmentMapFunction, and the
server, realtime, and MSQ callers I checked close that function after the
mapped segments/segment references, so the join path no longer appears to rely
on the Cleaner for normal completion.
Reviewed 32 of 32 changed files.
---
This is an automated review by Codex GPT-5.5
##########
processing/src/main/java/org/apache/druid/query/dimension/LookupDimensionSpec.java:
##########
@@ -136,18 +139,51 @@ public String getName()
@Override
public ExtractionFn getExtractionFn()
{
- final LookupExtractor lookupExtractor;
+ final LookupExtractor lookupExtractor = getLookupExtractor();
+ return makeLookupExtractionFn(lookupExtractor);
+ }
+
+ @Override
+ public ExtractionFn getExtractionFnForMetadata()
+ {
+ return makeLookupExtractionFn(getLookupExtractorForMetadata());
+ }
+
+ private LookupExtractor getLookupExtractor()
+ {
+ if (Strings.isNullOrEmpty(name)) {
+ return this.lookup;
+ }
+
+ final LookupExtractorFactory lookupExtractorFactory =
getLookupExtractorFactory();
+ final Optional<RetainedLookupExtractor> retainedLookupExtractor =
+ lookupExtractorFactory.acquireRetainedLookupExtractor();
+
+ // ExtractionFn has no close hook. The RetainedLookupExtractor cleaner
releases this reference when the
+ // LookupExtractionFn that owns it becomes unreachable.
+ return retainedLookupExtractor.<LookupExtractor>map(retained ->
retained).orElseGet(lookupExtractorFactory);
Review Comment:
[P1] Do not retain lookup dimension extraction functions without a close path
getExtractionFn() now acquires a RetainedLookupExtractor and returns it
inside a plain LookupExtractionFn, but that API has no close hook. Several
callers do not cache or close the returned function, for example
TopNQueryQueryToolChest calls
topNQuery.getDimensionSpec().getExtractionFn().apply(...) while post-processing
each result value, and GroupByQueryQueryToolChest.extractionsToRewrite calls it
just to inspect getExtractionType(). For named lookup dimensions, each call
increments the namespace cache reference and then depends on Cleaner/GC to
release it, so with the new default maxRetiredCacheEntries=1 a query can make
later namespace refreshes skip until GC or the 15 minute timeout. Please avoid
retaining from LookupDimensionSpec.getExtractionFn() unless the returned handle
has a deterministic query/result lifecycle, or update these call sites to use a
scoped retained wrapper that is explicitly closed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]