Aklakan commented on code in PR #2405: URL: https://github.com/apache/jena/pull/2405#discussion_r1703989751
########## jena-arq/src/main/java/org/apache/jena/sparql/engine/join/AbstractIterHashJoin.java: ########## @@ -42,47 +42,52 @@ public abstract class AbstractIterHashJoin extends QueryIter2 { protected long s_countResults = 0 ; // Overall result size. protected long s_trailerResults = 0 ; // Results from the trailer iterator. // See also stats in the probe table. - + protected final JoinKey joinKey ; - protected final HashProbeTable hashTable ; + protected final MultiHashProbeTable hashTable ; private QueryIterator iterStream ; private Binding rowStream = null ; private Iterator<Binding> iterCurrent ; - private boolean yielded ; // Flag to note when current probe causes a result. + private boolean yielded ; // Flag to note when current probe causes a result. // Hanlde any "post join" additions. private Iterator<Binding> iterTail = null ; - + enum Phase { INIT, HASH , STREAM, TRAILER, DONE } Phase state = Phase.INIT ; - + private Binding slot = null ; - protected AbstractIterHashJoin(JoinKey joinKey, QueryIterator probeIter, QueryIterator streamIter, ExecutionContext execCxt) { + protected AbstractIterHashJoin(JoinKey initialJoinKey, QueryIterator probeIter, QueryIterator streamIter, ExecutionContext execCxt) { super(probeIter, streamIter, execCxt) ; - - if ( joinKey == null ) { + + if ( initialJoinKey == null ) { + // This block computes an initial join key from the common variables of each iterator's first binding. + QueryIterPeek pProbe = QueryIterPeek.create(probeIter, execCxt) ; QueryIterPeek pStream = QueryIterPeek.create(streamIter, execCxt) ; - + Binding bLeft = pProbe.peek() ; Binding bRight = pStream.peek() ; - + List<Var> varsLeft = Iter.toList(bLeft.vars()) ; List<Var> varsRight = Iter.toList(bRight.vars()) ; - joinKey = JoinKey.createVarKey(varsLeft, varsRight) ; + // joinKey = JoinKey.createVarKey(varsLeft, varsRight) ; + initialJoinKey = JoinKey.create(varsLeft, varsRight) ; + probeIter = pProbe ; streamIter = pStream ; } - - this.joinKey = joinKey ; + + JoinKey maxJoinKey = null; Review Comment: The max join key was meant to prevent creation of JoinIndexes for other variables than the given ones. For example, when set to [x, y] then an attempt to create a JoinIndex for [x, y, z] would still only create one for [x, y] and thus omit z. Most likely such a feature is not needed. `initialJoinKey` improves over the original `joinKey`: Originally, only the first variable common to the peeked bindings was used for indexing, now the all common variables are used. The `initialJoinKey` is used to store the probe-bindings directly in an indexed structure (rather than e.g. a list), but the MultiHashProbe table can create further `JoinIndex` instances depending on the lookup requests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@jena.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: pr-unsubscr...@jena.apache.org For additional commands, e-mail: pr-h...@jena.apache.org