afs opened a new issue, #1629: URL: https://github.com/apache/jena/issues/1629
This issue proposes a roadmap for changes to path evaluation. 4.7.0 is December-ish and is already a significant release because of other, unrelated changes. There are several strands in progress. 1. Iterators, not materialized lists so LIMIT and timeout have an effect. 1. Better starting points for P_Alt in the case of `_ (:p1|:p2) _`. 1. Further P_Alt evaluation - tranform or `PathLib`. 1. Triple pattern reordering across paths. #### Iterators Item 1 can be done now. We then have a starting point for the rest of the items. It needs to ensure `PathLib` iterators are closed eventually (caveat: cancellation is not on the iterator execution thread). The iterators in `PathLib` don't need to be QueryIterators. They get wrapped in a QueryIterator so cancellation will still happen. Cancellation needs to close (Iter.close) the iterators carefully (caveat: different threads). (functional decomposition) is better is not clear. Proposal: * Do this item now as a separate step #### P_Alt starting points `PathLib.determineUngroundedStartingSet` could handle the `(:p1|:p2)` case instead of relying on OpUnion expansion. There are more cases that just `(:p1|:p2)` that are the same general mechanism of "seeding by property" -- `:p+`, `(:p+|:q)` -- to find starting points but cardinality rules start to come into play. Proposal: * do point 2 in PathLib because it applies to other forms. The test coverage needs to be verified. Up to here is a target for 4.7.0 as 2 changes. It gives a common base for further changes. Otherwise we have one big change that all has to be assessed so it will take longer before any change happens. #### Further Path and Transform work The objective is to have one set of active code in the codebase. This needs an investigation is to run tests to perform both transforms and compare the output to look for their optimization coverage. The `PathTransform` (`PathVisitor` based) one may be a better base because it naturally handles `P_Alt`, but it may be harder to consider triple pattern reordering across paths and harder for some other cases like `?x (:p1+|...) ?z` which involve multiple path operators to be examined. The `PathCompiler` approach is easier for handling state across different path operators and keeps what is the original path together but isn't yet flexible enough. #### Process Process-wise, discussing architecture on PRs does not work well. The PR UX is focues on keeping code-related comments together but foir longer running PRs when broader points are made, the time-ordering discussion of the overall architecture is lost. Proposal: * Use this github issue for the general and architecture discussions * Use the PR for discussion on specific code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
