Aklakan opened a new issue, #2535: URL: https://github.com/apache/jena/issues/2535
### Version 5.1.0-SNAPSHOT ### Feature This is one more follow up towards support for multi variable join keys as triggered by the mail thread around https://www.mail-archive.com/[email protected]/msg20755.html The mail mentions example sparql queries, such as the one below, that use value blocks to exemplify the issue with multi-variable joins. ```sparql # test.rq select (count(*) as ?C) where { { select ?X ?Y (struuid() as ?UUID) where { values ?X_i { 0 1 2 3 4 5 6 7 8 9 } values ?X_j { 0 1 2 3 4 5 6 7 8 9 } bind ( ?X_i + 10 * ?X_j as ?X) values ?Y_i { 0 1 2 3 4 5 6 7 8 9 } values ?Y_j { 0 1 2 3 4 5 6 7 8 9 } bind ( ?Y_i + 10 * ?Y_j as ?Y) } } { select ?X ?Y where { { select ?X ?Y (rand() as ?RAND) where { values ?X_i { 0 1 2 3 4 5 6 7 8 9 } values ?X_j { 0 1 2 3 4 5 6 7 8 9 } bind ( ?X_i + 10 * ?X_j as ?X) values ?Y_i { 0 1 2 3 4 5 6 7 8 9 } values ?Y_j { 0 1 2 3 4 5 6 7 8 9 } bind ( ?Y_i + 10 * ?Y_j as ?Y) } } filter (?RAND < 0.95) } } } ``` Jena's JoinClassifier so far linearizes joins between values blocks such as the SPARQL example below which gets very slow for larger values blocks. An extra flag is necessary to force hash joins: ```bash arq --explain --time --set arq:optIndexJoinStrategy=false --query test.rq ``` The goal of this issue is to modify JoinClassifier such that joins between table-based operands are processed as hash joins rather than linear joins. ### Are you interested in contributing a solution yourself? Yes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
