Aklakan commented on code in PR #2925:
URL: https://github.com/apache/jena/pull/2925#discussion_r1918441638


##########
jena-arq/src/main/java/org/apache/jena/sparql/engine/iterator/QueryIterLateral.java:
##########
@@ -292,17 +295,33 @@ public Op transform(OpTable opTable) {
             // By the assignment restriction, the binding only needs to be 
added to each row of the table.
             Table table = opTable.getTable();
             // Table vars.
-            List<Var> vars = new ArrayList<>(table.getVars());
-            binding.vars().forEachRemaining(vars::add);
-            TableN table2 = new TableN(vars);
+            List<Var> tableVars = table.getVars();
+            List<Var> vars = new ArrayList<>(tableVars);
+
+            // Track variables that appear both in the table and the binding.
+            List<Var> commonVars = new ArrayList<>();
+
+            // Index variables in a set if there are more than a few of them.
+            Collection<Var> tableVarsIndex = tableVars.size() > 4 ? new 
HashSet<>(tableVars) : tableVars;
+            binding.vars().forEachRemaining(v -> {
+                if (tableVarsIndex.contains(v)) {
+                    commonVars.add(v);
+                } else {
+                    vars.add(v);
+                }
+            });
+
+            List<Binding> bindings = new ArrayList<>(table.size());
             BindingBuilder builder = BindingFactory.builder();
             table.iterator(null).forEachRemaining(row->{
-                builder.reset();
-                builder.addAll(row);
-                builder.addAll(binding);
-                table2.addBinding(builder.build());
+                if (Algebra.compatible(row, binding, commonVars.iterator())) {

Review Comment:
   > `SELECT *` is effectively `SELECT ?employee` which works.
   
   Hm, depending on the viewpoint it is the effective query *after* 
substitution, but before that, it's `SELECT ?department ?employee` as to make 
`?department` in-scope such that the substitution could affect the table and 
discard its incompatible rows.
   
   I think the scope error is in-line with the SEP7 statement:
   
   > "Disallow syntactic forms that set variables that may already be present 
in the current row."
   
   But I think that this is too restrictive, because it prevents several useful 
cases for inlining data using VALUES blocks. IMO if substitution leads to 
incompatible bindings then they should just be discarded as usual - no? Perhaps 
you have corner cases in mind where under this logic the substitution could 
become ambiguous and that this might be the reason for the restrictions.
   
   Removing the restrictions on `Element_Bind` and `Element_Data` in 
`checkLATERAL` in order to make the above "department-employee" query work 
causes the following test cases to fail:
   
    
syntax-lateral-bad-{[04](https://github.com/apache/jena/blob/main/jena-arq/testing/ARQ/Syntax/Syntax-Lateral/syntax-lateral-bad-04),
 
[05](https://github.com/apache/jena/blob/main/jena-arq/testing/ARQ/Syntax/Syntax-Lateral/syntax-lateral-bad-05),
 
[06](https://github.com/apache/jena/blob/main/jena-arq/testing/ARQ/Syntax/Syntax-Lateral/syntax-lateral-bad-06),
 
[08](https://github.com/apache/jena/blob/main/jena-arq/testing/ARQ/Syntax/Syntax-Lateral/syntax-lateral-bad-08)}.arq
   
   For example, `syntax-lateral-bad-04`:
   ```sparql
   SELECT * {
      ?s ?p ?o
      LATERAL { BIND( 123 AS ?o) }
   }
   ```
   
   This is certainly not an efficient way to write a query, but I don't 
understand why the query should be syntactically illegal in the first place. To 
me it looks like it would only retain bindings where ?o = 123. So it would be 
the task of an optimizer to rewrite the `Lateral/Bind` to a plain `FILTER(?o = 
123)`.
   
   When disabling the SyntaxVarScope check for ElementBind, each ?o gets 
injected into the unit table beneath `(extend ((?o 123))  (table unit))` such 
as `(extend ((?o 123))  (table (row (?o 456))))` and the evaluation of OpExtend 
discards the incompatible binding - which seems like the natural way it should 
be to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@jena.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscr...@jena.apache.org
For additional commands, e-mail: pr-h...@jena.apache.org

Reply via email to