Copilot commented on code in PR #45:
URL: https://github.com/apache/asterixdb/pull/45#discussion_r2997985806


##########
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/pushdown/processor/ColumnFilterPushdownProcessor.java:
##########
@@ -159,10 +159,51 @@ protected boolean pushdownFilter(ScanDefineDescriptor 
scanDescriptor) throws Alg
     protected void putFilterInformation(ScanDefineDescriptor 
scanDefineDescriptor, ILogicalExpression inlinedExpr)
             throws AlgebricksException {
         if (checkerVisitor.containsMultipleArrayPaths(paths.values())) {
-            // Cannot pushdown a filter with multiple unnest
-            // TODO allow rewindable column readers for filters
-            // TODO this is a bit conservative (maybe too conservative) as we 
can push part of expression down
-            return;
+            // Conservative fallback: attempt to push a subset of top-level 
AND operands
+            // that do not collectively introduce multiple array paths.
+            if (inlinedExpr instanceof AbstractFunctionCallExpression
+                    && 
BuiltinFunctions.AND.equals(((AbstractFunctionCallExpression) 
inlinedExpr).getFunctionIdentifier())) {
+                List<Mutable<ILogicalExpression>> args = 
((AbstractFunctionCallExpression) inlinedExpr)
+                        .getArguments();
+                List<Mutable<ILogicalExpression>> selectedArgs = new 
ArrayList<>();
+                List<ARecordType> selectedPaths = new ArrayList<>();
+
+                for (Mutable<ILogicalExpression> argRef : args) {
+                    ILogicalExpression arg = argRef.getValue();
+                    List<ARecordType> argPaths = 
collectPathsForExpression(arg);
+                    // Test if adding this arg's paths would cause multiple 
array paths
+                    checkerVisitor.beforeVisit();
+                    List<ARecordType> merged = new ArrayList<>(selectedPaths);
+                    merged.addAll(argPaths);
+                    if (!checkerVisitor.containsMultipleArrayPaths(merged)) {
+                        selectedArgs.add(new MutableObject<>(arg));
+                        selectedPaths.addAll(argPaths);
+                    }
+                }
+
+                if (selectedArgs.isEmpty()) {
+                    // nothing pushable
+                    return;
+                }
+
+                // Build new inlined expression from selectedArgs
+                AbstractFunctionCallExpression newExpr;
+                if (selectedArgs.size() == 1) {
+                    // single argument
+                    ILogicalExpression single = selectedArgs.get(0).getValue();
+                    putFilterInformation(scanDefineDescriptor, single);
+                    return;

Review Comment:
   When `selectedArgs.size() == 1`, this calls 
`putFilterInformation(scanDefineDescriptor, single)` recursively while leaving 
`paths` unchanged. Since `paths.values()` still contains the full set of paths 
(including the ones that triggered `containsMultipleArrayPaths`), the recursive 
call will typically hit the same `containsMultipleArrayPaths(...)` branch and 
then return early (because `single` is not an AND), resulting in *no* predicate 
being pushed down. Instead, avoid recursion here: set `inlinedExpr` to the 
single selected predicate and also restrict `paths` to only the entries needed 
by that selected predicate (and re-check/reset the visitor state if needed).
   ```suggestion
                       // Single argument: avoid recursion. Use it directly as 
the inlined expression
                       // and restrict 'paths' to only those needed by this 
predicate.
                       ILogicalExpression single = 
selectedArgs.get(0).getValue();
                       inlinedExpr = single;
                       // Keep only entries whose ARecordType is among the 
selected paths
                       paths.entrySet().removeIf(e -> 
!selectedPaths.contains(e.getValue()));
   ```



##########
asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/pushdown/processor/ColumnFilterPushdownProcessor.java:
##########
@@ -159,10 +159,51 @@ protected boolean pushdownFilter(ScanDefineDescriptor 
scanDescriptor) throws Alg
     protected void putFilterInformation(ScanDefineDescriptor 
scanDefineDescriptor, ILogicalExpression inlinedExpr)
             throws AlgebricksException {
         if (checkerVisitor.containsMultipleArrayPaths(paths.values())) {
-            // Cannot pushdown a filter with multiple unnest
-            // TODO allow rewindable column readers for filters
-            // TODO this is a bit conservative (maybe too conservative) as we 
can push part of expression down
-            return;
+            // Conservative fallback: attempt to push a subset of top-level 
AND operands
+            // that do not collectively introduce multiple array paths.
+            if (inlinedExpr instanceof AbstractFunctionCallExpression
+                    && 
BuiltinFunctions.AND.equals(((AbstractFunctionCallExpression) 
inlinedExpr).getFunctionIdentifier())) {
+                List<Mutable<ILogicalExpression>> args = 
((AbstractFunctionCallExpression) inlinedExpr)
+                        .getArguments();
+                List<Mutable<ILogicalExpression>> selectedArgs = new 
ArrayList<>();
+                List<ARecordType> selectedPaths = new ArrayList<>();
+
+                for (Mutable<ILogicalExpression> argRef : args) {
+                    ILogicalExpression arg = argRef.getValue();
+                    List<ARecordType> argPaths = 
collectPathsForExpression(arg);
+                    // Test if adding this arg's paths would cause multiple 
array paths
+                    checkerVisitor.beforeVisit();
+                    List<ARecordType> merged = new ArrayList<>(selectedPaths);
+                    merged.addAll(argPaths);
+                    if (!checkerVisitor.containsMultipleArrayPaths(merged)) {
+                        selectedArgs.add(new MutableObject<>(arg));
+                        selectedPaths.addAll(argPaths);
+                    }

Review Comment:
   The subset-selection logic checks `containsMultipleArrayPaths(merged)` using 
only `selectedPaths` + `argPaths`, but it doesn't include any paths that may 
already have been pushed down earlier for this scan (i.e., 
`scanDefineDescriptor.getFilterPaths()` from prior `putFilterInformation` 
calls). Since `paths` is accumulated across multiple candidate filters during 
`pushdownFilter(...)`, this can still select a conjunct that introduces a 
second array path when combined with previously accepted predicates, defeating 
the constraint you’re trying to preserve. Consider seeding `selectedPaths` (or 
the `merged` list) with the already-pushed paths from the descriptor before 
evaluating each new conjunct.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to