[ 
https://issues.apache.org/jira/browse/ASTERIXDB-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17944620#comment-17944620
 ] 

ASF subversion and git services commented on ASTERIXDB-3582:
------------------------------------------------------------

Commit 32627a475154e51edcd7257e4c100aca7641dfa4 in asterixdb's branch 
refs/heads/master from Ritik Raj
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=32627a4751 ]

[ASTERIXDB-3582][COMP] Fix expected schema tree generation

- user model changes: no
- storage format changes: no
- interface changes: no

Ext-ref: MB-65792
Change-Id: Ic04a618c7aa182af4b1f4b7ade64d687147ef705
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19532
Reviewed-by: Ali Alsuliman <[email protected]>
Tested-by: Ali Alsuliman <[email protected]>


> Issues in values and filter pushdown for column collections
> -----------------------------------------------------------
>
>                 Key: ASTERIXDB-3582
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3582
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: COMP - Compiler, STO - Storage
>            Reporter: Ritik Raj
>            Assignee: Ritik Raj
>            Priority: Critical
>              Labels: triaged
>             Fix For: 0.9.10
>
>
> There are few issues identified related to value and filter pushdown for 
> column collections.
> 1. For the following query
> {code:java}
> USE commerce.marketing;
> CREATE FUNCTION sent(txt) {
> LET pos = ["bomb","needs"], neg=["shrinks","shrunk","smaller"], exp = 
> split(txt, " ")
> SELECT CASE 
>          WHEN (
>                (SOME w IN pos SATISFIES (w IN exp))
>                AND
>                (EVERY w IN neg SATISFIES (w NOT IN exp))
>               )     
>          THEN "positive" 
>          WHEN (
>                (SOME w IN neg SATISFIES (w IN exp)) 
>                AND 
>                (EVERY w IN pos SATISFIES (w NOT IN exp))
>               ) 
>          THEN "negative"
>          ELSE "neutral"
>        END
> }; 
> USE commerce.marketing;
> SELECT r.*, sent(r.text) FROM reviews r;{code}
> The query gives out Internal Error with the following trace
> {code:java}
> 2025-03-17T17:13:50.133+00:00 INFO CBAS.translator.QueryTranslator 
> [QueryTranslator:4ba4e252-4fde-4a6e-8b97-039b11366919] null
> java.util.ConcurrentModificationException: null
>         at 
> java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1597) ~[?:?]
>         at java.base/java.util.HashMap$EntryIterator.next(HashMap.java:1630) 
> ~[?:?]
>         at java.base/java.util.HashMap$EntryIterator.next(HashMap.java:1628) 
> ~[?:?]
>         at 
> org.apache.asterix.optimizer.rules.pushdown.processor.AbstractFilterPushdownProcessor.putPotentialSelects(AbstractFilterPushdownProcessor.java:194)
>  ~[asterix-algebra-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.optimizer.rules.pushdown.processor.AbstractFilterPushdownProcessor.process(AbstractFilterPushdownProcessor.java:79)
>  ~[asterix-algebra-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.optimizer.rules.pushdown.PushdownProcessorsExecutor.execute(PushdownProcessorsExecutor.java:63)
>  ~[asterix-algebra-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.optimizer.rules.PushValueAccessAndFilterDownRule.rewritePre(PushValueAccessAndFilterDownRule.java:102)
>  ~[asterix-algebra-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.hyracks.algebricks.core.rewriter.base.AbstractRuleController.rewriteOperatorRef(AbstractRuleController.java:79)
>  ~[algebricks-core-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.hyracks.algebricks.compiler.rewriter.rulecontrollers.SequentialOnceRuleController.rewriteWithRuleCollection(SequentialOnceRuleController.java:43)
>  ~[algebricks-compiler-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.hyracks.algebricks.core.rewriter.base.HeuristicOptimizer.runOptimizationSets(HeuristicOptimizer.java:92)
>  ~[algebricks-core-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.hyracks.algebricks.core.rewriter.base.HeuristicOptimizer.runPhysicalOptimizationSets(HeuristicOptimizer.java:122)
>  ~[algebricks-core-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.hyracks.algebricks.core.rewriter.base.HeuristicOptimizer.optimize(HeuristicOptimizer.java:66)
>  ~[algebricks-core-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.hyracks.algebricks.compiler.api.HeuristicCompilerFactoryBuilder$CompilerImpl.optimize(HeuristicCompilerFactoryBuilder.java:165)
>  ~[algebricks-compiler-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.api.common.APIFramework.compileQuery(APIFramework.java:289)
>  ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.app.translator.QueryTranslator.rewriteCompileQuery(QueryTranslator.java:4322)
>  ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.app.translator.QueryTranslator.lambda$handleQuery$3(QueryTranslator.java:5280)
>  ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.app.translator.QueryTranslator.createAndRunJob(QueryTranslator.java:5433)
>  ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.app.translator.QueryTranslator.deliverResult(QueryTranslator.java:5326)
>  ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.app.translator.QueryTranslator.handleQuery(QueryTranslator.java:5296)
>  ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.app.translator.QueryTranslator.compileAndExecute(QueryTranslator.java:534)
>  ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.app.message.ExecuteStatementRequestMessage.handle(ExecuteStatementRequestMessage.java:181)
>  ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.asterix.messaging.CCMessageBroker.receivedMessage(CCMessageBroker.java:64)
>  ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
>         at 
> org.apache.hyracks.control.cc.work.ApplicationMessageWork.lambda$notifyMessageBroker$0(ApplicationMessageWork.java:74)
>  ~[hyracks-control-cc-1.0.3-2467.jar:1.0.3-2467]
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  [?:?]
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  [?:?]
>         at java.base/java.lang.Thread.run(Thread.java:840) [?:?] {code}
> The reason behind failure is we are trying to modify a map which tracks the 
> subplan operators while iterating the map. 
> if we see the below plan:
> {code:java}
> distribute result [$$163] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
> -- DISTRIBUTE_RESULT  |PARTITIONED|  
>   project ([$$163]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>   -- STREAM_PROJECT  |PARTITIONED|  
>     assign [$$163] <- [{"$1": $$162}] [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>     -- ASSIGN  |PARTITIONED|  
>       project ([$$162]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0] -- 
> |UNPARTITIONED|  
>         subplan {  
>                   aggregate [$$162] <- [listify($$161)] [cardinality: 0.0, 
> op-cost: 0.0, total-cost: 0.0]  
>                   -- AGGREGATE  |LOCAL|  
>                     assign [$$161] <- [{"$2": switch-case(true, and($$140, 
> $$146), "positive", and($$153, $$159), "negative", "neutral")}] [cardinality: 
> 0.0, op-cost: 0.0, total-cost: 0.0]  
>                     -- ASSIGN  |LOCAL|  
>                       subplan {  
>                                 aggregate [$$159] <- [empty-stream()] 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                 -- AGGREGATE  |LOCAL|  
>                                   select (not(if-missing-or-null($$158, 
> false))) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                   -- STREAM_SELECT  |LOCAL|  
>                                     subplan {  
>                                               aggregate [$$158] <- 
> [empty-stream()] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                               -- AGGREGATE  |LOCAL|  
>                                                 select 
> (not(if-missing-or-null(neq($$w, $#6), false))) [cardinality: 0.0, op-cost: 
> 0.0, total-cost: 0.0]  
>                                                 -- STREAM_SELECT  |LOCAL|  
>                                                   unnest $#6 <- 
> scan-collection(split($$166, " ")) [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                                                   -- UNNEST  |LOCAL|  
>                                                     nested tuple source 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                                     -- NESTED_TUPLE_SOURCE  
> |LOCAL|  
>                                            } [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                                     -- SUBPLAN  |LOCAL|  
>                                       unnest $$w <- scan-collection(array: [ 
> "bomb", "needs" ]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                       -- UNNEST  |LOCAL|  
>                                         nested tuple source [cardinality: 
> 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                         -- NESTED_TUPLE_SOURCE  |LOCAL|  
>                              } [cardinality: 0.0, op-cost: 0.0, total-cost: 
> 0.0]  
>                       -- SUBPLAN  |LOCAL|  
>                         subplan {  
>                                   aggregate [$$153] <- [non-empty-stream()] 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                   -- AGGREGATE  |LOCAL|  
>                                     select ($$152) [cardinality: 0.0, 
> op-cost: 0.0, total-cost: 0.0]  
>                                     -- STREAM_SELECT  |LOCAL|  
>                                       subplan {  
>                                                 aggregate [$$152] <- 
> [non-empty-stream()] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                                 -- AGGREGATE  |LOCAL|  
>                                                   select (eq($$w, $#5)) 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                                   -- STREAM_SELECT  |LOCAL|  
>                                                     unnest $#5 <- 
> scan-collection(split($$166, " ")) [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                                                     -- UNNEST  |LOCAL|  
>                                                       nested tuple source 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                                       -- NESTED_TUPLE_SOURCE  
> |LOCAL|  
>                                              } [cardinality: 0.0, op-cost: 
> 0.0, total-cost: 0.0]  
>                                       -- SUBPLAN  |LOCAL|  
>                                         unnest $$w <- scan-collection(array: 
> [ "shrinks", "shrunk", "smaller" ]) [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                                         -- UNNEST  |LOCAL|  
>                                           nested tuple source [cardinality: 
> 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                           -- NESTED_TUPLE_SOURCE  |LOCAL|  
>                                } [cardinality: 0.0, op-cost: 0.0, total-cost: 
> 0.0]  
>                         -- SUBPLAN  |LOCAL|  
>                           subplan {  
>                                     aggregate [$$146] <- [empty-stream()] 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                     -- AGGREGATE  |LOCAL|  
>                                       select (not(if-missing-or-null($$145, 
> false))) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                       -- STREAM_SELECT  |LOCAL|  
>                                         subplan {  
>                                                   aggregate [$$145] <- 
> [empty-stream()] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                                   -- AGGREGATE  |LOCAL|  
>                                                     select 
> (not(if-missing-or-null(neq($$w, $#4), false))) [cardinality: 0.0, op-cost: 
> 0.0, total-cost: 0.0]  
>                                                     -- STREAM_SELECT  |LOCAL| 
>  
>                                                       unnest $#4 <- 
> scan-collection(split($$166, " ")) [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                                                       -- UNNEST  |LOCAL|  
>                                                         nested tuple source 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                                         -- 
> NESTED_TUPLE_SOURCE  |LOCAL|  
>                                                } [cardinality: 0.0, op-cost: 
> 0.0, total-cost: 0.0]  
>                                         -- SUBPLAN  |LOCAL|  
>                                           unnest $$w <- 
> scan-collection(array: [ "shrinks", "shrunk", "smaller" ]) [cardinality: 0.0, 
> op-cost: 0.0, total-cost: 0.0]  
>                                           -- UNNEST  |LOCAL|  
>                                             nested tuple source [cardinality: 
> 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                             -- NESTED_TUPLE_SOURCE  |LOCAL|  
>                                  } [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                           -- SUBPLAN  |LOCAL|  
>                             subplan {  
>                                       aggregate [$$140] <- 
> [non-empty-stream()] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                       -- AGGREGATE  |LOCAL|  
>                                         select ($$139) [cardinality: 0.0, 
> op-cost: 0.0, total-cost: 0.0]  
>                                         -- STREAM_SELECT  |LOCAL|  
>                                           subplan {  
>                                                     aggregate [$$139] <- 
> [non-empty-stream()] [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                                     -- AGGREGATE  |LOCAL|  
>                                                       select (eq($$w, $#3)) 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                                       -- STREAM_SELECT  
> |LOCAL|  
>                                                         unnest $#3 <- 
> scan-collection(split($$166, " ")) [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                                                         -- UNNEST  |LOCAL|  
>                                                           nested tuple source 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                                           -- 
> NESTED_TUPLE_SOURCE  |LOCAL|  
>                                                  } [cardinality: 0.0, 
> op-cost: 0.0, total-cost: 0.0]  
>                                           -- SUBPLAN  |LOCAL|  
>                                             unnest $$w <- 
> scan-collection(array: [ "bomb", "needs" ]) [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                                             -- UNNEST  |LOCAL|  
>                                               nested tuple source 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                                               -- NESTED_TUPLE_SOURCE  |LOCAL| 
>  
>                                    } [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                             -- SUBPLAN  |LOCAL|  
>                               nested tuple source [cardinality: 0.0, op-cost: 
> 0.0, total-cost: 0.0]  
>                               -- NESTED_TUPLE_SOURCE  |LOCAL|  
>                } [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>         -- SUBPLAN  |PARTITIONED|  
>           project ([$$166]) [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0] 
>  
>           -- STREAM_PROJECT  |PARTITIONED|  
>             assign [$$166] <- [$$r.getField("text")] [cardinality: 0.0, 
> op-cost: 0.0, total-cost: 0.0]  
>             -- ASSIGN  |PARTITIONED|  
>               project ([$$r]) [cardinality: 0.0, op-cost: 0.0, total-cost: 
> 0.0] -- |UNPARTITIONED|  
>                 data-scan []<-[$$164, $$165, $$r] <- marketing.reviews 
> [cardinality: 0.0, op-cost: 0.0, total-cost: 0.0]  
>                 -- DATASOURCE_SCAN  |PARTITIONED|  
>                   empty-tuple-source [cardinality: 0.0, op-cost: 0.0, 
> total-cost: 0.0]  
>                   -- EMPTY_TUPLE_SOURCE  |PARTITIONED| {code}
>  
> The issue arises because subplans within subplans exist, and while iterating 
> over the subplan map to identify filters that can be pushed down, we 
> encounter new subplans that consume the output of the current subplan. To 
> account for these newly discovered subplans, we attempt to add them to the 
> map during iteration. However, since a regular HashMap is being used, which 
> does not support modifications while iterating, this leads to a 
> ConcurrentModificationException.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to