[
https://issues.apache.org/jira/browse/ASTERIXDB-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yingyi Bu resolved ASTERIXDB-1267.
----------------------------------
Resolution: Fixed
I think the issue has been fixed after Murtadha's fix for ASTERIXDB-1340. I
tried a 9 NC cluster (3 partitions each NC) on my local machine and it works.
(The trace is independent of the amount of data to process.)
[~pouria][~javierjia], please re-open it if you still see the issue.
> TPCH query failure with runtime error at scale
> ----------------------------------------------
>
> Key: ASTERIXDB-1267
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1267
> Project: Apache AsterixDB
> Issue Type: Bug
> Reporter: Pouria
> Assignee: Yingyi Bu
>
> - TPC-H query 4 (AQL) fails with a internal/runtime error, only at scale.
> - This query works fine against an asterixdb instance with a simple
> configuration (it passes our build tests for example) - but it fails on
> cluster (I am using 9 NCs, 3 partitions (IO device) each).
> - This query was working fine back in Summer 2015, and it seems a recent
> check-in broke it.
> - Here is the query:
> {noformat}
> declare function tmp()
> {
> for $l in dataset('LineItem')
> where $l.l_commitdate < $l.l_receiptdate
> distinct by $l.l_orderkey
> return { "o_orderkey": $l.l_orderkey }
> }
> for $o in dataset('Orders')
> for $t in tmp()
> where $o.o_orderkey = $t.o_orderkey and
> $o.o_orderdate >= '1993-07-01' and $o.o_orderdate < '1993-10-01'
> group by $o_orderpriority := $o.o_orderpriority with $o
> order by $o_orderpriority
> return {
> "order_priority": $o_orderpriority,
> "count": count($o)
> }
> {noformat}
> - Here is the optimized plan:
> {noformat}
> distribute result [%0->$$25]
> -- DISTRIBUTE_RESULT |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> project ([$$25])
> -- STREAM_PROJECT |PARTITIONED|
> assign [$$25] <- [function-call: asterix:closed-record-constructor,
> Args:[AString: {order_priority}, %0->$$3, AString: {count}, %0->$$35]]
> -- ASSIGN |PARTITIONED|
> exchange
> -- SORT_MERGE_EXCHANGE [$$3(ASC) ] |PARTITIONED|
> group by ([$$3 := %0->$$39]) decor ([]) {
> aggregate [$$35] <- [function-call: asterix:agg-sum,
> Args:[%0->$$38]]
> -- AGGREGATE |LOCAL|
> nested tuple source
> -- NESTED_TUPLE_SOURCE |LOCAL|
> }
> -- PRE_CLUSTERED_GROUP_BY[$$39] |PARTITIONED|
> exchange
> -- HASH_PARTITION_MERGE_EXCHANGE MERGE:[$$39(ASC)] HASH:[$$39]
> |PARTITIONED|
> group by ([$$39 := %0->$$27]) decor ([]) {
> aggregate [$$38] <- [function-call:
> asterix:agg-count, Args:[AInt64: {1}]]
> -- AGGREGATE |LOCAL|
> nested tuple source
> -- NESTED_TUPLE_SOURCE |LOCAL|
> }
> -- SORT_GROUP_BY[$$27] |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> project ([$$27])
> -- STREAM_PROJECT |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> join (function-call: algebricks:eq, Args:[%0->$$30,
> %0->$$31])
> -- HYBRID_HASH_JOIN [$$30][$$31] |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> project ([$$27, $$30])
> -- STREAM_PROJECT |PARTITIONED|
> select (function-call: algebricks:and,
> Args:[function-call: algebricks:ge, Args:[%0->$$29, AString: {1993-07-01}],
> function-call: algebricks:lt, Args:[%0->$$29, AString: {1993-10-01}]])
> -- STREAM_SELECT |PARTITIONED|
> project ([$$27, $$29, $$30])
> -- STREAM_PROJECT |PARTITIONED|
> assign [$$27, $$29] <- [function-call:
> asterix:field-access-by-index, Args:[%0->$$4, AInt32: {5}], function-call:
> asterix:field-access-by-index, Args:[%0->$$4, AInt32: {4}]]
> -- ASSIGN |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> unnest-map [$$30, $$4] <- function-call:
> asterix:index-search, Args:[AString: {Orders}, AInt32: {0}, AString:
> {tpch_450g}, AString: {Orders}, ABoolean: {false}, ABoolean: {false},
> ABoolean: {false}, AInt32: {1}, %0->$$43, AInt32: {1}, %0->$$43, TRUE, TRUE,
> TRUE]
> -- BTREE_SEARCH |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> order (ASC, %0->$$43)
> -- STABLE_SORT [$$43(ASC)]
> |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE
> |PARTITIONED|
> project ([$$43])
> -- STREAM_PROJECT |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE
> |PARTITIONED|
> unnest-map [$$42, $$43] <-
> function-call: asterix:index-search, Args:[AString: {orders_orderdateIx},
> AInt32: {0}, AString: {tpch_450g}, AString: {Orders}, ABoolean: {false},
> ABoolean: {false}, ABoolean: {false}, AInt32: {1}, %0->$$40, AInt32: {1},
> %0->$$41, TRUE, FALSE, FALSE]
> -- BTREE_SEARCH |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE
> |PARTITIONED|
> assign [$$40, $$41] <-
> [AString: {1993-07-01}, AString: {1993-10-01}]
> -- ASSIGN |PARTITIONED|
> empty-tuple-source
> -- EMPTY_TUPLE_SOURCE
> |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> distinct ([%0->$$31])
> -- PRE_SORTED_DISTINCT_BY |PARTITIONED|
> exchange
> -- HASH_PARTITION_MERGE_EXCHANGE
> MERGE:[$$31(ASC)] HASH:[$$31] |PARTITIONED|
> project ([$$31])
> -- STREAM_PROJECT |PARTITIONED|
> select (function-call: algebricks:lt,
> Args:[function-call: asterix:field-access-by-index, Args:[%0->$$5, AInt32:
> {11}], function-call: asterix:field-access-by-index, Args:[%0->$$5, AInt32:
> {12}]])
> -- STREAM_SELECT |PARTITIONED|
> project ([$$5, $$31])
> -- STREAM_PROJECT |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> data-scan []<-[$$31, $$32, $$5] <-
> tpch_450g:LineItem
> -- DATASOURCE_SCAN |PARTITIONED|
> exchange
> -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
> empty-tuple-source
> -- EMPTY_TUPLE_SOURCE |PARTITIONED|
> {noformat}
> - Here are error logs from two NCs:
> from NC1:
> {noformat}
> INFO: Initializing TAID:TID:ANID:ODID:1:1:0:0 ->
> [org.apache.hyracks.dataflow.std.sort.ExternalSortOperatorDescriptor$2@6b5870a9,
>
> org.apache.hyracks.storage.am.btree.dataflow.BTreeSearchOperatorDescriptor@1306c1d9,
>
> org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode@a31e46f,
>
> org.apache.hyracks.dataflow.std.group.sort.SortGroupByOperatorDescriptor$1@619ff47e,
> Asterix {
> assign [2, 3] :=
> [org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@496a0ddc,
>
> org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@6c471521];
> stream-project [2, 3, 0];
> stream-select
> org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@69d0865a;
> stream-project [0, 2];
> }, Asterix {
> stream-project [0];
> }]
> Jan 15, 2016 10:35:07 AM org.apache.hyracks.control.nc.work.StartTasksWork run
> INFO: Initializing TAID:TID:ANID:ODID:1:1:1:0 ->
> [org.apache.hyracks.dataflow.std.sort.ExternalSortOperatorDescriptor$2@6b5870a9,
>
> org.apache.hyracks.storage.am.btree.dataflow.BTreeSearchOperatorDescriptor@1306c1d9,
>
> org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode@a31e46f,
>
> org.apache.hyracks.dataflow.std.group.sort.SortGroupByOperatorDescriptor$1@619ff47e,
> Asterix {
> assign [2, 3] :=
> [org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@496a0ddc,
>
> org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@6c471521];
> stream-project [2, 3, 0];
> stream-select
> org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@69d0865a;
> stream-project [0, 2];
> }, Asterix {
> stream-project [0];
> }]
> Jan 15, 2016 10:35:07 AM org.apache.hyracks.control.nc.work.StartTasksWork run
> INFO: Initializing TAID:TID:ANID:ODID:1:1:2:0 ->
> [org.apache.hyracks.dataflow.std.sort.ExternalSortOperatorDescriptor$2@6b5870a9,
>
> org.apache.hyracks.storage.am.btree.dataflow.BTreeSearchOperatorDescriptor@1306c1d9,
>
> org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode@a31e46f,
>
> org.apache.hyracks.dataflow.std.group.sort.SortGroupByOperatorDescriptor$1@619ff47e,
> Asterix {
> assign [2, 3] :=
> [org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@496a0ddc,
>
> org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@6c471521];
> stream-project [2, 3, 0];
> stream-select
> org.apache.hyracks.algebricks.core.algebra.expressions.LogicalExpressionJobGenToExpressionRuntimeProviderAdapter$ScalarEvaluatorFactoryAdapter@69d0865a;
> stream-project [0, 2];
> }, Asterix {
> stream-project [0];
> }]
> org.apache.hyracks.api.exceptions.HyracksDataException:
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
> at org.apache.hyracks.control.nc.Task.run(Task.java:258)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.NullPointerException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
> ... 5 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> org.apache.hyracks.api.exceptions.HyracksDataException:
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
> at org.apache.hyracks.control.nc.Task.run(Task.java:258)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.NullPointerException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
> ... 5 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> org.apache.hyracks.api.exceptions.HyracksDataException:
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
> at org.apache.hyracks.control.nc.Task.run(Task.java:258)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.NullPointerException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
> ... 5 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> {noformat}
> From NC2:
> {noformat}
> org.apache.hyracks.api.exceptions.HyracksDataException:
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
> at org.apache.hyracks.control.nc.Task.run(Task.java:258)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.NullPointerException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
> ... 5 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> org.apache.hyracks.api.exceptions.HyracksDataException:
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
> at org.apache.hyracks.control.nc.Task.run(Task.java:258)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.NullPointerException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
> ... 5 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> Jan 15, 2016 10:35:08 AM
> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run
> INFO: Executing: NotifyTaskFailure
> org.apache.hyracks.api.exceptions.HyracksDataException:
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:215)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
> at org.apache.hyracks.control.nc.Task.run(Task.java:258)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.NullPointerException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:213)
> ... 5 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hyracks.dataflow.std.sort.AbstractSorterOperatorDescriptor$MergeActivity$1.initialize(AbstractSorterOperatorDescriptor.java:183)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:204)
> at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:201)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ... 3 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)