[jira] [Commented] (ASTERIXDB-2481) Out of Memory error doing aggregation - need a bound

2019-05-17 Thread Gift Sinthong (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842004#comment-16842004
 ] 

Gift Sinthong commented on ASTERIXDB-2481:
--

[~alsuliman], I just checked by running the same set of queries on the recent 
build. It returned the expected result. There is no more error. 

> Out of Memory error doing aggregation - need a bound
> 
>
> Key: ASTERIXDB-2481
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2481
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: COMP - Compiler, RT - Runtime, SQL - Translator SQL++
>Affects Versions: 0.9.4
> Environment: Linux
>Reporter: Gift Sinthong
>Assignee: Ali Alsuliman
>Priority: Critical
> Fix For: 0.9.4.2
>
> Attachments: Screen Shot 2018-11-14 at 3.12.31 PM.png, cc.log, 
> nc-1.log
>
>
> This is the schema:
> {noformat}
> CREATE TYPE Test AS open { unique2: int64 };
> CREATE DATASET wisconsin_5gb(Test) PRIMARY KEY unique2;
> {noformat}
> This is the query:
> {noformat}
> SELECT
> min(t.oddOnePercent) as min, 
> max(t.oddOnePercent) as max, 
> count(distinct t.oddOnePercent) as cnt
> FROM wisconsin_5gb t;
> {noformat}
> The plan for this query:
> {noformat}
> distribute result [$$46]
> -- DISTRIBUTE_RESULT  |UNPARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
> project ([$$46])
> -- STREAM_PROJECT  |UNPARTITIONED|
>   assign [$$46] <- [{"min": $$48, "max": $$49, "cnt": $$50}]
>   -- ASSIGN  |UNPARTITIONED|
> project ([$$48, $$49, $$50])
> -- STREAM_PROJECT  |UNPARTITIONED|
>   subplan {
> aggregate [$$50] <- [agg-sql-sum($$53)]
> -- AGGREGATE  |LOCAL|
>   aggregate [$$53] <- [agg-sql-count($$43)]
>   -- AGGREGATE  |LOCAL|
> distinct ([$$43])
> -- MICRO_PRE_SORTED_DISTINCT_BY  |LOCAL|
>   order (ASC, $$43) 
>   -- IN_MEMORY_STABLE_SORT [$$43(ASC)]  |LOCAL|
> assign [$$43] <- [$$52.getField("oddOnePercent")]
> -- ASSIGN  |UNPARTITIONED|
>   assign [$$52] <- [$#4.getField(0)]
>   -- ASSIGN  |UNPARTITIONED|
> unnest $#4 <- scan-collection($$28)
> -- UNNEST  |UNPARTITIONED|
>   nested tuple source
>   -- NESTED_TUPLE_SOURCE  |UNPARTITIONED|
>  }
>   -- SUBPLAN  |UNPARTITIONED|
> aggregate [$$28, $$48, $$49] <- [listify($$27), 
> agg-sql-min($$33), agg-sql-max($$33)]
> -- AGGREGATE  |UNPARTITIONED|
>   exchange
>   -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
> project ([$$27, $$33])
> -- STREAM_PROJECT  |PARTITIONED|
>   assign [$$33, $$27] <- [$$t.getField("oddOnePercent"), 
> {"t": $$t}]
>   -- ASSIGN  |PARTITIONED|
> project ([$$t])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> data-scan []<-[$$47, $$t] <- Default.wisconsin_5gb
> -- DATASOURCE_SCAN  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> empty-tuple-source
> -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2481) Out of Memory error doing aggregation - need a bound

2019-05-15 Thread Ali Alsuliman (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840155#comment-16840155
 ] 

Ali Alsuliman commented on ASTERIXDB-2481:
--

[~psinthon]

Hi Gift,

Few changed have been merged to master that are related to listify, in memory 
sort, and out of memory issue. One change is related to eliminating listify 
from the plan when it is possible. The second change is to make the in memory 
sort spill to disk when the memory budget is hit.

Could you please check if you are facing the same issue. It would be nice to 
post the plan, too.

> Out of Memory error doing aggregation - need a bound
> 
>
> Key: ASTERIXDB-2481
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2481
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: COMP - Compiler, RT - Runtime, SQL - Translator SQL++
>Affects Versions: 0.9.4
> Environment: Linux
>Reporter: Gift Sinthong
>Assignee: Ali Alsuliman
>Priority: Critical
> Fix For: 0.9.4.2
>
> Attachments: Screen Shot 2018-11-14 at 3.12.31 PM.png, cc.log, 
> nc-1.log
>
>
> This is the schema:
> {noformat}
> CREATE TYPE Test AS open { unique2: int64 };
> CREATE DATASET wisconsin_5gb(Test) PRIMARY KEY unique2;
> {noformat}
> This is the query:
> {noformat}
> SELECT
> min(t.oddOnePercent) as min, 
> max(t.oddOnePercent) as max, 
> count(distinct t.oddOnePercent) as cnt
> FROM wisconsin_5gb t;
> {noformat}
> The plan for this query:
> {noformat}
> distribute result [$$46]
> -- DISTRIBUTE_RESULT  |UNPARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
> project ([$$46])
> -- STREAM_PROJECT  |UNPARTITIONED|
>   assign [$$46] <- [{"min": $$48, "max": $$49, "cnt": $$50}]
>   -- ASSIGN  |UNPARTITIONED|
> project ([$$48, $$49, $$50])
> -- STREAM_PROJECT  |UNPARTITIONED|
>   subplan {
> aggregate [$$50] <- [agg-sql-sum($$53)]
> -- AGGREGATE  |LOCAL|
>   aggregate [$$53] <- [agg-sql-count($$43)]
>   -- AGGREGATE  |LOCAL|
> distinct ([$$43])
> -- MICRO_PRE_SORTED_DISTINCT_BY  |LOCAL|
>   order (ASC, $$43) 
>   -- IN_MEMORY_STABLE_SORT [$$43(ASC)]  |LOCAL|
> assign [$$43] <- [$$52.getField("oddOnePercent")]
> -- ASSIGN  |UNPARTITIONED|
>   assign [$$52] <- [$#4.getField(0)]
>   -- ASSIGN  |UNPARTITIONED|
> unnest $#4 <- scan-collection($$28)
> -- UNNEST  |UNPARTITIONED|
>   nested tuple source
>   -- NESTED_TUPLE_SOURCE  |UNPARTITIONED|
>  }
>   -- SUBPLAN  |UNPARTITIONED|
> aggregate [$$28, $$48, $$49] <- [listify($$27), 
> agg-sql-min($$33), agg-sql-max($$33)]
> -- AGGREGATE  |UNPARTITIONED|
>   exchange
>   -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
> project ([$$27, $$33])
> -- STREAM_PROJECT  |PARTITIONED|
>   assign [$$33, $$27] <- [$$t.getField("oddOnePercent"), 
> {"t": $$t}]
>   -- ASSIGN  |PARTITIONED|
> project ([$$t])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> data-scan []<-[$$47, $$t] <- Default.wisconsin_5gb
> -- DATASOURCE_SCAN  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> empty-tuple-source
> -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2481) Out of Memory error doing aggregation

2018-11-18 Thread Michael J. Carey (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690986#comment-16690986
 ] 

Michael J. Carey commented on ASTERIXDB-2481:
-

There are actually two issues here:
 # We should have a bound on the amount of memory used by an operator, so the 
cheating that's done by listify needs to be changed so that it accounts for its 
use and errors out when its aggregate operation hits its allotment of memory.
 # We should rewrite the query to eliminate the listify.

> Out of Memory error doing aggregation
> -
>
> Key: ASTERIXDB-2481
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2481
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: COMP - Compiler, RT - Runtime, SQL - Translator SQL++
>Affects Versions: 0.9.5
> Environment: Linux
>Reporter: Gift Sinthong
>Assignee: Dmitry Lychagin
>Priority: Critical
> Attachments: Screen Shot 2018-11-14 at 3.12.31 PM.png, cc.log, 
> nc-1.log
>
>
> This is the schema:
> {noformat}
> CREATE TYPE Test AS open { unique2: int64 };
> CREATE DATASET wisconsin_5gb(Test) PRIMARY KEY unique2;
> {noformat}
> This is the query:
> {noformat}
> SELECT
> min(t.oddOnePercent) as min, 
> max(t.oddOnePercent) as max, 
> count(distinct t.oddOnePercent) as cnt
> FROM wisconsin_5gb t;
> {noformat}
> The plan for this query:
> {noformat}
> distribute result [$$46]
> -- DISTRIBUTE_RESULT  |UNPARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
> project ([$$46])
> -- STREAM_PROJECT  |UNPARTITIONED|
>   assign [$$46] <- [{"min": $$48, "max": $$49, "cnt": $$50}]
>   -- ASSIGN  |UNPARTITIONED|
> project ([$$48, $$49, $$50])
> -- STREAM_PROJECT  |UNPARTITIONED|
>   subplan {
> aggregate [$$50] <- [agg-sql-sum($$53)]
> -- AGGREGATE  |LOCAL|
>   aggregate [$$53] <- [agg-sql-count($$43)]
>   -- AGGREGATE  |LOCAL|
> distinct ([$$43])
> -- MICRO_PRE_SORTED_DISTINCT_BY  |LOCAL|
>   order (ASC, $$43) 
>   -- IN_MEMORY_STABLE_SORT [$$43(ASC)]  |LOCAL|
> assign [$$43] <- [$$52.getField("oddOnePercent")]
> -- ASSIGN  |UNPARTITIONED|
>   assign [$$52] <- [$#4.getField(0)]
>   -- ASSIGN  |UNPARTITIONED|
> unnest $#4 <- scan-collection($$28)
> -- UNNEST  |UNPARTITIONED|
>   nested tuple source
>   -- NESTED_TUPLE_SOURCE  |UNPARTITIONED|
>  }
>   -- SUBPLAN  |UNPARTITIONED|
> aggregate [$$28, $$48, $$49] <- [listify($$27), 
> agg-sql-min($$33), agg-sql-max($$33)]
> -- AGGREGATE  |UNPARTITIONED|
>   exchange
>   -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
> project ([$$27, $$33])
> -- STREAM_PROJECT  |PARTITIONED|
>   assign [$$33, $$27] <- [$$t.getField("oddOnePercent"), 
> {"t": $$t}]
>   -- ASSIGN  |PARTITIONED|
> project ([$$t])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> data-scan []<-[$$47, $$t] <- Default.wisconsin_5gb
> -- DATASOURCE_SCAN  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> empty-tuple-source
> -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2481) Out of Memory error doing aggregation

2018-11-16 Thread Gift Sinthong (JIRA)


[jira] [Commented] (ASTERIXDB-2481) Out of Memory error doing aggregation

2018-11-14 Thread Till (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687392#comment-16687392
 ] 

Till commented on ASTERIXDB-2481:
-

Just formatted the description for readability.

> Out of Memory error doing aggregation
> -
>
> Key: ASTERIXDB-2481
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2481
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: COMP - Compiler, RT - Runtime, SQL - Translator SQL++
>Affects Versions: 0.9.5
> Environment: Linux
>Reporter: Gift Sinthong
>Priority: Critical
> Attachments: Screen Shot 2018-11-14 at 3.12.31 PM.png
>
>
> This is the schema:
> {noformat}
> CREATE TYPE Test AS open { unique2: int64 };
> CREATE DATASET wisconsin_5gb(Test) PRIMARY KEY unique2;
> {noformat}
> This is the query:
> {noformat}
> SELECT
> min(t.oddOnePercent) as min, 
> max(t.oddOnePercent) as max, 
> count(distinct t.oddOnePercent) as cnt
> FROM wisconsin_5gb t;
> {noformat}
> The plan for this query:
> {noformat}
> distribute result [$$46]
> -- DISTRIBUTE_RESULT  |UNPARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
> project ([$$46])
> -- STREAM_PROJECT  |UNPARTITIONED|
>   assign [$$46] <- [{"min": $$48, "max": $$49, "cnt": $$50}]
>   -- ASSIGN  |UNPARTITIONED|
> project ([$$48, $$49, $$50])
> -- STREAM_PROJECT  |UNPARTITIONED|
>   subplan {
> aggregate [$$50] <- [agg-sql-sum($$53)]
> -- AGGREGATE  |LOCAL|
>   aggregate [$$53] <- [agg-sql-count($$43)]
>   -- AGGREGATE  |LOCAL|
> distinct ([$$43])
> -- MICRO_PRE_SORTED_DISTINCT_BY  |LOCAL|
>   order (ASC, $$43) 
>   -- IN_MEMORY_STABLE_SORT [$$43(ASC)]  |LOCAL|
> assign [$$43] <- [$$52.getField("oddOnePercent")]
> -- ASSIGN  |UNPARTITIONED|
>   assign [$$52] <- [$#4.getField(0)]
>   -- ASSIGN  |UNPARTITIONED|
> unnest $#4 <- scan-collection($$28)
> -- UNNEST  |UNPARTITIONED|
>   nested tuple source
>   -- NESTED_TUPLE_SOURCE  |UNPARTITIONED|
>  }
>   -- SUBPLAN  |UNPARTITIONED|
> aggregate [$$28, $$48, $$49] <- [listify($$27), 
> agg-sql-min($$33), agg-sql-max($$33)]
> -- AGGREGATE  |UNPARTITIONED|
>   exchange
>   -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
> project ([$$27, $$33])
> -- STREAM_PROJECT  |PARTITIONED|
>   assign [$$33, $$27] <- [$$t.getField("oddOnePercent"), 
> {"t": $$t}]
>   -- ASSIGN  |PARTITIONED|
> project ([$$t])
> -- STREAM_PROJECT  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> data-scan []<-[$$47, $$t] <- Default.wisconsin_5gb
> -- DATASOURCE_SCAN  |PARTITIONED|
>   exchange
>   -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
> empty-tuple-source
> -- EMPTY_TUPLE_SOURCE  |PARTITIONED|
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ASTERIXDB-2481) Out of Memory error doing aggregation

2018-11-14 Thread Taewoo Kim (JIRA)


[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687291#comment-16687291
 ] 

Taewoo Kim commented on ASTERIXDB-2481:
---

Can you also attach the log records?

> Out of Memory error doing aggregation
> -
>
> Key: ASTERIXDB-2481
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2481
> Project: Apache AsterixDB
>  Issue Type: Bug
>  Components: COMP - Compiler, RT - Runtime, SQL - Translator SQL++
>Affects Versions: 0.9.5
> Environment: Linux
>Reporter: Gift Sinthong
>Priority: Critical
> Attachments: Screen Shot 2018-11-14 at 3.12.31 PM.png
>
>
> This is the schema for this query:
> CREATE TYPE Test AS open{
>  unique2: int64
> };
> CREATE DATASET wisconsin_1gb(Test)
>  PRIMARY KEY unique2;
> This is the query:
> SELECT min( t.oddOnePercent) as min, max(t.oddOnePercent) as max, 
> count(distinct t.oddOnePercent) as cnt
>  FROM wisconsin_5gb t ;
>  
> The plan for this query:
> distribute result [$$46]
> -- DISTRIBUTE_RESULT |UNPARTITIONED|
>  exchange
>  -- ONE_TO_ONE_EXCHANGE |UNPARTITIONED|
>  project ([$$46])
>  -- STREAM_PROJECT |UNPARTITIONED|
>  assign [$$46] <- [\{"min": $$48, "max": $$49, "cnt": $$50}]
>  -- ASSIGN |UNPARTITIONED|
>  project ([$$48, $$49, $$50])
>  -- STREAM_PROJECT |UNPARTITIONED|
>  subplan {
>  aggregate [$$50] <- [agg-sql-sum($$53)]
>  -- AGGREGATE |LOCAL|
>  aggregate [$$53] <- [agg-sql-count($$43)]
>  -- AGGREGATE |LOCAL|
>  distinct ([$$43])
>  -- MICRO_PRE_SORTED_DISTINCT_BY |LOCAL|
>  order (ASC, $$43) 
>  -- IN_MEMORY_STABLE_SORT [$$43(ASC)] |LOCAL|
>  assign [$$43] <- [$$52.getField("oddOnePercent")]
>  -- ASSIGN |UNPARTITIONED|
>  assign [$$52] <- [$#4.getField(0)]
>  -- ASSIGN |UNPARTITIONED|
>  unnest $#4 <- scan-collection($$28)
>  -- UNNEST |UNPARTITIONED|
>  nested tuple source
>  -- NESTED_TUPLE_SOURCE |UNPARTITIONED|
>  }
>  -- SUBPLAN |UNPARTITIONED|
>  aggregate [$$28, $$48, $$49] <- [listify($$27), agg-sql-min($$33), 
> agg-sql-max($$33)]
>  -- AGGREGATE |UNPARTITIONED|
>  exchange
>  -- RANDOM_MERGE_EXCHANGE |PARTITIONED|
>  project ([$$27, $$33])
>  -- STREAM_PROJECT |PARTITIONED|
>  assign [$$33, $$27] <- [$$t.getField("oddOnePercent"), \{"t": $$t}]
>  -- ASSIGN |PARTITIONED|
>  project ([$$t])
>  -- STREAM_PROJECT |PARTITIONED|
>  exchange
>  -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
>  data-scan []<-[$$47, $$t] <- benchmark.wisconsin_5gb
>  -- DATASOURCE_SCAN |PARTITIONED|
>  exchange
>  -- ONE_TO_ONE_EXCHANGE |PARTITIONED|
>  empty-tuple-source
>  -- EMPTY_TUPLE_SOURCE |PARTITIONED|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)