Wail - I went ahead and filed ASTERIXDB-1670 <https://issues.apache.org/jira/browse/ASTERIXDB-1670> for you, I tried to change the reporter to your name, but I don't have permissions to edit the reporter field.
Thanks, Khurram On Sat, Oct 1, 2016 at 10:44 PM, Yingyi Bu <[email protected]> wrote: > PS, if you still have the OOM instance, can you do a Yourkit memory > profile? > Thanks! > > Best, > Yingyi > > On Sat, Oct 1, 2016 at 9:43 AM, Yingyi Bu <[email protected]> wrote: > > > Wail, > > > > Can you attach the query plan for query 1? > > I tried > > count( for $x in dataset beers > > return $x > > ) > > > > and got the following plan, which seems OK: > > -- DISTRIBUTE_RESULT |UNPARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |UNPARTITIONED| > > aggregate [$$5] <- [function-call: asterix:agg-sum, Args:[%0->$$8]] > > -- AGGREGATE |UNPARTITIONED| > > exchange > > -- RANDOM_MERGE_EXCHANGE |PARTITIONED| > > aggregate [$$8] <- [function-call: asterix:agg-count, > > Args:[%0->$$0]] > > -- AGGREGATE |PARTITIONED| > > project ([$$0]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan []<-[$$6, $$0, $$7] <- Default:beers > > -- DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > > > > > > Best, > > Yingyi > > > > On Sat, Oct 1, 2016 at 9:25 AM, Mike Carey <[email protected]> wrote: > > > >> Sounds like there is a new materialization bug there..... Please file a > >> JIRA issue (and we'll need a query plan test case to keep it from > breaking > >> again). > >> > >> > >> > >> On 10/1/16 2:01 AM, Wail Alkowaileet wrote: > >> > >>> Hi, > >>> > >>> I know that early projections will enhance the performance. > >>> I just noticed something: > >>> > >>> 1- returning the whole tuple > >>> count( for $x in dataset Tweets > >>> return $x > >>> ) > >>> > >>> => Throws an exception Java heap exceeded. (The heap-size is less than > >>> the > >>> sum of AsterixDB configured memory ... so it's not a problem). > >>> > >>> 2- However, returning one field > >>> count( for $x in dataset Tweets > >>> return $x.id > >>> ) > >>> > >>> => Worked just fine. > >>> > >>> I'm just wondering, does the projection in count() affects its > >>> performance ? > >>> > >> > >> > > >
