Re: Indexes not performing well

Magnus Kongshem Tue, 31 May 2016 11:43:53 -0700

Ok, so I took your advice and created a new data set with a filter on
timestamp and only a fraction of the indexes I had previously created. I
only kept the indexes based on the fields building, floor and id.


Attached you will find:

- The DLL for the data set
- The optimized logical plan for each query. (excluding query number six) I
used a timestamp range of 30 days, I'll test on 7 days and 1 day tomorrow.
- The optimized logical plan for a full table scan on the data set.

The results was disappointing unfortunately. The attached diagram shows the
query results compared to the other two tests I have performed. To be
clear, the gray bars represent the data set with the DLL attached in this
e-mail. I tried to decipher the logical plan for each query, but I did not
get anything reasonable out of it.

It's worth mentioning that AsterixDB currently does not support creating
filters on a data set with an auto generated UID field, unless there is
some magic that I am not aware of. This means I had to create a UUID for
each record in my data set before loading it into AsterixDB. This was
performed in Java 7 with UUID.randomUUID();

Any thoughts guys? Am I missing something?

BG,
Magnus

On Sun, May 29, 2016 at 8:32 AM, Sattam Alsubaiee <[email protected]>
wrote:

> Creating indexes on fields with high selectivities (such as hourOfDay
> and dayOfWeek) are not encouraged at all. Each secondary index lookup will
> have to probe the primary index to fetch other fields in the record. It
> would be much more efficient if you just perform scans as opposed of
> accessing secondary indexes when querying such fields.
>
> I would recommend that you drop at least the following indexes:
> drop index posdata.hour;
> drop index posdata.day;
>
> Also I would highly recommend that you utilize AsterixDB filters, which is
> very good optimization (could save up to 99% of query time) when you deal
> with time-correlated fields such as timestamps:
> https://asterixdb.apache.org/docs/0.8.8-incubating/aql/filters.html
> http://dl.acm.org/citation.cfm?id=2786007
>
> Cheers,
> Sattam
>
> On Sun, May 29, 2016 at 8:58 AM, Michael Carey <[email protected]>
> wrote:
>
>> @Pouria: Please share your findings here when you check this out - this
>> is quite strange, since none of the other performance results that have
>> been obtained on the system have looked anything like this.  (I will try to
>> look at this too at some point, but will unfortunately be MIA from June
>> 1-15 first.)  Weird....
>>
>> On 5/26/16 9:20 AM, Pouria Pirzadeh wrote:
>>
>> Hi Magnus,
>>
>> Thanks for your email and sharing the information.
>> If it is Ok with you, Would you please share with us the exact DDL
>> (including type definitions, dataset and index definitions) and exact AQL
>> queries that you ran against AsterixDB ?
>> I am just interested in checking the query plans and see what ended up
>> being run as jobs.
>>
>> Thanks.
>> Pouria
>>
>> On Thu, May 26, 2016 at 4:59 AM, Magnus Kongshem <[email protected]
>> > wrote:
>>
>>> Hi,
>>>
>>> There has been a lot of questions from me regarding AsterixDB and I
>>> thank all of you who have answered me. So it is time for me to contribute
>>> with some obeservations. I am writing my master thesis where I test
>>> multiple databases on a large data set. I should also mention that I have
>>> installed AsterixDB on a single machine.
>>>
>>> What I have observed is that asterixDB has a "poorer" read performance
>>> when I specify indexes on the data set compared to not implementing any
>>> indexes. See the attachment for details, its an excerpt of my thesis
>>> explaining and describing the queries, the indexes and the test results.
>>> Any thoughts on these test results?
>>>
>>> I also cannot help to notice that the read performance for a query
>>> querying a small portion, medium portion and large portion of the data set
>>> is very similar. The largest query finds 75 million records and the
>>> smallest query finds 3.5 million records, but almost have the same read
>>> performance. How can this be?
>>>
>>> Perhaps you can use these test results in the future development of
>>> asterixDB.
>>>
>>> I you would like, I can send you my final thesis when it's done.
>>>
>>> --
>>>
>>> Mvh
>>>
>>> Magnus Alderslyst Kongshem
>>> +47 415 65 906 <%2B47%20415%2065%20906>
>>>
>>
>>
>>

use dataverse bigd;

    

create type table2 as open {
        
uid: string,
        
campus: string,
        
building: string,
        
floor: string,
        
timestamp: int32,
        
dayOfWeek: int32,
        
hourOfDay: int32,
        
latitude: double,
        
salt_timestamp: int32,
        
longitude: double,
        
id: string,
        
accuracy: double
    
}

create dataset posdata3(table2)
    
        primary key uid with filter on timestamp;

create index id on posdata3(id);

create index building on posdata3(building);

create index floor on posdata3(floor);



load dataset posdata3 using localfs 
(("path"="localhost:///path/to/dataset/all.adm"),("format"="adm"));

Full table scan)



count(for $obj in dataset posdata return $obj);
Duration: 348 seconds


-----------------------------------------

distribute result [%0->$$5]
-- DISTRIBUTE_RESULT  |UNPARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
    aggregate [$$5] <- [function-call: asterix:agg-sum, Args:[%0->$$7]]
    -- AGGREGATE  |UNPARTITIONED|
      exchange 
      -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
        aggregate [$$7] <- [function-call: asterix:agg-count, Args:[%0->$$0]]
        -- AGGREGATE  |PARTITIONED|
          project ([$$0])
          -- STREAM_PROJECT  |PARTITIONED|
            exchange 
            -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
              data-scan []<-[$$6, $$0] <- bigd:posdata3
              -- DATASOURCE_SCAN  |PARTITIONED|
                exchange 
                -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                  empty-tuple-source
                  -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

Query number 2)


count(from $obj in dataset posdata
    
where $obj.timestamp >= 1412121600 and $obj.timestamp <= 1414800000 and 
$obj.hourOfDay > 9 and $obj.hourOfDay < 15
    
return $obj);


------------------------------------


distribute result [%0->$$14]
-- DISTRIBUTE_RESULT  |UNPARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
    aggregate [$$14] <- [function-call: asterix:agg-sum, Args:[%0->$$18]]
    -- AGGREGATE  |UNPARTITIONED|
      exchange 
      -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
        aggregate [$$18] <- [function-call: asterix:agg-count, Args:[%0->$$0]]
        -- AGGREGATE  |PARTITIONED|
          project ([$$0])
          -- STREAM_PROJECT  |PARTITIONED|
            select (function-call: algebricks:and, Args:[function-call: 
algebricks:gt, Args:[%0->$$15, AInt64: {9}], function-call: algebricks:lt, 
Args:[%0->$$15, AInt64: {15}], function-call: algebricks:le, Args:[%0->$$16, 
AInt64: {1414800000}], function-call: algebricks:ge, Args:[%0->$$16, AInt64: 
{1412121600}]])
            -- STREAM_SELECT  |PARTITIONED|
              assign [$$16, $$15] <- [function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {4}], function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {6}]]
              -- ASSIGN  |PARTITIONED|
                project ([$$0])
                -- STREAM_PROJECT  |PARTITIONED|
                  exchange 
                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                    data-scan []<-[$$17, $$0] <- bigd:posdata3
                    -- DATASOURCE_SCAN  |PARTITIONED|
                      exchange 
                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                        assign [$$19, $$20] <- [AInt64: {1414800000}, AInt64: 
{1412121600}]
                        -- ASSIGN  |PARTITIONED|
                          empty-tuple-source
                          -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

Query number 3)


count(from $obj in dataset posdata
    
where $obj.timestamp >= 1412121600 and $obj.timestamp <= 1414800000 and 
$obj.hourOfDay > 9 and $obj.hourOfDay < 15 and ($obj.dayOfWeek = 1 or 
$obj.dayOfWeek = 3)
    
return $obj);


------------------------------------

distribute result [%0->$$19]
-- DISTRIBUTE_RESULT  |UNPARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
    aggregate [$$19] <- [function-call: asterix:agg-sum, Args:[%0->$$25]]
    -- AGGREGATE  |UNPARTITIONED|
      exchange 
      -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
        aggregate [$$25] <- [function-call: asterix:agg-count, Args:[%0->$$0]]
        -- AGGREGATE  |PARTITIONED|
          project ([$$0])
          -- STREAM_PROJECT  |PARTITIONED|
            exchange 
            -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
              join (function-call: algebricks:eq, Args:[%0->$$24, %0->$$20])
              -- HYBRID_HASH_JOIN [$$24][$$20]  |PARTITIONED|
                exchange 
                -- HASH_PARTITION_EXCHANGE [$$24]  |PARTITIONED|
                  unnest $$24 <- function-call: asterix:scan-collection, 
Args:[AOrderedList: [ AInt64: {1}, AInt64: {3} ]]
                  -- UNNEST  |UNPARTITIONED|
                    empty-tuple-source
                    -- EMPTY_TUPLE_SOURCE  |UNPARTITIONED|
                exchange 
                -- HASH_PARTITION_EXCHANGE [$$20]  |PARTITIONED|
                  project ([$$0, $$20])
                  -- STREAM_PROJECT  |PARTITIONED|
                    select (function-call: algebricks:and, Args:[function-call: 
algebricks:lt, Args:[%0->$$21, AInt64: {15}], function-call: algebricks:gt, 
Args:[%0->$$21, AInt64: {9}], function-call: algebricks:ge, Args:[%0->$$22, 
AInt64: {1412121600}], function-call: algebricks:le, Args:[%0->$$22, AInt64: 
{1414800000}]])
                    -- STREAM_SELECT  |PARTITIONED|
                      assign [$$22, $$21, $$20] <- [function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {4}], function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {6}], function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {5}]]
                      -- ASSIGN  |PARTITIONED|
                        project ([$$0])
                        -- STREAM_PROJECT  |PARTITIONED|
                          exchange 
                          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                            data-scan []<-[$$23, $$0] <- bigd:posdata3
                            -- DATASOURCE_SCAN  |PARTITIONED|
                              exchange 
                              -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                assign [$$26, $$27] <- [AInt64: {1412121600}, 
AInt64: {1414800000}]
                                -- ASSIGN  |PARTITIONED|
                                  empty-tuple-source
                                  -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

Query number 4)


count(from $obj in dataset posdata
    
where $obj.timestamp >= 1412121600 and $obj.timestamp <= 1414800000 and 
$obj.hourOfDay > 9 and $obj.hourOfDay < 15 and $obj.building = "IT-Vest"
    
return $obj);


------------------------------------


distribute result [%0->$$16]
-- DISTRIBUTE_RESULT  |UNPARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
    aggregate [$$16] <- [function-call: asterix:agg-sum, Args:[%0->$$21]]
    -- AGGREGATE  |UNPARTITIONED|
      exchange 
      -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
        aggregate [$$21] <- [function-call: asterix:agg-count, Args:[%0->$$0]]
        -- AGGREGATE  |PARTITIONED|
          project ([$$0])
          -- STREAM_PROJECT  |PARTITIONED|
            select (function-call: algebricks:and, Args:[function-call: 
algebricks:eq, Args:[function-call: asterix:field-access-by-index, 
Args:[%0->$$0, AInt32: {2}], AString: {IT-Vest}], function-call: algebricks:lt, 
Args:[%0->$$17, AInt64: {15}], function-call: algebricks:gt, Args:[%0->$$17, 
AInt64: {9}], function-call: algebricks:le, Args:[%0->$$18, AInt64: 
{1414800000}], function-call: algebricks:ge, Args:[%0->$$18, AInt64: 
{1412121600}]])
            -- STREAM_SELECT  |PARTITIONED|
              assign [$$18, $$17] <- [function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {4}], function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {6}]]
              -- ASSIGN  |PARTITIONED|
                project ([$$0])
                -- STREAM_PROJECT  |PARTITIONED|
                  exchange 
                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                    unnest-map [$$19, $$0] <- function-call: 
asterix:index-search, Args:[AString: {posdata3}, AInt32: {0}, AString: {bigd}, 
AString: {posdata3}, ABoolean: {false}, ABoolean: {false}, ABoolean: {false}, 
AInt32: {1}, %0->$$25, AInt32: {1}, %0->$$25, TRUE, TRUE, TRUE]
                    -- BTREE_SEARCH  |PARTITIONED|
                      exchange 
                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                        assign [$$26, $$27] <- [AInt64: {1414800000}, AInt64: 
{1412121600}]
                        -- ASSIGN  |PARTITIONED|
                          exchange 
                          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                            order (ASC, %0->$$25) 
                            -- STABLE_SORT [$$25(ASC)]  |PARTITIONED|
                              exchange 
                              -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                project ([$$25])
                                -- STREAM_PROJECT  |PARTITIONED|
                                  exchange 
                                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                    unnest-map [$$24, $$25] <- function-call: 
asterix:index-search, Args:[AString: {building}, AInt32: {0}, AString: {bigd}, 
AString: {posdata3}, ABoolean: {false}, ABoolean: {false}, ABoolean: {false}, 
AInt32: {1}, %0->$$22, AInt32: {1}, %0->$$23, TRUE, TRUE, TRUE]
                                    -- BTREE_SEARCH  |PARTITIONED|
                                      exchange 
                                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                        assign [$$28, $$29, $$22, $$23] <- 
[AInt64: {1414800000}, AInt64: {1412121600}, AString: {IT-Vest}, AString: 
{IT-Vest}]
                                        -- ASSIGN  |PARTITIONED|
                                          empty-tuple-source
                                          -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

Query number 5)


count(from $obj in dataset posdata
    
where $obj.timestamp >= 1412121600 and $obj.timestamp <= 1414800000 and 
$obj.accuracy < 10
    
return $obj);


------------------------------------
distribute result [%0->$$12]
-- DISTRIBUTE_RESULT  |UNPARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
    aggregate [$$12] <- [function-call: asterix:agg-sum, Args:[%0->$$16]]
    -- AGGREGATE  |UNPARTITIONED|
      exchange 
      -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
        aggregate [$$16] <- [function-call: asterix:agg-count, Args:[%0->$$0]]
        -- AGGREGATE  |PARTITIONED|
          project ([$$0])
          -- STREAM_PROJECT  |PARTITIONED|
            select (function-call: algebricks:and, Args:[function-call: 
algebricks:lt, Args:[function-call: asterix:field-access-by-index, 
Args:[%0->$$0, AInt32: {11}], AInt64: {10}], function-call: algebricks:le, 
Args:[%0->$$13, AInt64: {1414800000}], function-call: algebricks:ge, 
Args:[%0->$$13, AInt64: {1412121600}]])
            -- STREAM_SELECT  |PARTITIONED|
              assign [$$13] <- [function-call: asterix:field-access-by-index, 
Args:[%0->$$0, AInt32: {4}]]
              -- ASSIGN  |PARTITIONED|
                project ([$$0])
                -- STREAM_PROJECT  |PARTITIONED|
                  exchange 
                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                    data-scan []<-[$$14, $$0] <- bigd:posdata3
                    -- DATASOURCE_SCAN  |PARTITIONED|
                      exchange 
                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                        assign [$$17, $$18] <- [AInt64: {1414800000}, AInt64: 
{1412121600}]
                        -- ASSIGN  |PARTITIONED|
                          empty-tuple-source
                          -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

Query number 7)


count(from $obj in dataset posdata
    
where $obj.timestamp >= 1412121600 and $obj.timestamp <= 1414800000 and 
$obj.hourOfDay > 9 and $obj.hourOfDay < 15 and ($obj.dayOfWeek = 1 or 
$obj.dayOfWeek = 3) and $obj.building = "Realfagbygget" and $obj.floor = "4. 
etasje"
    
return $obj);


------------------------------------


distribute result [%0->$$23]
-- DISTRIBUTE_RESULT  |UNPARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
    aggregate [$$23] <- [function-call: asterix:agg-sum, Args:[%0->$$31]]
    -- AGGREGATE  |UNPARTITIONED|
      exchange 
      -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
        aggregate [$$31] <- [function-call: asterix:agg-count, Args:[%0->$$0]]
        -- AGGREGATE  |PARTITIONED|
          project ([$$0])
          -- STREAM_PROJECT  |PARTITIONED|
            exchange 
            -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
              join (function-call: algebricks:eq, Args:[%0->$$28, %0->$$24])
              -- HYBRID_HASH_JOIN [$$28][$$24]  |PARTITIONED|
                exchange 
                -- HASH_PARTITION_EXCHANGE [$$28]  |PARTITIONED|
                  unnest $$28 <- function-call: asterix:scan-collection, 
Args:[AOrderedList: [ AInt64: {1}, AInt64: {3} ]]
                  -- UNNEST  |UNPARTITIONED|
                    empty-tuple-source
                    -- EMPTY_TUPLE_SOURCE  |UNPARTITIONED|
                exchange 
                -- HASH_PARTITION_EXCHANGE [$$24]  |PARTITIONED|
                  project ([$$0, $$24])
                  -- STREAM_PROJECT  |PARTITIONED|
                    select (function-call: algebricks:and, Args:[function-call: 
algebricks:eq, Args:[function-call: asterix:field-access-by-index, 
Args:[%0->$$0, AInt32: {2}], AString: {Realfagbygget}], function-call: 
algebricks:eq, Args:[function-call: asterix:field-access-by-index, 
Args:[%0->$$0, AInt32: {3}], AString: {4. etasje}], function-call: 
algebricks:lt, Args:[%0->$$25, AInt64: {15}], function-call: algebricks:gt, 
Args:[%0->$$25, AInt64: {9}], function-call: algebricks:ge, Args:[%0->$$26, 
AInt64: {1412121600}], function-call: algebricks:le, Args:[%0->$$26, AInt64: 
{1414800000}]])
                    -- STREAM_SELECT  |PARTITIONED|
                      assign [$$26, $$25, $$24] <- [function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {4}], function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {6}], function-call: 
asterix:field-access-by-index, Args:[%0->$$0, AInt32: {5}]]
                      -- ASSIGN  |PARTITIONED|
                        project ([$$0])
                        -- STREAM_PROJECT  |PARTITIONED|
                          exchange 
                          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                            unnest-map [$$27, $$0] <- function-call: 
asterix:index-search, Args:[AString: {posdata3}, AInt32: {0}, AString: {bigd}, 
AString: {posdata3}, ABoolean: {false}, ABoolean: {false}, ABoolean: {false}, 
AInt32: {1}, %0->$$35, AInt32: {1}, %0->$$35, TRUE, TRUE, TRUE]
                            -- BTREE_SEARCH  |PARTITIONED|
                              exchange 
                              -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                assign [$$36, $$37] <- [AInt64: {1412121600}, 
AInt64: {1414800000}]
                                -- ASSIGN  |PARTITIONED|
                                  exchange 
                                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                    order (ASC, %0->$$35) 
                                    -- STABLE_SORT [$$35(ASC)]  |PARTITIONED|
                                      exchange 
                                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                        project ([$$35])
                                        -- STREAM_PROJECT  |PARTITIONED|
                                          exchange 
                                          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                            unnest-map [$$34, $$35] <- 
function-call: asterix:index-search, Args:[AString: {building}, AInt32: {0}, 
AString: {bigd}, AString: {posdata3}, ABoolean: {false}, ABoolean: {false}, 
ABoolean: {false}, AInt32: {1}, %0->$$32, AInt32: {1}, %0->$$33, TRUE, TRUE, 
TRUE]
                                            -- BTREE_SEARCH  |PARTITIONED|
                                              exchange 
                                              -- ONE_TO_ONE_EXCHANGE  
|PARTITIONED|
                                                assign [$$38, $$39, $$32, $$33] 
<- [AInt64: {1412121600}, AInt64: {1414800000}, AString: {Realfagbygget}, 
AString: {Realfagbygget}]
                                                -- ASSIGN  |PARTITIONED|
                                                  empty-tuple-source
                                                  -- EMPTY_TUPLE_SOURCE  
|PARTITIONED|

Query number 8)


count(from $obj in dataset posdata
    
where $obj.timestamp >= 1412121600 and $obj.timestamp <= 1414800000 and 
$obj.building = "Realfagbygget"
    
return $obj);


------------------------------------




distribute result [%0->$$12]
-- DISTRIBUTE_RESULT  |UNPARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
    aggregate [$$12] <- [function-call: asterix:agg-sum, Args:[%0->$$16]]
    -- AGGREGATE  |UNPARTITIONED|
      exchange 
      -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
        aggregate [$$16] <- [function-call: asterix:agg-count, Args:[%0->$$0]]
        -- AGGREGATE  |PARTITIONED|
          project ([$$0])
          -- STREAM_PROJECT  |PARTITIONED|
            select (function-call: algebricks:and, Args:[function-call: 
algebricks:eq, Args:[function-call: asterix:field-access-by-index, 
Args:[%0->$$0, AInt32: {2}], AString: {Realfagbygget}], function-call: 
algebricks:le, Args:[%0->$$13, AInt64: {1414800000}], function-call: 
algebricks:ge, Args:[%0->$$13, AInt64: {1412121600}]])
            -- STREAM_SELECT  |PARTITIONED|
              assign [$$13] <- [function-call: asterix:field-access-by-index, 
Args:[%0->$$0, AInt32: {4}]]
              -- ASSIGN  |PARTITIONED|
                project ([$$0])
                -- STREAM_PROJECT  |PARTITIONED|
                  exchange 
                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                    unnest-map [$$14, $$0] <- function-call: 
asterix:index-search, Args:[AString: {posdata3}, AInt32: {0}, AString: {bigd}, 
AString: {posdata3}, ABoolean: {false}, ABoolean: {false}, ABoolean: {false}, 
AInt32: {1}, %0->$$20, AInt32: {1}, %0->$$20, TRUE, TRUE, TRUE]
                    -- BTREE_SEARCH  |PARTITIONED|
                      exchange 
                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                        assign [$$21, $$22] <- [AInt64: {1414800000}, AInt64: 
{1412121600}]
                        -- ASSIGN  |PARTITIONED|
                          exchange 
                          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                            order (ASC, %0->$$20) 
                            -- STABLE_SORT [$$20(ASC)]  |PARTITIONED|
                              exchange 
                              -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                project ([$$20])
                                -- STREAM_PROJECT  |PARTITIONED|
                                  exchange 
                                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                    unnest-map [$$19, $$20] <- function-call: 
asterix:index-search, Args:[AString: {building}, AInt32: {0}, AString: {bigd}, 
AString: {posdata3}, ABoolean: {false}, ABoolean: {false}, ABoolean: {false}, 
AInt32: {1}, %0->$$17, AInt32: {1}, %0->$$18, TRUE, TRUE, TRUE]
                                    -- BTREE_SEARCH  |PARTITIONED|
                                      exchange 
                                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                                        assign [$$23, $$24, $$17, $$18] <- 
[AInt64: {1414800000}, AInt64: {1412121600}, AString: {Realfagbygget}, AString: 
{Realfagbygget}]
                                        -- ASSIGN  |PARTITIONED|
                                          empty-tuple-source
                                          -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

Query number 1)


count(from $obj in dataset posdata
    
where $obj.timestamp >= 1412121600 and $obj.timestamp <= 1414800000
    
return $obj);


------------------------------------



distribute result [%0->$$10]
-- DISTRIBUTE_RESULT  |UNPARTITIONED|
  exchange 
  -- ONE_TO_ONE_EXCHANGE  |UNPARTITIONED|
    aggregate [$$10] <- [function-call: asterix:agg-sum, Args:[%0->$$13]]
    -- AGGREGATE  |UNPARTITIONED|
      exchange 
      -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
        aggregate [$$13] <- [function-call: asterix:agg-count, Args:[%0->$$0]]
        -- AGGREGATE  |PARTITIONED|
          project ([$$0])
          -- STREAM_PROJECT  |PARTITIONED|
            select (function-call: algebricks:and, Args:[function-call: 
algebricks:ge, Args:[%0->$$11, AInt64: {1412121600}], function-call: 
algebricks:le, Args:[%0->$$11, AInt64: {1414800000}]])
            -- STREAM_SELECT  |PARTITIONED|
              assign [$$11] <- [function-call: asterix:field-access-by-index, 
Args:[%0->$$0, AInt32: {4}]]
              -- ASSIGN  |PARTITIONED|
                project ([$$0])
                -- STREAM_PROJECT  |PARTITIONED|
                  exchange 
                  -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                    data-scan []<-[$$12, $$0] <- bigd:posdata3
                    -- DATASOURCE_SCAN  |PARTITIONED|
                      exchange 
                      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                        assign [$$14, $$15] <- [AInt64: {1412121600}, AInt64: 
{1414800000}]
                        -- ASSIGN  |PARTITIONED|
                          empty-tuple-source
                          -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

Re: Indexes not performing well

Reply via email to