[ 
https://issues.apache.org/jira/browse/ASTERIXDB-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721823#comment-17721823
 ] 

ASF subversion and git services commented on ASTERIXDB-3180:
------------------------------------------------------------

Commit fa8a284f41ffdcdd8f2a6576f6333efc51fcbd4e in asterixdb's branch 
refs/heads/master from Wail Alkowaileet
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=fa8a284f41 ]

[ASTERIXDB-3180][COMP][RT] Apply filter before assembling columnar datasets

- user model changes: no
- storage format changes: no
- interface changes: yes

Details:
This patch implements an idea by Mike Carey, which says
let's use the columns as a "poorman" index. The condition
expression of SELECT is pushed down to data-scan and
the following is performed for each mega-leaf node:

1- Read all the columns involved in the SELECT condition expression.
2- Look for a tuple that satisfies the condition
  - If none exists, skip reading the rest of the columns
  - If at least one exists, read the rest of the columns
3- For each subsequent call to next() in the LSM cursor,
   check whether the returned tuple satisfies the condition
  - If yes, assemble and return the tuple
  - If no, skip and go to the next tuple and repeat

Change-Id: Ia83b839633d83ac6e3ffb4340a1d144daa0b299d
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17510
Integration-Tests: Jenkins <[email protected]>
Tested-by: Jenkins <[email protected]>
Reviewed-by: Wail Alkowaileet <[email protected]>
Reviewed-by: Ali Alsuliman <[email protected]>


> Apply filter before assembling columnar datasets
> ------------------------------------------------
>
>                 Key: ASTERIXDB-3180
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3180
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>          Components: COMP - Compiler, RT - Runtime
>    Affects Versions: 0.9.9
>            Reporter: Wail Y. Alkowaileet
>            Assignee: Wail Y. Alkowaileet
>            Priority: Major
>             Fix For: 0.9.9
>
>
> The idea here is to examine column(s) in the WHERE clause before record 
> assembly (Mike Carey refers to this approach as "poor man's index").  The 
> sequence could be summarized as follows:
>  * We first read the filtering columns (i.e., columns in the WHERE clause)
>  * If the column(s)
>  ** satisfy the query predicate, we read the rest of the requested columns 
> and we assemble the record
>  ** If not, we simply fetch the next tuple
> This approach can improve the I/O (by skip reading columns if possible) and 
> also avoid assembling records that will be filtered anyway – a wasted CPU 
> expense  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to