[jira] [Comment Edited] (TAJO-1430) Implement Query Parsing Result Caching

Hyunsik Choi (JIRA) Fri, 20 Mar 2015 02:00:14 -0700

    [ 
https://issues.apache.org/jira/browse/TAJO-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371005#comment-14371005
 ]


Hyunsik Choi edited comment on TAJO-1430 at 3/20/15 8:59 AM:
-------------------------------------------------------------

The problem definition looks good. But, the solution is naive due to the 
following reasons:
 1) the same queries with different parameters should be parsed every time (low 
cache hit ratio).
 2) the same queries with different parameters will be all kept in the cache.
 3) there is no cache invalidation method

The better solution may be an approach like PreparedStatement.
http://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html

This approach allows DB system to prepare many thing before execution. Also, it 
separates parameters represented by '?' from SQL statements. So, we will have 
more opportunities to increase the cache hit ratio.

In addition, this parsed statements should be kept for each session. It would 
be better in terms of security and cache management.


was (Author: hyunsik):
The problem definition looks good. But, the solution is naive because the patch 
keeps all SQL statements in Master. The same queries with different parameter 
should be parsed every time, and they all will be kept in the cache.

The better solution may be an approach like PreparedStatement.
http://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html

This approach allows DB system to prepare many thing before execution. Also, it 
separates parameters represented by '?' from SQL statements. So, we will have 
more opportunities to increase the cache hit ratio.

In addition, this parsed statements should be kept for each session. It would 
be better in terms of security and cache management.

> Implement Query Parsing Result Caching
> --------------------------------------
>
>                 Key: TAJO-1430
>                 URL: https://issues.apache.org/jira/browse/TAJO-1430
>             Project: Tajo
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 0.10.0
>            Reporter: Dongjoon Hyun
>            Assignee: Dongjoon Hyun
>             Fix For: 0.10.1
>
>         Attachments: TAJO-1430.patch, long.sql, middle.sql, wide_table.sql
>
>
> There are wide tables with many many columns. Moveover, BI tools generate 
> very complex queries whose size is several MB. Although Tajo executes those 
> queries very fast in a few seconds, the total time of UX is slow.
> To become a fastest Hadoop DW, we need this following feature. 
> {code:sql}
> time tsql -f middle.sql > /dev/null
> real  0m19.058s
> user  0m2.148s
> sys   0m0.268s
> time tsql -f ~/tajo/middle.sql > /dev/null 
> real  0m18.496s
> user  0m2.119s
> sys   0m0.240s
> $ time ./tsql -f ~/tajo/long.sql > /dev/null 
> real  0m36.974s
> user  0m2.305s
> sys   0m0.272s
> $ time ./tsql -f ~/tajo/long.sql > /dev/null 
> real  0m4.103s
> user  0m2.237s
> sys   0m0.249s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (TAJO-1430) Implement Query Parsing Result Caching

Reply via email to