[
https://issues.apache.org/jira/browse/TAJO-838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163237#comment-14163237
]
ASF GitHub Bot commented on TAJO-838:
-------------------------------------
GitHub user jihoonson opened a pull request:
https://github.com/apache/tajo/pull/192
TAJO-838: Improve query planner to utilize index
Hi guys.
This is an ongoing work. Even though there still remain critical problems
for practical uses, I'd like to share the progress of this issue.
Finally, I succeeded to utilize the index for query processing.
To show the effectiveness of the index, I carried out a simple performance
test as follows.
* Environments: an in-house cluster that consists of one master and 32
workers.
* Data: TPC-DS store_sales table at scale factor 100 (41 GB).
* DDL for index creation: create index ss_item_sk_idx on store_sales
(ss_item_sk asc null first);
* Test query: select ss_item_sk from store_sales where ss_item_sk = 1;
(selectivity = 0.000045139%)
* Result
| | Without disk cache | With disk cache |
|--- | --- | ---|
| Without index | 23.917 | 19.154 |
| With index | 4.207 | 3.995 |
Although the selectivity of the query is very low, I think that this result
shows a potential benefit of index.
Here are some remaining issues.
* Selectivity estimation. In the current patch, index utilization is forced
when it exists. I'll improve this to use the index when it is beneficial.
* Support index for partitioned tables
* Consider the case when the query predicate includes two or more columns.
* Code refactoring and potential bug fixes
* Add more tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jihoonson/tajo-2 TAJO-838
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tajo/pull/192.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #192
----
commit 7c98709f0fcb06dfb675acae3d6489a6126f55b5
Author: jinossy <[email protected]>
Date: 2014-08-06T08:43:35Z
TAJO-995: HiveMetaStoreClient wrapper should retry the connection
commit 415d0867ae4a4543f47360294bead1fc7f41e292
Author: Jaehwa Jung <[email protected]>
Date: 2014-08-10T06:07:24Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo
commit 7a7b4fd26f61df89cacdb4fc41faf9c2abe456b2
Author: Jaehwa Jung <[email protected]>
Date: 2014-08-11T02:28:48Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo
commit 45f5ed3adba931f4706f26dda1d3c03240ee11d3
Author: Jaehwa Jung <[email protected]>
Date: 2014-08-11T05:40:25Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo
commit aa01e83859ef553ac4eb90c1678e3bc6be20c6c9
Author: Jaehwa Jung <[email protected]>
Date: 2014-08-18T09:56:24Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo
commit b33a94509c1a007b56785435c8e16640ffde91b7
Author: Jaehwa Jung <[email protected]>
Date: 2014-09-04T02:14:19Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo
commit 05c892448113db40daef54d2e06dad463dbae9c8
Author: Jaehwa Jung <[email protected]>
Date: 2014-09-11T02:33:54Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo
commit 42a6c4ebeebe36aad6f7dc5f92c83baee398c85e
Author: Jaehwa Jung <[email protected]>
Date: 2014-09-11T03:25:05Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo
commit 52a942136c549f197d6d1c3d1a13717e6f14a83f
Author: Jaehwa Jung <[email protected]>
Date: 2014-09-24T01:26:36Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo
commit 0e63abc71723c4c22f2c591ae60d892a9707973f
Author: Jaehwa Jung <[email protected]>
Date: 2014-09-24T01:48:09Z
TAJO-1062: Update TSQL documentation
commit e59fe460cc888bfebe968619405edd9b9e57e410
Author: Jaehwa Jung <[email protected]>
Date: 2014-09-27T07:50:08Z
Update tsql documentation.
commit a5402249a2f7df85aa01280ed359f1d1d3489281
Author: Jaehwa Jung <[email protected]>
Date: 2014-09-27T07:52:05Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
TAJO-1062
commit 706be644389f1223f5ea19d1418c6b6aa8b9bc96
Author: Jaehwa Jung <[email protected]>
Date: 2014-10-01T03:31:51Z
Update some typos
commit 327a9c4edd5426521b86afac12dbab4640a7164a
Author: Jaehwa Jung <[email protected]>
Date: 2014-10-01T05:51:24Z
Rename back_command.rst
commit e1f2b6b437fdb842166f8cf7a8c8fbf4bce19041
Author: Jaehwa Jung <[email protected]>
Date: 2014-10-01T05:54:36Z
Use "Tajo" instead of "tajo"
commit b6c06138e756823daebf188174e63bead43c0c05
Author: Jaehwa Jung <[email protected]>
Date: 2014-10-01T06:25:43Z
Update some comments.
commit 4e203f03fb533e04577f5c2947823fedd8680b8a
Author: Hyunsik Choi <[email protected]>
Date: 2014-10-04T16:11:59Z
TAJO-1072: CLI gets stuck when wrong host/port is provided. (Jihun Kang via
hyunsik)
Closes #169
commit 68b44da57e53f53e88bedc0bb7ac763c97f069a9
Author: Hyunsik Choi <[email protected]>
Date: 2014-10-04T16:36:25Z
TAJO-1065: The \admin -cluster argument doesn't run as expected. (Jongyoung
Park via hyunsik)
Closes #173
commit 029054b45c158159325a68ac1491256e3abe71f4
Author: Hyunsik Choi <[email protected]>
Date: 2014-10-05T00:56:12Z
TAJO-1030: Not supported JDBC APIs should return empty results instead of
Exception. (Hyoungjun Kim via hyunsik)
Closes #145
commit ecc2b05af60d9540c758839c1f5d691850ac772b
Author: Hyunsik Choi <[email protected]>
Date: 2014-10-05T01:04:27Z
TAJO-668: Add datetime function documentation. (Jongyoung Park via hyunsik)
Closes #160
commit a282fc1059c2489804fe08e02234a7d09dba2a10
Author: Jihoon Son <[email protected]>
Date: 2014-10-05T07:43:40Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
TAJO-838
commit 28b4cbc036b05a109694e1dcbaeacd802d0c9f71
Author: Jaehwa Jung <[email protected]>
Date: 2014-10-05T13:15:27Z
Rename cli.rst to tsql.rst
commit 03847cf497779d18d323bf94a9e2f0d79dadcb96
Author: Jaehwa Jung <[email protected]>
Date: 2014-10-05T13:16:12Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into
TAJO-1062
commit c86d7ade7cde7de60a79d44196eb8c401b9f2a68
Author: Jihoon Son <[email protected]>
Date: 2014-10-05T14:40:31Z
TAJO-838
commit ca187bcf68ce81b984ccb7c2e2b5adc25ebff237
Author: Jaehwa Jung <[email protected]>
Date: 2014-10-06T01:43:27Z
Updated Change Note.
commit 4f987d967aa3a68e2c17cd8472120ec4316e0fc0
Author: Jihoon Son <[email protected]>
Date: 2014-10-06T02:56:55Z
TAJO-838
commit ca5fb301bff4b38d80a523d5bece9eaf74f64ec3
Author: Jaehwa Jung <[email protected]>
Date: 2014-10-06T05:12:42Z
TAJO-1067: INSERT OVERWRITE INTO should not remove all partitions. (jaehwa)
commit 44e6fe595da28c6c06e5c82741478a8ff3031fa9
Author: Jaehwa Jung <[email protected]>
Date: 2014-10-06T07:28:45Z
TAJO-1096: Update download source documentation (Mai Hai Thanh via jaehwa)
Closes #182
commit 67541c48aaa577848023e2e7cedab727f33b8a52
Author: Jihoon Son <[email protected]>
Date: 2014-10-06T08:52:37Z
TAJO-838
commit 3d630f93be0c50f09abf62aa00e69c0be5dabe7e
Author: Jihoon Son <[email protected]>
Date: 2014-10-06T08:52:46Z
Merge branch 'master' of http://git-wip-us.apache.org/repos/asf/tajo into
TAJO-838
----
> Improve query planner to utilize index
> --------------------------------------
>
> Key: TAJO-838
> URL: https://issues.apache.org/jira/browse/TAJO-838
> Project: Tajo
> Issue Type: Sub-task
> Components: planner/optimizer
> Reporter: Jihoon Son
> Assignee: Jihoon Son
> Priority: Minor
>
> Index can improve the query performance when the selectivity of query is high.
> Thus, query planner should decide whether index is used or not for a given
> query.
> The selectivity can be guessed using statistics.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)