[
https://issues.apache.org/jira/browse/PHOENIX-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas D'Silva updated PHOENIX-3516:
------------------------------------
Attachment: PHOENIX-3516-addendum-v2.patch
Attaching final addendum patch.
> Performance Issues with queries that have compound filters and specify
> phoenix.query.force.rowkeyorder=true
> -----------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-3516
> URL: https://issues.apache.org/jira/browse/PHOENIX-3516
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.9.0
> Reporter: Jan Fernando
> Assignee: Thomas D'Silva
> Fix For: 4.9.1, 4.10.0
>
> Attachments: PHOENIX-3516-addendum-v2.patch,
> PHOENIX-3516-addendum.patch, PHOENIX-3516.patch
>
>
> On all our connections we specify the phoenix.query.force.rowkeyorder=true
> property to force serial scans so that we only support queries that will
> scale horizontally with data size.
> In running performance tests, we found that queries with multiple AND'ed
> range filters were slow and not performing not as expected. We looked at the
> query plan and noticed that, in the slow query case, the query plan is doing
> a PARALLEL 1-WAY SCAN and doing a SERVER FILTER BY whereas the fast query is
> simply doing a SERIAL 1-WAY RANGE SCAN.
> We expect these queries to both have the same plan as we are specifying
> phoenix.query.force.rowkeyorder=true.
> You can repro as follows:
> 1. Use non-tenant specific connection to create the table:
> CREATE TABLE IF NOT EXISTS MY_MT_TABLE.TEST_TABLE (
> ORGANIZATION_ID CHAR(15) NOT NULL,
> PARENT_TYPE CHAR(3) NOT NULL,
> PARENT_ID CHAR(15) NOT NULL,
> CREATED_DATE DATE NOT NULL
> CONSTRAINT PK PRIMARY KEY
> (
> ORGANIZATION_ID,
> PARENT_TYPE,
> PARENT_ID,
> CREATED_DATE DESC
> )
> ) VERSIONS=1,MULTI_TENANT=true,REPLICATION_SCOPE=1;
> 2. Use non-tenant specific connection to execute index:
> CREATE INDEX IF NOT EXISTS MY_TEST_TABLE_INDEX
> ON MY_MT_TABLE.TEST_TABLE (PARENT_TYPE, CREATED_DATE, PARENT_ID);
> 3. Use a tenant-specific connection to create the View:
> CREATE VIEW IF NOT EXISTS MY_TEST_TABLE_VIEW AS SELECT * FROM
> MY_MT_TABLE.TEST_TABLE;
> 4. Run queries below with tenant-specific connection:
> Query with expected plan:
> EXPLAIN SELECT PARENT_ID
> FROM MY_TEST_TABLE_VIEW
> WHERE PARENT_TYPE='001'
> AND (CREATED_DATE > to_date('2012-10-21') AND CREATED_DATE <
> to_date('2016-10-31'))
> ORDER BY PARENT_TYPE,CREATED_DATE LIMIT 501;
> +------------------------------------------+
> | PLAN |
> +------------------------------------------+
> | CLIENT 1-CHUNK SERIAL 1-WAY RANGE SCAN OVER MY_MT_TABLE.MY_TEST_TABLE_INDEX
> ['00Dxx0000001gFA','001','2012-10-21 00:00:00.001'] -
> ['00Dxx0000001gFA','001',' |
> | SERVER FILTER BY FIRST KEY ONLY |
> | SERVER 501 ROW LIMIT |
> | CLIENT 501 ROW LIMIT |
> +------------------------------------------+
> Slow query with unexpected plan. Since the date range are overlapping we
> expected Phoenix to consolidate this into a the smallest matching range and
> do a range scan. It does seem to do the consolidation but then do a parallel
> and not a range scan.
> EXPLAIN SELECT PARENT_ID
> FROM MY_TEST_TABLE_VIEW
> WHERE PARENT_TYPE='001'
> AND (CREATED_DATE >= to_date('2011-01-01') AND CREATED_DATE <=
> to_date('2016-01-01'))
> AND (CREATED_DATE > to_date('2012-10-21') AND CREATED_DATE <
> to_date('2016-10-31'))
> ORDER BY PARENT_TYPE,CREATED_DATE LIMIT 501;
> +------------------------------------------+
> | PLAN |
> +------------------------------------------+
> | CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER
> MY_MT_TABLE.MY_TEST_TABLE_INDEX ['00Dxx0000001gFA','001','2012-10-21
> 00:00:00.001'] - ['00Dxx0000001gFA','001' |
> | SERVER FILTER BY FIRST KEY ONLY AND (true AND true) |
> | SERVER 501 ROW LIMIT |
> | CLIENT 501 ROW LIMIT |
> +------------------------------------------+
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)