[
https://issues.apache.org/jira/browse/SPARK-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jiang Chen updated SPARK-12161:
-------------------------------
Description:
Right now SparkSQL CacheManager can only map incoming query with cached query
when they are exactly the same. That is to say, for the following query pattern:
- Query 1: CACHE TABLE first AS SELECT * FROM table A where A.id >100 AND A.id
< 200;
- Query 2: SELECT * FROM table A where A.id < 200 AND A.id > 100;
Query 2 cannot utilize the cache result of query 1, although query 1 and query
2 are the same if ignoring the order of the predicates.
Ideally, for all incoming queries, we'd like to ignore the order of predicates
when matching them with the cached queries.
was:
Right now SparkSQL CacheManager can only map incoming query with cached query
when they are exactly the same. That is to say, for the following query pattern:
- Query 1: CACHE TABLE first AS SELECT * FROM table A where A.id >100 AND A.id
< 200;
- Query 2: SELECT * FROM table A where A.id < 200 AND A.id > 100;
Query 2 cannot utilize the cache result of query 1, although query 1 and query
2 are the same if ignoring the order of the filters.
Ideally, for all incoming queries, we'd like to ignore the order of predicates
when matching them with the cached queries.
> SparkSQL Cache Matching Improvement
> -----------------------------------
>
> Key: SPARK-12161
> URL: https://issues.apache.org/jira/browse/SPARK-12161
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Reporter: Jiang Chen
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> Right now SparkSQL CacheManager can only map incoming query with cached query
> when they are exactly the same. That is to say, for the following query
> pattern:
> - Query 1: CACHE TABLE first AS SELECT * FROM table A where A.id >100 AND
> A.id < 200;
> - Query 2: SELECT * FROM table A where A.id < 200 AND A.id > 100;
> Query 2 cannot utilize the cache result of query 1, although query 1 and
> query 2 are the same if ignoring the order of the predicates.
> Ideally, for all incoming queries, we'd like to ignore the order of
> predicates when matching them with the cached queries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]