Jiang Chen created SPARK-12161:
----------------------------------
Summary: SparkSQL Cache Match improvement
Key: SPARK-12161
URL: https://issues.apache.org/jira/browse/SPARK-12161
Project: Spark
Issue Type: Improvement
Components: SQL
Reporter: Jiang Chen
Right now SparkSQL CacheManager can only map incoming query with cached query
when they are exactly the same. That is to say, for the following query pattern:
- Query 1: CACHE TABLE first AS SELECT * FROM table A where A.id >100 AND A.id
< 200;
- Query 2: SELECT * FROM table A where A.id < 200 AND A.id > 100;
Query 2 cannot utilize the cache result of query 1, although query 1 and query
2 are the same if ignoring the order of the filters.
Ideally, for all incoming queries, we'd like to ignore the order of predicates
when matching them with the cached queries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]