[
https://issues.apache.org/jira/browse/PHOENIX-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780557#comment-17780557
]
ASF GitHub Bot commented on PHOENIX-7032:
-----------------------------------------
tkhurana commented on code in PR #1701:
URL: https://github.com/apache/phoenix/pull/1701#discussion_r1375107341
##########
phoenix-core/src/test/java/org/apache/phoenix/compile/WhereCompilerTest.java:
##########
@@ -1026,4 +1029,81 @@ public void testScanCaching_CustomFetchSizeOnStatement()
throws SQLException {
assertEquals(FETCH_SIZE, pstmt.getFetchSize());
assertEquals(FETCH_SIZE, scan.getCaching());
}
+ private Expression getDNF(PhoenixConnection pconn, String query) throws
SQLException {
+ //SQLParser parser = new SQLParser("where ID = 'i1' or (ID = 'i2' and
A > 1)");
+ // ParseNode where = parser.parseWhereClause()
+ PhoenixPreparedStatement pstmt = newPreparedStatement(pconn, query);
+ QueryPlan plan = pstmt.compileQuery();
+ ParseNode where = plan.getStatement().getWhere();
+
+ return transformDNF(where, plan.getContext());
+ }
+ @Test
+ public void testWhereInclusion() throws SQLException {
+ PhoenixConnection pconn = DriverManager.getConnection(getUrl(),
+
PropertiesUtil.deepCopy(TEST_PROPERTIES)).unwrap(PhoenixConnection.class);
+ String ddl = "create table myTable(ID varchar primary key, A integer,
B varchar, " +
+ "C date, D double, E integer)";
+ pconn.createStatement().execute(ddl);
+ ddl = "create table myTableDesc(ID varchar primary key DESC, A
integer, B varchar, " +
+ "C date, D double, E integer)";
+ pconn.createStatement().execute(ddl);
+
+ final int NUM = 13;
+ String[] containingQueries = new String[NUM];
+ String[] containedQueries = new String[NUM];
+
+ containingQueries[0] = "select * from myTable where ID = 'i1' or (ID =
'i2' and A > 1)";
+ containedQueries[0] = "select * from myTableDesc where ID = 'i1' or
(ID = 'i2' and " +
+ "A > 2 + 2)";
+
+ containingQueries[1] = "select * from myTable where ID > 'i3' and A >
1";
+ containedQueries[1] = "select * from myTableDesc where (ID > 'i7' or
ID = 'i4') and " +
+ "A > 2 * 10";
+
+ containingQueries[2] = "select * from myTable where ID IN ('i3', 'i7',
'i1') and A < 10";
+ containedQueries[2] = "select * from myTableDesc where ID IN ('i1',
'i7') and A < 10 / 2";
+
+ containingQueries[3] = "select * from myTableDesc where (ID, B) >
('i3', 'a') and A >= 10";
+ containedQueries[3] = "select * from myTable where ID = 'i3' and B =
'c' and A = 10";
+
+ containingQueries[4] = "select * from myTable where ID >= 'i3' and A
between 5 and 15";
+ containedQueries[4] = "select * from myTableDesc where ID = 'i3' and A
between 5 and 10";
+
+ containingQueries[5] = "select * from myTable where (A between 5 and
15) and " +
+ "(D < 10.67 or C <= CURRENT_DATE())";
+ containedQueries[5] = "select * from myTable where (A = 5 and D
between 1.5 and 9.99) or " +
+ "(A = 6 and C <= CURRENT_DATE() - 1000)";
+
+ containingQueries[6] = "select * from myTable where A is not null";
+ containedQueries[6] = "select * from myTable where A > 0";
+
+ containingQueries[7] = "select * from myTable where NOT (B is null)";
+ containedQueries[7] = "select * from myTable where (B > 'abc')";
+
+ containingQueries[8] = "select * from myTable where A >= E and D <= A";
+ containedQueries[8] = "select * from myTable where (A > E and D = A)";
+
+ containingQueries[9] = "select * from myTable where A > E";
+ containedQueries[9] = "select * from myTable where (A > E and B is
not null)";
+
+ containingQueries[10] = "select * from myTable where B like '%abc'";
+ containedQueries[10] = "select * from myTable where (B like '%abc' and
ID > 'i1')";
+
+ containingQueries[11] = "select * from myTable where " +
+ "PHOENIX_ROW_TIMESTAMP() < CURRENT_TIME()";
+ containedQueries[11] = "select * from myTable where " +
+ "(PHOENIX_ROW_TIMESTAMP() < CURRENT_TIME() - 1)";
+
+ containingQueries[12] = "select * from myTable where (A, E) IN ((2,3),
(7,8), (10,11))";
+ containedQueries[12] = "select * from myTable where (A, E) IN ((2,3),
(7,8))";
Review Comment:
@kadirozde It is working but it is not completely rewriting the expression.
This is getting translated to `((A, E) = (2, 3) OR (A, E) = (7, 8) OR (A, E) =
(10, 11))` . The individual RVC's are not getting expanded.
> Partial Global Secondary Indexes
> --------------------------------
>
> Key: PHOENIX-7032
> URL: https://issues.apache.org/jira/browse/PHOENIX-7032
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Kadir Ozdemir
> Assignee: Kadir Ozdemir
> Priority: Major
>
> The secondary indexes supported in Phoenix have been full indexes such that
> for every data table row there is an index row. Generating an index row for
> every data table row is not always required. For example, some use cases do
> not require index rows for the data table rows in which indexed column values
> are null. Such indexes are called sparse indexes. Partial indexes generalize
> the concept of sparse indexing and allow users to specify the subset of the
> data table rows for which index rows will be maintained. This subset is
> specified using a WHERE clause added to the CREATE INDEX DDL statement.
> Partial secondary indexes were first proposed by Michael Stonebraker
> [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf]. Since then several SQL
> databases (e.g.,
> [Postgres|https://www.postgresql.org/docs/current/indexes-partial.html] and
> [SQLite|https://www.sqlite.org/partialindex.html]) and NoSQL databases
> (e.g., [MongoDB|https://www.mongodb.com/docs/manual/core/index-partial/])
> have supported some form of partial indexes. It is challenging to allow
> arbitrary WHERE clauses in DDL statements. For example, Postgres does not
> allow subqueries in these where clauses and SQLite supports much more
> restrictive where clauses.
> Supporting arbitrary where clauses creates challenges for query optimizers in
> deciding the usability of a partial index for a given query. If the set of
> data table rows that satisfy the query is a subset of the data table rows
> that the partial index points back, then the query can use the index. Thus,
> the query optimizer has to decide if the WHERE clause of the query implies
> the WHERE clause of the index.
> Michael Stonebraker [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf]
> suggests that an index WHERE clause is a conjunct of simple terms, i.e:
> i-clause-1 and i-clause-2 and ... and i-clause-m where each clause is of the
> form <column> <operator> <constant>. Hence, the qualification can be
> evaluated for each tuple in the indicated relation without consulting
> additional tuples.
> Phoenix partial indexes will initially support a more general set of index
> WHERE clauses that can be evaluated on a single row with the following
> exceptions
> * Subqueries are not allowed.
> * Like expressions are allowed with very limited support such that an index
> WHERE clause with like expressions can imply/contain a query if the query has
> the same like expressions that the index WHERE clause has.
> * Comparison between columns are allowed without supporting transitivity,
> for example, a > b and b > c does not imply a > c.
> Partial indexes will be supported initially for global secondary indexes,
> i.e., covered global indexes and uncovered global indexes. The local
> secondary indexes will be supported in future.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)