James Taylor created PHOENIX-3739:
-------------------------------------
Summary: Turn pure point lookups into HBase Gets
Key: PHOENIX-3739
URL: https://issues.apache.org/jira/browse/PHOENIX-3739
Project: Phoenix
Issue Type: Improvement
Reporter: James Taylor
Priority: Minor
HBase provides a means of isolating resources based on read/write and further
based on the operation (Scan versus Get). To leverage this, Phoenix could turn
a pure point lookup scan into a series of Gets (or a MultiGet when that's
available).
Best to look at an example query to outline some potential issues:
{code}
SELECT * FROM MY.TABLE
WHERE ID IN ('001','123','002', '456') AND COL1 > 10
{code}
Phoenix turns this into a scan per region pushing the IDs through our skip scan
filter which does seeks to each row. The {{COL1 > 10}} turns into filter as
well which is anded with the skip scan filter.
Some potential issues to overcome are:
- Extra RPC calls. We're had use cases in which 250K keys are pushed through
the skip scan filter. We wouldn't want to turn this into 250K RPCs. Perhaps
there's some kind of multi/batch operation that could be leveraged for
currently supported HBase versions since it looks like MultiGet is targeted for
HBase 2.0.
- Extra payload per Get call. With the Scan approach, the extra filter for
{{COL1 > 10}} is passed once. With this approach, we'd need to pass this for
every Get operation.
- Code consistency. It's nice to have a single code path in Phoenix that's
consistent across all queries. Phoenix knows when a scan becomes a pure point
lookup, though, so this can be overcome - it just adds a little complexity.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)