[
https://issues.apache.org/jira/browse/PHOENIX-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kadir OZDEMIR reassigned PHOENIX-6318:
--------------------------------------
Assignee: Abhishek Singh Chouhan
> Phoenix client to set maxTimestamp on scans
> -------------------------------------------
>
> Key: PHOENIX-6318
> URL: https://issues.apache.org/jira/browse/PHOENIX-6318
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Kadir OZDEMIR
> Assignee: Abhishek Singh Chouhan
> Priority: Major
>
> On regular (non SCN) connections, Phoenix clients do not set the time range
> for scans. This means that a region server will include all the mutations
> that have been applied to its table region at the time the scan is opened on
> the region server. This creates some consistency issues if (1) a single
> Phoenix query needs to be executed on multiple table regions, (2) a region
> scanner implemented by Phoenix, e.g., indexing or paging region scanners,
> closes or reopens the underlying HBase scanner, or (3) HBase itself needs to
> close and reopen the scanner due its internal activities, e.g., region
> movement, split or merge.
> The consistency issue for the data tables is that the rows returned by the
> query would not accurately represent a point in time image of a table. The
> consistency issue for index tables can be even more severe as the results may
> include more than an index row (with different row key) for the same data
> table row. In other words, the result set of a query on an index table may
> include stale index rows.
> A simple approach to address this issue is to let the Phoenix client set the
> max timestamp for scans and set the same timestamp for all scans generated
> for the same Phoenix query (instance). If the clock skew between clients and
> servers is not large, this approach will greatly improve the consistency for
> Phoenix queries.
> The side effect of this approach is that if (1) the clock skew between
> clients and servers is more than the time between the start of processing a
> mutation on a server and the start of a scan to read the same mutation on a
> client, and (2) the client wall clock is behind. We assume that this side
> effect will rarely happen and the benefit of improving the consistency of
> Phoenix queries will outweigh.
> In future, we can consider better approaches to set the scan max timestamp
> more accurately.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)