[jira] [Commented] (PHOENIX-7106) Data Integrity issues due to invalid rowkeys returned by various coprocessors

ASF GitHub Bot (Jira) Fri, 12 Jan 2024 00:59:05 -0800


    [ 
https://issues.apache.org/jira/browse/PHOENIX-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805944#comment-17805944
 ]


ASF GitHub Bot commented on PHOENIX-7106:
-----------------------------------------

kadirozde commented on code in PR #1736:
URL: https://github.com/apache/phoenix/pull/1736#discussion_r1450082686


##########
phoenix-core-client/src/main/java/org/apache/phoenix/iterate/OffsetResultIterator.java:
##########
@@ -32,32 +34,49 @@
  */
 public class OffsetResultIterator extends DelegateResultIterator {
     private int rowCount;
-    private int offset;
+    private final int offset;
+    private Tuple lastScannedTuple;
     private long pageSizeMs = Long.MAX_VALUE;
+    private boolean isIncompatibleClient = false;
 
     public OffsetResultIterator(ResultIterator delegate, Integer offset) {
         super(delegate);
         this.offset = offset == null ? -1 : offset;
+        this.lastScannedTuple = null;
     }
 
-    public OffsetResultIterator(ResultIterator delegate, Integer offset, long 
pageSizeMs) {
+    public OffsetResultIterator(ResultIterator delegate, Integer offset, long 
pageSizeMs,
+                                boolean isIncompatibleClient) {
         this(delegate, offset);
         this.pageSizeMs = pageSizeMs;
+        this.isIncompatibleClient = isIncompatibleClient;
     }
+
     @Override
     public Tuple next() throws SQLException {
+        long startTime = EnvironmentEdgeManager.currentTimeMillis();
         while (rowCount < offset) {
             Tuple tuple = super.next();
-            if (tuple == null) { return null; }
+            if (tuple == null) {
+                return null;
+            }
             if (tuple.size() == 0 || isDummy(tuple)) {
                 // while rowCount < offset absorb the dummy and call next on 
the underlying scanner
                 continue;
             }
             rowCount++;
-            // no page timeout check at this level because we cannot correctly 
resume
-            // scans for OFFSET queries until the offset is reached
+            lastScannedTuple = tuple;
+            if (!isIncompatibleClient) {
+                if (EnvironmentEdgeManager.currentTimeMillis() - startTime >= 
pageSizeMs) {

Review Comment:
   Since we cannot handle paging before skipping offset number of rows, we 
should not time out here either. We should also lock the region to prevent 
region movement.





> Data Integrity issues due to invalid rowkeys returned by various coprocessors
> -----------------------------------------------------------------------------
>
>                 Key: PHOENIX-7106
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7106
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.2.0, 5.1.4
>            Reporter: Viraj Jasani
>            Assignee: Viraj Jasani
>            Priority: Blocker
>             Fix For: 5.2.0, 5.1.4
>
>
> HBase scanner interface expects server to perform scan of the cells from 
> HFile or Block cache and return consistent data i.e. rowkey of the cells 
> returned should stay in the range of the scan boundaries. When a region moves 
> and scanner needs reset, or if the current row is too large and the server 
> returns partial row, the subsequent scanner#next is supposed to return 
> remaining cells. When this happens, cell rowkeys returned by servers i.e. any 
> coprocessors is expected to be in the scan boundary range so that server can 
> reliably perform its validation and return remaining cells as expected.
> Phoenix client initiates serial or parallel scans from the aggregators based 
> on the region boundaries and the scan boundaries are sometimes adjusted based 
> on where optimizer provided key ranges, to include tenant boundaries, salt 
> boundaries etc. After the client opens the scanner and performs scan 
> operation, some of the coprocs return invalid rowkey for the following cases:
>  # Grouped aggregate queries
>  # Some Ungrouped aggregate queries
>  # Offset queries
>  # Dummy cells returned with empty rowkey
>  # Update statistics queries
>  # Uncovered Index queries
>  # Ordered results at server side
>  # ORDER BY DESC on rowkey
>  # Global Index read-repair
>  # Paging region scanner with HBase scanner reopen
>  # ORDER BY on non-pk column(s) with/without paging
>  # GROUP BY on non-pk column(s) with/without paging
> Since many of these cases return reserved rowkeys, they are likely not going 
> to match scan or region boundaries. It has potential to cause data integrity 
> issues in certain scenarios as explained above. Empty rowkey returned by 
> server can be treated as end of the region scan by HBase client.
> With the paging feature enabled, if the page size is kept low, we have higher 
> chances of scanners returning dummy cell, resulting in increased num of RPC 
> calls for better latency and timeouts. We should return only valid rowkey in 
> the scan range for all the cases where we perform above mentioned operations 
> like complex aggregate or offset queries etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PHOENIX-7106) Data Integrity issues due to invalid rowkeys returned by various coprocessors

Reply via email to