[ 
https://issues.apache.org/jira/browse/PHOENIX-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17182190#comment-17182190
 ] 

Lars Hofhansl edited comment on PHOENIX-6090 at 8/22/20, 1:26 AM:
------------------------------------------------------------------

Here's what happens in this code:
{code}
        // Acquire the locks again before letting the region proceed with data 
table updates
        List<RowLock> rowLocks = 
Lists.newArrayListWithExpectedSize(context.rowLocks.size());
        for (RowLock rowLock : context.rowLocks) {
            rowLocks.add(lockManager.lockRow(rowLock.getRowKey(), 
rowLockWaitDuration));
        }
        context.rowLocks.clear();
        context.rowLocks = rowLocks;

{code}
Assume now we fail to acquire one of the locks. We'll throw an exception and in 
that case we do not clear context.rowlocks, and thus we will be attempting to 
release the old locks again in postBatchMutateIndispensably. That will abort 
the RegionServer. That one is pretty bad.


was (Author: lhofhansl):
Here's what happens in this code:
{code}
        // Acquire the locks again before letting the region proceed with data 
table updates
        List<RowLock> rowLocks = 
Lists.newArrayListWithExpectedSize(context.rowLocks.size());
        for (RowLock rowLock : context.rowLocks) {
            rowLocks.add(lockManager.lockRow(rowLock.getRowKey(), 
rowLockWaitDuration));
        }
        context.rowLocks.clear();
        context.rowLocks = rowLocks;

{code}
Assume now we fail to acquire one of the locks. In that case we do not touch 
context.rowlocks, and thus we will be attempting to release the old locks again.

> Local indexes get out of sync after changes for global consistent indexes
> -------------------------------------------------------------------------
>
>                 Key: PHOENIX-6090
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6090
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.15.0, 5.1.0, 4.16.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Blocker
>             Fix For: 5.1.0, 4.15.1, 4.16.0
>
>         Attachments: 6090-fix-4.x.txt, 6090-fix-v2-4.x.txt, 
> 6090-fix-v3-4.x.txt, 6090-test-4.x.txt, 6090-test-v2-4.x.txt
>
>
> {code:java}
>  > select /*+ NO_INDEX */ count(*) from test;
> +----------+
> | COUNT(1) |
> +----------+
> | 522244   |
> +----------+
> 1 row selected (1.213 seconds)
> > select count(*) from test;
> +---------+
> | COUNT(1) |
> +----------+
> | 522245   |
> +----------+
> 1 row selected (1.23 seconds)
> {code}
>  
> This was after I did some insert and a bunch of splits (but not in parallel).
> It's not, yet, clear under what circumstances that exactly happens. Just that 
> after a while it happens.
> This is Phoenix built from master and HBase built from branch-2.3. (Client 
> and server versions of HBase are matching).
> I've since tried with Phoenix 4.x and see the same issue - also see attached 
> tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to