[
https://issues.apache.org/jira/browse/PHOENIX-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17182130#comment-17182130
]
Lars Hofhansl commented on PHOENIX-6090:
----------------------------------------
I think the problem is that
* global indexes should not have the lock while the remote writes are
happening (since that can cause transitive stalling of region servers)
* if there are concurrent writes to the same rows global consistent indexes
just accept the "corruption" (which is unverified anyway) and then let the
read-repair handle it
* local indexes do not have read-repair (nor should they) and they need that
lock to be held - holding the locks for local indexes is ok since they do not
perform any remote operations.
So we are bit at odds here. Solutions:
* add read-repair to local indexes (I would strongly advice against that)
* keep holding the locks if there is at least one local index involved (but
now the global remote operation are done inside of a lock), perhaps document to
avoid the mixing the two on the same table...?
* (somehow) separate the write paths.
* or figure out how to make local indexes consistent without holding locks in
other ways.
> Local indexes get out of sync after changes for global consistent indexes
> -------------------------------------------------------------------------
>
> Key: PHOENIX-6090
> URL: https://issues.apache.org/jira/browse/PHOENIX-6090
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.15.0, 5.1.0, 4.16.0
> Reporter: Lars Hofhansl
> Assignee: Kadir OZDEMIR
> Priority: Blocker
> Fix For: 5.1.0, 4.15.1, 4.16.0
>
> Attachments: 6090-test-4.x.txt, 6090-test-v2-4.x.txt
>
>
> {code:java}
> > select /*+ NO_INDEX */ count(*) from test;
> +----------+
> | COUNT(1) |
> +----------+
> | 522244 |
> +----------+
> 1 row selected (1.213 seconds)
> > select count(*) from test;
> +---------+
> | COUNT(1) |
> +----------+
> | 522245 |
> +----------+
> 1 row selected (1.23 seconds)
> {code}
>
> This was after I did some insert and a bunch of splits (but not in parallel).
> It's not, yet, clear under what circumstances that exactly happens. Just that
> after a while it happens.
> This is Phoenix built from master and HBase built from branch-2.3. (Client
> and server versions of HBase are matching).
> I've since tried with Phoenix 4.x and see the same issue - also see attached
> tests.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)