[ https://issues.apache.org/jira/browse/PHOENIX-950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13979995#comment-13979995 ]
James Taylor commented on PHOENIX-950: -------------------------------------- I like the idea of improving the update failure handling. However, I think there are two other solutions that skirt this issue completely: - local indexing. [~rajesh23] has been working away at getting local indexing into Phoenix and I don't think it's too far off. There are no cross region server calls in this solution, so the problem goes away. I still think we'll want to support global indexes, though, as these will provide better performance for some use cases (see below). - transactions. [~jesse_yates], [~lhofhansl], and myself have been talking about how to provide light-weight transactions through snapshot isolation in HBase. The folks at Continuity have done this as have the folks at Splice Machine. It's not as much work as it sounds. With transaction support, this problem also goes away, as the data and index updates would be done in the same transaction and fail or succeed together. Short term, though, I think we can do something simpler than what you're proposing and have a failure mode that doesn't kill the region server. As you've said, we first try to disable the index. If that fails, we can throw an exception and the client can disable the index locally and attempt to disable it globally as well. We can also optionally not write the index updates to the WAL if that's problematic (a kind of "do you feel lucky" mode). :-) I'm curious, though, are you seeing situations where index updates fail, but the region server is functioning fine and the index cannot be disabled? That's the only time we kill the region server. > Improve Secondary Index Update Failure Handling > ----------------------------------------------- > > Key: PHOENIX-950 > URL: https://issues.apache.org/jira/browse/PHOENIX-950 > Project: Phoenix > Issue Type: Improvement > Reporter: Jeffrey Zhong > Attachments: Improve Phoenix Secondary Index Update Failure > Handling.pdf > > > Current secondary index update could trigger chained region server failures. > This isn't friendly to end-users. Even we disable index after index update > failures before aborting, it will involve lot of human involvement because > index update failure isn't a rare situation. > In this JIRA, I propose a 2PC like protocol. The "like" means it's a not a > real 2PC because no infinitely blocking but it requires read time(query) to > reconcile inconsistence between index and data. Since I'm not familiar with > the query time logic, please let me know if the proposal could fly. > Thanks. -- This message was sent by Atlassian JIRA (v6.2#6252)