[ 
https://issues.apache.org/jira/browse/PHOENIX-950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980039#comment-13980039
 ] 

Jeffrey Zhong commented on PHOENIX-950:
---------------------------------------

[~giacomotaylor] Thanks for the comments. Good to know the transaction work is 
underway. I'd like to see the design/talk on this topic. With the transaction 
support, secondary index update could be handled nicely.  

Disabling index when updates fail is better than aborting region server 
directly but it also makes index unusable. Region server hosting index could go 
away for a while(not often situation not rare either). So after a while, index 
likely is in disable state which against the purpose to have such an index at 
first place.  It also requires human involvement to reenable the index 
afterwards. I'm hoping to have index online all the time before we can use 
transaction to handle the update smoothly.

{quote}
are you seeing situations where index updates fail, but the region server is 
functioning fine and the index cannot be disabled
{quote}
Because RS hosting data, index and system.catalog are likely on there different 
RS. It will be a rare situation but it still could happen when one of them 
can't talk to the other two for a moment.


> Improve Secondary Index Update Failure Handling
> -----------------------------------------------
>
>                 Key: PHOENIX-950
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-950
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Jeffrey Zhong
>         Attachments: Improve Phoenix Secondary Index Update Failure 
> Handling.pdf
>
>
> Current secondary index update could trigger chained region server failures. 
> This isn't friendly to end-users. Even we disable index after index update 
> failures before aborting, it will involve lot of human involvement because 
> index update failure isn't a rare situation.
> In this JIRA, I propose a 2PC like protocol. The "like" means it's a not a 
> real 2PC because no infinitely blocking but it requires read time(query) to 
> reconcile inconsistence between index and data. Since I'm not familiar with 
> the query time logic, please let me know if the proposal could fly.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to