[ https://issues.apache.org/jira/browse/PHOENIX-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344274#comment-16344274 ]
Vincent Poon commented on PHOENIX-4130: --------------------------------------- > if an index is PENDING_DISABLE and elapsedSinceDisable <= >pendingDisableThreshold, just continue before even checking if all table >regions are online (since that check is somewhat expensive). There is already a check, the logic for checking all table regions are online is: {code:java} if ((indexState == PIndexState.DISABLE || indexState == PIndexState.PENDING_ACTIVE || (indexState == PIndexState.PENDING_DISABLE && elapsedSinceDisable > pendingDisableThreshold)) && !MetaDataUtil.tableRegionsOnline(this.env.getConfiguration(), indexPTable)) { LOG.debug("Index rebuild has been skipped because not all regions of index table=" + indexPTable.getName() + " are online."); continue; }{code} So, if elapsedSinceDisable <= pendingDisableThreshold, it should already do what you said - skip the check if all regions are online, and just continue. > Since index maintenance was never stopped when we go into PENDING_DISABLE Is that true after the change in QueryOptimizer, where we only return the index if we are under pendingDisableThreshold? I was thinking perhaps if we're over pendingDisableThreshold, then index maintenance might have stopped, but I'm actually not sure. If index maintenance still happens, then I can do as you suggested and switch directly to INACTIVE. > Avoid server retries for mutable indexes > ---------------------------------------- > > Key: PHOENIX-4130 > URL: https://issues.apache.org/jira/browse/PHOENIX-4130 > Project: Phoenix > Issue Type: Improvement > Reporter: Lars Hofhansl > Assignee: Vincent Poon > Priority: Major > Fix For: 4.14.0 > > Attachments: PHOENIX-4130.v1.master.patch, > PHOENIX-4130.v2.master.patch, PHOENIX-4130.v3.master.patch, > PHOENIX-4130.v4.master.patch, PHOENIX-4130.v5.master.patch > > > Had some discussions with [~jamestaylor], [~samarthjain], and [~vincentpoon], > during which I suggested that we can possibly eliminate retry loops happening > at the server that cause the handler threads to be stuck potentially for > quite a while (at least multiple seconds to ride over common scenarios like > splits). > Instead we can do the retries at the Phoenix client that. > So: > # The index updates are not retried on the server. (retries = 0) > # A failed index update would set the failed index timestamp but leave the > index enabled. > # Now the handler thread is done, it throws an appropriate exception back to > the client. > # The Phoenix client can now retry. When those retries fail the index is > disabled (if the policy dictates that) and throw the exception back to its > caller. > So no more waiting is needed on the server, handler threads are freed > immediately. -- This message was sent by Atlassian JIRA (v7.6.3#76005)