James Taylor created PHOENIX-4027: ------------------------------------- Summary: Mark index as disabled during partial rebuild after configurable amount of time Key: PHOENIX-4027 URL: https://issues.apache.org/jira/browse/PHOENIX-4027 Project: Phoenix Issue Type: Bug Reporter: James Taylor
Instead of marking an index as permanently disabled in the partial index rebuilder when a failure occurs, we should let it try again up to a configurable amount of time. The reason is that the fail-fast approach with the lower RPC timeout will continue to cause a failure until the index region can be written to. This will allow us to ride out region moves without a long RPC time out and thus without holding handler threads for long periods of time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an index as we walk through the scan results here in MetaDataRegionObserver. : {code} do { results.clear(); hasMore = scanner.next(results); if (results.isEmpty()) break; Result r = Result.create(results); byte[] disabledTimeStamp = r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES); byte[] indexState = r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, PhoenixDatabaseMetaData.INDEX_STATE_BYTES); if (disabledTimeStamp == null || disabledTimeStamp.length == 0) { continue; } // TODO: if disabledTimeStamp - System.currentTimeMillis() > configurableAmount // then disable the index. {code} I'd propose we allow 30 minutes to get an index back online. -- This message was sent by Atlassian JIRA (v6.4.14#64029)