[jira] [Updated] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time
[ https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-4027: -- Labels: secondary_index (was: ) > Mark index as disabled during partial rebuild after configurable amount of > time > --- > > Key: PHOENIX-4027 > URL: https://issues.apache.org/jira/browse/PHOENIX-4027 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: Samarth Jain > Labels: secondary_index > Fix For: 4.12.0 > > Attachments: PHOENIX-4027.patch, PHOENIX-4027_addendum.patch, > PHOENIX-4027_addendum_2.patch > > > Instead of marking an index as permanently disabled in the partial index > rebuilder when a failure occurs, we should let it try again up to a > configurable amount of time. The reason is that the fail-fast approach with > the lower RPC timeout will continue to cause a failure until the index region > can be written to. This will allow us to ride out region moves without a long > RPC time out and thus without holding handler threads for long periods of > time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an > index as we walk through the scan results here in MetaDataRegionObserver. : > {code} > do { > results.clear(); > hasMore = scanner.next(results); > if (results.isEmpty()) break; > Result r = Result.create(results); > byte[] disabledTimeStamp = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > > PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES); > byte[] indexState = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > PhoenixDatabaseMetaData.INDEX_STATE_BYTES); > if (disabledTimeStamp == null || disabledTimeStamp.length > == 0) { > continue; > } > // TODO: if disabledTimeStamp - > System.currentTimeMillis() > configurableAmount > // then disable the index. > {code} > I'd propose we allow 30 minutes to get an index back online. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time
[ https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-4027: -- Fix Version/s: (was: 4.11.1) > Mark index as disabled during partial rebuild after configurable amount of > time > --- > > Key: PHOENIX-4027 > URL: https://issues.apache.org/jira/browse/PHOENIX-4027 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: Samarth Jain > Labels: secondary_index > Fix For: 4.12.0 > > Attachments: PHOENIX-4027.patch, PHOENIX-4027_addendum.patch, > PHOENIX-4027_addendum_2.patch > > > Instead of marking an index as permanently disabled in the partial index > rebuilder when a failure occurs, we should let it try again up to a > configurable amount of time. The reason is that the fail-fast approach with > the lower RPC timeout will continue to cause a failure until the index region > can be written to. This will allow us to ride out region moves without a long > RPC time out and thus without holding handler threads for long periods of > time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an > index as we walk through the scan results here in MetaDataRegionObserver. : > {code} > do { > results.clear(); > hasMore = scanner.next(results); > if (results.isEmpty()) break; > Result r = Result.create(results); > byte[] disabledTimeStamp = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > > PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES); > byte[] indexState = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > PhoenixDatabaseMetaData.INDEX_STATE_BYTES); > if (disabledTimeStamp == null || disabledTimeStamp.length > == 0) { > continue; > } > // TODO: if disabledTimeStamp - > System.currentTimeMillis() > configurableAmount > // then disable the index. > {code} > I'd propose we allow 30 minutes to get an index back online. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time
[ https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajeshbabu Chintaguntla updated PHOENIX-4027: - Attachment: PHOENIX-4027_addendum_2.patch Here is the addendum make use of timestamp at the index state changed than index disable timestamp to check since how long the index rebuild is running. > Mark index as disabled during partial rebuild after configurable amount of > time > --- > > Key: PHOENIX-4027 > URL: https://issues.apache.org/jira/browse/PHOENIX-4027 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: Samarth Jain > Fix For: 4.12.0, 4.11.1 > > Attachments: PHOENIX-4027_addendum_2.patch, > PHOENIX-4027_addendum.patch, PHOENIX-4027.patch > > > Instead of marking an index as permanently disabled in the partial index > rebuilder when a failure occurs, we should let it try again up to a > configurable amount of time. The reason is that the fail-fast approach with > the lower RPC timeout will continue to cause a failure until the index region > can be written to. This will allow us to ride out region moves without a long > RPC time out and thus without holding handler threads for long periods of > time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an > index as we walk through the scan results here in MetaDataRegionObserver. : > {code} > do { > results.clear(); > hasMore = scanner.next(results); > if (results.isEmpty()) break; > Result r = Result.create(results); > byte[] disabledTimeStamp = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > > PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES); > byte[] indexState = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > PhoenixDatabaseMetaData.INDEX_STATE_BYTES); > if (disabledTimeStamp == null || disabledTimeStamp.length > == 0) { > continue; > } > // TODO: if disabledTimeStamp - > System.currentTimeMillis() > configurableAmount > // then disable the index. > {code} > I'd propose we allow 30 minutes to get an index back online. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time
[ https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4027: -- Attachment: PHOENIX-4027_addendum.patch Addendum patch. Turned out the test added in PhoenixRuntimeIT needs to use the real phoenix driver. So I had move it into it's own test class. Also, there was a typo in the number of index rpc retries for index rebuilds. Instead of 0 I had it as 1. For tests though, I need to set the rpc retry value to 1 otherwise they fail. > Mark index as disabled during partial rebuild after configurable amount of > time > --- > > Key: PHOENIX-4027 > URL: https://issues.apache.org/jira/browse/PHOENIX-4027 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: Samarth Jain > Fix For: 4.12.0, 4.11.1 > > Attachments: PHOENIX-4027_addendum.patch, PHOENIX-4027.patch > > > Instead of marking an index as permanently disabled in the partial index > rebuilder when a failure occurs, we should let it try again up to a > configurable amount of time. The reason is that the fail-fast approach with > the lower RPC timeout will continue to cause a failure until the index region > can be written to. This will allow us to ride out region moves without a long > RPC time out and thus without holding handler threads for long periods of > time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an > index as we walk through the scan results here in MetaDataRegionObserver. : > {code} > do { > results.clear(); > hasMore = scanner.next(results); > if (results.isEmpty()) break; > Result r = Result.create(results); > byte[] disabledTimeStamp = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > > PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES); > byte[] indexState = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > PhoenixDatabaseMetaData.INDEX_STATE_BYTES); > if (disabledTimeStamp == null || disabledTimeStamp.length > == 0) { > continue; > } > // TODO: if disabledTimeStamp - > System.currentTimeMillis() > configurableAmount > // then disable the index. > {code} > I'd propose we allow 30 minutes to get an index back online. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time
[ https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samarth Jain updated PHOENIX-4027: -- Attachment: PHOENIX-4027.patch [~jamestaylor], [~vincentpoon] - please review. > Mark index as disabled during partial rebuild after configurable amount of > time > --- > > Key: PHOENIX-4027 > URL: https://issues.apache.org/jira/browse/PHOENIX-4027 > Project: Phoenix > Issue Type: Bug >Reporter: James Taylor >Assignee: Samarth Jain > Attachments: PHOENIX-4027.patch > > > Instead of marking an index as permanently disabled in the partial index > rebuilder when a failure occurs, we should let it try again up to a > configurable amount of time. The reason is that the fail-fast approach with > the lower RPC timeout will continue to cause a failure until the index region > can be written to. This will allow us to ride out region moves without a long > RPC time out and thus without holding handler threads for long periods of > time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an > index as we walk through the scan results here in MetaDataRegionObserver. : > {code} > do { > results.clear(); > hasMore = scanner.next(results); > if (results.isEmpty()) break; > Result r = Result.create(results); > byte[] disabledTimeStamp = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > > PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES); > byte[] indexState = > r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, > PhoenixDatabaseMetaData.INDEX_STATE_BYTES); > if (disabledTimeStamp == null || disabledTimeStamp.length > == 0) { > continue; > } > // TODO: if disabledTimeStamp - > System.currentTimeMillis() > configurableAmount > // then disable the index. > {code} > I'd propose we allow 30 minutes to get an index back online. -- This message was sent by Atlassian JIRA (v6.4.14#64029)