[jira] [Commented] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time

2017-09-19 Thread Rajeshbabu Chintaguntla (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171208#comment-16171208
 ] 

Rajeshbabu Chintaguntla commented on PHOENIX-4027:
--

[~jamestaylor] I ran the tests and they are passing with the addendum. What 
about increasing the default threshold to 1 hour(currently 30min) or more than 
it because sometimes fixing the HBase inconsistencies might take more time. 
Sometimes rebuilding index also take time.

> Mark index as disabled during partial rebuild after configurable amount of 
> time
> ---
>
> Key: PHOENIX-4027
> URL: https://issues.apache.org/jira/browse/PHOENIX-4027
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Samarth Jain
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4027_addendum_2.patch, 
> PHOENIX-4027_addendum.patch, PHOENIX-4027.patch
>
>
> Instead of marking an index as permanently disabled in the partial index 
> rebuilder when a failure occurs, we should let it try again up to a 
> configurable amount of time. The reason is that the fail-fast approach with 
> the lower RPC timeout will continue to cause a failure until the index region 
> can be written to. This will allow us to ride out region moves without a long 
> RPC time out and thus without holding handler threads for long periods of 
> time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an 
> index as we walk through the scan results here in MetaDataRegionObserver. :
> {code}
> do {
> results.clear();
> hasMore = scanner.next(results);
> if (results.isEmpty()) break;
> Result r = Result.create(results);
> byte[] disabledTimeStamp = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> 
> PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES);
> byte[] indexState = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> PhoenixDatabaseMetaData.INDEX_STATE_BYTES);
> if (disabledTimeStamp == null || disabledTimeStamp.length 
> == 0) {
> continue;
> }
> // TODO: if disabledTimeStamp - 
> System.currentTimeMillis() > configurableAmount 
> // then disable the index.
> {code}
> I'd propose we allow 30 minutes to get an index back online.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time

2017-09-15 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168347#comment-16168347
 ] 

James Taylor commented on PHOENIX-4027:
---

That sounds reasonable, [~rajeshbabu]. How about kicking off a pre-commit run 
to make sure we don't need to update any tests?

> Mark index as disabled during partial rebuild after configurable amount of 
> time
> ---
>
> Key: PHOENIX-4027
> URL: https://issues.apache.org/jira/browse/PHOENIX-4027
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Samarth Jain
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4027_addendum_2.patch, 
> PHOENIX-4027_addendum.patch, PHOENIX-4027.patch
>
>
> Instead of marking an index as permanently disabled in the partial index 
> rebuilder when a failure occurs, we should let it try again up to a 
> configurable amount of time. The reason is that the fail-fast approach with 
> the lower RPC timeout will continue to cause a failure until the index region 
> can be written to. This will allow us to ride out region moves without a long 
> RPC time out and thus without holding handler threads for long periods of 
> time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an 
> index as we walk through the scan results here in MetaDataRegionObserver. :
> {code}
> do {
> results.clear();
> hasMore = scanner.next(results);
> if (results.isEmpty()) break;
> Result r = Result.create(results);
> byte[] disabledTimeStamp = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> 
> PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES);
> byte[] indexState = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> PhoenixDatabaseMetaData.INDEX_STATE_BYTES);
> if (disabledTimeStamp == null || disabledTimeStamp.length 
> == 0) {
> continue;
> }
> // TODO: if disabledTimeStamp - 
> System.currentTimeMillis() > configurableAmount 
> // then disable the index.
> {code}
> I'd propose we allow 30 minutes to get an index back online.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time

2017-09-15 Thread Rajeshbabu Chintaguntla (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167715#comment-16167715
 ] 

Rajeshbabu Chintaguntla commented on PHOENIX-4027:
--

With this patch partial index rebuild make the index disabled forever in very 
easily in below situations.
1) When we write the past data with row timestamp columns
2) Sometimes any region inconsistencies introduced take more than 30 mins

When the data is huge creating index or rebuilding complete index might take 
hours or days. In such cases it's better to rebuild the index in intervals or 
batches than completely disabling. [~samarthjain] [~jamestaylor] whyt?
{noformat}
if (EnvironmentEdgeManager.currentTimeMillis() - 
Math.abs(indexDisableTimestamp) > indexDisableTimestampThreshold) {
/*
 * It has been too long since the index has been 
disabled and any future
 * attempts to reenable it likely will fail. So we are 
going to mark the
 * index as disabled and set the index disable 
timestamp to 0 so that the
 * rebuild task won't pick up this index again for 
rebuild.
 */
try {
IndexUtil.updateIndexState(conn, 
indexTableFullName, PIndexState.DISABLE, 0l);
LOG.error("Unable to rebuild index " + 
indexTableFullName
+ ". Won't attempt again since index 
disable timestamp is older than current time by "
+ indexDisableTimestampThreshold
+ " milliseconds. Manual intervention 
needed to re-build the index");
} catch (Throwable ex) {
LOG.error(
"Unable to mark index " + indexTableFullName + 
" as disabled.", ex);
}
continue; // don't attempt another rebuild irrespective 
of whether
  // updateIndexState worked or not
}
{noformat}

> Mark index as disabled during partial rebuild after configurable amount of 
> time
> ---
>
> Key: PHOENIX-4027
> URL: https://issues.apache.org/jira/browse/PHOENIX-4027
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Samarth Jain
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4027_addendum.patch, PHOENIX-4027.patch
>
>
> Instead of marking an index as permanently disabled in the partial index 
> rebuilder when a failure occurs, we should let it try again up to a 
> configurable amount of time. The reason is that the fail-fast approach with 
> the lower RPC timeout will continue to cause a failure until the index region 
> can be written to. This will allow us to ride out region moves without a long 
> RPC time out and thus without holding handler threads for long periods of 
> time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an 
> index as we walk through the scan results here in MetaDataRegionObserver. :
> {code}
> do {
> results.clear();
> hasMore = scanner.next(results);
> if (results.isEmpty()) break;
> Result r = Result.create(results);
> byte[] disabledTimeStamp = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> 
> PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES);
> byte[] indexState = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> PhoenixDatabaseMetaData.INDEX_STATE_BYTES);
> if (disabledTimeStamp == null || disabledTimeStamp.length 
> == 0) {
> continue;
> }
> // TODO: if disabledTimeStamp - 
> System.currentTimeMillis() > configurableAmount 
> // then disable the index.
> {code}
> I'd propose we allow 30 minutes to get an index back online.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time

2017-08-21 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135040#comment-16135040
 ] 

Ankit Singhal commented on PHOENIX-4027:


[~samarthjain], is this can be resolved?

> Mark index as disabled during partial rebuild after configurable amount of 
> time
> ---
>
> Key: PHOENIX-4027
> URL: https://issues.apache.org/jira/browse/PHOENIX-4027
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Samarth Jain
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4027_addendum.patch, PHOENIX-4027.patch
>
>
> Instead of marking an index as permanently disabled in the partial index 
> rebuilder when a failure occurs, we should let it try again up to a 
> configurable amount of time. The reason is that the fail-fast approach with 
> the lower RPC timeout will continue to cause a failure until the index region 
> can be written to. This will allow us to ride out region moves without a long 
> RPC time out and thus without holding handler threads for long periods of 
> time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an 
> index as we walk through the scan results here in MetaDataRegionObserver. :
> {code}
> do {
> results.clear();
> hasMore = scanner.next(results);
> if (results.isEmpty()) break;
> Result r = Result.create(results);
> byte[] disabledTimeStamp = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> 
> PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES);
> byte[] indexState = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> PhoenixDatabaseMetaData.INDEX_STATE_BYTES);
> if (disabledTimeStamp == null || disabledTimeStamp.length 
> == 0) {
> continue;
> }
> // TODO: if disabledTimeStamp - 
> System.currentTimeMillis() > configurableAmount 
> // then disable the index.
> {code}
> I'd propose we allow 30 minutes to get an index back online.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time

2017-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090639#comment-16090639
 ] 

Hudson commented on PHOENIX-4027:
-

FAILURE: Integrated in Jenkins build Phoenix-master #1692 (See 
[https://builds.apache.org/job/Phoenix-master/1692/])
PHOENIX-4027 Addendum - move testRebuildIndexConnectionProperties to its 
(samarth: rev 48341ae3fcc645aa7f559ae98606c522c563268d)
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/PhoenixRuntimeIT.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/util/QueryUtil.java
* (add) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/RebuildIndexConnectionPropsIT.java
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/query/QueryServicesOptions.java


> Mark index as disabled during partial rebuild after configurable amount of 
> time
> ---
>
> Key: PHOENIX-4027
> URL: https://issues.apache.org/jira/browse/PHOENIX-4027
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Samarth Jain
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4027_addendum.patch, PHOENIX-4027.patch
>
>
> Instead of marking an index as permanently disabled in the partial index 
> rebuilder when a failure occurs, we should let it try again up to a 
> configurable amount of time. The reason is that the fail-fast approach with 
> the lower RPC timeout will continue to cause a failure until the index region 
> can be written to. This will allow us to ride out region moves without a long 
> RPC time out and thus without holding handler threads for long periods of 
> time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an 
> index as we walk through the scan results here in MetaDataRegionObserver. :
> {code}
> do {
> results.clear();
> hasMore = scanner.next(results);
> if (results.isEmpty()) break;
> Result r = Result.create(results);
> byte[] disabledTimeStamp = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> 
> PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES);
> byte[] indexState = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> PhoenixDatabaseMetaData.INDEX_STATE_BYTES);
> if (disabledTimeStamp == null || disabledTimeStamp.length 
> == 0) {
> continue;
> }
> // TODO: if disabledTimeStamp - 
> System.currentTimeMillis() > configurableAmount 
> // then disable the index.
> {code}
> I'd propose we allow 30 minutes to get an index back online.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time

2017-07-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089551#comment-16089551
 ] 

Hadoop QA commented on PHOENIX-4027:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12877530/PHOENIX-4027_addendum.patch
  against master branch at commit 18ea6edc00029e7e900ad95562fa73da0e5ccf51.
  ATTACHMENT ID: 12877530

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
51 warning messages.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+serverProps.put(QueryServices.EXTRA_JDBC_ARGUMENTS_ATTRIB, 
QueryServicesOptions.DEFAULT_EXTRA_JDBC_ARGUMENTS);
+serverProps.put(QueryServices.INDEX_REBUILD_RPC_RETRIES_COUNTER, 
Long.toString(NUM_RPC_RETRIES));
+
MetaDataRegionObserver.getRebuildIndexConnection(hbaseTestUtil.getMiniHBaseCluster().getConfiguration()))
 {
+ConnectionQueryServices rebuildQueryServices = 
rebuildIndexConnection.getQueryServices();
+public static final int DEFAULT_INDEX_REBUILD_RPC_RETRIES_COUNTER = 0; // 
no retries at rpc level

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.MutableIndexFailureIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1220//testReport/
Javadoc warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1220//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1220//console

This message is automatically generated.

> Mark index as disabled during partial rebuild after configurable amount of 
> time
> ---
>
> Key: PHOENIX-4027
> URL: https://issues.apache.org/jira/browse/PHOENIX-4027
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Samarth Jain
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4027_addendum.patch, PHOENIX-4027.patch
>
>
> Instead of marking an index as permanently disabled in the partial index 
> rebuilder when a failure occurs, we should let it try again up to a 
> configurable amount of time. The reason is that the fail-fast approach with 
> the lower RPC timeout will continue to cause a failure until the index region 
> can be written to. This will allow us to ride out region moves without a long 
> RPC time out and thus without holding handler threads for long periods of 
> time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an 
> index as we walk through the scan results here in MetaDataRegionObserver. :
> {code}
> do {
> results.clear();
> hasMore = scanner.next(results);
> if (results.isEmpty()) break;
> Result r = Result.create(results);
> byte[] disabledTimeStamp = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> 
> PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES);
> byte[] indexState = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> PhoenixDatabaseMetaData.INDEX_STATE_BYTES);
> if (disabledTimeStamp == null || disabledTimeStamp.length 
> == 0) {
> continue;
> }
> // TODO: if disabledTimeStamp - 
> System.currentTimeMillis() > configurableAmount 
> // then disable the index.
> {code}
> I'd propose we allow 30 minutes to get an index back online.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time

2017-07-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1600#comment-1600
 ] 

Hudson commented on PHOENIX-4027:
-

FAILURE: Integrated in Jenkins build Phoenix-master #1689 (See 
[https://builds.apache.org/job/Phoenix-master/1689/])
PHOENIX-4027 Mark index as disabled during partial rebuild after (samarth: rev 
d541d6f2875a590580e8ccf05f26795083b06658)
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataRegionObserver.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/PhoenixRuntimeIT.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/query/QueryServices.java
* (edit) 
phoenix-core/src/it/java/org/apache/phoenix/end2end/index/MutableIndexFailureIT.java
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/query/QueryServicesOptions.java


> Mark index as disabled during partial rebuild after configurable amount of 
> time
> ---
>
> Key: PHOENIX-4027
> URL: https://issues.apache.org/jira/browse/PHOENIX-4027
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Samarth Jain
> Fix For: 4.12.0, 4.11.1
>
> Attachments: PHOENIX-4027.patch
>
>
> Instead of marking an index as permanently disabled in the partial index 
> rebuilder when a failure occurs, we should let it try again up to a 
> configurable amount of time. The reason is that the fail-fast approach with 
> the lower RPC timeout will continue to cause a failure until the index region 
> can be written to. This will allow us to ride out region moves without a long 
> RPC time out and thus without holding handler threads for long periods of 
> time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an 
> index as we walk through the scan results here in MetaDataRegionObserver. :
> {code}
> do {
> results.clear();
> hasMore = scanner.next(results);
> if (results.isEmpty()) break;
> Result r = Result.create(results);
> byte[] disabledTimeStamp = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> 
> PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES);
> byte[] indexState = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> PhoenixDatabaseMetaData.INDEX_STATE_BYTES);
> if (disabledTimeStamp == null || disabledTimeStamp.length 
> == 0) {
> continue;
> }
> // TODO: if disabledTimeStamp - 
> System.currentTimeMillis() > configurableAmount 
> // then disable the index.
> {code}
> I'd propose we allow 30 minutes to get an index back online.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PHOENIX-4027) Mark index as disabled during partial rebuild after configurable amount of time

2017-07-14 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088341#comment-16088341
 ] 

James Taylor commented on PHOENIX-4027:
---

+1. Thanks, [~samarthjain].

> Mark index as disabled during partial rebuild after configurable amount of 
> time
> ---
>
> Key: PHOENIX-4027
> URL: https://issues.apache.org/jira/browse/PHOENIX-4027
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Samarth Jain
> Attachments: PHOENIX-4027.patch
>
>
> Instead of marking an index as permanently disabled in the partial index 
> rebuilder when a failure occurs, we should let it try again up to a 
> configurable amount of time. The reason is that the fail-fast approach with 
> the lower RPC timeout will continue to cause a failure until the index region 
> can be written to. This will allow us to ride out region moves without a long 
> RPC time out and thus without holding handler threads for long periods of 
> time. We can base the failure on the INDEX_DISABLE_TIMESTAMP value of an 
> index as we walk through the scan results here in MetaDataRegionObserver. :
> {code}
> do {
> results.clear();
> hasMore = scanner.next(results);
> if (results.isEmpty()) break;
> Result r = Result.create(results);
> byte[] disabledTimeStamp = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> 
> PhoenixDatabaseMetaData.INDEX_DISABLE_TIMESTAMP_BYTES);
> byte[] indexState = 
> r.getValue(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES,
> PhoenixDatabaseMetaData.INDEX_STATE_BYTES);
> if (disabledTimeStamp == null || disabledTimeStamp.length 
> == 0) {
> continue;
> }
> // TODO: if disabledTimeStamp - 
> System.currentTimeMillis() > configurableAmount 
> // then disable the index.
> {code}
> I'd propose we allow 30 minutes to get an index back online.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)