[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.
[ https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8229: - Assignee: Lars Hofhansl Replication code logs like crazy if a target table cannot be found. --- Key: HBASE-8229 URL: https://issues.apache.org/jira/browse/HBASE-8229 Project: HBase Issue Type: Bug Components: Replication Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.98.0, 0.94.7, 0.95.1 Attachments: 8229-0.94.txt One of our RS/DN machines ran out of diskspace on the partition to which we write the log files. It turns out we still had a table in our source cluster with REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster. In then logged a long stack trace every 50ms or so, over a few days that filled up our log partition. Since ReplicationSource cannot make any progress in this case anyway, it should probably sleep a bit before retrying (or at least limit the rate at which it spews out these exceptions to the log). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.
[ https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8229: - Fix Version/s: (was: 0.95.0) 0.95.1 Replication code logs like crazy if a target table cannot be found. --- Key: HBASE-8229 URL: https://issues.apache.org/jira/browse/HBASE-8229 Project: HBase Issue Type: Bug Components: Replication Reporter: Lars Hofhansl Fix For: 0.95.1, 0.98.0, 0.94.7 Attachments: 8229-0.94.txt, 8229-0.94-V2.txt One of our RS/DN machines ran out of diskspace on the partition to which we write the log files. It turns out we still had a table in our source cluster with REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster. In then logged a long stack trace every 50ms or so, over a few days that filled up our log partition. Since ReplicationSource cannot make any progress in this case anyway, it should probably sleep a bit before retrying (or at least limit the rate at which it spews out these exceptions to the log). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.
[ https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8229: - Attachment: (was: 8229-0.94-V2.txt) Replication code logs like crazy if a target table cannot be found. --- Key: HBASE-8229 URL: https://issues.apache.org/jira/browse/HBASE-8229 Project: HBase Issue Type: Bug Components: Replication Reporter: Lars Hofhansl Fix For: 0.95.1, 0.98.0, 0.94.7 Attachments: 8229-0.94.txt One of our RS/DN machines ran out of diskspace on the partition to which we write the log files. It turns out we still had a table in our source cluster with REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster. In then logged a long stack trace every 50ms or so, over a few days that filled up our log partition. Since ReplicationSource cannot make any progress in this case anyway, it should probably sleep a bit before retrying (or at least limit the rate at which it spews out these exceptions to the log). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.
[ https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8229: - Attachment: 8229-0.94.txt Here's a simple 0.94 fix. While looking at the code I also realized that there is no way to get out of this situation in the source cluster. Removing REPLICATION_SCOPE, disabling, or even dropping the table will not end this endless loop. Replication will never make any progress until the RS is restarted or the table is added to the peer cluster. Replication code logs like crazy if a target table cannot be found. --- Key: HBASE-8229 URL: https://issues.apache.org/jira/browse/HBASE-8229 Project: HBase Issue Type: Bug Components: Replication Reporter: Lars Hofhansl Fix For: 0.95.0, 0.98.0, 0.94.7 Attachments: 8229-0.94.txt One of our RS/DN machines ran out of diskspace on the partition to which we write the log files. It turns out we still had a table in our source cluster with REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster. In then logged a long stack trace every 50ms or so, over a few days that filled up our log partition. Since ReplicationSource cannot make any progress in this case anyway, it should probably sleep a bit before retrying (or at least limit the rate at which it spews out these exceptions to the log). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.
[ https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8229: - Attachment: 8229-0.94-V2.txt Here's a version that MIGHT fix that (untested). The idea is to return back into the run() loop of ReplicationSource, so that the edits are rechecked (and not shipped to the peer if the local table's status has changed). Replication code logs like crazy if a target table cannot be found. --- Key: HBASE-8229 URL: https://issues.apache.org/jira/browse/HBASE-8229 Project: HBase Issue Type: Bug Components: Replication Reporter: Lars Hofhansl Fix For: 0.95.0, 0.98.0, 0.94.7 Attachments: 8229-0.94.txt, 8229-0.94-V2.txt One of our RS/DN machines ran out of diskspace on the partition to which we write the log files. It turns out we still had a table in our source cluster with REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster. In then logged a long stack trace every 50ms or so, over a few days that filled up our log partition. Since ReplicationSource cannot make any progress in this case anyway, it should probably sleep a bit before retrying (or at least limit the rate at which it spews out these exceptions to the log). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.
[ https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-8229: Component/s: Replication Replication code logs like crazy if a target table cannot be found. --- Key: HBASE-8229 URL: https://issues.apache.org/jira/browse/HBASE-8229 Project: HBase Issue Type: Bug Components: Replication Reporter: Lars Hofhansl Fix For: 0.95.0, 0.98.0, 0.94.7 One of our RS/DN machines ran out of diskspace on the partition to which we write the log files. It turns out we still had a table in our source cluster with REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster. In then logged a long stack trace every 50ms or so, over a few days that filled up our log partition. Since ReplicationSource cannot make any progress in this case anyway, it should probably sleep a bit before retrying (or at least limit the rate at which it spews out these exceptions to the log). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira