[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.

2013-04-17 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-8229:
-

Assignee: Lars Hofhansl

 Replication code logs like crazy if a target table cannot be found.
 ---

 Key: HBASE-8229
 URL: https://issues.apache.org/jira/browse/HBASE-8229
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.94.7, 0.95.1

 Attachments: 8229-0.94.txt


 One of our RS/DN machines ran out of diskspace on the partition to which we 
 write the log files.
 It turns out we still had a table in our source cluster with 
 REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster.
 In then logged a long stack trace every 50ms or so, over a few days that 
 filled up our log partition.
 Since ReplicationSource cannot make any progress in this case anyway, it 
 should probably sleep a bit before retrying (or at least limit the rate at 
 which it spews out these exceptions to the log).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.

2013-04-03 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-8229:
-

Fix Version/s: (was: 0.95.0)
   0.95.1

 Replication code logs like crazy if a target table cannot be found.
 ---

 Key: HBASE-8229
 URL: https://issues.apache.org/jira/browse/HBASE-8229
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.95.1, 0.98.0, 0.94.7

 Attachments: 8229-0.94.txt, 8229-0.94-V2.txt


 One of our RS/DN machines ran out of diskspace on the partition to which we 
 write the log files.
 It turns out we still had a table in our source cluster with 
 REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster.
 In then logged a long stack trace every 50ms or so, over a few days that 
 filled up our log partition.
 Since ReplicationSource cannot make any progress in this case anyway, it 
 should probably sleep a bit before retrying (or at least limit the rate at 
 which it spews out these exceptions to the log).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.

2013-04-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-8229:
-

Attachment: (was: 8229-0.94-V2.txt)

 Replication code logs like crazy if a target table cannot be found.
 ---

 Key: HBASE-8229
 URL: https://issues.apache.org/jira/browse/HBASE-8229
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.95.1, 0.98.0, 0.94.7

 Attachments: 8229-0.94.txt


 One of our RS/DN machines ran out of diskspace on the partition to which we 
 write the log files.
 It turns out we still had a table in our source cluster with 
 REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster.
 In then logged a long stack trace every 50ms or so, over a few days that 
 filled up our log partition.
 Since ReplicationSource cannot make any progress in this case anyway, it 
 should probably sleep a bit before retrying (or at least limit the rate at 
 which it spews out these exceptions to the log).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.

2013-04-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-8229:
-

Attachment: 8229-0.94.txt

Here's a simple 0.94 fix.
While looking at the code I also realized that there is no way to get out of 
this situation in the source cluster. Removing REPLICATION_SCOPE, disabling, or 
even dropping the table will not end this endless loop. Replication will never 
make any progress until the RS is restarted or the table is added to the peer 
cluster.

 Replication code logs like crazy if a target table cannot be found.
 ---

 Key: HBASE-8229
 URL: https://issues.apache.org/jira/browse/HBASE-8229
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: 8229-0.94.txt


 One of our RS/DN machines ran out of diskspace on the partition to which we 
 write the log files.
 It turns out we still had a table in our source cluster with 
 REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster.
 In then logged a long stack trace every 50ms or so, over a few days that 
 filled up our log partition.
 Since ReplicationSource cannot make any progress in this case anyway, it 
 should probably sleep a bit before retrying (or at least limit the rate at 
 which it spews out these exceptions to the log).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.

2013-04-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-8229:
-

Attachment: 8229-0.94-V2.txt

Here's a version that MIGHT fix that (untested). The idea is to return back 
into the run() loop of ReplicationSource, so that the edits are rechecked (and 
not shipped to the peer if the local table's status has changed).


 Replication code logs like crazy if a target table cannot be found.
 ---

 Key: HBASE-8229
 URL: https://issues.apache.org/jira/browse/HBASE-8229
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: 8229-0.94.txt, 8229-0.94-V2.txt


 One of our RS/DN machines ran out of diskspace on the partition to which we 
 write the log files.
 It turns out we still had a table in our source cluster with 
 REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster.
 In then logged a long stack trace every 50ms or so, over a few days that 
 filled up our log partition.
 Since ReplicationSource cannot make any progress in this case anyway, it 
 should probably sleep a bit before retrying (or at least limit the rate at 
 which it spews out these exceptions to the log).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8229) Replication code logs like crazy if a target table cannot be found.

2013-03-31 Thread Jieshan Bean (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-8229:


Component/s: Replication

 Replication code logs like crazy if a target table cannot be found.
 ---

 Key: HBASE-8229
 URL: https://issues.apache.org/jira/browse/HBASE-8229
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.95.0, 0.98.0, 0.94.7


 One of our RS/DN machines ran out of diskspace on the partition to which we 
 write the log files.
 It turns out we still had a table in our source cluster with 
 REPLICATION_SCOPE=1 that did not have a matching table in the remote cluster.
 In then logged a long stack trace every 50ms or so, over a few days that 
 filled up our log partition.
 Since ReplicationSource cannot make any progress in this case anyway, it 
 should probably sleep a bit before retrying (or at least limit the rate at 
 which it spews out these exceptions to the log).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira