[
https://issues.apache.org/jira/browse/HBASE-13885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586449#comment-14586449
]
Andrew Purtell commented on HBASE-13885:
----------------------------------------
+1
bq. FYI for 0.98.14. Or if we want, since this is pretty bad we can pull it
into the RC (0.98.14 seems fine, though)
0.98.13 is super late and we've been through a couple of RCs already. This
issue has been there with snapshots/procedure v1 since the beginning. It's
important but we can get to it when 0.98.14 goes out next month more or less on
the normal cadence.
HBASE-13901 adds a check in ZKUtil#isEmpty for this type of thing:
{code}
diff --git
a/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureMemberRpcs.java
b/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureMemberRpcs.java
index 8620558..114d735 100644
---
a/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureMemberRpcs.java
+++
b/hbase-server/src/main/java/org/apache/hadoop/hbase/procedure/ZKProcedureMemberRpcs.java
@@ -309,7 +309,10 @@ public class ZKProcedureMemberRpcs implements
ProcedureMemberRpcs {
// figure out the data we need to pass
ForeignException ee;
try {
- if (!ProtobufUtil.isPBMagicPrefix(data)) {
+ if (data == null || data.length == 0) {
+ // ignore
+ return;
+ } else if (!ProtobufUtil.isPBMagicPrefix(data)) {
String msg = "Illegally formatted data in abort node for proc " +
opName
+ ". Killing the procedure.";
LOG.error(msg);
{code}
I'll commit HBASE-13901 right after leaving this comment here. Use
ZKUtil#isEmpty instead of if (data == null ...) ? Just a nit.
> ZK watches leaks during snapshots
> ---------------------------------
>
> Key: HBASE-13885
> URL: https://issues.apache.org/jira/browse/HBASE-13885
> Project: HBase
> Issue Type: Bug
> Components: snapshots
> Affects Versions: 0.98.12
> Reporter: Abhishek Singh Chouhan
> Assignee: Lars Hofhansl
> Priority: Critical
> Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.1
>
> Attachments: 13885-0.98-v2.txt, 13885-0.98-v3.txt, 13885-master.txt
>
>
> When taking snapshot of a table a watcher over
> /hbase/online-snapshot/abort/snapshot-name is created which is never cleared
> when the snapshot is successful. If we use snapshots to take backups daily we
> accumulate a lot of watches.
> Steps to reproduce -
> 1) Take snapshot of a table - snapshot 'table_1', 'abc'
> 2) Run the following on zk node or alternatively observe zk watches metric
> echo "wchc" | nc localhost 2181
> /hbase/online-snapshot/abort/abc can be found.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)