Apparent data loss on 90.4 rc2 after partial zookeeper network partition (on MapR)

Jacques Tue, 02 Aug 2011 15:44:39 -0700

Given the hardy reviews and timing, we recently shifted from 90.3 (apache)
to 90.4rc2 (the July 24th one that Stack posted -- 0.90.4, r1150278).


We had a network switch go down last night which caused an apparent network
partition between two of our region servers and one or more zk nodes.
 (We're still piecing together the situation).  Anyway, things *seemed* to
recover fine.  However, this morning we realized that we lost some data that
was generated just before the problems occurred.

It looks like h002 went down nearly immediately at around 8pm while h001
didn't go down until around 8:10pm (somewhat confused by this).  We're
thinking that this may have contributed to the problem.  The particular
table that had data issues is a very small table with a single region that
was running on h002 when it went down.

We know the corruption/lack of edits affected two tables.  It extended
across a number of rows and actually appears to reach back up to data
inserted 6 hours earlier (estimate).  The two tables we can verify errors on
are each probably at most 10-20k <1k rows.  Some places rows that were added
are completely missing and some just had missing cell edits.  As an aside, I
was thinking there was a time based memstore flush in addition to a size
one.  But upon reviewing the hbase default configuration, I don't see
mention of it.  Is this purely size based?

We don't have the tools in place to verify exactly what other data or tables
may have been impacted.

The log files are at the paste bin links below.  The whole cluster is 8
nodes + master, 3 zk nodes running on separate machines.  We run with mostly
standard settings but do have the following settings:
heap: 12gb
regionsize 4gb, (due to lots of cold data and not enough servers, avg 300
regions/server)
mslab: 4m/512k (due to somewhat frequent updates to larger objects in the
200-500k size range)

We've been using hbase for about a year now and have been nothing but happy
with it.  The failure state that we had last night (where only some region
servers cannot talk to some zk servers) seems like a strange one.

Any thoughts? (beyond chiding for switching to a rc)    Any opinions whether
we should we roll back to 90.3 (or 90.3+cloudera)?

Thanks for any help,
Jacques

master: http://pastebin.com/aG8fm2KZ
h001: http://pastebin.com/nLLk06EC
h002: http://pastebin.com/0wPFuZDx
h003: http://pastebin.com/3ZMV01mA
h004: http://pastebin.com/0YVefuqS
h005: http://pastebin.com/N90LDjvs
h006: http://pastebin.com/gM8umekW
h007: http://pastebin.com/0TVvX68d
h008: http://pastebin.com/mV968Cem

Apparent data loss on 90.4 rc2 after partial zookeeper network partition (on MapR)

Reply via email to