Re: data loss around splits when tserver goes down

Anthony F Mon, 27 Jan 2014 06:14:30 -0800

Eric, I'm currently digging through the logs and will report back.  Keep in
mind, I'm using Accumulo 1.5.0 on a Hadoop 2.2.0 stack.  To determine data
loss, I have a 'ConsistencyCheckingIterator' that verifies each row has the
expected data (it takes a long time to scan the whole table).  Below is a
quick summary of what happened.  The tablet in question is
"d;72~gcm~201304".  Notice that it is assigned to
192.168.2.233:9997[343bc1fa155242c]
at 2014-01-25 09:49:36,233.  At 2014-01-25 09:49:54,141, the tserver goes
away.  Then, the tablet gets assigned to 192.168.2.223:9997[143bc1f14412432]
and shortly after that, I see the BadLocationStateException.  The master
never recovers from the BLSE - I have to manually delete one of the
offending locations.


2014-01-25 09:49:36,233 [master.Master] DEBUG: Normal Tablets assigning
tablet d;72~gcm~201304;72=192.168.2.233:9997[343bc1fa155242c]
2014-01-25 09:49:36,233 [master.Master] DEBUG: Normal Tablets assigning
tablet p;18~thm~2012101;18=192.168.2.233:9997[343bc1fa155242c]
2014-01-25 09:49:54,141 [master.Master] WARN : Lost servers
[192.168.2.233:9997[343bc1fa155242c]]
2014-01-25 09:49:56,866 [master.Master] DEBUG: 42 assigned to dead servers:
[d;03~u36~201302;03~thm~2012091@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;06~u36~2013;06~thm~2012083@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;25;24~u36~2013@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;25~u36~201303;25~thm~201209@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;27~gcm~2013041;27@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;30~u36~2013031;30~thm~2012082@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;34~thm;34~gcm~2013022@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;39~thm~20121;39~gcm~20130418@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;41~thm;41~gcm~2013041@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;42~u36~201304;42~thm~20121@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;45~thm~201208;45~gcm~201303@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;48~gcm~2013052;48@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;60~u36~2013021;60~thm~20121@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;68~gcm~2013041;68@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;72;71~u36~2013@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;72~gcm~201304;72@(192.168.2.233:9997[343bc1fa155242c],null,null),
d;75~thm~2012101;75~gcm~2013032@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;78;77~u36~201305@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;90~u36~2013032;90~thm~2012092@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;91~thm;91~gcm~201304@(null,192.168.2.233:9997[343bc1fa155242c],null),
d;93~u36~2013012;93~thm~20121@(null,192.168.2.233:9997[343bc1fa155242c],null),
m;20;19@(null,192.168.2.233:9997[343bc1fa155242c],null), m;38;37@
(null,192.168.2.233:9997[343bc1fa155242c],null), m;51;50@
(null,192.168.2.233:9997[343bc1fa155242c],null), m;60;59@
(null,192.168.2.233:9997[343bc1fa155242c],null), m;92;91@
(null,192.168.2.233:9997[343bc1fa155242c],null),
o;01<@(null,192.168.2.233:9997[343bc1fa155242c],null), o;04;03@
(null,192.168.2.233:9997[343bc1fa155242c],null), o;50;49@
(null,192.168.2.233:9997[343bc1fa155242c],null), o;63;62@
(null,192.168.2.233:9997[343bc1fa155242c],null), o;74;73@
(null,192.168.2.233:9997[343bc1fa155242c],null), o;97;96@
(null,192.168.2.233:9997[343bc1fa155242c],null), p;08~thm~20121;08@
(null,192.168.2.233:9997[343bc1fa155242c],null), p;09~thm~20121;09@
(null,192.168.2.233:9997[343bc1fa155242c],null), p;10;09~thm~20121@
(null,192.168.2.233:9997[343bc1fa155242c],null), p;18~thm~2012101;18@
(192.168.2.233:9997[343bc1fa155242c],null,null), p;21;20~thm~201209@
(null,192.168.2.233:9997[343bc1fa155242c],null), p;22~thm~2012091;22@
(null,192.168.2.233:9997[343bc1fa155242c],null), p;23;22~thm~2012091@
(null,192.168.2.233:9997[343bc1fa155242c],null), p;41~thm~2012111;41@
(null,192.168.2.233:9997[343bc1fa155242c],null), p;42;41~thm~2012111@
(null,192.168.2.233:9997[343bc1fa155242c],null), p;58~thm~201208;58@
(null,192.168.2.233:9997[343bc1fa155242c],null)]...
2014-01-25 09:49:59,706 [master.Master] DEBUG: Normal Tablets assigning
tablet d;72~gcm~201304;72=192.168.2.223:9997[143bc1f14412432]
2014-01-25 09:50:13,515 [master.EventCoordinator] INFO : tablet
d;72~gcm~201304;72 was loaded on 192.168.2.223:9997
2014-01-25 09:51:20,058 [state.MetaDataTableScanner] ERROR:
java.lang.RuntimeException:
org.apache.accumulo.server.master.state.TabletLocationState$BadLocationStateException:
found two locations for the same extent d;72~gcm~201304:
192.168.2.223:9997[143bc1f14412432]
and 192.168.2.233:9997[343bc1fa155242c]
java.lang.RuntimeException:
org.apache.accumulo.server.master.state.TabletLocationState$BadLocationStateException:
found two locations for the same extent d;72~gcm~201304:
192.168.2.223:9997[143bc1f14412432]
and 192.168.2.233:9997[343bc1fa155242c]



On Mon, Jan 27, 2014 at 8:53 AM, Eric Newton <[email protected]> wrote:

> Having two "last" locations... is annoying, and useless.  Having two "loc"
> locations is disastrous.  We do a *lot* of testing that verifies that data
> is not lost, with live ingest and with bulk ingest, and just about every
> other condition you can imagine.  Presently, this testing is being done by
> me for 1.6.0 on Hadoop 2.2.0 and ZK 3.4.5.
>
> If you can provide any of the following, it would be helpful:
>
> * an automated test case that demonstrates the problem
> * logs that document what happened
> * a description of the *exact* things you did to detect data loss
>
> Please don't use the approximate counts displayed on the monitor pages to
> confirm ingest.  These are known to be incorrect with both bulk ingested
> data and right after splits.  The data is there, but the counts are just
> estimates.
>
> If you find you have verified data loss, please open a ticket, and provide
> as many details as you can, even if it does not happen consistently.
>
> Thanks!
>
> -Eric
>
>
>
> On Mon, Jan 27, 2014 at 7:57 AM, Anthony F <[email protected]> wrote:
>
>> I took a look in the code . . . the stack trace is not quite the same.
>>  In 1.6.0, the fixed issue related to METADATA_LAST_LOCATION_COLUMN_FAMILY.
>>  The issue I am seeing (in 1.5.0) is related to
>> METADATA_CURRENT_LOCATION_COLUMN_FAMILY (line 144).
>>
>>
>> On Sun, Jan 26, 2014 at 7:00 PM, Anthony F <[email protected]> wrote:
>>
>>> The stack trace is pretty close and the steps to reproduce match the
>>> scenario in which I observed the issue.  But there's no fix (in Jira)
>>> against 1.5.0, just 1.6.0.
>>>
>>>
>>> On Sun, Jan 26, 2014 at 5:56 PM, Josh Elser <[email protected]>wrote:
>>>
>>>> Just because the error message is the same doesn't mean that the root
>>>> cause is also the same.
>>>>
>>>> Without looking more into Eric's changes, I'm not sure if ACCUMULO-2057
>>>> would also affect 1.5.0. We're usually pretty good about checking backwards
>>>> when bugs are found in newer versions, but things slip through the cracks,
>>>> too.
>>>>
>>>>
>>>> On 1/26/2014 5:09 PM, Anthony F wrote:
>>>>
>>>>> This is pretty much the issue:
>>>>>
>>>>> https://issues.apache.org/jira/browse/ACCUMULO-2057
>>>>>
>>>>> Slightly different error message but it's a different version.  Looks
>>>>> like its fixed in 1.6.0.  I'll probably need to upgrade.
>>>>>
>>>>>
>>>>> On Sun, Jan 26, 2014 at 4:47 PM, Anthony F <[email protected]
>>>>> <mailto:[email protected]>> wrote:
>>>>>
>>>>>     Thanks, I'll check Jira.  As for versions, Hadoop 2.2.0, Zk 3.4.5,
>>>>>     CentOS 64bit (kernel 2.6.32-431.el6.x86_64).  Has much testing been
>>>>>     done using Hadoop 2.2.0?  I tried Hadoop 2.0.0 (CDH 4.5.0) but ran
>>>>>     into HDFS-5225/5031 which basically makes it a non-starter.
>>>>>
>>>>>
>>>>>     On Sun, Jan 26, 2014 at 4:29 PM, Josh Elser <[email protected]
>>>>>     <mailto:[email protected]>> wrote:
>>>>>
>>>>>         I meant to reply to your original email, but I didn't yet,
>>>>> sorry.
>>>>>
>>>>>         First off, if Accumulo is reporting that it found multiple
>>>>>         locations for the same extent, this is a (very bad) bug in
>>>>>         Accumulo. It might be worth looking at tickets that at marked
>>>>> as
>>>>>         "affects 1.5.0" and "fixed in 1.5.1" on Jira. It's likely that
>>>>>         we've already encountered and fixed the issue, but, if you
>>>>> can't
>>>>>         find a fix that was already made, we don't want to overlook the
>>>>>         potential need for one.
>>>>>
>>>>>         For both "live" and "bulk" ingest, *neither* should lose any
>>>>>         data. This is one thing that Accumulo should never be doing. If
>>>>>         you have multiple locations for an extent, it seems plausible
>>>>> to
>>>>>         me that you would run into data loss. However, you should focus
>>>>>         on trying to determine why you keep running into multiple
>>>>>         locations for a tablet.
>>>>>
>>>>>         After you take a look at Jira, I would likely go ahead and file
>>>>>         a jira to track this since it's easier to follow than an email
>>>>>         thread. Be sure to note if there is anything notable about your
>>>>>         installation (did you download it directly from the
>>>>>         accumulo.apache.org <http://accumulo.apache.org> site)? You
>>>>>
>>>>>         should also include what OS and version and what Hadoop and
>>>>>         ZooKeeper versions you are running.
>>>>>
>>>>>
>>>>>         On 1/26/2014 4:10 PM, Anthony F wrote:
>>>>>
>>>>>             I have observed a loss of data when tservers fail during
>>>>>             bulk ingest.
>>>>>             The keys that are missing are right around the table's
>>>>>             splits indicating
>>>>>             that data was lost when a tserver died during a split.  I
>>>>> am
>>>>>             using
>>>>>             Accumulo 1.5.0.  At around the same time, I observe the
>>>>>             master logging a
>>>>>             message about "Found two locations for the same extent".
>>>>>               Can anyone
>>>>>             shed light on this behavior?  Are tserver failures during
>>>>>             bulk ingest
>>>>>             supposed to be fault tolerant?
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>

Re: data loss around splits when tserver goes down

Reply via email to