Re: data loss around splits when tserver goes down

Josh Elser Sun, 26 Jan 2014 15:02:49 -0800

Hadoop 2.2.0 wasn't released before Accumulo 1.5.0 so it's impossible tohave tested that then :)

I've personally done extensive testing of 1.5.1-SNAPSHOT and1.6.0-SNAPSHOT with Hadoop 2.2.0. I know others have also been doing thesame.


On 1/26/2014 4:47 PM, Anthony F wrote:

Thanks, I'll check Jira.  As for versions, Hadoop 2.2.0, Zk 3.4.5,
CentOS 64bit (kernel 2.6.32-431.el6.x86_64).  Has much testing been done
using Hadoop 2.2.0?  I tried Hadoop 2.0.0 (CDH 4.5.0) but ran into
HDFS-5225/5031 which basically makes it a non-starter.


On Sun, Jan 26, 2014 at 4:29 PM, Josh Elser <[email protected]
<mailto:[email protected]>> wrote:

    I meant to reply to your original email, but I didn't yet, sorry.

    First off, if Accumulo is reporting that it found multiple locations
    for the same extent, this is a (very bad) bug in Accumulo. It might
    be worth looking at tickets that at marked as "affects 1.5.0" and
    "fixed in 1.5.1" on Jira. It's likely that we've already encountered
    and fixed the issue, but, if you can't find a fix that was already
    made, we don't want to overlook the potential need for one.

    For both "live" and "bulk" ingest, *neither* should lose any data.
    This is one thing that Accumulo should never be doing. If you have
    multiple locations for an extent, it seems plausible to me that you
    would run into data loss. However, you should focus on trying to
    determine why you keep running into multiple locations for a tablet.

    After you take a look at Jira, I would likely go ahead and file a
    jira to track this since it's easier to follow than an email thread.
    Be sure to note if there is anything notable about your installation
    (did you download it directly from the accumulo.apache.org
    <http://accumulo.apache.org> site)? You should also include what OS
    and version and what Hadoop and ZooKeeper versions you are running.


    On 1/26/2014 4:10 PM, Anthony F wrote:

        I have observed a loss of data when tservers fail during bulk
        ingest.
        The keys that are missing are right around the table's splits
        indicating
        that data was lost when a tserver died during a split.  I am using
        Accumulo 1.5.0.  At around the same time, I observe the master
        logging a
        message about "Found two locations for the same extent".  Can anyone
        shed light on this behavior?  Are tserver failures during bulk
        ingest
        supposed to be fault tolerant?

Re: data loss around splits when tserver goes down

Reply via email to