Re: WAL - rate limiting factor x4.67

Josh Elser Wed, 04 Dec 2013 08:06:39 -0800

Peter --

I don't know if this was made entirely clear.

The reason that things are much slower when you have the WAL turned onis that you're suddenly writing N extra copies of your data to disk.When you don't have the WAL turned on, you're simply writing toAccumulo's in-memory data structures which are fast.

Keith's suggestion would ammortize the number of times you write todisk. Adam's suggestion will, like he said, reduce the number of copiesyou write to disk. There is no configuration that you can make that willmake writing data with WALs as fast as writing without the WAL in normalsituations. Writing one copy to memory will likely always be faster thanwriting to memory and writing multiple copies to disk.


- Josh

On 12/4/13, 10:53 AM, Peter Tillotson wrote:

Keith

I tried tserver.mutation.queue.max=4M and it improved but by no where
near a significant difference. I my app records get turned into multiple
Accumulo rows.

So in terms of my record write rate.

wal=true  & mutation.queue.max = 256K    |   ~8K records/s
wal=true & mutation.queue.max = 4M        |   ~14K records/s
wal=false                                                 |  ~25K records/s

Adam,

Its one box so replication is off, good thought tnx.

BTW - I've been plying around with ZFS compression vs Accumulo Snappy.
What I've found was quite interesting. The idea was that with ZFS dedup
and being in charge of compression I'd get a boost later on when blocks
merge. What I've found is that after a while with ZFS LZ4 the CPU and
disk all tail off, as though timeouts are elapsing somewhere whereas
SNAPPY maintains an average ~20k+.

Anyway tnx and if I get a chance I may the 1.7 branch for the fix.


On Wednesday, 4 December 2013, 14:56, Adam Fuchs <[email protected]> wrote:
One thing you can do is reduce the replication factor for the WAL. We
have found that makes a pretty significant different in write
performance. That can be modified with the tserver.wal.replication
property. Setting it to 2 instead of the default (probably 3) should
give you some performance improvement, of course at some cost to
durability.

Adam


On Wed, Dec 4, 2013 at 5:14 AM, Peter Tillotson <[email protected]
<mailto:[email protected]>> wrote:

    I've been trying to get the most out of streaming data into Accumulo
    1.5 (Hadoop Cloudera CDH4). Having tried a number of settings,
    re-writing client code etc I finally switched off the Write Ahead
    Log (table.walog.enabled=false) and saw a huge leap in ingest
    performance.

    Ingest with table.walog.enabled= true:   ~6 MB/s
    Ingest with table.walog.enabled= false:  ~28 MB/s

    That is a factor of about x4.67 speed improvement.

    Now my use case could probably live without or work around not
    having a wal, but I wondered if this was a known issue??
    (didn't see anything in jira), wal seem to be a significant rate
    limiter this is either endemic to Accumulo or an HDFS / setup issue.
    Though given everything is in HDFS these days and otherwise IO flies
    it looks like Accumulo WAL is the most likely culprit.

    I don't believe this to be an IO issue on the box, with wal off the
    is significantly more IO (up to 80M/s reported by dstat), with wal
    on (up to 12M/s reported by dstat). Testing the box with FIO
    sequential write is 160M/s.

    Further info:
    Hadoop 2.00 (Cloudera cdh4)
    Accumulo (1.5.0)
    Zookeeper ( with Netty, minor improvement of <1MB/s  )
    Filesystem ( HDFS is ZFS, compression=on, dedup=on, otherwise ext4 )

    With large imports from scratch now I start off CPU bound and as
    more shuffling is needed this becomes Disk bound later in the import
    as expected. So I know pre-splitting would probably sort it.

    Tnx

    P

Re: WAL - rate limiting factor x4.67

Reply via email to