Re: WAL - rate limiting factor x4.67

Keith Turner Wed, 04 Dec 2013 08:10:37 -0800

How many concurrent writers do you have?  I made some other comments below
inline.



On Wed, Dec 4, 2013 at 10:53 AM, Peter Tillotson <[email protected]>wrote:

> Keith
>
> I tried tserver.mutation.queue.max=4M and it improved but by no where near
> a significant difference. I my app records get turned into multiple
> Accumulo rows.
>
> So in terms of my record write rate.
>
> wal=true  & mutation.queue.max = 256K    |   ~8K records/s
> wal=true & mutation.queue.max = 4M        |   ~14K records/s
>

Do you know if its plateaued?  If you increase this further (like 8M), is
the rate the same?


> wal=false                                                 |   ~25K
> records/s
>
> Adam,
>
> Its one box so replication is off, good thought tnx.
>
> BTW - I've been plying around with ZFS compression vs Accumulo Snappy.
> What I've found was quite interesting. The idea was that with ZFS dedup and
> being in charge of compression I'd get a boost later on when blocks merge.
> What I've found is that after a while with ZFS LZ4 the CPU and disk all
> tail off, as though timeouts are elapsing somewhere whereas SNAPPY
> maintains an average ~20k+.
>

W/ this strategy the data will not be compressed when going between the
tserver and datanode OR the datanode and OS.


>
> Anyway tnx and if I get a chance I may the 1.7 branch for the fix.
>

Nothing was done in 1.7 for this issue yet.


>
>
>
>   On Wednesday, 4 December 2013, 14:56, Adam Fuchs <[email protected]>
> wrote:
>  One thing you can do is reduce the replication factor for the WAL. We
> have found that makes a pretty significant different in write performance.
> That can be modified with the tserver.wal.replication property. Setting it
> to 2 instead of the default (probably 3) should give you some performance
> improvement, of course at some cost to durability.
>
> Adam
>
>
> On Wed, Dec 4, 2013 at 5:14 AM, Peter Tillotson <[email protected]>wrote:
>
> I've been trying to get the most out of streaming data into Accumulo 1.5
> (Hadoop Cloudera CDH4). Having tried a number of settings, re-writing
> client code etc I finally switched off the Write Ahead Log
> (table.walog.enabled=false) and saw a huge leap in ingest performance.
>
> Ingest with table.walog.enabled= true:   ~6 MB/s
> Ingest with table.walog.enabled= false:  ~28 MB/s
>
> That is a factor of about x4.67 speed improvement.
>
> Now my use case could probably live without or work around not having a
> wal, but I wondered if this was a known issue??
> (didn't see anything in jira), wal seem to be a significant rate limiter
> this is either endemic to Accumulo or an HDFS / setup issue. Though given
> everything is in HDFS these days and otherwise IO flies it looks like
> Accumulo WAL is the most likely culprit.
>
> I don't believe this to be an IO issue on the box, with wal off the is
> significantly more IO (up to 80M/s reported by dstat), with wal on (up to
> 12M/s reported by dstat). Testing the box with FIO sequential write is
> 160M/s.
>
> Further info:
> Hadoop 2.00 (Cloudera cdh4)
> Accumulo (1.5.0)
> Zookeeper ( with Netty, minor improvement of <1MB/s  )
> Filesystem ( HDFS is ZFS, compression=on, dedup=on, otherwise ext4 )
>
> With large imports from scratch now I start off CPU bound and as more
> shuffling is needed this becomes Disk bound later in the import as
> expected. So I know pre-splitting would probably sort it.
>
> Tnx
>
> P
>
>
>
>
>

Re: WAL - rate limiting factor x4.67

Reply via email to