Re: Riak performance problems when LevelDB database grows beyond 16GB

Matthew Von-Maszewski Wed, 21 Nov 2012 08:01:23 -0800

map reduce is currently outside my skill set.  I have forwarded the question to 
others on the team.


I have also asked the team if they can give me the key specifications used for 
auto delete in bitcask.  Maybe, just maybe, I can slide the same logic into 
leveldb's compaction algorithm.

Matthew


On Nov 21, 2012, at 10:52 AM, <[email protected]> wrote:

> ---------- Původní zpráva ----------
>> Od: Matthew Von-Maszewski 
>> Datum: 22. 10. 2012
>> Předmět: Re: Riak performance problems when LevelDB database grows beyond 
>> 16GB
>> Jan,
>> 
>> ...
>> The next question from me is whether the drive / disk array problems are 
>> your only problem at this point.  The data in log_jan.txt looks ok until 
>> the failures start.  I am willing to work more, but I need to better 
>> understand your next level of problems.
>> 
>> Matthew
> 
> Hi Matthew,
> 
> thanks again for helping me. It was actually bad RAM. It took me some time to 
> convince the hosting provider since the problem did not show up in their 
> hardware tests. :-/
> 
> I ran a 4 day test when the two bad nodes got fixed, and the original issue 
> (that a Riak node got stuck) did not appear again.
> 
> There are however other problems which seem to be caused by my "misuse" of 
> LevelDB for storing short-lived data (the data is valid only for 24 hours).
> 
> Here is the application throughput during a 4 days test with 5 Riak nodes:
> 
> http://janevangelista.rajce.idnes.cz/nastenka#5Riak_4d_edited.jpg
> 
> This graph shows memory use on a Riak node:
> 
> http://janevangelista.rajce.idnes.cz/nastenka/#MemNode3-edited.jpg
> 
> (Memory use on other nodes looks similar, but the OOM killed was not invoked 
> there.)
> 
> And this graph shows disk space consumption on a Riak node:
> 
> http://janevangelista.rajce.idnes.cz/nastenka#DiskSpace-4d-edited.jpg
> 
> The OOM condition which killed one Riak node (and slowed down the other ones) 
> seems to be caused by the map-reduce jobs which 
> periodically delete old data from the database. The entries are deleted with 
> a mapred job querying the secondary index and using the reduce function 
> published at http://contrib.basho.com/delete_keys.html .
> 
> I wish LevelDB could expire old entries in the same was as BitCask does. :-)
> 
> In an older 3 day test I had only 5 min timeout for the mapred jobs (a bug). 
> It caused premature cancellation of the jobs deleting the old data - but the 
> throughput was better:
> 
> http://janevangelista.rajce.idnes.cz/nastenka#4Riak_3d_8K_edited.jpg
> 
> The memory use looked reasonable as well:
> 
> http://janevangelista.rajce.idnes.cz/nastenka/#Memory-3d-edited.jpg
> 
> The disk use in this case was:
> 
> http://janevangelista.rajce.idnes.cz/nastenka#DiskSpace.jpg
> 
> The databases were cca 85 GB.
> 
> So the only problem now seems to be how to get rid of the old data. Any hints?
> 
> Thanks, Jan


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak performance problems when LevelDB database grows beyond 16GB

Reply via email to