In its current status, I think that rbh v3 consumes more CPU than v2.5, because it much more relies on dynamic memory allocation where v2.5 used stack allocated buffers. Certainly other memory allocators like jemalloc etc can help, but we still have to evaluate this for v3.

Can you precise what testing you refer, so I can try to find in my benchmark archives what was the hardware.

Le 15/12/2015 21:10, Colin Faber a écrit :
Hi Thomas,

Thanks for the reply, I think my problem is CPU related. In the case of the benchmark testing you've done, can you detail the hardware platform you choose to achieve the numbers you saw?

On Tue, Dec 15, 2015 at 12:48 PM, Thomas LEIBOVICI (mail perso) <[email protected] <mailto:[email protected]>> wrote:

    Hi Colin,

    Not sure what is the limiting factor in your case: do you state it
    is the DB access or the filesystem access?
    This doc can help you determining what is limiting:
    https://github.com/cea-hpc/robinhood/wiki/pipeline_tuning

    If it is the DB: I suggest disabling "accounting" feature which is
    a DB performance killer (and should be improved in next versions).
    Once disabled, you don't have to choose between bulk insert and
    high threads, robinhood can do both, which commonly increase DB
    ingest performance by x3-x4
    (think about removing your previous tunings of batch size and
    pipeline stage threads, so you can just have a minimistic tuning
    of nb_threads of pipeline).

    If the FS operation latency is the limiting point, a track is to
    increase the number of operations that can be processed in
    parallel by the Lustre client
    by increasing "max_rpc_in_flight" and ko2iblnd peer_credits on
    robinhood host and on MDS accordingly.

    Thanks forward for any update,
    Thomas


    Le 09/12/2015 23:31, Colin Faber a écrit :
    So I'm playing around with RBH on some reasonably powerful
    hardware, I've tried various database configurations of robinhood
    3, against mariadb with both myisam and innodb tables but keep
    running into the same road blocks performance wise.

    My test is simple, on a quiet file system, generate 1 million
    files with 1 million records in changelogs, then fire off
    robinhood -r --once and time it (along with it's internal timing).

    The database an E5-2609 based system with 32 gb of memory. I'm
    allowing mariadb to eat up 26GB of it and have sliced a few gigs
    off for a memory backed file system which I run the tables off of
    (to eliminate possible array slowness).

    I've tried both strategies of high thread count, vs bulk inserts
    to the DB but seem to be generally limited performance wise.

    I'm wondering what tunings I should be focusing on here to
    achieve the results posted in the documentation.

    Thanks!

    -cf


    
------------------------------------------------------------------------------


    _______________________________________________
    robinhood-support mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/robinhood-support



    ------------------------------------------------------------------------
    Avast logo <http://www.avast.com/>    

    L'absence de virus dans ce courrier électronique a été vérifiée
    par le logiciel antivirus Avast.
    www.avast.com <http://www.avast.com/>






---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
http://www.avast.com
------------------------------------------------------------------------------
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to