Hi Sean, this still seems to be nearly two orders of magnitude faster than what I observe. For us 100Mio keys (per Node! obtained with riak_kv_bitcask_backend:key_counts()) take something like 500-600 seconds. In your example its 1.5 seconds for (assuming N=3) 30Mio*3/5 = 18Mio keys. Thats 12Mio keys/s vs. 0.2Mio keys/s.
Cheers, Nico Am Freitag, den 01.04.2011, 10:27 -0400 schrieb Sean Cribbs: > Nico, > > That's fair, I was probably too rosy-eyed about it. However, the difference > in startup times between Bitcask and Inno is still orders of magnitude for > the same keyspace size. Now that I reflect on it, I remember a customer who > had 30MM keys in a 5 node cluster, each node would take about 1.5 seconds for > the node watcher to report riak_kv was available. Before they switched off of > Innostore, it would take upwards of a minute to load and repair tables. YMMV > > Sean Cribbs <[email protected]> > Developer Advocate > Basho Technologies, Inc. > http://basho.com/ > > On Apr 1, 2011, at 10:11 AM, Nico Meyer wrote: > > > Hi Sean, > > > > I have to object here. We have a cluster of 8 Core/64GB nodes with an > > SSD drive for the bitcask dir. Each node holds on the order of 100Mio > > keys. The complete bitcask directory is only about 60Gb big, so it fit > > almost completely. > > The time from starting the node until it start handling requests, which > > means all hint files have been read, is on the order of 10 minutes. > > During this time the beam process is completely CPU bound, the disk is > > hardly breaking a sweat. > > Only 1 core is used at all, since there is only a single erlang process > > starting all the vnodes sequentially. Sometimes a second core is also > > utilized, but that is due to merges on the already started partitions. > > > > Cheers, > > Nico > > > > Am Freitag, den 01.04.2011, 08:45 -0400 schrieb Sean Cribbs: > >> Santhosh, > >> > >> > >> Bitcask has crash-proof design and so, unlike Inno, it will not read > >> the entire keyspace and try to correct it at startup time. It will > >> simply load the existing hint files and then scan the files it doesn't > >> have hints for to discover the extant keys. This takes milliseconds > >> or less per partition; you will hardly notice it. > >> > >> Sean Cribbs <[email protected]> > >> Developer Advocate > >> Basho Technologies, Inc. > >> http://basho.com/ > >> > >> On Apr 1, 2011, at 2:49 AM, santhosh venkat wrote: > >> > >>> Hi , > >>> I am trying to experiment with the recovery time of a riak > >>> node using bitcask storage after a crash . > >>> > >>> I was able to find some information about that in this page > >>> (which is for Innodb though) > >>> > >>> http://wiki.basho.com/Recovering-a-failed-node.html which is > >>> more about Innodb . > >>> > >>> Upon Reading bitcask paper i found it uses hint file to > >>> constructs in memory mapping , so it should not ideally take more > >>> than few mins to reconstruct data after crash . Please throw some > >>> light on this . > >>> > >>> I got this thread dump when i tried the steps outlined in the > >>> above link. > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:50 === > >>> [{alarm_handler,{set,{{disk_almost_full,"/var/lib/mysql"},[]}}}] > >>> =INFO REPORT==== 1-Apr-2011::12:06:50 === > >>> [{alarm_handler,{set,{{disk_almost_full,"/var/lib/riak"},[]}}}]** > >>> Found 0 name clashes in code paths > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_map) host starting (<0.141.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_map) host starting (<0.142.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_map) host starting (<0.143.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_map) host starting (<0.144.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_map) host starting (<0.145.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_map) host starting (<0.146.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_map) host starting (<0.147.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_map) host starting (<0.148.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_reduce) host starting (<0.150.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_reduce) host starting (<0.151.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_reduce) host starting (<0.152.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_reduce) host starting (<0.153.0>) > >>> > >>> =INFO REPORT==== 1-Apr-2011::12:06:51 === > >>> Spidermonkey VM (thread stack: 16MB, max heap: 8MB, pool: > >>> riak_kv_js_reduce) host starting (<0.154.0>) > >>> > >>> Please help . > >>> > >>> -- > >>> Santhosh > >>> > >>> _______________________________________________ > >>> riak-users mailing list > >>> [email protected] > >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > >> > >> _______________________________________________ > >> riak-users mailing list > >> [email protected] > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
