Sorry I forgot the half of it..
seffenberg@kriak46-1:~$ free -m
total used free shared buffers cached
Mem: 23999 23759 239 0 184 16183
-/+ buffers/cache: 7391 16607
Swap: 0 0 0We have 12 servers.. datadir on the compacted servers (1.4.2) ~ 765 GB AAE is enabled. I attached app.config and vm.args. Cheers Simon On Wed, 11 Dec 2013 07:33:31 -0500 Matthew Von-Maszewski <[email protected]> wrote: > Ok, I am now suspecting that your servers are either using swap space (which > is slow) or your leveldb file cache is thrashing (opening and closing > multiple files per request). > > How many servers do you have and do you use Riak's active anti-entropy > feature? I am going to plug all of this into a spreadsheet. > > Matthew Von-Maszewski > > > On Dec 11, 2013, at 7:09, Simon Effenberg <[email protected]> wrote: > > > Hi Matthew > > > > Memory: 23999 MB > > > > ring_creation_size, 256 > > max_open_files, 100 > > > > riak-admin status: > > > > memory_total : 276001360 > > memory_processes : 191506322 > > memory_processes_used : 191439568 > > memory_system : 84495038 > > memory_atom : 686993 > > memory_atom_used : 686560 > > memory_binary : 21965352 > > memory_code : 11332732 > > memory_ets : 10823528 > > > > Thanks for looking! > > > > Cheers > > Simon > > > > > > > > On Wed, 11 Dec 2013 06:44:42 -0500 > > Matthew Von-Maszewski <[email protected]> wrote: > > > >> I need to ask other developers as they arrive for the new day. Does not > >> make sense to me. > >> > >> How many nodes do you have? How much RAM do you have in each node? What > >> are your settings for max_open_files and cache_size in the app.config > >> file? Maybe this is as simple as leveldb using too much RAM in 1.4. The > >> memory accounting for maz_open_files changed in 1.4. > >> > >> Matthew Von-Maszewski > >> > >> > >> On Dec 11, 2013, at 6:28, Simon Effenberg <[email protected]> > >> wrote: > >> > >>> Hi Matthew, > >>> > >>> it took around 11hours for the first node to finish the compaction. The > >>> second node is running already 12 hours and is still doing compaction. > >>> > >>> Besides that I wonder because the fsm_put time on the new 1.4.2 host is > >>> much higher (after the compaction) than on an old 1.3.1 (both are > >>> running in the cluster right now and another one is doing the > >>> compaction/upgrade while it is in the cluster but not directly > >>> accessible because it is out of the Loadbalancer): > >>> > >>> 1.4.2: > >>> > >>> node_put_fsm_time_mean : 2208050 > >>> node_put_fsm_time_median : 39231 > >>> node_put_fsm_time_95 : 17400382 > >>> node_put_fsm_time_99 : 50965752 > >>> node_put_fsm_time_100 : 59537762 > >>> node_put_fsm_active : 5 > >>> node_put_fsm_active_60s : 364 > >>> node_put_fsm_in_rate : 5 > >>> node_put_fsm_out_rate : 3 > >>> node_put_fsm_rejected : 0 > >>> node_put_fsm_rejected_60s : 0 > >>> node_put_fsm_rejected_total : 0 > >>> > >>> > >>> 1.3.1: > >>> > >>> node_put_fsm_time_mean : 5036 > >>> node_put_fsm_time_median : 1614 > >>> node_put_fsm_time_95 : 8789 > >>> node_put_fsm_time_99 : 38258 > >>> node_put_fsm_time_100 : 384372 > >>> > >>> > >>> any clue why this could/should be? > >>> > >>> Cheers > >>> Simon > >>> > >>> On Tue, 10 Dec 2013 17:21:07 +0100 > >>> Simon Effenberg <[email protected]> wrote: > >>> > >>>> Hi Matthew, > >>>> > >>>> thanks!.. that answers my questions! > >>>> > >>>> Cheers > >>>> Simon > >>>> > >>>> On Tue, 10 Dec 2013 11:08:32 -0500 > >>>> Matthew Von-Maszewski <[email protected]> wrote: > >>>> > >>>>> 2i is not my expertise, so I had to discuss you concerns with another > >>>>> Basho developer. He says: > >>>>> > >>>>> Between 1.3 and 1.4, the 2i query did change but not the 2i on-disk > >>>>> format. You must wait for all nodes to update if you desire to use the > >>>>> new 2i query. The 2i data will properly write/update on both 1.3 and > >>>>> 1.4 machines during the migration. > >>>>> > >>>>> Does that answer your question? > >>>>> > >>>>> > >>>>> And yes, you might see available disk space increase during the upgrade > >>>>> compactions if your dataset contains numerous delete "tombstones". The > >>>>> Riak 2.0 code includes a new feature called "aggressive delete" for > >>>>> leveldb. This feature is more proactive in pushing delete tombstones > >>>>> through the levels to free up disk space much more quickly (especially > >>>>> if you perform block deletes every now and then). > >>>>> > >>>>> Matthew > >>>>> > >>>>> > >>>>> On Dec 10, 2013, at 10:44 AM, Simon Effenberg > >>>>> <[email protected]> wrote: > >>>>> > >>>>>> Hi Matthew, > >>>>>> > >>>>>> see inline.. > >>>>>> > >>>>>> On Tue, 10 Dec 2013 10:38:03 -0500 > >>>>>> Matthew Von-Maszewski <[email protected]> wrote: > >>>>>> > >>>>>>> The sad truth is that you are not the first to see this problem. And > >>>>>>> yes, it has to do with your 950GB per node dataset. And no, nothing > >>>>>>> to do but sit through it at this time. > >>>>>>> > >>>>>>> While I did extensive testing around upgrade times before shipping > >>>>>>> 1.4, apparently there are data configurations I did not anticipate. > >>>>>>> You are likely seeing a cascade where a shift of one file from > >>>>>>> level-1 to level-2 is causing a shift of another file from level-2 to > >>>>>>> level-3, which causes a level-3 file to shift to level-4, etc … then > >>>>>>> the next file shifts from level-1. > >>>>>>> > >>>>>>> The bright side of this pain is that you will end up with better > >>>>>>> write throughput once all the compaction ends. > >>>>>> > >>>>>> I have to deal with that.. but my problem is now, if I'm doing this > >>>>>> node by node it looks like 2i searches aren't possible while 1.3 and > >>>>>> 1.4 nodes exists in the cluster. Is there any problem which leads me to > >>>>>> an 2i repair marathon or could I easily wait for some hours for each > >>>>>> node until all merges are done before I upgrade the next one? (2i > >>>>>> searches can fail for some time.. the APP isn't having problems with > >>>>>> that but are new inserts with 2i indices processed successfully or do > >>>>>> I have to do the 2i repair?) > >>>>>> > >>>>>> /s > >>>>>> > >>>>>> one other good think: saving disk space is one advantage ;).. > >>>>>> > >>>>>> > >>>>>>> > >>>>>>> Riak 2.0's leveldb has code to prevent/reduce compaction cascades, > >>>>>>> but that is not going to help you today. > >>>>>>> > >>>>>>> Matthew > >>>>>>> > >>>>>>> On Dec 10, 2013, at 10:26 AM, Simon Effenberg > >>>>>>> <[email protected]> wrote: > >>>>>>> > >>>>>>>> Hi @list, > >>>>>>>> > >>>>>>>> I'm trying to upgrade our Riak cluster from 1.3.1 to 1.4.2 .. after > >>>>>>>> upgrading the first node (out of 12) this node seems to do many > >>>>>>>> merges. > >>>>>>>> the sst_* directories changes in size "rapidly" and the node is > >>>>>>>> having > >>>>>>>> a disk utilization of 100% all the time. > >>>>>>>> > >>>>>>>> I know that there is something like that: > >>>>>>>> > >>>>>>>> "The first execution of 1.4.0 leveldb using a 1.3.x or 1.2.x dataset > >>>>>>>> will initiate an automatic conversion that could pause the startup of > >>>>>>>> each node by 3 to 7 minutes. The leveldb data in "level #1" is being > >>>>>>>> adjusted such that "level #1" can operate as an overlapped data level > >>>>>>>> instead of as a sorted data level. The conversion is simply the > >>>>>>>> reduction of the number of files in "level #1" to being less than > >>>>>>>> eight > >>>>>>>> via normal compaction of data from "level #1" into "level #2". This > >>>>>>>> is > >>>>>>>> a one time conversion." > >>>>>>>> > >>>>>>>> but it looks much more invasive than explained here or doesn't have > >>>>>>>> to > >>>>>>>> do anything with the (probably seen) merges. > >>>>>>>> > >>>>>>>> Is this "normal" behavior or could I do anything about it? > >>>>>>>> > >>>>>>>> At the moment I'm stucked with the upgrade procedure because this > >>>>>>>> high > >>>>>>>> IO load would probably lead to high response times. > >>>>>>>> > >>>>>>>> Also we have a lot of data (per node ~950 GB). > >>>>>>>> > >>>>>>>> Cheers > >>>>>>>> Simon > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> riak-users mailing list > >>>>>>>> [email protected] > >>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >>>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>>> > >>>>>> Mail: [email protected] > >>>>>> Web: www.mobile.de > >>>>>> > >>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>>> > >>>>>> > >>>>>> Geschäftsführer: Malte Krüger > >>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>>> Sitz der Gesellschaft: Kleinmachnow > >>>>> > >>>> > >>>> > >>>> -- > >>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>>> Fon: + 49-(0)30-8109 - 7173 > >>>> Fax: + 49-(0)30-8109 - 7131 > >>>> > >>>> Mail: [email protected] > >>>> Web: www.mobile.de > >>>> > >>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>> > >>>> > >>>> Geschäftsführer: Malte Krüger > >>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>> Sitz der Gesellschaft: Kleinmachnow > >>>> > >>>> _______________________________________________ > >>>> riak-users mailing list > >>>> [email protected] > >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >>> > >>> > >>> -- > >>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>> Fon: + 49-(0)30-8109 - 7173 > >>> Fax: + 49-(0)30-8109 - 7131 > >>> > >>> Mail: [email protected] > >>> Web: www.mobile.de > >>> > >>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>> > >>> > >>> Geschäftsführer: Malte Krüger > >>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>> Sitz der Gesellschaft: Kleinmachnow > > > > > > -- > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > Fon: + 49-(0)30-8109 - 7173 > > Fax: + 49-(0)30-8109 - 7131 > > > > Mail: [email protected] > > Web: www.mobile.de > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > Geschäftsführer: Malte Krüger > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > Sitz der Gesellschaft: Kleinmachnow -- Simon Effenberg | Site Ops Engineer | mobile.international GmbH Fon: + 49-(0)30-8109 - 7173 Fax: + 49-(0)30-8109 - 7131 Mail: [email protected] Web: www.mobile.de Marktplatz 1 | 14532 Europarc Dreilinden | Germany Geschäftsführer: Malte Krüger HRB Nr.: 18517 P, Amtsgericht Potsdam Sitz der Gesellschaft: Kleinmachnow
app.config
Description: Binary data
vm.args
Description: Binary data
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
