Cool.. gave me an exception about ** exception error: undefined shell command profit/0
but it worked and now I have new data.. thanks a lot! Cheers Simon On Wed, 11 Dec 2013 17:05:29 -0500 Matthew Von-Maszewski <[email protected]> wrote: > One of the core developers says that the following line should stop the stats > process. It will then be automatically started, without the stuck data. > > exit(whereis(riak_core_stat_calc_sup), kill), profit(). > > Matthew > > On Dec 11, 2013, at 4:50 PM, Simon Effenberg <[email protected]> > wrote: > > > So I think I have no real chance to get good numbers. I can see a > > little bit through the app monitoring but I'm not sure if I can see > > real differences about the 100 -> 170 open_files increase. > > > > I will try to change the value on the already migrated nodes as well to > > see if this improves the stuff I can see.. > > > > Any other ideas? > > > > Cheers > > Simon > > > > On Wed, 11 Dec 2013 15:37:03 -0500 > > Matthew Von-Maszewski <[email protected]> wrote: > > > >> The real Riak developers have suggested this might be your problem with > >> stats being stuck: > >> > >> https://github.com/basho/riak_core/pull/467 > >> > >> The fix is included in the upcoming 1.4.4 maintenance release (which is > >> overdue so I am not going to bother guessing when it will actually arrive). > >> > >> Matthew > >> > >> On Dec 11, 2013, at 2:47 PM, Simon Effenberg <[email protected]> > >> wrote: > >> > >>> I will do.. > >>> > >>> but one other thing: > >>> > >>> watch Every 10.0s: sudo riak-admin status | grep put_fsm > >>> node_put_fsm_time_mean : 2208050 > >>> node_put_fsm_time_median : 39231 > >>> node_put_fsm_time_95 : 17400382 > >>> node_put_fsm_time_99 : 50965752 > >>> node_put_fsm_time_100 : 59537762 > >>> node_put_fsm_active : 5 > >>> node_put_fsm_active_60s : 364 > >>> node_put_fsm_in_rate : 5 > >>> node_put_fsm_out_rate : 3 > >>> node_put_fsm_rejected : 0 > >>> node_put_fsm_rejected_60s : 0 > >>> node_put_fsm_rejected_total : 0 > >>> > >>> this is not changing at all.. so maybe my expectations are _wrong_?! So > >>> I will start searching around if there is a "status" bug or I'm > >>> looking in the wrong place... maybe there is no problem while searching > >>> for one?! But I see that at least the app has some issues on GET and > >>> PUT (more on PUT).. so I would like to know how fast the things are.. > >>> but "status" isn't working.. aaaaargh... > >>> > >>> Cheers > >>> Simon > >>> > >>> > >>> On Wed, 11 Dec 2013 14:32:07 -0500 > >>> Matthew Von-Maszewski <[email protected]> wrote: > >>> > >>>> An additional thought: if increasing max_open_files does NOT help, try > >>>> removing +S 4:4 from the vm.args. Typically +S setting helps leveldb, > >>>> but one other user mentioned that the new sorted 2i queries needed more > >>>> CPU in the Erlang layer. > >>>> > >>>> Summary: > >>>> - try increasing max_open_files to 170 > >>>> - helps: try setting sst_block_size to 32768 in app.config > >>>> - does not help: try removing +S from vm.args > >>>> > >>>> Matthew > >>>> > >>>> On Dec 11, 2013, at 1:58 PM, Simon Effenberg <[email protected]> > >>>> wrote: > >>>> > >>>>> Hi Matthew, > >>>>> > >>>>> On Wed, 11 Dec 2013 18:38:49 +0100 > >>>>> Matthew Von-Maszewski <[email protected]> wrote: > >>>>> > >>>>>> Simon, > >>>>>> > >>>>>> I have plugged your various values into the attached spreadsheet. I > >>>>>> assumed a vnode count to allow for one of your twelve servers to die > >>>>>> (256 ring size / 11 servers). > >>>>> > >>>>> Great, thanks! > >>>>> > >>>>>> > >>>>>> The spreadsheet suggests you can safely raise your max_open_files from > >>>>>> 100 to 170. I would suggest doing this for the next server you > >>>>>> upgrade. If part of your problem is file cache thrashing, you should > >>>>>> see an improvement. > >>>>> > >>>>> I will try this out.. starting the next server in 3-4 hours. > >>>>> > >>>>>> > >>>>>> Only if max_open_files helps, you should then consider adding > >>>>>> {sst_block_size, 32767} to the eleveldb portion of app.config. This > >>>>>> setting, given your value sizes, would likely half the size of the > >>>>>> metadata held in the file cache. It only impacts the files newly > >>>>>> compacted in the upgrade, and would gradually increase space in the > >>>>>> file cache while slowing down the file cache thrashing. > >>>>> > >>>>> So I'll do this at the over-next server if the next server is fine. > >>>>> > >>>>>> > >>>>>> What build/packaging of Riak do you use, or do you build from source? > >>>>> > >>>>> Using the debian packages from the basho site.. > >>>>> > >>>>> I'm really wondering why the "put" performance is that bad. > >>>>> Here are the changes which were introduced/changed only on the new > >>>>> upgraded servers: > >>>>> > >>>>> > >>>>> + fsm_limit => 50000, > >>>>> --- our '+P' is set to 262144 so more than 3x fsm_limit which was > >>>>> --- stated somewhere > >>>>> + # after finishing the upgrade this should be switched to v1 !!! > >>>>> + object_format => '__atom_v0', > >>>>> > >>>>> - '-env ERL_MAX_ETS_TABLES' => 8192, > >>>>> + '-env ERL_MAX_ETS_TABLES' => 256000, # old package used 8192 > >>>>> but 1.4.2 raised it to this high number > >>>>> + '-env ERL_MAX_PORTS' => 64000, > >>>>> + # Treat error_logger warnings as warnings > >>>>> + '+W' => 'w', > >>>>> + # Tweak GC to run more often > >>>>> + '-env ERL_FULLSWEEP_AFTER' => 0, > >>>>> + # Force the erlang VM to use SMP > >>>>> + '-smp' => 'enable', > >>>>> + ################################# > >>>>> > >>>>> Cheers > >>>>> Simon > >>>>> > >>>>> > >>>>>> > >>>>>> Matthew > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Dec 11, 2013, at 9:48 AM, Simon Effenberg > >>>>>> <[email protected]> wrote: > >>>>>> > >>>>>>> Hi Matthew, > >>>>>>> > >>>>>>> thanks for all your time and work.. see inline for answers.. > >>>>>>> > >>>>>>> On Wed, 11 Dec 2013 09:17:32 -0500 > >>>>>>> Matthew Von-Maszewski <[email protected]> wrote: > >>>>>>> > >>>>>>>> The real Riak developers have arrived on-line for the day. They are > >>>>>>>> telling me that all of your problems are likely due to the extended > >>>>>>>> upgrade times, and yes there is a known issue with handoff between > >>>>>>>> 1.3 and 1.4. They also say everything should calm down after all > >>>>>>>> nodes are upgraded. > >>>>>>>> > >>>>>>>> I will review your system settings now and see if there is something > >>>>>>>> that might make the other machines upgrade quicker. So three more > >>>>>>>> questions: > >>>>>>>> > >>>>>>>> - what is the average size of your keys > >>>>>>> > >>>>>>> bucket names are between 5 and 15 characters (only ~ 10 buckets).. > >>>>>>> key names are normally something like 26iesj:hovh7egz > >>>>>>> > >>>>>>>> > >>>>>>>> - what is the average size of your value (data stored) > >>>>>>> > >>>>>>> I have to guess.. but mean is (from Riak) 12kb but 95th percentile is > >>>>>>> at 75kb and in theory we have a limit of 1MB (then it will be split > >>>>>>> up) > >>>>>>> but sometimes thanks to sibblings (we have to buckets with allow_mult) > >>>>>>> we have also some 7MB in MAX but this will be reduced again (it's a > >>>>>>> new > >>>>>>> feature in our app which has to many parallel wrights below of 15ms). > >>>>>>> > >>>>>>>> > >>>>>>>> - in regular use, are your keys accessed randomly across their > >>>>>>>> entire range, or do they contain a date component which clusters > >>>>>>>> older, less used keys > >>>>>>> > >>>>>>> normally we don't search but retrieve keys by key name.. and we have > >>>>>>> data which is up to 6 months old and normally we access mostly > >>>>>>> new/active/hot data and not all the old ones.. besides this we have a > >>>>>>> job doing a 2i query every 5mins and another one doing this maybe once > >>>>>>> an hour.. both don't work while the upgrade is ongoing (2i isn't > >>>>>>> working). > >>>>>>> > >>>>>>> Cheers > >>>>>>> Simon > >>>>>>> > >>>>>>>> > >>>>>>>> Matthew > >>>>>>>> > >>>>>>>> > >>>>>>>> On Dec 11, 2013, at 8:43 AM, Simon Effenberg > >>>>>>>> <[email protected]> wrote: > >>>>>>>> > >>>>>>>>> Oh and at the moment they are waiting for some handoffs and I see > >>>>>>>>> errors in logfiles: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> 2013-12-11 13:41:47.948 UTC [error] > >>>>>>>>> <0.7157.24>@riak_core_handoff_sender:start_fold:269 hinted_handoff > >>>>>>>>> transfer of riak_kv_vnode from '[email protected]' > >>>>>>>>> 468137243207554840987117797979434404733540892672 > >>>>>>>>> > >>>>>>>>> but I remember that somebody else had this as well and if I recall > >>>>>>>>> correctly it disappeared after the full upgrade was done.. but at > >>>>>>>>> the > >>>>>>>>> moment it's hard to think about upgrading everything at once.. > >>>>>>>>> (~12hours 100% disk utilization on all 12 nodes will lead to real > >>>>>>>>> slow > >>>>>>>>> puts/gets) > >>>>>>>>> > >>>>>>>>> What can I do? > >>>>>>>>> > >>>>>>>>> Cheers > >>>>>>>>> Simon > >>>>>>>>> > >>>>>>>>> PS: transfers output: > >>>>>>>>> > >>>>>>>>> '[email protected]' waiting to handoff 17 partitions > >>>>>>>>> '[email protected]' waiting to handoff 19 partitions > >>>>>>>>> > >>>>>>>>> (these are the 1.4.2 nodes) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Wed, 11 Dec 2013 14:39:58 +0100 > >>>>>>>>> Simon Effenberg <[email protected]> wrote: > >>>>>>>>> > >>>>>>>>>> Also some side notes: > >>>>>>>>>> > >>>>>>>>>> "top" is even better on new 1.4.2 than on 1.3.1 machines.. IO > >>>>>>>>>> utilization of disk is mostly the same (round about 33%).. > >>>>>>>>>> > >>>>>>>>>> but > >>>>>>>>>> > >>>>>>>>>> 95th percentile of response time for get (avg over all nodes): > >>>>>>>>>> before upgrade: 29ms > >>>>>>>>>> after upgrade: almost the same > >>>>>>>>>> > >>>>>>>>>> 95th percentile of response time for put (avg over all nodes): > >>>>>>>>>> before upgrade: 60ms > >>>>>>>>>> after upgrade: 1548ms > >>>>>>>>>> but this is only because of 2 of 12 nodes are > >>>>>>>>>> on 1.4.2 and are really slow (17000ms) > >>>>>>>>>> > >>>>>>>>>> Cheers, > >>>>>>>>>> Simon > >>>>>>>>>> > >>>>>>>>>> On Wed, 11 Dec 2013 13:45:56 +0100 > >>>>>>>>>> Simon Effenberg <[email protected]> wrote: > >>>>>>>>>> > >>>>>>>>>>> Sorry I forgot the half of it.. > >>>>>>>>>>> > >>>>>>>>>>> seffenberg@kriak46-1:~$ free -m > >>>>>>>>>>> total used free shared buffers cached > >>>>>>>>>>> Mem: 23999 23759 239 0 184 > >>>>>>>>>>> 16183 > >>>>>>>>>>> -/+ buffers/cache: 7391 16607 > >>>>>>>>>>> Swap: 0 0 0 > >>>>>>>>>>> > >>>>>>>>>>> We have 12 servers.. > >>>>>>>>>>> datadir on the compacted servers (1.4.2) ~ 765 GB > >>>>>>>>>>> > >>>>>>>>>>> AAE is enabled. > >>>>>>>>>>> > >>>>>>>>>>> I attached app.config and vm.args. > >>>>>>>>>>> > >>>>>>>>>>> Cheers > >>>>>>>>>>> Simon > >>>>>>>>>>> > >>>>>>>>>>> On Wed, 11 Dec 2013 07:33:31 -0500 > >>>>>>>>>>> Matthew Von-Maszewski <[email protected]> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Ok, I am now suspecting that your servers are either using swap > >>>>>>>>>>>> space (which is slow) or your leveldb file cache is thrashing > >>>>>>>>>>>> (opening and closing multiple files per request). > >>>>>>>>>>>> > >>>>>>>>>>>> How many servers do you have and do you use Riak's active > >>>>>>>>>>>> anti-entropy feature? I am going to plug all of this into a > >>>>>>>>>>>> spreadsheet. > >>>>>>>>>>>> > >>>>>>>>>>>> Matthew Von-Maszewski > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Dec 11, 2013, at 7:09, Simon Effenberg > >>>>>>>>>>>> <[email protected]> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Hi Matthew > >>>>>>>>>>>>> > >>>>>>>>>>>>> Memory: 23999 MB > >>>>>>>>>>>>> > >>>>>>>>>>>>> ring_creation_size, 256 > >>>>>>>>>>>>> max_open_files, 100 > >>>>>>>>>>>>> > >>>>>>>>>>>>> riak-admin status: > >>>>>>>>>>>>> > >>>>>>>>>>>>> memory_total : 276001360 > >>>>>>>>>>>>> memory_processes : 191506322 > >>>>>>>>>>>>> memory_processes_used : 191439568 > >>>>>>>>>>>>> memory_system : 84495038 > >>>>>>>>>>>>> memory_atom : 686993 > >>>>>>>>>>>>> memory_atom_used : 686560 > >>>>>>>>>>>>> memory_binary : 21965352 > >>>>>>>>>>>>> memory_code : 11332732 > >>>>>>>>>>>>> memory_ets : 10823528 > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks for looking! > >>>>>>>>>>>>> > >>>>>>>>>>>>> Cheers > >>>>>>>>>>>>> Simon > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Wed, 11 Dec 2013 06:44:42 -0500 > >>>>>>>>>>>>> Matthew Von-Maszewski <[email protected]> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> I need to ask other developers as they arrive for the new day. > >>>>>>>>>>>>>> Does not make sense to me. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> How many nodes do you have? How much RAM do you have in each > >>>>>>>>>>>>>> node? What are your settings for max_open_files and > >>>>>>>>>>>>>> cache_size in the app.config file? Maybe this is as simple as > >>>>>>>>>>>>>> leveldb using too much RAM in 1.4. The memory accounting for > >>>>>>>>>>>>>> maz_open_files changed in 1.4. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Matthew Von-Maszewski > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Dec 11, 2013, at 6:28, Simon Effenberg > >>>>>>>>>>>>>> <[email protected]> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi Matthew, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> it took around 11hours for the first node to finish the > >>>>>>>>>>>>>>> compaction. The > >>>>>>>>>>>>>>> second node is running already 12 hours and is still doing > >>>>>>>>>>>>>>> compaction. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Besides that I wonder because the fsm_put time on the new > >>>>>>>>>>>>>>> 1.4.2 host is > >>>>>>>>>>>>>>> much higher (after the compaction) than on an old 1.3.1 (both > >>>>>>>>>>>>>>> are > >>>>>>>>>>>>>>> running in the cluster right now and another one is doing the > >>>>>>>>>>>>>>> compaction/upgrade while it is in the cluster but not directly > >>>>>>>>>>>>>>> accessible because it is out of the Loadbalancer): > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> 1.4.2: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> node_put_fsm_time_mean : 2208050 > >>>>>>>>>>>>>>> node_put_fsm_time_median : 39231 > >>>>>>>>>>>>>>> node_put_fsm_time_95 : 17400382 > >>>>>>>>>>>>>>> node_put_fsm_time_99 : 50965752 > >>>>>>>>>>>>>>> node_put_fsm_time_100 : 59537762 > >>>>>>>>>>>>>>> node_put_fsm_active : 5 > >>>>>>>>>>>>>>> node_put_fsm_active_60s : 364 > >>>>>>>>>>>>>>> node_put_fsm_in_rate : 5 > >>>>>>>>>>>>>>> node_put_fsm_out_rate : 3 > >>>>>>>>>>>>>>> node_put_fsm_rejected : 0 > >>>>>>>>>>>>>>> node_put_fsm_rejected_60s : 0 > >>>>>>>>>>>>>>> node_put_fsm_rejected_total : 0 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> 1.3.1: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> node_put_fsm_time_mean : 5036 > >>>>>>>>>>>>>>> node_put_fsm_time_median : 1614 > >>>>>>>>>>>>>>> node_put_fsm_time_95 : 8789 > >>>>>>>>>>>>>>> node_put_fsm_time_99 : 38258 > >>>>>>>>>>>>>>> node_put_fsm_time_100 : 384372 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> any clue why this could/should be? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Cheers > >>>>>>>>>>>>>>> Simon > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Tue, 10 Dec 2013 17:21:07 +0100 > >>>>>>>>>>>>>>> Simon Effenberg <[email protected]> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Hi Matthew, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> thanks!.. that answers my questions! > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Cheers > >>>>>>>>>>>>>>>> Simon > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Tue, 10 Dec 2013 11:08:32 -0500 > >>>>>>>>>>>>>>>> Matthew Von-Maszewski <[email protected]> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> 2i is not my expertise, so I had to discuss you concerns > >>>>>>>>>>>>>>>>> with another Basho developer. He says: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Between 1.3 and 1.4, the 2i query did change but not the 2i > >>>>>>>>>>>>>>>>> on-disk format. You must wait for all nodes to update if > >>>>>>>>>>>>>>>>> you desire to use the new 2i query. The 2i data will > >>>>>>>>>>>>>>>>> properly write/update on both 1.3 and 1.4 machines during > >>>>>>>>>>>>>>>>> the migration. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Does that answer your question? > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> And yes, you might see available disk space increase during > >>>>>>>>>>>>>>>>> the upgrade compactions if your dataset contains numerous > >>>>>>>>>>>>>>>>> delete "tombstones". The Riak 2.0 code includes a new > >>>>>>>>>>>>>>>>> feature called "aggressive delete" for leveldb. This > >>>>>>>>>>>>>>>>> feature is more proactive in pushing delete tombstones > >>>>>>>>>>>>>>>>> through the levels to free up disk space much more quickly > >>>>>>>>>>>>>>>>> (especially if you perform block deletes every now and > >>>>>>>>>>>>>>>>> then). > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Matthew > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Dec 10, 2013, at 10:44 AM, Simon Effenberg > >>>>>>>>>>>>>>>>> <[email protected]> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hi Matthew, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> see inline.. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Tue, 10 Dec 2013 10:38:03 -0500 > >>>>>>>>>>>>>>>>>> Matthew Von-Maszewski <[email protected]> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> The sad truth is that you are not the first to see this > >>>>>>>>>>>>>>>>>>> problem. And yes, it has to do with your 950GB per node > >>>>>>>>>>>>>>>>>>> dataset. And no, nothing to do but sit through it at > >>>>>>>>>>>>>>>>>>> this time. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> While I did extensive testing around upgrade times before > >>>>>>>>>>>>>>>>>>> shipping 1.4, apparently there are data configurations I > >>>>>>>>>>>>>>>>>>> did not anticipate. You are likely seeing a cascade > >>>>>>>>>>>>>>>>>>> where a shift of one file from level-1 to level-2 is > >>>>>>>>>>>>>>>>>>> causing a shift of another file from level-2 to level-3, > >>>>>>>>>>>>>>>>>>> which causes a level-3 file to shift to level-4, etc … > >>>>>>>>>>>>>>>>>>> then the next file shifts from level-1. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> The bright side of this pain is that you will end up with > >>>>>>>>>>>>>>>>>>> better write throughput once all the compaction ends. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I have to deal with that.. but my problem is now, if I'm > >>>>>>>>>>>>>>>>>> doing this > >>>>>>>>>>>>>>>>>> node by node it looks like 2i searches aren't possible > >>>>>>>>>>>>>>>>>> while 1.3 and > >>>>>>>>>>>>>>>>>> 1.4 nodes exists in the cluster. Is there any problem > >>>>>>>>>>>>>>>>>> which leads me to > >>>>>>>>>>>>>>>>>> an 2i repair marathon or could I easily wait for some > >>>>>>>>>>>>>>>>>> hours for each > >>>>>>>>>>>>>>>>>> node until all merges are done before I upgrade the next > >>>>>>>>>>>>>>>>>> one? (2i > >>>>>>>>>>>>>>>>>> searches can fail for some time.. the APP isn't having > >>>>>>>>>>>>>>>>>> problems with > >>>>>>>>>>>>>>>>>> that but are new inserts with 2i indices processed > >>>>>>>>>>>>>>>>>> successfully or do > >>>>>>>>>>>>>>>>>> I have to do the 2i repair?) > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> /s > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> one other good think: saving disk space is one advantage > >>>>>>>>>>>>>>>>>> ;).. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Riak 2.0's leveldb has code to prevent/reduce compaction > >>>>>>>>>>>>>>>>>>> cascades, but that is not going to help you today. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Matthew > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On Dec 10, 2013, at 10:26 AM, Simon Effenberg > >>>>>>>>>>>>>>>>>>> <[email protected]> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Hi @list, > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> I'm trying to upgrade our Riak cluster from 1.3.1 to > >>>>>>>>>>>>>>>>>>>> 1.4.2 .. after > >>>>>>>>>>>>>>>>>>>> upgrading the first node (out of 12) this node seems to > >>>>>>>>>>>>>>>>>>>> do many merges. > >>>>>>>>>>>>>>>>>>>> the sst_* directories changes in size "rapidly" and the > >>>>>>>>>>>>>>>>>>>> node is having > >>>>>>>>>>>>>>>>>>>> a disk utilization of 100% all the time. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> I know that there is something like that: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> "The first execution of 1.4.0 leveldb using a 1.3.x or > >>>>>>>>>>>>>>>>>>>> 1.2.x dataset > >>>>>>>>>>>>>>>>>>>> will initiate an automatic conversion that could pause > >>>>>>>>>>>>>>>>>>>> the startup of > >>>>>>>>>>>>>>>>>>>> each node by 3 to 7 minutes. The leveldb data in "level > >>>>>>>>>>>>>>>>>>>> #1" is being > >>>>>>>>>>>>>>>>>>>> adjusted such that "level #1" can operate as an > >>>>>>>>>>>>>>>>>>>> overlapped data level > >>>>>>>>>>>>>>>>>>>> instead of as a sorted data level. The conversion is > >>>>>>>>>>>>>>>>>>>> simply the > >>>>>>>>>>>>>>>>>>>> reduction of the number of files in "level #1" to being > >>>>>>>>>>>>>>>>>>>> less than eight > >>>>>>>>>>>>>>>>>>>> via normal compaction of data from "level #1" into > >>>>>>>>>>>>>>>>>>>> "level #2". This is > >>>>>>>>>>>>>>>>>>>> a one time conversion." > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> but it looks much more invasive than explained here or > >>>>>>>>>>>>>>>>>>>> doesn't have to > >>>>>>>>>>>>>>>>>>>> do anything with the (probably seen) merges. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Is this "normal" behavior or could I do anything about > >>>>>>>>>>>>>>>>>>>> it? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> At the moment I'm stucked with the upgrade procedure > >>>>>>>>>>>>>>>>>>>> because this high > >>>>>>>>>>>>>>>>>>>> IO load would probably lead to high response times. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Also we have a lot of data (per node ~950 GB). > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Cheers > >>>>>>>>>>>>>>>>>>>> Simon > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>>>>>>>>>> riak-users mailing list > >>>>>>>>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international > >>>>>>>>>>>>>>>>>> GmbH > >>>>>>>>>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>>>>>>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Mail: [email protected] > >>>>>>>>>>>>>>>>>> Web: www.mobile.de > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Geschäftsführer: Malte Krüger > >>>>>>>>>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>>>>>>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international > >>>>>>>>>>>>>>>> GmbH > >>>>>>>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>>>>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Mail: [email protected] > >>>>>>>>>>>>>>>> Web: www.mobile.de > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Geschäftsführer: Malte Krüger > >>>>>>>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>>>>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>>>>>> riak-users mailing list > >>>>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international > >>>>>>>>>>>>>>> GmbH > >>>>>>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>>>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Mail: [email protected] > >>>>>>>>>>>>>>> Web: www.mobile.de > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Geschäftsführer: Malte Krüger > >>>>>>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>>>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>>>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>>>>>>>>>> > >>>>>>>>>>>>> Mail: [email protected] > >>>>>>>>>>>>> Web: www.mobile.de > >>>>>>>>>>>>> > >>>>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Geschäftsführer: Malte Krüger > >>>>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>>>>>>>> > >>>>>>>>>>> Mail: [email protected] > >>>>>>>>>>> Web: www.mobile.de > >>>>>>>>>>> > >>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Geschäftsführer: Malte Krüger > >>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>>>>>>> > >>>>>>>>>> Mail: [email protected] > >>>>>>>>>> Web: www.mobile.de > >>>>>>>>>> > >>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Geschäftsführer: Malte Krüger > >>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>>>>>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>>>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>>>>>> > >>>>>>>>> Mail: [email protected] > >>>>>>>>> Web: www.mobile.de > >>>>>>>>> > >>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Geschäftsführer: Malte Krüger > >>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>>>>>> Sitz der Gesellschaft: Kleinmachnow > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>>>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>>>> > >>>>>>> Mail: [email protected] > >>>>>>> Web: www.mobile.de > >>>>>>> > >>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>>>> > >>>>>>> > >>>>>>> Geschäftsführer: Malte Krüger > >>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>>>> Sitz der Gesellschaft: Kleinmachnow > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>>>> Fon: + 49-(0)30-8109 - 7173 > >>>>> Fax: + 49-(0)30-8109 - 7131 > >>>>> > >>>>> Mail: [email protected] > >>>>> Web: www.mobile.de > >>>>> > >>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>>>> > >>>>> > >>>>> Geschäftsführer: Malte Krüger > >>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>>>> Sitz der Gesellschaft: Kleinmachnow > >>>> > >>> > >>> > >>> -- > >>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH > >>> Fon: + 49-(0)30-8109 - 7173 > >>> Fax: + 49-(0)30-8109 - 7131 > >>> > >>> Mail: [email protected] > >>> Web: www.mobile.de > >>> > >>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany > >>> > >>> > >>> Geschäftsführer: Malte Krüger > >>> HRB Nr.: 18517 P, Amtsgericht Potsdam > >>> Sitz der Gesellschaft: Kleinmachnow > >> > > > > > > -- > > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > > Fon: + 49-(0)30-8109 - 7173 > > Fax: + 49-(0)30-8109 - 7131 > > > > Mail: [email protected] > > Web: www.mobile.de > > > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > > > > Geschäftsführer: Malte Krüger > > HRB Nr.: 18517 P, Amtsgericht Potsdam > > Sitz der Gesellschaft: Kleinmachnow > -- Simon Effenberg | Site Ops Engineer | mobile.international GmbH Fon: + 49-(0)30-8109 - 7173 Fax: + 49-(0)30-8109 - 7131 Mail: [email protected] Web: www.mobile.de Marktplatz 1 | 14532 Europarc Dreilinden | Germany Geschäftsführer: Malte Krüger HRB Nr.: 18517 P, Amtsgericht Potsdam Sitz der Gesellschaft: Kleinmachnow _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
