One of the core developers says that the following line should stop the stats process. It will then be automatically started, without the stuck data.
exit(whereis(riak_core_stat_calc_sup), kill), profit(). Matthew On Dec 11, 2013, at 4:50 PM, Simon Effenberg <[email protected]> wrote: > So I think I have no real chance to get good numbers. I can see a > little bit through the app monitoring but I'm not sure if I can see > real differences about the 100 -> 170 open_files increase. > > I will try to change the value on the already migrated nodes as well to > see if this improves the stuff I can see.. > > Any other ideas? > > Cheers > Simon > > On Wed, 11 Dec 2013 15:37:03 -0500 > Matthew Von-Maszewski <[email protected]> wrote: > >> The real Riak developers have suggested this might be your problem with >> stats being stuck: >> >> https://github.com/basho/riak_core/pull/467 >> >> The fix is included in the upcoming 1.4.4 maintenance release (which is >> overdue so I am not going to bother guessing when it will actually arrive). >> >> Matthew >> >> On Dec 11, 2013, at 2:47 PM, Simon Effenberg <[email protected]> >> wrote: >> >>> I will do.. >>> >>> but one other thing: >>> >>> watch Every 10.0s: sudo riak-admin status | grep put_fsm >>> node_put_fsm_time_mean : 2208050 >>> node_put_fsm_time_median : 39231 >>> node_put_fsm_time_95 : 17400382 >>> node_put_fsm_time_99 : 50965752 >>> node_put_fsm_time_100 : 59537762 >>> node_put_fsm_active : 5 >>> node_put_fsm_active_60s : 364 >>> node_put_fsm_in_rate : 5 >>> node_put_fsm_out_rate : 3 >>> node_put_fsm_rejected : 0 >>> node_put_fsm_rejected_60s : 0 >>> node_put_fsm_rejected_total : 0 >>> >>> this is not changing at all.. so maybe my expectations are _wrong_?! So >>> I will start searching around if there is a "status" bug or I'm >>> looking in the wrong place... maybe there is no problem while searching >>> for one?! But I see that at least the app has some issues on GET and >>> PUT (more on PUT).. so I would like to know how fast the things are.. >>> but "status" isn't working.. aaaaargh... >>> >>> Cheers >>> Simon >>> >>> >>> On Wed, 11 Dec 2013 14:32:07 -0500 >>> Matthew Von-Maszewski <[email protected]> wrote: >>> >>>> An additional thought: if increasing max_open_files does NOT help, try >>>> removing +S 4:4 from the vm.args. Typically +S setting helps leveldb, but >>>> one other user mentioned that the new sorted 2i queries needed more CPU in >>>> the Erlang layer. >>>> >>>> Summary: >>>> - try increasing max_open_files to 170 >>>> - helps: try setting sst_block_size to 32768 in app.config >>>> - does not help: try removing +S from vm.args >>>> >>>> Matthew >>>> >>>> On Dec 11, 2013, at 1:58 PM, Simon Effenberg <[email protected]> >>>> wrote: >>>> >>>>> Hi Matthew, >>>>> >>>>> On Wed, 11 Dec 2013 18:38:49 +0100 >>>>> Matthew Von-Maszewski <[email protected]> wrote: >>>>> >>>>>> Simon, >>>>>> >>>>>> I have plugged your various values into the attached spreadsheet. I >>>>>> assumed a vnode count to allow for one of your twelve servers to die >>>>>> (256 ring size / 11 servers). >>>>> >>>>> Great, thanks! >>>>> >>>>>> >>>>>> The spreadsheet suggests you can safely raise your max_open_files from >>>>>> 100 to 170. I would suggest doing this for the next server you upgrade. >>>>>> If part of your problem is file cache thrashing, you should see an >>>>>> improvement. >>>>> >>>>> I will try this out.. starting the next server in 3-4 hours. >>>>> >>>>>> >>>>>> Only if max_open_files helps, you should then consider adding >>>>>> {sst_block_size, 32767} to the eleveldb portion of app.config. This >>>>>> setting, given your value sizes, would likely half the size of the >>>>>> metadata held in the file cache. It only impacts the files newly >>>>>> compacted in the upgrade, and would gradually increase space in the file >>>>>> cache while slowing down the file cache thrashing. >>>>> >>>>> So I'll do this at the over-next server if the next server is fine. >>>>> >>>>>> >>>>>> What build/packaging of Riak do you use, or do you build from source? >>>>> >>>>> Using the debian packages from the basho site.. >>>>> >>>>> I'm really wondering why the "put" performance is that bad. >>>>> Here are the changes which were introduced/changed only on the new >>>>> upgraded servers: >>>>> >>>>> >>>>> + fsm_limit => 50000, >>>>> --- our '+P' is set to 262144 so more than 3x fsm_limit which was >>>>> --- stated somewhere >>>>> + # after finishing the upgrade this should be switched to v1 !!! >>>>> + object_format => '__atom_v0', >>>>> >>>>> - '-env ERL_MAX_ETS_TABLES' => 8192, >>>>> + '-env ERL_MAX_ETS_TABLES' => 256000, # old package used 8192 >>>>> but 1.4.2 raised it to this high number >>>>> + '-env ERL_MAX_PORTS' => 64000, >>>>> + # Treat error_logger warnings as warnings >>>>> + '+W' => 'w', >>>>> + # Tweak GC to run more often >>>>> + '-env ERL_FULLSWEEP_AFTER' => 0, >>>>> + # Force the erlang VM to use SMP >>>>> + '-smp' => 'enable', >>>>> + ################################# >>>>> >>>>> Cheers >>>>> Simon >>>>> >>>>> >>>>>> >>>>>> Matthew >>>>>> >>>>>> >>>>>> >>>>>> On Dec 11, 2013, at 9:48 AM, Simon Effenberg <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Matthew, >>>>>>> >>>>>>> thanks for all your time and work.. see inline for answers.. >>>>>>> >>>>>>> On Wed, 11 Dec 2013 09:17:32 -0500 >>>>>>> Matthew Von-Maszewski <[email protected]> wrote: >>>>>>> >>>>>>>> The real Riak developers have arrived on-line for the day. They are >>>>>>>> telling me that all of your problems are likely due to the extended >>>>>>>> upgrade times, and yes there is a known issue with handoff between 1.3 >>>>>>>> and 1.4. They also say everything should calm down after all nodes >>>>>>>> are upgraded. >>>>>>>> >>>>>>>> I will review your system settings now and see if there is something >>>>>>>> that might make the other machines upgrade quicker. So three more >>>>>>>> questions: >>>>>>>> >>>>>>>> - what is the average size of your keys >>>>>>> >>>>>>> bucket names are between 5 and 15 characters (only ~ 10 buckets).. >>>>>>> key names are normally something like 26iesj:hovh7egz >>>>>>> >>>>>>>> >>>>>>>> - what is the average size of your value (data stored) >>>>>>> >>>>>>> I have to guess.. but mean is (from Riak) 12kb but 95th percentile is >>>>>>> at 75kb and in theory we have a limit of 1MB (then it will be split up) >>>>>>> but sometimes thanks to sibblings (we have to buckets with allow_mult) >>>>>>> we have also some 7MB in MAX but this will be reduced again (it's a new >>>>>>> feature in our app which has to many parallel wrights below of 15ms). >>>>>>> >>>>>>>> >>>>>>>> - in regular use, are your keys accessed randomly across their entire >>>>>>>> range, or do they contain a date component which clusters older, less >>>>>>>> used keys >>>>>>> >>>>>>> normally we don't search but retrieve keys by key name.. and we have >>>>>>> data which is up to 6 months old and normally we access mostly >>>>>>> new/active/hot data and not all the old ones.. besides this we have a >>>>>>> job doing a 2i query every 5mins and another one doing this maybe once >>>>>>> an hour.. both don't work while the upgrade is ongoing (2i isn't >>>>>>> working). >>>>>>> >>>>>>> Cheers >>>>>>> Simon >>>>>>> >>>>>>>> >>>>>>>> Matthew >>>>>>>> >>>>>>>> >>>>>>>> On Dec 11, 2013, at 8:43 AM, Simon Effenberg >>>>>>>> <[email protected]> wrote: >>>>>>>> >>>>>>>>> Oh and at the moment they are waiting for some handoffs and I see >>>>>>>>> errors in logfiles: >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013-12-11 13:41:47.948 UTC [error] >>>>>>>>> <0.7157.24>@riak_core_handoff_sender:start_fold:269 hinted_handoff >>>>>>>>> transfer of riak_kv_vnode from '[email protected]' >>>>>>>>> 468137243207554840987117797979434404733540892672 >>>>>>>>> >>>>>>>>> but I remember that somebody else had this as well and if I recall >>>>>>>>> correctly it disappeared after the full upgrade was done.. but at the >>>>>>>>> moment it's hard to think about upgrading everything at once.. >>>>>>>>> (~12hours 100% disk utilization on all 12 nodes will lead to real slow >>>>>>>>> puts/gets) >>>>>>>>> >>>>>>>>> What can I do? >>>>>>>>> >>>>>>>>> Cheers >>>>>>>>> Simon >>>>>>>>> >>>>>>>>> PS: transfers output: >>>>>>>>> >>>>>>>>> '[email protected]' waiting to handoff 17 partitions >>>>>>>>> '[email protected]' waiting to handoff 19 partitions >>>>>>>>> >>>>>>>>> (these are the 1.4.2 nodes) >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, 11 Dec 2013 14:39:58 +0100 >>>>>>>>> Simon Effenberg <[email protected]> wrote: >>>>>>>>> >>>>>>>>>> Also some side notes: >>>>>>>>>> >>>>>>>>>> "top" is even better on new 1.4.2 than on 1.3.1 machines.. IO >>>>>>>>>> utilization of disk is mostly the same (round about 33%).. >>>>>>>>>> >>>>>>>>>> but >>>>>>>>>> >>>>>>>>>> 95th percentile of response time for get (avg over all nodes): >>>>>>>>>> before upgrade: 29ms >>>>>>>>>> after upgrade: almost the same >>>>>>>>>> >>>>>>>>>> 95th percentile of response time for put (avg over all nodes): >>>>>>>>>> before upgrade: 60ms >>>>>>>>>> after upgrade: 1548ms >>>>>>>>>> but this is only because of 2 of 12 nodes are >>>>>>>>>> on 1.4.2 and are really slow (17000ms) >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Simon >>>>>>>>>> >>>>>>>>>> On Wed, 11 Dec 2013 13:45:56 +0100 >>>>>>>>>> Simon Effenberg <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Sorry I forgot the half of it.. >>>>>>>>>>> >>>>>>>>>>> seffenberg@kriak46-1:~$ free -m >>>>>>>>>>> total used free shared buffers cached >>>>>>>>>>> Mem: 23999 23759 239 0 184 >>>>>>>>>>> 16183 >>>>>>>>>>> -/+ buffers/cache: 7391 16607 >>>>>>>>>>> Swap: 0 0 0 >>>>>>>>>>> >>>>>>>>>>> We have 12 servers.. >>>>>>>>>>> datadir on the compacted servers (1.4.2) ~ 765 GB >>>>>>>>>>> >>>>>>>>>>> AAE is enabled. >>>>>>>>>>> >>>>>>>>>>> I attached app.config and vm.args. >>>>>>>>>>> >>>>>>>>>>> Cheers >>>>>>>>>>> Simon >>>>>>>>>>> >>>>>>>>>>> On Wed, 11 Dec 2013 07:33:31 -0500 >>>>>>>>>>> Matthew Von-Maszewski <[email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Ok, I am now suspecting that your servers are either using swap >>>>>>>>>>>> space (which is slow) or your leveldb file cache is thrashing >>>>>>>>>>>> (opening and closing multiple files per request). >>>>>>>>>>>> >>>>>>>>>>>> How many servers do you have and do you use Riak's active >>>>>>>>>>>> anti-entropy feature? I am going to plug all of this into a >>>>>>>>>>>> spreadsheet. >>>>>>>>>>>> >>>>>>>>>>>> Matthew Von-Maszewski >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Dec 11, 2013, at 7:09, Simon Effenberg >>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Matthew >>>>>>>>>>>>> >>>>>>>>>>>>> Memory: 23999 MB >>>>>>>>>>>>> >>>>>>>>>>>>> ring_creation_size, 256 >>>>>>>>>>>>> max_open_files, 100 >>>>>>>>>>>>> >>>>>>>>>>>>> riak-admin status: >>>>>>>>>>>>> >>>>>>>>>>>>> memory_total : 276001360 >>>>>>>>>>>>> memory_processes : 191506322 >>>>>>>>>>>>> memory_processes_used : 191439568 >>>>>>>>>>>>> memory_system : 84495038 >>>>>>>>>>>>> memory_atom : 686993 >>>>>>>>>>>>> memory_atom_used : 686560 >>>>>>>>>>>>> memory_binary : 21965352 >>>>>>>>>>>>> memory_code : 11332732 >>>>>>>>>>>>> memory_ets : 10823528 >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for looking! >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers >>>>>>>>>>>>> Simon >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, 11 Dec 2013 06:44:42 -0500 >>>>>>>>>>>>> Matthew Von-Maszewski <[email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I need to ask other developers as they arrive for the new day. >>>>>>>>>>>>>> Does not make sense to me. >>>>>>>>>>>>>> >>>>>>>>>>>>>> How many nodes do you have? How much RAM do you have in each >>>>>>>>>>>>>> node? What are your settings for max_open_files and cache_size >>>>>>>>>>>>>> in the app.config file? Maybe this is as simple as leveldb >>>>>>>>>>>>>> using too much RAM in 1.4. The memory accounting for >>>>>>>>>>>>>> maz_open_files changed in 1.4. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Matthew Von-Maszewski >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Dec 11, 2013, at 6:28, Simon Effenberg >>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> it took around 11hours for the first node to finish the >>>>>>>>>>>>>>> compaction. The >>>>>>>>>>>>>>> second node is running already 12 hours and is still doing >>>>>>>>>>>>>>> compaction. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Besides that I wonder because the fsm_put time on the new 1.4.2 >>>>>>>>>>>>>>> host is >>>>>>>>>>>>>>> much higher (after the compaction) than on an old 1.3.1 (both >>>>>>>>>>>>>>> are >>>>>>>>>>>>>>> running in the cluster right now and another one is doing the >>>>>>>>>>>>>>> compaction/upgrade while it is in the cluster but not directly >>>>>>>>>>>>>>> accessible because it is out of the Loadbalancer): >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1.4.2: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> node_put_fsm_time_mean : 2208050 >>>>>>>>>>>>>>> node_put_fsm_time_median : 39231 >>>>>>>>>>>>>>> node_put_fsm_time_95 : 17400382 >>>>>>>>>>>>>>> node_put_fsm_time_99 : 50965752 >>>>>>>>>>>>>>> node_put_fsm_time_100 : 59537762 >>>>>>>>>>>>>>> node_put_fsm_active : 5 >>>>>>>>>>>>>>> node_put_fsm_active_60s : 364 >>>>>>>>>>>>>>> node_put_fsm_in_rate : 5 >>>>>>>>>>>>>>> node_put_fsm_out_rate : 3 >>>>>>>>>>>>>>> node_put_fsm_rejected : 0 >>>>>>>>>>>>>>> node_put_fsm_rejected_60s : 0 >>>>>>>>>>>>>>> node_put_fsm_rejected_total : 0 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1.3.1: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> node_put_fsm_time_mean : 5036 >>>>>>>>>>>>>>> node_put_fsm_time_median : 1614 >>>>>>>>>>>>>>> node_put_fsm_time_95 : 8789 >>>>>>>>>>>>>>> node_put_fsm_time_99 : 38258 >>>>>>>>>>>>>>> node_put_fsm_time_100 : 384372 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> any clue why this could/should be? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>> Simon >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, 10 Dec 2013 17:21:07 +0100 >>>>>>>>>>>>>>> Simon Effenberg <[email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks!.. that answers my questions! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>>> Simon >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, 10 Dec 2013 11:08:32 -0500 >>>>>>>>>>>>>>>> Matthew Von-Maszewski <[email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2i is not my expertise, so I had to discuss you concerns with >>>>>>>>>>>>>>>>> another Basho developer. He says: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Between 1.3 and 1.4, the 2i query did change but not the 2i >>>>>>>>>>>>>>>>> on-disk format. You must wait for all nodes to update if you >>>>>>>>>>>>>>>>> desire to use the new 2i query. The 2i data will properly >>>>>>>>>>>>>>>>> write/update on both 1.3 and 1.4 machines during the >>>>>>>>>>>>>>>>> migration. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Does that answer your question? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> And yes, you might see available disk space increase during >>>>>>>>>>>>>>>>> the upgrade compactions if your dataset contains numerous >>>>>>>>>>>>>>>>> delete "tombstones". The Riak 2.0 code includes a new >>>>>>>>>>>>>>>>> feature called "aggressive delete" for leveldb. This feature >>>>>>>>>>>>>>>>> is more proactive in pushing delete tombstones through the >>>>>>>>>>>>>>>>> levels to free up disk space much more quickly (especially if >>>>>>>>>>>>>>>>> you perform block deletes every now and then). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Matthew >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Dec 10, 2013, at 10:44 AM, Simon Effenberg >>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Matthew, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> see inline.. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, 10 Dec 2013 10:38:03 -0500 >>>>>>>>>>>>>>>>>> Matthew Von-Maszewski <[email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The sad truth is that you are not the first to see this >>>>>>>>>>>>>>>>>>> problem. And yes, it has to do with your 950GB per node >>>>>>>>>>>>>>>>>>> dataset. And no, nothing to do but sit through it at this >>>>>>>>>>>>>>>>>>> time. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> While I did extensive testing around upgrade times before >>>>>>>>>>>>>>>>>>> shipping 1.4, apparently there are data configurations I >>>>>>>>>>>>>>>>>>> did not anticipate. You are likely seeing a cascade where >>>>>>>>>>>>>>>>>>> a shift of one file from level-1 to level-2 is causing a >>>>>>>>>>>>>>>>>>> shift of another file from level-2 to level-3, which causes >>>>>>>>>>>>>>>>>>> a level-3 file to shift to level-4, etc … then the next >>>>>>>>>>>>>>>>>>> file shifts from level-1. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The bright side of this pain is that you will end up with >>>>>>>>>>>>>>>>>>> better write throughput once all the compaction ends. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have to deal with that.. but my problem is now, if I'm >>>>>>>>>>>>>>>>>> doing this >>>>>>>>>>>>>>>>>> node by node it looks like 2i searches aren't possible while >>>>>>>>>>>>>>>>>> 1.3 and >>>>>>>>>>>>>>>>>> 1.4 nodes exists in the cluster. Is there any problem which >>>>>>>>>>>>>>>>>> leads me to >>>>>>>>>>>>>>>>>> an 2i repair marathon or could I easily wait for some hours >>>>>>>>>>>>>>>>>> for each >>>>>>>>>>>>>>>>>> node until all merges are done before I upgrade the next >>>>>>>>>>>>>>>>>> one? (2i >>>>>>>>>>>>>>>>>> searches can fail for some time.. the APP isn't having >>>>>>>>>>>>>>>>>> problems with >>>>>>>>>>>>>>>>>> that but are new inserts with 2i indices processed >>>>>>>>>>>>>>>>>> successfully or do >>>>>>>>>>>>>>>>>> I have to do the 2i repair?) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> /s >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> one other good think: saving disk space is one advantage ;).. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Riak 2.0's leveldb has code to prevent/reduce compaction >>>>>>>>>>>>>>>>>>> cascades, but that is not going to help you today. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Matthew >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Dec 10, 2013, at 10:26 AM, Simon Effenberg >>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi @list, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'm trying to upgrade our Riak cluster from 1.3.1 to 1.4.2 >>>>>>>>>>>>>>>>>>>> .. after >>>>>>>>>>>>>>>>>>>> upgrading the first node (out of 12) this node seems to do >>>>>>>>>>>>>>>>>>>> many merges. >>>>>>>>>>>>>>>>>>>> the sst_* directories changes in size "rapidly" and the >>>>>>>>>>>>>>>>>>>> node is having >>>>>>>>>>>>>>>>>>>> a disk utilization of 100% all the time. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I know that there is something like that: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> "The first execution of 1.4.0 leveldb using a 1.3.x or >>>>>>>>>>>>>>>>>>>> 1.2.x dataset >>>>>>>>>>>>>>>>>>>> will initiate an automatic conversion that could pause the >>>>>>>>>>>>>>>>>>>> startup of >>>>>>>>>>>>>>>>>>>> each node by 3 to 7 minutes. The leveldb data in "level >>>>>>>>>>>>>>>>>>>> #1" is being >>>>>>>>>>>>>>>>>>>> adjusted such that "level #1" can operate as an overlapped >>>>>>>>>>>>>>>>>>>> data level >>>>>>>>>>>>>>>>>>>> instead of as a sorted data level. The conversion is >>>>>>>>>>>>>>>>>>>> simply the >>>>>>>>>>>>>>>>>>>> reduction of the number of files in "level #1" to being >>>>>>>>>>>>>>>>>>>> less than eight >>>>>>>>>>>>>>>>>>>> via normal compaction of data from "level #1" into "level >>>>>>>>>>>>>>>>>>>> #2". This is >>>>>>>>>>>>>>>>>>>> a one time conversion." >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> but it looks much more invasive than explained here or >>>>>>>>>>>>>>>>>>>> doesn't have to >>>>>>>>>>>>>>>>>>>> do anything with the (probably seen) merges. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Is this "normal" behavior or could I do anything about it? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> At the moment I'm stucked with the upgrade procedure >>>>>>>>>>>>>>>>>>>> because this high >>>>>>>>>>>>>>>>>>>> IO load would probably lead to high response times. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Also we have a lot of data (per node ~950 GB). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>>>>>>> Simon >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>> riak-users mailing list >>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international >>>>>>>>>>>>>>>>>> GmbH >>>>>>>>>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 >>>>>>>>>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Mail: [email protected] >>>>>>>>>>>>>>>>>> Web: www.mobile.de >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Geschäftsführer: Malte Krüger >>>>>>>>>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>>>>>>>>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH >>>>>>>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 >>>>>>>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Mail: [email protected] >>>>>>>>>>>>>>>> Web: www.mobile.de >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Geschäftsführer: Malte Krüger >>>>>>>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>>>>>>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> riak-users mailing list >>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH >>>>>>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 >>>>>>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Mail: [email protected] >>>>>>>>>>>>>>> Web: www.mobile.de >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Geschäftsführer: Malte Krüger >>>>>>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>>>>>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH >>>>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 >>>>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 >>>>>>>>>>>>> >>>>>>>>>>>>> Mail: [email protected] >>>>>>>>>>>>> Web: www.mobile.de >>>>>>>>>>>>> >>>>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Geschäftsführer: Malte Krüger >>>>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>>>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH >>>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 >>>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 >>>>>>>>>>> >>>>>>>>>>> Mail: [email protected] >>>>>>>>>>> Web: www.mobile.de >>>>>>>>>>> >>>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Geschäftsführer: Malte Krüger >>>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH >>>>>>>>>> Fon: + 49-(0)30-8109 - 7173 >>>>>>>>>> Fax: + 49-(0)30-8109 - 7131 >>>>>>>>>> >>>>>>>>>> Mail: [email protected] >>>>>>>>>> Web: www.mobile.de >>>>>>>>>> >>>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Geschäftsführer: Malte Krüger >>>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>>>>>>>>> Sitz der Gesellschaft: Kleinmachnow >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH >>>>>>>>> Fon: + 49-(0)30-8109 - 7173 >>>>>>>>> Fax: + 49-(0)30-8109 - 7131 >>>>>>>>> >>>>>>>>> Mail: [email protected] >>>>>>>>> Web: www.mobile.de >>>>>>>>> >>>>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>>>>>>>> >>>>>>>>> >>>>>>>>> Geschäftsführer: Malte Krüger >>>>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>>>>>>>> Sitz der Gesellschaft: Kleinmachnow >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH >>>>>>> Fon: + 49-(0)30-8109 - 7173 >>>>>>> Fax: + 49-(0)30-8109 - 7131 >>>>>>> >>>>>>> Mail: [email protected] >>>>>>> Web: www.mobile.de >>>>>>> >>>>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>>>>>> >>>>>>> >>>>>>> Geschäftsführer: Malte Krüger >>>>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>>>>>> Sitz der Gesellschaft: Kleinmachnow >>>>>> >>>>> >>>>> >>>>> -- >>>>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH >>>>> Fon: + 49-(0)30-8109 - 7173 >>>>> Fax: + 49-(0)30-8109 - 7131 >>>>> >>>>> Mail: [email protected] >>>>> Web: www.mobile.de >>>>> >>>>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>>>> >>>>> >>>>> Geschäftsführer: Malte Krüger >>>>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>>>> Sitz der Gesellschaft: Kleinmachnow >>>> >>> >>> >>> -- >>> Simon Effenberg | Site Ops Engineer | mobile.international GmbH >>> Fon: + 49-(0)30-8109 - 7173 >>> Fax: + 49-(0)30-8109 - 7131 >>> >>> Mail: [email protected] >>> Web: www.mobile.de >>> >>> Marktplatz 1 | 14532 Europarc Dreilinden | Germany >>> >>> >>> Geschäftsführer: Malte Krüger >>> HRB Nr.: 18517 P, Amtsgericht Potsdam >>> Sitz der Gesellschaft: Kleinmachnow >> > > > -- > Simon Effenberg | Site Ops Engineer | mobile.international GmbH > Fon: + 49-(0)30-8109 - 7173 > Fax: + 49-(0)30-8109 - 7131 > > Mail: [email protected] > Web: www.mobile.de > > Marktplatz 1 | 14532 Europarc Dreilinden | Germany > > > Geschäftsführer: Malte Krüger > HRB Nr.: 18517 P, Amtsgericht Potsdam > Sitz der Gesellschaft: Kleinmachnow _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
