Sorry, I forgot to add your results, using bitcask: https://dl.dropbox.com/u/183971/summary.png
On Mon, Nov 5, 2012 at 1:17 PM, Uruka Dark <[email protected]> wrote: > Just an update. > > I ran the benchmark again, but now, using Memory backend: > https://dl.dropbox.com/u/308392/memory_summary.png > > This was the result using Bitcask backend: > https://dl.dropbox.com/u/308392/bitcask_summary.png > > The difference is not that big in my environment. I was expecting much > better results, but I don't know if it was supposed to happen. > Anyway, your results are still much better, even when I'm using memory > only backend (50% of yours). > > Maybe it can help to understand what is happening. > > On Sat, Nov 3, 2012 at 7:31 PM, Uruka Dark <[email protected]> wrote: > >> Jared, >> >> Again, thank you very much. >> You helped me a lot. >> >> I perfectly understand your point. I'm just starting to know Riak and I >> want to go much deeper. But, before I keep going, I want make sure that I'm >> starting with the right foot :) >> I double/triple-checked and I still have no additional clues about what >> is happening. >> >> You've reached much better results than mine using your default settings, >> and, given my numbers, I'm still missing something. I would like at least >> to get closer to your results. If you think that I'll not make any better >> than this with my default settings, please, let me know. >> >> Anyway, this is my app.config: >> >> %% -*- mode: erlang;erlang-indent-level: 4;indent-tabs-mode: nil -*- >> %% ex: ft=erlang ts=4 sw=4 et >> [ >> %% Riak Client APIs config >> {riak_api, [ >> %% pb_backlog is the maximum length to which the queue of >> pending >> %% connections may grow. If set, it must be an integer >= 0. >> %% By default the value is 5. If you anticipate a huge number >> of >> %% connections being initialised *simultaneously*, set this >> number >> %% higher. >> %% {pb_backlog, 64}, >> >> %% pb_ip is the IP address that the Riak Protocol Buffers >> interface >> %% will bind to. If this is undefined, the interface will >> not run. >> {pb_ip, "10.1.1.221" }, >> >> %% pb_port is the TCP port that the Riak Protocol Buffers >> interface >> %% will bind to >> {pb_port, 8087 } >> ]}, >> >> %% Riak Core config >> {riak_core, [ >> %% Default location of ringstate >> {ring_state_dir, "/var/lib/riak/ring"}, >> >> %% Default ring creation size. Make sure it is a power of >> 2, >> %% e.g. 16, 32, 64, 128, 256, 512 etc >> %{ring_creation_size, 64}, >> >> %% http is a list of IP addresses and TCP ports that the >> Riak >> %% HTTP interface will bind. >> {http, [ {"10.1.1.221", 8098 } ]}, >> >> %% https is a list of IP addresses and TCP ports that the >> Riak >> %% HTTPS interface will bind. >> {https, [{ "10.1.1.221", 8069 }]}, >> >> %% Default cert and key locations for https can be >> overridden >> %% with the ssl config variable, for example: >> {ssl, [ >> {certfile, "/etc/riak/server.crt"}, >> {keyfile, "/etc/riak/server.key"} >> ]}, >> >> %% riak_handoff_port is the TCP port that Riak uses for >> %% intra-cluster data handoff. >> {handoff_port, 8099 }, >> >> %% To encrypt riak_core intra-cluster data handoff traffic, >> %% uncomment the following line and edit its path to an >> %% appropriate certfile and keyfile. (This example uses a >> %% single file with both items concatenated together.) >> %{handoff_ssl_options, [{certfile, "/tmp/erlserver.pem"}]}, >> >> %% DTrace support >> %% Do not enable 'dtrace_support' unless your Erlang/OTP >> %% runtime is compiled to support DTrace. DTrace is >> %% available in R15B01 (supported by the Erlang/OTP >> %% official source package) and in R14B04 via a custom >> %% source repository & branch. >> {dtrace_support, false}, >> >> %% Platform-specific installation paths (substituted by >> rebar) >> {platform_bin_dir, "/usr/sbin"}, >> {platform_data_dir, "/var/lib/riak"}, >> {platform_etc_dir, "/etc/riak"}, >> {platform_lib_dir, "/usr/lib/riak/lib"}, >> {platform_log_dir, "/var/log/riak"} >> ]}, >> >> %% Riak KV config >> {riak_kv, [ >> %% Storage_backend specifies the Erlang module defining the >> storage >> %% mechanism that will be used on this node. >> %{storage_backend, riak_kv_memory_backend}, >> {storage_backend, riak_kv_bitcask_backend}, >> %{storage_backend, riak_kv_eleveldb_backend}, >> >> %% raw_name is the first part of all URLS used by the Riak >> raw HTTP >> %% interface. See riak_web.erl and raw_http_resource.erl for >> %% details. >> %{raw_name, "riak"}, >> >> %% mapred_name is URL used to submit map/reduce requests to >> Riak. >> {mapred_name, "mapred"}, >> >> %% mapred_system indicates which version of the MapReduce >> %% system should be used: 'pipe' means riak_pipe will >> %% power MapReduce queries, while 'legacy' means that luke >> %% will be used >> {mapred_system, pipe}, >> >> %% mapred_2i_pipe indicates whether secondary-index >> %% MapReduce inputs are queued in parallel via their own >> %% pipe ('true'), or serially via a helper process >> %% ('false' or undefined). Set to 'false' or leave >> %% undefined during a rolling upgrade from 1.0. >> {mapred_2i_pipe, true}, >> >> %% directory used to store a transient queue for pending >> %% map tasks >> %% Only valid when mapred_system == legacy >> %% {mapred_queue_dir, "/var/lib/riak/mr_queue" }, >> >> %% Each of the following entries control how many Javascript >> %% virtual machines are available for executing map, reduce, >> %% pre- and post-commit hook functions. >> {map_js_vm_count, 8 }, >> {reduce_js_vm_count, 6 }, >> {hook_js_vm_count, 2 }, >> >> %% Number of items the mapper will fetch in one request. >> %% Larger values can impact read/write performance for >> %% non-MapReduce requests. >> %% Only valid when mapred_system == legacy >> %% {mapper_batch_size, 5}, >> >> %% js_max_vm_mem is the maximum amount of memory, in >> megabytes, >> %% allocated to the Javascript VMs. If unset, the default is >> %% 8MB. >> {js_max_vm_mem, 8}, >> >> %% js_thread_stack is the maximum amount of thread stack, in >> megabyes, >> %% allocate to the Javascript VMs. If unset, the default is >> 16MB. >> %% NOTE: This is not the same as the C thread stack. >> {js_thread_stack, 16}, >> >> %% Number of objects held in the MapReduce cache. These will >> be >> %% ejected when the cache runs out of room or the bucket/key >> %% pair for that entry changes >> %% Only valid when mapred_system == legacy >> %% {map_cache_size, 10000}, >> >> %% js_source_dir should point to a directory containing >> Javascript >> %% source files which will be loaded by Riak when it >> initializes >> %% Javascript VMs. >> %{js_source_dir, "/tmp/js_source"}, >> >> %% http_url_encoding determines how Riak treats URL encoded >> %% buckets, keys, and links over the REST API. When set to >> 'on' >> %% Riak always decodes encoded values sent as URLs and >> Headers. >> %% Otherwise, Riak defaults to compatibility mode where links >> %% are decoded, but buckets and keys are not. The >> compatibility >> %% mode will be removed in a future release. >> {http_url_encoding, on}, >> >> %% Switch to vnode-based vclocks rather than client ids. This >> %% significantly reduces the number of vclock entries. >> %% Only set true if *all* nodes in the cluster are upgraded >> to 1.0 >> {vnode_vclocks, true}, >> >> %% This option enables compatability of bucket and key >> listing >> %% with 0.14 and earlier versions. Once a rolling upgrade to >> %% a version > 0.14 is completed for a cluster, this should be >> %% set to false for improved performance for bucket and key >> %% listing operations. >> {legacy_keylisting, false}, >> >> %% This option toggles compatibility of keylisting with 1.0 >> %% and earlier versions. Once a rolling upgrade to a version >> %% > 1.0 is completed for a cluster, this should be set to >> %% true for better control of memory usage during key listing >> %% operations >> {listkeys_backpressure, true} >> ]}, >> >> %% Riak Search Config >> {riak_search, [ >> %% To enable Search functionality set this 'true'. >> {enabled, false} >> ]}, >> >> %% Merge Index Config >> {merge_index, [ >> %% The root dir to store search merge_index data >> {data_root, "/var/lib/riak/merge_index"}, >> >> %% Size, in bytes, of the in-memory buffer. When this >> %% threshold has been reached the data is transformed >> %% into a segment file which resides on disk. >> {buffer_rollover_size, 1048576}, >> >> %% Overtime the segment files need to be compacted. >> %% This is the maximum number of segments that will be >> %% compacted at once. A lower value will lead to >> %% quicker but more frequent compactions. >> {max_compact_segments, 20} >> ]}, >> >> %% Bitcask Config >> {bitcask, [ >> {data_root, "/var/lib/riak/bitcask"} >> ]}, >> >> %% eLevelDB Config >> {eleveldb, [ >> {data_root, "/var/lib/riak/leveldb"}, >> {write_buffer_size_min, 31457280}, %% 30 MB in bytes >> {write_buffer_size_max, 62914560} %% 60 MB in bytes >> ]}, >> >> %% Lager Config >> {lager, [ >> %% What handlers to install with what arguments >> %% The defaults for the logfiles are to rotate the files when >> %% they reach 10Mb or at midnight, whichever comes first, and >> keep >> %% the last 5 rotations. See the lager README for a >> description of >> %% the time rotation format: >> %% https://github.com/basho/lager/blob/master/README.org >> %% >> %% If you wish to disable rotation, you can either set the >> size to 0 >> %% and the rotation time to "", or instead specify a 2-tuple >> that only >> %% consists of {Logfile, Level}. >> {handlers, [ >> {lager_console_backend, info}, >> {lager_file_backend, [ >> {"/var/log/riak/error.log", error, 10485760, "$D0", >> 5}, >> {"/var/log/riak/console.log", info, 10485760, "$D0", >> 5} >> ]} >> ]}, >> >> %% Whether to write a crash log, and where. >> %% Commented/omitted/undefined means no crash logger. >> {crash_log, "/var/log/riak/crash.log"}, >> >> %% Maximum size in bytes of events in the crash log - >> defaults to 65536 >> {crash_log_msg_size, 65536}, >> >> %% Maximum size of the crash log in bytes, before its >> rotated, set >> %% to 0 to disable rotation - default is 0 >> {crash_log_size, 10485760}, >> >> %% What time to rotate the crash log - default is no time >> %% rotation. See the lager README for a description of this >> format: >> %% https://github.com/basho/lager/blob/master/README.org >> {crash_log_date, "$D0"}, >> >> %% Number of rotated crash logs to keep, 0 means keep only the >> %% current one - default is 0 >> {crash_log_count, 5}, >> >> %% Whether to redirect error_logger messages into lager - >> defaults to true >> {error_logger_redirect, true} >> ]}, >> >> %% riak_sysmon config >> {riak_sysmon, [ >> %% To disable forwarding events of a particular type, use a >> %% limit of 0. >> {process_limit, 30}, >> {port_limit, 2}, >> >> %% Finding reasonable limits for a given workload is a matter >> %% of experimentation. >> {gc_ms_limit, 100}, >> {heap_word_limit, 40111000}, >> >> %% Configure the following items to 'false' to disable logging >> %% of that event type. >> {busy_port, true}, >> {busy_dist_port, true} >> ]}, >> >> %% SASL config >> {sasl, [ >> {sasl_error_logger, false} >> ]}, >> >> %% riak_control config >> {riak_control, [ >> %% Set to false to disable the admin panel. >> {enabled, true}, >> >> %% Authentication style used for access to the admin >> %% panel. Valid styles are 'userlist' <TODO>. >> {auth, none}, >> >> %% If auth is set to 'userlist' then this is the >> %% list of usernames and passwords for access to the >> %% admin panel. >> {userlist, [{"user", "pass"} >> ]}, >> >> %% The admin panel is broken up into multiple >> %% components, each of which is enabled or disabled >> %% by one of these settings. >> {admin, true} >> ]} >> ]. >> >> ------ >> I have two machines with those settings: 10.1.1.221 and 10.1.1.222. They >> are working together. >> Do you see any problem on that? >> >> Again, if you think I can't go any further with those default settings >> (without tuning FS, etc), please, let me know. >> >> Thank you. >> >> On Sat, Nov 3, 2012 at 4:43 PM, Jared Morrow <[email protected]> wrote: >> >>> Uruka, >>> >>> Now that you got some somewhat reasonable numbers, it is probably time >>> to discuss what you are trying to get out of Riak. We typically recommend >>> 4 or 5 nodes minimum for a Riak install because that is the point where the >>> distribution becomes a performance benefit rather than a hindrance. I know >>> you were just load testing, but I'd recommend considering a test with 4 or >>> 5 nodes, with default N values. During the test, remove a node (power it >>> off, or 'riak stop' it). Or like someone else mentioned start with a 3 or >>> 4 node cluster and add a node to see how the performance goes up and no >>> further operations work is needed to rebalance the data around the cluster. >>> This is really where Riak shines over some alternative databases, the ease >>> of scaling and dealing with failures. SIngle node performance although fun >>> to try and tune to get the most out of it, isn't as interesting on a long >>> timeline when trying to scale the system. Obviously single node >>> performance is still important, dont' get me wrong. Riak isn't always the >>> best choice, but when it comes with staying available and performance while >>> systems are failing no other system has a better real-world story than Riak. >>> >>> If you still want to get your single node performance up, we have >>> several pages on our docs page based around tuning. A good place to start >>> is the file system tuning page >>> http://docs.basho.com/riak/latest/cookbooks/File-System-Tuning/ . >>> Reading that and other pages in the Operations section might be helpful in >>> squeezing out those last bits of speed. >>> >>> I am glad to see your initial 60 writes/sec has gone up to 800 >>> writes/sec, but we definitely can do better once you start utilizing our >>> strengths. >>> >>> Hope my rambling helped, >>> -Jared >>> >>> On Sat, Nov 3, 2012 at 4:55 AM, Uruka Dark <[email protected]> wrote: >>> >>>> Jared, >>>> >>>> Thank you for you time and reply. >>>> >>>> I got impressed by your numbers and I started to double check my >>>> settings. I found a big problem here, my third machine (the one out of the >>>> cluster, making the load), was not talking to Riak in gigabit speed, it was >>>> 100 Mbs. I changed the network cable and it's working fine now. >>>> I ran my python script again and I already could see better results: >>>> 252 ops/sec (before the fix it was 175 ops/sec). >>>> >>>> I also ran your benchmark .config, and these are my numbers: >>>> https://dl.dropbox.com/u/308392/summary.png >>>> >>>> As you can see, even so, I'm still far from your results.. not even >>>> close, and now I'm using Bitcask. >>>> Anyway, my current position is much better than at the beggining. I'll >>>> double-check all over again, cause now I have a confirmation that there is >>>> something wrong. >>>> >>>> If you have any suggestion, please, let me know. >>>> Once again, thank you. >>>> >>>> On Sat, Nov 3, 2012 at 3:08 AM, Jared Morrow <[email protected]> wrote: >>>> >>>>> I forgot to mention that 2000 ops/sec was on bitcask, not memory. I >>>>> didn't bother with the memory backend. >>>>> >>>>> -Jared >>>>> >>>>> >>>>> On Sat, Nov 3, 2012 at 12:05 AM, Jared Morrow <[email protected]> wrote: >>>>> >>>>>> Uruka, >>>>>> >>>>>> So looking at your results something is really wrong with your setup. >>>>>> I was surprised by your numbers, so I made two VM's each with only 1gb >>>>>> of >>>>>> RAM on two different boxes also on a 1gb switch. >>>>>> >>>>>> I ran a put of 100,000 keys at 10kb in size. >>>>>> >>>>>> I didn't do any tuning at all on the VM's and these were quick Ubuntu >>>>>> 10.04 VM's with 2 virtual CPU's and 1 gig of ram. I also didn't change >>>>>> any >>>>>> settings in Riak, except for the IP address and listening ports. >>>>>> >>>>>> Here is the summary of the results showing around 2000 ops/sec >>>>>> https://dl.dropbox.com/u/183971/summary.png >>>>>> >>>>>> So my main thought is that you weren't actually using N=1 for your >>>>>> puts and you were using the default N value of 3, meaning you were >>>>>> writing >>>>>> each key/value 3 times, and with 2 nodes this is doing a lot of writes to >>>>>> the same disk multiple times. >>>>>> >>>>>> To be sure you have N=1, you can use 'riak attach' on each node and >>>>>> enter the following command: >>>>>> >>>>>> riak_core_bucket:set_bucket(<<"pop1">>,[{n_val,1}]). >>>>>> >>>>>> >>>>>> If you bucket name is "pop1" as in my case. That name is completely >>>>>> arbitrary. >>>>>> >>>>>> Sorry I'm late to this thread, I had to find some time to setup the >>>>>> test. >>>>>> >>>>>> For reference I used https://github.com/basho/basho_bench for the >>>>>> benchmark. With the following .config file >>>>>> https://gist.github.com/e630b63f4a025a0fb634 >>>>>> >>>>>> Hope this helps, >>>>>> Jared >>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
