Sorry, I forgot to add your results, using bitcask:
https://dl.dropbox.com/u/183971/summary.png

On Mon, Nov 5, 2012 at 1:17 PM, Uruka Dark <[email protected]> wrote:

> Just an update.
>
> I ran the benchmark again, but now, using Memory backend:
> https://dl.dropbox.com/u/308392/memory_summary.png
>
> This was the result using Bitcask backend:
> https://dl.dropbox.com/u/308392/bitcask_summary.png
>
> The difference is not that big in my environment. I was expecting much
> better results, but I don't know if it was supposed to happen.
> Anyway, your results are still much better, even when I'm using memory
> only backend (50% of yours).
>
> Maybe it can help to understand what is happening.
>
> On Sat, Nov 3, 2012 at 7:31 PM, Uruka Dark <[email protected]> wrote:
>
>> Jared,
>>
>> Again, thank you very much.
>> You helped me a lot.
>>
>> I perfectly understand your point. I'm just starting to know Riak and I
>> want to go much deeper. But, before I keep going, I want make sure that I'm
>> starting with the right foot :)
>> I double/triple-checked and I still have no additional clues about what
>> is happening.
>>
>> You've reached much better results than mine using your default settings,
>> and, given my numbers, I'm still missing something. I would like at least
>> to get closer to your results. If you think that I'll not make any better
>> than this with my default settings, please, let me know.
>>
>> Anyway, this is my app.config:
>>
>> %% -*- mode: erlang;erlang-indent-level: 4;indent-tabs-mode: nil -*-
>> %% ex: ft=erlang ts=4 sw=4 et
>> [
>>  %% Riak Client APIs config
>>  {riak_api, [
>>             %% pb_backlog is the maximum length to which the queue of
>> pending
>>             %% connections may grow. If set, it must be an integer >= 0.
>>             %% By default the value is 5. If you anticipate a huge number
>> of
>>             %% connections being initialised *simultaneously*, set this
>> number
>>             %% higher.
>>             %% {pb_backlog, 64},
>>
>>             %% pb_ip is the IP address that the Riak Protocol Buffers
>> interface
>>             %% will bind to.  If this is undefined, the interface will
>> not run.
>>             {pb_ip,   "10.1.1.221" },
>>
>>             %% pb_port is the TCP port that the Riak Protocol Buffers
>> interface
>>             %% will bind to
>>             {pb_port, 8087 }
>>             ]},
>>
>>  %% Riak Core config
>>  {riak_core, [
>>               %% Default location of ringstate
>>               {ring_state_dir, "/var/lib/riak/ring"},
>>
>>               %% Default ring creation size.  Make sure it is a power of
>> 2,
>>               %% e.g. 16, 32, 64, 128, 256, 512 etc
>>               %{ring_creation_size, 64},
>>
>>               %% http is a list of IP addresses and TCP ports that the
>> Riak
>>               %% HTTP interface will bind.
>>               {http, [ {"10.1.1.221", 8098 } ]},
>>
>>               %% https is a list of IP addresses and TCP ports that the
>> Riak
>>               %% HTTPS interface will bind.
>>               {https, [{ "10.1.1.221", 8069 }]},
>>
>>               %% Default cert and key locations for https can be
>> overridden
>>               %% with the ssl config variable, for example:
>>               {ssl, [
>>                      {certfile, "/etc/riak/server.crt"},
>>                      {keyfile, "/etc/riak/server.key"}
>>                     ]},
>>
>>               %% riak_handoff_port is the TCP port that Riak uses for
>>               %% intra-cluster data handoff.
>>               {handoff_port, 8099 },
>>
>>               %% To encrypt riak_core intra-cluster data handoff traffic,
>>               %% uncomment the following line and edit its path to an
>>               %% appropriate certfile and keyfile.  (This example uses a
>>               %% single file with both items concatenated together.)
>>               %{handoff_ssl_options, [{certfile, "/tmp/erlserver.pem"}]},
>>
>>               %% DTrace support
>>               %% Do not enable 'dtrace_support' unless your Erlang/OTP
>>               %% runtime is compiled to support DTrace.  DTrace is
>>               %% available in R15B01 (supported by the Erlang/OTP
>>               %% official source package) and in R14B04 via a custom
>>               %% source repository & branch.
>>               {dtrace_support, false},
>>
>>               %% Platform-specific installation paths (substituted by
>> rebar)
>>               {platform_bin_dir, "/usr/sbin"},
>>               {platform_data_dir, "/var/lib/riak"},
>>               {platform_etc_dir, "/etc/riak"},
>>               {platform_lib_dir, "/usr/lib/riak/lib"},
>>               {platform_log_dir, "/var/log/riak"}
>>              ]},
>>
>>  %% Riak KV config
>>  {riak_kv, [
>>             %% Storage_backend specifies the Erlang module defining the
>> storage
>>             %% mechanism that will be used on this node.
>>             %{storage_backend, riak_kv_memory_backend},
>>             {storage_backend, riak_kv_bitcask_backend},
>>             %{storage_backend, riak_kv_eleveldb_backend},
>>
>>             %% raw_name is the first part of all URLS used by the Riak
>> raw HTTP
>>             %% interface.  See riak_web.erl and raw_http_resource.erl for
>>             %% details.
>>             %{raw_name, "riak"},
>>
>>             %% mapred_name is URL used to submit map/reduce requests to
>> Riak.
>>             {mapred_name, "mapred"},
>>
>>             %% mapred_system indicates which version of the MapReduce
>>             %% system should be used: 'pipe' means riak_pipe will
>>             %% power MapReduce queries, while 'legacy' means that luke
>>             %% will be used
>>             {mapred_system, pipe},
>>
>>             %% mapred_2i_pipe indicates whether secondary-index
>>             %% MapReduce inputs are queued in parallel via their own
>>             %% pipe ('true'), or serially via a helper process
>>             %% ('false' or undefined).  Set to 'false' or leave
>>             %% undefined during a rolling upgrade from 1.0.
>>             {mapred_2i_pipe, true},
>>
>>             %% directory used to store a transient queue for pending
>>             %% map tasks
>>             %% Only valid when mapred_system == legacy
>>             %% {mapred_queue_dir, "/var/lib/riak/mr_queue" },
>>
>>             %% Each of the following entries control how many Javascript
>>             %% virtual machines are available for executing map, reduce,
>>             %% pre- and post-commit hook functions.
>>             {map_js_vm_count, 8 },
>>             {reduce_js_vm_count, 6 },
>>             {hook_js_vm_count, 2 },
>>
>>             %% Number of items the mapper will fetch in one request.
>>             %% Larger values can impact read/write performance for
>>             %% non-MapReduce requests.
>>             %% Only valid when mapred_system == legacy
>>             %% {mapper_batch_size, 5},
>>
>>             %% js_max_vm_mem is the maximum amount of memory, in
>> megabytes,
>>             %% allocated to the Javascript VMs. If unset, the default is
>>             %% 8MB.
>>             {js_max_vm_mem, 8},
>>
>>             %% js_thread_stack is the maximum amount of thread stack, in
>> megabyes,
>>             %% allocate to the Javascript VMs. If unset, the default is
>> 16MB.
>>             %% NOTE: This is not the same as the C thread stack.
>>             {js_thread_stack, 16},
>>
>>             %% Number of objects held in the MapReduce cache. These will
>> be
>>             %% ejected when the cache runs out of room or the bucket/key
>>             %% pair for that entry changes
>>             %% Only valid when mapred_system == legacy
>>             %% {map_cache_size, 10000},
>>
>>             %% js_source_dir should point to a directory containing
>> Javascript
>>             %% source files which will be loaded by Riak when it
>> initializes
>>             %% Javascript VMs.
>>             %{js_source_dir, "/tmp/js_source"},
>>
>>             %% http_url_encoding determines how Riak treats URL encoded
>>             %% buckets, keys, and links over the REST API. When set to
>> 'on'
>>             %% Riak always decodes encoded values sent as URLs and
>> Headers.
>>             %% Otherwise, Riak defaults to compatibility mode where links
>>             %% are decoded, but buckets and keys are not. The
>> compatibility
>>             %% mode will be removed in a future release.
>>             {http_url_encoding, on},
>>
>>             %% Switch to vnode-based vclocks rather than client ids.  This
>>             %% significantly reduces the number of vclock entries.
>>             %% Only set true if *all* nodes in the cluster are upgraded
>> to 1.0
>>             {vnode_vclocks, true},
>>
>>              %% This option enables compatability of bucket and key
>> listing
>>             %% with 0.14 and earlier versions. Once a rolling upgrade to
>>             %% a version > 0.14 is completed for a cluster, this should be
>>             %% set to false for improved performance for bucket and key
>>             %% listing operations.
>>             {legacy_keylisting, false},
>>
>>             %% This option toggles compatibility of keylisting with 1.0
>>             %% and earlier versions.  Once a rolling upgrade to a version
>>             %% > 1.0 is completed for a cluster, this should be set to
>>             %% true for better control of memory usage during key listing
>>             %% operations
>>             {listkeys_backpressure, true}
>>            ]},
>>
>>  %% Riak Search Config
>>  {riak_search, [
>>                 %% To enable Search functionality set this 'true'.
>>                 {enabled, false}
>>                ]},
>>
>>  %% Merge Index Config
>>  {merge_index, [
>>                 %% The root dir to store search merge_index data
>>                 {data_root, "/var/lib/riak/merge_index"},
>>
>>                 %% Size, in bytes, of the in-memory buffer.  When this
>>                 %% threshold has been reached the data is transformed
>>                 %% into a segment file which resides on disk.
>>                 {buffer_rollover_size, 1048576},
>>
>>                 %% Overtime the segment files need to be compacted.
>>                 %% This is the maximum number of segments that will be
>>                 %% compacted at once.  A lower value will lead to
>>                 %% quicker but more frequent compactions.
>>                 {max_compact_segments, 20}
>>                ]},
>>
>>  %% Bitcask Config
>>  {bitcask, [
>>              {data_root, "/var/lib/riak/bitcask"}
>>            ]},
>>
>>  %% eLevelDB Config
>>  {eleveldb, [
>>              {data_root, "/var/lib/riak/leveldb"},
>>      {write_buffer_size_min, 31457280}, %% 30 MB in bytes
>>              {write_buffer_size_max, 62914560} %% 60 MB in bytes
>>             ]},
>>
>>  %% Lager Config
>>  {lager, [
>>             %% What handlers to install with what arguments
>>             %% The defaults for the logfiles are to rotate the files when
>>             %% they reach 10Mb or at midnight, whichever comes first, and
>> keep
>>             %% the last 5 rotations. See the lager README for a
>> description of
>>             %% the time rotation format:
>>             %% https://github.com/basho/lager/blob/master/README.org
>>             %%
>>             %% If you wish to disable rotation, you can either set the
>> size to 0
>>             %% and the rotation time to "", or instead specify a 2-tuple
>> that only
>>             %% consists of {Logfile, Level}.
>>             {handlers, [
>>                 {lager_console_backend, info},
>>                 {lager_file_backend, [
>>                     {"/var/log/riak/error.log", error, 10485760, "$D0",
>> 5},
>>                     {"/var/log/riak/console.log", info, 10485760, "$D0",
>> 5}
>>                 ]}
>>             ]},
>>
>>             %% Whether to write a crash log, and where.
>>             %% Commented/omitted/undefined means no crash logger.
>>             {crash_log, "/var/log/riak/crash.log"},
>>
>>             %% Maximum size in bytes of events in the crash log -
>> defaults to 65536
>>             {crash_log_msg_size, 65536},
>>
>>             %% Maximum size of the crash log in bytes, before its
>> rotated, set
>>             %% to 0 to disable rotation - default is 0
>>              {crash_log_size, 10485760},
>>
>>             %% What time to rotate the crash log - default is no time
>>             %% rotation. See the lager README for a description of this
>> format:
>>             %% https://github.com/basho/lager/blob/master/README.org
>>             {crash_log_date, "$D0"},
>>
>>             %% Number of rotated crash logs to keep, 0 means keep only the
>>             %% current one - default is 0
>>             {crash_log_count, 5},
>>
>>             %% Whether to redirect error_logger messages into lager -
>> defaults to true
>>             {error_logger_redirect, true}
>>         ]},
>>
>>  %% riak_sysmon config
>>  {riak_sysmon, [
>>          %% To disable forwarding events of a particular type, use a
>>          %% limit of 0.
>>          {process_limit, 30},
>>          {port_limit, 2},
>>
>>          %% Finding reasonable limits for a given workload is a matter
>>          %% of experimentation.
>>          {gc_ms_limit, 100},
>>          {heap_word_limit, 40111000},
>>
>>          %% Configure the following items to 'false' to disable logging
>>          %% of that event type.
>>          {busy_port, true},
>>          {busy_dist_port, true}
>>         ]},
>>
>>  %% SASL config
>>  {sasl, [
>>          {sasl_error_logger, false}
>>         ]},
>>
>>  %% riak_control config
>>  {riak_control, [
>>                 %% Set to false to disable the admin panel.
>>                 {enabled, true},
>>
>>                 %% Authentication style used for access to the admin
>>                 %% panel. Valid styles are 'userlist' <TODO>.
>>                 {auth, none},
>>
>>                 %% If auth is set to 'userlist' then this is the
>>                 %% list of usernames and passwords for access to the
>>                 %% admin panel.
>>                 {userlist, [{"user", "pass"}
>>                            ]},
>>
>>                 %% The admin panel is broken up into multiple
>>                 %% components, each of which is enabled or disabled
>>                 %% by one of these settings.
>>                  {admin, true}
>>                 ]}
>> ].
>>
>> ------
>> I have two machines with those settings: 10.1.1.221 and 10.1.1.222. They
>> are working together.
>>  Do you see any problem on that?
>>
>> Again, if you think I can't go any further with those default settings
>> (without tuning FS, etc), please, let me know.
>>
>> Thank you.
>>
>> On Sat, Nov 3, 2012 at 4:43 PM, Jared Morrow <[email protected]> wrote:
>>
>>> Uruka,
>>>
>>> Now that you got some somewhat reasonable numbers, it is probably time
>>> to discuss what you are trying to get out of Riak.  We typically recommend
>>> 4 or 5 nodes minimum for a Riak install because that is the point where the
>>> distribution becomes a performance benefit rather than a hindrance.  I know
>>> you were just load testing, but I'd recommend considering a test with 4 or
>>> 5 nodes, with default N values.  During the test, remove a node (power it
>>> off, or 'riak stop' it).  Or like someone else mentioned start with a 3 or
>>> 4 node cluster and add a node to see how the performance goes up and no
>>> further operations work is needed to rebalance the data around the cluster.
>>>  This is really where Riak shines over some alternative databases, the ease
>>> of scaling and dealing with failures.  SIngle node performance although fun
>>> to try and tune to get the most out of it, isn't as interesting on a long
>>> timeline when trying to scale the system.  Obviously single node
>>> performance is still important, dont' get me wrong.  Riak isn't always the
>>> best choice, but when it comes with staying available and performance while
>>> systems are failing no other system has a better real-world story than Riak.
>>>
>>> If you still want to get your single node performance up, we have
>>> several pages on our docs page based around tuning.  A good place to start
>>> is the file system tuning page
>>> http://docs.basho.com/riak/latest/cookbooks/File-System-Tuning/ .
>>>  Reading that and other pages in the Operations section might be helpful in
>>> squeezing out those last bits of speed.
>>>
>>> I am glad to see your initial 60 writes/sec has gone up to 800
>>> writes/sec, but we definitely can do better once you start utilizing our
>>> strengths.
>>>
>>> Hope my rambling helped,
>>> -Jared
>>>
>>> On Sat, Nov 3, 2012 at 4:55 AM, Uruka Dark <[email protected]> wrote:
>>>
>>>> Jared,
>>>>
>>>> Thank you for you time and reply.
>>>>
>>>> I got impressed by your numbers and I started to double check my
>>>> settings. I found a big problem here, my third machine (the one out of the
>>>> cluster, making the load), was not talking to Riak in gigabit speed, it was
>>>> 100 Mbs. I changed the network cable and it's working fine now.
>>>> I ran my python script again and I already could see better results:
>>>> 252 ops/sec (before the fix it was 175 ops/sec).
>>>>
>>>> I also ran your benchmark .config, and these are my numbers:
>>>> https://dl.dropbox.com/u/308392/summary.png
>>>>
>>>> As you can see, even so, I'm still far from your results.. not even
>>>> close, and now I'm using Bitcask.
>>>> Anyway, my current position is much better than at the beggining. I'll
>>>> double-check all over again, cause now I have a confirmation that there is
>>>> something wrong.
>>>>
>>>> If you have any suggestion, please, let me know.
>>>> Once again, thank you.
>>>>
>>>> On Sat, Nov 3, 2012 at 3:08 AM, Jared Morrow <[email protected]> wrote:
>>>>
>>>>> I forgot to mention that 2000 ops/sec was on bitcask, not memory.  I
>>>>> didn't bother with the memory backend.
>>>>>
>>>>> -Jared
>>>>>
>>>>>
>>>>> On Sat, Nov 3, 2012 at 12:05 AM, Jared Morrow <[email protected]> wrote:
>>>>>
>>>>>> Uruka,
>>>>>>
>>>>>> So looking at your results something is really wrong with your setup.
>>>>>>  I was surprised by your numbers, so I made two VM's each with only 1gb 
>>>>>> of
>>>>>> RAM on two different boxes also on a 1gb switch.
>>>>>>
>>>>>> I ran a put of 100,000 keys at 10kb in size.
>>>>>>
>>>>>> I didn't do any tuning at all on the VM's and these were quick Ubuntu
>>>>>> 10.04 VM's with 2 virtual CPU's and 1 gig of ram.  I also didn't change 
>>>>>> any
>>>>>> settings in Riak, except for the IP address and listening ports.
>>>>>>
>>>>>> Here is the summary of the results showing around 2000 ops/sec
>>>>>> https://dl.dropbox.com/u/183971/summary.png
>>>>>>
>>>>>> So my main thought is that you weren't actually using N=1 for your
>>>>>> puts and you were using the default N value of 3, meaning you were 
>>>>>> writing
>>>>>> each key/value 3 times, and with 2 nodes this is doing a lot of writes to
>>>>>> the same disk multiple times.
>>>>>>
>>>>>> To be sure you have N=1, you can use 'riak attach' on each node and
>>>>>> enter the following command:
>>>>>>
>>>>>> riak_core_bucket:set_bucket(<<"pop1">>,[{n_val,1}]).
>>>>>>
>>>>>>
>>>>>> If you bucket name is "pop1" as in my case.  That name is completely
>>>>>> arbitrary.
>>>>>>
>>>>>> Sorry I'm late to this thread, I had to find some time to setup the
>>>>>> test.
>>>>>>
>>>>>> For reference I used https://github.com/basho/basho_bench for the
>>>>>> benchmark.  With the following .config file
>>>>>> https://gist.github.com/e630b63f4a025a0fb634
>>>>>>
>>>>>> Hope this helps,
>>>>>> Jared
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to