Jared,

Again, thank you very much.
You helped me a lot.

I perfectly understand your point. I'm just starting to know Riak and I
want to go much deeper. But, before I keep going, I want make sure that I'm
starting with the right foot :)
I double/triple-checked and I still have no additional clues about what is
happening.

You've reached much better results than mine using your default settings,
and, given my numbers, I'm still missing something. I would like at least
to get closer to your results. If you think that I'll not make any better
than this with my default settings, please, let me know.

Anyway, this is my app.config:

%% -*- mode: erlang;erlang-indent-level: 4;indent-tabs-mode: nil -*-
%% ex: ft=erlang ts=4 sw=4 et
[
 %% Riak Client APIs config
 {riak_api, [
            %% pb_backlog is the maximum length to which the queue of
pending
            %% connections may grow. If set, it must be an integer >= 0.
            %% By default the value is 5. If you anticipate a huge number of
            %% connections being initialised *simultaneously*, set this
number
            %% higher.
            %% {pb_backlog, 64},

            %% pb_ip is the IP address that the Riak Protocol Buffers
interface
            %% will bind to.  If this is undefined, the interface will not
run.
            {pb_ip,   "10.1.1.221" },

            %% pb_port is the TCP port that the Riak Protocol Buffers
interface
            %% will bind to
            {pb_port, 8087 }
            ]},

 %% Riak Core config
 {riak_core, [
              %% Default location of ringstate
              {ring_state_dir, "/var/lib/riak/ring"},

              %% Default ring creation size.  Make sure it is a power of 2,
              %% e.g. 16, 32, 64, 128, 256, 512 etc
              %{ring_creation_size, 64},

              %% http is a list of IP addresses and TCP ports that the Riak
              %% HTTP interface will bind.
              {http, [ {"10.1.1.221", 8098 } ]},

              %% https is a list of IP addresses and TCP ports that the Riak
              %% HTTPS interface will bind.
              {https, [{ "10.1.1.221", 8069 }]},

              %% Default cert and key locations for https can be overridden
              %% with the ssl config variable, for example:
              {ssl, [
                     {certfile, "/etc/riak/server.crt"},
                     {keyfile, "/etc/riak/server.key"}
                    ]},

              %% riak_handoff_port is the TCP port that Riak uses for
              %% intra-cluster data handoff.
              {handoff_port, 8099 },

              %% To encrypt riak_core intra-cluster data handoff traffic,
              %% uncomment the following line and edit its path to an
              %% appropriate certfile and keyfile.  (This example uses a
              %% single file with both items concatenated together.)
              %{handoff_ssl_options, [{certfile, "/tmp/erlserver.pem"}]},

              %% DTrace support
              %% Do not enable 'dtrace_support' unless your Erlang/OTP
              %% runtime is compiled to support DTrace.  DTrace is
              %% available in R15B01 (supported by the Erlang/OTP
              %% official source package) and in R14B04 via a custom
              %% source repository & branch.
              {dtrace_support, false},

              %% Platform-specific installation paths (substituted by rebar)
              {platform_bin_dir, "/usr/sbin"},
              {platform_data_dir, "/var/lib/riak"},
              {platform_etc_dir, "/etc/riak"},
              {platform_lib_dir, "/usr/lib/riak/lib"},
              {platform_log_dir, "/var/log/riak"}
             ]},

 %% Riak KV config
 {riak_kv, [
            %% Storage_backend specifies the Erlang module defining the
storage
            %% mechanism that will be used on this node.
            %{storage_backend, riak_kv_memory_backend},
            {storage_backend, riak_kv_bitcask_backend},
            %{storage_backend, riak_kv_eleveldb_backend},

            %% raw_name is the first part of all URLS used by the Riak raw
HTTP
            %% interface.  See riak_web.erl and raw_http_resource.erl for
            %% details.
            %{raw_name, "riak"},

            %% mapred_name is URL used to submit map/reduce requests to
Riak.
            {mapred_name, "mapred"},

            %% mapred_system indicates which version of the MapReduce
            %% system should be used: 'pipe' means riak_pipe will
            %% power MapReduce queries, while 'legacy' means that luke
            %% will be used
            {mapred_system, pipe},

            %% mapred_2i_pipe indicates whether secondary-index
            %% MapReduce inputs are queued in parallel via their own
            %% pipe ('true'), or serially via a helper process
            %% ('false' or undefined).  Set to 'false' or leave
            %% undefined during a rolling upgrade from 1.0.
            {mapred_2i_pipe, true},

            %% directory used to store a transient queue for pending
            %% map tasks
            %% Only valid when mapred_system == legacy
            %% {mapred_queue_dir, "/var/lib/riak/mr_queue" },

            %% Each of the following entries control how many Javascript
            %% virtual machines are available for executing map, reduce,
            %% pre- and post-commit hook functions.
            {map_js_vm_count, 8 },
            {reduce_js_vm_count, 6 },
            {hook_js_vm_count, 2 },

            %% Number of items the mapper will fetch in one request.
            %% Larger values can impact read/write performance for
            %% non-MapReduce requests.
            %% Only valid when mapred_system == legacy
            %% {mapper_batch_size, 5},

            %% js_max_vm_mem is the maximum amount of memory, in megabytes,
            %% allocated to the Javascript VMs. If unset, the default is
            %% 8MB.
            {js_max_vm_mem, 8},

            %% js_thread_stack is the maximum amount of thread stack, in
megabyes,
            %% allocate to the Javascript VMs. If unset, the default is
16MB.
            %% NOTE: This is not the same as the C thread stack.
            {js_thread_stack, 16},

            %% Number of objects held in the MapReduce cache. These will be
            %% ejected when the cache runs out of room or the bucket/key
            %% pair for that entry changes
            %% Only valid when mapred_system == legacy
            %% {map_cache_size, 10000},

            %% js_source_dir should point to a directory containing
Javascript
            %% source files which will be loaded by Riak when it initializes
            %% Javascript VMs.
            %{js_source_dir, "/tmp/js_source"},

            %% http_url_encoding determines how Riak treats URL encoded
            %% buckets, keys, and links over the REST API. When set to 'on'
            %% Riak always decodes encoded values sent as URLs and Headers.
            %% Otherwise, Riak defaults to compatibility mode where links
            %% are decoded, but buckets and keys are not. The compatibility
            %% mode will be removed in a future release.
            {http_url_encoding, on},

            %% Switch to vnode-based vclocks rather than client ids.  This
            %% significantly reduces the number of vclock entries.
            %% Only set true if *all* nodes in the cluster are upgraded to
1.0
            {vnode_vclocks, true},

            %% This option enables compatability of bucket and key listing
            %% with 0.14 and earlier versions. Once a rolling upgrade to
            %% a version > 0.14 is completed for a cluster, this should be
            %% set to false for improved performance for bucket and key
            %% listing operations.
            {legacy_keylisting, false},

            %% This option toggles compatibility of keylisting with 1.0
            %% and earlier versions.  Once a rolling upgrade to a version
            %% > 1.0 is completed for a cluster, this should be set to
            %% true for better control of memory usage during key listing
            %% operations
            {listkeys_backpressure, true}
           ]},

 %% Riak Search Config
 {riak_search, [
                %% To enable Search functionality set this 'true'.
                {enabled, false}
               ]},

 %% Merge Index Config
 {merge_index, [
                %% The root dir to store search merge_index data
                {data_root, "/var/lib/riak/merge_index"},

                %% Size, in bytes, of the in-memory buffer.  When this
                %% threshold has been reached the data is transformed
                %% into a segment file which resides on disk.
                {buffer_rollover_size, 1048576},

                %% Overtime the segment files need to be compacted.
                %% This is the maximum number of segments that will be
                %% compacted at once.  A lower value will lead to
                %% quicker but more frequent compactions.
                {max_compact_segments, 20}
               ]},

 %% Bitcask Config
 {bitcask, [
             {data_root, "/var/lib/riak/bitcask"}
           ]},

 %% eLevelDB Config
 {eleveldb, [
             {data_root, "/var/lib/riak/leveldb"},
     {write_buffer_size_min, 31457280}, %% 30 MB in bytes
             {write_buffer_size_max, 62914560} %% 60 MB in bytes
            ]},

 %% Lager Config
 {lager, [
            %% What handlers to install with what arguments
            %% The defaults for the logfiles are to rotate the files when
            %% they reach 10Mb or at midnight, whichever comes first, and
keep
            %% the last 5 rotations. See the lager README for a description
of
            %% the time rotation format:
            %% https://github.com/basho/lager/blob/master/README.org
            %%
            %% If you wish to disable rotation, you can either set the size
to 0
            %% and the rotation time to "", or instead specify a 2-tuple
that only
            %% consists of {Logfile, Level}.
            {handlers, [
                {lager_console_backend, info},
                {lager_file_backend, [
                    {"/var/log/riak/error.log", error, 10485760, "$D0", 5},
                    {"/var/log/riak/console.log", info, 10485760, "$D0", 5}
                ]}
            ]},

            %% Whether to write a crash log, and where.
            %% Commented/omitted/undefined means no crash logger.
            {crash_log, "/var/log/riak/crash.log"},

            %% Maximum size in bytes of events in the crash log - defaults
to 65536
            {crash_log_msg_size, 65536},

            %% Maximum size of the crash log in bytes, before its rotated,
set
            %% to 0 to disable rotation - default is 0
            {crash_log_size, 10485760},

            %% What time to rotate the crash log - default is no time
            %% rotation. See the lager README for a description of this
format:
            %% https://github.com/basho/lager/blob/master/README.org
            {crash_log_date, "$D0"},

            %% Number of rotated crash logs to keep, 0 means keep only the
            %% current one - default is 0
            {crash_log_count, 5},

            %% Whether to redirect error_logger messages into lager -
defaults to true
            {error_logger_redirect, true}
        ]},

 %% riak_sysmon config
 {riak_sysmon, [
         %% To disable forwarding events of a particular type, use a
         %% limit of 0.
         {process_limit, 30},
         {port_limit, 2},

         %% Finding reasonable limits for a given workload is a matter
         %% of experimentation.
         {gc_ms_limit, 100},
         {heap_word_limit, 40111000},

         %% Configure the following items to 'false' to disable logging
         %% of that event type.
         {busy_port, true},
         {busy_dist_port, true}
        ]},

 %% SASL config
 {sasl, [
         {sasl_error_logger, false}
        ]},

 %% riak_control config
 {riak_control, [
                %% Set to false to disable the admin panel.
                {enabled, true},

                %% Authentication style used for access to the admin
                %% panel. Valid styles are 'userlist' <TODO>.
                {auth, none},

                %% If auth is set to 'userlist' then this is the
                %% list of usernames and passwords for access to the
                %% admin panel.
                {userlist, [{"user", "pass"}
                           ]},

                %% The admin panel is broken up into multiple
                %% components, each of which is enabled or disabled
                %% by one of these settings.
                {admin, true}
                ]}
].

------
I have two machines with those settings: 10.1.1.221 and 10.1.1.222. They
are working together.
Do you see any problem on that?

Again, if you think I can't go any further with those default settings
(without tuning FS, etc), please, let me know.

Thank you.

On Sat, Nov 3, 2012 at 4:43 PM, Jared Morrow <[email protected]> wrote:

> Uruka,
>
> Now that you got some somewhat reasonable numbers, it is probably time to
> discuss what you are trying to get out of Riak.  We typically recommend 4
> or 5 nodes minimum for a Riak install because that is the point where the
> distribution becomes a performance benefit rather than a hindrance.  I know
> you were just load testing, but I'd recommend considering a test with 4 or
> 5 nodes, with default N values.  During the test, remove a node (power it
> off, or 'riak stop' it).  Or like someone else mentioned start with a 3 or
> 4 node cluster and add a node to see how the performance goes up and no
> further operations work is needed to rebalance the data around the cluster.
>  This is really where Riak shines over some alternative databases, the ease
> of scaling and dealing with failures.  SIngle node performance although fun
> to try and tune to get the most out of it, isn't as interesting on a long
> timeline when trying to scale the system.  Obviously single node
> performance is still important, dont' get me wrong.  Riak isn't always the
> best choice, but when it comes with staying available and performance while
> systems are failing no other system has a better real-world story than Riak.
>
> If you still want to get your single node performance up, we have several
> pages on our docs page based around tuning.  A good place to start is the
> file system tuning page
> http://docs.basho.com/riak/latest/cookbooks/File-System-Tuning/ .
>  Reading that and other pages in the Operations section might be helpful in
> squeezing out those last bits of speed.
>
> I am glad to see your initial 60 writes/sec has gone up to 800 writes/sec,
> but we definitely can do better once you start utilizing our strengths.
>
> Hope my rambling helped,
> -Jared
>
> On Sat, Nov 3, 2012 at 4:55 AM, Uruka Dark <[email protected]> wrote:
>
>> Jared,
>>
>> Thank you for you time and reply.
>>
>> I got impressed by your numbers and I started to double check my
>> settings. I found a big problem here, my third machine (the one out of the
>> cluster, making the load), was not talking to Riak in gigabit speed, it was
>> 100 Mbs. I changed the network cable and it's working fine now.
>> I ran my python script again and I already could see better results: 252
>> ops/sec (before the fix it was 175 ops/sec).
>>
>> I also ran your benchmark .config, and these are my numbers:
>> https://dl.dropbox.com/u/308392/summary.png
>>
>> As you can see, even so, I'm still far from your results.. not even
>> close, and now I'm using Bitcask.
>> Anyway, my current position is much better than at the beggining. I'll
>> double-check all over again, cause now I have a confirmation that there is
>> something wrong.
>>
>> If you have any suggestion, please, let me know.
>> Once again, thank you.
>>
>> On Sat, Nov 3, 2012 at 3:08 AM, Jared Morrow <[email protected]> wrote:
>>
>>> I forgot to mention that 2000 ops/sec was on bitcask, not memory.  I
>>> didn't bother with the memory backend.
>>>
>>> -Jared
>>>
>>>
>>> On Sat, Nov 3, 2012 at 12:05 AM, Jared Morrow <[email protected]> wrote:
>>>
>>>> Uruka,
>>>>
>>>> So looking at your results something is really wrong with your setup.
>>>>  I was surprised by your numbers, so I made two VM's each with only 1gb of
>>>> RAM on two different boxes also on a 1gb switch.
>>>>
>>>> I ran a put of 100,000 keys at 10kb in size.
>>>>
>>>> I didn't do any tuning at all on the VM's and these were quick Ubuntu
>>>> 10.04 VM's with 2 virtual CPU's and 1 gig of ram.  I also didn't change any
>>>> settings in Riak, except for the IP address and listening ports.
>>>>
>>>> Here is the summary of the results showing around 2000 ops/sec
>>>> https://dl.dropbox.com/u/183971/summary.png
>>>>
>>>> So my main thought is that you weren't actually using N=1 for your puts
>>>> and you were using the default N value of 3, meaning you were writing each
>>>> key/value 3 times, and with 2 nodes this is doing a lot of writes to the
>>>> same disk multiple times.
>>>>
>>>> To be sure you have N=1, you can use 'riak attach' on each node and
>>>> enter the following command:
>>>>
>>>> riak_core_bucket:set_bucket(<<"pop1">>,[{n_val,1}]).
>>>>
>>>>
>>>> If you bucket name is "pop1" as in my case.  That name is completely
>>>> arbitrary.
>>>>
>>>> Sorry I'm late to this thread, I had to find some time to setup the
>>>> test.
>>>>
>>>> For reference I used https://github.com/basho/basho_bench for the
>>>> benchmark.  With the following .config file
>>>> https://gist.github.com/e630b63f4a025a0fb634
>>>>
>>>> Hope this helps,
>>>> Jared
>>>>
>>>>
>>>
>>
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to