Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-27 Thread Alexander Sicular
Awesome! Ya, Solr like resources. If you're on 3 nodes now, consider adjusting your n_val from default 3 to 2. With default ring_size of 64 and n_val of 3 and a cluster size less than 5 you are not guaranteed to have all copies of your data on distinct physical nodes. Some nodes will receive 2

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-27 Thread Guillaume Boddaert
A little follow up for you guys since I went offline for quite some times. As suggested, it was a Solr performance issue, we were able to prove that my old 5 hosts were able to handle the load without Solr/Yokozuna. Fact was that I lacked CPU for my host, as well as RAM. Since SolR is pretty

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-04 Thread Matthew Von-Maszewski
Guillaume, Two points: 1. You can send the “riak debug” from one server and I will verify that 2.0.18 is indicated in the LOG file. 2. Your previous “riak debug” from server “riak1” indicated that only two CPU cores existed. We performance test with eight, twelve, and twenty-four core

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-04 Thread Guillaume Boddaert
Thanks, I've installed the new library as stated in the documentation using 2.0.18 files. I was unable to find the vnode LOG file from the documentation, as my vnodes looks like file, not directories. So I can't verify that I run the proper version of the library after my riak restart.

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-03 Thread Matthew Von-Maszewski
Guillaume, A prebuilt eleveldb 2.0.18 for Debian 7 is found here: https://s3.amazonaws.com/downloads.basho.com/patches/eleveldb/2.0.18/eleveldb_2.0.18_debian7.tgz There are good instructions

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-03 Thread Fred Dushin
Hi Guillaume, From your bucket properties it looks like you are using search, and I assume that is search 2.0 (i.e., yokozuna), and not the legacy Riak Search. It is true that in the current 2.0 and 2.1 trunks the indexing into Solr via Yokozuna is synchronous with the vnode -- very long times

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-03 Thread Matthew Von-Maszewski
Guillaume, I have reviewed the debug package for your riak1 server. There are two potential areas of follow-up: 1. You are running our most recent Riak 2.1.4 which has eleveldb 2.0.17. We have seen a case where a recent feature in eleveldb 2.0.17 caused too much cache flushing, impacting

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-03 Thread Luke Bakken
Guillaume - You said earlier "My data are stored on an openstack volume that support up to 3000IOPS". There is a likelihood that your write load is exceeding the capacity of your virtual environment, especially if some Riak nodes are sharing physical disk or server infrastructure. Some

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-03 Thread Guillaume Boddaert
Hi, Sorry for the delay, I've spent a lot of time trying to understand if the problem was elsewhere. I've simplified my infrastructure and got a simple layout that don't rely anymore on loadbalancer and also corrected some minor performance issue on my workers. At the moment, i have up to

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-02 Thread Alexander Sicular
I believe you should be looking for the get_fsm_objsize stats listed here: http://docs.basho.com/riak/kv/2.1.4/using/cluster-operations/inspecting-node/#get-fsm-objsize . Unless you are using consistent bucket types or write once bucket types. -Alexander Alexander Sicular Solutions Architect

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-02 Thread Luke Bakken
Guillaume - Some colleagues had me carefully re-read those stats. You'll notice that those "put" stats are only for consistent or write_once operations, so they don't apply to you. Your read stats show objects well within Riak's recommended object size: node_get_fsm_objsize_100 : 10916

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-02 Thread Guillaume Boddaert
Here we go for a complete round of my hosts, all are objsize : 0 Here is a sample answer (headers only, that are followed by the full set of JSON content) from the RIAK5 host HTTP/1.1 200 OK X-Riak-Vclock: a85hYGBgzGDKBVI8xTxKnGbpn7QYuPafyWBKZMxjZXjyYfYFviwA Vary: Accept-Encoding Server:

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-02 Thread Luke Bakken
Could you please check the objsize stats on every Riak node? If they are all zero then ... -- Luke Bakken Engineer lbak...@basho.com On Mon, May 2, 2016 at 8:26 AM, Guillaume Boddaert wrote: > My clients are working through an haproxy box configured on

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-02 Thread Guillaume Boddaert
My clients are working through an haproxy box configured on round-robin. I've switched from PBC to HTTP to provide you this: May 2 15:24:12 intrabalancer haproxy[29677]: my_daemon_box:53456 [02/May/2016:15:24:12.390] riak_rest riak_rest_backend/riak2 6/0/1/54/61 503 222 - - 5/4/2/1/0

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-02 Thread Luke Bakken
Which Riak client are you using? Do you have it configured to connect to all nodes in your cluster or just one? -- Luke Bakken Engineer lbak...@basho.com On Mon, May 2, 2016 at 7:40 AM, Guillaume Boddaert wrote: > Hi Luke, > > Well objsize seems to be 0,

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-02 Thread Guillaume Boddaert
Hi Luke, Well objsize seems to be 0, that's very troubling. I can assure you that i am writing 75 items per second at the moment and that I can pull data from the cluster. admin@riak3:~$ sudo riak-admin status | grep -e 'objsize' consistent_get_objsize_100 : 0 consistent_get_objsize_95 : 0

Re: Very slow acquisition time (99 percentile) while fast median times

2016-05-02 Thread Luke Bakken
Hi Guillaume - What are the "objsize" stats for your cluster? -- Luke Bakken Engineer lbak...@basho.com On Mon, May 2, 2016 at 4:45 AM, Guillaume Boddaert wrote: > Hi, > > I'm trying to setup a production environment with Riak as backend. > Unfortunately I

Very slow acquisition time (99 percentile) while fast median times

2016-05-02 Thread Guillaume Boddaert
Hi, I'm trying to setup a production environment with Riak as backend. Unfortunately I have very slow write times that bottleneck my whole system. Here is a sample of one of my node (riak-admin status | grep -e '^node_put_fsm_time'): node_put_fsm_time_100 : 3305516 node_put_fsm_time_95 :