Thanks again, Kelly, for your response.
I continued playing with things last night and
updated fold_objects_for_list_keys across the nodes to be true and
performed a full roll of the stack, that seems to have sped up the replies.
But, the lol bucket was still returning differing results.
I added the second node to my nginx config and have it load balancing
between the two nodes now, same results. However, I just performed 5-10
recursive get's on the lol bucket and it seems that did the trick. Getting
consistent values in response to the ls:
(10:51:14) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:39) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:40) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:41) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:42) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
(10:51:43) [andrew/desktop] ~ $ s3cmd ls s3://lol
DIR s3://lol/kitties/
2013-11-13 14:59 93107 s3://lol/_1
2013-11-13 22:20 84513 s3://lol/kitty.jpg
I was wondering how I can tune against issues like this, but keep the same
performance that's currently in place. I'm not very familiar with
performing read-repairs. Look forward to any assistance with this.
Thanks,
Andrew
On Thu, Nov 14, 2013 at 10:37 AM, Kelly McLaughlin <[email protected]> wrote:
> Andrew,
>
> Are you able to successfully download all the files in the lol bucket? It
> seems like you have some replicas in Riak that do not have copies of all of
> the data from that bucket. That can be resolved in two ways: read-repair or
> active anti-entropy. So I would expect doing a read of each object would
> resolve the issue with differences in bucket listing attempts.
>
> Another thing I would recommend since you are on the Riak 1.4 series is to
> set fold_objects_for_list_keys to true in your Riak CS app.config. It
> enables improvements to bucket listing that rely on features in Riak 1.4
> and should give you better results.
>
> I am not certain about the upload issue with the proxy. It does seem like
> a proxy issue since the upload succeeds when performed directly against the
> node, but I am not familiar enough with nginx configuration to spot any
> issues.
>
> Cheers,
>
> Kelly
>
> On November 13, 2013 at 10:24:42 PM, Andrew Tynefield (
> [email protected] <//[email protected]>) wrote:
>
> I appreciate the help Kelly! ( And sorry for the double mail you're going
> to get, accidentally didn't reply to all. ) I've provided the requested
> information below.
>
> app.config:
> http://pastebin.centos.org/5716/
>
> app.config for riak-cs is managed by puppet, installing the same file on
> both nodes.
>
> riak-admin data:
>
> (10:58:46) [riak] ~ $ riak-admin ring-status
> ================================== Claimant
> ===================================
> Claimant: '[email protected]'
> Status: up
> Ring Ready: true
>
> ============================== Ownership Handoff
> ==============================
> No pending changes.
>
> ============================== Unreachable Nodes
> ==============================
> All nodes are up and reachable
>
> (10:58:54) [riak] ~ $ riak-admin member-status
> ================================= Membership
> ==================================
> Status Ring Pending Node
> -------------------------------------------------------------------------------
> valid 50.0% -- '[email protected]'
> valid 50.0% -- '[email protected]'
> -------------------------------------------------------------------------------
> Valid:2 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
>
> network data:
>
> (11:05:51) [riak] ~ $ ip a | grep /24
> inet 192.168.122.90/24 brd 192.168.122.255 scope global eth0
> inet 192.168.1.19/24 brd 192.168.1.255 scope global eth1
> (10:59:43) [riak] ~ $ netstat -tunap | grep :8080
> tcp 0 0 0.0.0.0:8080 0.0.0.0:*
> LISTEN 29437/beam.smp
> (11:03:20) [riak] ~ $ ps auxf | grep 29437
> root 1507 0.0 0.0 103236 820 pts/4 S+ 23:03 0:00
> \_ grep 29437
> riakcs 29437 0.9 1.8 768480 35364 pts/3 Ssl+ 17:23 3:08 \_
> /usr/lib64/riak-cs/erts-5.9.1/bin/beam.smp -K true -A 64 -W w -- -root
> /usr/lib64/riak-cs -progname riak-cs -- -home /var/lib/riak-cs-control --
> -boot /usr/lib64/riak-cs/releases/1.4.2/riak-cs -config
> /etc/riak-cs/app.config -pa /usr/lib64/riak-cs/lib/basho-patches -name
> [email protected] -setcookie [redacted] -- console
>
> nginx config: [added the proxy_pass_header to ensure I was reaching
> riak-cs]
>
> upstream riak-cs {
> server 192.168.1.19:8080;
> }
> server {
> listen 80;
> server_name cs.domain.com *.cs.domain.com;
> location / {
> proxy_pass http://riak-cs;
> proxy_set_header Host $host;
> proxy_connect_timeout 59s;
> proxy_send_timeout 600;
> proxy_read_timeout 600;
> #proxy_buffering off;
> proxy_buffers 16 32k;
> proxy_buffer_size 64k;
> proxy_pass_header Server;
> #return 403;
>
> }
>
> }
>
> (11:10:36) [andrew/desktop] ~ $ curl -I cs.domain.com/buckets
> HTTP/1.1 404 Object Not Found
> Date: Thu, 14 Nov 2013 05:10:38 GMT
> Content-Type: application/xml
> Connection: keep-alive
> Server: Riak CS
> Content-Length: 185
>
>
> Please let me know if there's anything else I can provide, I'm more than
> willing to do so. Also, it may be worthy to note that domain.com in this
> case is an actual registered and resolving domain that has been sed'd out,
> cause archive.
>
> Thank you so much,
> Andrew
>
>
> On Wed, Nov 13, 2013 at 10:57 PM, Kelly McLaughlin <[email protected]>wrote:
>
>> Andy,
>>
>> To try to get a better idea of what might be going on it would be
>> helpful to see what your riak and riak cs app.config files look like. Also
>> the output of riak-admin ring-status and riak-admin member-status could be
>> useful. For the upload issue I am curious if you have changed the port that
>> riak cs is listening on? The default is 8080 and I don't see from your
>> nginx config where you are sending requests to that port.
>>
>> Kelly
>>
>> On November 13, 2013 at 6:49:59 PM, Andrew Tynefield (
>> [email protected] <//[email protected]>) wrote:
>>
>> Hey guys,
>>
>> I'm working on a dev environment for a riak-cs setup.
>>
>> 2 vms and an external proxy
>>
>> Config of the riak/riak-cs nodes appears to be all complete. I'm
>> encountering two issues I'd like some pointers on where to begin diagnosing
>> before I go around stracing everything.
>>
>> Firstly:
>> When using s3cmd to query riak-cs, I'm receiving differing results on the
>> same commands in succession. Here are the results when going through a
>> proxy:
>>
>> (07:06:09) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
>> 2013-11-13 22:20 84513 s3://lol/kitty.jpg
>> (07:06:10) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
>> 2013-11-13 22:20 84513 s3://lol/kitty.jpg
>> (07:06:11) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
>> 2013-11-13 14:59 93107 s3://lol/_1
>> 2013-11-13 22:20 84513 s3://lol/kitty.jpg
>> (07:06:12) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
>> DIR s3://lol/kitties/
>> 2013-11-13 14:59 93107 s3://lol/_1
>> (07:06:13) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
>> 2013-11-13 22:20 84513 s3://lol/kitty.jpg
>> (07:06:14) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
>> DIR s3://lol/kitties/
>> 2013-11-13 14:59 93107 s3://lol/_1
>>
>> And here they are querying one of the nodes directly:
>> (07:05:59) [andrew/desktop] ~ $ s3cmd ls s3://lol
>> DIR s3://lol/kitties/
>> 2013-11-13 14:59 93107 s3://lol/_1
>> (07:06:00) [andrew/desktop] ~ $ s3cmd ls s3://lol
>> 2013-11-13 14:59 93107 s3://lol/_1
>> 2013-11-13 22:20 84513 s3://lol/kitty.jpg
>> (07:06:01) [andrew/desktop] ~ $ s3cmd ls s3://lol
>> 2013-11-13 22:20 84513 s3://lol/kitty.jpg
>> (07:06:02) [andrew/desktop] ~ $ s3cmd ls s3://lol
>> 2013-11-13 22:20 84513 s3://lol/kitty.jpg
>> (07:06:02) [andrew/desktop] ~ $ s3cmd ls s3://lol
>> DIR s3://lol/kitties/
>> 2013-11-13 14:59 93107 s3://lol/_1
>> (07:06:03) [andrew/desktop] ~ $ s3cmd ls s3://lol
>> 2013-11-13 22:20 84513 s3://lol/kitty.jpg
>> (07:06:04) [andrew/desktop] ~ $ s3cmd -c .s3cfg-riak ls s3://lol
>>
>> The same results happen regardless of which node I query directly, within
>> 1-2 seconds of executing the command a repeat execution of it returns
>> different results. (They are the same repetitive results, just missing
>> objects on some of the returns)
>>
>> The other issue I'm encountering is with put's. If I put directly to the
>> node, I see something like:
>>
>> (07:09:09) [andrew/desktop] ~ $ s3cmd put
>> Downloads/CentOS-6.4-x86_64-minimal.iso s3://big
>> Downloads/CentOS-6.4-x86_64-minimal.iso ->
>> s3://big/CentOS-6.4-x86_64-minimal.iso [part 1 of 23, 15MB]
>> 15728640 of 15728640 100% in 5s 2.62 MB/s done
>> Downloads/CentOS-6.4-x86_64-minimal.iso ->
>> s3://big/CentOS-6.4-x86_64-minimal.iso [part 2 of 23, 15MB]
>> 15728640 of 15728640 100% in 5s 2.86 MB/s done
>> (... Truncated some of the values for brevity ...)
>> Downloads/CentOS-6.4-x86_64-minimal.iso ->
>> s3://big/CentOS-6.4-x86_64-minimal.iso [part 22 of 23, 15MB]
>> 15728640 of 15728640 100% in 1s 12.06 MB/s done
>> Downloads/CentOS-6.4-x86_64-minimal.iso ->
>> s3://big/CentOS-6.4-x86_64-minimal.iso [part 23 of 23, 12MB]
>> 12929024 of 12929024 100% in 1s 11.70 MB/s done
>>
>> Which is ideally what should occur. However, when I go through the
>> proxy:
>>
>> It starts great for the first chunk, but hangs:
>>
>> Start:
>> Downloads/CentOS-6.4-x86_64-minimal.iso -> s3://big/cent6.minimal.iso
>> [part 19 of 23, 15MB]
>> 8675328 of 15728640 55% in 1s 8.26 MB/s
>>
>> Finish:
>> Downloads/CentOS-6.4-x86_64-minimal.iso -> s3://big/cent6.minimal.iso
>> [part 19 of 23, 15MB]
>> 15728640 of 15728640 100% in 22s 683.57 kB/s done
>>
>> It immediately jumps to 55% (the % varies) and then pauses, sometimes
>> up to 30 seconds and then jumps to [done].
>>
>> I assume this is in my nginx configuration somewhere, I thought it was a
>> proxy buffer issue, I've since raised those limits and also tried disabling
>> proxy_buffering entirely to no difference.
>>
>> server {
>> listen 80;
>> server_name cs.domain.com *.cs.domain.com;
>> location / {
>> proxy_pass http://riak-cs;
>> proxy_set_header Host $host;
>> proxy_connect_timeout 59s;
>> proxy_send_timeout 600;
>> proxy_read_timeout 600;
>> #proxy_buffering off;
>> proxy_buffers 16 32k;
>> proxy_buffer_size 64k;
>> #return 403;
>>
>> }
>>
>> }
>>
>> (The two nodes are identical in versions)
>>
>> (07:34:47) [riak] ~ $ cat /etc/redhat-release
>> CentOS release 6.4 (Final)
>> (07:45:53) [riak] ~ $ uname -a
>> Linux riak.tyne.io 2.6.32-358.23.2.el6.x86_64 #1 SMP Wed Oct 16 18:37:12
>> UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>> (07:46:09) [riak] ~ $ riak version
>> 1.4.2
>> (07:46:13) [riak] ~ $ riak-cs version
>> 1.4.2
>> (07:46:25) [riak] ~ $ rpm -qa | grep riak
>> riak-cs-1.4.2-1.el6.x86_64
>> riak-1.4.2-1.el6.x86_64
>>
>> All recommended sysctl and ulimit values have been set as described in
>> the docs.
>>
>> I look forward to any assistance with further tracking this down.
>>
>> --
>> [Andy Tynefield]
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
>
> --
> [Andy Tynefield]
>
>
--
[Andy Tynefield]
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com