Thanks Jon.
I had thought I had the ulimit bumped up and will need to do some more reading
on this.
Is it possible a node could have had dangling file descriptor references?
(Effectively no “garbage collection” happening and thus this was just a tipping
point)
I’m assuming the more likely case was I didn’t have it increased from the
default setting on the OS and thus hit the limit and everything crashed.
Alex Millar, CTO
Office: 1-800-354-8010 ext. 704
Mobile: 519-729-2539
GoBonfire.com
On December 3, 2014 at 9:31:22 AM, Jon Meredith ([email protected]) wrote:
Hi Alex.
It looks like you exceeded the files ulimit. Information on how to fix is here
http://docs.basho.com/riak/latest/ops/tuning/open-files-limit/#Changing-the-limit
Jon
On Dec 3, 2014, at 7:15 AM, Alex Millar <[email protected]> wrote:
Good morning Riak-Users
Last night one of the nodes in my 5 node RiakCS cluster went haywire and shot
up to +90% disk i/o utilization seemingly out of the blue.
Looking at the riak error.log I saw the following being continuously written.
2014-12-02 21:57:13.220 [error] <0.29210.3089> CRASH REPORT Process
<0.29210.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/570899077082383952423314387779798054553098649600/CURRENT:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
2014-12-02 21:57:13.226 [error] <0.29211.3089> CRASH REPORT Process
<0.29211.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/776422744832042175295707567380525354192214163456/LOCK:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
2014-12-02 21:57:13.226 [error] <0.29212.3089> CRASH REPORT Process
<0.29212.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/570899077082383952423314387779798054553098649600/CURRENT:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
2014-12-02 21:57:13.226 [error] <0.29213.3089> CRASH REPORT Process
<0.29213.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/776422744832042175295707567380525354192214163456/CURRENT:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
2014-12-02 21:57:13.286 [error] <0.29215.3089> CRASH REPORT Process
<0.29215.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/776422744832042175295707567380525354192214163456/LOCK:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
2014-12-02 21:57:13.286 [error] <0.29214.3089> CRASH REPORT Process
<0.29214.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/570899077082383952423314387779798054553098649600/LOCK:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
2014-12-02 21:57:13.286 [error] <0.29217.3089> CRASH REPORT Process
<0.29217.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/570899077082383952423314387779798054553098649600/LOCK:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
2014-12-02 21:57:13.287 [error] <0.29216.3089> CRASH REPORT Process
<0.29216.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/776422744832042175295707567380525354192214163456/LOCK:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
2014-12-02 21:57:13.312 [error] <0.29219.3089> CRASH REPORT Process
<0.29219.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/570899077082383952423314387779798054553098649600/LOCK:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
2014-12-02 21:57:15.634 [error] <0.29218.3089> CRASH REPORT Process
<0.29218.3089> with 0 neighbours exited with reason: no match of right hand
value {error,{db_open,"IO error:
/var/lib/riak/anti_entropy/776422744832042175295707567380525354192214163456/CURRENT:
Too many open files"}} in hashtree:new_segment_store/2 line 505 in
gen_server:init_it/6 line 328
Leading up to this there didn’t appear to be any significant load on our
cluster.
I simply restarted the node and the issue went away but I wanted to reach out
to get some help as to why / how this arose in the first place.
Regards,
Alex Millar, CTO
Office: 1-800-354-8010 ext. 704
Mobile: 519-729-2539
GoBonfire.com
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com