Hello SmartOS folks. I have been running SmartOS at home for years now with only one issue with a GZ upgrade way back in 2014 so thank you for a rock solid system! Plex on an LX zone is my media server.
Recently I deployed a new SmartOS server in a remote location to host an instance of the application we develop at work. We run & develop on Ubuntu in a VSphere environment. It has prerequisites like Java/Jetty, CouchDB, and ffmpeg. Everything seemed happy in an LX zone at first. But as it turns out, Apache CouchDB on an Ubuntu LX zone seems to have trouble remaining responsive. I have the same application setup on many VMWare Ubuntu guests and do not have any stability problems with CouchDB. I am using this Ubuntu 16.04 image (with apt update and full-upgrade): https://docs.joyent.com/public-cloud/instances/infrastructure/images/ubuntu#ubuntu-1604-20170403 The rest of the setup is identical to my other VMWare guest setups with CouchDB 1.6.0 from the default repositories and our application and its dependancies such as openjdk, ffmpeg, etc. The tricky thing is that everything works for a while and then at some point within hours/days couchdb will become unresponsive. Looking at top shows its beam.smp process locked at 100%. There are various errors in the logs that seem to point to resource problems encountered somewhere in the Erlang code. An example snippit would be: {error_info, {exit, {timeout, {gen_server,call, [<0.2715.5>,{open_ref_count,<0.4245.5>}]}}, [{gen_server,terminate,7, [{file,"gen_server.erl"},{line,826}]}, {proc_lib,init_p_do_apply,3, [{file,"proc_lib.erl"},{line,240}]}]}}, I'm not informed enough to know how to debug this, although I did try some basics like making sure the zone was assigned sufficient RAM (I upped it to 16GB out of 32GB total on host) and that quotas were set to 0 (although this was adjusted after creating the zone.) Restarting couchdb brings it back up and it behaves normally again for a while with no errors or warnings during use. I haven't taken this up with couchdb folks yet because the only thing different in the environment is that I'm running it in an Ubuntu LX zone vs. an Ubuntu VMWare guest. If anyone could point me to some things to try or something I can set up to try to catch more details when it happens again, I'd appreciate it. I am completely unfamiliar with dtrace but I gather this type of problem is where it shines. I have changed the couchdb logging to "warning" so it will only log warnings and errors so the logs will be a bit more manageable. I imagine it will break again within a day and then maybe I'll have some focused logs to post somewhere. Thanks for reading through. I welcome any thoughts. ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com