Hello SmartOS folks. I have been running SmartOS at home for years now with
only one issue with a GZ upgrade way back in 2014 so thank you for a rock
solid system! Plex on an LX zone is my media server.

Recently I deployed a new SmartOS server in a remote location to host an
instance of the application we develop at work. We run & develop on Ubuntu
in a VSphere environment. It has prerequisites like Java/Jetty, CouchDB,
and ffmpeg.

Everything seemed happy in an LX zone at first. But as it turns out, Apache
CouchDB on an Ubuntu LX zone seems to have trouble remaining responsive. I
have the same application setup on many VMWare Ubuntu guests and do not
have any stability problems with CouchDB.

I am using this Ubuntu 16.04 image (with apt update and full-upgrade):
https://docs.joyent.com/public-cloud/instances/infrastructure/images/ubuntu#ubuntu-1604-20170403

The rest of the setup is identical to my other VMWare guest setups with
CouchDB 1.6.0 from the default repositories and our application and its
dependancies such as openjdk, ffmpeg, etc.

The tricky thing is that everything works for a while and then at some
point within hours/days couchdb will become unresponsive. Looking at top
shows its beam.smp process locked at 100%. There are various errors in the
logs that seem to point to resource problems encountered somewhere in the
Erlang code. An example snippit would be:

{error_info,
                          {exit,
                           {timeout,
                            {gen_server,call,
                             [<0.2715.5>,{open_ref_count,<0.4245.5>}]}},
                           [{gen_server,terminate,7,
                             [{file,"gen_server.erl"},{line,826}]},
                            {proc_lib,init_p_do_apply,3,
                             [{file,"proc_lib.erl"},{line,240}]}]}},


I'm not informed enough to know how to debug this, although I did try some
basics like making sure the zone was assigned sufficient RAM (I upped it to
16GB out of 32GB total on host) and that quotas were set to 0 (although
this was adjusted after creating the zone.) Restarting couchdb brings it
back up and it behaves normally again for a while with no errors or
warnings during use.

I haven't taken this up with couchdb folks yet because the only thing
different in the environment is that I'm running it in an Ubuntu LX zone
vs. an Ubuntu VMWare guest.

If anyone could point me to some things to try or something I can set up to
try to catch more details when it happens again, I'd appreciate it. I am
completely unfamiliar with dtrace but I gather this type of problem is
where it shines.

I have changed the couchdb logging to "warning" so it will only log
warnings and errors so the logs will be a bit more manageable. I imagine it
will break again within a day and then maybe I'll have some focused logs to
post somewhere.

Thanks for reading through. I welcome any thoughts.



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to