Hi Adam, If possible, could you please mail me directly your app.config, vm.args and archive of log files (/var/log/riak) from one node, and HAProxy configuration?
Have you tried having HAProxy as the SSL endpoint and then use unencrypted HTTP traffic between HAProxy and Riak? I believe it's typical to use HAProxy as the HTTPS endpoint rather than proxy the SSL connection to the back-end services. I'm wondering if there's something about using it in this manner that is contributing to this scenario. Just a thought. Thanks. -- Luke Bakken CSE [email protected] On Fri, Apr 18, 2014 at 9:22 AM, Adam Leko <[email protected]> wrote: > We have a 5-node Riak cluster and we're having problems keeping the HTTPS > listener running properly. The problem typically manifests itself a few > hours after Riak is started. When it happens, the HTTPS listener on a Riak > node will accept new connections but will never respond to them. Connections > made via curl or OpenSSL's s_client show the client sending the SSL hello > but never getting a response. When this happens, the OS does show pending > data for the socket that isn't being processed (trimmed output): > > # ss -lt > State Recv-Q Send-Q Local Address:Port Peer Address:Port > LISTEN 129 128 1.2.3.4:8098 *:* > > One of the times the Erlang VM was in this state I grabbed a crash dump via > SIGUSR1. The Mochiweb process shows up in a "Waiting" state: > > =proc:<0.190.0> > State: Waiting > Name: 'https_1.2.3.4:8098_mochiweb' > Spawned as: proc_lib:init_p/5 > Spawned by: <0.150.0> > Started: Thu Apr 17 21:01:12 2014 > Message queue length: 0 > Number of heap fragments: 0 > Heap fragment data: 0 > Link list: [#Port<0.3873>, <0.4203.0>, <0.150.0>, <0.5788.48>, <0.12559.49>, > <0.4819.45>, <0.17031.51>, <0.19186.51>, <0.18428.51>, <0.25106.51>, > <0.20568.51>, <0.16399.51>, <0.25307.51>, <0.25382.51>, <0.31884.51>, > <0.30289.51>, <0.29247.51>, <0.25168.51>] > Reductions: 50203 > Stack+heap: 1597 > OldHeap: 0 > Heap unused: 495 > OldHeap unused: 0 > Program counter: 0x00007fb03fea4de8 (gen_server:loop/6 + 264) > CP: 0x0000000000000000 (invalid) > arity = 0 > > All the processes linked from the main Mochiweb process are also in a > "Waiting" state. If I connect to the riak console and manually kill the > mochiweb process (via exit(pid(...), kill)), its supervisor restarts it and > the node starts servicing HTTPS requests again. > > We do have the Erlang cluster behind haproxy but the SSL connections hang > even if you try to connect locally from the machine running the RIak > service. We're using a lightly modified config from what is suggested in the > docs > (http://docs.basho.com/riak/1.3.1/cookbooks/Load-Balancing-and-Proxy-Configuration/) > with a much lower max connections setting. When the hangs happen, netstat > only shows a handful of open connections to the haproxy front end. > > It's also worth pointing out that when the hangs happen, there are no > messages that show up in the log files that indicate any errors. The rest of > the services on the Riak node don't appear to be affected as well - we still > get periodic anti-"entropy" exchange log messages and all the usual suspects > in riak-admin status check out. > > We are using a pretty standard OS configuration - Ubuntu 12.04 LTS with the > Basho apt repo, riak 1.4.8-1, erts-5.9.1 that comes bundled with the Riak > packages. > > Are there any known issues with accessing Riak over its HTTPS interface or > any known problems with erts' SSL implementation? As of now we're forced to > use periodic rolling restarts on the nodes in our production cluster to keep > the HTTPS listeners functional, which is a pretty disgusting workaround. > > Thanks for taking the time to read this. I'd appreciate any insight or > guidance on how to address/track down this problem. > > -Adam Leko > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
