Hi Tim and other Gurus,

I am using the auth-request lua script from here (thanks Tim, you rock for publishing this):

https://github.com/TimWolla/haproxy-auth-request

Haproxy is sitting in front of two nodejs apps; etherpad and ethercalc. it is also sitting in front of apache which is serving standard websites such as static html, Joomla, Wordpress, Drupal, etc. I am using the lua auth script to employ apache's authnz external authentication program, which is authenticating against an IMAP server, and I have got my haproxy configured such that it only fires the lua script when accessing the nodejs apps.

Things are mostly working, but there is one problem I am having with etherpad that can be fixed by disabling the lua script, so even though I am not 100% sure the problem is not on the etherpad side, I am starting from here. To date I am not able to replicate this issue on ethercalc, the lua auth script seems to work fine every where else.

specifically; I can load the etherpad landing page reliably. to load a pad one changes the path of the url, for example one could have two pads with the following urls: http://pad.domain.tld/p/nameofpad and http://pad.domain.tld/p/otherpadname. These pads eventually time out and produce an error regarding a null variable from etherpad if the lua auth script is enabled in the haproxy config. etherpad has other modules that can be accessed at other URL paths, such as /metrics and /stats, these work fine, the error seems to be limited to loading the pads only.

The error I am seeing in etherpad is described quite accurately here:

https://github.com/ether/etherpad-lite/issues/3047

The short version of that is that a variable or variables are empty and should not be. According to the guy who got things working there, he "redid" his proxy setup and it solved the problem. unfortunately, etherpad doesn't have any documentation for using haproxy as of yet, so trying to puzzle out what is different. this is my one/only indication that this problem might need to be solved on the etherpad side.

I figured there must be some difference with a cookie or header or something, so I stripped all SSL configs and did a tcpdump to capture the full text of the traffic. I expected and found some differences, but every thing looks pretty much the same to me. I have spent quite a few hours searching those dumps, if the answer is there it is too slippery for my eye to land on.

Log files are equally unhelpful; the etherpad log shows the null variable, but I am able to find no clues in logs for haproxy, etherpad, or the apache instance that is doing the authnz auth.

In my investigation, I found that sometimes apache would return a 304 or 408 from the authentication instance. I altered the lua script to return an auth_response_successful on these codes, but it didn't fix the problem.

through my tcpdump, I discovered that etherpad is using websockets. I found this page:

https://www.haproxy.com/blog/websockets-load-balancing-with-haproxy/

which indicates that haproxy will automatically change to tunnel mode and support websockets. I can find no reason why the lua script might interfere with this, and ethercalc also uses websockets, but so far it is the only thing I can find that *might be causing data to not be passed for the variable(s) to be populated.

if you are still with me, thank you so much for reading this far. I would truly appreciate any thoughts you might have on how to diagnose what is causing this issue...

--
Bob Miller
Cell: 867-334-7117
Office: 867-633-3760
www.computerisms.ca

Reply via email to