Re: [jupyter] Shibboleth + apache2 reverse proxy + .sparkmagic + spark + cookies = major headache

2018-10-10 Thread Bryn Smith
The wss: vs ws: is an artifact of trying a bunch of different variations. 
Moving the web socket proxy higher than what, the shib config lines? 
I've just tried mod_rewrite instead of mod_proxy (both with ws: and wss: 
for the websockets), and it made no difference.  The kernel still wouldn't 
connect. I'm agnostic as far as the SSL termination point.


On Wednesday, October 10, 2018 at 1:36:22 PM UTC-4, Evan Clark wrote:
>
> Any reason why you are proxying to wss instead of ws? In this config I’d 
> expect sal termination at Apache and not at the jupyter hub. I had a 
> similar problem and it was caused by websockets not being properly proxied. 
> I resolved it by moving the web socket proxy higher and I believe using the 
> mod rewrite module instead.
>
> —
> Regards,
> Evan Clark
>  
> --
> *From:* 30051004020n behalf of 
> *Sent:* Wednesday, October 10, 2018 1:02 PM
> *To:* Project Jupyter
> *Subject:* [jupyter] Shibboleth + apache2 reverse proxy + .sparkmagic + 
> spark + cookies = major headache 
>  
> Hi, 
> I promise I have searched extensively in/for jupyterhub gitlab and 
> jhub_shibboleth_auth and jhub_remote_user_authenticator and Shibboeth docs 
> and Jupyter docs and reverse proxy even websockets + httpd.  
>
> tl;dr problem: With the proxy turned on, everything works up to the kernel 
> trying to connect to spark, but with the proxy off, users can only connect 
> if they have a previous cookie.
>
> My setup:
>
> Ubuntu 16.04
> Jupyterhub 0.9.2
> Shibboleth 2.6.1-1
> Apache2 (httpd) 2.4.18
> I've tried both jhub_remote_user_authenticator-0.0.2 and 
> jhub_shibboleth_auth-1.3.0
>
> The goal is to have a shibbed jhub instance to connect to a spark instance 
> on a separate hadoop cluster.  
>
> If I have the proxy turned on, users can log in no problem, their notebook 
> starts up and everything is fine until they try to start up a kernel.  I've 
> tried pyspark, sparkmagic, plain old python3, and probably one or two 
> others, and none of them start.
>
> If the proxy is turned off after the user has the notebook cookie, they 
> can reconnect to the non-proxied URL and launch their notebook and the 
> kernel is fine.
>
> If the user tries to log in when the proxy is turned off, and connects to 
> the non-proxied URL, if they do not have a cookie they get a 403 Forbidden 
> error and the REMOTE_USER is not passed from Shibboleth to the hub, so they 
> get no notebook and no anything.
>
>
> My configs:
> Jupyterhub with proxy on: (with the proxy turned off you would swap the 
> bind_url lines)
>
> c.JupyterHub.admin_access = True
>
> c.JupyterHub.hub_ip = '10.138.20.98'
>
> c.JupyterHub.bind_url = 'https://127.0.0.1:8000'
> #c.JupyterHub.bind_url = 'https://jupyter-dev.$UNIV.edu:8000'
>
> c.Application.log_level = 'DEBUG'
> c.JupyterHub.extra_log_file = '/var/log/jupyterhub.log'
>
> c.JupyterHub.ssl_cert = '/etc/jupyterhub/ssl.crt'
>
> c.JupyterHub.ssl_key = '/etc/jupyterhub/ssl.key'
>
> c.Authenticator.admin_users = {'bryn'}
>
> #c.Spawner.notebook_dir = '~/notebooks'
>
> c.JupyterHub.authenticator_class = 
> 'jhub_shibboleth_auth.shibboleth_auth.ShibbolethAuthenticator'
> #c.JupyterHub.authenticator_class = 
> 'jhub_remote_user_authenticator.remote_user_auth.RemoteUserAuthenticator'
>
> Apache with proxy on and websockets:
>
> 
>  
> 
> ServerAdmin help@$UNIV.edu
> ServerName jupyter-dev.$UNIV.edu
>
>
> ErrorLog ${APACHE_LOG_DIR}/error-ssl.log
> CustomLog ${APACHE_LOG_DIR}/access-ssl.log combined
>
> SSLEngine on
>
> SSLCertificateFile /etc/ssl/certs/ssl-cert-jupyter-dev.crt
> SSLCertificateKeyFile /etc/ssl/private/ssl-cert-jupyter-dev.key
>
> SSLCACertificatePath /etc/ssl/certs/
> SSLCACertificateFile /etc/ssl/certs/incommon-2015.crt
>
> ProxyVia On
> ProxyRequests Off
> ProxyPreserveHost on
> SSLProxyEngine on
>
>
> 
> Authtype shibboleth
> ShibRequireSession On
> ShibUseHeaders On
> require shibboleth
> RequestHeader set REMOTE_USER %{REMOTE_USER}s
> ProxyPass wss://127.0.0.1:8000
> ProxyPassReverse wss://127.0.0.1:8000
> ProxyPass https://127.0.0.1:8000/
> ProxyPassReverse https://127.0.0.1:8000/
>
>   
>
> # 
> # "/jupyter/(user/[^/]*)/(api/kernels/[^/]+/channels|terminals/websocket)/?"\ 
> >
> #  
>
> # 
>
> 
>  
> 
>
> Apache2 without proxy info:
>
> 
> 
> ServerAdmin help@$UNIV.edu
> ServerName jupyter-dev.$UNIV.edu
>
>
> ErrorLog ${APACHE_LOG_DIR}/error-ssl.log
> CustomLog ${APACHE_LOG_DIR}/access-ssl.log combined
>
> SSLEngine on
>
> SSLCertificateFile /etc/ssl/certs/ssl-cert-jupyter-dev.crt
> SSLCertificateKeyFile /etc/ssl/private/ssl-cert-jupyter-dev.key
>
> SSLCACertificatePath /etc/ssl/certs/
> SSLCACertificateFile /etc/ssl/certs/incommon-2015.crt
>
>
>
> 
> Authtype shibboleth
> ShibRequireSession On
> ShibUseHeaders On
> require shibboleth
> RequestHeader set REMOTE_USER %{REMOTE_USER}s
> 
>
>   
>
>
> 
>  
>
>
> I have tried approximately 30 variations on the websockets proxy 
> configuration, the main proxy configuration, 

Re: [jupyter] Shibboleth + apache2 reverse proxy + .sparkmagic + spark + cookies = major headache

2018-10-10 Thread Evan Clark
Any reason why you are proxying to wss instead of ws? In this config I’d expect 
sal termination at Apache and not at the jupyter hub. I had a similar problem 
and it was caused by websockets not being properly proxied. I resolved it by 
moving the web socket proxy higher and I believe using the mod rewrite module 
instead.

—
Regards,
Evan Clark


From: 30051004020n behalf of
Sent: Wednesday, October 10, 2018 1:02 PM
To: Project Jupyter
Subject: [jupyter] Shibboleth + apache2 reverse proxy + .sparkmagic + spark + 
cookies = major headache

Hi,
I promise I have searched extensively in/for jupyterhub gitlab and 
jhub_shibboleth_auth and jhub_remote_user_authenticator and Shibboeth docs and 
Jupyter docs and reverse proxy even websockets + httpd.

tl;dr problem: With the proxy turned on, everything works up to the kernel 
trying to connect to spark, but with the proxy off, users can only connect if 
they have a previous cookie.

My setup:

Ubuntu 16.04
Jupyterhub 0.9.2
Shibboleth 2.6.1-1
Apache2 (httpd) 2.4.18
I've tried both jhub_remote_user_authenticator-0.0.2 and 
jhub_shibboleth_auth-1.3.0

The goal is to have a shibbed jhub instance to connect to a spark instance on a 
separate hadoop cluster.

If I have the proxy turned on, users can log in no problem, their notebook 
starts up and everything is fine until they try to start up a kernel.  I've 
tried pyspark, sparkmagic, plain old python3, and probably one or two others, 
and none of them start.

If the proxy is turned off after the user has the notebook cookie, they can 
reconnect to the non-proxied URL and launch their notebook and the kernel is 
fine.

If the user tries to log in when the proxy is turned off, and connects to the 
non-proxied URL, if they do not have a cookie they get a 403 Forbidden error 
and the REMOTE_USER is not passed from Shibboleth to the hub, so they get no 
notebook and no anything.


My configs:
Jupyterhub with proxy on: (with the proxy turned off you would swap the 
bind_url lines)

c.JupyterHub.admin_access = True

c.JupyterHub.hub_ip = '10.138.20.98'

c.JupyterHub.bind_url = 'https://127.0.0.1:8000'
#c.JupyterHub.bind_url = 'https://jupyter-dev.$UNIV.edu:8000'

c.Application.log_level = 'DEBUG'
c.JupyterHub.extra_log_file = '/var/log/jupyterhub.log'

c.JupyterHub.ssl_cert = '/etc/jupyterhub/ssl.crt'

c.JupyterHub.ssl_key = '/etc/jupyterhub/ssl.key'

c.Authenticator.admin_users = {'bryn'}

#c.Spawner.notebook_dir = '~/notebooks'

c.JupyterHub.authenticator_class = 
'jhub_shibboleth_auth.shibboleth_auth.ShibbolethAuthenticator'
#c.JupyterHub.authenticator_class = 
'jhub_remote_user_authenticator.remote_user_auth.RemoteUserAuthenticator'

Apache with proxy on and websockets:


 

ServerAdmin help@$UNIV.edu
ServerName jupyter-dev.$UNIV.edu


ErrorLog ${APACHE_LOG_DIR}/error-ssl.log
CustomLog ${APACHE_LOG_DIR}/access-ssl.log combined

SSLEngine on

SSLCertificateFile /etc/ssl/certs/ssl-cert-jupyter-dev.crt
SSLCertificateKeyFile /etc/ssl/private/ssl-cert-jupyter-dev.key

SSLCACertificatePath /etc/ssl/certs/
SSLCACertificateFile /etc/ssl/certs/incommon-2015.crt

ProxyVia On
ProxyRequests Off
ProxyPreserveHost on
SSLProxyEngine on



Authtype shibboleth
ShibRequireSession On
ShibUseHeaders On
require shibboleth
RequestHeader set REMOTE_USER %{REMOTE_USER}s
ProxyPass wss://127.0.0.1:8000
ProxyPassReverse wss://127.0.0.1:8000
ProxyPass https://127.0.0.1:8000/
ProxyPassReverse https://127.0.0.1:8000/

  

# 
#
#  

# 


 


Apache2 without proxy info:



ServerAdmin help@$UNIV.edu
ServerName jupyter-dev.$UNIV.edu


ErrorLog ${APACHE_LOG_DIR}/error-ssl.log
CustomLog ${APACHE_LOG_DIR}/access-ssl.log combined

SSLEngine on

SSLCertificateFile /etc/ssl/certs/ssl-cert-jupyter-dev.crt
SSLCertificateKeyFile /etc/ssl/private/ssl-cert-jupyter-dev.key

SSLCACertificatePath /etc/ssl/certs/
SSLCACertificateFile /etc/ssl/certs/incommon-2015.crt




Authtype shibboleth
ShibRequireSession On
ShibUseHeaders On
require shibboleth
RequestHeader set REMOTE_USER %{REMOTE_USER}s


  



 


I have tried approximately 30 variations on the websockets proxy configuration, 
the main proxy configuration, with shib turned on, with it off, new sparkmagic 
configs, etc.

When the proxy is turned off, the user goes straight to tornado and bypasses 
shibboleth so of course their user info is not passed in.  If they had already 
logged in, their cookie is good, of course, but I can't go through this song 
and dance for each new user.


This is what is in jupyterhub.log when the proxy is turned off, so you can see 
that it is definitely not getting REMOTE_USER from shib.:
==> /var/log/jupyterhub.log <==
[I 2018-10-10 12:33:19.664 JupyterHub log:158] 302 GET / -> /hub 
(@10.237.5.144) 0.91ms
[I 2018-10-10 12:33:19.691 JupyterHub log:158] 302 GET /hub -> /hub/ 
(@10.237.5.144) 0.69ms
[I 2018-10-10 12:33:19.708 JupyterHub log:158] 302 GET /hub/ -> /hub/login 
(@10.237.5.144) 0.80ms
[D 2018-10-10 12:33:19.727