Well, I'm not convinced it is fixed now :(  I hardcoded the /etc/hosts file
on all my docker containers and swarm nodes for the ad.uni.edu to an ip
address that is known to be working for ldap ssl.  The NPE took more login
attempts to come up, but it still arrived.  At least I'm not seeing
additional errors in the logs about no route to host though...  Sorry, I
had deleted some of my PR comments about the issue, since I thought it was
just an error on my end.  But, now I not certain what the cause might be at
this point.

/etc/nsswitch.conf has files as the source of truth before dns, so I don't
think the handshake issue is routing based anymore.  I've also run the
script against the server quite extensively and I'm not seeing any failed
connections now...  I think it must be something to do with this NPE.

Michael Barkdoll


On Fri, Apr 26, 2019 at 11:50 AM Emmanuel Lecharny <elecha...@apache.org>
wrote:

> Good to see you have found the root cause of your issue. May I ask you to
> fill a JIRA for the NPE so that we don’t forget to fix it?
>
> Many thanks!
>
> Le ven. 26 avr. 2019 à 17:55, Michael Barkdoll <mabarkd...@gmail.com> a
> écrit :
>
> > I tried removing the valid=10s from the docker swarm dns resolver to see
> if
> > it makes a difference, but I still received an error [1] after several
> ldap
> > successfully logins.  I noticed this error states:
> >
> > org.apache.mina.core.RuntimeIoException: Failed to get the session.
> > Caused by: java.net.NoRouteToHostException: No route to host
> >
> > So, I made a bash script to check if there was any routing issues.
> >
> > ```
> > while true; do
> > nc -w 3 -z -v ad.uni.edu 636; echo $?
> > sleep 1;
> > done
> > ```
> > Output:
> > Warning: inverse host lookup failed for 10.10.0.19: Unknown host
> > ad.uni.edu [10.10.0.19] 636 (?) : No route to host
> >
> > I think one of the servers in the DNS entry is bad! I had hard coded
> Apache
> > Guacmaole to only connect to a good one, but I think the Apache Ldap is
> > doing a bind with the DNS entry provided by the ldap-user-base-dn:
> > dc=ad,dc=uni,dc=edu in apache guacamole.  I'm going to email our windows
> > folks and see if they can get that server out of the DNS entry since I
> > think it is the cause.
> >
> > [1]
> > https://gist.github.com/michaelbarkdoll/bc8ae3b13b1a20dd4ac259d6c20c011c
> >
> > Michael Barkdoll
> >
> >
> > On Fri, Apr 26, 2019 at 10:06 AM Michael Barkdoll <mabarkd...@gmail.com>
> > wrote:
> >
> > > The ldap server is active directory 2016.
> > >
> > > The code that is using the directory ldap api is from a tomcat .WAR
> > > (apache guacamole) [1].  I forked [1] and customized the jira/234 PR to
> > > support ldap and nginx websocket load balancing in this repo [2]
> > according
> > > to apache guacamole's documentation.   I'm using docker swarm to set up
> > an
> > > overlay network between an nginx reverse proxy to two separate apache
> > > guacamole tomcat servlets.  The nginx reverse proxy nginx.conf file is
> > > provided here [3].
> > >
> > > You're correct that userX log entries are successful ldap login
> attempts
> > > that I do to the tomcat .WAR and then I immediately logout and back in
> > > another time until the error occurs.  What would be causing the
> handshake
> > > to not end?
> > >
> > > [1] https://github.com/apache/guacamole-client
> > > [2] https://github.com/michaelbarkdoll/guacamole-client/tree/jira/234
> > > [3]
> > >
> https://gist.github.com/michaelbarkdoll/d78614635fa0432ab08100d05f1a4919
> > >
> > > Michael Barkdoll
> > >
> > >
> > >
> > > On Fri, Apr 26, 2019 at 12:26 AM Stefan Seelmann <
> > m...@stefan-seelmann.de>
> > > wrote:
> > >
> > >> On 4/26/19 7:09 AM, Emmanuel Lecharny wrote:
> > >> >> ERR_04122_SSL_CONTEXT_INIT_FAILURE Failed to initialize the SSL
> > context
> > >> >>
> > >> >> java.lang.NullPointerException: null
> > >> >> at
> > >> >>
> > >> >>
> > >>
> >
> org.apache.directory.ldap.client.api.LdapNetworkConnection.connect(LdapNetworkConnection.java:689)
> > >> >
> > >> >
> > >> > It seems, from the code, that the connection times out. The NPE is
> > >> > infortunate -and we will fix it- but it’s just masking the real
> cause:
> > >> the
> > >> > handshake never ends.
> > >> >
> > >> > What is the scenario you are running?
> > >>
> > >> Especially, which LDAP server do you use?
> > >>
> > >> In error3.txt and error4.txt I see multiple logs messages "User
> "userX"
> > >> successfully authenticated". Does that mean in those cases the
> > >> connection to LDAP worked and it only fails randomly? It seems there
> are
> > >> multiple threads involved, so maybe it's a concurrency issue...
> > >>
> > >>
> > >>
> > >>
> > >>
> >
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>

Reply via email to