I've attached a patch to the ticket so even MCF 0.3 users should be able to apply it.
Karl On Wed, Nov 23, 2011 at 5:40 AM, Karl Wright <daddy...@gmail.com> wrote: > To clarify, what I think may be happening is this. > > (1) The Java LDAP context is keeping a socket connection to the AD controller. > (2) The AD controller must be configured to close connections forcibly > after a certain period of time. > (3) The LDAP context's reconnect() operation doesn't recover from a > socket that was closed by the server. > (4) The authority code won't release the LDAP context until 5 idle > minutes go by. > > So basically, a connection winds up in a busted state and doesn't > recover, if the server closes the socket out from under the ldap > connection. > > It's easy to fix, so I've opened a ticket (CONNECTORS-291), and will > commit code changes to trunk shortly. What version of MCF are you > using? > > Karl > > On Wed, Nov 23, 2011 at 5:23 AM, Karl Wright <daddy...@gmail.com> wrote: >> Hi Swapna, >> >> There should be manifoldcf log output that contains the actual stack >> trace of the exception. That would be very helpful; I need the line >> numbers. >> >> The code is quite simple, and indicates that the LDAP server is >> refusing a connection: >> >> protected void getSession() >> throws ManifoldCFException >> { >> if (ctx == null) >> { >> // Calculate the ldap url first >> String ldapURL = "ldap://" + domainControllerName + ":389"; >> >> Hashtable env = new Hashtable(); >> >> env.put(Context.INITIAL_CONTEXT_FACTORY,"com.sun.jndi.ldap.LdapCtxFactory"); >> env.put(Context.SECURITY_AUTHENTICATION,authentication); >> env.put(Context.SECURITY_PRINCIPAL,userName); >> env.put(Context.SECURITY_CREDENTIALS,password); >> >> //connect to my domain controller >> env.put(Context.PROVIDER_URL,ldapURL); >> >> //specify attributes to be returned in binary format >> env.put("java.naming.ldap.attributes.binary","tokenGroups objectSid"); >> >> // Now, try the connection... >> try >> { >> ctx = new InitialLdapContext(env,null); >> } >> catch (AuthenticationException e) >> { >> // This means we couldn't authenticate! >> throw new ManifoldCFException("Authentication problem >> authenticating admin user '"+userName+"': "+e.getMessage(),e); >> } >> catch (CommunicationException e) >> { >> // This means we couldn't connect, most likely >> throw new ManifoldCFException("Couldn't communicate with domain >> controller '"+domainControllerName+"': "+e.getMessage(),e); >> } >> catch (NamingException e) >> { >> throw new ManifoldCFException(e.getMessage(),e); >> } >> } >> else >> { >> // Attempt to reconnect. I *hope* this is efficient and doesn't >> do unnecessary work. >> try >> { >> ctx.reconnect(null); >> } >> catch (AuthenticationException e) >> { >> // This means we couldn't authenticate! >> throw new ManifoldCFException("Authentication problem >> authenticating admin user '"+userName+"': "+e.getMessage(),e); >> } >> catch (CommunicationException e) >> { >> // This means we couldn't connect, most likely >> throw new ManifoldCFException("Couldn't communicate with domain >> controller '"+domainControllerName+"': "+e.getMessage(),e); >> } >> catch (NamingException e) >> { >> throw new ManifoldCFException(e.getMessage(),e); >> } >> } >> >> expiration = System.currentTimeMillis() + expirationInterval; >> >> try >> { >> responseLifetime = Long.parseLong(this.cacheLifetime) * 60L * 1000L; >> LRUsize = Integer.parseInt(this.cacheLRUsize); >> } >> catch (NumberFormatException e) >> { >> throw new ManifoldCFException("Cache lifetime or Cache LRU size >> must be an integer: "+e.getMessage(),e); >> } >> >> } >> >> >> Your problem description indicates that it is possible that the >> ctx.reconnect() call is failing to reconnect, but a new connection >> works OK on your setup. A stack trace should tell me everything. >> >> Thanks, >> Karl >> >> >> >> On Wed, Nov 23, 2011 at 12:58 AM, Swapna Vuppala >> <swapna.kollip...@gmail.com> wrote: >>> Hi Karl, >>> >>> Even after reducing the max connections to 3, the connection fails abruptly >>> for me. >>> >>> Currently, the domain controller am using is mapped to only one IP address, >>> and that responds on ping, and the max connections are 3. It was working >>> yesterday and it fails suddenly throwing different exceptions like below: >>> >>> Threw exception: 'Couldn't communicate with domain controller 'globalad1': >>> null' >>> Threw exception: 'Couldn't communicate with domain controller >>> 'globalad1.global.arup.com': null' >>> Threw exception: 'globalad1.global.arup.com:389; socket closed' >>> >>> Sometimes, it works when I change the cache lifetime parameter. What others >>> factors do you think that can cause this to fail ? >>> >>> Thanks and Regards, >>> Swapna. >>> >>> On Tue, Nov 22, 2011 at 11:56 AM, Swapna Vuppala >>> <swapna.kollip...@gmail.com> wrote: >>>> >>>> OK.. Thanks for the information >>>> >>>> On Mon, Nov 21, 2011 at 6:31 PM, Karl Wright <daddy...@gmail.com> wrote: >>>>> >>>>> The sAMAccountName and UserPrincipalName LDAP fields were used by >>>>> different versions of Windows at different points in time. Some >>>>> backwards compatibility was maintained, however Microsoft has >>>>> apparently decided to deprecate one of them (can't remember which), >>>>> and thus you need support for both. >>>>> >>>>> Karl >>>>> >>>>> On Mon, Nov 21, 2011 at 6:39 AM, Swapna Vuppala >>>>> <swapna.kollip...@gmail.com> wrote: >>>>> > Hi Karl, >>>>> > >>>>> > Yes, my Active Directory authority connection is configured to talk to >>>>> > only >>>>> > one IP address and that particular one is responding to ping always. >>>>> > >>>>> > Earlier, the max connections parameter was set to 10, now I reduced it >>>>> > to 3. >>>>> > Its working as of now and I'll keep checking if its going to throw an >>>>> > exception. Thanks a lot for the inputs. >>>>> > >>>>> > Also, I was wondering what the difference was between 2 options for >>>>> > Login >>>>> > name AD attribute, sAMAccountName and UserPrincipalName ? >>>>> > >>>>> > Thanks and Regards, >>>>> > Swapna. >>>>> > >>>>> > On Mon, Nov 21, 2011 at 4:57 PM, Karl Wright <daddy...@gmail.com> >>>>> > wrote: >>>>> >> >>>>> >> So let me get this straight - your Active Directory authority >>>>> >> connection is configured to talk to only one IP address? and that IP >>>>> >> address responds to ping even when you are receiving an error back >>>>> >> from the authority connection? >>>>> >> >>>>> >> Another possibility is that the DC can only accept a limited number of >>>>> >> connections at a time. What is the max connections parameter for your >>>>> >> authority connection? Try reducing it to no more than 3-4 and see if >>>>> >> that helps. >>>>> >> >>>>> >> Karl >>>>> >> >>>>> >> >>>>> >> On Mon, Nov 21, 2011 at 5:34 AM, Swapna Vuppala >>>>> >> <swapna.kollip...@gmail.com> wrote: >>>>> >> > Hi Karl, >>>>> >> > >>>>> >> > I think I see many domain controllers for the domain am using. But I >>>>> >> > see >>>>> >> > only one IP address mapped to the domain controller name that am >>>>> >> > using >>>>> >> > in >>>>> >> > the credentials form. >>>>> >> > >>>>> >> > As I told you, its working sometimes and throwing exception >>>>> >> > sometimes. >>>>> >> > But >>>>> >> > ping works always fine on the domain controller name that am using, >>>>> >> > from >>>>> >> > which I assume that it is not unreachable. >>>>> >> > >>>>> >> > Can you tell me what else I should be checking or what other factors >>>>> >> > could >>>>> >> > be causing this to fail ? >>>>> >> > >>>>> >> > Thanks and Regards, >>>>> >> > Swapna. >>>>> >> > >>>>> >> > On Thu, Nov 17, 2011 at 1:18 PM, Karl Wright <daddy...@gmail.com> >>>>> >> > wrote: >>>>> >> >> >>>>> >> >> Try doing nslookup on the domain controller. In some larger >>>>> >> >> companies >>>>> >> >> there are many domain controllers all with the same name but >>>>> >> >> different >>>>> >> >> IP's. These *should* all be in synch but it may be the case that >>>>> >> >> they >>>>> >> >> are not - or some of them are unreachable or offline. This can >>>>> >> >> also >>>>> >> >> be the cause of intermittent authorization failures during >>>>> >> >> crawling. >>>>> >> >> >>>>> >> >> If that is the case you have the option of setting the local >>>>> >> >> machine's >>>>> >> >> /etc/hosts file to point to a couple of domain controller instances >>>>> >> >> that are local and in good working order, rather than rely on DNS >>>>> >> >> to >>>>> >> >> find one. >>>>> >> >> >>>>> >> >> Karl >>>>> >> >> >>>>> >> >> On Thu, Nov 17, 2011 at 1:32 AM, Swapna Vuppala >>>>> >> >> <swapna.kollip...@gmail.com> wrote: >>>>> >> >> > Hi, >>>>> >> >> > >>>>> >> >> > I seem to have some problem with Authority Connection. When I >>>>> >> >> > define >>>>> >> >> > an >>>>> >> >> > Authority Connection specifying all the parameters like Domain >>>>> >> >> > Controller, >>>>> >> >> > username, password etc, the connection status shows "Connection >>>>> >> >> > Working" >>>>> >> >> > and >>>>> >> >> > everything works fine, crawling and sending docs to solr, using >>>>> >> >> > mcf-authority-service to get only those docs that a user has got >>>>> >> >> > permission >>>>> >> >> > to see etc. >>>>> >> >> > >>>>> >> >> > But suddenly, the connection status for the Authority Connection >>>>> >> >> > throws >>>>> >> >> > an >>>>> >> >> > exception, and when I play around the credentials form toggling >>>>> >> >> > Login >>>>> >> >> > name >>>>> >> >> > AD attribute, or changing domain controller name, or >>>>> >> >> > authentication , >>>>> >> >> > or >>>>> >> >> > sometimes even with the same settings that threw an exception >>>>> >> >> > earlier, >>>>> >> >> > the >>>>> >> >> > status shows "Connection working" again. I cannot define when it >>>>> >> >> > fails >>>>> >> >> > and >>>>> >> >> > when it works and for what settings it works. >>>>> >> >> > >>>>> >> >> > Can someone help me in understanding why this is happening and >>>>> >> >> > what >>>>> >> >> > needs to >>>>> >> >> > be done to make it work always ? >>>>> >> >> > >>>>> >> >> > Thanks and Regards, >>>>> >> >> > Swapna. >>>>> >> >> > >>>>> >> > >>>>> >> > >>>>> > >>>>> > >>>> >>> >>> >> >