Your explanation makes perfect sense, I hadn't considered that the call gets actually through and the remote method just hangs. Glad you found the source of the problem and thanks for the update.
Alex Weirig (JIRA) <[email protected]> schrieb am Sa., 2. Juni 2018, 15:26: > > [ > https://issues.apache.org/jira/browse/ARIES-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499064#comment-16499064 > ] > > Alex Weirig commented on ARIES-1804: > ------------------------------------ > > I think I finally found the source of the problem... > > My LDAP service that does the LDAP processing is using the Apache LDAP > API. I'm creating a pool to get the LDAP connections. When I'm done with > the LDAP connection I'm unbinding it and closing it ... but it seems there > is a need to release the LDAP connection back to the pool. > > The default value of LDAP connections in the pool is 8, so after 8 calls, > no LDAP connection was left in the pool. From what I see, there is no > stacktrace being produced and if I read correctly, the default wait time > for an LDAP connection is infinite. > > I was able to reproduce the problem by calling my scheduled job 8 times. > I've now added a release connection before I unbind and close it and so far > I haven't seen the problem show up. > > I'll let my job run over the weekend but I'm confident that's the cause of > the problem and obviously not karaf, zookeeper or fastbin. The timeout that > occurred further on up the road was just a consequence of the LDAP API not > receiving a connection from the pool and sitting there and waiting for > ever. I will close the ticket on Monday if no surprise shows up over the > weekend. > > I suppose I will fine-tune the default values with a little more attention. > > Thank you very much for your support > > Alex > > > Timeout due to connection loss in RSA fastbin provider? > > ------------------------------------------------------- > > > > Key: ARIES-1804 > > URL: https://issues.apache.org/jira/browse/ARIES-1804 > > Project: Aries > > Issue Type: Bug > > Components: Remote Service Admin > > Affects Versions: rsa-1.12.0 > > Environment: Karaf 4.2.0 > > RSA 1.12.0 > > zookeeper 3.4.12 > > java 1.8.0_172-b11 > > RHEL 7.5 > > Reporter: Alex Weirig > > Priority: Critical > > Attachments: AuthenticationServiceImpl.txt, LoginView.txt, > stacktrace.txt, zoo.cfg.txt > > > > > > Hello, > > I'm running two karaf (4.2.0) servers, one is running the frontend of my > application, the second one is running the backend. > > The backend services are published to 3 clustered zookeeper (3.4.12) > servers. In karaf I have deployed the following RSA features: > > karaf@appsrvtlk()> feature:list | grep rsa > > aries-rsa-core │ 1.12.0 │ │ Started │ aries-rsa-1.12.0 │ > > aries-rsa-provider-tcp │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │ > > aries-rsa-provider-fastbin │ 1.12.0 │ x │ Started │ aries-rsa-1.12.0 │ > > aries-rsa-discovery-local │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │ > > aries-rsa-discovery-config │ 1.12.0 │ │ Uninstalled │ aries-rsa-1.12.0 │ > > aries-rsa-discovery-zookeeper │ 1.12.0 │ x │ Started │ aries-rsa-1.12.0 │ > > aries-rsa-discovery-zookeeper-server │ 1.12.0 │ │ Uninstalled │ > aries-rsa-1.12.0 │ > > When I start my karaf servers everything is working fine and my frontend > can call my backend service and gets the result. But after some time (I > can't figure out when) it seems that the connections between the karaf and > zookeeper gets lost and I'm getting a timeout when I call my remote service > eventhough all the servers (karaf and zookeepers) are still available and > responding. Exhibitor shows no apparent issues with the zookeepers. > > I have attached the > > * relevant parts of my LoginView UI where I declared the @Reference to > my service and where I call the remote service > > * relevant parts of my AuthenticationService implementation that should > be called on the remote karaf > > * the stacktrace that I'm getting on the frontend karaf when the > timeout occurs > > * my zoo.cfg file > > From the stacktrace one can see that the LoginView has a non-null > fastbin proxy handler for the authentication service but that after 5 > minutes a timeout occurs and there is no line in the log that shows that > the remote service was actually called. > > Many thanks in advance for your support. > > Kind regards, > > Alex > > > > -- > This message was sent by Atlassian JIRA > (v7.6.3#76005) >
