We are developing Jira cloud app using Apache Ignite both as data storage and as job scheduler. This is done via a standard Ignite client node. But we need to use Atlassian Connect Spring Boot to be able to communicate with Jira. In short, all is done exactly as in our article Boosting Jira Cloud app development with Apache Ignite <https://medium.com/alliedium/boosting-jira-cloud-app-development-with-apache-ignite-7eebc7bb3d48> .At first we used simple Ignite JDBC driver <https://apacheignite-sql.readme.io/docs/jdbc-driver> just for Atlassian Connect Spring Boot along with a separate Ignite client node for our own purposes. But this turned out to be very unstable being deployed in our local Kubernetes cluster (built via Kubespray) due to constant exceptions occuring from time to time (in fact, this revealed only in our local cluster, in AWS EKS all worked fine). To make all this more stable we tried to use Ignite JDBC Client driver <https://apacheignite-sql.readme.io/docs/jdbc-client-driver> exactly as described in the article mentioned above. Thus, now our backend uses two Ignite client nodes per single JVM: the first one for JDBC used by Atlassian Connect Spring Boot, the second one for our own purposes.This solution turned out to be good enough, because our app works now very stable both in our local cluster and in AWS EKS. But when we deploy our app in Docker for testing and developing purposes, our Ignite client nodes hang from time to time. After some investigation we were able to see that this occurs exactly at the instant when an object of IgniteAtomicLong is created. Below are logs both for successful initialization of our app and for the case when Ignite client node hanged. Logs when all is ok ignite-appclientnode-successful.log <http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-appclientnode-successful.log> ignite-jdbcclientnode-successful.log <http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-jdbcclientnode-successful.log> Logs when both client node hang ignite-appclientnode-failed.log <http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-appclientnode-failed.log> ignite-jdbcclientnode-failed.log <http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-jdbcclientnode-failed.log> Some analysis and questions >From logs one can see that caches default, tenants, atlassian_host_audit, SQL_PUBLIC_ATLASSIAN_HOST are manipulated, in fact, default is given in client configuration: client.xml <http://apache-ignite-users.70518.x6.nabble.com/file/t2262/client.xml> , the cache SQL_PUBLIC_ATLASSIAN_HOST contains atlassian_host table mentioned in Boosting Jira Cloud app development with Apache Ignite <https://medium.com/alliedium/boosting-jira-cloud-app-development-with-apache-ignite-7eebc7bb3d48> and is created in advance even before the app starts. Further, atlassian_host_audit is a copy of atlassian_host, in any case it is not yet created when the app hangs.What concerns other entities processed by Ignite, they are created by the following code:And from the logs of the app itself it is clear that the app hangs exactly on the last line. This is confirmed by the fact that the in ignite-jdbcclientnode-successful.log we have the following lines:while in ignite-jdbcclientnode-failed.log all the lines starting the first time the cache ignite-sys-atomic-cache@default-ds-group (the cache used for atomics) was mentioned are as follows:In particular, the following line from ignite-jdbcclientnode-successful.log is absent in ignite-jdbcclientnode-failed.log:But it should be noted that for the failure case there are other client nodes executed in separate containers executed simultaneously with the backend app and with the same code creating the cache tenants and IgniteAtomicLong idGen what concerns the logs below (see above for the code), their node ids are 653143b2-6e80-49ff-9e9a-ae10237b32e8 and 30e24e06-ab76-4053-a36e-548e87ffe5d1, respectively (and it can be easily seen that all the lines in ignite-jdbcclientnode-failed.log with ignite-sys-atomic-cache@default-ds-group relate namely to these nodes), the logs for the time segment when the code with tenants and idGen is executed are as follows:And the code creating tenants and idGen is executed successfully. But is it possible that this simultaneous creation of idGen may hang some nodes? (As for the case when all was executed successfully, there we also have two separate containers, but they are executed strictly after all is done in the main app, so the simultaneous execution of the same code in several client nodes may be the reason of hanging, isn't it?) And in the case the answer is positive, what is to do? Certainly it is possible to set a delay for those separate containers, but this does not look as a rather safe solution...And we have another small question, when we have two separate client nodes in our app, both configured for logging, why starting from some instant only the log for JDBC client node is used for logging, not both?
-- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
