Ignite client node hangs while IgniteAtomicLong is created

Ilya Roublev Tue, 04 Aug 2020 02:51:46 -0700

We are developing Jira cloud app using Apache Ignite both as data storage and
as job scheduler. This is done via a standard Ignite client node. But we
need to use Atlassian Connect Spring Boot to be able to communicate with
Jira. In short, all is done exactly as in our article  Boosting Jira Cloud
app development with Apache Ignite
<https://medium.com/alliedium/boosting-jira-cloud-app-development-with-apache-ignite-7eebc7bb3d48>
 
.At first we used simple  Ignite JDBC driver
<https://apacheignite-sql.readme.io/docs/jdbc-driver>   just for Atlassian
Connect Spring Boot along with a separate Ignite client node for our own
purposes. But this turned out to be very unstable being deployed in our
local Kubernetes cluster (built via Kubespray) due to constant exceptions 
occuring from time to time (in fact, this revealed only in our local
cluster, in AWS EKS all worked fine). To make all this more stable we tried
to use  Ignite JDBC Client driver
<https://apacheignite-sql.readme.io/docs/jdbc-client-driver>   exactly as
described in the article mentioned above. Thus, now our backend uses two
Ignite client nodes per single JVM: the first one for JDBC used by Atlassian
Connect Spring Boot, the second one for our own purposes.This solution
turned out to be good enough, because our app works now very stable both in
our local cluster and in AWS EKS. But when we deploy our app in Docker for
testing and developing purposes, our Ignite client nodes hang from time to
time. After some investigation we were able to see that this occurs exactly
at the instant when an object of IgniteAtomicLong is created. Below are logs
both for successful initialization of our app and for the case when Ignite
client node hanged.
Logs when all is ok
ignite-appclientnode-successful.log
<http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-appclientnode-successful.log>
  
ignite-jdbcclientnode-successful.log
<http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-jdbcclientnode-successful.log>
  
Logs when both client node hang
ignite-appclientnode-failed.log
<http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-appclientnode-failed.log>
  
ignite-jdbcclientnode-failed.log
<http://apache-ignite-users.70518.x6.nabble.com/file/t2262/ignite-jdbcclientnode-failed.log>
  
Some analysis and questions
>From logs one can see that caches default, tenants, atlassian_host_audit,
SQL_PUBLIC_ATLASSIAN_HOST  are manipulated, in fact, default is given in
client configuration:  client.xml
<http://apache-ignite-users.70518.x6.nabble.com/file/t2262/client.xml>  ,
the cache SQL_PUBLIC_ATLASSIAN_HOST contains atlassian_host table mentioned
in  Boosting Jira Cloud app development with Apache Ignite
<https://medium.com/alliedium/boosting-jira-cloud-app-development-with-apache-ignite-7eebc7bb3d48>
  
and is created in advance even before the app starts. Further,
atlassian_host_audit is a copy of atlassian_host, in any case it is not yet
created when the app hangs.What concerns other entities processed by Ignite,
they are created by the following code:And from the logs of the app itself
it is clear that the app hangs exactly on the last line. This is confirmed
by the fact that the in ignite-jdbcclientnode-successful.log we have the
following lines:while in ignite-jdbcclientnode-failed.log all the lines
starting the first time the cache ignite-sys-atomic-cache@default-ds-group
(the cache used for atomics) was mentioned are as follows:In particular, the
following line from ignite-jdbcclientnode-successful.log is absent in
ignite-jdbcclientnode-failed.log:But it should be noted that for the failure
case there are other client nodes executed in separate containers executed
simultaneously with the backend app and with the same code creating the
cache tenants and IgniteAtomicLong idGen what concerns the logs below (see
above for the code), their node ids are 653143b2-6e80-49ff-9e9a-ae10237b32e8
and 30e24e06-ab76-4053-a36e-548e87ffe5d1, respectively (and it can be easily
seen that all the lines in ignite-jdbcclientnode-failed.log with
ignite-sys-atomic-cache@default-ds-group relate namely to these nodes), the
logs for the time segment when the code with tenants and idGen is executed
are as follows:And the code creating tenants and idGen is executed
successfully. But is it possible that this simultaneous creation of idGen
may hang some nodes? (As for the case when all was executed successfully,
there we also have two separate containers, but they are executed strictly
after all is done in the main app, so the simultaneous execution of the same
code in several client nodes may be the reason of hanging, isn't it?) And in
the case the answer is positive, what is to do? Certainly it is possible to
set a delay for those separate containers, but this does not look as a
rather safe solution...And we have another small question, when we have two
separate client nodes in our app, both configured for logging, why starting
from some instant only the log for JDBC client node is used for logging, not
both?




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Ignite client node hangs while IgniteAtomicLong is created

Reply via email to