I thought that this https://marc.info/?l=tomcat-user&m=119376798217922&w=2
might be the problem.
*"The uniqueId is used to be able to differentiate between the same node
 joining a cluster, then crashing and then rejoining again. if the uniqueId
didn't change in between this, there is no way to tell  the difference
between a node going down, or just leaving the cluster  and rejoining."*
So, I tried creating a session when one of the nodes was down, but that did
not sync as well when the other node came online again.
In that case, I would also expect org.apache.catalina.ha.
session.DeltaManager.waitForSendAllSessions to proceed with no state sync
rather than timing out.

I have also checked the time on both the servers using the Linux date
command and they seem to be in sync. The timezone flag passed to the
JAVA_OPTS argument in catalina.sh is also the same. Please let me know if
any more information is required to help debug this issue.

Sincerely,
Manak Bisht

On Sun, Jan 14, 2024 at 11:09 PM Manak Bisht <manak18...@iiitd.ac.in> wrote:

> Hi,
> I am using DeltaManager (static membership) with non-sticky load balancing
> on two nodes. I have observed even load, and requests with the same
> JSESSIONID being served successfully by both tomcats. This leads me to
> conclude that session replication is working as expected when both nodes
> are up.
>
> However, when I restart any one of them, the newly restarted tomcat is
> unable to serve requests from old sessions. The logs indicate that node
> discovering is working but the session sync timeouts. New logins/sessions
> work just fine though, implying that replication is working successfully
> again.
>
> *tomcat1.log*
> 13-Jan-2024 14:16:35.713 INFO [GroupChannel-Heartbeat-1]
> org.apache.catalina.ha.tcp.SimpleTcpCluster.memberDisappeared Received
> member
> disappeared:org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat2:8090,tomcat2,8090,
> alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 },
> payload={}, command={}, domain={}, ]
> 13-Jan-2024 14:44:16.457 INFO [GroupChannel-Heartbeat-1]
> org.apache.catalina.ha.tcp.SimpleTcpCluster.memberAdded Replication member
> added:org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat2:8090,tomcat2,8090,
> alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 },
> payload={}, command={}, domain={}, ]
> 13-Jan-2024 14:44:16.457 INFO [GroupChannel-Heartbeat-1]
> org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.performBasicCheck
> Suspect member, confirmed
> alive.[org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat2:8090,tomcat2,8090,
> alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 },
> payload={}, command={}, domain={}, ]]
> *13-Jan-2024 14:45:24.354 WARNING [Tribes-Task-Receiver-4]
> org.apache.catalina.ha.session.DeltaManager.deserializeSessions overload
> existing session XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX*
>
>
> *tomcat2.log*
> 13-Jan-2024 14:45:24.290 INFO [localhost-startStop-1]
> org.apache.catalina.ha.session.DeltaManager.startInternal Register manager
> localhost# to cluster element Engine with name Catalina
> 13-Jan-2024 14:45:24.291 INFO [localhost-startStop-1]
> org.apache.catalina.ha.session.DeltaManager.startInternal Starting
> clustering manager at localhost#
> 13-Jan-2024 14:45:24.363 INFO [localhost-startStop-1]
> org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.report
> ThroughputInterceptor Report[
> Tx Msg:1 messages
> Sent:0.00 MB (total)
> Sent:0.00 MB (application)
> Time:0.06 seconds
> Tx Speed:0.01 MB/sec (total)
> TxSpeed:0.01 MB/sec (application)
> Error Msg:0
> Rx Msg:15 messages
> Rx Speed:0.00 MB/sec (since 1st msg)
> Received:0.00 MB]
>
> 13-Jan-2024 14:45:24.368 INFO [localhost-startStop-1]
> org.apache.catalina.ha.session.DeltaManager.getAllClusterSessions Manager
> [localhost#], requesting session state from
> org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat1:8090,tomcat1,8090,
> alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 },
> payload={}, command={}, domain={}, ]. This operation will timeout if no
> session state has been received within 60 seconds.
> *13-Jan-2024 14:46:24.459 SEVERE [localhost-startStop-1]
> org.apache.catalina.ha.session.DeltaManager.waitForSendAllSessions Manager
> [localhost#]: No session state send at 1/13/24 2:45 PM received, timing out
> after 60,167 ms.*
>
> There is also a warning, but I am unsure of its significance.
> I have tried tweaking the sendAllSessions value to false and increasing
> the stateTransferTimeout window to no avail.
>
> This is my clustering config for tomcat1 (the config is the same for
> tomcat2 with the host as tomcat1 and uniqueId
> {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1}) -
>
> <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
>     channelSendOptions="6" channelStartOptions="3">
>
>     <Manager className="org.apache.catalina.ha.session.DeltaManager"/>
>
>     <Channel className="org.apache.catalina.tribes.group.GroupChannel">
>         <Receiver
> className="org.apache.catalina.tribes.transport.nio.NioReceiver"
>             address="0.0.0.0"
>             port="8090"
>             autoBind="0"/>
>
>         <Sender
> className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
>             <Transport
> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
>         </Sender>
>
>         <Interceptor
> className="org.apache.catalina.tribes.group.interceptors.TcpPingInterceptor"/>
>         <Interceptor
> className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
>         <Interceptor
> className="org.apache.catalina.tribes.group.interceptors.StaticMembershipInterceptor">
>             <Member
> className="org.apache.catalina.tribes.membership.StaticMember"
>                 port="8090"
>                 host="tomcat2"
>                 uniqueId="{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2}"/>
>         </Interceptor>
>         <Interceptor
> className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/>
>     </Channel>
>
>     <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
> filter=""/>
>
>     <ClusterListener
> className="org.apache.catalina.ha.session.ClusterSessionListener"/>
> </Cluster>
>
> Any help would be greatly appreciated.
>
> Sincerely,
> Manak Bisht
>

Reply via email to