Re: Operation has timed out
Hi All, As u know we are using 224.0.0.4 multicast IP for tomcat cluster (Node1: 10.114.43.102 / Node2: 10.114.43.103) i was trying to ping the multicast IP i get reply from 10.114.43.51 IP! Also i have executed the nslookup for 224.0.0.4 i get the DC IP (10.114.43.7) and mcast.net domain: C:\Users\Administrator>nslookup 224.0.0.4 Server: hq-dc02.albaraka.com.sd Address: 10.114.43.7 Name:dvmrp.mcast.net Address: 224.0.0.4 On Wed, Feb 8, 2017 at 8:59 AM, Fady Haikal wrote: > Ashwin, > I'm using the below configuration, please let me know how i can check > if i'm using a unique multicast address and port > > > className="org.apache.catalina.tribes.group.GroupChannel"> > className="org.apache.catalina.tribes.membership.McastService" > address="228.0.0.4" > port="45564" > frequency="500" > dropTime="9000"/> > className="org.apache.catalina.tribes.transport.nio.NioReceiver" > address="auto" > port="4000" > autoBind="100" > selectorTimeout="5000" > maxThreads="6"/> > > On Wed, Feb 8, 2017 at 6:39 AM, ashwin rajput wrote: >> I am not sure if anyone has verified below. >> >> Have you verifyed clustering is using unique multicast address and port. >> Cluster multicast address should be unique and not used by any other >> cluster. >> >> Regards, >> Ashwin >> On 07-Feb-2017 10:38 pm, "André Warnier (tomcat)" wrote: >> >>> On 07.02.2017 17:20, Fady Haikal wrote: >>> Christopher, For the first time >>> >>> @Christopher : just to make sure you got that bit, buried below : the >>> actual replication seems to work fine. The problem is only these >>> "unsuccesful ping" messages in the log, which fill the log, and which so >>> far nobody has managed to find an explanation for. >>> >>> On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Fady, > > On 2/7/17 10:53 AM, Fady Haikal wrote: > >> ProcessPID Protocol local address local port >> Remote Address State Tomcat8.exe 8160 TCP >> imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED >> > > Stupid question: was this working in the past, and it stopped working? > Or are you trying to get this working for the first time? > > - -chris > > On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal >> wrote: >> >>> Yes there is a ESTABLISHED connection, the replication of >>> sessions is working fine (port 4000 is for tomcat cluster) but we >>> also faced this error on the log file >>> >>> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat) >>> wrote: >>> On 07.02.2017 16:24, Fady Haikal wrote: > > Hi, telnet IP 4000 is working fine, i installed a tool for > network monitoring at the level of IP and Port and i didnt > see any disconnection, > but did you see a *connection* ? I mean, on the pinging node, if you use the Windows "netstat" program, for example as netstat -aon -p TCP you should see a list of connections in the ESTABLISHED state, of which one of the IP/ports should be your target IP:4000 (in the "remote" column). And on the pinged node, this port :4000 should be in the "local" column, in LISTEN mode (and also probably one in the ESTABLISHED state, if they agree.) Is that the case ? and yes i'm sure that no firewall is enabled. > > > I saw some strange think on the server that I have tried to > ping the multicast IP (228.0.0.4) and i get reply from > different IPs in the network, i don't know why and how i get > those IPs, after checking with the network team they told me > that those IPs are related to the SAN storage taking into > consideration that the Tomcat servers are not connected in > anyway to that SUN storage. > > > On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) > wrote: > >> >> Hi. >> >> This is for the Tomcat/Tribes experts on the list. >> >> I know nothing of Tribes, but the on-line documentation >> seems to say that the communication happens over TCP and >> that the protocol used is not encrypted. Fady previously >> tried a standard "ping" and a "telnet" between the two >> nodes, and that is the base for him mentioning that "there >> is no network disconnection" between the nodes. >> Nevertheless, the calling pinging node seems to say that it >
Re: Operation has timed out
Ashwin, I'm using the below configuration, please let me know how i can check if i'm using a unique multicast address and port On Wed, Feb 8, 2017 at 6:39 AM, ashwin rajput wrote: > I am not sure if anyone has verified below. > > Have you verifyed clustering is using unique multicast address and port. > Cluster multicast address should be unique and not used by any other > cluster. > > Regards, > Ashwin > On 07-Feb-2017 10:38 pm, "André Warnier (tomcat)" wrote: > >> On 07.02.2017 17:20, Fady Haikal wrote: >> >>> Christopher, >>> For the first time >>> >> >> @Christopher : just to make sure you got that bit, buried below : the >> actual replication seems to work fine. The problem is only these >> "unsuccesful ping" messages in the log, which fill the log, and which so >> far nobody has managed to find an explanation for. >> >> >>> On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz >>> wrote: >>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Fady, On 2/7/17 10:53 AM, Fady Haikal wrote: > ProcessPID Protocol local address local port > Remote Address State Tomcat8.exe 8160 TCP > imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED > Stupid question: was this working in the past, and it stopped working? Or are you trying to get this working for the first time? - -chris On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal > wrote: > >> Yes there is a ESTABLISHED connection, the replication of >> sessions is working fine (port 4000 is for tomcat cluster) but we >> also faced this error on the log file >> >> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat) >> wrote: >> >>> On 07.02.2017 16:24, Fady Haikal wrote: >>> Hi, telnet IP 4000 is working fine, i installed a tool for network monitoring at the level of IP and Port and i didnt see any disconnection, >>> >>> >>> but did you see a *connection* ? I mean, on the pinging node, >>> if you use the Windows "netstat" program, for example as >>> netstat -aon -p TCP you should see a list of connections in the >>> ESTABLISHED state, of which one of the IP/ports should be your >>> target IP:4000 (in the "remote" column). And on the pinged >>> node, this port :4000 should be in the "local" column, in >>> LISTEN mode (and also probably one in the ESTABLISHED state, if >>> they agree.) >>> >>> Is that the case ? >>> >>> >>> >>> and yes i'm sure that no firewall is enabled. >>> I saw some strange think on the server that I have tried to ping the multicast IP (228.0.0.4) and i get reply from different IPs in the network, i don't know why and how i get those IPs, after checking with the network team they told me that those IPs are related to the SAN storage taking into consideration that the Tomcat servers are not connected in anyway to that SUN storage. On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) wrote: > > Hi. > > This is for the Tomcat/Tribes experts on the list. > > I know nothing of Tribes, but the on-line documentation > seems to say that the communication happens over TCP and > that the protocol used is not encrypted. Fady previously > tried a standard "ping" and a "telnet" between the two > nodes, and that is the base for him mentioning that "there > is no network disconnection" between the nodes. > Nevertheless, the calling pinging node seems to say that it > times out without getting a response fom the target node. > There is evidently a contradiction there. So this could > still be some kind of network issue. > > Considering that the protocol command for this "ping" > should be known by someone here, would it not be possible > to imagine a little program in some scripting language (or > even java, God forbid), which would open a TCP channel with > the target node IP/port, send such a "ping" message, wait > for a response and report the result ? That would at least > confirm/deny that the problem is with the network. > > The log below does not for example say if the error happens > when opening the TCP communication channel, or after > sending the ping message on it, (Of course, testing the TCP > open could be done with "telnet IP 4000", but I don't know > if Fady tried this). Maybe tribes also already contains > some löw-level debugging options ? wireshark maybe another > option, but it has quite a learning curve. And this is on > Windows. > >>>
Re: Operation has timed out
I am not sure if anyone has verified below. Have you verifyed clustering is using unique multicast address and port. Cluster multicast address should be unique and not used by any other cluster. Regards, Ashwin On 07-Feb-2017 10:38 pm, "André Warnier (tomcat)" wrote: > On 07.02.2017 17:20, Fady Haikal wrote: > >> Christopher, >> For the first time >> > > @Christopher : just to make sure you got that bit, buried below : the > actual replication seems to work fine. The problem is only these > "unsuccesful ping" messages in the log, which fill the log, and which so > far nobody has managed to find an explanation for. > > >> On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz >> wrote: >> >>> -BEGIN PGP SIGNED MESSAGE- >>> Hash: SHA256 >>> >>> Fady, >>> >>> On 2/7/17 10:53 AM, Fady Haikal wrote: >>> ProcessPID Protocol local address local port Remote Address State Tomcat8.exe 8160 TCP imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED >>> >>> Stupid question: was this working in the past, and it stopped working? >>> Or are you trying to get this working for the first time? >>> >>> - -chris >>> >>> On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal wrote: > Yes there is a ESTABLISHED connection, the replication of > sessions is working fine (port 4000 is for tomcat cluster) but we > also faced this error on the log file > > On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat) > wrote: > >> On 07.02.2017 16:24, Fady Haikal wrote: >> >>> >>> Hi, telnet IP 4000 is working fine, i installed a tool for >>> network monitoring at the level of IP and Port and i didnt >>> see any disconnection, >>> >> >> >> but did you see a *connection* ? I mean, on the pinging node, >> if you use the Windows "netstat" program, for example as >> netstat -aon -p TCP you should see a list of connections in the >> ESTABLISHED state, of which one of the IP/ports should be your >> target IP:4000 (in the "remote" column). And on the pinged >> node, this port :4000 should be in the "local" column, in >> LISTEN mode (and also probably one in the ESTABLISHED state, if >> they agree.) >> >> Is that the case ? >> >> >> >> and yes i'm sure that no firewall is enabled. >> >>> >>> >>> I saw some strange think on the server that I have tried to >>> ping the multicast IP (228.0.0.4) and i get reply from >>> different IPs in the network, i don't know why and how i get >>> those IPs, after checking with the network team they told me >>> that those IPs are related to the SAN storage taking into >>> consideration that the Tomcat servers are not connected in >>> anyway to that SUN storage. >>> >>> >>> On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) >>> wrote: >>> Hi. This is for the Tomcat/Tribes experts on the list. I know nothing of Tribes, but the on-line documentation seems to say that the communication happens over TCP and that the protocol used is not encrypted. Fady previously tried a standard "ping" and a "telnet" between the two nodes, and that is the base for him mentioning that "there is no network disconnection" between the nodes. Nevertheless, the calling pinging node seems to say that it times out without getting a response fom the target node. There is evidently a contradiction there. So this could still be some kind of network issue. Considering that the protocol command for this "ping" should be known by someone here, would it not be possible to imagine a little program in some scripting language (or even java, God forbid), which would open a TCP channel with the target node IP/port, send such a "ping" message, wait for a response and report the result ? That would at least confirm/deny that the problem is with the network. The log below does not for example say if the error happens when opening the TCP communication channel, or after sending the ping message on it, (Of course, testing the TCP open could be done with "telnet IP 4000", but I don't know if Fady tried this). Maybe tribes also already contains some löw-level debugging options ? wireshark maybe another option, but it has quite a learning curve. And this is on Windows. By the way Fady, are you sure that your "Windows Firewall with Enhanced Security" is not just dropping TCP packets to/from port 40xx (or from "java.exe") ? There are some "network policies" there which can have wide-ranging side-effects. On 07.02.2017 14:42, Fady Ha
Tomcat 7.0.xx under Java 7?
Ladies and Gentlemen of the Tomcat List: To date, the overwhelming bulk of our own Tomcat experience has been under Java 6 JVMs. And we have a customer who will likely be losing that JVM soon. Are there any "gotchas" running 7.0.47 or later under Java 7? -- James H. H. Lampert Touchtone Corporation - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Apache Tomcat 7.0.59 - Even if a ws certificate stored in the WSkeystore expires, any webclient request is still accepted by server and not refused
On 07/02/17 19:33, George Stanchev wrote: Mark, Apologies for top posting. We have our own trust manager that is attached to the connector because we want client certificates to be passed in the application for validation and authentication rather than the connector. If we switch to the OpenSSL/APR based certificate processing, would the trust manager still work? I presume not, but wanted to ask and if not, what are the options? If the application is validating the client certs, just add valid to/from date checking to that validation. Mark -Original Message- From: Mark Thomas [mailto:ma...@apache.org] Sent: Monday, February 06, 2017 7:20 AM To: Tomcat Users List Subject: Re: Apache Tomcat 7.0.59 - Even if a ws certificate stored in the WSkeystore expires, any webclient request is still accepted by server and not refused On 06/02/17 13:49, Francesco Leone wrote: Dear Sirs, To communicate you a behaviour with Apache Tomcat 7.0.59 Apache Tomcat 7.0.59 is running with: - RHEL6.6 - java jdk 1.8.0.74 - OpenSSL 1.0.2g We have a client - server communication. The Client certificate is produced via keytool and we have same problem highlighted here http://stackoverflow.com/questions/33688020/configuring-apache-tomcat- 7-0-to-reject-connections-with-expired-client-certific and http://stackoverflow.com/questions/5206859/java-trustmanager-behavior- on-expired-certificates What we got reading all flow, is that to solve our problem we should implement a new X509TrustManager which creates our original instance in its constructor, implements all methods as calls to the original instance, and adds a call to checkValidity for each certificate in certs[] inside checkServerTrusted. Did we get well ? If yes, it sounds to us as a hole in the security and so a bug in Tomcat, is there any chance to have this behaviour (refuse connection at expired certificates) as standard in later Apache tomcat 7.0.x release ? Any of this community can support us ? This is not a Tomcat bug. If you tell Java to trust a certificate, it will do so and ignore the validity period. I've looked into this in the past and short of implementing your own X509TrustManager I haven't yet found an API Tomcat could use to add an additional check on the trusted cert's validity. A better general solution is to trust the CA(s) issuing the client certificates rather than the client certificates. Then, because the client cert is not in the trust store, Java checks it more thoroughly - including the validity dates. It is also worth looking at using an OpenSSL based TLS connector. From what I recall of my previous testing OpenSSL did check the validity dates of trusted certs. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Apache Tomcat 7.0.59 - Even if a ws certificate stored in the WSkeystore expires, any webclient request is still accepted by server and not refused
Mark, Apologies for top posting. We have our own trust manager that is attached to the connector because we want client certificates to be passed in the application for validation and authentication rather than the connector. If we switch to the OpenSSL/APR based certificate processing, would the trust manager still work? I presume not, but wanted to ask and if not, what are the options? -Original Message- From: Mark Thomas [mailto:ma...@apache.org] Sent: Monday, February 06, 2017 7:20 AM To: Tomcat Users List Subject: Re: Apache Tomcat 7.0.59 - Even if a ws certificate stored in the WSkeystore expires, any webclient request is still accepted by server and not refused On 06/02/17 13:49, Francesco Leone wrote: > Dear Sirs, To communicate you a behaviour with Apache Tomcat 7.0.59 > > Apache Tomcat 7.0.59 is running with: - RHEL6.6 - java jdk 1.8.0.74 - > OpenSSL 1.0.2g > > We have a client - server communication. The Client certificate is > produced via keytool and we have same problem highlighted here > > http://stackoverflow.com/questions/33688020/configuring-apache-tomcat- > 7-0-to-reject-connections-with-expired-client-certific > > and > > http://stackoverflow.com/questions/5206859/java-trustmanager-behavior- > on-expired-certificates > > > > What we got reading all flow, is that to solve our problem we should > implement a new X509TrustManager which creates our original instance > in its constructor, implements all methods as calls to the original > instance, and adds a call to checkValidity for each certificate in > certs[] inside checkServerTrusted. > > Did we get well ? If yes, it sounds to us as a hole in the security > and so a bug in Tomcat, is there any chance to have this behaviour > (refuse connection at expired certificates) as standard in later > Apache tomcat 7.0.x release ? Any of this community can support us ? This is not a Tomcat bug. If you tell Java to trust a certificate, it will do so and ignore the validity period. I've looked into this in the past and short of implementing your own X509TrustManager I haven't yet found an API Tomcat could use to add an additional check on the trusted cert's validity. A better general solution is to trust the CA(s) issuing the client certificates rather than the client certificates. Then, because the client cert is not in the trust store, Java checks it more thoroughly - including the validity dates. It is also worth looking at using an OpenSSL based TLS connector. From what I recall of my previous testing OpenSSL did check the validity dates of trusted certs. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Operation has timed out
On 07.02.2017 17:20, Fady Haikal wrote: Christopher, For the first time @Christopher : just to make sure you got that bit, buried below : the actual replication seems to work fine. The problem is only these "unsuccesful ping" messages in the log, which fill the log, and which so far nobody has managed to find an explanation for. On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Fady, On 2/7/17 10:53 AM, Fady Haikal wrote: ProcessPID Protocol local address local port Remote Address State Tomcat8.exe 8160 TCP imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED Stupid question: was this working in the past, and it stopped working? Or are you trying to get this working for the first time? - -chris On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal wrote: Yes there is a ESTABLISHED connection, the replication of sessions is working fine (port 4000 is for tomcat cluster) but we also faced this error on the log file On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat) wrote: On 07.02.2017 16:24, Fady Haikal wrote: Hi, telnet IP 4000 is working fine, i installed a tool for network monitoring at the level of IP and Port and i didnt see any disconnection, but did you see a *connection* ? I mean, on the pinging node, if you use the Windows "netstat" program, for example as netstat -aon -p TCP you should see a list of connections in the ESTABLISHED state, of which one of the IP/ports should be your target IP:4000 (in the "remote" column). And on the pinged node, this port :4000 should be in the "local" column, in LISTEN mode (and also probably one in the ESTABLISHED state, if they agree.) Is that the case ? and yes i'm sure that no firewall is enabled. I saw some strange think on the server that I have tried to ping the multicast IP (228.0.0.4) and i get reply from different IPs in the network, i don't know why and how i get those IPs, after checking with the network team they told me that those IPs are related to the SAN storage taking into consideration that the Tomcat servers are not connected in anyway to that SUN storage. On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) wrote: Hi. This is for the Tomcat/Tribes experts on the list. I know nothing of Tribes, but the on-line documentation seems to say that the communication happens over TCP and that the protocol used is not encrypted. Fady previously tried a standard "ping" and a "telnet" between the two nodes, and that is the base for him mentioning that "there is no network disconnection" between the nodes. Nevertheless, the calling pinging node seems to say that it times out without getting a response fom the target node. There is evidently a contradiction there. So this could still be some kind of network issue. Considering that the protocol command for this "ping" should be known by someone here, would it not be possible to imagine a little program in some scripting language (or even java, God forbid), which would open a TCP channel with the target node IP/port, send such a "ping" message, wait for a response and report the result ? That would at least confirm/deny that the problem is with the network. The log below does not for example say if the error happens when opening the TCP communication channel, or after sending the ping message on it, (Of course, testing the TCP open could be done with "telnet IP 4000", but I don't know if Fady tried this). Maybe tribes also already contains some löw-level debugging options ? wireshark maybe another option, but it has quite a learning curve. And this is on Windows. By the way Fady, are you sure that your "Windows Firewall with Enhanced Security" is not just dropping TCP packets to/from port 40xx (or from "java.exe") ? There are some "network policies" there which can have wide-ranging side-effects. On 07.02.2017 14:42, Fady Haikal wrote: Hi, issue still not fixed. Tomcat session replication is not able to replicate the key from node to node, please find below the error, taking into consideration that there is no network disconnection between 2 nodes 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryIn fo Unable to replicate backup key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10 , 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350, securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14 -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ]. Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; org.apache.catalina.tribes.ChannelException: Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMe ssage(ParallelNioSender.java:108) at org.apache.catalina.trib
Re: Operation has timed out
Christopher, For the first time On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Fady, > > On 2/7/17 10:53 AM, Fady Haikal wrote: >> ProcessPID Protocol local address local port >> Remote Address State Tomcat8.exe 8160 TCP >> imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED > > Stupid question: was this working in the past, and it stopped working? > Or are you trying to get this working for the first time? > > - -chris > >> On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal >> wrote: >>> Yes there is a ESTABLISHED connection, the replication of >>> sessions is working fine (port 4000 is for tomcat cluster) but we >>> also faced this error on the log file >>> >>> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat) >>> wrote: On 07.02.2017 16:24, Fady Haikal wrote: > > Hi, telnet IP 4000 is working fine, i installed a tool for > network monitoring at the level of IP and Port and i didnt > see any disconnection, but did you see a *connection* ? I mean, on the pinging node, if you use the Windows "netstat" program, for example as netstat -aon -p TCP you should see a list of connections in the ESTABLISHED state, of which one of the IP/ports should be your target IP:4000 (in the "remote" column). And on the pinged node, this port :4000 should be in the "local" column, in LISTEN mode (and also probably one in the ESTABLISHED state, if they agree.) Is that the case ? and yes i'm sure that no firewall is enabled. > > > I saw some strange think on the server that I have tried to > ping the multicast IP (228.0.0.4) and i get reply from > different IPs in the network, i don't know why and how i get > those IPs, after checking with the network team they told me > that those IPs are related to the SAN storage taking into > consideration that the Tomcat servers are not connected in > anyway to that SUN storage. > > > On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) > wrote: >> >> Hi. >> >> This is for the Tomcat/Tribes experts on the list. >> >> I know nothing of Tribes, but the on-line documentation >> seems to say that the communication happens over TCP and >> that the protocol used is not encrypted. Fady previously >> tried a standard "ping" and a "telnet" between the two >> nodes, and that is the base for him mentioning that "there >> is no network disconnection" between the nodes. >> Nevertheless, the calling pinging node seems to say that it >> times out without getting a response fom the target node. >> There is evidently a contradiction there. So this could >> still be some kind of network issue. >> >> Considering that the protocol command for this "ping" >> should be known by someone here, would it not be possible >> to imagine a little program in some scripting language (or >> even java, God forbid), which would open a TCP channel with >> the target node IP/port, send such a "ping" message, wait >> for a response and report the result ? That would at least >> confirm/deny that the problem is with the network. >> >> The log below does not for example say if the error happens >> when opening the TCP communication channel, or after >> sending the ping message on it, (Of course, testing the TCP >> open could be done with "telnet IP 4000", but I don't know >> if Fady tried this). Maybe tribes also already contains >> some löw-level debugging options ? wireshark maybe another >> option, but it has quite a learning curve. And this is on >> Windows. >> >> By the way Fady, are you sure that your "Windows Firewall >> with Enhanced Security" is not just dropping TCP packets >> to/from port 40xx (or from "java.exe") ? There are some >> "network policies" there which can have wide-ranging >> side-effects. >> >> >> >> >> On 07.02.2017 14:42, Fady Haikal wrote: >>> >>> >>> Hi, issue still not fixed. Tomcat session replication is >>> not able to replicate the key from node to node, please >>> find below the error, taking into consideration that >>> there is no network disconnection between 2 nodes >>> >>> >>> 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] >>> org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryIn > fo >>> >>> > Unable to replicate backup >>> key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to >>> backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10 > , >>> >>> > 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350, >>> securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 >>> 12 -102 -14 -85 -87 15 9 -51 -112 }, payload={}, >>> command={}, domain={}, ]. Reaso
Re: Operation has timed out
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Fady, On 2/7/17 10:53 AM, Fady Haikal wrote: > ProcessPID Protocol local address local port > Remote Address State Tomcat8.exe 8160 TCP > imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED Stupid question: was this working in the past, and it stopped working? Or are you trying to get this working for the first time? - -chris > On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal > wrote: >> Yes there is a ESTABLISHED connection, the replication of >> sessions is working fine (port 4000 is for tomcat cluster) but we >> also faced this error on the log file >> >> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat) >> wrote: >>> On 07.02.2017 16:24, Fady Haikal wrote: Hi, telnet IP 4000 is working fine, i installed a tool for network monitoring at the level of IP and Port and i didnt see any disconnection, >>> >>> >>> but did you see a *connection* ? I mean, on the pinging node, >>> if you use the Windows "netstat" program, for example as >>> netstat -aon -p TCP you should see a list of connections in the >>> ESTABLISHED state, of which one of the IP/ports should be your >>> target IP:4000 (in the "remote" column). And on the pinged >>> node, this port :4000 should be in the "local" column, in >>> LISTEN mode (and also probably one in the ESTABLISHED state, if >>> they agree.) >>> >>> Is that the case ? >>> >>> >>> >>> and yes i'm sure that no firewall is enabled. I saw some strange think on the server that I have tried to ping the multicast IP (228.0.0.4) and i get reply from different IPs in the network, i don't know why and how i get those IPs, after checking with the network team they told me that those IPs are related to the SAN storage taking into consideration that the Tomcat servers are not connected in anyway to that SUN storage. On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) wrote: > > Hi. > > This is for the Tomcat/Tribes experts on the list. > > I know nothing of Tribes, but the on-line documentation > seems to say that the communication happens over TCP and > that the protocol used is not encrypted. Fady previously > tried a standard "ping" and a "telnet" between the two > nodes, and that is the base for him mentioning that "there > is no network disconnection" between the nodes. > Nevertheless, the calling pinging node seems to say that it > times out without getting a response fom the target node. > There is evidently a contradiction there. So this could > still be some kind of network issue. > > Considering that the protocol command for this "ping" > should be known by someone here, would it not be possible > to imagine a little program in some scripting language (or > even java, God forbid), which would open a TCP channel with > the target node IP/port, send such a "ping" message, wait > for a response and report the result ? That would at least > confirm/deny that the problem is with the network. > > The log below does not for example say if the error happens > when opening the TCP communication channel, or after > sending the ping message on it, (Of course, testing the TCP > open could be done with "telnet IP 4000", but I don't know > if Fady tried this). Maybe tribes also already contains > some löw-level debugging options ? wireshark maybe another > option, but it has quite a learning curve. And this is on > Windows. > > By the way Fady, are you sure that your "Windows Firewall > with Enhanced Security" is not just dropping TCP packets > to/from port 40xx (or from "java.exe") ? There are some > "network policies" there which can have wide-ranging > side-effects. > > > > > On 07.02.2017 14:42, Fady Haikal wrote: >> >> >> Hi, issue still not fixed. Tomcat session replication is >> not able to replicate the key from node to node, please >> find below the error, taking into consideration that >> there is no network disconnection between 2 nodes >> >> >> 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] >> org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryIn fo >> >> Unable to replicate backup >> key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to >> backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10 , >> >> 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350, >> securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 >> 12 -102 -14 -85 -87 15 9 -51 -112 }, payload={}, >> command={}, domain={}, ]. Reason:Operation has timed >> out(3000 ms.).; Faulty members:tcp://{10, 114, 43, >> 102}:4000; org.apache.catalina.tribes.ChannelException: >> Operation has timed out(3000 ms.).; Faulty
Re: Operation has timed out
ProcessPID Protocol local address local port Remote Address State Tomcat8.exe 8160 TCP imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal wrote: > Yes there is a ESTABLISHED connection, the replication of sessions is > working fine (port 4000 is for tomcat cluster) but we also faced this > error on the log file > > On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat) > wrote: >> On 07.02.2017 16:24, Fady Haikal wrote: >>> >>> Hi, >>> telnet IP 4000 is working fine, i installed a tool for network >>> monitoring at the level of IP and Port and i didnt see any >>> disconnection, >> >> >> but did you see a *connection* ? >> I mean, on the pinging node, if you use the Windows "netstat" program, for >> example as >> netstat -aon -p TCP >> you should see a list of connections in the ESTABLISHED state, of which one >> of the IP/ports should be your target IP:4000 (in the "remote" column). >> And on the pinged node, this port :4000 should be in the "local" column, in >> LISTEN mode >> (and also probably one in the ESTABLISHED state, if they agree.) >> >> Is that the case ? >> >> >> >> and yes i'm sure that no firewall is enabled. >>> >>> >>> I saw some strange think on the server that I have tried to ping the >>> multicast IP (228.0.0.4) and i get reply from different IPs in the >>> network, i don't know why and how i get those IPs, after checking with >>> the network team they told me that those IPs are related to the SAN >>> storage taking into consideration that the Tomcat servers are not >>> connected in anyway to that SUN storage. >>> >>> >>> On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) >>> wrote: Hi. This is for the Tomcat/Tribes experts on the list. I know nothing of Tribes, but the on-line documentation seems to say that the communication happens over TCP and that the protocol used is not encrypted. Fady previously tried a standard "ping" and a "telnet" between the two nodes, and that is the base for him mentioning that "there is no network disconnection" between the nodes. Nevertheless, the calling pinging node seems to say that it times out without getting a response fom the target node. There is evidently a contradiction there. So this could still be some kind of network issue. Considering that the protocol command for this "ping" should be known by someone here, would it not be possible to imagine a little program in some scripting language (or even java, God forbid), which would open a TCP channel with the target node IP/port, send such a "ping" message, wait for a response and report the result ? That would at least confirm/deny that the problem is with the network. The log below does not for example say if the error happens when opening the TCP communication channel, or after sending the ping message on it, (Of course, testing the TCP open could be done with "telnet IP 4000", but I don't know if Fady tried this). Maybe tribes also already contains some löw-level debugging options ? wireshark maybe another option, but it has quite a learning curve. And this is on Windows. By the way Fady, are you sure that your "Windows Firewall with Enhanced Security" is not just dropping TCP packets to/from port 40xx (or from "java.exe") ? There are some "network policies" there which can have wide-ranging side-effects. On 07.02.2017 14:42, Fady Haikal wrote: > > > Hi, issue still not fixed. Tomcat session replication is not able to > replicate the key from node to node, please find below the error, > taking into consideration that there is no network disconnection > between 2 nodes > > > 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] > org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo > Unable to replicate backup > key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to > backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, > 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350, > securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14 > -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ]. > Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10, > 114, 43, 102}:4000; >org.apache.catalina.tribes.ChannelException: Operation has timed > out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; > at > > org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108) > at > > org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48) > at > > org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMess
Re: Operation has timed out
Yes there is a ESTABLISHED connection, the replication of sessions is working fine (port 4000 is for tomcat cluster) but we also faced this error on the log file On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat) wrote: > On 07.02.2017 16:24, Fady Haikal wrote: >> >> Hi, >> telnet IP 4000 is working fine, i installed a tool for network >> monitoring at the level of IP and Port and i didnt see any >> disconnection, > > > but did you see a *connection* ? > I mean, on the pinging node, if you use the Windows "netstat" program, for > example as > netstat -aon -p TCP > you should see a list of connections in the ESTABLISHED state, of which one > of the IP/ports should be your target IP:4000 (in the "remote" column). > And on the pinged node, this port :4000 should be in the "local" column, in > LISTEN mode > (and also probably one in the ESTABLISHED state, if they agree.) > > Is that the case ? > > > > and yes i'm sure that no firewall is enabled. >> >> >> I saw some strange think on the server that I have tried to ping the >> multicast IP (228.0.0.4) and i get reply from different IPs in the >> network, i don't know why and how i get those IPs, after checking with >> the network team they told me that those IPs are related to the SAN >> storage taking into consideration that the Tomcat servers are not >> connected in anyway to that SUN storage. >> >> >> On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) >> wrote: >>> >>> Hi. >>> >>> This is for the Tomcat/Tribes experts on the list. >>> >>> I know nothing of Tribes, but the on-line documentation seems to say that >>> the communication happens over TCP and that the protocol used is not >>> encrypted. >>> Fady previously tried a standard "ping" and a "telnet" between the two >>> nodes, and that is the base for him mentioning that "there is no network >>> disconnection" between the nodes. >>> Nevertheless, the calling pinging node seems to say that it times out >>> without getting a response fom the target node. There is evidently a >>> contradiction there. >>> So this could still be some kind of network issue. >>> >>> Considering that the protocol command for this "ping" should be known by >>> someone here, would it not be possible to imagine a little program in >>> some >>> scripting language (or even java, God forbid), which would open a TCP >>> channel with the target node IP/port, send such a "ping" message, wait >>> for a >>> response and report the result ? >>> That would at least confirm/deny that the problem is with the network. >>> >>> The log below does not for example say if the error happens when opening >>> the >>> TCP communication channel, or after sending the ping message on it, >>> (Of course, testing the TCP open could be done with "telnet IP 4000", but >>> I >>> don't know if Fady tried this). >>> Maybe tribes also already contains some löw-level debugging options ? >>> wireshark maybe another option, but it has quite a learning curve. >>> And this is on Windows. >>> >>> By the way Fady, are you sure that your "Windows Firewall with Enhanced >>> Security" is not just dropping TCP packets to/from port 40xx (or from >>> "java.exe") ? There are some "network policies" there which can have >>> wide-ranging side-effects. >>> >>> >>> >>> >>> On 07.02.2017 14:42, Fady Haikal wrote: Hi, issue still not fixed. Tomcat session replication is not able to replicate the key from node to node, please find below the error, taking into consideration that there is no network disconnection between 2 nodes 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo Unable to replicate backup key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350, securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14 -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ]. Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; org.apache.catalina.tribes.ChannelException: Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108) at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48) at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54) at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(Me
Re: Operation has timed out
On 07.02.2017 16:24, Fady Haikal wrote: Hi, telnet IP 4000 is working fine, i installed a tool for network monitoring at the level of IP and Port and i didnt see any disconnection, but did you see a *connection* ? I mean, on the pinging node, if you use the Windows "netstat" program, for example as netstat -aon -p TCP you should see a list of connections in the ESTABLISHED state, of which one of the IP/ports should be your target IP:4000 (in the "remote" column). And on the pinged node, this port :4000 should be in the "local" column, in LISTEN mode (and also probably one in the ESTABLISHED state, if they agree.) Is that the case ? and yes i'm sure that no firewall is enabled. I saw some strange think on the server that I have tried to ping the multicast IP (228.0.0.4) and i get reply from different IPs in the network, i don't know why and how i get those IPs, after checking with the network team they told me that those IPs are related to the SAN storage taking into consideration that the Tomcat servers are not connected in anyway to that SUN storage. On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) wrote: Hi. This is for the Tomcat/Tribes experts on the list. I know nothing of Tribes, but the on-line documentation seems to say that the communication happens over TCP and that the protocol used is not encrypted. Fady previously tried a standard "ping" and a "telnet" between the two nodes, and that is the base for him mentioning that "there is no network disconnection" between the nodes. Nevertheless, the calling pinging node seems to say that it times out without getting a response fom the target node. There is evidently a contradiction there. So this could still be some kind of network issue. Considering that the protocol command for this "ping" should be known by someone here, would it not be possible to imagine a little program in some scripting language (or even java, God forbid), which would open a TCP channel with the target node IP/port, send such a "ping" message, wait for a response and report the result ? That would at least confirm/deny that the problem is with the network. The log below does not for example say if the error happens when opening the TCP communication channel, or after sending the ping message on it, (Of course, testing the TCP open could be done with "telnet IP 4000", but I don't know if Fady tried this). Maybe tribes also already contains some löw-level debugging options ? wireshark maybe another option, but it has quite a learning curve. And this is on Windows. By the way Fady, are you sure that your "Windows Firewall with Enhanced Security" is not just dropping TCP packets to/from port 40xx (or from "java.exe") ? There are some "network policies" there which can have wide-ranging side-effects. On 07.02.2017 14:42, Fady Haikal wrote: Hi, issue still not fixed. Tomcat session replication is not able to replicate the key from node to node, please find below the error, taking into consideration that there is no network disconnection between 2 nodes 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo Unable to replicate backup key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350, securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14 -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ]. Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; org.apache.catalina.tribes.ChannelException: Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108) at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48) at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54) at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:81) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:93) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:233) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:186) at org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo(LazyReplicatedMap.java:170) at org.apache.catalina.tribes.tipis.AbstractRepli
Re: Operation has timed out
Hi, telnet IP 4000 is working fine, i installed a tool for network monitoring at the level of IP and Port and i didnt see any disconnection, and yes i'm sure that no firewall is enabled. I saw some strange think on the server that I have tried to ping the multicast IP (228.0.0.4) and i get reply from different IPs in the network, i don't know why and how i get those IPs, after checking with the network team they told me that those IPs are related to the SAN storage taking into consideration that the Tomcat servers are not connected in anyway to that SUN storage. On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) wrote: > Hi. > > This is for the Tomcat/Tribes experts on the list. > > I know nothing of Tribes, but the on-line documentation seems to say that > the communication happens over TCP and that the protocol used is not > encrypted. > Fady previously tried a standard "ping" and a "telnet" between the two > nodes, and that is the base for him mentioning that "there is no network > disconnection" between the nodes. > Nevertheless, the calling pinging node seems to say that it times out > without getting a response fom the target node. There is evidently a > contradiction there. > So this could still be some kind of network issue. > > Considering that the protocol command for this "ping" should be known by > someone here, would it not be possible to imagine a little program in some > scripting language (or even java, God forbid), which would open a TCP > channel with the target node IP/port, send such a "ping" message, wait for a > response and report the result ? > That would at least confirm/deny that the problem is with the network. > > The log below does not for example say if the error happens when opening the > TCP communication channel, or after sending the ping message on it, > (Of course, testing the TCP open could be done with "telnet IP 4000", but I > don't know if Fady tried this). > Maybe tribes also already contains some löw-level debugging options ? > wireshark maybe another option, but it has quite a learning curve. > And this is on Windows. > > By the way Fady, are you sure that your "Windows Firewall with Enhanced > Security" is not just dropping TCP packets to/from port 40xx (or from > "java.exe") ? There are some "network policies" there which can have > wide-ranging side-effects. > > > > > On 07.02.2017 14:42, Fady Haikal wrote: >> >> Hi, issue still not fixed. Tomcat session replication is not able to >> replicate the key from node to node, please find below the error, >> taking into consideration that there is no network disconnection >> between 2 nodes >> >> >> 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] >> org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo >> Unable to replicate backup >> key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to >> backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, >> 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350, >> securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14 >> -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ]. >> Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10, >> 114, 43, 102}:4000; >> org.apache.catalina.tribes.ChannelException: Operation has timed >> out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; >> at >> org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108) >> at >> org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48) >> at >> org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54) >> at >> org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82) >> at >> org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) >> at >> org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:81) >> at >> org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) >> at >> org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:93) >> at >> org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) >> at >> org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:233) >> at >> org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:186) >> at >> org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo(LazyReplicatedMap.java:170) >> at >> org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1040) >> at >> org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1024) >> at org.apache.catalina.session.ManagerBase.add(ManagerBase.java:647) >> at >> org.apache.catalina.session.StandardSession.setId(StandardSession.java:374) >> at >>
Re: Operation has timed out
Hi. This is for the Tomcat/Tribes experts on the list. I know nothing of Tribes, but the on-line documentation seems to say that the communication happens over TCP and that the protocol used is not encrypted. Fady previously tried a standard "ping" and a "telnet" between the two nodes, and that is the base for him mentioning that "there is no network disconnection" between the nodes. Nevertheless, the calling pinging node seems to say that it times out without getting a response fom the target node. There is evidently a contradiction there. So this could still be some kind of network issue. Considering that the protocol command for this "ping" should be known by someone here, would it not be possible to imagine a little program in some scripting language (or even java, God forbid), which would open a TCP channel with the target node IP/port, send such a "ping" message, wait for a response and report the result ? That would at least confirm/deny that the problem is with the network. The log below does not for example say if the error happens when opening the TCP communication channel, or after sending the ping message on it, (Of course, testing the TCP open could be done with "telnet IP 4000", but I don't know if Fady tried this). Maybe tribes also already contains some löw-level debugging options ? wireshark maybe another option, but it has quite a learning curve. And this is on Windows. By the way Fady, are you sure that your "Windows Firewall with Enhanced Security" is not just dropping TCP packets to/from port 40xx (or from "java.exe") ? There are some "network policies" there which can have wide-ranging side-effects. On 07.02.2017 14:42, Fady Haikal wrote: Hi, issue still not fixed. Tomcat session replication is not able to replicate the key from node to node, please find below the error, taking into consideration that there is no network disconnection between 2 nodes 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo Unable to replicate backup key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350, securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14 -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ]. Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; org.apache.catalina.tribes.ChannelException: Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108) at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48) at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54) at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:81) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:93) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:233) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:186) at org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo(LazyReplicatedMap.java:170) at org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1040) at org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1024) at org.apache.catalina.session.ManagerBase.add(ManagerBase.java:647) at org.apache.catalina.session.StandardSession.setId(StandardSession.java:374) at org.apache.catalina.ha.session.DeltaSession.setId(DeltaSession.java:279) at org.apache.catalina.session.ManagerBase.createSession(ManagerBase.java:708) at org.apache.catalina.connector.Request.doGetSession(Request.java:2936) at org.apache.catalina.connector.Request.getSession(Request.java:2260) at org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:895) at javax.servlet.http.HttpServletRequestWrapper.getSession(HttpServletRequestWrapper.java:231) at org.apache.catalina.core.ApplicationHttpRequest.getSession(ApplicationHttpRequest.java:568) at org.apache.catalina.core.ApplicationHttpRequest.getSession(ApplicationHttpRequest.java:513) at org.apache.jasper.runtime.PageContextImpl.initialize(PageContextImpl.java:137) at org.apache.jasper.runtime.JspFactoryImpl.internalGetPageContext(JspFactoryImp
Re: Operation has timed out
Hi, issue still not fixed. Tomcat session replication is not able to replicate the key from node to node, please find below the error, taking into consideration that there is no network disconnection between 2 nodes 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo Unable to replicate backup key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350, securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14 -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ]. Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; org.apache.catalina.tribes.ChannelException: Operation has timed out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108) at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48) at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54) at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:81) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:93) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:233) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:186) at org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo(LazyReplicatedMap.java:170) at org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1040) at org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1024) at org.apache.catalina.session.ManagerBase.add(ManagerBase.java:647) at org.apache.catalina.session.StandardSession.setId(StandardSession.java:374) at org.apache.catalina.ha.session.DeltaSession.setId(DeltaSession.java:279) at org.apache.catalina.session.ManagerBase.createSession(ManagerBase.java:708) at org.apache.catalina.connector.Request.doGetSession(Request.java:2936) at org.apache.catalina.connector.Request.getSession(Request.java:2260) at org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:895) at javax.servlet.http.HttpServletRequestWrapper.getSession(HttpServletRequestWrapper.java:231) at org.apache.catalina.core.ApplicationHttpRequest.getSession(ApplicationHttpRequest.java:568) at org.apache.catalina.core.ApplicationHttpRequest.getSession(ApplicationHttpRequest.java:513) at org.apache.jasper.runtime.PageContextImpl.initialize(PageContextImpl.java:137) at org.apache.jasper.runtime.JspFactoryImpl.internalGetPageContext(JspFactoryImpl.java:109) at org.apache.jasper.runtime.JspFactoryImpl.getPageContext(JspFactoryImpl.java:60) at org.apache.jsp.WEB_002dINF.jsp._401_jsp._jspService(_401_jsp.java:100) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:729) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:438) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:396) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:340) at javax.servlet.http.HttpServlet.service(HttpServlet.java:729) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:719) at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:467) at org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:390) at org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:317) at org.apache.catalina.core.StandardHostValve.custom(StandardHostValve.java:445) at org.apache.catalina.core.StandardHostValve.status(StandardHostValve.java:304) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:181) at org.apache.catalina.ha.session.JvmRouteBinderValve.invoke(JvmRouteBinderValve.java:194) at org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:318) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79) at org.apache.catalina.valves.StuckThreadDetectionVal
AW: Operation has timed out
Fady, Sorry for top posting. If I remember correctly, the Cluster Element goes into the Container and not the Host. Plus I see in our (working) case, a DeltaManager and a JvmRouteSessionIDBinderListener ... Besides this, only ports, limits and values are different. You may want to filter out the replication for static resources as gifs jpg or css. Best regards Peter > below is the server.xml configuration, as mentioened earlier the issue > is related to the cluster configuration, and as per my research i can > see that some users are facing the same issue but i didnt found the > solution of it > > > > > > > > > >SSLEngine="on" /> > >className="org.apache.catalina.core.JreMemoryLeakPreventionListener" > /> >className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener" > /> >className="org.apache.catalina.core.ThreadLocalLeakPreventionListener" > /> > > > > >type="org.apache.catalina.UserDatabase" > description="User database that can be updated and saved" > factory="org.apache.catalina.users.MemoryUserDatabaseFactory" > pathname="conf/tomcat-users.xml" /> > > > > > > > > > > > connectionTimeout="6" maxThreads="500" > minSpareThreads="25" maxSpareThreads="75" enableLookups="false" > disableUploadTimeout="true" acceptCount="100" redirectPort="8443" /> > > > > > > > > > > > > > > > > > resourceName="UserDatabase"/> > > >unpackWARs="true" autoDeploy="true" startStopThreads="0"> > > > >channelSendOptions="4"> > > className="org.apache.catalina.tribes.membership.McastService" > address="228.0.0.4" > port="45564" > frequency="500" > dropTime="9000"/> > className="org.apache.catalina.tribes.transport.nio.NioReceiver" > address="auto" > port="4000" > autoBind="100" > selectorTimeout="5000" > maxThreads="6"/> > > className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> >className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/> > > className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/> > className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/> > > > filter=""/> >className="org.apache.catalina.ha.session.JvmRouteBinderValve"/> > >tempDir="D:/imaljava/TomcatNode1/tmp/war-temp/" > deployDir="D:/imaljava/TomcatNode1/tmp/war-deploy/" > watchDir="D:/imaljava/TomcatNode1/tmp/war-listen/" > watchEnabled="false"/> > >className="org.apache.catalina.ha.session.ClusterSessionListener"/> > > > > > > directory="logs" >prefix="localhost_access_log" suffix=".txt" >pattern="%h %l %u %t "%r" %s %b" /> > threshold="900" /> > > > > > > > On Mon, Feb 6, 2017 at 6:51 PM, André Warnier (tomcat) > wrote: > > On 06.02.2017 17:45, Fady Haikal wrote: > >> > >> Hi, > >> What is the host OS ? Windows Server 2012 > >> What is the Tomcat version ? Apache Tomcat/8.0.30 > >> > >> Is this problem new ? was this working before ? how long ? Since > >> cluster implementation > >> > > > > I still don't know tribes, but then my non-educated guess at this point > > would be that there is something wrong in your configuration. > > Can you copy/paste it here ? (remove sensible things like passwords, public > > IP addresses etc..)(but not to the point of making it uncheckable). > > > > Then maybe some tribes-specialist can take over ? > > > > > >> > >> Is there actually something listening on that address/port ? Tomcat > >> cluster > >> > >> the Port 4000 is listening and there is no disconnection between 2 > >> nodes ping and telnet are OK > >> > >> On Mon, Feb 6, 2017 at 6:42 PM, André Warnier (tomcat) > >> wrote: > >>> > >>> On 06.02.2017 17:24, Fady Haikal wrote: > > > Plz can i get some help here? > This issue is still occurring and it's filling the log file in the > Production server > > Regards, > Fady > >>> > >>> > >>> > >>> Hi. > >>> If you want quick answers, you should provide more information. > >>> What is the host OS ? > >>> What is the Tomcat version ? > >>> Is this problem new ? was this working before ? how long ? > >>> > >>> I do not know tribes at all, but according to the logfile