Re: Operation has timed out

2017-02-07 Thread Fady Haikal
Hi All,
As u know we are using 224.0.0.4 multicast IP for tomcat cluster
(Node1: 10.114.43.102 / Node2: 10.114.43.103) i was trying to ping the
multicast IP i get reply from 10.114.43.51 IP!

Also i have executed the nslookup for 224.0.0.4 i get the DC IP
(10.114.43.7) and mcast.net domain:

C:\Users\Administrator>nslookup 224.0.0.4
Server:  hq-dc02.albaraka.com.sd
Address:  10.114.43.7

Name:dvmrp.mcast.net
Address:  224.0.0.4

On Wed, Feb 8, 2017 at 8:59 AM, Fady Haikal  wrote:
> Ashwin,
> I'm using the below configuration, please let me know how i can check
> if i'm using a unique multicast address and port
>
>
>  className="org.apache.catalina.tribes.group.GroupChannel">
>  className="org.apache.catalina.tribes.membership.McastService"
> address="228.0.0.4"
> port="45564"
> frequency="500"
> dropTime="9000"/>
>  className="org.apache.catalina.tribes.transport.nio.NioReceiver"
>   address="auto"
>   port="4000"
>   autoBind="100"
>   selectorTimeout="5000"
>   maxThreads="6"/>
>
> On Wed, Feb 8, 2017 at 6:39 AM, ashwin rajput  wrote:
>> I am not sure if anyone has verified below.
>>
>> Have you verifyed clustering is using unique multicast address and port.
>> Cluster multicast address should be unique and not used by any other
>> cluster.
>>
>> Regards,
>> Ashwin
>> On 07-Feb-2017 10:38 pm, "André Warnier (tomcat)"  wrote:
>>
>>> On 07.02.2017 17:20, Fady Haikal wrote:
>>>
 Christopher,
 For the first time

>>>
>>> @Christopher : just to make sure you got that bit, buried below : the
>>> actual replication seems to work fine. The problem is only these
>>> "unsuccesful ping" messages in the log, which fill the log, and which so
>>> far nobody has managed to find an explanation for.
>>>
>>>
 On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz
  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Fady,
>
> On 2/7/17 10:53 AM, Fady Haikal wrote:
>
>> ProcessPID   Protocol   local address  local port
>> Remote Address  State Tomcat8.exe 8160 TCP
>> imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED
>>
>
> Stupid question: was this working in the past, and it stopped working?
> Or are you trying to get this working for the first time?
>
> - -chris
>
> On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal 
>> wrote:
>>
>>> Yes there is a ESTABLISHED connection, the replication of
>>> sessions is working fine (port 4000 is for tomcat cluster) but we
>>> also faced this error on the log file
>>>
>>> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat)
>>>  wrote:
>>>
 On 07.02.2017 16:24, Fady Haikal wrote:

>
> Hi, telnet IP 4000 is working fine, i installed a tool for
> network monitoring at the level of IP and Port and i didnt
> see any disconnection,
>


 but did you see a *connection* ? I mean, on the pinging node,
 if you use the Windows "netstat" program, for example as
 netstat -aon -p TCP you should see a list of connections in the
 ESTABLISHED state, of which one of the IP/ports should be your
 target IP:4000 (in the "remote" column). And on the pinged
 node, this port :4000 should be in the "local" column, in
 LISTEN mode (and also probably one in the ESTABLISHED state, if
 they agree.)

 Is that the case ?



 and yes i'm sure that no firewall is enabled.

>
>
> I saw some strange think on the server that I have tried to
> ping the multicast IP (228.0.0.4) and i get reply from
> different IPs in the network, i don't know why and how i get
> those IPs, after checking with the network team they told me
> that those IPs are related to the SAN storage taking into
> consideration that the Tomcat servers are not connected in
> anyway to that SUN storage.
>
>
> On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat)
>  wrote:
>
>>
>> Hi.
>>
>> This is for the Tomcat/Tribes experts on the list.
>>
>> I know nothing of Tribes, but the on-line documentation
>> seems to say that the communication happens over TCP and
>> that the protocol used is not encrypted. Fady previously
>> tried a standard "ping" and a "telnet" between the two
>> nodes, and that is the base for him mentioning that 

Re: Operation has timed out

2017-02-07 Thread Fady Haikal
Ashwin,
I'm using the below configuration, please let me know how i can check
if i'm using a unique multicast address and port






On Wed, Feb 8, 2017 at 6:39 AM, ashwin rajput  wrote:
> I am not sure if anyone has verified below.
>
> Have you verifyed clustering is using unique multicast address and port.
> Cluster multicast address should be unique and not used by any other
> cluster.
>
> Regards,
> Ashwin
> On 07-Feb-2017 10:38 pm, "André Warnier (tomcat)"  wrote:
>
>> On 07.02.2017 17:20, Fady Haikal wrote:
>>
>>> Christopher,
>>> For the first time
>>>
>>
>> @Christopher : just to make sure you got that bit, buried below : the
>> actual replication seems to work fine. The problem is only these
>> "unsuccesful ping" messages in the log, which fill the log, and which so
>> far nobody has managed to find an explanation for.
>>
>>
>>> On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz
>>>  wrote:
>>>
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Fady,

 On 2/7/17 10:53 AM, Fady Haikal wrote:

> ProcessPID   Protocol   local address  local port
> Remote Address  State Tomcat8.exe 8160 TCP
> imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED
>

 Stupid question: was this working in the past, and it stopped working?
 Or are you trying to get this working for the first time?

 - -chris

 On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal 
> wrote:
>
>> Yes there is a ESTABLISHED connection, the replication of
>> sessions is working fine (port 4000 is for tomcat cluster) but we
>> also faced this error on the log file
>>
>> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat)
>>  wrote:
>>
>>> On 07.02.2017 16:24, Fady Haikal wrote:
>>>

 Hi, telnet IP 4000 is working fine, i installed a tool for
 network monitoring at the level of IP and Port and i didnt
 see any disconnection,

>>>
>>>
>>> but did you see a *connection* ? I mean, on the pinging node,
>>> if you use the Windows "netstat" program, for example as
>>> netstat -aon -p TCP you should see a list of connections in the
>>> ESTABLISHED state, of which one of the IP/ports should be your
>>> target IP:4000 (in the "remote" column). And on the pinged
>>> node, this port :4000 should be in the "local" column, in
>>> LISTEN mode (and also probably one in the ESTABLISHED state, if
>>> they agree.)
>>>
>>> Is that the case ?
>>>
>>>
>>>
>>> and yes i'm sure that no firewall is enabled.
>>>


 I saw some strange think on the server that I have tried to
 ping the multicast IP (228.0.0.4) and i get reply from
 different IPs in the network, i don't know why and how i get
 those IPs, after checking with the network team they told me
 that those IPs are related to the SAN storage taking into
 consideration that the Tomcat servers are not connected in
 anyway to that SUN storage.


 On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat)
  wrote:

>
> Hi.
>
> This is for the Tomcat/Tribes experts on the list.
>
> I know nothing of Tribes, but the on-line documentation
> seems to say that the communication happens over TCP and
> that the protocol used is not encrypted. Fady previously
> tried a standard "ping" and a "telnet" between the two
> nodes, and that is the base for him mentioning that "there
> is no network disconnection" between the nodes.
> Nevertheless, the calling pinging node seems to say that it
> times out without getting a response fom the target node.
> There is evidently a contradiction there. So this could
> still be some kind of network issue.
>
> Considering that the protocol command for this "ping"
> should be known by someone here, would it not be possible
> to imagine a little program in some scripting language (or
> even java, God forbid), which would open a TCP channel with
> the target node IP/port, send such a "ping" message, wait
> for a response and report the result ? That would at least
> confirm/deny that the problem is with the network.
>
> The log below does not for example say if the error happens
> when opening the TCP communication channel, or after
> sending the ping message on it, (Of course, testing the TCP
> open could be done with "telnet IP 4000", but I don't know
> if Fady tried this). Maybe tribes also already contains
> some löw-level debugging options ? 

Re: Operation has timed out

2017-02-07 Thread ashwin rajput
I am not sure if anyone has verified below.

Have you verifyed clustering is using unique multicast address and port.
Cluster multicast address should be unique and not used by any other
cluster.

Regards,
Ashwin
On 07-Feb-2017 10:38 pm, "André Warnier (tomcat)"  wrote:

> On 07.02.2017 17:20, Fady Haikal wrote:
>
>> Christopher,
>> For the first time
>>
>
> @Christopher : just to make sure you got that bit, buried below : the
> actual replication seems to work fine. The problem is only these
> "unsuccesful ping" messages in the log, which fill the log, and which so
> far nobody has managed to find an explanation for.
>
>
>> On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz
>>  wrote:
>>
>>> -BEGIN PGP SIGNED MESSAGE-
>>> Hash: SHA256
>>>
>>> Fady,
>>>
>>> On 2/7/17 10:53 AM, Fady Haikal wrote:
>>>
 ProcessPID   Protocol   local address  local port
 Remote Address  State Tomcat8.exe 8160 TCP
 imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED

>>>
>>> Stupid question: was this working in the past, and it stopped working?
>>> Or are you trying to get this working for the first time?
>>>
>>> - -chris
>>>
>>> On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal 
 wrote:

> Yes there is a ESTABLISHED connection, the replication of
> sessions is working fine (port 4000 is for tomcat cluster) but we
> also faced this error on the log file
>
> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat)
>  wrote:
>
>> On 07.02.2017 16:24, Fady Haikal wrote:
>>
>>>
>>> Hi, telnet IP 4000 is working fine, i installed a tool for
>>> network monitoring at the level of IP and Port and i didnt
>>> see any disconnection,
>>>
>>
>>
>> but did you see a *connection* ? I mean, on the pinging node,
>> if you use the Windows "netstat" program, for example as
>> netstat -aon -p TCP you should see a list of connections in the
>> ESTABLISHED state, of which one of the IP/ports should be your
>> target IP:4000 (in the "remote" column). And on the pinged
>> node, this port :4000 should be in the "local" column, in
>> LISTEN mode (and also probably one in the ESTABLISHED state, if
>> they agree.)
>>
>> Is that the case ?
>>
>>
>>
>> and yes i'm sure that no firewall is enabled.
>>
>>>
>>>
>>> I saw some strange think on the server that I have tried to
>>> ping the multicast IP (228.0.0.4) and i get reply from
>>> different IPs in the network, i don't know why and how i get
>>> those IPs, after checking with the network team they told me
>>> that those IPs are related to the SAN storage taking into
>>> consideration that the Tomcat servers are not connected in
>>> anyway to that SUN storage.
>>>
>>>
>>> On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat)
>>>  wrote:
>>>

 Hi.

 This is for the Tomcat/Tribes experts on the list.

 I know nothing of Tribes, but the on-line documentation
 seems to say that the communication happens over TCP and
 that the protocol used is not encrypted. Fady previously
 tried a standard "ping" and a "telnet" between the two
 nodes, and that is the base for him mentioning that "there
 is no network disconnection" between the nodes.
 Nevertheless, the calling pinging node seems to say that it
 times out without getting a response fom the target node.
 There is evidently a contradiction there. So this could
 still be some kind of network issue.

 Considering that the protocol command for this "ping"
 should be known by someone here, would it not be possible
 to imagine a little program in some scripting language (or
 even java, God forbid), which would open a TCP channel with
 the target node IP/port, send such a "ping" message, wait
 for a response and report the result ? That would at least
 confirm/deny that the problem is with the network.

 The log below does not for example say if the error happens
 when opening the TCP communication channel, or after
 sending the ping message on it, (Of course, testing the TCP
 open could be done with "telnet IP 4000", but I don't know
 if Fady tried this). Maybe tribes also already contains
 some löw-level debugging options ? wireshark maybe another
 option, but it has quite a learning curve. And this is on
 Windows.

 By the way Fady, are you sure that your "Windows Firewall
 with Enhanced Security" is not just dropping TCP packets
 to/from port 40xx (or from "java.exe") ? There are some
 "network policies" there which can have 

Tomcat 7.0.xx under Java 7?

2017-02-07 Thread James H. H. Lampert

Ladies and Gentlemen of the Tomcat List:

To date, the overwhelming bulk of our own Tomcat experience has been 
under Java 6 JVMs. And we have a customer who will likely be losing that 
JVM soon.


Are there any "gotchas" running 7.0.47 or later under Java 7?

--
James H. H. Lampert
Touchtone Corporation

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Apache Tomcat 7.0.59 - Even if a ws certificate stored in the WSkeystore expires, any webclient request is still accepted by server and not refused

2017-02-07 Thread Mark Thomas

On 07/02/17 19:33, George Stanchev wrote:

Mark,

Apologies for top posting. We have our own trust manager that is
attached to the connector because we want client certificates to be
passed in the application for validation and authentication rather
than the connector. If we switch to the OpenSSL/APR based certificate
processing, would the trust manager still work? I presume not, but
wanted to ask and if not, what are the options?


If the application is validating the client certs, just add valid 
to/from date checking to that validation.


Mark





-Original Message- From: Mark Thomas
[mailto:ma...@apache.org] Sent: Monday, February 06, 2017 7:20 AM To:
Tomcat Users List  Subject: Re: Apache
Tomcat 7.0.59 - Even if a ws certificate stored in the WSkeystore
expires, any webclient request is still accepted by server and not
refused

On 06/02/17 13:49, Francesco Leone wrote:

Dear Sirs, To communicate you a behaviour with Apache Tomcat
7.0.59

Apache Tomcat 7.0.59 is running with: - RHEL6.6 - java jdk 1.8.0.74
- OpenSSL 1.0.2g

We have a client - server communication. The Client certificate is
 produced via keytool  and we have same problem highlighted here

http://stackoverflow.com/questions/33688020/configuring-apache-tomcat-



7-0-to-reject-connections-with-expired-client-certific


and

http://stackoverflow.com/questions/5206859/java-trustmanager-behavior-



on-expired-certificates




What we got reading all flow, is that to solve our problem we
should implement a new X509TrustManager which creates our original
instance in its constructor, implements all methods as calls to the
original instance, and adds a call to checkValidity for each
certificate in certs[] inside checkServerTrusted.

Did we get well ? If yes, it sounds to us as a hole in the security
 and so a bug in Tomcat, is there any chance to have this behaviour
 (refuse connection at expired certificates) as standard in later
Apache tomcat 7.0.x release ? Any of this community can support us
?


This is not a Tomcat bug.

If you tell Java to trust a certificate, it will do so and ignore the
validity period.

I've looked into this in the past and short of implementing your own
X509TrustManager I haven't yet found an API Tomcat could use to add
an additional check on the trusted cert's validity.

A better general solution is to trust the CA(s) issuing the client
certificates rather than the client certificates. Then, because the
client cert is not in the trust store, Java checks it more thoroughly
- including the validity dates.

It is also worth looking at using an OpenSSL based TLS connector.
From what I recall of my previous testing OpenSSL did check the
validity dates of trusted certs.

Mark

-



To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org

For additional commands, e-mail: users-h...@tomcat.apache.org


-



To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org

For additional commands, e-mail: users-h...@tomcat.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Apache Tomcat 7.0.59 - Even if a ws certificate stored in the WSkeystore expires, any webclient request is still accepted by server and not refused

2017-02-07 Thread George Stanchev
Mark, 

Apologies for top posting. We have our own trust manager that is attached to 
the connector because we want client certificates to be passed in the 
application for validation and authentication rather than the connector. If we 
switch to the OpenSSL/APR based certificate processing, would the trust manager 
still work? I presume not, but wanted to ask and if not, what are the options?


-Original Message-
From: Mark Thomas [mailto:ma...@apache.org] 
Sent: Monday, February 06, 2017 7:20 AM
To: Tomcat Users List 
Subject: Re: Apache Tomcat 7.0.59 - Even if a ws certificate stored in the 
WSkeystore expires, any webclient request is still accepted by server and not 
refused

On 06/02/17 13:49, Francesco Leone wrote:
> Dear Sirs, To communicate you a behaviour with Apache Tomcat 7.0.59
>
> Apache Tomcat 7.0.59 is running with: - RHEL6.6 - java jdk 1.8.0.74 - 
> OpenSSL 1.0.2g
>
> We have a client - server communication. The Client certificate is 
> produced via keytool  and we have same problem highlighted here
>
> http://stackoverflow.com/questions/33688020/configuring-apache-tomcat-
> 7-0-to-reject-connections-with-expired-client-certific
>
>  and
>
> http://stackoverflow.com/questions/5206859/java-trustmanager-behavior-
> on-expired-certificates
>
>
>
> What we got reading all flow, is that to solve our problem we should 
> implement a new X509TrustManager which creates our original instance 
> in its constructor, implements all methods as calls to the original 
> instance, and adds a call to checkValidity for each certificate in 
> certs[] inside checkServerTrusted.
>
> Did we get well ? If yes, it sounds to us as a hole in the security 
> and so a bug in Tomcat, is there any chance to have this behaviour 
> (refuse connection at expired certificates) as standard in later 
> Apache tomcat 7.0.x release ? Any of this community can support us ?

This is not a Tomcat bug.

If you tell Java to trust a certificate, it will do so and ignore the validity 
period.

I've looked into this in the past and short of implementing your own 
X509TrustManager I haven't yet found an API Tomcat could use to add an 
additional check on the trusted cert's validity.

A better general solution is to trust the CA(s) issuing the client certificates 
rather than the client certificates. Then, because the client cert is not in 
the trust store, Java checks it more thoroughly - including the validity dates.

It is also worth looking at using an OpenSSL based TLS connector. From what I 
recall of my previous testing OpenSSL did check the validity dates of trusted 
certs.

Mark

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Operation has timed out

2017-02-07 Thread tomcat

On 07.02.2017 17:20, Fady Haikal wrote:

Christopher,
For the first time


@Christopher : just to make sure you got that bit, buried below : the actual replication 
seems to work fine. The problem is only these "unsuccesful ping" messages in the log, 
which fill the log, and which so far nobody has managed to find an explanation for.




On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz
 wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Fady,

On 2/7/17 10:53 AM, Fady Haikal wrote:

ProcessPID   Protocol   local address  local port
Remote Address  State Tomcat8.exe 8160 TCP
imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED


Stupid question: was this working in the past, and it stopped working?
Or are you trying to get this working for the first time?

- -chris


On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal 
wrote:

Yes there is a ESTABLISHED connection, the replication of
sessions is working fine (port 4000 is for tomcat cluster) but we
also faced this error on the log file

On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat)
 wrote:

On 07.02.2017 16:24, Fady Haikal wrote:


Hi, telnet IP 4000 is working fine, i installed a tool for
network monitoring at the level of IP and Port and i didnt
see any disconnection,



but did you see a *connection* ? I mean, on the pinging node,
if you use the Windows "netstat" program, for example as
netstat -aon -p TCP you should see a list of connections in the
ESTABLISHED state, of which one of the IP/ports should be your
target IP:4000 (in the "remote" column). And on the pinged
node, this port :4000 should be in the "local" column, in
LISTEN mode (and also probably one in the ESTABLISHED state, if
they agree.)

Is that the case ?



and yes i'm sure that no firewall is enabled.



I saw some strange think on the server that I have tried to
ping the multicast IP (228.0.0.4) and i get reply from
different IPs in the network, i don't know why and how i get
those IPs, after checking with the network team they told me
that those IPs are related to the SAN storage taking into
consideration that the Tomcat servers are not connected in
anyway to that SUN storage.


On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat)
 wrote:


Hi.

This is for the Tomcat/Tribes experts on the list.

I know nothing of Tribes, but the on-line documentation
seems to say that the communication happens over TCP and
that the protocol used is not encrypted. Fady previously
tried a standard "ping" and a "telnet" between the two
nodes, and that is the base for him mentioning that "there
is no network disconnection" between the nodes.
Nevertheless, the calling pinging node seems to say that it
times out without getting a response fom the target node.
There is evidently a contradiction there. So this could
still be some kind of network issue.

Considering that the protocol command for this "ping"
should be known by someone here, would it not be possible
to imagine a little program in some scripting language (or
even java, God forbid), which would open a TCP channel with
the target node IP/port, send such a "ping" message, wait
for a response and report the result ? That would at least
confirm/deny that the problem is with the network.

The log below does not for example say if the error happens
when opening the TCP communication channel, or after
sending the ping message on it, (Of course, testing the TCP
open could be done with "telnet IP 4000", but I don't know
if Fady tried this). Maybe tribes also already contains
some löw-level debugging options ? wireshark maybe another
option, but it has quite a learning curve. And this is on
Windows.

By the way Fady, are you sure that your "Windows Firewall
with Enhanced Security" is not just dropping TCP packets
to/from port 40xx (or from "java.exe") ? There are some
"network policies" there which can have wide-ranging
side-effects.




On 07.02.2017 14:42, Fady Haikal wrote:



Hi, issue still not fixed. Tomcat session replication is
not able to replicate the key from node to node, please
find below the error, taking into consideration that
there is no network disconnection between 2 nodes


07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8]
org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryIn

fo




Unable to replicate backup

key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to
backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10

,




114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350,

securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64
12 -102 -14 -85 -87 15 9 -51 -112 }, payload={},
command={}, domain={}, ]. Reason:Operation has timed
out(3000 ms.).; Faulty members:tcp://{10, 114, 43,
102}:4000; org.apache.catalina.tribes.ChannelException:
Operation has timed out(3000 ms.).; Faulty
members:tcp://{10, 114, 43, 102}:4000; at


Re: Operation has timed out

2017-02-07 Thread Fady Haikal
Christopher,
For the first time

On Tue, Feb 7, 2017 at 6:19 PM, Christopher Schultz
 wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Fady,
>
> On 2/7/17 10:53 AM, Fady Haikal wrote:
>> ProcessPID   Protocol   local address  local port
>> Remote Address  State Tomcat8.exe 8160 TCP
>> imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED
>
> Stupid question: was this working in the past, and it stopped working?
> Or are you trying to get this working for the first time?
>
> - -chris
>
>> On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal 
>> wrote:
>>> Yes there is a ESTABLISHED connection, the replication of
>>> sessions is working fine (port 4000 is for tomcat cluster) but we
>>> also faced this error on the log file
>>>
>>> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat)
>>>  wrote:
 On 07.02.2017 16:24, Fady Haikal wrote:
>
> Hi, telnet IP 4000 is working fine, i installed a tool for
> network monitoring at the level of IP and Port and i didnt
> see any disconnection,


 but did you see a *connection* ? I mean, on the pinging node,
 if you use the Windows "netstat" program, for example as
 netstat -aon -p TCP you should see a list of connections in the
 ESTABLISHED state, of which one of the IP/ports should be your
 target IP:4000 (in the "remote" column). And on the pinged
 node, this port :4000 should be in the "local" column, in
 LISTEN mode (and also probably one in the ESTABLISHED state, if
 they agree.)

 Is that the case ?



 and yes i'm sure that no firewall is enabled.
>
>
> I saw some strange think on the server that I have tried to
> ping the multicast IP (228.0.0.4) and i get reply from
> different IPs in the network, i don't know why and how i get
> those IPs, after checking with the network team they told me
> that those IPs are related to the SAN storage taking into
> consideration that the Tomcat servers are not connected in
> anyway to that SUN storage.
>
>
> On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat)
>  wrote:
>>
>> Hi.
>>
>> This is for the Tomcat/Tribes experts on the list.
>>
>> I know nothing of Tribes, but the on-line documentation
>> seems to say that the communication happens over TCP and
>> that the protocol used is not encrypted. Fady previously
>> tried a standard "ping" and a "telnet" between the two
>> nodes, and that is the base for him mentioning that "there
>> is no network disconnection" between the nodes.
>> Nevertheless, the calling pinging node seems to say that it
>> times out without getting a response fom the target node.
>> There is evidently a contradiction there. So this could
>> still be some kind of network issue.
>>
>> Considering that the protocol command for this "ping"
>> should be known by someone here, would it not be possible
>> to imagine a little program in some scripting language (or
>> even java, God forbid), which would open a TCP channel with
>> the target node IP/port, send such a "ping" message, wait
>> for a response and report the result ? That would at least
>> confirm/deny that the problem is with the network.
>>
>> The log below does not for example say if the error happens
>> when opening the TCP communication channel, or after
>> sending the ping message on it, (Of course, testing the TCP
>> open could be done with "telnet IP 4000", but I don't know
>> if Fady tried this). Maybe tribes also already contains
>> some löw-level debugging options ? wireshark maybe another
>> option, but it has quite a learning curve. And this is on
>> Windows.
>>
>> By the way Fady, are you sure that your "Windows Firewall
>> with Enhanced Security" is not just dropping TCP packets
>> to/from port 40xx (or from "java.exe") ? There are some
>> "network policies" there which can have wide-ranging
>> side-effects.
>>
>>
>>
>>
>> On 07.02.2017 14:42, Fady Haikal wrote:
>>>
>>>
>>> Hi, issue still not fixed. Tomcat session replication is
>>> not able to replicate the key from node to node, please
>>> find below the error, taking into consideration that
>>> there is no network disconnection between 2 nodes
>>>
>>>
>>> 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8]
>>> org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryIn
> fo
>>>
>>>
> Unable to replicate backup
>>> key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to
>>> backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10
> ,
>>>
>>>
> 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350,
>>> securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64
>>> 

Re: Operation has timed out

2017-02-07 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Fady,

On 2/7/17 10:53 AM, Fady Haikal wrote:
> ProcessPID   Protocol   local address  local port
> Remote Address  State Tomcat8.exe 8160 TCP
> imal14-app24000 imal14-app1.albaraka.com.sdESTABLISHED

Stupid question: was this working in the past, and it stopped working?
Or are you trying to get this working for the first time?

- -chris

> On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal 
> wrote:
>> Yes there is a ESTABLISHED connection, the replication of
>> sessions is working fine (port 4000 is for tomcat cluster) but we
>> also faced this error on the log file
>> 
>> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat)
>>  wrote:
>>> On 07.02.2017 16:24, Fady Haikal wrote:
 
 Hi, telnet IP 4000 is working fine, i installed a tool for
 network monitoring at the level of IP and Port and i didnt
 see any disconnection,
>>> 
>>> 
>>> but did you see a *connection* ? I mean, on the pinging node,
>>> if you use the Windows "netstat" program, for example as 
>>> netstat -aon -p TCP you should see a list of connections in the
>>> ESTABLISHED state, of which one of the IP/ports should be your
>>> target IP:4000 (in the "remote" column). And on the pinged
>>> node, this port :4000 should be in the "local" column, in 
>>> LISTEN mode (and also probably one in the ESTABLISHED state, if
>>> they agree.)
>>> 
>>> Is that the case ?
>>> 
>>> 
>>> 
>>> and yes i'm sure that no firewall is enabled.
 
 
 I saw some strange think on the server that I have tried to
 ping the multicast IP (228.0.0.4) and i get reply from
 different IPs in the network, i don't know why and how i get
 those IPs, after checking with the network team they told me
 that those IPs are related to the SAN storage taking into
 consideration that the Tomcat servers are not connected in
 anyway to that SUN storage.
 
 
 On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat)
  wrote:
> 
> Hi.
> 
> This is for the Tomcat/Tribes experts on the list.
> 
> I know nothing of Tribes, but the on-line documentation
> seems to say that the communication happens over TCP and
> that the protocol used is not encrypted. Fady previously
> tried a standard "ping" and a "telnet" between the two 
> nodes, and that is the base for him mentioning that "there
> is no network disconnection" between the nodes. 
> Nevertheless, the calling pinging node seems to say that it
> times out without getting a response fom the target node.
> There is evidently a contradiction there. So this could
> still be some kind of network issue.
> 
> Considering that the protocol command for this "ping"
> should be known by someone here, would it not be possible
> to imagine a little program in some scripting language (or
> even java, God forbid), which would open a TCP channel with
> the target node IP/port, send such a "ping" message, wait 
> for a response and report the result ? That would at least
> confirm/deny that the problem is with the network.
> 
> The log below does not for example say if the error happens
> when opening the TCP communication channel, or after
> sending the ping message on it, (Of course, testing the TCP
> open could be done with "telnet IP 4000", but I don't know
> if Fady tried this). Maybe tribes also already contains
> some löw-level debugging options ? wireshark maybe another
> option, but it has quite a learning curve. And this is on
> Windows.
> 
> By the way Fady, are you sure that your "Windows Firewall
> with Enhanced Security" is not just dropping TCP packets
> to/from port 40xx (or from "java.exe") ? There are some
> "network policies" there which can have wide-ranging
> side-effects.
> 
> 
> 
> 
> On 07.02.2017 14:42, Fady Haikal wrote:
>> 
>> 
>> Hi, issue still not fixed. Tomcat session replication is
>> not able to replicate the key from node to node, please
>> find below the error, taking into consideration that
>> there is no network disconnection between 2 nodes
>> 
>> 
>> 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8] 
>> org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryIn
fo
>>
>> 
Unable to replicate backup
>> key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to 
>> backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10
,
>>
>> 
114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350,
>> securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64
>> 12 -102 -14 -85 -87 15 9 -51 -112 }, payload={},
>> command={}, domain={}, ]. Reason:Operation has timed
>> out(3000 ms.).; Faulty members:tcp://{10, 114, 43,
>> 102}:4000; 

Re: Operation has timed out

2017-02-07 Thread Fady Haikal
ProcessPID   Protocol   local address  local port Remote
Address  State
Tomcat8.exe 8160 TCP imal14-app24000
imal14-app1.albaraka.com.sdESTABLISHED

On Tue, Feb 7, 2017 at 5:46 PM, Fady Haikal  wrote:
> Yes there is a ESTABLISHED connection, the replication of sessions is
> working fine (port 4000 is for tomcat cluster) but we also faced this
> error on the log file
>
> On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat)  
> wrote:
>> On 07.02.2017 16:24, Fady Haikal wrote:
>>>
>>> Hi,
>>> telnet IP 4000 is working fine, i installed a tool for network
>>> monitoring at the level of IP and Port and i didnt see any
>>> disconnection,
>>
>>
>> but did you see a *connection* ?
>> I mean, on the pinging node, if you use the Windows "netstat" program, for
>> example as
>> netstat -aon -p TCP
>> you should see a list of connections in the ESTABLISHED state, of which one
>> of the IP/ports should be your target IP:4000 (in the "remote" column).
>> And on the pinged node, this port :4000 should be in the "local" column, in
>> LISTEN mode
>> (and also probably one in the ESTABLISHED state, if they agree.)
>>
>> Is that the case ?
>>
>>
>>
>> and yes i'm sure that no firewall is enabled.
>>>
>>>
>>> I saw some strange think on the server that I have tried to ping the
>>> multicast IP (228.0.0.4) and i get reply from different IPs in the
>>> network, i don't know why and how i get those IPs, after checking with
>>> the network team they told me that those IPs are related to the SAN
>>> storage taking into consideration that the Tomcat servers are not
>>> connected in anyway to that SUN storage.
>>>
>>>
>>> On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) 
>>> wrote:

 Hi.

 This is for the Tomcat/Tribes experts on the list.

 I know nothing of Tribes, but the on-line documentation seems to say that
 the communication happens over TCP and that the protocol used is not
 encrypted.
 Fady previously tried a standard "ping" and a "telnet" between the two
 nodes, and that is the base for him mentioning that "there is no network
 disconnection" between the nodes.
 Nevertheless, the calling pinging node seems to say that it times out
 without getting a response fom the target node.  There is evidently a
 contradiction there.
 So this could still be some kind of network issue.

 Considering that the protocol command for this "ping" should be known by
 someone here, would it not be possible to imagine a little program in
 some
 scripting language (or even java, God forbid), which would open a TCP
 channel with the target node IP/port, send such a "ping" message, wait
 for a
 response and report the result ?
 That would at least confirm/deny that the problem is with the network.

 The log below does not for example say if the error happens when opening
 the
 TCP communication channel, or after sending the ping message on it,
 (Of course, testing the TCP open could be done with "telnet IP 4000", but
 I
 don't know if Fady tried this).
 Maybe tribes also already contains some löw-level debugging options ?
 wireshark maybe another option, but it has quite a learning curve.
 And this is on Windows.

 By the way Fady, are you sure that your "Windows Firewall with Enhanced
 Security" is not just dropping TCP packets to/from port 40xx (or from
 "java.exe") ? There are some "network policies" there which can have
 wide-ranging side-effects.




 On 07.02.2017 14:42, Fady Haikal wrote:
>
>
> Hi, issue still not fixed. Tomcat session replication is not able to
> replicate the key from node to node, please find below the error,
> taking into consideration that there is no network disconnection
> between 2 nodes
>
>
> 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8]
> org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo
> Unable to replicate backup
> key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to
> backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10,
> 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350,
> securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14
> -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ].
> Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10,
> 114, 43, 102}:4000;
>org.apache.catalina.tribes.ChannelException: Operation has timed
> out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000;
> at
>
> org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108)
> at
>
> org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48)
> at
>
> 

Re: Operation has timed out

2017-02-07 Thread Fady Haikal
Yes there is a ESTABLISHED connection, the replication of sessions is
working fine (port 4000 is for tomcat cluster) but we also faced this
error on the log file

On Tue, Feb 7, 2017 at 5:44 PM, André Warnier (tomcat)  wrote:
> On 07.02.2017 16:24, Fady Haikal wrote:
>>
>> Hi,
>> telnet IP 4000 is working fine, i installed a tool for network
>> monitoring at the level of IP and Port and i didnt see any
>> disconnection,
>
>
> but did you see a *connection* ?
> I mean, on the pinging node, if you use the Windows "netstat" program, for
> example as
> netstat -aon -p TCP
> you should see a list of connections in the ESTABLISHED state, of which one
> of the IP/ports should be your target IP:4000 (in the "remote" column).
> And on the pinged node, this port :4000 should be in the "local" column, in
> LISTEN mode
> (and also probably one in the ESTABLISHED state, if they agree.)
>
> Is that the case ?
>
>
>
> and yes i'm sure that no firewall is enabled.
>>
>>
>> I saw some strange think on the server that I have tried to ping the
>> multicast IP (228.0.0.4) and i get reply from different IPs in the
>> network, i don't know why and how i get those IPs, after checking with
>> the network team they told me that those IPs are related to the SAN
>> storage taking into consideration that the Tomcat servers are not
>> connected in anyway to that SUN storage.
>>
>>
>> On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat) 
>> wrote:
>>>
>>> Hi.
>>>
>>> This is for the Tomcat/Tribes experts on the list.
>>>
>>> I know nothing of Tribes, but the on-line documentation seems to say that
>>> the communication happens over TCP and that the protocol used is not
>>> encrypted.
>>> Fady previously tried a standard "ping" and a "telnet" between the two
>>> nodes, and that is the base for him mentioning that "there is no network
>>> disconnection" between the nodes.
>>> Nevertheless, the calling pinging node seems to say that it times out
>>> without getting a response fom the target node.  There is evidently a
>>> contradiction there.
>>> So this could still be some kind of network issue.
>>>
>>> Considering that the protocol command for this "ping" should be known by
>>> someone here, would it not be possible to imagine a little program in
>>> some
>>> scripting language (or even java, God forbid), which would open a TCP
>>> channel with the target node IP/port, send such a "ping" message, wait
>>> for a
>>> response and report the result ?
>>> That would at least confirm/deny that the problem is with the network.
>>>
>>> The log below does not for example say if the error happens when opening
>>> the
>>> TCP communication channel, or after sending the ping message on it,
>>> (Of course, testing the TCP open could be done with "telnet IP 4000", but
>>> I
>>> don't know if Fady tried this).
>>> Maybe tribes also already contains some löw-level debugging options ?
>>> wireshark maybe another option, but it has quite a learning curve.
>>> And this is on Windows.
>>>
>>> By the way Fady, are you sure that your "Windows Firewall with Enhanced
>>> Security" is not just dropping TCP packets to/from port 40xx (or from
>>> "java.exe") ? There are some "network policies" there which can have
>>> wide-ranging side-effects.
>>>
>>>
>>>
>>>
>>> On 07.02.2017 14:42, Fady Haikal wrote:


 Hi, issue still not fixed. Tomcat session replication is not able to
 replicate the key from node to node, please find below the error,
 taking into consideration that there is no network disconnection
 between 2 nodes


 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8]
 org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo
 Unable to replicate backup
 key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to
 backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10,
 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350,
 securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14
 -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ].
 Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10,
 114, 43, 102}:4000;
org.apache.catalina.tribes.ChannelException: Operation has timed
 out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000;
 at

 org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108)
 at

 org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48)
 at

 org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54)
 at

 org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82)
 at

 org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
 at

 

Re: Operation has timed out

2017-02-07 Thread tomcat

On 07.02.2017 16:24, Fady Haikal wrote:

Hi,
telnet IP 4000 is working fine, i installed a tool for network
monitoring at the level of IP and Port and i didnt see any
disconnection,


but did you see a *connection* ?
I mean, on the pinging node, if you use the Windows "netstat" program, for 
example as
netstat -aon -p TCP
you should see a list of connections in the ESTABLISHED state, of which one of the 
IP/ports should be your target IP:4000 (in the "remote" column).

And on the pinged node, this port :4000 should be in the "local" column, in 
LISTEN mode
(and also probably one in the ESTABLISHED state, if they agree.)

Is that the case ?


and yes i'm sure that no firewall is enabled.


I saw some strange think on the server that I have tried to ping the
multicast IP (228.0.0.4) and i get reply from different IPs in the
network, i don't know why and how i get those IPs, after checking with
the network team they told me that those IPs are related to the SAN
storage taking into consideration that the Tomcat servers are not
connected in anyway to that SUN storage.


On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat)  wrote:

Hi.

This is for the Tomcat/Tribes experts on the list.

I know nothing of Tribes, but the on-line documentation seems to say that
the communication happens over TCP and that the protocol used is not
encrypted.
Fady previously tried a standard "ping" and a "telnet" between the two
nodes, and that is the base for him mentioning that "there is no network
disconnection" between the nodes.
Nevertheless, the calling pinging node seems to say that it times out
without getting a response fom the target node.  There is evidently a
contradiction there.
So this could still be some kind of network issue.

Considering that the protocol command for this "ping" should be known by
someone here, would it not be possible to imagine a little program in some
scripting language (or even java, God forbid), which would open a TCP
channel with the target node IP/port, send such a "ping" message, wait for a
response and report the result ?
That would at least confirm/deny that the problem is with the network.

The log below does not for example say if the error happens when opening the
TCP communication channel, or after sending the ping message on it,
(Of course, testing the TCP open could be done with "telnet IP 4000", but I
don't know if Fady tried this).
Maybe tribes also already contains some löw-level debugging options ?
wireshark maybe another option, but it has quite a learning curve.
And this is on Windows.

By the way Fady, are you sure that your "Windows Firewall with Enhanced
Security" is not just dropping TCP packets to/from port 40xx (or from
"java.exe") ? There are some "network policies" there which can have
wide-ranging side-effects.




On 07.02.2017 14:42, Fady Haikal wrote:


Hi, issue still not fixed. Tomcat session replication is not able to
replicate the key from node to node, please find below the error,
taking into consideration that there is no network disconnection
between 2 nodes


07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8]
org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo
Unable to replicate backup
key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to
backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10,
114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350,
securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14
-85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ].
Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10,
114, 43, 102}:4000;
   org.apache.catalina.tribes.ChannelException: Operation has timed
out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000;
at
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108)
at
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48)
at
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54)
at
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
at
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:81)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
at
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:93)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:233)
at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:186)
at
org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo(LazyReplicatedMap.java:170)
at

Re: Operation has timed out

2017-02-07 Thread Fady Haikal
Hi,
telnet IP 4000 is working fine, i installed a tool for network
monitoring at the level of IP and Port and i didnt see any
disconnection, and yes i'm sure that no firewall is enabled.

I saw some strange think on the server that I have tried to ping the
multicast IP (228.0.0.4) and i get reply from different IPs in the
network, i don't know why and how i get those IPs, after checking with
the network team they told me that those IPs are related to the SAN
storage taking into consideration that the Tomcat servers are not
connected in anyway to that SUN storage.


On Tue, Feb 7, 2017 at 4:51 PM, André Warnier (tomcat)  wrote:
> Hi.
>
> This is for the Tomcat/Tribes experts on the list.
>
> I know nothing of Tribes, but the on-line documentation seems to say that
> the communication happens over TCP and that the protocol used is not
> encrypted.
> Fady previously tried a standard "ping" and a "telnet" between the two
> nodes, and that is the base for him mentioning that "there is no network
> disconnection" between the nodes.
> Nevertheless, the calling pinging node seems to say that it times out
> without getting a response fom the target node.  There is evidently a
> contradiction there.
> So this could still be some kind of network issue.
>
> Considering that the protocol command for this "ping" should be known by
> someone here, would it not be possible to imagine a little program in some
> scripting language (or even java, God forbid), which would open a TCP
> channel with the target node IP/port, send such a "ping" message, wait for a
> response and report the result ?
> That would at least confirm/deny that the problem is with the network.
>
> The log below does not for example say if the error happens when opening the
> TCP communication channel, or after sending the ping message on it,
> (Of course, testing the TCP open could be done with "telnet IP 4000", but I
> don't know if Fady tried this).
> Maybe tribes also already contains some löw-level debugging options ?
> wireshark maybe another option, but it has quite a learning curve.
> And this is on Windows.
>
> By the way Fady, are you sure that your "Windows Firewall with Enhanced
> Security" is not just dropping TCP packets to/from port 40xx (or from
> "java.exe") ? There are some "network policies" there which can have
> wide-ranging side-effects.
>
>
>
>
> On 07.02.2017 14:42, Fady Haikal wrote:
>>
>> Hi, issue still not fixed. Tomcat session replication is not able to
>> replicate the key from node to node, please find below the error,
>> taking into consideration that there is no network disconnection
>> between 2 nodes
>>
>>
>> 07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8]
>> org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo
>> Unable to replicate backup
>> key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to
>> backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10,
>> 114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350,
>> securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14
>> -85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ].
>> Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10,
>> 114, 43, 102}:4000;
>>   org.apache.catalina.tribes.ChannelException: Operation has timed
>> out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000;
>> at
>> org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108)
>> at
>> org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48)
>> at
>> org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54)
>> at
>> org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82)
>> at
>> org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
>> at
>> org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:81)
>> at
>> org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
>> at
>> org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:93)
>> at
>> org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
>> at
>> org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:233)
>> at
>> org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:186)
>> at
>> org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo(LazyReplicatedMap.java:170)
>> at
>> org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1040)
>> at
>> org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1024)
>> at org.apache.catalina.session.ManagerBase.add(ManagerBase.java:647)
>> at
>> 

Re: Operation has timed out

2017-02-07 Thread tomcat

Hi.

This is for the Tomcat/Tribes experts on the list.

I know nothing of Tribes, but the on-line documentation seems to say that the 
communication happens over TCP and that the protocol used is not encrypted.
Fady previously tried a standard "ping" and a "telnet" between the two nodes, and that is 
the base for him mentioning that "there is no network disconnection" between the nodes.
Nevertheless, the calling pinging node seems to say that it times out without getting a 
response fom the target node.  There is evidently a contradiction there.

So this could still be some kind of network issue.

Considering that the protocol command for this "ping" should be known by someone here, 
would it not be possible to imagine a little program in some scripting language (or even 
java, God forbid), which would open a TCP channel with the target node IP/port, send such 
a "ping" message, wait for a response and report the result ?

That would at least confirm/deny that the problem is with the network.

The log below does not for example say if the error happens when opening the TCP 
communication channel, or after sending the ping message on it,
(Of course, testing the TCP open could be done with "telnet IP 4000", but I don't know if 
Fady tried this).

Maybe tribes also already contains some löw-level debugging options ?
wireshark maybe another option, but it has quite a learning curve.
And this is on Windows.

By the way Fady, are you sure that your "Windows Firewall with Enhanced Security" is not 
just dropping TCP packets to/from port 40xx (or from "java.exe") ? There are some "network 
policies" there which can have wide-ranging side-effects.




On 07.02.2017 14:42, Fady Haikal wrote:

Hi, issue still not fixed. Tomcat session replication is not able to
replicate the key from node to node, please find below the error,
taking into consideration that there is no network disconnection
between 2 nodes


07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8]
org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo
Unable to replicate backup
key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to
backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10,
114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350,
securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14
-85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ].
Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10,
114, 43, 102}:4000;
  org.apache.catalina.tribes.ChannelException: Operation has timed
out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000;
at 
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108)
at 
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48)
at 
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54)
at 
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82)
at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
at 
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:81)
at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
at 
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:93)
at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:233)
at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:186)
at 
org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo(LazyReplicatedMap.java:170)
at 
org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1040)
at 
org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1024)
at org.apache.catalina.session.ManagerBase.add(ManagerBase.java:647)
at org.apache.catalina.session.StandardSession.setId(StandardSession.java:374)
at org.apache.catalina.ha.session.DeltaSession.setId(DeltaSession.java:279)
at org.apache.catalina.session.ManagerBase.createSession(ManagerBase.java:708)
at org.apache.catalina.connector.Request.doGetSession(Request.java:2936)
at org.apache.catalina.connector.Request.getSession(Request.java:2260)
at 
org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:895)
at 
javax.servlet.http.HttpServletRequestWrapper.getSession(HttpServletRequestWrapper.java:231)
at 
org.apache.catalina.core.ApplicationHttpRequest.getSession(ApplicationHttpRequest.java:568)
at 
org.apache.catalina.core.ApplicationHttpRequest.getSession(ApplicationHttpRequest.java:513)
at 
org.apache.jasper.runtime.PageContextImpl.initialize(PageContextImpl.java:137)
at 

Re: Operation has timed out

2017-02-07 Thread Fady Haikal
Hi, issue still not fixed. Tomcat session replication is not able to
replicate the key from node to node, please find below the error,
taking into consideration that there is no network disconnection
between 2 nodes


07-Feb-2017 16:36:06.186 SEVERE [http-nio-8080-exec-8]
org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo
Unable to replicate backup
key:58291D242C742A8A4B1657BA42C831A4.TomcatNode2 to
backup:org.apache.catalina.tribes.membership.MemberImpl[tcp://{10,
114, 43, 102}:4000,{10, 114, 43, 102},4000, alive=68841350,
securePort=-1, UDP Port=-1, id={85 5 -62 -66 106 -12 64 12 -102 -14
-85 -87 15 9 -51 -112 }, payload={}, command={}, domain={}, ].
Reason:Operation has timed out(3000 ms.).; Faulty members:tcp://{10,
114, 43, 102}:4000;
 org.apache.catalina.tribes.ChannelException: Operation has timed
out(3000 ms.).; Faulty members:tcp://{10, 114, 43, 102}:4000;
at 
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:108)
at 
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:48)
at 
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:54)
at 
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:82)
at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
at 
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:81)
at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
at 
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:93)
at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:76)
at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:233)
at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:186)
at 
org.apache.catalina.tribes.tipis.LazyReplicatedMap.publishEntryInfo(LazyReplicatedMap.java:170)
at 
org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1040)
at 
org.apache.catalina.tribes.tipis.AbstractReplicatedMap.put(AbstractReplicatedMap.java:1024)
at org.apache.catalina.session.ManagerBase.add(ManagerBase.java:647)
at org.apache.catalina.session.StandardSession.setId(StandardSession.java:374)
at org.apache.catalina.ha.session.DeltaSession.setId(DeltaSession.java:279)
at org.apache.catalina.session.ManagerBase.createSession(ManagerBase.java:708)
at org.apache.catalina.connector.Request.doGetSession(Request.java:2936)
at org.apache.catalina.connector.Request.getSession(Request.java:2260)
at 
org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:895)
at 
javax.servlet.http.HttpServletRequestWrapper.getSession(HttpServletRequestWrapper.java:231)
at 
org.apache.catalina.core.ApplicationHttpRequest.getSession(ApplicationHttpRequest.java:568)
at 
org.apache.catalina.core.ApplicationHttpRequest.getSession(ApplicationHttpRequest.java:513)
at 
org.apache.jasper.runtime.PageContextImpl.initialize(PageContextImpl.java:137)
at 
org.apache.jasper.runtime.JspFactoryImpl.internalGetPageContext(JspFactoryImpl.java:109)
at 
org.apache.jasper.runtime.JspFactoryImpl.getPageContext(JspFactoryImpl.java:60)
at org.apache.jsp.WEB_002dINF.jsp._401_jsp._jspService(_401_jsp.java:100)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:438)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:396)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:340)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:719)
at 
org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:467)
at 
org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:390)
at 
org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:317)
at org.apache.catalina.core.StandardHostValve.custom(StandardHostValve.java:445)
at org.apache.catalina.core.StandardHostValve.status(StandardHostValve.java:304)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:181)
at 
org.apache.catalina.ha.session.JvmRouteBinderValve.invoke(JvmRouteBinderValve.java:194)
at org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:318)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
at 

AW: Operation has timed out

2017-02-07 Thread Kreuser, Peter
Fady,

Sorry for top posting.

If I remember correctly, the Cluster Element goes into the Container and not 
the Host.
Plus I see in our (working) case, a DeltaManager and a 
JvmRouteSessionIDBinderListener


...


Besides this, only ports, limits and values are different.

You may want to filter out the replication for static resources as gifs jpg or 
css.


Best regards

Peter

> below is the server.xml configuration, as mentioened earlier the issue
> is related to the cluster configuration, and as per my research i can
> see that some users are facing the same issue but i didnt  found the
> solution of it
> 
> 
> 
> 
> 
> 
>   
>   
>   
>SSLEngine="on" />
>   
>className="org.apache.catalina.core.JreMemoryLeakPreventionListener"
> />
>className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"
> />
>className="org.apache.catalina.core.ThreadLocalLeakPreventionListener"
> />
> 
>   
>   
> 
>type="org.apache.catalina.UserDatabase"
>   description="User database that can be updated and saved"
>   factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
>   pathname="conf/tomcat-users.xml" />
>   
> 
>   
>   
> 
> 
> 
> 
> 
> 
> connectionTimeout="6" maxThreads="500"
> minSpareThreads="25" maxSpareThreads="75" enableLookups="false"
> disableUploadTimeout="true" acceptCount="100" redirectPort="8443" />
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>   
>   
> 
> resourceName="UserDatabase"/>
>   
> 
>unpackWARs="true" autoDeploy="true" startStopThreads="0">
>   
>   
> 
>channelSendOptions="4">
>   
>  className="org.apache.catalina.tribes.membership.McastService"
> address="228.0.0.4"
> port="45564"
> frequency="500"
> dropTime="9000"/>
>  className="org.apache.catalina.tribes.transport.nio.NioReceiver"
>   address="auto"
>   port="4000"
>   autoBind="100"
>   selectorTimeout="5000"
>   maxThreads="6"/>
> 
>  className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
>className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
> 
>  className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
>  className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
>   
> 
> filter=""/>
>className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>
> 
>tempDir="D:/imaljava/TomcatNode1/tmp/war-temp/"
> deployDir="D:/imaljava/TomcatNode1/tmp/war-deploy/"
> watchDir="D:/imaljava/TomcatNode1/tmp/war-listen/"
> watchEnabled="false"/>
> 
>className="org.apache.catalina.ha.session.ClusterSessionListener"/>
> 
> 
> 
> 
> 
>  directory="logs"
>prefix="localhost_access_log" suffix=".txt"
>pattern="%h %l %u %t %r %s %b" />
>   threshold="900" />
> 
>   
> 
>   
> 
> 
> On Mon, Feb 6, 2017 at 6:51 PM, André Warnier (tomcat)  
> wrote:
> > On 06.02.2017 17:45, Fady Haikal wrote:
> >>
> >> Hi,
> >> What is the host OS ? Windows Server 2012
> >> What is the Tomcat version ? Apache Tomcat/8.0.30
> >>
> >> Is this problem new ? was this working before ? how long ? Since
> >> cluster implementation
> >>
> >
> > I still don't know tribes, but then my non-educated guess at this point
> > would be that there is something wrong in your configuration.
> > Can you copy/paste it here ? (remove sensible things like passwords, public
> > IP addresses etc..)(but not to the point of making it uncheckable).
> >
> > Then maybe some tribes-specialist can take over ?
> >
> >
> >>
> >> Is there actually something listening on that address/port ? Tomcat
> >> cluster
> >>
> >> the Port 4000 is listening and there is no disconnection between 2
> >> nodes ping and telnet are OK
> >>
> >> On Mon, Feb 6, 2017 at 6:42 PM, André Warnier (tomcat) 
> >> wrote:
> >>>
> >>> On 06.02.2017 17:24, Fady Haikal wrote:
> 
> 
>  Plz can i get some help here?
>  This issue is still occurring and it's filling the log file in the
>  Production server
> 
>  Regards,
>  Fady
> >>>
> >>>
> >>>
> >>> Hi.
> >>> If you want quick answers, you should provide more information.
> >>> What is the host OS ?
> >>> What is the Tomcat version ?
> >>> Is this problem new ? was this working before ? how long ?
> >>>
> >>> I do not know tribes at