DO NOT REPLY [Bug 28161] - Replication messages get lost with AsyncSocketSender

2004-04-07 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=28161.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28161

Replication messages get lost with AsyncSocketSender

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||FIXED



--- Additional Comments From [EMAIL PROTECTED]  2004-04-07 20:41 ---
The Deltamanager now uses a different Id to assign to its messages, that will 
make them unique. This all the messages will get through.

However, the async could be pooled in the same way the the synchronous is in 
pooled mode, would increase performance alot :)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 28161] - Replication messages get lost with AsyncSocketSender

2004-04-05 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=28161.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28161

Replication messages get lost with AsyncSocketSender





--- Additional Comments From [EMAIL PROTECTED]  2004-04-05 16:39 ---
Maybe interesting: a simple Hello World JSP does NOT trigger the problem. But 
once I add a scriptlet %session.setAttribute(COUNT,new Integer(10));%
the problem arises.

I can now reproduce without any apache or mod_jk involvement. I just start a 2 
node cluster and call the JSP via direct HTTP connections to node #1 200 times 
with a delay between calls of 100ms.

After calling 200 times, I can find 200 sessions on node #1, but only between 
170 and 195 sessions on node 2. I check session count via /manager/html, but I 
also added debug output to see, that some sessions are indeed missing.

I try to go deeper into cluster messages and the queue handling.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 28161] - Replication messages get lost with AsyncSocketSender

2004-04-05 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=28161.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28161

Replication messages get lost with AsyncSocketSender





--- Additional Comments From [EMAIL PROTECTED]  2004-04-05 17:26 ---
Next info: If I use this JSP, then synchronous and pooled are both EXTREMELY 
slow, response times between 1000ms and 5000ms. As soon as I reduce 
tcpSelectorTimeout from 1000 to 10, I get more reasonable response times (10-
50ms). Any idea, why tcpSelectorTimeout show such a tremendous effect?

Then when I use multiple parallel clients, synchronous again gets too slow, so 
only pooled is an alternative. Synchronous once showed a freeze (getting no 
more anserws) for 15 seconds.

Both, synchronous and pooled do not show the problem of missing sessions.

Nevertheless I like the idea of having one or few dedicated replication 
connections fed by a queue of work load and not directly coupled to the 
finishing of the original response (asynchronous) much more.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 28161] - Replication messages get lost with AsyncSocketSender

2004-04-05 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=28161.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28161

Replication messages get lost with AsyncSocketSender





--- Additional Comments From [EMAIL PROTECTED]  2004-04-05 19:11 ---
First: there are app. 6 System.out.print/ln in the cluster code. One of these 
(line 71 in the SmartQueue.java) prevented me from finding the solution earlier.

Here is the SOLUTION: What happens is, that the smart feature of the smart 
queue gets us into trouble. For my JSP two session messages are being send. One 
is of type 1 (EVT_SESSION_CREATED), and the second one is of type 13 
(EVT_SESSION_DELTA). Both are being send very close to each other during the 
only request in a session.

Most of the time the system is fast enough to handle each message individually, 
before the next message is put into the queue. Every now and then the message 
of type 1 is not read from the queue before type 13 is generated. Then the 
queue replaces the type 1 message in the queue by the type 13 message, and only 
the type 13 message is send out. Then the receiving side seems to not create 
the session, since the type 1 message is missing. I didn't check this last 
point, because I think this is much clearer for you.

Isn't there a general problem in using the Delta manager together with the 
smart queue? Since you only send out delta messages, it doesn't look like a 
good idea to replace pending messages with newer ones. In fact isn't it 
necessary to send all deltas and to furthermore make sure, that they are send 
in the original order?

At least this makes clear, why the problem will only show up in asynchronous 
mode. In synchronous mode you will allways send all messages (and in the right 
order).

Maybe it suffices to strip off the smart feature of the smart queue?

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 28161] - Replication messages get lost with AsyncSocketSender

2004-04-02 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=28161.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28161

Replication messages get lost with AsyncSocketSender

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||WONTFIX



--- Additional Comments From [EMAIL PROTECTED]  2004-04-02 15:57 ---
Async data send means that there is not time guarantee for when the session is 
delivered. The session should not get lost without any error trace in the logs.
I am still debating whether to remove this feature all together, but I left it 
in for people to play with. I have not found a case where async is useful, but 
I am sure there is which is why it is still there. Most of the time people 
want to be ensured that the session gets replicated, that is when pooled mode 
comes in.

also, from experience, using mod_jk in high load can result in lost 
sessions, cause it sometimes messes up the request and looses the session id.
from my experience, pen (siag.nu/pen) works better as a load balancer

I strongly suggest to retry the same test with replicationMode=pooled and 
see if you get better results. 
Pooled means that replication is synchronous, but on concurrent channels.

Filip

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 28161] - Replication messages get lost with AsyncSocketSender

2004-04-02 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=28161.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28161

Replication messages get lost with AsyncSocketSender





--- Additional Comments From [EMAIL PROTECTED]  2004-04-02 16:45 ---
I respect your sugestion to not use asynchronous, although it looked to me like
the right way to do it.

Just for your information: The messages really get lost, even after we stop load
the missing messages don't get replicated. So it's not just a problem of
messages getting replicated too late.

There are definitely only debug log stetments all the time, except for a few
info messages giving mean values for replication data size. No other non debug
log statements on any cluster node. Also from what I see I'm pretty sure, that
the replication data is written to the Socket.

Concerning mod_jk: For this test case we used each session only once. So the
correctness of the response through mod_jk somehow didn't matter. We could
easily reproduce the same situation using build in Tomcat HTTP Connector
(although we didn't do so until now).

We will retry using pooled, although I don't like the idea of having up to 25
connections (code constant) and threads for each pair of nodes in the cluster.
Also I had the impression, that in pooled mode TCP conections are only used a
very short time (I think I remember for only 100 messages? This application will
be under heavy load in production). 

Why do I think asynchronous fits better?

In any synchronous situation if the replication is not fast enough I immediately
get negative consequences for the application from the user point of view,
because the request blocks ressources needed for accepting new requests as long
as the replication hasn't finished. So if replication is slow for a few seconds
I'm in danger of loosing all free Apache-Slots resp. Tomcat worker threads for
incoming requests.

When I do asynchronous replication I only loose timely replication of the sesion
changes. If I route my request to the primary container, then I still profit
from the cluster with respect to availability and servicability (I can shutdown
one of the containers without users loosing sessions). For these features it
doesn't really matter, if all request are replicated within milliseconds all the
time.

I'm sorry to bother you, but I think it's an important discussion.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 28161] - Replication messages get lost with AsyncSocketSender

2004-04-02 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=28161.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28161

Replication messages get lost with AsyncSocketSender

[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|WONTFIX |



--- Additional Comments From [EMAIL PROTECTED]  2004-04-02 17:01 ---
We will retry using pooled, although I don't like the idea of having up to 25

this is not really a big resource issue, since no threads are holding on to 
these connections, they just grab one from the queue when it is available, 
then return it.

lets reopen this bug re:/ async, once I get all moved in and have my computers 
set up I can start testing this again

Filip

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 28161] - Replication messages get lost with AsyncSocketSender

2004-04-02 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=28161.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28161

Replication messages get lost with AsyncSocketSender





--- Additional Comments From [EMAIL PROTECTED]  2004-04-02 17:04 ---
also, the problem with Async, is that it is using only one channel, hence 
during heavy load, you will not get milli seconds throughput, cause it queues 
all the messages

the solution would be to make an async pooled mode,

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]