Re: Cluster fixes - Need Coordination of work

2005-04-16 Thread Peter Rossbach
Hey Filip,
very welcome that you help.
Filip Hanik - Dev lists schrieb:
I ran some load tests with the pooled mode and the clustering stats 
are looking good.
next week I am expecting to dig a little bit deeper into the code, but 
so far it is looking pretty good,

Well, that a very fine news.
I am getting an increased number of incomplete responses, such as 302 
redirects from tomcat, but that can also be the load balancer or the 
client scrambling the headers making an incomplete request.

I have tested with the mod_jk 1.2.10 load balancing, Apache 2.0.52/53 
(Wndows XP,Suse 9.1) and start next week some tests with Cisco LB in 
combination with a lot of Apaches/Tomcat ( 8 Apache and every host a 3 
cluster tomcats domain ).

I don't see those 302.
I am glad you removed the compress flag, I am not sure what that was 
to begin with as if I remember it correctly, messages were already 
being compressed, and during profiling, this had little impact on 
performance

On my profiling the compress mode is only usefull when you have large 
replication messages (  8k bytes),
but it use more CPU performance ( 20-30% more). I don't remove the 
compress flag.  I have disable it by default. It is a sender/receiver 
attribute. The attribute waitForAck and compress was transfered to the 
Receiver:

   Receiver

className=org.apache.catalina.cluster.tcp.SocketReplicationListener
tcpListenAddress=@node.clustertcp.address@
tcpListenPort=@node.clustertcp.port@
doReceivedProcessingStats=true
/
Sender

className=org.apache.catalina.cluster.tcp.ReplicationTransmitter
replicationMode=fastasyncqueue
compress=true
doTransmitterProcessingStats=true
doProcessingStats=true
doWaitAckStats=true
queueTimeWait=true
queueDoStats=true
queueCheckLock=true
ackTimeout=15000
waitForAck=true
autoConnect=false
keepAliveTimeout=8
keepAliveMaxRequestCount=-1/

One of my ideas is:
Change the cluster protocol that developer can add there own data 
serialzable/deserialzable format (high risk)

 Currently
 header 6 bytes (FLT2002)
 data.length 4 bytes
 data,
 end header  6 bytes (TLF2003)
  Optimized to
header 2 bytes (TC)
type   1 byte
compressflag  1 byte
data.length 4 bytes,
data | real uncompressed data.length (4 bytes) data
type means user defined type and receiver extract bytes and 
type and sende it to callback
s. ObjectReader or SocketObjectReader
  compress 1
 first data 4 data bytes are the real uncompressed data length. ( 
Is for better memory management atr recevier side, S. XByteBuffer)
  overwrite ClusterSender and ClusterReceiver 
deserizable/seriazable methods

- Then we can setup a flag at ClusterMessage or make a on the fly 
decision to compress data.

when changing the code, I was wondering if we can stick to method 
names that make sense and are logical

public int getTimeoutAllSession()
If this means return the count of all sessions that have timed out, I 
would suggest public int getSessionTimeoutCount()

No, it is the value of the timeout in sec's that DeltaManager wait after 
send all session event to one other cluster member.

protected ClusterMessage createRecevierObject(byte[] data)
do you mean deserialize? as in protected ClusterMessage 
deserialize(byte[] data)

Yes, I have change the names at ClusterReceiverBase and 
ReplicationTransmitter.
Also my favorit names, but time is limit when you refactor code

I must admit that I am having a little bit of a hard time reading the 
code because of the funky naming conventions, do you mind me cleaning 
up some when I go in and add changes?

Yes, feel free to find better names. Please, change the names also  
inside the mbeans descriptors and testcode.
I thing we must coordinate the work. You announce the change name step, 
than I can stop my redesign and refactorings.

I will be pushing for stabilization as opposed to new features and so 
called refactoring.
As an example, to customers stability and speed is more important than 
features, take MySQL for example.

Yes, you are right. But my code changes are important for better 
understanding and made a clearer semantic to a
lot of classes. Other thing is: I want made the cluster faster and 
easier to extend. I hope we can also port the Remy/Mladens APR
sockets to the clustering module.

The following cases/classes need help:
- SimpleTcpCluster
  pause/resume senders
You also mean that pause Receiver help?
 Then you must also stop the Membership and that is dangerous.
 = pause: We can send a message to all other nodes that we are 
member 

Re: Cluster fixes

2005-04-15 Thread Peter Rossbach
Yes, I have change a lot and it is time to test and stabilze the code.
   s. to-do.txt for more :-)
The current cluster code with 5.5.9 fix pack work very well  I testet 
the fix under very high
load last week

Peter
- Great that you also start to look inside the code.
Filip Hanik - Dev Lists schrieb:
I am going through the cluster code right now and will be adding fixes 
along the way.
I think the development of this code has focused more on features than 
stability, so I would like to ask that for the next period, lets focus 
on the stability and get this beast back in shape again.

Filip
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Cluster fixes

2005-04-14 Thread Filip Hanik - Dev Lists
I am going through the cluster code right now and will be adding fixes 
along the way.
I think the development of this code has focused more on features than 
stability, so I would like to ask that for the next period, lets focus 
on the stability and get this beast back in shape again.

Filip
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]