Patrick and Ted.
Unless Zookeeper clients adding this feature, it is not easy for us to
implement.
We only provide platform for many services within our org.
Their batch servers will fire off whatever clients they want.
We have no control over it.
But 8 second latency during stampede is
2011. 4. 14., 오전 10:30, Patrick Hunt 작성:
2011/4/13 Chang Song tru64...@me.com:
Patrick.
Thank you for the reply.
We are very aware of all the things you mentioned below.
None of those.
Not GC (we monitor every possible resource in JVM and system)
No IO. No Swapping.
No VM guest
2011. 4. 14., 오후 1:53, Patrick Hunt 작성:
two additional thoughts come to mind:
1) try running the ensemble with a single zk server, does this help at
all? (it might provide a short term workaround, it also might provide
some insight into what's causing the issue)
We are going to try this
2011/4/14 Chang Song tru64...@me.com:
2) regarding IO, if you run 'iostat -x 2' on the zk servers while your
issue is happening, what's the %util of the disk? what's the iowait
look like?
Again, no I/O at all. 0%
This is simply not possible.
Sessions are persistent. Each time a session
chang,
if the problem is on client startup, then it isn't the heartbeat
stamped, it is session establishment. the heartbeats are very light
weight, so i can't imagine them causing any issues.
the two key issues we need to know are: 1) the version of the server
you are running, and 2) if you are
2011. 4. 15., 오전 1:04, Patrick Hunt 작성:
2011/4/14 Chang Song tru64...@me.com:
2) regarding IO, if you run 'iostat -x 2' on the zk servers while your
issue is happening, what's the %util of the disk? what's the iowait
look like?
Again, no I/O at all. 0%
This is simply not
when you file the jira can you also note the logging level you are using?
thanx
ben
2011/4/14 Chang Song tru64...@me.com:
Yes, Ben.
If you read my emails carefully, I already said it is not heartbeat,
it is session establishment / closing gets stamped.
Since all the requests' response gets
sure I will
thank you.
Chang
2011. 4. 15., 오전 7:16, Benjamin Reed 작성:
when you file the jira can you also note the logging level you are using?
thanx
ben
2011/4/14 Chang Song tru64...@me.com:
Yes, Ben.
If you read my emails carefully, I already said it is not heartbeat,
it is
2011/4/14 Chang Song tru64...@me.com
You need to understand that most app can tolerate delay in connect/close,
but we cannot tolerate ping delay since we are using ZK heartbeat TO
for sole failure detection.
What about using multiple ZK clusters for this, then?
But it really sounds like
You said that, but there was some skepticism from others about this.
You need to try the monitoring that was suggested. 5 minute averages are
not useful.
What does the stat four letter command return? (
http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_zkCommands )
2011/4/14 Chang
10 matches
Mail list logo