so for a 200KB value you are seeing >200ms of latency in MessagingService? How much for a 1KB value?
-Jonathan On Mon, Oct 5, 2009 at 1:46 PM, Igor Katkov <[email protected]> wrote: > In case anyone is following this, here is an update: > > I was able to narrow it down to Cassandra-Cassandra link. Storage proxy > latency depends on size of the key. The larger amount of data (per key) is > transfered the larger latency is. No surprise here. > Client connects to a demons "A" and sends key-value, "A" accept thrift > message, de-serialize it to an object, sees that key belongs to demons "B", > serialize it to bytes once again (internal format now) and invoke > MessagingService, which in turn writes to a socket. As soon as "B" delivers > write-acknowledgment over a different connection, the client call is let go. > Cassandra's MessagingService utilizes java nio to connect to other cassandra > daemons, all connections are uni-directionals. So in theory it should be > very fast. But it's not. > > What does look suspicious is certain network usage cap, only ~4% of the > 1Gbps link is used regardless of "value" size. With smaller value I get a > better throughput, with larger (200Kb) - worse. > > As a temp workaround I see that client might be held responsible to > identifying what cassandra instance it should send a key to. On 200kb value > it's ~10 times faster. > > > On Thu, Oct 1, 2009 at 6:51 PM, Igor Katkov <[email protected]> wrote: >> >> Hi, >> >> I have the following puzzle: >> Storage proxy write latency ~235ms >> CF write latency <1 ms >> >> I have 3 nodes in the cluster, Cassandra v.0.4. Tokens evenly distributed. >> The client connects to a node and inserts a key with ConsistencyLevel.ONE >> If it happen to be a local write operation is fast, same speed as in one >> node setup. JMX shows write latency <1 ms >> If it happens to be a remote insert StorageProxy sends it to a proper >> node. This operation is slow. JMX shows write latency ~ 235ms. >> In the same time, on remote node JMX shows same <1ms write latency. So >> it's not remote node being sluggish, it's something else. >> There are no pending tasks on remote node - JMX counters are always zero, >> network is 1Gb, idle. So I can't blame it. >> >> >> I profiled Cassandra server in JProfiler, could not find a thing. All this >> extra time is spent inside QuorumResponseHandler waiting for the condition >> to signal. Which should happen as soon as response is received. >> >> There is one pooled TCP connection open to remote host. Hardly a >> bottleneck, ThreadPoolExecutors looks OK. >> >> Any ideas why write latency it is so high? > >
