Client/Server deadlock in aggressive bulk upload (maybe a thrift bug?)
(2010-06-07_12-31-16)
--------------------------------------------------------------------------------------------
Key: CASSANDRA-1175
URL: https://issues.apache.org/jira/browse/CASSANDRA-1175
Project: Cassandra
Issue Type: Bug
Components: Core
Environment: apache-cassandra-2010-06-07_12-31-16
Reporter: Jesse Hallio
I was testing to see how long it takes to upload some 222M lines into
Cassandra. Using a single machine (4-core, 8G of mem, opensolaris, 1.6.0_10
64bit and 32bit) to run the server and client.
The client creates a single keyspace with a single column family. The inserted
data is 8-byte key, with 26 NVs (3-5 ascii bytes per key, 8 bytes per value)
for each line. Using RackUnawareStrategy and replication factor 1. The server
install is pretty much out-of-the-box, with -d64 -server -Xmx6G for the server
end. (3G for the 32bit VM). The client writes the changes in batches of 1000
lines with batch_mutate, and outputs a logging line every 50k lines.
The import hangs at random points - sometimes after 6900K mark (I think I saw
even >10M yesterday, but I lost the window and the backbuffer with it),
sometimes only after 1750K. kill -QUIT gives for the server:
---
"pool-1-thread-1" prio=3 tid=0x0000000000b68800 nid=0x5c runnable
[0xfffffd7e676f5000..0xfffffd7e676f5920]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0xfffffd7e7ac80fb8> (a java.io.BufferedInputStream)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readBinary(TBinaryProtocol.java:363)
at
org.apache.cassandra.thrift.Cassandra$batch_mutate_args.read(Cassandra.java:12840)
at
org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:1743)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:1317)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
---
and for the client:
---
"main" prio=3 tid=0x08070000 nid=0x2 runnable [0xfe38e000..0xfe38ed38]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0xbad8a840> (a java.io.BufferedInputStream)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:126)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192)
at
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:745)
at
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:729)
at mycode.MyClass.main(MyClass.java:169)
---
It looks like both ends are trying to read simultaneously from each other,
which kind of looks like a thrift bug; but I don't have clear idea what happens
in org.apache.cassandra.thrift.Cassandra.
I tried using with thrift r948492, but it didn't help (I didn't recompile the
interface classes, I only switched the runtime jar).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.