Re: Occasional 10s Timeouts on Read

Jonathan Ellis Thu, 24 Jun 2010 15:00:38 -0700

Glad you tracked that down!

On Wed, Jun 23, 2010 at 6:14 PM, AJ Slater <a...@zuno.com> wrote:
> This issue is caused by my network.
>
> Cassandra maintains multiple gossip connections per node pair. One of
> these connections is used for heartbeat and load broadcasting traffic.
> Its quite talky. Another one is used for distributed key reads. Its
> idle unless distributed keys are actively being sought.
>
> So if your switches are configured to drop conns in an impolite way as
> a defense against flooding, and maybe the default is 3 hours or 4hrs
> 15 minutes, then you can run into my behavior.
>
> Thanks all for your attention and advice.
>
> AJ
>
> On Sat, Jun 19, 2010 at 8:16 PM, AJ Slater <a...@zuno.com> wrote:
>> The only indication I have that cassandra realized something was wrong
>> during this period was this INFO message:
>>
>> 10.33.2.70:/var/log/cassandra/output.log
>>
>> DEBUG 20:00:35,841 get_slice
>> DEBUG 20:00:35,841 weakreadremote reading 
>> SliceFromReadCommand(table='jolitics.c
>> om', key='4c43228354b38f14a1a015dd722cdf4b', 
>> column_parent='QueryPath(columnFami
>> lyName='Images', superColumnName='null', columnName='null')', start='', 
>> finish='
>> ', reversed=false, count=100)
>> DEBUG 20:00:35,841 weakreadremote reading 
>> SliceFromReadCommand(table='jolitics.c
>> om', key='4c43228354b38f14a1a015dd722cdf4b', 
>> column_parent='QueryPath(columnFami
>> lyName='Images', superColumnName='null', columnName='null')', start='', 
>> finish='
>> ', reversed=false, count=100) from 60999@/10.33.3.10
>> INFO 20:00:35,842 error writing to /10.33.3.10
>> TRACE 20:00:36,267 Received a GossipDigestSynMessage from /10.33.3.10
>> TRACE 20:00:36,267 reporting /10.33.3.10
>> TRACE 20:00:36,267 reporting /10.33.3.10
>>
>>
>> Where it notes there's an error. The next read I did at 20:02, while
>> writing my last mail to this list, succeeded.
>>
>> So, its timing out, but al the while sending heartbeats and
>> GossipDIgestSyns and Acks back and forth and maybe not really querying
>> its peers when it should, or timing out trying to do so. When it
>> finally realizes theres an error, it resets something? And then we're
>> back in business?
>>
>> I'm going to be offline for 48 hours.
>>
>> AJ
>>
>> On Sat, Jun 19, 2010 at 8:09 PM, AJ Slater <a...@zuno.com> wrote:
>>> Agreed. But those connection errors were happening at a sort of random
>>> time. Not the time when I was seeing the problem. Now I am seeing the
>>> problem and here are some logs without ConnectionExceptions.
>>>
>>> Here we're asking 10.33.2.70 for key: 52e86817a577f75e545cdecd174d8b17
>>> which resides only on 10.33.3.10 and 10.33.3.20. Note all the
>>> apparently normal communication. Execept that no mention of a request
>>> for key 52e86817a577f75e545cdecd174d8b17 ever comes up in 10.33.3.10's
>>> log, despite 10.33.2.70 saying it was reading from 10.33.3.10
>>>
>>> The problem resolved itself again at 20:02, maybe 20 minutes later.
>>> where all of a sudden I get my columns returned in milliseconds and I
>>> see good stuff like:
>>>
>>> DEBUG 20:06:35,238 Reading consistency digest for
>>> 52e86817a577f75e545cdecd174d8b17 from 59...@[/10.33.3.10, /10.33.3.20]
>>>
>>> Here's some logs from the problem period
>>>
>>> 10.33.2.70:/var/log/cassandra/output.log
>>>
>>> DEBUG 19:42:03,230 get_slice
>>> DEBUG 19:42:03,231 weakreadremote reading 
>>> SliceFromReadCommand(table='jolitics.c
>>> om', key='52e86817a577f75e545cdecd174d8b17', 
>>> column_parent='QueryPath(columnFami
>>> lyName='Images', superColumnName='null', columnName='null')', start='', 
>>> finish='
>>> ', reversed=false, count=100)
>>> DEBUG 19:42:03,231 weakreadremote reading 
>>> SliceFromReadCommand(table='jolitics.c
>>> om', key='52e86817a577f75e545cdecd174d8b17', 
>>> column_parent='QueryPath(columnFami
>>> lyName='Images', superColumnName='null', columnName='null')', start='', 
>>> finish='
>>> ', reversed=false, count=100) from 57663@/10.33.3.10
>>> TRACE 19:42:03,619 Gossip Digests are : /10.33.2.70:1276981671:20386 
>>> /10.33.3.10
>>> :1276983719:18303 /10.33.3.20:1276983726:18295 /10.33.2.70:1276981671:20386
>>> TRACE 19:42:03,619 Sending a GossipDigestSynMessage to /10.33.3.20 ...
>>> TRACE 19:42:03,619 Performing status check ...
>>> TRACE 19:42:03,619 PHI for /10.33.3.10 : 0.95343619570936
>>> TRACE 19:42:03,619 PHI for /10.33.3.20 : 0.8635116192106644
>>> TRACE 19:42:03,621 Received a GossipDigestAckMessage from /10.33.3.20
>>> TRACE 19:42:03,621 reporting /10.33.3.10
>>> TRACE 19:42:03,621 reporting /10.33.3.20
>>> TRACE 19:42:03,621 marking as alive /10.33.3.10
>>> TRACE 19:42:03,621 Updating heartbeat state version to 18304 from 18303 for 
>>> /10.
>>> 33.3.10 ...
>>> TRACE 19:42:03,621 marking as alive /10.33.3.20
>>> TRACE 19:42:03,621 Updating heartbeat state version to 18296 from 18295 for 
>>> /10.
>>> 33.3.20 ...
>>> TRACE 19:42:03,622 Scanning for state greater than 20385 for /10.33.2.70
>>> TRACE 19:42:03,622 Scanning for state greater than 20385 for /10.33.2.70
>>> TRACE 19:42:03,622 Sending a GossipDigestAck2Message to /10.33.3.20
>>> TRACE 19:42:04,172 Received a GossipDigestSynMessage from /10.33.3.10
>>> TRACE 19:42:04,172 reporting /10.33.3.10
>>> TRACE 19:42:04,172 reporting /10.33.3.10
>>> TRACE 19:42:04,172 Scanning for state greater than 20385 for /10.33.2.70
>>> TRACE 19:42:04,172 @@@@ Size of GossipDigestAckMessage is 52
>>> TRACE 19:42:04,172 Sending a GossipDigestAckMessage to /10.33.3.10
>>> TRACE 19:42:04,174 Received a GossipDigestAck2Message from /10.33.3.10
>>> TRACE 19:42:04,174 reporting /10.33.3.10
>>> TRACE 19:42:04,174 marking as alive /10.33.3.10
>>> TRACE 19:42:04,174 Updating heartbeat state version to 18305 from 18304 for 
>>> /10.
>>> 33.3.10 ...
>>>
>>>
>>> 10.33.3.10:/var/log/cassandra/output.log
>>>
>>> TRACE 19:42:03,174 Sending a GossipDigestSynMessage to /10.33.3.20 ...
>>> TRACE 19:42:03,174 Performing status check ...
>>> TRACE 19:42:03,174 PHI for /10.33.2.70 : 1.3363463863632534
>>> TRACE 19:42:03,174 PHI for /10.33.3.20 : 0.9297110501502452
>>> TRACE 19:42:03,175 Received a GossipDigestAckMessage from /10.33.3.20
>>> TRACE 19:42:03,176 reporting /10.33.2.70
>>> TRACE 19:42:03,176 marking as alive /10.33.2.70
>>> TRACE 19:42:03,176 Updating heartbeat state version to 20385 from 20384 for 
>>> /10.
>>> 33.2.70 ...
>>> TRACE 19:42:03,176 Scanning for state greater than 18303 for /10.33.3.10
>>> TRACE 19:42:03,176 Scanning for state greater than 18303 for /10.33.3.10
>>> TRACE 19:42:03,176 Sending a GossipDigestAck2Message to /10.33.3.20
>>> TRACE 19:42:03,230 Received a GossipDigestSynMessage from /10.33.3.20
>>> TRACE 19:42:03,230 reporting /10.33.3.20
>>> TRACE 19:42:03,231 reporting /10.33.3.20
>>> TRACE 19:42:03,231 @@@@ Size of GossipDigestAckMessage is 35
>>> TRACE 19:42:03,231 Sending a GossipDigestAckMessage to /10.33.3.20
>>> TRACE 19:42:03,232 Received a GossipDigestAck2Message from /10.33.3.20
>>> TRACE 19:42:03,232 reporting /10.33.3.20
>>> TRACE 19:42:03,232 marking as alive /10.33.3.20
>>> TRACE 19:42:03,232 Updating heartbeat state version to 18296 from 18295 for 
>>> /10.
>>> 33.3.20 ...
>>> TRACE 19:42:04,173 Gossip Digests are : /10.33.3.10:1276983719:18305 
>>> /10.33.2.70
>>>
>>>
>>>
>>> 10.33.3.20:/var/log/cassandra/output.log
>>>
>>> TRACE 19:42:03,174 Received a GossipDigestSynMessage from /10.33.3.10
>>> TRACE 19:42:03,174 reporting /10.33.3.10
>>> TRACE 19:42:03,174 reporting /10.33.3.10
>>> TRACE 19:42:03,174 Scanning for state greater than 20384 for /10.33.2.70
>>> TRACE 19:42:03,175 @@@@ Size of GossipDigestAckMessage is 52
>>> TRACE 19:42:03,175 Sending a GossipDigestAckMessage to /10.33.3.10
>>> TRACE 19:42:03,176 Received a GossipDigestAck2Message from /10.33.3.10
>>> TRACE 19:42:03,176 reporting /10.33.3.10
>>> TRACE 19:42:03,177 marking as alive /10.33.3.10
>>> TRACE 19:42:03,177 Updating heartbeat state version to 18304 from 18303 for 
>>> /10.
>>> 33.3.10 ...
>>> TRACE 19:42:03,229 Gossip Digests are : /10.33.3.20:1276983726:18296 
>>> /10.33.3.10
>>> :1276983719:18304 /10.33.3.20:1276983726:18296 /10.33.2.70:1276981671:20385
>>> TRACE 19:42:03,229 Sending a GossipDigestSynMessage to /10.33.3.10 ...
>>> TRACE 19:42:03,229 Performing status check ...
>>> TRACE 19:42:03,229 PHI for /10.33.2.70 : 0.5938079948279411
>>> TRACE 19:42:03,229 PHI for /10.33.3.10 : 0.045531699282787594
>>> TRACE 19:42:03,231 Received a GossipDigestAckMessage from /10.33.3.10
>>> TRACE 19:42:03,231 Scanning for state greater than 18295 for /10.33.3.20
>>> TRACE 19:42:03,231 Scanning for state greater than 18295 for /10.33.3.20
>>> TRACE 19:42:03,232 Sending a GossipDigestAck2Message to /10.33.3.10
>>> TRACE 19:42:03,622 Received a GossipDigestSynMessage from /10.33.2.70
>>> TRACE 19:42:03,622 reporting /10.33.2.70
>>> TRACE 19:42:03,622 reporting /10.33.2.70
>>> TRACE 19:42:03,622 Scanning for state greater than 18295 for /10.33.3.20
>>> TRACE 19:42:03,623 Scanning for state greater than 18303 for /10.33.3.10
>>> TRACE 19:42:03,623 @@@@ Size of GossipDigestAckMessage is 69
>>> TRACE 19:42:03,623 Sending a GossipDigestAckMessage to /10.33.2.70
>>> TRACE 19:42:03,625 Received a GossipDigestAck2Message from /10.33.2.70
>>> TRACE 19:42:03,625 reporting /10.33.2.70
>>> TRACE 19:42:03,625 marking as alive /10.33.2.70
>>> TRACE 19:42:03,625 Updating heartbeat state version to 20386 from 20385 for 
>>> /10.
>>> 33.2.70 ...
>>> TRACE 19:42:04,229 Gossip Digests are : /10.33.3.20:1276983726:18297 
>>> /10.33.2.70
>>> :1276981671:20386 /10.33.3.10:1276983719:18304 /10.33.3.20:1276983726:18297
>>> TRACE 19:42:04,229 Sending a GossipDigestSynMessage to /10.33.3.10 ...
>>> TRACE 19:42:04,229 Performing status check ...
>>>
>>>
>>> AJ
>>>
>>> On Sat, Jun 19, 2010 at 7:02 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
>>>> This is definitely not a Cassandra bug, something external is causing
>>>> those connection failures.
>>>>
>>>> On Sat, Jun 19, 2010 at 3:12 PM, AJ Slater <a...@zuno.com> wrote:
>>>>> Logging with TRACE reveals immediate problems with no client requests
>>>>> coming in to the servers. The problem was immediate and persisted over
>>>>> the course of  half an hour:
>>>>>
>>>>> 10.33.2.70   lpc03
>>>>> 10.33.3.10   fs01
>>>>> 10.33.3.20   fs02
>>>>>
>>>>> a...@lpc03:~$ grep unable /var/log/cassandra/output.log
>>>>> TRACE 14:07:52,104 unable to connect to /10.33.3.10
>>>>> ...
>>>>> TRACE 14:42:00,008 unable to connect to /10.33.3.20
>>>>> ...
>>>>> TRACE 14:42:06,751 unable to connect to /10.33.3.20
>>>>>
>>>>> Note that lpc03 has trouble talking to fs01 and fs02. But after After
>>>>> seeing this I started logging TRACE on fs01 and fs02.
>>>>>
>>>>> During the six seconds before I restarted fs02:
>>>>>
>>>>> a...@fs01:~/logs$ grep unable /var/log/cassandra/output.log | grep unable
>>>>> Bad configuration; unable to start server
>>>>> TRACE 14:42:00,865 unable to connect to /10.33.3.20
>>>>> ...
>>>>> TRACE 14:42:06,730 unable to connect to /10.33.3.20
>>>>>
>>>>> Restarted fs02 and no issues in any of the logs.
>>>>>
>>>>> a...@fs02:~$ grep unable /var/log/cassandra/output.log
>>>>> a...@fs02:~$
>>>>>
>>>>>
>>>>>
>>>>> The unfiltered log messages all look more like:
>>>>>
>>>>> TRACE 14:42:06,248 unable to connect to /10.33.3.20
>>>>> java.net.ConnectException: Connection refused
>>>>>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>>>>>        at 
>>>>> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>>>>>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>>>>>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>>>>>        at java.net.Socket.connect(Socket.java:529)
>>>>>        at java.net.Socket.connect(Socket.java:478)
>>>>>        at java.net.Socket.<init>(Socket.java:375)
>>>>>        at java.net.Socket.<init>(Socket.java:276)
>>>>>        at 
>>>>> org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpCon
>>>>> nection.java:149)
>>>>>        at 
>>>>> org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnect
>>>>> ion.java:85)
>>>>>
>>>>>
>>>>> On Sat, Jun 19, 2010 at 2:19 PM, AJ Slater <a...@zuno.com> wrote:
>>>>>> I shall do just that. I did a bunch of tests this morning and the
>>>>>> situation appears to be this:
>>>>>>
>>>>>> I have three nodes A, B and C, with RF=2. I understand now why this
>>>>>> issue wasn't apparent with RF=3.
>>>>>>
>>>>>> If there are regular intranode column requests going on (e.g. i set up
>>>>>> a pinger to get remote columns), the cluster functions normally.
>>>>>> However, if no intranode column requests happen for a few hours, (3
>>>>>> hours is the minimum I've seen, but sometimes it takes longer), things
>>>>>> go wrong. Using node A as the point of contact from the client, all
>>>>>> columns that live on A are returned in a timely fashion, but for
>>>>>> columns that only live on B & C, the retrieval times out, with this in
>>>>>> the log:
>>>>>>
>>>>>> INFO 13:13:28,345 error writing to /10.33.3.20
>>>>>>
>>>>>> No request for replicas, or consistency checks are seen in the logs of
>>>>>> B & C at this time. Using 'nodetool ring' from each of the three nodes
>>>>>> shows all nodes as Up. Telnet from A to B on port 7000 connects.
>>>>>> Tcpdump logs look like, at first glance, that gossip communication,
>>>>>> perhaps heartbeats, are proceeding normally, but I haven't really
>>>>>> analyzed them.
>>>>>>
>>>>>> Fifteen minutes later, the cluster decided to behave normally again.
>>>>>> Everyone talks to each other like buddies and delivers columns fast an
>>>>>> regularly.
>>>>>>
>>>>>> This is really looking like a Cassandra bug. I'll report back with my
>>>>>> TRACE log later and I expect I'll be opening a ticket. The confidence
>>>>>> level of my employer in my Cassandra solution to their petabyte data
>>>>>> storage project is... uh... well... it could be better.
>>>>>>
>>>>>> AJ
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 18, 2010 at 8:16 PM, Jonathan Ellis <jbel...@gmail.com> 
>>>>>> wrote:
>>>>>>> set log level to TRACE and see if the OutboundTcpConnection is going
>>>>>>> bad.  that would explain the message never arriving.
>>>>>>>
>>>>>>> On Fri, Jun 18, 2010 at 10:39 AM, AJ Slater <a...@zuno.com> wrote:
>>>>>>>> To summarize:
>>>>>>>>
>>>>>>>> If a request for a column comes in *after a period of several hours
>>>>>>>> with no requests*, then the node servicing the request hangs while
>>>>>>>> looking for its peer rather than servicing the request like it should.
>>>>>>>> It then throws either a TimedOutException or a (wrong)
>>>>>>>> NotFoundExeption.
>>>>>>>>
>>>>>>>> And it doen't appear to actually send the message it says it does to
>>>>>>>> its peer. Or at least its peer doesn't report the request being
>>>>>>>> received.
>>>>>>>>
>>>>>>>> And then the situation magically clears up after approximately 2 
>>>>>>>> minutes.
>>>>>>>>
>>>>>>>> However, if the idle period never occurs, then the problem does not
>>>>>>>> manifest. If I run a cron job with wget against my server every
>>>>>>>> minute, I do not see the problem.
>>>>>>>>
>>>>>>>> I'll be looking at some tcpdump logs to see if i can suss out what's
>>>>>>>> really happening, and perhaps file this as a bug. The several hours
>>>>>>>> between reproducible events makes this whole thing aggravating for
>>>>>>>> detection, debugging and I'll assume, fixing, if it is indeed a
>>>>>>>> cassandra problem.
>>>>>>>>
>>>>>>>> It was suggested on IRC that it may be my network. But gossip is
>>>>>>>> continually sending heartbeats and nodetool and the logs show the
>>>>>>>> nodes as up and available. If my network was flaking out I'd think it
>>>>>>>> would be dropping heartbeats and I'd see that.
>>>>>>>>
>>>>>>>> AJ
>>>>>>>>
>>>>>>>> On Thu, Jun 17, 2010 at 2:26 PM, AJ Slater <a...@zuno.com> wrote:
>>>>>>>>> These are physical machines.
>>>>>>>>>
>>>>>>>>> storage-conf.xml.fs03 is here:
>>>>>>>>>
>>>>>>>>> http://pastebin.com/weL41NB1
>>>>>>>>>
>>>>>>>>> Diffs from that for the other two storage-confs are inline here:
>>>>>>>>>
>>>>>>>>> a...@worm:../Z3/cassandra/conf/dev$ diff storage-conf.xml.lpc03
>>>>>>>>> storage-conf.xml.fs01
>>>>>>>>> 185c185
>>>>>>>>>
>>>>>>>>>>   <InitialToken>71603818521973537678586548668074777838</InitialToken>
>>>>>>>>> 229c229
>>>>>>>>> <   <ListenAddress>10.33.2.70</ListenAddress>
>>>>>>>>> ---
>>>>>>>>>>   <ListenAddress>10.33.3.10</ListenAddress>
>>>>>>>>> 241c241
>>>>>>>>> <   <ThriftAddress>10.33.2.70</ThriftAddress>
>>>>>>>>> ---
>>>>>>>>>>   <ThriftAddress>10.33.3.10</ThriftAddress>
>>>>>>>>> 341c341
>>>>>>>>> <   <ConcurrentReads>16</ConcurrentReads>
>>>>>>>>> ---
>>>>>>>>>>   <ConcurrentReads>4</ConcurrentReads>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> a...@worm:../Z3/cassandra/conf/dev$ diff storage-conf.xml.lpc03
>>>>>>>>> storage-conf.xml.fs02
>>>>>>>>> 185c185
>>>>>>>>> <   <InitialToken>0</InitialToken>
>>>>>>>>> ---
>>>>>>>>>>   
>>>>>>>>>> <InitialToken>120215585224964746744782921158327379306</InitialToken>
>>>>>>>>> 206d205
>>>>>>>>> <       <Seed>10.33.3.20</Seed>
>>>>>>>>> 229c228
>>>>>>>>> <   <ListenAddress>10.33.2.70</ListenAddress>
>>>>>>>>> ---
>>>>>>>>>>   <ListenAddress>10.33.3.20</ListenAddress>
>>>>>>>>> 241c240
>>>>>>>>> <   <ThriftAddress>10.33.2.70</ThriftAddress>
>>>>>>>>> ---
>>>>>>>>>>   <ThriftAddress>10.33.3.20</ThriftAddress>
>>>>>>>>> 341c340
>>>>>>>>> <   <ConcurrentReads>16</ConcurrentReads>
>>>>>>>>> ---
>>>>>>>>>>   <ConcurrentReads>4</ConcurrentReads>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thank you for your attention,
>>>>>>>>>
>>>>>>>>> AJ
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Jun 17, 2010 at 2:09 PM, Benjamin Black <b...@b3k.us> wrote:
>>>>>>>>>> Are these physical machines or virtuals?  Did you post your
>>>>>>>>>> cassandra.in.sh and storage-conf.xml someplace?
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 17, 2010 at 10:31 AM, AJ Slater <a...@zuno.com> wrote:
>>>>>>>>>>> Total data size in the entire cluster is about twenty 12k images. 
>>>>>>>>>>> With
>>>>>>>>>>> no other load on the system. I just ask for one column and I get 
>>>>>>>>>>> these
>>>>>>>>>>> timeouts. Performing multiple gets on the columns leads to multiple
>>>>>>>>>>> timeouts for a period of a few seconds or minutes and then the
>>>>>>>>>>> situation magically resolves itself and response times are down to
>>>>>>>>>>> single digit milliseconds for a column get.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 17, 2010 at 10:24 AM, AJ Slater <a...@zuno.com> wrote:
>>>>>>>>>>>> Cassandra 0.6.2 from the apache debian source.
>>>>>>>>>>>> Ubunutu Jaunty. Sun Java6 jvm.
>>>>>>>>>>>>
>>>>>>>>>>>> All nodes in separate racks at 365 main.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 17, 2010 at 10:12 AM, AJ Slater <a...@zuno.com> wrote:
>>>>>>>>>>>>> I'm seing 10s timeouts on reads few times a day. Its hard to 
>>>>>>>>>>>>> reproduce
>>>>>>>>>>>>> consistently but seems to happen most often after its been a long 
>>>>>>>>>>>>> time
>>>>>>>>>>>>> between reads. After presenting itself for a couple minutes the
>>>>>>>>>>>>> problem then goes away.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've got a three node cluster with replication factor 2, reading 
>>>>>>>>>>>>> at
>>>>>>>>>>>>> consistency level ONE. The columns being read are around 12k 
>>>>>>>>>>>>> each. The
>>>>>>>>>>>>> nodes are 8GB multicore boxes with the JVM limits between 4GB and 
>>>>>>>>>>>>> 6GB.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here's an application log from early this morning when a 
>>>>>>>>>>>>> developer in
>>>>>>>>>>>>> Belgrade accessed the system:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jun 17 03:54:17 lpc03 pinhole[5736]: MainThread:pinhole.py:61 |
>>>>>>>>>>>>> Requested image_id: 5827067133c3d670071c17d9144f0b49
>>>>>>>>>>>>> Jun 17 03:54:27 lpc03 pinhole[5736]: MainThread:pinhole.py:76 |
>>>>>>>>>>>>> TimedOutException for Image 5827067133c3d670071c17d9144f0b49
>>>>>>>>>>>>> Jun 17 03:54:27 lpc03 pinhole[5736]: MainThread:zlog.py:105 | 
>>>>>>>>>>>>> Image
>>>>>>>>>>>>> Get took 10005.388975 ms
>>>>>>>>>>>>> Jun 17 03:54:27 lpc03 pinhole[5736]: MainThread:pinhole.py:61 |
>>>>>>>>>>>>> Requested image_id: af8caf3b76ce97d13812ddf795104a5c
>>>>>>>>>>>>> Jun 17 03:54:27 lpc03 pinhole[5736]: MainThread:zlog.py:105 | 
>>>>>>>>>>>>> Image
>>>>>>>>>>>>> Get took 3.658056 ms
>>>>>>>>>>>>> Jun 17 03:54:27 lpc03 pinhole[5736]: MainThread:zlog.py:105 | 
>>>>>>>>>>>>> Image
>>>>>>>>>>>>> Transform took 0.978947 ms
>>>>>>>>>>>>>
>>>>>>>>>>>>> That's a Timeout and then a successful get of another column.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here's the cassandra log for 10.33.2.70:
>>>>>>>>>>>>>
>>>>>>>>>>>>> DEBUG 03:54:17,070 get_slice
>>>>>>>>>>>>> DEBUG 03:54:17,071 weakreadremote reading
>>>>>>>>>>>>> SliceFromReadCommand(table='jolitics.com',
>>>>>>>>>>>>> key='5827067133c3d670071c17d9144f0b49',
>>>>>>>>>>>>> column_parent='QueryPath(columnFamilyName='Images',
>>>>>>>>>>>>> superColumnName='null', columnName='null')', start='', finish='
>>>>>>>>>>>>> ', reversed=false, count=100)
>>>>>>>>>>>>> DEBUG 03:54:17,071 weakreadremote reading
>>>>>>>>>>>>> SliceFromReadCommand(table='jolitics.com',
>>>>>>>>>>>>> key='5827067133c3d670071c17d9144f0b49',
>>>>>>>>>>>>> column_parent='QueryPath(columnFamilyName='Images',
>>>>>>>>>>>>> superColumnName='null', columnName='null')', start='', finish='
>>>>>>>>>>>>> ', reversed=false, count=100) from 45138@/10.33.3.10
>>>>>>>>>>>>> DEBUG 03:54:27,077 get_slice
>>>>>>>>>>>>> DEBUG 03:54:27,078 weakreadlocal reading
>>>>>>>>>>>>> SliceFromReadCommand(table='jolitics.com',
>>>>>>>>>>>>> key='af8caf3b76ce97d13812ddf795104a5c',
>>>>>>>>>>>>> column_parent='QueryPath(columnFamilyName='Images',
>>>>>>>>>>>>> superColumnName='null', columnName='null')', start='', finish=''
>>>>>>>>>>>>> , reversed=false, count=100)
>>>>>>>>>>>>> DEBUG 03:54:27,079 collecting body:false:1...@1275951327610885
>>>>>>>>>>>>> DEBUG 03:54:27,080 collecting body:false:1...@1275951327610885
>>>>>>>>>>>>> DEBUG 03:54:27,080 Reading consistency digest for 
>>>>>>>>>>>>> af8caf3b76ce97d13812ddf795104a
>>>>>>>>>>>>> 5c from 45...@[/10.33.2.70, /10.33.3.10]
>>>>>>>>>>>>> DEBUG 03:54:50,779 Disseminating load info ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> It looks like it asks for key='5827067133c3d670071c17d9144f0b49' 
>>>>>>>>>>>>> from
>>>>>>>>>>>>> the local host and also queries 10.33.3.10 for the first one and 
>>>>>>>>>>>>> then
>>>>>>>>>>>>> for 'af8caf3b76ce97d13812ddf795104a5c' it only queries the local 
>>>>>>>>>>>>> host
>>>>>>>>>>>>> and then returns appropriately.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here's the log for 10.33.3.10 around that time:
>>>>>>>>>>>>>
>>>>>>>>>>>>> DEBUG 03:54:19,645 Disseminating load info ...
>>>>>>>>>>>>> DEBUG 03:55:19,645 Disseminating load info ...
>>>>>>>>>>>>> DEBUG 03:56:19,646 Disseminating load info ...
>>>>>>>>>>>>> DEBUG 03:57:19,645 Disseminating load info ...
>>>>>>>>>>>>> DEBUG 03:58:19,645 Disseminating load info ...
>>>>>>>>>>>>> DEBUG 03:59:19,646 Disseminating load info ...
>>>>>>>>>>>>> DEBUG 04:00:18,635 GC for ParNew: 4 ms, 21443128 reclaimed leaving
>>>>>>>>>>>>> 55875144 used; max is 6580535296
>>>>>>>>>>>>>
>>>>>>>>>>>>> No record of communication from 10.33.2.70.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does this ring any bells for anyone? I can of course attach
>>>>>>>>>>>>> storage-conf's for all nodes if that sounds useful and I'll be on
>>>>>>>>>>>>> #cassandra as ajslater.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Much thanks for taking a look and any suggestions. We fear we'll 
>>>>>>>>>>>>> have
>>>>>>>>>>>>> to abandon Cassandra if this bug cannot be resolved.
>>>>>>>>>>>>>
>>>>>>>>>>>>> AJ
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jonathan Ellis
>>>>>>> Project Chair, Apache Cassandra
>>>>>>> co-founder of Riptano, the source for professional Cassandra support
>>>>>>> http://riptano.com
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan Ellis
>>>> Project Chair, Apache Cassandra
>>>> co-founder of Riptano, the source for professional Cassandra support
>>>> http://riptano.com
>>>>
>>>
>>
>




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Occasional 10s Timeouts on Read

Reply via email to