Okay, so we have 2 problems:

 - the read is simply too slow
 - we're doing unnecessary(?) read repairs

For the first I think it's going to be easiest if you can export the
row in question so we can do some profiling.  (Eric will have a tool
to do this soon.)

For the second, how sure are you that the repair is unnecessary?  Does
it repeat the repair when you do the query again after letting the
callbacks complete?

-Jonathan

On Mon, Oct 19, 2009 at 11:38 PM, Edmond Lau <[email protected]> wrote:
> A single local read with debug logging takes 3-4 secs on the node with
> 3 data files.  It actually takes roughly as long on the other nodes as
> well (I was incorrect in saying that they were fast).
>
> DEBUG [pool-1-thread-22] 2009-10-20 04:26:06,953 CassandraServer.java
> (line 219) get_slice
> DEBUG [pool-1-thread-22] 2009-10-20 04:26:06,955 StorageProxy.java
> (line 468) weakreadlocal reading
> SliceFromReadCommand(table='Analytics', key='o:movie',
> column_parent='QueryPath(columnFamilyName='movie',
> superColumnName='[...@26e00675', columnName='null')', start='',
> finish='', reversed=false, count=1)
> DEBUG [pool-1-thread-22] 2009-10-20 04:26:10,382 SliceQueryFilter.java
> (line 115) collecting 1000036:false:5...@1255978226411394
> DEBUG [CONSISTENCY-MANAGER:2] 2009-10-20 04:26:10,383
> ConsistencyManager.java (line 168) Reading consistency digest for
> o:movie from 2...@[172.16.129.71:7000, 172.16.129.72:7000]
> DEBUG [RESPONSE-STAGE:1] 2009-10-20 04:26:13,332
> ResponseVerbHandler.java (line 34) Processing response on a callback
> from [email protected]:7000
> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:26:13,675
> ResponseVerbHandler.java (line 34) Processing response on a callback
> from [email protected]:7000
> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:26:13,677
> ConsistencyManager.java (line 96) Performing read repair for o:movie
> to 2...@[172.16.129.71:7000, 172.16.129.72:7000, 172.16.129.75:7000]
>
> Also, it's unclear why there's a read repair going on above, since I
> haven't made any updates.
>
> Doing a quorum read on the node with 3 data files takes 7 seconds w/
> debug logging and leads to a DigestMismatchException, again even
> though I haven't made any updates.
>
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,892 CassandraServer.java
> (line 219) get_slice
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,894 StorageProxy.java
> (line 374) strongread reading data for
> SliceFromReadCommand(table='Analytics', key='o:movie',
> column_parent='QueryPath(columnFamilyName='movie',
> superColumnName='[...@2cf50c5f', columnName='null')', start='',
> finish='', reversed=false, count=1) from [email protected]:7000
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,894 StorageProxy.java
> (line 381) strongread reading digest for
> SliceFromReadCommand(table='Analytics', key='o:movie',
> column_parent='QueryPath(columnFamilyName='movie',
> superColumnName='[...@2cf50c5f', columnName='null')', start='',
> finish='', reversed=false, count=1) from [email protected]:7000
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,895 StorageProxy.java
> (line 381) strongread reading digest for
> SliceFromReadCommand(table='Analytics', key='o:movie',
> column_parent='QueryPath(columnFamilyName='movie',
> superColumnName='[...@2cf50c5f', columnName='null')', start='',
> finish='', reversed=false, count=1) from [email protected]:7000
> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:30:03,994
> ResponseVerbHandler.java (line 34) Processing response on a callback
> from [email protected]:7000
> DEBUG [RESPONSE-STAGE:2] 2009-10-20 04:30:04,107
> ResponseVerbHandler.java (line 34) Processing response on a callback
> from [email protected]:7000
> DEBUG [ROW-READ-STAGE:9] 2009-10-20 04:30:04,787 SliceQueryFilter.java
> (line 115) collecting 1000036:false:5...@1255978226411394
> DEBUG [ROW-READ-STAGE:9] 2009-10-20 04:30:04,788 ReadVerbHandler.java
> (line 100) Read key o:movie; sending response to
> [email protected]:7000
> DEBUG [RESPONSE-STAGE:4] 2009-10-20 04:30:04,788
> ResponseVerbHandler.java (line 34) Processing response on a callback
> from [email protected]:7000
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,789
> ReadResponseResolver.java (line 84) Response deserialization time : 0
> ms.
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,790
> ReadResponseResolver.java (line 84) Response deserialization time : 0
> ms.
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,791
> ReadResponseResolver.java (line 84) Response deserialization time : 1
> ms.
>  INFO [pool-1-thread-25] 2009-10-20 04:30:04,792 StorageProxy.java
> (line 410) DigestMismatchException: o:movie
> DEBUG [RESPONSE-STAGE:1] 2009-10-20 04:30:07,953
> ResponseVerbHandler.java (line 34) Processing response on a callback
> from [email protected]:7000
> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:30:08,130
> ResponseVerbHandler.java (line 34) Processing response on a callback
> from [email protected]:7000
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
> ReadResponseResolver.java (line 84) Response deserialization time : 1
> ms.
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
> ReadResponseResolver.java (line 84) Response deserialization time : 0
> ms.
> DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
> ReadResponseResolver.java (line 149) resolve: 1 ms.
> DEBUG [ROW-READ-STAGE:10] 2009-10-20 04:30:08,235
> SliceQueryFilter.java (line 115) collecting
> 1000036:false:5...@1255978226411394
> DEBUG [ROW-READ-STAGE:10] 2009-10-20 04:30:08,235 ReadVerbHandler.java
> (line 100) Read key o:movie; sending response to
> [email protected]:7000
>
> Edmond
>
> On Mon, Oct 19, 2009 at 8:04 PM, Jonathan Ellis <[email protected]> wrote:
>> On Mon, Oct 19, 2009 at 9:48 PM, Edmond Lau <[email protected]> wrote:
>>> 8 data files total.  3 nodes have 1, 1 has 2, the 3rd has 3.
>>>
>>>> Does it still take ~8s if you direct a CL.ONE query at one of the
>>>> nodes you know has the data (i.e., a local read)?
>>>
>>> Local reads return quickly, but if you look at the logs below, there's
>>> 6 secs from when cassandra prints that it's reading a consistency
>>> digest to when it processes a callback.
>>
>> That's from when it sends the digest command, to when it gets a callback.
>>
>> Something doesn't add up.  If computing &sending a digest response
>> takes 6s, either reading the row or computing the digest must be
>> taking substantial time.  But if local reads are fast, it's not
>> reading the row that is slow, and if resolve is fast (from your logs,
>> it is 2ms) then it's not computing the digest that is slow.
>>
>> Try doing a local read from the node with 3 data files, see if that is 
>> slower.
>>
>> If you still see a fast read there, add some more debug logging in
>> ReadVerbHandler.java, where it says something like (I'm going from
>> trunk here but 0.4.1 is similar)
>>
>>            Row row = readCommand.getRow(table);
>>            ReadResponse readResponse;
>>            if (readCommand.isDigestQuery())
>>            {
>>                readResponse = new ReadResponse(row.digest());
>>
>> time the getRow and the row.digest() calls and log how long those take
>> in the quorum scenario.
>>
>> -Jonathan
>>
>

Reply via email to