<ColumnFamily Name="movie" ColumnType="Super"
CompareWith="UTF8Type" CompareSubcolumnsWith="UTF8Type" />
On Thu, Oct 22, 2009 at 10:05 AM, Jonathan Ellis <[email protected]> wrote:
> What is the CF definition for movie? the part from your config file like
> <ColumnFamily CompareWith="BytesType" Name="Movie"/>
>
> On Thu, Oct 22, 2009 at 12:04 PM, Jonathan Ellis <[email protected]> wrote:
>> On Thu, Oct 22, 2009 at 10:59 AM, Edmond Lau <[email protected]> wrote:
>>> Try:
>>>
>>> keyspace: "Analytics"
>>> key: "o:movie"
>>> column family: "movie"
>>> super column: "all"
>>>
>>> I was able to get timeouts with a few concurrent get_slice() calls
>>> (3-4 with debug logging, 6 without) on the super column, with start =
>>> "", finish = "", and count = 1.
>>>
>>> Edmond
>>>
>>> On Thu, Oct 22, 2009 at 7:42 AM, Jonathan Ellis <[email protected]> wrote:
>>>> What is your columnfamily definition? What query should I test with?
>>>>
>>>> On Wed, Oct 21, 2009 at 7:33 PM, Edmond Lau <[email protected]> wrote:
>>>>> On Tue, Oct 20, 2009 at 2:03 PM, Jonathan Ellis <[email protected]> wrote:
>>>>>> Okay, so we have 2 problems:
>>>>>>
>>>>>> - the read is simply too slow
>>>>>> - we're doing unnecessary(?) read repairs
>>>>>>
>>>>>> For the first I think it's going to be easiest if you can export the
>>>>>> row in question so we can do some profiling. (Eric will have a tool
>>>>>> to do this soon.)
>>>>>
>>>>> Ok - I've created a cassandra setup with just the row in question and
>>>>> tarred up the data + storage-conf.xml for the three nodes:
>>>>>
>>>>> http://demo.ooyala.com/cassandra/cassandra-1.tgz
>>>>> http://demo.ooyala.com/cassandra/cassandra-2.tgz
>>>>> http://demo.ooyala.com/cassandra/cassandra-3.tgz
>>>>>
>>>>>>
>>>>>> For the second, how sure are you that the repair is unnecessary? Does
>>>>>> it repeat the repair when you do the query again after letting the
>>>>>> callbacks complete?
>>>>>
>>>>> I didn't do any writes as I was debugging the issue whereas I was
>>>>> reading the same data a whole bunch of times, so the repair should've
>>>>> been unnecessary.
>>>>>
>>>>> Thanks for your help,
>>>>> Edmond
>>>>>
>>>>>>
>>>>>> -Jonathan
>>>>>>
>>>>>> On Mon, Oct 19, 2009 at 11:38 PM, Edmond Lau <[email protected]> wrote:
>>>>>>> A single local read with debug logging takes 3-4 secs on the node with
>>>>>>> 3 data files. It actually takes roughly as long on the other nodes as
>>>>>>> well (I was incorrect in saying that they were fast).
>>>>>>>
>>>>>>> DEBUG [pool-1-thread-22] 2009-10-20 04:26:06,953 CassandraServer.java
>>>>>>> (line 219) get_slice
>>>>>>> DEBUG [pool-1-thread-22] 2009-10-20 04:26:06,955 StorageProxy.java
>>>>>>> (line 468) weakreadlocal reading
>>>>>>> SliceFromReadCommand(table='Analytics', key='o:movie',
>>>>>>> column_parent='QueryPath(columnFamilyName='movie',
>>>>>>> superColumnName='[...@26e00675', columnName='null')', start='',
>>>>>>> finish='', reversed=false, count=1)
>>>>>>> DEBUG [pool-1-thread-22] 2009-10-20 04:26:10,382 SliceQueryFilter.java
>>>>>>> (line 115) collecting 1000036:false:5...@1255978226411394
>>>>>>> DEBUG [CONSISTENCY-MANAGER:2] 2009-10-20 04:26:10,383
>>>>>>> ConsistencyManager.java (line 168) Reading consistency digest for
>>>>>>> o:movie from 2...@[172.16.129.71:7000, 172.16.129.72:7000]
>>>>>>> DEBUG [RESPONSE-STAGE:1] 2009-10-20 04:26:13,332
>>>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>>>> from [email protected]:7000
>>>>>>> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:26:13,675
>>>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>>>> from [email protected]:7000
>>>>>>> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:26:13,677
>>>>>>> ConsistencyManager.java (line 96) Performing read repair for o:movie
>>>>>>> to 2...@[172.16.129.71:7000, 172.16.129.72:7000, 172.16.129.75:7000]
>>>>>>>
>>>>>>> Also, it's unclear why there's a read repair going on above, since I
>>>>>>> haven't made any updates.
>>>>>>>
>>>>>>> Doing a quorum read on the node with 3 data files takes 7 seconds w/
>>>>>>> debug logging and leads to a DigestMismatchException, again even
>>>>>>> though I haven't made any updates.
>>>>>>>
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,892 CassandraServer.java
>>>>>>> (line 219) get_slice
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,894 StorageProxy.java
>>>>>>> (line 374) strongread reading data for
>>>>>>> SliceFromReadCommand(table='Analytics', key='o:movie',
>>>>>>> column_parent='QueryPath(columnFamilyName='movie',
>>>>>>> superColumnName='[...@2cf50c5f', columnName='null')', start='',
>>>>>>> finish='', reversed=false, count=1) from [email protected]:7000
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,894 StorageProxy.java
>>>>>>> (line 381) strongread reading digest for
>>>>>>> SliceFromReadCommand(table='Analytics', key='o:movie',
>>>>>>> column_parent='QueryPath(columnFamilyName='movie',
>>>>>>> superColumnName='[...@2cf50c5f', columnName='null')', start='',
>>>>>>> finish='', reversed=false, count=1) from [email protected]:7000
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:00,895 StorageProxy.java
>>>>>>> (line 381) strongread reading digest for
>>>>>>> SliceFromReadCommand(table='Analytics', key='o:movie',
>>>>>>> column_parent='QueryPath(columnFamilyName='movie',
>>>>>>> superColumnName='[...@2cf50c5f', columnName='null')', start='',
>>>>>>> finish='', reversed=false, count=1) from [email protected]:7000
>>>>>>> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:30:03,994
>>>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>>>> from [email protected]:7000
>>>>>>> DEBUG [RESPONSE-STAGE:2] 2009-10-20 04:30:04,107
>>>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>>>> from [email protected]:7000
>>>>>>> DEBUG [ROW-READ-STAGE:9] 2009-10-20 04:30:04,787 SliceQueryFilter.java
>>>>>>> (line 115) collecting 1000036:false:5...@1255978226411394
>>>>>>> DEBUG [ROW-READ-STAGE:9] 2009-10-20 04:30:04,788 ReadVerbHandler.java
>>>>>>> (line 100) Read key o:movie; sending response to
>>>>>>> [email protected]:7000
>>>>>>> DEBUG [RESPONSE-STAGE:4] 2009-10-20 04:30:04,788
>>>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>>>> from [email protected]:7000
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,789
>>>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>>>> ms.
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,790
>>>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>>>> ms.
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:04,791
>>>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 1
>>>>>>> ms.
>>>>>>> INFO [pool-1-thread-25] 2009-10-20 04:30:04,792 StorageProxy.java
>>>>>>> (line 410) DigestMismatchException: o:movie
>>>>>>> DEBUG [RESPONSE-STAGE:1] 2009-10-20 04:30:07,953
>>>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>>>> from [email protected]:7000
>>>>>>> DEBUG [RESPONSE-STAGE:3] 2009-10-20 04:30:08,130
>>>>>>> ResponseVerbHandler.java (line 34) Processing response on a callback
>>>>>>> from [email protected]:7000
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
>>>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 1
>>>>>>> ms.
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
>>>>>>> ReadResponseResolver.java (line 84) Response deserialization time : 0
>>>>>>> ms.
>>>>>>> DEBUG [pool-1-thread-25] 2009-10-20 04:30:08,132
>>>>>>> ReadResponseResolver.java (line 149) resolve: 1 ms.
>>>>>>> DEBUG [ROW-READ-STAGE:10] 2009-10-20 04:30:08,235
>>>>>>> SliceQueryFilter.java (line 115) collecting
>>>>>>> 1000036:false:5...@1255978226411394
>>>>>>> DEBUG [ROW-READ-STAGE:10] 2009-10-20 04:30:08,235 ReadVerbHandler.java
>>>>>>> (line 100) Read key o:movie; sending response to
>>>>>>> [email protected]:7000
>>>>>>>
>>>>>>> Edmond
>>>>>>>
>>>>>>> On Mon, Oct 19, 2009 at 8:04 PM, Jonathan Ellis <[email protected]>
>>>>>>> wrote:
>>>>>>>> On Mon, Oct 19, 2009 at 9:48 PM, Edmond Lau <[email protected]> wrote:
>>>>>>>>> 8 data files total. 3 nodes have 1, 1 has 2, the 3rd has 3.
>>>>>>>>>
>>>>>>>>>> Does it still take ~8s if you direct a CL.ONE query at one of the
>>>>>>>>>> nodes you know has the data (i.e., a local read)?
>>>>>>>>>
>>>>>>>>> Local reads return quickly, but if you look at the logs below, there's
>>>>>>>>> 6 secs from when cassandra prints that it's reading a consistency
>>>>>>>>> digest to when it processes a callback.
>>>>>>>>
>>>>>>>> That's from when it sends the digest command, to when it gets a
>>>>>>>> callback.
>>>>>>>>
>>>>>>>> Something doesn't add up. If computing &sending a digest response
>>>>>>>> takes 6s, either reading the row or computing the digest must be
>>>>>>>> taking substantial time. But if local reads are fast, it's not
>>>>>>>> reading the row that is slow, and if resolve is fast (from your logs,
>>>>>>>> it is 2ms) then it's not computing the digest that is slow.
>>>>>>>>
>>>>>>>> Try doing a local read from the node with 3 data files, see if that is
>>>>>>>> slower.
>>>>>>>>
>>>>>>>> If you still see a fast read there, add some more debug logging in
>>>>>>>> ReadVerbHandler.java, where it says something like (I'm going from
>>>>>>>> trunk here but 0.4.1 is similar)
>>>>>>>>
>>>>>>>> Row row = readCommand.getRow(table);
>>>>>>>> ReadResponse readResponse;
>>>>>>>> if (readCommand.isDigestQuery())
>>>>>>>> {
>>>>>>>> readResponse = new ReadResponse(row.digest());
>>>>>>>>
>>>>>>>> time the getRow and the row.digest() calls and log how long those take
>>>>>>>> in the quorum scenario.
>>>>>>>>
>>>>>>>> -Jonathan
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>