I am just curious about which partitioner you are using?

On Thu, Nov 17, 2011 at 4:30 PM, Philippe <watche...@gmail.com> wrote:

> Hi Todd
> Yes all equal hardware. Nearly no CPU usage and no memory issues.
> Repairs are running in tens of minutes so i don't understand why
> replication would be backed up.
>
> Any other ideas?
> Le 17 nov. 2011 02:33, "Todd Burruss" <bburr...@expedia.com> a écrit :
>
> Are all of your machines equal hardware?  Since those machines are sending
>> data somewhere, maybe they are behind in replicating and are continuously
>> catching up?
>>
>> Use a tool like tcpdump to find out where the data is going
>>
>> From: Philippe <watche...@gmail.com>
>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>> Date: Tue, 15 Nov 2011 13:22:38 -0800
>> To: user <user@cassandra.apache.org>
>> Subject: Re: Network traffic patterns
>>
>> Sorry about the previous message, I've enabled keyboard shortcuts on
>> gmail...*sigh*...
>>
>> Hello,
>> I'm trying to understand the network usage I am seeing in my cluster, can
>> anyone shed some light?
>> It's an RF=3, 12-node, cassandra 0.8.6 cluster. repair is performed on
>> each node once a week, with a rolling schedule.
>> The nodes are p13,p14,p15...p24 and are consecutive in that order on the
>> ring. Each node is only a cassandra database. I am hitting the cluster from
>> another server (p4).
>>
>> p4 is doing this with 20 threads in parallel
>>
>>    1. read a lot of data (some columns for hundreds to tens of thousands
>>    of keys, split into 512-key multigets)
>>    2. process the data
>>    3. write back a byte array to cassandra (average size is 400 bytes)
>>    4. go back to 1
>>
>> According to my munin graphs, network usage is about as follows. I am not
>> surprised at the bias towards p13-p15 as p4 is getting & storing data
>> mainly for keys located on one of those nodes.
>>
>>    - p4 : 1.5Mb/s in and out
>>    - p13-p15 : 15Mb/s in and 80Mb/s out
>>    - p16-p24 : 45Mb/s in and 5Mb/s out
>>
>> What I don't understand is why p4 is only seeing 1.5Mb/s while I see
>> 80Mb/s on p13 & p15.
>>
>> The way I understand this:
>>
>>    - p4 makes a multiget to the cluster, electing to use any node in the
>>    cluster (IN traffic for describe the query)
>>    - coordinator node replays the query on all 3 replicas (so 3 servers
>>    each get the IN traffic, mostly p13-p15)
>>    - each server replies to coordinator
>>    - coordinator chooses matching values and sends back data to p4
>>
>> So if p13-p15 are outputting 80Mb/s why am I not seeing 80Mb/s coming
>> into p4 which is on the receiving end ?
>>
>> Thanks
>>
>> 2011/11/15 Philippe <watche...@gmail.com>
>>
>>> Hello,
>>> I'm trying to understand the network usage I am seeing in my cluster,
>>> can anyone shed some light?
>>> It's an RF=3, 12-node, cassandra 0.8.6 cluster. The nodes are
>>> p13,p14,p15...p24 and are consecutive in that order on the ring.
>>> Each node is only a cassandra database. I am hitting the cluster from
>>> another server (p4).
>>>
>>> The pattern on p4 is the pattern is to
>>>
>>>    1. read a lot of data (some columns for hundreds to tens of
>>>    thousands of keys, split into 512-key multigets)
>>>    2. process the data
>>>    3. write back a byte array to cassandra (average size is 400 bytes)
>>>
>>>
>>> p4 reads as
>>>
>>
>>

Reply via email to