Hi,

Below is a sample trace for a LOCAL_QUORUM query . I've changed the query
table/col names and actual node IP addresses to IP.1 and IP.coord (for the
co-ordinator node). RF=3 and we have 2 DCs. Don't we expect to see an
"IP.2" since LOCAL_QUORUM requires the co-ordinator to receive at least 2
responses? . What am i missing here?

activity


                              | timestamp                  | source
 | source_elapsed
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------



           Execute CQL3 query | 2016-09-15 04:17:55.401000 | IP.coord |
         0
 Parsing SELECT A,B,C from T WHERE key1='K1' and key2='K2' and key3='K3'
and key4='K4'; [SharedPool-Worker-2] | 2016-09-15 04:17:55.402000 |
IP.coord |             57


                                                                Preparing
statement [SharedPool-Worker-2] | 2016-09-15 04:17:55.403000 | IP.coord |
         140


                                                   reading data from /IP.1
[SharedPool-Worker-2] | 2016-09-15 04:17:55.403000 | IP.coord |
1343


                        Sending READ message to /IP.1
[MessagingService-Outgoing-/IP.1] | 2016-09-15 04:17:55.404000 | IP.coord |
          1388


         REQUEST_RESPONSE message received from /IP.1
[MessagingService-Incoming-/IP.1] | 2016-09-15 04:17:55.404000 | IP.coord |
          2953


                                            Processing response from /IP.1
[SharedPool-Worker-3] | 2016-09-15 04:17:55.404000 | IP.coord |
3001


                     READ message received from /IP.coord
[MessagingService-Incoming-/IP.coord] | 2016-09-15 04:17:55.405000 | IP.1 |
           117


                                     Executing single-partition query on
user_carts [SharedPool-Worker-1] | 2016-09-15 04:17:55.405000 | IP.1 |
       253


                                                       Acquiring sstable
references [SharedPool-Worker-1] | 2016-09-15 04:17:55.406000 | IP.1 |
       262


                                                        Merging memtable
tombstones [SharedPool-Worker-1] | 2016-09-15 04:17:55.406000 | IP.1 |
       295


                                           Bloom filter allows skipping
sstable 729 [SharedPool-Worker-1] | 2016-09-15 04:17:55.406000 | IP.1 |
       341


                               Partition index with 0 entries found for
sstable 713 [SharedPool-Worker-1] | 2016-09-15 04:17:55.407000 | IP.1 |
       411


                                  Seeking to partition indexed section in
data file [SharedPool-Worker-1] | 2016-09-15 04:17:55.407000 | IP.1 |
     414


          Skipped 0/2 non-slice-intersecting sstables, included 0 due to
tombstones [SharedPool-Worker-1] | 2016-09-15 04:17:55.407000 | IP.1 |
       854


                                         Merging data from memtables and 1
sstables [SharedPool-Worker-1] | 2016-09-15 04:17:55.408000 | IP.1 |
     860


                                                  Read 1 live and 1
tombstone cells [SharedPool-Worker-1] | 2016-09-15 04:17:55.408000 | IP.1 |
           910


                                               Enqueuing response to
/IP.coord [SharedPool-Worker-1] | 2016-09-15 04:17:55.408000 | IP.1 |
    1051


            Sending REQUEST_RESPONSE message to /IP.coord
[MessagingService-Outgoing-/IP.coord] | 2016-09-15 04:17:55.409000 | IP.1 |
          1110



             Request complete | 2016-09-15 04:17:55.404067 | IP.coord |
      3067

Thanks,
Joseph




On Tue, Sep 20, 2016 at 3:07 AM, Nicolas Douillet <
nicolas.douil...@gmail.com> wrote:

> Hi Pranay,
>
> I'll try to answer the more precisely as I can.
>
> Note that what I'm going to explain is valid only for reads, write
> requests work differently.
> I'm assuming you have only one DC.
>
>    1. The coordinator gets a list of sorted live replicas. Replicas are
>    sorted by proximity.
>    (I'm not sure enough how it works to explain it here, by snitch I
>    guess).
>
>    2. By default *the coordinator keeps the exact list of nodes necessary*
>    to ensure the desired consistency (2 nodes for RF=3),
>    but, according the read repair chance provided on each column family
>    (10% of the requests by default), *it might keep all the replicas* (if
>    one DC).
>
>    3. The coordinator checks if enough nodes are alive before trying any
>    request. If not, no need to go further.
>    You'll have a slightly different error message :
>
> *Live nodes <list> do not satisfy ConsistencyLevel (2 required) *
>    4. And in substance the coordinator waits for the exact number of
>    responses to achieve the consistency.
>    To be more specific, the coordinator is not requesting the same to
>    each involved replicas (to one or two, the closest, a full data read, and
>    for the others only a digest), and is waiting for the exact number of
>    responses to achieve the consistency with at least one full data present.
>    (There is of course more to explain, if the digests do not match for
>    example ...)
>
>    So you're right when you talk about the fastest responses, but only
>    under certain conditions and if additional replicas are requested.
>
>
> I'm certainly missing some points.
> Is that clear enough?
>
> --
> Nicolas
>
>
>
> Le lun. 19 sept. 2016 à 22:16, Pranay akula <pranay.akula2...@gmail.com>
> a écrit :
>
>>
>>
>> i always have this doubt when a cassandra node got a read request for
>> local quorum consistency does coordinator node asks all nodes with replicas
>> in that DC for response or just the fastest responding nodes to it who's
>> count satisfy the local quorum.
>>
>> In this case RF is 3 Cassandra timeout during read query at consistency
>> LOCAL_QUORUM (2 responses were required but only 1 replica responded)....
>> does this mean coordinator asked only two replicas with fastest response
>> for data and 1 out of 2 timed out  or  coordinator asked all nodes with
>> replicas which means all three (3)  and 2 out of 3 timed out as i only got
>> single response back.
>>
>>
>>
>> Thanks
>>
>> Pranay
>>
>

Reply via email to