Re: Trace evidence for LOCAL_QUORUM ending up in remote DC

Tom van den Berge Tue, 08 Sep 2015 12:02:57 -0700

Just to be sure: can this bug result in a 0-row result while it should be >
0 ?
Op 8 sep. 2015 6:29 PM schreef "Tyler Hobbs" <ty...@datastax.com>:


> See https://issues.apache.org/jira/browse/CASSANDRA-9753
>
> On Tue, Sep 8, 2015 at 10:22 AM, Tom van den Berge <
> tom.vandenbe...@gmail.com> wrote:
>
>> I've been bugging you a few times, but now I've got trace data for a
>> query with LOCAL_QUORUM that is being sent to a remove data center.
>>
>> The setup is as follows:
>> NetworkTopologyStrategy: {"DC1":"1","DC2":"2"}
>> Both DC1 and DC2 have 2 nodes.
>> In DC2, one node is currently being rebuilt, and therefore does not
>> contain all data (yet).
>>
>> The client app connects to a node in DC1, and sends a SELECT query with
>> CL LOCAL_QUORUM, which in this case means ((1/2)+1=1.
>> If all is ok, the query always produces a result, because the requested
>> rows are guaranteed to be available in DC1.
>>
>> However, the query sometimes produces no result. I've been able to record
>> the traces of these queries, and it turns out that the coordinator node in
>> DC1 sometimes sends the query to DC2, to the node that is being rebuilt,
>> and does not have the requested rows. I've included an example trace below.
>>
>> The coordinator node is 10.55.156.67, which is in DC1. The 10.88.4.194 node
>> is in DC2.
>> I've verified that the  CL=LOCAL_QUORUM by printing it when the query is
>> sent (I'm using the datastax java driver).
>>
>>  activity
>>    | source       | source_elapsed | thread
>>
>> ---------------------------------------------------------------------------+--------------+----------------+-----------------------------------------
>>                                        Message received from /
>> 10.55.156.67 |  10.88.4.194 |             48 |
>> MessagingService-Incoming-/10.55.156.67
>>                              Executing single-partition query on
>> aggregate |  10.88.4.194 |            286 |
>> SharedPool-Worker-2
>>                                               Acquiring sstable
>> references |  10.88.4.194 |            306 |
>> SharedPool-Worker-2
>>                                                Merging memtable
>> tombstones |  10.88.4.194 |            321 |
>> SharedPool-Worker-2
>>                         Partition index lookup allows skipping sstable
>> 107 |  10.88.4.194 |            458 |
>> SharedPool-Worker-2
>>                                     Bloom filter allows skipping sstable
>> 1 |  10.88.4.194 |            489 |                     SharedPool-Worker-2
>>  Skipped 0/2 non-slice-intersecting sstables, included 0 due to
>> tombstones |  10.88.4.194 |            496 |
>> SharedPool-Worker-2
>>                                 Merging data from memtables and 0
>> sstables |  10.88.4.194 |            500 |
>> SharedPool-Worker-2
>>                                          Read 0 live and 0 tombstone
>> cells |  10.88.4.194 |            513 |
>> SharedPool-Worker-2
>>                                        Enqueuing response to /
>> 10.55.156.67 |  10.88.4.194 |            613 |
>> SharedPool-Worker-2
>>                                           Sending message to /
>> 10.55.156.67 |  10.88.4.194 |            672 |
>> MessagingService-Outgoing-/10.55.156.67
>>                 Parsing SELECT * FROM Aggregate WHERE type=? AND
>> typeId=?; | 10.55.156.67 |             10 |
>> SharedPool-Worker-4
>>                                            Sending message to /
>> 10.88.4.194 | 10.55.156.67 |           4335 |
>>  MessagingService-Outgoing-/10.88.4.194
>>                                         Message received from /
>> 10.88.4.194 | 10.55.156.67 |           6328 |
>>  MessagingService-Incoming-/10.88.4.194
>>                                Seeking to partition beginning in data
>> file | 10.55.156.67 |          10417 |
>> SharedPool-Worker-3
>>                                              Key cache hit for sstable
>> 389 | 10.55.156.67 |          10586 |
>> SharedPool-Worker-3
>>
>> My question is: how is it possible that the query is sent to a node in
>> DC2?
>> Since DC1 has 2 nodes and RF 1, the query should always be sent to the
>> other node in DC1 if the coordinator does not have a replica, right?
>>
>> Thanks,
>> Tom
>>
>>
>>
>>
>>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Re: Trace evidence for LOCAL_QUORUM ending up in remote DC

Reply via email to