Just to be sure: can this bug result in a 0-row result while it should be > 0 ? Op 8 sep. 2015 6:29 PM schreef "Tyler Hobbs" <ty...@datastax.com>:
> See https://issues.apache.org/jira/browse/CASSANDRA-9753 > > On Tue, Sep 8, 2015 at 10:22 AM, Tom van den Berge < > tom.vandenbe...@gmail.com> wrote: > >> I've been bugging you a few times, but now I've got trace data for a >> query with LOCAL_QUORUM that is being sent to a remove data center. >> >> The setup is as follows: >> NetworkTopologyStrategy: {"DC1":"1","DC2":"2"} >> Both DC1 and DC2 have 2 nodes. >> In DC2, one node is currently being rebuilt, and therefore does not >> contain all data (yet). >> >> The client app connects to a node in DC1, and sends a SELECT query with >> CL LOCAL_QUORUM, which in this case means ((1/2)+1=1. >> If all is ok, the query always produces a result, because the requested >> rows are guaranteed to be available in DC1. >> >> However, the query sometimes produces no result. I've been able to record >> the traces of these queries, and it turns out that the coordinator node in >> DC1 sometimes sends the query to DC2, to the node that is being rebuilt, >> and does not have the requested rows. I've included an example trace below. >> >> The coordinator node is 10.55.156.67, which is in DC1. The 10.88.4.194 node >> is in DC2. >> I've verified that the CL=LOCAL_QUORUM by printing it when the query is >> sent (I'm using the datastax java driver). >> >> activity >> | source | source_elapsed | thread >> >> ---------------------------------------------------------------------------+--------------+----------------+----------------------------------------- >> Message received from / >> 10.55.156.67 | 10.88.4.194 | 48 | >> MessagingService-Incoming-/10.55.156.67 >> Executing single-partition query on >> aggregate | 10.88.4.194 | 286 | >> SharedPool-Worker-2 >> Acquiring sstable >> references | 10.88.4.194 | 306 | >> SharedPool-Worker-2 >> Merging memtable >> tombstones | 10.88.4.194 | 321 | >> SharedPool-Worker-2 >> Partition index lookup allows skipping sstable >> 107 | 10.88.4.194 | 458 | >> SharedPool-Worker-2 >> Bloom filter allows skipping sstable >> 1 | 10.88.4.194 | 489 | SharedPool-Worker-2 >> Skipped 0/2 non-slice-intersecting sstables, included 0 due to >> tombstones | 10.88.4.194 | 496 | >> SharedPool-Worker-2 >> Merging data from memtables and 0 >> sstables | 10.88.4.194 | 500 | >> SharedPool-Worker-2 >> Read 0 live and 0 tombstone >> cells | 10.88.4.194 | 513 | >> SharedPool-Worker-2 >> Enqueuing response to / >> 10.55.156.67 | 10.88.4.194 | 613 | >> SharedPool-Worker-2 >> Sending message to / >> 10.55.156.67 | 10.88.4.194 | 672 | >> MessagingService-Outgoing-/10.55.156.67 >> Parsing SELECT * FROM Aggregate WHERE type=? AND >> typeId=?; | 10.55.156.67 | 10 | >> SharedPool-Worker-4 >> Sending message to / >> 10.88.4.194 | 10.55.156.67 | 4335 | >> MessagingService-Outgoing-/10.88.4.194 >> Message received from / >> 10.88.4.194 | 10.55.156.67 | 6328 | >> MessagingService-Incoming-/10.88.4.194 >> Seeking to partition beginning in data >> file | 10.55.156.67 | 10417 | >> SharedPool-Worker-3 >> Key cache hit for sstable >> 389 | 10.55.156.67 | 10586 | >> SharedPool-Worker-3 >> >> My question is: how is it possible that the query is sent to a node in >> DC2? >> Since DC1 has 2 nodes and RF 1, the query should always be sent to the >> other node in DC1 if the coordinator does not have a replica, right? >> >> Thanks, >> Tom >> >> >> >> >> > > > -- > Tyler Hobbs > DataStax <http://datastax.com/> >