Sorry, the bug was in our snitch. We're using getHostName() instead of
getCanonicalHostName() to determine DC Rack and since for local it returns
alias, instead of reverse DNS, DC Rack numbers are not as expected.
Best regards/ Pagarbiai
Viktor Jevdokimov
Senior Developer
Email: viktor.jevdoki...@adform.com
Phone: +370 5 212 3063. Fax: +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
[Adform news]http://www.adform.com/
[Visit us!]
Follow:
[twitter]http://twitter.com/#!/adforminsider
Visit our bloghttp://www.adform.com/site/blog
Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that the
information remains the property of the sender. You must not use, disclose,
distribute, copy, print or rely on this e-mail. If you have received this
message in error, please contact the sender immediately and irrevocably delete
this message and any copies.
From: Viktor Jevdokimov [mailto:viktor.jevdoki...@adform.com]
Sent: Thursday, December 01, 2011 14:05
To: user@cassandra.apache.org
Subject: NetworkTopologyStrategy bug?
Assume for now we have 1 DC and 1 rack with 3 nodes. Ring will look like:
(we use own snitch, which returns DC=0, Rack=0 for this case).
AddressDC Rack Token
113427455640312821154458202477256070484
10.0.0.1 0 0 0
10.0.0.2 0 0
56713727820156410577229101238628035242
10.0.0.3 0 0
113427455640312821154458202477256070484
Schema: ReplicaPlacementStrategy=NetworkTopologyStrategy, options: [0:2] (2
replicas in DC 0).
When trying to run cleanup (same problem with repair), Cassandra reports:
From 10.0.0.1:
DEBUG [time] 10.0.0.2,10.0.0.3 endpoints in datacenter 0 for token 0
DEBUG [time] 10.0.0.2,10.0.0.3 endpoints in datacenter 0 for token
56713727820156410577229101238628035242
DEBUG [time] 10.0.0.3,10.0.0.2 endpoints in datacenter 0 for token
113427455640312821154458202477256070484
INFO [time] Cleanup cannot run before a node has joined the ring
From 10.0.0.2:
DEBUG [time] 10.0.0.1,10.0.0.3 endpoints in datacenter 0 for token 0
DEBUG [time] 10.0.0.1,10.0.0.3 endpoints in datacenter 0 for token
56713727820156410577229101238628035242
DEBUG [time] 10.0.0.3,10.0.0.1 endpoints in datacenter 0 for token
113427455640312821154458202477256070484
INFO [time] Cleanup cannot run before a node has joined the ring
From 10.0.0.3:
DEBUG [time] 10.0.0.1,10.0.0.2 endpoints in datacenter 0 for token 0
DEBUG [time] 10.0.0.1,10.0.0.2 endpoints in datacenter 0 for token
56713727820156410577229101238628035242
DEBUG [time] 10.0.0.2,10.0.0.1 endpoints in datacenter 0 for token
113427455640312821154458202477256070484
INFO [time] Cleanup cannot run before a node has joined the ring
For me this means, that one node thinks that whole data range is on other two
nodes.
As a result:
WRITE request with any key/any token sent to 10.0.0.1 controller will be
forwarded and saved on 10.0.0.2 and 10.0.0.3
READ request with CL.One with any key/any token sent to 10.0.0.2 controller
will be forwarded to 10.0.0.1 or 10.0.0.3, and since 10.0.0.1 can't have data
for write above, some requests fails, some don't (if 10.0.0.3 answers).
More of it, every READ request to any node will be forwarded to other node.
That what we have right now with 0.8.6 and up to 1.0.5 as with 3 nodes in 1 DC,
as with 8x2 nodes.
Best regards/ Pagarbiai
Viktor Jevdokimov
Senior Developer
Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com
Phone: +370 5 212 3063. Fax: +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
[Adform news]http://www.adform.com/
[Visit us!]
Follow:
[twitter]http://twitter.com/#!/adforminsider
Visit our bloghttp://www.adform.com/site/blog
Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that the
information remains the property of the sender. You must not use, disclose,
distribute, copy, print or rely on this e-mail. If you have received this
message in error, please contact the sender immediately and irrevocably delete
this message and any copies.
inline: image001.pnginline: image002.pnginline: image003.pnginline: signature-logo46e2.pnginline: dm-exco578c.pnginline: tweet7db.png