Re: read path, I have missed something

Renato Marroquín Mogrovejo Wed, 16 Jan 2013 10:30:47 -0800

Thanks for the explanation Sylvain!


2013/1/16 Sylvain Lebresne <[email protected]>:
>> I mean if a node is down, then
>> we get that node up and running again, wouldn't it be synchronized
>> automatically?
>
>
> It will, thanks to hinted handoff (not gossip, gossip only handle the ring
> topology and a bunch of metadata, it doesn't deal with data synchronization
> at all). But hinted handoff are not bulletproof (if only because hinted
> handoff expire after some time if they are not delivered). And you're right,
> that's probably why Carlos' example worked as he observed it, especially
> since he didn't mentioned reads between his stop/erase/restart steps.
> Anyway, my description of read_repair_chance is still correct if someone
> wonder about that :)
>
> --
> Sylvain
>
>
>>
>> Thanks!
>>
>>
>> Renato M.
>>
>> 2013/1/16 Carlos Pérez Miguel <[email protected]>:
>> > ahhhh, ok. Now I understand where the data came from. When using CL.ALL
>> > read_repair always repairs inconsistent data.
>> >
>> > Thanks a lot, Sylvain.
>> >
>> >
>> > Carlos Pérez Miguel
>> >
>> >
>> > 2013/1/17 Sylvain Lebresne <[email protected]>
>> >>
>> >> You're missing the correct definition of read_repair_chance.
>> >>
>> >> When you do a read at CL.ALL, all replicas are wait upon and the
>> >> results
>> >> from all those replicas are compared. From that, we can extract which
>> >> nodes
>> >> are not up to date, i.e. which ones can be read repair. And if some
>> >> node
>> >> need to be repair, we do it. Always, whatever the value of
>> >> read_repair_chance is.
>> >>
>> >> Now if you do a read at CL.ONE, if you only end up querying 1 replica,
>> >> you
>> >> will never be able to do read repair. That's where read_repair_chance
>> >> come
>> >> into play. What it really control, is how often we query *more* replica
>> >> than
>> >> strictly required by the consistency level. And it happens that the
>> >> reason
>> >> you would want to do that is because of read repair and hence the
>> >> option
>> >> name. But read repair potentially kicks in anytime more than replica
>> >> answer
>> >> a query. One corollary is that read_repair_chance has no impact
>> >> whatsoever
>> >> at CL.ALL.
>> >>
>> >> --
>> >> Sylvain
>> >>
>> >>
>> >> On Wed, Jan 16, 2013 at 1:55 PM, Carlos Pérez Miguel
>> >> <[email protected]>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I am trying to understand the read path in Cassandra. I've read
>> >>> Cassandra's documentation and it seems that the read path is like
>> >>> this:
>> >>>
>> >>> - Client contacts with a proxy node which performs the operation over
>> >>> certain object
>> >>> - Proxy node sends requests to every replica of that object
>> >>> - Replica nodes answers eventually if they are up
>> >>> - After the first R replicas answer, the proxy node returns value to
>> >>> the
>> >>> client.
>> >>> - If some of the replicas are non updated and readrepair is active,
>> >>> proxy
>> >>> node updates those replicas.
>> >>>
>> >>> Ok, so far so good.
>> >>>
>> >>> But now I found some incoherences that I don't understand:
>> >>>
>> >>> Let's suppose that we have a 5 node cluster: x1, x2, x3, x4 and x5
>> >>> each with replication factor 3, read_repair_chance=0.0,
>> >>> autobootstrap=false and caching=NONE
>> >>> We have keyspace KS1 and colunfamily CF1.
>> >>>
>> >>> With this configuration, we know that if any node crashes and erases
>> >>> its
>> >>> data directories it will be necesary to run nodetool repair in that
>> >>> node in
>> >>> order to repair that node and gather information from its replica
>> >>> companions.
>> >>>
>> >>> So, let's suppose that x1, x2 and x3 are the endpoint which stores the
>> >>> data KS1.CF1['data1']
>> >>> If x1 crashes (loosing all its data), and we execute get
>> >>> KS1.CF1['data1']
>> >>> with consistency level ALL, the operation will fail. That is ok to my
>> >>> understanding.
>> >>>
>> >>> If we restart x1 node and doesn't execute nodetool repair and repeat
>> >>> the
>> >>> operation get KS1.CF1['data1'] using consistency ALL, we will obtain
>> >>> the
>> >>> original data! Why? one of the nodes doesn't have any data about
>> >>> KS1.CF1['data1']. Ok, let's suppose that as all the required nodes
>> >>> answer,
>> >>> even if one doesn't have data, the operation ends correctly.
>> >>>
>> >>> Now let's repeat the same procedure with the rest of nodes, that is:
>> >>>
>> >>> 1- stop x1, erase data, logs, cache and commitlog from x1
>> >>> 2- restart x1 adn don't repair it
>> >>> 3- stop x2, erase data, logs, cache and commitlog from x2
>> >>> 4- restart x2 adn don't repair it
>> >>> 5- stop x3, erase data, logs, cache and commitlog from x3
>> >>> 6- restart x3 adn don't repair it
>> >>> 7- execute get KS1.CF1['data1'] with consistency level ALL -> still
>> >>> return the correct data!
>> >>>
>> >>> Where did that data come from? the endpoint is supposed to be empty of
>> >>> data. I tried this using cassandra-cli and cassandra's ruby client and
>> >>> the
>> >>> result is always the same. What did I miss?
>> >>>
>> >>> Thank you for reading until the end, ;)
>> >>>
>> >>> Bye
>> >>>
>> >>> Carlos Pérez Miguel
>> >>
>> >>
>>
>

Re: read path, I have missed something

Reply via email to