Re: 'not found' after join

Greg Nelson Mon, 02 May 2011 21:12:44 -0700

Actually, I just found this bug: https://issues.basho.com/show_bug.cgi?id=992


So it looks like the r=1 idea won't work.
On Monday, May 2, 2011 at 8:36 PM, Greg Nelson wrote: 
> Ok, thank you Ryan. I'm glad that what I'm seeing is expected behavior, 
> although it is a little surprising.
> 
> As Kyle asked in a parallel reply on this thread: is it possible for the 
> client to distinguish between these different scenarios where a notfound can 
> be returned? My initial thought is that when a node is being added, clients 
> can do reads with r=1 and retry GETs that 404...?
> On Monday, May 2, 2011 at 8:14 PM, Ryan Zezeski wrote:
> > Greg,
> > 
> > Your expectations are fair, just because you added a node doesn't mean Riak 
> > should return notfounds. Unfortunately, we aren't quite there yet. This is 
> > a side effect of how Riak currently implements handoff in that it 
> > immediately updates/gossips the ring causing many partitions to handoff 
> > immediately. If a request comes in that relies on these partitions then it 
> > will get a notfound and perform read repair. You're situation is multiplied 
> > by the fact that you are going from 3 nodes to 4. More vnode shuffling 
> > occurs because of the small cluster size. 
> > 
> > We're well aware of this and have it on our radar for improvement in a 
> > future release.
> > 
> > All this said, you data will be eventually consistent. That is, all your 
> > data will eventually be handed off and things will work as normal. It's 
> > only during the handoff that you _may_ encounter notfounds. In this case it 
> > would be best to add a new node to your cluster at lowest load times and if 
> > you can spare additional hardware a few more nodes to start with is an even 
> > easier option. 
> > 
> > -Ryan
> > 
> > On Mon, May 2, 2011 at 9:48 PM, Greg Nelson <[email protected]> wrote:
> > > Hello riak users! 
> > > 
> > > I have a 4 node cluster that started out as 3 nodes. ring_creation_size = 
> > > 2048, target_n_val is default (4), and all buckets have n_val = 3. 
> > > 
> > > When I joined the 4th node, for a few minutes some GETs were returning 
> > > 'not found' for data that was already in riak. Eventually the data was 
> > > returned, due to read repair I would assume. Is this expected? It seems 
> > > that 'not found' and read repairs should only happen when something goes 
> > > wrong, like a node goes down. Not when adding a node to the cluster, 
> > > which is supposed to be part of normal operation! 
> > > 
> > > Any help or insight is appreciated!
> > > 
> > > Greg 
> > > _______________________________________________
> > >  riak-users mailing list
> > > [email protected]
> > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > > 
> > 
> > 
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: 'not found' after join

Reply via email to