Re: Forcing Siblings to Occur

Olav Frengstad Tue, 12 Nov 2013 21:44:10 -0800

Forgot the link!

[1]
https://github.com/basho/riak_kv/commit/6981450c5ffc18207b3a1dc057fd3840a0906c42



2013/11/13 Olav Frengstad <[email protected]>

> @John, I'm definitely looking forward to CRDT's but at the same time i'm
> looking into alternative approaches for achieving the same thing.
>
> @Jason, your description is close to what i had in mind. Only real
> difference is merge would be on read. I did some testing and m/r seems to
> work by using an initial map phase calling `riak_object:get_values`
>
>
> There's also the addition of maximum number of siblings in riak-2.0[1]
>
>
>
>
> 2013/11/13 John Daily <[email protected]>
>
>> Jason, I don’t see any inherent problems, given reasonable management of
>> the situation as you describe. I’d have to chase the code path to see what
>> overhead you’re introducing to Riak’s processing, but if it’s working well
>> for you, then who am I to object?
>>
>> Perhaps someone who’s more familiar with the sibling management code
>> could chime in.
>>
>> -John
>>
>>
>> On Nov 12, 2013, at 5:10 PM, Jason Campbell <[email protected]> wrote:
>>
>> I am currently forcing siblings for time series data. The maximum bucket
>> sizes are very predictable due to the nature of the data. I originally used
>> the get/update/set cycle, but as I approach the end of the interval,
>> reading and writing 1MB+ objects at a high frequency kills network
>> bandwidth. So now, I append siblings, and I have a cron that merges the
>> previous siblings (a simple set union works for me, only entire objects are
>> ever deleted).
>>
>> I can see how it can be dangerous to insert siblings, bit if you have
>> some other method of knowing how much data is in one, I don't see size
>> being an issue. I have also considered using a counter to know how large an
>> object is without fetching it, which shouldn't be off by more than a few
>> siblings unless there is a network partition.
>>
>> So aside from size issues, which can be roughly predicted or worked
>> around, is there any reason to not create hundreds or thousands of siblings
>> and resolve them later? I realise sets could work well for my use case, but
>> they seem overkill for simple append operations when I don't need delete
>> functionality. Creating your own CRDTs are trivial if you never need to
>> delete.
>>
>> Thoughts are welcome,
>> Jason
>>    *From: *John Daily
>> *Sent: *Wednesday, 13 November 2013 3:10 AM
>> *To: *Olav Frengstad
>> *Cc: *riak-users
>> *Subject: *Re: Forcing Siblings to Occur
>>
>> Forcing siblings other than for testing purposes is not typically a good
>> idea; as you indicate, the object size can easily become a problem as all
>> siblings will live inside the same Riak value.
>>
>> Your counter-example sounds a lot like a use case for server-side CRDTs;
>> data structures that allow the application to add values without retrieving
>> the server-side content first, and siblings are resolved by Riak.
>>
>> These will arrive with Riak 2.0; see
>> https://gist.github.com/russelldb/f92f44bdfb619e089a4d for an overview.
>>
>> -John
>>
>> On Nov 12, 2013, at 7:13 AM, Olav Frengstad <[email protected]> wrote:
>>
>> Do you consider forcing siblings a good idea? I would like to get some
>> input on possible use cases and pitfalls.
>> For instance i have considered to force siblings and then merge them on
>> read instead of fetching an object every time i want to update it
>> (especially with larger objects).
>>
>> It's not clear from the docs if there are any limitations, will the
>> maximum object size be the limitation:?
>>
>> A section of the docs[1] comees comes to mind:
>>
>> "Having an enormous object in your node can cause reads of that object to
>> crash the entire node. Other issues are increased cluster latency as the
>> object is replicated and out of memory errors."
>>
>> [1]
>> http://docs.basho.com/riak/latest/theory/concepts/Vector-Clocks/#Siblings
>>
>> 2013/11/9 Brian Roach <[email protected]>
>>
>>> On Fri, Nov 8, 2013 at 11:38 AM, Russell Brown <[email protected]>
>>> wrote:
>>>
>>> > If you’re using a well behaved client like the Riak-Java-Client, or
>>> any other that gets a vclock before doing a put, use whatever option stops
>>> that.
>>>
>>> for (int i = 0; i < numReplicasWanted; i++) {
>>>     bucket.store("key", "value").withoutFetch().execute();
>>> }
>>>
>>> :)
>>>
>>> - Roach
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> [email protected]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>  _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>>
>
>
> --
> Med Vennlig Hilsen
> Olav Frengstad
>
> Systemutvikler // FWT
> +47 920 42 090
>



-- 
Med Vennlig Hilsen
Olav Frengstad

Systemutvikler // FWT
+47 920 42 090

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Forcing Siblings to Occur

Reply via email to