The `put_index` snippet in the following blog post actually forces the
creation of siblings (while `get_index` resolves them by doing a set
union):

http://basho.com/index-for-fun-and-for-profit/

As John said, you definitely want to be careful not to create too many
siblings because that'll impact the overall Riak object size.

--
Hector


On Wed, Nov 13, 2013 at 5:25 AM, Russell Brown <[email protected]> wrote:
>
> On 13 Nov 2013, at 10:03, Carlos Baquero <[email protected]> wrote:
>
>>
>> Its interesting to see a use case where a grow only set is sufficient. I 
>> believe Riak 2.0 will offer optimized OR-Sets that allow item removal at the 
>> expense of some extra complexity in element storage and logarithmic metadata 
>> growth per operation. But for your case a simple direct set of elements with 
>> server side merge by set union looks perfect. Its not efficient at all to 
>> keep all those siblings if a simple server side merge can reduce them.
>>
>> Maybe it is a good idea to not overlook the potential usefulness of simple 
>> grow only sets and add that datatype to the 2.0 server side CRDTs library. 
>> And maybe even 2P-Sets that only allow deleting once, might be useful for 
>> some cases.
>
> We plan to add more data types in future, I don’t think they’ll make them 
> into 2.0. You can use an ORSet as a G-Set, though, just only ever add to it. 
> The overhead is pretty small.
>
> the difficulty is exposing different “flavours” of CRDTs in a non-confusing 
> way. We chose to go with the name “data type” and name the implementations 
> generically (set, map, counter.) I wonder if we painted ourselves into a 
> corner.
>
> Cheers
>
> Russell
>
>>
>> Regards,
>> Carlos
>>
>> -----
>> Carlos Baquero
>> HASLab / INESC TEC &
>> Universidade do Minho,
>> Portugal
>>
>> [email protected]
>> http://gsd.di.uminho.pt/cbm
>>
>>
>>
>>
>>
>> On 12/11/2013, at 22:10, Jason Campbell wrote:
>>
>>> I am currently forcing siblings for time series data. The maximum bucket 
>>> sizes are very predictable due to the nature of the data. I originally used 
>>> the get/update/set cycle, but as I approach the end of the interval, 
>>> reading and writing 1MB+ objects at a high frequency kills network 
>>> bandwidth. So now, I append siblings, and I have a cron that merges the 
>>> previous siblings (a simple set union works for me, only entire objects are 
>>> ever deleted).
>>>
>>> I can see how it can be dangerous to insert siblings, bit if you have some 
>>> other method of knowing how much data is in one, I don't see size being an 
>>> issue. I have also considered using a counter to know how large an object 
>>> is without fetching it, which shouldn't be off by more than a few siblings 
>>> unless there is a network partition.
>>>
>>> So aside from size issues, which can be roughly predicted or worked around, 
>>> is there any reason to not create hundreds or thousands of siblings and 
>>> resolve them later? I realise sets could work well for my use case, but 
>>> they seem overkill for simple append operations when I don't need delete 
>>> functionality. Creating your own CRDTs are trivial if you never need to 
>>> delete.
>>>
>>> Thoughts are welcome,
>>> Jason
>>> From: John Daily
>>> Sent: Wednesday, 13 November 2013 3:10 AM
>>> To: Olav Frengstad
>>> Cc: riak-users
>>> Subject: Re: Forcing Siblings to Occur
>>>
>>> Forcing siblings other than for testing purposes is not typically a good 
>>> idea; as you indicate, the object size can easily become a problem as all 
>>> siblings will live inside the same Riak value.
>>>
>>> Your counter-example sounds a lot like a use case for server-side CRDTs; 
>>> data structures that allow the application to add values without retrieving 
>>> the server-side content first, and siblings are resolved by Riak.
>>>
>>> These will arrive with Riak 2.0; see 
>>> https://gist.github.com/russelldb/f92f44bdfb619e089a4d for an overview.
>>>
>>> -John
>>>
>>> On Nov 12, 2013, at 7:13 AM, Olav Frengstad <[email protected]> wrote:
>>>
>>>> Do you consider forcing siblings a good idea? I would like to get some 
>>>> input on possible use cases and pitfalls.
>>>> For instance i have considered to force siblings and then merge them on 
>>>> read instead of fetching an object every time i want to update it 
>>>> (especially with larger objects).
>>>>
>>>> It's not clear from the docs if there are any limitations, will the 
>>>> maximum object size be the limitation:?
>>>>
>>>> A section of the docs[1] comees comes to mind:
>>>>
>>>> "Having an enormous object in your node can cause reads of that object to 
>>>> crash the entire node. Other issues are increased cluster latency as the 
>>>> object is replicated and out of memory errors."
>>>>
>>>> [1] 
>>>> http://docs.basho.com/riak/latest/theory/concepts/Vector-Clocks/#Siblings
>>>>
>>>> 2013/11/9 Brian Roach <[email protected]>
>>>> On Fri, Nov 8, 2013 at 11:38 AM, Russell Brown <[email protected]> 
>>>> wrote:
>>>>
>>>>> If you’re using a well behaved client like the Riak-Java-Client, or any 
>>>>> other that gets a vclock before doing a put, use whatever option stops 
>>>>> that.
>>>>
>>>> for (int i = 0; i < numReplicasWanted; i++) {
>>>>    bucket.store("key", "value").withoutFetch().execute();
>>>> }
>>>>
>>>> :)
>>>>
>>>> - Roach
>>>>
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> [email protected]
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> [email protected]
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> [email protected]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> _______________________________________________
>>> riak-users mailing list
>>> [email protected]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to