The `put_index` snippet in the following blog post actually forces the creation of siblings (while `get_index` resolves them by doing a set union):
http://basho.com/index-for-fun-and-for-profit/ As John said, you definitely want to be careful not to create too many siblings because that'll impact the overall Riak object size. -- Hector On Wed, Nov 13, 2013 at 5:25 AM, Russell Brown <[email protected]> wrote: > > On 13 Nov 2013, at 10:03, Carlos Baquero <[email protected]> wrote: > >> >> Its interesting to see a use case where a grow only set is sufficient. I >> believe Riak 2.0 will offer optimized OR-Sets that allow item removal at the >> expense of some extra complexity in element storage and logarithmic metadata >> growth per operation. But for your case a simple direct set of elements with >> server side merge by set union looks perfect. Its not efficient at all to >> keep all those siblings if a simple server side merge can reduce them. >> >> Maybe it is a good idea to not overlook the potential usefulness of simple >> grow only sets and add that datatype to the 2.0 server side CRDTs library. >> And maybe even 2P-Sets that only allow deleting once, might be useful for >> some cases. > > We plan to add more data types in future, I don’t think they’ll make them > into 2.0. You can use an ORSet as a G-Set, though, just only ever add to it. > The overhead is pretty small. > > the difficulty is exposing different “flavours” of CRDTs in a non-confusing > way. We chose to go with the name “data type” and name the implementations > generically (set, map, counter.) I wonder if we painted ourselves into a > corner. > > Cheers > > Russell > >> >> Regards, >> Carlos >> >> ----- >> Carlos Baquero >> HASLab / INESC TEC & >> Universidade do Minho, >> Portugal >> >> [email protected] >> http://gsd.di.uminho.pt/cbm >> >> >> >> >> >> On 12/11/2013, at 22:10, Jason Campbell wrote: >> >>> I am currently forcing siblings for time series data. The maximum bucket >>> sizes are very predictable due to the nature of the data. I originally used >>> the get/update/set cycle, but as I approach the end of the interval, >>> reading and writing 1MB+ objects at a high frequency kills network >>> bandwidth. So now, I append siblings, and I have a cron that merges the >>> previous siblings (a simple set union works for me, only entire objects are >>> ever deleted). >>> >>> I can see how it can be dangerous to insert siblings, bit if you have some >>> other method of knowing how much data is in one, I don't see size being an >>> issue. I have also considered using a counter to know how large an object >>> is without fetching it, which shouldn't be off by more than a few siblings >>> unless there is a network partition. >>> >>> So aside from size issues, which can be roughly predicted or worked around, >>> is there any reason to not create hundreds or thousands of siblings and >>> resolve them later? I realise sets could work well for my use case, but >>> they seem overkill for simple append operations when I don't need delete >>> functionality. Creating your own CRDTs are trivial if you never need to >>> delete. >>> >>> Thoughts are welcome, >>> Jason >>> From: John Daily >>> Sent: Wednesday, 13 November 2013 3:10 AM >>> To: Olav Frengstad >>> Cc: riak-users >>> Subject: Re: Forcing Siblings to Occur >>> >>> Forcing siblings other than for testing purposes is not typically a good >>> idea; as you indicate, the object size can easily become a problem as all >>> siblings will live inside the same Riak value. >>> >>> Your counter-example sounds a lot like a use case for server-side CRDTs; >>> data structures that allow the application to add values without retrieving >>> the server-side content first, and siblings are resolved by Riak. >>> >>> These will arrive with Riak 2.0; see >>> https://gist.github.com/russelldb/f92f44bdfb619e089a4d for an overview. >>> >>> -John >>> >>> On Nov 12, 2013, at 7:13 AM, Olav Frengstad <[email protected]> wrote: >>> >>>> Do you consider forcing siblings a good idea? I would like to get some >>>> input on possible use cases and pitfalls. >>>> For instance i have considered to force siblings and then merge them on >>>> read instead of fetching an object every time i want to update it >>>> (especially with larger objects). >>>> >>>> It's not clear from the docs if there are any limitations, will the >>>> maximum object size be the limitation:? >>>> >>>> A section of the docs[1] comees comes to mind: >>>> >>>> "Having an enormous object in your node can cause reads of that object to >>>> crash the entire node. Other issues are increased cluster latency as the >>>> object is replicated and out of memory errors." >>>> >>>> [1] >>>> http://docs.basho.com/riak/latest/theory/concepts/Vector-Clocks/#Siblings >>>> >>>> 2013/11/9 Brian Roach <[email protected]> >>>> On Fri, Nov 8, 2013 at 11:38 AM, Russell Brown <[email protected]> >>>> wrote: >>>> >>>>> If you’re using a well behaved client like the Riak-Java-Client, or any >>>>> other that gets a vclock before doing a put, use whatever option stops >>>>> that. >>>> >>>> for (int i = 0; i < numReplicasWanted; i++) { >>>> bucket.store("key", "value").withoutFetch().execute(); >>>> } >>>> >>>> :) >>>> >>>> - Roach >>>> >>>> _______________________________________________ >>>> riak-users mailing list >>>> [email protected] >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>> _______________________________________________ >>>> riak-users mailing list >>>> [email protected] >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >>> _______________________________________________ >>> riak-users mailing list >>> [email protected] >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> _______________________________________________ >>> riak-users mailing list >>> [email protected] >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> _______________________________________________ >> riak-users mailing list >> [email protected] >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
