Thanks for the input. If i understand correctly the only size overhead would be in the extra metadata added by all the siblings?
2013/11/13 Hector Castro <[email protected]> > The `put_index` snippet in the following blog post actually forces the > creation of siblings (while `get_index` resolves them by doing a set > union): > > http://basho.com/index-for-fun-and-for-profit/ > > As John said, you definitely want to be careful not to create too many > siblings because that'll impact the overall Riak object size. > > -- > Hector > > > On Wed, Nov 13, 2013 at 5:25 AM, Russell Brown <[email protected]> > wrote: > > > > On 13 Nov 2013, at 10:03, Carlos Baquero <[email protected]> wrote: > > > >> > >> Its interesting to see a use case where a grow only set is sufficient. > I believe Riak 2.0 will offer optimized OR-Sets that allow item removal at > the expense of some extra complexity in element storage and logarithmic > metadata growth per operation. But for your case a simple direct set of > elements with server side merge by set union looks perfect. Its not > efficient at all to keep all those siblings if a simple server side merge > can reduce them. > >> > >> Maybe it is a good idea to not overlook the potential usefulness of > simple grow only sets and add that datatype to the 2.0 server side CRDTs > library. And maybe even 2P-Sets that only allow deleting once, might be > useful for some cases. > > > > We plan to add more data types in future, I don’t think they’ll make > them into 2.0. You can use an ORSet as a G-Set, though, just only ever add > to it. The overhead is pretty small. > > > > the difficulty is exposing different “flavours” of CRDTs in a > non-confusing way. We chose to go with the name “data type” and name the > implementations generically (set, map, counter.) I wonder if we painted > ourselves into a corner. > > > > Cheers > > > > Russell > > > >> > >> Regards, > >> Carlos > >> > >> ----- > >> Carlos Baquero > >> HASLab / INESC TEC & > >> Universidade do Minho, > >> Portugal > >> > >> [email protected] > >> http://gsd.di.uminho.pt/cbm > >> > >> > >> > >> > >> > >> On 12/11/2013, at 22:10, Jason Campbell wrote: > >> > >>> I am currently forcing siblings for time series data. The maximum > bucket sizes are very predictable due to the nature of the data. I > originally used the get/update/set cycle, but as I approach the end of the > interval, reading and writing 1MB+ objects at a high frequency kills > network bandwidth. So now, I append siblings, and I have a cron that merges > the previous siblings (a simple set union works for me, only entire objects > are ever deleted). > >>> > >>> I can see how it can be dangerous to insert siblings, bit if you have > some other method of knowing how much data is in one, I don't see size > being an issue. I have also considered using a counter to know how large an > object is without fetching it, which shouldn't be off by more than a few > siblings unless there is a network partition. > >>> > >>> So aside from size issues, which can be roughly predicted or worked > around, is there any reason to not create hundreds or thousands of siblings > and resolve them later? I realise sets could work well for my use case, but > they seem overkill for simple append operations when I don't need delete > functionality. Creating your own CRDTs are trivial if you never need to > delete. > >>> > >>> Thoughts are welcome, > >>> Jason > >>> From: John Daily > >>> Sent: Wednesday, 13 November 2013 3:10 AM > >>> To: Olav Frengstad > >>> Cc: riak-users > >>> Subject: Re: Forcing Siblings to Occur > >>> > >>> Forcing siblings other than for testing purposes is not typically a > good idea; as you indicate, the object size can easily become a problem as > all siblings will live inside the same Riak value. > >>> > >>> Your counter-example sounds a lot like a use case for server-side > CRDTs; data structures that allow the application to add values without > retrieving the server-side content first, and siblings are resolved by Riak. > >>> > >>> These will arrive with Riak 2.0; see > https://gist.github.com/russelldb/f92f44bdfb619e089a4d for an overview. > >>> > >>> -John > >>> > >>> On Nov 12, 2013, at 7:13 AM, Olav Frengstad <[email protected]> wrote: > >>> > >>>> Do you consider forcing siblings a good idea? I would like to get > some input on possible use cases and pitfalls. > >>>> For instance i have considered to force siblings and then merge them > on read instead of fetching an object every time i want to update it > (especially with larger objects). > >>>> > >>>> It's not clear from the docs if there are any limitations, will the > maximum object size be the limitation:? > >>>> > >>>> A section of the docs[1] comees comes to mind: > >>>> > >>>> "Having an enormous object in your node can cause reads of that > object to crash the entire node. Other issues are increased cluster latency > as the object is replicated and out of memory errors." > >>>> > >>>> [1] > http://docs.basho.com/riak/latest/theory/concepts/Vector-Clocks/#Siblings > >>>> > >>>> 2013/11/9 Brian Roach <[email protected]> > >>>> On Fri, Nov 8, 2013 at 11:38 AM, Russell Brown <[email protected]> > wrote: > >>>> > >>>>> If you’re using a well behaved client like the Riak-Java-Client, or > any other that gets a vclock before doing a put, use whatever option stops > that. > >>>> > >>>> for (int i = 0; i < numReplicasWanted; i++) { > >>>> bucket.store("key", "value").withoutFetch().execute(); > >>>> } > >>>> > >>>> :) > >>>> > >>>> - Roach > >>>> > >>>> _______________________________________________ > >>>> riak-users mailing list > >>>> [email protected] > >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >>>> _______________________________________________ > >>>> riak-users mailing list > >>>> [email protected] > >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >>> > >>> > >>> _______________________________________________ > >>> riak-users mailing list > >>> [email protected] > >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >>> _______________________________________________ > >>> riak-users mailing list > >>> [email protected] > >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >> > >> _______________________________________________ > >> riak-users mailing list > >> [email protected] > >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > > _______________________________________________ > > riak-users mailing list > > [email protected] > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > -- Med Vennlig Hilsen Olav Frengstad Systemutvikler // FWT +47 920 42 090
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
