Thanks for the input.

If i understand correctly the only size overhead would be in the extra
metadata added by all the siblings?


2013/11/13 Hector Castro <[email protected]>

> The `put_index` snippet in the following blog post actually forces the
> creation of siblings (while `get_index` resolves them by doing a set
> union):
>
> http://basho.com/index-for-fun-and-for-profit/
>
> As John said, you definitely want to be careful not to create too many
> siblings because that'll impact the overall Riak object size.
>
> --
> Hector
>
>
> On Wed, Nov 13, 2013 at 5:25 AM, Russell Brown <[email protected]>
> wrote:
> >
> > On 13 Nov 2013, at 10:03, Carlos Baquero <[email protected]> wrote:
> >
> >>
> >> Its interesting to see a use case where a grow only set is sufficient.
> I believe Riak 2.0 will offer optimized OR-Sets that allow item removal at
> the expense of some extra complexity in element storage and logarithmic
> metadata growth per operation. But for your case a simple direct set of
> elements with server side merge by set union looks perfect. Its not
> efficient at all to keep all those siblings if a simple server side merge
> can reduce them.
> >>
> >> Maybe it is a good idea to not overlook the potential usefulness of
> simple grow only sets and add that datatype to the 2.0 server side CRDTs
> library. And maybe even 2P-Sets that only allow deleting once, might be
> useful for some cases.
> >
> > We plan to add more data types in future, I don’t think they’ll make
> them into 2.0. You can use an ORSet as a G-Set, though, just only ever add
> to it. The overhead is pretty small.
> >
> > the difficulty is exposing different “flavours” of CRDTs in a
> non-confusing way. We chose to go with the name “data type” and name the
> implementations generically (set, map, counter.) I wonder if we painted
> ourselves into a corner.
> >
> > Cheers
> >
> > Russell
> >
> >>
> >> Regards,
> >> Carlos
> >>
> >> -----
> >> Carlos Baquero
> >> HASLab / INESC TEC &
> >> Universidade do Minho,
> >> Portugal
> >>
> >> [email protected]
> >> http://gsd.di.uminho.pt/cbm
> >>
> >>
> >>
> >>
> >>
> >> On 12/11/2013, at 22:10, Jason Campbell wrote:
> >>
> >>> I am currently forcing siblings for time series data. The maximum
> bucket sizes are very predictable due to the nature of the data. I
> originally used the get/update/set cycle, but as I approach the end of the
> interval, reading and writing 1MB+ objects at a high frequency kills
> network bandwidth. So now, I append siblings, and I have a cron that merges
> the previous siblings (a simple set union works for me, only entire objects
> are ever deleted).
> >>>
> >>> I can see how it can be dangerous to insert siblings, bit if you have
> some other method of knowing how much data is in one, I don't see size
> being an issue. I have also considered using a counter to know how large an
> object is without fetching it, which shouldn't be off by more than a few
> siblings unless there is a network partition.
> >>>
> >>> So aside from size issues, which can be roughly predicted or worked
> around, is there any reason to not create hundreds or thousands of siblings
> and resolve them later? I realise sets could work well for my use case, but
> they seem overkill for simple append operations when I don't need delete
> functionality. Creating your own CRDTs are trivial if you never need to
> delete.
> >>>
> >>> Thoughts are welcome,
> >>> Jason
> >>> From: John Daily
> >>> Sent: Wednesday, 13 November 2013 3:10 AM
> >>> To: Olav Frengstad
> >>> Cc: riak-users
> >>> Subject: Re: Forcing Siblings to Occur
> >>>
> >>> Forcing siblings other than for testing purposes is not typically a
> good idea; as you indicate, the object size can easily become a problem as
> all siblings will live inside the same Riak value.
> >>>
> >>> Your counter-example sounds a lot like a use case for server-side
> CRDTs; data structures that allow the application to add values without
> retrieving the server-side content first, and siblings are resolved by Riak.
> >>>
> >>> These will arrive with Riak 2.0; see
> https://gist.github.com/russelldb/f92f44bdfb619e089a4d for an overview.
> >>>
> >>> -John
> >>>
> >>> On Nov 12, 2013, at 7:13 AM, Olav Frengstad <[email protected]> wrote:
> >>>
> >>>> Do you consider forcing siblings a good idea? I would like to get
> some input on possible use cases and pitfalls.
> >>>> For instance i have considered to force siblings and then merge them
> on read instead of fetching an object every time i want to update it
> (especially with larger objects).
> >>>>
> >>>> It's not clear from the docs if there are any limitations, will the
> maximum object size be the limitation:?
> >>>>
> >>>> A section of the docs[1] comees comes to mind:
> >>>>
> >>>> "Having an enormous object in your node can cause reads of that
> object to crash the entire node. Other issues are increased cluster latency
> as the object is replicated and out of memory errors."
> >>>>
> >>>> [1]
> http://docs.basho.com/riak/latest/theory/concepts/Vector-Clocks/#Siblings
> >>>>
> >>>> 2013/11/9 Brian Roach <[email protected]>
> >>>> On Fri, Nov 8, 2013 at 11:38 AM, Russell Brown <[email protected]>
> wrote:
> >>>>
> >>>>> If you’re using a well behaved client like the Riak-Java-Client, or
> any other that gets a vclock before doing a put, use whatever option stops
> that.
> >>>>
> >>>> for (int i = 0; i < numReplicasWanted; i++) {
> >>>>    bucket.store("key", "value").withoutFetch().execute();
> >>>> }
> >>>>
> >>>> :)
> >>>>
> >>>> - Roach
> >>>>
> >>>> _______________________________________________
> >>>> riak-users mailing list
> >>>> [email protected]
> >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>> _______________________________________________
> >>>> riak-users mailing list
> >>>> [email protected]
> >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>
> >>>
> >>> _______________________________________________
> >>> riak-users mailing list
> >>> [email protected]
> >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>> _______________________________________________
> >>> riak-users mailing list
> >>> [email protected]
> >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>
> >> _______________________________________________
> >> riak-users mailing list
> >> [email protected]
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > [email protected]
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Med Vennlig Hilsen
Olav Frengstad

Systemutvikler // FWT
+47 920 42 090
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to