Re: Getting a value: get vs map

Sean Cribbs Fri, 29 Jul 2011 10:46:47 -0700

A few things that should be mentioned as well:

1) MapReduce amounts to N=1, or reading only one replica. If you have
divergent replicas (siblings, e.g.) on different notes, they might not
appear in your MapReduce results.
2) MapReduce does not invoke read-repair, so divergent replicas will not
converge.


On Fri, Jul 29, 2011 at 1:30 PM, Justin Sheehy <[email protected]> wrote:

> Jeremiah,
>
> You were essentially correct. A "targeted" MR does not have to search
> for the data, and does not slow down with database size. It is a
> bucket-sweeping MR that currently has that behavior.
>
> -Justin
>
>
>
> On Fri, Jul 29, 2011 at 10:27 AM, Jeremiah Peschka
> <[email protected]> wrote:
> > I would have suspected that an MR job where you supply a Bucket, Key pair
> would be just as fast as a Get request. Shows what I know.
> > ---
> > Jeremiah Peschka
> > Founder, Brent Ozar PLF, LLC
> >
> > On Jul 29, 2011, at 1:37 AM, Antonio Rohman Fernandez wrote:
> >
> >> MapReduce ( or a simply Map ) gets really slow when database has a
> significant amount of data ( or distributed over several servers ). Get
> instead is always faster as Riak doesn't have to search for the key ( you
> tell Riak exactly where to GET the data in your url )
> >>
> >> Rohman
> >>
> >> On Thu, 28 Jul 2011 23:43:06 +0400, [email protected] wrote:
> >>
> >>> Hi,
> >>>
> >>> (I looked at various places for the information, however I could not
> >>> find anything that would answer the question.  It's not completely
> ruled
> >>> out that not all places were checked though :))
> >>>
> >>> I use PB erlang interface to access the database.  Given a bucket name
> >>> and a key, the value can easily be extracted using:
> >>>
> >>>     {ok, Object} = riakc_pb_socket:get(Conn, Bucket, Key),
> >>>     Value = riakc_obj:get_value(Object)
> >>>
> >>> Alternatively, a mapred (actually, just map) request could be issued:
> >>>
> >>>     {ok, [{_, Value}]} = riakc_pb_socket:mapred(Conn, [
> >>>         {Bucket, Key}
> >>>     ], [
> >>>         {map, {modfun, riak_kv, map_object_value}, none, true}
> >>>     ])
> >>>
> >>> I would expect that the result is the same while in the second case,
> the
> >>> amount of data transferred to the client is smaller (which might be
> good
> >>> for certain situations).
> >>>
> >>> So the [open] question is: are there any reasons for using the first
> >>> approach over the second?
> >>>
> >>> --
> >>> Misha
> >>>
> >> --
> >>
> >>               Antonio Rohman Fernandez
> >> CEO, Founder & Lead Engineer
> >> [email protected]               Projects
> >> MaruBatsu.es
> >> PupCloud.com
> >> Wedding Album
> >> _______________________________________________
> >> riak-users mailing list
> >> [email protected]
> >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > [email protected]
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Sean Cribbs <[email protected]>
Developer Advocate
Basho Technologies, Inc.
http://www.basho.com/

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Getting a value: get vs map

Reply via email to