Re: Inconsistent map/reduce results

Dan Reverri Thu, 31 Mar 2011 09:20:31 -0700

Hi Keith,

The cache entry parameter name changed in 0.14 to "map_cache_size". Setting
this parameter to 0 will disable the cache.


Regarding the empty MapReduce results, I'll try to reproduce the issue
locally and narrow down the cause.

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
[email protected]


On Tue, Mar 29, 2011 at 6:16 PM, Keith Dreibelbis <[email protected]>wrote:

> Followup to this (somewhat old) thread...
>
> I had resolved my problem by putting the vnode_cache_entries=0 thing in
> app.config, doing what Grant said.  But sometime later it began failing
> again.  I was getting misses of 25%-50% on records that should have been
> found by map reduce but weren't.  At that point I tried Rohman's suggestion
> of using a random seed, and that worked around the problem successfully.
>  But this isn't a very satisfying fix.
>
> So the vnode_cache_entries=0 thing doesn't really fix it after all?  Is
> there something else to put in the config that would make this work
> properly, without the random seed hack?  BTW since the original thread I
> have upgraded from 0.13 to 0.14, and the bug is still there.
>
>
> Keith
>
>
> On Thu, Mar 10, 2011 at 6:56 PM, Antonio Rohman Fernandez <
> [email protected]> wrote:
>
>> if you want to avoid caching ( without configuration ), you can put some
>> random variable in your map or reduce or both... that does the trick for me
>> as the query will be always different:
>>
>> $seed = randomStringHere;
>>
>> {"map":{"language":"javascript","source":"function(v,k,a) {
>> seed='.$seed.'; x=Riak.mapValuesJson(v)[0]; return [v.values[0].data]; }"}
>>
>> Rohman
>>
>> On Thu, 10 Mar 2011 17:47:49 -0800, Keith Dreibelbis <[email protected]>
>> wrote:
>>
>> Thanks for the prompt response, Grant.  I made the configuration change
>> you suggested, and it fixed my problem.
>>  Some followup questions:
>>  - is it possible to configure this dynamically on a per-bucket basis, or
>> just per-server like it is now?
>> - is this fixed in a newer version?
>>
>> On Thu, Mar 10, 2011 at 2:56 PM, Grant Schofield <[email protected]> wrote:
>>
>>> There are currently some bugs in the mapreduce caching system. The best
>>> thing to do would be to disable the feature, on 0.13 you can do this by
>>> editing or adding the vnode_cache_entries to the riak_kv section of your
>>> app.config. The entry would look like:
>>> {vnode_cache_entries, 0},
>>>
>>>  Grant Schofield
>>> Developer Advocate
>>> Basho Technologies
>>>
>>>   On Mar 10, 2011, at 4:16 PM, Keith Dreibelbis wrote:
>>>
>>>  Hi riak-users,
>>> I'm trying to do a map/reduce query from java on a 0.13 server, and get
>>> inconsistent results.  What I'm doing should be pretty simple.  I'm hoping
>>> someone will notice an obvious error in here, or have some insight:
>>>  This is an automated test.  I'm doing a simple query where I'm trying
>>> to get the keys for records with a certain field value.  In SQL it would
>>> look like "SELECT id FROM table WHERE age = '32'".  In java I'm invoking it
>>> like this:
>>>    MapReduceResponse r = riak.mapReduceOverBucket(getBucket())
>>>         .map(JavascriptFunction.anon(func), true)
>>>              .submit();
>>>  where riak is a RiakClient, getBucket() returns the name of the bucket,
>>> and func is a string that looks like:
>>>  function(value, keyData, arg) {
>>>        var data = Riak.mapValuesJson(value)[0];
>>>        if(data.age == "32")
>>>          return [value.key];
>>>       else
>>>          return [];
>>>    }
>>>  No reduce phase.  All entries in the example bucket are json and have
>>> an age field.  This initially works correctly, it gets back the matching
>>> records as expected.  It also works in curl.  It's an automated test, so
>>> each time I run this, it is using a different bucket.  After about a dozen
>>> queries, this starts to fail.  It returns an empty result, when it should
>>> have found records.  It fails in curl at the same time.
>>>  I initially suspected this might have something to do with doing map
>>> reduce too soon after writing, and the write not being available on all
>>> nodes.  However, I changed the bucket schema entries for w,r,rw,dw from
>>> "quorum" to "all", and this still happens (is there another bucket setting I
>>> missed?). In addition, I only have 3 nodes (I'm using the dev123 example),
>>> and am running curl long enough afterwards.
>>>  Here's the strange part that makes me suspicious.  If I make
>>> insignificant changes to the query, for example change the double quotes to
>>> single quotes, add whitespace or extra parentheses, etc, then it suddenly
>>> works again.  It will work on an existing bucket, and on subsequent tests,
>>> but again only about a dozen times before it starts failing again. Same
>>> behavior in curl.  This makes me suspect that the server is doing some
>>> incorrect caching around this js function, based on the function string.
>>>  Any explanation about what's going on?
>>>  Keith
>>>  _______________________________________________
>>> riak-users mailing list
>>> [email protected]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>     --
>>
>> [image: line][image: logo] <http://mahalostudio.com> *Antonio Rohman 
>> Fernandez*
>> CEO, Founder & Lead Engineer
>> [email protected] *Projects*
>> MaruBatsu.es <http://marubatsu.es>
>> PupCloud.com <http://pupcloud.com>
>> Wedding Album <http://wedding.mahalostudio.com>[image: line]
>>
>>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Inconsistent map/reduce results

Reply via email to