On 2 Jun 2011, at 19:43, Jacques wrote:
> So the bug is-- using a map reduce job that 'includes' nonexistent bucket/key
> causes an error. Correct?
Yes: - if you attempt a map reduce via the pb interface (from any client) and
the inputs include a [bucket, key] pair that don't exist you will get the error
you are seeing (a closed socket, (an EOF from the client) and the JSON encoding
error in the riak log)).
>
> I assumed that we would get a "not found" response back same as with the rest
> interface and thus assumed by encoding was what was causing the problem. My
> mistake.
Our mistake too; you should have got a not_found. The patch fixes it, it'll be
there soon.
>
> Upon further reflection, I realized that base64 encoding wouldn't work unless
> I stored the values that way single it reads the utf8 as strings correctly .
> My plan is store my values directly as byte values. This is easy with the
> protobuf bytes types. However, how would I then encode them in the includes
> section of the map reduce json block?
I'm confused here: I think you are asking how you can add the bucket, key
inputs to a m/r job when the PB client MapReduceBuilder only allows Strings in
the
addRiakObject(String bucket, String key)
method. I guess you are asking this since you created the objects as byte[]
values with
public RiakObject(ByteString bucket, ByteString key, ByteString
content)
And stored them with
public void store(RiakObject value)
but I'm guessing.
If that is the case then I think the best we can do with the current API is to
generate a String from your bytes. I guess the ByteString class that the PB
client uses sends your bytes unmolested, so if you want a String representation
of your bytes you want to encode them with ISO-8859-1. Try
addRiakObject(new String(yourBucketBytes, "ISO-8859-1"), new
String(yourKeyBytes, "ISO-8859-1"))
When you create the m/r job with the PB MapReduceBuilder.
Does that solve your problem?
> I've noticed that there is a secondary erlang format that can be passed for
> map reduce jobs, must I use that? If so, does anyone have an example of a
> generating one of these from within Java?
The java PB client doesn't currently support the application/x-erlang-binary
content-type for map/reduce jobs. I think that only the erlang pb client does.
>
> Two other questions:
> 1. I noticed some earlier discussions about the protobuf client being alpha
> state (I think that was from a number of months ago). Is it generally safe
> to use these days?
Yes, generally safe. I'd recommend it over the java http client for speed, less
so for features.
> 2. Are the request and response threads in Riak separate or sequential. For
> example, if I send 5 normal PbcGetReq requests in quick succession on a
> single socket does Riak finish the first one before starting on requests 2-5?
> Or does it rather thread the requests out as they come in so it will get 2-5
> simultaneously? I'm asking this because I'm trying to figure out how much I
> should try to reuse a single socket connection.
Talking about the java pb client? Each thread gets its own socket. If you want
to do 5 concurrent gets, create 5 threads, pass a pb riak client to each, and
have each thread do a get, you will get 5 open sockets and 5 concurrent gets.
If those threads do more operations (within a second (there is a Timer thread
reaping connections that are unused for 1 second)) they will reuse the same
connection.
I hope I've gone someway to answering your questions.
I'd like to get your map/reduce query working and it seems you've hit a genuine
blind spot in the current API (storing a value with a byte[] bucket/key but
the MapReduceBuilder requires Strings for bucket/key) so I want to find a work
around and make sure that it is in the next version of the API. Thanks for your
patience: If you could send me some code that reproduces your problem (github
gist is ideal for this) then it'd make it easier.
Cheers
Russell
>
> Thanks for your help,
> Jacques
>
>
>
>
>
> On Thu, Jun 2, 2011 at 1:00 AM, Russell Brown <[email protected]> wrote:
> Hi Jaques,
>
> Where to start...hm.
>
> On 2 Jun 2011, at 02:47, Jacques wrote:
>
>> I'm using Java and looking to replicate multi-get using a map-only job per
>> everyone's recommendation.
>>
>> My bucket and key names are binary values, not strings.
>>
>> I attempted to run a map reduce job using a json input object, base 64
>> encoding the values. This failed.
>>
>> What is the correct way to submit a pbc map reduce job where the inputs info
>> is binary?
>>
>> Thanks,
>> Jacques
>>
>>
>>
>> Error when trying base 64 values:
>> ** Reason for termination ==
>> ** {json_encode,{bad_term,{not_found,{<<"dGVzdDI=">>,<<"dGVzdEtleQ==">>},
>> undefined}}}
>
> This is a bug in the pb interface of Riak that I have patched and will be in
> master soon. The error is because the {not_found} term you see in the log is
> not able to be serialised to JSON. The fix is here
> https://github.com/basho/riak_kv/pull/103
>
> However, this error just means that your bucket/key combo
> <<"dGVzdDI=">>,<<"dGVzdEtleQ==">> is not found. Which is another problem all
> together.
>
> Are you're bucket and key names actual binary values, or base64 encoded
> binary values?
>
> Cheers
>
> Russell
>
>>
>>
>> JSON {
>> "inputs": [
>> [
>> "dGVzdDI=",
>> "dGVzdEtleQ=="
>> ],
>> [
>> "dGVzdDI=",
>> "dGVzdEtleQ=="
>> ]
>> ],
>> "query": [{"map": {
>> "keep": true,
>> "language": "javascript",
>> "source": "function(v) { return [v]; }"
>> }}]
>> }
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com