Worth noting here; the current Java client is entirely UTF-8 centric
and is explicitly converting those bytes to UTF-8 strings, so yes ...
that's probably an issue here if I'm understanding things correctly.

Almost everything is copied to/from the protocol buffer message to
Java Strings using the ByteString.copyFromUtf8() and
ByteString.toStringUtf8() methods.

This is actually something that is addressed in the new 2.0 Java
client Dave and I are working on.

Thanks,
- Roach

On Tue, Nov 5, 2013 at 5:40 PM, Toby Corkindale
<[email protected]> wrote:
> On 06/11/13 11:30, Evan Vigil-McClanahan wrote:
>>
>> You can replace int_to_bin with int_to_str to make it easier to debug
>> in the future, I suppose.  I am not sure how to get them to be fetched
>> as bytes, without may altering the client.
>>
>> You could just attach to the console and run whatever listing command
>> you're running there, which would give you the answer as unfiltered
>> erlang binaries, which are easy to understand.
>
>
> Ah, I'm really not familiar enough with Erlang and Riak to be doing that.
> Which API applies to console commands? I'll take a look. (Is it just the
> same as the Erlang client?)
>
>
>
>> Is this easily replicable on a new cluster?
>
>
> I think it should be -- the only difference over default configuration is
> that LevelDB is configured as the default backend.
> Run basho_bench with the pbc-client test to generate the initial keys and
> you should be set.
>
>
> T
>
>> On Tue, Nov 5, 2013 at 4:17 PM, Toby Corkindale
>> <[email protected]> wrote:
>>>
>>> Hi Evan,
>>> These keys were originally created by basho-bench, using:
>>> {key_generator, {int_to_bin, {uniform_int, 10000}}}.
>>>
>>> Of the 10k keys, it seems half could be removed, but not the other half.
>>>
>>> Now I've tried storing keys with the same key as the un-deleteable ones,
>>> waiting a minute, and then deleting them again.. this isn't seeming to
>>> help!
>>>
>>> I don't know if it's significant, but I'm working with the Java client
>>> here
>>> (protocol buffers). I note that the bad keys are basically just bytes,
>>> not
>>> actual ascii strings, and they do contain nulls.
>>>
>>> Actually, here's something I just noticed -- the keys I'm getting from
>>> the
>>> index are repeating! It's the same 39 keys, repeated 128 times.
>>>
>>> O.o
>>>
>>> Are there any known bugs in the PBC interface when it comes to binary
>>> keys?
>>> I know the HTTP interface just crashes out completely.
>>>
>>> I'm fetching the keys in a manner that returns strings; is there a way to
>>> fetch them as bytes? Maybe that would work better; I'm wondering if the
>>> client is attempting to convert the bytes into unicode strings and
>>> dropping
>>> invalid characters?
>>>
>>>
>>> On 05/11/13 03:44, Evan Vigil-McClanahan wrote:
>>>>
>>>>
>>>> Hi Toby.
>>>>
>>>> It's possible, since they're stored separately, that the objects were
>>>> deleted but the indices were left in place because of some error (e.g.
>>>> the operation failed for some reason between the object removal and
>>>> the index removal).  One of the things on the feature list for the
>>>> next release is AAE of index values, which should take care of this
>>>> case.  This is really rare, but not unknown.  It'd be interesting to
>>>> know how you ended up with so many.
>>>>
>>>> In the mean time, the only way I can think of to get rid of them
>>>> (other than deleting them from the console, which would require taking
>>>> nodes down and a lot of manual effort), would be to write another
>>>> value that would have the same index, then delete it, which should
>>>> normally succeed.
>>>>
>>>> I'll ask around to see if there is anything that might work better.
>
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to