I suspect you will need to explicitly encode to UTF8 first, then. (And decode when reading.)
My reading of the relevant issues (https://issues.apache.org/jira/browse/THRIFT-395, https://issues.apache.org/jira/browse/THRIFT-414) is that this won't be fixed any time soon. -Jonathan On Mon, Dec 7, 2009 at 4:56 PM, Edmond Lau <[email protected]> wrote: > This particular client was in Ruby. > > On Mon, Dec 7, 2009 at 2:49 PM, Jonathan Ellis <[email protected]> wrote: >> (bugs in thrift, that is) >> >> On Mon, Dec 7, 2009 at 4:49 PM, Jonathan Ellis <[email protected]> wrote: >>> what language are your clients in? there are definitely some bugs >>> there when communicating b/t client and server of different languages. >>> :( >>> >>> On Mon, Dec 7, 2009 at 4:43 PM, Edmond Lau <[email protected]> wrote: >>>> I'm using non-ascii keys on Cassandra, relatively close to trunk at >>>> r880926, and my some of my keys get mangled. >>>> >>>> As a simple test case, if I insert a one-byte key anywhere between >>>> \200 and \377 (octal for 128 to 255) through the thrift interface, and >>>> then query back my data with multi get, I get a hash back that has >>>> "\357\277\275" as the key. All those one-byte keys get mapped to the >>>> same bucket, so if I insert with the key \205, I get the data back >>>> when querying for \300. So either a) there's a bug in thrift, b) >>>> Cassandra doesn't support non-ascii keys, or c) Cassandra is mangling >>>> my key somewhere. >>>> >>>> Has anyone else run into this issue? >>>> >>>> Edmond >>>> >>> >> >
