Hi @byungwok . I'm not sure I fully understand your question, but Accumulo 
manages lexicographical ordered arbitrary byte arrays. The APIs for String (or 
CharSequence) parameters are for convenience, and if a character encoding is 
not provided, it will assume UTF-8 (unless there's a bug that we've overlooked, 
and then it *may* assume whatever default encoding your local JVM is using).

If you need to scan/delete/insert something with non-printable (binary) 
characters, you should be able to provide the exact byte array, rather than use 
any String with a particular encoding.

For convenience, the shell attempts to identify characters which aren't 
printable and displays them in a hex-encoded format. However, it's not a 
perfect algorithm... it assumes non-printable, even if your console is capable 
of printing them, and it may be hard to distinguish between something like the 
bytes for the literal string `\x00` (a literal backslash, followed by a literal 
'x', followed by two literal '0' characters) and the encoded form of a null (0) 
byte `\x00`. The shell has limitations like this, but you should be able to 
distinguish between these cases easily using the Java API directly.

[ Full content available at: https://github.com/apache/accumulo/issues/668 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to