That has worked well for me when I don't care about using printable characters.
-chris On Aug 22, 2011, at 6:06 PM, Mark <[email protected]> wrote: > How about an empty byte (0x00)? > > On 8/22/11 6:03 PM, Chris Tarnas wrote: >> Generally you want your delimiters to be less than any valid character. For >> normal character data I've found tab (0x09) works well, it's pretty much the >> first option. Forward slash (0x2f) is less reliable depending on what other >> non-alphanumeric characters are allowed. >> >> -chris >> >> >> >> On Aug 22, 2011, at 5:04 PM, Mark wrote: >> >>> I have another question though ;) >>> >>> Is there a better separator I could use to accomplish natural sorting? Also >>> what is the preferred way to use start and stop keys when scanning? For >>> example: STARTROW => "foo", ENDROW => "foo#{what should go here?}". >>> >>> Thanks >>> >>> On 8/22/11 4:59 PM, Mark wrote: >>>> After further investigation it turns out it is my use case. >>>> >>>> My keys are actually in the form of: >>>> "idx_query/foo bar/9223372035540718511" >>>> "idx_query/foo/9223372035540718648" >>>> >>>> Now that I look at it, it make perfect sense why "foo bar" comes before >>>> "foo/" >>>> >>>> Sorry for the confusion. >>>> >>>> On 8/22/11 9:16 AM, Chris Tarnas wrote: >>>>> Good point on the sorting issues with thrift - what client language are >>>>> you using? Using perl I have not seen inconstancies in ordering. >>>>> >>>>> Do your strings have any particular terminator that is being included but >>>>> not seen in your output? Can you send out the rowkeys from scans in the >>>>> HBase shell? That would help narrow it down. >>>>> >>>>> -chris >>>>> >>>>> >>>>> >>>>> On Aug 22, 2011, at 10:55 AM, Jesse Hutton wrote: >>>>> >>>>>> I don't use the thrift API, but my suspicion is that it doesn't return >>>>>> results in the correct order. You're not the only one I've seen report >>>>>> strange things about results ordering recently, and IIRC they were using >>>>>> thrift as well. >>>>>> >>>>>> Can you verify that the results sort the same using the Java API or even >>>>>> by >>>>>> looking at it in the HBase shell? >>>>>> >>>>>> Jesse >>>>>> >>>>>> On Mon, Aug 22, 2011 at 11:28 AM, Mark<[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Im still also confused on how "foo " is less than "foo". Aren't their >>>>>>> respective bytes [102, 111, 111, 32] , and [102, 111, 111] ? >>>>>>> >>>>>>> >>>>>>> On 8/22/11 7:33 AM, Mark wrote: >>>>>>> >>>>>>>> Is there anyway to around this to achieve natural ordering? Thanks >>>>>>>> >>>>>>>> On 8/21/11 10:17 PM, Chris Tarnas wrote: >>>>>>>> >>>>>>>>> HBase doesn't use the localized sorting rules, it sorts on the byte >>>>>>>>> value. Space is ASCII 32, a value less than the alphanumeric >>>>>>>>> characters. >>>>>>>>> >>>>>>>>> -chris >>>>>>>>> >>>>>>>>> On Aug 21, 2011, at 8:11 PM, Mark<[email protected]**> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> FYI I am using openScannerWithPrefix thrift api call >>>>>>>>>> On 8/21/11 6:47 PM, Mark wrote: >>>>>>>>>> >>>>>>>>>>> Why when scanning do I see the following sort order? >>>>>>>>>>> >>>>>>>>>>> "foo bar" >>>>>>>>>>> "foo bar" >>>>>>>>>>> "foo" >>>>>>>>>>> >>>>>>>>>>> I thought that "foo" would be sorted before "foo bar" since this is >>>>>>>>>>> natural ordering. Why am I seeing these results? >>>>>>>>>>>
