On Fri, Oct 28, 2011 at 4:30 AM, Rita <[email protected]> wrote: > Couple of questions: > What is the best delimiter for a key? Does it even matter? I read somewhere > that using a \t is optimal for a reason. >
Do without a delimiter if you can. Just make the row key elements of fixed size. It looks like though that your your key schema would require you have a delimiter (I'm guessing 'server' can be anything -- or can it be contained so all servers have same size'd name?) If you have to have a delimiter, choose one that is illegal in a server name or user name so you can be sure it doesn't show up in either ever and throw off your parse. > For these types of queries I have been using filters particularly, > RegexStringComparator > (w/start&stop) and things seem to work to an extent. I was wondering is this > the correct way to query or is there a more optimal way? > Regex'ing over keys will be expensive. HBase is all bytes. To regex, you need to change the bytes into a String. Java Strings are i18n and multi-byte natively so it costs making them. Can you make your key as raw bytes and do byte compares in your filtering? > I also couldnt find any examples using filters for timeseries data, is there > a place I should be looking at? > > I thought tsdb used filters? St.Ack
