Re: querying questions

Stack Fri, 28 Oct 2011 09:23:44 -0700

On Fri, Oct 28, 2011 at 4:30 AM, Rita <[email protected]> wrote:
> Couple of questions:
> What is the best delimiter for a key? Does it even matter? I read somewhere
> that using a \t is optimal for a reason.
>


Do without a delimiter if you can.  Just make the row key elements of
fixed size.

It looks like though that your your key schema would require you have
a delimiter (I'm guessing 'server' can be anything -- or can it be
contained so all servers have same size'd name?)

If you have to have a delimiter, choose one that is illegal in a
server name or user name so you can be sure it doesn't show up in
either ever and throw off your parse.

> For these types of queries I have been using filters particularly,
> RegexStringComparator
> (w/start&stop) and things seem to work to an extent. I was wondering is this
> the correct way to query or is there a more optimal way?
>

Regex'ing over keys will be expensive.  HBase is all bytes.  To regex,
you need to change the bytes into a String.  Java Strings are i18n and
multi-byte natively so it costs making them.  Can you make your key as
raw bytes and do byte compares in your filtering?

> I also couldnt find any examples using filters for timeseries data, is there
> a place I should be looking at?
>
>

I thought tsdb used filters?

St.Ack

Re: querying questions

Reply via email to