Re: hbase key design to efficient query on base of 2 or more column

2014-05-19 Thread Michael Segel
Whoa! BAD BOY. This isn’t a good idea for secondary index. You have a row key (primary index) which is time. The secondary is a filter… with 3 choices. HINT: Do you really want a secondary index based on a field that only has 3 choices for a value? What are they teaching in school these

Re: hbase key design to efficient query on base of 2 or more column

2014-05-19 Thread Shushant Arora
I cannot apply server side filter. 2nd requirement is not just get users with supreme category rather distribution of users category wise. 1.How many of supreme , how many of normal and how many of medium till date. On Mon, May 19, 2014 at 12:58 PM, Michael Segel michael_se...@hotmail.comwrote:

Re: hbase key design to efficient query on base of 2 or more column

2014-05-19 Thread Michael Segel
The point is that choosing a field that has a small finite set of values is not a good candidate for indexing using an inverted table or b-tree etc … I’d say that you’re actually going to be better off using a scan with a start and stop row, then doing the counts on the client side. So as

Re: hbase key design to efficient query on base of 2 or more column

2014-05-19 Thread Shushant Arora
Ok..but what if I have 2 multivalue dimensions on which I have to analyse no of users. Say Category can have 50 values and another dimension is country of user(say 100+ values). I need weekly count on category and country + I need overall distinct user count on category and country. How to

Re: hbase key design to efficient query on base of 2 or more column

2014-05-19 Thread James Taylor
If you use Phoenix, queries would leverage our Skip Scan: http://phoenix-hbase.blogspot.com/2013/05/demystifying-skip-scan-in-phoenix.html Assuming a row key made up of a low cardinality first value (like a byte representing an enum), followed by a high cardinality second value (like a date/time

RE: hbase key design to efficient query on base of 2 or more column

2014-05-19 Thread Vladimir Rodionov
I cannot apply server side filter. Why is that? Are you using stock HBase or some other, API - compatible product? Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Shushant

Re: hbase key design to efficient query on base of 2 or more column

2014-05-19 Thread Shushant Arora
By server side filter you mean to partition the data across multiple hbase table one for each category or something else? On Mon, May 19, 2014 at 11:05 PM, Vladimir Rodionov vrodio...@carrieriq.com wrote: I cannot apply server side filter. Why is that? Are you using stock HBase or some

RE: hbase key design to efficient query on base of 2 or more column

2014-05-19 Thread Vladimir Rodionov
Nope. Filter allows you to customize Scan or Get operation. See HBase java-doc for org.apache.hadoop.hbase.filter.Filter class Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From:

hbase key design to efficient query on base of 2 or more column

2014-05-17 Thread Shushant Arora
Hi I have a requirement to query my data base on date and user category. User category can be Supreme,Normal,Medium. I want to query how many new users are there in my table from date range (2014-01-01) to (2014-05-16) category wise. Another requirement is to query how many users of Supreme