On Tue, Jul 5, 2011 at 1:02 PM, Alt Control <[email protected]> wrote: > Question is - how can I do that efficiently? I don't know if HBase allow me > to set multiple filters in a single Scane object, > but I can do that with regex (for example (GOOG|IBM|DELL|.......|n|)), but > is this the right way? >
You can pass lists of filters. See http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterList.html For scanning during a certain time, make your Scan start (and optionally end) within the time you are interested in by passing the appropriate start and stop keys: See setStartRow and setStopRow in http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html. FYI, avoid regex'es if you can. They are costly. HBase is all about bytes so to do the check, need to go from bytes to String, then do regex, and do this for each compare of all values. It adds up. St.Ack
