Overhead of Bloomfilters

2011-01-25 Thread Lars George
Hi, (Probably aimed at Nicolas) Do we have a (rough) formula of overhead, i.e. the size of the bloomfilters for row and col granularity as for example depending on the KV count and average sizes (as reported by the HFile main() helper)? Thanks, Lars

Re: Overhead of Bloomfilters

2011-01-25 Thread Nicolas Spiegelberg
A great article for Bloom Filter rules of thumb: http://corte.si/posts/code/bloom-filter-rules-of-thumb/ Note that only rules #1 #2 apply for our use case. Rule #3, while true, isn't as big a worry because we use combinatorial generation for hashes, so the number of 'expensive' hash

Re: Overhead of Bloomfilters

2011-01-25 Thread Ted Dunning
See http://en.wikipedia.org/wiki/Double_hashing for information on double hashing. On Tue, Jan 25, 2011 at 8:11 AM, Nicolas Spiegelberg nspiegelb...@fb.comwrote: A great article for Bloom Filter rules of thumb: http://corte.si/posts/code/bloom-filter-rules-of-thumb/ Note that only rules #1

Missing filter documentation for Stargate

2011-01-25 Thread Lars George
Hi, Am I wrong or is there a lack of documentation for the FilterModel for the filters in Stargate? The Wiki http://wiki.apache.org/hadoop/Hbase/HbaseRest points to an old 0.20.4 documentation (although saying it is the new place) and the other page we have is void of details on the filters for

Re: quick method for removing all rows in a table

2011-01-25 Thread Jean-Daniel Cryans
There's slower ways, but not quicker ways as far as I know. J-D On Tue, Jan 25, 2011 at 4:37 PM, Ted Yu yuzhih...@gmail.com wrote: HBase shell provides this command: truncate  Disables, drops and recreates the specified table Is there a quicker way of removing all rows ? Thanks

Re: Looks like duplicate in MemoryStoreFlusher flushSomeRegions()

2011-01-25 Thread Ryan Rawson
the call to compactionRequested() only puts the region on a queue to be compacted, so if there is unintended duplication, it wont actually hold anything up. -ryan On Tue, Jan 25, 2011 at 6:05 PM, mac fang mac.had...@gmail.com wrote: Guys, since the flushCache will make the write/read suspend. I

Re: Items to contribute (plan)

2011-01-25 Thread Tatsuya Kawano
Hi Yifeng, #4. Writing Japanese books and documents I am glad if I can work on this one with you. Thanks for your offer. Let me explain a bit more about them. -- Currently I'm authoring a book chapter about HBase for a Japanese NOSQL book This one is a commercial book from a Japanese

Re: Looks like duplicate in MemoryStoreFlusher flushSomeRegions()

2011-01-25 Thread mac fang
Oh, yes, I checked the code, the method protected CompactionRequest addToRegionsInQueue(HRegion r, int p) { contains the: if (queuedRequest == null || newRequest.getPriority() queuedRequest.getPriority()) { LOG.trace(Inserting region in queue. + newRequest);