Re: HBase balancer policy issue

2011-06-02 Thread Anty
if region A is doing compaction , load balancer ask region server to close region A if set hbase.hstore.close.check.interval = 0, the region A can't be closed until the completion of compaction, during this period, the region A can also accept mutation request? Am I right? Does this have bad

Re: Web UI Reload?

2011-06-02 Thread Joey Echeverria
+1 on forwarding after 5 seconds. On Thu, Jun 2, 2011 at 8:10 AM, Lars George lars.geo...@gmail.com wrote: Hi, When you click on split/compact in the table.jsp you get: Split request accepted. Reload. Reloading is braindead as it triggers the action over and over again. We should have a

RegionServer Page

2011-06-02 Thread Lars George
Hi, We have pRegion names are made of the containing table's name, a comma, the start key, a comma, and a randomly generated region id. To illustrate, the region named emdomains,apache.org,5464829424211263407/em is party to the table emdomains/em, has an id of em5464829424211263407/em and the

Re: HBase balancer policy issue

2011-06-02 Thread Schubert Zhang
Thanks Ted. On Thu, Jun 2, 2011 at 4:19 PM, Anty anty@gmail.com wrote: if region A is doing compaction , load balancer ask region server to close region A if set hbase.hstore.close.check.interval = 0, the region A can't be closed until the completion of compaction, during this period,

Lucene's FST for the block index

2011-06-02 Thread Jason Rutherglen
Lucene has a compact FST (Finite State Transducer) that's used for the sorted terms index. I think this is the same type of functionality as the HBase block index, eg, a sorted index of row ids? The FST is more compact keeping every Nth row id in RAM. Does the HFile format allow pluggable block

Re: Lucene's FST for the block index

2011-06-02 Thread Ted Yu
Currently BlockIndex is an inner class of HFile. It would be nice to support pluggable block index implementations. FYI On Thu, Jun 2, 2011 at 9:09 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Lucene has a compact FST (Finite State Transducer) that's used for the sorted terms

Re: Lucene's FST for the block index

2011-06-02 Thread Andrew Purtell
It would be nice to support pluggable block index implementations. +1 Perhaps we do this in the scope of HFile v2? https://issues.apache.org/jira/browse/HBASE-3857 - Andy --- On Thu, 6/2/11, Ted Yu yuzhih...@gmail.com wrote: From: Ted Yu yuzhih...@gmail.com Subject: Re: Lucene's FST

Re: Lucene's FST for the block index

2011-06-02 Thread Jason Rutherglen
The FST is more compact [than] keeping every Nth row id in RAM. It would be nice to support pluggable block index implementations Maybe we should try to support this prior to the HFile v2, which instead uses a tree structure to layout the blocks? Eg, a pluggable block index then becomes more

Re: Web UI Reload?

2011-06-02 Thread Rakesh Aggarwal
+1 on forwarding after 5 seconds. This part of the user interface is very confusing. -Rakesh On Thu, Jun 2, 2011 at 5:30 AM, Joey Echeverria j...@cloudera.com wrote: +1 on forwarding after 5 seconds. On Thu, Jun 2, 2011 at 8:10 AM, Lars George lars.geo...@gmail.com wrote: Hi, When you

Re: Web UI Reload?

2011-06-02 Thread Stack
Make an issue Lars. Add to it the two +1s below! St.Ack On Thu, Jun 2, 2011 at 5:10 AM, Lars George lars.geo...@gmail.com wrote: Hi, When you click on split/compact in the table.jsp you get: Split request accepted. Reload. Reloading is braindead as it triggers the action over and over

Re: prefix compression

2011-06-02 Thread Todd Lipcon
Hey Matt, Interesting email, and also something I've been thinking about recently. Unfortunately, I think one of the big prerequisites before we can start thinking about actual compression algorithms is some refactoring around how KeyValue is used. Currently, KeyValue exposes byte arrays in a

Re: prefix compression

2011-06-02 Thread Jason Rutherglen
Memstore dynamic row key compaction This is interesting though I think extremely difficult. We need this for Lucene realtime search with the RAM terms dictionary, which is slated to use a ConcurrentSkipListMap, eg it's lock free and it works. As Todd mentions, the KeyValue's need for access to

RE: prefix compression

2011-06-02 Thread Jonathan Gray
I'm here! Still parsing through the stuff in this e-mail but I agree that many approaches will require rejiggering of KeyValue in a significant way. I have made some attempts at this but nothing is very far. It's definitely something that we are hoping to put some resources into over the

RE: Lucene's FST for the block index

2011-06-02 Thread Abinash Karana (Bizosys)
Hi, I had developed a NOSQL Search HSearch (http://bizosyshsearch.sourceforge.net/ ) where HBase is the data storage, we indexed around 10 times of wikipedia information distributing in 10 Amazon EC2 machine and searching 100 concurrent users. It works wonderfully. I presented my findings in

Re: prefix compression

2011-06-02 Thread Jason Rutherglen
i agree about starting with the block index since it's a relatively isolated piece of code. Could possibly be done without KeyValue modifications Right. I don't think any KeyValue changes will be required. I'm trying to get more info about if we can MMap the FST, then store all the keys

Re: prefix compression

2011-06-02 Thread Stack
High-level this sounds like a great. Inline below is some feedback and a bit of history on how we got here in case it helps: On Thu, Jun 2, 2011 at 3:28 PM, Matt Corgan mcor...@hotpads.com wrote: * refer to prefix compression as compaction to avoid interfering with traditional compression

Re: prefix compression

2011-06-02 Thread Stack
On Thu, Jun 2, 2011 at 8:17 PM, Matt Corgan mcor...@hotpads.com wrote: What about turning KeyValue into an interface with only the essential getXX() methods?  Whatever is using the KeyValue shouldn't *have* to know what sort of structure it's backed by, but could figure out the implementation

Re: prefix compression

2011-06-02 Thread Todd Lipcon
On Thu, Jun 2, 2011 at 9:51 PM, Stack st...@duboce.net wrote: On Thu, Jun 2, 2011 at 8:17 PM, Matt Corgan mcor...@hotpads.com wrote: What about turning KeyValue into an interface with only the essential getXX() methods? Whatever is using the KeyValue shouldn't *have* to know what sort of