Finding the latest updated rows

2014-01-20 Thread William Kang
Hi, In HBase, the time stamp is set for each column, not for the entire row. If somehow I want to find the latest updated (put new row, or update only certain columns in some rows, etc) rows, is there an efficient way to do it? Many thanks. William

Hbase 0.90.2 problems

2011-04-09 Thread William Kang
Hi folks, I recently upgraded to hbase 0.90.2 that runs with hadoop 0.20.1. And I got the following errors in the hbase logs: 2011-04-09 02:28:02,429 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181 2011-04-09 02:28:02,430 WARN

Re: Hbase 0.90.2 problems

2011-04-09 Thread William Kang
...@apache.org wrote: Looks like your zookeeper isn't running for some reason, that's usually what connection refused means. Also notice it connects on localhost. J-D On Fri, Apr 8, 2011 at 11:54 PM, William Kang weliam.cl...@gmail.com wrote: Hi folks, I recently upgraded to hbase 0.90.2 that runs

Re: Hbase 0.90.2 problems

2011-04-09 Thread William Kang
recommended by HBase (Apache hadoop-0.20-append branch / CDH3b4) Is your ZooKeeper service running? You'll need to either let HBase run it for you, or run one yourself. On port 2181. On Sat, Apr 9, 2011 at 12:24 PM, William Kang weliam.cl...@gmail.com wrote: Hi folks, I recently upgraded to hbase

Re: More on Column Family versus Column

2010-10-18 Thread William Kang
row and one column family.  A row may contain multiple hbase blocks but an hbase block may only contain one row. Thanks, Jacques On Fri, Oct 15, 2010 at 8:54 PM, William Kang weliam.cl...@gmail.comwrote: Hi Jacques, If I understand correctly, it depends on several factors. First

HBase random access in HDFS and block indices

2010-10-18 Thread William Kang
Hi, Recently I have spent some efforts to try to understand the mechanisms of HBase to exploit possible performance tunning options. And many thanks to the folks who helped with my questions in this community, I have sent a report. But, there are still few questions left. 1. If a HFile block

Re: HBase cluster with heterogeneous resources

2010-10-16 Thread William Kang
HDFS blocks are streaming files, which means you cannot random access those HDFS blocks quickly like other file systems. So that means if your HBase block is in the middle of a HDFS block, you have to traverse inside it to get to the middle. Right? Can somebody explain how HBase manage to fetch

Re: More on Column Family versus Column

2010-10-15 Thread William Kang
Hi Jacques, If I understand correctly, it depends on several factors. First is the configured block size; second is the typical cell size. A block may have multiple keyvalue pairs. If the block size is bigger than the cell size, a block may have multiple cells, which are stored in block as

Re: Help needed! Performance related questions

2010-10-14 Thread William Kang
Hi guys, Thanks so much for answering my questions. I really appreciate that. They helps a lot! I have a few more follow up questions though. 1. about the row searching mechanism, I understand the part before the HBase locate where the row resides in which region. I am confused after that. So, I

Re: Help needed! Performance related questions

2010-10-14 Thread William Kang
Hey J-D, Thanks a lot! That has cleared a lot of my confusions. :) I really appreciate it. William On Thu, Oct 14, 2010 at 2:51 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: 1. about the row searching mechanism, I understand the part before the HBase locate where the row resides in which

Re: coprocessors WAS - Re: Parallel computing on HBase

2010-10-10 Thread William Kang
. Thanks, Mingjie On 10/07/2010 12:44 PM, William Kang wrote: Hi St. Ack, Thanks a lot for your information. I will look them up. If the coprocessors can work with the 0.90 manual balanced hbase, that would be really nice. William On Thu, Oct 7, 2010 at 2:31 PM, Stackst...@duboce.net wrote

Hbase internally row location mechanism

2010-10-10 Thread William Kang
Hi, Can somebody explain briefly how Hbase locate a row internally by using Get g = new Get(Bytes.toBytes(RowID)); table.get(g); What type of searching algorithm Hbase use to locate the rows ordering lexicographically? Many thanks! William

Re: coprocessors WAS - Re: Parallel computing on HBase

2010-10-07 Thread William Kang
example uses including examples that resemble strongly that which you would like to do, described below. St.Ack On Wed, Oct 6, 2010 at 11:08 PM, William Kang weliam.cl...@gmail.com wrote: Ryan, thanks for your explanation. It is very clear and helpful. Andy, I think Hbase-2000 is exactly

Re: Can I pick which region server to store my row?

2010-10-06 Thread William Kang
at 8:38 PM, William Kang weliam.cl...@gmail.com wrote: So, I can use the 'move' command to manually balance the load? Is this available to 0.20.6? Was there any automatic balance mechanism in hbase before if the replicated block is not for load distribution purpose? Thanks. William

Re: Can I pick which region server to store my row?

2010-10-06 Thread William Kang
-storage.html J-D On Tue, Oct 5, 2010 at 8:38 PM, William Kang weliam.cl...@gmail.com wrote: So, I can use the 'move' command to manually balance the load? Is this available to 0.20.6? Was there any automatic balance mechanism in hbase before if the replicated block is not for load distribution

Can I pick which region server to store my row?

2010-10-05 Thread William Kang
Hi folks, I have a general question about Hbase. Can we pick which region server we want to save a particular row? The reason I am asking this is because sometimes we want to manually balance region servers' load. If we could assign particular rows to particular region servers, we can have that

Parallel computing on HBase

2010-10-05 Thread William Kang
Hi guys, Is there any project going on co-processing on region servers? Right now, we have to transfer all data from region servers to region client after query, is that right? This can be slow. Furthermore, the cpus on the region servers are not fully used. If we could distribute the computation

Re: Can I pick which region server to store my row?

2010-10-05 Thread William Kang
to distribute load, what happens is each block is replicated 3 times by HDFS and this is invisible to HBase. This is done for data safety rather than distributing the load. J-D On Tue, Oct 5, 2010 at 8:05 PM, William Kang weliam.cl...@gmail.com wrote: Hi folks, I have a general question

Re: Parallel computing on HBase

2010-10-05 Thread William Kang
with lots of overhead. Ideally, we want something light weight and can get result fast. Many thanks. William On Wed, Oct 6, 2010 at 12:01 AM, Jeff Zhang zjf...@gmail.com wrote: You can incorporate map reduce with hbase for parallel computing. On Wed, Oct 6, 2010 at 11:24 AM, William Kang

Re: Limits on HBase

2010-09-07 Thread William Kang
a given row can only be in one region and thus be hosted on one server at a time. JG -Original Message- From: William Kang [mailto:weliam.cl...@gmail.com] Sent: Monday, September 06, 2010 1:57 PM To: hbase-user Subject: Limits on HBase Hi folks, I know

Re: Limits on HBase

2010-09-07 Thread William Kang
are approaching the default block size on HDFS (64MB), you should consider putting the data directly into HDFS rather than HBase. JG -Original Message- From: William Kang [mailto:weliam.cl...@gmail.com] Sent: Tuesday, September 07, 2010 7:36 PM To: user@hbase.apache.org; apurt

Limits on HBase

2010-09-06 Thread William Kang
Hi folks, I know this question may have been asked many times, but I am wondering if there is any update on the optimized cell size (in megabytes) and row size (in megabytes)? Many thanks. William

Re: Limits on HBase

2010-09-06 Thread William Kang
be hosted on one server at a time. JG -Original Message- From: William Kang [mailto:weliam.cl...@gmail.com] Sent: Monday, September 06, 2010 1:57 PM To: hbase-user Subject: Limits on HBase Hi folks, I know this question may have been asked many times, but I am wondering