Since we have lived so long without this information, I guess we can hold for longer :-) Another issue I am working on is to reduce memory footprint. See the following discussion thread: One of the regionserver aborted, then the master shut down itself
We have to bear in mind that there would be around 10K regions or more in production. Cheers On Wed, Mar 16, 2011 at 1:46 PM, Jeff Whiting <je...@qualtrics.com> wrote: > Just a random thought. What about keeping a per region row count? Then if > you needed to get a row count for a table you'd just have to query each > region once and sum. Seems like it wouldn't be too expensive because you'd > just have a row counter variable. It maybe more complicated than I'm making > it out to be though... > > ~Jeff > > > On 3/16/2011 2:40 PM, Stack wrote: > >> On Wed, Mar 16, 2011 at 1:35 PM, Vivek Krishna<vivekris...@gmail.com> >> wrote: >> >>> 1. How do I count rows fast in hbase? >>> >>> First I tired count 'test' , takes ages. >>> >>> Saw that I could use RowCounter, but looks like it is deprecated. >>> >> It is not. Make sure you are using the one from mapreduce package as >> opposed to mapred package. >> >> >> I just need to verify the total counts. Is it possible to see somewhere >>> in >>> the web interface or ganglia or by any other means? >>> >>> We don't keep a current count on a table. Too expensive. Run the >> rowcounter MR job. This page may be of help: >> >> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description >> >> Good luck, >> St.Ack >> > > -- > Jeff Whiting > Qualtrics Senior Software Engineer > je...@qualtrics.com > >