In the future, describe your environment a bit.
The way I approach this is:
find the correct commandline from
src/main/java/org/apache/hadoop/hbase/mapreduce/package-info.java
Then I issue:
[hadoop@us01-ciqps1-name01 hbase]$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase
classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.1.jar
rowcounter packageindex
Then I check the map/reduce task on job tracker URL
On Wed, Mar 16, 2011 at 1:59 PM, Vivek Krishna <[email protected]>wrote:
> I guess it is using the mapred class
>
> 11/03/16 20:58:27 INFO mapred.JobClient: Task Id :
> attempt_201103161245_0005_m_000004_0, Status : FAILED
> java.io.IOException: Cannot create a record reader because of a previous
> error. Please look at the previous logs lines from the task's full log for
> more details.
> at
>
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:98)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
> at org.apache.hadoop.mapred.Child.main(Child.java:234)
>
> How do I use mapreduce class?
> Viv
>
>
>
> On Wed, Mar 16, 2011 at 4:52 PM, Ted Yu <[email protected]> wrote:
>
> > Since we have lived so long without this information, I guess we can hold
> > for longer :-)
> > Another issue I am working on is to reduce memory footprint. See the
> > following discussion thread:
> > One of the regionserver aborted, then the master shut down itself
> >
> > We have to bear in mind that there would be around 10K regions or more in
> > production.
> >
> > Cheers
> >
> > On Wed, Mar 16, 2011 at 1:46 PM, Jeff Whiting <[email protected]>
> wrote:
> >
> > > Just a random thought. What about keeping a per region row count?
> Then
> > if
> > > you needed to get a row count for a table you'd just have to query each
> > > region once and sum. Seems like it wouldn't be too expensive because
> > you'd
> > > just have a row counter variable. It maybe more complicated than I'm
> > making
> > > it out to be though...
> > >
> > > ~Jeff
> > >
> > >
> > > On 3/16/2011 2:40 PM, Stack wrote:
> > >
> > >> On Wed, Mar 16, 2011 at 1:35 PM, Vivek Krishna<[email protected]>
> > >> wrote:
> > >>
> > >>> 1. How do I count rows fast in hbase?
> > >>>
> > >>> First I tired count 'test' , takes ages.
> > >>>
> > >>> Saw that I could use RowCounter, but looks like it is deprecated.
> > >>>
> > >> It is not. Make sure you are using the one from mapreduce package as
> > >> opposed to mapred package.
> > >>
> > >>
> > >> I just need to verify the total counts. Is it possible to see
> > somewhere
> > >>> in
> > >>> the web interface or ganglia or by any other means?
> > >>>
> > >>> We don't keep a current count on a table. Too expensive. Run the
> > >> rowcounter MR job. This page may be of help:
> > >>
> > >>
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
> > >>
> > >> Good luck,
> > >> St.Ack
> > >>
> > >
> > > --
> > > Jeff Whiting
> > > Qualtrics Senior Software Engineer
> > > [email protected]
> > >
> > >
> >
>