Re: Row Counters

Ted Yu Wed, 16 Mar 2011 15:21:52 -0700

The connection loss was due to inability of finding zookeeper quorum

Use the commandline in my previous email.


On Wed, Mar 16, 2011 at 3:18 PM, Vivek Krishna <vivekris...@gmail.com>wrote:

> Oops. sorry about the environment.
>
> I am using hadoop-0.20.2-CDH3B4, and hbase-0.90.1-CDH3B4
> and zookeeper-3.3.2-CDH3B4.
>
> I was able to configure jars and run the command,
>
> hadoop jar /usr/lib/hbase/hbase-0.90.1-CDH3B4.jar rowcounter test,
>
> but I get
>
> java.io.IOException: Cannot create a record reader because of a previous 
> error. Please look at the previous logs lines from the task's full log for 
> more details.
>       at 
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:98)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>       at org.apache.hadoop.mapred.Child.main(Child.java:234)
>
>
> The previous error in the task's full log is ..
>
>
> 2011-03-16 21:41:03,367 ERROR 
> org.apache.hadoop.hbase.mapreduce.TableInputFormat: 
> org.apache.hadoop.hbase.ZooKeeperConnectionException: 
> org.apache.hadoop.hbase.ZooKeeperConnectionException: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:988)
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:301)
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:292)
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:155)
>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:167)
>       at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:145)
>       at 
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:91)
>       at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>       at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:605)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>       at org.apache.hadoop.mapred.Child.main(Child.java:234)
> Caused by: org.apache.hadoop.hbase.ZooKeeperConnectionException: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase
>       at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:147)
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:986)
>       ... 15 more
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss for /hbase
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>       at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
>       ... 16 more
>
>
> find I am pretty sure zookeeper master is running in the same machine at
> port 2181.  Not sure why the connection loss occurs.  Do I need 
> HBASE-3578<https://issues.apache.org/jira/browse/HBASE-3578>by any chance?
>
> Viv
>
>
>
>
> On Wed, Mar 16, 2011 at 5:36 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> In the future, describe your environment a bit.
>>
>> The way I approach this is:
>> find the correct commandline from
>> src/main/java/org/apache/hadoop/hbase/mapreduce/package-info.java
>>
>> Then I issue:
>> [hadoop@us01-ciqps1-name01 hbase]$
>> HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase
>> classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-0.90.1.jar
>> rowcounter packageindex
>>
>> Then I check the map/reduce task on job tracker URL
>>
>> On Wed, Mar 16, 2011 at 1:59 PM, Vivek Krishna <vivekris...@gmail.com
>> >wrote:
>>
>> > I guess it is using the mapred class
>> >
>> > 11/03/16 20:58:27 INFO mapred.JobClient: Task Id :
>> > attempt_201103161245_0005_m_000004_0, Status : FAILED
>> > java.io.IOException: Cannot create a record reader because of a previous
>> > error. Please look at the previous logs lines from the task's full log
>> for
>> > more details.
>> >  at
>> >
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:98)
>> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
>> >  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>> > at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>> >  at java.security.AccessController.doPrivileged(Native Method)
>> > at javax.security.auth.Subject.doAs(Subject.java:396)
>> >  at
>> >
>> >
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>> > at org.apache.hadoop.mapred.Child.main(Child.java:234)
>> >
>> > How do I use mapreduce class?
>> > Viv
>> >
>> >
>> >
>> > On Wed, Mar 16, 2011 at 4:52 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>> >
>> > > Since we have lived so long without this information, I guess we can
>> hold
>> > > for longer :-)
>> > > Another issue I am working on is to reduce memory footprint. See the
>> > > following discussion thread:
>> > > One of the regionserver aborted, then the master shut down itself
>> > >
>> > > We have to bear in mind that there would be around 10K regions or more
>> in
>> > > production.
>> > >
>> > > Cheers
>> > >
>> > > On Wed, Mar 16, 2011 at 1:46 PM, Jeff Whiting <je...@qualtrics.com>
>> > wrote:
>> > >
>> > > > Just a random thought.  What about keeping a per region row count?
>> >  Then
>> > > if
>> > > > you needed to get a row count for a table you'd just have to query
>> each
>> > > > region once and sum.  Seems like it wouldn't be too expensive
>> because
>> > > you'd
>> > > > just have a row counter variable.  It maybe more complicated than
>> I'm
>> > > making
>> > > > it out to be though...
>> > > >
>> > > > ~Jeff
>> > > >
>> > > >
>> > > > On 3/16/2011 2:40 PM, Stack wrote:
>> > > >
>> > > >> On Wed, Mar 16, 2011 at 1:35 PM, Vivek Krishna<
>> vivekris...@gmail.com>
>> > > >>  wrote:
>> > > >>
>> > > >>> 1.  How do I count rows fast in hbase?
>> > > >>>
>> > > >>> First I tired count 'test'  , takes ages.
>> > > >>>
>> > > >>> Saw that I could use RowCounter, but looks like it is deprecated.
>> > > >>>
>> > > >> It is not.  Make sure you are using the one from mapreduce package
>> as
>> > > >> opposed to mapred package.
>> > > >>
>> > > >>
>> > > >>  I just need to verify the total counts.  Is it possible to see
>> > > somewhere
>> > > >>> in
>> > > >>> the web interface or ganglia or by any other means?
>> > > >>>
>> > > >>>  We don't keep a current count on a table.  Too expensive.  Run
>> the
>> > > >> rowcounter MR job.  This page may be of help:
>> > > >>
>> > > >>
>> > >
>> >
>> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
>> > > >>
>> > > >> Good luck,
>> > > >> St.Ack
>> > > >>
>> > > >
>> > > > --
>> > > > Jeff Whiting
>> > > > Qualtrics Senior Software Engineer
>> > > > je...@qualtrics.com
>> > > >
>> > > >
>> > >
>> >
>>
>
>

Re: Row Counters

Reply via email to