Re: HTable.coprocessorExec call times out

Kevin Thu, 19 Jul 2012 13:55:43 -0700

That's it. That's the end of the regionserver log. In the master's web UI
the regionserver is in the table labeled "Dead Region Servers."


In the master's log there is:
2012-07-19 07:02:04,016 INFO
org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer
ephemeral node deleted, processing expiration [slave2,60020,1342694622535]
2012-07-19 07:02:04,025 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current
region=-ROOT-,,0.70236052 is on server=slave2,60020,1342694622535 server
being checked: slave2,60020,1342694622535
2012-07-19 07:02:04,025 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current
region=.META.,,1.1028785192 is on server=slave2,60020,1342694622535 server
being checked: slave2,60020,1342694622535
2012-07-19 07:02:04,027 DEBUG org.apache.hadoop.hbase.master.ServerManager:
Added=slave2,60020,1342694622535 to dead servers, submitted shutdown
handler to be executed, root=true, meta=true
2012-07-19 07:02:04,027 INFO
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting
logs for slave2,60020,1342694622535

The timestamps are different than above but it's from the same application
just at a different time than I sent before. The message would look the
same only with the timestamps being 2012-07-19 09:49ish

On Thu, Jul 19, 2012 at 4:50 PM, Ted Yu <[email protected]> wrote:

> Can you paste more of the region server log after 09:49:18,551 (till the
> region server died) ?
>
> Thanks
>
> On Thu, Jul 19, 2012 at 1:46 PM, Kevin <[email protected]> wrote:
>
> > The log snippet just before the regionservers die look like this:
> >
> > 2012-07-19 09:49:18,551 INFO project.coproc.IndexEndpoint: putting new
> > rowkey
> > 2012-07-19 09:49:18,551 INFO project.coproc.IndexEndpoint: new rowkey put
> > 2012-07-19 09:49:18,551 INFO project.coproc.IndexEndpoint: coproc time:
> > 1227 ms
> > 2012-07-19 09:49:18,551 INFO project.coproc.IndexEndpoint: closing
> scanner
> > 2012-07-19 09:49:18,551 INFO project.coproc.IndexEndpoint: scanner closed
> > <after this log statement in the endpoint code is the return statement>
> >
> > A coprocessorExec call may be from 3-20 seconds after the previous (it
> > depends how long the last call took). But I know the endpoints are
> > finishing their code fast because throughout the log each "coproc time:"
> > statement is under 5 seconds.
> >
> > I am using CDH4b2, which uses HBase 0.92.1.
> >
> > On Thu, Jul 19, 2012 at 4:35 PM, Ted Yu <[email protected]> wrote:
> >
> > > Kevin:
> > > Can you pastebin the log snippet from region server just before it
> died ?
> > >
> > > How frequent were your coprocessorExec() calls ?
> > > What HBase version were you using ?
> > >
> > > Thanks
> > >
> > > On Thu, Jul 19, 2012 at 12:44 PM, Kevin <[email protected]>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm using endpoint coprocessors to do intense scans in parallel on
> some
> > > > tables. I log the time it takes for each coprocessor to finish its
> job
> > on
> > > > the region. Each coprocessor rarely takes longer than a few seconds,
> > > > maximum of 5 seconds (there are only 5 regions on the tables right
> > now).
> > > As
> > > > my cluster grows with data the call HTable.coprocessorExec takes
> longer
> > > and
> > > > longer but the coprocessors themselves finish quickly (under 5
> > seconds).
> > > > Eventually I see all my regionservers die because the coprocessorExec
> > > call
> > > > timed out and zookeeper kills the connection, which makes the
> > > regionserver
> > > > die.
> > > >
> > > > In terms of code structure, the coprocessorExec call is done inside a
> > > > for-loop. The for-loop iterates over a List of objects to help form
> > > filters
> > > > for the endpoint and then calls the coprocessorExec once per object
> > > > processed.
> > > >
> > > > What would be the bottleneck? Is calling the coprocessor like this
> in a
> > > > for-loop loading the regions down and not allowing them time to do
> GC?
> > Is
> > > > there a way to ping a table and judge if it'll be ready for the
> > endpoint
> > > > call?
> > > >
> > > > Thanks,
> > > > -Kevin
> > > >
> > >
> >
>

Re: HTable.coprocessorExec call times out

Reply via email to