I've seen TNFE when a region in the middle of an online table is offline. Shouldn't ever happen but....

What I've seen is in the shell, you can do 'show tables;' and it will list all tables including the one reporting TNFE.

You then attempt a get or a scan against the table and you get the TNFE exception.

Is this what you are seeing?

Try doing a 'select info:regioninfo from .META.;' Look for a region marked offline. Might be easier if you run the query like this: % echo 'select info:regioninfo from .META.;' | ./bin/hbase --html &> /tmp/query.out ... because then you can grep around in the /tmp/query.out file.. or just send it to us off-list and we'll take a look.

For sure this is 0.1.0?

Thanks,
St.Ack


David Alves wrote:
Hi all

        I think we can consider the test has passed, as previous error logs
told me the M/R job failed around 35,000 records and the job has reached
42.000, failing for a whole other reason :

Caused by: org.apache.hadoop.hbase.TableNotFoundException: Table 'XXXXX' was
not found.
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionI
nMeta(HConnectionManager.java:415)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
HConnectionManager.java:346)
        at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
HConnectionManager.java:308)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:89).

Is there a known workaround for this problem I know for sure the table
exists as it has been used in the previous 25 M/R jobs? Should I make my
code wait&retry until the table is up again?


Regards
David Alves

-----Original Message-----
From: Jim Kellerman [mailto:[EMAIL PROTECTED]
Sent: Monday, April 07, 2008 5:09 PM
To: [email protected]
Subject: RE: StackOverFlow Error in HBase

Yes, trunk is fine since there are no changes in filters between 0.1 and
trunk.

---
Jim Kellerman, Senior Engineer; Powerset


-----Original Message-----
From: David Alves [mailto:[EMAIL PROTECTED]
Sent: Monday, April 07, 2008 8:44 AM
To: [email protected]
Subject: RE: StackOverFlow Error in HBase

Hi Jim
        The job I left running before the weekend had some
(other) problems, mainly about hadoop API change.
        Anyway I'm running it again right now and at first
glance its working (I'll know for sure in about 1 hour), on a
different note there was a problem with RegExpRowFilter where
if it received more that one conditional in the constructor
map it would filter out records it shouldn't, and that
problem is solved.
        As Friday before I got your response I had already
upgraded the cluster to the hadoop and hbase trunk versions
I'm currently testing with these versions instead of 0.1,
hope there is no problem there.
        I'll send another email soon.

Regards
David Alves

On Mon, 2008-04-07 at 08:31 -0700, Jim Kellerman wrote:
David,

Any luck running this patch either against head or against
the 0.1 branch?
Thanks.

---
Jim Kellerman, Senior Engineer; Powerset


-----Original Message-----
From: David Alves [mailto:[EMAIL PROTECTED]
Sent: Friday, April 04, 2008 10:05 AM
To: [email protected]
Subject: RE: StackOverFlow Error in HBase

Hi Jim

        Of course, my questions was regarding whether I
should use
HEAD or some branch or tag.
        Anyways I currently running Hbase HEAD patched against
Hadoop HEAD, I'll know if its ok soon.

Regards
David Alves
On Fri, 2008-04-04 at 09:18 -0700, Jim Kellerman wrote:
After applying the patch, you have to rebuild and
deploy on your
cluster, run your test that was failing and verify that it
now works.
See

http://hadoop.apache.org/hbase/docs/current/api/overview-summary.htm
l#
overview_description



---
Jim Kellerman, Senior Engineer; Powerset


-----Original Message-----
From: David Alves [mailto:[EMAIL PROTECTED]
Sent: Friday, April 04, 2008 6:29 AM
To: [email protected]
Subject: RE: StackOverFlow Error in HBase

Hi all again

        I've never used the patch system you guys use, so
in order
to test the patch submitted by Clint what do I have to do? I
mean I've updated HEAD and applied the patch, is this it?

Regards
David Alves



On Thu, 2008-04-03 at 10:18 -0700, Jim Kellerman wrote:
Thanks David. I'll add 554 as a blocker for 0.1.1

---
Jim Kellerman, Senior Engineer; Powerset


-----Original Message-----
From: David Alves [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 03, 2008 9:21 AM
To: [EMAIL PROTECTED]
Subject: RE: StackOverFlow Error in HBase

Hi Jim and all

        I'll commit to test the patch under the same
conditions as
it failed before, (with around 36000 records) but in this
precise moment I preparing my next development
iteration, which
means a lot
of meetings.
        By the end of the day tomorrow (friday) I
should have a
confirmation whether the patch worked (or not).

Regards
David Alves

On Thu, 2008-04-03 at 09:12 -0700, Jim Kellerman wrote:
David,

Have you had a chance to try this patch? We are about to
release hbase-0.1.1 and until we receive a confirmation in
HBASE-554 from another person who has tried it and
verifies that it
works, we cannot include it in this release. If
it is not in
this release, there will be a significant wait for it
to appear
in an hbase release. hbase-0.1.2 will not happen anytime
soon
unless there
are critical issues that arise that have not been
fixed in 0.1.1.
hbase-0.2.0 is also some time in the future. There are a
significant
number of issues to address before that release is ready.
Frankly, I'd like to see this patch in 0.1.1,
because it
is
an issue for people that use filters.
The alternative would be for Clint to supply a test case
that fails without the patch but passes with the patch.
We will hold up the release, but need a
commitment either
from David to test the patch or for Clint to
supply a test.
We need that commitment by the end of the day today
2008/04/03 along with an eta as to when it will
be completed.
---
Jim Kellerman, Senior Engineer; Powerset


-----Original Message-----
From: David Alves
[mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2008 2:36 PM
To: [EMAIL PROTECTED]
Subject: RE: StackOverFlow Error in HBase

Hi

        I just deployed the unpatched version.
        Tomorrow I'll rebuild the system with
the patch
and try it
out.
        Thanks again.

Regards
David Alves

-----Original Message-----
From: Jim Kellerman [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2008 10:04 PM
To: [EMAIL PROTECTED]
Subject: RE: StackOverFlow Error in HBase

David,

Have you tried this patch and does it work for
you? If so
we'll include it
hbase-0.1.1

---
Jim Kellerman, Senior Engineer; Powerset


-----Original Message-----
From: David Alves
[mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2008 10:44 AM
To: [EMAIL PROTECTED]
Subject: RE: StackOverFlow Error in HBase

Hi
        Thanks for the prompt path Clint,
St.Ack and
all you guys.
Regards
David Alves

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
On Behalf
Of Clint Morgan
Sent: Tuesday, April 01, 2008 2:04 AM
To: [EMAIL PROTECTED]
Subject: Re: StackOverFlow Error in HBase

Try the patch at
https://issues.apache.org/jira/browse/HBASE-554.
cheers,
-clint

On Mon, Mar 31, 2008 at 5:39 AM, David Alves
<[EMAIL PROTECTED]> wrote:
Hi ... again

        In my previous mail I stated that
increasing the
stack size
solved the
 problem, well I jumped a little bit to the
conclusion,
in fact it
didn't, the StackOverFlowError always occurs
at the end
of the cycle
when no more records match the filter. Anyway
I've
rewritten my
application to use a normal scanner
and and do
the
"filtering" after
which is not optimal but it works.
        I'm just saying this because it might
be a clue,
in previous
versions
 (!= 0.1.0) even though a more serious
problem happened
(regionservers  became irresponsive after so
many
records) this
didn't happen. Btw in  current version I
notice no,
or
very small,
decrease of thoughput with  time, great work!

 Regards
 David Alves







 On Mon, 2008-03-31 at 05:18 +0100, David
Alves wrote:
 > Hi again
 >
 >       As I was almost at the end (80%)
of indexable
docs, for the
time
 > being I simply increased the stack size,
which
seemed to work.
 >       Thanks for your input St.Ack
really helped me
solve the problem
at
 > least for the moment.
 >       On another note in the same
method I changed
the way the
scanner was
 > obtained when htable.getStartKeys()
would be more
than
1, so that
I
could
 > limit the records read each time
to a single
region, and the
scanning
would
 > start at the last region, strangely
the number of
keys
obtained
by  > htable.getStartKeys() was always 1
even though
by the end
there are
already
 > 21 regions.
 >       Any thoughts?
 >
 > Regards
 > David Alves
 >
 > > -----Original Message-----  > >
From: stack
[mailto:[EMAIL PROTECTED]  > > Sent:
Sunday, March
30, 2008 9:36 PM  > > To:
[EMAIL PROTECTED]
Subject:
Re: StackOverFlow Error in HBase  > >  > >
You're
doing nothing
wrong.
 > >
 > > The filters as written recurse until
they find
a
match.  If
long  > > stretches between matching rows,
then you will
get a  > >
StackOverflowError.  Filters need to
be changed.
Thanks for
pointing
 > > this out.  Can you do without
them for the
moment
until we get
a
chance
 > > to fix it?  (HBASE-554)  > >  >
Thanks,
St.Ack
 > >  > >  > > David Alves wrote:
 > > > Hi St.Ack and all  > > >
 > > >   The error always occurs when
trying to see if
there are more
rows to
 > > > process.
 > > >   Yes I'm using a
filter(RegExpRowFilter) to
select only the rows
(any
 > > > row key) that match a specific
value in one
of
the columns.
 > > >   Then I obtain the scanner just test
the hasNext
method, close
the
 > > > scanner and return.
 > > >   Am I doing something wrong?
 > > >   Still StackOverflowError is not
supposed to
happen right?
 > > >
 > > > Regards
 > > > David Alves  > > > On Thu,
2008-03-27 at
12:36 -0700,
stack wrote:
 > > >
 > > >> You are using a filter?  If
so, tell us
more about it.
 > > >> St.Ack
 > > >>
 > > >> David Alves wrote:
 > > >>
 > > >>> Hi guys  > > >>>
 > > >>>         I 'm using HBase to keep
data that is
later indexed.
 > > >>>         The data is indexed in
chunks so the
cycle is get XXXX
records index
 > > >>> them check for more records etc...
 > > >>>         When I tryed the candidate-2
instead of
the old 0.16.0
(which I
 > > >>> switched to do to the regionservers
becoming
unresponsive)
I
got the
 > > >>> error in the end of this email
well into an
indexing job.
 > > >>>         So you have any idea
why? Am I doing
something wrong?
 > > >>>
 > > >>> David Alves  > > >>>  > > >>>
java.lang.RuntimeException:
org.apache.hadoop.ipc.RemoteException:
 > > >>> java.io.IOException:
java.lang.StackOverflowError
 > > >>>         at
java.io.DataInputStream.readFully(DataInputStream.java:178
)
 > > >>>         at
java.io.DataInputStream.readLong(DataInputStream.java:399)
 > > >>>         at
org.apache.hadoop.dfs.DFSClient
 > > >>>
$BlockReader.readChunk(DFSClient.java:735)
 > > >>>         at
 > > >>>
 > >
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInpu
tC
hecker.java:
 > > 234)
 > > >>>         at
 > > >>>
org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176
)
 > > >>>         at
 > > >>>
org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:19
3)
 > > >>>         at
 > > >>>
org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157
)
 > > >>>         at
org.apache.hadoop.dfs.DFSClient
 > > >>> $BlockReader.read(DFSClient.java:658)
 > > >>>         at
org.apache.hadoop.dfs.DFSClient
 > > >>>
$DFSInputStream.readBuffer(DFSClient.java:1130)
 > > >>>         at
org.apache.hadoop.dfs.DFSClient
 > > >>>
$DFSInputStream.read(DFSClient.java:1166)
 > > >>>         at
java.io.DataInputStream.readFully(DataInputStream.java:178
)
 > > >>>         at
org.apache.hadoop.io.DataOutputBuffer
 > > >>>
$Buffer.write(DataOutputBuffer.java:56)
 > > >>>         at
 > > >>>
org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90
)
 > > >>>         at
org.apache.hadoop.io.SequenceFile
 > > >>> $Reader.next(SequenceFile.java:1829)
 > > >>>         at
org.apache.hadoop.io.SequenceFile
 > > >>> $Reader.next(SequenceFile.java:1729)
 > > >>>         at
org.apache.hadoop.io.SequenceFile
 > > >>> $Reader.next(SequenceFile.java:1775)
 > > >>>         at
org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:461)
 > > >>>         at
org.apache.hadoop.hbase.HStore
 > > >>>
$StoreFileScanner.getNext(HStore.java:2350)
 > > >>>         at
 > > >>>
 > >
org.apache.hadoop.hbase.HAbstractScanner.next(HAbstractScanner.java:
25
6)
 > > >>>         at
org.apache.hadoop.hbase.HStore
 > > >>> $HStoreScanner.next(HStore.java:2561)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1807)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)
 > > >>>         at
org.apache.hadoop.hbase.HRegion
 > > >>> $HScanner.next(HRegion.java:1843)  >
...
 > > >>>
 > > >>>
 > > >>>
 > > >>>
 > > >
 > > >
 >


No virus found in this incoming message.
Checked by AVG.
Version: 7.5.519 / Virus Database:
269.22.3/1354 -
Release
Date: 4/1/2008 5:38 AM


No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.3/1354 -
Release Date:
4/1/2008
5:38 AM
No virus found in this incoming message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.3/1354 -
Release
Date: 4/1/2008 5:38 AM


No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.5/1357 -
Release Date:
4/3/2008 10:48 AM

No virus found in this incoming message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.5/1357 - Release
Date: 4/3/2008 10:48 AM


No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.5/1357 -
Release Date:
4/3/2008 10:48 AM

No virus found in this incoming message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.5/1359 - Release
Date: 4/4/2008 8:23 AM


No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.5/1359 - Release Date:
4/4/2008 8:23 AM

No virus found in this incoming message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.5/1359 - Release
Date: 4/4/2008 8:23 AM


No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.8/1362 - Release Date:
4/6/2008 11:12 AM

No virus found in this incoming message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.8/1363 - Release
Date: 4/7/2008 8:56 AM


No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.22.8/1363 - Release Date: 4/7/2008
8:56 AM


Reply via email to