Just to follow up - I ran add_table as I had done when I lost a table before - and it fixed the error.
Thanks! Take care, -stu --- On Fri, 8/6/10, Stuart Smith <stu24m...@yahoo.com> wrote: > From: Stuart Smith <stu24m...@yahoo.com> > Subject: Re: Batch puts interrupted ... Requested row out of range for > HRegion filestore > ...org.apache.hadoop.hbase.client.RetriesExhaustedException: > To: user@hbase.apache.org > Date: Friday, August 6, 2010, 6:50 PM > Hello Ryan, > > Yup. There's a hole, exactly where it should be. > > I used add_table.rb once before, and am no expert on it. > All I have is a note written down: > > To recover lost tables: > ./hbase org.jruby.Main add_table.rb /hbase/filestore > > Any thing else I need to know? Do I just run the script > like so? > Anything need to be shut down before I do? > > Thanks! > > Take care, > -stu > > > --- On Fri, 8/6/10, Ryan Rawson <ryano...@gmail.com> > wrote: > > > From: Ryan Rawson <ryano...@gmail.com> > > Subject: Re: Batch puts interrupted ... Requested row > out of range for HRegion filestore > ...org.apache.hadoop.hbase.client.RetriesExhaustedException: > > To: user@hbase.apache.org > > Date: Friday, August 6, 2010, 6:08 PM > > Hi, > > > > When you run into this problem, it's usually a sign of > a > > META problem, > > specifically you have a 'hole' in the META table. > > > > The META table contains a series of keys like so: > > table,start_row1,<timestamp> [data] > > table,start_row2,<timestamp> [data] > > > > etc > > > > When we search for a region for a given row, we build > a key > > like so: > > 'table,my_row,9*19' and so a search called > > 'closestRowBefore'. This > > finds the region that contains this row. > > > > Now notice that we only put the start row in the > key.... > > each region > > has a start_row,end_row, and all the regions are > mutually > > exclusive > > and form complete coverage. Imagine a row for a > > region was missing, > > we'd consistently find the wrong region and the > > regionserver would > > reject the request (correctly so). > > > > That is what is probably happening here. Check the > > table dump in the > > master web-ui and see if you can find a 'hole'... > where the > > end-key > > doesnt match up with the start-key. > > > > If that is the case, there is a script add_table.rb > which > > is used to > > fix these things. > > > > -ryan > > > > On Fri, Aug 6, 2010 at 2:59 PM, Stuart Smith <stu24m...@yahoo.com> > > wrote: > > > Hello, > > > > > > I'm running hbase 0.20.5, and seeing Puts() > fail > > repeatedly when trying to insert a specific item into > the > > database. > > > > > > Client side I see: > > > > > > > > > org.apache.hadoop.hbase.client.RetriesExhaustedException: > > Trying to contact region server Some server, > > retryOnlyOne=true, index=0, islastrow=true, tries=9, > > numtries=10, i=0, listsize=1, > > > region=filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836 > > for region filestore, > > > > > > I then looked up which node was hosting the > given > > region > > > (filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b) > > on the gui, found the following debug message in the > > regionserver log: > > > > > > 2010-08-06 14:23:47,414 DEBUG > > org.apache.hadoop.hbase.regionserver.HRegionServer: > Batch > > puts interrupted at index=0 because:Requested row out > of > > range for HRegion > > > filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836, > > > startKey='bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b', > > > getEndKey()='be0bc7b3f8bc2a30910b9c758b47cdb730a4691e93f92abb857a2dcc7aefa633', > > > row='be1681910b02db5da061659c2cb08f501a135c2f065559a37a1761bf6e203d1d' > > > > > > > > > Which appears to be coming from: > > > > > > /regionserver/HRegionServer.java:1786: > > LOG.debug("Batch puts interrupted at index=" + i + > " > > because:" + > > > > > > Which is coming from: > > > > > > > > > ./java/org/apache/hadoop/hbase/regionserver/HRegion.java:1658: > > throw new WrongRegionException("Requested row > out of > > range for " + > > > > > > This happens repeatedly on a specific item over > at > > least a day or so, even when not much is happening > with the > > cluster. > > > > > > As far as I can tell, it looks like the logic to > > select the correct region for a given row is wrong. > The row > > is indeed not in the correct range (at least from what > I can > > tell of the exception thrown), and the check in > > HRegion.java:1658: > > > > > > /** Make sure this is a valid row for the > HRegion > > */ > > > private void checkRow(final byte [] row) > throws > > IOException { > > > if(!rowIsInRange(regionInfo, row)) { > > > > > > Is correctly rejecting the Put(). > > > > > > So it appears the error would be somewhere in: > > > HRegion.java:1550: > > > private void put(final Map<byte > > [],List<KeyValue>> familyMap, > > > boolean writeToWAL) throws IOException { > > > > > > Which appears to be the actual guts of the > insert > > operation. > > > However, I don't know enough about the design of > > HRegions to really decipher this method. I'll dig into > it > > more, but I thought it might be more efficient just to > ask > > you guys first. > > > > > > Any ideas? > > > > > > I can update to 0.20.6, but I don't see any > fixed > > jira's on 0.20.6 that seem related.. I could be wrong. > I'm > > not sure what I should do next. Any more information > you > > guys need? > > > > > > Note that I am inserting file into the database, > and > > using it's sha256sum as the key. And the file that is > > failing does indeed have a sha that corresponds to the > key > > in the message above (and is out of range). > > > > > > Take care, > > > -stu > > > > > > > > > > > > > > > > > > > > > > > >