Hello Ryan, Yup. There's a hole, exactly where it should be.
I used add_table.rb once before, and am no expert on it. All I have is a note written down: To recover lost tables: ./hbase org.jruby.Main add_table.rb /hbase/filestore Any thing else I need to know? Do I just run the script like so? Anything need to be shut down before I do? Thanks! Take care, -stu --- On Fri, 8/6/10, Ryan Rawson <ryano...@gmail.com> wrote: > From: Ryan Rawson <ryano...@gmail.com> > Subject: Re: Batch puts interrupted ... Requested row out of range for > HRegion filestore > ...org.apache.hadoop.hbase.client.RetriesExhaustedException: > To: user@hbase.apache.org > Date: Friday, August 6, 2010, 6:08 PM > Hi, > > When you run into this problem, it's usually a sign of a > META problem, > specifically you have a 'hole' in the META table. > > The META table contains a series of keys like so: > table,start_row1,<timestamp> [data] > table,start_row2,<timestamp> [data] > > etc > > When we search for a region for a given row, we build a key > like so: > 'table,my_row,9*19' and so a search called > 'closestRowBefore'. This > finds the region that contains this row. > > Now notice that we only put the start row in the key.... > each region > has a start_row,end_row, and all the regions are mutually > exclusive > and form complete coverage. Imagine a row for a > region was missing, > we'd consistently find the wrong region and the > regionserver would > reject the request (correctly so). > > That is what is probably happening here. Check the > table dump in the > master web-ui and see if you can find a 'hole'... where the > end-key > doesnt match up with the start-key. > > If that is the case, there is a script add_table.rb which > is used to > fix these things. > > -ryan > > On Fri, Aug 6, 2010 at 2:59 PM, Stuart Smith <stu24m...@yahoo.com> > wrote: > > Hello, > > > > I'm running hbase 0.20.5, and seeing Puts() fail > repeatedly when trying to insert a specific item into the > database. > > > > Client side I see: > > > > > org.apache.hadoop.hbase.client.RetriesExhaustedException: > Trying to contact region server Some server, > retryOnlyOne=true, index=0, islastrow=true, tries=9, > numtries=10, i=0, listsize=1, > region=filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836 > for region filestore, > > > > I then looked up which node was hosting the given > region > (filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b) > on the gui, found the following debug message in the > regionserver log: > > > > 2010-08-06 14:23:47,414 DEBUG > org.apache.hadoop.hbase.regionserver.HRegionServer: Batch > puts interrupted at index=0 because:Requested row out of > range for HRegion > filestore,bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836, > startKey='bdfa9f2173033330cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b', > getEndKey()='be0bc7b3f8bc2a30910b9c758b47cdb730a4691e93f92abb857a2dcc7aefa633', > row='be1681910b02db5da061659c2cb08f501a135c2f065559a37a1761bf6e203d1d' > > > > > > Which appears to be coming from: > > > > /regionserver/HRegionServer.java:1786: > LOG.debug("Batch puts interrupted at index=" + i + " > because:" + > > > > Which is coming from: > > > > > ./java/org/apache/hadoop/hbase/regionserver/HRegion.java:1658: > throw new WrongRegionException("Requested row out of > range for " + > > > > This happens repeatedly on a specific item over at > least a day or so, even when not much is happening with the > cluster. > > > > As far as I can tell, it looks like the logic to > select the correct region for a given row is wrong. The row > is indeed not in the correct range (at least from what I can > tell of the exception thrown), and the check in > HRegion.java:1658: > > > > /** Make sure this is a valid row for the HRegion > */ > > private void checkRow(final byte [] row) throws > IOException { > > if(!rowIsInRange(regionInfo, row)) { > > > > Is correctly rejecting the Put(). > > > > So it appears the error would be somewhere in: > > HRegion.java:1550: > > private void put(final Map<byte > [],List<KeyValue>> familyMap, > > boolean writeToWAL) throws IOException { > > > > Which appears to be the actual guts of the insert > operation. > > However, I don't know enough about the design of > HRegions to really decipher this method. I'll dig into it > more, but I thought it might be more efficient just to ask > you guys first. > > > > Any ideas? > > > > I can update to 0.20.6, but I don't see any fixed > jira's on 0.20.6 that seem related.. I could be wrong. I'm > not sure what I should do next. Any more information you > guys need? > > > > Note that I am inserting file into the database, and > using it's sha256sum as the key. And the file that is > failing does indeed have a sha that corresponds to the key > in the message above (and is out of range). > > > > Take care, > > -stu > > > > > > > > > > > > >