Ceph as an alternative to HDFS

2010-08-06 Thread Amandeep Khurana
Article published in this months Usenix Login magazine:

http://www.usenix.org/publications/login/2010-08/openpdfs/maltzahn.pdf

-ak


How to delete rows in a FIFO manner

2010-08-06 Thread Thomas Downing

Hi,

Continuing with testing HBase suitability in a high ingest rate
environment, I've come up with a new stumbling block, likely
due to my inexperience with HBase.

We want to keep and purge records on a time basis: i.e, when
a record is older than say, 24 hours, we want to purge it from
the database.

The problem that I am encountering is the only way I've found
to delete records using an arbitrary but strongly ordered over
time row id is to scan for rows from lower bound to upper
bound, then build an array of Delete using

for Result in ResultScanner
add new Delete( Result.getRow( ) ) to Delete array.

This method is far too slow to keep up with our ingest rate; the
iteration over the Results in the ResultScanner is the bottleneck,
even though the Scan is limited to a single small column in the
column family.

The obvious but naive solution is to use a sequential row id
where the lower and upper bound can be known.  This would
allow the building of the array of Delete objects without a scan
step.  Problem with this approach is how do you guarantee a
sequential and non-colliding row id across more than one Put'ing
process, and do it efficiently.  As it happens, I can do this, but
given the details of my operational requirements, it's not a simple
thing to do.

So I was hoping that I had just missed something.  The ideal
would be a Delete object that would take row id bounds in the
same way that Scan does, allowing the work to be done all
on the server side.  Does this exists somewhere?  Or is there
some other way to skin this cat?

Thanks

Thomas Downing


Re: How to delete rows in a FIFO manner

2010-08-06 Thread Jean-Daniel Cryans
If the inserts are coming from more than 1 client, and your are trying
to delete from only 1 client, then likely it won't work. You could try
using a pool of deleters (multiple threads that delete rows) that you
feed from the scanner. Or you could run a MapReduce that would
parallelize that for you, that takes your table as an input and that
outputs Delete objects.

J-D

On Fri, Aug 6, 2010 at 5:50 AM, Thomas Downing
tdown...@proteus-technologies.com wrote:
 Hi,

 Continuing with testing HBase suitability in a high ingest rate
 environment, I've come up with a new stumbling block, likely
 due to my inexperience with HBase.

 We want to keep and purge records on a time basis: i.e, when
 a record is older than say, 24 hours, we want to purge it from
 the database.

 The problem that I am encountering is the only way I've found
 to delete records using an arbitrary but strongly ordered over
 time row id is to scan for rows from lower bound to upper
 bound, then build an array of Delete using

 for Result in ResultScanner
    add new Delete( Result.getRow( ) ) to Delete array.

 This method is far too slow to keep up with our ingest rate; the
 iteration over the Results in the ResultScanner is the bottleneck,
 even though the Scan is limited to a single small column in the
 column family.

 The obvious but naive solution is to use a sequential row id
 where the lower and upper bound can be known.  This would
 allow the building of the array of Delete objects without a scan
 step.  Problem with this approach is how do you guarantee a
 sequential and non-colliding row id across more than one Put'ing
 process, and do it efficiently.  As it happens, I can do this, but
 given the details of my operational requirements, it's not a simple
 thing to do.

 So I was hoping that I had just missed something.  The ideal
 would be a Delete object that would take row id bounds in the
 same way that Scan does, allowing the work to be done all
 on the server side.  Does this exists somewhere?  Or is there
 some other way to skin this cat?

 Thanks

 Thomas Downing



Re: How to delete rows in a FIFO manner

2010-08-06 Thread Venkatesh

 I wrestled with that idea of time bounded tables..Would it make it harder to 
write code/run map reduce
on multiple tables ? Also, how do u decide to when to do the cut over (start of 
a new day, week/month..)
 if u do how to process data that cross those time boundaries efficiently..
Guess that is not your requirement..

If it is fixed time cut over, is n't enough to set the TTL timestamp ? 


 Interesting thread..thanks


 

 

-Original Message-
From: Thomas Downing tdown...@proteus-technologies.com
To: user@hbase.apache.org user@hbase.apache.org
Sent: Fri, Aug 6, 2010 11:39 am
Subject: Re: How to delete rows in a FIFO manner


Thanks for the suggestions.  The problem isn't generating the 
Delete objects, or the delete operation itself - both are fast 
enough.  The problem is generating the list of row keys from 
which the Delete objects are created. 
 
For now, the obvious work-around is to create and drop 
tables on the fly, using HBaseAdmin, with the tables being 
time-bounded. When the high end of a table passes the expiry 
time, just drop the table. When a table is written with the first 
record greater than the low bound, create a new table for the 
next time interval. 
 
As I am having other problems related to high ingest rates, 
the fact may be that I am just using the wrong tool for the job. 
 
Thanks 
 
td 
 
On 8/6/2010 10:24 AM, Jean-Daniel Cryans wrote: 
 If the inserts are coming from more than 1 client, and your are trying 
 to delete from only 1 client, then likely it won't work. You could try 
 using a pool of deleters (multiple threads that delete rows) that you 
 feed from the scanner. Or you could run a MapReduce that would 
 parallelize that for you, that takes your table as an input and that 
 outputs Delete objects. 
 
 J-D 
 
 On Fri, Aug 6, 2010 at 5:50 AM, Thomas Downing 
 tdown...@proteus-technologies.com  wrote: 
 Hi, 
 
 Continuing with testing HBase suitability in a high ingest rate 
 environment, I've come up with a new stumbling block, likely 
 due to my inexperience with HBase. 
 
 We want to keep and purge records on a time basis: i.e, when 
 a record is older than say, 24 hours, we want to purge it from 
 the database. 
 
 The problem that I am encountering is the only way I've found 
 to delete records using an arbitrary but strongly ordered over 
 time row id is to scan for rows from lower bound to upper 
 bound, then build an array of Delete using 
 
 for Result in ResultScanner 
 add new Delete( Result.getRow( ) ) to Delete array. 
 
 This method is far too slow to keep up with our ingest rate; the 
 iteration over the Results in the ResultScanner is the bottleneck, 
 even though the Scan is limited to a single small column in the 
 column family. 
 
 The obvious but naive solution is to use a sequential row id 
 where the lower and upper bound can be known.  This would 
 allow the building of the array of Delete objects without a scan 
 step.  Problem with this approach is how do you guarantee a 
 sequential and non-colliding row id across more than one Put'ing 
 process, and do it efficiently.  As it happens, I can do this, but 
 given the details of my operational requirements, it's not a simple 
 thing to do. 
 
 So I was hoping that I had just missed something.  The ideal 
 would be a Delete object that would take row id bounds in the 
 same way that Scan does, allowing the work to be done all 
 on the server side.  Does this exists somewhere?  Or is there 
 some other way to skin this cat? 
 
 Thanks 
 
 Thomas Downing 
 
   -- 
 Follow this link to mark it as spam: 
 http://mailfilter.proteus-technologies.com/cgi-bin/learn-msg.cgi?id=6574C2821B.A5164
  
 
 
 

 


Re: How to delete rows in a FIFO manner

2010-08-06 Thread Thomas Downing

Our problem does not require significant map/reduce ops, and
queries tend to be for sequential rows with the timeframe being
the primary consideration.  So time-bounded tables are not a
big hurdle, as they might be were other columns primary keys
or considerations for query or map/reduce ops.

TTL timestamp - that may be just the magic I was looking for...
thanks, I'll look at that.

td

On 8/6/2010 11:59 AM, Venkatesh wrote:

  I wrestled with that idea of time bounded tables..Would it make it harder to 
write code/run map reduce
on multiple tables ? Also, how do u decide to when to do the cut over (start of 
a new day, week/month..)
  if u do how to process data that cross those time boundaries efficiently..
Guess that is not your requirement..

If it is fixed time cut over, is n't enough to set the TTL timestamp ?


  Interesting thread..thanks






-Original Message-
From: Thomas Downingtdown...@proteus-technologies.com
To: user@hbase.apache.orguser@hbase.apache.org
Sent: Fri, Aug 6, 2010 11:39 am
Subject: Re: How to delete rows in a FIFO manner


Thanks for the suggestions.  The problem isn't generating the
Delete objects, or the delete operation itself - both are fast
enough.  The problem is generating the list of row keys from
which the Delete objects are created.

For now, the obvious work-around is to create and drop
tables on the fly, using HBaseAdmin, with the tables being
time-bounded. When the high end of a table passes the expiry
time, just drop the table. When a table is written with the first
record greater than the low bound, create a new table for the
next time interval.

As I am having other problems related to high ingest rates,
the fact may be that I am just using the wrong tool for the job.

Thanks

td

On 8/6/2010 10:24 AM, Jean-Daniel Cryans wrote:
   

If the inserts are coming from more than 1 client, and your are trying
to delete from only 1 client, then likely it won't work. You could try
using a pool of deleters (multiple threads that delete rows) that you
feed from the scanner. Or you could run a MapReduce that would
parallelize that for you, that takes your table as an input and that
outputs Delete objects.

J-D

On Fri, Aug 6, 2010 at 5:50 AM, Thomas Downing
tdown...@proteus-technologies.com   wrote:
  Hi,
 

Continuing with testing HBase suitability in a high ingest rate
environment, I've come up with a new stumbling block, likely
due to my inexperience with HBase.

We want to keep and purge records on a time basis: i.e, when
a record is older than say, 24 hours, we want to purge it from
the database.

The problem that I am encountering is the only way I've found
to delete records using an arbitrary but strongly ordered over
time row id is to scan for rows from lower bound to upper
bound, then build an array of Delete using

for Result in ResultScanner
 add new Delete( Result.getRow( ) ) to Delete array.

This method is far too slow to keep up with our ingest rate; the
iteration over the Results in the ResultScanner is the bottleneck,
even though the Scan is limited to a single small column in the
column family.

The obvious but naive solution is to use a sequential row id
where the lower and upper bound can be known.  This would
allow the building of the array of Delete objects without a scan
step.  Problem with this approach is how do you guarantee a
sequential and non-colliding row id across more than one Put'ing
process, and do it efficiently.  As it happens, I can do this, but
given the details of my operational requirements, it's not a simple
thing to do.

So I was hoping that I had just missed something.  The ideal
would be a Delete object that would take row id bounds in the
same way that Scan does, allowing the work to be done all
on the server side.  Does this exists somewhere?  Or is there
some other way to skin this cat?

Thanks

Thomas Downing

-- 
   

Follow this link to mark it as spam:
http://mailfilter.proteus-technologies.com/cgi-bin/learn-msg.cgi?id=6574C2821B.A5164



 



--
Follow this link to mark it as spam:
http://mailfilter.proteus-technologies.com/cgi-bin/learn-msg.cgi?id=7E6BE2821B.A4479


   




Using HBase's export/import function...

2010-08-06 Thread Michael Segel

Ok,

Silly question...

Inside the /usr/lib/hbase/*.jar (base jar for HBase) There's an export/import 
tool.

If you supply the #versions, and the start time and end time, you can timebox 
your scan so your map/reduce job will let you do daily, weekly, etc type of 
incremental backups. 

So here's my questions:


1) Is anyone using this.
2) There isn't any documentation, I'm assuming that the start time and end 
times are timestamps (long values representing the number of miliseconds since 
the epoch which are what is being stored in hbase).
3) Is there an easy way to convert a date in to a time stamp? (not in ksh, and 
I'm struggling on finding a way to reverse the datetime object in python.

Thx

-Mike

  

Re: Using HBase's export/import function...

2010-08-06 Thread Stack
On Fri, Aug 6, 2010 at 11:13 AM, Michael Segel
michael_se...@hotmail.com wrote:
 2) There isn't any documentation, I'm assuming that the start time and end 
 times are timestamps (long values representing the number of miliseconds 
 since the epoch which are what is being stored in hbase).

Yes.

What kinda doc. do you need?  The javadoc on the class is minimal:
http://hbase.apache.org/docs/r0.20.6/api/org/apache/hadoop/hbase/mapreduce/Export.html


 3) Is there an easy way to convert a date in to a time stamp? (not in ksh, 
 and I'm struggling on finding a way to reverse the datetime object in python.


On the end of this page it shows you how to do date convertions inside
in the hbase shell: http://wiki.apache.org/hadoop/Hbase/Shell

St.Ack


 Thx

 -Mike




RE: Using HBase's export/import function...

2010-08-06 Thread Michael Segel

StAck...

LOL...

The idea is to automate the use of the export function to be run within a cron 
job. 
(And yes, there are some use cases where we want to actually back data up.. ;-)
I originally wanted to do this in ksh (yeah I'm that old. :-) but ended up 
looking at Python because I couldn't figure out how to create the time stamp in 
ksh.

As to documentation... just something which tells us what is meant by start 
time and end time. (Like that its in ms from the epoch instead of making us 
assume that.)
[And you know what they say about assumptions.]

As to converting the date / time to a timestamp...

In Python:
You build up a date object then you can do the following:
mytime = datetime.datetime(year,month,day,hour,min,sec) *where hour,min,sec are 
optional
mytimestamp = time.mktime(mytime.timetuple())

I'm in the process of testing this... I think it will work.

Thx

-Mike




 Date: Fri, 6 Aug 2010 11:28:57 -0700
 Subject: Re: Using HBase's export/import function...
 From: st...@duboce.net
 To: user@hbase.apache.org
 
 On Fri, Aug 6, 2010 at 11:13 AM, Michael Segel
 michael_se...@hotmail.com wrote:
  2) There isn't any documentation, I'm assuming that the start time and end 
  times are timestamps (long values representing the number of miliseconds 
  since the epoch which are what is being stored in hbase).
 
 Yes.
 
 What kinda doc. do you need?  The javadoc on the class is minimal:
 http://hbase.apache.org/docs/r0.20.6/api/org/apache/hadoop/hbase/mapreduce/Export.html
 
 
  3) Is there an easy way to convert a date in to a time stamp? (not in ksh, 
  and I'm struggling on finding a way to reverse the datetime object in 
  python.
 
 
 On the end of this page it shows you how to do date convertions inside
 in the hbase shell: http://wiki.apache.org/hadoop/Hbase/Shell
 
 St.Ack
 
 
  Thx
 
  -Mike
 
 
  

Re: HBase storage sizing

2010-08-06 Thread Andrew Nguyen
With respect to the comment below, I'm trying to determine what the minimum IO 
requirements are for us...

For any given value being stored into HBase, is accurate to calculate the size 
of the row key, family, qualifier, timestamp, and value and use their sum as 
the amount of data that needs to be written for every insert?

Thanks,
Andre

On Jul 8, 2010, at 5:44 PM, Jean-Daniel Cryans wrote:

 keep in mind that every value is also
 stored with it's full key (row key + family + qualifier + timestamp).



Batch puts interrupted ... Requested row out of range for HRegion filestore ...org.apache.hadoop.hbase.client.RetriesExhaustedException:

2010-08-06 Thread Stuart Smith
Hello,

  I'm running hbase 0.20.5, and seeing Puts() fail repeatedly when trying to 
insert a specific item into the database.

Client side I see:

org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact 
region server Some server, retryOnlyOne=true, index=0, islastrow=true, tries=9, 
numtries=10, i=0, listsize=1, 
region=filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836
 for region filestore,

I then looked up which node was hosting the given region 
(filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b) on 
the gui, found the following debug message in the regionserver log:

2010-08-06 14:23:47,414 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionServer: Batch puts interrupted at 
index=0 because:Requested row out of range for HRegion 
filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836,
 startKey='bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b', 
getEndKey()='be0bc7b3f8bc2a30910b9c758b47cdb730a4691e93f92abb857a2dcc7aefa633', 
row='be1681910b02db5da061659c2cb08f501a135c2f065559a37a1761bf6e203d1d'


Which appears to be coming from:

/regionserver/HRegionServer.java:1786:  LOG.debug(Batch puts interrupted 
at index= + i +  because: +

Which is coming from:

./java/org/apache/hadoop/hbase/regionserver/HRegion.java:1658:  throw new 
WrongRegionException(Requested row out of range for  +

This happens repeatedly on a specific item over at least a day or so, even when 
not much is happening with the cluster.

As far as I can tell, it looks like the logic to select the correct region for 
a given row is wrong. The row is indeed not in the correct range (at least from 
what I can tell of the exception thrown), and the check in HRegion.java:1658:

  /** Make sure this is a valid row for the HRegion */
  private void checkRow(final byte [] row) throws IOException {
if(!rowIsInRange(regionInfo, row)) {

Is correctly rejecting the Put().

So it appears the error would be somewhere in: 
HRegion.java:1550: 
  private void put(final Mapbyte [],ListKeyValue familyMap,
  boolean writeToWAL) throws IOException {

Which appears to be the actual guts of the insert operation.
However, I don't know enough about the design of HRegions to really decipher 
this method. I'll dig into it more, but I thought it might be more efficient 
just to ask you guys first.

Any ideas? 

I can update to 0.20.6, but I don't see any fixed jira's on 0.20.6 that seem 
related.. I could be wrong. I'm not sure what I should do next. Any more 
information you guys need?

Note that I am inserting file into the database, and using it's sha256sum as 
the key. And the file that is failing does indeed have a sha that corresponds 
to the key in the message above (and is out of range).

Take care,
  -stu




  


Re: Batch puts interrupted ... Requested row out of range for HRegion filestore ...org.apache.hadoop.hbase.client.RetriesExhaustedException:

2010-08-06 Thread Ryan Rawson
Hi,

When you run into this problem, it's usually a sign of a META problem,
specifically you have a 'hole' in the META table.

The META table contains a series of keys like so:
table,start_row1,timestamp[data]
table,start_row2,timestamp[data]

etc

When we search for a region for a given row, we build a key like so:
'table,my_row,9*19' and so a search called 'closestRowBefore'.  This
finds the region that contains this row.

Now notice that we only put the start row in the key each region
has a start_row,end_row, and all the regions are mutually exclusive
and form complete coverage.  Imagine a row for a region was missing,
we'd consistently find the wrong region and the regionserver would
reject the request (correctly so).

That is what is probably happening here.  Check the table dump in the
master web-ui and see if you can find a 'hole'... where the end-key
doesnt match up with the start-key.

If that is the case, there is a script add_table.rb which is used to
fix these things.

-ryan

On Fri, Aug 6, 2010 at 2:59 PM, Stuart Smith stu24m...@yahoo.com wrote:
 Hello,

  I'm running hbase 0.20.5, and seeing Puts() fail repeatedly when trying to 
 insert a specific item into the database.

 Client side I see:

 org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact 
 region server Some server, retryOnlyOne=true, index=0, islastrow=true, 
 tries=9, numtries=10, i=0, listsize=1, 
 region=filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836
  for region filestore,

 I then looked up which node was hosting the given region 
 (filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b) 
 on the gui, found the following debug message in the regionserver log:

 2010-08-06 14:23:47,414 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Batch puts interrupted at 
 index=0 because:Requested row out of range for HRegion 
 filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836,
  startKey='bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b', 
 getEndKey()='be0bc7b3f8bc2a30910b9c758b47cdb730a4691e93f92abb857a2dcc7aefa633',
  row='be1681910b02db5da061659c2cb08f501a135c2f065559a37a1761bf6e203d1d'


 Which appears to be coming from:

 /regionserver/HRegionServer.java:1786:      LOG.debug(Batch puts interrupted 
 at index= + i +  because: +

 Which is coming from:

 ./java/org/apache/hadoop/hbase/regionserver/HRegion.java:1658:      throw new 
 WrongRegionException(Requested row out of range for  +

 This happens repeatedly on a specific item over at least a day or so, even 
 when not much is happening with the cluster.

 As far as I can tell, it looks like the logic to select the correct region 
 for a given row is wrong. The row is indeed not in the correct range (at 
 least from what I can tell of the exception thrown), and the check in 
 HRegion.java:1658:

  /** Make sure this is a valid row for the HRegion */
  private void checkRow(final byte [] row) throws IOException {
    if(!rowIsInRange(regionInfo, row)) {

 Is correctly rejecting the Put().

 So it appears the error would be somewhere in:
 HRegion.java:1550:
  private void put(final Mapbyte [],ListKeyValue familyMap,
      boolean writeToWAL) throws IOException {

 Which appears to be the actual guts of the insert operation.
 However, I don't know enough about the design of HRegions to really decipher 
 this method. I'll dig into it more, but I thought it might be more efficient 
 just to ask you guys first.

 Any ideas?

 I can update to 0.20.6, but I don't see any fixed jira's on 0.20.6 that seem 
 related.. I could be wrong. I'm not sure what I should do next. Any more 
 information you guys need?

 Note that I am inserting file into the database, and using it's sha256sum as 
 the key. And the file that is failing does indeed have a sha that corresponds 
 to the key in the message above (and is out of range).

 Take care,
  -stu








Re: Batch puts interrupted ... Requested row out of range for HRegion filestore ...org.apache.hadoop.hbase.client.RetriesExhaustedException:

2010-08-06 Thread Stuart Smith
Hello Ryan,

  Yup. There's a hole, exactly where it should be.

I used add_table.rb once before, and am no expert on it.
All I have is a note written down:

To recover lost tables:
./hbase org.jruby.Main add_table.rb /hbase/filestore

Any thing else I need to know? Do I just run the script like so?
Anything need to be shut down before I do?

Thanks!

Take care,
  -stu


--- On Fri, 8/6/10, Ryan Rawson ryano...@gmail.com wrote:

 From: Ryan Rawson ryano...@gmail.com
 Subject: Re: Batch puts interrupted ... Requested row out of range for 
 HRegion  filestore 
 ...org.apache.hadoop.hbase.client.RetriesExhaustedException:
 To: user@hbase.apache.org
 Date: Friday, August 6, 2010, 6:08 PM
 Hi,
 
 When you run into this problem, it's usually a sign of a
 META problem,
 specifically you have a 'hole' in the META table.
 
 The META table contains a series of keys like so:
 table,start_row1,timestamp    [data]
 table,start_row2,timestamp    [data]
 
 etc
 
 When we search for a region for a given row, we build a key
 like so:
 'table,my_row,9*19' and so a search called
 'closestRowBefore'.  This
 finds the region that contains this row.
 
 Now notice that we only put the start row in the key
 each region
 has a start_row,end_row, and all the regions are mutually
 exclusive
 and form complete coverage.  Imagine a row for a
 region was missing,
 we'd consistently find the wrong region and the
 regionserver would
 reject the request (correctly so).
 
 That is what is probably happening here.  Check the
 table dump in the
 master web-ui and see if you can find a 'hole'... where the
 end-key
 doesnt match up with the start-key.
 
 If that is the case, there is a script add_table.rb which
 is used to
 fix these things.
 
 -ryan
 
 On Fri, Aug 6, 2010 at 2:59 PM, Stuart Smith stu24m...@yahoo.com
 wrote:
  Hello,
 
   I'm running hbase 0.20.5, and seeing Puts() fail
 repeatedly when trying to insert a specific item into the
 database.
 
  Client side I see:
 
 
 org.apache.hadoop.hbase.client.RetriesExhaustedException:
 Trying to contact region server Some server,
 retryOnlyOne=true, index=0, islastrow=true, tries=9,
 numtries=10, i=0, listsize=1,
 region=filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836
 for region filestore,
 
  I then looked up which node was hosting the given
 region
 (filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b)
 on the gui, found the following debug message in the
 regionserver log:
 
  2010-08-06 14:23:47,414 DEBUG
 org.apache.hadoop.hbase.regionserver.HRegionServer: Batch
 puts interrupted at index=0 because:Requested row out of
 range for HRegion
 filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836,
 startKey='bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b',
 getEndKey()='be0bc7b3f8bc2a30910b9c758b47cdb730a4691e93f92abb857a2dcc7aefa633',
 row='be1681910b02db5da061659c2cb08f501a135c2f065559a37a1761bf6e203d1d'
 
 
  Which appears to be coming from:
 
  /regionserver/HRegionServer.java:1786:    
  LOG.debug(Batch puts interrupted at index= + i + 
 because: +
 
  Which is coming from:
 
 
 ./java/org/apache/hadoop/hbase/regionserver/HRegion.java:1658:
      throw new WrongRegionException(Requested row out of
 range for  +
 
  This happens repeatedly on a specific item over at
 least a day or so, even when not much is happening with the
 cluster.
 
  As far as I can tell, it looks like the logic to
 select the correct region for a given row is wrong. The row
 is indeed not in the correct range (at least from what I can
 tell of the exception thrown), and the check in
 HRegion.java:1658:
 
   /** Make sure this is a valid row for the HRegion
 */
   private void checkRow(final byte [] row) throws
 IOException {
     if(!rowIsInRange(regionInfo, row)) {
 
  Is correctly rejecting the Put().
 
  So it appears the error would be somewhere in:
  HRegion.java:1550:
   private void put(final Mapbyte
 [],ListKeyValue familyMap,
       boolean writeToWAL) throws IOException {
 
  Which appears to be the actual guts of the insert
 operation.
  However, I don't know enough about the design of
 HRegions to really decipher this method. I'll dig into it
 more, but I thought it might be more efficient just to ask
 you guys first.
 
  Any ideas?
 
  I can update to 0.20.6, but I don't see any fixed
 jira's on 0.20.6 that seem related.. I could be wrong. I'm
 not sure what I should do next. Any more information you
 guys need?
 
  Note that I am inserting file into the database, and
 using it's sha256sum as the key. And the file that is
 failing does indeed have a sha that corresponds to the key
 in the message above (and is out of range).
 
  Take care,
   -stu
 
 
 
 
 
 
 





Re: Batch puts interrupted ... Requested row out of range for HRegion filestore ...org.apache.hadoop.hbase.client.RetriesExhaustedException:

2010-08-06 Thread Stuart Smith
Just to follow up - I ran add_table as I had done when I lost a table before - 
and it fixed the error.

Thanks!

Take care,
  -stu

--- On Fri, 8/6/10, Stuart Smith stu24m...@yahoo.com wrote:

 From: Stuart Smith stu24m...@yahoo.com
 Subject: Re: Batch puts interrupted ... Requested row out of range for 
 HRegion  filestore 
 ...org.apache.hadoop.hbase.client.RetriesExhaustedException:
 To: user@hbase.apache.org
 Date: Friday, August 6, 2010, 6:50 PM
 Hello Ryan,
 
   Yup. There's a hole, exactly where it should be.
 
 I used add_table.rb once before, and am no expert on it.
 All I have is a note written down:
 
 To recover lost tables:
 ./hbase org.jruby.Main add_table.rb /hbase/filestore
 
 Any thing else I need to know? Do I just run the script
 like so?
 Anything need to be shut down before I do?
 
 Thanks!
 
 Take care,
   -stu
 
 
 --- On Fri, 8/6/10, Ryan Rawson ryano...@gmail.com
 wrote:
 
  From: Ryan Rawson ryano...@gmail.com
  Subject: Re: Batch puts interrupted ... Requested row
 out of range for HRegion  filestore
 ...org.apache.hadoop.hbase.client.RetriesExhaustedException:
  To: user@hbase.apache.org
  Date: Friday, August 6, 2010, 6:08 PM
  Hi,
  
  When you run into this problem, it's usually a sign of
 a
  META problem,
  specifically you have a 'hole' in the META table.
  
  The META table contains a series of keys like so:
  table,start_row1,timestamp    [data]
  table,start_row2,timestamp    [data]
  
  etc
  
  When we search for a region for a given row, we build
 a key
  like so:
  'table,my_row,9*19' and so a search called
  'closestRowBefore'.  This
  finds the region that contains this row.
  
  Now notice that we only put the start row in the
 key
  each region
  has a start_row,end_row, and all the regions are
 mutually
  exclusive
  and form complete coverage.  Imagine a row for a
  region was missing,
  we'd consistently find the wrong region and the
  regionserver would
  reject the request (correctly so).
  
  That is what is probably happening here.  Check the
  table dump in the
  master web-ui and see if you can find a 'hole'...
 where the
  end-key
  doesnt match up with the start-key.
  
  If that is the case, there is a script add_table.rb
 which
  is used to
  fix these things.
  
  -ryan
  
  On Fri, Aug 6, 2010 at 2:59 PM, Stuart Smith stu24m...@yahoo.com
  wrote:
   Hello,
  
    I'm running hbase 0.20.5, and seeing Puts()
 fail
  repeatedly when trying to insert a specific item into
 the
  database.
  
   Client side I see:
  
  
 
 org.apache.hadoop.hbase.client.RetriesExhaustedException:
  Trying to contact region server Some server,
  retryOnlyOne=true, index=0, islastrow=true, tries=9,
  numtries=10, i=0, listsize=1,
 
 region=filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836
  for region filestore,
  
   I then looked up which node was hosting the
 given
  region
 
 (filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b)
  on the gui, found the following debug message in the
  regionserver log:
  
   2010-08-06 14:23:47,414 DEBUG
  org.apache.hadoop.hbase.regionserver.HRegionServer:
 Batch
  puts interrupted at index=0 because:Requested row out
 of
  range for HRegion
 
 filestore,bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b,1279604506836,
 
 startKey='bdfa9f217300cfae81ece08f75f0002bf3f3a54cde6bbf9192f0187e275b',
 
 getEndKey()='be0bc7b3f8bc2a30910b9c758b47cdb730a4691e93f92abb857a2dcc7aefa633',
 
 row='be1681910b02db5da061659c2cb08f501a135c2f065559a37a1761bf6e203d1d'
  
  
   Which appears to be coming from:
  
   /regionserver/HRegionServer.java:1786:    
   LOG.debug(Batch puts interrupted at index= + i +
 
  because: +
  
   Which is coming from:
  
  
 
 ./java/org/apache/hadoop/hbase/regionserver/HRegion.java:1658:
       throw new WrongRegionException(Requested row
 out of
  range for  +
  
   This happens repeatedly on a specific item over
 at
  least a day or so, even when not much is happening
 with the
  cluster.
  
   As far as I can tell, it looks like the logic to
  select the correct region for a given row is wrong.
 The row
  is indeed not in the correct range (at least from what
 I can
  tell of the exception thrown), and the check in
  HRegion.java:1658:
  
    /** Make sure this is a valid row for the
 HRegion
  */
    private void checkRow(final byte [] row)
 throws
  IOException {
      if(!rowIsInRange(regionInfo, row)) {
  
   Is correctly rejecting the Put().
  
   So it appears the error would be somewhere in:
   HRegion.java:1550:
    private void put(final Mapbyte
  [],ListKeyValue familyMap,
        boolean writeToWAL) throws IOException {
  
   Which appears to be the actual guts of the
 insert
  operation.
   However, I don't know enough about the design of
  HRegions to really decipher this method. I'll dig into
 it
  more, but I thought it might be more efficient just to
 ask
  you guys first.
  
   Any ideas?
  
   I can update to