I still see the same behaviour in the patched 0.2.0 (though not in the
0.1.2 release), so I guess it wasn't 478.

Short summary of the problem:
- create table: everything ok
- admin.disableTable - admin.addColumn - admin.enableTable
(where admin is an HBaseAdmin): disableTable doesn't always succeed.

I've taken a look at HBaseAdmin.disableTable().
First thing I notice is that after trying "numRetries" times to wait for
the first region to be disabled, I leave the loop and proceed as if
everything was alright - and in the log I get
"INFO org.apache.hadoop.hbase.client.HBaseAdmin: Disabled table xy".
That's quite misleading.

Second, the created scanner returns all "region:regioninfo" Cells for
all row keys (in this case the table name, right?) starting with the
requested table. Therefore the boolean "disabled" turns true if a table
after the intended one is disabled and again I leave the loop. If I
disable, say, table4 and then call disableTable() for table24, I
immediately get the message "Disabled table24" and then (not
surprisingly) the TableNotDisabledException. This is confusing - am I
using the method in a wrong way?

These issues aside, the question remains why sometimes a table cannot be
disabled. @Bryan: if I create the table along with the column family, I
don't get a problem - but I don't call disableTable() in that case. And
you're right about the oscillation. It's not unusual to watch a few
regions being deassigned from and reassigned again to the same node a
few times in a row. As I mentioned: I currently have only two nodes,
which admittedly might not be a typical setting. But still it should be
possible to work with.

Anyway, it seems that the table in question is often?/usually? being
reassigned to another RegionServer around the time that I want to add
the column. But maybe that's just coincidence.
I've tried upping hbase.client.retries.number to 20 (the "numRetries")
just to make sure to give the process enough time, but the problem remains.
Can you give me any pointer on where to start digging?

micha

[EMAIL PROTECTED]: No short term need for stability guarantees, just checking
HBase out - I figured that TRUNK is for the adventurous-minded :)


Jim Kellerman wrote:
> Most likely it is 478. A patch is available for trunk, but has not been 
> committed yet because it is waiting for review.

>> It seems to me like it could be a case of HBASE-478. Usually it takes
>> a bigger table to cause issues though. If you create the table with
>> the column family to start with instead of adding after the table is
>> created, do you still have this problem?
>>
>> Another possible issue here is HBASE-615. It's unlikely that the
>> oscillation Jim saw is only during startup, so perhaps with the right
>> arrangement of number of nodes and number of regions, you're also
>> seeing oscillation.
>>
>> -Bryan

Reply via email to