For 3, we have the following check in createTable():
if(lastKey != null && Bytes.equals(splitKey, lastKey)) {
throw new IllegalArgumentException("All split keys must be unique,
" +
"found duplicate: " + Bytes.toStringBinary(splitKey) +
I wonder why you used createTableAsync() directly which doesn't have split
key check.
On Tue, May 17, 2011 at 4:59 PM, Ted Yu <[email protected]> wrote:
> For 1, the check in HCM.isTableAvailable() is:
> return available.get() && (regionCount.get() > 0);
> This explains why some regions aren't available.
>
> For 3, can you provide a unit test so that we can investigate further ?
>
> Thanks
>
>
> On Tue, May 17, 2011 at 4:25 PM, Vidhyashankar Venkataraman <
> [email protected]> wrote:
>
>> (Running Hbase 0.90.0 on 700+ nodes.)
>>
>> You may have seen many (or mostly all) of the following issues already:
>> 1. HConnection.isTableAvailable: This doesn't seem to be working all the
>> time. In particular, I had this code after creating a table asynchronously:
>>
>> do {
>> LOG.info("Table " + tableName + "not yet available... Sleeping for" +
>> sleepTime + "milliseconds...");
>> Thread.sleep(sleepTime);
>> } while (!conn.isTableAvailable(table.getTableName()));
>> LOG.info("Table is available!! : "+tableName+" Available?
>> "+conn.isTableAvailable(table.getTableName()));
>>
>> It comes out of the loop but then I see this:
>> Table is available!! : <TABLE> Available? false
>>
>> And then I see that not all the regions are yet available.
>>
>>
>> 2. The master getting stuck unable to delete a WAL (I have seen this
>> before on this forum and a related JIRA on this one): We had worked around
>> by manually deleting a WAL. But during times when the master crashed during
>> table creation (with split key boundaries), the node that took over next as
>> the master (failover) started getting stuck for around 25% of the cluster. I
>> had to wipe out all the logs so that the master could start up right.
>>
>> But even then, the regionservers which had suffered the log issue couldn't
>> recognize the failed over master. (Is this something that has been observed
>> before?)
>>
>>
>> 3. createTableAsync with incorrect split keys: By mistake, I had some
>> duplicate keys in the split key byte array while calling the
>> createTableAsync function. The master crashed throwing a KeeperException
>> (thanks to the duplicate keys I guess?)
>>
>>
>> Also, can you let me know why createTableAsync blocks for some time and
>> throws a socket timeout exception when I try creating a table with a large
>> number of regions?
>>
>> Thank you
>> Vidhya
>>
>
>