[jira] Commented: (HBASE-3295) Dropping a 1k+ regions table likely ends in a client socket timeout and it's very confusing

Lars George (JIRA) Thu, 02 Dec 2010 01:25:42 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966033#action_12966033
 ]


Lars George commented on HBASE-3295:
------------------------------------

I like the async plus proper communication to user that they have to poll for 
the status - some sort of queued or async task list. I have found many reasons 
to press CTRL+c while waiting for a disable or drop just to mess up things. 
Ideally that could be build into the shell, i.e. it sends the command async and 
then polls the state while printing out "." on the command line to report it is 
waiting still. Once the async command has succeeded it reports so and exists 
the loop.

> Dropping a 1k+ regions table likely ends in a client socket timeout and it's 
> very confusing
> -------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3295
>                 URL: https://issues.apache.org/jira/browse/HBASE-3295
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.90.0
>
>         Attachments: 3295-v2.txt, 3295.txt
>
>
> I tried truncating a 1.6k regions table from the shell and, after the usual 
> disabling timeout, I then got a socket timeout on the second invocation while 
> it was dropping. It looked like this:
> {noformat}
> ERROR: java.net.SocketTimeoutException: Call to sv2borg180/10.20.20.180:61000 
> failed on socket timeout exception:
>  java.net.SocketTimeoutException: 60000 millis timeout while waiting for 
> channel to be ready for read. ch :
>  java.nio.channels.SocketChannel[connected local=/10.20.20.180:59153 
> remote=sv2borg180/10.20.20.180:61000]
> {noformat}
> At first I thought that was coming from the master because HDFS was somehow 
> slow, but then understood that it was my socket that timed out meaning that 
> the master was still dropping the table. Calling truncate again, I got:
> {noformat}
> ERROR: Unknown table TestTable!
> {noformat}
> Which means that the table would be deleted... I learned later that it wasn't 
> totally deleted after I shut down the cluster. So it leaves me in a situation 
> where I have to manually delete the files on the FS and the remaining .META. 
> entries.
> Since I expect a few people will hit this issue rather soon, for 0.90.0, I 
> propose we just set the socket timeout really high in the shell. For 0.90.1, 
> or 0.92, we should do for drop what we do for disabling.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3295) Dropping a 1k+ regions table likely ends in a client socket timeout and it's very confusing

Reply via email to