[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-19 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257271#comment-13257271
 ] 

Hudson commented on HBASE-5545:
---

Integrated in HBase-TRUNK-security #175 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/175/])
HBASE-5545 region can't be opened for a long time. Because the creating 
File failed. (Ram) (Revision 1327677)

 Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256224#comment-13256224
 ] 

Hadoop QA commented on HBASE-5545:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12523075/HBASE-5545.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.io.hfile.TestLruBlockCache

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1561//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1561//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1561//console

This message is automatically generated.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256238#comment-13256238
 ] 

ramkrishna.s.vasudevan commented on HBASE-5545:
---

@Lars
Thanks for your review.
Just checked, if the file doesnot exists then delete just returns false and no 
error is thrown.
bq.Do you want to set recursive to false (just in case somebody changes this 
around ends up pointing to a directory)...
Here the regioninfo is a file and even if we pass true it will delete only the 
file.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256757#comment-13256757
 ] 

Lars Hofhansl commented on HBASE-5545:
--

@Ram: yes that is what I meant to say (recursive is not needed, and somebody 
might misunderstand later and use this to delete tmp directory). Not important.

+1 on commit as is.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256938#comment-13256938
 ] 

Lars Hofhansl commented on HBASE-5545:
--

I'm going to commit this 0.94 and 0.96 in the next few minutes.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call: 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256950#comment-13256950
 ] 

stack commented on HBASE-5545:
--

The addtions to FSUtils are over the top but +1 on patch -- deleting tmp 
content on open seems useful.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256981#comment-13256981
 ] 

Hudson commented on HBASE-5545:
---

Integrated in HBase-0.94-security #15 (See 
[https://builds.apache.org/job/HBase-0.94-security/15/])
HBASE-5545 region can't be opened for a long time. Because the creating 
File failed. (Ram) (Revision 1327676)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256996#comment-13256996
 ] 

Hudson commented on HBASE-5545:
---

Integrated in HBase-TRUNK #2783 (See 
[https://builds.apache.org/job/HBase-TRUNK/2783/])
HBASE-5545 region can't be opened for a long time. Because the creating 
File failed. (Ram) (Revision 1327677)

 Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257007#comment-13257007
 ] 

Hudson commented on HBASE-5545:
---

Integrated in HBase-0.94 #129 (See 
[https://builds.apache.org/job/HBase-0.94/129/])
HBASE-5545 region can't be opened for a long time. Because the creating 
File failed. (Ram) (Revision 1327676)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255441#comment-13255441
 ] 

ramkrishna.s.vasudevan commented on HBASE-5545:
---

@Gao
Are you working on this patch? 

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call: public abstract void 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255642#comment-13255642
 ] 

ramkrishna.s.vasudevan commented on HBASE-5545:
---

@Lars
Do you mind taking this into 0.94.0? Though it exists in previous versions!!

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.1


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call: public abstract void 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255714#comment-13255714
 ] 

Lars Hofhansl commented on HBASE-5545:
--

If this gets done before HBASE-5782 I'll pull it in, otherwise I don't think I 
should hold up the release for this. Sounds good?


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.1


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call: public 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255779#comment-13255779
 ] 

Lars Hofhansl commented on HBASE-5545:
--

+1 on patch. I assume there no strange race conditions in this part of the code.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call: public abstract 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255780#comment-13255780
 ] 

Zhihong Yu commented on HBASE-5545:
---

{code}
+// in the .tmp directory then on next time creation we will be getting
{code}
Should read 'then the next creation ...'
{code}
+  fs.delete(tmpPath, true);
{code}
Please check the return value from fs.delete().

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255781#comment-13255781
 ] 

Uma Maheswara Rao G commented on HBASE-5545:


If that is the case, why can't we create the file with overwrite flag as true?. 
It will automatically deletes the file if exists right.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255824#comment-13255824
 ] 

Hadoop QA commented on HBASE-5545:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12522989/HBASE-5545.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 4 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1553//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1553//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1553//console

This message is automatically generated.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256174#comment-13256174
 ] 

ramkrishna.s.vasudevan commented on HBASE-5545:
---

@Uma
1. Overwrite option will be applicable if you are trying to overwrite a closed 
file.
2. If the file is not closed, then even with overwrite option Same 
AlreadyBeingCreatedException will be thrown.
This is the expected behaviour to avoid the Multiple clients writing to same 
file.

The problem can come if the close of the first create failed.  so even if we 
use overwrite option it does not work.
@Ted
Return value in this case may not help us i feel.  If the delete failed in 
normal case the create will be successful if the prev time the file was closed 
normally.  
What you feel Ted?

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256181#comment-13256181
 ] 

ramkrishna.s.vasudevan commented on HBASE-5545:
---

@Lars
If you will be taking one more day to cut the RC, then by today eve i will give 
you an updated patch.  May be the delete and exists api can be added in 
FSUtils.java and use them from there as how create is done?


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256186#comment-13256186
 ] 

ramkrishna.s.vasudevan commented on HBASE-5545:
---

@Zhihong
The creation will succeed because while creating we pass overwrite as 'true'.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256191#comment-13256191
 ] 

Uma Maheswara Rao G commented on HBASE-5545:


@Ram, Ok, in this case, file was not closed as DNs crashed. overwrite flag was 
already passed as true from FSUtils. So, it can overwrite only closed files. 
So, your proposal of deletion file if exists should be fine. Delete will 
succeed in any case.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256198#comment-13256198
 ] 

Lars Hofhansl commented on HBASE-5545:
--

@Ram: Isn't this a pretty remote corner case?
If we feel strongly that this should go into 0.94.0, let's get it in.


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256204#comment-13256204
 ] 

ramkrishna.s.vasudevan commented on HBASE-5545:
---

@Lars
I will upload the patch now.  Even in case the RS goes down just before closing 
the .tmp file even then we will get the same problem.  I accept it is not a 
regular one.

I just thought can give this a try as HBASE-5782 was not yet in.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256206#comment-13256206
 ] 

Lars Hofhansl commented on HBASE-5545:
--

I see. Yep, that is more likely to happen than a DN crash (I think).
No hurry Ram :)


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call: public 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256210#comment-13256210
 ] 

Lars Hofhansl commented on HBASE-5545:
--

2nd patch looks good. I don't think you needed to the add the FSUtil code, but 
it cannot hurt.
fs.delete does not throw if the file does not exist, right?
Do you want to set recursive to false (just in case somebody changes this 
around ends up pointing to a directory)... This is a super minor nit.

+1 on commit if delete does not throw on non-existant file.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-03-08 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225815#comment-13225815
 ] 

ramkrishna.s.vasudevan commented on HBASE-5545:
---

@Gao
Thanks for bringing in this. Also we need to see any other place like 
compaction, flush (incase similar scenarios) are there
Because every where we first create a file in tmp .  Then rename.
Also changing the version to 0.92.2

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ]