[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257271#comment-13257271 ] Hudson commented on HBASE-5545: --- Integrated in HBase-TRUNK-security #175 (See [https://builds.apache.org/job/HBase-TRUNK-security/175/]) HBASE-5545 region can't be opened for a long time. Because the creating File failed. (Ram) (Revision 1327677) Result = FAILURE larsh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256224#comment-13256224 ] Hadoop QA commented on HBASE-5545: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12523075/HBASE-5545.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestLruBlockCache Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1561//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1561//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1561//console This message is automatically generated. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256238#comment-13256238 ] ramkrishna.s.vasudevan commented on HBASE-5545: --- @Lars Thanks for your review. Just checked, if the file doesnot exists then delete just returns false and no error is thrown. bq.Do you want to set recursive to false (just in case somebody changes this around ends up pointing to a directory)... Here the regioninfo is a file and even if we pass true it will delete only the file. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256757#comment-13256757 ] Lars Hofhansl commented on HBASE-5545: -- @Ram: yes that is what I meant to say (recursive is not needed, and somebody might misunderstand later and use this to delete tmp directory). Not important. +1 on commit as is. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ]
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256938#comment-13256938 ] Lars Hofhansl commented on HBASE-5545: -- I'm going to commit this 0.94 and 0.96 in the next few minutes. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call:
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256950#comment-13256950 ] stack commented on HBASE-5545: -- The addtions to FSUtils are over the top but +1 on patch -- deleting tmp content on open seems useful. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131]
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256981#comment-13256981 ] Hudson commented on HBASE-5545: --- Integrated in HBase-0.94-security #15 (See [https://builds.apache.org/job/HBase-0.94-security/15/]) HBASE-5545 region can't be opened for a long time. Because the creating File failed. (Ram) (Revision 1327676) Result = SUCCESS larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256996#comment-13256996 ] Hudson commented on HBASE-5545: --- Integrated in HBase-TRUNK #2783 (See [https://builds.apache.org/job/HBase-TRUNK/2783/]) HBASE-5545 region can't be opened for a long time. Because the creating File failed. (Ram) (Revision 1327677) Result = FAILURE larsh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257007#comment-13257007 ] Hudson commented on HBASE-5545: --- Integrated in HBase-0.94 #129 (See [https://builds.apache.org/job/HBase-0.94/129/]) HBASE-5545 region can't be opened for a long time. Because the creating File failed. (Ram) (Revision 1327676) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestFSUtils.java region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255441#comment-13255441 ] ramkrishna.s.vasudevan commented on HBASE-5545: --- @Gao Are you working on this patch? region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2 Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call: public abstract void
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255642#comment-13255642 ] ramkrishna.s.vasudevan commented on HBASE-5545: --- @Lars Do you mind taking this into 0.94.0? Though it exists in previous versions!! region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.1 Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call: public abstract void
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255714#comment-13255714 ] Lars Hofhansl commented on HBASE-5545: -- If this gets done before HBASE-5782 I'll pull it in, otherwise I don't think I should hold up the release for this. Sounds good? region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.1 Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call: public
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255779#comment-13255779 ] Lars Hofhansl commented on HBASE-5545: -- +1 on patch. I assume there no strange race conditions in this part of the code. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call: public abstract
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255780#comment-13255780 ] Zhihong Yu commented on HBASE-5545: --- {code} +// in the .tmp directory then on next time creation we will be getting {code} Should read 'then the next creation ...' {code} + fs.delete(tmpPath, true); {code} Please check the return value from fs.delete(). region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ]
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255781#comment-13255781 ] Uma Maheswara Rao G commented on HBASE-5545: If that is the case, why can't we create the file with overwrite flag as true?. It will automatically deletes the file if exists right. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23]
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255824#comment-13255824 ] Hadoop QA commented on HBASE-5545: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522989/HBASE-5545.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1553//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1553//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1553//console This message is automatically generated. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256174#comment-13256174 ] ramkrishna.s.vasudevan commented on HBASE-5545: --- @Uma 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. The problem can come if the close of the first create failed. so even if we use overwrite option it does not work. @Ted Return value in this case may not help us i feel. If the delete failed in normal case the create will be successful if the prev time the file was closed normally. What you feel Ted? region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256181#comment-13256181 ] ramkrishna.s.vasudevan commented on HBASE-5545: --- @Lars If you will be taking one more day to cut the RC, then by today eve i will give you an updated patch. May be the delete and exists api can be added in FSUtils.java and use them from there as how create is done? region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ]
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256186#comment-13256186 ] ramkrishna.s.vasudevan commented on HBASE-5545: --- @Zhihong The creation will succeed because while creating we pass overwrite as 'true'. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256191#comment-13256191 ] Uma Maheswara Rao G commented on HBASE-5545: @Ram, Ok, in this case, file was not closed as DNs crashed. overwrite flag was already passed as true from FSUtils. So, it can overwrite only closed files. So, your proposal of deletion file if exists should be fine. Delete will succeed in any case. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256198#comment-13256198 ] Lars Hofhansl commented on HBASE-5545: -- @Ram: Isn't this a pretty remote corner case? If we feel strongly that this should go into 0.94.0, let's get it in. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256204#comment-13256204 ] ramkrishna.s.vasudevan commented on HBASE-5545: --- @Lars I will upload the patch now. Even in case the RS goes down just before closing the .tmp file even then we will get the same problem. I accept it is not a regular one. I just thought can give this a try as HBASE-5782 was not yet in. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256206#comment-13256206 ] Lars Hofhansl commented on HBASE-5545: -- I see. Yep, that is more likely to happen than a DN crash (I think). No hurry Ram :) region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call: public
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256210#comment-13256210 ] Lars Hofhansl commented on HBASE-5545: -- 2nd patch looks good. I don't think you needed to the add the FSUtil code, but it cannot hurt. fs.delete does not throw if the file does not exist, right? Do you want to set recursive to false (just in case somebody changes this around ends up pointing to a directory)... This is a super minor nit. +1 on commit if delete does not throw on non-existant file. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225815#comment-13225815 ] ramkrishna.s.vasudevan commented on HBASE-5545: --- @Gao Thanks for bringing in this. Also we need to see any other place like compaction, flush (incase similar scenarios) are there Because every where we first create a file in tmp . Then rename. Also changing the version to 0.92.2 region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2 Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ]