Hi Jon, I've actually discovered another issue with snapshot export. If you have a region that has recently split and you take a snapshot of that table and try to export it while the children still have references to the files in the split parent, the files will not be transferred and will be counted in the missing total. You end with error messages like:
java.io.FileNotFoundException: Unable to open link: org.apache.hadoop.hbase.io.HLogLink Please let me know if you would like any additional information. Thanks and have a great day, Sean On Wednesday, 24 April, 2013 at 9:19 AM, Sean MacDonald wrote: > Hi Jon, > > No problem. We do have snapshots enabled on the target cluster, and we are > using the default hfile archiver settings on both clusters. > > Thanks, > > Sean > > > On Tuesday, 23 April, 2013 at 1:54 PM, Jonathan Hsieh wrote: > > > Sean, > > > > Thanks for finding this problem. Can you provide some more information so > > that we can try to duplicate and fix this problem? > > > > Are snapshots on on the target cluster? > > What are the hfile archiver settings in your hbase-site.xml on both > > clusters? > > > > Thanks, > > Jon. > > > > > > On Mon, Apr 22, 2013 at 4:47 PM, Sean MacDonald <[email protected] > > (mailto:[email protected])> wrote: > > > > > It looks like you can't export a snapshot to a running cluster or it will > > > start cleaning up files from the archive after a period of time. I have > > > turned off HBase on the destination cluster and the export is working as > > > expected now. > > > > > > Sean > > > > > > > > > On Monday, 22 April, 2013 at 9:22 AM, Sean MacDonald wrote: > > > > > > > Hello, > > > > > > > > I am using HBase 0.94.6 on CDH 4.2 and trying to export a snapshot to > > > another cluster (also CDH 4.2), but this is failing repeatedly. The table > > > I > > > am trying to export is approximately 4TB in size and has 10GB regions. > > > Each > > > of the map jobs runs for about 6 minutes and appears to be running > > > properly, but then fails with a message like the following: > > > > > > > > 2013-04-22 16:12:50,699 WARN org.apache.hadoop.hdfs.DFSClient: > > > DataStreamer Exception > > > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > > > No lease on > > > /hbase/.archive/queries/533fcbb7858ef34b103a4f8804fa8719/d/651e974dafb64eefb9c49032aec4a35b > > > File does not exist. Holder DFSClient_NONMAPREDUCE_-192704511_1 does not > > > have any open files. at > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) > > > at > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) > > > at > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) > > > at > > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) > > > at > > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) > > > at > > > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtoc > > > ol > > > > $2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) at > > > > > > > > > > > > > > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) > > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at > > > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) at > > > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) at > > > java.security.AccessController.doPrivileged(Native Method) at > > > javax.security.auth.Subject.doAs(Subject.java:396) at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689) > > > > > > > > I was able to see the file that the LeaseExpiredException mentions on > > > the destination cluster before the exception happened (it is gone > > > afterwards). > > > > > > > > Any help that could be provided in resolving this would be greatly > > > appreciated. > > > > > > > > Thanks and have a great day, > > > > > > > > Sean > > > > > > -- > > // Jonathan Hsieh (shay) > > // Software Engineer, Cloudera > > // [email protected] (mailto:[email protected]) >
