Have you checked audit log from NameNode to see which client deleted the files ?
Thanks On Jul 6, 2014, at 4:19 AM, Amit Sela <[email protected]> wrote: > I have a bulk load job running daily for months, when suddenly I got > a FileNotFoundException. > > Googling it I found HBASE-4030 and I noticed someone reporting it started > to re-appear at 0.94.8. > > I'm running with Hadoop 1.0.4 and 0.94.12. > > Anyone else encountered this problem lately ? > > Re-open the Jira ? > > Thanks, > > Amit. > > *On the client side this is the Excpetion:* > > java.net.SocketTimeoutException: Call to node.xxx.com/xxx.xxx.xxx.xxx:PORT > failed on socket timeout exception: java.net.SocketTimeoutException: 60000 > millis timeout while waiting for channel to be ready for read. ch : > java.nio.channels.SocketChannel[connected > local=/xxx.xxx.xxx.xxx:PORT remote=node.xxx.com/xxx.xxx.xxx.xxx:PORT] > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$3@29f2a6e3, > org.apache.hadoop.ipc.RemoteException: > org.apache.hadoop.io.MultipleIOException: 6 exceptions > [java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/metadata/88fd743853cf4f8a862fb19646027a48, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/gen/31c4c5cea9b348dbb6bb94115a483877, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/gen/5762c45aaf4f408ba748a989f7be9647, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/gen1/2ee02a005b654704a092d16c5c713373, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/gen1/618251330a1842a797de4b304d341a02, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/metadata/3955039392ce4f49aee5f58218a61be1] > at > org.apache.hadoop.io.MultipleIOException.createIOException(MultipleIOException.java:47) > at > org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3673) > at > org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3622) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFiles(HRegionServer.java:2930) > at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) > > *On the regionserver:* > > ERROR org.apache.hadoop.hbase.regionserver.HRegion: There were one or more > IO errors when checking if the bulk load is ok. > org.apache.hadoop.io.MultipleIOException: 6 exceptions > [java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/metadata/88fd743853cf4f8a862fb19646027a48, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/gen/31c4c5cea9b348dbb6bb94115a483877, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/gen/5762c45aaf4f408ba748a989f7be9647, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/gen1/2ee02a005b654704a092d16c5c713373, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/gen1/618251330a1842a797de4b304d341a02, > java.io.FileNotFoundException: File does not exist: > /data/output_jobs/output_websites/HFiles_20140705/metadata/3955039392ce4f49aee5f58218a61be1] > at > org.apache.hadoop.io.MultipleIOException.createIOException(MultipleIOException.java:47) > at > org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3673) > at > org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:3622) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFiles(HRegionServer.java:2930) > at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) > > followed by: > > ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: > org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call > next(4522610431482097770, 250), rpc version=1, client version=29, > methodsFingerPrint=-1368823753 from x > <http://82.80.29.145:51311>xx.xxx.xxx.xxx > after 12507 ms, since caller disconnected > at > org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:436) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3980) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3890) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3880) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2648) > at sun.reflect.GeneratedMethodAccessor60.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) > 2014-07-06 03:52:14,278 [IPC Server handler 28 on 8041] ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: > org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call > next(7354511084312054096, 250), rpc version=1, client version=29, > methodsFingerPrint=-1368823753 from x > <http://82.80.29.145:51311/>xx.xxx.xxx.xxx after > 9476 ms, since caller disconnected > at > org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:436) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3980) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3890) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3880) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2648) > at sun.reflect.GeneratedMethodAccessor60.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
