Himanshu et al, I see this error in one region. the .meta file is having a problem I guess. Thanks for all the help.
13/09/16 02:13:39 WARN wal.HLogSplitter: Could not open hdfs://ssp-master:8020/apps/hbase/data/WALs/ssp-region3,60020,1379105062356-splitting/ssp-region3%2C60020%2C1379105062356.1379105067148.meta for reading. File is empty java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1512) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1490) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1479) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1474) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:69) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.reset(SequenceFileLogReader.java:174) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.initReader(SequenceFileLogReader.java:183) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:126) at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:89) and in the master I see 2013-09-16 02:33:28,755 [RpcServer.handler=27,port=60000] ERROR org.apache.hadoop.ipc.RpcServer.call - Unexpected throwable object java.lang.NullPointerException at org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:111) at org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3116) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1779) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1820) at org.apache.hadoop.hbase.protobuf.generated.MasterAdminProtos$MasterAdminService$2.callBlockingMethod(MasterAdminProtos.java:27720) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2156) at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1861) Regards, - kiru ________________________________ From: Himanshu Vashishtha <[email protected]> To: [email protected]; Kiru Pakkirisamy <[email protected]> Cc: Ted Yu <[email protected]> Sent: Friday, September 13, 2013 4:18 PM Subject: Re: 0.95.2 upgrade errors Is there something odd (exceptions on starting) in Master logs? Does it start normally? You removed the above jar from HMaster node too? Could you restart master and see if that changes anything? On Fri, Sep 13, 2013 at 1:47 PM, Kiru Pakkirisamy <[email protected]> wrote: > >I see a hadoop-core-1.1.2 jar in the hbase/lib dir. Is this the same as the >hadoop 'core' jar in the hadoop directory ? >If I remove this, the regionservers seem to stay up. But create table fails. > >ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after >attempts=7, exceptions: >Fri Sep 13 16:44:48 EDT 2013, >org.apache.hadoop.hbase.client.RpcRetryingCaller@734893da, >org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: java.io.IOException >at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2194) >at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1861) >Caused by: java.lang.NullPointerException >at >org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:111) > > >Regards, >- kiru > > > > >________________________________ > From: Himanshu Vashishtha <[email protected]> >To: [email protected]; Kiru Pakkirisamy <[email protected]> >Cc: Ted Yu <[email protected]> >Sent: Friday, September 13, 2013 10:22 AM > >Subject: Re: 0.95.2 upgrade errors > > >Sounds odd, but just to be sure you are picking right hadoop jars, you >could look at the classpath dump in the logs when the regionserver starts. > > >On Fri, Sep 13, 2013 at 9:33 AM, Kiru Pakkirisamy <[email protected] >> wrote: > >> We are using hadoop-core-1.2.0.1.3.0.0-107.jar >> The versions are same on the other cluster and it works there. But this >> one is on ec2. Thats the only difference. Beats me for now. >> >> Regards, >> - kiru >> >> >> ________________________________ >> From: Ted Yu <[email protected]> >> To: "[email protected]" <[email protected]>; Kiru Pakkirisamy < >> [email protected]> >> Sent: Friday, September 13, 2013 9:30 AM >> Subject: Re: 0.95.2 upgrade errors >> >> >> >> java.io.IOException: IOException >> flush:org.apache.hadoop.ipc.RemoteException: java.io.IOException: >> java.lang.NoSuchMethodException: >> org.apache.hadoop.hdfs.protocol.ClientProtocol.fsync(java.lang.String, >> java.lang.String) >> >> >> What version of hadoop is in use on this machine ? >> >> Thanks >> >> >> >> On Fri, Sep 13, 2013 at 9:27 AM, Kiru Pakkirisamy < >> [email protected]> wrote: >> >> We were able to migrate one cluster to 0.95.2 (from 0.94.11) without any >> problems. >> >But in another clusters, we have run into a few issues . Any pointers >> > >> >13/09/13 07:57:09 FATAL wal.FSHLog: Could not sync. Requesting roll of >> hlog >> >java.io.IOException: DFSOutputStream is closed >> > at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3845) >> > at >> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97) >> > at >> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:135) >> > at >> org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1075) >> > at >> org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1013) >> > at >> org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1186) >> > at >> org.apache.hadoop.hbase.regionserver.wal.FSHLog$LogSyncer.run(FSHLog.java:964) >> >at java.lang.Thread.run(Thread.java:662) >> >13/09/13 07:57:09 ERROR wal.FSHLog: Error while syncing, requesting close >> of hlog >> >java.io.IOException: DFSOutputStream is closed >> > at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3845) >> > at >> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97) >> > at >> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:135) >> > at >> org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1075) >> > at >> org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1013) >> > at >> org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1186) >> > at >> org.apache.hadoop.hbase.regionserver.wal.FSHLog$LogSyncer.run(FSHLog.java:964) >> > at java.lang.Thread.run(Thread.java:662) >> >13/09/13 07:57:09 INFO regionserver.HRegionServer: stopping server >> ssp-region1,60020,1379073423978; all regions closed. >> >13/09/13 07:57:10 INFO wal.FSHLog: >> RS_OPEN_META-ssp-region1:60020-0.logSyncer exiting >> >13/09/13 07:57:10 ERROR regionserver.HRegionServer: Metalog close and >> delete failed >> >java.io.IOException: IOException >> flush:org.apache.hadoop.ipc.RemoteException: java.io.IOException: >> java.lang.NoSuchMethodException: >> org.apache.hadoop.hdfs.protocol.ClientProtocol.fsync(java.lang.String, >> java.lang.String) >> > at java.lang.Class.getMethod(Class.java:1605) >> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:581) >> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444) >> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440) >> > >> >and finally the region shuts down >> > >> >13/09/13 07:57:10 INFO wal.FSHLog: regionserver60020.logSyncer exiting >> >13/09/13 07:57:10 INFO regionserver.Leases: regionserver60020 closing >> leases >> >13/09/13 07:57:10 INFO regionserver.Leases: regionserver60020 closed >> leases >> >13/09/13 07:57:15 INFO >> regionserver.HRegionServer$PeriodicMemstoreFlusher: >> regionserver60020.periodicFlusher exiting >> >13/09/13 07:57:15 INFO regionserver.Leases: >> regionserver60020.leaseChecker closing leases >> >13/09/13 07:57:15 INFO regionserver.Leases: >> regionserver60020.leaseChecker closed leases >> >13/09/13 07:57:15 INFO zookeeper.ZooKeeper: Session: 0x141172e5a590008 >> closed >> >13/09/13 07:57:15 INFO zookeeper.ClientCnxn: EventThread shut down >> >13/09/13 07:57:15 INFO regionserver.HRegionServer: stopping server >> ssp-region1,60020,1379073423978; zookeeper connection closed. >> >13/09/13 07:57:15 INFO regionserver.HRegionServer: regionserver60020 >> exiting >> >13/09/13 07:57:15 ERROR regionserver.HRegionServerCommandLine: Region >> server exiting >> >java.lang.RuntimeException: HRegionServer Aborted >> > at >> org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66) >> > at >> org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85) >> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> > at >> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:78) >> > at >> org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2309) >> >13/09/13 07:57:15 INFO regionserver.ShutdownHook: Shutdown hook starting; >> hbase.shutdown.hook=true; fsShutdownHook=Thread[Thread-4,5,main] >> > >> > >> >Regards, >> >- kiru >> > >>
