Himanshu, We did not have much data to move, so we did not use the migration script. After using the (same) hadoop core jar in hbase and cleaning out the data (including zk) , I think I have a working cluster on ec2 now. Proceeding with loading data now. Thanks so much. Regards, - kiru
________________________________ From: Himanshu Vashishtha <[email protected]> To: Kiru Pakkirisamy <[email protected]> Cc: "[email protected]" <[email protected]>; Ted Yu <[email protected]> Sent: Monday, September 16, 2013 10:22 AM Subject: Re: 0.95.2 upgrade errors "13/09/16 02:13:39 WARN wal.HLogSplitter: Could not open hdfs://ssp-master:8020/apps/hbase/data/WALs/ssp-region3, 60020,1379105062356-splitting/ssp-region3%2C60020% 2C1379105062356.1379105067148.meta for reading. File is empty" This is okay as the file is empty. You can do a ls -l on the above file to confirm. The NPE needs to be looked into. How is the master startup? Could you pastebin the master log at startup? You upgrade from a 94 installation? And using the migration script? On Sun, Sep 15, 2013 at 11:47 PM, Kiru Pakkirisamy < [email protected]> wrote: > Himanshu et al, > I see this error in one region. the .meta file is having a problem I > guess. Thanks for all the help. > > 13/09/16 02:13:39 WARN wal.HLogSplitter: Could not open > hdfs://ssp-master:8020/apps/hbase/data/WALs/ssp-region3,60020,1379105062356-splitting/ssp-region3%2C60020%2C1379105062356.1379105067148.meta > for reading. File is empty > java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:180) > at java.io.DataInputStream.readFully(DataInputStream.java:152) > at > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1512) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1490) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1479) > at > org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1474) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:69) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.reset(SequenceFileLogReader.java:174) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.initReader(SequenceFileLogReader.java:183) > at > org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) > at > org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:126) > at > org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createReader(HLogFactory.java:89) > > and in the master I see > > 2013-09-16 02:33:28,755 [RpcServer.handler=27,port=60000] ERROR > org.apache.hadoop.ipc.RpcServer.call - Unexpected throwable object > java.lang.NullPointerException > at > org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:111) > at > org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3116) > at > org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1779) > at > org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1820) > at > org.apache.hadoop.hbase.protobuf.generated.MasterAdminProtos$MasterAdminService$2.callBlockingMethod(MasterAdminProtos.java:27720) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2156) > at > org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1861) > > > Regards, > - kiru > > ------------------------------ > *From:* Himanshu Vashishtha <[email protected]> > *To:* [email protected]; Kiru Pakkirisamy <[email protected]> > *Cc:* Ted Yu <[email protected]> > *Sent:* Friday, September 13, 2013 4:18 PM > > *Subject:* Re: 0.95.2 upgrade errors > > Is there something odd (exceptions on starting) in Master logs? Does it > start normally? You removed the above jar from HMaster node too? Could you > restart master and see if that changes anything? > > > On Fri, Sep 13, 2013 at 1:47 PM, Kiru Pakkirisamy < > [email protected]> wrote: > > > > I see a hadoop-core-1.1.2 jar in the hbase/lib dir. Is this the same as > the hadoop 'core' jar in the hadoop directory ? > If I remove this, the regionservers seem to stay up. But create table > fails. > > ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed > after attempts=7, exceptions: > Fri Sep 13 16:44:48 EDT 2013, > org.apache.hadoop.hbase.client.RpcRetryingCaller@734893da, > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException: java.io.IOException > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2194) > at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1861) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:111) > > > Regards, > - kiru > > > > ________________________________ > From: Himanshu Vashishtha <[email protected]> > To: [email protected]; Kiru Pakkirisamy <[email protected]> > Cc: Ted Yu <[email protected]> > Sent: Friday, September 13, 2013 10:22 AM > Subject: Re: 0.95.2 upgrade errors > > > Sounds odd, but just to be sure you are picking right hadoop jars, you > could look at the classpath dump in the logs when the regionserver starts. > > > On Fri, Sep 13, 2013 at 9:33 AM, Kiru Pakkirisamy < > [email protected] > > wrote: > > > We are using hadoop-core-1.2.0.1.3.0.0-107.jar > > The versions are same on the other cluster and it works there. But this > > one is on ec2. Thats the only difference. Beats me for now. > > > > Regards, > > - kiru > > > > > > ________________________________ > > From: Ted Yu <[email protected]> > > To: "[email protected]" <[email protected]>; Kiru Pakkirisamy < > > [email protected]> > > Sent: Friday, September 13, 2013 9:30 AM > > Subject: Re: 0.95.2 upgrade errors > > > > > > > > java.io.IOException: IOException > > flush:org.apache.hadoop.ipc.RemoteException: java.io.IOException: > > java.lang.NoSuchMethodException: > > org.apache.hadoop.hdfs.protocol.ClientProtocol.fsync(java.lang.String, > > java.lang.String) > > > > > > What version of hadoop is in use on this machine ? > > > > Thanks > > > > > > > > On Fri, Sep 13, 2013 at 9:27 AM, Kiru Pakkirisamy < > > [email protected]> wrote: > > > > We were able to migrate one cluster to 0.95.2 (from 0.94.11) without any > > problems. > > >But in another clusters, we have run into a few issues . Any pointers > > > > > >13/09/13 07:57:09 FATAL wal.FSHLog: Could not sync. Requesting roll of > > hlog > > >java.io.IOException: DFSOutputStream is closed > > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3845) > > > at > > org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97) > > > at > > > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:135) > > > at > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1075) > > > at > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1013) > > > at > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1186) > > > at > > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$LogSyncer.run(FSHLog.java:964) > > >at java.lang.Thread.run(Thread.java:662) > > >13/09/13 07:57:09 ERROR wal.FSHLog: Error while syncing, requesting > close > > of hlog > > >java.io.IOException: DFSOutputStream is closed > > > at > > > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3845) > > > at > > org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97) > > > at > > > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:135) > > > at > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1075) > > > at > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1013) > > > at > > org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1186) > > > at > > > org.apache.hadoop.hbase.regionserver.wal.FSHLog$LogSyncer.run(FSHLog.java:964) > > > at java.lang.Thread.run(Thread.java:662) > > >13/09/13 07:57:09 INFO regionserver.HRegionServer: stopping server > > ssp-region1,60020,1379073423978; all regions closed. > > >13/09/13 07:57:10 INFO wal.FSHLog: > > RS_OPEN_META-ssp-region1:60020-0.logSyncer exiting > > >13/09/13 07:57:10 ERROR regionserver.HRegionServer: Metalog close and > > delete failed > > >java.io.IOException: IOException > > flush:org.apache.hadoop.ipc.RemoteException: java.io.IOException: > > java.lang.NoSuchMethodException: > > org.apache.hadoop.hdfs.protocol.ClientProtocol.fsync(java.lang.String, > > java.lang.String) > > > at java.lang.Class.getMethod(Class.java:1605) > > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:581) > > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444) > > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440) > > > > > >and finally the region shuts down > > > > > >13/09/13 07:57:10 INFO wal.FSHLog: regionserver60020.logSyncer exiting > > >13/09/13 07:57:10 INFO regionserver.Leases: regionserver60020 closing > > leases > > >13/09/13 07:57:10 INFO regionserver.Leases: regionserver60020 closed > > leases > > >13/09/13 07:57:15 INFO > > regionserver.HRegionServer$PeriodicMemstoreFlusher: > > regionserver60020.periodicFlusher exiting > > >13/09/13 07:57:15 INFO regionserver.Leases: > > regionserver60020.leaseChecker closing leases > > >13/09/13 07:57:15 INFO regionserver.Leases: > > regionserver60020.leaseChecker closed leases > > >13/09/13 07:57:15 INFO zookeeper.ZooKeeper: Session: 0x141172e5a590008 > > closed > > >13/09/13 07:57:15 INFO zookeeper.ClientCnxn: EventThread shut down > > >13/09/13 07:57:15 INFO regionserver.HRegionServer: stopping server > > ssp-region1,60020,1379073423978; zookeeper connection closed. > > >13/09/13 07:57:15 INFO regionserver.HRegionServer: regionserver60020 > > exiting > > >13/09/13 07:57:15 ERROR regionserver.HRegionServerCommandLine: Region > > server exiting > > >java.lang.RuntimeException: HRegionServer Aborted > > > at > > > org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85) > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > at > > > org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:78) > > > at > > > org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2309) > > >13/09/13 07:57:15 INFO regionserver.ShutdownHook: Shutdown hook > starting; > > hbase.shutdown.hook=true; fsShutdownHook=Thread[Thread-4,5,main] > > > > > > > > >Regards, > > >- kiru > > > > > > > > > >
