Here are the log files you asked for : http://pastebin.com/xRBuQdNS <---- hbase-master.log
http://pastebin.com/u6WYQT6R <---- hdfs-namenode.log If you find the fix to this damn issue I'll enjoy ! Thanks Cyril SCETBON On Jul 5, 2012, at 11:44 PM, Jean-Daniel Cryans wrote: > Interesting... Can you read the file? Try a "hadoop dfs -cat" on it > and see if it goes to the end of it. > > It could also be useful to see a bigger portion of the master log, for > all I know maybe it handles it somehow and there's a problem > elsewhere. > > Finally, which Hadoop version are you using? > > Thx, > > J-D > > On Thu, Jul 5, 2012 at 1:58 PM, Cyril Scetbon <[email protected]> wrote: >> yes : >> >> /hbase/.logs/hb-d12,60020,1341429679981-splitting/hb-d12%2C60020%2C1341429679981.134143064971 >> >> I did a fsck and here is the report : >> >> Status: HEALTHY >> Total size: 618827621255 B (Total open files size: 868 B) >> Total dirs: 4801 >> Total files: 2825 (Files currently being written: 42) >> Total blocks (validated): 11479 (avg. block size 53909541 B) (Total >> open file blocks (not validated): 41) >> Minimally replicated blocks: 11479 (100.0 %) >> Over-replicated blocks: 1 (0.008711561 %) >> Under-replicated blocks: 0 (0.0 %) >> Mis-replicated blocks: 0 (0.0 %) >> Default replication factor: 4 >> Average block replication: 4.0000873 >> Corrupt blocks: 0 >> Missing replicas: 0 (0.0 %) >> Number of data-nodes: 12 >> Number of racks: 1 >> FSCK ended at Thu Jul 05 20:56:35 UTC 2012 in 795 milliseconds >> >> >> The filesystem under path '/hbase' is HEALTHY >> >> Cyril SCETBON >> >> Cyril SCETBON >> >> On Jul 5, 2012, at 7:59 PM, Jean-Daniel Cryans wrote: >> >>> Does this file really exist in HDFS? >>> >>> hdfs://hb-zk1:54310/hbase/.logs/hb-d12,60020,1341429679981-splitting/hb-d12%2C60020%2C1341429679981.1341430649711 >>> >>> If so, did you run fsck in HDFS? >>> >>> It would be weird if HDFS doesn't report anything bad but somehow the >>> clients (like HBase) can't read it. >>> >>> J-D >>> >>> On Thu, Jul 5, 2012 at 12:45 AM, Cyril Scetbon <[email protected]> >>> wrote: >>>> Hi, >>>> >>>> I can nolonger start my cluster correctly and get messages like >>>> http://pastebin.com/T56wrJxE (taken on one region server) >>>> >>>> I suppose Hbase is not done for being stopped but only for having some >>>> nodes going down ??? HDFS is not complaining, it's only HBase that can't >>>> start correctly :( >>>> >>>> I suppose some data has not been flushed and it's not really important for >>>> me. Is there a way to fix theses errors even if I will lose data ? >>>> >>>> thanks >>>> >>>> Cyril SCETBON >>>> >>
