Hello, did you find any anomalies?
Michal On Thu, Mar 10, 2016 at 2:50 PM, Michal Medvecky <[email protected]> wrote: > Hello, > > the log you pasted is from later time when I was playing with all kinds of > knobs to make things work. > > Here are logs from freshly deployed cluster: > > http://michal.medvecky.net/hbase/hbase-master.log > http://michal.medvecky.net/hbase/hbase-regionserver.log > http://michal.medvecky.net/status.jpg > http://michal.medvecky.net/hbase/debugdump-master.txt > http://michal.medvecky.net/hbase/debugdump-rs.txt > > Thank you for your interest! > > Michal > > On Wed, Mar 9, 2016 at 4:25 PM, Ted Yu <[email protected]> wrote: > >> From the region server log, did you notice this: >> >> 2016-03-09 11:59:49,159 WARN >> [regionserver/staging-aws-hbase-data-1.aws.px/10.231.16.30:16020] >> wal.ProtobufLogWriter: Failed to write trailer, non-fatal, >> continuing... >> java.io.IOException: All datanodes >> DatanodeInfoWithStorage[10.231.16.30:50010 >> ,DS-a49fd123-fefc-46e2-83ca-6ae462081702,DISK] >> are bad. Aborting... >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1084) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:876) >> at >> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:402) >> >> >> Was hdfs stable around that time ? >> >> I am more interested in the failure of fresh 1.2.0 installation. >> >> Please pastebin server logs for that incident. >> >> >> On Wed, Mar 9, 2016 at 6:19 AM, Michal Medvecky <[email protected]> wrote: >> >> > You can check both logs at >> > >> > http://michal.medvecky.net/log-master.txt >> > http://michal.medvecky.net/log-rs.txt >> > >> > First restart after upgrade happened at 10:29. >> > >> > I did not find anything useful. >> > >> > Michal >> > >> > On Wed, Mar 9, 2016 at 2:58 PM, Ted Yu <[email protected]> wrote: >> > >> > > Can you take a look at data-1 server log around this time frame to see >> > > what happened ? >> > > >> > > Thanks >> > > >> > > > On Mar 9, 2016, at 3:44 AM, Michal Medvecky <[email protected]> >> wrote: >> > > > >> > > > Hello, >> > > > >> > > > I upgraded my single hbase master and single hbase regionserver from >> > > 1.1.3 >> > > > to 1.2.0, by simply stopping both, upgrading packages (i download >> > binary >> > > > packages from hbase.org) and starting them again. >> > > > >> > > > It did not come up; master is stuck in assigning hbase:meta: >> > > > >> > > > 2016-03-09 11:28:11,491 INFO >> > > > [staging-aws-hbase-3:16000.activeMasterManager] >> > master.AssignmentManager: >> > > > Processing 1588230740 in state: M_ZK_REGION_OFFLINE >> > > > 2016-03-09 11:28:11,491 INFO >> > > > [staging-aws-hbase-3:16000.activeMasterManager] master.RegionStates: >> > > > Transition {1588230740 state=OFFLINE, ts=1457522891475, >> server=null} to >> > > > {1588230740 state=OFFLINE, ts=1457522891491, >> > > > server=staging-aws-hbase-data-1.aws.px,16020,1457522034036} >> > > > 2016-03-09 11:28:11,492 INFO >> > > > [staging-aws-hbase-3:16000.activeMasterManager] >> > > > zookeeper.MetaTableLocator: Setting hbase:meta region location in >> > > ZooKeeper >> > > > as staging-aws-hbase-data-1.aws.px,16020,1457522034036 >> > > > 2016-03-09 11:42:10,611 ERROR [Thread-65] master.HMaster: Master >> failed >> > > to >> > > > complete initialization after 900000ms. Please consider submitting a >> > bug >> > > > report including a thread dump of this process. >> > > > >> > > > Did I miss something? >> > > > >> > > > I tried downgrading back to 1.1.3 but later realized this is not >> > > supported >> > > > (and does not work of course). >> > > > >> > > > Thank you >> > > > >> > > > Michal >> > > >> > >> > >
