distcp should work. Try online snapshot case - I think it might work. But ideally here the table is in the DISABLING state. so it is again tricky.
Take a raw back up of the table data in the HDFS - and again put it back after restoring the table? On Wed, Feb 4, 2015 at 12:41 PM, Weichen YE <[email protected]> wrote: > Hi,Ted, Ram, > > Thank you for your attemtion for this bug. > > I meet this bug in production environment and the table contains > important data. If we are not able to enable this table in current cluster, > do you have any idea to get the table data back in some other way? Maybe > export, snapshot, copytable, distcp all table files to another cluster ? > > 2015-02-04 13:17 GMT+08:00 Ted Yu <[email protected]>: > > > Looks like the NPE was caused by the following method in BaseLoadBalancer > > returning null: > > > > protected Map<ServerName, List<HRegionInfo>> assignMasterRegions( > > > > Collection<HRegionInfo> regions, List<ServerName> servers) { > > > > if (servers == null || regions == null || regions.isEmpty()) { > > > > return null; > > > > Since bulkPlan is null, calling BulkAssigner seems unnecessary. > > > > > > > > On Tue, Feb 3, 2015 at 9:01 PM, ramkrishna vasudevan < > > [email protected]> wrote: > > > >> It is not only about the state on the table descriptor but also the in > >> memory state in the AM. I remember some time back Rajeshbabu worked on > a > >> HBCK like tool which will forcefully change the state of these tables in > >> such cases. I don't remember the JIRA now.I thought of restarting the > >> master thinking the in memory state would change and I got this > >> > >> java.lang.NullPointerException > >> at > >> > >> > org.apache.hadoop.hbase.master.handler.EnableTableHandler.handleEnableTable(EnableTableHandler.java:210) > >> at > >> > >> > org.apache.hadoop.hbase.master.handler.EnableTableHandler.process(EnableTableHandler.java:142) > >> at > >> > >> > org.apache.hadoop.hbase.master.AssignmentManager.recoverTableInEnablingState(AssignmentManager.java:1695) > >> at > >> > >> > org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:416) > >> at > >> > >> > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:720) > >> at > >> org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:170) > >> at > org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1459) > >> at java.lang.Thread.run(Thread.java:745) > >> 2015-02-04 16:11:45,932 FATAL [stobdtserver3:16040.activeMasterManager] > >> master.HMaster: Master server abort: loaded coprocessors are: > >> [org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint] > >> 2015-02-04 16:11:45,933 FATAL [stobdtserver3:16040.activeMasterManager] > >> master.HMaster: Unhandled exception. Starting shutdown. > >> java.lang.NullPointerException > >> at > >> > >> > org.apache.hadoop.hbase.master.handler.EnableTableHandler.handleEnableTable(EnableTableHandler.java:210) > >> at > >> > >> > org.apache.hadoop.hbase.master.handler.EnableTableHandler.process(EnableTableHandler.java:142) > >> at > >> > >> > org.apache.hadoop.hbase.master.AssignmentManager.recoverTableInEnablingState(AssignmentManager.java:1695) > >> at > >> > >> > org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:416) > >> at > >> > >> > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:720) > >> at > >> org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:170) > >> at > org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1459) > >> > >> > >> Regards > >> Ram > >> > >> On Wed, Feb 4, 2015 at 10:25 AM, Ted Yu <[email protected]> wrote: > >> > >> > What about creating an offline tool which can modify the table > >> descriptor > >> > so that table goes to designated state ? > >> > > >> > Cheers > >> > > >> > On Tue, Feb 3, 2015 at 8:51 PM, ramkrishna vasudevan < > >> > [email protected]> wrote: > >> > > >> > > I tried reproducing this scenario on trunk. The same problem exists. > >> > > Currently in the master the table state is noted in the Table > >> descriptor > >> > > and not on the ZK. In 0.98.XX version it should be on the zk. > >> > > > >> > > When we tried to enable the table the region assignment failed due > to > >> > > ClassNotFound and already the state is in ENABLING. But doing a > >> describe > >> > > table still shows it in DISABLED. > >> > > > >> > > Thought we could alter the correct Configuration but specifying > >> another > >> > > alter Table command we are still not able to enable the table. > >> > > > >> > > Moving this to dev to see if there is any workaround for this issue. > >> If > >> > > not we may have to solve this issue across branches until we have > the > >> > > Procedure V2 implemenation ready on trunk. > >> > > > >> > > Any suggestions? > >> > > > >> > > Regards > >> > > Ram > >> > > > >> > > On Wed, Feb 4, 2015 at 4:05 AM, 叶炜晨 <[email protected]> wrote: > >> > > > >> > > > my version is 0.98.6-cdh5.2.0, the problem in my production > >> > environment. > >> > > > > >> > > > So should I first delete znode? And then how to distable this > >> table?my > >> > > > goal is to fix the wrong table configuration to get my data. > >> > > > > >> > > > > >> > > > from my mobile phone. > >> > > > > >> > > > 在 2015-2-4 上午12:46,ramkrishna vasudevan < > >> > > [email protected] > >> > > > >写道: > >> > > > > >> > > > > > >> > > > > I think the only way out here is to clear the zookeeper node. > >> But am > >> > > > not sure on the ramifications of that. > >> > > > > > >> > > > > >> > > > > Which version are you using? The newer versions are > >> 'protobuf'fed. > >> > > > > > >> > > > > >> > > > > Are you running this in production? > >> > > > > > >> > > > > >> > > > > Regards > >> > > > > Ram > >> > > > > > >> > > > > >> > > > > On Tue, Feb 3, 2015 at 5:00 PM,[email protected]< > >> > > > [email protected]>wrote: > >> > > > > >> > > > >> > >> > > > >> I tried HBCK, but it doesn`t help. > >> > > > > >> > > > I want to disable the table, so that I can use "alter" to fix the > >> wrong > >> > > > configuration. But now the table keep in the status that no > matter I > >> > use > >> > > > "is_enabled" or "is_disabled", it return false. > >> > > > >> > >> > > > >> ________________________________ > >> > > > >> [email protected] > >> > > > >>> > >> > > > >>> > >> > > > >>> From: ramkrishna vasudevan > >> > > > >>> Date: 2015-02-03 19:55 > >> > > > >>> To: [email protected] > >> > > > >>> CC: yeweichen > >> > > > >>> Subject: Re: Wrong Configuration lead to a failure when > enabling > >> > > table > >> > > > >>> Can you try HBCK? Did it help in anyway? Remember something > was > >> > done > >> > > > >>> related to failure in ENABLE/DISABLE table some time back. > >> > > > >>> > >> > > > >>> Regards > >> > > > >>> Ram > >> > > > >>> > >> > > > >>> On Tue, Feb 3, 2015 at 3:38 PM,[email protected]< > >> > > > >>> [email protected]> wrote: > >> > > > >>> > >> > > > >>> > Hi, all, > >> > > > >>> > > >> > > > >>> > II did the following command in hbase shell: > >> > > > >>> > > >> > > > >>> > disable 'TestTable' > >> > > > >>> > alter 'TestTable', CONFIGURATION => > >> > > > >>> > {'hbase.regionserver.region.split.policy' => 'xxxxxxxxx'} > >> > > > >>> > enable 'TestTable' > >> > > > >>> > > >> > > > >>> > At first I want to put > >> > > > >>> > > >> > > "org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy" > >> > > > to the > >> > > > >>> > place "xxxxxxxxx", but because a spelling error, now is > >> something > >> > > > wrong in > >> > > > >>> > this configuration. After I enable the table, it failed > >> bacause > >> > of > >> > > > >>> > ClassNotFound. > >> > > > >>> > > >> > > > >>> > Now is the problem: the table failed to enable and stay in a > >> > middle > >> > > > >>> > status. The table is neither enabled nor disabled now. How > >> can I > >> > > > save my > >> > > > >>> > table and fix the wrong configuration? > >> > > > >>> > > >> > > > >>> > > >> > > > >>> > > >> > > > >>> > > >> > > > >>> > > >> > > > >>> >[email protected] > >> > > > >>> > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > > >
