Re: Please congratulate our new PMC Chair Misty Stanley-Jones
Congrats! Misty!! On Fri, Sep 22, 2017 at 7:16 AM, Pankaj krwrote: > Congratulations Misty..!! :) > > > -Pankaj- > > > -Original Message- > From: Andrew Purtell [mailto:apurt...@apache.org] > Sent: Friday, September 22, 2017 3:08 AM > To: d...@hbase.apache.org; user@hbase.apache.org > Subject: Please congratulate our new PMC Chair Misty Stanley-Jones > > At today's meeting of the Board, Special Resolution B changing the HBase > project Chair to Misty Stanley-Jones was passed unanimously. > > Please join me in congratulating Misty on her new role! > > (If you need any help or advice please don't hesitate to ping me, Misty, > but I suspect you'll do just fine and won't need it.) > > > -- > Best regards, > Andrew >
Re: [ANNOUNCE] Dima Spivak joins the Apache HBase PMC
Congrats!! On Wed, Aug 31, 2016 at 1:33 PM, Aleksandr Shulmanwrote: > Congrats Dima!! > > On Wed, Aug 31, 2016 at 12:37 PM, Andrew Purtell > wrote: > >> On behalf of the Apache HBase PMC I am pleased to announce that Dima Spivak >> has accepted our invitation to become a committer and PMC member on the >> Apache HBase project. Dima has been an active contributor for some time, >> particularly in development and contribution of release tooling that all of >> our RMs now use, such as the API compatibility checker. Dima has also been >> active in testing and voting on release candidates. Release voting is >> important to project health and momentum and demonstrates interest and >> capability above and beyond just committing. We wish to recognize this and >> make those release votes binding. Please join me in thanking Dima for his >> contributions to date and anticipation of many more contributions. >> >> Welcome to the HBase project, Dima! >> >> -- >> Best regards, >> >>- Andy >> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein >> (via Tom White) >> > > > > -- > Best Regards, > > Aleks Shulman > 847.814.5804 > Cloudera
Re: Nice blog post on coming zk-less assignment by our Jimmy Xiang
Thanks! Jimmy On Fri, Mar 6, 2015 at 9:39 AM, Nick Dimiduk ndimi...@gmail.com wrote: Great work Jimmy! Excellent explanation. On Thursday, March 5, 2015, Stack st...@duboce.net wrote: See https://blogs.apache.org/hbase/entry/hbase_zk_less_region_assignment St.Ack
Re: numberOfOnlineRegions available in just one regionServer
Do you see other region servers from your master web UI? If so, if you run balancer/balance_switch from the shell, what happens? On Wed, Sep 10, 2014 at 10:24 AM, Ivan Fernandez ivan.fernandez.pe...@gmail.com wrote: Hi, I'm trying to run a HBase cluster on a development environment with several nodes. One thing that I don't understand checking the HBase Master is why all online regions are pointing to a single region server and don't balance them among all the region servers. I have 6 different tables, so I supposed that they were going to be distributed among all the region server. But it doesn't seem so. I've also tried setting hbase.master.startup.retainassign to false and hbase.master.loadbalance.bytable to true as this post http://answers.mapr.com/questions/7049/table-only-on-single-region-server suggested. Any idea! HBase Version 0.94.6.1.3.8.0-3, Hadoop Version 1.2.0.1.3.8.0-3, -- View this message in context: http://apache-hbase.679495.n3.nabble.com/numberOfOnlineRegions-available-in-just-one-regionServer-tp4063775.html Sent from the HBase User mailing list archive at Nabble.com.
Re: Region stuck in Transition FAILED_OPEN after upgrade to HBase0.98
What's your setting hbase.assignment.timeout.management? Region is FAILED_OPEN state usually needs some manual fix, such as assign from hbase shell. On Tue, Jul 29, 2014 at 6:06 PM, anil gupta anilgupt...@gmail.com wrote: Hi Ted, It seems like the problem self healed. Is there any timeout for region_in_transition that led to this fix? Now, UI show's everything is good. Thanks, Anil Gupta On Tue, Jul 29, 2014 at 6:01 PM, Ted Yu yuzhih...@gmail.com wrote: Can you pastebin snippet of master log pertaining to this region ? Cheers On Tue, Jul 29, 2014 at 5:51 PM, anil gupta anilgupt...@gmail.com wrote: Hi All, We recently upgrade our cluster from 0.94 to HBase0.98(cdh5.1). All the tables are working fine except one table with 0.94. One of the region has been stuck in transition since our upgrade. I see following HMaster UI: 6c12ff0021f80eea22666e4ae625b150SYSTEM.CATALOG,,1397780246020.6c12ff0021f80eea22666e4ae625b150. state=FAILED_OPEN, ts=Tue Jul 29 16:45:20 PDT 2014 (1469s ago), server=host_name,60020,1406677501761 How can i fix or debug this problem? -- Thanks Regards, Anil Gupta -- Thanks Regards, Anil Gupta
Re: how to let hmaster read zookeeper's /hbase/unassigned
Yes, that will work. On Tue, Jul 22, 2014 at 12:39 PM, Libo Yu yu_l...@hotmail.com wrote: My cluster has a standby master process. If I restart it, will it reload data from zookeeper? If it does, I can shut down the current master and let standby master become the active one. Will that work? Libo From: yu_l...@hotmail.com To: user@hbase.apache.org Subject: how to let hmaster read zookeeper's /hbase/unassigned Date: Mon, 21 Jul 2014 19:20:33 -0400 Hi all, I removed regions in transition zookeeper path /hbase/unassigned. But I don't want to bring down master node (the whole cluster will be down). Is there a way to force the master node to load data from zookeeper? Thanks. Libo
Re: how to let hmaster read zookeeper's /hbase/unassigned
There is no way to force the master to reload such data from zookeeper currently. Thanks, Jimmy On Mon, Jul 21, 2014 at 4:20 PM, Libo Yu yu_l...@hotmail.com wrote: Hi all, I removed regions in transition zookeeper path /hbase/unassigned. But I don't want to bring down master node (the whole cluster will be down). Is there a way to force the master node to load data from zookeeper? Thanks. Libo
Re: HBase REST using ssl
Hi Demai, You need to set these configurations for your REST server: hbase.rest.ssl.enabled hbase.rest.ssl.keystore.store hbase.rest.ssl.keystore.password hbase.rest.ssl.keystore.keypassword Thanks, Jimmy On Mon, Jun 2, 2014 at 3:23 PM, Demai Ni nid...@gmail.com wrote: hi, folks, I am wondering how to use ssl with HBase REST, and look at http://wiki.apache.org/hadoop/Hbase/Stargate, and google a bit. but couldn't find examples or instructions. Can someone give me a couple pointers? thanks Demai
Re: [ANNOUNCE] Apache Phoenix has graduated as a top level project
Cool! Congrats! On Fri, May 23, 2014 at 12:18 AM, JungHo Lee hbell.al...@gmail.com wrote: Wow! Congratulations!! - hbell 2014年5月23日金曜日、Job Thomasj...@suntecgroup.comjavascript:_e(%7B%7D,'cvml',' j...@suntecgroup.com'); さんは書きました: Congratulations. From: James Taylor [mailto:jamestay...@apache.org] Sent: Fri 5/23/2014 3:16 AM To: HBase Dev; HBase User Subject: [ANNOUNCE] Apache Phoenix has graduated as a top level project I'm pleased to announce that Apache Phoenix has graduated from the incubator to become a top level project. Thanks so much for all your help and support - we couldn't have done it without the fantastic HBase community! We're looking forward to continued collaboration. Regards, The Apache Phoenix team -- -- JungHo Lee (hbell.al...@gmail.com)
Re: Hbase API questions
HBaseAdmin#getClusterStatus can be used to list all the region servers. For each regionserver, you can use HBaseAdmin#getOnlineRegions to list the regions on it. For store files, they are changing due to compaction/memstore flush. On Thu, Mar 27, 2014 at 8:41 AM, Libo Yu yu_l...@hotmail.com wrote: Hi All, I am frustrated with Hbase API. I want to list all the region servers: regions on the region server and store files in the region. What classes should I use for that? Libo
Re: Hbase API questions
If you really need to get the store files info, you can take a look at ProtobufUtil#getStoreFiles if you use 0.96+. On Thu, Mar 27, 2014 at 9:31 AM, Jimmy Xiang jxi...@cloudera.com wrote: HBaseAdmin#getClusterStatus can be used to list all the region servers. For each regionserver, you can use HBaseAdmin#getOnlineRegions to list the regions on it. For store files, they are changing due to compaction/memstore flush. On Thu, Mar 27, 2014 at 8:41 AM, Libo Yu yu_l...@hotmail.com wrote: Hi All, I am frustrated with Hbase API. I want to list all the region servers: regions on the region server and store files in the region. What classes should I use for that? Libo
Re: Hbase data loss scenario
Hi Kiran, Can you check your table TTL setting? Is it possible that the data are expired and purged? Thanks, Jimmy On Thu, Feb 27, 2014 at 10:11 AM, Stack st...@duboce.net wrote: Anything in your logs that might give you a clue? Master logs? HDFS NameNode logs? St.Ack On Thu, Feb 27, 2014 at 7:53 AM, kiran kiran.sarvabho...@gmail.com wrote: Hi All, We have been experiencing severe data loss issues from few hours. There are some wierd things going on in the cluster. We were unable to locate the data even in hdfs Hbase version 0.94.1 Here is the wierd things that are going on: 1) Table which was once 1TB has now become 170GB with many of the regions which we once 7gb are now becoming few MB's. We are no clue what is happening at all 2) Table is splitting (or what ever) (100 regions have become 200 regions) and ours is constantregionsplitpolicy with region size 20gb. I don't know why it is even spltting 3) HDFS namenode dump size which we periodically backup is decreasing 4) And there is a region chain with start keys and end keys as, I can't copy paste the exact thing. For example K1.xxx K2.xyz K2.xyz K3.xyz,13879801.xyp K3.xyz,13879801.xyp K4.xyq I have never seen a wierd start key and end key like this. We also suspect a failed split of a region around 20GB. We looked at logs many times but unable to get any sense out of it. Please help us out and we can't afford data loss. Yesterday, There was an cluster crash of root region but we thought we sucessfully restored that.But things did n't go that way There was a consitent data loss after that. -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late
Re: Only map getting created for 100000 rows
Do you have just one region for this table? On Tue, Feb 11, 2014 at 2:59 AM, Tousif tousif.pa...@gmail.com wrote: I would like to know what configuration causes mapreduce to have only one map while input split of 1 and lines per map of 1000 are set in job configuration. Its a 2 node cluster and i tried scan with startRow and endRow. I want to have atleast 2 maps, one on each machine. http://stackoverflow.com/questions/21697055/what-causes-mapreduce-job-to-create-only-one-map-for-10-rows-in-hbase -- Regards Tousif Khazi
Re: hbase 0.96 stop master receive ERROR ipc.RPC: RPC.stopProxy called on non proxy.
Which version of Hadoop do you use? On Wed, Nov 20, 2013 at 5:43 PM, Henry Hung ythu...@winbond.com wrote: Hi All, When stopping master or regionserver, I found some ERROR and WARN in the log files, are these errors can cause problem in hbase: 13/11/21 09:31:16 INFO zookeeper.ClientCnxn: EventThread shut down 13/11/21 09:35:36 ERROR ipc.RPC: RPC.stopProxy called on non proxy. java.lang.IllegalArgumentException: object is not an instance of declaring class at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266) at $Proxy18.close(Unknown Source) at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:621) at org.apache.hadoop.hdfs.DFSClient.closeConnectionToNamenode(DFSClient.java:738) at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:794) at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:847) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2524) at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2541) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) 13/11/21 09:35:36 WARN util.ShutdownHookManager: ShutdownHook 'ClientFinalizer' failed, org.apache.hadoop.HadoopIllegalArgumentException: Cannot close proxy - is not Closeable or does not provide closeable invocation handler class $Proxy18 org.apache.hadoop.HadoopIllegalArgumentException: Cannot close proxy - is not Closeable or does not provide closeable invocation handler class $Proxy18 at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:639) at org.apache.hadoop.hdfs.DFSClient.closeConnectionToNamenode(DFSClient.java:738) at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:794) at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:847) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2524) at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2541) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) Best regards, Henry The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.
Re: SplitLogManager issue
In the region server log, you should see the details about the failure. On Wed, Sep 25, 2013 at 7:31 PM, kun yan yankunhad...@gmail.com wrote: I use importtsv import data into HDFS, but during the power outage. Horrible, and then I re-import the data.(hbase 0.94) the exception org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for data2,04684015,99 after 10 tries. I look HMaster logs as follows: 2013-09-26 10:21:17,874 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2013-09-26 10:21:18,875 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2013-09-26 10:21:19,875 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2013-09-26 10:21:20,875 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2013-09-26 10:21:21,876 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2013-09-26 10:21:22,875 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2013-09-26 10:21:23,875 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2013-09-26 10:21:24,875 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2013-09-26 10:21:25,436 INFO org.apache.hadoop.hbase.master.SplitLogManager: task /hbase/splitlog/hdfs%3A%2F%2Fhydra0001%3A8020%2Fhbase%2F.logs%2Fhydra0006%2C60020%2C1379926437471-splitting%2Fhydra0006%252C60020%252C1379926437471.1380157500804 entered state err hydra0004,60020,1380159614688 2013-09-26 10:21:25,436 WARN org.apache.hadoop.hbase.master.SplitLogManager: Error splitting /hbase/splitlog/hdfs%3A%2F%2Fhydra0001%3A8020%2Fhbase%2F.logs%2Fhydra0006%2C60020%2C1379926437471-splitting%2Fhydra0006%252C60020%252C1379926437471.1380157500804 2013-09-26 10:21:25,436 WARN org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs in [hdfs://hydra0001:8020/hbase/.logs/hydra0003,60020,1379926447350-splitting, hdfs://hydra0001:8020/hbase/.logs/hydra0004,60020,1379926440171-splitting, hdfs://hydra0001:8020/hbase/.logs/hydra0006,60020,1379926437471-splitting] installed = 2 but only 0 done 2013-09-26 10:21:25,436 WARN org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting of [hydra0003,60020,1379926447350, hydra0004,60020,1379926440171, hydra0006,60020,1379926437471] java.io.IOException: error or interrupted while splitting logs in [hdfs://hydra0001:8020/hbase/.logs/hydra0003,60020,1379926447350-splitting, hdfs://hydra0001:8020/hbase/.logs/hydra0004,60020,1379926440171-splitting, hdfs://hydra0001:8020/hbase/.logs/hydra0006,60020,1379926437471-splitting] Task = installed = 2 done = 0 error = 2 at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:282) at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:300) at org.apache.hadoop.hbase.master.MasterFileSystem.splitLogAfterStartup(MasterFileSystem.java:242) at org.apache.hadoop.hbase.master.HMaster.splitLogAfterStartup(HMaster.java:661) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:580) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:396) at java.lang.Thread.run(Thread.java:722) -- In the Hadoop world, I am just a novice, explore the entire Hadoop ecosystem, I hope one day I can contribute their own code YanBit yankunhad...@gmail.com
Re: Please welcome our newest committer, Rajeshbabu Chintaguntla
Congrats! On Wed, Sep 11, 2013 at 9:54 AM, Stack st...@duboce.net wrote: Hurray for Rajesh! On Wed, Sep 11, 2013 at 9:17 AM, ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com wrote: Hi All, Please join me in welcoming Rajeshbabu (Rajesh) as our new HBase committer. Rajesh has been there for more than a year and has been solving some very good bugs around the Assignment Manger area. He has been working on other stuff like HBase-Mapreduce performance improvement, migration scripts and off late in the Secondary Index related things. Rajesh has made his first commit to the pom.xml already. Once again, congratulations and welcome to this new role (smile). Cheers Ram
Re: ERROR: org.apache.hadoop.hbase.client.NoServerForRegionException:
You can use hbck to fix it: hbase hbck -fixMeta -fixAssignments If that doesn't work, can you restart the region server that holds the ROOT region? Thanks, Jimmy On Fri, Sep 6, 2013 at 3:56 AM, enes yücer enes...@gmail.com wrote: Hi, I'm using cdh 4.2.0 and hbase 0.94.2. our cluster working on 12 nodes. when I scan table in hbase shell or java api, it return: ERROR: org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for table_name,,99 after 7 tries. I am getting same exception when runnig sudo -u hbase hbase hbck -repair. One region looks unassigned in web ui and hbase zkcli. my hbase master log: 1:54:07.226 PMINFOorg.apache.hadoop.hbase.master.AssignmentManager Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again 1:54:17.227 PMINFOorg.apache.hadoop.hbase.master.AssignmentManager Regions in transition timed out: -ROOT-,,0.70236052 state=CLOSING, ts=1378296338838, server=bda1node01.etiya.com,6,1378296027149 1:54:17.227 PMINFOorg.apache.hadoop.hbase.master.AssignmentManager Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again 1:54:27.226 PMINFOorg.apache.hadoop.hbase.master.AssignmentManager Regions in transition timed out: -ROOT-,,0.70236052 state=CLOSING, ts=1378296338838, server=bda1node01.etiya.com,6,1378296027149 1:54:27.226 PMINFOorg.apache.hadoop.hbase.master.AssignmentManager Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again 1:54:37.226 PMINFOorg.apache.hadoop.hbase.master.AssignmentManager Regions in transition timed out: -ROOT-,,0.70236052 state=CLOSING, ts=1378296338838, server=bda1node01.etiya.com,6,1378296027149 1:54:37.227 PMINFOorg.apache.hadoop.hbase.master.AssignmentManager Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again how to fix this problem? Thanks.
Re: Too many Compaction Complete messages
Here d should be the column family being compacted. Do you have 3-5 column families of the same region being compacted? On Wed, Sep 4, 2013 at 8:36 AM, Tom Brown tombrow...@gmail.com wrote: Is it normal to receive 3-5 distinct Compaction Complete statuses for the same region each second? For any individual region, it continuously generates Compacting d in {theregion}... Compaction Complete statuses for minutes or hours. In that status message, what is d? --Tom On Wed, Sep 4, 2013 at 6:21 AM, Frank Chow zhoushuaif...@gmail.com wrote: Hi Tom, Below parameters may help to reduce the compaction number: property namehbase.hstore.compactionThreshold/name value3/value description If more than this number of HStoreFiles in any one HStore (one HStoreFile is written per flush of memstore) then a compaction is run to rewrite all HStoreFiles files as one. Larger numbers put off compaction but when it runs, it takes longer to complete. /description /property Make the value larger, if there are too many small files in the region dir, this may help a lot. Frank Chow
Re: LocalHBaseCluster exception
Find the hbase-default.xml file in your target folder, check the value for parameter hbase.defaults.for.version. Is it value@@@VERSION@ @@/value? It should be replaced by mvn to the ${project.version}. However, if you do a clean build from eclipse or something else, the replaced value may be gone. You can always manually set hbase.defaults.for.version.skip to true to move on. Thanks, Jimmy On Thu, Aug 22, 2013 at 11:58 PM, 闫昆 yankunhad...@gmail.com wrote: hi all I am using my hbase maven compile the source code and then execute LocalHBaseCluster in src / main / java directory, but the following exception occurred I did not modify any configuration and does not replace any files Thank you for your help Exception in thread main java.lang.RuntimeException: hbase-default.xml file seems to be for and old version of HBase at org.apache.hadoop.hbase.HBaseConfiguration.checkDefaultsVersion(HBaseConfiguration.java:68) at org.apache.hadoop.hbase.HBaseConfiguration.addHbaseResources(HBaseConfiguration.java:100) at org.apache.hadoop.hbase.HBaseConfiguration.create(HBaseConfiguration.java:111) at org.apache.hadoop.hbase.LocalHBaseCluster.main(LocalHBaseCluster.java:445) -- In the Hadoop world, I am just a novice, explore the entire Hadoop ecosystem, I hope one day I can contribute their own code YanBit yankunhad...@gmail.com
Re: getting splitmanager debug logs continuously
Can you start the master as well (besides region servers)? On Thu, Aug 8, 2013 at 2:41 PM, oc tsdb oc.t...@gmail.com wrote: I am using hbase-0.92 Region server was not running on any of the nodes. Restarted the cluster. It started region server on all nodes except HMaster but still unresponsive. processes running on master are TSDMain HMaster SecondaryNameNode NameNode JobTracker HQuorumPeer processes running on all other nodes are DataNode TaskTracker RegionServer TSDMain This time, I see the error messages in the attached log. Could you please suggest if I can recover/restore the data and get the cluster up. Thanks Regards, VSR On Thu, Aug 8, 2013 at 1:40 PM, Ted Yu yuzhih...@gmail.com wrote: Can you tell us the version of HBase you're using ? Do you find something in region server logs on the 4 remaining nodes ? Cheers On Thu, Aug 8, 2013 at 1:36 PM, oc tsdb oc.t...@gmail.com wrote: Hi, I am running a cluster with 6 nodes; Two of 6 nodes in my cluster went down (due to other application failure) and came back after some time (had to do a power reboot). When these nodes are back I use to get WARN org.apache.hadoop.DFSClient: Failed to connect to , add to deadnodes and continue. Now these messages are stopped and getting continuous debug message as follows. 2013-08-08 12:57:36,628 DEBUG org.apache.hadoop.hbase. master.SplitLogManager: total tasks = 14 unassigned = 14 2013-08-08 12:57:37,628 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14 unassigned = 14 2013-08-08 12:57:37,629 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-3.corp.oc.com%2C60020%2C1375466447768-splitting%2Fmb-3.corp.oc.com %252C60020%252C1375466447768.1375631802971 ver = 0 2013-08-08 12:57:37,629 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com %252C60020%252C1375466460755.1375623787557 ver = 0 2013-08-08 12:57:37,629 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com %252C60020%252C1375466460755.1375619231059 ver = 3 2013-08-08 12:57:37,629 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-2.corp.oc.com%2C60020%2C1375466479427-splitting%2Fmb-2.corp.oc.com %252C60020%252C1375466479427.1375639017535 ver = 0 2013-08-08 12:57:37,629 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com %252C60020%252C1375466460755.1375623021175 ver = 0 2013-08-08 12:57:37,629 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-3.corp.oc.com%2C60020%2C1375466447768-splitting%2Fmb-3.corp.oc.com %252C60020%252C1375466447768.1375630425141 ver = 0 2013-08-08 12:57:37,629 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: resubmitting unassigned task(s) after timeout 2013-08-08 12:57:37,629 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com %252C60020%252C1375466460755.1375620714514 ver = 3 2013-08-08 12:57:37,630 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-6.corp.oc.com%2C60020%2C1375924525310-splitting%2Fmb-6.corp.oc.com %252C60020%252C1375924525310.1375924529658 ver = 0 2013-08-08 12:57:37,630 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-4.corp.oc.com%2C60020%2C1375466551673-splitting%2Fmb-4.corp.oc.com %252C60020%252C1375466551673.1375641592581 ver = 0 2013-08-08 12:57:37,630 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs% 2Fmb-5.corp.oc.com%2C60020%2C1375924528073-splitting%2Fmb-5.corp.oc.com %252C60020%252C1375924528073.1375924532442 ver = 0 2013-08-08 12:57:37,630 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
Re: AssignmentManager looping?
Something went wrong with split. It should be easy to fix your cluster. However, it will be more interesting to find out how it happened. Do you remember what has happened since it was good previously? Do you have all the logs? On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: I tried to remove the znodes but got the same result. So I shutted down all the RS and restarted HBase, and now I have 0 regions for this table. Running HBCK. Seems that it has a lot to do... 2013/8/1 Kevin O'dell kevin.od...@cloudera.com Yes you can if HBase is down, first I would copy .META out of HDFS local and then you can search it for split issues. Deleting those znodes should clear this up though. On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: I can't check the meta since HBase is down. Regarding HDFS, I took few random lines like: 2013-08-01 08:45:57,260 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 28328fdb7181cbd9cc4d6814775e8895 not found on server node4,60020,1375319042033; failed processing 2013-08-01 08:45:57,260 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 28328fdb7181cbd9cc4d6814775e8895 from server node4,60020,1375319042033 but it doesn't exist anymore, probably already processed its split And each time, there is nothing like that. hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep 28328fdb7181cbd9cc4d6814775e8895 On ZK side: [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned [28328fdb7181cbd9cc4d6814775e8895, a8781a598c46f19723a2405345b58470, b7ebfeb63b10997736fd12920fde2bb8, d95bb27cc026511c2a8c8ad155e79bf6, 270a9c371fcbe9cd9a04986e0b77d16b, aff4d1d8bf470458bb19525e8aef0759] Can I just delete those zknodes? Worst case hbck will find them back from HDFS if required? JM 2013/8/1 Kevin O'dell kevin.od...@cloudera.com Does it exist in meta or hdfs? On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: My master keep logging that: 2013-07-31 21:52:59,201 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 270a9c371fcbe9cd9a04986e0b77d16b not found on server node7,60020,1375319044055; failed processing 2013-07-31 21:52:59,201 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but it doesn't exist anymore, probably already processed its split 2013-07-31 21:52:59,339 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 270a9c371fcbe9cd9a04986e0b77d16b not found on server node7,60020,1375319044055; failed processing 2013-07-31 21:52:59,339 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but it doesn't exist anymore, probably already processed its split 2013-07-31 21:52:59,461 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 270a9c371fcbe9cd9a04986e0b77d16b not found on server node7,60020,1375319044055; failed processing 2013-07-31 21:52:59,461 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but it doesn't exist anymore, probably already processed its split 2013-07-31 21:52:59,636 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 270a9c371fcbe9cd9a04986e0b77d16b not found on server node7,60020,1375319044055; failed processing 2013-07-31 21:52:59,636 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but it doesn't exist anymore, probably already processed its split 2013-07-31 21:53:00,074 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 270a9c371fcbe9cd9a04986e0b77d16b not found on server node7,60020,1375319044055; failed processing 2013-07-31 21:53:00,074 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but it doesn't exist anymore, probably already processed its split 2013-07-31 21:53:00,261 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 270a9c371fcbe9cd9a04986e0b77d16b not found on server node7,60020,1375319044055; failed processing 2013-07-31 21:53:00,261 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
Re: AssignmentManager looping?
It will be great if you can reproduce this issue. One thing to keep in mind is not to run hbck(repair) in this case since hbck may have some problem to handle the split parent properly. By the way, in trunk, region split uses multi row mutate to update meta, which is more reliable. So I think the issue should have been fixed in trunk. On Thu, Aug 1, 2013 at 11:07 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: No,it's a HBase 0.94.10 cluster with Hadoop 1.0.3, everything installed manually from JARs ;) It's a mess to monitor and I would have loved to have it under CM now, but I have to deal with that ;) I'm building a 2nd cluster at home so I will be able to replicate this one to the other one, which might allow me to play even further with it... I will try to reproduce the issue, give me just couple of hours... JM 2013/8/1 Kevin O'dell kevin.od...@cloudera.com Jimmy, Sounds like our dreaded reference file issue again. I spoke with JM and he is going to try to reproduce this My gut tells me our point of no return may be in the wrong place due to some code change along the way, but hbck could also just be doing something wonky. JM, This cluster is not CM managed correct? On Aug 1, 2013 1:49 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: So I had to remove few reference files and run few hbck to get everything back online. Summary: don't stop your cluster while it's major compacting huge tables ;) Thanks all! JM 2013/8/1 Kevin O'dell kevin.od...@cloudera.com If that doesn't work you probably have an invalid reference file and you will find that in RS logs for the HLog split that is never finishing. On Aug 1, 2013 1:38 PM, Kevin O'dell kevin.od...@cloudera.com wrote: JM, Stop HBase rmr /hbase from zkcli Sideline META Run offline meta repair Start HBase On Aug 1, 2013 1:01 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Jimmy, I should still have all the logs. What I did is pretty simple. I tried to turn the cluster off while a single regioned 250GB table was under major_compaction to get splitted. I will targz all the logs for the few last days and make that available. On the other side, I'm still not able to bring it back up... JM 2013/8/1 Jimmy Xiang jxi...@cloudera.com Something went wrong with split. It should be easy to fix your cluster. However, it will be more interesting to find out how it happened. Do you remember what has happened since it was good previously? Do you have all the logs? On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: I tried to remove the znodes but got the same result. So I shutted down all the RS and restarted HBase, and now I have 0 regions for this table. Running HBCK. Seems that it has a lot to do... 2013/8/1 Kevin O'dell kevin.od...@cloudera.com Yes you can if HBase is down, first I would copy .META out of HDFS local and then you can search it for split issues. Deleting those znodes should clear this up though. On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: I can't check the meta since HBase is down. Regarding HDFS, I took few random lines like: 2013-08-01 08:45:57,260 WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 28328fdb7181cbd9cc4d6814775e8895 not found on server node4,60020,1375319042033; failed processing 2013-08-01 08:45:57,260 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 28328fdb7181cbd9cc4d6814775e8895 from server node4,60020,1375319042033 but it doesn't exist anymore, probably already processed its split And each time, there is nothing like that. hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep 28328fdb7181cbd9cc4d6814775e8895 On ZK side: [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned [28328fdb7181cbd9cc4d6814775e8895, a8781a598c46f19723a2405345b58470, b7ebfeb63b10997736fd12920fde2bb8, d95bb27cc026511c2a8c8ad155e79bf6, 270a9c371fcbe9cd9a04986e0b77d16b, aff4d1d8bf470458bb19525e8aef0759] Can I just delete those zknodes? Worst case hbck will find them back from HDFS if required? JM 2013/8/1 Kevin O'dell kevin.od...@cloudera.com Does it exist in meta
Re: Delete all data before a given timestamp
When you set up the MR, does it help to set a proper timestamp filter or time range in the scan object? On Tue, Jul 16, 2013 at 5:59 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Another option might be to setup the proper TTL on the table? You alter the table to set the TTL to reflect your timestamp, the you run a compaction? The issue is that you have to disable the table while you alter it. JM 2013/7/16 Ted Yu yuzhih...@gmail.com Would this method (of Delete) serve your need ? public Delete deleteFamily(byte [] family, long timestamp) { From its Javadoc: * Delete all columns of the specified family with a timestamp less than * or equal to the specified timestamp. On Mon, Jul 15, 2013 at 8:07 PM, Chao Shi stepi...@live.com wrote: Jean-Marc Spaggiari jean-marc@... writes: When you send a delete command to the server, you can specify a timestamp. So as the result of your MR job,just emit this delete with the specific timestamp to remove any previous version? JM 2013/7/15 Chao Shi stepinto@... Hi HBase users, We have created a index table (say T2) of another table (say t1). The clients who write to T1 also write a index record to T2 with the same timestamp. There may be accumulated inconsistency as time goes by. So we run a MR job periodically, which fully scans T1, builds a index, and bulk-loads the result to T2. Because the MR job may be running for a while, during the period of which, all new data into T2 must be kept and not be overridden. So the MR creates puts using the timestamp the job starts. Then we want all data in T2 before a given timestamp to invisible for read after the index builds successfully and get deleted eventually (e.g. during major compaction). We prefer setting it explicitly than using the TTL feature for safety, as we want only old data are deleted only when the new data is written. Does HBase support this kind of operation for now? Thanks, Chao Hi Jean-Marc, Thanks for the reply. I see delete can specify a timestamp, but I don't think that is what I need. To clarify, in my scenario, I don't want to issue deletes for every key (because I don't know what exactly to delete unless do another full scan). I'd like to see if this is possible: set a min_timestamp to ColumnDescriptor. Once done, KVs before this timestamp become invisible to read. During major compaction, these KVs are deleted. It is the absolute version of TTL.
Re: Scanner problem after bulk load hfile
Do you see any exception/logging in the region server side? On Tue, Jul 16, 2013 at 1:15 PM, Rohit Kelkar rohitkel...@gmail.com wrote: Yes. I tried everything from myTable.flushCommits() to myTable.clearRegionCache() before and after the LoadIncrementalHFiles.doBulkLoad(). But it doesn't seem to work. This is what I am doing right now to get things moving although I think this may not be the recommended approach - HBaseAdmin hbaseAdmin = new HBaseAdmin(hbaseConf); hbaseAdmin.majorCompact(myTableName.getBytes()); myTable.close(); hbaseAdmin.close(); - R On Mon, Jul 15, 2013 at 9:14 AM, Amit Sela am...@infolinks.com wrote: Well, I know it's kind of voodoo but try it once before pre-split and once after. Worked for me. On Mon, Jul 15, 2013 at 7:27 AM, Rohit Kelkar rohitkel...@gmail.com wrote: Thanks Amit, I am also using 0.94.2 . I am also pre-splitting and I tried the table.clearRegionCache() but still doesn't work. - R On Sun, Jul 14, 2013 at 3:45 AM, Amit Sela am...@infolinks.com wrote: If new regions are created during the bulk load (are you pre-splitting ?), maybe try myTable.clearRegionCache() after the bulk load (or even after the pre-splitting if you do pre-split). This should clear the region cache. I needed to use this because I am pre-splitting my tables for bulk load. BTW I'm using HBase 0.94.2 Good luck! On Fri, Jul 12, 2013 at 6:50 PM, Rohit Kelkar rohitkel...@gmail.com wrote: I am having problems while scanning a table created using HFile. This is what I am doing - Once Hfile is created I use following code to bulk load LoadIncrementalHFiles loadTool = new LoadIncrementalHFiles(conf); HTable myTable = new HTable(conf, mytablename.getBytes()); loadTool.doBulkLoad(new Path(outputHFileBaseDir + / + mytablename), mytableTable); Then scan the table using- HTable table = new HTable(conf, mytable); Scan scan = new Scan(); scan.addColumn(cf.getBytes(), q.getBytes()); ResultScanner scanner = table.getScanner(scan); for (Result rr = scanner.next(); rr != null; rr = scanner.next()) { numRowsScanned += 1; } This code crashes with following error - http://pastebin.com/SeKAeAST If I remove the scan.addColumn from the code then the code works. Similarly on the hbase shell - - A simple count 'mytable' in hbase shell gives the correct count. - A scan 'mytable' gives correct results. - get 'mytable', 'myrow', 'cf:q' crashes The hadoop dfs -ls /hbase/mytable shows the .tableinfo, .tmp, the directory for region etc. Now if I do a major_compact 'mytable' and then execute my code with the scan.addColumn statement then it works. Also the get 'mytable', 'myrow', 'cf:q' works. My question is What is major_compact doing to enable the scanner that the LoadIncrementalFiles tool is not? I am sure I am missing a step after the LoadIncrementalFiles. - R
Re: Scanner problem after bulk load hfile
HBASE-8055 should have fixed it. On Tue, Jul 16, 2013 at 2:33 PM, Rohit Kelkar rohitkel...@gmail.com wrote: This ( http://pastebin.com/yhx4apCG ) is the error on the region server side when execute the following on the shell - get 'mytable', 'myrow', 'cf:q' - R On Tue, Jul 16, 2013 at 3:28 PM, Jimmy Xiang jxi...@cloudera.com wrote: Do you see any exception/logging in the region server side? On Tue, Jul 16, 2013 at 1:15 PM, Rohit Kelkar rohitkel...@gmail.com wrote: Yes. I tried everything from myTable.flushCommits() to myTable.clearRegionCache() before and after the LoadIncrementalHFiles.doBulkLoad(). But it doesn't seem to work. This is what I am doing right now to get things moving although I think this may not be the recommended approach - HBaseAdmin hbaseAdmin = new HBaseAdmin(hbaseConf); hbaseAdmin.majorCompact(myTableName.getBytes()); myTable.close(); hbaseAdmin.close(); - R On Mon, Jul 15, 2013 at 9:14 AM, Amit Sela am...@infolinks.com wrote: Well, I know it's kind of voodoo but try it once before pre-split and once after. Worked for me. On Mon, Jul 15, 2013 at 7:27 AM, Rohit Kelkar rohitkel...@gmail.com wrote: Thanks Amit, I am also using 0.94.2 . I am also pre-splitting and I tried the table.clearRegionCache() but still doesn't work. - R On Sun, Jul 14, 2013 at 3:45 AM, Amit Sela am...@infolinks.com wrote: If new regions are created during the bulk load (are you pre-splitting ?), maybe try myTable.clearRegionCache() after the bulk load (or even after the pre-splitting if you do pre-split). This should clear the region cache. I needed to use this because I am pre-splitting my tables for bulk load. BTW I'm using HBase 0.94.2 Good luck! On Fri, Jul 12, 2013 at 6:50 PM, Rohit Kelkar rohitkel...@gmail.com wrote: I am having problems while scanning a table created using HFile. This is what I am doing - Once Hfile is created I use following code to bulk load LoadIncrementalHFiles loadTool = new LoadIncrementalHFiles(conf); HTable myTable = new HTable(conf, mytablename.getBytes()); loadTool.doBulkLoad(new Path(outputHFileBaseDir + / + mytablename), mytableTable); Then scan the table using- HTable table = new HTable(conf, mytable); Scan scan = new Scan(); scan.addColumn(cf.getBytes(), q.getBytes()); ResultScanner scanner = table.getScanner(scan); for (Result rr = scanner.next(); rr != null; rr = scanner.next()) { numRowsScanned += 1; } This code crashes with following error - http://pastebin.com/SeKAeAST If I remove the scan.addColumn from the code then the code works. Similarly on the hbase shell - - A simple count 'mytable' in hbase shell gives the correct count. - A scan 'mytable' gives correct results. - get 'mytable', 'myrow', 'cf:q' crashes The hadoop dfs -ls /hbase/mytable shows the .tableinfo, .tmp, the directory for region etc. Now if I do a major_compact 'mytable' and then execute my code with the scan.addColumn statement then it works. Also the get 'mytable', 'myrow', 'cf:q' works. My question is What is major_compact doing to enable the scanner that the LoadIncrementalFiles tool is not? I am sure I am missing a step after the LoadIncrementalFiles. - R
Re: Hbase ConnectionLoss for /hbase/master
Is this exception seen in the client side? Was your HBase still running fine at that moment? How many connections do you open in your client? Which HBase version is it? Thanks, Jimmy On Wed, Apr 3, 2013 at 1:47 AM, Dhanasekaran Anbalagan bugcy...@gmail.comwrote: Hi Guys, I trying to run hbase based parser program. Basically the parsed output written in Hbase table. When I ran my program. tables created some data also inserted to the table. but not fully table updated. This Exception seen in console output. I try to find goggling it's gives try to configure zookeeper session. I correctly configure hbase.zookeeper.quorum property I mentioned all my zookeeper machine names. 2013-04-02 17:50:13,888 ERROR RecoverableZooKeeper [pool-1-thread-19-EventThread]: ZooKeeper exists failed after 3 retries 2013-04-02 17:50:13,900 ERROR ZooKeeperWatcher [pool-1-thread-19-EventThread]: hconnection Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154) at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:82) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:580) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.resetZooKeeperTrackers(HConnectionManager.java:597) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.abort(HConnectionManager.java:1720) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:374) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:271) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497) 2013-04-02 17:50:46,601 FATAL HConnectionManager$HConnectionImplementation [pool-1-thread-19-EventThread]: Unexpected exception during initialization, aborting org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154) at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:82) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:580) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.resetZooKeeperTrackers(HConnectionManager.java:597) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.abort(HConnectionManager.java:1720) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:374) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:271) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497) Please guide me. how to fix this. -Dhanasekaran. Did I learn something today? If not, I wasted it. --
Re: Welcome our newest Committer Anoop
Congratulations! On Wed, Mar 20, 2013 at 6:11 AM, Jonathan Hsieh j...@cloudera.com wrote: welcome welcome! On Wed, Mar 13, 2013 at 10:23 AM, Sergey Shelukhin ser...@hortonworks.comwrote: Congrats! On Tue, Mar 12, 2013 at 10:38 PM, xkwang bruce bruce.xkwa...@gmail.com wrote: Congratulations, Anoop! 2013/3/13 Devaraj Das d...@hortonworks.com Hey Anoop, Congratulations! Devaraj. On Mon, Mar 11, 2013 at 10:50 AM, Enis Söztutar enis@gmail.com wrote: Congrats and welcome. On Mon, Mar 11, 2013 at 2:21 AM, Nicolas Liochon nkey...@gmail.com wrote: Congrats, Anoop! On Mon, Mar 11, 2013 at 5:35 AM, rajeshbabu chintaguntla rajeshbabu.chintagun...@huawei.com wrote: Contratulations Anoop! From: Anoop Sam John [anoo...@huawei.com] Sent: Monday, March 11, 2013 9:00 AM To: user@hbase.apache.org Subject: RE: Welcome our newest Committer Anoop Thanks to all.. Hope to work more and more for HBase! -Anoop- From: Andrew Purtell [apurt...@apache.org] Sent: Monday, March 11, 2013 7:33 AM To: user@hbase.apache.org Subject: Re: Welcome our newest Committer Anoop Congratulations Anoop. Welcome! On Mon, Mar 11, 2013 at 12:42 AM, ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com wrote: Hi All Pls welcome Anoop, our newest committer. Anoop's work in HBase has been great and he has helped lot of users in the mailing list. He has contributed features related to Endpoints and CPs. Welcome Anoop and best wishes for your future work. Hope to see your continuing efforts to the community. Regards Ram -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) -- // Jonathan Hsieh (shay) // Software Engineer, Cloudera // j...@cloudera.com
Re: [ANNOUNCE] New Apache HBase Committer - Devaraj Das
Congratulations! On Thu, Feb 7, 2013 at 9:54 AM, Andrew Purtell apurt...@apache.org wrote: Congratulations Devaraj! On Wed, Feb 6, 2013 at 9:19 PM, Ted Yu yuzhih...@gmail.com wrote: Hi, We've brought in one new Apache HBase Committer: Devaraj Das. On behalf of the Apache HBase PMC, I am excited to welcome Devaraj as committer. He has played a key role in unifying RPC engines for 0.96 He fixed some tricky replication-related bugs There're 30 resolved HBase JIRAs under his name. Please join me in congratulating Devaraj for his new role. -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: What does Found lingering reference file mean?
RECOVERED_EDITS is not a column family. It should be ignored by hbck. Filed a jira: https://issues.apache.org/jira/browse/HBASE-7640 Thanks, Jimmy On Mon, Jan 21, 2013 at 2:36 PM, Stack st...@duboce.net wrote: Did you get the name of the broken reference? I'd trace its life in namenode logs and in regionserver log by searching its name (You might have to find the region in master logs to see where region landed over time). The reference name includes the encoded region name as a suffix. This is the region that the reference 'references' so need to figure what happened with it. Did it get cleaned up before reference was cleared? (Something that should not happen). St.Ack On Mon, Jan 21, 2013 at 2:20 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hum. It's still a bit obscur for me how this happend to my cluster... -repair helped to fix that, so I'm now fine. I will re-run the job I ran and see if this is happening again. Thanks, JM 2013/1/21, Stack st...@duboce.net: On Mon, Jan 21, 2013 at 12:01 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Found lingering reference file The comment on the method that is finding the lingering reference files is pretty good: http://hbase.apache.org/xref/org/apache/hadoop/hbase/util/HBaseFsck.html#604 It looks like a reference file that lost its referencee. If you pass this arg., does it help? http://hbase.apache.org/xref/org/apache/hadoop/hbase/util/HBaseFsck.html#3391 St.Ack
Re: Re: always assinging root
That regionserver is stopping? If not, you may need to restart it. 2012-12-07 13:18:53,947 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing open of -ROOT-,,0.70236052 2012-12-07 13:18:53,947 INFO org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Server stopping or stopped, skipping open of -ROOT-,,0.70236052 Jimmy On Thu, Dec 6, 2012 at 9:53 PM, happygodwithwang happygodwithw...@gmail.com wrote: Yi, Thanks for your reply. following is the link for regionserver 'dn004' log :http://pastebin.com/224CKwsu I can't find anything useful. Thanks, Jing Wang At 2012-12-07 13:16:14,Yi Liang white...@gmail.com wrote: Jing, I think you could follow this kind of log to check rs log to find why it could not open root region. 2012-12-07 10:48:48,708 INFO org.apache.hadoop.hbase.master.AssignmentManager: Assigning region -ROOT-,,0.70236052 to dn004,60020,1353922884530 Regards, Yi On Fri, Dec 7, 2012 at 11:49 AM, jing wang happygodwithw...@gmail.com wrote: thanks for your reply. here is the link for the full log since starting master: http://pastebin.com/EjRvbk0g ps:svr36 is master, others are regionserver Thanks Jing Wang 2012/12/7 ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com Check out your master logs.. It may give us more clue. Regards Ram On Fri, Dec 7, 2012 at 8:59 AM, jing wang happygodwithw...@gmail.com wrote: Hi there, Something is wrong with our hbase cluster. In mater:60010 web ui, it shows bellow after we restart the mater, using bin/hbase-daemon.sh start master. What's wrong ? Recent tasksStart TimeDescriptionStateStatusFri Dec 07 10:06:57 CST 2012Master startupRUNNING (since 1hrs, 15mins, 29sec ago)Assigning ROOT region (since 1hrs, 3mins, 41sec ago) HBase Version0.90.6-cdh3u5, r Hadoop Version0.20.2-cdh3u5, rde14a95e895a72e7b2501bbe628c1e23578aae29 Thanks, Jing Wang
Re: Does HBase combine multiple Puts against the same region server?
If auto flush is off, multiple puts could be combined into a batch and send to the region server in one RPC call if they are for the same region server. Thanks, Jimmy On Thu, Dec 6, 2012 at 10:34 AM, yun peng pengyunm...@gmail.com wrote: Hi, I have question on how the multiple Puts() are executed when they are issued against the same region server. For example, in the case of asynchronous executing Put() using setAutoFlush(true), there will be multiple Puts() in the writeBuffer. Or use HTbale API put(List puts) which directly issues multiple Puts. In either case, let's say, would two Puts in the list, which are issued against the same HRegionServer, be combined in a single RPC before sending to that RegionServer? I appreciate it if I can get pointer to the code in HBase. Thanks... Regards, Yun
Re: Does HBase combine multiple Puts against the same region server?
This has been built in hbase for quite some time, no application change. On Thu, Dec 6, 2012 at 11:08 AM, yun peng pengyunm...@gmail.com wrote: Is that done in current HBase implementation (say, 0.94.2 or more recent one) or it requires applications to handle it? Thanks for your note, Yun On Thu, Dec 6, 2012 at 1:42 PM, Jimmy Xiang jxi...@cloudera.com wrote: If auto flush is off, multiple puts could be combined into a batch and send to the region server in one RPC call if they are for the same region server. Thanks, Jimmy On Thu, Dec 6, 2012 at 10:34 AM, yun peng pengyunm...@gmail.com wrote: Hi, I have question on how the multiple Puts() are executed when they are issued against the same region server. For example, in the case of asynchronous executing Put() using setAutoFlush(true), there will be multiple Puts() in the writeBuffer. Or use HTbale API put(List puts) which directly issues multiple Puts. In either case, let's say, would two Puts in the list, which are issued against the same HRegionServer, be combined in a single RPC before sending to that RegionServer? I appreciate it if I can get pointer to the code in HBase. Thanks... Regards, Yun
Re: regions not balanced, CDH4.1.2
In CDH4.1.2, per table region balancing is turned off by default. You can change the configuration to turn it on. Thanks, Jimmy On Tue, Dec 4, 2012 at 11:10 AM, Ted Yu yuzhih...@gmail.com wrote: Can you give us a little more detail on how much deviation the region counts on region servers have ? There is a parameter, hbase.regions.slop, with default value of 0.2 This parameter allows region count to deviate by certain percentage from average region count. You can tighten the value for this parameter and see if you get better results. I will also put the above summary on HBASE-3373. Thanks On Tue, Dec 4, 2012 at 8:42 AM, Norbert Burger norbert.bur...@gmail.comwrote: We upgraded to CDH4.1.2 (contains HBASE-3373) in one of our environments. After filling that environment with data, I was surprised to see that regions were not balanced across regionservers at the table level. We have restarted all regionservers at least once here. In [1], I see Stack's reference to temporarily adding hbase.master.startup.retainassign=false and restarting nodes. Is this a necessary step on the path to region balancing nirvana? Norbert [1] http://search-hadoop.com/m/MQSPEyUQIv1
Re: regions not balanced, CDH4.1.2
Right, that's config. You can turn it on and restart the cluster. Upstream it is on by default. However, it is turned off by default in CDH4.1.2 to be backward compatible. Thanks, Jimmy On Tue, Dec 4, 2012 at 11:28 AM, Norbert Burger norbert.bur...@gmail.com wrote: Thanks, Jimmy. Do you mean the config hbase.master.loadbalance.bytable? According to [1] and [2], it is true by default. [1] https://issues.apache.org/jira/secure/attachment/12509174/3373.txt [2] http://search-hadoop.com/m/M6z7G1PKejw Norbert On Tue, Dec 4, 2012 at 2:23 PM, Jimmy Xiang jxi...@cloudera.com wrote: In CDH4.1.2, per table region balancing is turned off by default. You can change the configuration to turn it on. Thanks, Jimmy On Tue, Dec 4, 2012 at 11:10 AM, Ted Yu yuzhih...@gmail.com wrote: Can you give us a little more detail on how much deviation the region counts on region servers have ? There is a parameter, hbase.regions.slop, with default value of 0.2 This parameter allows region count to deviate by certain percentage from average region count. You can tighten the value for this parameter and see if you get better results. I will also put the above summary on HBASE-3373. Thanks On Tue, Dec 4, 2012 at 8:42 AM, Norbert Burger norbert.bur...@gmail.comwrote: We upgraded to CDH4.1.2 (contains HBASE-3373) in one of our environments. After filling that environment with data, I was surprised to see that regions were not balanced across regionservers at the table level. We have restarted all regionservers at least once here. In [1], I see Stack's reference to temporarily adding hbase.master.startup.retainassign=false and restarting nodes. Is this a necessary step on the path to region balancing nirvana? Norbert [1] http://search-hadoop.com/m/MQSPEyUQIv1
Re: could not start HMaster
Is your /tmp folder cleaned up automatically and some files are gone? Thanks, Jimmy On Mon, Oct 15, 2012 at 12:26 PM, yulin...@dell.com wrote: Hi, I set up a single node HBase server on top of Hadoop and it has been working fine with most of my testing scenarios such as creating tables and inserting data. Just during the weekend, I accidentally left a testing script running that inserts about 67 rows every min for three days. Today when I looked at the environment, I found out that HBase master could not be started anymore. Digging into the logs, I could see that starting from the second day, HBase first got an exception as follows: 2012-10-13 13:05:07,367 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /tmp/hbase-root/hbase/.logs/sflow-linux02.santanet.dell.com,47137,1348606516541/sflow-linux02.santanet.dell.com%2C47137%2C1348606516541.1350155105992, entries=7981, filesize=3754556. for /tmp/hbase-root/hbase/.logs/sflow-linux02.santanet.dell.com,47137,1348606516541/sflow-linux02.santanet.dell.com%2C47137%2C1348606516541.1350158707364 2012-10-13 13:05:07,367 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: moving old hlog file /tmp/hbase-root/hbase/.logs/sflow-linux02.santanet.dell.com,47137,1348606516541/sflow-linux02.santanet.dell.com%2C47137%2C1348606516541.1348606520442 whose highest sequenceid is 4 to /tmp/hbase-root/hbase/.oldlogs/sflow-linux02.santanet.dell.com%2C47137%2C1348606516541.1348606520442 2012-10-13 13:05:07,379 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server sflow-linux02.santanet.dell.com,47137,1348606516541: IOE in log roller java.io.FileNotFoundException: File file:/tmp/hbase-root/hbase/.logs/sflow-linux02.santanet.dell.com,47137,1348606516541/sflow-linux02.santanet.dell.com%2C47137%2C1348606516541.1348606520442 does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:163) at org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:287) at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:428) at org.apache.hadoop.hbase.regionserver.wal.HLog.archiveLogFile(HLog.java:825) at org.apache.hadoop.hbase.regionserver.wal.HLog.cleanOldLogs(HLog.java:708) at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:603) at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:94) at java.lang.Thread.run(Thread.java:662) Then SplitLogManager kept splitting the logs for about two days: 2012-10-13 13:05:09,061 WARN org.apache.zookeeper.server.NIOServerCnxn: caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x139ff3656b30003, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:224) at java.lang.Thread.run(Thread.java:662) 2012-10-13 13:05:09,061 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /127.0.0.1:52573 which had sessionid 0x139ff3656b30003 2012-10-13 13:05:09,082 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2012-10-13 13:05:09,085 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for sflow-linux02.santanet.dell.com,47137,1348606516541 2012-10-13 13:05:09,086 INFO org.apache.hadoop.hbase.master.SplitLogManager: dead splitlog worker sflow-linux02.santanet.dell.com,47137,1348606516541 2012-10-13 13:05:09,101 INFO org.apache.hadoop.hbase.master.SplitLogManager: started splitting logs in [file:/tmp/hbase-root/hbase/.logs/sflow-linux02.santanet.dell.com,47137,1348606516541-splitting] 2012-10-13 13:05:14,545 INFO org.apache.hadoop.hbase.regionserver.Leases: RegionServer:0;sflow-linux02.santanet.dell.com,47137,1348606516541.leaseChecker closing leases 2012-10-13 13:05:14,545 INFO org.apache.hadoop.hbase.regionserver.Leases: RegionServer:0;sflow-linux02.santanet.dell.com,47137,1348606516541.leaseChecker closed leases 2012-10-13 13:08:09,275 INFO org.apache.hadoop.hbase.master.SplitLogManager: task /hbase/splitlog/RESCAN28 entered state done sflow-linux02.santanet.dell.com,37015,1348606516151 2012-10-13 13:11:09,730 INFO org.apache.hadoop.hbase.master.SplitLogManager: task /hbase/splitlog/RESCAN29 entered state done sflow-linux02.santanet.dell.com,37015,1348606516151 2012-10-13 13:14:10,171 INFO org.apache.hadoop.hbase.master.SplitLogManager: task /hbase/splitlog/RESCAN30 entered state done sflow-linux02.santanet.dell.com,37015,1348606516151 When I tried to re-start HBase server today, the following exception occurs: 2012-10-15
Re: Question about WAL writes after region server soft failures
Hi Nick, When the dead region server comes back, it won't be able to write data to the WAL any more. As the first thing of log splitting, the WAL folder for the dead region server is renamed. When the dead region server tries to write to the WAL, it will find the file is not there any more. Thanks, Jimmy On Fri, Sep 7, 2012 at 12:19 PM, Nick Puz npuz...@me.com wrote: I'm new to HBase and HDFS and have a question about what happens when failure is detected and a new region server takes over a region. If the old region server hasn't really failed and comes back will it still accept writes? Here's a specific sequence of events: 1) region R is currently being served by region server RS1. 2) RS1 hangs for some reason (long GC, network hiccup, etc) 3) the region master gets notified that RS1 is down so it splits logs and reassigns. Looking at the code splitting logs renames the log directory so if RS1 tries to create a new log file it will fail. 4) region server RS2 is assigned the region, replays the log, and all is well. 5) RS1 comes back to life. After 5 happens: - if it had inflight requests will it write the to the WAL and eventually flush the memtables? - if it gets new requests will it service them as long as it is still appending to the same block in the WAL file? One way to prevent the clients getting acks would be to set the client timeout to be less than the zookeeper session timeout (zookeeper.session.timeout) which seems like a logical thing to do. But even if the timeouts were such the client got a timeout are there scenarios when the edits would be readable by other clients? (say if that log file was rescanned) Thanks, -Nick
Re: MR hbase export is failing
It could be also caused by the MR takes too long to process a batch of data before coming back for another batch. Thanks, Jimmy On Tue, Jul 24, 2012 at 11:52 AM, Jeff Whiting je...@qualtrics.com wrote: What would cause a scanner timeout exception? Is hdfs too slow? Do I just increase the scanner timeout or is there a better approach. Thanks, ~Jeff running: hadoop jar /usr/lib/hbase/hbase-0.90.1-CDH3B4.jar export -D dfs.replication=2 -D mapred.output.compress=true -D mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec -D mapred.output.compression.type=BLOCK response /hbaseBackup/ partial/2012-07-20_2012-07-21/response 1 134274240 134282880 org.apache.hadoop.hbase.client.ScannerTimeoutException: 61294ms passed since the last invocation, timeout is currently set to 6 at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1114) at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:143) at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:142) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:455) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) at org.apache.hadoop.mapred.Child$4.run(Child.java:240) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:234) Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Name: -3564751891935236449 at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1795) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000) at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1100) ... 12 more -- Jeff Whiting Qualtrics Senior Software Engineer je...@qualtrics.com
Re: Is this exception caused by an overloaded node?
This exception means the scanner is expired in the region server side. You can adjust the scanner expiration setting, or make your client fast. Thanks, Jimmy On Fri, Jul 20, 2012 at 9:27 AM, Jonathan Bishop jbishop@gmail.com wrote: Hi, I am running on a cluster where some of the machines are loaded for other purposes. Occasionally an HBase scan fails with the message below, and I suspect this is caused by on or more of the region servers being overloaded with other processes (not hadoop/hbase) and not being able to respond correctly. Is this possible? Thanks, Jon Exception in thread main java.lang.RuntimeException: org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-3573171992963675348' does not exist at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2117) at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1402) at prd.Levelize.readPins(Levelize.java:50) at prd.Levelize.main(Levelize.java:245) Caused by: org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-3573171992963675348' does not exist at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:231) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2117) at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:84) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:39) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1325) at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1293) at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1399) ... 2 more
Re: Question about regions splitting
At the very beginning, the two daughter regions are on the same region server as the parent region. But they can be moved to other region servers by region balancer. HFiles of a region may not be on the same host as the region. Thanks, Jimmy On Fri, Jul 20, 2012 at 2:44 PM, Haijia Zhou leons...@gmail.com wrote: I have a question about how regions split. Let's say we have a 2GB region and upon splitting the region will be splitted into two 1GB sub-regions. My first question is: will the two 1GB sub-regions always be on the same host as the parent region? My second question is: Let's say the HDFS block size is 64MB, then one 1GB region will contains about 16 HDFS blocks, will all HDFS blocks always be on the same host as the region? Thanks
Re: Scan only talks to a single region server
Hi Whitney, The scanner will automatically jump to the next region server once the current region server is scanned. In the client, can HTable.getStartEndKeys() see all the regions and region servers? Thanks, Jimmy On Tue, Jul 17, 2012 at 10:47 AM, Whitney Sorenson wsoren...@hubspot.com wrote: The code is pasted above, here it is again: ResultScanner rs = table.getScanner(family, qualifier); for (Result r : rs) { // do something } ResultScanner's are iterable which means you can for:each them. In addition, the debug logs indicate that the scanner only ever retrieves rows from the first region server. On Tue, Jul 17, 2012 at 12:02 PM, Alex Baranau alex.barano...@gmail.com wrote: How do you create your scan(ner)? Could you paste the code here? Sorry, meant to ask how do you instantiate HTable, configuration objects. Alex Baranau -- Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch - Solr On Tue, Jul 17, 2012 at 11:37 AM, Alex Baranau alex.barano...@gmail.comwrote: this scan is running inside a map task How do you create your scan(ner)? Could you paste the code here? You know that when HBase table is used as a source for MapReduce job (via standard configuration), each Map task consumes data from one region (apart from other things, it tries to benefit from data locality). I.e. it creates one Map task per region. I wonder if this can be related. Sorry for obvious check... Alex Baranau -- Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch - Solr On Tue, Jul 17, 2012 at 11:11 AM, Whitney Sorenson wsoren...@hubspot.comwrote: I'm trying to scan across an entire table (using only a specific family or family + qualifier). I've tried various methods but I can only get this scan to touch the first region server. Afterwords, it stops processing. Issuing the same scan in the shell works (returns 50,000 rows) whereas the Scan made from Java only returns ~4000 rows. I've tried adding/removing start/stop rows, using getScanner(family, column) vs getScanner(scan), and restarting the region servers which host the 1st and 2nd regions. The debug output from the scan shows that it knows about locations for each region; however, it calls close after the first region. In the simplest case, the code looks like: ResultScanner rs = table.getScanner(family, qualifier); for (Result r : rs) { // do something } Any ideas or known issues? (0.90.4-cdh3u2 - this scan is running inside a map task) I figure the next step is to walk through the client scanner code locally in a java main but haven't done this yet. -- Alex Baranau -- Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch - Solr -- Alex Baranau -- Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch - Solr
Re: HMaster not failing over dead RegionServers
Bryan, The master could not detect if the region server is dead. How do you set the zookeeper session timeout? Thanks, Jimmy On Sat, Jun 30, 2012 at 8:09 AM, Stack st...@duboce.net wrote: On Sat, Jun 30, 2012 at 7:04 AM, Bryan Beaudreault bbeaudrea...@hubspot.com wrote: 12/06/30 00:07:22 INFO ipc.Client: Retrying connect to server: / 10.125.18.129:50020. Already tried 14 time(s). This was one of the servers that went down? It was not following through the splitting of HLog files and didn't appear to be moving regions off failed hosts. After giving it about 20 minutes to try to right itself, I tried restarting the service. The restart script just hung for a while printing dots and nothing apparent was happening on the logs at the time. Can we see the log Bryan? You might thread dump when its hung-up the next time Bryan (Would be something for us to do a looksee on). Finally I kill -9 the process, so that another master could take over. The new master seemed to start splitting logs, but eventually got into the same state of printing the above message. You think it a particular log? Eventually it all worked out, but it took WAY too long (almost an hour, all said). Is this something that is tunable? Have RS carry less WALs? Its a configuration. They should have instantly been removed from the list instead of retrying so many times. Each server was retried upwards of 30-40 times. Yeah, thats a bit silly. We're working on the MTTR in general. You logs would be of interest to a few of us if its ok that someone else can take a look. St.Ack I am running cdh3u2 (0.90.4). Thanks, Bryan
Re: Schedule major compaction programmatically
I am thinking to add a function to check if a table or region in compaction (major or minor). I filed HBASE-6033. It won't show status of a specific compaction request. Will this help? Thanks, Jimmy On Thu, May 17, 2012 at 11:11 AM, Chen Song chen.song...@gmail.com wrote: I would like to schedule major compaction on a region programmatically. I found the API call below which can properly achieve my goal. HBaseAdmin.majorCompact(String tableOrRegionName) It turns out to be an asynchronous call and there seems no call back parameter that can be specified. How can I validate the compaction result (e.g., success or failure) ? Thanks Chen
Re: Schedule major compaction programmatically
It is an async call to the region server to request a compaction. Once the request is accepted, the call returned. There is no sync call here. The request is queued and processed by a pool of threads. Currently, there is a metric to show the queue size. But it doesn't tell how many are for major, and how many are for minor. The queue size is the number of store files pending compact. As I know, there is no work around for now. Jimmy On Thu, May 17, 2012 at 11:42 AM, Chen Song chen.song...@gmail.com wrote: Thanks Jimmy. Meanwhile, is there a work around for this? How does compact/major_compact issued from hbase shell handles this under the hood? Is it eventually calling HBaseAdmin API or HRegion synchronous API call? Thanks Chen On Thu, May 17, 2012 at 2:24 PM, Jimmy Xiang jxi...@cloudera.com wrote: I am thinking to add a function to check if a table or region in compaction (major or minor). I filed HBASE-6033. It won't show status of a specific compaction request. Will this help? Thanks, Jimmy On Thu, May 17, 2012 at 11:11 AM, Chen Song chen.song...@gmail.com wrote: I would like to schedule major compaction on a region programmatically. I found the API call below which can properly achieve my goal. HBaseAdmin.majorCompact(String tableOrRegionName) It turns out to be an asynchronous call and there seems no call back parameter that can be specified. How can I validate the compaction result (e.g., success or failure) ? Thanks Chen -- Chen Song Mobile: 518-445-5096
Re: Schedule major compaction programmatically
HRegionServer.java: this.metrics.compactionQueueSize.set(compactSplitThread .getCompactionQueueSize()); On Thu, May 17, 2012 at 12:00 PM, Chen Song chen.song...@gmail.com wrote: Can you direct me to the API call to get the queue size metrics? On Thu, May 17, 2012 at 2:58 PM, Jimmy Xiang jxi...@cloudera.com wrote: It is an async call to the region server to request a compaction. Once the request is accepted, the call returned. There is no sync call here. The request is queued and processed by a pool of threads. Currently, there is a metric to show the queue size. But it doesn't tell how many are for major, and how many are for minor. The queue size is the number of store files pending compact. As I know, there is no work around for now. Jimmy On Thu, May 17, 2012 at 11:42 AM, Chen Song chen.song...@gmail.com wrote: Thanks Jimmy. Meanwhile, is there a work around for this? How does compact/major_compact issued from hbase shell handles this under the hood? Is it eventually calling HBaseAdmin API or HRegion synchronous API call? Thanks Chen On Thu, May 17, 2012 at 2:24 PM, Jimmy Xiang jxi...@cloudera.com wrote: I am thinking to add a function to check if a table or region in compaction (major or minor). I filed HBASE-6033. It won't show status of a specific compaction request. Will this help? Thanks, Jimmy On Thu, May 17, 2012 at 11:11 AM, Chen Song chen.song...@gmail.com wrote: I would like to schedule major compaction on a region programmatically. I found the API call below which can properly achieve my goal. HBaseAdmin.majorCompact(String tableOrRegionName) It turns out to be an asynchronous call and there seems no call back parameter that can be specified. How can I validate the compaction result (e.g., success or failure) ? Thanks Chen -- Chen Song Mobile: 518-445-5096 -- Chen Song Mobile: 518-445-5096
Re: Coprocessor Aggregation supposed to be ~20x slower than Scans?
Yes, it is fixed in CDH4. It will be in the coming release. Thanks, Jimmy On Tue, May 15, 2012 at 5:34 PM, Ted Yu yuzhih...@gmail.com wrote: Hopefully this gets fixed in https://repository.cloudera.com/artifactory/public/org/apache/hbase/hbase/0.92.0-cdh4b2-SNAPSHOT/ A developer from Cloudera would be able to better help you. On Tue, May 15, 2012 at 5:30 PM, anil gupta anilgupt...@gmail.com wrote: Hi Ted, I looked into hbase-0.92.0-cdh4b1-20120206.193413-23-sources.jar and it also doesn't have it. On Tue, May 15, 2012 at 5:07 PM, Ted Yu yuzhih...@gmail.com wrote: Why did you need to decompile ? Here is the source code: https://repository.cloudera.com/artifactory/public/org/apache/hbase/hbase/0.92.0-cdh4b1-SNAPSHOT/ On Tue, May 15, 2012 at 4:58 PM, anil gupta anilgupt...@gmail.com wrote: Hi Ted, I decompiled the hbase-0.92.0-cdh4b1.jar using JD-GUI and in validateParameter method i don't find that condition. Thanks, Anil On Tue, May 15, 2012 at 1:37 PM, Ted Yu yuzhih...@gmail.com wrote: I checked the code in Apache HBase 0.92 and trunk. I see the following line in validateParameters(): !Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW))) { Can you confirm that the bug is in cdh4b1 only ? Sorry for not doing the validation earlier. On Tue, May 15, 2012 at 12:09 PM, anil gupta anilgupt...@gmail.com wrote: Oh i c.. Now if i look closely at your gmail id then i can see your name. I was totally confused. So, you want to force the user to specify stopRow if the filter is not used? What if the user just wants to scan the table from startRow till the end of table? In your solution user will have explicitly set the stopRow as HConstants.EMPTY_END_ROW. Do we really want to force this? As per your solution the code would look like this: if(scan.hasFilter()) { if (scan == null || (Bytes.equals(scan.getStartRow(), scan.getStopRow()) !Bytes.equals(scan.getStartRow(), HConstants.EMPTY_START_ROW)) || (Bytes.compareTo(scan.getStartRow(), scan.getStopRow()) 0 !Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) )) { throw new IOException( Agg client Exception: Startrow should be smaller than Stoprow); } else if (scan.getFamilyMap().size() != 1) { throw new IOException(There must be only one family.); } } else { if (scan == null || (Bytes.equals(scan.getStartRow(), scan.getStopRow()) !Bytes.equals(scan.getStartRow(), HConstants.EMPTY_START_ROW)) || Bytes.compareTo(scan.getStartRow(), scan.getStopRow()) 0) { throw new IOException( Agg client Exception: Startrow should be smaller than Stoprow); } else if (scan.getFamilyMap().size() != 1) { throw new IOException(There must be only one family.); } } Let me know your thoughts. Thanks, Anil On Tue, May 15, 2012 at 11:46 AM, Ted Yu yuzhih...@gmail.com wrote: Anil: I am having trouble accessing JIRA. Ted Yu and Zhihong Yu are the same person :-) I think it would be good to remind user of aggregation client to narrow range of scan. That's why I proposed adding check of hasFilter(). Cheers On Tue, May 15, 2012 at 10:47 AM, Ted Yu yuzhih...@gmail.com wrote: Take your time. Once you complete your first submission, subsequent contributions would be easier. On Tue, May 15, 2012 at 10:34 AM, anil gupta anilgupt...@gmail.com wrote: Hi Ted, I created the jira: https://issues.apache.org/jira/browse/HBASE-5999for fixing this. Creating the patch might take me sometime(due to learning curve) as this is the first time i would be creating a patch. Thanks, Anil Gupta On Mon, May 14, 2012 at 4:00 PM, Ted Yu yuzhih...@gmail.com wrote: I was aware of the following change. Can you log a JIRA and attach the patch to it ? Thanks for trying out and improving aggregation client. On Mon, May 14, 2012 at 3:31 PM, anil gupta anilgupt...@gmail.com wrote: Hi Ted, If we change the if statement condition in validateParameters method in AggregationClient.java to: if (scan == null || (Bytes.equals(scan.getStartRow(), scan.getStopRow())
Re: Failed to apply HBASE-5128 on v0.90.6
You don't need to apply this patch to fix your problem. To use the enhanced hbck in 0.90, you can get the latest in 0.90, and build the hbase jar file, then make a copy of your own hbase home, replace the hbase jar in your copy of your hbase home with the hbase jar you just built from latest 0.90. Now you can use the enhanced hbck, while your existing installed code is not changed. Thanks, Jimmy On Sun, May 6, 2012 at 7:14 AM, Ted Yu yuzhih...@gmail.com wrote: Can you pastebin the contents of .rejpatching files ? Cheers On Sat, May 5, 2012 at 10:13 PM, Yabo Xu arber.resea...@gmail.com wrote: Hi all: We are trying to repair some of the corrupted regions from earlier data migration. To be specific, we have many regions with .regioninfo corrupted. We manually removed those corrupted regions, and attempt to fix the ROOT/META by running OfflineMetaRepair. It reports there are many holes discovered with messages as below: p|2b8ad6458052eca8bd78012b41bc6533|165148928|web| f3a13d30730ef68fae0f1355ff436872 and p|3e8997c536a88039549cb8a2963d5d 83|232368128|web|6776c73f7429fe85ef7932221bf60a07. You need to create a new regioninfo and region dir in hdfs to plug the hole. Another search leads us to HBASE-5128https://issues.apache.org/jira/browse/HBASE-5128 , a HBaseFsck patch that seems to be exactly what we want. When we try to apply the patch, it has the following confilcts. FYI: our hadoop/hbase version is 0.20.2/0.90.6. Did we do anything wrong or there is just no patch for 0.90.6 and we have to make our own? patching file src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.javaHunk #1 FAILED at 175.Hunk #2 succeeded at 185 with fuzz 1 (offset -12 lines).1 out of 2 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java.rejpatching file src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.javaHunk #1 FAILED at 133.1 out of 2 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java.rejpatching file src/main/java/org/apache/hadoop/hbase/master/HMaster.javaHunk #1 succeeded at 1085 (offset -1 lines).Hunk #2 FAILED at 1123.1 out of 2 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/master/HMaster.java.rejpatching file src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.javapatching file src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.javapatching file src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.javapatching file src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.javapatching file src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandlerImpl.javapatching file src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.javapatching file src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.javapatching file src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckComparator.javapatching file src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.javapatching file src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java Best, Arber
Re: Region server shutting down due to HDFS error
Which version of HDFS and HBase are you using? When the problem happens, can you access the HDFS, for example, from hadoop dfs? Thanks, Jimmy On Wed, Mar 28, 2012 at 4:28 AM, Eran Kutner e...@gigya.com wrote: Hi, We have region server sporadically stopping under load due supposedly to errors writing to HDFS. Things like: 2012-03-28 00:37:11,210 WARN org.apache.hadoop.hdfs.DFSClient: Error while syncing java.io.IOException: All datanodes 10.1.104.10:50010 are bad. Aborting.. It's happening with a different region server and data node every time, so it's not a problem with one specific server and there doesn't seem to be anything really wrong with either of them. I've already increased the file descriptor limit, datanode xceivers and data node handler count. Any idea what can be causing these errors? A more complete log is here: http://pastebin.com/wC90xU2x Thanks. -eran
Re: ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 Times
What's your hbase.zookeeper.quorom configuration? You can check out this quick start guide: http://hbase.apache.org/book/quickstart.html Thanks, Jimmy On Mon, Feb 13, 2012 at 10:09 AM, Bing Li lbl...@gmail.com wrote: Dear all, After searching on the Web and asking for help from friends, I noticed that the pseudo distributed configuration in the book, HBase the Definitive Guide, was not complete. Now the ZooKeeper related exception is fixed. However, I got another error when typing status in the HBase shell. ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 Times I am trying to fix it myself. Your help is highly appreciated. Thanks so much! Bing Li On Mon, Feb 13, 2012 at 5:00 AM, Bing Li lbl...@gmail.com wrote: Dear all, I am a new learner of HBase. I tried to set up my HBase on a pseudo-distributed HDFS. After starting HDFS by running ./start-dfs.sh and ./start-hbase.sh, I started the HBase shell. ./hbase shell It was started properly. However, when I typed the command, status, as follows. hbase(main):001:0 status It got the following exception. Since I had very limited experiences to use HBase, I could not figure out what the problem was. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hbase-0.92.0/lib/slf4j-log4j12-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-1.0.0/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 12/02/13 04:34:01 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries 12/02/13 04:34:01 WARN zookeeper.ZKUtil: hconnection Unable to set watcher on znode /hbase/master org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1003) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154) at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:76) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:580) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.init(HConnectionManager.java:569) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:186) at org.apache.hadoop.hbase.client.HBaseAdmin.init(HBaseAdmin.java:98) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.jruby.javasupport.JavaConstructor.newInstanceDirect(JavaConstructor.java:275) at org.jruby.java.invokers.ConstructorInvoker.call(ConstructorInvoker.java:91) at org.jruby.java.invokers.ConstructorInvoker.call(ConstructorInvoker.java:178) at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:322) at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:178) at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:182) at org.jruby.java.proxies.ConcreteJavaProxy$2.call(ConcreteJavaProxy.java:47) at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:322) Could you please give me a hand? Thanks so much! Best regards, Bing
Re: ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 Times
In this case, you may just use the standalone mode. You can follow the quick start step by step. The default zookeeper port is 2181, you don't need to configure it. On Mon, Feb 13, 2012 at 11:28 AM, Bing Li lbl...@gmail.com wrote: Dear Jimmy, I am a new user of HBase. My experiences in HBase and Hadoop is very limited. I just tried to follow some books, such as Hadoop/HBase the Definitive Guide. However, I still got some problems. What I am trying to do is just to set up a pseudo distributed HBase environment on a single node. After that, I will start my system programming in Java. I hope I could deploy the system in fully distributed mode when my system is done. So what I am configuring is very simple. Do I need to set up the zookeeper port in hbase-site.xml? Thanks so much! Best, Bing On Tue, Feb 14, 2012 at 3:16 AM, Jimmy Xiang jxi...@cloudera.com wrote: Have you restarted your HBase after the change? What's the zookeeper port does your HMaster use? Can you run the following to checkout where is your HMaster as below? hbase zkcli then: get /hbase/master It should show you master location. It seems you have a distributed installation. How many regionservers do you have? Can you check your master web UI to make sure all look fine. Thanks, Jimmy On Mon, Feb 13, 2012 at 10:51 AM, Bing Li lbl...@gmail.com wrote: Dear Jimmy, Thanks so much for your reply! I didn't set up the zookeeper.quorom. After getting your email, I made a change. Now my hbase-site.xml is as follows. configuration property namehbase.rootdir/name valuehdfs://localhost:9000/hbase/value /property property namedfs.replication/name value1/value /property property namehbase.cluster.distributed/name valuetrue/value /property property namehbase.zookeeper.quorum/name valuelocalhost/value /property /configuration The previous error is still existed. I feel weird why HBase developers cannot provide a reliable description about their work. Best, Bing On Tue, Feb 14, 2012 at 2:16 AM, Jimmy Xiang jxi...@cloudera.comwrote: What's your hbase.zookeeper.quorom configuration? You can check out this quick start guide: http://hbase.apache.org/book/quickstart.html Thanks, Jimmy On Mon, Feb 13, 2012 at 10:09 AM, Bing Li lbl...@gmail.com wrote: Dear all, After searching on the Web and asking for help from friends, I noticed that the pseudo distributed configuration in the book, HBase the Definitive Guide, was not complete. Now the ZooKeeper related exception is fixed. However, I got another error when typing status in the HBase shell. ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 Times I am trying to fix it myself. Your help is highly appreciated. Thanks so much! Bing Li On Mon, Feb 13, 2012 at 5:00 AM, Bing Li lbl...@gmail.com wrote: Dear all, I am a new learner of HBase. I tried to set up my HBase on a pseudo-distributed HDFS. After starting HDFS by running ./start-dfs.sh and ./start-hbase.sh, I started the HBase shell. ./hbase shell It was started properly. However, when I typed the command, status, as follows. hbase(main):001:0 status It got the following exception. Since I had very limited experiences to use HBase, I could not figure out what the problem was. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hbase-0.92.0/lib/slf4j-log4j12-1.5.8.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-1.0.0/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 12/02/13 04:34:01 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries 12/02/13 04:34:01 WARN zookeeper.ZKUtil: hconnection Unable to set watcher on znode /hbase/master org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1003) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154) at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:76) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:580) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.init(HConnectionManager.java:569
Re: ERROR: org.apache.hadoop.hbase.MasterNotRunningException: Retried 7 Times
Which port does your HDFS listen to? It is not 9000, right? namehbase.rootdir/name valuehdfs://localhost:9000/hbase/value You need to fix this and make sure your HDFS is working, for example, the following command should work for you. hadoop fs -ls / On Mon, Feb 13, 2012 at 11:44 AM, Bing Li lbl...@gmail.com wrote: Dear Jimmy, I configured the standalone mode successfully. But I wonder why the pseudo distributed one does work. I checked in logs and got the following exceptions. Does the information give you some hints? Thanks so much for your help again! Best, Bing 2012-02-13 18:25:49,782 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on connection exception: java.net.ConnectException: Connection refuse d at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy10.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:471) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:94) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:448) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:656) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202) at org.apache.hadoop.ipc.Client.call(Client.java:1046) ... 18 more 2012-02-13 18:25:49,787 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2012-02-13 18:25:49,787 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping service threads Thanks so much! Bing On Tue, Feb 14, 2012 at 3:35 AM, Jimmy Xiang jxi...@cloudera.com wrote: In this case, you may just use the standalone mode. You can follow the quick start step by step. The default zookeeper port is 2181, you don't need to configure it. On Mon, Feb 13, 2012 at 11:28 AM, Bing Li lbl...@gmail.com wrote: Dear Jimmy, I am a new user of HBase. My experiences in HBase and Hadoop is very limited. I just tried to follow some books, such as Hadoop/HBase the Definitive Guide. However, I still got some problems. What I am trying to do is just to set up a pseudo distributed HBase environment on a single node. After that, I will start my system programming in Java. I hope I could deploy the system in fully distributed mode when my system is done. So what I am configuring is very simple. Do I need to set up the zookeeper port in hbase-site.xml? Thanks so much! Best, Bing On Tue, Feb 14, 2012 at 3:16 AM, Jimmy Xiang jxi...@cloudera.comwrote: Have you restarted your HBase after the change? What's the zookeeper port does your HMaster use? Can you run the following to checkout where is your HMaster as below? hbase zkcli then: get /hbase/master It should show you master location. It seems you have a distributed installation. How many regionservers do you have? Can you check your master web UI to make sure all look fine. Thanks, Jimmy On Mon, Feb 13, 2012 at 10:51 AM, Bing Li lbl...@gmail.com wrote: Dear Jimmy, Thanks so much for your reply! I didn't set up the zookeeper.quorom. After getting your email, I made a change. Now my hbase-site.xml is as follows. configuration property namehbase.rootdir/name valuehdfs://localhost:9000/hbase/value /property property namedfs.replication/name value1/value /property property namehbase.cluster.distributed/name valuetrue/value /property property
Re: Problem accessing /master-status
It may not be null actually. It is most likely because the hostname cannot be resolved to an IP address. Thanks, Jimmy On Mon, Feb 6, 2012 at 10:10 AM, devrant evan.con...@gmail.com wrote: I received this error below...does anyone know why the hostname is null? HTTP ERROR 500 Problem accessing /master-status. Reason: hostname can't be null Caused by: java.lang.IllegalArgumentException: hostname can't be null at java.net.InetSocketAddress.init(InetSocketAddress.java:121) at org.apache.hadoop.hbase.HServerAddress.getResolvedAddress(HServerAddress.java:108) at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:64) at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.dataToHServerAddress(RootRegionTracker.java:82) at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.getRootRegionLocation(RootRegionTracker.java:61) at org.apache.hadoop.hbase.catalog.CatalogTracker.getRootLocation(CatalogTracker.java:163) at org.apache.hadoop.hbase.master.MasterStatusServlet.getRootLocationOrNull(MasterStatusServlet.java:82) at org.apache.hadoop.hbase.master.MasterStatusServlet.doGet(MasterStatusServlet.java:59) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:829) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) -- View this message in context: http://old.nabble.com/Problem-accessing--master-status-tp33273366p33273366.html Sent from the HBase User mailing list archive at Nabble.com.
Re: Problem accessing /master-status
At first, you can check how you configure the region servers, i.e. the host names of your region servers. It is in file regionservers. Then you can check which host name is not properly configured in the /etc/hosts or DNS. On Mon, Feb 6, 2012 at 10:30 AM, devrant evan.con...@gmail.com wrote: Thanks for the response Jimmy. Do you know if this a error on the server side (/etc/hosts etc) or config files for hbase? (ie. conf/hbase-site.xml etc). Jimmy Xiang wrote: It may not be null actually. It is most likely because the hostname cannot be resolved to an IP address. Thanks, Jimmy On Mon, Feb 6, 2012 at 10:10 AM, devrant evan.con...@gmail.com wrote: I received this error below...does anyone know why the hostname is null? HTTP ERROR 500 Problem accessing /master-status. Reason: hostname can't be null Caused by: java.lang.IllegalArgumentException: hostname can't be null at java.net.InetSocketAddress.init(InetSocketAddress.java:121) at org.apache.hadoop.hbase.HServerAddress.getResolvedAddress(HServerAddress.java:108) at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:64) at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.dataToHServerAddress(RootRegionTracker.java:82) at org.apache.hadoop.hbase.zookeeper.RootRegionTracker.getRootRegionLocation(RootRegionTracker.java:61) at org.apache.hadoop.hbase.catalog.CatalogTracker.getRootLocation(CatalogTracker.java:163) at org.apache.hadoop.hbase.master.MasterStatusServlet.getRootLocationOrNull(MasterStatusServlet.java:82) at org.apache.hadoop.hbase.master.MasterStatusServlet.doGet(MasterStatusServlet.java:59) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:829) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) -- View this message in context: http://old.nabble.com/Problem-accessing--master-status-tp33273366p33273366.html Sent from the HBase User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Problem-accessing--master-status-tp33273366p33273484.html Sent from the HBase User mailing list archive at Nabble.com.