Re: HRegionInfo was null or empty in Meta
The warning came from: try { // pre-fetch certain number of regions info at region cache. MetaScanner.metaScan(conf, this, visitor, tableName, row, this.prefetchRegionLimit, HConstants.META_TABLE_NAME); } catch (IOException e) { LOG.warn(Encountered problems when prefetch META table: , e); } Can you scan / write to vc2.out_link ? Cheers On Tue, May 6, 2014 at 6:07 PM, Li Li fancye...@gmail.com wrote: hbase hbck vc2.out_link Summary: -ROOT- is okay. Number of regions: 1 Deployed on: app-hbase-1,60020,1398226921318 .META. is okay. Number of regions: 1 Deployed on: app-hbase-4,60020,1398226920856 vc2.out_link is okay. Number of regions: 9 Deployed on: app-hbase-1,60020,1398226921318 app-hbase-2,60020,1398226921328 app-hbase-4,60020,1398226920856 app-hbase-5,60020,1398226920317 0 inconsistencies detected. Status: OK On Tue, May 6, 2014 at 9:40 PM, Ted Yu yuzhih...@gmail.com wrote: Have you run hbck on vc2.out_link ? Cheers On May 6, 2014, at 6:33 AM, Li Li fancye...@gmail.com wrote: I am using 0.94.11 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation Encountered problems when prefetch META table: java.io.IOException: HRegionInfo was null or empty in Meta for vc2.out_link, row=vc2.out_link,,99 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:157) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1062) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1124) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1004) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:961) at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:227) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:219) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getTable(HConnectionManager.java:671) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getTable(HConnectionManager.java:658) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getTable(HConnectionManager.java:653)
phoenix setup issue
Hi, I am trying to setup Phoenix and test queries on Hbase. But getting below error. Any clue what might be the issue. I have added core jar to classpath in hbase region servers by using dynamic loading of jars setting in hbase-site.xml. Also added phoenix client jar at client side. Getting same error with sqlline aswell. ./performance.py testhost.gs.com 100 Phoenix Performance Evaluation Script 1.0 - Creating performance table... java.lang.IllegalArgumentException: Not a host:port pair: PBUF testhost.gs.com??( at org.apache.hadoop.hbase.util.Addressing.parseHostname(Addressing.java:60) at org.apache.hadoop.hbase.ServerName.init(ServerName.java:101) at org.apache.hadoop.hbase.ServerName.parseVersionedServerName(ServerName.java:283) at org.apache.hadoop.hbase.MasterAddressTracker.bytesToServerName(MasterAddressTracker.java:77) Thanks, Chandra
Re: hbase shell - how to get the key with oldest timestamp
Have you tried something like this? get 't1', 'r1', {COLUMN = 'c1', TIMERANGE = [ts1, ts2], VERSIONS = 1} Where ts1 is a very old date and ts2 is today? does it gives you the most recent one? Or the oldest one? I did not tried... 2014-05-13 7:59 GMT-04:00 john guthrie graf...@gmail.com: can't you just pick an old date - January 1, 1970 maybe? On Tue, May 13, 2014 at 4:58 AM, Hansi Klose hansi.kl...@web.de wrote: Hi because of the Issue https://issues.apache.org/jira/browse/HBASE-10395 i want to start my verification job with a starttime. To verify the whole time range i need the oldest timestamp in that table. I it possible with the hbase shell to get the key with the oldest timestamp? So that i can use this timestamp as starttime in my verifyrep job. Regards Hansi
Re: custom filter which extends Filter directly
+hbase-user On Tue, May 13, 2014 at 7:57 PM, Ted Yu yuzhih...@gmail.com wrote: To be a bit more specific (Filter is an interface in 0.94): If you use 0.96+ releases and your filter extends Filter directly, I would be curious to know your use case. Thanks On Tue, May 6, 2014 at 11:25 AM, Ted Yu yuzhih...@gmail.com wrote: Hi, Filter is an abstract class. If your filter extends Filter directly, I would be curious to know your use case. Thanks
map reduce become much slower when upgrading from 0.94.11 to 0.96.2-hadoop1
today I upgraded hbase 0.94.11 to 0.96.2-hadoop1. I have not changed any client codes except replace 0.94.11 client jar to 0.96.2 's When with old version. when doing mapreduce task. the requests per seconds is about 10,000. But with new one, the value is 300. What's wrong with it? The hbase put and get is fast and Request Per Second is larger than 5,000 my codes: ListScan scans = new ArrayListScan(); Scan urldbScan=new Scan(); urldbScan.setCaching(5000); urldbScan.setCacheBlocks(false); urldbScan.setAttribute(Scan.SCAN_ATTRIBUTES_TABLE_NAME, HbaseTools.TB_URL_DB_BT); urldbScan.addFamily(HbaseTools.CF_BT); scans.add(urldbScan); Scan outLinkScan=new Scan(); outLinkScan.setCaching(5000); outLinkScan.setCacheBlocks(false); outLinkScan.setAttribute(Scan.SCAN_ATTRIBUTES_TABLE_NAME, HbaseTools.TB_OUT_LINK_BT); outLinkScan.addFamily(HbaseTools.CF_BT); scans.add(outLinkScan); TableMapReduceUtil.initTableMapperJob(scans, Step1Mapper.class, BytesWritable.class, ScheduleData.class, job);
Re: mapred/mapreduce classes in hbase-server rather than hbase-client
Hi Keegan, Unfortunately, at the time of the module split in 0.96, we could not completely decouple mapreduce classes from the server dependencies. I think we actually need two modules to be extracted out, one is hbase-mapreduce (probably separate module than client module) and hbase-storage for the storage bits. I am sure at some point that will happen. Enis On Tue, May 13, 2014 at 9:09 AM, Keegan Witt keeganw...@gmail.com wrote: Possibly this was due to HBASE-7186 or HBASE-7188. It's especially odd since I don't see usages outside the mapreduce package (at least for the classes that were of interest to me), so there shouldn't be any issue with changing the artifact the package is deployed in. Is this more a question for the dev list? -Keegan On Thu, May 1, 2014 at 10:59 AM, Keegan Witt keeganw...@gmail.com wrote: It looks like maybe as part of HBASE-4336, classes under the mapred and mapreduce package are now deployed in the hbase-server artifact. Wouldn't it make more sense to have these deployed in hbase-client? hbase-server is a pretty big artifact to pull down to get access to TableOutputFormat, for example. If this makes sense, I can open a Jira. I just thought I'd see if someone could explain the rationale first. Thanks! -Keegan
Region Size == Size of Compressed Store file or Actual Size of Data in Store?
Hi All, In one of my test cluster, i have set region size to 1 GB and I am using Snappy compression. The combined size of store files under that table is 50 GB. Then also i see around 100 regions for that table. I am assuming that the compression ratio is 50%. So, uncompressed data size will be 100GB. I would like to know what hbase.hregion.max.filesize property looks at? Actual size of store file(s) or actual size of data in that region? I only created 10 presplit region for this table and then i did bulkloading. Then why am i seeing regions much more than approximately 50. Is there any other setting i need to look at? -- Thanks Regards, Anil Gupta
Fwd: Retrieving nth qualifier in hbase using java
Can anyone please reply? -- Forwarded message -- From: Vivekanand Ittigi vi...@biginfolabs.com Date: Wed, May 14, 2014 at 6:21 PM Subject: Retrieving nth qualifier in hbase using java To: user@hbase.apache.org Hi, This question is quite out of box but i need it. In list(collection), we can retrieve the nth element in the list by list.get(i); similarly is there any method, in hbase, using java API, where i can get the nth qualifier given the row id and ColumnFamily name. NOTE: I have million qualifiers in single row in single columnFamily. Thanks, Vivek
Re: Call for Lightning Talks, Hadoop Summit HBase BoF
Just to be clear, this is not a call for vendor pitches. This is a venue for HBase users, operators, and developers to intermingle, share stories, and storm new ideas. On Tue, May 13, 2014 at 11:40 AM, Nick Dimiduk ndimi...@gmail.com wrote: Hi HBasers! Subash and I are organizing the HBase Birds of a Feather (BoF) session at Hadoop Summit San Jose this year. We're looking for 4-5 brave souls willing to standup for 15 minutes and tell the community what's working for them and what isn't. Have a story about how this particular feature saved the day? Great! Really wish something was implemented differently and have a plan for fixing it? Step up and recruit folks to provide/review patches! Either way, send me a note off-list and we'll get you queued up. The event is on Thursday, June 5, 3:30p - 5:00p at the San Jose Convention Center, room 230C. RSPV at the meetup page [0]. Please note that this event is NOT exclusive to conference attendees. Come, come, one and all! See you at the convention center! Nick Subash [0]: http://www.meetup.com/Hadoop-Summit-Community-San-Jose/events/179081342/
Hadoop-2.4.0 HBase-0.19.18 Region server startup failed
I have recently installed hadoop-2.4.0 and hbase-0.95.18 compiled with sudo mvn clean package assembly:assembly -DskipTests -Dhadoop.profile=2.4 using below pom.xml options. protobuf.version2.5.0/protobuf.version and idhadoop-2.4/id activation property namehadoop.profile/name value2.4/value /property /activation properties hadoop.version2.4.0/hadoop.version slf4j.version1.7.5/slf4j.version Here is my hbase-site.xml info. configuration property namehbase.rootdir/name valuehdfs://big7:54310/hbase/value /property property namedfs.replication/name value3/value /property property namehbase.cluster.distributed/name valuetrue/value /property property namehbase.zookeeper.quorum/name valuebig11,big1,big4/value /property property namehbase.zookeeper.property.clientPort/name value/value /property property namehbase.zookeeper.property.dataDir/name value/home/hduser/hadoop/zookeeper/value /property property namehbase.cluster.distributed/name valuetrue/value /property /configuration HMaster Zookeepers are starting successfully but the regionservers are not starting. I got below error in the master log. 2014 -05-09 16:11:59,030 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=big11,60020,1399632116991 2014 -05-09 16:11:59,038 INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS 2014 -05-09 16:11:59,041 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 1, slept for 100 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2014 -05-09 16:11:59,058 **ERROR org.apache.hadoop.hbase.master.HMaster: Region server \00\00big11,60020,1399632116991 reported a fatal error**: **ABORTING region server big11,60020,1399632116991: Unhandled exception: Region server startup failed Cause: java.io.IOException: Region server startup failed** at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1279) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1136) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:758) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.UnsupportedOperationException: **This is supposed to be overridden by subclasses.** at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180) at org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionServerInfo.getSerializedSize(HBaseProtos.java:883) at com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62) at org.apache.hadoop.hbase.regionserver.HRegionServer.createMyEphemeralNode(HRegionServer.java:1148) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1109) ... 2 more 2014 -05-09 16:11:59,768 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=big1,60020,1399632116879 2014 -05-09 16:11:59,777 ERROR org.apache.hadoop.hbase.master.HMaster: Region server \00\00big9,60020,1399632116855 reported a fatal error: ABORTING region server big9,60020,1399632116855: Unhandled exception: Region server startup failed Cause: java.io.IOException: Region server startup failed at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1279) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1136) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:758) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses. at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180) at org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionServerInfo.getSerializedSize(HBaseProtos.java:883) at com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62) at org.apache.hadoop.hbase.regionserver.HRegionServer.createMyEphemeralNode(HRegionServer.java:1148) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1109) ... 2 more Could someone help me to resolve this issue? Regards, Raja
Retrieving nth qualifier in hbase using java
Hi, This question is quite out of box but i need it. In list(collection), we can retrieve the nth element in the list by list.get(i); similarly is there any method, in hbase, using java API, where i can get the nth qualifier given the row id and ColumnFamily name. NOTE: I have million qualifiers in single row in single columnFamily. Thanks, Vivek
Re: preGetOp being called on the Coprocessor when issued with delete request
Looking at RegionCoprocessorHost.java, preGetOp() is called in: public boolean preGet(final Get get, final ListCell results) Do you have stack trace showing that preGetOp() is called for delete request ? Thanks On Mon, May 12, 2014 at 9:31 PM, Vinay Kashyap vinay_kash...@ymail.comwrote: Dear all, I am using HBase 0.96.1.1-hadoop2 with CDH-5.0.0. I have an application where I have registered a coprocessor to my table to get few statistics on the read/write/delete requests. I have implemented preGetOp, prePut and preDelete accordingly and it is working as expected in case of read/write requests. But when I issue a delete request on the table, coprocessor's preGetOp is been called which is varying the read requests statistics. I wanted to understand why is the preGetOp being called when delete request is issued.? Thanks and regards Vinay Kashyap
Re: Hbase full scan
RegexFilter extends AbstractPatternFilter. Among the filters shipped with HBase, here're the ones which do row skipping: ColumnPaginationFilter ColumnPrefixFilter ColumnRangeFilter FuzzyRowFilter MultipleColumnPrefixFilter On Wed, May 14, 2014 at 8:20 AM, Mike Axiak m...@axiak.net wrote: Just to clarify - filters can only skip rows when the filter is operating on the row keys, and even then only some filters can take advantage of this. (Notably, FuzzyRowFilter and RegexFilter) Best, Mike On Mon, May 12, 2014 at 11:17 PM, mu yu win@gmail.com wrote: Hi JM, Thanks for your reply . Ok,that's mean when filter or start row and stop row are used ,the scan would skip the other rows. Thank you so much. On Tue, May 13, 2014 at 2:15 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Mu, For a scan you can give start row and stop row. If you do so, it's only a partial scan. Also, if you add filters, rows are skipped on the server side. So you need to think your key to match your access pattern to avoid huge scans. JM 2014-05-07 5:06 GMT-04:00 mu yu win@gmail.com: Hi We deployed a hbase-hadoop cluster for log storage .It's known hbase has no index , i wanna know all the scan including the hbase filter scan are full table scan ,and there's no other scans ? For example if implement a rowkey scan by rowkey filter , and hbase would execute a full table scan . Any reply are appreciate.Thanks in advance.
RE: How to get complete row?
My queries always on a single column family. But return columns will be multiple from same family. In this case I don't think below issue impact our results. Let me know if I am wrong. scan 'test4',{STARTROW = '11645|1395288900', ENDROW = '11645|1398699000', COLUMNS = ['cf1:foo','cf1:bar'],FILTER = SingleColumnValueFilter('cf1','foo',=, 'binary:\xxx\')} scan.addColumn(Bytes.toBytes(cf1), Bytes.toBytes(foo)); scan.addColumn(Bytes.toBytes(cf1), Bytes.toBytes(bar)); Thanks, Chandra -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Tuesday, May 06, 2014 5:50 PM To: user@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: How to get complete row? For 0.96+, HBASE-10850 is needed for SingleColumnValueFilter to function correctly. This fix is in 0.98.2 whose RC is under vote. Cheers On May 6, 2014, at 2:34 AM, Bogala, Chandra Reddy chandra.bog...@gs.com wrote: I am able to solve by using SingleColumnValueFilter . Tx From: Bogala, Chandra Reddy [Tech] Sent: Tuesday, May 06, 2014 12:35 PM To: 'user@hbase.apache.org' Subject: How to get complete row? Hi, I have similar requirement like in below posted thread. I need to get complete row after applying value filter on single column value. Let know if anyone knows the solution. http://stackoverflow.com/questions/21636787/hbase-how-to-get-complete-rows-when-scanning-with-filters-by-qualifier-value Thanks, Chandra
答复: 答复: meta server hungs ?
I find lots of these in gc.log. It seems like CMS gc run many times but old Generation is always large. I'm confused. Any suggestions ,Thanks 2014-04-29T13:40:36.081+0800: 2143586.787: [CMS-concurrent-sweep-start] 2014-04-29T13:40:36.447+0800: 2143587.154: [GC 2143587.154: [ParNew: 471872K-52416K(471872K), 0.0587370 secs] 11893986K-11506108K(16724800K), 0.0590390 secs] [Times: user=0.00 sys=0.00, real=0.06 secs] 2014-04-29T13:40:37.382+0800: 2143588.089: [GC 2143588.089: [ParNew: 471872K-52416K(471872K), 0.0805690 secs] 11812475K-11439145K(16724800K), 0.0807940 secs] [Times: user=0.00 sys=0.00, real=0.08 secs] 2014-04-29T13:40:37.660+0800: 2143588.367: [CMS-concurrent-sweep: 1.435/1.579 secs] [Times: user=0.00 sys=0.00, real=1.58 secs] 2014-04-29T13:56:39.780+0800: 2144550.486: [CMS-concurrent-sweep-start] 2014-04-29T13:56:41.007+0800: 2144551.714: [CMS-concurrent-sweep: 1.228/1.228 secs] [Times: user=0.00 sys=0.00, real=1.23 secs] 2014-04-29T13:56:48.231+0800: 2144558.938: [CMS-concurrent-sweep-start] 2014-04-29T13:56:49.490+0800: 2144560.196: [CMS-concurrent-sweep: 1.258/1.258 secs] [Times: user=0.00 sys=0.00, real=1.26 secs] -邮件原件- 发件人: sunweiwei [mailto:su...@asiainfo-linkage.com] 发送时间: 2014年5月6日 9:27 收件人: user@hbase.apache.org 主题: 答复: 答复: meta server hungs ? HI Samir I think master declared hadoop77/192.168.1.87:60020 as dead server, because of Failed verification of hbase:meta,,1 at address=hadoop77,60020,1396606457005 exception=java.net.SocketTimeoutException. I have paste the master log in the first mail. I'm not sure, here is the whole process: at 2014-04-29 13:53:57,271client throw a SocketTimeoutException : Call to hadoop77/192.168.1.87:60020failed because java.net.SocketTimeoutException: 6 millis timeout and other clients hung. at 2014-04-29 15:30:**I visit hbase web and found hmaster hung , then i stop it and start a new hmaster. at 2014-04-29 15:32:21,530the new hmaster logs Failed verification of hbase:meta,,1 at address=hadoop77,60020,1396606457005, exception=java.net.SocketTimeoutException: Call to hadoop77/192.168.1.87:60020 failed because java.net.SocketTimeoutException at 2014-04-29 15:32:28,364the meta server received hmaster's message and shutdown itself. after these, clients come back to normal -邮件原件- 发件人: Samir Ahmic [mailto:ahmic.sa...@gmail.com] 发送时间: 2014年5月5日 19:25 收件人: user@hbase.apache.org 主题: Re: 答复: meta server hungs ? There should be exception in regionserver log on hadoop77/ 192.168.1.87:60020 above this one: * 2014-04-29 15:32:28,364 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server hadoop77,60020,1396606457005: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing hadoop77,60020,1396606457005 as dead server at org.apache.hadoop.hbase.master.ServerManager. checkIsDead(ServerManager.java:339) * Can you find it and past it. That exception should explain why master declared hadoop77/192.168.1.87:60020 as dead server. Regards Samir On Mon, May 5, 2014 at 11:39 AM, sunweiwei su...@asiainfo-linkage.comwrote: And this is client log. 2014-04-29 13:53:57,271 WARN [main] org.apache.hadoop.hbase.client.ScannerCallable: Ignore, probably already closed java.net.SocketTimeoutException: Call to hadoop77/192.168.1.87:60020failed because java.net.SocketTimeoutException: 6 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.1.102:56473remote=hadoop77/ 192.168.1.87:60020] at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1475) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1450) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1650) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1708) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:27332) at org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:284) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:152) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:57) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:116) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:94) at org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:462) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:187) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1095)
Parallel Scan with TableMapReduceUtil
I am processing data from HBase with a MapReduce. The input of my MapReduce is a full scan of a table. When I execute a full scan with TableMapReduceUtil, is this scan executed in parallel, so all mappers get the data in parallel?? same way that if I would execute many range scans with threads?
how to optimize my cluster setting?
I have a small six nodes cluter. one node run master and namenode, another run secondary namenode. the other 4 nodes are datanodes and region servers. each node has 16GB memory and a 4 core cpu my application is very simple. I use hbase to store data for a web spider. the table is: 1. url_db row key MD5(url). and there are other columns of the url. average length of a row is about 1k 2. out_link row key MD5(url1)+MD5(url2). and there are anchor text and other columns. average length is also less than 1K 3. in_link row key MD5(url2)+MD5(url1). 4. other tables with very few rows when a url is fetched by the fetcher, A link extractor will extract all the urls in this web page. so with a url, I need to insert new found urls to url_db and url+childurl to out_link and childurl+url to in_link. as for reading, there are a few map reduce tasks to select priority urls from url_db. it use full table scan of url_db and out_link. map reduce is running every hour and it takes tens of minutes to complete at the beginning, it's fast. but when url_db expands to tens of million urls. it slows down. And I found two of the 4 nodes become very high load but the other two have low load. I use top to find two nodes' load average is larger than 50 and the other two is less than 1. I tried to split the region and move them manully. But after some time, it is not balanced again. I am using hbase 0.94.11 with hadoop 1.0.0 is hbase 0.96/0.98 's balancer better for me or I shoud adjust some settings to?
Re: Hadoop-2.4.0 HBase-0.19.18 Region server startup failed
See https://code.google.com/p/protobuf/issues/detail?id=493 for background information on the exception you got. On the computer where you did the build, can you run the following command ? $ protoc --version It should print something like: libprotoc 2.5.0 If you see libprotoc 2.4.1, you need to upgrade to 2.5.0 The following file would be regenerated from protoc command : src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java Cheers On Sat, May 10, 2014 at 3:36 AM, raja kbv raja...@yahoo.com wrote: Hi Ted, I have recently installed hadoop-2.2.0 and hbase-0.95.19 compiled using sudo mvn clean package assembly:single -DskipTests -Dhadoop.profile=2.0 as in HBASE-11076 update. However my region server statrup fails with error UnsupportedOperationException: This is supposed to be overridden by subclasses.. I have attached all files for your reference. i also read the post in http://apache-hbase.679495.n3.nabble.com/HBase-0-94-on-hadoop-2-2-0-2-4-0-td4058517.htmlwhere you mentioned to rebuild classes for .proto files. the command protoc -Isrc/main/protobuf --java_out=src/main/java created Protobuf.java. I also tried mvn compile -Dcompile-protobuf -Dprotoc.path=/user/bin/protoc which created hbase/target directory but i dont know how to rebuild tarball from it. I am stuck here for last 5 days. Could you please help me to resolve this issue? Thank you in advance. Regards, Raja
Re: RPC Client OutOfMemoryError Java Heap Space
Is this an expectation problem or a legitimate concern. I have been studying the memory configurations on cloudera manager and I don't seem to see where I can improve my situation. On Thu, May 8, 2014 at 5:35 PM, Geovanie Marquez geovanie.marq...@gmail.com wrote: sorry didn't include version CDH5 version - CDH-5.0.0-1.cdh5.0.0.p0.47 On Thu, May 8, 2014 at 5:32 PM, Geovanie Marquez geovanie.marq...@gmail.com wrote: Hey group, There is one job that scans HBase contents and is really resource intensive using all resources available to yarn (under Resource Manager). In my case, that is 8GB. My expectation here is that a properly configured cluster would kill the application or degrade the application performance but never ever take a region server down. This is intended to be a multi-tenant environment where developers may submit jobs at will and I would want a configuration where the cluster services are not exited in this way because of memory. The simple solution here, is to change the way the job consumes resources so that when run it is not so resource greedy. I want to understand how I can mitigate this situation in general. **It FAILS with the following config:** The RPC client has 30 handlers write buffer of 2MiB The RegionServer heap is 4GiB Max Size of all memstores is 0.40 of total heap HFile Block Cache Size is 0.40 Low watermark for memstore flush is 0.38 HBase Memstore size is 128MiB **Job still FAILS with the following config:** Everything else the same except The RPC client has 10 handlers **Job still FAILS with the following config:** Everything else the same except HFile Block Cache Size is 0.10 When this runs I get the following error stacktrace: # #How do I avoid this via configuration. # java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721) 2014-05-08 16:23:54,705 WARN [IPC Client (1242056950) connection to c1d001.in.wellcentive.com/10.2.4.21:60020 from hbase] org.apache.hadoop.ipc.RpcClient: IPC Client (1242056950) connection to c1d001.in.wellcentive.com/10.2.4.21:60020 from hbase: unexpected exception receiving call responses # ###Yes, there was an RPC timeout this is what is killing the server because the timeout is eventually (1minute later) reached. # java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721) 2014-05-08 16:23:55,319 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout? at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:384) at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:194) at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) # ## Probably caused by the OOME above # Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 5612205039322936440 number_of_rows: 1 close_scanner: false next_call_seq: 0 at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3018) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26929) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175) at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)