Re: HRegionInfo was null or empty in Meta

2014-05-15 Thread Ted Yu
The warning came from:

  try {
// pre-fetch certain number of regions info at region cache.
MetaScanner.metaScan(conf, this, visitor, tableName, row,
this.prefetchRegionLimit, HConstants.META_TABLE_NAME);
  } catch (IOException e) {
LOG.warn(Encountered problems when prefetch META table: , e);
  }

Can you scan / write to vc2.out_link ?

Cheers


On Tue, May 6, 2014 at 6:07 PM, Li Li fancye...@gmail.com wrote:

 hbase hbck vc2.out_link

 Summary:
   -ROOT- is okay.
 Number of regions: 1
 Deployed on:  app-hbase-1,60020,1398226921318
   .META. is okay.
 Number of regions: 1
 Deployed on:  app-hbase-4,60020,1398226920856
   vc2.out_link is okay.
 Number of regions: 9
 Deployed on:  app-hbase-1,60020,1398226921318
 app-hbase-2,60020,1398226921328 app-hbase-4,60020,1398226920856
 app-hbase-5,60020,1398226920317
 0 inconsistencies detected.
 Status: OK

 On Tue, May 6, 2014 at 9:40 PM, Ted Yu yuzhih...@gmail.com wrote:
  Have you run hbck on vc2.out_link ?
 
  Cheers
 
  On May 6, 2014, at 6:33 AM, Li Li fancye...@gmail.com wrote:
 
  I am using 0.94.11
 
 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
  Encountered problems when prefetch META table:
  java.io.IOException: HRegionInfo was null or empty in Meta for
  vc2.out_link, row=vc2.out_link,,99
 at
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:157)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1062)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1124)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1004)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:961)
 at
 org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:227)
 at org.apache.hadoop.hbase.client.HTable.init(HTable.java:219)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getTable(HConnectionManager.java:671)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getTable(HConnectionManager.java:658)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getTable(HConnectionManager.java:653)



phoenix setup issue

2014-05-15 Thread Bogala, Chandra Reddy
Hi,
   I am trying to setup Phoenix and test queries on Hbase. But getting below 
error. Any clue what might be the issue. I have added core jar to classpath in 
hbase region servers by using dynamic loading of jars setting in 
hbase-site.xml.  Also added phoenix client jar at client side.
Getting same error with sqlline aswell.

./performance.py testhost.gs.com 100
Phoenix Performance Evaluation Script 1.0
-

Creating performance table...
java.lang.IllegalArgumentException: Not a host:port pair: PBUF

testhost.gs.com??(
at 
org.apache.hadoop.hbase.util.Addressing.parseHostname(Addressing.java:60)
at org.apache.hadoop.hbase.ServerName.init(ServerName.java:101)
at 
org.apache.hadoop.hbase.ServerName.parseVersionedServerName(ServerName.java:283)
at 
org.apache.hadoop.hbase.MasterAddressTracker.bytesToServerName(MasterAddressTracker.java:77)

Thanks,
Chandra


Re: hbase shell - how to get the key with oldest timestamp

2014-05-15 Thread Jean-Marc Spaggiari
Have you tried something like this?

get 't1', 'r1', {COLUMN = 'c1', TIMERANGE = [ts1, ts2], VERSIONS = 1}

Where ts1 is a very old date and ts2 is today?

does it gives you the most recent one? Or the oldest one? I did not tried...


2014-05-13 7:59 GMT-04:00 john guthrie graf...@gmail.com:

 can't you just pick an old date - January 1, 1970 maybe?


 On Tue, May 13, 2014 at 4:58 AM, Hansi Klose hansi.kl...@web.de wrote:

  Hi because of the Issue
 
  https://issues.apache.org/jira/browse/HBASE-10395
 
  i want to start my verification job with a starttime.
 
  To verify the whole time range i need the oldest timestamp in that table.
 
  I it possible with the hbase shell to get the key with the oldest
  timestamp?
 
  So that i can use this timestamp as starttime in my verifyrep job.
 
  Regards Hansi
 



Re: custom filter which extends Filter directly

2014-05-15 Thread Nick Dimiduk
+hbase-user


On Tue, May 13, 2014 at 7:57 PM, Ted Yu yuzhih...@gmail.com wrote:

 To be a bit more specific (Filter is an interface in 0.94):
 If you use 0.96+ releases and your filter extends Filter directly, I would
 be curious to know your use case.

 Thanks


 On Tue, May 6, 2014 at 11:25 AM, Ted Yu yuzhih...@gmail.com wrote:

  Hi,
  Filter is an abstract class.
 
  If your filter extends Filter directly, I would be curious to know your
  use case.
 
  Thanks
 



map reduce become much slower when upgrading from 0.94.11 to 0.96.2-hadoop1

2014-05-15 Thread Li Li
today I upgraded hbase 0.94.11 to 0.96.2-hadoop1. I have not changed
any client codes except replace 0.94.11 client jar to 0.96.2 's
When with old version. when doing mapreduce task. the requests per
seconds is about 10,000. But with new one, the value is 300. What's
wrong with it?
The hbase put and get is fast and Request Per Second is larger than 5,000

my codes:
ListScan scans = new ArrayListScan();
Scan urldbScan=new Scan();
urldbScan.setCaching(5000);
urldbScan.setCacheBlocks(false);
urldbScan.setAttribute(Scan.SCAN_ATTRIBUTES_TABLE_NAME,
HbaseTools.TB_URL_DB_BT);
urldbScan.addFamily(HbaseTools.CF_BT);
scans.add(urldbScan);
Scan outLinkScan=new Scan();
outLinkScan.setCaching(5000);
outLinkScan.setCacheBlocks(false);
outLinkScan.setAttribute(Scan.SCAN_ATTRIBUTES_TABLE_NAME,
HbaseTools.TB_OUT_LINK_BT);
outLinkScan.addFamily(HbaseTools.CF_BT);
scans.add(outLinkScan);
TableMapReduceUtil.initTableMapperJob(scans, Step1Mapper.class,
BytesWritable.class,
ScheduleData.class, job);


Re: mapred/mapreduce classes in hbase-server rather than hbase-client

2014-05-15 Thread Enis Söztutar
Hi Keegan,

Unfortunately, at the time of the module split in 0.96, we could not
completely decouple mapreduce classes from the server dependencies. I think
we actually need two modules to be extracted out, one is hbase-mapreduce
(probably separate module than client module) and hbase-storage for the
storage bits. I am sure at some point that will happen.

Enis


On Tue, May 13, 2014 at 9:09 AM, Keegan Witt keeganw...@gmail.com wrote:

 Possibly this was due to HBASE-7186 or HBASE-7188.  It's especially odd
 since I don't see usages outside the mapreduce package (at least for the
 classes that were of interest to me), so there shouldn't be any issue with
 changing the artifact the package is deployed in.
 Is this more a question for the dev list?

 -Keegan


 On Thu, May 1, 2014 at 10:59 AM, Keegan Witt keeganw...@gmail.com wrote:

  It looks like maybe as part of HBASE-4336, classes under the mapred and
  mapreduce package are now deployed in the hbase-server artifact.
  Wouldn't
  it make more sense to have these deployed in hbase-client?  hbase-server
 is
  a pretty big artifact to pull down to get access to TableOutputFormat,
 for
  example.
 
  If this makes sense, I can open a Jira.  I just thought I'd see if
 someone
  could explain the rationale first.  Thanks!
 
  -Keegan
 



Region Size == Size of Compressed Store file or Actual Size of Data in Store?

2014-05-15 Thread anil gupta
Hi All,

In one of my test cluster, i have set region size to 1 GB and I am using
Snappy compression.
The combined size of store files under that table is 50 GB. Then also i see
around 100 regions for that table. I am assuming that the compression ratio
is 50%. So, uncompressed data size will be 100GB.
I would like to know what hbase.hregion.max.filesize property looks at?
Actual size of store file(s) or actual size of data in that region?

I only created 10 presplit region for this table and then i did
bulkloading. Then why am i seeing regions much more than approximately 50.
Is there any other setting i need to look at?

-- 
Thanks  Regards,
Anil Gupta


Fwd: Retrieving nth qualifier in hbase using java

2014-05-15 Thread Vivekanand Ittigi
Can anyone please reply?

-- Forwarded message --
From: Vivekanand Ittigi vi...@biginfolabs.com
Date: Wed, May 14, 2014 at 6:21 PM
Subject: Retrieving nth qualifier in hbase using java
To: user@hbase.apache.org


Hi,

This question is quite out of box but i need it.

In list(collection), we can retrieve the nth element in the list by
list.get(i);

similarly is there any method, in hbase, using java API, where i can get
the nth qualifier given the row id and ColumnFamily name.

NOTE: I have million qualifiers in single row in single columnFamily.
Thanks,
Vivek


Re: Call for Lightning Talks, Hadoop Summit HBase BoF

2014-05-15 Thread Nick Dimiduk
Just to be clear, this is not a call for vendor pitches. This is a venue
for HBase users, operators, and developers to intermingle, share stories,
and storm new ideas.


On Tue, May 13, 2014 at 11:40 AM, Nick Dimiduk ndimi...@gmail.com wrote:

 Hi HBasers!

 Subash and I are organizing the HBase Birds of a Feather (BoF) session at
 Hadoop Summit San Jose this year. We're looking for 4-5 brave souls willing
 to standup for 15 minutes and tell the community what's working for them
 and what isn't. Have a story about how this particular feature saved the
 day? Great! Really wish something was implemented differently and have a
 plan for fixing it? Step up and recruit folks to provide/review patches!

 Either way, send me a note off-list and we'll get you queued up.

 The event is on Thursday, June 5, 3:30p - 5:00p at the San Jose Convention
 Center, room 230C. RSPV at the meetup page [0]. Please note that this event
 is NOT exclusive to conference attendees. Come, come, one and all!

 See you at the convention center!
 Nick  Subash

 [0]:
 http://www.meetup.com/Hadoop-Summit-Community-San-Jose/events/179081342/



Hadoop-2.4.0 HBase-0.19.18 Region server startup failed

2014-05-15 Thread raja kbv
I have recently installed hadoop-2.4.0 and hbase-0.95.18 compiled with
sudo mvn clean package assembly:assembly -DskipTests -Dhadoop.profile=2.4 
using below pom.xml options.
protobuf.version2.5.0/protobuf.version 
and 
idhadoop-2.4/id activation property namehadoop.profile/name 
value2.4/value /property /activation properties 
hadoop.version2.4.0/hadoop.version slf4j.version1.7.5/slf4j.version 
Here is my hbase-site.xml info.
configuration property namehbase.rootdir/name 
valuehdfs://big7:54310/hbase/value /property property 
namedfs.replication/name value3/value /property property 
namehbase.cluster.distributed/name valuetrue/value /property 
property namehbase.zookeeper.quorum/name valuebig11,big1,big4/value 
/property property namehbase.zookeeper.property.clientPort/name 
value/value /property property 
namehbase.zookeeper.property.dataDir/name 
value/home/hduser/hadoop/zookeeper/value /property property 
namehbase.cluster.distributed/name valuetrue/value /property
/configuration 
HMaster  Zookeepers are starting successfully but the regionservers are not 
starting. I got below error in the master log.
2014 -05-09 16:11:59,030 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=big11,60020,1399632116991
2014 -05-09 16:11:59,038 INFO org.apache.hadoop.conf.Configuration.deprecation: 
fs.default.name is deprecated. Instead, use fs.defaultFS
2014 -05-09 16:11:59,041 INFO org.apache.hadoop.hbase.master.ServerManager: 
Waiting for region servers count to settle; currently checked in 1, slept for 
100 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, 
interval of 1500 ms.
2014 -05-09 16:11:59,058 **ERROR org.apache.hadoop.hbase.master.HMaster: Region 
server \00\00big11,60020,1399632116991 reported a fatal error**: **ABORTING 
region server big11,60020,1399632116991: Unhandled exception: Region server 
startup failed Cause: java.io.IOException: Region server startup failed** at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1279)
 at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1136)
 at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:758) 
at java.lang.Thread.run(Thread.java:744) Caused by: 
java.lang.UnsupportedOperationException: **This is supposed to be overridden by 
subclasses.** at 
com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
 at 
org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionServerInfo.getSerializedSize(HBaseProtos.java:883)
 at
 
com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62)
 at 
org.apache.hadoop.hbase.regionserver.HRegionServer.createMyEphemeralNode(HRegionServer.java:1148)
 at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1109)
 ... 2 more 2014 -05-09 16:11:59,768 INFO 
org.apache.hadoop.hbase.master.ServerManager: Registering 
server=big1,60020,1399632116879
2014 -05-09 16:11:59,777 ERROR org.apache.hadoop.hbase.master.HMaster: Region 
server \00\00big9,60020,1399632116855 reported a fatal error: ABORTING region 
server big9,60020,1399632116855: Unhandled exception: Region server startup 
failed Cause: java.io.IOException: Region server startup failed at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1279)
 at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1136)
 at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:758) 
at java.lang.Thread.run(Thread.java:744) Caused by: 
java.lang.UnsupportedOperationException: This is supposed to be overridden by 
subclasses. at 
com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
 at 
org.apache.hadoop.hbase.protobuf.generated.HBaseProtos$RegionServerInfo.getSerializedSize(HBaseProtos.java:883)
 at
 
com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62)
 at 
org.apache.hadoop.hbase.regionserver.HRegionServer.createMyEphemeralNode(HRegionServer.java:1148)
 at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1109)
 ... 2 more 
Could someone help me to resolve this issue?

Regards,
Raja


Retrieving nth qualifier in hbase using java

2014-05-15 Thread Vivekanand Ittigi
Hi,

This question is quite out of box but i need it.

In list(collection), we can retrieve the nth element in the list by
list.get(i);

similarly is there any method, in hbase, using java API, where i can get
the nth qualifier given the row id and ColumnFamily name.

NOTE: I have million qualifiers in single row in single columnFamily.
Thanks,
Vivek


Re: preGetOp being called on the Coprocessor when issued with delete request

2014-05-15 Thread Ted Yu
Looking at RegionCoprocessorHost.java, preGetOp() is called in:
  public boolean preGet(final Get get, final ListCell results)

Do you have stack trace showing that preGetOp() is called for delete
request ?

Thanks


On Mon, May 12, 2014 at 9:31 PM, Vinay Kashyap vinay_kash...@ymail.comwrote:

 Dear all,

 I am using HBase 0.96.1.1-hadoop2 with CDH-5.0.0.
 I have an application where I have registered a coprocessor to my table to
 get few statistics on the read/write/delete requests.
 I have implemented preGetOp, prePut and preDelete accordingly and it is
 working as expected in case of read/write requests.
 But when I issue a delete request on the table, coprocessor's preGetOp is
 been called which is varying the read requests statistics.
 I wanted to understand why is the preGetOp being called when delete
 request is issued.?



 Thanks and regards
 Vinay Kashyap


Re: Hbase full scan

2014-05-15 Thread Ted Yu
RegexFilter extends AbstractPatternFilter.

Among the filters shipped with HBase, here're the ones which do row
skipping:

ColumnPaginationFilter
ColumnPrefixFilter
ColumnRangeFilter
FuzzyRowFilter
MultipleColumnPrefixFilter


On Wed, May 14, 2014 at 8:20 AM, Mike Axiak m...@axiak.net wrote:

 Just to clarify - filters can only skip rows when the filter is
 operating on the row keys, and even then only some filters can take
 advantage of this. (Notably, FuzzyRowFilter and RegexFilter)

 Best,
 Mike

 On Mon, May 12, 2014 at 11:17 PM, mu yu win@gmail.com wrote:
  Hi JM,
  Thanks for your reply .
  Ok,that's mean when filter or start row and stop row are used ,the scan
  would skip the other rows.
  Thank you so much.
 
 
  On Tue, May 13, 2014 at 2:15 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Mu,
 
  For a scan you can give start row and stop row. If you do so, it's only
 a
  partial scan. Also, if you add filters, rows are skipped on the server
  side.
 
  So you need to think your key to match your access pattern to avoid huge
  scans.
 
  JM
 
 
  2014-05-07 5:06 GMT-04:00 mu yu win@gmail.com:
 
   Hi
   We  deployed  a hbase-hadoop cluster for log storage .It's known hbase
  has
   no index , i wanna know all the scan including the hbase filter scan
 are
   full table scan ,and there's no other scans ?
   For example if  implement a  rowkey  scan by rowkey filter , and
  hbase
   would execute a full table scan .
  
   Any reply are appreciate.Thanks in advance.
  
 



RE: How to get complete row?

2014-05-15 Thread Bogala, Chandra Reddy
My queries always on a single column family. But return columns will be 
multiple from same family. In this case I don't think below issue impact our 
results. Let me know if I am wrong. 

scan 'test4',{STARTROW = '11645|1395288900', ENDROW = '11645|1398699000', 
COLUMNS = ['cf1:foo','cf1:bar'],FILTER = 
SingleColumnValueFilter('cf1','foo',=, 'binary:\xxx\')} 

scan.addColumn(Bytes.toBytes(cf1), Bytes.toBytes(foo));
scan.addColumn(Bytes.toBytes(cf1), Bytes.toBytes(bar));

Thanks,
Chandra

-Original Message-
From: Ted Yu [mailto:yuzhih...@gmail.com] 
Sent: Tuesday, May 06, 2014 5:50 PM
To: user@hbase.apache.org
Cc: user@hbase.apache.org
Subject: Re: How to get complete row?

For 0.96+, HBASE-10850 is needed for SingleColumnValueFilter to function 
correctly. 
This fix is in 0.98.2 whose RC is under vote. 

Cheers

On May 6, 2014, at 2:34 AM, Bogala, Chandra Reddy chandra.bog...@gs.com 
wrote:

 I am able to solve by using SingleColumnValueFilter .
 Tx
 
 From: Bogala, Chandra Reddy [Tech]
 Sent: Tuesday, May 06, 2014 12:35 PM
 To: 'user@hbase.apache.org'
 Subject: How to get complete row?
 
 Hi,
   I have similar requirement like in below posted thread. I need to get 
 complete row after applying value filter on single column value. Let know if 
 anyone knows the solution.
 http://stackoverflow.com/questions/21636787/hbase-how-to-get-complete-rows-when-scanning-with-filters-by-qualifier-value
 
 Thanks,
 Chandra
 


答复: 答复: meta server hungs ?

2014-05-15 Thread sunweiwei
I find lots of  these  in gc.log. It seems like CMS gc run many times but old 
Generation is always large. 
I'm confused. Any suggestions ,Thanks

2014-04-29T13:40:36.081+0800: 2143586.787: [CMS-concurrent-sweep-start]
2014-04-29T13:40:36.447+0800: 2143587.154: [GC 2143587.154: [ParNew: 
471872K-52416K(471872K), 0.0587370 secs] 11893986K-11506108K(16724800K), 
0.0590390 secs] [Times: user=0.00 sys=0.00, real=0.06 secs]
2014-04-29T13:40:37.382+0800: 2143588.089: [GC 2143588.089: [ParNew: 
471872K-52416K(471872K), 0.0805690 secs] 11812475K-11439145K(16724800K), 
0.0807940 secs] [Times: user=0.00 sys=0.00, real=0.08 secs]
2014-04-29T13:40:37.660+0800: 2143588.367: [CMS-concurrent-sweep: 1.435/1.579 
secs] [Times: user=0.00 sys=0.00, real=1.58 secs]

2014-04-29T13:56:39.780+0800: 2144550.486: [CMS-concurrent-sweep-start]
2014-04-29T13:56:41.007+0800: 2144551.714: [CMS-concurrent-sweep: 1.228/1.228 
secs] [Times: user=0.00 sys=0.00, real=1.23 secs]

2014-04-29T13:56:48.231+0800: 2144558.938: [CMS-concurrent-sweep-start]
2014-04-29T13:56:49.490+0800: 2144560.196: [CMS-concurrent-sweep: 1.258/1.258 
secs] [Times: user=0.00 sys=0.00, real=1.26 secs]


-邮件原件-
发件人: sunweiwei [mailto:su...@asiainfo-linkage.com] 
发送时间: 2014年5月6日 9:27
收件人: user@hbase.apache.org
主题: 答复: 答复: meta server hungs ?

HI Samir
I think master declared  hadoop77/192.168.1.87:60020 as dead server,  
because of Failed verification of hbase:meta,,1 at 
address=hadoop77,60020,1396606457005 exception=java.net.SocketTimeoutException.
I have paste the master log in the first mail.

I'm not sure,  here is the whole process:
at 2014-04-29 13:53:57,271client throw a SocketTimeoutException : Call 
to  hadoop77/192.168.1.87:60020failed because java.net.SocketTimeoutException: 
6 millis timeout and other clients hung.
at 2014-04-29 15:30:**I visit hbase web and found hmaster hung , 
then i stop it and start a new  hmaster.
at 2014-04-29 15:32:21,530the new hmaster logs Failed verification of 
hbase:meta,,1 at address=hadoop77,60020,1396606457005, 
exception=java.net.SocketTimeoutException: 
  Call to hadoop77/192.168.1.87:60020 failed 
because java.net.SocketTimeoutException
at 2014-04-29 15:32:28,364the meta server received hmaster's message 
and shutdown itself.

after these, clients come back to normal

-邮件原件-
发件人: Samir Ahmic [mailto:ahmic.sa...@gmail.com] 
发送时间: 2014年5月5日 19:25
收件人: user@hbase.apache.org
主题: Re: 答复: meta server hungs ?

There should be exception in regionserver log on  hadoop77/
192.168.1.87:60020 above  this one:

*
2014-04-29 15:32:28,364 FATAL [regionserver60020]
regionserver.HRegionServer: ABORTING region server
hadoop77,60020,1396606457005: org.apache.hadoop.hbase.YouAreDeadException:
Server REPORT rejected; currently processing hadoop77,60020,1396606457005
as dead server
at org.apache.hadoop.hbase.master.ServerManager.
checkIsDead(ServerManager.java:339)
*

Can you find it and past it. That exception should explain why
master declared  hadoop77/192.168.1.87:60020 as dead server.

Regards
Samir


On Mon, May 5, 2014 at 11:39 AM, sunweiwei su...@asiainfo-linkage.comwrote:

 And  this is client log.

 2014-04-29 13:53:57,271 WARN [main]
 org.apache.hadoop.hbase.client.ScannerCallable: Ignore, probably already
 closed
 java.net.SocketTimeoutException: Call to hadoop77/192.168.1.87:60020failed 
 because java.net.SocketTimeoutException: 6 millis timeout while
 waiting for channel to be ready for read. ch :
 java.nio.channels.SocketChannel[connected 
 local=/192.168.1.102:56473remote=hadoop77/
 192.168.1.87:60020]
 at
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1475)
 at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1450)
 at
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1650)
 at
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1708)
 at
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:27332)
 at
 org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:284)
 at
 org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:152)
 at
 org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:57)
 at
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:116)
 at
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:94)
 at
 org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:462)
 at
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:187)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1095)
 

Parallel Scan with TableMapReduceUtil

2014-05-15 Thread Guillermo Ortiz
I am processing data from HBase with a MapReduce. The input of my MapReduce
is a full scan of a table.

When I execute a full scan with TableMapReduceUtil, is this scan executed
in parallel, so all mappers get the data in parallel?? same way that if I
would execute many range scans with threads?


how to optimize my cluster setting?

2014-05-15 Thread Li Li
I have a small six nodes cluter. one node run master and namenode,
another run secondary namenode. the other 4 nodes are datanodes and
region servers.
each node has 16GB memory and a 4 core cpu

my application is very simple. I use hbase to store data for a web spider.
the table is:
1. url_db
 row key MD5(url). and there are other columns of the url. average
length of a row is about 1k
2. out_link
 row key MD5(url1)+MD5(url2). and there are anchor text and other
columns. average length is also less than 1K
3. in_link
  row key MD5(url2)+MD5(url1).
4. other tables with very few rows

when a url is fetched by the fetcher, A link extractor will extract
all the urls in this web page.
so with a url, I need to insert new found urls to url_db and
url+childurl to out_link and childurl+url to in_link.

as for reading, there are a few map reduce tasks to select priority
urls from url_db. it use full table scan of url_db and out_link.
map reduce is running every hour and it takes tens of minutes to complete

at the beginning, it's fast. but when url_db expands to tens of
million urls. it slows down. And I found two of the 4 nodes become
very high load but the other two have low load. I use top to find two
nodes' load average is larger than 50 and the other two is less than
1.
I tried to split the region and move them manully. But after some
time, it is not balanced again.

I am using hbase 0.94.11 with hadoop 1.0.0
is hbase 0.96/0.98 's balancer better for me or I shoud adjust some settings to?


Re: Hadoop-2.4.0 HBase-0.19.18 Region server startup failed

2014-05-15 Thread Ted Yu
See https://code.google.com/p/protobuf/issues/detail?id=493 for background
information on the exception you got.

On the computer where you did the build, can you run the following command ?

$ protoc --version

It should print something like:

libprotoc 2.5.0

If you see libprotoc 2.4.1, you need to upgrade to 2.5.0
The following file would be regenerated from protoc command :

src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java

Cheers


On Sat, May 10, 2014 at 3:36 AM, raja kbv raja...@yahoo.com wrote:

 Hi Ted,

 I have recently installed hadoop-2.2.0 and hbase-0.95.19 compiled using
 sudo mvn clean package assembly:single -DskipTests -Dhadoop.profile=2.0
 as in HBASE-11076 update.

 However my region server statrup fails with error
 UnsupportedOperationException: This is supposed to be overridden by
 subclasses.. I have attached all files for your reference.

 i also read the post in
 http://apache-hbase.679495.n3.nabble.com/HBase-0-94-on-hadoop-2-2-0-2-4-0-td4058517.htmlwhere
  you mentioned to rebuild classes for .proto files.  the command
 protoc -Isrc/main/protobuf --java_out=src/main/java   created
 Protobuf.java.  I also tried mvn compile -Dcompile-protobuf
 -Dprotoc.path=/user/bin/protoc which created hbase/target directory but i
 dont know how to rebuild tarball from it.

 I am stuck here for last 5 days. Could you please help me to resolve this
 issue? Thank you in advance.

 Regards,
 Raja




Re: RPC Client OutOfMemoryError Java Heap Space

2014-05-15 Thread Geovanie Marquez
Is this an expectation problem or a legitimate concern. I have been
studying the memory configurations on cloudera manager and I don't seem to
see where I can improve my situation.




On Thu, May 8, 2014 at 5:35 PM, Geovanie Marquez geovanie.marq...@gmail.com
 wrote:

 sorry didn't include version

 CDH5 version - CDH-5.0.0-1.cdh5.0.0.p0.47


 On Thu, May 8, 2014 at 5:32 PM, Geovanie Marquez 
 geovanie.marq...@gmail.com wrote:

 Hey group,

 There is one job that scans HBase contents and is really resource
 intensive using all resources available to yarn (under Resource Manager).
 In my case, that is 8GB. My expectation here is that a properly configured
 cluster would kill the application or degrade the application performance
 but never ever take a region server down. This is intended to be a
 multi-tenant environment where developers may submit jobs at will and I
 would want a configuration where the cluster services are not exited in
 this way because of memory.

 The simple solution here, is to change the way the job consumes resources
 so that when run it is not so resource greedy. I want to understand how I
 can mitigate this situation in general.

 **It FAILS with the following config:**
 The RPC client has 30 handlers
 write buffer of 2MiB
 The RegionServer heap is 4GiB
 Max Size of all memstores is 0.40 of total heap
 HFile Block Cache Size is 0.40
 Low watermark for memstore flush is 0.38
 HBase Memstore size is 128MiB

 **Job still FAILS with the following config:**
 Everything else the same except
 The RPC client has 10 handlers

 **Job still FAILS with the following config:**
 Everything else the same except
 HFile Block Cache Size is 0.10


 When this runs I get the following error stacktrace:
 #
 #How do I avoid this via configuration.
 #

 java.lang.OutOfMemoryError: Java heap space
  at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100)
  at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721)
 2014-05-08 16:23:54,705 WARN [IPC Client (1242056950) connection to 
 c1d001.in.wellcentive.com/10.2.4.21:60020 from hbase] 
 org.apache.hadoop.ipc.RpcClient: IPC Client (1242056950) connection to 
 c1d001.in.wellcentive.com/10.2.4.21:60020 from hbase: unexpected exception 
 receiving call responses
 #

 ###Yes, there was an RPC timeout this is what is killing the server because 
 the timeout is eventually (1minute later) reached.

 #

 java.lang.OutOfMemoryError: Java heap space
  at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100)
  at 
 org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721)
 2014-05-08 16:23:55,319 INFO [main] 
 org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from 
 org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of 
 OutOfOrderScannerNextException: was there a rpc timeout?
  at 
 org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:384)
  at 
 org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:194)
  at 
 org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138)
  at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
  at 
 org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
  at 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

 #

 ## Probably caused by the OOME above

 #

 Caused by: 
 org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: 
 org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected 
 nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 
 5612205039322936440 number_of_rows: 1 close_scanner: false 
 next_call_seq: 0
  at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3018)
  at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26929)
  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
  at 
 org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)