Re: Hbase regionserver heap space problem

Marcus Schlüter Thu, 10 Jul 2008 04:41:01 -0700

Hi J-D,

thanks for your reply.
I'll try it out with your proposal.


Marcus


Am 10.07.2008 um 12:53 schrieb Jean-Daniel Cryans:

Hi Marcus,
I don't know if it's related to your problem but in your machinesetup you
seem to imply that you have one region server and 3 datanodes on four
different machines. If it's really the case, I recommend that youinstead
have 1 machine for the Namenode and Master and three other machines as
Datanodes and RegionServers.

J-D
On Thu, Jul 10, 2008 at 5:32 AM, Marcus Schlüter <[EMAIL PROTECTED]>
wrote:
Hi everyone,

We would like to use Hbase and Hadoop.
But when we tried to use real data with our test setup, we saw alot ofcrashes and could not succeed to insert the amount of data we aretrying to
insert into an Hbase table.
Our goal is to have about 100 million of rows in one table, witheach row
having about 100byte of raw data.
Our testsetup consists of the following servers:
3 x HP DL385 with 4GB RAM, 2x2,8Ghz Opterons and Smartarray RAID5with ancapacity of 400GB. (all used as datanodes, and one of them also asthe
namenode)
1 x HP DL380 with 3GB RAM, 2x3,4Ghz Dualcore Xeons and SmartarrayRAID5
with an capacity of 320GB for hbase (master and regionserver).

We used hadoop 0.16.4 with a replaction level of 2 and hbase 0.1.3.
Hbase is configured to use 2GB of heap space.
The table was created with the following query:
create table logdata (logtype MAX_VERSIONS=1 COMPRESSION=BLOCK,banner_idMAX_VERSIONS=1, contentunit_id MAX_VERSIONS=1, campaign_idMAX_VERSIONS=1,
network MAX_VERSIONS=1, geodata MAX_VERSIONS=1 COMPRESSION=BLOCK,
client_data MAX_VERSIONS=1 COMPRESSION=BLOCK, profile_dataMAX_VERSIONS=1
COMPRESSION=BLOCK, keyword MAX_VERSIONS=1 COMPRESSION=BLOCK, tstamp
MAX_VERSIONS=1, time MAX_VERSIONS=1);
there problem is, that the regionserver runs out of heap space andthrowsthe following exception after inserting a few million rows (notalways the
same number of rows, ranging from 3 to about 10 million):
Exception in thread "org.apache.hadoop.dfs.DFSClient[EMAIL PROTECTED]"
java.lang.OutOfMemoryError: Java heap space
      at java.io.DataInputStream.<init>(DataInputStream.java:42)
      at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:186)
      at org.apache.hadoop.ipc.Client.getConnection(Client.java:578)
      at org.apache.hadoop.ipc.Client.call(Client.java:501)
      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:198)
      at org.apache.hadoop.dfs.$Proxy1.renewLease(Unknown Source)
      at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source)
      at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
      at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
      at org.apache.hadoop.dfs.$Proxy1.renewLease(Unknown Source)
      at
org.apache.hadoop.dfs.DFSClient$LeaseChecker.run(DFSClient.java:596)
      at java.lang.Thread.run(Thread.java:619)
Exception in thread "ResponseProcessor for blockblk_7988192980299756280"
java.lang.OutOfMemoryError: Java heap space
Exception in thread "IPC Server Responder" Exception in thread
"org.apache.hadoop.io.ObjectWritable Connection Culler" Exceptionin thread
"IPC Client connection to /192.168.1.117:54310"
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
any idears why we always see this crashes and if hbase should beable to
handle this amount of data in the setup we use?
On a side note, we also observe that hbase seems to have a largestorage
overhead.
When we insert about 1GB of rawdata into hbase, it uses about 8GBof HDFS
space (when taking into account the replication).
Is this large overhead expected?

/Marcus

Re: Hbase regionserver heap space problem

Reply via email to