[
https://issues.apache.org/jira/browse/HBASE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970498#action_12970498
]
stack commented on HBASE-3323:
------------------------------
I think I pulled the right patch -- the latest one -- 62820 bytes in size
(Todd, add a version to your patches?). I was trying it and got this on the
very end
{code}
2010-12-11 18:39:00,914 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering
file
hdfs://sv2borg180:10000/hbase/.logs/sv2borg188,60020,1291841481545/sv2borg188%3A60020.1291993339759
2010-12-11 18:39:00,915 ERROR
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Error in log splitting
write thread
java.util.ConcurrentModificationException
at
java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:761)
at java.util.LinkedList$ListItr.next(LinkedList.java:696)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:669)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:649)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:621)
{code}
> OOME in master splitting logs
> -----------------------------
>
> Key: HBASE-3323
> URL: https://issues.apache.org/jira/browse/HBASE-3323
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Blocker
> Fix For: 0.90.0
>
> Attachments: hbase-3323.txt, hbase-3323.txt, hbase-3323.txt, sizes.png
>
>
> In testing a RS failure under heavy increment workload I ran into an OOME
> when the master was splitting the logs.
> In this test case, I have exactly 136 bytes per log entry in all the logs,
> and the logs are all around 66-74MB). With a batch size of 3 logs, this means
> the master is loading about 500K-600K edits per log file. Each edit ends up
> creating 3 byte[] objects, the references for which are each 8 bytes of RAM,
> so we have 160 (136+8*3) bytes per edit used by the byte[]. For each edit we
> also allocate a bunch of other objects: one HLog$Entry, one WALEdit, one
> ArrayList, one LinkedList$Entry, one HLogKey, and one KeyValue. Overall this
> works out to 400 bytes of overhead per edit. So, with the default settings on
> this fairly average workload, the 1.5M log entries takes about 770MB of RAM.
> Since I had a few log files that were a bit larger (around 90MB) it exceeded
> 1GB of RAM and I got an OOME.
> For one, the 400 bytes per edit overhead is pretty bad, and we could probably
> be a lot more efficient. For two, we should actually account this rather than
> simply having a configurable "batch size" in the master.
> I think this is a blocker because I'm running with fairly default configs
> here and just killing one RS made the cluster fall over due to master OOME.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.