[ 
https://issues.apache.org/jira/browse/HBASE-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652667#action_12652667
 ] 

Andrew Purtell commented on HBASE-1040:
---------------------------------------

See 
https://issues.apache.org/jira/browse/HBASE-1038?focusedCommentId=12652658#action_12652658

> OOME does not cause graceful shutdown under some failure scenarios
> ------------------------------------------------------------------
>
>                 Key: HBASE-1040
>                 URL: https://issues.apache.org/jira/browse/HBASE-1040
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.18.1
>            Reporter: Andrew Purtell
>
> Probably OOME related updates to trunk should be backported to 0.18 branch. I 
> am seeing these exceptions on our cluster in output from tablemap/tablereduce 
> jobs:
> > java.io.IOException: java.lang.OutOfMemoryError: Java heap space
> > at java.io.DataInputStream.readFull(DataInputSteram.java:175)
> > at 
> > org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:64)
> > at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:102)
> > at org.apahce.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1933)
> > at org.apahce.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1833)
> > at org.apahce.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
> > at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:516)
> > at 
> > org.apache.hadoop.hbase.regionserver.StoreFileScanner.getNext(StoreFileScanner.java:312)
> When such OOMEs as above happen, the cluster does not recover without manual 
> intervention. The regionservers sometimes go down after this, or sometimes do 
> not and stay up in sick condition for a while. Regions go offline and remain 
> unavailable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to