[jira] [Commented] (HBASE-5387) Reuse compression streams in HFileBlock.Writer

stack (Commented) (JIRA) Fri, 10 Feb 2012 21:27:32 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206026#comment-13206026
 ]


stack commented on HBASE-5387:
------------------------------

Any reason for hardcoding of 32K for buffer size:

+      ((Configurable)codec).getConf().setInt("io.file.buffer.size", 32 * 1024);

Give this an initial reasonable size?

+        compressedByteStream = new ByteArrayOutputStream();

So, we'll keep around the largest thing we ever wrote into this 
ByteArrayOutputStream?  Should we resize it or something from time to time?  Or 
I suppose we can just wait till its a prob?

Is the gzip stuff brittle?  The header can be bigger than 10bytes I suppose 
(spec allows extensions IIRC) but I suppose its safe because we presume java or 
underlying native compression.

Good stuff Mikhail.  +1 on patch.
                
> Reuse compression streams in HFileBlock.Writer
> ----------------------------------------------
>
>                 Key: HBASE-5387
>                 URL: https://issues.apache.org/jira/browse/HBASE-5387
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Mikhail Bautin
>            Assignee: Mikhail Bautin
>         Attachments: Fix-deflater-leak-2012-02-10_18_48_45.patch
>
>
> We need to to reuse compression streams in HFileBlock.Writer instead of 
> allocating them every time. The motivation is that when using Java's built-in 
> implementation of Gzip, we allocate a new GZIPOutputStream object and an 
> associated native data structure every time we create a compression stream. 
> The native data structure is only deallocated in the finalizer. This is one 
> suspected cause of recent TestHFileBlock failures on Hadoop QA: 
> https://builds.apache.org/job/HBase-TRUNK/2658/testReport/org.apache.hadoop.hbase.io.hfile/TestHFileBlock/testPreviousOffset_1_/.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5387) Reuse compression streams in HFileBlock.Writer

Reply via email to