[
https://issues.apache.org/jira/browse/HBASE-20197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16400754#comment-16400754
]
Anoop Sam John commented on HBASE-20197:
----------------------------------------
Good analysis and find on the issue with Unsafe#copy.. Ideally we will reach
this method iff we deal with DBB. So the 1st if itself will get executed.
For this, reason no need to exclude the BBUtil usage IMHO.. We can fix the
Unsafe issue by another jira. If you see the code in DBB that is also doing
the Unsafe way only I believe. From Java 8, the usage of BB APIs directly for
the copy is not that bad. (prior 1.8 it was not the case). We can avoid some
garbage creation because of dup, slice calls. What I can say is if it is just
because of the mentioned issue with Unsafe#copy, that you want the change, then
we will never meet that issue as per our usage (And we can solve that also)..
> Review of ByteBufferWriterOutputStream.java
> -------------------------------------------
>
> Key: HBASE-20197
> URL: https://issues.apache.org/jira/browse/HBASE-20197
> Project: HBase
> Issue Type: Improvement
> Components: hbase
> Affects Versions: 2.0.0
> Reporter: BELUGA BEHR
> Assignee: BELUGA BEHR
> Priority: Minor
> Attachments: HBASE-20197.1.patch, HBASE-20197.2.patch,
> HBASE-20197.3.patch, HBASE-20197.4.patch
>
>
> In looking at this class, two things caught my eye.
> # Default buffer size of 4K
> # Re-sizing of buffer on demand
>
> Java's {{BufferedOutputStream}} uses an internal buffer size of 8K on modern
> JVMs. This is due to various bench-marking that showed optimal performance
> at this level.
> The Re-sizing buffer looks a bit "unsafe":
>
> {code:java}
> public void write(ByteBuffer b, int off, int len) throws IOException {
> byte[] buf = null;
> if (len > TEMP_BUF_LENGTH) {
> buf = new byte[len];
> } else {
> if (this.tempBuf == null) {
> this.tempBuf = new byte[TEMP_BUF_LENGTH];
> }
> buf = this.tempBuf;
> }
> ...
> }
> {code}
> If this method gets one call with a 'len' of 4000, then 4001, then 4002, then
> 4003, etc. then the 'tempBuf' will be re-created many times. Also, it seems
> unsafe to create a buffer as large as the 'len' input. This could
> theoretically lead to an internal buffer of 2GB for each instance of this
> class.
> I propose:
> # Increase the default buffer size to 8K
> # Create the buffer once and chunk the output instead of loading data into a
> single array and writing it to the output stream.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)