[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21327 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21327#discussion_r188689311 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -63,15 +63,18 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) { */ def writeFully(channel: WritableByteChannel): Unit = { for (bytes <- getChunks()) { - val curChunkLimit = bytes.limit() + val originalLimit = bytes.limit() while (bytes.hasRemaining) { -try { - val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) - bytes.limit(bytes.position() + ioSize) - channel.write(bytes) -} finally { - bytes.limit(curChunkLimit) -} +// If `bytes` is an on-heap ByteBuffer, the JDK will copy it to a temporary direct --- End diff -- the caching happens in the JDK code, not some magic inside JVM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21327#discussion_r188532109 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -63,15 +63,18 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) { */ def writeFully(channel: WritableByteChannel): Unit = { for (bytes <- getChunks()) { - val curChunkLimit = bytes.limit() + val originalLimit = bytes.limit() while (bytes.hasRemaining) { -try { - val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) - bytes.limit(bytes.position() + ioSize) - channel.write(bytes) -} finally { - bytes.limit(curChunkLimit) -} +// If `bytes` is an on-heap ByteBuffer, the JDK will copy it to a temporary direct +// ByteBuffer when writing it out. The JDK caches one temporary buffer per thread, and we --- End diff -- > The JDK caches one temporary buffer per thread I don't think this statement is correct. According to [Util.java](http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/sun/nio/ch/Util.java#l48), the cached number of temporary direct buffer is up to `IOUtil.IOV_MAX`. > if the cached temp buffer gets created and freed frequently The problem is that the varied-sized heap buffers could cause a new allocation of temporary direct buffer and free of old direct buffer if the buffer size is larger than before --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/21327#discussion_r188525716 --- Diff: core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala --- @@ -63,15 +63,18 @@ private[spark] class ChunkedByteBuffer(var chunks: Array[ByteBuffer]) { */ def writeFully(channel: WritableByteChannel): Unit = { for (bytes <- getChunks()) { - val curChunkLimit = bytes.limit() + val originalLimit = bytes.limit() while (bytes.hasRemaining) { -try { - val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize) - bytes.limit(bytes.position() + ioSize) - channel.write(bytes) -} finally { - bytes.limit(curChunkLimit) -} +// If `bytes` is an on-heap ByteBuffer, the JDK will copy it to a temporary direct --- End diff -- how about `the JDK` -> `the JVM`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/21327 [SPARK-24107][CORE][followup] ChunkedByteBuffer.writeFully method has not reset the limit value ## What changes were proposed in this pull request? According to the discussion in https://github.com/apache/spark/pull/21175 , this PR proposes 2 improvements: 1. add comments to explain why we call `limit` to write out `ByteBuffer` with slices. 2. remove the `try ... finally` ## How was this patch tested? existing tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark minor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21327.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21327 commit cd2f0e3658964818b076e6de150f15db32f3c455 Author: Wenchen FanDate: 2018-05-15T04:29:56Z improve --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org