[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...

2018-05-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21327


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...

2018-05-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21327#discussion_r188689311
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala ---
@@ -63,15 +63,18 @@ private[spark] class ChunkedByteBuffer(var chunks: 
Array[ByteBuffer]) {
*/
   def writeFully(channel: WritableByteChannel): Unit = {
 for (bytes <- getChunks()) {
-  val curChunkLimit = bytes.limit()
+  val originalLimit = bytes.limit()
   while (bytes.hasRemaining) {
-try {
-  val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize)
-  bytes.limit(bytes.position() + ioSize)
-  channel.write(bytes)
-} finally {
-  bytes.limit(curChunkLimit)
-}
+// If `bytes` is an on-heap ByteBuffer, the JDK will copy it to a 
temporary direct
--- End diff --

the caching happens in the JDK code, not some magic inside JVM.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...

2018-05-16 Thread advancedxy
Github user advancedxy commented on a diff in the pull request:

https://github.com/apache/spark/pull/21327#discussion_r188532109
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala ---
@@ -63,15 +63,18 @@ private[spark] class ChunkedByteBuffer(var chunks: 
Array[ByteBuffer]) {
*/
   def writeFully(channel: WritableByteChannel): Unit = {
 for (bytes <- getChunks()) {
-  val curChunkLimit = bytes.limit()
+  val originalLimit = bytes.limit()
   while (bytes.hasRemaining) {
-try {
-  val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize)
-  bytes.limit(bytes.position() + ioSize)
-  channel.write(bytes)
-} finally {
-  bytes.limit(curChunkLimit)
-}
+// If `bytes` is an on-heap ByteBuffer, the JDK will copy it to a 
temporary direct
+// ByteBuffer when writing it out. The JDK caches one temporary 
buffer per thread, and we
--- End diff --

> The JDK caches one temporary buffer per thread

I don't think this statement is correct.  According to 
[Util.java](http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/sun/nio/ch/Util.java#l48),
 the cached  number of temporary direct buffer is up to `IOUtil.IOV_MAX`. 

> if the cached temp buffer gets created and freed frequently

The problem is that the varied-sized heap buffers could cause a new 
allocation of temporary direct buffer and free of old direct buffer if the 
buffer size is larger than before



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...

2018-05-16 Thread advancedxy
Github user advancedxy commented on a diff in the pull request:

https://github.com/apache/spark/pull/21327#discussion_r188525716
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala ---
@@ -63,15 +63,18 @@ private[spark] class ChunkedByteBuffer(var chunks: 
Array[ByteBuffer]) {
*/
   def writeFully(channel: WritableByteChannel): Unit = {
 for (bytes <- getChunks()) {
-  val curChunkLimit = bytes.limit()
+  val originalLimit = bytes.limit()
   while (bytes.hasRemaining) {
-try {
-  val ioSize = Math.min(bytes.remaining(), bufferWriteChunkSize)
-  bytes.limit(bytes.position() + ioSize)
-  channel.write(bytes)
-} finally {
-  bytes.limit(curChunkLimit)
-}
+// If `bytes` is an on-heap ByteBuffer, the JDK will copy it to a 
temporary direct
--- End diff --

how about `the JDK` -> `the JVM`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21327: [SPARK-24107][CORE][followup] ChunkedByteBuffer.w...

2018-05-14 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/21327

[SPARK-24107][CORE][followup] ChunkedByteBuffer.writeFully method has not 
reset the limit value

## What changes were proposed in this pull request?

According to the discussion in https://github.com/apache/spark/pull/21175 , 
this PR proposes 2 improvements:
1. add comments to explain why we call `limit` to write out `ByteBuffer` 
with slices.
2. remove the `try ... finally`

## How was this patch tested?

existing tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark minor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21327.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21327


commit cd2f0e3658964818b076e6de150f15db32f3c455
Author: Wenchen Fan 
Date:   2018-05-15T04:29:56Z

improve




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org