[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19077


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-07 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166836868
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -46,9 +46,12 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int numWords = (int) ((size + 7) / 8);
--- End diff --

L51: assert (alignedSize >= size); 
I think `assert (alignedSize >= size) ` can make sure `(size + 7) / 8` 
doesn't exceed max int.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-07 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166836613
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -46,9 +46,12 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int numWords = (int) ((size + 7) / 8);
+long alignedSize = numWords * 8;
--- End diff --

yeah,thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166834552
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -46,9 +46,12 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int numWords = (int) ((size + 7) / 8);
--- End diff --

let's add some check to make sure `(size + 7) / 8` doesn't exceed max int.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166834595
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -46,9 +46,12 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int numWords = (int) ((size + 7) / 8);
+long alignedSize = numWords * 8;
--- End diff --

`numWords * 8L`, to avoid overflow


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-07 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166832418
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -46,9 +47,10 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
--- End diff --

I aggree with you .
I hava updated,thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166829018
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -46,9 +47,10 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
--- End diff --

I think we don't need to create a new method:
```
int numWords = (int) ((size + 7) / 8); // add some overflow check.
int alignedSize = numWords * 8;
if (shouldPool(alignedSize)) {
  ...
}
long[] array = new long[numWords];
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-07 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166817332
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java 
---
@@ -40,6 +40,15 @@ public static int roundNumberOfBytesToNearestWord(int 
numBytes) {
 }
   }
 
+  public static long roundNumberOfBytesToNearestWord(long numBytes) {
+long remainder = numBytes & 0x07;  // This is equivalent to `numBytes 
% 8`
--- End diff --

Yes, it's 8 * ((numBytes + 7) / 8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-07 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166815823
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java ---
@@ -134,4 +135,24 @@ public void memoryDebugFillEnabledInTest() {
   MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE);
 MemoryAllocator.UNSAFE.free(offheap);
   }
+
+  @Test
+  public void heapMemoryReuse() {
+MemoryAllocator heapMem = new HeapMemoryAllocator();
+// The size is less than 1M,allocate new memory every time.
--- End diff --

Yeah,i agree with you,thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-07 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166815274
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -20,6 +20,7 @@
 import javax.annotation.Nullable;
 
 import org.apache.spark.unsafe.Platform;
+import org.apache.spark.unsafe.array.ByteArrayMethods;
--- End diff --

ok,thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-06 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166368336
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java ---
@@ -134,4 +135,24 @@ public void memoryDebugFillEnabledInTest() {
   MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE);
 MemoryAllocator.UNSAFE.free(offheap);
   }
+
+  @Test
+  public void heapMemoryReuse() {
+MemoryAllocator heapMem = new HeapMemoryAllocator();
+// The size is less than 1M,allocate new memory every time.
+MemoryBlock onheap1 = heapMem.allocate(513);
+Object obj1 = onheap1.getBaseObject();
+heapMem.free(onheap1);
+MemoryBlock onheap2 = heapMem.allocate(514);
+Assert.assertNotEquals(obj1, onheap2.getBaseObject());
+
+// The size is greater than 1M,reuse the previous memory which has 
released.
--- End diff --

ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-06 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166368258
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java ---
@@ -134,4 +135,24 @@ public void memoryDebugFillEnabledInTest() {
   MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE);
 MemoryAllocator.UNSAFE.free(offheap);
   }
+
+  @Test
+  public void heapMemoryReuse() {
+MemoryAllocator heapMem = new HeapMemoryAllocator();
+// The size is less than 1M,allocate new memory every time.
--- End diff --

Is it better to refer to `HeapMemoryAllocator.POOLING_THRESHOLD_BYTES` at 
`1M`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-06 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166300538
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java 
---
@@ -40,6 +40,15 @@ public static int roundNumberOfBytesToNearestWord(int 
numBytes) {
 }
   }
 
+  public static long roundNumberOfBytesToNearestWord(long numBytes) {
+long remainder = numBytes & 0x07;  // This is equivalent to `numBytes 
% 8`
--- End diff --

It's `8 * ((numBytes + 7) / 8)` but yeah, and the bit operation is 
unnecessarily complex IMHO. This is inherently a numeric operation, not 
bit-wise.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166280829
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -98,12 +100,13 @@ public void free(MemoryBlock memory) {
 long[] array = (long[]) memory.obj;
 memory.setObjAndOffset(null, 0);
 
-if (shouldPool(size)) {
+long alignedSize = 
ByteArrayMethods.roundNumberOfBytesToNearestWord(size);
--- End diff --

do we need to do it again here? I think the `size` is guaranteed to be 
aligned with word.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166280889
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -20,6 +20,7 @@
 import javax.annotation.Nullable;
 
 import org.apache.spark.unsafe.Platform;
+import org.apache.spark.unsafe.array.ByteArrayMethods;
--- End diff --

unnecessary change


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2018-02-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r166280387
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java 
---
@@ -40,6 +40,15 @@ public static int roundNumberOfBytesToNearestWord(int 
numBytes) {
 }
   }
 
+  public static long roundNumberOfBytesToNearestWord(long numBytes) {
+long remainder = numBytes & 0x07;  // This is equivalent to `numBytes 
% 8`
--- End diff --

is it just `return (numBytes + 7) / 8;`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-11-12 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r150442437
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -48,6 +49,15 @@ public long size() {
   }
 
   /**
+   * Reset the size of the memory block.
--- End diff --

The size of memory block is not aligned size,resetting the size is just 
to keep the behavior consistent with before.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-11-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r150381892
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -48,6 +49,15 @@ public long size() {
   }
 
   /**
+   * Reset the size of the memory block.
--- End diff --

I think it's a really strange and dangerous interface. Why do we have to 
set the size? Does the memory consumer complain when he gets a memory block 
larger than he requested?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-11-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r150240149
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java 
---
@@ -31,8 +31,8 @@ public static long nextPowerOf2(long num) {
 return (highBit == num) ? num : highBit << 1;
   }
 
-  public static int roundNumberOfBytesToNearestWord(int numBytes) {
-int remainder = numBytes & 0x07;  // This is equivalent to `numBytes % 
8`
+  public static long roundNumberOfBytesToNearestWord(long numBytes) {
--- End diff --

I think we should create a new method instead of changing this method. The 
added int cast looks scaring.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-26 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r147179694
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -118,7 +118,8 @@ private [sql] object GenArrayData {
 } else {
   val unsafeArraySizeInBytes =
 UnsafeArrayData.calculateHeaderPortionInBytes(numElements) +
-
ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * 
numElements)
+
ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize *
--- End diff --

nit:
```
ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * 
numElements)
.toInt
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-12 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r144216267
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -48,6 +49,15 @@ public long size() {
   }
 
   /**
+   * Reset the size of the memory block.
+   */
+  public void resetSize(long len) {
+assert (ByteArrayMethods.roundNumberOfBytesToNearestWord(length) ==
--- End diff --

It seems that `require`  can not be used in the Java code @jiangxb1987 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r144038007
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -116,9 +116,10 @@ private [sql] object GenArrayData {
s"final ArrayData $arrayDataName = new 
$genericArrayClass($arrayName);",
arrayDataName)
 } else {
+  val numBytes = elementType.defaultSize * numElements
   val unsafeArraySizeInBytes =
 UnsafeArrayData.calculateHeaderPortionInBytes(numElements) +
-
ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * 
numElements)
+ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt
--- End diff --

We should really inline that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r144037194
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -48,6 +49,15 @@ public long size() {
   }
 
   /**
+   * Reset the size of the memory block.
+   */
+  public void resetSize(long len) {
+assert (ByteArrayMethods.roundNumberOfBytesToNearestWord(length) ==
--- End diff --

Also leave some message if the check fails.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r144037771
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeArrayWriter.java
 ---
@@ -57,7 +57,7 @@ public void initialize(BufferHolder holder, int 
numElements, int elementSize) {
 
 // Grows the global buffer ahead for header and fixed size data.
 int fixedPartInBytes =
-  ByteArrayMethods.roundNumberOfBytesToNearestWord(elementSize * 
numElements);
+  (int)ByteArrayMethods.roundNumberOfBytesToNearestWord(elementSize * 
numElements);
--- End diff --

nit: extra space after `(int) `, also please update the other similar 
changes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-11 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r144037069
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -48,6 +49,15 @@ public long size() {
   }
 
   /**
+   * Reset the size of the memory block.
+   */
+  public void resetSize(long len) {
+assert (ByteArrayMethods.roundNumberOfBytesToNearestWord(length) ==
--- End diff --

We'd better use `require` instead of `assert`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-09 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r143439369
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -116,9 +116,10 @@ private [sql] object GenArrayData {
s"final ArrayData $arrayDataName = new 
$genericArrayClass($arrayName);",
arrayDataName)
 } else {
+  val numBytes = elementType.defaultSize * numElements
   val unsafeArraySizeInBytes =
 UnsafeArrayData.calculateHeaderPortionInBytes(numElements) +
-
ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * 
numElements)
+ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt
--- End diff --

 The size of this line is larger than 200 bytes


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r143380706
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -116,9 +116,10 @@ private [sql] object GenArrayData {
s"final ArrayData $arrayDataName = new 
$genericArrayClass($arrayName);",
arrayDataName)
 } else {
+  val numBytes = elementType.defaultSize * numElements
   val unsafeArraySizeInBytes =
 UnsafeArrayData.calculateHeaderPortionInBytes(numElements) +
-
ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * 
numElements)
+ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt
--- End diff --

Minor: why don't we inline this instead of creating a new variable?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-09-06 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r137442821
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -48,6 +48,13 @@ public long size() {
   }
 
   /**
+   * Reset the size of the memory block.
+   */
--- End diff --

Thanks,i will  add a check.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-09-06 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r137442763
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +48,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+long alignedSize = 
ByteArrayMethods.roundNumberOfBytesToNearestWord(size);
--- End diff --

yeah,I think it's acceptable


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-09-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r137441814
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -48,6 +48,13 @@ public long size() {
   }
 
   /**
+   * Reset the size of the memory block.
+   */
--- End diff --

It is dangerous to reset to a invalid size. We should add a check here or 
put a WARNING in the method comment. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-09-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r137439954
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +48,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+long alignedSize = 
ByteArrayMethods.roundNumberOfBytesToNearestWord(size);
--- End diff --

Maybe minor but some small allocations will be counted for pooling 
mechanism but they are not before, e.g. `POOLING_THRESHOLD_BYTES` - 1.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-09-06 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r137361142
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int arraySize = (int)((size + 7) / 8);
--- End diff --

Maybe we should make the method to tackle long values.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-08-31 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r136489896
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int arraySize = (int)((size + 7) / 8);
--- End diff --

But the type of input parameter for `roundNumberOfBytesToNearestWord` is 
`int`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-08-31 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r136487281
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int arraySize = (int)((size + 7) / 8);
--- End diff --

You might be able to use `ByteAraryMethods.roundNumberOfBytesToNearestWord` 
for this, which we'e done for similar rounding elsewhere. Makes it a bit easier 
to spot what's happening.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-08-31 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r136332974
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int arraySize = (int)((size + 7) / 8);
+long alignedSize = arraySize * 8;
+if (shouldPool(alignedSize)) {
   synchronized (this) {
-final LinkedList pool = 
bufferPoolsBySize.get(size);
+final LinkedList pool = 
bufferPoolsBySize.get(alignedSize);
 if (pool != null) {
   while (!pool.isEmpty()) {
 final WeakReference blockReference = pool.pop();
 final MemoryBlock memory = blockReference.get();
 if (memory != null) {
-  assert (memory.size() == size);
+  assert ((int)((memory.size() + 7) / 8) == arraySize);
+  memory.resetSize(size);
--- End diff --

I got it, thanks for the explanation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-08-30 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r136223648
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int arraySize = (int)((size + 7) / 8);
+long alignedSize = arraySize * 8;
+if (shouldPool(alignedSize)) {
   synchronized (this) {
-final LinkedList pool = 
bufferPoolsBySize.get(size);
+final LinkedList pool = 
bufferPoolsBySize.get(alignedSize);
 if (pool != null) {
   while (!pool.isEmpty()) {
 final WeakReference blockReference = pool.pop();
 final MemoryBlock memory = blockReference.get();
 if (memory != null) {
-  assert (memory.size() == size);
+  assert ((int)((memory.size() + 7) / 8) == arraySize);
+  memory.resetSize(size);
--- End diff --

yes, `MemoryBlock` is always the actual size, if we reuse the previous 
memory,we should modify the  size of `MemoryBlock`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-08-30 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r136069800
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +47,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+int arraySize = (int)((size + 7) / 8);
+long alignedSize = arraySize * 8;
+if (shouldPool(alignedSize)) {
   synchronized (this) {
-final LinkedList pool = 
bufferPoolsBySize.get(size);
+final LinkedList pool = 
bufferPoolsBySize.get(alignedSize);
 if (pool != null) {
   while (!pool.isEmpty()) {
 final WeakReference blockReference = pool.pop();
 final MemoryBlock memory = blockReference.get();
 if (memory != null) {
-  assert (memory.size() == size);
+  assert ((int)((memory.size() + 7) / 8) == arraySize);
+  memory.resetSize(size);
--- End diff --

Hmm, from my understanding the size of `MemoryBlock` is always the actual 
size, not the aligned size, so looks like we dont need to reset the size here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org