[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19077 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166836868 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -46,9 +46,12 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int numWords = (int) ((size + 7) / 8); --- End diff -- L51: assert (alignedSize >= size); I think `assert (alignedSize >= size) ` can make sure `(size + 7) / 8` doesn't exceed max int. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166836613 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -46,9 +46,12 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int numWords = (int) ((size + 7) / 8); +long alignedSize = numWords * 8; --- End diff -- yeah,thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166834552 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -46,9 +46,12 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int numWords = (int) ((size + 7) / 8); --- End diff -- let's add some check to make sure `(size + 7) / 8` doesn't exceed max int. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166834595 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -46,9 +46,12 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int numWords = (int) ((size + 7) / 8); +long alignedSize = numWords * 8; --- End diff -- `numWords * 8L`, to avoid overflow --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166832418 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -46,9 +47,10 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { --- End diff -- I aggree with you . I hava updated,thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166829018 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -46,9 +47,10 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { --- End diff -- I think we don't need to create a new method: ``` int numWords = (int) ((size + 7) / 8); // add some overflow check. int alignedSize = numWords * 8; if (shouldPool(alignedSize)) { ... } long[] array = new long[numWords]; ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166817332 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java --- @@ -40,6 +40,15 @@ public static int roundNumberOfBytesToNearestWord(int numBytes) { } } + public static long roundNumberOfBytesToNearestWord(long numBytes) { +long remainder = numBytes & 0x07; // This is equivalent to `numBytes % 8` --- End diff -- Yes, it's 8 * ((numBytes + 7) / 8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166815823 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java --- @@ -134,4 +135,24 @@ public void memoryDebugFillEnabledInTest() { MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE); MemoryAllocator.UNSAFE.free(offheap); } + + @Test + public void heapMemoryReuse() { +MemoryAllocator heapMem = new HeapMemoryAllocator(); +// The size is less than 1M,allocate new memory every time. --- End diff -- Yeah,i agree with you,thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166815274 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -20,6 +20,7 @@ import javax.annotation.Nullable; import org.apache.spark.unsafe.Platform; +import org.apache.spark.unsafe.array.ByteArrayMethods; --- End diff -- ok,thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166368336 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java --- @@ -134,4 +135,24 @@ public void memoryDebugFillEnabledInTest() { MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE); MemoryAllocator.UNSAFE.free(offheap); } + + @Test + public void heapMemoryReuse() { +MemoryAllocator heapMem = new HeapMemoryAllocator(); +// The size is less than 1M,allocate new memory every time. +MemoryBlock onheap1 = heapMem.allocate(513); +Object obj1 = onheap1.getBaseObject(); +heapMem.free(onheap1); +MemoryBlock onheap2 = heapMem.allocate(514); +Assert.assertNotEquals(obj1, onheap2.getBaseObject()); + +// The size is greater than 1M,reuse the previous memory which has released. --- End diff -- ditto --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166368258 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/PlatformUtilSuite.java --- @@ -134,4 +135,24 @@ public void memoryDebugFillEnabledInTest() { MemoryAllocator.MEMORY_DEBUG_FILL_CLEAN_VALUE); MemoryAllocator.UNSAFE.free(offheap); } + + @Test + public void heapMemoryReuse() { +MemoryAllocator heapMem = new HeapMemoryAllocator(); +// The size is less than 1M,allocate new memory every time. --- End diff -- Is it better to refer to `HeapMemoryAllocator.POOLING_THRESHOLD_BYTES` at `1M`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166300538 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java --- @@ -40,6 +40,15 @@ public static int roundNumberOfBytesToNearestWord(int numBytes) { } } + public static long roundNumberOfBytesToNearestWord(long numBytes) { +long remainder = numBytes & 0x07; // This is equivalent to `numBytes % 8` --- End diff -- It's `8 * ((numBytes + 7) / 8)` but yeah, and the bit operation is unnecessarily complex IMHO. This is inherently a numeric operation, not bit-wise. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166280829 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -98,12 +100,13 @@ public void free(MemoryBlock memory) { long[] array = (long[]) memory.obj; memory.setObjAndOffset(null, 0); -if (shouldPool(size)) { +long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size); --- End diff -- do we need to do it again here? I think the `size` is guaranteed to be aligned with word. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166280889 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -20,6 +20,7 @@ import javax.annotation.Nullable; import org.apache.spark.unsafe.Platform; +import org.apache.spark.unsafe.array.ByteArrayMethods; --- End diff -- unnecessary change --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r166280387 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java --- @@ -40,6 +40,15 @@ public static int roundNumberOfBytesToNearestWord(int numBytes) { } } + public static long roundNumberOfBytesToNearestWord(long numBytes) { +long remainder = numBytes & 0x07; // This is equivalent to `numBytes % 8` --- End diff -- is it just `return (numBytes + 7) / 8;`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r150442437 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -48,6 +49,15 @@ public long size() { } /** + * Reset the size of the memory block. --- End diff -- The size of memory block is not aligned sizeï¼resetting the size is just to keep the behavior consistent with before. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r150381892 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -48,6 +49,15 @@ public long size() { } /** + * Reset the size of the memory block. --- End diff -- I think it's a really strange and dangerous interface. Why do we have to set the size? Does the memory consumer complain when he gets a memory block larger than he requested? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r150240149 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/ByteArrayMethods.java --- @@ -31,8 +31,8 @@ public static long nextPowerOf2(long num) { return (highBit == num) ? num : highBit << 1; } - public static int roundNumberOfBytesToNearestWord(int numBytes) { -int remainder = numBytes & 0x07; // This is equivalent to `numBytes % 8` + public static long roundNumberOfBytesToNearestWord(long numBytes) { --- End diff -- I think we should create a new method instead of changing this method. The added int cast looks scaring. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r147179694 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -118,7 +118,8 @@ private [sql] object GenArrayData { } else { val unsafeArraySizeInBytes = UnsafeArrayData.calculateHeaderPortionInBytes(numElements) + - ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements) + ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * --- End diff -- nit: ``` ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements) .toInt ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r144216267 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -48,6 +49,15 @@ public long size() { } /** + * Reset the size of the memory block. + */ + public void resetSize(long len) { +assert (ByteArrayMethods.roundNumberOfBytesToNearestWord(length) == --- End diff -- It seems that `require` can not be used in the Java code @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r144038007 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -116,9 +116,10 @@ private [sql] object GenArrayData { s"final ArrayData $arrayDataName = new $genericArrayClass($arrayName);", arrayDataName) } else { + val numBytes = elementType.defaultSize * numElements val unsafeArraySizeInBytes = UnsafeArrayData.calculateHeaderPortionInBytes(numElements) + - ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements) +ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt --- End diff -- We should really inline that. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r144037194 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -48,6 +49,15 @@ public long size() { } /** + * Reset the size of the memory block. + */ + public void resetSize(long len) { +assert (ByteArrayMethods.roundNumberOfBytesToNearestWord(length) == --- End diff -- Also leave some message if the check fails. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r144037771 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeArrayWriter.java --- @@ -57,7 +57,7 @@ public void initialize(BufferHolder holder, int numElements, int elementSize) { // Grows the global buffer ahead for header and fixed size data. int fixedPartInBytes = - ByteArrayMethods.roundNumberOfBytesToNearestWord(elementSize * numElements); + (int)ByteArrayMethods.roundNumberOfBytesToNearestWord(elementSize * numElements); --- End diff -- nit: extra space after `(int) `, also please update the other similar changes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r144037069 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -48,6 +49,15 @@ public long size() { } /** + * Reset the size of the memory block. + */ + public void resetSize(long len) { +assert (ByteArrayMethods.roundNumberOfBytesToNearestWord(length) == --- End diff -- We'd better use `require` instead of `assert`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r143439369 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -116,9 +116,10 @@ private [sql] object GenArrayData { s"final ArrayData $arrayDataName = new $genericArrayClass($arrayName);", arrayDataName) } else { + val numBytes = elementType.defaultSize * numElements val unsafeArraySizeInBytes = UnsafeArrayData.calculateHeaderPortionInBytes(numElements) + - ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements) +ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt --- End diff -- The size of this line is larger than 200 bytes --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r143380706 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -116,9 +116,10 @@ private [sql] object GenArrayData { s"final ArrayData $arrayDataName = new $genericArrayClass($arrayName);", arrayDataName) } else { + val numBytes = elementType.defaultSize * numElements val unsafeArraySizeInBytes = UnsafeArrayData.calculateHeaderPortionInBytes(numElements) + - ByteArrayMethods.roundNumberOfBytesToNearestWord(elementType.defaultSize * numElements) +ByteArrayMethods.roundNumberOfBytesToNearestWord(numBytes).toInt --- End diff -- Minor: why don't we inline this instead of creating a new variable? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r137442821 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -48,6 +48,13 @@ public long size() { } /** + * Reset the size of the memory block. + */ --- End diff -- Thanksï¼i will add a check. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r137442763 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +48,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size); --- End diff -- yeahï¼I think it's acceptable --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r137441814 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -48,6 +48,13 @@ public long size() { } /** + * Reset the size of the memory block. + */ --- End diff -- It is dangerous to reset to a invalid size. We should add a check here or put a WARNING in the method comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r137439954 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +48,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size); --- End diff -- Maybe minor but some small allocations will be counted for pooling mechanism but they are not before, e.g. `POOLING_THRESHOLD_BYTES` - 1. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r137361142 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +47,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int arraySize = (int)((size + 7) / 8); --- End diff -- Maybe we should make the method to tackle long values. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r136489896 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +47,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int arraySize = (int)((size + 7) / 8); --- End diff -- But the type of input parameter for `roundNumberOfBytesToNearestWord` is `int` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r136487281 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +47,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int arraySize = (int)((size + 7) / 8); --- End diff -- You might be able to use `ByteAraryMethods.roundNumberOfBytesToNearestWord` for this, which we'e done for similar rounding elsewhere. Makes it a bit easier to spot what's happening. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r136332974 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +47,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int arraySize = (int)((size + 7) / 8); +long alignedSize = arraySize * 8; +if (shouldPool(alignedSize)) { synchronized (this) { -final LinkedListpool = bufferPoolsBySize.get(size); +final LinkedList pool = bufferPoolsBySize.get(alignedSize); if (pool != null) { while (!pool.isEmpty()) { final WeakReference blockReference = pool.pop(); final MemoryBlock memory = blockReference.get(); if (memory != null) { - assert (memory.size() == size); + assert ((int)((memory.size() + 7) / 8) == arraySize); + memory.resetSize(size); --- End diff -- I got it, thanks for the explanation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r136223648 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +47,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int arraySize = (int)((size + 7) / 8); +long alignedSize = arraySize * 8; +if (shouldPool(alignedSize)) { synchronized (this) { -final LinkedListpool = bufferPoolsBySize.get(size); +final LinkedList pool = bufferPoolsBySize.get(alignedSize); if (pool != null) { while (!pool.isEmpty()) { final WeakReference blockReference = pool.pop(); final MemoryBlock memory = blockReference.get(); if (memory != null) { - assert (memory.size() == size); + assert ((int)((memory.size() + 7) / 8) == arraySize); + memory.resetSize(size); --- End diff -- yes, `MemoryBlock` is always the actual size, if we reuse the previous memory,we should modify the size of `MemoryBlock` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r136069800 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +47,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +int arraySize = (int)((size + 7) / 8); +long alignedSize = arraySize * 8; +if (shouldPool(alignedSize)) { synchronized (this) { -final LinkedListpool = bufferPoolsBySize.get(size); +final LinkedList pool = bufferPoolsBySize.get(alignedSize); if (pool != null) { while (!pool.isEmpty()) { final WeakReference blockReference = pool.pop(); final MemoryBlock memory = blockReference.get(); if (memory != null) { - assert (memory.size() == size); + assert ((int)((memory.size() + 7) / 8) == arraySize); + memory.resetSize(size); --- End diff -- Hmm, from my understanding the size of `MemoryBlock` is always the actual size, not the aligned size, so looks like we dont need to reset the size here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org