Github user Ngone51 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19285#discussion_r162848405
--- Diff:
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -162,26 +162,29 @@ private[spark] class MemoryStore(
}
/**
- * Attempt to put the given block in memory store as values.
+ * Attempt to put the given block in memory store as values or bytes.
*
* It's possible that the iterator is too large to materialize and store
in memory. To avoid
* OOM exceptions, this method will gradually unroll the iterator while
periodically checking
* whether there is enough free memory. If the block is successfully
materialized, then the
* temporary unroll memory used during the materialization is
"transferred" to storage memory,
* so we won't acquire more memory than is actually needed to store the
block.
*
- * @return in case of success, the estimated size of the stored data. In
case of failure, return
- * an iterator containing the values of the block. The returned
iterator will be backed
- * by the combination of the partially-unrolled block and the
remaining elements of the
- * original input iterator. The caller must either fully consume
this iterator or call
- * `close()` on it in order to free the storage memory consumed
by the partially-unrolled
- * block.
+ * @param blockId The block id.
+ * @param values The values which need be stored.
+ * @param classTag the [[ClassTag]] for the block.
+ * @param memoryMode The values saved mode.
+ * @param valuesHolder A holder that supports storing record of values
into memory store as
+ * values or bytes.
+ * @return if the block is stored successfully, return the stored data
size. Else return the
+ * memory has used for unroll the block.
--- End diff --
First, I think you will do not disagree with that there's
partially-unrolled case exists in failure situation.
Second,
>The block can be unrolled fully, but the used memory exceeded the request
and can't request the extra memory.
Yeah, I know. But what I want to say is block unrolled fully doesn't mean
we have reserved unroll memory for all values(this only happens when the last
element in iterator % memoryCheckPeriod == 0), because of `memoryCheckPeriod`.
And here, we talk about `the memory has used for unroll the block`. So, it is
not accurately to say 'block be unrolled fully, so the used memory is for all
the values'.
So, mostly, it would be `partially-unrolled`. WDYT?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]