Hi

Broadcast variables definitely store in the spark.memory.storageFraction .

1 If we go into the code of TorrentBroadcast.scala and writeBlocks method
and navigates to BlockManager to MemoryStore . Desearlization of the
variables occures in unroll memory and then transferred to storage memory .

memoryManager.synchronized {
  releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, amount)

val success = memoryManager.acquireStorageMemory(blockId, amount,
MemoryMode.ON_HEAP)


So definitely broadcast variables are stored in  spark.memory.storageFraction
.


Can u explain how are u seeing smaller set of memory used on given executor
for broadcast variables through UI ?

Regards
Pralabh Kumar

On Thu, Jun 22, 2017 at 4:39 AM, Bryan Jeffrey <bryan.jeff...@gmail.com>
wrote:

> Satish,
>
> I agree - that was my impression too. However I am seeing a smaller set of
> storage memory used on a given executor than the amount of memory required
> for my broadcast variables. I am wondering if the statistics in the ui are
> incorrect or if the broadcasts are simply not a part of that storage memory
> fraction.
>
> Bryan Jeffrey
>
> Get Outlook for Android <https://aka.ms/ghei36>
>
>
>
>
> On Wed, Jun 21, 2017 at 6:48 PM -0400, "satish lalam" <
> satish.la...@gmail.com> wrote:
>
> My understanding is - it from storageFraction. Here cached blocks are
>> immune to eviction - so both persisted RDDs and broadcast variables sit
>> here. Ref
>> <https://image.slidesharecdn.com/sparkinternalsworkshoplatest-160303190243/95/apache-spark-in-depth-core-concepts-architecture-internals-20-638.jpg?cb=1457597704>
>>
>>
>> On Wed, Jun 21, 2017 at 1:43 PM, Bryan Jeffrey <bryan.jeff...@gmail.com>
>> wrote:
>>
>>> Hello.
>>>
>>> Question: Do broadcast variables stored on executors count as part of
>>> 'storage memory' or other memory?
>>>
>>> A little bit more detail:
>>>
>>> I understand that we have two knobs to control memory allocation:
>>> - spark.memory.fraction
>>> - spark.memory.storageFraction
>>>
>>> My understanding is that spark.memory.storageFraction controls the
>>> amount of memory allocated for cached RDDs.  spark.memory.fraction controls
>>> how much memory is allocated to Spark operations (task serialization,
>>> operations, etc.), w/ the remainder reserved for user data structures,
>>> Spark internal metadata, etc.  This includes the storage memory for cached
>>> RDDs.
>>>
>>> You end up with executor memory that looks like the following:
>>> All memory: 0-100
>>> Spark memory: 0-75
>>> RDD Storage: 0-37
>>> Other Spark: 38-75
>>> Other Reserved: 76-100
>>>
>>> Where do broadcast variables fall into the mix?
>>>
>>> Regards,
>>>
>>> Bryan Jeffrey
>>>
>>
>>

Reply via email to