On the data path, Spark will write to a local disk when it runs out of
memory and needs to spill or when doing a shuffle with the default shuffle
implementation.  The spilling is a good thing because it lets you process
data that is too large to fit in memory.  It is not great because the
processing slows down a lot when that happens, but slow is better than
crashing in many cases. The default shuffle implementation will
always write out to disk.  This again is good in that it allows you to
process more data on a single box than can fit in memory. It is bad when
the shuffle data could fit in memory, but ends up being written to disk
anyways.  On Linux the data is being written into the page cache and will
be flushed to disk in the background when memory is needed or after a set
amount of time. If your query is fast and is shuffling little data, then it
is likely that your query is running all in memory.  All of the shuffle
reads and writes are probably going directly to the page cache and the disk
is not involved at all. If you really want to you can configure the
pagecache to not spill to disk until absolutely necessary. That should get
you really close to pure in-memory processing, so long as you have enough
free memory on the host to support it.

Bobby



On Fri, Aug 20, 2021 at 7:57 AM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Well I don't know what having an "in-memory Spark only" is going to
> achieve. Spark GUI shows the amount of disk usage pretty well. The memory
> is used exclusively by default first.
>
> Spark is no different from a predominantly in-memory application.
> Effectively it is doing the classical disk based hadoop  map-reduce
> operation "in memory" to speed up the processing but it is still an
> application on top of the OS.  So like mose applications, there is a state
> of Spark, the code running and the OS(s), where disk usage will be needed.
>
> This is akin to swap space on OS itself and I quote "Swap space is used when
> your operating system decides that it needs physical memory for active
> processes and the amount of available (unused) physical memory is
> insufficient. When this happens, inactive pages from the physical memory
> are then moved into the swap space, freeing up that physical memory for
> other uses"
>
>  free
>               total        used        free      shared  buff/cache
>  available
> Mem:       65659732    30116700     1429436     2341772    34113596
> 32665372
> Swap:     104857596      550912   104306684
>
> HTH
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Fri, 20 Aug 2021 at 12:50, Jacek Laskowski <ja...@japila.pl> wrote:
>
>> Hi,
>>
>> I've been exploring BlockManager and the stores for a while now and am
>> tempted to say that a memory-only Spark setup would be possible (except
>> shuffle blocks). Is this correct?
>>
>> What about shuffle blocks? Do they have to be stored on disk (in
>> DiskStore)?
>>
>> I think broadcast variables are in-memory first so except on-disk storage
>> level explicitly used (by Spark devs), there's no reason not to have Spark
>> in-memory only.
>>
>> (I was told that one of the differences between Trino/Presto vs Spark SQL
>> is that Trino keeps all processing in-memory only and will blow up while
>> Spark uses disk to avoid OOMEs).
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> ----
>> https://about.me/JacekLaskowski
>> "The Internals Of" Online Books <https://books.japila.pl/>
>> Follow me on https://twitter.com/jaceklaskowski
>>
>> <https://twitter.com/jaceklaskowski>
>>
>

Reply via email to