Hello everyone,

In our group at EPFL we're doing research on understanding and potentially 
improving the performance of data-parallel frameworks that use secondary 
storage.

I was looking at the Flink code to understand how spilling to disk actually 
works.

So far I got to the UnilateralSortMerger.java and its spill and reading 
threads. I also saw there are some spilling markers used.

I am curious if there is any design document available on this topic.

I was not able to find much online.

If there is no such design document I would appreciate if someone could help me 
understand how these spilling markers are used.

At a higher level, I am trying to understand how much data does Flink spill to 
disk after it has concluded that it needs to spill to disk.


Thank you very much

Florin Dinu

Reply via email to