mridulm commented on PR #38567:
URL: https://github.com/apache/spark/pull/38567#issuecomment-1308113372

   To comment on proposal in description, based on past prototypes I have 
worked on/seen: 
   
   Maintaining state at driver on disk backed store and copying that to dfs has 
a few things which impact it - particularly for larger applications.
   
   They are not very robust to application crashes, interact in nontrivial ways 
with shutdown hook (hdfs failures) and increase application termination time 
during graceful shutdown.
   
   Depending on application characteristics, the impact of disk backed store 
can positively or negatively impact driver performance (positively - as updates 
are faster due to index, which was lacking in in memory store (when I added 
index, memory requirements increased :-( ), negatively due to increased disk 
activity): was difficult to predict.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to