siying commented on PR #42567:
URL: https://github.com/apache/spark/pull/42567#issuecomment-1686685511

   When block cache in strict mode is full, we try to insert more data, some 
exception will be thrown one way of another. I don't think SS has a way to 
handle the error, and the consequence will be task failure. If that is not a 
behavior we can afford, we should disable strict mode.
   
   It is a trade-off here: either RocksDB reject queries and background tasks 
that require more memory than needed, or we use more memory than configured. 
RocksDB might not have a better way to handle it. It is a less used feature, 
and is mainly used in those applications that would rather to fail than using 
more configured memory. An example, sometimes people don't feel comfortable for 
an administrative demon to use more memory and crash a host that might serves 
online queries.
   
   I think it is a decision Spark SS needs to make. When users misconfigured 
memory cap that turns out to be not enough, are we going to fail their queries, 
or are we going to allow RocksDB to use more memory than needed and potentially 
cause an OOM. I can see both pros and cons on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to