Github user vpchelko commented on the issue:
https://github.com/apache/spark/pull/16374
I don't use anymore approach above.
To unpersist unnecessary RDD, I hacked MapWithStateDStream a little bit by
calling unpersist for previously generated RDDs in internalMapWithStateStream.
Сache eviction not working well, from my observations the cache eviction
is an expensive operation. Also because of a lot object in JVM the garbage
collector starts consuming resources significantly.
P.S.
For my task I don't need mappedValues in cache.
Why mapWithState caching mappedValues in RAM?
Previous mappedValues are not required to calculate the next state, and
next mappedValues, they just using the RAM.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]