[GitHub] spark issue #16374: [SPARK-18925][STREAMING] Reduce memory usage of mapWithS...

vpchelko Tue, 12 Jun 2018 03:52:50 -0700

Github user vpchelko commented on the issue:

    https://github.com/apache/spark/pull/16374
  
    I don't use anymore approach above.
    To unpersist unnecessary RDD, I hacked MapWithStateDStream a little bit by 
calling unpersist for previously generated RDDs in internalMapWithStateStream.
    
    Ð¡ache eviction not working well, from my observations the cache eviction 
is an expensive operation. Also because of a lot object in JVM the garbage 
collector starts consuming resources significantly.
    
    P.S.
    For my task I don't need mappedValues in cache.
    Why mapWithState caching mappedValues in RAM?
    Previous mappedValues are not required to calculate the next state, and 
next mappedValues, they just using the RAM.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #16374: [SPARK-18925][STREAMING] Reduce memory usage of mapWithS...

Reply via email to