Ernest created SPARK-14289:
------------------------------

             Summary: Add support to multiple eviction strategys for cached RDD 
partitions
                 Key: SPARK-14289
                 URL: https://issues.apache.org/jira/browse/SPARK-14289
             Project: Spark
          Issue Type: New Feature
          Components: Block Manager, Spark Core
         Environment: Spark 2.0-SNAPSHOT
Single Rack
Standalone mode scheduling
8 node cluster
16 cores & 64G RAM / node
Data Replication factor of 3

Each Node has 1 Spark executors configured with 16 cores each and 40GB of RAM.
            Reporter: Ernest
            Priority: Minor


Currently, there is only eviction strategy for cached RDD partition in Spark. 

The default RDD eviection strategy is LRU (with an additional rule that do not 
replacing another block that belongs to the same RDD like current creating 
partition).

When memory space not sufficient for RDD caching, several partitions will be 
evicted, if these partitions are used again lattly, they will be reproduced by 
the Lineage information and cached in memory again. The reproduce phase will 
bring in additional cost. However, LRU has no guarantee for the lowest 
reproduce cost. 

The first RDD that needed to be cached is usually generated by reading from 
HDFS and doing several transformations. The reading operation usually cost 
longer time than other Spark transformations. 

For example, in one stage we having the following DAG structure: hdfs -> \[A\] 
-> B -> \[C\] -> D - > \[E\] -> \[F\], RDD A, C, E, F needed to be cached in 
memory, F is creating during this stage while A, B and E had already been 
created in previous. When using the LRU eviction strategy, partition of A will 
be evicted first. However, the time cost in\ [A\] -> B -> \[C\] may be much 
less than hdfs ->\ [A\], so evict \[C\] may be better than evict \[A\]. 

A eviction strategy based on the creation cost may be better than LRU, by 
statisticing each transformation's time during the creation of cached RDD 
partition (e.g. \[E\] only need to statistic time cost in \[C\] -> D and D -> 
\[E\]) and time cost in needed shuffle reading. When memory for RDD storage not 
sufficient, partition with the least creation cost may be evicted first. So 
this strategy for be called as LCS. My current demo show better performance 
gain than default LRU.

This strategy needs to consider the following situtation:
1. Unified Memory Management is provided after Spark 1.6, memory for execution 
during recomputing a partition may be pretty different than the first time the 
partition created. So before better thought, LCS may not be allowed in UMM 
mode. (Though my demo also show improvement in LCS than LRU in UMM mode).

2. MEMORY_AND_DISK_SER or other simillar storage level may serialize RDD 
partition. By estimating ser/deseralize cost and compare to creation cost, if 
the ser/deseralize cost even larger than recreation, not serialize but 
directlly removed from memory. As existing storage level only allowed for the 
whole RDD, so a new storage level may be needed for RDD parition to directly 
determine whether to serialize or just remove from memory.

Besides LCS, FIFO or LFU is easy to be implemented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to