[ 
https://issues.apache.org/jira/browse/TEZ-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010121#comment-14010121
 ] 

Rohini Palaniswamy commented on TEZ-1153:
-----------------------------------------

Yes. I am aware that this is possible with ObjectLifeCycle.DAG but it will not 
be advisable to do that for replicated join. Only vertices doing replicate join 
we want the table in memory. As the join small table can be anywhere from 
5MB-1G, it will be really bad to cache it for the whole DAG.   If the Processor 
is intelligent enough to mark cache items belonging to specific vertices, it 
should not be difficult to clean items from the cache based on the which vertex 
the newly scheduled task is instead of blindly clearing the whole cache if it 
is a different vertex.  

> Allow object caching for more than one vertex
> ---------------------------------------------
>
>                 Key: TEZ-1153
>                 URL: https://issues.apache.org/jira/browse/TEZ-1153
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>
>   It would be good to have option to cache an object for more than 1 vertex 
> by specifying a list of vertex ids



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to