[ 
https://issues.apache.org/jira/browse/FLINK-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736582#comment-14736582
 ] 

ASF GitHub Bot commented on FLINK-1730:
---------------------------------------

Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1083#issuecomment-138861596
  
    You are certainly right that there should be an API call to explicitly 
persist data in memory (or transparently on disk if memory is short) and later 
access this data (within the same or another job). However, this feature can be 
implemented in different ways, for example using the network stack or on the 
operator level. Even if one implementation looks straight-forward, it can have 
severe limitations and implications on the behavior of the system. That is why 
such features should be discussed before taking action even if it looks like an 
easily doable thing.
    
    Doing it on an operator level has several shortcomings:
    - persisted data sets cannot be used for recovery. If done on the network 
stack level, the same code can be basically used for both.
    - data cannot (easily) be shared across jobs. Operators are expected to 
return their memory when a job is done otherwise this will be a memory leak. 
There is no way to free memory if the job is finished and did not do it.
    
    @uce, @StephanEwen You are more familiar with this feature. Did I miss 
something? 


> Add a FlinkTools.persist style method to the Data Set.
> ------------------------------------------------------
>
>                 Key: FLINK-1730
>                 URL: https://issues.apache.org/jira/browse/FLINK-1730
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Stephan Ewen
>            Priority: Minor
>
> I think this is an operation that will be needed more prominently. Defining a 
> point where one long logical program is broken into different executions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to