[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207881#comment-15207881
 ] 

ASF GitHub Bot commented on APEXMALHAR-2017:
--------------------------------------------

Github user chandnisingh commented on a diff in the pull request:

    
https://github.com/apache/incubator-apex-malhar/pull/218#discussion_r57111096
  
    --- Diff: 
library/src/main/java/com/datatorrent/lib/io/fs/AbstractFileOutputOperator.java 
---
    @@ -1195,6 +1188,24 @@ public void close() throws IOException
       }
     
       @Override
    +  public void beforeCheckpoint(long l)
    +  {
    +    try {
    +      Map<String, FSFilterStreamContext> openStreams = 
streamsCache.asMap();
    +      for (FSFilterStreamContext streamContext: openStreams.values()) {
    +        long start = System.currentTimeMillis();
    +        streamContext.finalizeContext();
    +        totalWritingTime += System.currentTimeMillis() - start;
    +        //streamContext.resetFilter();
    --- End diff --
    
    why this commented out code?


> Use pre checkpoint notification to optimize operator IO
> -------------------------------------------------------
>
>                 Key: APEXMALHAR-2017
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2017
>             Project: Apache Apex Malhar
>          Issue Type: Improvement
>            Reporter: Pramod Immaneni
>            Assignee: Pramod Immaneni
>
> Currently many output operators enforce persistence of data on endWindow by 
> calling flush, hflush or equivalent calls. This was done to help recovery. 
> Doing this always ensures that the data corresponding to checkpoint state at 
> recovery is always present.
> A recent addition to the engine lets the operators know about an impending 
> checkpoint just before it happens using a callback. Operators can now enforce 
> persistence of data one time in this in this callback instead of end of every 
> window. This results in better performance as data is not being frequently 
> written to persistent storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to