[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537595#comment-15537595
 ] 

Chandni Singh commented on APEXMALHAR-2223:
-------------------------------------------

IncrementalCheckpointManager used to extend FSWindowDataManager but this change 
requires saving multiple artifacts per window and the api of 
IncrementalCheckpointManager was quite different from FSWindowDataManager. So 
abstracted out the common code to `AbstractFSWindowStateManager`.

> Managed state should parallelize WAL writes
> -------------------------------------------
>
>                 Key: APEXMALHAR-2223
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2223
>             Project: Apache Apex Malhar
>          Issue Type: Improvement
>    Affects Versions: 3.4.0
>            Reporter: Thomas Weise
>            Assignee: Chandni Singh
>
> Currently, data is accumulated in memory and written to the WAL on checkpoint 
> only. This causes a write spike on checkpoint and does not utilize the HDFS 
> write pipeline. The other extreme is writing to the WAL as soon as data 
> arrives and then only flush in beforeCheckpoint. The downside of this is that 
> when the same key is written many times, all duplicates will be in the WAL. 
> Need to find a balances approach, that the user can potentially fine tune. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to