[ https://issues.apache.org/jira/browse/APEXMALHAR-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chandni Singh reassigned APEXMALHAR-2223: ----------------------------------------- Assignee: Chandni Singh > Managed state should parallelize WAL writes > ------------------------------------------- > > Key: APEXMALHAR-2223 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2223 > Project: Apache Apex Malhar > Issue Type: Improvement > Affects Versions: 3.4.0 > Reporter: Thomas Weise > Assignee: Chandni Singh > > Currently, data is accumulated in memory and written to the WAL on checkpoint > only. This causes a write spike on checkpoint and does not utilize the HDFS > write pipeline. The other extreme is writing to the WAL as soon as data > arrives and then only flush in beforeCheckpoint. The downside of this is that > when the same key is written many times, all duplicates will be in the WAL. > Need to find a balances approach, that the user can potentially fine tune. -- This message was sent by Atlassian JIRA (v6.3.4#6332)