Thomas Weise created APEXMALHAR-2223:
----------------------------------------

             Summary: Managed state should parallelize WAL writes
                 Key: APEXMALHAR-2223
                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2223
             Project: Apache Apex Malhar
          Issue Type: Improvement
            Reporter: Thomas Weise


Currently, data is accumulated in memory and written to the WAL on checkpoint 
only. This causes a write spike on checkpoint and does not utilize the HDFS 
write pipeline. The other extreme is writing to the WAL as soon as data arrives 
and then only flush in beforeCheckpoint. The downside of this is that when the 
same key is written many times, all duplicates will be in the WAL. Need to find 
a balances approach, that the user can potentially fine tune. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to