[
https://issues.apache.org/jira/browse/SAMZA-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160871#comment-14160871
]
Roger Hoover commented on SAMZA-428:
------------------------------------
If you need to write to a remote DB and you want idempotency, then each batch
would be written as a transaction and you'd want it to match the Samza batch so
that it can be retried on failure (this is assuming a deterministic
MessageChooser).
I'm basing my thinking off of the way Trident works. How should one handle
writes to remote data stores?
> Investigate: how to tune down caching in the KeyValueStore implementations
> --------------------------------------------------------------------------
>
> Key: SAMZA-428
> URL: https://issues.apache.org/jira/browse/SAMZA-428
> Project: Samza
> Issue Type: Improvement
> Components: kv
> Affects Versions: 0.8.0
> Reporter: Chinmay Soman
> Fix For: 0.8.0
>
>
> Currently, we have a 'CachedStore' layer on top of the KeyValueStore
> implementation that we use. This might lead to double caching:
> i) Once at the CachedStore layer
> ii) Possibly cached again in the specific K-V store that we use (for eg:
> RocksDB / BDB)
> We need the CachedStore layer so that the writes to LoggedStore (if
> configured) are done in an efficient manner.
> We can then potentially do some config tuning for the K-V store to reduce its
> memory footprint and simply write to disk.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)