Rayman created SAMZA-2255:
-----------------------------
Summary: Smart value writes in TaskSideInputStorageManager
Key: SAMZA-2255
URL: https://issues.apache.org/jira/browse/SAMZA-2255
Project: Samza
Issue Type: Improvement
Reporter: Rayman
TaskSideInputStorageManager converts each IME into the desired set of records
to be written by invoking the respective sideInputsProcessor.
For example,
List<Record> entriesToBeWritten = sideInputsProcessor.process(IME.message);
Then it iterates over this list, and if the entry to be written has a null
value then the TaskSideInputStorageManager issues a delete to the KV Store,
otherwise it issues a put.
This can be optimized as follows:
For a given list of entriesToBeWritten, the TaskSideInputStorageManager should
first
a. Do a O(n) pass over it and retain only the last record for each key.
b. Now given the list in a, it should apply all records with a null value by
using the deleteAll, and
all records with a non-null value by using a put-All.
a. is easy to do by simply iterating over the list in reverse order and
retaining the first record encountered for each key. b. is straightforward.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)