Rayman created SAMZA-2255:
-----------------------------

             Summary: Smart value writes in TaskSideInputStorageManager
                 Key: SAMZA-2255
                 URL: https://issues.apache.org/jira/browse/SAMZA-2255
             Project: Samza
          Issue Type: Improvement
            Reporter: Rayman


TaskSideInputStorageManager converts each IME into the desired set of records 
to be written by invoking the respective sideInputsProcessor. 

For example, 
List<Record> entriesToBeWritten = sideInputsProcessor.process(IME.message);

Then it iterates over this list, and if the entry to be written has a null 
value then the TaskSideInputStorageManager issues a delete to the KV Store, 
otherwise it issues a put. 

This can be optimized as follows: 
For a given list of entriesToBeWritten, the TaskSideInputStorageManager should 
first 
a. Do a O(n) pass over it and retain only the last record for each key. 
b. Now given the list in a, it should apply all records with a null value by 
using the deleteAll, and 
all records with a non-null value by using a put-All.

a. is easy to do by simply iterating over the list in reverse order and 
retaining the first record encountered for each key. b. is straightforward.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to