Github user miquillo commented on the issue:

    https://github.com/apache/nifi/pull/239
  
    ouch, that's painful... we've just committed very similar code in NIFI-1769 
(PR #362, see reference), since we required a kinesis streams processor block. 
    
    I propose to take this solution further though, as it seems more mature and 
already has been reviewed a few times. 
    
    My comments:
    
    - At the time of reading, the name of the issue / PR didn't make us realize 
enough that this processor block was already build. Kinesis is the Umbrella, 
which currently contains three services: Firehose, Streams and Analytics. This 
processor block only works for Kinesis Streams, Please rename both code and 
PR/Jira :) (+ also the tests under a subfolder 'streams')
    - The API explains some restrictions to PutRecords(): 
    
    > Each PutRecords request can support up to 500 records. Each record in the 
request can be as large as 1 MB, up to a limit of 5 MB for the entire request, 
including partition keys. Each shard can support writes up to 1,000 records per 
second, up to a maximum data write total of 1 MB per 
second.[http://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html]
 
    
    Data needs to be chunked into multiple PutRecords calls if the amount of 
records > 500. We used the following:
    `List<PutRecordBatchResponseEntry> recordChuncks = Lists.partition(records, 
500);`
    - We added a NR_SHARDS parameter for the sake of resharding (scaling 
up/down) in the future. Although we didn't implement a resharding mechanism, 
it's perhaps worth considering.
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to