Github user jvwing commented on the issue:

    https://github.com/apache/nifi/pull/239
  
    Thanks for your latest changes to the error handling.  The changes look OK, 
I don't think we need the failure relationship.  The integration tests for 
PutKinesisStream and GetKinesisStream both worked fine.  I set up a small flow 
putting and getting records from a Kinesis stream to test the processors.  The 
processors do work, but I had a rough experience interrupted by various errors 
that required a NiFi restart to fix.  Errors include the following, not 
necessarily in this sequence:
    
    * ERROR [pool-123-thread-4] c.a.s.kinesis.producer.KinesisProducer Error in 
child process
    java.lang.RuntimeException: EOF reached during read
    * ERROR [pool-37-thread-1] c.a.s.kinesis.producer.KinesisProducer Error in 
child process
    java.lang.RuntimeException: Child process exited with code 137
    * ERROR [Timer-Driven Process Thread-2] 
o.a.n.p.a.k.producer.PutKinesisStream
    com.amazonaws.services.kinesis.producer.DaemonException: The child process 
has been shutdown and can no longer accept messages.
    
    Are you familiar with any of these?  Once the child process errors show up, 
the PutKinesisStream processor seems to stop working.  I do not have a precise 
repro sequence yet, but they coincided with throughput around the throttle 
limit of my Kinesis Stream.  Stopping and starting the processor did not help.  
I was running this on an Amazon Linux EC2 instance with permissions for 
Kinesis, Dynamo, and CloudWatch.  I am not sure how to evaluate if this is a 
KCL problem or a PutKinesisStream problem.
    
    One suggestion I have for the error handling in PutKinesisStream would be 
to NOT log the entire batch of FlowFile (PutKinesisStream.java, lines 263, 268, 
and 272).  For example:
    
    ```
        if ( failedFlowFiles.size() > 0 ) {
                session.transfer(failedFlowFiles, PutKinesisStream.REL_FAILURE);
                getLogger().error("Failed to publish to kinesis {} records {}", 
new Object[]{stream, failedFlowFiles});
        }
    ```
    With the default batch size of 250, 250 x FlowFile::toString() adds up to a 
very large block of text that makes it difficult to find the error.  I'm not 
sure how helpful the flow file records are.  I certainly recommend putting the 
exception first, and maybe leaving out the files?  Would a count of files be OK?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to