[
https://issues.apache.org/jira/browse/NIFI-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189597#comment-17189597
]
ASF subversion and git services commented on NIFI-7740:
-------------------------------------------------------
Commit c10bd4990bfcc5f5fd17c3eefdb03801e7a036a9 in nifi's branch
refs/heads/support/nifi-1.12.x from Matt Burgess
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=c10bd49 ]
NIFI-7740: Add Records Per Transaction and Transactions Per Batch properties to
PutHive3Streaming
NIFI-7740: Incorporated review comments
NIFI-7740: Restore RecordsEOFException superclass to SerializationError
This closes #4489.
Signed-off-by: Peter Turcsanyi <[email protected]>
> Add Records Per Transaction and Transactions Per Batch to PutHive3Streaming
> ---------------------------------------------------------------------------
>
> Key: NIFI-7740
> URL: https://issues.apache.org/jira/browse/NIFI-7740
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Matt Burgess
> Assignee: Matt Burgess
> Priority: Major
> Fix For: 1.13.0, 1.12.1
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> The original PutHiveStreaming (for Hive 1.2.x) exposed properties to the user
> for tuning the number of records in an individual Hive Streaming transaction,
> as well as the number of transactions to be batched together (for
> performance).
> These properties should be exposed in the PutHive3Streaming processor in
> order to tune its performance. The default values should result in the
> current behavior, so a setting of zero for Records Per Transaction will put
> all records into a single transaction, and a setting of 1 for Transactions
> Per Batch will result in a single transaction in each batch. Together these
> defaults describe the current behavior.
> For large files, Records Per Transaction should be set to something more
> manageable, such as 100K perhaps, and Transactions Per Batch to something
> such as 10. As a rule the product of the two numbers should be larger than
> the largest expected number of records in the flow file(s), this will ensure
> any failed transaction batches cause a full rollback. The documentation for
> these properties should include this prescription.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)