plexaikm commented on a change in pull request #3472: NIFI-6294 Support for
flowfile attribute in TABLE_NAME
URL: https://github.com/apache/nifi/pull/3472#discussion_r284287365
##########
File path:
nifi-nar-bundles/nifi-kudu-bundle/nifi-kudu-processors/src/main/java/org/apache/nifi/processors/kudu/PutKudu.java
##########
@@ -307,9 +302,11 @@ private void trigger(final ProcessContext context, final
ProcessSession session,
final List<RowError> pendingRowErrors = new ArrayList<>();
for (FlowFile flowFile : flowFiles) {
try (final InputStream in = session.read(flowFile);
- final RecordReader recordReader =
recordReaderFactory.createRecordReader(flowFile, in, getLogger())) {
+ final RecordReader recordReader =
recordReaderFactory.createRecordReader(flowFile, in, getLogger())) {
final List<String> fieldNames =
recordReader.getSchema().getFieldNames();
final RecordSet recordSet = recordReader.createRecordSet();
+ final String tableName =
context.getProperty(TABLE_NAME).evaluateAttributeExpressions(flowFile).getValue();
+ final KuduTable kuduTable = kuduClient.openTable(tableName);
Review comment:
The connection to Kudu is still acquired in onSchedule (so all the crypto
heavy stuff is there), the openTable function basically retrieves table
metadata from Kudu server over existing connection.
Observed no visible performance degradation in our environment loading
~60mil records from 600 flow files with 10000 batch size, For lots of small
flow files with few rows this change will add a trip to Kudu for metadata
retrieval
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services