Mark Payne created NIFI-1673:
--------------------------------
Summary: Investigate indexing provenance events 'by block'
Key: NIFI-1673
URL: https://issues.apache.org/jira/browse/NIFI-1673
Project: Apache NiFi
Issue Type: Improvement
Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne
Currently, we index each provenance event individually. Unfortunately, indexing
provenance events can sometimes be the bottleneck of the flow when we have many
small FlowFiles. We should investigate instead batching together a block of say
100 FlowFiles, and then indexing the attributes, etc. for the entire block.
Then, when queried, we could get back a block of events and filter the events
afterward. This would, in theory, provide a trade-off that gives us better
indexing performance at the cost of query performance. This block size could
then be configurable to give administrators the ability to choose that
trade-off.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)