[
https://issues.apache.org/jira/browse/NIFI-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joseph Witt updated NIFI-4675:
------------------------------
Fix Version/s: (was: 1.5.0)
> PublishKafka_0_10 can't use demarcator and kafka key at the same time
> ---------------------------------------------------------------------
>
> Key: NIFI-4675
> URL: https://issues.apache.org/jira/browse/NIFI-4675
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Affects Versions: 1.2.0
> Reporter: Jasper Knulst
> Labels: performance
>
> At the moment you can't split up a flowfile using a demarcator AND set the
> Kafka key (kafka.key) attribute for all resulting Kafka records at the same
> time. The code explicitly prevents this.
> Still it would be a valuable performance booster to have the ability to use
> both at the same time in all cases where 1 flowfile contains many individual
> kafka records. Flowfiles would not have to be pre split (explosion of NiFi
> overhead) if you want to set the key.
> Note:
> Using demarcator and kafka key at the same time will normally make every
> resulting kafka record from 1 incoming flowfile to have the same kafka key
> (see REMARK).
> I know a live NiFi deployment where this fix/feature (provided as custom fix)
> led to a 500 - 600% increase in throughput. Others could and should benefit
> as well.
> REMARK
> The argument against this feature has been that it is not a good idea to
> intentionally generate many duplicate Kafka keys. I would argue that it is up
> to the user to decide. Most would use Kafka as a pure distributed log system
> and key uniqueness is not important. The kafka key can be really valuable
> grouping placeholder though. The only case where this would get problematic
> is on compaction of Kafka topics when kafka keys are deduplicated. But after
> we put sufficient warnings and disclaimers for this risk in the tooltips it
> is up to the user to decide whether to use the performance booster.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)