[
https://issues.apache.org/jira/browse/NIFI-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280906#comment-16280906
]
ASF GitHub Bot commented on NIFI-4675:
--------------------------------------
Github user joewitt commented on the issue:
https://github.com/apache/nifi/pull/2326
Given your description of the user understanding the decision they're
making I think I agree with you.
@markap14 what do you think?
@jasper-k we'll want to make sure kafka 10, 11, and 1 are all updated for
this but lets see what mark thinks too.
Thanks for contributing and sharing your findings
> PublishKafka_0_10 can't use demarcator and kafka key at the same time
> ---------------------------------------------------------------------
>
> Key: NIFI-4675
> URL: https://issues.apache.org/jira/browse/NIFI-4675
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Affects Versions: 1.2.0
> Reporter: Jasper Knulst
> Labels: performance
>
> At the moment you can't split up a flowfile using a demarcator AND set the
> Kafka key (kafka.key) attribute for all resulting Kafka records at the same
> time. The code explicitly prevents this.
> Still it would be a valuable performance booster to have the ability to use
> both at the same time in all cases where 1 flowfile contains many individual
> kafka records. Flowfiles would not have to be pre split (explosion of NiFi
> overhead) if you want to set the key.
> Note:
> Using demarcator and kafka key at the same time will normally make every
> resulting kafka record from 1 incoming flowfile to have the same kafka key
> (see REMARK).
> I know a live NiFi deployment where this fix/feature (provided as custom fix)
> led to a 500 - 600% increase in throughput. Others could and should benefit
> as well.
> REMARK
> The argument against this feature has been that it is not a good idea to
> intentionally generate many duplicate Kafka keys. I would argue that it is up
> to the user to decide. Most would use Kafka as a pure distributed log system
> and key uniqueness is not important. The kafka key can be really valuable
> grouping placeholder though. The only case where this would get problematic
> is on compaction of Kafka topics when kafka keys are deduplicated. But after
> we put sufficient warnings and disclaimers for this risk in the tooltips it
> is up to the user to decide whether to use the performance booster.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)