[
https://issues.apache.org/jira/browse/NIFI-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938986#comment-17938986
]
Dariusz Seweryn commented on NIFI-14394:
----------------------------------------
Maximum value of unsigned long is 2^64 = 18,446,744,073,709,551,616 which would
occupy 20 characters in maximum. {{String.format("%s%020d", sequenceNumber,
subSequenceNumber)}} when wrapped into {{BigInteger}} allows for comparing this
{{shardedSequenceNumber}} of any two messages and always get the correct order
whether they are from the same {{sequenceNumber}} or not.
> ConsumeKinesisStream support for Sub Sequence Number
> ----------------------------------------------------
>
> Key: NIFI-14394
> URL: https://issues.apache.org/jira/browse/NIFI-14394
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Affects Versions: 2.3.0
> Reporter: Dariusz Seweryn
> Priority: Major
>
> ConsumeKinesisStream supports "Output Strategy" of "Use Wrapper" which wraps
> message data and metadata into record content.
> Currently metadata includes:
> * stream
> * shardId
> * sequenceNumber
> * partitionKey
> * approximateArrival
> What it does not include is Sub Sequence Number which is assigned to messages
> that [are aggregated by the Amazon's Kinesis Producer
> Library|https://docs.aws.amazon.com/streams/latest/dev/kinesis-producer-adv-aggregation.html]
> making them indistinguishable by evaluating metadata.
> Proposal is to add to wrapped metadata:
> * subSequenceNumber — a long
> * shardUniqueId — a String formed as `String.format("%s%020d",
> sequenceNumber, subSequenceNumber)` which could be useful for uniquely
> identifying messages identifying order using a single comparison
--
This message was sent by Atlassian Jira
(v8.20.10#820010)