devinbost commented on issue #5116: Provide a flag for message properties pass 
through in functions
URL: https://github.com/apache/pulsar/issues/5116#issuecomment-528062821
 
 
   I was just about to create a new feature request until I saw this issue. 
   I want to provide an example use case that might be helpful to identify the 
value of this feature.
   
   Let's say we have a pipeline like this:
   
   > topic1 -> function1 -> topic2 -> function2 -> topic3 -> function3 -> . . . 
-> topicN -> functionN
   
   Let's say that I want to create a message tracing capability for an 
application that sends a test message through the pipeline and listens for the 
message to indicate how far the message got int the pipeline. (Think like the 
X-Ray tracing feature for AWS Lambda.) If we can tag a message at the start of 
the pipeline, then we can have our functions (which import from a common 
function override) automatically listen for messages with the given tag and 
write to our application to indicate if they received the message or not. If we 
could also attach a timestamp upfront, then we could also compute and report 
the total latency through the pipeline at each subsequent function in the 
pipeline. 
   
   As another possible (though similar) application, we could randomly sample 
every 1/1000th (or some other frequency) message that passes through a given 
pipeline for health, performance, or data contract checks to ensure that 
messages are behaving as expected through the pipeline across many different 
possible logical paths. For example, if there are multiple actors that are 
producing to a single topic, this sampling could expose cases where a 
percentage of messages are malformed (but not malformed enough to create a 
backlog), such as where a particular field is an empty string when it shouldn't 
be. Extensive validation may be expensive to perform on a high-velocity 
pipeline for every message, but it may be inexpensive to perform on a sample of 
incoming messages. It would also allow us to accurately compute end-to-end 
pipeline latencies and other statistics, which would be useful for SLAs.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to