[ https://issues.apache.org/jira/browse/NIFI-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17803846#comment-17803846 ]
David Handermann edited comment on NIFI-12544 at 1/6/24 5:46 PM: ----------------------------------------------------------------- Thanks for summarizing this potential new feature [~renlor]. On initial consideration, I do not recommend attempting to integrate Server-Sent Events into InvokeHTTP. The Processor is already rather complex, and as you noted, it is designed to send a request and receive a response for each invocation. The concept of receiving a stream of events certainly fits the pattern of other NiFi Processors, so the next question would be scoping the capabilities of such a Processor. Perhaps something like ConsumeServerSentEvent could work as a general name. The ConsumeTwitter Processor is a more specific example of this type of concept, as the Twitter V2 API sends a continual stream of JSON objects over an HTTP response socket connection. The main question is whether this makes sense as a general Processor, versus something like ConsumeTwitter, targeted to a specific service. Such a general Processor could build on the WebClientServiceProvider Controller Service to handle TLS and Proxy configuration, but it would still require some general configurable features, such as a Record Reader to handle various data formats. This is a case where it would be helpful to highlight some common SSE API examples, to see whether a generalized Processor makes sense, or whether the amount of complexity warrants more specific implementations. was (Author: exceptionfactory): Thanks for summarizing this potential new feature [~renlor]. On initial consider, I do not recommend attempting to integrate Server-Sent Events into InvokeHTTP. The Processor is already rather complex, and as you noted, it is designed to send a request and receive a response for each invocation. The concept of receiving a stream of events certainly fits the pattern of other NiFi Processors, so the next question would be scoping the capabilities of such a Processor. Perhaps something like ConsumeServerSentEvent could work as a general name. The ConsumeTwitter Processor is a more specific example of this type of concept, as the Twitter V2 API sends a continual stream of JSON objects over an HTTP response socket connection. The main question is whether this makes sense as a general Processor, versus something like ConsumeTwitter, targeted to a specific service. Such a general Processor could build on the WebClientServiceProvider Controller Service to handle TLS and Proxy configuration, but it would still require some general configurable features, such as a Record Reader to handle various data formats. This is a case where it would be helpful to highlight some common SSE API examples, to see whether a generalized Processor makes sense, or whether the amount of complexity warrants more specific implementations. > Support receiving HTTP SSE (Server Sent Events) > ----------------------------------------------- > > Key: NIFI-12544 > URL: https://issues.apache.org/jira/browse/NIFI-12544 > Project: Apache NiFi > Issue Type: Improvement > Reporter: Justin > Priority: Major > > Currently NiFi has no processor which supports long poll type HTTP > connections, InvokeHTTP expects that every http request will terminate > shortly and does not forward intermediate results for long-lived connections > instead hanging until the read timeout is reached and then erroring the > request. This makes it difficult to connect to any system which utilizes the > SSE standard to stream events to clients as they happen from NiFi, requiring > a script processor or external tool to convert the SSE stream into something > NiFi can handle. Since streaming data processing (streaming events) is > NiFi's MO, it seems appropriate that there would be a native processor which > can connect and stream these types of events. > I also have need of such a processor and would like to upstream the changes, > hence the ticket before I get too far into an implementation. > A couple possibilities for implementing these changes: > * Extend InvokeHTTP to support streaming content when enabled in processor > options, thus it would forward content as it arrives, possibly broken apart > via a delimiter/parse config. This would only timing out connections that > don't make progress in so long (read timeout limits). > * Add a new processor which is SSE specific which does all of the Proxy, > Auth, ... work of InvokeHTTP, but outputs SSE events roughly as they arrived > instead of a single HTTP response per input request. Should this accept > configuration via FlowFile or be an INPUT_FORBIDDEN processor where > everything is through Configuration? How do you close a stream > programmatically if configured via FlowFile? > * Create a new HTTP processor which has some level of support/awareness of > content types (application/stream+json, application/x-ngjson, > text/event-stream, ...) and doesn't time out long connections unless they > aren't making progress, would output blocks in some predefined size or by a > delimiter and rely on downstream to decide what to do with the content. > Ref: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events -- This message was sent by Atlassian Jira (v8.20.10#820010)