Hi Adam, I know this thread has been opened over a month ago, but we recently had to move FlowFiles, including both attributes and content, from one NiFi cluster to another and could not built upon the built-in Site-to-Site transfer mechanisms due to network restrictions between the clusters.
We've built upon an existing solution from a community member which has been dormant for some time. It uses a pair of two custom processors to transfer FlowFile content and attributes using raw TCP connections. You can find the solution under its name "nifi-flow-over-tcp" both on GitHub and on Maven Central. githubDOTcom/EndzeitBegins/nifi-flow-over-tcp Maybe this can be helpful to you as well in the aforementioned cases you previously made use of the PostHTTP processor. Best regards Adam Taft <a...@adamtaft.com> schrieb am Do., 12. Jan. 2023, 05:39: > David, > > Thank you for the reasonable response to my questions. Much appreciated. > > I'm not a huge fan of the MergeContent -> InvokeHTTP -> {} -> ListenHTTP -> > UnpackContent approach to provide the same functionality. But I do > acknowledge that's the most direct replacement option without PostHTTP. > It's adding extract processors to the chain for something that is > effectively a transport issue. NiFi-to-Nifi using PostHTTP was a simple > transport-oriented solution, and packing the data with MergeContent first > isn't quite the same level of fidelity. You also miss the two-phase commit > built into those extra bits. MergeContent is often a bit of a beast > in-and-of-itself too. > > Flowfile attributes conveyed as HTTP headers definitely don't work for > complex attribute values. But yes, I know that the functionality exists > (having some history with that processor myself). > > Thanks again for the response. > > /Adam > > > > > On Wed, Jan 11, 2023 at 9:27 PM Adam Taft <a...@adamtaft.com> wrote: > > > Hi Mathew, > > > > > It's quite remarkable you're advocating against standard practice > > presumably > > > for your own convenience. > > > > Wow, absolutely not stated nor implied in my message. And even borderline > > offensive. > > > > What I asked was simply, why remove it, if it's not hurting anything. I > > agree with your statement that there is a (very small) cost for > maintaining > > the component in the source tree. But PostHTTP is not in the same scope > as > > compared to a component that has a dependency on an abandoned, insecure, > or > > completely out of standards library (for example). > > > > PostHTTP has a reasonable use case (as I described) that is not directly > > matched with other processors. The two-phase commit protocol sitting > > between PostHTTP and ListenHTTP has demonstrated to bear good fruit over > > many hardened years of use. I think it's a reasonable reply to my > question > > to just simply suggest that the interaction between PostHTTP and > ListenHTTP > > is just not supported by NiFi going forward. But please don't tell me my > > question/concern is "out of convenience." > > > > There is lacking documentation as to the rationale behind the deprecation > > of PostHTTP. I might be missing it, can you please send me the link to > the > > rationale? That's what this thread is trying to address. It sounds like, > > from your answer, that the rationale is to reduce code footprint, which > > isn't the strongest argument for its removal given its established > > historical use. Seems like we'd want more than just reduced footprint for > > such a heavily used processor, no? > > > > /Adam > > > > > > On Wed, Jan 11, 2023 at 7:53 PM Matthew Hawkins <hawko2...@gmail.com> > > wrote: > > > >> Hi Adam, > >> > >> PostHTTP was marked deprecated 3 years ago (aka six technology > lifetimes). > >> The successive technologies to replace it's functionality are well > >> documented and proven in production. The technical reason to remove it > is > >> that it is superfluous code that has a cost to maintain and zero > benefit. > >> Backwards compatibility is never guaranteed for components marked > >> deprecated for such a long length of time in any software product let > >> alone > >> nifi specifically. > >> > >> Your organisation is free to continue using the version of nifi it is on > >> today and not take any further action. It is unhelpful to suggest every > >> other organisation should be held back in progress because yours refuses > >> to > >> take the necessary flow maintenance action. One of the impetus for a > major > >> version upgrade is to specifically jettison deprecated components. It's > >> quite remarkable you're advocating against standard practice presumably > >> for > >> your own convenience. > >> > >> Site to site connectivity is conducted with either raw sockets or http > >> (which is https on secured nifi) so I'm highly skeptical there is any > >> performance deprecation in InvokeHTTP or S2S over PostHTTP, given the > >> former can take advantage of http/2 and the latter not. It's easy to > >> monitor nifi and prove through metrics in any case. Sadly in enterprise > >> environments this is sometimes necessary to defeat the political layer > >> around change management. > >> > >> You can run records-based processing over either current method and it > is > >> ridiculously fast. The bottleneck in my last engagement ended up being > >> network hardware limitations, not nifi. Having contributed in this > domain, > >> I also recommend tossing CompressContent into the flow to minimise > >> bandwidth. On modern hardware the decompression is minimal time and you > >> can > >> plug a *lot* more data through in less CPU and wall clock time. It's > easy > >> to bench with DuplicateFlowfile on your production flow and metrics > >> analysis, just make sure your provenance db has sufficient space. > >> > >> Kind regards, > >> > >> On Thu, Jan 12, 2023, 10:09 Adam Taft <a...@adamtaft.com> wrote: > >> > >> > Just wanted to note a concern on the deprecation (and presumed > removal) > >> of > >> > the PostHTTP processor in the upcoming 2.0 release. > >> > > >> > While yes, for traditional client interactions with an external HTTP > >> > service, utilizing InvokeHTTP for your POST operation is probably > >> sensible. > >> > The concern is that there are a number of NiFi-to-NiFi transfers that > >> > leverage the "special sauce" that exists between PostHTTP and > >> ListenHTTP. > >> > > >> > What special sauce? Namely, the extra negotiation that enables an > >> automated > >> > serialization of NiFi flowfiles from one system to another. InvokeHTTP > >> is > >> > just a "raw" HTTP client and doesn't share any special concern or > >> support > >> > for NiFi-to-NiFi data transfer. > >> > > >> > Of course, if you remember the history, before there was any > >> site-to-site > >> > functionality built into processor groups, the primary means of > flowfile > >> > transport between NiFi systems was the PostHTTP / ListenHTTP combo. It > >> was > >> > an easy way to facilitate transfer between two nifi systems. > >> > > >> > And from what I can tell, this "legacy" approach to NiFi data transfer > >> is > >> > still being used heavily in certain operational contexts. Why? Because > >> > often it's the case that the _only_ traffic allowed between network > >> > boundaries is done via HTTPS. The site-to-site protocol provides its > own > >> > ports and protocol operations that don't necessarily comply with such > a > >> > network policy. And I believe there's still some lingering and/or > >> > demonstrated concern over the performance characteristics of the > >> > site-to-site protocol by dataflow managers. They have often reverted > to > >> > using PostHTTP / ListenHTTP instead. > >> > > >> > While many of the other deprecated components seem logical, getting > rid > >> of > >> > this one just seems like change-for-the-sake-of-change. > >> > > >> > Is there any actual technical reason to deprecate and remove PostHTTP > >> from > >> > the standard nar? Is it causing a burden to the product itself? Or was > >> the > >> > decision just more like, "hey it's dumb not to use InvokeHTTP for all > >> HTTP > >> > client operations" and maybe not realize the alternative use case that > >> > PostHTTP enables? > >> > > >> > Thanks for any feedback. > >> > > >> > /Adam > >> > > >> > > >