David, Thank you for the reasonable response to my questions. Much appreciated.
I'm not a huge fan of the MergeContent -> InvokeHTTP -> {} -> ListenHTTP -> UnpackContent approach to provide the same functionality. But I do acknowledge that's the most direct replacement option without PostHTTP. It's adding extract processors to the chain for something that is effectively a transport issue. NiFi-to-Nifi using PostHTTP was a simple transport-oriented solution, and packing the data with MergeContent first isn't quite the same level of fidelity. You also miss the two-phase commit built into those extra bits. MergeContent is often a bit of a beast in-and-of-itself too. Flowfile attributes conveyed as HTTP headers definitely don't work for complex attribute values. But yes, I know that the functionality exists (having some history with that processor myself). Thanks again for the response. /Adam On Wed, Jan 11, 2023 at 9:27 PM Adam Taft <a...@adamtaft.com> wrote: > Hi Mathew, > > > It's quite remarkable you're advocating against standard practice > presumably > > for your own convenience. > > Wow, absolutely not stated nor implied in my message. And even borderline > offensive. > > What I asked was simply, why remove it, if it's not hurting anything. I > agree with your statement that there is a (very small) cost for maintaining > the component in the source tree. But PostHTTP is not in the same scope as > compared to a component that has a dependency on an abandoned, insecure, or > completely out of standards library (for example). > > PostHTTP has a reasonable use case (as I described) that is not directly > matched with other processors. The two-phase commit protocol sitting > between PostHTTP and ListenHTTP has demonstrated to bear good fruit over > many hardened years of use. I think it's a reasonable reply to my question > to just simply suggest that the interaction between PostHTTP and ListenHTTP > is just not supported by NiFi going forward. But please don't tell me my > question/concern is "out of convenience." > > There is lacking documentation as to the rationale behind the deprecation > of PostHTTP. I might be missing it, can you please send me the link to the > rationale? That's what this thread is trying to address. It sounds like, > from your answer, that the rationale is to reduce code footprint, which > isn't the strongest argument for its removal given its established > historical use. Seems like we'd want more than just reduced footprint for > such a heavily used processor, no? > > /Adam > > > On Wed, Jan 11, 2023 at 7:53 PM Matthew Hawkins <hawko2...@gmail.com> > wrote: > >> Hi Adam, >> >> PostHTTP was marked deprecated 3 years ago (aka six technology lifetimes). >> The successive technologies to replace it's functionality are well >> documented and proven in production. The technical reason to remove it is >> that it is superfluous code that has a cost to maintain and zero benefit. >> Backwards compatibility is never guaranteed for components marked >> deprecated for such a long length of time in any software product let >> alone >> nifi specifically. >> >> Your organisation is free to continue using the version of nifi it is on >> today and not take any further action. It is unhelpful to suggest every >> other organisation should be held back in progress because yours refuses >> to >> take the necessary flow maintenance action. One of the impetus for a major >> version upgrade is to specifically jettison deprecated components. It's >> quite remarkable you're advocating against standard practice presumably >> for >> your own convenience. >> >> Site to site connectivity is conducted with either raw sockets or http >> (which is https on secured nifi) so I'm highly skeptical there is any >> performance deprecation in InvokeHTTP or S2S over PostHTTP, given the >> former can take advantage of http/2 and the latter not. It's easy to >> monitor nifi and prove through metrics in any case. Sadly in enterprise >> environments this is sometimes necessary to defeat the political layer >> around change management. >> >> You can run records-based processing over either current method and it is >> ridiculously fast. The bottleneck in my last engagement ended up being >> network hardware limitations, not nifi. Having contributed in this domain, >> I also recommend tossing CompressContent into the flow to minimise >> bandwidth. On modern hardware the decompression is minimal time and you >> can >> plug a *lot* more data through in less CPU and wall clock time. It's easy >> to bench with DuplicateFlowfile on your production flow and metrics >> analysis, just make sure your provenance db has sufficient space. >> >> Kind regards, >> >> On Thu, Jan 12, 2023, 10:09 Adam Taft <a...@adamtaft.com> wrote: >> >> > Just wanted to note a concern on the deprecation (and presumed removal) >> of >> > the PostHTTP processor in the upcoming 2.0 release. >> > >> > While yes, for traditional client interactions with an external HTTP >> > service, utilizing InvokeHTTP for your POST operation is probably >> sensible. >> > The concern is that there are a number of NiFi-to-NiFi transfers that >> > leverage the "special sauce" that exists between PostHTTP and >> ListenHTTP. >> > >> > What special sauce? Namely, the extra negotiation that enables an >> automated >> > serialization of NiFi flowfiles from one system to another. InvokeHTTP >> is >> > just a "raw" HTTP client and doesn't share any special concern or >> support >> > for NiFi-to-NiFi data transfer. >> > >> > Of course, if you remember the history, before there was any >> site-to-site >> > functionality built into processor groups, the primary means of flowfile >> > transport between NiFi systems was the PostHTTP / ListenHTTP combo. It >> was >> > an easy way to facilitate transfer between two nifi systems. >> > >> > And from what I can tell, this "legacy" approach to NiFi data transfer >> is >> > still being used heavily in certain operational contexts. Why? Because >> > often it's the case that the _only_ traffic allowed between network >> > boundaries is done via HTTPS. The site-to-site protocol provides its own >> > ports and protocol operations that don't necessarily comply with such a >> > network policy. And I believe there's still some lingering and/or >> > demonstrated concern over the performance characteristics of the >> > site-to-site protocol by dataflow managers. They have often reverted to >> > using PostHTTP / ListenHTTP instead. >> > >> > While many of the other deprecated components seem logical, getting rid >> of >> > this one just seems like change-for-the-sake-of-change. >> > >> > Is there any actual technical reason to deprecate and remove PostHTTP >> from >> > the standard nar? Is it causing a burden to the product itself? Or was >> the >> > decision just more like, "hey it's dumb not to use InvokeHTTP for all >> HTTP >> > client operations" and maybe not realize the alternative use case that >> > PostHTTP enables? >> > >> > Thanks for any feedback. >> > >> > /Adam >> > >> >