Re: new PackageFlowFile processor
Flow File Packager v3. You can find the source here: https://github.com/apache/nifi/blob/main/nifi-commons/nifi-flowfile-packager/src/main/java/org/apache/nifi/util/FlowFilePackagerV3.java It's a serialization format that is used for writing a flowfile (content and attributes) to a stream (network, file, etc.). It's a simple binary format, that is effectively the attributes serialized as key/value pairs followed by the content. There are byte size markers written into the start of each field, so that deserializing can read the values into a byte array (or equivalent). Flow File Packager v3 is primarily used in the MergeContent processor (for bundling) and the UnpackContent processor (for extraction). But the (deprecated) PostHTTP processor and the ListenHTTP processor has support for this format somewhat transparently as well. Thus enabling two NiFi systems to send a serialized flowfile across the wire using HTTP. You might see this format name as "FlowFile Stream v3" or "flowfile-stream-v3" when looking at either MergeContent or UnpackContent. On Fri, Sep 8, 2023 at 2:14 PM Russell Bateman wrote: > Uh, sorry, "Version 3" refers to what exactly? > > On 9/8/23 12:48, David Handermann wrote: > > I agree that this would be a useful general feature. I also agree with > > Joe that format support should be limited to*Version 3* due to the > > limitations of the earlier versions. > > > > This is definitely something that would be useful on the 1.x support > > branch to provide a smooth upgrade path for NiFi 2. > > > > This general topic also came up on the dev channel on the Apache NiFi > > Slack group: > > > > https://apachenifi.slack.com/archives/C0L9S92JY/p1692115270146369 > > > > One key thing to note from that discussion is supporting > > interoperability with services outside of NiFi. That may be too much > > of a stretch for an initial implementation, but it is something I am > > planning to evaluate as time allows. > > > > For now, something focused narrowly on FlowFile Version 3 encoding > > seems like the best approach. > > > > I recommend referencing this discussion in a new Jira issue and > > outlining the general design goals. > > > > Regards, > > David Handermann > > > > > > On Fri, Sep 8, 2023 at 1:11 PM Adam Taft wrote: > >> And also ... if we can land this in a 1.x release, this would help > >> tremendously to those who are going to need a replacement for PostHTTP > and > >> don't want to "go dark" when they make the transition. > >> > >> That is, without this processor in 1.x, when a user upgrades from 1.x to > >> 2.x, they will either have to have a MergeContent/InvokeHTTP solution in > >> place already to replace PostHTTP, or they will have to take a > (hopefully > >> short) outage when they bring their canvas back up (removing PostHTTP > and > >> replacing with PackageFlowFile + InvokeHTTP). > >> > >> With this processor in 1.x, they can make that transition while > PostHTTP is > >> still available on their canvas. Wishful thinking that we can make the > >> entire journey from 1.x to 2.x as smooth as possible, but this could > >> potentially help some. > >> > >> > >> On Fri, Sep 8, 2023 at 10:55 AM Adam Taft wrote: > >> > >>> +1 on this as well. It's something I've kind of griped about before > (with > >>> the loss of PostHTTP). > >>> > >>> I don't think it would be horrible (as per Joe's concern) to offer a > N:1 > >>> "bundling" property. It would just have to be stupid simple. No > "groups", > >>> timeouts, correlation attributes, minimum entries, etc. It should just > >>> basically call the ProcessSession#get(int maxResults) where > "maxResults" is > >>> a configurable property. Whatever number of flowfiles returned in the > list > >>> is what is "bundled" into FFv3 format for output. > >>> > >>> /Adam > >>> > >>> > >>> On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord > >>> wrote: > >>> > +1 from me. > I’ve experimented with both methods. The simplicity of a > PackageFlowfile > straight up 1:1 is convenient and straightforward. > MergeContent on the other hand can be difficult to understand and > tweak > appropriately to gain desired results/throughput. > On Sep 8, 2023 at 10:14 AM -0400, Joe Witt, > wrote: > > Ok. Certainly simplifies it but likely makes it applicable to larger > > flowfiles only. The format is meant to allow appending and result in > large > > sets of flowfiles for io efficiency and specifically for storage as > the > > small files/tons of files thing can cause poor performance pretty > quickly > > (10s of thousands of files in a single directory). > > > > But maybe that simplicity is fine and we just link to the > MergeContent > > packaging option if users need more. > > > > On Fri, Sep 8, 2023 at 7:06 AM Michael Moser > wrote: > >> I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of > multiple > >> files at all. Probably change the mime.type
Re: new PackageFlowFile processor
Uh, sorry, "Version 3" refers to what exactly? On 9/8/23 12:48, David Handermann wrote: I agree that this would be a useful general feature. I also agree with Joe that format support should be limited to*Version 3* due to the limitations of the earlier versions. This is definitely something that would be useful on the 1.x support branch to provide a smooth upgrade path for NiFi 2. This general topic also came up on the dev channel on the Apache NiFi Slack group: https://apachenifi.slack.com/archives/C0L9S92JY/p1692115270146369 One key thing to note from that discussion is supporting interoperability with services outside of NiFi. That may be too much of a stretch for an initial implementation, but it is something I am planning to evaluate as time allows. For now, something focused narrowly on FlowFile Version 3 encoding seems like the best approach. I recommend referencing this discussion in a new Jira issue and outlining the general design goals. Regards, David Handermann On Fri, Sep 8, 2023 at 1:11 PM Adam Taft wrote: And also ... if we can land this in a 1.x release, this would help tremendously to those who are going to need a replacement for PostHTTP and don't want to "go dark" when they make the transition. That is, without this processor in 1.x, when a user upgrades from 1.x to 2.x, they will either have to have a MergeContent/InvokeHTTP solution in place already to replace PostHTTP, or they will have to take a (hopefully short) outage when they bring their canvas back up (removing PostHTTP and replacing with PackageFlowFile + InvokeHTTP). With this processor in 1.x, they can make that transition while PostHTTP is still available on their canvas. Wishful thinking that we can make the entire journey from 1.x to 2.x as smooth as possible, but this could potentially help some. On Fri, Sep 8, 2023 at 10:55 AM Adam Taft wrote: +1 on this as well. It's something I've kind of griped about before (with the loss of PostHTTP). I don't think it would be horrible (as per Joe's concern) to offer a N:1 "bundling" property. It would just have to be stupid simple. No "groups", timeouts, correlation attributes, minimum entries, etc. It should just basically call the ProcessSession#get(int maxResults) where "maxResults" is a configurable property. Whatever number of flowfiles returned in the list is what is "bundled" into FFv3 format for output. /Adam On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord wrote: +1 from me. I’ve experimented with both methods. The simplicity of a PackageFlowfile straight up 1:1 is convenient and straightforward. MergeContent on the other hand can be difficult to understand and tweak appropriately to gain desired results/throughput. On Sep 8, 2023 at 10:14 AM -0400, Joe Witt, wrote: Ok. Certainly simplifies it but likely makes it applicable to larger flowfiles only. The format is meant to allow appending and result in large sets of flowfiles for io efficiency and specifically for storage as the small files/tons of files thing can cause poor performance pretty quickly (10s of thousands of files in a single directory). But maybe that simplicity is fine and we just link to the MergeContent packaging option if users need more. On Fri, Sep 8, 2023 at 7:06 AM Michael Moser wrote: I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of multiple files at all. Probably change the mime.type attribute. It might not even have any config properties at all if we only support flowfile-v3 and not v1 or v2. -- Mike On Fri, Sep 8, 2023 at 9:56 AM Joe Witt wrote: Mike In user terms this makes sense to me. Id only bother with v3 or whatever is latest. We want to dump the old code. And if there are seriously older versions v1,v2 then nifi 1.x can be used. The challenge is that you end up needing some of the same complexity in implementation and config of merge content i think. What did you have in mind for that? Thanks On Fri, Sep 8, 2023 at 6:53 AM Michael Moser wrote: Devs, I can't find if this was suggested before, so here goes. With the demise of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1 file into FlowFile-v3 format then InvokeHTTP. What does the community think about supporting a new PackageFlowFile processor that is simple to configure (compared to MergeContent!) and simply packages flowfile attributes + content into a FlowFile-v[1,2,3] format? This would also offer a simple way to export flowfiles from NiFi that could later be re-ingested and recovered using UnpackContent. I don't want to submit a PR for such a processor without first asking the community whether this would be acceptable. Thanks, -- Mike
Re: new PackageFlowFile processor
I agree that this would be a useful general feature. I also agree with Joe that format support should be limited to Version 3 due to the limitations of the earlier versions. This is definitely something that would be useful on the 1.x support branch to provide a smooth upgrade path for NiFi 2. This general topic also came up on the dev channel on the Apache NiFi Slack group: https://apachenifi.slack.com/archives/C0L9S92JY/p1692115270146369 One key thing to note from that discussion is supporting interoperability with services outside of NiFi. That may be too much of a stretch for an initial implementation, but it is something I am planning to evaluate as time allows. For now, something focused narrowly on FlowFile Version 3 encoding seems like the best approach. I recommend referencing this discussion in a new Jira issue and outlining the general design goals. Regards, David Handermann On Fri, Sep 8, 2023 at 1:11 PM Adam Taft wrote: > > And also ... if we can land this in a 1.x release, this would help > tremendously to those who are going to need a replacement for PostHTTP and > don't want to "go dark" when they make the transition. > > That is, without this processor in 1.x, when a user upgrades from 1.x to > 2.x, they will either have to have a MergeContent/InvokeHTTP solution in > place already to replace PostHTTP, or they will have to take a (hopefully > short) outage when they bring their canvas back up (removing PostHTTP and > replacing with PackageFlowFile + InvokeHTTP). > > With this processor in 1.x, they can make that transition while PostHTTP is > still available on their canvas. Wishful thinking that we can make the > entire journey from 1.x to 2.x as smooth as possible, but this could > potentially help some. > > > On Fri, Sep 8, 2023 at 10:55 AM Adam Taft wrote: > > > +1 on this as well. It's something I've kind of griped about before (with > > the loss of PostHTTP). > > > > I don't think it would be horrible (as per Joe's concern) to offer a N:1 > > "bundling" property. It would just have to be stupid simple. No "groups", > > timeouts, correlation attributes, minimum entries, etc. It should just > > basically call the ProcessSession#get(int maxResults) where "maxResults" is > > a configurable property. Whatever number of flowfiles returned in the list > > is what is "bundled" into FFv3 format for output. > > > > /Adam > > > > > > On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord > > wrote: > > > >> +1 from me. > >> I’ve experimented with both methods. The simplicity of a PackageFlowfile > >> straight up 1:1 is convenient and straightforward. > >> MergeContent on the other hand can be difficult to understand and tweak > >> appropriately to gain desired results/throughput. > >> On Sep 8, 2023 at 10:14 AM -0400, Joe Witt , wrote: > >> > Ok. Certainly simplifies it but likely makes it applicable to larger > >> > flowfiles only. The format is meant to allow appending and result in > >> large > >> > sets of flowfiles for io efficiency and specifically for storage as the > >> > small files/tons of files thing can cause poor performance pretty > >> quickly > >> > (10s of thousands of files in a single directory). > >> > > >> > But maybe that simplicity is fine and we just link to the MergeContent > >> > packaging option if users need more. > >> > > >> > On Fri, Sep 8, 2023 at 7:06 AM Michael Moser > >> wrote: > >> > > >> > > I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of > >> multiple > >> > > files at all. Probably change the mime.type attribute. It might not > >> even > >> > > have any config properties at all if we only support flowfile-v3 and > >> not v1 > >> > > or v2. > >> > > > >> > > -- Mike > >> > > > >> > > > >> > > On Fri, Sep 8, 2023 at 9:56 AM Joe Witt wrote: > >> > > > >> > > > Mike > >> > > > > >> > > > In user terms this makes sense to me. Id only bother with v3 or > >> whatever > >> > > is > >> > > > latest. We want to dump the old code. And if there are seriously > >> older > >> > > > versions v1,v2 then nifi 1.x can be used. > >> > > > > >> > > > The challenge is that you end up needing some of the same > >> complexity in > >> > > > implementation and config of merge content i think. What did you > >> have in > >> > > > mind for that? > >> > > > > >> > > > Thanks > >> > > > > >> > > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser > >> wrote: > >> > > > > >> > > > > Devs, > >> > > > > > >> > > > > I can't find if this was suggested before, so here goes. With the > >> > > demise > >> > > > > of PostHTTP in NiFi 2.0, the recommended alternative is to > >> > > MergeContent 1 > >> > > > > file into FlowFile-v3 format then InvokeHTTP. What does the > >> community > >> > > > > think about supporting a new PackageFlowFile processor that is > >> simple > >> > > to > >> > > > > configure (compared to MergeContent!) and simply packages flowfile > >> > > > > attributes + content into a FlowFile-v[1,2,3] format? This would > >> also > >> > > > > offer a simple way
Re: new PackageFlowFile processor
And also ... if we can land this in a 1.x release, this would help tremendously to those who are going to need a replacement for PostHTTP and don't want to "go dark" when they make the transition. That is, without this processor in 1.x, when a user upgrades from 1.x to 2.x, they will either have to have a MergeContent/InvokeHTTP solution in place already to replace PostHTTP, or they will have to take a (hopefully short) outage when they bring their canvas back up (removing PostHTTP and replacing with PackageFlowFile + InvokeHTTP). With this processor in 1.x, they can make that transition while PostHTTP is still available on their canvas. Wishful thinking that we can make the entire journey from 1.x to 2.x as smooth as possible, but this could potentially help some. On Fri, Sep 8, 2023 at 10:55 AM Adam Taft wrote: > +1 on this as well. It's something I've kind of griped about before (with > the loss of PostHTTP). > > I don't think it would be horrible (as per Joe's concern) to offer a N:1 > "bundling" property. It would just have to be stupid simple. No "groups", > timeouts, correlation attributes, minimum entries, etc. It should just > basically call the ProcessSession#get(int maxResults) where "maxResults" is > a configurable property. Whatever number of flowfiles returned in the list > is what is "bundled" into FFv3 format for output. > > /Adam > > > On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord > wrote: > >> +1 from me. >> I’ve experimented with both methods. The simplicity of a PackageFlowfile >> straight up 1:1 is convenient and straightforward. >> MergeContent on the other hand can be difficult to understand and tweak >> appropriately to gain desired results/throughput. >> On Sep 8, 2023 at 10:14 AM -0400, Joe Witt , wrote: >> > Ok. Certainly simplifies it but likely makes it applicable to larger >> > flowfiles only. The format is meant to allow appending and result in >> large >> > sets of flowfiles for io efficiency and specifically for storage as the >> > small files/tons of files thing can cause poor performance pretty >> quickly >> > (10s of thousands of files in a single directory). >> > >> > But maybe that simplicity is fine and we just link to the MergeContent >> > packaging option if users need more. >> > >> > On Fri, Sep 8, 2023 at 7:06 AM Michael Moser >> wrote: >> > >> > > I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of >> multiple >> > > files at all. Probably change the mime.type attribute. It might not >> even >> > > have any config properties at all if we only support flowfile-v3 and >> not v1 >> > > or v2. >> > > >> > > -- Mike >> > > >> > > >> > > On Fri, Sep 8, 2023 at 9:56 AM Joe Witt wrote: >> > > >> > > > Mike >> > > > >> > > > In user terms this makes sense to me. Id only bother with v3 or >> whatever >> > > is >> > > > latest. We want to dump the old code. And if there are seriously >> older >> > > > versions v1,v2 then nifi 1.x can be used. >> > > > >> > > > The challenge is that you end up needing some of the same >> complexity in >> > > > implementation and config of merge content i think. What did you >> have in >> > > > mind for that? >> > > > >> > > > Thanks >> > > > >> > > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser >> wrote: >> > > > >> > > > > Devs, >> > > > > >> > > > > I can't find if this was suggested before, so here goes. With the >> > > demise >> > > > > of PostHTTP in NiFi 2.0, the recommended alternative is to >> > > MergeContent 1 >> > > > > file into FlowFile-v3 format then InvokeHTTP. What does the >> community >> > > > > think about supporting a new PackageFlowFile processor that is >> simple >> > > to >> > > > > configure (compared to MergeContent!) and simply packages flowfile >> > > > > attributes + content into a FlowFile-v[1,2,3] format? This would >> also >> > > > > offer a simple way to export flowfiles from NiFi that could later >> be >> > > > > re-ingested and recovered using UnpackContent. I don't want to >> submit >> > > a >> > > > PR >> > > > > for such a processor without first asking the community whether >> this >> > > > would >> > > > > be acceptable. >> > > > > >> > > > > Thanks, >> > > > > -- Mike >> > > > > >> > > > >> > > >> >
Re: new PackageFlowFile processor
+1 on this as well. It's something I've kind of griped about before (with the loss of PostHTTP). I don't think it would be horrible (as per Joe's concern) to offer a N:1 "bundling" property. It would just have to be stupid simple. No "groups", timeouts, correlation attributes, minimum entries, etc. It should just basically call the ProcessSession#get(int maxResults) where "maxResults" is a configurable property. Whatever number of flowfiles returned in the list is what is "bundled" into FFv3 format for output. /Adam On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord wrote: > +1 from me. > I’ve experimented with both methods. The simplicity of a PackageFlowfile > straight up 1:1 is convenient and straightforward. > MergeContent on the other hand can be difficult to understand and tweak > appropriately to gain desired results/throughput. > On Sep 8, 2023 at 10:14 AM -0400, Joe Witt , wrote: > > Ok. Certainly simplifies it but likely makes it applicable to larger > > flowfiles only. The format is meant to allow appending and result in > large > > sets of flowfiles for io efficiency and specifically for storage as the > > small files/tons of files thing can cause poor performance pretty quickly > > (10s of thousands of files in a single directory). > > > > But maybe that simplicity is fine and we just link to the MergeContent > > packaging option if users need more. > > > > On Fri, Sep 8, 2023 at 7:06 AM Michael Moser wrote: > > > > > I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of > multiple > > > files at all. Probably change the mime.type attribute. It might not > even > > > have any config properties at all if we only support flowfile-v3 and > not v1 > > > or v2. > > > > > > -- Mike > > > > > > > > > On Fri, Sep 8, 2023 at 9:56 AM Joe Witt wrote: > > > > > > > Mike > > > > > > > > In user terms this makes sense to me. Id only bother with v3 or > whatever > > > is > > > > latest. We want to dump the old code. And if there are seriously > older > > > > versions v1,v2 then nifi 1.x can be used. > > > > > > > > The challenge is that you end up needing some of the same complexity > in > > > > implementation and config of merge content i think. What did you > have in > > > > mind for that? > > > > > > > > Thanks > > > > > > > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser > wrote: > > > > > > > > > Devs, > > > > > > > > > > I can't find if this was suggested before, so here goes. With the > > > demise > > > > > of PostHTTP in NiFi 2.0, the recommended alternative is to > > > MergeContent 1 > > > > > file into FlowFile-v3 format then InvokeHTTP. What does the > community > > > > > think about supporting a new PackageFlowFile processor that is > simple > > > to > > > > > configure (compared to MergeContent!) and simply packages flowfile > > > > > attributes + content into a FlowFile-v[1,2,3] format? This would > also > > > > > offer a simple way to export flowfiles from NiFi that could later > be > > > > > re-ingested and recovered using UnpackContent. I don't want to > submit > > > a > > > > PR > > > > > for such a processor without first asking the community whether > this > > > > would > > > > > be acceptable. > > > > > > > > > > Thanks, > > > > > -- Mike > > > > > > > > > > > > >
Re: new PackageFlowFile processor
+1 from me. I’ve experimented with both methods. The simplicity of a PackageFlowfile straight up 1:1 is convenient and straightforward. MergeContent on the other hand can be difficult to understand and tweak appropriately to gain desired results/throughput. On Sep 8, 2023 at 10:14 AM -0400, Joe Witt , wrote: > Ok. Certainly simplifies it but likely makes it applicable to larger > flowfiles only. The format is meant to allow appending and result in large > sets of flowfiles for io efficiency and specifically for storage as the > small files/tons of files thing can cause poor performance pretty quickly > (10s of thousands of files in a single directory). > > But maybe that simplicity is fine and we just link to the MergeContent > packaging option if users need more. > > On Fri, Sep 8, 2023 at 7:06 AM Michael Moser wrote: > > > I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of multiple > > files at all. Probably change the mime.type attribute. It might not even > > have any config properties at all if we only support flowfile-v3 and not v1 > > or v2. > > > > -- Mike > > > > > > On Fri, Sep 8, 2023 at 9:56 AM Joe Witt wrote: > > > > > Mike > > > > > > In user terms this makes sense to me. Id only bother with v3 or whatever > > is > > > latest. We want to dump the old code. And if there are seriously older > > > versions v1,v2 then nifi 1.x can be used. > > > > > > The challenge is that you end up needing some of the same complexity in > > > implementation and config of merge content i think. What did you have in > > > mind for that? > > > > > > Thanks > > > > > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser wrote: > > > > > > > Devs, > > > > > > > > I can't find if this was suggested before, so here goes. With the > > demise > > > > of PostHTTP in NiFi 2.0, the recommended alternative is to > > MergeContent 1 > > > > file into FlowFile-v3 format then InvokeHTTP. What does the community > > > > think about supporting a new PackageFlowFile processor that is simple > > to > > > > configure (compared to MergeContent!) and simply packages flowfile > > > > attributes + content into a FlowFile-v[1,2,3] format? This would also > > > > offer a simple way to export flowfiles from NiFi that could later be > > > > re-ingested and recovered using UnpackContent. I don't want to submit > > a > > > PR > > > > for such a processor without first asking the community whether this > > > would > > > > be acceptable. > > > > > > > > Thanks, > > > > -- Mike > > > > > > > > >
Re: new PackageFlowFile processor
Ok. Certainly simplifies it but likely makes it applicable to larger flowfiles only. The format is meant to allow appending and result in large sets of flowfiles for io efficiency and specifically for storage as the small files/tons of files thing can cause poor performance pretty quickly (10s of thousands of files in a single directory). But maybe that simplicity is fine and we just link to the MergeContent packaging option if users need more. On Fri, Sep 8, 2023 at 7:06 AM Michael Moser wrote: > I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of multiple > files at all. Probably change the mime.type attribute. It might not even > have any config properties at all if we only support flowfile-v3 and not v1 > or v2. > > -- Mike > > > On Fri, Sep 8, 2023 at 9:56 AM Joe Witt wrote: > > > Mike > > > > In user terms this makes sense to me. Id only bother with v3 or whatever > is > > latest. We want to dump the old code. And if there are seriously older > > versions v1,v2 then nifi 1.x can be used. > > > > The challenge is that you end up needing some of the same complexity in > > implementation and config of merge content i think. What did you have in > > mind for that? > > > > Thanks > > > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser wrote: > > > > > Devs, > > > > > > I can't find if this was suggested before, so here goes. With the > demise > > > of PostHTTP in NiFi 2.0, the recommended alternative is to > MergeContent 1 > > > file into FlowFile-v3 format then InvokeHTTP. What does the community > > > think about supporting a new PackageFlowFile processor that is simple > to > > > configure (compared to MergeContent!) and simply packages flowfile > > > attributes + content into a FlowFile-v[1,2,3] format? This would also > > > offer a simple way to export flowfiles from NiFi that could later be > > > re-ingested and recovered using UnpackContent. I don't want to submit > a > > PR > > > for such a processor without first asking the community whether this > > would > > > be acceptable. > > > > > > Thanks, > > > -- Mike > > > > > >
Re: new PackageFlowFile processor
Most of the complexity in MergeContent is around the bundling parameters... this processor would do no bundling, just straight pass through to the packaging library. No worries for the user about setting max package size, number of entries, number of bins, bin age, headers, footers, etc... even if they have sensible defaults in merge content, it's potentially confusing that if I want to "package" a FlowFile, what I need to do is "merge" it with itself (and ignore all the other confusing / irrelevant settings). Bypassing the bundling logic probably improves performance for this use case as well. On Fri, Sep 8, 2023 at 9:56 AM Joe Witt wrote: > Mike > > In user terms this makes sense to me. Id only bother with v3 or whatever is > latest. We want to dump the old code. And if there are seriously older > versions v1,v2 then nifi 1.x can be used. > > The challenge is that you end up needing some of the same complexity in > implementation and config of merge content i think. What did you have in > mind for that? > > Thanks > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser wrote: > > > Devs, > > > > I can't find if this was suggested before, so here goes. With the demise > > of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1 > > file into FlowFile-v3 format then InvokeHTTP. What does the community > > think about supporting a new PackageFlowFile processor that is simple to > > configure (compared to MergeContent!) and simply packages flowfile > > attributes + content into a FlowFile-v[1,2,3] format? This would also > > offer a simple way to export flowfiles from NiFi that could later be > > re-ingested and recovered using UnpackContent. I don't want to submit a > PR > > for such a processor without first asking the community whether this > would > > be acceptable. > > > > Thanks, > > -- Mike > > >
Re: new PackageFlowFile processor
I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of multiple files at all. Probably change the mime.type attribute. It might not even have any config properties at all if we only support flowfile-v3 and not v1 or v2. -- Mike On Fri, Sep 8, 2023 at 9:56 AM Joe Witt wrote: > Mike > > In user terms this makes sense to me. Id only bother with v3 or whatever is > latest. We want to dump the old code. And if there are seriously older > versions v1,v2 then nifi 1.x can be used. > > The challenge is that you end up needing some of the same complexity in > implementation and config of merge content i think. What did you have in > mind for that? > > Thanks > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser wrote: > > > Devs, > > > > I can't find if this was suggested before, so here goes. With the demise > > of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1 > > file into FlowFile-v3 format then InvokeHTTP. What does the community > > think about supporting a new PackageFlowFile processor that is simple to > > configure (compared to MergeContent!) and simply packages flowfile > > attributes + content into a FlowFile-v[1,2,3] format? This would also > > offer a simple way to export flowfiles from NiFi that could later be > > re-ingested and recovered using UnpackContent. I don't want to submit a > PR > > for such a processor without first asking the community whether this > would > > be acceptable. > > > > Thanks, > > -- Mike > > >
Re: new PackageFlowFile processor
Mike In user terms this makes sense to me. Id only bother with v3 or whatever is latest. We want to dump the old code. And if there are seriously older versions v1,v2 then nifi 1.x can be used. The challenge is that you end up needing some of the same complexity in implementation and config of merge content i think. What did you have in mind for that? Thanks On Fri, Sep 8, 2023 at 6:53 AM Michael Moser wrote: > Devs, > > I can't find if this was suggested before, so here goes. With the demise > of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1 > file into FlowFile-v3 format then InvokeHTTP. What does the community > think about supporting a new PackageFlowFile processor that is simple to > configure (compared to MergeContent!) and simply packages flowfile > attributes + content into a FlowFile-v[1,2,3] format? This would also > offer a simple way to export flowfiles from NiFi that could later be > re-ingested and recovered using UnpackContent. I don't want to submit a PR > for such a processor without first asking the community whether this would > be acceptable. > > Thanks, > -- Mike >
Re: new PackageFlowFile processor
I have had to use that pattern myself recently. I think a simple PackageFlowFile processor makes a lot of sense. I am +1. Brandon From: Michael Moser Sent: Friday, September 8, 2023 9:52:52 AM To: dev@nifi.apache.org Subject: new PackageFlowFile processor Devs, I can't find if this was suggested before, so here goes. With the demise of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1 file into FlowFile-v3 format then InvokeHTTP. What does the community think about supporting a new PackageFlowFile processor that is simple to configure (compared to MergeContent!) and simply packages flowfile attributes + content into a FlowFile-v[1,2,3] format? This would also offer a simple way to export flowfiles from NiFi that could later be re-ingested and recovered using UnpackContent. I don't want to submit a PR for such a processor without first asking the community whether this would be acceptable. Thanks, -- Mike
new PackageFlowFile processor
Devs, I can't find if this was suggested before, so here goes. With the demise of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1 file into FlowFile-v3 format then InvokeHTTP. What does the community think about supporting a new PackageFlowFile processor that is simple to configure (compared to MergeContent!) and simply packages flowfile attributes + content into a FlowFile-v[1,2,3] format? This would also offer a simple way to export flowfiles from NiFi that could later be re-ingested and recovered using UnpackContent. I don't want to submit a PR for such a processor without first asking the community whether this would be acceptable. Thanks, -- Mike