Re: new PackageFlowFile processor

2023-09-08 Thread Adam Taft
Flow File Packager v3. You can find the source here:

https://github.com/apache/nifi/blob/main/nifi-commons/nifi-flowfile-packager/src/main/java/org/apache/nifi/util/FlowFilePackagerV3.java

It's a serialization format that is used for writing a flowfile (content
and attributes) to a stream (network, file, etc.). It's a simple binary
format, that is effectively the attributes serialized as key/value pairs
followed by the content. There are byte size markers written into the start
of each field, so that deserializing can read the values into a byte array
(or equivalent).

Flow File Packager v3 is primarily used in the MergeContent processor (for
bundling) and the UnpackContent processor (for extraction). But the
(deprecated) PostHTTP processor and the ListenHTTP processor has support
for this format somewhat transparently as well. Thus enabling two NiFi
systems to send a serialized flowfile across the wire using HTTP.

You might see this format name as "FlowFile Stream v3" or
"flowfile-stream-v3" when looking at either MergeContent or UnpackContent.





On Fri, Sep 8, 2023 at 2:14 PM Russell Bateman 
wrote:

> Uh, sorry, "Version 3" refers to what exactly?
>
> On 9/8/23 12:48, David Handermann wrote:
> > I agree that this would be a useful general feature. I also agree with
> > Joe that format support should be limited to*Version 3*  due to the
> > limitations of the earlier versions.
> >
> > This is definitely something that would be useful on the 1.x support
> > branch to provide a smooth upgrade path for NiFi 2.
> >
> > This general topic also came up on the dev channel on the Apache NiFi
> > Slack group:
> >
> > https://apachenifi.slack.com/archives/C0L9S92JY/p1692115270146369
> >
> > One key thing to note from that discussion is supporting
> > interoperability with services outside of NiFi. That may be too much
> > of a stretch for an initial implementation, but it is something I am
> > planning to evaluate as time allows.
> >
> > For now, something focused narrowly on FlowFile Version 3 encoding
> > seems like the best approach.
> >
> > I recommend referencing this discussion in a new Jira issue and
> > outlining the general design goals.
> >
> > Regards,
> > David Handermann
> >
> >
> > On Fri, Sep 8, 2023 at 1:11 PM Adam Taft  wrote:
> >> And also ... if we can land this in a 1.x release, this would help
> >> tremendously to those who are going to need a replacement for PostHTTP
> and
> >> don't want to "go dark" when they make the transition.
> >>
> >> That is, without this processor in 1.x, when a user upgrades from 1.x to
> >> 2.x, they will either have to have a MergeContent/InvokeHTTP solution in
> >> place already to replace PostHTTP, or they will have to take a
> (hopefully
> >> short) outage when they bring their canvas back up (removing PostHTTP
> and
> >> replacing with PackageFlowFile + InvokeHTTP).
> >>
> >> With this processor in 1.x, they can make that transition while
> PostHTTP is
> >> still available on their canvas. Wishful thinking that we can make the
> >> entire journey from 1.x to 2.x as smooth as possible, but this could
> >> potentially help some.
> >>
> >>
> >> On Fri, Sep 8, 2023 at 10:55 AM Adam Taft  wrote:
> >>
> >>> +1 on this as well. It's something I've kind of griped about before
> (with
> >>> the loss of PostHTTP).
> >>>
> >>> I don't think it would be horrible (as per Joe's concern) to offer a
> N:1
> >>> "bundling" property. It would just have to be stupid simple. No
> "groups",
> >>> timeouts, correlation attributes, minimum entries, etc. It should just
> >>> basically call the ProcessSession#get(int maxResults) where
> "maxResults" is
> >>> a configurable property. Whatever number of flowfiles returned in the
> list
> >>> is what is "bundled" into FFv3 format for output.
> >>>
> >>> /Adam
> >>>
> >>>
> >>> On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord
> >>> wrote:
> >>>
>  +1 from me.
>  I’ve experimented with both methods.  The simplicity of a
> PackageFlowfile
>  straight up 1:1 is convenient and straightforward.
>  MergeContent on the other hand can be difficult to understand and
> tweak
>  appropriately to gain desired results/throughput.
>  On Sep 8, 2023 at 10:14 AM -0400, Joe Witt,
> wrote:
> > Ok. Certainly simplifies it but likely makes it applicable to larger
> > flowfiles only. The format is meant to allow appending and result in
>  large
> > sets of flowfiles for io efficiency and specifically for storage as
> the
> > small files/tons of files thing can cause poor performance pretty
>  quickly
> > (10s of thousands of files in a single directory).
> >
> > But maybe that simplicity is fine and we just link to the
> MergeContent
> > packaging option if users need more.
> >
> > On Fri, Sep 8, 2023 at 7:06 AM Michael Moser
>  wrote:
> >> I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of
>  multiple
> >> files at all. Probably change the mime.type 

Re: new PackageFlowFile processor

2023-09-08 Thread Russell Bateman

Uh, sorry, "Version 3" refers to what exactly?

On 9/8/23 12:48, David Handermann wrote:

I agree that this would be a useful general feature. I also agree with
Joe that format support should be limited to*Version 3*  due to the
limitations of the earlier versions.

This is definitely something that would be useful on the 1.x support
branch to provide a smooth upgrade path for NiFi 2.

This general topic also came up on the dev channel on the Apache NiFi
Slack group:

https://apachenifi.slack.com/archives/C0L9S92JY/p1692115270146369

One key thing to note from that discussion is supporting
interoperability with services outside of NiFi. That may be too much
of a stretch for an initial implementation, but it is something I am
planning to evaluate as time allows.

For now, something focused narrowly on FlowFile Version 3 encoding
seems like the best approach.

I recommend referencing this discussion in a new Jira issue and
outlining the general design goals.

Regards,
David Handermann


On Fri, Sep 8, 2023 at 1:11 PM Adam Taft  wrote:

And also ... if we can land this in a 1.x release, this would help
tremendously to those who are going to need a replacement for PostHTTP and
don't want to "go dark" when they make the transition.

That is, without this processor in 1.x, when a user upgrades from 1.x to
2.x, they will either have to have a MergeContent/InvokeHTTP solution in
place already to replace PostHTTP, or they will have to take a (hopefully
short) outage when they bring their canvas back up (removing PostHTTP and
replacing with PackageFlowFile + InvokeHTTP).

With this processor in 1.x, they can make that transition while PostHTTP is
still available on their canvas. Wishful thinking that we can make the
entire journey from 1.x to 2.x as smooth as possible, but this could
potentially help some.


On Fri, Sep 8, 2023 at 10:55 AM Adam Taft  wrote:


+1 on this as well. It's something I've kind of griped about before (with
the loss of PostHTTP).

I don't think it would be horrible (as per Joe's concern) to offer a N:1
"bundling" property. It would just have to be stupid simple. No "groups",
timeouts, correlation attributes, minimum entries, etc. It should just
basically call the ProcessSession#get(int maxResults) where "maxResults" is
a configurable property. Whatever number of flowfiles returned in the list
is what is "bundled" into FFv3 format for output.

/Adam


On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord
wrote:


+1 from me.
I’ve experimented with both methods.  The simplicity of a PackageFlowfile
straight up 1:1 is convenient and straightforward.
MergeContent on the other hand can be difficult to understand and tweak
appropriately to gain desired results/throughput.
On Sep 8, 2023 at 10:14 AM -0400, Joe Witt, wrote:

Ok. Certainly simplifies it but likely makes it applicable to larger
flowfiles only. The format is meant to allow appending and result in

large

sets of flowfiles for io efficiency and specifically for storage as the
small files/tons of files thing can cause poor performance pretty

quickly

(10s of thousands of files in a single directory).

But maybe that simplicity is fine and we just link to the MergeContent
packaging option if users need more.

On Fri, Sep 8, 2023 at 7:06 AM Michael Moser

wrote:

I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of

multiple

files at all. Probably change the mime.type attribute. It might not

even

have any config properties at all if we only support flowfile-v3 and

not v1

or v2.

-- Mike


On Fri, Sep 8, 2023 at 9:56 AM Joe Witt  wrote:


Mike

In user terms this makes sense to me. Id only bother with v3 or

whatever

is

latest. We want to dump the old code. And if there are seriously

older

versions v1,v2 then nifi 1.x can be used.

The challenge is that you end up needing some of the same

complexity in

implementation and config of merge content i think. What did you

have in

mind for that?

Thanks

On Fri, Sep 8, 2023 at 6:53 AM Michael Moser

wrote:

Devs,

I can't find if this was suggested before, so here goes. With the

demise

of PostHTTP in NiFi 2.0, the recommended alternative is to

MergeContent 1

file into FlowFile-v3 format then InvokeHTTP. What does the

community

think about supporting a new PackageFlowFile processor that is

simple

to

configure (compared to MergeContent!) and simply packages flowfile
attributes + content into a FlowFile-v[1,2,3] format? This would

also

offer a simple way to export flowfiles from NiFi that could later

be

re-ingested and recovered using UnpackContent. I don't want to

submit

a

PR

for such a processor without first asking the community whether

this

would

be acceptable.

Thanks,
-- Mike



Re: new PackageFlowFile processor

2023-09-08 Thread David Handermann
I agree that this would be a useful general feature. I also agree with
Joe that format support should be limited to Version 3 due to the
limitations of the earlier versions.

This is definitely something that would be useful on the 1.x support
branch to provide a smooth upgrade path for NiFi 2.

This general topic also came up on the dev channel on the Apache NiFi
Slack group:

https://apachenifi.slack.com/archives/C0L9S92JY/p1692115270146369

One key thing to note from that discussion is supporting
interoperability with services outside of NiFi. That may be too much
of a stretch for an initial implementation, but it is something I am
planning to evaluate as time allows.

For now, something focused narrowly on FlowFile Version 3 encoding
seems like the best approach.

I recommend referencing this discussion in a new Jira issue and
outlining the general design goals.

Regards,
David Handermann


On Fri, Sep 8, 2023 at 1:11 PM Adam Taft  wrote:
>
> And also ... if we can land this in a 1.x release, this would help
> tremendously to those who are going to need a replacement for PostHTTP and
> don't want to "go dark" when they make the transition.
>
> That is, without this processor in 1.x, when a user upgrades from 1.x to
> 2.x, they will either have to have a MergeContent/InvokeHTTP solution in
> place already to replace PostHTTP, or they will have to take a (hopefully
> short) outage when they bring their canvas back up (removing PostHTTP and
> replacing with PackageFlowFile + InvokeHTTP).
>
> With this processor in 1.x, they can make that transition while PostHTTP is
> still available on their canvas. Wishful thinking that we can make the
> entire journey from 1.x to 2.x as smooth as possible, but this could
> potentially help some.
>
>
> On Fri, Sep 8, 2023 at 10:55 AM Adam Taft  wrote:
>
> > +1 on this as well. It's something I've kind of griped about before (with
> > the loss of PostHTTP).
> >
> > I don't think it would be horrible (as per Joe's concern) to offer a N:1
> > "bundling" property. It would just have to be stupid simple. No "groups",
> > timeouts, correlation attributes, minimum entries, etc. It should just
> > basically call the ProcessSession#get(int maxResults) where "maxResults" is
> > a configurable property. Whatever number of flowfiles returned in the list
> > is what is "bundled" into FFv3 format for output.
> >
> > /Adam
> >
> >
> > On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord 
> > wrote:
> >
> >> +1 from me.
> >> I’ve experimented with both methods.  The simplicity of a PackageFlowfile
> >> straight up 1:1 is convenient and straightforward.
> >> MergeContent on the other hand can be difficult to understand and tweak
> >> appropriately to gain desired results/throughput.
> >> On Sep 8, 2023 at 10:14 AM -0400, Joe Witt , wrote:
> >> > Ok. Certainly simplifies it but likely makes it applicable to larger
> >> > flowfiles only. The format is meant to allow appending and result in
> >> large
> >> > sets of flowfiles for io efficiency and specifically for storage as the
> >> > small files/tons of files thing can cause poor performance pretty
> >> quickly
> >> > (10s of thousands of files in a single directory).
> >> >
> >> > But maybe that simplicity is fine and we just link to the MergeContent
> >> > packaging option if users need more.
> >> >
> >> > On Fri, Sep 8, 2023 at 7:06 AM Michael Moser 
> >> wrote:
> >> >
> >> > > I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of
> >> multiple
> >> > > files at all. Probably change the mime.type attribute. It might not
> >> even
> >> > > have any config properties at all if we only support flowfile-v3 and
> >> not v1
> >> > > or v2.
> >> > >
> >> > > -- Mike
> >> > >
> >> > >
> >> > > On Fri, Sep 8, 2023 at 9:56 AM Joe Witt  wrote:
> >> > >
> >> > > > Mike
> >> > > >
> >> > > > In user terms this makes sense to me. Id only bother with v3 or
> >> whatever
> >> > > is
> >> > > > latest. We want to dump the old code. And if there are seriously
> >> older
> >> > > > versions v1,v2 then nifi 1.x can be used.
> >> > > >
> >> > > > The challenge is that you end up needing some of the same
> >> complexity in
> >> > > > implementation and config of merge content i think. What did you
> >> have in
> >> > > > mind for that?
> >> > > >
> >> > > > Thanks
> >> > > >
> >> > > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser 
> >> wrote:
> >> > > >
> >> > > > > Devs,
> >> > > > >
> >> > > > > I can't find if this was suggested before, so here goes. With the
> >> > > demise
> >> > > > > of PostHTTP in NiFi 2.0, the recommended alternative is to
> >> > > MergeContent 1
> >> > > > > file into FlowFile-v3 format then InvokeHTTP. What does the
> >> community
> >> > > > > think about supporting a new PackageFlowFile processor that is
> >> simple
> >> > > to
> >> > > > > configure (compared to MergeContent!) and simply packages flowfile
> >> > > > > attributes + content into a FlowFile-v[1,2,3] format? This would
> >> also
> >> > > > > offer a simple way 

Re: new PackageFlowFile processor

2023-09-08 Thread Adam Taft
And also ... if we can land this in a 1.x release, this would help
tremendously to those who are going to need a replacement for PostHTTP and
don't want to "go dark" when they make the transition.

That is, without this processor in 1.x, when a user upgrades from 1.x to
2.x, they will either have to have a MergeContent/InvokeHTTP solution in
place already to replace PostHTTP, or they will have to take a (hopefully
short) outage when they bring their canvas back up (removing PostHTTP and
replacing with PackageFlowFile + InvokeHTTP).

With this processor in 1.x, they can make that transition while PostHTTP is
still available on their canvas. Wishful thinking that we can make the
entire journey from 1.x to 2.x as smooth as possible, but this could
potentially help some.


On Fri, Sep 8, 2023 at 10:55 AM Adam Taft  wrote:

> +1 on this as well. It's something I've kind of griped about before (with
> the loss of PostHTTP).
>
> I don't think it would be horrible (as per Joe's concern) to offer a N:1
> "bundling" property. It would just have to be stupid simple. No "groups",
> timeouts, correlation attributes, minimum entries, etc. It should just
> basically call the ProcessSession#get(int maxResults) where "maxResults" is
> a configurable property. Whatever number of flowfiles returned in the list
> is what is "bundled" into FFv3 format for output.
>
> /Adam
>
>
> On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord 
> wrote:
>
>> +1 from me.
>> I’ve experimented with both methods.  The simplicity of a PackageFlowfile
>> straight up 1:1 is convenient and straightforward.
>> MergeContent on the other hand can be difficult to understand and tweak
>> appropriately to gain desired results/throughput.
>> On Sep 8, 2023 at 10:14 AM -0400, Joe Witt , wrote:
>> > Ok. Certainly simplifies it but likely makes it applicable to larger
>> > flowfiles only. The format is meant to allow appending and result in
>> large
>> > sets of flowfiles for io efficiency and specifically for storage as the
>> > small files/tons of files thing can cause poor performance pretty
>> quickly
>> > (10s of thousands of files in a single directory).
>> >
>> > But maybe that simplicity is fine and we just link to the MergeContent
>> > packaging option if users need more.
>> >
>> > On Fri, Sep 8, 2023 at 7:06 AM Michael Moser 
>> wrote:
>> >
>> > > I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of
>> multiple
>> > > files at all. Probably change the mime.type attribute. It might not
>> even
>> > > have any config properties at all if we only support flowfile-v3 and
>> not v1
>> > > or v2.
>> > >
>> > > -- Mike
>> > >
>> > >
>> > > On Fri, Sep 8, 2023 at 9:56 AM Joe Witt  wrote:
>> > >
>> > > > Mike
>> > > >
>> > > > In user terms this makes sense to me. Id only bother with v3 or
>> whatever
>> > > is
>> > > > latest. We want to dump the old code. And if there are seriously
>> older
>> > > > versions v1,v2 then nifi 1.x can be used.
>> > > >
>> > > > The challenge is that you end up needing some of the same
>> complexity in
>> > > > implementation and config of merge content i think. What did you
>> have in
>> > > > mind for that?
>> > > >
>> > > > Thanks
>> > > >
>> > > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser 
>> wrote:
>> > > >
>> > > > > Devs,
>> > > > >
>> > > > > I can't find if this was suggested before, so here goes. With the
>> > > demise
>> > > > > of PostHTTP in NiFi 2.0, the recommended alternative is to
>> > > MergeContent 1
>> > > > > file into FlowFile-v3 format then InvokeHTTP. What does the
>> community
>> > > > > think about supporting a new PackageFlowFile processor that is
>> simple
>> > > to
>> > > > > configure (compared to MergeContent!) and simply packages flowfile
>> > > > > attributes + content into a FlowFile-v[1,2,3] format? This would
>> also
>> > > > > offer a simple way to export flowfiles from NiFi that could later
>> be
>> > > > > re-ingested and recovered using UnpackContent. I don't want to
>> submit
>> > > a
>> > > > PR
>> > > > > for such a processor without first asking the community whether
>> this
>> > > > would
>> > > > > be acceptable.
>> > > > >
>> > > > > Thanks,
>> > > > > -- Mike
>> > > > >
>> > > >
>> > >
>>
>


Re: new PackageFlowFile processor

2023-09-08 Thread Adam Taft
+1 on this as well. It's something I've kind of griped about before (with
the loss of PostHTTP).

I don't think it would be horrible (as per Joe's concern) to offer a N:1
"bundling" property. It would just have to be stupid simple. No "groups",
timeouts, correlation attributes, minimum entries, etc. It should just
basically call the ProcessSession#get(int maxResults) where "maxResults" is
a configurable property. Whatever number of flowfiles returned in the list
is what is "bundled" into FFv3 format for output.

/Adam


On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord  wrote:

> +1 from me.
> I’ve experimented with both methods.  The simplicity of a PackageFlowfile
> straight up 1:1 is convenient and straightforward.
> MergeContent on the other hand can be difficult to understand and tweak
> appropriately to gain desired results/throughput.
> On Sep 8, 2023 at 10:14 AM -0400, Joe Witt , wrote:
> > Ok. Certainly simplifies it but likely makes it applicable to larger
> > flowfiles only. The format is meant to allow appending and result in
> large
> > sets of flowfiles for io efficiency and specifically for storage as the
> > small files/tons of files thing can cause poor performance pretty quickly
> > (10s of thousands of files in a single directory).
> >
> > But maybe that simplicity is fine and we just link to the MergeContent
> > packaging option if users need more.
> >
> > On Fri, Sep 8, 2023 at 7:06 AM Michael Moser  wrote:
> >
> > > I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of
> multiple
> > > files at all. Probably change the mime.type attribute. It might not
> even
> > > have any config properties at all if we only support flowfile-v3 and
> not v1
> > > or v2.
> > >
> > > -- Mike
> > >
> > >
> > > On Fri, Sep 8, 2023 at 9:56 AM Joe Witt  wrote:
> > >
> > > > Mike
> > > >
> > > > In user terms this makes sense to me. Id only bother with v3 or
> whatever
> > > is
> > > > latest. We want to dump the old code. And if there are seriously
> older
> > > > versions v1,v2 then nifi 1.x can be used.
> > > >
> > > > The challenge is that you end up needing some of the same complexity
> in
> > > > implementation and config of merge content i think. What did you
> have in
> > > > mind for that?
> > > >
> > > > Thanks
> > > >
> > > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser 
> wrote:
> > > >
> > > > > Devs,
> > > > >
> > > > > I can't find if this was suggested before, so here goes. With the
> > > demise
> > > > > of PostHTTP in NiFi 2.0, the recommended alternative is to
> > > MergeContent 1
> > > > > file into FlowFile-v3 format then InvokeHTTP. What does the
> community
> > > > > think about supporting a new PackageFlowFile processor that is
> simple
> > > to
> > > > > configure (compared to MergeContent!) and simply packages flowfile
> > > > > attributes + content into a FlowFile-v[1,2,3] format? This would
> also
> > > > > offer a simple way to export flowfiles from NiFi that could later
> be
> > > > > re-ingested and recovered using UnpackContent. I don't want to
> submit
> > > a
> > > > PR
> > > > > for such a processor without first asking the community whether
> this
> > > > would
> > > > > be acceptable.
> > > > >
> > > > > Thanks,
> > > > > -- Mike
> > > > >
> > > >
> > >
>


Re: new PackageFlowFile processor

2023-09-08 Thread Phillip Lord
+1 from me.
I’ve experimented with both methods.  The simplicity of a PackageFlowfile 
straight up 1:1 is convenient and straightforward.
MergeContent on the other hand can be difficult to understand and tweak 
appropriately to gain desired results/throughput.
On Sep 8, 2023 at 10:14 AM -0400, Joe Witt , wrote:
> Ok. Certainly simplifies it but likely makes it applicable to larger
> flowfiles only. The format is meant to allow appending and result in large
> sets of flowfiles for io efficiency and specifically for storage as the
> small files/tons of files thing can cause poor performance pretty quickly
> (10s of thousands of files in a single directory).
>
> But maybe that simplicity is fine and we just link to the MergeContent
> packaging option if users need more.
>
> On Fri, Sep 8, 2023 at 7:06 AM Michael Moser  wrote:
>
> > I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of multiple
> > files at all. Probably change the mime.type attribute. It might not even
> > have any config properties at all if we only support flowfile-v3 and not v1
> > or v2.
> >
> > -- Mike
> >
> >
> > On Fri, Sep 8, 2023 at 9:56 AM Joe Witt  wrote:
> >
> > > Mike
> > >
> > > In user terms this makes sense to me. Id only bother with v3 or whatever
> > is
> > > latest. We want to dump the old code. And if there are seriously older
> > > versions v1,v2 then nifi 1.x can be used.
> > >
> > > The challenge is that you end up needing some of the same complexity in
> > > implementation and config of merge content i think. What did you have in
> > > mind for that?
> > >
> > > Thanks
> > >
> > > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser  wrote:
> > >
> > > > Devs,
> > > >
> > > > I can't find if this was suggested before, so here goes. With the
> > demise
> > > > of PostHTTP in NiFi 2.0, the recommended alternative is to
> > MergeContent 1
> > > > file into FlowFile-v3 format then InvokeHTTP. What does the community
> > > > think about supporting a new PackageFlowFile processor that is simple
> > to
> > > > configure (compared to MergeContent!) and simply packages flowfile
> > > > attributes + content into a FlowFile-v[1,2,3] format? This would also
> > > > offer a simple way to export flowfiles from NiFi that could later be
> > > > re-ingested and recovered using UnpackContent. I don't want to submit
> > a
> > > PR
> > > > for such a processor without first asking the community whether this
> > > would
> > > > be acceptable.
> > > >
> > > > Thanks,
> > > > -- Mike
> > > >
> > >
> >


Re: new PackageFlowFile processor

2023-09-08 Thread Joe Witt
Ok.  Certainly simplifies it but likely makes it applicable to larger
flowfiles only.  The format is meant to allow appending and result in large
sets of flowfiles for io efficiency and specifically for storage as the
small files/tons of files thing can cause poor performance pretty quickly
(10s of thousands of files in a single directory).

But maybe that simplicity is fine and we just link to the MergeContent
packaging option if users need more.

On Fri, Sep 8, 2023 at 7:06 AM Michael Moser  wrote:

> I was thinking 1 file in -> 1 flowfile-v3 file out.  No merging of multiple
> files at all.  Probably change the mime.type attribute.  It might not even
> have any config properties at all if we only support flowfile-v3 and not v1
> or v2.
>
> -- Mike
>
>
> On Fri, Sep 8, 2023 at 9:56 AM Joe Witt  wrote:
>
> > Mike
> >
> > In user terms this makes sense to me. Id only bother with v3 or whatever
> is
> > latest. We want to dump the old code. And if there are seriously older
> > versions v1,v2 then nifi 1.x can be used.
> >
> > The challenge is that you end up needing some of the same complexity in
> > implementation and config of merge content i think. What did you have in
> > mind for that?
> >
> > Thanks
> >
> > On Fri, Sep 8, 2023 at 6:53 AM Michael Moser  wrote:
> >
> > > Devs,
> > >
> > > I can't find if this was suggested before, so here goes.  With the
> demise
> > > of PostHTTP in NiFi 2.0, the recommended alternative is to
> MergeContent 1
> > > file into FlowFile-v3 format then InvokeHTTP.  What does the community
> > > think about supporting a new PackageFlowFile processor that is simple
> to
> > > configure (compared to MergeContent!) and simply packages flowfile
> > > attributes + content into a FlowFile-v[1,2,3] format?  This would also
> > > offer a simple way to export flowfiles from NiFi that could later be
> > > re-ingested and recovered using UnpackContent.  I don't want to submit
> a
> > PR
> > > for such a processor without first asking the community whether this
> > would
> > > be acceptable.
> > >
> > > Thanks,
> > > -- Mike
> > >
> >
>


Re: new PackageFlowFile processor

2023-09-08 Thread Brandon DeVries
Most of the complexity in MergeContent is around the bundling parameters...
this processor would do no bundling, just straight pass through to the
packaging library.  No worries for the user about setting max package size,
number of entries, number of bins, bin age, headers, footers, etc... even
if they have sensible defaults in merge content, it's potentially confusing
that if I want to "package" a FlowFile, what I need to do is "merge" it
with itself (and ignore all the other confusing / irrelevant settings).
Bypassing the bundling logic probably improves performance for this use
case as well.

On Fri, Sep 8, 2023 at 9:56 AM Joe Witt  wrote:

> Mike
>
> In user terms this makes sense to me. Id only bother with v3 or whatever is
> latest. We want to dump the old code. And if there are seriously older
> versions v1,v2 then nifi 1.x can be used.
>
> The challenge is that you end up needing some of the same complexity in
> implementation and config of merge content i think. What did you have in
> mind for that?
>
> Thanks
>
> On Fri, Sep 8, 2023 at 6:53 AM Michael Moser  wrote:
>
> > Devs,
> >
> > I can't find if this was suggested before, so here goes.  With the demise
> > of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1
> > file into FlowFile-v3 format then InvokeHTTP.  What does the community
> > think about supporting a new PackageFlowFile processor that is simple to
> > configure (compared to MergeContent!) and simply packages flowfile
> > attributes + content into a FlowFile-v[1,2,3] format?  This would also
> > offer a simple way to export flowfiles from NiFi that could later be
> > re-ingested and recovered using UnpackContent.  I don't want to submit a
> PR
> > for such a processor without first asking the community whether this
> would
> > be acceptable.
> >
> > Thanks,
> > -- Mike
> >
>


Re: new PackageFlowFile processor

2023-09-08 Thread Michael Moser
I was thinking 1 file in -> 1 flowfile-v3 file out.  No merging of multiple
files at all.  Probably change the mime.type attribute.  It might not even
have any config properties at all if we only support flowfile-v3 and not v1
or v2.

-- Mike


On Fri, Sep 8, 2023 at 9:56 AM Joe Witt  wrote:

> Mike
>
> In user terms this makes sense to me. Id only bother with v3 or whatever is
> latest. We want to dump the old code. And if there are seriously older
> versions v1,v2 then nifi 1.x can be used.
>
> The challenge is that you end up needing some of the same complexity in
> implementation and config of merge content i think. What did you have in
> mind for that?
>
> Thanks
>
> On Fri, Sep 8, 2023 at 6:53 AM Michael Moser  wrote:
>
> > Devs,
> >
> > I can't find if this was suggested before, so here goes.  With the demise
> > of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1
> > file into FlowFile-v3 format then InvokeHTTP.  What does the community
> > think about supporting a new PackageFlowFile processor that is simple to
> > configure (compared to MergeContent!) and simply packages flowfile
> > attributes + content into a FlowFile-v[1,2,3] format?  This would also
> > offer a simple way to export flowfiles from NiFi that could later be
> > re-ingested and recovered using UnpackContent.  I don't want to submit a
> PR
> > for such a processor without first asking the community whether this
> would
> > be acceptable.
> >
> > Thanks,
> > -- Mike
> >
>


Re: new PackageFlowFile processor

2023-09-08 Thread Joe Witt
Mike

In user terms this makes sense to me. Id only bother with v3 or whatever is
latest. We want to dump the old code. And if there are seriously older
versions v1,v2 then nifi 1.x can be used.

The challenge is that you end up needing some of the same complexity in
implementation and config of merge content i think. What did you have in
mind for that?

Thanks

On Fri, Sep 8, 2023 at 6:53 AM Michael Moser  wrote:

> Devs,
>
> I can't find if this was suggested before, so here goes.  With the demise
> of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1
> file into FlowFile-v3 format then InvokeHTTP.  What does the community
> think about supporting a new PackageFlowFile processor that is simple to
> configure (compared to MergeContent!) and simply packages flowfile
> attributes + content into a FlowFile-v[1,2,3] format?  This would also
> offer a simple way to export flowfiles from NiFi that could later be
> re-ingested and recovered using UnpackContent.  I don't want to submit a PR
> for such a processor without first asking the community whether this would
> be acceptable.
>
> Thanks,
> -- Mike
>


Re: new PackageFlowFile processor

2023-09-08 Thread Brandon DeVries
I have had to use that pattern myself recently. I think a simple 
PackageFlowFile processor makes a lot of sense. I am +1.

Brandon

From: Michael Moser 
Sent: Friday, September 8, 2023 9:52:52 AM
To: dev@nifi.apache.org 
Subject: new PackageFlowFile processor

Devs,

I can't find if this was suggested before, so here goes.  With the demise
of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1
file into FlowFile-v3 format then InvokeHTTP.  What does the community
think about supporting a new PackageFlowFile processor that is simple to
configure (compared to MergeContent!) and simply packages flowfile
attributes + content into a FlowFile-v[1,2,3] format?  This would also
offer a simple way to export flowfiles from NiFi that could later be
re-ingested and recovered using UnpackContent.  I don't want to submit a PR
for such a processor without first asking the community whether this would
be acceptable.

Thanks,
-- Mike


new PackageFlowFile processor

2023-09-08 Thread Michael Moser
Devs,

I can't find if this was suggested before, so here goes.  With the demise
of PostHTTP in NiFi 2.0, the recommended alternative is to MergeContent 1
file into FlowFile-v3 format then InvokeHTTP.  What does the community
think about supporting a new PackageFlowFile processor that is simple to
configure (compared to MergeContent!) and simply packages flowfile
attributes + content into a FlowFile-v[1,2,3] format?  This would also
offer a simple way to export flowfiles from NiFi that could later be
re-ingested and recovered using UnpackContent.  I don't want to submit a PR
for such a processor without first asking the community whether this would
be acceptable.

Thanks,
-- Mike