from:"Russell Bateman"

Re: NiFi 2.0.0-M2

2024-03-06 Thread Russell Bateman

Don't know if this is relevant let along helpful, but we were on CentOS 
for many years and, when it died, we moved to Alma.


Best of luck to you,

Russ

On 3/5/24 07:56, Marton Szasz wrote:
CentOS 7 is a very old distribution, so you may run into issues, but 
in theory, if you can install Java 21 on it, and start NiFi using 
that, then it should work.


I'd be surprised if Java 21 was included in the CentOS 7 package 
repositories. You'll most likely need to install it manually, and set 
JAVA_HOME appropriately in the environment you're using to start NiFi.


Marton

On 3/5/24 14:10, Pierre Villard wrote:

Hi,

NiFi is OS-agnostic for linux based distributions. You can download the
tar.gz from the download page of the website and follow the 
instructions to

get started on CentOS 7.

HTH,
Pierre

Le mar. 5 mars 2024 à 14:07, Gleb Efimov  a 
écrit :



Good afternoon I'm trying to deploy NiFi 2.0.0-M2 on Centos 7, but,
unfortunately, I can't find a tar archive with the distribution I need.
Could you tell me where I can get it?
And is it even possible to deploy NiFi 2.0.0-M2 on Centos 7?
Thank you very much.

Sincerely, Efimov Gleb.

Re: Block start/stop processors

2024-02-22 Thread Russell Bateman


Isha,

Wait, by "lefthand toolbox" are you referring to the process group's 
toolbar start button or to the start button in the pallet at the upper 
left of the page?


Please clarify.

Thanks


On 2/22/24 04:39, Isha Lamboo wrote:

always use the right-click context menu on a process group you want to
stop/start and never use the button in the lefthand toolbox.
Right-clicking the PG also selects it, so you don't have the risk of
clicking stop with no selection.

Re: Issue with Nifi pipeline

2023-11-28 Thread Russell Bateman


Sonia,

It sounds like you may prefer the Users Mailing List 
(us...@nifi.apache.org)rather than this one which is more for custom 
processors and other development-related activities.


Best regards,

Russ


On 11/27/23 22:49, Sonia Soleimani wrote:

Hello,
I am working for Telus and there has been a legacy pipeline that I am
trying to fix but get some error which could be because I am new to Nifi. I
would appreciate it if I could get someone from Nifi team to resolve this
issue . It is mostly related to some tables not loading completely in GCP
Bigquery (destination). This could be because the "fetch" step doesn't
fully finish .

Thanks,
Sonia Soleimani

Internationalization and localization of the UI and processors

2023-09-29 Thread Russell Bateman

Looking around, I see there have been statements of intent by folk to 
localize NiFi [1], but few statements as to how far they got. I saw a 
question on stackoverflow [2] on how to hack relevant Java-annotated 
references (@CapabilityDescription) which isn't exactly 
internationalization, but it's a start.


Most of what our down-streamers use are custom processors, but they use 
some standard NiFi ones too, but the UI would be a big job and a pretty 
important piece.


Has this sort of thing gone pretty far and I haven't found the right 
Google string to find it? I'd happily write a guide for custom processor 
coding if it's worth walking the path because the UI is taken care of 
(assuming also that my company thinks it's worth the effort and expense, 
but I might happily crack the nut and do notes or a tutorial within the 
context of writing custom processors).


[1 https://cwiki.apache.org/confluence/display/NIFI/Localize+NiFi ]
[2 
https://stackoverflow.com/questions/69766648/how-to-internationalization-in-java-annotations 
]

Custom-processor configuration suggestions

2023-09-27 Thread Russell Bateman


I'm posting this plea for suggestions as I'm short on imagination here.

We have some custom processors that need extraordinary amounts of 
configuration of the sort a flow writer would have to copy and paste 
in--huge amounts of Yaml, regular expressions, etc. This is what our 
flow writers are already doing. It would be easier to insert a filename 
or -path, but...


...asking a custom processor to perform filesystem I/O is icky because 
of unpredictable filesystem access post installation. Thinking about how 
installation is beyond my control, I don't want to make installation 
messy, etc. Containers, Kubernetes deployment, etc. complicate this.


I thought of wiring /GetFile/ to a subdirectory (problematic, but less 
so?) and accepting files as input to pass on to needy processors who 
would recognize, adopt and incorporate configuration based on 
higher-level and simpler cues posted by flow writers as property values.


Assuming you both grok and are interested in what I'm asking, do you 
have thoughts, cautionary statements or even cat-calls to offer? Maybe 
there are obvious answers I'm just not thinking of.


Profuse thanks,

Russ

Re: new PackageFlowFile processor

2023-09-08 Thread Russell Bateman


Uh, sorry, "Version 3" refers to what exactly?

On 9/8/23 12:48, David Handermann wrote:

I agree that this would be a useful general feature. I also agree with
Joe that format support should be limited to*Version 3*  due to the
limitations of the earlier versions.

This is definitely something that would be useful on the 1.x support
branch to provide a smooth upgrade path for NiFi 2.

This general topic also came up on the dev channel on the Apache NiFi
Slack group:

https://apachenifi.slack.com/archives/C0L9S92JY/p1692115270146369

One key thing to note from that discussion is supporting
interoperability with services outside of NiFi. That may be too much
of a stretch for an initial implementation, but it is something I am
planning to evaluate as time allows.

For now, something focused narrowly on FlowFile Version 3 encoding
seems like the best approach.

I recommend referencing this discussion in a new Jira issue and
outlining the general design goals.

Regards,
David Handermann


On Fri, Sep 8, 2023 at 1:11 PM Adam Taft  wrote:

And also ... if we can land this in a 1.x release, this would help
tremendously to those who are going to need a replacement for PostHTTP and
don't want to "go dark" when they make the transition.

That is, without this processor in 1.x, when a user upgrades from 1.x to
2.x, they will either have to have a MergeContent/InvokeHTTP solution in
place already to replace PostHTTP, or they will have to take a (hopefully
short) outage when they bring their canvas back up (removing PostHTTP and
replacing with PackageFlowFile + InvokeHTTP).

With this processor in 1.x, they can make that transition while PostHTTP is
still available on their canvas. Wishful thinking that we can make the
entire journey from 1.x to 2.x as smooth as possible, but this could
potentially help some.


On Fri, Sep 8, 2023 at 10:55 AM Adam Taft  wrote:


+1 on this as well. It's something I've kind of griped about before (with
the loss of PostHTTP).

I don't think it would be horrible (as per Joe's concern) to offer a N:1
"bundling" property. It would just have to be stupid simple. No "groups",
timeouts, correlation attributes, minimum entries, etc. It should just
basically call the ProcessSession#get(int maxResults) where "maxResults" is
a configurable property. Whatever number of flowfiles returned in the list
is what is "bundled" into FFv3 format for output.

/Adam


On Fri, Sep 8, 2023 at 7:19 AM Phillip Lord
wrote:


+1 from me.
I’ve experimented with both methods.  The simplicity of a PackageFlowfile
straight up 1:1 is convenient and straightforward.
MergeContent on the other hand can be difficult to understand and tweak
appropriately to gain desired results/throughput.
On Sep 8, 2023 at 10:14 AM -0400, Joe Witt, wrote:

Ok. Certainly simplifies it but likely makes it applicable to larger
flowfiles only. The format is meant to allow appending and result in

large

sets of flowfiles for io efficiency and specifically for storage as the
small files/tons of files thing can cause poor performance pretty

quickly

(10s of thousands of files in a single directory).

But maybe that simplicity is fine and we just link to the MergeContent
packaging option if users need more.

On Fri, Sep 8, 2023 at 7:06 AM Michael Moser

wrote:

I was thinking 1 file in -> 1 flowfile-v3 file out. No merging of

multiple

files at all. Probably change the mime.type attribute. It might not

even

have any config properties at all if we only support flowfile-v3 and

not v1

or v2.

-- Mike


On Fri, Sep 8, 2023 at 9:56 AM Joe Witt  wrote:


Mike

In user terms this makes sense to me. Id only bother with v3 or

whatever

is

latest. We want to dump the old code. And if there are seriously

older

versions v1,v2 then nifi 1.x can be used.

The challenge is that you end up needing some of the same

complexity in

implementation and config of merge content i think. What did you

have in

mind for that?

Thanks

On Fri, Sep 8, 2023 at 6:53 AM Michael Moser

wrote:

Devs,

I can't find if this was suggested before, so here goes. With the

demise

of PostHTTP in NiFi 2.0, the recommended alternative is to

MergeContent 1

file into FlowFile-v3 format then InvokeHTTP. What does the

community

think about supporting a new PackageFlowFile processor that is

simple

to

configure (compared to MergeContent!) and simply packages flowfile
attributes + content into a FlowFile-v[1,2,3] format? This would

also

offer a simple way to export flowfiles from NiFi that could later

be

re-ingested and recovered using UnpackContent. I don't want to

submit

a

PR

for such a processor without first asking the community whether

this

would

be acceptable.

Thanks,
-- Mike

Re: Refreshing HTML displayed for View usage

2023-08-21 Thread Russell Bateman

Thanks, Matt. This is happening locally during development where I'm not 
using DNS (just localhost). I find this problem less acute on older 
versions of NiFi (1.1.3.2) than more recent ones (1.23.0, 1.19.1).


Thanks,
Russ

On 8/20/23 19:03, Matthew Hawkins wrote:

Hi Russell,

Something I've noticed myself, and it's easily reproducible on the Usage
pages for processors, is that Nifi is doing a reverse DNS lookup when
loading the page. If your DNS is broken, it can take 10-30 seconds for the
page content to appear.

Kr,

On Fri, 18 Aug 2023, 07:29 Russell Bateman,  wrote:


Matt,

I experimented with the Open in New Window button, but the new window,
which probably isn't an iframe, nevertheless doesn't respond to attempts
to get around the cache.

You're probably right about the version not changing being at the root
of the browser's inability to respond to a change. Sadly, the
edit-display-debug-edit cycle doesn't excite me into changing versions
just so I can see the result.

I'm certainly not going to clamor for work to be done to get around
this. It's annoying, but not crippling.

Thanks,

Russ


On 8/17/23 13:54, Matt Gilman wrote:

Russell,

Thanks for the response. The suggestion to open a new tab was for the
generated documentation, not the NiFi canvas itself. The generated
documentation is shown in an iframe which will offer you the menu item in
the context menu for opening in a new tab. IIRC correctly, the path to

the

generated documentation contains the version of the component. For folks
simply using a released version of NiFi this works great since versions
will differ and any browser caching will only optimize page loading for

the

user. If you're a developer, however, you're likely making changes to

your

component and the version is not changing (x.y.z-SNAPSHOT). This is where
the browser cache could result in the behavior your seeing.

Matt

On Thu, Aug 17, 2023 at 3:21 PM Russell Bateman
wrote:


Thanks, Matt,

(Since I use Chrome 99% of the time, I'm using Chrome here.)

Clicking on the reload control next to the address bar, while holding
down Shift, reloads the canvas, but the NiFi Documentation page
disappears and I have to reload it using the View usage context menu of
my processor. Once reloaded, nothing has changed.

This is also the behavior of the Configure Processor dialog. As I reload
View usage, ...

In Chrome, there's no effective content-menu option to right-click in
the page contents and open in a new tab. I can...

  Back
  Forward
  Reload
  ---
  Save As...
  Print
  Cast
  Search images with Google
  -
  Send to your devices
  Create QR code for this page
  
  Translate
  -
  View page source
  View frame source
  Reload frame
  Inspect


If I right-click the current tab itself, I'm able to choose to
Duplicate., but that doesn't seem to lead to anything more helpful.

By the way, I'm doing this using NiFi 1.23.0. I have also verified the
date of my NAR to ensure that I'm not repeatedly working using a NAR
with old content. Also, the version displayed for the custom processor
is that of the various /pom.xml/ files I'm building with.

I usually develop on the latest NiFi version. However, my company has
NARs that won't load beginning 1.14.0. I just brought up 1.13.2, what
most of our customers run, and tried my latest NAR which *does* display
my changes.Same with 1.1.2, which represents the oldest platform
executing at any custom site. Exceptionally, I set up 1.19.1 where it
*also works*.

This is unexpected; maybe it does point, as unlikely as it seems, to
something changed in 1.23.0 (instead of being caused by anything between
the dossier of my chair and the space bar of my keyboard as I have been
expecting to learn. ;-)  )

It could still be me and I'm just not seeing the obvious yet.


On 8/17/23 12:44, Matt Gilman wrote:

Russell,

Assuming this is a browser cache issue, can you try right-clicking

anywhere

in the page contents of the generated documentation and open it in a

new

tab. Once that is open, can you try doing a hard refresh by holding

Shift

while clicking Reload next to the address bar? This should clear the

cache

and fetch the updated generated documentation. At this point, you

should

be

able to close that tab and retry from the NiFi UI.

Let us know if this doesn't help and we can see if something isn't

getting

generated and updated correctly.

Matt

On Thu, Aug 17, 2023 at 2:38 PM Russell Bateman
wrote:


Seems like a really stupid user/browser question, but I cannot seem to
get changes I've made to properties, relationships, attributes read or
written, etc. *for custom processors*. Also, from the Properties tab

in

Configuring Processor, the cartoon blurbs obtained by hovering over

(?)

aren't updated either.

This is despite that changes I make to my /additionalDetails.html/

files

come

Re: Refreshing HTML displayed for View usage

2023-08-17 Thread Russell Bateman


Matt,

I experimented with the Open in New Window button, but the new window, 
which probably isn't an iframe, nevertheless doesn't respond to attempts 
to get around the cache.


You're probably right about the version not changing being at the root 
of the browser's inability to respond to a change. Sadly, the 
edit-display-debug-edit cycle doesn't excite me into changing versions 
just so I can see the result.


I'm certainly not going to clamor for work to be done to get around 
this. It's annoying, but not crippling.


Thanks,

Russ


On 8/17/23 13:54, Matt Gilman wrote:

Russell,

Thanks for the response. The suggestion to open a new tab was for the
generated documentation, not the NiFi canvas itself. The generated
documentation is shown in an iframe which will offer you the menu item in
the context menu for opening in a new tab. IIRC correctly, the path to the
generated documentation contains the version of the component. For folks
simply using a released version of NiFi this works great since versions
will differ and any browser caching will only optimize page loading for the
user. If you're a developer, however, you're likely making changes to your
component and the version is not changing (x.y.z-SNAPSHOT). This is where
the browser cache could result in the behavior your seeing.

Matt

On Thu, Aug 17, 2023 at 3:21 PM Russell Bateman
wrote:


Thanks, Matt,

(Since I use Chrome 99% of the time, I'm using Chrome here.)

Clicking on the reload control next to the address bar, while holding
down Shift, reloads the canvas, but the NiFi Documentation page
disappears and I have to reload it using the View usage context menu of
my processor. Once reloaded, nothing has changed.

This is also the behavior of the Configure Processor dialog. As I reload
View usage, ...

In Chrome, there's no effective content-menu option to right-click in
the page contents and open in a new tab. I can...

 Back
 Forward
 Reload
 ---
 Save As...
 Print
 Cast
 Search images with Google
 -
 Send to your devices
 Create QR code for this page
 
 Translate
 -
 View page source
 View frame source
 Reload frame
 Inspect


If I right-click the current tab itself, I'm able to choose to
Duplicate., but that doesn't seem to lead to anything more helpful.

By the way, I'm doing this using NiFi 1.23.0. I have also verified the
date of my NAR to ensure that I'm not repeatedly working using a NAR
with old content. Also, the version displayed for the custom processor
is that of the various /pom.xml/ files I'm building with.

I usually develop on the latest NiFi version. However, my company has
NARs that won't load beginning 1.14.0. I just brought up 1.13.2, what
most of our customers run, and tried my latest NAR which *does* display
my changes.Same with 1.1.2, which represents the oldest platform
executing at any custom site. Exceptionally, I set up 1.19.1 where it
*also works*.

This is unexpected; maybe it does point, as unlikely as it seems, to
something changed in 1.23.0 (instead of being caused by anything between
the dossier of my chair and the space bar of my keyboard as I have been
expecting to learn. ;-)  )

It could still be me and I'm just not seeing the obvious yet.


On 8/17/23 12:44, Matt Gilman wrote:

Russell,

Assuming this is a browser cache issue, can you try right-clicking

anywhere

in the page contents of the generated documentation and open it in a new
tab. Once that is open, can you try doing a hard refresh by holding Shift
while clicking Reload next to the address bar? This should clear the

cache

and fetch the updated generated documentation. At this point, you should

be

able to close that tab and retry from the NiFi UI.

Let us know if this doesn't help and we can see if something isn't

getting

generated and updated correctly.

Matt

On Thu, Aug 17, 2023 at 2:38 PM Russell Bateman
wrote:


Seems like a really stupid user/browser question, but I cannot seem to
get changes I've made to properties, relationships, attributes read or
written, etc. *for custom processors*. Also, from the Properties tab in
Configuring Processor, the cartoon blurbs obtained by hovering over (?)
aren't updated either.

This is despite that changes I make to my /additionalDetails.html/ files
come through with no problem.

I have tried Chrome, Opera, Brave and Firefox. All [mis]behave

identically.

I have tried killing the browser running the NiFi UI, stopping also the
NiFi instance, relaunching NiFi, quickly asking the browser to display
the canvas, then holding down the Ctrl key as I click Reload.

In past times, I have observed that eventually the changes are
recognized and displayed, so this is not permanent, however, it's very
annoying especially when I'm polishing my processor documentation
wording in the class that extends AbstractProcessor. All the while, as I
say, the processor's

Re: Refreshing HTML displayed for View usage

2023-08-17 Thread Russell Bateman


Thanks, Matt,

(Since I use Chrome 99% of the time, I'm using Chrome here.)

Clicking on the reload control next to the address bar, while holding 
down Shift, reloads the canvas, but the NiFi Documentation page 
disappears and I have to reload it using the View usage context menu of 
my processor. Once reloaded, nothing has changed.


This is also the behavior of the Configure Processor dialog. As I reload 
View usage, ...


In Chrome, there's no effective content-menu option to right-click in 
the page contents and open in a new tab. I can...


   Back
   Forward
   Reload
   ---
   Save As...
   Print
   Cast
   Search images with Google
   -
   Send to your devices
   Create QR code for this page
   
   Translate
   -
   View page source
   View frame source
   Reload frame
   Inspect


If I right-click the current tab itself, I'm able to choose to 
Duplicate., but that doesn't seem to lead to anything more helpful.


By the way, I'm doing this using NiFi 1.23.0. I have also verified the 
date of my NAR to ensure that I'm not repeatedly working using a NAR 
with old content. Also, the version displayed for the custom processor 
is that of the various /pom.xml/ files I'm building with.


I usually develop on the latest NiFi version. However, my company has 
NARs that won't load beginning 1.14.0. I just brought up 1.13.2, what 
most of our customers run, and tried my latest NAR which *does* display 
my changes.Same with 1.1.2, which represents the oldest platform 
executing at any custom site. Exceptionally, I set up 1.19.1 where it 
*also works*.


This is unexpected; maybe it does point, as unlikely as it seems, to 
something changed in 1.23.0 (instead of being caused by anything between 
the dossier of my chair and the space bar of my keyboard as I have been 
expecting to learn. ;-)  )


It could still be me and I'm just not seeing the obvious yet.


On 8/17/23 12:44, Matt Gilman wrote:

Russell,

Assuming this is a browser cache issue, can you try right-clicking anywhere
in the page contents of the generated documentation and open it in a new
tab. Once that is open, can you try doing a hard refresh by holding Shift
while clicking Reload next to the address bar? This should clear the cache
and fetch the updated generated documentation. At this point, you should be
able to close that tab and retry from the NiFi UI.

Let us know if this doesn't help and we can see if something isn't getting
generated and updated correctly.

Matt

On Thu, Aug 17, 2023 at 2:38 PM Russell Bateman
wrote:


Seems like a really stupid user/browser question, but I cannot seem to
get changes I've made to properties, relationships, attributes read or
written, etc. *for custom processors*. Also, from the Properties tab in
Configuring Processor, the cartoon blurbs obtained by hovering over (?)
aren't updated either.

This is despite that changes I make to my /additionalDetails.html/ files
come through with no problem.

I have tried Chrome, Opera, Brave and Firefox. All [mis]behave identically.

I have tried killing the browser running the NiFi UI, stopping also the
NiFi instance, relaunching NiFi, quickly asking the browser to display
the canvas, then holding down the Ctrl key as I click Reload.

In past times, I have observed that eventually the changes are
recognized and displayed, so this is not permanent, however, it's very
annoying especially when I'm polishing my processor documentation
wording in the class that extends AbstractProcessor. All the while, as I
say, the processor's corresponding /additionalDetails.html/ displays
changes I make to it in a timely and accurate fashion.

Suggestions?

Refreshing HTML displayed for View usage

2023-08-17 Thread Russell Bateman

Seems like a really stupid user/browser question, but I cannot seem to 
get changes I've made to properties, relationships, attributes read or 
written, etc. *for custom processors*. Also, from the Properties tab in 
Configuring Processor, the cartoon blurbs obtained by hovering over (?) 
aren't updated either.


This is despite that changes I make to my /additionalDetails.html/ files 
come through with no problem.


I have tried Chrome, Opera, Brave and Firefox. All [mis]behave identically.

I have tried killing the browser running the NiFi UI, stopping also the 
NiFi instance, relaunching NiFi, quickly asking the browser to display 
the canvas, then holding down the Ctrl key as I click Reload.


In past times, I have observed that eventually the changes are 
recognized and displayed, so this is not permanent, however, it's very 
annoying especially when I'm polishing my processor documentation 
wording in the class that extends AbstractProcessor. All the while, as I 
say, the processor's corresponding /additionalDetails.html/ displays 
changes I make to it in a timely and accurate fashion.


Suggestions?

Custom processor once had properties, has no more now, but they still show up

2023-07-20 Thread Russell Bateman

I have a custom processor I modified losing a few properties because a 
specification change made them useless. I removed them. The processor 
works, but in configuration for this processor (in the NiFi UI), the 
processor appears to have kept them, i.e.: they're not disappearing. I 
would have expected trash cans on each, but they're just still there.


Yes, the processor bears the latest NAR version number and there is no 
"duplicate" or older NAR in the /extensions/ subdirectory.


Is my best recourse to remove the processor from the flow and re-add it? 
This is not a problem, but I'll have down-streamers with their own flows 
I'm not in contact with who will find this behavior unsettling. I could 
issue documentation, but I try to keep extra work to a minimum for them.


Thanks for any thoughts.

Russ

Re: Use of attribute uuid and other "native" attributes

2023-07-18 Thread Russell Bateman

Of course, a custom processor can create any attribute, including an 
"external id field." I don't think it can "lose" the original uuid 
since, if it attempts to reset it, the action will be quietly ignored 
(Mark).


Note that uuid figures prominently in the display of provenance--in my 
mind the crucial nature of my question. [1]


My question was about the "sanctified" state (or not) of uuid and Matt 
and Mark gave succinct and useful answers that I will explore. I was 
unaware of the suggested "best practice" of considering losing any and 
all previously established attributes before sending flowfiles on. I 
have long done this explicitly in the case of attributes I create, but 
will now contemplate doing it for other attributes I did not create and 
therefore have respected "religiously."


Russ

[1] 
https://www.tutorialspoint.com/apache_nifi/apache_nifi_data_provenance.htm


On 7/18/23 14:07, Edward Armes wrote:

Hmm,

I've seen this come up a few times now I wonder is there need for a rename
of the uuid field and a creation of an external id field?

Edward

On Tue, 18 Jul 2023, 20:53 Lucas Ottersbach,
wrote:


Hey Matt,

you wrote that both `Session.create` and `Session.clone` set a new FlowFile
UUID to the resulting FlowFile. This somewhat sounds like there is an
alternative way where the UUID is not controlled by the framework itself?

I've got a different use case than Russell, but was wondering whether it is
even possible to control the FlowFile UUID as a Processor developer? I've
got a processor pair for inter-cluster transfer of FlowFiles (where
Site-to-Site is not applicable). As of now, the UUID on the receiving side
differs from the original on the origin cluster, because I'm using
`Session.create`.
Is there a way to control the UUID of new FlowFiles?


Best regards,

Lucas

Matt Burgess  schrieb am Di., 18. Juli 2023, 20:23:


In general I recommend only sending on those attributes that will be
used at some point downstream (unless you have an "original"
relationship that should maintain the original state with respect to
provenance). If you don't know that ahead of time you'll probably need
to send all/most of the attributes just in case.

Are you using session.create() or session.clone()? They both set a new
"uuid" attribute on the created FlowFile, with at least the latter
setting some other attributes as well (see the Developer Guide [1] for
more details).

Regards,
Matt

[1]https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html

On Tue, Jul 18, 2023 at 12:25 PM Russell Bateman
wrote:

I have a custom processor, /SplitHl7v4Resources/, that splits out
individual FHIR resources (Patients, Observations, Encounters, etc.)
from great Bundle flowfiles. So, for a given flowfile, it's split into
hundreds of smaller ones.

When I do this, I leave the existing NiFi attributes as they were on

the

original flowfile.

As I contemplate the uuid attribute, it occurs to me that I should find
out what its *significance is for provenance and other potential
debugging/tracing concerns*. I never really look at it, but, if there
were some kind of melt-down in a production environment, would I care
that it multiplied across hundreds of flowfiles besided the original

one?

Also these two other NiFi attributes remain unchanged:

 filename
 path


I do garnish each flowfile with many pointed/significant new attributes
like resource.type that are my own. In my processing, I don't care

about

NiFi's original attributes, but should I?

Thanks,
Russ

Use of attribute uuid and other "native" attributes

2023-07-18 Thread Russell Bateman

I have a custom processor, /SplitHl7v4Resources/, that splits out 
individual FHIR resources (Patients, Observations, Encounters, etc.) 
from great Bundle flowfiles. So, for a given flowfile, it's split into 
hundreds of smaller ones.


When I do this, I leave the existing NiFi attributes as they were on the 
original flowfile.


As I contemplate the uuid attribute, it occurs to me that I should find 
out what its *significance is for provenance and other potential 
debugging/tracing concerns*. I never really look at it, but, if there 
were some kind of melt-down in a production environment, would I care 
that it multiplied across hundreds of flowfiles besided the original one?


Also these two other NiFi attributes remain unchanged:

   filename
   path


I do garnish each flowfile with many pointed/significant new attributes 
like resource.type that are my own. In my processing, I don't care about 
NiFi's original attributes, but should I?


Thanks,
Russ

Re: Possible Docker bug in 1.13.2

2023-06-08 Thread Russell Bateman

Thanks, Chris.

In fact, I merely hand-corrected the instances of the version in
/Dockerfile/ and in /DockerImage.txt/, then ran /DockerBuild.sh/. I got
exactly the image I wanted. It looked like a potential if irrelevant bug
and I thought I'd report it.

Best,
Russ

On 6/7/23 19:17, Chris Sampson wrote:

The DockerImage.txt file isn't always updated in the repo, a bit like the
README for Docker Hub - it probably should be, but is often forgotten (same
for all of the convenience images built after a release). Indeed, this is
currently set to 1.15.1 on `main`, but the images in Docker Hub since then
have certainly not used an incorrect version.

Nifi 1.16.0 [1] updated the DockerBuild.sh script to allow easier
specification of the NIFI_IMAGE_VERSION from the command line. A small
issue has just been spotted with this script's use of the text file to
obtain the version's default value too [4].

Before then, I think the process was for someone (with access to push
images to Docker Hub) to change the text file locally, build & push the
image, but not always then raise a PR to update the text file.

To manually build the Docker Image locally containing the correct version
of the downloaded convenience binary from the Apache servers, simply
recreate the `docker build` command from within the DockerBuild.sh [2]
file, substituting the correct values for the `--build-arg`s.

Alternatively, run the dockermaven [3] build for the repo (having checked
out the correct tag from GitHub), enabling the `-P docker` profile as you
do so in your `mvn` command. Note, however, that there were differences
between the dockermaven and dockerhub builds (both the included script
files and the Dockerfile structure) that weren't rationalised until NiFi
1.16.0+.

[1]:
https://github.com/apache/nifi/blob/rel/nifi-1.16.0/nifi-docker/dockerhub/DockerBuild.sh

[2]:
https://github.com/apache/nifi/blob/rel/nifi-1.13.2/nifi-docker/dockerhub/DockerBuild.sh#L36

[3]:
https://github.com/apache/nifi/tree/rel/nifi-1.16.0/nifi-docker/dockermaven

[4]:
https://apachenifi.slack.com/archives/CDGMCSDJT/p1686135561932289?thread_ts=1686135561.932289=CDGMCSDJT

On Thu, 8 Jun 2023, 00:10 Russell Bateman, wrote:

I'm re-rolling in order to update the Java inside to 11 in order to
permit using the new Java HTTP client. This seems to work well; I fixed
the bug locally.

Maybe too old to be important, but NiFi 1.14.0 is a quantum step up for
several aspects of processor writing requiring refactoring. So, until we
can shed the old NAR we cannot rebuild, we're stuck at NiFi 1.13.2.

On 6/7/23 15:31, Russell Bateman wrote:

I downloaded sources to 1.13.2 in order to hand-spin my own container
image. When I got down to
/nifi-1.13.2/nifi-docker/dockerhub/Dockerfile/, I found:

...
ARG NIFI_VERSION=1.13.1
...

and the version is also wrong in /DockerImage.txt/ which
/DockerBuild.sh/ consumes.

Indeed, the image that is built appears to be versioned 1.13.1 and not
1.13.2 when listing local Docker repository images:

*REPOSITORY TAG IMAGE ID CREATED SIZE*
apache/nifi 1.13.1 8c18038f152a 30 minutes ago 2.06GB

Why am I juggling so ancient a version? Because I have custom
processors that cannot be rebuilt (source-code gone) and will not run
on 1.14.0 and later.

Russ

Re: Possible Docker bug in 1.13.2

2023-06-07 Thread Russell Bateman

I'm re-rolling in order to update the Java inside to 11 in order to 
permit using the new Java HTTP client. This seems to work well; I fixed 
the bug locally.


Maybe too old to be important, but NiFi 1.14.0 is a quantum step up for 
several aspects of processor writing requiring refactoring. So, until we 
can shed the old NAR we cannot rebuild, we're stuck at NiFi 1.13.2.



On 6/7/23 15:31, Russell Bateman wrote:
I downloaded sources to 1.13.2 in order to hand-spin my own container 
image. When I got down to 
/nifi-1.13.2/nifi-docker/dockerhub/Dockerfile/, I found:


...
ARG NIFI_VERSION=1.13.1
...

and the version is also wrong in /DockerImage.txt/ which 
/DockerBuild.sh/ consumes.


Indeed, the image that is built appears to be versioned 1.13.1 and not 
1.13.2 when listing local Docker repository images:


*REPOSITORY TAG IMAGE ID   CREATED  SIZE*
apache/nifi 1.13.1  8c18038f152a   30 minutes ago 2.06GB

Why am I juggling so ancient a version? Because I have custom 
processors that cannot be rebuilt (source-code gone) and will not run 
on 1.14.0 and later.


Russ

Possible Docker bug in 1.13.2

2023-06-07 Thread Russell Bateman

I downloaded sources to 1.13.2 in order to hand-spin my own container 
image. When I got down to 
/nifi-1.13.2/nifi-docker/dockerhub/Dockerfile/, I found:


   ...
   ARG NIFI_VERSION=1.13.1
   ...

and the version is also wrong in /DockerImage.txt/ which 
/DockerBuild.sh/ consumes.


Indeed, the image that is built appears to be versioned 1.13.1 and not 
1.13.2 when listing local Docker repository images:


   *REPOSITORY TAG IMAGE ID   CREATED  SIZE*
   apache/nifi 1.13.1  8c18038f152a   30 minutes ago   2.06GB

Why am I juggling so ancient a version? Because I have custom processors 
that cannot be rebuilt (source-code gone) and will not run on 1.14.0 and 
later.


Russ

Re: Usage Documentation for Custom Processors

2023-04-04 Thread Russell Bateman

Matthew,

If you feel that the documentation generated from the annotations at the 
top of your custom processor class (@CapabilityDescription, etc., of 
which Bryan spoke) is insufficient, it's also possible to supplement it with

/src/main/resources/docs/.CustomProcessorClass/additionalDetails.html/

You write it in simple HTML with embedded CSS. By your user, it's 
reached via a hyperlink on the (standard) processor usage page put there 
when the framework notices that you've supplied it (directory name 
including package path, filesystem location, etc. are crucial).

I do this for almost every last custom processor I write as a favor to 
my downstream flow writers.

Cheers,

Russ

On 4/4/23 08:54, Matthew Baine wrote:

Hi Bryan,

Sorry, on a separate note, what would be the best way to set up Usage 
Documentation for a custom processor?

image.png

We can't seem to get this right with the information online and on the 
Nifi developer guide 
(https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html). 
Our custom processors seem to only publish documentation of the native 
processors.

Kind Regards,
Matthew

On Tue, 4 Apr 2023 at 13:54, Matthew Baine  
wrote:

Hi Bryan,

Sorry for the delayed response, and thank you so much for the
feedback!

We will attempt the advised approach and revert if we run into any
trouble.

Thanks again!

Regards,

On Thu, 30 Mar 2023 at 16:49, Bryan Bende  wrote:

Hello,

This might not give you exactly what you want, but the Minifi
Toolkit
already has the ability to transform the JSON snapshot from
registry,
there are actually two commands:

"transform" - for XML templates
"transform-vfs" - for versioned flow snapshot (JSON from
registry) [1]

It doesn't pull the snapshot from registry directly, so you
would have
to script something to download the snapshot and then run
transform-vfs.

Thanks,

Bryan

[1]

https://github.com/apache/nifi/blob/main/minifi/minifi-toolkit/minifi-toolkit-configuration/src/main/java/org/apache/nifi/minifi/toolkit/configuration/ConfigMain.java#L62

On Thu, Mar 30, 2023 at 10:22 AM Simeon Wentzel
 wrote:
>
> Dear Nifi dev team
>
> Can you add extended functionality to the MiNiFi toolkit to
extract a flow
> from the NiFi Registry software and convert it to the
appropriate conf.yml
> file?
>
> We have found a limitation regarding the conversion in the
minifi toolkit
> that it can only convert the .xml file template extracted
from a Nifi
> canvas on Java version 8, it can not do the conversion on
java 11 that we
> have migrated to.
>
> Although extracting the flow as a template out of nifi and
then converting
> it to the conf.yaml file works we find it a bit cumbersome
because we can
> not implement it in our pipeline to automate the process.
>
> By allowing the minifi toolkit to pull a flow from the Nifi
registry and
> then convert it will give us the functionality to add this
in our Jenkins
> pipeline to build individual docker containers for each of
our flows.
>
> Regards
> Simeon
> DevOps Engineer

-- 

*Matthew Baine | *DevOps Engineer

*Johannesburg Head Office*

E: matt...@airvantage.co.za  | M:
+27 (0) 71053 9012 

T: +27 (0) 11 100 1880  | W:
www.airvantage.co.za 

*Skype: matthew.baine57*

--

*Matthew Baine | *DevOps Engineer

*Johannesburg Head Office*

E: matt...@airvantage.co.za  | M: +27 
(0) 71053 9012 

T: +27 (0) 11 100 1880  | W: www.airvantage.co.za 

*Skype: matthew.baine57*

Re: ReplaceText 1.19 not processing Flowfile

2022-12-11 Thread Russell Bateman

Also, this strikes me as a NiFi Users List (us...@nifi.apache.org) 
question though many of us haunt both forums.



On 12/11/22 07:55, Mark Payne wrote:

Hello,

It looks like the attachment didn’t come through. The mailing list often strips 
out attachments from emails. Perhaps put them in pastebin or google drive or a 
GitHub gist and send a link?

Thanks
Mark

Sent from my iPhone


On Dec 11, 2022, at 7:33 AM,develo...@alucio.dev  wrote:

Hello, I am new to Nifi (right now on 1.19.0) and currently having trouble 
with ReplaceText processor.
Did my best to find resources on the net to find out what's going on , but 
nothing seems to address my issue.
I built a simple Flow...generating flowfile -> ReplaceText.

I can see that the Read counter in ReplaceText increased , but somehow the 
Queue is stuck (Data is being penalized).
On the ReplaceText right top corner, I can see some info window saying that the flowfile 
as transferred to "Success" but I don't see anything going on in the Success 
route.

I don't see anything in the log file that would indicate any errors relating to 
ReplaceText.
Attached are my flow and the ReplaceText Config.
Could anyone point me to the right direction please?

Seems like this issue is similar to NIFI-6426

Thanks

Re: Grammar error in error message

2022-11-16 Thread Russell Bateman

Yes it should be the infinitive */receive/* instead of the past 
participle /received/.


On 11/16/22 08:18, Paul Schou wrote:

This error message does not look like it is grammatically correct:

https://github.com/gkatta4113/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ListenHTTP.java#L332

getLogger().warn("failed to received acknowledgment for HOLD with ID {}
sent by {}; rolling back session", new Object[] {id,
wrapper.getClientIP()});

- Paul Schou

Re: How to manage security artifacts from a custom processor

2022-07-05 Thread Russell Bateman

I appreciate the responses. I will try out the canonical 
/StandardSSLContextService/ first (since that's what I am using with 
Kafka), then imitate the other sample depending.


However, where/how do I install the certificates I'll be given for use? 
I would expect something for certain representing the third-party 
service in a truststore and maybe another (a private key) in a keystore.



On 7/5/22 16:30, Russell Bateman wrote:
From a custom processor, I intend to interface with a third-party 
service (via simple HTTP client), however, I would need as I 
understand it to


a) maintain a private key by which I can identify myself to that
third-party service and
b) maintain a trusted-store certificate by which I can guarantee
the identity of the service.

This is pretty far outside my own experience. I have been reading on 
how this is achieved in Java, but in my mind a complication arises 
from the fact that a custom NiFi processor lives within NiFi's JVM. My 
question is therefore, how can I control the certificates and 
authorities for my use in or associated with NiFi's JVM. Clearly, I 
don't grok this well enough even to ask the question; I'm hoping 
someone can see through what I'm asking and point me in a good 
direction to study.


I've written a pile of successful and useful custom NiFi processors to 
cover proprietary needs, so custom-processor writing isn't a mystery. 
Certificates, keys, trusts and security in general still is.


Profuse thanks,

Russ

How to manage security artifacts from a custom processor

2022-07-05 Thread Russell Bateman

From a custom processor, I intend to interface with a third-party 
service (via simple HTTP client), however, I would need as I understand 
it to


   a) maintain a private key by which I can identify myself to that
   third-party service and
   b) maintain a trusted-store certificate by which I can guarantee the
   identity of the service.

This is pretty far outside my own experience. I have been reading on how 
this is achieved in Java, but in my mind a complication arises from the 
fact that a custom NiFi processor lives within NiFi's JVM. My question 
is therefore, how can I control the certificates and authorities for my 
use in or associated with NiFi's JVM. Clearly, I don't grok this well 
enough even to ask the question; I'm hoping someone can see through what 
I'm asking and point me in a good direction to study.


I've written a pile of successful and useful custom NiFi processors to 
cover proprietary needs, so custom-processor writing isn't a mystery. 
Certificates, keys, trusts and security in general still is.


Profuse thanks,

Russ

Re: Reg Nifi java code generation

2022-05-02 Thread Russell Bateman

You don't have to write Java code to benefit from NiFi which is an 
insanely useful framework all by itself with jillions of super-useful 
processors ready for use. However, if you plan to code your own, 
proprietary processor to do something that hasn't been covered, here's a 
likely place to start:


https://nifi.apache.org/developer-guide.html

In addition, googling will help you find many samples to imitate.

Best luck to you.

On 5/2/22 06:32, AKHILESH PATIDAR wrote:

Hi,
great work is done by you on nifi java code generation. I'm working on my
college project which needs nifi java code can you please help me in
understanding that.


Thank You
Akhilesh


Re: [DISCUSS] NiFi 2.0 Release Goals

2021-07-23 Thread Russell Bateman

Bringing up Elastic also reminds me that the Elastic framework has just 
recently transitioned out of Open Source, so to acknowledge that, maybe 
some effort toward OpenSearch--I say this not understanding exactly how 
this sort of thing is considered in a large-scale, world-class software 
project like Apache NiFi. (I'm not a contributor, just a grateful consumer.)


Russ

On 7/23/21 10:28 AM, Matt Burgess wrote:

Along with the itemized list for ancient components we should look at
updating versions of drivers, SDKs, etc. for external systems such as
Elasticsearch, Cassandra, etc. There may be breaking changes but 2.0
is probably the right time to get things up to date to make them more
useful to more people.

On Fri, Jul 23, 2021 at 12:21 PM Nathan Gough  wrote:

I'm a +1 for removing pretty much all of this stuff. There are security
implications to keeping old dependencies around, so the more old code we
can remove the better. I agree that eventually we need to move to
supporting only Java 11+, and as our next release will probably be about 4
- 6 months from now that doesn't seem too soon. We could potentially break
this in two and remove the deprecated processors and leave 1.x on Java 8,
and finally start on 2.x which would support only Java 11. I'm unsure of
what implications changing the date and time handling would have - for
running systems that use long term historical logs, unexpected impacts to
time logging could be a problem.

As Joe says I think feature work will have to be dedicated to 2.x and we
could support 1.x for security fixes for some period of time. 2.x seems
like a gargantuan task but it's probably time to get started. Not sure how
we handle all open PRs and the transition between 1.x and 2.x.

On Fri, Jul 23, 2021 at 10:57 AM Joe Witt  wrote:


Jon

You're right we have to be careful and you're right there are still
significant Java 8 users out there.  But we also have to be careful
about security and sustainability of the codebase.  If we had talked
about this last year when that article came out I'd have agreed it is
too early.  Interestingly that link seems to get updated and I tried
[1] and found more recent data (not sure how recent).  Anyway it
suggests Java 8 is still the top dog but we see good growth on 11.  In
my $dayjob this aligns to what I'm seeing too.  Customers didn't seem
to care about Java 11 until later half last year and now suddenly it
is all over the place.

I think once we put out a NiFi 2.0 release we'd see rapid decrease in
work on the 1.x line just being blunt.  We did this many years ago
with 0.x to 1.x and we stood behind 0.x for a while (maybe a year or
so) but it was purely bug fix/security related bits.  We would need to
do something similar.  But feature work would almost certainly go to
the 2.x line.  Maybe there are other workable models but my instinct
suggests this is likely to follow a similar path.

...anyway I agree it isn't that easy of a call to dump Java 8.  We
need to make the call in both the interests of the user base and the
contributor base of the community.

[1] https://www.jetbrains.com/lp/devecosystem-2021/java/


Thanks
Joe

On Fri, Jul 23, 2021 at 7:46 AM Joe Witt  wrote:

Russ

Yeah the flow registry is a key part of it.  But also now you can
download the flow definition in JSON (upload i think is there now
too).  Templates offered a series of challenges such as we store them
in the flow definition which has made flows massive in an unintended
way which isn't fun for cluster behavior.

We have a couple cases where we headed down a particular concept and
came up with better approaches later.  We need to reconcile these with
the benefit of hindsight, and while being careful to be not overly
disruptive to existing users, to reduce the codebase/maintenance
burden and allow continued evolution of the project.

Thanks

On Fri, Jul 23, 2021 at 7:43 AM Russell Bateman 

wrote:

Joe,

I apologize for the off-topic intrusion, but what replaces templates?
The Registry? Templates rocked and we have used them since 0.5.x.

Russ

On 7/23/21 8:31 AM, Joe Witt wrote:

David,

I think this is a highly reasonable approach and such a focus will
greatly help make a 2.0 release far more approachable to knock out.
Not only that but tech debt reduction would help make work towards
major features we'd think about in a 'major release' sense more
approachable.

We should remove all deprecated things (as well as verify we have the
right list).  We should remove/consider removal of deprecated

concepts

like templates.  We should consider whether we can resolve the

various

ways we've handled what are now parameters down to one clean

approach.

We should remove options in the nifi.properties which turn out to
never be used quite right (if there are).  There is quite a bit we

can

do purely in the name of tech debt reduction.

Lots to consider here but I think this is the right discussion.

Than ks

On Fri, Jul 23, 2021 at 7:26 AM Bryan Bende 

wrote

Re: [DISCUSS] NiFi 2.0 Release Goals

2021-07-23 Thread Russell Bateman


Joe,

I apologize for the off-topic intrusion, but what replaces templates? 
The Registry? Templates rocked and we have used them since 0.5.x.


Russ

On 7/23/21 8:31 AM, Joe Witt wrote:

David,

I think this is a highly reasonable approach and such a focus will
greatly help make a 2.0 release far more approachable to knock out.
Not only that but tech debt reduction would help make work towards
major features we'd think about in a 'major release' sense more
approachable.

We should remove all deprecated things (as well as verify we have the
right list).  We should remove/consider removal of deprecated concepts
like templates.  We should consider whether we can resolve the various
ways we've handled what are now parameters down to one clean approach.
We should remove options in the nifi.properties which turn out to
never be used quite right (if there are).  There is quite a bit we can
do purely in the name of tech debt reduction.

Lots to consider here but I think this is the right discussion.

Than ks

On Fri, Jul 23, 2021 at 7:26 AM Bryan Bende  wrote:

I'm a +1 for this... Not sure if this falls under "Removing Deprecated
Components", but I think we should also look at anything that has been
marked as deprecated throughout the code base as a candidate for
removal. There are quite a few classes, methods, properties, etc that
have been waiting for a chance to be removed.

On Fri, Jul 23, 2021 at 10:13 AM David Handermann
 wrote:

Team,

With all of the excellent work that many have contributed to NiFi over the
years, the code base has also accumulated some amount of technical debt. A
handful of components have been marked as deprecated, and some components
remain in the code base to support integration with old versions of various
products. Following the principles of semantic versioning, introducing a
major release would provide the opportunity to remove these deprecated and
unsupported components.

Rather than focusing the next major release on new features, what do you
think about focusing on technical debt removal? This approach would not
make for the most interesting release, but it provides the opportunity to
clean up elements that involve breaking changes.

Focusing on technical debt, at least three primary goals come to mind for
the next major release:

1. Removal of deprecated and unmaintained components
2. Require Java 11 as the minimum supported version
3. Transition internal date and time handling to JSR 310 java.time
components

*Removing Deprecated Components*

Removing support for older and deprecated components provides a great
opportunity to improve the overall security posture when it comes to
maintaining dependencies. The OWASP dependency plugin report currently
generates 50 MB of HTML for questionable dependencies, many of which are
related to old versions of various libraries.

As a starting point, here are a handful of components and extension modules
that could be targeted for removal in a major version:

- PostHTTP and GetHTTP
- ListenLumberjack and the entire nifi-lumberjack-bundle
- ListenBeats and the entire nifi-beats-bundle
- Elasticsearch 5 components
- Hive 1 and 2 components

*Requiring Java 11*

Java 8 is now over seven years old, and NiFi has supported general
compatibility with Java 11 for several years. NiFi 1.14.0 incorporated
internal improvements specifically related to TLS 1.3, which allowed
closing out the long-running Java 11 compatibility epic NIFI-5174. Making
Java 11 the minimum required version provides the opportunity to address
any lingering edge cases and put NiFi in a better position to support
current Java versions.

*JSR 310 for Date and Time Handling*

Without making the scope too broad, transitioning internal date and time
handling to use DateTimeFormatter instead of SimpleDateFormat would provide
a number of advantages. The Java Time components provide much better
clarity when it comes to handling localized date and time representations,
and also avoid the inherent confusion of java.sql.Date extending
java.util.Date. Many internal components, specifically Record-oriented
processors and services, rely on date parsing, leading to confusion and
various workarounds. The pattern formats of SimpleDateFormat and
DateTimeFormatter are very similar, but there are a few subtle differences.
Making this transition would provide a much better foundation going forward.

*Conclusion*

Thanks for giving this proposal some consideration. Many of you have been
developing NiFi for years and I look forward to your feedback. I would be
glad to put together a more formalized recommendation on Confluence and
write up Jira epics if this general approach sounds agreeable to the
community.

Regards,
David Handermann

Re: Penalizing one part of a flow over another

2021-04-22 Thread Russell Bateman


Thanks, Mark, both comments are very helpful.

Cheers,

Russ

On 4/22/21 11:19 AM, Mark Payne wrote:

Russell,

You can’t really set a “priority” of one flow of the other. A couple of options 
that may make sense for you though:

- You can set the Run Schedule to something other than “0 sec” for processors 
in the sub-flow. Perhaps set them to “100 millis” or something like that. This 
will leave to more latency in that flow but schedule the processors less 
frequently so they won’t interfere with your main flow as much. Here, though, 
if there’s a bunch of data coming in, it could result in backpressure all the 
way back to the main flow. So you’d want to consider if FlowFile Expiration is 
appropriate. That way you’d say if data sits in this first queue for more than 
3 seconds, for instance, expire it, so that it doesn’t cause back flow. You 
could schedule just the first processor in the sub flow to run at a slower pace 
or all of them, depending on if you’re just trying to slow down the ingestion 
into the flow or all of the processing.

- Similarly, rather than mess with the Run Schedule, you could use a Control 
Rate and say that you’re only going to allow a throughput of maybe 10 MB/sec 
into the sub-flow. Again, that could cause backpressure so you’d want to 
consider FlowFile Expiration if you’d rather lose the FlowFiles than allow them 
to affect the main flow.

Hope that’s helpful!

Thanks
-Mark


On Apr 22, 2021, at 9:44 AM, Russell Bateman  wrote:

I have a flow performing ETL of HL7v4 (FHIR) document on their way to indexing and 
storage. Custom processors perform the important transformations. Performance of this 
flow is at a premium for us. At some point along the way I want to gate off copies of raw 
or of transformed FHIR records (the flow writer's choice) to a new flow (a 
"subflow" of the total flow) for the purpose of validating those FHIR records 
as an option.

The main ETL flow will thus not be interrupted. Also, its performance should 
not be too hugely impacted by this new subflow. I have looked at priority 
techniques discussed, but usually the discussion is geared more toward a 
resulting order. I want to deprecate the performance of this new subflow to 
avoid handicapping the main flow, ideally from almost shutting down the subflow 
to allowing it equal performance with the main ETL flow.

Are there recommendations for such a thing? As I author many custom processors, 
is there something I could be doing in my code to aid this? I want rather to 
put the amount of crippling into the hands of my flow writers  a) by natural, 
existing configuration that's a feature of most NiFi processors and/or b) 
surfacing programming choices as configuration in my custom processor's 
configuration. Etc.

Any comments on this are hoped for and very welcome.

(Because I wrote so many custom processors that are crucial to my flows, I 
chose the NiFi developer- instead of the users list.)

Penalizing one part of a flow over another

2021-04-22 Thread Russell Bateman

I have a flow performing ETL of HL7v4 (FHIR) document on their way to 
indexing and storage. Custom processors perform the important 
transformations. Performance of this flow is at a premium for us. At 
some point along the way I want to gate off copies of raw or of 
transformed FHIR records (the flow writer's choice) to a new flow (a 
"subflow" of the total flow) for the purpose of validating those FHIR 
records as an option.


The main ETL flow will thus not be interrupted. Also, its performance 
should not be too hugely impacted by this new subflow. I have looked at 
priority techniques discussed, but usually the discussion is geared more 
toward a resulting order. I want to deprecate the performance of this 
new subflow to avoid handicapping the main flow, ideally from almost 
shutting down the subflow to allowing it equal performance with the main 
ETL flow.


Are there recommendations for such a thing? As I author many custom 
processors, is there something I could be doing in my code to aid this? 
I want rather to put the amount of crippling into the hands of my flow 
writers  a) by natural, existing configuration that's a feature of most 
NiFi processors and/or b) surfacing programming choices as configuration 
in my custom processor's configuration. Etc.


Any comments on this are hoped for and very welcome.

(Because I wrote so many custom processors that are crucial to my flows, 
I chose the NiFi developer- instead of the users list.)

Re: [DISCUSS] Processors Market

2021-03-24 Thread Russell Bateman


Javi,

Don't despair. Could just be that folk are busy and haven't had time to 
reflect upon it.



On 3/23/21 11:55 PM, Javi Roman wrote:

I see that it has not been well received, I thought it would be a good idea
:-(

--
Javi Roman

Twitter: @javiromanrh
GitHub: github.com/javiroman
Linkedin: es.linkedin.com/in/javiroman
Big Data Blog: dataintensive.info


On Tue, Mar 23, 2021 at 6:44 AM Javi Roman  wrote:


Hi!

I'm not sure whether this topic has been discussed in the past. I would
like to open a thread talking about the possibility of creating a kind of
market of NiFi Processors, for third-party processors.

I'm thinking about something similar to Kubernetes Helm hub [1], or
Kubernetes Operators Hub [2], but for NiFi processors (NAR bundles).

This repository could be managed by NiFi Registry in order to use
processors in different stages of development (preview, stable, official
... and so forth).

This is a way of slimming down the NiFi image and opening the development
to new users.

It would also be interesting to discuss if this "processors market" could
be hosted in ASF resources.

What do you think?

[1] https://artifacthub.io/
[2] https://operatorhub.io/
--
Javi Roman

Twitter: @javiromanrh
GitHub: github.com/javiroman
Linkedin: es.linkedin.com/in/javiroman
Big Data Blog: dataintensive.info

Re: Preconfiguring dynamic properties

2021-02-24 Thread Russell Bateman


Thanks, that probably gets me all or most of what I wanted. I'll try it out.

On 2/24/21 11:19 AM, Bryan Bende wrote:

I don't think it was the intent to pre-add a dynamic property. You
should be able to set a default value though, the user still has to
click the + icon to add the property though.

On Wed, Feb 24, 2021 at 12:02 PM Russell Bateman  wrote:

I have a dynamic property in a custom processor that my down-streamers
struggle a little bit to configure (requires newlines and a peculiar
format). I would like to "preconfigure" a dynamic property as an example
that they can either modify or erase to add their own. Most of them
would probably just use what I preconfigure.

The point is that I don't wish it to be a full-raging, static property.
I want to see a trash can to the right of it. The trash can is not a
feature of static properties.

Is this possible, wrong-headed, what?

Thanks,
Russ

Preconfiguring dynamic properties

2021-02-24 Thread Russell Bateman

I have a dynamic property in a custom processor that my down-streamers 
struggle a little bit to configure (requires newlines and a peculiar 
format). I would like to "preconfigure" a dynamic property as an example 
that they can either modify or erase to add their own. Most of them 
would probably just use what I preconfigure.


The point is that I don't wish it to be a full-raging, static property. 
I want to see a trash can to the right of it. The trash can is not a 
feature of static properties.


Is this possible, wrong-headed, what?

Thanks,
Russ

Re: [discuss] we need to enable secure by default...

2021-02-10 Thread Russell Bateman

I second the concerns expressed, but second especially Bryan's pointing 
out that requiring LDAP/AD to be set up in order even to begin to use 
our framework would be a bit onerous for developers just interested in 
getting work done and a barrier to considering the framework should it 
be erected a little too high. Should we at least glance at how this is 
solved by the likes of other projects, Kafka and Cassandra come to mind, 
even if it means resorting to a store of a name or two? I didn't find 
getting into developing with them a pain, but making me jump through the 
hoop of setting up LDAP may very well have changed that.


These unsecure instances of NiFi out there are not our community's 
fault. I suppose we're worried about getting splattered by bad press?


On 2/10/21 5:47 AM, Bryan Bende wrote:

I agree with the overall idea, although I would think it requires a
major release to make this kind of change to the default behavior.

Also, we have always avoided NiFi being a store of usernames and
passwords, so we don't have a login provider that uses a local file or
a database, we've always said you connect to LDAP/AD for that.

Obviously it can be implemented, but just pointing out that we'd have
to change our stance here if we want to provide a default username and
password to authenticate with.

On Tue, Feb 9, 2021 at 11:25 PM Andrew Grande  wrote:

Mysql has been generating an admin password on default installs for, like,
forever. This workflow should be familiar for many users.

I'd suggest taking the automation tooling into account and how a production
rollout (user-provided password) would fit into the workflow.

Andrew

On Tue, Feb 9, 2021, 8:15 PM Tony Kurc  wrote:


Joe,
In addition to your suggestions, were you thinking of making this processor
disabled by default as well?

Tony


On Tue, Feb 9, 2021, 11:04 PM Joe Witt  wrote:


Team

While secure by default may not be practical perhaps ‘not blatantly wide
open’ by default should be adopted.

I think we should consider killing support for http entirely and support
only https.  We should consider auto generating a user and password and
possibly server cert if nothing is configured and log the generated user
and password.   Sure it could still be configured to be non secure but

that

would truly be an admins fault.  Now its just ‘on’

This tweet is a great example of why

https://twitter.com/_escctrl_/status/1359280656174510081?s=21


Who agrees?  Who disagrees?   Please share ideas.

Thanks

Re: java api for changing parameter context

2021-01-27 Thread Russell Bateman


Wait! Can't this be done using the ReST APIs?

On 1/27/21 3:24 AM, u...@moosheimer.com wrote:

Hello NiFi-Core-Team,

Are you planning to create a high-level Java API for setting (and
clearing) individual parameters in the parameter context, so we can use
this API in processor development?

Example:
setParameter(string contextName, string parameterName, string
parameterValue, boolean sensitive);
deleteParameter(string contextName, string parameterName);

Some of our customers have systems with weekly changing parameter values
and/or access passphrase.
Apart from these nothing changes in the system and the changes can be
automated with self written processor.

Best regards,
Kay-Uwe Moosheimer

Re: Static processor design

2021-01-09 Thread Russell Bateman


Mark,

Thanks for responding. I think my question is a little more naive than 
that on my part.


I want to get those files through there as fast as possible. If I ask 
for /n/ files, how many would contribute to them getting them through 
the quickest? After all, I will do nothing at all to any except transfer 
them on and I don't care how many.


I write a lot of custom processors that do specific things to flowfiles 
one at a time. This isn't one of those. I don't care what's coming 
through, I just want to get every flowfile straight through with no changes.


Thanks.

Russ

On 1/9/21 9:09 AM, Mark Bean wrote:

Russell,

You can use "session.get(N)" where N is an integer. This will get up to N
flowfiles per OnTrigger() call.

-Mark


On Fri, Jan 8, 2021 at 5:07 PM Russell Bateman 
wrote:


Very well, I have decided to force customer flowfiles through this
processor (I did check out the /Listen/* processors, but chose this
easier solution). This now works. However,

It brings up another question: is this the most efficient way to pass
flowfiles straight through this processor (we're not processing them in
any way), or is there a batching technique that's faster, etc. I want
this to be straight-through, no back-pressure, throttling or influencing
their passage whatsoever (because I didn't want them coming through in
the first place). It should be ask if this processor weren't there.

Thanks for any and all thoughts on this.

public class HumanReadables extends AbstractProcessor{private boolean
propertyModified = false;@Override public void onTrigger( final
ProcessContext context, final ProcessSession session ) throws
ProcessException{FlowFile flowfile = session.get();if( propertyModified
){propertyModified = false;// record effects of changed
properties...}if( nonNull( flowfile ) )session.transfer( flowfile,
SUCCESS );}...@Override public void onPropertyModified( final
PropertyDescriptor descriptor, final String oldValue, final String
newValue ){propertyModified = true;}

Re: Static processor design

2021-01-08 Thread Russell Bateman

Very well, I have decided to force customer flowfiles through this 
processor (I did check out the /Listen/* processors, but chose this 
easier solution). This now works. However,


It brings up another question: is this the most efficient way to pass 
flowfiles straight through this processor (we're not processing them in 
any way), or is there a batching technique that's faster, etc. I want 
this to be straight-through, no back-pressure, throttling or influencing 
their passage whatsoever (because I didn't want them coming through in 
the first place). It should be ask if this processor weren't there.


Thanks for any and all thoughts on this.

public class HumanReadables extends AbstractProcessor{private boolean 
propertyModified = false;@Override public void onTrigger( final 
ProcessContext context, final ProcessSession session ) throws 
ProcessException{FlowFile flowfile = session.get();if( propertyModified 
){propertyModified = false;// record effects of changed 
properties...}if( nonNull( flowfile ) )session.transfer( flowfile, 
SUCCESS );}...@Override public void onPropertyModified( final 
PropertyDescriptor descriptor, final String oldValue, final String 
newValue ){propertyModified = true;}

Re: Static processor design

2021-01-08 Thread Russell Bateman

I only put the code I want to execute in onTrigger(), I suspected it 
would not fire there. I know that this isn't what processors do. 
Configuration is a messy problem to solve when your downstreamers want 
it made easy. This is supposed to be a solution that allows them to 
remain in the NiFi UI and not have to run off doing harder things to 
configure. I could put what I'm doing in /HumanReadables/ into a real, 
running processor, but then, I would kind of have to add them to several 
processors and I wanted to avoid the confusion that created.


Here's the code. Thanks.

import com.windofkeltia.constants.HumanReadableMappings;

...

@TriggerWhenEmpty
@SideEffectFree
@CapabilityDescription( "Dynamic properties can be created to specify 
(or add to) static configuration"
  + " of key-value substitution pairs. See 
additional details." )

public class HumanReadables extends AbstractProcessor
{
  @Override
  public void onTrigger( final ProcessContext context, final 
ProcessSession session ) throws ProcessException

  {
    getLogger().info( "inside onTrigger()..." );

    for( Map.Entry< PropertyDescriptor, String > entry : 
context.getProperties().entrySet() )

    {
  PropertyDescriptor property = entry.getKey();

  // do work here--maybe pre-wipe all key-value pairs if we're just 
going to recreate them?

  final String PROPERTY_NAME  = property.getName();
  final String PROPERTY_VALUE = entry.getValue();

  logger.trace( "Processing configurable mappings titled \"" + 
PROPERTY_NAME + "\"" );


  try
  {
    harvestDynamicPropertyMappings( PROPERTY_VALUE );
  }
  catch( Exception e )
  {
getLogger().debug( e.getMessage() );
  }
    }
  }

  protected static void harvestDynamicPropertyMappings( final String 
PROPERTY_VALUE )

  {
    final String[] LINES  = PROPERTY_VALUE.split( "\n" );
    int    lineNumber = 0;

    if( LINES.length < 1 )
  return;

    final String WHICH_LIST = LINES[ 0 ];

    for( final String VALUE_LINE : LINES )
    {
  char delimiter = VALUE_LINE.charAt( 0 );
  int  position = VALUE_LINE.indexOf( delimiter, 1 );
  String key, value;

  key   = ( position < 0 ) ? VALUE_LINE.substring( 1 ) : 
VALUE_LINE.substring( 1, position ).trim();
  value = ( position > 0 ) ? VALUE_LINE.substring( position + 1 
).trim() : "";


  HumanReadableMappings.add( key, value );
    }
  }

  @Override
  protected PropertyDescriptor getSupportedDynamicPropertyDescriptor( 
final String propertyDescriptorName )

  {
    return new PropertyDescriptor.Builder()
 .required( false )
 .name( propertyDescriptorName )
 .addValidator( 
StandardValidators.NON_EMPTY_VALIDATOR )
 // or .addValidator( Validator.VALID ) 
if you do not wish it validated!

 .dynamic( true )
 .build();
  }

  private volatile Set< String > dynamicPropertyNames = new HashSet<>();

  @Override
  public void onPropertyModified( final PropertyDescriptor descriptor, 
final String oldValue, final String newValue )

  {
    getLogger().info( oldValue + " -> " + newValue );

    final Set< String > newDynamicPropertyNames = new HashSet<>( 
dynamicPropertyNames );


    if( isNull( newValue ) )
  newDynamicPropertyNames.remove( descriptor.getName() );
    else if( isNull( oldValue ) && descriptor.isDynamic() )
  newDynamicPropertyNames.add( descriptor.getName() );

    dynamicPropertyNames = Collections.unmodifiableSet( 
newDynamicPropertyNames );


    final Set< String > allDynamicProperties = dynamicPropertyNames;
  }

  @OnScheduled public void processProperties( final ProcessContext 
context )

  {
    for( Map.Entry< PropertyDescriptor, String > entry : 
context.getProperties().entrySet() )

    {
  PropertyDescriptor descriptor = entry.getKey();

  if( descriptor.isDynamic() )
    getLogger().debug( "Dynamic property named:\n    " + 
descriptor.getName()
    + ", value: " + 
entry.getValue().replaceAll( "\n", " + " ) );

    }
  }

  protected static final String DEFAULT_MAPPING_VALUE = ""
    + "|http://loinc.org |LOINC\n"
    + "|http://snomed.info/sct |SNOMED\n"
    + "|http://www.ama-assn.org/go/cpt  |CPT\n"
    + "|http://aapc.org |CPT\n"
    + "|http://www.nlm.nih.gov/research/umls/rxnorm |RxNorm\n"
    + "|http://hl7.org/fhir/sid/ndc |NDC\n"
    + "|http://hl7.org/fhir/sid/icd-9-cm |ICD-9\n"
    + "|http://hl7.org/fhir/sid/icd-10 |ICD-10\n";

  public static final PropertyDescriptor DEFAULT_MAPPINGS = new 
PropertyDescriptor.Builder()

  .name( "default mappings" )
  .displayName( "Default mappings" )
  .required( false )
  .expressionLanguageSupported( ExpressionLanguageScope.NONE )
  .defaultValue(

Re: Static processor design

2021-01-08 Thread Russell Bateman

The code I really want to run is sitting in onTrigger(), though I could 
move it elsewhere.


Yes, I have tried

*Scheduling Strategy*of Timer driven
*Run Schedule*of 10 sec

...but the getLogger().info( "called from onTrigger()" )never reaches 
/logs/nifi-app.log/ (while the logging statement from 
onPropertyModified()does reach the log every time I change properties to 
remove old or introduce new properties).



On 1/7/21 6:38 PM, Russell Bateman wrote:

(Inadequate title; didn't know what to call it.)

I have written a processor that doesn't feature any relationships.

It accepts dynamically properties that, in theory, when created (or 
removed, or values added or changed), and sets data into a class 
inside my NAR.


I wonder, however, at what I expect of it because, while it works in 
unit testing, it does not in practice. I can sort of guess why, but 
I'm not sure what to do about it. Given that I can create methods to 
be called at various opportunities by annotating thus:


@OnAdded
@OnEnabled
@OnRemoved
@OnScheduled
@OnUnscheduled
@OnStopped
@OnShutdown

There isn't one of these annotations that says to my brain, "When a 
dynamic property is added, changed or removed, wake up and run this 
method." Except, of course, for onPropertyModified(). A new property 
is duly added when created in configuration; my call to 
getLogger.info()from onPropertyModified()shows


2021-01-07 18:32:51,923 INFO [NiFi Web Server-78] 
c.windofkeltia.processor.HumanReadables 
HumanReadables[id=afa5b637-0176-1000-78bd-a74904054649] null -> 
|http://hospital.smarthealthit.org|Smart Health IT


But, how do I incite some code after the fact to awaken and analyze 
the newly added configuration then affect the 
HumanReadableMappingsclass instance?


(Hope I haven't explained this too badly. I am willing to attach 
code--it's a very tiny processor.)


Thanks

Static processor design

2021-01-07 Thread Russell Bateman


(Inadequate title; didn't know what to call it.)

I have written a processor that doesn't feature any relationships.

It accepts dynamically properties that, in theory, when created (or 
removed, or values added or changed), and sets data into a class inside 
my NAR.


I wonder, however, at what I expect of it because, while it works in 
unit testing, it does not in practice. I can sort of guess why, but I'm 
not sure what to do about it. Given that I can create methods to be 
called at various opportunities by annotating thus:


   @OnAdded
   @OnEnabled
   @OnRemoved
   @OnScheduled
   @OnUnscheduled
   @OnStopped
   @OnShutdown

There isn't one of these annotations that says to my brain, "When a 
dynamic property is added, changed or removed, wake up and run this 
method." Except, of course, for onPropertyModified(). A new property is 
duly added when created in configuration; my call to 
getLogger.info()from onPropertyModified()shows


2021-01-07 18:32:51,923 INFO [NiFi Web Server-78] 
c.windofkeltia.processor.HumanReadables 
HumanReadables[id=afa5b637-0176-1000-78bd-a74904054649] null -> 
|http://hospital.smarthealthit.org|Smart Health IT


But, how do I incite some code after the fact to awaken and analyze the 
newly added configuration then affect the HumanReadableMappingsclass 
instance?


(Hope I haven't explained this too badly. I am willing to attach 
code--it's a very tiny processor.)


Thanks

Re: Safely updating custom processors in existing flows...

2020-12-30 Thread Russell Bateman

I will be very happy to read what others contribute. Chris' suggestion, 
which has huge implications, I think, for my IDE and project structure, 
will be something to look into.


However, I was hoping someone would articulate the difference between 
*property names* and *property display names*, which of the two can be 
changed without invalidating the processor when simply replaced with the 
new one, etc. I have read a lot about this topic, but I'm still 
confused. Is display name supposed to be what shows up in the UI? Can it 
be changed without screwing up the processor? Is it the name that can be 
fixed? Which is the safe thing to modify?


I was hoping for some advice on taking quantum leaps (such as Chris 
discussed), but also on baby steps that must have been more common back 
in the days before processor versioning was supported.


Thanks

Safely updating custom processors in existing flows...

2020-12-30 Thread Russell Bateman

I have a custom processor actively used in customer flows. I need to 
enhance it, but avoid invalidating it when I update custom's existing 
NiFi installations.


   I know that the processor properties, in particular, the conflict
   between the property .name()and .displayName(), is a good way to get
   myself into trouble by changing the wording, correcting spelling, etc.

   Also, adding to or subtracting properties and/or relationships is a
   good way to blow it all up.

   How does versioning the processor solve any of this?

   Etc.

Can anyone share canonical advice on this?

Thanks,
Russ

Re: ETL to Nifi Migration

2020-12-26 Thread Russell Bateman

Unless you're certain you will need to write custom processors for NiFi, 
the forum you really want to subscribe to and post in is


NiFi Users 

Best regards!

On 12/26/20 7:37 AM, Sumith Karthikeyan wrote:

Hi Team,

Hope you all doing well !!!

This regards a Middle East public sector requirement to move their current ETL 
to Apache Nifi in AWS cloud. We have prepared a questionnaire in-align with our 
expectation and request your technical support to collect the details against 
each.

Please help me to connect right team/support members to get this done. 
Questionnaire is in excel format and not included here as I am not sure it's 
the right forum to do so.

Thanks in advance !!!

Merry Christmas and Happy New Year !!!

Thanks & Regards,
Sumith K


Disclaimer: This transmittal and/or attachments have been issued by LMRA. The 
information contained here within may be privileged or confidential. If you are 
not the intended recipient, you are hereby notified that you have received this 
transmittal in error; any review, dissemination, distribution or copying of 
this transmittal is strictly prohibited. If you have received this transmittal 
and/or attachments in error, please notify us immediately by reply to the 
sender or delete this message and all its attachments immediately.

Re: Okay to manage ComponentLog via ThreadLocal?

2020-12-08 Thread Russell Bateman

Yes, well, I would consider that I had out-foxed myself, but it's 
probably due to not quite having all my fingers around the scope of the 
processor. I guess there's only one component logger for the currently 
executing NiFi instance? Yeah, I was thinking really hard about 
per-thread stuff, which is how I used this before.


You're right. I swallowed the whole camel.

Thanks, Mark!


On 12/8/20 2:58 PM, Mark Payne wrote:

Russ,

Why not just use:

Public class Foo {
public void bar() {
ComponentLogger logger = CustomProcessor.getLogger();
logger.warn( “This is a warning!” );
}
}


Perhaps I’m missing something - or perhaps you made things simpler than they 
really are for demonstration purposes?

Thanks
-Mark



On Dec 8, 2020, at 4:54 PM, Russell Bateman  wrote:

Because it's so onerous to pass a reference to the logger down through 
parameters lists, I thought I might try using Java's thread-local store. I 
haven't been using it for anything else either, but I thought I'd start. For 
now, the logger is the only thing that tempts me. In past lives as a 
web-application writer, I used it quite a bit.

My question is, "Does this offend anyone who cares to give an opinion?" If 
there's a reason not to do this, I'll go to the effort (and muddy my parameter lists). 
Otherwise, I'll give it a try.


public class CustomProcessor extends AbstractProcessor
{
   private static ThreadLocal< ComponentLog > tls = new ThreadLocal<>();

   public static ThreadLocal< ComponentLog > getTls() { return tls; }

   @Override
   public void onTrigger( final ProcessContext context, final ProcessSession 
session ) throws ProcessException
   {
 // grab the logger and store it on the thread...
 tls.set( getLogger() );

 ...

 // we've finished using it--dump the logger...
 tls.remove();
   }
}

public class Foo
{
   public void bar()
   {
 ComponentLog logger = CustomProcessor.getTls().get();
 logger.warn( "This is a warning!" );
   }
}

Thanks for any opinions, statements of best practice, cat calls, sneers, etc.

;-)

Russ

Okay to manage ComponentLog via ThreadLocal?

2020-12-08 Thread Russell Bateman

Because it's so onerous to pass a reference to the logger down through 
parameters lists, I thought I might try using Java's thread-local store. 
I haven't been using it for anything else either, but I thought I'd 
start. For now, the logger is the only thing that tempts me. In past 
lives as a web-application writer, I used it quite a bit.


My question is, "Does this offend anyone who cares to give an opinion?" 
If there's a reason not to do this, I'll go to the effort (and muddy my 
parameter lists). Otherwise, I'll give it a try.



public class CustomProcessor extends AbstractProcessor
{
  private static ThreadLocal< ComponentLog > tls = new ThreadLocal<>();

  public static ThreadLocal< ComponentLog > getTls() { return tls; }

  @Override
  public void onTrigger( final ProcessContext context, final 
ProcessSession session ) throws ProcessException

  {
    // grab the logger and store it on the thread...
    tls.set( getLogger() );

    ...

    // we've finished using it--dump the logger...
    tls.remove();
  }
}

public class Foo
{
  public void bar()
  {
    ComponentLog logger = CustomProcessor.getTls().get();
    logger.warn( "This is a warning!" );
  }
}

Thanks for any opinions, statements of best practice, cat calls, sneers, 
etc.


;-)

Russ

Re: APACHE NIFI CLUSTER INSTALLATION WALKTHROUGHS FOR 2NODE ON LINUX OS

2020-09-23 Thread Russell Bateman

You have treated Google as your friend and worked through the dozen or 
so examples by other folk doing this, right?


--just a suggestion.

On 9/23/20 1:21 PM, Abiodun Adegbile wrote:

*/Hello Team,/*

Still looking forward to your reply.
/
/

/I got this though after i tried setting up the 2-node cluster on two 
separate physical servers vms/


*/

/*

*/
/*

Your prompt response will be highly appreciated.

*/
/*

*/KEEP SAFE, STAY SAFE & STAY STRONG/*

/
/

/I celebrate you.../


/*/ADEGBILE ABIODUN A.(OCA, OCP, OCIFA, OCIAA, OCIAP)/*/

*/Database Administrator/Infrastructure Services/*

*/Bluechip Technologies Limited/*

*/Plot 9B, Onikoyi Lane/*

*/Ikoyi. Lagos State./*

*/Mobile : +234 806 206 9970/*

*/Website : www.bluechiptech.biz /*

*/
/*

.../*2016 Winner Oracle Transformational Deal of the Year - Nigeria*./



*From:* Abiodun Adegbile
*Sent:* Sunday, September 20, 2020 2:33 AM
*To:* dev@nifi.apache.org 
*Subject:* APACHE NIFI CLUSTER INSTALLATION WALKTHROUGHS FOR 2NODE ON 
LINUX OS

*/Hello Team,/*

Good to write to you.

I am planning to implement a NIFI cluster installation for two 
separate physical linux servers(to-be-clustered) for my test lab. I 
was hoping you could send a walkthrough documentation for this.


Many thanks for your help. Will appreciate your prompt response.

*/
/*

*/
/*

*/KEEP SAFE, STAY SAFE & STAY STRONG/*

/
/

/I celebrate you.../


/*/ADEGBILE ABIODUN A.(OCA, OCP, OCIFA, OCIAA, OCIAP)/*/

*/Database Administrator/Infrastructure Services/*

*/Bluechip Technologies Limited/*

*/Plot 9B, Onikoyi Lane/*

*/Ikoyi. Lagos State./*

*/Mobile : +234 806 206 9970/*

*/Website : www.bluechiptech.biz /*

*/
/*

.../*2016 Winner Oracle Transformational Deal of the Year - Nigeria*./

Re: APACHE NIFI CLUSTER INSTALLATION WALKTHROUGHS FOR 2NODE ON LINUX OS

2020-09-20 Thread Russell Bateman


Google is your friend.

https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.5.1/nifi-configuration-best-practices/content/basic-cluster-setup.html
https://bryanbende.com/development/2018/10/23/apache-nifi-secure-cluster-setup
https://mintopsblog.com/2017/11/12/apache-nifi-cluster-configuration/
https://www.nifi.rocks/apache-nifi-docker-compose-cluster/

etc.

On 9/19/20 7:33 PM, Abiodun Adegbile wrote:

*/Hello Team,/*

Good to write to you.

I am planning to implement a NIFI cluster installation for two 
separate physical linux servers(to-be-clustered) for my test lab. I 
was hoping you could send a walkthrough documentation for this.


Many thanks for your help. Will appreciate your prompt response.

*/
/*

*/
/*

*/KEEP SAFE, STAY SAFE & STAY STRONG/*

/
/

/I celebrate you.../


/*/ADEGBILE ABIODUN A.(OCA, OCP, OCIFA, OCIAA, OCIAP)/*/

*/Database Administrator/Infrastructure Services/*

*/Bluechip Technologies Limited/*

*/Plot 9B, Onikoyi Lane/*

*/Ikoyi. Lagos State./*

*/Mobile : +234 806 206 9970/*

*/Website : www.bluechiptech.biz /*

*/
/*

.../*2016 Winner Oracle Transformational Deal of the Year - Nigeria*./

Re: TestRunner: enqueueing multiple flowfiles

2020-08-31 Thread Russell Bateman

Oh, no, Brian, calling runner.enqueue()multiple times is a perfect
solution. It wasn't clear that this was an option. I guess I missed that
semantic in the Javadoc, but, certainly, the name, "enqueue" should have
been a huge hint to me.

Thanks!

On 8/31/20 7:26 AM, Bryan Bende wrote:

I think you could call any of the enqueue methods multiple times to queue
multiple flow files.

If you really want to use the one that takes var args of FlowFile, then you
would need to create the MockFlowFiles yourself doing something like this...

https://github.com/apache/nifi/blob/main/nifi-mock/src/main/java/org/apache/nifi/util/StandardProcessorTestRunner.java#L443-L448

Instead of creating a new MockProcessSession, you would get the
ProcessSessionFactory from the TestRunner and then call
createProcessSession().

On Mon, Aug 31, 2020 at 9:01 AM Russell Bateman
wrote:

In my JUnit testing of a custom processor, I need to queue up at least
two flowfiles. I see that there is an implementation of
TestRunner.enqueue()that takes*a list of flowfiles*, but I'm used to
using the implementation of this method that creates me a flowfile from
bytes or a stream. I do not know how to create a flowfile from scratch
inside test code. MockFlowFile's two constructors are no help. Getting
there via interface ProcessorSessionseems a long road to travel just for
this.

Examples using TestRunner.enqueue( FlowFile ... flowfiles ) do not
abound out there in Googleland. Has someone else done this?

Thanks.

TestRunner: enqueueing multiple flowfiles

2020-08-31 Thread Russell Bateman

In my JUnit testing of a custom processor, I need to queue up at least 
two flowfiles. I see that there is an implementation of 
TestRunner.enqueue()that takes*a list of flowfiles*, but I'm used to 
using the implementation of this method that creates me a flowfile from 
bytes or a stream. I do not know how to create a flowfile from scratch 
inside test code. MockFlowFile's two constructors are no help. Getting 
there via interface ProcessorSessionseems a long road to travel just for 
this.


Examples using TestRunner.enqueue( FlowFile ... flowfiles ) do not 
abound out there in Googleland. Has someone else done this?


Thanks.

Re: From one flowfile to two...

2020-08-27 Thread Russell Bateman


In case anyone cares,
https://www.javahotchocolate.com/notes/nifi-custom.html#two-split-from-one

On 8/27/20 11:15 AM, Andy LoPresto wrote:

Russell,

Glad you found a working solution. Maybe it would be better for you to write up 
your findings and share them with a broader audience. I have often seen the 
best explanations are written by people who were recently in the “how do I do 
X?” state, as they are closest to the problem and can walk through their 
process of gathering understanding. Someone who works on these methods day in 
and day out may not write for the appropriate audience or explain the 
experience as well.

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Aug 27, 2020, at 10:10 AM, Russell Bateman  wrote:

I needed to get back here...

I took this advice to heart and finished my processor. Thanks to Matt and Mark 
for all their suggestions! They cleared up a few things. There was one bug in 
the code that was mine, small, but significant in its effect on the rest. That 
mistake also explained why I thought the uuidwas identical between at least two 
of the cloned flowfiles. What I would wish for, and am probably not strong 
enough to write, would be a synthesis of the session methods read() and write() 
and how best to use them (one-to-one, one-to-many, etc.). Javadoc is too 
paratactic by nature, the NiFi Developer's Guide almost silent on these 
methods. If it were not for the many existing examples using these methods, it 
would be hard to learn to do even simple things. I did look for something 
closer to what I needed to do, but unsuccessfully.

Thanks again. If anything, the NiFi mailing lists are a place both for great 
information and being treated well.

Russ

On 8/25/20 12:24 PM, Mark Payne wrote:

Russ,

Several comments here. I’ve included them inline, below.

Hope it’s helpful.

Thanks
-Mark



On Aug 25, 2020, at 2:09 PM, Russell Bateman  wrote:

Thanks for your suggestions, Matt.

I decided to keep the original flowfile only upon failure. So, I have the 
embedded-document file and the serialized POJOs created from processing the non 
embedded-document part as the result if successful. (Condensed code at end...)

Now I have three questions...

1. I seem not to have placated NiFi with the assurance that I have transferred 
or disposed of all three flowfiles suitably. I get:

java.lang.AssertionError: 
org.apache.nifi.processor.exception.FlowFileHandlingException: Cannot commit 
session because the following FlowFiles have not been removed or transferred: 
[2]

This is probably because at the end of the block, you catch Exception and then 
route the original FlowFile to failure. But you’ve already cloned it and didn’t 
deal with the clone.

*Which of the three flowfiles does [2] refer to? Or does it just mean I botched 
two flowfiles. *

2. session.clone()generates a new flowfile with the identical uuid. I don't 
think I want the result to be two flowfiles with the same uuid. I am binding 
them together so I can associate them later using attribute embedded-document. 
*Should I/How do I force cloning to acquire new **uuid**s?*

This appears to actually be a bug in the mock framework. It *should* have a 
unique uuid, and would in a running NiFi instance. Feel free to file a Jira for 
that.

3. A question on theory... *Wouldn't all of this cloning be expensive* and I 
should just clone for one of the new files and then mangle the original 
flowfile to become the other?

session.clone() is not particularly expensive. It’s just creating a new 
FlowFile object. It doesn’t clone the FlowFile’s contents.

That said, it is probably more appropriate to call session.create(flowFile), 
rather than session.clone(flowFile). It makes little difference in practice but 
what you’re really doing is forking a child, and that will come across more 
cleanly in the Provenance lineage that is generated if using 
session.create(flowFile).

Additional comments in code below.



Thanks,
Russ


@Override
public void onTrigger( final ProcessContext context, final ProcessSession 
session ) throws ProcessException
{
   FlowFile flowfile = session.get();

   if( flowfile == null )
   {
 context.yield();

No need to yield here. Let the framework handle the scheduling. 
ProcessContext.yield() is meant for cases where you’re communicating with some 
external service, for instance, and you know the service is unavailable or rate 
limiting you or something like that. You can’t make any progress, so tell NiFi 
to not bother wasting CPU cycles with this Processor.

 return;
   }

   try
   {
 final String UUID = flowfile.getAttribute( NiFiUtilities.UUID );

 FlowFile document = session.clone( flowfile );

*// excerpt and write the embedded document to a new flowfile...*
 session.write( document, new OutputStreamCallback()
 {
   @Override public void process( OutputStream outputStream

Re: From one flowfile to two...

2020-08-27 Thread Russell Bateman

I will be sure to do that. I keep several pages of NiFi notes on my 
website (javahotchocolate.com). The notes are mostly for me to 
re-consult, but I have tutorials about writing custom processors and the 
like. I'll be putting out a skeletal copy of my recent code soon.


Thanks!

On 8/27/20 11:15 AM, Andy LoPresto wrote:

Russell,

Glad you found a working solution. Maybe it would be better for you to write up 
your findings and share them with a broader audience. I have often seen the 
best explanations are written by people who were recently in the “how do I do 
X?” state, as they are closest to the problem and can walk through their 
process of gathering understanding. Someone who works on these methods day in 
and day out may not write for the appropriate audience or explain the 
experience as well.

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Aug 27, 2020, at 10:10 AM, Russell Bateman  wrote:

I needed to get back here...

I took this advice to heart and finished my processor. Thanks to Matt and Mark 
for all their suggestions! They cleared up a few things. There was one bug in 
the code that was mine, small, but significant in its effect on the rest. That 
mistake also explained why I thought the uuidwas identical between at least two 
of the cloned flowfiles. What I would wish for, and am probably not strong 
enough to write, would be a synthesis of the session methods read() and write() 
and how best to use them (one-to-one, one-to-many, etc.). Javadoc is too 
paratactic by nature, the NiFi Developer's Guide almost silent on these 
methods. If it were not for the many existing examples using these methods, it 
would be hard to learn to do even simple things. I did look for something 
closer to what I needed to do, but unsuccessfully.

Thanks again. If anything, the NiFi mailing lists are a place both for great 
information and being treated well.

Russ

On 8/25/20 12:24 PM, Mark Payne wrote:

Russ,

Several comments here. I’ve included them inline, below.

Hope it’s helpful.

Thanks
-Mark



On Aug 25, 2020, at 2:09 PM, Russell Bateman  wrote:

Thanks for your suggestions, Matt.

I decided to keep the original flowfile only upon failure. So, I have the 
embedded-document file and the serialized POJOs created from processing the non 
embedded-document part as the result if successful. (Condensed code at end...)

Now I have three questions...

1. I seem not to have placated NiFi with the assurance that I have transferred 
or disposed of all three flowfiles suitably. I get:

java.lang.AssertionError: 
org.apache.nifi.processor.exception.FlowFileHandlingException: Cannot commit 
session because the following FlowFiles have not been removed or transferred: 
[2]

This is probably because at the end of the block, you catch Exception and then 
route the original FlowFile to failure. But you’ve already cloned it and didn’t 
deal with the clone.

*Which of the three flowfiles does [2] refer to? Or does it just mean I botched 
two flowfiles. *

2. session.clone()generates a new flowfile with the identical uuid. I don't 
think I want the result to be two flowfiles with the same uuid. I am binding 
them together so I can associate them later using attribute embedded-document. 
*Should I/How do I force cloning to acquire new **uuid**s?*

This appears to actually be a bug in the mock framework. It *should* have a 
unique uuid, and would in a running NiFi instance. Feel free to file a Jira for 
that.

3. A question on theory... *Wouldn't all of this cloning be expensive* and I 
should just clone for one of the new files and then mangle the original 
flowfile to become the other?

session.clone() is not particularly expensive. It’s just creating a new 
FlowFile object. It doesn’t clone the FlowFile’s contents.

That said, it is probably more appropriate to call session.create(flowFile), 
rather than session.clone(flowFile). It makes little difference in practice but 
what you’re really doing is forking a child, and that will come across more 
cleanly in the Provenance lineage that is generated if using 
session.create(flowFile).

Additional comments in code below.



Thanks,
Russ


@Override
public void onTrigger( final ProcessContext context, final ProcessSession 
session ) throws ProcessException
{
   FlowFile flowfile = session.get();

   if( flowfile == null )
   {
 context.yield();

No need to yield here. Let the framework handle the scheduling. 
ProcessContext.yield() is meant for cases where you’re communicating with some 
external service, for instance, and you know the service is unavailable or rate 
limiting you or something like that. You can’t make any progress, so tell NiFi 
to not bother wasting CPU cycles with this Processor.

 return;
   }

   try
   {
 final String UUID = flowfile.getAttribute( NiFiUtilities.UUID );

 FlowFile document = session.clone( flowfile );

*// excerpt

Re: From one flowfile to two...

2020-08-27 Thread Russell Bateman


I needed to get back here...

I took this advice to heart and finished my processor. Thanks to Matt 
and Mark for all their suggestions! They cleared up a few things. There 
was one bug in the code that was mine, small, but significant in its 
effect on the rest. That mistake also explained why I thought the 
uuidwas identical between at least two of the cloned flowfiles. What I 
would wish for, and am probably not strong enough to write, would be a 
synthesis of the session methods read() and write() and how best to use 
them (one-to-one, one-to-many, etc.). Javadoc is too paratactic by 
nature, the NiFi Developer's Guide almost silent on these methods. If it 
were not for the many existing examples using these methods, it would be 
hard to learn to do even simple things. I did look for something closer 
to what I needed to do, but unsuccessfully.


Thanks again. If anything, the NiFi mailing lists are a place both for 
great information and being treated well.


Russ

On 8/25/20 12:24 PM, Mark Payne wrote:

Russ,

Several comments here. I’ve included them inline, below.

Hope it’s helpful.

Thanks
-Mark



On Aug 25, 2020, at 2:09 PM, Russell Bateman  wrote:

Thanks for your suggestions, Matt.

I decided to keep the original flowfile only upon failure. So, I have the 
embedded-document file and the serialized POJOs created from processing the non 
embedded-document part as the result if successful. (Condensed code at end...)

Now I have three questions...

1. I seem not to have placated NiFi with the assurance that I have transferred 
or disposed of all three flowfiles suitably. I get:

java.lang.AssertionError: 
org.apache.nifi.processor.exception.FlowFileHandlingException: Cannot commit 
session because the following FlowFiles have not been removed or transferred: 
[2]

This is probably because at the end of the block, you catch Exception and then 
route the original FlowFile to failure. But you’ve already cloned it and didn’t 
deal with the clone.

*Which of the three flowfiles does [2] refer to? Or does it just mean I botched 
two flowfiles. *

2. session.clone()generates a new flowfile with the identical uuid. I don't 
think I want the result to be two flowfiles with the same uuid. I am binding 
them together so I can associate them later using attribute embedded-document. 
*Should I/How do I force cloning to acquire new **uuid**s?*

This appears to actually be a bug in the mock framework. It *should* have a 
unique uuid, and would in a running NiFi instance. Feel free to file a Jira for 
that.

3. A question on theory... *Wouldn't all of this cloning be expensive* and I 
should just clone for one of the new files and then mangle the original 
flowfile to become the other?

session.clone() is not particularly expensive. It’s just creating a new 
FlowFile object. It doesn’t clone the FlowFile’s contents.

That said, it is probably more appropriate to call session.create(flowFile), 
rather than session.clone(flowFile). It makes little difference in practice but 
what you’re really doing is forking a child, and that will come across more 
cleanly in the Provenance lineage that is generated if using 
session.create(flowFile).

Additional comments in code below.



Thanks,
Russ


@Override
public void onTrigger( final ProcessContext context, final ProcessSession 
session ) throws ProcessException
{
   FlowFile flowfile = session.get();

   if( flowfile == null )
   {
 context.yield();

No need to yield here. Let the framework handle the scheduling. 
ProcessContext.yield() is meant for cases where you’re communicating with some 
external service, for instance, and you know the service is unavailable or rate 
limiting you or something like that. You can’t make any progress, so tell NiFi 
to not bother wasting CPU cycles with this Processor.

 return;
   }

   try
   {
 final String UUID = flowfile.getAttribute( NiFiUtilities.UUID );

 FlowFile document = session.clone( flowfile );

*// excerpt and write the embedded document to a new flowfile...*
 session.write( document, new OutputStreamCallback()
 {
   @Override public void process( OutputStream outputStream )
   {
 // read from the original flowfile copying to the output flowfile...
 session.read( flowfile, new InputStreamCallback()
 {
   @Override public void process( InputStream inputStream ) throws 
IOException
   {
...
   }
 } );
   }
 } );

 FlowFile concepts = session.clone( flowfile );

 AtomicReference< ConceptList > conceptListHolder = new AtomicReference<>();

*// parse the concepts into a POJO list...*
 session.read( concepts, new InputStreamCallback()
 {
   final ConceptList conceptList = conceptListHolder.get();

   @Override public void process( InputStream inputStream ) throws 
IOException
   {
 ...
   }
 } );

*// write out the concept POJOs serialized...*
 s

Re: From one flowfile to two...

2020-08-25 Thread Russell Bateman


Thanks for your suggestions, Matt.

I decided to keep the original flowfile only upon failure. So, I have 
the embedded-document file and the serialized POJOs created from 
processing the non embedded-document part as the result if successful. 
(Condensed code at end...)


Now I have three questions...

1. I seem not to have placated NiFi with the assurance that I have 
transferred or disposed of all three flowfiles suitably. I get:


java.lang.AssertionError: 
org.apache.nifi.processor.exception.FlowFileHandlingException: Cannot 
commit session because the following FlowFiles have not been removed or 
transferred: [2]


*Which of the three flowfiles does [2] refer to? Or does it just mean I 
botched two flowfiles. *


2. session.clone()generates a new flowfile with the identical uuid. I 
don't think I want the result to be two flowfiles with the same uuid. I 
am binding them together so I can associate them later using attribute 
embedded-document. *Should I/How do I force cloning to acquire new 
**uuid**s?*


3. A question on theory... *Wouldn't all of this cloning be expensive* 
and I should just clone for one of the new files and then mangle the 
original flowfile to become the other?


Thanks,
Russ


@Override
public void onTrigger( final ProcessContext context, final 
ProcessSession session ) throws ProcessException

{
  FlowFile flowfile = session.get();

  if( flowfile == null )
  {
    context.yield();
    return;
  }

  try
  {
    final String UUID = flowfile.getAttribute( NiFiUtilities.UUID );

    FlowFile document = session.clone( flowfile );

*    // excerpt and write the embedded document to a new flowfile...*
    session.write( document, new OutputStreamCallback()
    {
  @Override public void process( OutputStream outputStream )
  {
    // read from the original flowfile copying to the output 
flowfile...

    session.read( flowfile, new InputStreamCallback()
    {
  @Override public void process( InputStream inputStream ) 
throws IOException

  {
   ...
  }
    } );
  }
    } );

    FlowFile concepts = session.clone( flowfile );

    AtomicReference< ConceptList > conceptListHolder = new 
AtomicReference<>();


*    // parse the concepts into a POJO list...*
    session.read( concepts, new InputStreamCallback()
    {
  final ConceptList conceptList = conceptListHolder.get();

  @Override public void process( InputStream inputStream ) throws 
IOException

  {
    ...
  }
    } );

*    // write out the concept POJOs serialized...*
    session.write( concepts, new OutputStreamCallback()
    {
  @Override public void process( OutputStream outputStream )
  {
    ...
  }
    } );

    document = session.putAttribute( document, "embedded-document", UUID );
    concepts = session.putAttribute( document, "embedded-document", UUID );
    session.transfer( document, DOCUMENT );
    session.transfer( concepts, CONCEPTS );
    session.remove( flowfile );
  }
  catch( Exception e )
  {
    session.transfer( flowfile, FAILURE );
  }
}

On 8/24/20 4:52 PM, Matt Burgess wrote:

Russell,

session.read() won't overwrite any contents of the incoming flow file,
but write() will. For #2, are you doing any processing on the file? If
not, wouldn't that be the original flowfile anyway? Or do you want it
to be a different flowfile on purpose (so you can send the incoming
flowfile to a different relationship)? You can use session.clone() to
create a new flowfile that has the same content and attributes from
the incoming flowfile, then handle that separately from the incoming
(original) flowfile. For #1, you could clone() the original flowfile
and do the read/process/write as part of a session.write(FlowFile,
StreamCallback) call, then you're technically reading the "new" file
content (which is the same of course) and overwriting it on the way
out.

Regards,
Matt

On Mon, Aug 24, 2020 at 6:37 PM Russell Bateman  wrote:

I am writing a custom processor that, upon processing a flowfile,
results  in two new flowfiles (neither keeping the exact, original
content) out two different relationships. I might like to route the
original flowfile to a separate relationship.

FlowFile original = session.get();

Do I need to call session.create()for the two new files?

  1. session.read()of original file's contents, not all of the way
 through, but send the processed output from what I do read as
 flowfile 1.
  2. session.read()of original file's contents and send resulting output
 as flowfile 2.
  3. session.transfer()of original flowfile.

I look at all of these session.read()and session.write()calls and I'm a
bit confused as to which to use that won't lose the original flowfile's
content after #1 so I can start over again in #2.

Thanks.

From one flowfile to two...

2020-08-24 Thread Russell Bateman

I am writing a custom processor that, upon processing a flowfile, 
results  in two new flowfiles (neither keeping the exact, original 
content) out two different relationships. I might like to route the 
original flowfile to a separate relationship.


FlowFile original = session.get();

Do I need to call session.create()for the two new files?

1. session.read()of original file's contents, not all of the way
   through, but send the processed output from what I do read as
   flowfile 1.
2. session.read()of original file's contents and send resulting output
   as flowfile 2.
3. session.transfer()of original flowfile.

I look at all of these session.read()and session.write()calls and I'm a 
bit confused as to which to use that won't lose the original flowfile's 
content after #1 so I can start over again in #2.


Thanks.

Re: Suggestions for splitting, then reassembling documents

2020-08-21 Thread Russell Bateman

Hey, thanks, Jason. I will give this approach a try. I confess I had not 
even thought of /Wait/ for that. Thanks too for pointing out side effects.


Russ

On 8/21/20 5:46 AM, Sherman, Jason wrote:

This sounds like a good use case for wait/notify, which I've used
successfully multiple times.  Once the document is split, the original
document part would sit at the wait processor until a notify processor
signals the completion of the flow.  I would first try using the original
files UUID for the wait/notify signal.

Also, you can set attributes on the notify processor that can get added to
the document at the wait processor. Then, use that added information to
build whatever output you need with the original document.

However, with such large documents, this will likely slow down the
processing for the XML portion, depending on how much processing they have
to go through.

Cheers,
Jason
--
Jason C. Sherman, CSSLP, CISSP
Owner
Logical Software Solutions, LLC
Solid. Secure. Software.

http://logicalsoftware.co/
.co?  Yes, your data isn't always what you expect.  We'll make sense of it.

https://www.linkedin.com/in/lss-js/


On Tue, Aug 18, 2020 at 12:38 PM Russell Bateman 
wrote:


I am writing custom processors that juggle medical documents (in a more
or less proprietary format). The document are always XML and contain
two, major parts:

  1. an original document which may be text, HL7v2 or XML and may contain
 HTML between  ... , could be many megabytes in
size
  2. XML structure representing data extracted from (a) in myriad XML
 elements, rarely more than a few hundred kilobytes in size

I am using XStreamto serialize #2 after I've parsed it into POJOs for
later use. It's too over the top to base-64 encode #1 to survive
serialization by XStreamand it buys me nothing except the convenience of
making #1 conceptually identical to #2. Since I don't need to dig down
into #1 or analyze it, and it's so big, processing it at all is costly
and undesirable.

What I thought I'd investigate is the possibility of splitting
 ... (#1) into a separate flowfile to be
reassembled by a later processor, but with (literally) millions of these
files flowing through NiFi, I wonder about the advisability of splitting
them up then hoping I can unite the correct parts and how to accomplish
that (discrete attribute ids on constituent parts?).

The reassembly would involve deserializing #2, working with that data to
generate a new document (HL7v4/FHIR, other formats) along with
reinserting #1.

Yes, I have examined /SplitXml/ and /SplitContent/, but I need to do
much more than just split the flowfile at the time I have it in my
hands, hence, a custom processor. Similarly, /MergeContent/ will not be
helpful for reassembly.

So, specifically, I can easily generate a flowfile attribute, an id that
discretely identifies these two now separate documents as suitable to
weld back together. However, I have not yet experimented with flowfiles
randomly (?) showing up together later in the flow within easy reach of
one processor for reassembly. Obviously, /Split/- and /MergeContent/
must be in the habit of dealing with this situation, but I have no
experience with them outside my primitive imagination.

I'm asking for suggestions, best practice, gotchas, warnings or any
other thoughts.

Russ

Suggestions for splitting, then reassembling documents

2020-08-18 Thread Russell Bateman

I am writing custom processors that juggle medical documents (in a more 
or less proprietary format). The document are always XML and contain 
two, major parts:


1. an original document which may be text, HL7v2 or XML and may contain
   HTML between  ... , could be many megabytes in size
2. XML structure representing data extracted from (a) in myriad XML
   elements, rarely more than a few hundred kilobytes in size

I am using XStreamto serialize #2 after I've parsed it into POJOs for 
later use. It's too over the top to base-64 encode #1 to survive 
serialization by XStreamand it buys me nothing except the convenience of 
making #1 conceptually identical to #2. Since I don't need to dig down 
into #1 or analyze it, and it's so big, processing it at all is costly 
and undesirable.


What I thought I'd investigate is the possibility of splitting 
 ... (#1) into a separate flowfile to be 
reassembled by a later processor, but with (literally) millions of these 
files flowing through NiFi, I wonder about the advisability of splitting 
them up then hoping I can unite the correct parts and how to accomplish 
that (discrete attribute ids on constituent parts?).


The reassembly would involve deserializing #2, working with that data to 
generate a new document (HL7v4/FHIR, other formats) along with 
reinserting #1.


Yes, I have examined /SplitXml/ and /SplitContent/, but I need to do 
much more than just split the flowfile at the time I have it in my 
hands, hence, a custom processor. Similarly, /MergeContent/ will not be 
helpful for reassembly.


So, specifically, I can easily generate a flowfile attribute, an id that 
discretely identifies these two now separate documents as suitable to 
weld back together. However, I have not yet experimented with flowfiles 
randomly (?) showing up together later in the flow within easy reach of 
one processor for reassembly. Obviously, /Split/- and /MergeContent/ 
must be in the habit of dealing with this situation, but I have no 
experience with them outside my primitive imagination.


I'm asking for suggestions, best practice, gotchas, warnings or any 
other thoughts.


Russ

Live development of custom processors and JAVA_HOME--confirm best practice

2020-08-13 Thread Russell Bateman

When installing NiFi in production, Ansible can be used to set up 
JAVA_HOME. There is zero problem for users of NiFi.


However, from a development host, given the now rapid cadence of Java 
releases, we sometimes run into problems launching a private 
installation of NiFi in the course of testing or debugging our custom 
processors because tools used minute-to-minute and all day long like 
IntelliJ IDEA (and others) march on requiring later and later Java versions.


I found an old JIRA issue that suggests a solution for NiFi 0.1.0, that 
of working around the problem by setting java=  in /conf/bootstrap.conf/ 
to point to a valid Java 1.8 JRE/JDK. This sounds good to me, but the 
version is very old.


Is this still best practice?

Re: Need to know if there is multiple template support

2020-08-06 Thread Russell Bateman

Since the advent of the NiFi Registry, templates are sort of deprecated. 
The Registry is a brilliant design far better able to support what you 
seem to be asking for.


By the way, this is much more a "user" question you might have asked in 
that forum rather than this "dev" forum.


Cheers!

On 8/6/20 2:44 AM, Rahul Vasan wrote:

Im looking to create multiple templates for multiple client with different
processor flows for each of them. Is it possible?

Re: Failing to update custom processor properties names, displayNames, etc.

2020-07-18 Thread Russell Bateman


Andy,

You're right. I was a caching issue with Chrome. I don't know why I 
didn't think to try that--forest for the trees, I guess. Thank you.


Russ

On 7/17/20 5:58 PM, Andy LoPresto wrote:

Russell,

Have you verified this is not a browser caching issue? They are pernicious and 
it sounds like this could be an example. If you’re sure it’s not, verify the 
API calls using your browser’s Developer Tools to see what properties are 
actually being returned by the server when inspecting a component to see if the 
correct values are present. If so, it’s likely a UI bug, and if not, the new 
code is not being properly loaded/used by NiFi’s server application.


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Jul 17, 2020, at 3:48 PM, Russell Bateman  wrote:

If I have changed a custom processor's PropertyDescriptor.nameand/or 
.displayName,including changes I have made to my /additionalDetails.html, /and 
I have:

- removed that processor from my test flow or removed /flow.xml.gz/ altogether
- removed my NAR from /${NIFI_ROOT}///custom-lib/ and bounced NiFi with the new 
custom NAR copied

yet nothing changes (in usage or in additional details), what am I overlooking?

Let's set aside the vaguely confusing semantic distinction between nameand 
displayName, I just want NiFi to forget my new custom processor completely and 
then accept my new version as if brand new including all the changes I have 
made.

Thanks for any suggestions.

Failing to update custom processor properties names, displayNames, etc.

2020-07-17 Thread Russell Bateman

If I have changed a custom processor's PropertyDescriptor.nameand/or 
.displayName,including changes I have made to my 
/additionalDetails.html, /and I have:


- removed that processor from my test flow or removed flow.xml.gz altogether
- removed my NAR from /${NIFI_ROOT}///custom-lib/ and bounced NiFi with 
the new custom NAR copied


yet nothing changes (in usage or in additional details), what am I 
overlooking?


Let's set aside the vaguely confusing semantic distinction between 
nameand displayName, I just want NiFi to forget my new custom processor 
completely and then accept my new version as if brand new including all 
the changes I have made.


Thanks for any suggestions.

Re: Derby as DBCP service, error from Kerberos?

2020-07-14 Thread Russell Bateman

Oopsie! Okay, too early for my eyes to pick up on that difference. 
Indeed, it works now. Thanks a million, Bryan!


On 7/14/20 8:25 AM, Bryan Bende wrote:

The one I referenced is actually "nifi-kerberos-credentials-service-api"
and you have "nifi-kerberos-credentials-service".

On Tue, Jul 14, 2020 at 10:24 AM Russell Bateman 
wrote:


Thanks for the responses. I did have this dependency already before
mailing to the forum:

  
  1.11.0
  ...
  
org.apache.nifi
nifi-kerberos-credentials-service
${nifi.version}
  

Other thoughts? I tried debugging through this as both test scope and no
specified scope. The result is the same.


On 7/14/20 8:12 AM, Matt Burgess wrote:

Don't forget to include that service with "test" scope so it doesn't
get included in the "real" bundle.

On Tue, Jul 14, 2020 at 9:49 AM Bryan Bende  wrote:

It looks like you are missing a dependency in your project...



https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-standard-services/nifi-dbcp-service-bundle/nifi-dbcp-service/pom.xml#L50-L55

On Mon, Jul 13, 2020 at 5:24 PM Russell Bateman 
wrote:


I'm trying to use Apache Derby as the DBCP controller in JUnit tests.
For the first test, I start off vetting my ability to inject Derby as
the DBCP controller I want to use. But, right off, I get this Kerberos
error. I wasn't trying to use Kerberos, but maybe I'm missing
configuration to tell that to DBCPConnectionPool?

public void test() throws Exception
{
 final DBCPConnectionPool service = new DBCPConnectionPool();

   *java.lang.NoClassDefFoundError:
org/apache/nifi/kerberos/KerberosCredentialsService at


org.apache.nifi.dbcp.DBCPConnectionPool.(DBCPConnectionPool.java:243)

at

com.windofkeltia.processor.TestWithDerby.test(TestWithDerby.java:111)

< 26 internal calls> Caused by: java.lang.ClassNotFoundException:
org.apache.nifi.kerberos.KerberosCredentialsService < 2 internal calls
at


java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)

at


java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)

at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) ...*

 runner.addControllerService( "Derby service", service );
 runner.setProperty( service, DBCPConnectionPool.DATABASE_URL,
"jdbc:derby:memory:sampledb;create=true" );
 runner.setProperty( service, DBCPConnectionPool.DB_USER,

  "sa" );

 runner.setProperty( service, DBCPConnectionPool.DB_PASSWORD,

  "sa" );

 runner.setProperty( service, DBCPConnectionPool.DB_DRIVERNAME,
"org.apache.derby.jdbc.EmbeddedDriver" );
 runner.enableControllerService( service );
 runner.assertValid( service );

 final DBCPService derbyService = ( DBCPService )
runner.getProcessContext()

.getControllerServiceLookup()

.getControllerService( "Derby service" );

 // get and verify connections to Derby...
 for( int count = 0; count < 10; count++ )
 {
   final Connection connection = service.getConnection();
   if( VERBOSE )
 System.out.println( connection );
   assertNotNull( connection );
   assertValidConnectionDerby( connection, count );
 }

 final Map< String, String > properties = new HashMap<>();
 runner.setProperty( TestWithDerby.DBCP_SERVICE, "Derby service" );
 runner.setIncomingConnection( false );
 runner.setIncomingConnection( false );
 runner.run();
}

Re: Derby as DBCP service, error from Kerberos?

2020-07-14 Thread Russell Bateman

Thanks for the responses. I did have this dependency already before 
mailing to the forum:


    
    1.11.0
    ...
    
  org.apache.nifi
  *nifi-kerberos-credentials-service*
  ${nifi.version}
    

Other thoughts? I tried debugging through this as both test scope and no 
specified scope. The result is the same.



On 7/14/20 8:12 AM, Matt Burgess wrote:

Don't forget to include that service with "test" scope so it doesn't
get included in the "real" bundle.

On Tue, Jul 14, 2020 at 9:49 AM Bryan Bende  wrote:

It looks like you are missing a dependency in your project...

https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-standard-services/nifi-dbcp-service-bundle/nifi-dbcp-service/pom.xml#L50-L55

On Mon, Jul 13, 2020 at 5:24 PM Russell Bateman 
wrote:


I'm trying to use Apache Derby as the DBCP controller in JUnit tests.
For the first test, I start off vetting my ability to inject Derby as
the DBCP controller I want to use. But, right off, I get this Kerberos
error. I wasn't trying to use Kerberos, but maybe I'm missing
configuration to tell that to DBCPConnectionPool?

public void test() throws Exception
{
final DBCPConnectionPool service = new DBCPConnectionPool();

  *java.lang.NoClassDefFoundError:
org/apache/nifi/kerberos/KerberosCredentialsService at
org.apache.nifi.dbcp.DBCPConnectionPool.(DBCPConnectionPool.java:243)

at com.windofkeltia.processor.TestWithDerby.test(TestWithDerby.java:111)
< 26 internal calls> Caused by: java.lang.ClassNotFoundException:
org.apache.nifi.kerberos.KerberosCredentialsService < 2 internal calls >
at
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)

at
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)

at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) ...*

runner.addControllerService( "Derby service", service );
runner.setProperty( service, DBCPConnectionPool.DATABASE_URL,
"jdbc:derby:memory:sampledb;create=true" );
runner.setProperty( service, DBCPConnectionPool.DB_USER,   "sa" );
runner.setProperty( service, DBCPConnectionPool.DB_PASSWORD,   "sa" );
runner.setProperty( service, DBCPConnectionPool.DB_DRIVERNAME,
"org.apache.derby.jdbc.EmbeddedDriver" );
runner.enableControllerService( service );
runner.assertValid( service );

final DBCPService derbyService = ( DBCPService )
runner.getProcessContext()

.getControllerServiceLookup()

.getControllerService( "Derby service" );

// get and verify connections to Derby...
for( int count = 0; count < 10; count++ )
{
  final Connection connection = service.getConnection();
  if( VERBOSE )
System.out.println( connection );
  assertNotNull( connection );
  assertValidConnectionDerby( connection, count );
}

final Map< String, String > properties = new HashMap<>();
runner.setProperty( TestWithDerby.DBCP_SERVICE, "Derby service" );
runner.setIncomingConnection( false );
runner.setIncomingConnection( false );
runner.run();
}

Derby as DBCP service, error from Kerberos?

2020-07-13 Thread Russell Bateman

I'm trying to use Apache Derby as the DBCP controller in JUnit tests. 
For the first test, I start off vetting my ability to inject Derby as 
the DBCP controller I want to use. But, right off, I get this Kerberos 
error. I wasn't trying to use Kerberos, but maybe I'm missing 
configuration to tell that to DBCPConnectionPool?


public void test() throws Exception
{
  final DBCPConnectionPool service = new DBCPConnectionPool();

*java.lang.NoClassDefFoundError: 
org/apache/nifi/kerberos/KerberosCredentialsService at 
org.apache.nifi.dbcp.DBCPConnectionPool.(DBCPConnectionPool.java:243) 
at com.windofkeltia.processor.TestWithDerby.test(TestWithDerby.java:111) 
< 26 internal calls> Caused by: java.lang.ClassNotFoundException: 
org.apache.nifi.kerberos.KerberosCredentialsService < 2 internal calls > 
at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) 
at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) 
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) ...*


  runner.addControllerService( "Derby service", service );
  runner.setProperty( service, DBCPConnectionPool.DATABASE_URL,  
"jdbc:derby:memory:sampledb;create=true" );
  runner.setProperty( service, DBCPConnectionPool.DB_USER,   "sa" );
  runner.setProperty( service, DBCPConnectionPool.DB_PASSWORD,   "sa" );
  runner.setProperty( service, DBCPConnectionPool.DB_DRIVERNAME, 
"org.apache.derby.jdbc.EmbeddedDriver" );
  runner.enableControllerService( service );
  runner.assertValid( service );

  final DBCPService derbyService = ( DBCPService ) runner.getProcessContext()
 
.getControllerServiceLookup()
 .getControllerService( 
"Derby service" );

  // get and verify connections to Derby...
  for( int count = 0; count < 10; count++ )
  {
final Connection connection = service.getConnection();
if( VERBOSE )
  System.out.println( connection );
assertNotNull( connection );
assertValidConnectionDerby( connection, count );
  }

  final Map< String, String > properties = new HashMap<>();
  runner.setProperty( TestWithDerby.DBCP_SERVICE, "Derby service" );
  runner.setIncomingConnection( false );
  runner.setIncomingConnection( false );
  runner.run();
}

Re: Difficulty using DBCPService

2020-06-25 Thread Russell Bateman


Great. Thanks, Matt. And it is working fine now.

On 6/25/20 10:44 AM, Matt Burgess wrote:

Russell,

Sorry I lost track of this. If a version of a dependency is not
provided, then Maven will go up to the parent POM and look for a
version for it, and so on until it is found. If a version for the
dependency has not been declared in any parent, you'll get that error
and need to set the version explicitly in your own POM.

For nifi-dbcp-service-nar, that contains the actual DBCPConnectionPool
implementation, but your code should (and is) using the interface it
implements, namely DBCPService. That interface is in
nifi-standard-services-api-nar, which is why you need it as a parent
NAR to your custom NAR.

Regards,
Matt

On Mon, Jun 22, 2020 at 7:52 PM Russell Bateman  wrote:

Putting this dependency in the root, with a version, worked. My processor is 
loading.

Have you the time to tell me how building with this Maven dependency forces the 
behavior I need? (Or what behavior this is?)

I thought that nifi-dbcp-service-nar.nar was already loading before when I 
tried and looked. I was already checking on that. Maybe I was fooling myself.

Thanks, Matt

On 6/22/20 5:43 PM, Matt Burgess wrote:

You’ll want to add a 1.12.0-SNAPSHOT (or a released version of 
NiFi) before the ending  tag

Sent from my iPhone

On Jun 22, 2020, at 7:33 PM, Russell Bateman  wrote:

 Thank you for replying. You speak of the following dependency:


   org.apache.nifi
   nifi-standard-services-api-nar
   nar



I stumbled upon that, but could not make it work. My multimodule project 
structure appears thus:

-root
   - other submodules (whose custom processors work)
   - jdbc
 pom.xml
   - nar
 pom.xml
   pom.xml

No matter which pom.xml above I put that dependency into, I get the following 
complaint out of Maven:

[ERROR]   The project com.imatsolutions.pipeline:jdbc:4.0.0 
(/home/russ/sandboxes/imat-pipeline.master.dev/code/imat-pipeline/jdbc/pom.xml) 
has 1 error
   or
[ERROR]   The project com.imatsolutions.pipeline:imat-pipeline:4.0.0 
(/home/russ/sandboxes/imat-pipeline.master.dev/code/imat-pipeline/nar/pom.xml) 
has 1 error
   or
[ERROR]   The project com.imatsolutions.pipeline:imat-pipeline-parent:4.0.0 
(/home/russ/sandboxes/imat-pipeline.master.dev/code/imat-pipeline/pom.xml) has 
1 error
[ERROR] 'dependencies.dependency.version' for 
org.apache.nifi:nifi-standard-services-api-nar:nar is missing. @ line 136, 
column 17

On 6/22/20 4:38 PM, Matt Burgess wrote:

Not at my keyboard but does your NAR have the nifi-standard-services-api-nar as 
a parent? That should be where DBCPService is defined

Sent from my iPhone

On Jun 22, 2020, at 6:30 PM, Russell Bateman  wrote:

I find myself obliged to pick back up a custom processor, written by someone 
else a few years ago (in the NiFi 0.7.x era)  at my company, that makes use of 
DBCPService. While I think I understand the nuances of interface versus 
concrete controller class, etc. I probably need a push out the door onto the 
road of understanding why NiFi is not starting (and it's my fault). My JUnit 
test code works fine; I've created myself a loading problem maybe because I'm 
not including the proper auxiliary NAR (in this new, reduced world)? Thanks!

2020-06-22 16:11:37,139 ERROR [main] org.apache.nifi.NiFi Failure to launch 
NiFi due to java.util.ServiceConfigurationError: 
org.apache.nifi.processor.Processor: Provider 
com.imatsolutions.processor.JdbcToAttributes could not be instantiated
java.util.ServiceConfigurationError: org.apache.nifi.processor.Processor: 
Provider com.imatsolutions.processor.JdbcToAttributes could not be instantiated
 at java.util.ServiceLoader.fail(ServiceLoader.java:232)
 at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
 at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
 at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
 at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
 at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.loadExtensions(StandardExtensionDiscoveringManager.java:156)
 at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.discoverExtensions(StandardExtensionDiscoveringManager.java:131)
 at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.discoverExtensions(StandardExtensionDiscoveringManager.java:117)
 at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:942)
 at org.apache.nifi.NiFi.(NiFi.java:158)
 at org.apache.nifi.NiFi.(NiFi.java:72)
 at org.apache.nifi.NiFi.main(NiFi.java:301)
*Caused by: java.lang.NoClassDefFoundError: org/apache/nifi/dbcp/DBCPService*
 at com.imatsolutions.processor.JdbcCommon.(JdbcCommon.java:31)
 at 
com.imatsolutions.processor.AbstractJdbcTo.(AbstractJdbcTo.java:290)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance

Re: Difficulty using DBCPService

2020-06-22 Thread Russell Bateman

Putting this dependency in the root, with a version, worked. My 
processor is loading.


Have you the time to tell me how building with this Maven dependency 
forces the behavior I need? (Or what behavior this is?)


I thought that /nifi-dbcp-service-nar.nar/ was already loading before 
when I tried and looked. I was already checking on that. Maybe I was 
fooling myself.


Thanks, Matt

On 6/22/20 5:43 PM, Matt Burgess wrote:
You’ll want to add a 1.12.0-SNAPSHOT (or a released 
version of NiFi) before the ending  tag


Sent from my iPhone

On Jun 22, 2020, at 7:33 PM, Russell Bateman  
wrote:


 Thank you for replying. You speak of the following dependency:

   org.apache.nifi
   nifi-standard-services-api-nar
   nar


I stumbled upon that, but could not make it work. My multimodule 
project structure appears thus:


-root
  - other submodules (whose custom processors work)
  - jdbc
*pom.xml*
  - nar
*pom.xml*
*pom.xml*

No matter which /pom.xml/ above I put that dependency into, I get the 
following complaint out of Maven:


[ERROR]   The project com.imatsolutions.pipeline:jdbc:4.0.0 
(/home/russ/sandboxes/imat-pipeline.master.dev/code/imat-pipeline/jdbc/pom.xml) 
has 1 error

  or
[ERROR]   The project com.imatsolutions.pipeline:imat-pipeline:4.0.0 
(/home/russ/sandboxes/imat-pipeline.master.dev/code/imat-pipeline/nar/pom.xml) 
has 1 error

  or
[ERROR]   The project 
com.imatsolutions.pipeline:imat-pipeline-parent:4.0.0 
(/home/russ/sandboxes/imat-pipeline.master.dev/code/imat-pipeline/pom.xml) 
has 1 error
[ERROR] 'dependencies.dependency.version' for 
org.apache.nifi:nifi-standard-services-api-nar:nar is missing. @ line 
136, column 17


On 6/22/20 4:38 PM, Matt Burgess wrote:

Not at my keyboard but does your NAR have the nifi-standard-services-api-nar as 
a parent? That should be where DBCPService is defined

Sent from my iPhone


On Jun 22, 2020, at 6:30 PM, Russell Bateman  wrote:

I find myself obliged to pick back up a custom processor, written by someone 
else a few years ago (in the NiFi 0.7.x era)  at my company, that makes use of 
DBCPService. While I think I understand the nuances of interface versus 
concrete controller class, etc. I probably need a push out the door onto the 
road of understanding why NiFi is not starting (and it's my fault). My JUnit 
test code works fine; I've created myself a loading problem maybe because I'm 
not including the proper auxiliary NAR (in this new, reduced world)? Thanks!

2020-06-22 16:11:37,139 ERROR [main] org.apache.nifi.NiFi Failure to launch 
NiFi due to java.util.ServiceConfigurationError: 
org.apache.nifi.processor.Processor: Provider 
com.imatsolutions.processor.JdbcToAttributes could not be instantiated
java.util.ServiceConfigurationError: org.apache.nifi.processor.Processor: 
Provider com.imatsolutions.processor.JdbcToAttributes could not be instantiated
 at java.util.ServiceLoader.fail(ServiceLoader.java:232)
 at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
 at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
 at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
 at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
 at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.loadExtensions(StandardExtensionDiscoveringManager.java:156)
 at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.discoverExtensions(StandardExtensionDiscoveringManager.java:131)
 at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.discoverExtensions(StandardExtensionDiscoveringManager.java:117)
 at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:942)
 at org.apache.nifi.NiFi.(NiFi.java:158)
 at org.apache.nifi.NiFi.(NiFi.java:72)
 at org.apache.nifi.NiFi.main(NiFi.java:301)
*Caused by: java.lang.NoClassDefFoundError: org/apache/nifi/dbcp/DBCPService*
 at com.imatsolutions.processor.JdbcCommon.(JdbcCommon.java:31)
 at 
com.imatsolutions.processor.AbstractJdbcTo.(AbstractJdbcTo.java:290)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at java.lang.Class.newInstance(Class.java:442)
 at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
 ... 9 common frames omitted
Caused by: java.lang.ClassNotFoundException: org.apache.nifi.dbcp.DBCPService
 at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
 ... 17 common frames omitted

I'm doing stuff like this inside my custom processor. (I hope this gives enough 
to go on.)

public class JdbcToAttributes

Re: Difficulty using DBCPService

2020-06-22 Thread Russell Bateman


Thank you for replying. You speak of the following dependency:


  org.apache.nifi
  nifi-standard-services-api-nar
  nar



I stumbled upon that, but could not make it work. My multimodule project 
structure appears thus:


-root
  - other submodules (whose custom processors work)
  - jdbc
*pom.xml*
  - nar
*pom.xml*
*pom.xml*

No matter which /pom.xml/ above I put that dependency into, I get the 
following complaint out of Maven:


[ERROR]   The project com.imatsolutions.pipeline:jdbc:4.0.0 
(/home/russ/sandboxes/imat-pipeline.master.dev/code/imat-pipeline/jdbc/pom.xml) 
has 1 error

  or
[ERROR]   The project com.imatsolutions.pipeline:imat-pipeline:4.0.0 
(/home/russ/sandboxes/imat-pipeline.master.dev/code/imat-pipeline/nar/pom.xml) 
has 1 error

  or
[ERROR]   The project 
com.imatsolutions.pipeline:imat-pipeline-parent:4.0.0 
(/home/russ/sandboxes/imat-pipeline.master.dev/code/imat-pipeline/pom.xml) 
has 1 error
[ERROR] 'dependencies.dependency.version' for 
org.apache.nifi:nifi-standard-services-api-nar:nar is missing. @ line 
136, column 17


On 6/22/20 4:38 PM, Matt Burgess wrote:

Not at my keyboard but does your NAR have the nifi-standard-services-api-nar as 
a parent? That should be where DBCPService is defined

Sent from my iPhone


On Jun 22, 2020, at 6:30 PM, Russell Bateman  wrote:

I find myself obliged to pick back up a custom processor, written by someone 
else a few years ago (in the NiFi 0.7.x era)  at my company, that makes use of 
DBCPService. While I think I understand the nuances of interface versus 
concrete controller class, etc. I probably need a push out the door onto the 
road of understanding why NiFi is not starting (and it's my fault). My JUnit 
test code works fine; I've created myself a loading problem maybe because I'm 
not including the proper auxiliary NAR (in this new, reduced world)? Thanks!

2020-06-22 16:11:37,139 ERROR [main] org.apache.nifi.NiFi Failure to launch 
NiFi due to java.util.ServiceConfigurationError: 
org.apache.nifi.processor.Processor: Provider 
com.imatsolutions.processor.JdbcToAttributes could not be instantiated
java.util.ServiceConfigurationError: org.apache.nifi.processor.Processor: 
Provider com.imatsolutions.processor.JdbcToAttributes could not be instantiated
 at java.util.ServiceLoader.fail(ServiceLoader.java:232)
 at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
 at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
 at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
 at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
 at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.loadExtensions(StandardExtensionDiscoveringManager.java:156)
 at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.discoverExtensions(StandardExtensionDiscoveringManager.java:131)
 at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.discoverExtensions(StandardExtensionDiscoveringManager.java:117)
 at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:942)
 at org.apache.nifi.NiFi.(NiFi.java:158)
 at org.apache.nifi.NiFi.(NiFi.java:72)
 at org.apache.nifi.NiFi.main(NiFi.java:301)
*Caused by: java.lang.NoClassDefFoundError: org/apache/nifi/dbcp/DBCPService*
 at com.imatsolutions.processor.JdbcCommon.(JdbcCommon.java:31)
 at 
com.imatsolutions.processor.AbstractJdbcTo.(AbstractJdbcTo.java:290)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at java.lang.Class.newInstance(Class.java:442)
 at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
 ... 9 common frames omitted
Caused by: java.lang.ClassNotFoundException: org.apache.nifi.dbcp.DBCPService
 at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
 ... 17 common frames omitted

I'm doing stuff like this inside my custom processor. (I hope this gives enough 
to go on.)

public class JdbcToAttributes...
{
 public static final PropertyDescriptor DBCP_SERVICE = new 
PropertyDescriptor.Builder()
 .name( "Database Connection Pooling Service" )
 .description( "The controller service that is used to obtain connection 
to database" )
 .required( true )
 .identifiesControllerService( DBCPService.class )
 .build();
   ...

public void onTrigger( final ProcessContext context, final ProcessSession 
session, ProcessSessionFactory factory )
 throws ProcessException
   {
 ...

 final DBC

Difficulty using DBCPService

2020-06-22 Thread Russell Bateman

I find myself obliged to pick back up a custom processor, written by 
someone else a few years ago (in the NiFi 0.7.x era)  at my company, 
that makes use of DBCPService. While I think I understand the nuances of 
interface versus concrete controller class, etc. I probably need a push 
out the door onto the road of understanding why NiFi is not starting 
(and it's my fault). My JUnit test code works fine; I've created myself 
a loading problem maybe because I'm not including the proper auxiliary 
NAR (in this new, reduced world)? Thanks!


2020-06-22 16:11:37,139 ERROR [main] org.apache.nifi.NiFi Failure to 
launch NiFi due to java.util.ServiceConfigurationError: 
org.apache.nifi.processor.Processor: Provider 
com.imatsolutions.processor.JdbcToAttributes could not be instantiated
java.util.ServiceConfigurationError: 
org.apache.nifi.processor.Processor: Provider 
com.imatsolutions.processor.JdbcToAttributes could not be instantiated

    at java.util.ServiceLoader.fail(ServiceLoader.java:232)
    at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
    at 
java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)

    at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
    at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
    at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.loadExtensions(StandardExtensionDiscoveringManager.java:156)
    at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.discoverExtensions(StandardExtensionDiscoveringManager.java:131)
    at 
org.apache.nifi.nar.StandardExtensionDiscoveringManager.discoverExtensions(StandardExtensionDiscoveringManager.java:117)

    at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:942)
    at org.apache.nifi.NiFi.(NiFi.java:158)
    at org.apache.nifi.NiFi.(NiFi.java:72)
    at org.apache.nifi.NiFi.main(NiFi.java:301)
*Caused by: java.lang.NoClassDefFoundError: 
org/apache/nifi/dbcp/DBCPService*

    at com.imatsolutions.processor.JdbcCommon.(JdbcCommon.java:31)
    at 
com.imatsolutions.processor.AbstractJdbcTo.(AbstractJdbcTo.java:290)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
    at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at java.lang.Class.newInstance(Class.java:442)
    at 
java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)

    ... 9 common frames omitted
Caused by: java.lang.ClassNotFoundException: 
org.apache.nifi.dbcp.DBCPService

    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    ... 17 common frames omitted

I'm doing stuff like this inside my custom processor. (I hope this gives 
enough to go on.)


public class JdbcToAttributes...
{
    public static final PropertyDescriptor DBCP_SERVICE = new 
PropertyDescriptor.Builder()

            .name( "Database Connection Pooling Service" )
            .description( "The controller service that is used to 
obtain connection to database" )

            .required( true )
            .identifiesControllerService( DBCPService.class )
            .build();
  ...

public void onTrigger( final ProcessContext context, final 
ProcessSession session, ProcessSessionFactory factory )

            throws ProcessException
  {
    ...

    final DBCPService dbcpService = context.getProperty( DBCP_SERVICE 
).asControllerService( DBCPService.class );

    etc.

Re: Nifi tutorials

2020-05-26 Thread Russell Bateman


Certainly!

On 5/26/20 12:32 PM, Anuj Jain wrote:

Hi Team,

I am currently working on Apache-Nifi in my company from past 1 year.
I am planning to launch a series of videos for Nifi learners about what I
have learned in Nifi in last 1 year on youtube. Am I allowed to do that?


Regards,
Anuj Jain

Re: Reading the incoming flowfile "twice"

2020-03-31 Thread Russell Bateman


Let's see... Does this fix the typos the way you intended?

public void onTrigger( final ProcessContext context, final 
ProcessSession session ) throws ProcessException

{
  FlowFile original = session.get(); if( original == null ) { 
context.yield(); return; }

  FlowFile output   = session.create( original );

  // Begin writing to ‘output flowfile'
  FlowFile *modified* = session.write( output, new OutputStreamCallback()
  {
    @Override
    public void process( OutputStream out )
    {
  // read from original FlowFile
  session.read( original, new InputStreamCallback()
  {
    @Override
    public void process( InputStream in ) throws IOException
    {
  copyFirstHalf( in, out );
    }
  } );

  // read from original FlowFile a second time. Use a SAX parser to 
parse it

  // and write to the end of the ‘output flowfile'
  session.read( original, new InputStreamCallback()
  {
    @Override
    public void process( InputStream in ) throws IOException
    {
  processWithSaxParser( in, out );
    }
  } );
    }
  } );

  session.transfer( *modified*, SUCCESS );
  session.remove( original );
}

This seems very close to working for me; I don't see anything wrong and 
just need to plug in my SAX parser. This modified session is a new 
pattern for me (and a useful one).


Thanks!

On 3/31/20 12:44 PM, Russell Bateman wrote:

(Oh, I see where *out*comes from, but not *modified*.)

On 3/31/20 12:35 PM, Russell Bateman wrote:

Wait, where is *modified*from?

Thanks

On 3/31/20 12:24 PM, Mark Payne wrote:

Russ,

OK, so then I think the pattern you’d want to follow would be something like 
this:

FlowFile original = session.get();
if (flowFile == null) {
 return;
}

FlowFile output = session.create(original);

// Begin writing to ‘output flowfile'
output = session.write(*modified*, new OutputStreamCallback() {
 void process(OutputStream*out*) {

 // read from original FlowFile
 session.read(original, new InputStreamCallback() {
   void process(InputStream in) {
copyFirstHalf(in, out);
   }
});


 // read from original FlowFile a second time. Use a SAX parser to 
parse it and write to the end of the ‘output flowfile'
session.read(original, new InputStreamCallback() {
  void process(InputStream in) {
   processWithSaxParser(in,*out*);
  }
});

 }
});

session.transfer(output, REL_SUCCESS);
session.remove(original);


Thanks
-Mark



On Mar 31, 2020, at 2:04 PM, Russell Bateman  wrote:

Mark,

Thanks for getting back. My steps are:

1. Read the "first half" of the input stream copying it to the output stream. 
This is because I need to preserve the exact form of it (spacing, indentation, lines, 
etc.) without change whatsoever. If I

2. Reopen the stream from the beginning with a SAX parser. Its handler, which I wrote, will 
ignore the original part that I'm holding for sacred--everything between  
and .

3. The SAX handler writes the rest of the XML with a few changes out appending it to that 
same output stream on which the original "half" was written. (This does not 
seem to work.)

I was not seeing this as "overwriting" flowfile content, but, in my tiny little mind, I 
imagined an input stream, which I want to read exactly a) one-half, then again, b) one-whole time, 
and an output stream to which I start to write by copying (a), followed by a modification of (b) 
yet, the whole (b) or "second half." Then I'm done. I was thinking of the input stream as 
from the in-coming flowfile and a separate thing from the output stream which I see as being 
offered to me for my use in creating a new flowfile to transfer to. I guess this is not how it 
works.

My in-coming flowfiles can be megabytes in size. Copying to a string is not an option. Copying to a 
temporary file "isn't NiFi" as I understand it. I was hoping to avoid writing another 
processor or two to a) break up the flowfile into  ...  and (all the 
rest), fix (all the rest), then stitch the two back together in a later processor. I see having to 
coordinate the two halves of what used to be one file fraught with precarity and confusion, but I 
guess that's the solution I'm left with?

Thanks,
Russ


On 3/31/20 10:23 AM, Mark Payne wrote:

Russ,

As far as I can tell, this is working exactly as expected.

To verify, I created a simple Integration test, as well, which I attached below.

Let me outline what I *think* you’re trying to do here and please correct me if 
I’m wrong:

1. Read the content of the FlowFile. (Via session.read)
2. Overwrite the content of the FlowFile. (This is done by session.write)
3. Overwrite the content of the FlowFile again. (Via session.write)

The third step is the part where I’m confused. You’re calling session.write() 
again. In the callback, you’ll receive an Input

Re: Reading the incoming flowfile "twice"

2020-03-31 Thread Russell Bateman


(Oh, I see where *out*comes from, but not *modified*.)

On 3/31/20 12:35 PM, Russell Bateman wrote:

Wait, where is *modified*from?

Thanks

On 3/31/20 12:24 PM, Mark Payne wrote:

Russ,

OK, so then I think the pattern you’d want to follow would be something like 
this:

FlowFile original = session.get();
if (flowFile == null) {
 return;
}

FlowFile output = session.create(original);

// Begin writing to ‘output flowfile'
output = session.write(*modified*, new OutputStreamCallback() {
 void process(OutputStream*out*) {

 // read from original FlowFile
 session.read(original, new InputStreamCallback() {
   void process(InputStream in) {
copyFirstHalf(in, out);
   }
});


 // read from original FlowFile a second time. Use a SAX parser to 
parse it and write to the end of the ‘output flowfile'
session.read(original, new InputStreamCallback() {
  void process(InputStream in) {
   processWithSaxParser(in,*out*);
  }
});

 }
});

session.transfer(output, REL_SUCCESS);
session.remove(original);


Thanks
-Mark



On Mar 31, 2020, at 2:04 PM, Russell Bateman  wrote:

Mark,

Thanks for getting back. My steps are:

1. Read the "first half" of the input stream copying it to the output stream. 
This is because I need to preserve the exact form of it (spacing, indentation, lines, 
etc.) without change whatsoever. If I

2. Reopen the stream from the beginning with a SAX parser. Its handler, which I wrote, will 
ignore the original part that I'm holding for sacred--everything between  
and .

3. The SAX handler writes the rest of the XML with a few changes out appending it to that 
same output stream on which the original "half" was written. (This does not 
seem to work.)

I was not seeing this as "overwriting" flowfile content, but, in my tiny little mind, I 
imagined an input stream, which I want to read exactly a) one-half, then again, b) one-whole time, 
and an output stream to which I start to write by copying (a), followed by a modification of (b) 
yet, the whole (b) or "second half." Then I'm done. I was thinking of the input stream as 
from the in-coming flowfile and a separate thing from the output stream which I see as being 
offered to me for my use in creating a new flowfile to transfer to. I guess this is not how it 
works.

My in-coming flowfiles can be megabytes in size. Copying to a string is not an option. Copying to a 
temporary file "isn't NiFi" as I understand it. I was hoping to avoid writing another 
processor or two to a) break up the flowfile into  ...  and (all the 
rest), fix (all the rest), then stitch the two back together in a later processor. I see having to 
coordinate the two halves of what used to be one file fraught with precarity and confusion, but I 
guess that's the solution I'm left with?

Thanks,
Russ


On 3/31/20 10:23 AM, Mark Payne wrote:

Russ,

As far as I can tell, this is working exactly as expected.

To verify, I created a simple Integration test, as well, which I attached below.

Let me outline what I *think* you’re trying to do here and please correct me if 
I’m wrong:

1. Read the content of the FlowFile. (Via session.read)
2. Overwrite the content of the FlowFile. (This is done by session.write)
3. Overwrite the content of the FlowFile again. (Via session.write)

The third step is the part where I’m confused. You’re calling session.write() 
again. In the callback, you’ll receive an InputStream that contains the 
contents of the FlowFile (which have now been modified, per Step 2). You’re 
also given an OutputStream to write the new content to.
If you then return without writing anything to the OutputStream, as in the 
example that you attached, then yes, you’ll have erased all of the FlowFile’s 
content.

It’s unclear to me exactly what you’re attempting to accomplish in the third 
step. It *sounds* like you’re expecting the content of the original/incoming 
FlowFile. But you’re not going to get that because you’ve already overwritten 
that FlowFile’s content. If that is what you’re trying to do, I think what 
you’d want to do is something more like this:

FlowFile original = session.get();
If (original == null) {
   return;
}

session.read(original, new InputStreamCallback() {…});

FlowFile childFlowFile = session.create(original); // Create a ‘child’ flow 
file whose content is equal to the original FlowFile’s content.
session.write(childFlowFile, new StreamCallback() {…});

// Read the original FlowFile’s content
session.read(original, new InputStreamCallback() { … });

session.transfer(childFlowFile, REL_SUCCESS);
session.remove(original); // or transfer to an ‘original’ relationship or 
whatever makes sense for you.



Hope this helps!
-Mark





On Mar 30, 2020, at 4:23 PM, Russell Bateman mailto:r...@windofkeltia.com>> wrote:

If I haven't worn out my welcome, here is the sim

Re: Reading the incoming flowfile "twice"

2020-03-31 Thread Russell Bateman


And, also *out*?

On 3/31/20 12:35 PM, Russell Bateman wrote:

Wait, where is *modified*from?

Thanks

On 3/31/20 12:24 PM, Mark Payne wrote:

Russ,

OK, so then I think the pattern you’d want to follow would be something like 
this:

FlowFile original = session.get();
if (flowFile == null) {
 return;
}

FlowFile output = session.create(original);

// Begin writing to ‘output flowfile'
output = session.write(modified, new OutputStreamCallback() {
 void process(OutputStream out) {

 // read from original FlowFile
 session.read(original, new InputStreamCallback() {
   void process(InputStream in) {
copyFirstHalf(in, out);
   }
});


 // read from original FlowFile a second time. Use a SAX parser to 
parse it and write to the end of the ‘output flowfile'
session.read(original, new InputStreamCallback() {
  void process(InputStream in) {
   processWithSaxParser(in,*out*);
  }
});

 }
});

session.transfer(output, REL_SUCCESS);
session.remove(original);


Thanks
-Mark



On Mar 31, 2020, at 2:04 PM, Russell Bateman  wrote:

Mark,

Thanks for getting back. My steps are:

1. Read the "first half" of the input stream copying it to the output stream. 
This is because I need to preserve the exact form of it (spacing, indentation, lines, 
etc.) without change whatsoever. If I

2. Reopen the stream from the beginning with a SAX parser. Its handler, which I wrote, will 
ignore the original part that I'm holding for sacred--everything between  
and .

3. The SAX handler writes the rest of the XML with a few changes out appending it to that 
same output stream on which the original "half" was written. (This does not 
seem to work.)

I was not seeing this as "overwriting" flowfile content, but, in my tiny little mind, I 
imagined an input stream, which I want to read exactly a) one-half, then again, b) one-whole time, 
and an output stream to which I start to write by copying (a), followed by a modification of (b) 
yet, the whole (b) or "second half." Then I'm done. I was thinking of the input stream as 
from the in-coming flowfile and a separate thing from the output stream which I see as being 
offered to me for my use in creating a new flowfile to transfer to. I guess this is not how it 
works.

My in-coming flowfiles can be megabytes in size. Copying to a string is not an option. Copying to a 
temporary file "isn't NiFi" as I understand it. I was hoping to avoid writing another 
processor or two to a) break up the flowfile into  ...  and (all the 
rest), fix (all the rest), then stitch the two back together in a later processor. I see having to 
coordinate the two halves of what used to be one file fraught with precarity and confusion, but I 
guess that's the solution I'm left with?

Thanks,
Russ


On 3/31/20 10:23 AM, Mark Payne wrote:

Russ,

As far as I can tell, this is working exactly as expected.

To verify, I created a simple Integration test, as well, which I attached below.

Let me outline what I *think* you’re trying to do here and please correct me if 
I’m wrong:

1. Read the content of the FlowFile. (Via session.read)
2. Overwrite the content of the FlowFile. (This is done by session.write)
3. Overwrite the content of the FlowFile again. (Via session.write)

The third step is the part where I’m confused. You’re calling session.write() 
again. In the callback, you’ll receive an InputStream that contains the 
contents of the FlowFile (which have now been modified, per Step 2). You’re 
also given an OutputStream to write the new content to.
If you then return without writing anything to the OutputStream, as in the 
example that you attached, then yes, you’ll have erased all of the FlowFile’s 
content.

It’s unclear to me exactly what you’re attempting to accomplish in the third 
step. It *sounds* like you’re expecting the content of the original/incoming 
FlowFile. But you’re not going to get that because you’ve already overwritten 
that FlowFile’s content. If that is what you’re trying to do, I think what 
you’d want to do is something more like this:

FlowFile original = session.get();
If (original == null) {
   return;
}

session.read(original, new InputStreamCallback() {…});

FlowFile childFlowFile = session.create(original); // Create a ‘child’ flow 
file whose content is equal to the original FlowFile’s content.
session.write(childFlowFile, new StreamCallback() {…});

// Read the original FlowFile’s content
session.read(original, new InputStreamCallback() { … });

session.transfer(childFlowFile, REL_SUCCESS);
session.remove(original); // or transfer to an ‘original’ relationship or 
whatever makes sense for you.



Hope this helps!
-Mark





On Mar 30, 2020, at 4:23 PM, Russell Bateman mailto:r...@windofkeltia.com>> wrote:

If I haven't worn out my welcome, here is the simplified code that should 
demonstrate eith

Re: Reading the incoming flowfile "twice"

2020-03-31 Thread Russell Bateman


Wait, where is *modified*from?

Thanks

On 3/31/20 12:24 PM, Mark Payne wrote:

Russ,

OK, so then I think the pattern you’d want to follow would be something like 
this:

FlowFile original = session.get();
if (flowFile == null) {
 return;
}

FlowFile output = session.create(original);

// Begin writing to ‘output flowfile'
output = session.write(*modified*, new OutputStreamCallback() {
 void process(OutputStream out) {

 // read from original FlowFile
 session.read(original, new InputStreamCallback() {
   void process(InputStream in) {
copyFirstHalf(in, out);
   }
});


 // read from original FlowFile a second time. Use a SAX parser to 
parse it and write to the end of the ‘output flowfile'
session.read(original, new InputStreamCallback() {
  void process(InputStream in) {
   processWithSaxParser(in, out);
  }
});

 }
});

session.transfer(output, REL_SUCCESS);
session.remove(original);


Thanks
-Mark



On Mar 31, 2020, at 2:04 PM, Russell Bateman  wrote:

Mark,

Thanks for getting back. My steps are:

1. Read the "first half" of the input stream copying it to the output stream. 
This is because I need to preserve the exact form of it (spacing, indentation, lines, 
etc.) without change whatsoever. If I

2. Reopen the stream from the beginning with a SAX parser. Its handler, which I wrote, will 
ignore the original part that I'm holding for sacred--everything between  
and .

3. The SAX handler writes the rest of the XML with a few changes out appending it to that 
same output stream on which the original "half" was written. (This does not 
seem to work.)

I was not seeing this as "overwriting" flowfile content, but, in my tiny little mind, I 
imagined an input stream, which I want to read exactly a) one-half, then again, b) one-whole time, 
and an output stream to which I start to write by copying (a), followed by a modification of (b) 
yet, the whole (b) or "second half." Then I'm done. I was thinking of the input stream as 
from the in-coming flowfile and a separate thing from the output stream which I see as being 
offered to me for my use in creating a new flowfile to transfer to. I guess this is not how it 
works.

My in-coming flowfiles can be megabytes in size. Copying to a string is not an option. Copying to a 
temporary file "isn't NiFi" as I understand it. I was hoping to avoid writing another 
processor or two to a) break up the flowfile into  ...  and (all the 
rest), fix (all the rest), then stitch the two back together in a later processor. I see having to 
coordinate the two halves of what used to be one file fraught with precarity and confusion, but I 
guess that's the solution I'm left with?

Thanks,
Russ


On 3/31/20 10:23 AM, Mark Payne wrote:

Russ,

As far as I can tell, this is working exactly as expected.

To verify, I created a simple Integration test, as well, which I attached below.

Let me outline what I *think* you’re trying to do here and please correct me if 
I’m wrong:

1. Read the content of the FlowFile. (Via session.read)
2. Overwrite the content of the FlowFile. (This is done by session.write)
3. Overwrite the content of the FlowFile again. (Via session.write)

The third step is the part where I’m confused. You’re calling session.write() 
again. In the callback, you’ll receive an InputStream that contains the 
contents of the FlowFile (which have now been modified, per Step 2). You’re 
also given an OutputStream to write the new content to.
If you then return without writing anything to the OutputStream, as in the 
example that you attached, then yes, you’ll have erased all of the FlowFile’s 
content.

It’s unclear to me exactly what you’re attempting to accomplish in the third 
step. It *sounds* like you’re expecting the content of the original/incoming 
FlowFile. But you’re not going to get that because you’ve already overwritten 
that FlowFile’s content. If that is what you’re trying to do, I think what 
you’d want to do is something more like this:

FlowFile original = session.get();
If (original == null) {
   return;
}

session.read(original, new InputStreamCallback() {…});

FlowFile childFlowFile = session.create(original); // Create a ‘child’ flow 
file whose content is equal to the original FlowFile’s content.
session.write(childFlowFile, new StreamCallback() {…});

// Read the original FlowFile’s content
session.read(original, new InputStreamCallback() { … });

session.transfer(childFlowFile, REL_SUCCESS);
session.remove(original); // or transfer to an ‘original’ relationship or 
whatever makes sense for you.



Hope this helps!
-Mark





On Mar 30, 2020, at 4:23 PM, Russell Bateman mailto:r...@windofkeltia.com>> wrote:

If I haven't worn out my welcome, here is the simplified code that should 
demonstrate either that I have miscoded your suggestions or that the AP

Re: Reading the incoming flowfile "twice"

2020-03-31 Thread Russell Bateman


Mark,

Thanks for getting back. My steps are:

1. Read the "first half" of the input stream copying it to the output 
stream. This is because I need to preserve the exact form of it 
(spacing, indentation, lines, etc.) without change whatsoever. If I


2. Reopen the stream from the beginning with a SAX parser. Its handler, 
which I wrote, will ignore the original part that I'm holding for 
sacred--everything between  and .


3. The SAX handler writes the rest of the XML with a few changes out 
appending it to that same output stream on which the original "half" was 
written. (This does not seem to work.)


I was not seeing this as "overwriting" flowfile content, but, in my tiny 
little mind, I imagined an input stream, which I want to read exactly a) 
one-half, then again, b) one-whole time, and an output stream to which I 
start to write by copying (a), followed by a modification of (b) yet, 
the whole (b) or "second half." Then I'm done. I was thinking of the 
input stream as from the in-coming flowfile and a separate thing from 
the output stream which I see as being offered to me for my use in 
creating a new flowfile to transfer to. I guess this is not how it works.


My in-coming flowfiles can be megabytes in size. Copying to a string is 
not an option. Copying to a temporary file "isn't NiFi" as I understand 
it. I was hoping to avoid writing another processor or two to a) break 
up the flowfile into  ...  and (all the rest), fix 
(all the rest), then stitch the two back together in a later processor. 
I see having to coordinate the two halves of what used to be one file 
fraught with precarity and confusion, but I guess that's the solution 
I'm left with?


Thanks,
Russ


On 3/31/20 10:23 AM, Mark Payne wrote:

Russ,

As far as I can tell, this is working exactly as expected.

To verify, I created a simple Integration test, as well, which I 
attached below.


Let me outline what I *think* you’re trying to do here and please 
correct me if I’m wrong:


1. Read the content of the FlowFile. (Via session.read)
2. Overwrite the content of the FlowFile. (This is done by session.write)
3. Overwrite the content of the FlowFile again. (Via session.write)

The third step is the part where I’m confused. You’re calling 
session.write() again. In the callback, you’ll receive an InputStream 
that contains the contents of the FlowFile (which have now been 
modified, per Step 2). You’re also given an OutputStream to write the 
new content to.
If you then return without writing anything to the OutputStream, as in 
the example that you attached, then yes, you’ll have erased all of the 
FlowFile’s content.


It’s unclear to me exactly what you’re attempting to accomplish in the 
third step. It *sounds* like you’re expecting the content of the 
original/incoming FlowFile. But you’re not going to get that because 
you’ve already overwritten that FlowFile’s content. If that is what 
you’re trying to do, I think what you’d want to do is something more 
like this:


FlowFile original = session.get();
If (original == null) {
  return;
}

session.read(original, new InputStreamCallback() {…});

FlowFile childFlowFile = session.create(original); // Create a ‘child’ 
flow file whose content is equal to the original FlowFile’s content.

session.write(childFlowFile, new StreamCallback() {…});

// Read the original FlowFile’s content
session.read(original, new InputStreamCallback() { … });

session.transfer(childFlowFile, REL_SUCCESS);
session.remove(original); // or transfer to an ‘original’ relationship 
or whatever makes sense for you.




Hope this helps!
-Mark




On Mar 30, 2020, at 4:23 PM, Russell Bateman <mailto:r...@windofkeltia.com>> wrote:


If I haven't worn out my welcome, here is the simplified code that 
should demonstrate either that I have miscoded your suggestions or 
that the API doesn't in fact work as advertised. First, the output. 
The code, both JUnit test and processor are attached and the files 
are pretty small.


Much thanks,
Russ

This is the input stream first time around (before copying) 
===

* * * session.read( flowfile );
  Here's what's in input stream:
***

**    This is the original document.**


**2016-06-28 13:23**


**1980-07-01**
**36**

***

And now, let's copy some of the input stream to the output stream 
=

* * * flowfile = session.write( flowfile, new StreamCallback() ...
  Copying input stream to output stream up to ...
  The output stream has in it at this point:
***

**    This is the original document.**

*
[1. When we examine the output stream, it has what we expect.]

After copying, can we reopen input stream intact and does 
outputstream have what we think? 

* * * flowfile = session.write( flowfile, new StreamCallback() ...
  Here's what's in input stream:
***

**    This is the original document.**
**

Re: Reading the incoming flowfile "twice"

2020-03-31 Thread Russell Bateman

Yes, I though of that, but there's no way to insert completing XML 
structure into the input stream ahead of (). SAX will choke if 
I just start feeding it the flowfile where I left off from copying up to 
.


On 3/30/20 8:25 PM, Otto Fowler wrote:

Can I ask why you would consume the whole stream when doing the non-sax
part? If you consume the stream right up to the sax part ( the stream POS
is at the start of the xml ) then you can just pass the stream to sax as is
can’t you?




On March 30, 2020 at 16:23:27, Russell Bateman (r...@windofkeltia.com)
wrote:

If I haven't worn out my welcome, here is the simplified code that should
demonstrate either that I have miscoded your suggestions or that the API
doesn't in fact work as advertised. First, the output. The code, both JUnit
test and processor are attached and the files are pretty small.

Much thanks,
Russ

This is the input stream first time around (before copying)
===
* * * session.read( flowfile );
   Here's what's in input stream:
**
*  *
*This is the original document.*
*  *
*  *
*2016-06-28 13:23*
*  *
*  *
*1980-07-01*
*36*
*  *
**

And now, let's copy some of the input stream to the output stream
=
* * * flowfile = session.write( flowfile, new StreamCallback() ...
   Copying input stream to output stream up to ...
   The output stream has in it at this point:
**
*  *
*This is the original document.*
*  *

[1. When we examine the output stream, it has what we expect.]

After copying, can we reopen input stream intact and does outputstream have
what we think? 
* * * flowfile = session.write( flowfile, new StreamCallback() ...
   Here's what's in input stream:
**
*  *
*This is the original document.*
*  *

[2. The input stream as reported just above is truncated by exactly the
content we did
   not copy to the output stream. We expected to see the entire,
original file, but the
   second half is gone.]

   Here's what's in the output stream at this point:
* (nothing)*

[3. The content we copied to the output stream has disappeared. Does it
disappear simply
 because we looked at it (printed it out here)?]


On 3/29/20 5:05 AM, Joe Witt wrote:

Russell

I recommend writing very simple code that does two successive read/write
operations on basic data so you can make sure the api work/as expected.
Then add the xml bits.

Thanks

On Sun, Mar 29, 2020 at 5:15 AM Mike Thomsen 
 wrote:


If these files are only a few MB at the most, you can also just export them
to a ByteArrayOutputStream. Just a thought.

On Sun, Mar 29, 2020 at 12:16 AM Russell Bateman
 
wrote:


Joe and Mike,

Sadly, I was not able to get very far on this. It seems that the extend
to which I copy the first half of the contents of the input stream, I
lose what comes after when I try to read again, basically, the second
half comprising the and elements which I was
hoping to SAX-parse. Here's code and output. I have highlighted the
output to make it easier to read.

? <#>
|try|
|{|
|||InputStream inputStream = session.read( flowfile );|
|||System.out.println( ||"This is the input stream first time around
(before copying to output stream)..."| |);|
|||System.out.println( StreamUtilities.fromStream( inputStream ) );|
|||inputStream.close();|
|}|
|catch||( IOException e )|
|{|
|||e.printStackTrace();|
|}|
|flowfile = session.write( flowfile, ||new| |StreamCallback()|
|{|
|||@Override|
|||public| |void| |process( InputStream inputStream, OutputStream
outputStream ) ||throws| |IOException|
|||{|
|||System.out.println( ||"And now, let's copy..."| |);|
|||CxmlStreamUtilities.copyCxmlHeaderAndDocumentToOutput( inputStream,
outputStream );|
|||}|
|} );|
|try|
|{|
|||InputStream inputStream = session.read( flowfile );|
|||System.out.println( ||"This is the input stream second time around
(after copying)..."| |);|
|||System.out.println( StreamUtilities.fromStream( inputStream ) );|
|||inputStream.close();|
|}|
|catch||( IOException e )|
|{|
|||e.printStackTrace();|
|}|
|// ...on to SAX parser which dies because the input has been truncated

to|

|// exactly what was written out to the output stream|


Output of above:

This is the input stream first time around (before copying to output
stream)...


  This is the original document.


  2016-06-28 13:23


  1980-07-01
  36



And now, let's copy...
This is the input stream second time around (after copying)...


  This is the original document.

And now, we'll go on to the SAX parser...
  This is the original document. 
[pool-1-thread-1] ERROR [...] SAX ruleparser error:
org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 14; XML
document structures must start and end within the same entity.


I left off the code that prints, "And now, we'll go on to the SAX
parser..." It's in the next flowfile = session.write( ... ). I ha

Re: Reading the incoming flowfile "twice"

2020-03-30 Thread Russell Bateman

If I haven't worn out my welcome, here is the simplified code that 
should demonstrate either that I have miscoded your suggestions or that 
the API doesn't in fact work as advertised. First, the output. The code, 
both JUnit test and processor are attached and the files are pretty small.


Much thanks,
Russ

This is the input stream first time around (before copying) 
===

* * * session.read( flowfile );
  Here's what's in input stream:
***
**  **
**    This is the original document.**
**  **
**  **
**    2016-06-28 13:23**
**  **
**  **
**1980-07-01**
**    36**
**  **
***

And now, let's copy some of the input stream to the output stream 
=

* * * flowfile = session.write( flowfile, new StreamCallback() ...
  Copying input stream to output stream up to ...
  The output stream has in it at this point:
***
**  **
**    This is the original document.**
**  **
*
[1. When we examine the output stream, it has what we expect.]

After copying, can we reopen input stream intact and does outputstream 
have what we think? 

* * * flowfile = session.write( flowfile, new StreamCallback() ...
  Here's what's in input stream:
***
**  **
**    This is the original document.**
**  *

[2. The input stream as reported just above is truncated by exactly the 
content we did
  not copy to the output stream. We expected to see the entire, 
original file, but the

  second half is gone.]

  Here's what's in the output stream at this point:
* (nothing)*

[3. The content we copied to the output stream has disappeared. Does it 
disappear simply

    because we looked at it (printed it out here)?]


On 3/29/20 5:05 AM, Joe Witt wrote:

Russell

I recommend writing very simple code that does two successive read/write
operations on basic data so you can make sure the api work/as expected.
Then add the xml bits.

Thanks

On Sun, Mar 29, 2020 at 5:15 AM Mike Thomsen  wrote:


If these files are only a few MB at the most, you can also just export them
to a ByteArrayOutputStream. Just a thought.

On Sun, Mar 29, 2020 at 12:16 AM Russell Bateman 
wrote:


Joe and Mike,

Sadly, I was not able to get very far on this. It seems that the extend
to which I copy the first half of the contents of the input stream, I
lose what comes after when I try to read again, basically, the second
half comprising the and elements which I was
hoping to SAX-parse. Here's code and output. I have highlighted the
output to make it easier to read.

? <#>
|try|
|{|
|||InputStream inputStream = session.read( flowfile );|
|||System.out.println( ||"This is the input stream first time around
(before copying to output stream)..."| |);|
|||System.out.println( StreamUtilities.fromStream( inputStream ) );|
|||inputStream.close();|
|}|
|catch||( IOException e )|
|{|
|||e.printStackTrace();|
|}|
|flowfile = session.write( flowfile, ||new| |StreamCallback()|
|{|
|||@Override|
|||public| |void| |process( InputStream inputStream, OutputStream
outputStream ) ||throws| |IOException|
|||{|
|||System.out.println( ||"And now, let's copy..."| |);|
|||CxmlStreamUtilities.copyCxmlHeaderAndDocumentToOutput( inputStream,
outputStream );|
|||}|
|} );|
|try|
|{|
|||InputStream inputStream = session.read( flowfile );|
|||System.out.println( ||"This is the input stream second time around
(after copying)..."| |);|
|||System.out.println( StreamUtilities.fromStream( inputStream ) );|
|||inputStream.close();|
|}|
|catch||( IOException e )|
|{|
|||e.printStackTrace();|
|}|
|// ...on to SAX parser which dies because the input has been truncated

to|

|// exactly what was written out to the output stream|


Output of above:

This is the input stream first time around (before copying to output
stream)...


  This is the original document.


  2016-06-28 13:23


  1980-07-01
  36



And now, let's copy...
This is the input stream second time around (after copying)...


  This is the original document.

And now, we'll go on to the SAX parser...
  This is the original document. 
[pool-1-thread-1] ERROR [...] SAX ruleparser error:
org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 14; XML
document structures must start and end within the same entity.


I left off the code that prints, "And now, we'll go on to the SAX
parser..." It's in the next flowfile = session.write( ... ). I have unit
tests that verify the good functioning of
copyCxmlHeaderAndDocumentToOutput(). The SAX error occurs because the
"file" is truncated; SAX finds the first "half" just fine, but there is
no second "half". If I comment out copying from input stream to output
stream, the error doesn't occur--the whole document is there.

Thanks for looking at this again if you can,
Russ

On 3/27/20 3:08 PM, Joe Witt wrote:

you should be able to call write as many times as you need.  just keep
using the resul

Re: Reading the incoming flowfile "twice"

2020-03-29 Thread Russell Bateman

No, this test file is tiny. The real thing is usually megabytes in size.

On Sun, Mar 29, 2020 at 3:15 AM Mike Thomsen  wrote:

> If these files are only a few MB at the most, you can also just export them
> to a ByteArrayOutputStream. Just a thought.
>
> On Sun, Mar 29, 2020 at 12:16 AM Russell Bateman 
> wrote:
>
> > Joe and Mike,
> >
> > Sadly, I was not able to get very far on this. It seems that the extend
> > to which I copy the first half of the contents of the input stream, I
> > lose what comes after when I try to read again, basically, the second
> > half comprising the and elements which I was
> > hoping to SAX-parse. Here's code and output. I have highlighted the
> > output to make it easier to read.
> >
> > ? <#>
> > |try|
> > |{|
> > |||InputStream inputStream = session.read( flowfile );|
> > |||System.out.println( ||"This is the input stream first time around
> > (before copying to output stream)..."| |);|
> > |||System.out.println( StreamUtilities.fromStream( inputStream ) );|
> > |||inputStream.close();|
> > |}|
> > |catch||( IOException e )|
> > |{|
> > |||e.printStackTrace();|
> > |}|
> > |flowfile = session.write( flowfile, ||new| |StreamCallback()|
> > |{|
> > |||@Override|
> > |||public| |void| |process( InputStream inputStream, OutputStream
> > outputStream ) ||throws| |IOException|
> > |||{|
> > |||System.out.println( ||"And now, let's copy..."| |);|
> > |||CxmlStreamUtilities.copyCxmlHeaderAndDocumentToOutput( inputStream,
> > outputStream );|
> > |||}|
> > |} );|
> > |try|
> > |{|
> > |||InputStream inputStream = session.read( flowfile );|
> > |||System.out.println( ||"This is the input stream second time around
> > (after copying)..."| |);|
> > |||System.out.println( StreamUtilities.fromStream( inputStream ) );|
> > |||inputStream.close();|
> > |}|
> > |catch||( IOException e )|
> > |{|
> > |||e.printStackTrace();|
> > |}|
> > |// ...on to SAX parser which dies because the input has been truncated
> to|
> > |// exactly what was written out to the output stream|
> >
> >
> > Output of above:
> >
> > This is the input stream first time around (before copying to output
> > stream)...
> > 
> >
> >  This is the original document.
> >
> >
> >  2016-06-28 13:23
> >
> >
> >  1980-07-01
> >  36
> >
> > 
> >
> > And now, let's copy...
> > This is the input stream second time around (after copying)...
> > 
> >
> >  This is the original document.
> >
> > And now, we'll go on to the SAX parser...
> >   This is the original document. 
> > [pool-1-thread-1] ERROR [...] SAX ruleparser error:
> > org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 14; XML
> > document structures must start and end within the same entity.
> >
> >
> > I left off the code that prints, "And now, we'll go on to the SAX
> > parser..." It's in the next flowfile = session.write( ... ). I have unit
> > tests that verify the good functioning of
> > copyCxmlHeaderAndDocumentToOutput(). The SAX error occurs because the
> > "file" is truncated; SAX finds the first "half" just fine, but there is
> > no second "half". If I comment out copying from input stream to output
> > stream, the error doesn't occur--the whole document is there.
> >
> > Thanks for looking at this again if you can,
> > Russ
> >
> > On 3/27/20 3:08 PM, Joe Witt wrote:
> > > you should be able to call write as many times as you need.  just keep
> > > using the resulting flowfile reference into the next call.
> > >
> > > On Fri, Mar 27, 2020 at 5:06 PM Russell Bateman  >
> > > wrote:
> > >
> > >> Mike,
> > >>
> > >> Many thanks for responding. Do you mean to say that all I have to do
> is
> > >> something like this?
> > >>
> > >>  public void onTrigger( final ProcessContext context, final
> > >>  ProcessSession session ) throws ProcessException
> > >>  {
> > >> FlowFile flowfile = session.get();
> > >> ...
> > >>
> > >> // this is will be our resulting flowfile...
> > >> AtomicReference< OutputStream > savedOutputStream = new
> > >>  AtomicReference<>();
> > >>
> >

Re: Reading the incoming flowfile "twice"

2020-03-28 Thread Russell Bateman


Joe and Mike,

Sadly, I was not able to get very far on this. It seems that the extend 
to which I copy the first half of the contents of the input stream, I 
lose what comes after when I try to read again, basically, the second 
half comprising the and elements which I was 
hoping to SAX-parse. Here's code and output. I have highlighted the 
output to make it easier to read.


? <#>
|try|
|{|
|||InputStream inputStream = session.read( flowfile );|
|||System.out.println( ||"This is the input stream first time around 
(before copying to output stream)..."| |);|

|||System.out.println( StreamUtilities.fromStream( inputStream ) );|
|||inputStream.close();|
|}|
|catch||( IOException e )|
|{|
|||e.printStackTrace();|
|}|
|flowfile = session.write( flowfile, ||new| |StreamCallback()|
|{|
|||@Override|
|||public| |void| |process( InputStream inputStream, OutputStream 
outputStream ) ||throws| |IOException|

|||{|
|||System.out.println( ||"And now, let's copy..."| |);|
|||CxmlStreamUtilities.copyCxmlHeaderAndDocumentToOutput( inputStream, 
outputStream );|

|||}|
|} );|
|try|
|{|
|||InputStream inputStream = session.read( flowfile );|
|||System.out.println( ||"This is the input stream second time around 
(after copying)..."| |);|

|||System.out.println( StreamUtilities.fromStream( inputStream ) );|
|||inputStream.close();|
|}|
|catch||( IOException e )|
|{|
|||e.printStackTrace();|
|}|
|// ...on to SAX parser which dies because the input has been truncated to|
|// exactly what was written out to the output stream|


Output of above:

This is the input stream first time around (before copying to output 
stream)...


  
This is the original document.
  
  
2016-06-28 13:23
  
  
1980-07-01
36
  


And now, let's copy...
This is the input stream second time around (after copying)...

  
This is the original document.
  
And now, we'll go on to the SAX parser...
  This is the original document. 
[pool-1-thread-1] ERROR [...] SAX ruleparser error: 
org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 14; XML document 
structures must start and end within the same entity.


I left off the code that prints, "And now, we'll go on to the SAX 
parser..." It's in the next flowfile = session.write( ... ). I have unit 
tests that verify the good functioning of 
copyCxmlHeaderAndDocumentToOutput(). The SAX error occurs because the 
"file" is truncated; SAX finds the first "half" just fine, but there is 
no second "half". If I comment out copying from input stream to output 
stream, the error doesn't occur--the whole document is there.


Thanks for looking at this again if you can,
Russ

On 3/27/20 3:08 PM, Joe Witt wrote:

you should be able to call write as many times as you need.  just keep
using the resulting flowfile reference into the next call.

On Fri, Mar 27, 2020 at 5:06 PM Russell Bateman 
wrote:


Mike,

Many thanks for responding. Do you mean to say that all I have to do is
something like this?

 public void onTrigger( final ProcessContext context, final
 ProcessSession session ) throws ProcessException
 {
FlowFile flowfile = session.get();
...

// this is will be our resulting flowfile...
AtomicReference< OutputStream > savedOutputStream = new
 AtomicReference<>();

/* Do some processing on the in-coming flowfile then close its
 input stream, but
 * save the output stream for continued use.
 */
 *  session.write( flowfile, new InputStreamCallback()*
{
  @Override
 *public void process( InputStream inputStream, OutputStream
 outputStream ) throws IOException*
  {
savedOutputStream.set( outputStream );
...

// processing puts some output on the output stream...
outputStream.write( etc. );

inputStream.close();
  }
 *  } );*

/* Start over doing different processing on the (same/reopened)
 in-coming flowfile
 * continuing to use the original output stream. It's our
 responsibility to close
 * the saved output stream, NiFi closes the unused output stream
 opened, but
 * ignored by us.
 */
 *  session.write( flowfile, new StreamCallback()*
{
  @Override
 *public void process( InputStream inputStream, OutputStream
 outputStream ) throws IOException*
  {
outputStream = savedOutputStream.get(); // (discard the new
 output stream)
...

// processing puts (some more) output on the original output
 stream...
outputStream.write( etc. );

outputStream.close();
  }
 *  } );*

session.transfer( flowfile, etc. );
 }

I'm wondering if this will work to "discard" the new output stream
opened for me (the second time) and replace it with

Re: Reading the incoming flowfile "twice"

2020-03-27 Thread Russell Bateman


Joe,

Ah, thanks. I think I have learned a lot about what's going on down 
inside session.read/write()today. I don't have to stand on my head. For 
completeness if anyone else looks for this answer, here's my code amended:


public void onTrigger( final ProcessContext context, final ProcessSession 
session ) throws ProcessException
{
  FlowFile flowfile = session.get();
  ...

  // Do some processing on the in-coming flowfile then close its input stream
  flowfile = session.write( flowfile, new InputStreamCallback()
  {
@Override
public void process( InputStream inputStream, OutputStream outputStream ) 
throws IOException
{
  ...

  // processing puts some output on the output stream...
  outputStream.write( etc. );

  inputStream.close();
}
  } );

  // Start over doing different processing on the (same/reopened) in-coming 
flowfile
  // continuing to use the (same, also reopened, but appended to) output stream.
  flowfile = session.write( flowfile, new StreamCallback()
  {
@Override
public void process( InputStream inputStream, OutputStream outputStream ) 
throws IOException
{
  ...

  // processing puts (some more) output on the flowfile's output stream...
  outputStream.write( etc. );
}
  } );

  session.transfer( flowfile, etc. );
}

As I'm fond of saying, NiFi just rocks because there's always a solution!

Russ

On 3/27/20 3:08 PM, Joe Witt wrote:

you should be able to call write as many times as you need.  just keep
using the resulting flowfile reference into the next call.

On Fri, Mar 27, 2020 at 5:06 PM Russell Bateman 
wrote:


Mike,

Many thanks for responding. Do you mean to say that all I have to do is
something like this?

 public void onTrigger( final ProcessContext context, final
 ProcessSession session ) throws ProcessException
 {
FlowFile flowfile = session.get();
...

// this is will be our resulting flowfile...
AtomicReference< OutputStream > savedOutputStream = new
 AtomicReference<>();

/* Do some processing on the in-coming flowfile then close its
 input stream, but
 * save the output stream for continued use.
 */
 *  session.write( flowfile, new InputStreamCallback()*
{
  @Override
 *public void process( InputStream inputStream, OutputStream
 outputStream ) throws IOException*
  {
savedOutputStream.set( outputStream );
...

// processing puts some output on the output stream...
outputStream.write( etc. );

inputStream.close();
  }
 *  } );*

/* Start over doing different processing on the (same/reopened)
 in-coming flowfile
 * continuing to use the original output stream. It's our
 responsibility to close
 * the saved output stream, NiFi closes the unused output stream
 opened, but
 * ignored by us.
 */
 *  session.write( flowfile, new StreamCallback()*
{
  @Override
 *public void process( InputStream inputStream, OutputStream
 outputStream ) throws IOException*
  {
outputStream = savedOutputStream.get(); // (discard the new
 output stream)
...

// processing puts (some more) output on the original output
 stream...
outputStream.write( etc. );

outputStream.close();
  }
 *  } );*

session.transfer( flowfile, etc. );
 }

I'm wondering if this will work to "discard" the new output stream
opened for me (the second time) and replace it with the original one
which was probably closed when the first call to
session.write()finished. What's on these streams is way too big for me
to put them into temporary memory, say, a ByteArrayOutputStream.

Russ

On 3/27/20 10:03 AM, Mike Thomsen wrote:

session.read(FlowFile) just gives you an InputStream. You should be able

to

rerun that as many times as you want provided you properly close it.

On Fri, Mar 27, 2020 at 11:25 AM Russell Bateman 
wrote:


In my custom processor, I'm using a SAX parser to process an incoming
flowfile that's in XML. Except that, this particular XML is in essence
two different files and I would like to split, read and process the
first "half", which starts a couple of lines (XML elements) into the
file) not using the SAX parser. At the end, I would stream the output of
the first half, then the SAX-processed second half.

So, in short:

   1. process the incoming flowfile for the early content not using SAX,
  but merely copying as-is; at all cost I must avoid "reassembling"
  the first half using my SAX handler (what I'm doing now),
   2. output the first part down the output stream to the resulting

flowfile,

   3. (re)process the incoming flowfile using SAX (and I can just skip
  over the first bit) and spitting the result of this se

Re: Reading the incoming flowfile "twice"

2020-03-27 Thread Russell Bateman


Mike,

Many thanks for responding. Do you mean to say that all I have to do is 
something like this?


   public void onTrigger( final ProcessContext context, final
   ProcessSession session ) throws ProcessException
   {
  FlowFile flowfile = session.get();
  ...

  // this is will be our resulting flowfile...
  AtomicReference< OutputStream > savedOutputStream = new
   AtomicReference<>();

  /* Do some processing on the in-coming flowfile then close its
   input stream, but
   * save the output stream for continued use.
   */
   *  session.write( flowfile, new InputStreamCallback()*
  {
    @Override
   *    public void process( InputStream inputStream, OutputStream
   outputStream ) throws IOException*
    {
  savedOutputStream.set( outputStream );
  ...

  // processing puts some output on the output stream...
  outputStream.write( etc. );

  inputStream.close();
    }
   *  } );*

  /* Start over doing different processing on the (same/reopened)
   in-coming flowfile
   * continuing to use the original output stream. It's our
   responsibility to close
   * the saved output stream, NiFi closes the unused output stream
   opened, but
   * ignored by us.
   */
   *  session.write( flowfile, new StreamCallback()*
  {
    @Override
   *    public void process( InputStream inputStream, OutputStream
   outputStream ) throws IOException*
    {
  outputStream = savedOutputStream.get(); // (discard the new
   output stream)
  ...

  // processing puts (some more) output on the original output
   stream...
  outputStream.write( etc. );

  outputStream.close();
    }
   *  } );*

  session.transfer( flowfile, etc. );
   }

I'm wondering if this will work to "discard" the new output stream 
opened for me (the second time) and replace it with the original one 
which was probably closed when the first call to 
session.write()finished. What's on these streams is way too big for me 
to put them into temporary memory, say, a ByteArrayOutputStream.


Russ

On 3/27/20 10:03 AM, Mike Thomsen wrote:

session.read(FlowFile) just gives you an InputStream. You should be able to
rerun that as many times as you want provided you properly close it.

On Fri, Mar 27, 2020 at 11:25 AM Russell Bateman 
wrote:


In my custom processor, I'm using a SAX parser to process an incoming
flowfile that's in XML. Except that, this particular XML is in essence
two different files and I would like to split, read and process the
first "half", which starts a couple of lines (XML elements) into the
file) not using the SAX parser. At the end, I would stream the output of
the first half, then the SAX-processed second half.

So, in short:

  1. process the incoming flowfile for the early content not using SAX,
 but merely copying as-is; at all cost I must avoid "reassembling"
 the first half using my SAX handler (what I'm doing now),
  2. output the first part down the output stream to the resulting flowfile,
  3. (re)process the incoming flowfile using SAX (and I can just skip
 over the first bit) and spitting the result of this second part out
 down the output stream of the resulting flowfile.

I guess this is tantamount to asking how, in Java, I can read an input
stream twice (or one-half plus one times). Maybe it's less a NiFi
developer question and more a Java question. I have looked at it that
way too, but, if one of you knows (particularly NiFi) best practice, I
would very much like to hear about it.

Thanks.

Reading the incoming flowfile "twice"

2020-03-27 Thread Russell Bateman

In my custom processor, I'm using a SAX parser to process an incoming 
flowfile that's in XML. Except that, this particular XML is in essence 
two different files and I would like to split, read and process the 
first "half", which starts a couple of lines (XML elements) into the 
file) not using the SAX parser. At the end, I would stream the output of 
the first half, then the SAX-processed second half.


So, in short:

1. process the incoming flowfile for the early content not using SAX,
   but merely copying as-is; at all cost I must avoid "reassembling"
   the first half using my SAX handler (what I'm doing now),
2. output the first part down the output stream to the resulting flowfile,
3. (re)process the incoming flowfile using SAX (and I can just skip
   over the first bit) and spitting the result of this second part out
   down the output stream of the resulting flowfile.

I guess this is tantamount to asking how, in Java, I can read an input 
stream twice (or one-half plus one times). Maybe it's less a NiFi 
developer question and more a Java question. I have looked at it that 
way too, but, if one of you knows (particularly NiFi) best practice, I 
would very much like to hear about it.


Thanks.

Re: Managing custom processor logging statements during JUnit testing

2020-03-10 Thread Russell Bateman

Profuse thanks, Andy. I will get back to confirm this. Something else 
came up in the meantime.


On 3/9/20 1:35 PM, Andy LoPresto wrote:

Russell,

You can put a file called logback-test.xml in the src/test/resources/ directory 
of your custom processor module. This file is the same format as logback.xml 
but will supersede it for unit test execution. You can configure whatever level 
of logging severity and specificity in the same way you would in the production 
file.

Here is an example in nifi-security-utils [1].

[1] 
https://github.com/apache/nifi/blob/master/nifi-commons/nifi-security-utils/src/test/resources/logback-test.xml
 
<https://github.com/apache/nifi/blob/master/nifi-commons/nifi-security-utils/src/test/resources/logback-test.xml>

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Mar 6, 2020, at 12:13 PM, Russell Bateman  wrote:

I'm interested in getting log statements that may occur from my *custom 
processor* code at the INFO, DEBUGand TRACElevels when running unit tests for 
it. This means that I would like to set (programmatically at runtime or by 
configuration) what I am used to setting in /${NIFI_ROOT}///conf/logback.xml/, 
but effectively in/during my unit tests (using NiFi's TestRunner). And, 
obviously, I'd like to see those log statements come out to the console.

Thanks for any advice.

Managing custom processor logging statements during JUnit testing

2020-03-06 Thread Russell Bateman

I'm interested in getting log statements that may occur from my *custom 
processor* code at the INFO, DEBUGand TRACElevels when running unit 
tests for it. This means that I would like to set (programmatically at 
runtime or by configuration) what I am used to setting in 
/${NIFI_ROOT}///conf/logback.xml/, but effectively in/during my unit 
tests (using NiFi's TestRunner). And, obviously, I'd like to see those 
log statements come out to the console.


Thanks for any advice.

Re: How to preclude user-defined properties...

2020-02-25 Thread Russell Bateman

Ah, okay, I wondered. Googling for this, I saw wording that encouraged 
me to think I could "clean up" some of my processors, but I guess I 
can't. That's okay.


Thanks!

On 2/25/20 1:24 PM, Mark Payne wrote:

The UI always allows users to enter user-defined properties. It's certainly 
something that could be improved, I believe.

Thanks
-Mark


On Feb 25, 2020, at 3:18 PM, Russell Bateman  wrote:

...in a custom processor.

I have a custom processor (that I wrote) and, in on-canvas configuration, the 
dialog allows the user to create custom properties. I have no need of any and 
wish to help my down-streamers by removing the option of creating any 
user-defined properties. How is this accomplished or what have I done unawares 
to enable it in the first place?

How to preclude user-defined properties...

2020-02-25 Thread Russell Bateman


...in a custom processor.

I have a custom processor (that I wrote) and, in on-canvas 
configuration, the dialog allows the user to create custom properties. I 
have no need of any and wish to help my down-streamers by removing the 
option of creating any user-defined properties. How is this accomplished 
or what have I done unawares to enable it in the first place?

Re: additionalDetails.html styles

2020-02-22 Thread Russell Bateman


Matt,

I do not intend to have any other styles than what I can get from NiFi 
if possible. I tried to pull in /component-usage.css/, but failed. I 
will follow your link, however, once I'm back in the office Monday to 
see if I just failed to do something right.


Thanks,
Russ

On 2/21/20 7:15 PM, Matt Gilman wrote:

Hi Russell,

Do your additional details have different styles than others bundled by
NiFi? Have you tried referencing the component-usage.css stylesheet? There
is an example here [1].

Let me know if that helps.

[1]
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hbase-bundle/nifi-hbase-processors/src/main/resources/docs/org.apache.nifi.hbase.PutHBaseCell/additionalDetails.html

On Thu, Feb 20, 2020 at 12:20 PM Russell Bateman 
wrote:


I would like to get what I'm offering in /additionalDetails.html/ for my
custom processor to use the same font style used down the left-hand
column listing processors (under /NiFi Documentation/). I'm not an idiot
when it comes to HTML/CSS, however, I have been unable in experimenting
to see how I can easily grab NiFi's existing style sheets into my HTML.

Has someone done that and can give me an easy, working version of

 

to put into my HTML header?

Profuse thanks!

additionalDetails.html styles

2020-02-20 Thread Russell Bateman

I would like to get what I'm offering in /additionalDetails.html/ for my 
custom processor to use the same font style used down the left-hand 
column listing processors (under /NiFi Documentation/). I'm not an idiot 
when it comes to HTML/CSS, however, I have been unable in experimenting 
to see how I can easily grab NiFi's existing style sheets into my HTML.


Has someone done that and can give me an easy, working version of

   

to put into my HTML header?

Profuse thanks!

Re: Need to have my email added

2020-02-20 Thread Russell Bateman


Eric,

If I understand what you're asking, you just need to sign up properly. 
Check out community information at

https://nifi.apache.org/mailing_lists.html .

Best regards.

On 2/20/20 9:23 AM, Butts, Eric wrote:

I need to have my email added to forum discussion
buttse...@hotmail.com

Re: View Usage brings up (default) Apache NiFi Overview documentation

2020-01-29 Thread Russell Bateman


Andy,

Profuse thanks for the time spent on this. I did not invent this 
simplified project structure, but stumbled upon it and tried it out 
adding it to these notes when I discovered that it worked (trivially at 
least since the processor works, but as we know the documentation bit 
doesn't work). I should have retained a URL to it, but did not. And, I 
would not expect of you that you investigate this further. Indeed, I am 
sorry for what is tantamount to wasting your time. I will return my 
notes to describing the original, fully working solution with perhaps a 
sentence warning against a simplified solution and why.


Best regards,
Russ

On 1/28/20 4:23 PM, Andy LoPresto wrote:

Russ,

I don’t know where the idea to do a “single-module project” came from. The 
recommended (read: working) method is to have multiple Maven modules — a parent 
(usually called “nifi-myprocessor-bundle”), a processor 
(“nifi-myprocessor-processor”), and a NAR builder (“nifi-myprocessor-nar”). 
This is how the Maven NIFi archetype structures a generated custom processor 
project [1]. Similarly, my presentation on building custom processors addresses 
this structure the same way [2], as does the official Developer Guide [3].

I briefly reviewed the contents of your notes, and they are nicely formatted. 
It appears the advantage of the “single-module” approach is a slightly cleaner 
presentation in your IDE, while the downside is that the processor annotations 
are not evaluated completely (and potentially other problems; I have not 
investigated setting up this non-supported structure further). I would guess 
that the problem is in a single-module project, the packaging type is NAR, so 
the bundling process of building the NAR from a compiled and built JAR doesn’t 
fully occur. Your notes don’t show the results of building the single module, 
and the contents of the target/ directory to be specific.

I don’t mean to sound glib, but I don’t have the bandwidth at this time to 
pursue the single-module structure further, as I know of no one recommending it 
and have no idea why it would work other than by accident. We try to avoid 
recommending additional complexity unless it is required, and to the best of my 
knowledge, the “3 module” approach is the minimum required for successful (and 
complete) custom processor building and deployment.

Hope this helps.

[1] https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions 
<https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions>
[2] https://github.com/alopresto/slides/tree/master/dws_barcelona_2019 
<https://github.com/alopresto/slides/tree/master/dws_barcelona_2019>
[3] https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#nars 
<https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#nars>

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Jan 28, 2020, at 1:49 PM, Russell Bateman  wrote:

Andy,

I have formulated, on the basis of posts I've seen over the years, that there 
are two ways to build a NAR containing one's own, custom processors, what I 
term a /single-module product/ and a /multi-module product/. mostly based on 
whether I need multiple modules in IntelliJ IDEA or only one. I expose this in 
a page of notes that I have maintained for some time and I just looked at the 
single-module version, the one with the simple NAR build and discovered (I had 
probably just never noticed before) that the simple one suffers from the same 
problem as I suffer from now, to wit, that the simpler project structure and 
/pom.xml/ files yields a processor whose /@CapabilityDescription/ is ignored.

My original projects used the more complicated project build, but I saw someone 
propose the simpler one, tried it, it worked, and I have been trying to adopt 
it for my new project where I plan for only one module, i.e.: the project 
itself, and therefore dispence with separate /nar/ and /processor/ 
subdirectories.

This is exposed at https://www.javahotchocolate.com/notes/nifi-project.html.

If you have time and choose to look at it, and the /single-module project/ 
leaps out at you with the answer (as to why it can't be done that way or what, 
of this way, prevents the documentation from working), would you point that 
out? I have rebuilt using the more complex build process with /nar/ 
subdirectory and three instead of one /pom.xml/ files and I now get the 
documentation to work. This is the answer, but why can't I build using one 
/pom.xml/ and the simpler project structure?

Russ

On 1/28/20 10:33 AM, Russell Bateman wrote:

Andy,

The processor is really called NoOp (not CustomProcessor), it's a straight, 
pass-through that doesn't copy let alone modify the flowfile. It's used for 
debugging and crafting flows (and is always ultimately discarded or replaced 
with a real processor doing real things).

In my new NAR, I

Re: View Usage brings up (default) Apache NiFi Overview documentation

2020-01-28 Thread Russell Bateman

Thanks, Andy, no offense taken. I tried this in Chrome and Firefox, but 
am still getting that default page. My sample flow also contains an 
instance of GetFile feeding sample files into CustomProcessor. When I 
look at GetFile's usage, I see /Description: Creates FlowFiles from file 
in a directory/, etc.


I've written a few custom processors over the years, but this is the 
first time I've done something inadvertently so stupid as not to stumble 
later upon what would make usage fail for it. The only thing different 
is that I have it in a totally new NAR by itself so far and am using 1.10.0.


This is bizarre to say the least.

Russ

On 1/27/20 5:56 PM, Andy LoPresto wrote:

Hi Russ,

I hate to suggest something so simple, but have you cleared the cache of your 
browser? Sometimes this can be a result of stale caching.

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Jan 27, 2020, at 2:30 PM, Russell Bateman  wrote:

Addendum: the @TagsI add are not showing up in the tag field at the left, but 
when I scroll down in the processor list and see my new custom processor, I do 
see these tags listed next, to its right.

On 1/27/20 3:27 PM, Russell Bateman wrote:

Perhaps I made my question less clear that I could have. Consider that I have 
this annotation on the class of processor:

@CapabilityDescription( "Custom processor usage statement..." )

and, because my processor is listed in the file

/src/main/resources/META-INF/services/org.apache.nifi.processor.Processor/

as

/com.windofkeltia.processor.//CustomProcessor/

I am able to create a perfectly running and working instance of this processor 
on the canvas through the web browser UI of NiFi 1.10.0.

However, when I right-click the processor instance and choose View usage, I do 
not see documentation with

"Custom processor usage statement..."

but, instead, I see what I assume is default page content saying principally, 
"Apache NiFi Overview". I am looking for help/things to check that are keeping 
NiFi from displaying documentation for my capability description, properties, 
relationships, etc. Also, I am using the @Tags annotation to add a couple of quick-find 
tags: neither of these are showing up when I go to create an instance of the processor. 
This is also likely a clue to what I'm doing wrong.

Thanks,
Russ


On 1/24/20 3:30 PM, Russell Bateman wrote:

My custom processor's usage, which should come from the @CapabilityDescription 
annotation of the class containing the onTrigger() method, nevertheless is 
nothing more than the Apache NiFi Overview. I am able to place an instance on 
the canvas via the processor Component Tool (so, the custom processor is 
there--I just can't get my usage statement):

@SupportsBatching
@SideEffectFree
@InputRequirement( InputRequirement.Requirement.INPUT_REQUIRED )
*@CapabilityDescription( "Custom processor usage statement..**." )*
public class CustomProcessor extends AbstractSessionFactory Processor
{
  @Override
  public void onTrigger( final ProcessContext context, final
ProcessSessionFactory sessionFactory )
  throws ProcessException
  {
...
  }

  etc.
}

What thing missing should I be looking for to fix this?

I'm sure it's some stupid oversight that I have not failed to provide in other 
(successfully documented) custom processors I've written. I have compared this 
processor to those others; I've structured it identically, etc. and I have 
pored through 
https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#documenting-a-component
 a number of times. Usage just doesn't work.

Thanks.

Re: View Usage brings up (default) Apache NiFi Overview documentation

2020-01-27 Thread Russell Bateman

Addendum: the @TagsI add are not showing up in the tag field at the 
left, but when I scroll down in the processor list and see my new custom 
processor, I do see these tags listed next, to its right.


On 1/27/20 3:27 PM, Russell Bateman wrote:
Perhaps I made my question less clear that I could have. Consider that 
I have this annotation on the class of processor:


@CapabilityDescription( "Custom processor usage statement..." )

and, because my processor is listed in the file

/src/main/resources/META-INF/services/org.apache.nifi.processor.Processor/

as

/com.windofkeltia.processor.//CustomProcessor/

I am able to create a perfectly running and working instance of this 
processor on the canvas through the web browser UI of NiFi 1.10.0.


However, when I right-click the processor instance and choose View 
usage, I do not see documentation with


"Custom processor usage statement..."

but, instead, I see what I assume is default page content saying 
principally, "Apache NiFi Overview". I am looking for help/things to 
check that are keeping NiFi from displaying documentation for my 
capability description, properties, relationships, etc. Also, I am 
using the @Tags annotation to add a couple of quick-find tags: neither 
of these are showing up when I go to create an instance of the 
processor. This is also likely a clue to what I'm doing wrong.


Thanks,
Russ


On 1/24/20 3:30 PM, Russell Bateman wrote:
My custom processor's usage, which should come from the 
@CapabilityDescription annotation of the class containing the 
onTrigger() method, nevertheless is nothing more than the Apache NiFi 
Overview. I am able to place an instance on the canvas via the 
processor Component Tool (so, the custom processor is there--I just 
can't get my usage statement):


@SupportsBatching
@SideEffectFree
@InputRequirement( InputRequirement.Requirement.INPUT_REQUIRED )
*@CapabilityDescription( "Custom processor usage statement..**." )*
public class CustomProcessor extends AbstractSessionFactory Processor
{
  @Override
  public void onTrigger( final ProcessContext context, final
ProcessSessionFactory sessionFactory )
  throws ProcessException
  {
    ...
  }

  etc.
}

What thing missing should I be looking for to fix this?

I'm sure it's some stupid oversight that I have not failed to provide 
in other (successfully documented) custom processors I've written. I 
have compared this processor to those others; I've structured it 
identically, etc. and I have pored through 
https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#documenting-a-component 
a number of times. Usage just doesn't work.


Thanks.

Re: View Usage brings up (default) Apache NiFi Overview documentation

2020-01-27 Thread Russell Bateman

Perhaps I made my question less clear that I could have. Consider that I 
have this annotation on the class of processor:


   @CapabilityDescription( "Custom processor usage statement..." )

and, because my processor is listed in the file

   /src/main/resources/META-INF/services/org.apache.nifi.processor.Processor/

as

   /com.windofkeltia.processor.//CustomProcessor/

I am able to create a perfectly running and working instance of this 
processor on the canvas through the web browser UI of NiFi 1.10.0.


However, when I right-click the processor instance and choose View 
usage, I do not see documentation with


   "Custom processor usage statement..."

but, instead, I see what I assume is default page content saying 
principally, "Apache NiFi Overview". I am looking for help/things to 
check that are keeping NiFi from displaying documentation for my 
capability description, properties, relationships, etc. Also, I am using 
the @Tags annotation to add a couple of quick-find tags: neither of 
these are showing up when I go to create an instance of the processor. 
This is also likely a clue to what I'm doing wrong.


Thanks,
Russ


On 1/24/20 3:30 PM, Russell Bateman wrote:
My custom processor's usage, which should come from the 
@CapabilityDescription annotation of the class containing the 
onTrigger() method, nevertheless is nothing more than the Apache NiFi 
Overview. I am able to place an instance on the canvas via the 
processor Component Tool (so, the custom processor is there--I just 
can't get my usage statement):


@SupportsBatching
@SideEffectFree
@InputRequirement( InputRequirement.Requirement.INPUT_REQUIRED )
*@CapabilityDescription( "Custom processor usage statement..**." )*
public class CustomProcessor extends AbstractSessionFactory Processor
{
  @Override
  public void onTrigger( final ProcessContext context, final
ProcessSessionFactory sessionFactory )
  throws ProcessException
  {
    ...
  }

  etc.
}

What thing missing should I be looking for to fix this?

I'm sure it's some stupid oversight that I have not failed to provide 
in other (successfully documented) custom processors I've written. I 
have compared this processor to those others; I've structured it 
identically, etc. and I have pored through 
https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#documenting-a-component 
a number of times. Usage just doesn't work.


Thanks.

View Usage brings up (default) Apache NiFi Overview documentation

2020-01-24 Thread Russell Bateman

My custom processor's usage, which should come from the 
@CapabilityDescription annotation of the class containing the 
onTrigger() method, nevertheless is nothing more than the Apache NiFi 
Overview. I am able to place an instance on the canvas via the processor 
Component Tool (so, the custom processor is there--I just can't get my 
usage statement):


   @SupportsBatching
   @SideEffectFree
   @InputRequirement( InputRequirement.Requirement.INPUT_REQUIRED )
   *@CapabilityDescription( "Custom processor usage statement..**." )*
   public class CustomProcessor extends AbstractSessionFactory Processor
   {
  @Override
  public void onTrigger( final ProcessContext context, final
   ProcessSessionFactory sessionFactory )
  throws ProcessException
  {
    ...
  }

  etc.
   }

What thing missing should I be looking for to fix this?

I'm sure it's some stupid oversight that I have not failed to provide in 
other (successfully documented) custom processors I've written. I have 
compared this processor to those others; I've structured it identically, 
etc. and I have pored through 
https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#documenting-a-component 
a number of times. Usage just doesn't work.


Thanks.

Re: Unable to update processor...

2020-01-14 Thread Russell Bateman

Thanks for the feedback, sorry not to have got back until now to 
confirm. This was happening for me privately (my development host, no 
cluster). I wiped NiFi and reinstalled (a little in frustration, more 
just to make sure things were clean). I think I must have been updating 
the wrong /custom-lib/ subdirectory or something.


On 1/13/20 2:44 PM, Russell Bateman wrote:
I have a custom processor that I have updated (in potentially breaking 
ways and I know it), but I cannot seem to make NiFi 1.10.0 open its 
mouth and swallow it. I have copied my NAR file to 
/1.10.0/custom-lib/; there is no other version (in the /lib/ 
subdirectory for example). When I bounce NiFi I see nothing of the 
changes (properties wording, etc.). I've been away from NiFi for a 
while; maybe I'm forgetting something. I found some artifacts in 
/1.10.0/work/nar/extensions/ and /1.10.0/work/docs/ which I removed 
before bouncing NiFi, but when it comes back up, all that stuff 
appears to have been brought back. What am I missing? Any suggestions 
of where to look would be welcome.


Thanks,
Russ

Unable to update processor...

2020-01-13 Thread Russell Bateman

I have a custom processor that I have updated (in potentially breaking 
ways and I know it), but I cannot seem to make NiFi 1.10.0 open its 
mouth and swallow it. I have copied my NAR file to /1.10.0/custom-lib/; 
there is no other version (in the /lib/ subdirectory for example). When 
I bounce NiFi I see nothing of the changes (properties wording, etc.). 
I've been away from NiFi for a while; maybe I'm forgetting something. I 
found some artifacts in /1.10.0/work/nar/extensions/ and 
/1.10.0/work/docs/ which I removed before bouncing NiFi, but when it 
comes back up, all that stuff appears to have been brought back. What am 
I missing? Any suggestions of where to look would be welcome.


Thanks,
Russ

NiFi 1.10.0 JRE clarification...

2019-11-06 Thread Russell Bateman


Looking at the release notes, I see:

 * Apache NiFi can now be built on either Java 8 or Java 11! When built
   on Java 8 it can run on Java 8 or Java 11.

The implications of this statement make me ask:

 * At https://nifi.apache.org/download.html, how is it built? JDK 11?
 * Can I run it using JRE 8? Or, must I download source and build it
   using JDK 8 myself?

I apologize; I could try it myself, but I don't have a platform running 
JRE 8 easy to hand, so I thought I'd ask if anyone knows. Likely, 
anything I do will ultimately need to run on (CentOS) JRE 8, so I'm not 
asking idly.


Thanks,

Russ

Re: [DISCUSS] Time based release cycles

2019-11-05 Thread Russell Bateman


Kafka is first-rate, rock-star technology, just as is NiFi.

It would be nice to find something from Kafka elaborating on how this 
regular and accelerated release cadence is working out for them, how 
much more work it's been, what problems they've experienced, etc.


I show their releases over the last couple of years as below[1]. The 
cadence appears to be settling into the the 4-month cycle proposed. It's 
possible to discern a maintenance schedule. It doesn't exactly match 
NiFi's 0.x and 1.x efforts (which were simultaneous for some time too), 
but it's clear they've faced similar complexity (maybe a little more 
though for a shorter time). And, of course, there's no meaningful way to 
compare the effort going into and features implemented in Kafka by 
comparison with NiFi.


2019
2.3.1    24 October
2.3.0    25 June
2.2.1     1 June
2.2.0    22 March
2.1.1    15 February

2018
2.1.0    20 November
2.0.1     9 November
2.0.0    30 July
1.1.1    19 July
1.0.2     8 July
0.11.0.3  2 July
0.10.2.2  2 July
1.1.0    28 March
1.0.1     5 March

2017
1.0.0 1 November
0.11.0.1 13 September
0.11.0.0 28 June
.
.
.

[1] https://kafka.apache.org/downloads

On 11/5/19 8:02 AM, Pierre Villard wrote:

Hi NiFi dev community,

We just released NiFi 1.10 and that's an amazing release with a LOT of
great new features. Congrats to everyone!

I wanted to take this opportunity to bring a discussion around how often
we're doing releases.

We released 1.10.0 yesterday and we released 1.9.0 in February, that's
around 8 months between the two releases. And if we take 1.9.2, released
early April, that's about 7 months.

I acknowledge that doing releases is really up to the committers and anyone
can take the lead to perform this process, however, we often have people
asking (on the mailing lists or somewhere else) about when will the next
release be. I'm wondering if it would make sense to think about something a
bit more "planned" by doing time based releases.

The Apache Kafka community wrote a nice summary of the pros/cons about such
an approach [1] and it definitely adds more work to the committers with
more frequent releases. I do, however, think that it'd ease the adoption of
NiFi, its deployment and the dynamism in PR/code review.

I'm just throwing the idea here and I'm genuinely curious about what you
think about this approach.

[1]
https://cwiki.apache.org/confluence/display/KAFKA/Time+Based+Release+Plan

Thanks,
Pierre

Re: Azure Event Hub Processors Upgrade Problem

2019-08-22 Thread Russell Bateman

I can probably help, Sunny. I made some notes a few years ago on doing 
this. How this is done on later versions hasn't changed I don't think. 
Check out http://www.javahotchocolate.com/notes/nifi.html#20160323.


Russ Bateman

On 8/22/19 11:12 AM, Sunny Zhang wrote:

Hi there,



I’m a dev from Microsoft Azure Event Hub team. We noticed that the Azure
Event Hub processors on Nifi is using an very old version and may cause
usability problems to the users. We want to help to upgrade it to the
newest version!



However, I met some problem trying to debug with the project with IntelliJ
IDE. I can build-run the code with command lines, read log when exception
happens, and I can successfully opened the project in IntelliJ and edit,
but I’m not sure how to run and debug the project in IntelliJ. More
specifically, we wanna attach the running Nifi (http://localhost:8080/nifi/)
with the IDE so that we can have breakpoints, check runtime value without
having to print variables to log (but we didn’t find a way yet). We wanna
ensure the quality, so we want to do enough testing etc, so we hope to work
on debugging more effectively.



Could you help to explain how to work on the project with IntelliJ, or is
there any good ways?



Best,

Sunny

Re: Java 11 build support is live!

2019-08-15 Thread Russell Bateman

Yes, at the risk of adding nothing more than a "me too," I nevertheless 
wish to add my thanks. Good job!


On 8/15/19 12:04 PM, Jeff wrote:

Apache NiFi Developers,

Apache NiFi can now be built with Java 11!  Thanks to those that helped
with the review and committing of several PRs to get us to this point.

For all contributions going forward, developers, reviewers, and committers
need to make sure that PRs are verified by building with and running on
Java 8 and 11.

To assist in this process, the Linux-based Travis CI build has been updated
to perform an en_US locale-based build of NiFi using AdoptOpenJDK 11.0.4 in
addition to the three locale-based builds (en_US, fr_FR, ja_JP) on OpenJDK
8.

Apache NiFi still has a minimum requirement of Java 8 and as long as that
requirement exists the released convenience binaries will be built with
Java 8.

When building locally with Java 11, on all operating systems, a minimum JDK
version of 11.0.3 is required for a successful build.  If you are building
on OSX and use AdoptOpenJDK 11.0.4, there's an issue [4] [5] with native
libraries.  If you run into this issue, please downgrade your AdoptOpenJDK
11 installation to 11.0.3.  A updated release of AdoptOpenJDK 11.0.4 that
fixes the native library issues is forthcoming.  There is a comment [6] on
AdoptOpenJDK issue 1211 that describes the downgrade process.

Work still remains regarding Java 11.  Some of the components that NiFi
uses are not completely compatible with Java 11, and JIRAs will be created
or linked to the Java 11 parent JIRA [1] to track the remaining changes.

Several PRs not explicitly related to Java 11 were created and merged to
master ahead of NIFI-5176 [2] (PR 3404 [3]), and are listed as blockers on
NIFI-5176.  All blockers have been resolved, as well as NIFI-5176.

If you encounter an issue with the Java 11 build, or running the resulting
build on Java 11, please check NIFI-5174 to see if a JIRA for the issue has
already been filed, and create one if needed.

[1] https://issues.apache.org/jira/browse/NIFI-5174
[2] https://issues.apache.org/jira/browse/NIFI-5176
[3] https://github.com/apache/nifi/pull/3404
[4] https://github.com/AdoptOpenJDK/openjdk-build/issues/1206
[5] https://github.com/AdoptOpenJDK/openjdk-build/issues/1211
[6]
https://github.com/AdoptOpenJDK/openjdk-build/issues/1211#issuecomment-521392147

StateManager race condition potential

2019-08-09 Thread Russell Bateman

I'm assuming that the StateManagerprotects itself against race 
conditions for the consuming (custom) processor, but I'd like 
confirmation on that. Let's say something simple like we get an integer 
out of state to which we can add one to get the next (piece of work to 
do), then immediately bump and write that value plus 1 for the next 
thread to get. In the time it took us to get the value back, bump it by 
1, then put it out (I'm assuming Scope.LOCAL), I don't see that the 
StateManageris prevented from handing out that same value to another 
instance or task of my processor.


How does StateManager Scopeaffect this? (By whether the instance of 
state is per host or per cluster?)

How does processor behavior annotation affect this?
How does processor scheduling configuration (concurrent task count) 
affect this?


Thanks for any comments.

Re: [board report] Apache NiFi - July 2019

2019-07-10 Thread Russell Bateman

I mulled over whether it's appropriate to ask within this context. I 
guess I'll ask. Slap me if I have badly chosen, but where is NiFi 
respective to "modern" JDK versions? Is that too much detail for the 
audience of this periodic report?


Thanks.

On 7/10/19 11:15 AM, Joe Witt wrote:

Team,

I was running late so submitted the report already.  Here is what I sent to
the board for Apache NiFi July 2019 report.  Great work and great progress
all!



## Description:
  - Apache NiFi is an easy to use, powerful, and reliable system to process
and
distribute data.
  - Apache NiFi MiNiFi is an edge data collection agent built to seamlessly
integrate with and leverage the command and control of NiFi. There are
both
Java and C++ implementations.
  - Apache NiFi Registry is a centralized registry for key configuration
items
including flow versions, assets, and extensions for Apache NiFi and
Apache
MiNiFi.
  - Apache NiFi Nar Maven Plugin is a release artifact used for supporting
the
NiFi classloader isolation model.
  - Apache NiFi Flow Design System is a theme-able set of high quality UI
components and utilities for use across the various Apache NiFi web
applications in order to provide a more consistent user experience.

## Issues:
  - There are no issues requiring board attention at this time.

## Activity:
  - Released Apache NiFi Registry 0.4.0 which allows storage of extensions
which
is a critical step towards us breaking apart the monolithic release of
today
resulting in smaller binaries in Apache infra a mirrors and will also
bring
a better user experience for updating live running NiFi clusters and
operations in container based environments.
  - The community is working the release preparation and voting process for
Apache NiFi MiNiFi CPP 0.6.1.
  - Apache NiFi 0.10.0 is progressing nicely with nearly 200 JIRAs already
included.  This brings powerful features such as sourcing extensions from
the latest NiFi Registry at runtime, far better model for paramaterized
version controlled flows, Java 11 compatibility, and much more.

## Health report:
  - Health of the community remains strong with many active release lines,
feature development, active user and developer base including new
participants and continued participants.
  - We see considerable commentary on Apache NiFi in the form of meetups,
conferences, training, and talks including from folks at Google, Cloudera
and others. A recent tweet from a Ford Motor Company employee says it
best
"Love, love, love @apachenifi   Using MiNiFi opens up IOT innovation in
amazing ways.  Every IOT device manufacturer, including automotive,
should consider opening up API's to let developers do their thing!
Bringing value to your product! "

## PMC changes:

  - Currently 30 PMC members.
  - Peter Wicks was added to the PMC on Wed May 29 2019

## Committer base changes:

  - Currently 43 committers.
  - Arpad Boda was added as a committer on Thu May 23 2019

## Releases:

  - Apache NiFi Registry 0.4.0 was released on Mon May 20 2019

## Mailing list activity:

  - Activity on the mailing lists remains high with a mixture of new users,
contributors, and deeper more experienced users and contributors sparking
discussion and questions and filing bugs or new features.

  - We do see a significant drop in users list usage while dev and issues
remains busy and even growing.  Meanwhile we see strong growth in our
slack channel and it is very user centric.

  - Slack Channel Usage: apachenifi.slack.com
   - 394 users currently in the room.  This has grown weekly.

  - us...@nifi.apache.org:
 - 693 subscribers (up 18 in the last 3 months):
 - 447 emails sent to list (800 in previous quarter)

  - dev@nifi.apache.org:
 - 443 subscribers (down -1 in the last 3 months):
 - 357 emails sent to list (417 in previous quarter)

  - iss...@nifi.apache.org:
 - 56 subscribers (up 0 in the last 3 months):
 - 5371 emails sent to list (4140 in previous quarter)


## JIRA activity:

  - 274 JIRA tickets created in the last 3 months
  - 188 JIRA tickets closed/resolved in the last 3 months



Thanks
Joe

Re: nifi

2019-07-05 Thread Russell Bateman

It's short for "Niagra files" in the sense that you're meant to think of 
huge flows of files like water over the falls 
.


On 7/5/19 6:00 AM, Arpad Boda wrote:

"nye fye" (nī fī) is the preferred pronunciation.

https://nifi.apache.org/faq.html



On Fri, Jul 5, 2019 at 1:43 PM 鹿骏 <405453...@qq.com> wrote:


sorry to trouble you.
I'm curious about how to pronounce ‘nifi’?

NAR extensions warning during build

2019-05-30 Thread Russell Bateman


In /pom.xml/, I have:


  *org.apache.nifi*
  *nifi-nar-maven-plugin*
  *3.5.1*
  *true*



I get this, which isn't fatal, when my NAR is built. I would like

   a) to know to what "extension(s)" this refers (related to the
   plug-in configuration in /pom.xml/?),
   b) where to put extension documentation such that it will be picked
   up or
   c) what I can do to eliminate this warning in favor of there not
   being any extensions or extensions documentation, but which
   satisfies nifi-nar-maven-plugin and keeps it quiet about it.

I didn't get this using some previous version of the plug-in.

Thanks.

[INFO] Copying commons-httpclient-3.1.jar to 
/home/russ/sandboxes/nifi-pipeline.master/imat-pipeline-nar/target/classes/META-INF/bundled-dependencies/commons-httpclient-3.1.jar

[INFO] etc
[INFO] Generating documentation for NiFi extensions in the NAR...
[INFO] Found NAR dependency of 
org.apache.nifi:nifi-dbcp-service-nar:nar:1.9.2:compile
[INFO] Found NAR dependency of 
org.apache.nifi:nifi-standard-services-api-nar:nar:1.9.2:compile
[INFO] Found NAR dependency of 
org.apache.nifi:nifi-jetty-bundle:nar:1.9.2:compile

[INFO] Found a dependency on version 1.9.2 of NiFi API
[*WARNING*] Could not generate extensions' documentation
*org.apache.maven.plugin.MojoExecutionException: Failed to create 
Extension Documentation*

    at org.apache.nifi.NarMojo.generateDocumentation (NarMojo.java:596)
    at org.apache.nifi.NarMojo.execute (NarMojo.java:499)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
(DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:148)
    at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:117)
    at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:81)
    at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build 
(SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
(LifecycleStarter.java:128)

    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke 
(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke 
(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced 
(Launcher.java:289)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch 
(Launcher.java:229)
    at 
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode 
(Launcher.java:415)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main 
(Launcher.java:356)
*Caused by: java.lang.NoSuchMethodException: 
org.apache.nifi.documentation.xml.XmlDocumentationWriter.initialize(org.apache.nifi.components.ConfigurableComponent)*

    at java.lang.Class.getMethod (Class.java:1786)
    at org.apache.nifi.NarMojo.writeDocumentation (NarMojo.java:631)
    at org.apache.nifi.NarMojo.writeDocumentation (NarMojo.java:605)
    at org.apache.nifi.NarMojo.generateDocumentation (NarMojo.java:577)
    at org.apache.nifi.NarMojo.execute (NarMojo.java:499)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
(DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:148)
    at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:117)
    at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:81)
    at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build 
(SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
(LifecycleStarter.java:128)

    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
    at

Latest NiFi customs?

2019-04-17 Thread Russell Bateman

After a couple of years absence from NiFi (prior to Java 9), I find 
myself just now back in a developer role in a company that uses NiFi. 
(This is a pleasant thought, I might add, as I believe that NiFi rocks.) 
I have inherited an existing implementation that's sorely aged and, 
though I've googled mostly in vain on what I'm asking, would like to dot 
the /i/s and cross the /t/s.


*What version of NiFi?*
How far forward (toward) NiFi 1.9 should I push my company? I see that 
the Docker container is at 1.8 if that's any reference. I'm tempted 
right now to move to 1.8 immediately.


*What about Java?*
What is the state of Java in NiFi? It appears that it's still back on 
Java 8? I develop using IntelliJ IDEA. While I constrain the level of 
language features to 1.8, it isn't realistic to contemplate developing 
in IDEA without a pretty modern JDK version (I use Java 11 today because 
LTS). I assume, nevertheless, that if I'm careful not to permit--by 
setting in IDEA--the use of language constructs in my custom processors 
to exceed 1.8, I should be okay, right? Or, am I missing something and 
there are other considerations to watch out for?


Thanks for any and all comments, setting me straight, etc.

Re: Calling getLogger() from @OnScheduled, @OnStopped, etc.

2018-04-12 Thread Russell Bateman

Yes, this is what I assumed, but I was hoping that someone had developed 
a technique for reaching the log in some (twisted) way perhaps that I 
hadn't figured out yet. It would really help me visualize the order in 
which my code's called and help me feel better about what I've written.


Thanks,
Russ

On 04/12/2018 03:41 PM, Bryan Bende wrote:

The example processor you showed won’t work because you are calling
getLogger() inline as part of the variable declaration.

The logger is given to the processor in an init method which hasn’t been
called yet at that point, so that is assigning null to the variable.

Generally you should just call getLogger() whenever it is needed, or you
could assign it to a variable from inside OnScheduled.

On Thu, Apr 12, 2018 at 5:28 PM Russell Bateman <r...@windofkeltia.com>
wrote:


Thanks for responding, Andy.

I am able to use it, like you, in onTrigger(). Where I haven't been able
to use it is from annotated methods (in the sense that onTrigger()isn't
annotated except by @Overridewhich is not relevant in this question).
Imagine:

public class Fun extends AbstractProcessor
{
private ComponentLog logger = getLogger();

@Override
public void onTrigger( final ProcessContext context, final
ProcessSession session ) throws ProcessException
{
  logger.trace( "[PROFILE] onTrigger()" );*/* A */*
  ...
}

*@OnScheduled*
public void processProperties( final ProcessContext context )
{
  logger.trace( "[PROFILE] processProperties()" );*/* B */*
  ...
}

*@OnStopped*
public void dropEverything()
{
  logger.trace( "[PROFILE] dropEverything()" );*/* C */*
  ...
}
...
}


Now, imaging suitable test code, FunTest.test() which sets up

 runner = TestRunners.newTestRunner( processor = new Fun() );

etc., then

 runner.run( 1 );

Above, instance A works fine (it's the one you illustrated in footnote
[1]). Instances B and C cause the error:

 java.lang.AssertionError: Could not invoke methods annotated with
 @OnScheduled (or @OnStopped) annotation due to:
 java.lang.reflect.InvocationTargetException

Russ

On 04/12/2018 02:52 PM, Andy LoPresto wrote:

Hi Russ,

Are you saying the code that breaks is having “getLogger()” executed
inside one of the processor lifecycle methods (i.e.
GetFile#onTrigger()) or in your test code (i.e.
GetFileTest#testOnTriggerShouldReadFile())?

I’m not aware of anything with the JUnit runner that would cause
issues here. I use the loggers extensively in both my application code
[1] and the tests [2]. Obviously in the tests, I instantiate a new
Logger instance for the test class.

Can you share an example of the code that breaks this for you?

[1]


https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/EncryptContent.java#L511

[2]


https://github.com/apache/nifi/pull/2628/files#diff-e9cfa232683ae75b1fc505d6c9bd3b24R447

Andy LoPresto
alopre...@apache.org <mailto:alopre...@apache.org>
/alopresto.apa...@gmail.com <mailto:alopresto.apa...@gmail.com>/
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Apr 12, 2018, at 3:46 PM, Russell Bateman <r...@windofkeltia.com
<mailto:r...@windofkeltia.com>> wrote:

I seem to crash NiFi JUnit test runner when I have code that calls
getLogger()or attempts to make use of the product of calling
getLogger()in situations where some context (probably) is missing
like methods annotated for call at "special" times. This makes sense,
but is there a technique I can use to profile my custom processor in
order to observe (easily, such as using the logger) the behavior of
(i.e.: log-TRACE through) my processor in respect to @OnScheduled,
@OnUnscheduled, @OnStopped, etc. moments?

Many thanks,
Russ

--

Sent from Gmail Mobile

1 2 3 >

1 - 100 of 218 matches

Mail list logo