Re: Error handling in @OnScheduled

2018-08-23 Thread James Srinivasan
I tried that, but the problem is the exception is caught and the test
fails due to this:

try {
  ReflectionUtils.invokeMethodsWithAnnotation(OnScheduled.class,
processor, context);
} catch (final Exception e) {
  e.printStackTrace();
  Assert.fail("Could not invoke methods annotated with @OnScheduled
annotation due to: " + e);
}

https://github.com/apache/nifi/blob/master/nifi-mock/src/main/java/org/apache/nifi/util/StandardProcessorTestRunner.java#L181
On Thu, 23 Aug 2018 at 15:41, Mike Thomsen  wrote:
>
> For unit tests, if you're doing this to catch a failure scenario, you
> should be able to wrap the failing call in something like this:
>
> final def msg = "Lorem ipsum..."
> def error = null
> try {
> runner.run()
> } catch (Throwable t) {
> error = t
> } finally {
> assertNotNull(error)
> assertTrue(error.cause instanceof SomeException)
> assertTrue(error.cause.message.contains(msg))
> }
>
> Obviously play around with the finally block, but I've had success with
> that pattern.
>
> On Thu, Aug 23, 2018 at 10:19 AM James Srinivasan <
> james.sriniva...@gmail.com> wrote:
>
> > What is the best way to handle exceptions which might be thrown in my
> > @OnScheduled method? Right now, I'm logging and propagating the
> > exception which has the desired behaviour in NiFi (bulletin in GUI and
> > processor cannot be started) but when trying to add a unit test, the
> > (expected) exception is caught in StandardProcessorTestRunner.run and
> > failure asserted.
> >
> > My actual @OnScheduled method builds a non-trivial object based on the
> > Processor's params - maybe I should be building that any time any of
> > the params change instead?
> >
> > Many thanks,
> >
> > James
> >


Error handling in @OnScheduled

2018-08-23 Thread James Srinivasan
What is the best way to handle exceptions which might be thrown in my
@OnScheduled method? Right now, I'm logging and propagating the
exception which has the desired behaviour in NiFi (bulletin in GUI and
processor cannot be started) but when trying to add a unit test, the
(expected) exception is caught in StandardProcessorTestRunner.run and
failure asserted.

My actual @OnScheduled method builds a non-trivial object based on the
Processor's params - maybe I should be building that any time any of
the params change instead?

Many thanks,

James


Re: Error handling in @OnScheduled

2018-08-23 Thread James Srinivasan
Ah, hadn't spotted that. It's close, but the Throwable I get is a
java.lang.AssertionError (Could not invoke methods annotated with
@OnScheduled annotation due to:
java.lang.reflect.InvocationTargetException) and there doesn't seem to
be any way to get the actual underlying exception my code threw in
order to properly validate it.

Mark's suggestion of calling the @OnScheduled method directly seems a
little tricky when using the TestRunner framework, or should I just
replicate the test setup (e.g. create my own MockProcessContext etc.)

Thanks,

James
On Thu, 23 Aug 2018 at 21:03, Mike Thomsen  wrote:
>
> James try it with a throwable like in my example
> On Thu, Aug 23, 2018 at 10:51 AM Mark Payne  wrote:
>
> > James,
> >
> > If you are expecting the method to throw an Exception and want to verify
> > that, you should
> > just call the method directly from your unit test and catch the Exception
> > there. The TestRunner
> > expects to run the full lifecycle of the Processor.
> >
> > Thanks
> > -Mark
> >
> >
> > > On Aug 23, 2018, at 10:49 AM, James Srinivasan <
> > james.sriniva...@gmail.com> wrote:
> > >
> > > I tried that, but the problem is the exception is caught and the test
> > > fails due to this:
> > >
> > > try {
> > >  ReflectionUtils.invokeMethodsWithAnnotation(OnScheduled.class,
> > > processor, context);
> > > } catch (final Exception e) {
> > >  e.printStackTrace();
> > >  Assert.fail("Could not invoke methods annotated with @OnScheduled
> > > annotation due to: " + e);
> > > }
> > >
> > >
> > https://github.com/apache/nifi/blob/master/nifi-mock/src/main/java/org/apache/nifi/util/StandardProcessorTestRunner.java#L181
> > > On Thu, 23 Aug 2018 at 15:41, Mike Thomsen 
> > wrote:
> > >>
> > >> For unit tests, if you're doing this to catch a failure scenario, you
> > >> should be able to wrap the failing call in something like this:
> > >>
> > >> final def msg = "Lorem ipsum..."
> > >> def error = null
> > >> try {
> > >>runner.run()
> > >> } catch (Throwable t) {
> > >>error = t
> > >> } finally {
> > >>assertNotNull(error)
> > >>assertTrue(error.cause instanceof SomeException)
> > >>assertTrue(error.cause.message.contains(msg))
> > >> }
> > >>
> > >> Obviously play around with the finally block, but I've had success with
> > >> that pattern.
> > >>
> > >> On Thu, Aug 23, 2018 at 10:19 AM James Srinivasan <
> > >> james.sriniva...@gmail.com> wrote:
> > >>
> > >>> What is the best way to handle exceptions which might be thrown in my
> > >>> @OnScheduled method? Right now, I'm logging and propagating the
> > >>> exception which has the desired behaviour in NiFi (bulletin in GUI and
> > >>> processor cannot be started) but when trying to add a unit test, the
> > >>> (expected) exception is caught in StandardProcessorTestRunner.run and
> > >>> failure asserted.
> > >>>
> > >>> My actual @OnScheduled method builds a non-trivial object based on the
> > >>> Processor's params - maybe I should be building that any time any of
> > >>> the params change instead?
> > >>>
> > >>> Many thanks,
> > >>>
> > >>> James
> > >>>
> >
> >


Re: Error handling in @OnScheduled

2018-08-24 Thread James Srinivasan
Mark,

Thanks very much for the detailed answer. In my particular case, I
have a parameter corresponding to a schema file on disk and there is a
standard validator to ensure that the file exists. Currently, in my
@OnScheduled method, I read that schema file, parse it and store the
parsed results in a member of my class ready for use in my onTrigger
method. If the file fails to parse, an exception is thrown. I could
move that code into a validator, but setting a member as a side effect
of validation didn't quite feel right - does that makes sense?

James
On Fri, 24 Aug 2018 at 16:01, Mark Payne  wrote:
>
> James,
>
> You can certainly catch Throwable there, or AssertionError, more 
> specifically, but I'd be very wary
> of doing that, because at that point you're really kind of working against 
> the framework (both the
> nifi mock/test framework as well as the JUnit framework) instead of with it. 
> If your intent is to test
> a specific method, I would recommend testing that method explicitly by 
> calling it yourself.
>
> You don't need to create your own MockProcessContext. You can get the 
> ProcessContext from
> the Test Runner. For example:
>
>
> final MyProcessor myProcessor = new MyProcessor();
> final TestRunner runner = TestRunners.newTestRunner(myProcessor);
>
> runner.setProperty(MyProcessor.MY_PROPERTY, "hello");
>
> try {
>   myProcessor.onScheduled( runner.getProcessContext() );
>   Assert.fail("Expected SomeException to get thrown from onScheduled method 
> but it did not.");
> } catch (final SomeException e) {
>   // expected.
> }
>
> Now, this being said, it begs the question whether or not you want to be 
> throwing an Exception from your @OnScheduled method.
> I'm sure there are use cases where this makes perfect sense. However, you 
> should first think about whether or not you are
> able to prevent the Exception from occurring by applying validation rules 
> (addValidator() to PropertyDescriptor's or customValidate).
> The benefit to validators here is that when the user configures the Processor 
> incorrectly, they get a clear indication immediately that it
> is not valid and a clear explanation of why it's not valid (as well as being 
> shown in the Invalid Counts of Process Groups, etc.).
> If you wait until the user tries to start the Processor and throw an 
> Exception, it will be less obvious that there's a configuration problem
> and the error message that they receive is likely not to be as clear.
>
> Thanks
> -Mark
>
>
> On Aug 23, 2018, at 5:25 PM, James Srinivasan 
> mailto:james.sriniva...@gmail.com>> wrote:
>
> Ah, hadn't spotted that. It's close, but the Throwable I get is a
> java.lang.AssertionError (Could not invoke methods annotated with
> @OnScheduled annotation due to:
> java.lang.reflect.InvocationTargetException) and there doesn't seem to
> be any way to get the actual underlying exception my code threw in
> order to properly validate it.
>
> Mark's suggestion of calling the @OnScheduled method directly seems a
> little tricky when using the TestRunner framework, or should I just
> replicate the test setup (e.g. create my own MockProcessContext etc.)
>
> Thanks,
>
> James
> On Thu, 23 Aug 2018 at 21:03, Mike Thomsen 
> mailto:mikerthom...@gmail.com>> wrote:
>
> James try it with a throwable like in my example
> On Thu, Aug 23, 2018 at 10:51 AM Mark Payne 
> mailto:marka...@hotmail.com>> wrote:
>
> James,
>
> If you are expecting the method to throw an Exception and want to verify
> that, you should
> just call the method directly from your unit test and catch the Exception
> there. The TestRunner
> expects to run the full lifecycle of the Processor.
>
> Thanks
> -Mark
>
>
> On Aug 23, 2018, at 10:49 AM, James Srinivasan <
> james.sriniva...@gmail.com<mailto:james.sriniva...@gmail.com>> wrote:
>
> I tried that, but the problem is the exception is caught and the test
> fails due to this:
>
> try {
> ReflectionUtils.invokeMethodsWithAnnotation(OnScheduled.class,
> processor, context);
> } catch (final Exception e) {
> e.printStackTrace();
> Assert.fail("Could not invoke methods annotated with @OnScheduled
> annotation due to: " + e);
> }
>
>
> https://github.com/apache/nifi/blob/master/nifi-mock/src/main/java/org/apache/nifi/util/StandardProcessorTestRunner.java#L181
> On Thu, 23 Aug 2018 at 15:41, Mike Thomsen 
> wrote:
>
> For unit tests, if you're doing this to catch a failure scenario, you
> should be able to wrap the failing call in something like this:
>
> final def msg = "Lorem ipsum..."
> def error = null
> try {
>   runner.run()
> } catch (Throwable t

Re: Error handling in @OnScheduled

2018-08-26 Thread James Srinivasan
Thanks very much, I'm now able to write a useful unit test to catch
the expected exception. Given the great support I've had from the
list, I'll start my organisation's process to contribute the code back
for this:

https://jira.apache.org/jira/browse/NIFI-5538

On Fri, 24 Aug 2018 at 16:41, Mark Payne  wrote:
>
> James,
>
> Yes, makes perfect sense. I think that's a good balance of logic, as well. 
> Use a validator to quickly
> ensure that the file exists. Then, when trying to use it, go ahead and parse 
> the data and set your
> member variable. You could have the validator parse the file (but not set the 
> member variable) to
> ensure that the format is valid, etc. but I would personally avoid doing 
> that, because the parsing
> may well prove to be quite expensive for validation purposes. I think you're 
> very much on the right track.
>
> Thanks!
> -Mark
>
>
>
> > On Aug 24, 2018, at 11:34 AM, James Srinivasan  
> > wrote:
> >
> > Mark,
> >
> > Thanks very much for the detailed answer. In my particular case, I
> > have a parameter corresponding to a schema file on disk and there is a
> > standard validator to ensure that the file exists. Currently, in my
> > @OnScheduled method, I read that schema file, parse it and store the
> > parsed results in a member of my class ready for use in my onTrigger
> > method. If the file fails to parse, an exception is thrown. I could
> > move that code into a validator, but setting a member as a side effect
> > of validation didn't quite feel right - does that makes sense?
> >
> > James
> > On Fri, 24 Aug 2018 at 16:01, Mark Payne  wrote:
> >>
> >> James,
> >>
> >> You can certainly catch Throwable there, or AssertionError, more 
> >> specifically, but I'd be very wary
> >> of doing that, because at that point you're really kind of working against 
> >> the framework (both the
> >> nifi mock/test framework as well as the JUnit framework) instead of with 
> >> it. If your intent is to test
> >> a specific method, I would recommend testing that method explicitly by 
> >> calling it yourself.
> >>
> >> You don't need to create your own MockProcessContext. You can get the 
> >> ProcessContext from
> >> the Test Runner. For example:
> >>
> >>
> >> final MyProcessor myProcessor = new MyProcessor();
> >> final TestRunner runner = TestRunners.newTestRunner(myProcessor);
> >>
> >> runner.setProperty(MyProcessor.MY_PROPERTY, "hello");
> >>
> >> try {
> >>  myProcessor.onScheduled( runner.getProcessContext() );
> >>  Assert.fail("Expected SomeException to get thrown from onScheduled method 
> >> but it did not.");
> >> } catch (final SomeException e) {
> >>  // expected.
> >> }
> >>
> >> Now, this being said, it begs the question whether or not you want to be 
> >> throwing an Exception from your @OnScheduled method.
> >> I'm sure there are use cases where this makes perfect sense. However, you 
> >> should first think about whether or not you are
> >> able to prevent the Exception from occurring by applying validation rules 
> >> (addValidator() to PropertyDescriptor's or customValidate).
> >> The benefit to validators here is that when the user configures the 
> >> Processor incorrectly, they get a clear indication immediately that it
> >> is not valid and a clear explanation of why it's not valid (as well as 
> >> being shown in the Invalid Counts of Process Groups, etc.).
> >> If you wait until the user tries to start the Processor and throw an 
> >> Exception, it will be less obvious that there's a configuration problem
> >> and the error message that they receive is likely not to be as clear.
> >>
> >> Thanks
> >> -Mark
> >>
> >>
> >> On Aug 23, 2018, at 5:25 PM, James Srinivasan 
> >> mailto:james.sriniva...@gmail.com>> wrote:
> >>
> >> Ah, hadn't spotted that. It's close, but the Throwable I get is a
> >> java.lang.AssertionError (Could not invoke methods annotated with
> >> @OnScheduled annotation due to:
> >> java.lang.reflect.InvocationTargetException) and there doesn't seem to
> >> be any way to get the actual underlying exception my code threw in
> >> order to properly validate it.
> >>
> >> Mark's suggestion of calling the @OnScheduled method directly seems a
> >> little tricky when using the TestRunner framework, or should I just
> >

Licenses for test dependencies

2018-08-30 Thread James Srinivasan
I am not a lawyer (and you probably aren't too), but what should I do
to document the licenses of dependent libraries of the processor I
want to contribute that are only used with a Maven 'test' scope  i.e.
not bundled in the nar?

Also, same question for plugins that run in the test scope.

Many thanks,

James


Re: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

2018-10-30 Thread James Srinivasan
Apologies if I've missed this in the discussion so far - we use the
InvokeHTTP processor a lot, and the invokehttp.java.exception.message
attribute is really handy diving into why things have failed without
having to match up logs with flow files (from a system with hundreds
of processors making thousands of requests). We also route on
invokehttp.status.code (e.g. to retry 403s due to a race hazard in an
external system) but I don't imagine we'd route on
invokehttp.java.exception.* since (as others have mentioned) it looks
pretty fragile.

-- 
James
On Tue, 30 Oct 2018 at 16:44, Peter Wicks (pwicks)  wrote:
>
> Sorry for the delayed response, I've been traveling.
>
> Responses in order:
>
> Matt,
> Right now our work around is to keep retrying errors, usually with a penalty 
> or control rate processor. The problem is that we don't know why it failed, 
> and thus don't know if retry is the correct option. I have not found a way, 
> without code change, to be able to determine if retrying is the correct 
> option or not.
>
> Koji,
> Detailed error handling would indeed be a good workaround to the problems 
> raised by myself and Matt. I have not see this on other processors, but that 
> does not mean we can't do it of course.  I agree that having some kind of 
> hierarchy system for errors would be a much better solution.
>
> Pierre,
> My primary use case is as you described, a user friendly way to see what 
> actually happened without going through the log files. But I while I know 
> it's fragile, routing on exception text stored in an attribute still feels 
> like a very legitimate use case. I know in many systems there are good 
> exception types that can be used to route FlowFile's to appropriate failure 
> relationships, but as Matt mentioned, JDBC has just a handful of exception 
> types for a very large number of possible error types.
>
> I think this is probably the same rational that was used to justify this 
> feature for Execute Stream Command's inclusion of this feature in the past. 
> To many possible failure conditions to handle with just a few failure 
> conditions.
>
> Uwe,
> That is a fair question, but it doesn't feel like such a bad fit to me. It's 
> like extra metadata on the lineage, "We followed this path through the flow 
> because we had exception "  " which caused the FlowFile to follow the 
> failure route".
>
> But I still prefer the attribute, it could be another option for Detailed 
> error handling; instead of, or in addition to, additional relationships for 
> failures, the exception text could be included in an attribute.
>
> Thanks,
>   Peter
>
> -Original Message-
> From: u...@moosheimer.com [mailto:u...@moosheimer.com]
> Sent: Saturday, October 27, 2018 10:46 AM
> To: dev@nifi.apache.org
> Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that caused 
> failure in an attribute
>
> Do you really want to mix provenance and data lineage with logging/error 
> information?
>
> Writing exception information/logging information within an attribute is not 
> a bad idea in my opinion.
> If a user wants to use this for routing, why not ... or whatever the user 
> wants to do.
>
> I could imagine that this can be switched on and off by a property via 
> config. E.g. in development on and on production off.
>
> Regards,
> Uwe
>
> > Am 26.10.2018 um 09:26 schrieb Pierre Villard :
> >
> > Adding another option to the list.
> >
> > Peter - if I understand correctly and based on my own experience, the
> > idea is not to have an 'exception' attribute to perform custom routing
> > after the failure relationship but rather have a more user friendly
> > way to see what happened without going through all the logs for a given 
> > flow file.
> >
> > If that's correct, then could we add this information somehow to the
> > provenance event generated by the processor? Ideally adding a new
> > field to a provenance event or using the existing 'details' field?
> >
> > Pierre
> >
> >
> > Le ven. 26 oct. 2018 à 08:40, Koji Kawamura  a
> > écrit :
> >
> >> Hi all,
> >>
> >> I'd like to add another option to Matt's list of solutions:
> >>
> >> 4) Add a processor property, 'Enable detailed error handling'
> >> (defaults to false), then toggle available list of relationships.
> >> This way, existing flows such as Peter's don't have to change, while
> >> he can opt-in new relationships. RouteOnAttribute can be a reference
> >> implementation.
> >>
> >> I like the idea of thinking relationships as potential exceptions. It
> >> can be better if relationships have hierarchy.
> >> Some users need more granular relationships while others don't.
> >> For NiFi 2.0 or later, supporting relationship hierarchy at framework
> >> can mitigate having additional property at each processor.
> >>
> >> Thanks,
> >> Koji
> >> On Fri, Oct 26, 2018 at 11:49 AM Matt Burgess 
> >> wrote:
> >>>
> >>> Peter,
> >>>
> >>> Totally agree, RDBMS/JDBC is in a weird class as always, there is a
> >>> teaspoon of 

Code coverage for NiFi processor

2018-09-21 Thread James Srinivasan
Has anyone got JaCoCo (or similar) working with a custom processor
created from the archetype?

Many thanks,

James


Re: Processor forwards/backwards compatibility

2018-11-16 Thread James Srinivasan
Mark,

Gotcha, thanks very much.

James
On Fri, 16 Nov 2018 at 19:41, Mark Payne  wrote:
>
> Hi James,
>
> The processor may or may not work with older versions of NiFi (1.x.0). But 
> should work with
> newer versions of NiFi (1.z.0). That's because there may be a feature of the 
> nifi-api that you use
> in 1.y.0 and we can guarantee that will not be removed in 1.z.0 but it did 
> not exist in 1.x.0.
>
> Thanks
> -Mark
>
>
>
> > On Nov 16, 2018, at 2:35 PM, James Srinivasan  
> > wrote:
> >
> > I've read 
> > https://cwiki.apache.org/confluence/display/NIFI/Version+Scheme+and+API+Compatibility
> > but I'm still not quite clear:
> >
> > I develop my own custom Processor against NiFi v. 1.y.0
> >
> > Should that processor work with older NiFi releases 1.x.0 (x >
> > Should that processor work with newer NiFi releases 1.z.0 (y >
> > Thanks
> >
> > James
>


Processor forwards/backwards compatibility

2018-11-16 Thread James Srinivasan
I've read 
https://cwiki.apache.org/confluence/display/NIFI/Version+Scheme+and+API+Compatibility
but I'm still not quite clear:

I develop my own custom Processor against NiFi v. 1.y.0

Should that processor work with older NiFi releases 1.x.0 (x

Re: Lowering the barrier of entry

2019-01-26 Thread James Srinivasan
That would have been great - the only things I would add are
reading/writing attributes & reading the processor's properties. Maybe
something about testing in NiFi using GenerateFlowFile too?

I ended up referring to the bundled processors a lot, but it was
sometimes hard to see the wood for the trees.

On Fri, 25 Jan 2019 at 23:04, Andy LoPresto  wrote:
>
> James,
>
> I’m wondering if a page outlining a toy processor (something like `CountText` 
> or `ReverseContent`) and doing a line-by-line annotation from a developer’s 
> perspective would be helpful. It could be a few sections:
>
> 1. How to get to this point
> * running the maven archetype
> * choosing the directory to install to
> * putting the class name in the manifest file
> * etc.
> 2. The code
> * here’s the class
> * here’s what extending AbstractProcessor gets you, etc. A lot of 
> this is currently in the Developer Guide, but in textbook mode
> * here’s how to modify some code (i.e. write one line of Java that 
> switches it from straight content pass-through to reversing the text)
> * here’s how to make a unit test (introduce the TestRunner framework, 
> etc.)
> 3. Running, building, installing
> * Run your unit test from the IDE/maven
> * Build the NAR module
> * Install the NAR in NiFi lib/ or custom/
> * Restart NiFi
> * See the NAR loaded in the log
> * Deploy the component on the canvas
>
> I imagine this being written more conversationally/blog-like than most of our 
> current reference documentation to be used as a split-screen walkthrough. 
> Each section could certainly link to the existing detailed documentation for 
> various topics, like the processor lifecycle, etc.
>
> Does this sounds like something that would have helped you?
>
> Andy LoPresto
> alopre...@apache.org
> alopresto.apa...@gmail.com
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Jan 25, 2019, at 1:59 PM, James Srinivasan  
> > wrote:
> >
> > 9) Oh, and the wiki is a little hard to navigate and the contents rather 
> > patchy
> >
> > On Fri, 25 Jan 2019 at 21:57, James Srinivasan
> >  wrote:
> >>
> >> As someone relatively new to NiFi dev, here's my £0.02. (Yes, I
> >> realise I could and possibly should submit PRs :)
> >>
> >> 1) I'm used to Java and Maven, so used the archetype. It worked fine,
> >> it would have been nice it if set up unit tests for me.
> >> 2) The User and Developer documentation is great and comprehensive.
> >> Finding the developer docs is a little painful (handful of items at
> >> the end of a scrolling list of 200+ processors)
> >> 3) The Developer docs could possibly do with a little more clarity on
> >> processor lifetime i.e. what is called when ^h^h^h - skimming back
> >> over the docs, it looks pretty clear now
> >> 4) Some example code for common operations e.g. getting/setting
> >> attributes or reading/writing/modifying flowfile content would be
> >> great.
> >> 5) When using existing processors for inspiration, best practices
> >> weren't always clear e.g. some generated properties inside
> >> getSupportedPropertyDescriptors(), others generated a private static
> >> list on init and returned that. Such differences are inevitable in a
> >> large project, but it would be nice to have something blessed to start
> >> from.
> >> 6) (Minor niggle - layout of the docs doesn't work great on a phone screen)
> >> 7) I couldn't find (m?)any docs about the Groovy scripting API, but
> >> the great blog posts by Matt Burgess and others were invaluable
> >> 8) In case this all sounds too negative, NiFi is fab!
> >>
> >> On Fri, 25 Jan 2019 at 18:47, Andrew Grande  wrote:
> >>>
> >>> I am not against the archetype. But we need to spell out every step of the
> >>> way. I'd like to see a user thinking about their custom logic ASAP rather
> >>> than fighting the tools to get started. Those steps should be brain-dead,
> >>> just reflexes, if you know what I mean. Hell, let them create a custom
> >>> processor project or prototype in a script by accident even! :)
> >>>
> >>> On Fri, Jan 25, 2019, 10:43 AM Bryan Bende  wrote:
> >>>
> >>>> That makes sense about the best practice for deploying to an
> >>>> additional lib directory.
> >>>>
> >>>> So for the project structure you are saying it would be easier to have
> &

Re: Lowering the barrier of entry

2019-01-28 Thread James Srinivasan
How about separating out User/Developer/Admin into separate docs?

On Mon, 28 Jan 2019 at 16:13, Bryan Bende  wrote:
>
> What does everyone think about bumping the "Developer" section of the
> docs ahead of "Processors" so that it's easier to find?
>
> Here is what it would look like -
> https://gist.github.com/bbende/279c983f5c80d4fad1431ae81862060f
>
> I also added links to the "Contributor Guide" and the "Maven Projects"
> page since I think it would be helpful to make the Developer section
> be the one-stop place people look for development help, although I can
> see an argument for not mixing wiki content with the docs content.
>
> One issue would be if we want the docs to be fully usable in an
> off-line environment, then linking to the wiki won't work, so we could
> remove those links, or convert those pages to be part of the docs now
> that they are more stable.
>
> For reference, we already have some links in the docs to the wiki:
>
> https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#supplying-a-contribution
>
> On Sat, Jan 26, 2019 at 10:49 AM Joe Witt  wrote:
> >
> > ...we can.  but the discussion is about how to both lower the bar and offer
> > more routes to the bar.
> >
> > On Sat, Jan 26, 2019, 10:45 AM Otto Fowler  >
> > > Why wouldn’t we make the archetypes do this?
> > >
> > >
> > > On January 25, 2019 at 18:04:25, Andy LoPresto (alopre...@apache.org)
> > > wrote:
> > >
> > > James,
> > >
> > > I’m wondering if a page outlining a toy processor (something like
> > > `CountText` or `ReverseContent`) and doing a line-by-line annotation from 
> > > a
> > > developer’s perspective would be helpful. It could be a few sections:
> > >
> > > 1. How to get to this point
> > > * running the maven archetype
> > > * choosing the directory to install to
> > > * putting the class name in the manifest file
> > > * etc.
> > > 2. The code
> > > * here’s the class
> > > * here’s what extending AbstractProcessor gets you, etc. A lot of this is
> > > currently in the Developer Guide, but in textbook mode
> > > * here’s how to modify some code (i.e. write one line of Java that 
> > > switches
> > > it from straight content pass-through to reversing the text)
> > > * here’s how to make a unit test (introduce the TestRunner framework, 
> > > etc.)
> > > 3. Running, building, installing
> > > * Run your unit test from the IDE/maven
> > > * Build the NAR module
> > > * Install the NAR in NiFi lib/ or custom/
> > > * Restart NiFi
> > > * See the NAR loaded in the log
> > > * Deploy the component on the canvas
> > >
> > > I imagine this being written more conversationally/blog-like than most of
> > > our current reference documentation to be used as a split-screen
> > > walkthrough. Each section could certainly link to the existing detailed
> > > documentation for various topics, like the processor lifecycle, etc.
> > >
> > > Does this sounds like something that would have helped you?
> > >
> > > Andy LoPresto
> > > alopre...@apache.org
> > > alopresto.apa...@gmail.com
> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> > >
> > > > On Jan 25, 2019, at 1:59 PM, James Srinivasan <
> > > james.sriniva...@gmail.com>
> > > wrote:
> > > >
> > > > 9) Oh, and the wiki is a little hard to navigate and the contents rather
> > > patchy
> > > >
> > > > On Fri, 25 Jan 2019 at 21:57, James Srinivasan
> > > >  wrote:
> > > >>
> > > >> As someone relatively new to NiFi dev, here's my £0.02. (Yes, I
> > > >> realise I could and possibly should submit PRs :)
> > > >>
> > > >> 1) I'm used to Java and Maven, so used the archetype. It worked fine,
> > > >> it would have been nice it if set up unit tests for me.
> > > >> 2) The User and Developer documentation is great and comprehensive.
> > > >> Finding the developer docs is a little painful (handful of items at
> > > >> the end of a scrolling list of 200+ processors)
> > > >> 3) The Developer docs could possibly do with a little more clarity on
> > > >> processor lifetime i.e. what is called when ^h^h^h - skimming back
> > > >> over the docs, it looks pretty clear now
> > > >> 4) Some example code fo

Re: Lowering the barrier of entry

2019-01-28 Thread James Srinivasan
Rather than lumping all the documentation together in a single huge
doc, I was thinking separate entries in the top bar of the NiFi site
under "Documentation" for something like:

General (Overview & Getting Started)
User (User Guide, Expression Language Guide, Record Path Guide &
detailed Processor Usage)
Admin (Admin Guide)
Developer (All the text currently under Developer)

Breaking it out into multiple top-level headings will hopefully help
people find what they need more quickly e.g. with my Developer hat on,
I don't much care about the details of the FooBarProcessor, whereas
with my User hat on, I really want to know about its parameters and
what they mean. Likewise, a non-admin probably doesn't much care about
certificates etc.

Does this makes sense? What do others think?

On Mon, 28 Jan 2019 at 17:04, Bryan Bende  wrote:
>
> Currently it’s broken into General and Developer, so were you thinking of
> splitting General into User and Admin?
>
> On Mon, Jan 28, 2019 at 11:34 AM James Srinivasan <
> james.sriniva...@gmail.com> wrote:
>
> > How about separating out User/Developer/Admin into separate docs?
> >
> > On Mon, 28 Jan 2019 at 16:13, Bryan Bende  wrote:
> > >
> > > What does everyone think about bumping the "Developer" section of the
> > > docs ahead of "Processors" so that it's easier to find?
> > >
> > > Here is what it would look like -
> > > https://gist.github.com/bbende/279c983f5c80d4fad1431ae81862060f
> > >
> > > I also added links to the "Contributor Guide" and the "Maven Projects"
> > > page since I think it would be helpful to make the Developer section
> > > be the one-stop place people look for development help, although I can
> > > see an argument for not mixing wiki content with the docs content.
> > >
> > > One issue would be if we want the docs to be fully usable in an
> > > off-line environment, then linking to the wiki won't work, so we could
> > > remove those links, or convert those pages to be part of the docs now
> > > that they are more stable.
> > >
> > > For reference, we already have some links in the docs to the wiki:
> > >
> > >
> > https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#supplying-a-contribution
> > >
> > > On Sat, Jan 26, 2019 at 10:49 AM Joe Witt  wrote:
> > > >
> > > > ...we can.  but the discussion is about how to both lower the bar and
> > offer
> > > > more routes to the bar.
> > > >
> > > > On Sat, Jan 26, 2019, 10:45 AM Otto Fowler  > wrote:
> > > >
> > > > > Why wouldn’t we make the archetypes do this?
> > > > >
> > > > >
> > > > > On January 25, 2019 at 18:04:25, Andy LoPresto (alopre...@apache.org
> > )
> > > > > wrote:
> > > > >
> > > > > James,
> > > > >
> > > > > I’m wondering if a page outlining a toy processor (something like
> > > > > `CountText` or `ReverseContent`) and doing a line-by-line annotation
> > from a
> > > > > developer’s perspective would be helpful. It could be a few sections:
> > > > >
> > > > > 1. How to get to this point
> > > > > * running the maven archetype
> > > > > * choosing the directory to install to
> > > > > * putting the class name in the manifest file
> > > > > * etc.
> > > > > 2. The code
> > > > > * here’s the class
> > > > > * here’s what extending AbstractProcessor gets you, etc. A lot of
> > this is
> > > > > currently in the Developer Guide, but in textbook mode
> > > > > * here’s how to modify some code (i.e. write one line of Java that
> > switches
> > > > > it from straight content pass-through to reversing the text)
> > > > > * here’s how to make a unit test (introduce the TestRunner
> > framework, etc.)
> > > > > 3. Running, building, installing
> > > > > * Run your unit test from the IDE/maven
> > > > > * Build the NAR module
> > > > > * Install the NAR in NiFi lib/ or custom/
> > > > > * Restart NiFi
> > > > > * See the NAR loaded in the log
> > > > > * Deploy the component on the canvas
> > > > >
> > > > > I imagine this being written more conversationally/blog-like than
> > most of
> > > > > our current reference documentation to be used as a split-screen
>

Re: Lowering the barrier of entry

2019-01-28 Thread James Srinivasan
Throw away :-)

In our org, we have some people who can use NiFi to create flows with
the built-in processors. I guess HWX would all them "Data Flow
Managers".
They might know some scripting, so can write Jython scripts.

We have far fewer (approx 1) person (me) who can write custom
processors in Java. Sometimes, I also create flows.

I'd like to have lots more of both kinds of people.

"User" & "Extending NiFi" actually maps to the internal training I
run, so I'd be happy :-)

James

On Mon, 28 Jan 2019 at 18:01, Andrew Grande  wrote:
>
> Not to throw in a monkey wrench, but does it really make sense to split
> User and Developer? In all years I've never seen a user who wasn't a
> developer.
>
> Maybe call it a User and Extending NiFi sections?
>
> On Mon, Jan 28, 2019, 9:18 AM James Srinivasan 
> wrote:
>
> > Rather than lumping all the documentation together in a single huge
> > doc, I was thinking separate entries in the top bar of the NiFi site
> > under "Documentation" for something like:
> >
> > General (Overview & Getting Started)
> > User (User Guide, Expression Language Guide, Record Path Guide &
> > detailed Processor Usage)
> > Admin (Admin Guide)
> > Developer (All the text currently under Developer)
> >
> > Breaking it out into multiple top-level headings will hopefully help
> > people find what they need more quickly e.g. with my Developer hat on,
> > I don't much care about the details of the FooBarProcessor, whereas
> > with my User hat on, I really want to know about its parameters and
> > what they mean. Likewise, a non-admin probably doesn't much care about
> > certificates etc.
> >
> > Does this makes sense? What do others think?
> >
> > On Mon, 28 Jan 2019 at 17:04, Bryan Bende  wrote:
> > >
> > > Currently it’s broken into General and Developer, so were you thinking of
> > > splitting General into User and Admin?
> > >
> > > On Mon, Jan 28, 2019 at 11:34 AM James Srinivasan <
> > > james.sriniva...@gmail.com> wrote:
> > >
> > > > How about separating out User/Developer/Admin into separate docs?
> > > >
> > > > On Mon, 28 Jan 2019 at 16:13, Bryan Bende  wrote:
> > > > >
> > > > > What does everyone think about bumping the "Developer" section of the
> > > > > docs ahead of "Processors" so that it's easier to find?
> > > > >
> > > > > Here is what it would look like -
> > > > > https://gist.github.com/bbende/279c983f5c80d4fad1431ae81862060f
> > > > >
> > > > > I also added links to the "Contributor Guide" and the "Maven
> > Projects"
> > > > > page since I think it would be helpful to make the Developer section
> > > > > be the one-stop place people look for development help, although I
> > can
> > > > > see an argument for not mixing wiki content with the docs content.
> > > > >
> > > > > One issue would be if we want the docs to be fully usable in an
> > > > > off-line environment, then linking to the wiki won't work, so we
> > could
> > > > > remove those links, or convert those pages to be part of the docs now
> > > > > that they are more stable.
> > > > >
> > > > > For reference, we already have some links in the docs to the wiki:
> > > > >
> > > > >
> > > >
> > https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#supplying-a-contribution
> > > > >
> > > > > On Sat, Jan 26, 2019 at 10:49 AM Joe Witt 
> > wrote:
> > > > > >
> > > > > > ...we can.  but the discussion is about how to both lower the bar
> > and
> > > > offer
> > > > > > more routes to the bar.
> > > > > >
> > > > > > On Sat, Jan 26, 2019, 10:45 AM Otto Fowler <
> > ottobackwa...@gmail.com
> > > > wrote:
> > > > > >
> > > > > > > Why wouldn’t we make the archetypes do this?
> > > > > > >
> > > > > > >
> > > > > > > On January 25, 2019 at 18:04:25, Andy LoPresto (
> > alopre...@apache.org
> > > > )
> > > > > > > wrote:
> > > > > > >
> > > > > > > James,
> > > > > > >
> > > > > > > I’m wondering if a page outlining a toy processor (so

Re: Lowering the barrier of entry

2019-01-25 Thread James Srinivasan
As someone relatively new to NiFi dev, here's my £0.02. (Yes, I
realise I could and possibly should submit PRs :)

1) I'm used to Java and Maven, so used the archetype. It worked fine,
it would have been nice it if set up unit tests for me.
2) The User and Developer documentation is great and comprehensive.
Finding the developer docs is a little painful (handful of items at
the end of a scrolling list of 200+ processors)
3) The Developer docs could possibly do with a little more clarity on
processor lifetime i.e. what is called when ^h^h^h - skimming back
over the docs, it looks pretty clear now
4) Some example code for common operations e.g. getting/setting
attributes or reading/writing/modifying flowfile content would be
great.
5) When using existing processors for inspiration, best practices
weren't always clear e.g. some generated properties inside
getSupportedPropertyDescriptors(), others generated a private static
list on init and returned that. Such differences are inevitable in a
large project, but it would be nice to have something blessed to start
from.
6) (Minor niggle - layout of the docs doesn't work great on a phone screen)
7) I couldn't find (m?)any docs about the Groovy scripting API, but
the great blog posts by Matt Burgess and others were invaluable
8) In case this all sounds too negative, NiFi is fab!

On Fri, 25 Jan 2019 at 18:47, Andrew Grande  wrote:
>
> I am not against the archetype. But we need to spell out every step of the
> way. I'd like to see a user thinking about their custom logic ASAP rather
> than fighting the tools to get started. Those steps should be brain-dead,
> just reflexes, if you know what I mean. Hell, let them create a custom
> processor project or prototype in a script by accident even! :)
>
> On Fri, Jan 25, 2019, 10:43 AM Bryan Bende  wrote:
>
> > That makes sense about the best practice for deploying to an
> > additional lib directory.
> >
> > So for the project structure you are saying it would be easier to have
> > a repo somewhere with essentially the same thing that is in the
> > archetype, but they just clone it and rename it themselves (what the
> > archetype does for you)?
> >
> > Something that I think would be awesome is if we could provide a
> > web-based project initializer that would essentially run the archetype
> > behind the scenes and then let you download the archive of the code,
> > just like the spring-boot starter [1]. Not sure if their initializr is
> > something that can be re-used and customized [2].
> >
> > The problem is we would need to host that somewhere.
> >
> > [1] https://start.spring.io/
> > [2] https://github.com/spring-io/initializr
> >
> > On Fri, Jan 25, 2019 at 12:56 PM Andrew Grande  wrote:
> > >
> > > We assume they create new projects from archetypes every day. They don't.
> > >
> > > We also assume they know how to deploy new NARs. Most don't. Especially
> > if
> > > we want them to follow best practices and create an additional NAR
> > bundles
> > > directory entry im the config (vs dumping into nifi lib).
> > >
> > > I can attest that I feel a bit lost myself every time I need to come back
> > > to this and refresh my brain synapses. If we could make these not require
> > > any of that and make simple thinga dead simple
> > >
> > > Andrew
> > >
> > > On Fri, Jan 25, 2019, 9:47 AM Bryan Bende  wrote:
> > >
> > > > Andrew,
> > > >
> > > > I'm not disagreeing with your points, but I'm curious how you see
> > > > those two ideas being different from the processor archetype and the
> > > > wiki page with the archetype commands?
> > > >
> > > > Is it just that people don't know about it?
> > > >
> > > > -Bryan
> > > >
> > > > [1]
> > > >
> > https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions
> > > >
> > > > On Fri, Jan 25, 2019 at 12:23 PM Otto Fowler 
> > > > wrote:
> > > > >
> > > > > I think this ties into my other discuss thread on refreshing the
> > > > archetypes
> > > > >
> > > > >
> > > > > On January 25, 2019 at 11:50:10, Andrew Grande (apere...@gmail.com)
> > > > wrote:
> > > > >
> > > > > I consistently see my users struggling when they move up the nifi
> > food
> > > > > chain and start looking at custom processors. The good content about
> > > > > prototyping processsors via scripting processors and finalizing with
> > a
> > > > full
> > > > > NAR bundle is everywhere but where it should be.
> > > > >
> > > > > A few simple changes could help (not *more* docs). They are great,
> > much
> > > > > better than in many other projecta, but people are already drowning
> > in
> > > > > those.
> > > > >
> > > > > How about:
> > > > >
> > > > > + ISP has a pre-populated processor sceleton. A simple no-op to fill
> > in
> > > > is
> > > > > miles better than a blank text area (which invokes a blank stare).
> > > > >
> > > > > + As much as we may loook down on this, but... A simple guide to a
> > full
> > > > NAR
> > > > > build as a series of copy/paste commands.
> > > > >
> > > > > 

Re: Lowering the barrier of entry

2019-01-25 Thread James Srinivasan
9) Oh, and the wiki is a little hard to navigate and the contents rather patchy

On Fri, 25 Jan 2019 at 21:57, James Srinivasan
 wrote:
>
> As someone relatively new to NiFi dev, here's my £0.02. (Yes, I
> realise I could and possibly should submit PRs :)
>
> 1) I'm used to Java and Maven, so used the archetype. It worked fine,
> it would have been nice it if set up unit tests for me.
> 2) The User and Developer documentation is great and comprehensive.
> Finding the developer docs is a little painful (handful of items at
> the end of a scrolling list of 200+ processors)
> 3) The Developer docs could possibly do with a little more clarity on
> processor lifetime i.e. what is called when ^h^h^h - skimming back
> over the docs, it looks pretty clear now
> 4) Some example code for common operations e.g. getting/setting
> attributes or reading/writing/modifying flowfile content would be
> great.
> 5) When using existing processors for inspiration, best practices
> weren't always clear e.g. some generated properties inside
> getSupportedPropertyDescriptors(), others generated a private static
> list on init and returned that. Such differences are inevitable in a
> large project, but it would be nice to have something blessed to start
> from.
> 6) (Minor niggle - layout of the docs doesn't work great on a phone screen)
> 7) I couldn't find (m?)any docs about the Groovy scripting API, but
> the great blog posts by Matt Burgess and others were invaluable
> 8) In case this all sounds too negative, NiFi is fab!
>
> On Fri, 25 Jan 2019 at 18:47, Andrew Grande  wrote:
> >
> > I am not against the archetype. But we need to spell out every step of the
> > way. I'd like to see a user thinking about their custom logic ASAP rather
> > than fighting the tools to get started. Those steps should be brain-dead,
> > just reflexes, if you know what I mean. Hell, let them create a custom
> > processor project or prototype in a script by accident even! :)
> >
> > On Fri, Jan 25, 2019, 10:43 AM Bryan Bende  wrote:
> >
> > > That makes sense about the best practice for deploying to an
> > > additional lib directory.
> > >
> > > So for the project structure you are saying it would be easier to have
> > > a repo somewhere with essentially the same thing that is in the
> > > archetype, but they just clone it and rename it themselves (what the
> > > archetype does for you)?
> > >
> > > Something that I think would be awesome is if we could provide a
> > > web-based project initializer that would essentially run the archetype
> > > behind the scenes and then let you download the archive of the code,
> > > just like the spring-boot starter [1]. Not sure if their initializr is
> > > something that can be re-used and customized [2].
> > >
> > > The problem is we would need to host that somewhere.
> > >
> > > [1] https://start.spring.io/
> > > [2] https://github.com/spring-io/initializr
> > >
> > > On Fri, Jan 25, 2019 at 12:56 PM Andrew Grande  wrote:
> > > >
> > > > We assume they create new projects from archetypes every day. They 
> > > > don't.
> > > >
> > > > We also assume they know how to deploy new NARs. Most don't. Especially
> > > if
> > > > we want them to follow best practices and create an additional NAR
> > > bundles
> > > > directory entry im the config (vs dumping into nifi lib).
> > > >
> > > > I can attest that I feel a bit lost myself every time I need to come 
> > > > back
> > > > to this and refresh my brain synapses. If we could make these not 
> > > > require
> > > > any of that and make simple thinga dead simple
> > > >
> > > > Andrew
> > > >
> > > > On Fri, Jan 25, 2019, 9:47 AM Bryan Bende  wrote:
> > > >
> > > > > Andrew,
> > > > >
> > > > > I'm not disagreeing with your points, but I'm curious how you see
> > > > > those two ideas being different from the processor archetype and the
> > > > > wiki page with the archetype commands?
> > > > >
> > > > > Is it just that people don't know about it?
> > > > >
> > > > > -Bryan
> > > > >
> > > > > [1]
> > > > >
> > > https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions
> > > > >
> > > > > On Fri, Jan 25, 2019 at 12:23 PM Otto Fowler 
> > > > > wrote:
>

Re: Airflow and NiFi

2019-08-15 Thread James Srinivasan
We use NiFi for all our ETL, and have done so very happily for several
years (including persuading partner organisations to adopt it); we
have recently started using Airflow for scheduling periodic machine
learning algorithm execution and retraining.

My £0.02 - we probably could do everything we currently do in Airflow
using NiFi, but for our Airflow use cases, things like task duration
are interesting metrics to monitor and I don't immediately know how we
would do this with NiFi. Plus the different user communities make two
separate tools no bad thing.

On Thu, 15 Aug 2019 at 21:00, Mike Thomsen  wrote:
>
> Does anyone see any areas where the two can complement each other and where
> we might want to give users the ability to offload processing to Airflow?
> Curious since our poking around the docs lead us to conclude it was
> probably more "Airflow vs Spark+Oozie" than really competing with NiFi.


Re: Dependency and skipping tests

2020-09-18 Thread James Srinivasan
-DskipTests will compile but not run tests

You probably want something like -Dmaven.test.skip

See http://maven.apache.org/surefire/maven-surefire-plugin/test-mojo.html#skip

On Fri, 18 Sep 2020 at 20:58, Mark Bean  wrote:
>
> I had an interesting problem building Apache NiFi (1.11.4) using Java 11
> which resulted in a build failure. The environment is on a private network
> with a dedicated Nexus repository. I used the maven option "-DskipTests".
> It appears test execution was skipped, but maven still attempted to compile
> the source. In the end, the failure was due to a missing dependency from
> the Nexus repo. Specifically, the artifact was a dependency in
> nifi-accumulo-processors, but only required by test classes.
>
> Group: org.openjfx
> Name: javafx.base
> Version: 11.0.0-SNAPSHOT
>
> First, it surprised me that a SNAPSHOT version was allowed. However, since
> it is required only for a test, perhaps this may not be an issue. Can
> someone please confirm?
>
> Secondly, why did maven still attempt to compile test classes even when
> using the option "-DskipTests" ? Is there something in the nifi poms
> overriding the behavior? Or have I just misunderstood what -DskipTests does?
>
> I had the same results on Maven version 3.6.3 and 3.5.0.
>
> Thanks.


Re: Dependency and skipping tests

2020-09-19 Thread James Srinivasan
Presumably Maven still tries to resolve the dependencies even if not
compiling (because there's no difference between submodules for test or
not?)?

On Sat, 19 Sep 2020, 16:44 Mark Bean,  wrote:

> I tried that option too. Same result. This surprised me even further since
> the documentation explicitly says compilation of tests is skipped.
>
> Hopefully, fresh eyes after a weekend will reveal the issue. Any additional
> suggestions are still welcome!
>
> Thanks.
>
> On Fri, Sep 18, 2020 at 4:01 PM James Srinivasan <
> james.sriniva...@gmail.com>
> wrote:
>
> > -DskipTests will compile but not run tests
> >
> > You probably want something like -Dmaven.test.skip
> >
> > See
> >
> http://maven.apache.org/surefire/maven-surefire-plugin/test-mojo.html#skip
> >
> > On Fri, 18 Sep 2020 at 20:58, Mark Bean  wrote:
> > >
> > > I had an interesting problem building Apache NiFi (1.11.4) using Java
> 11
> > > which resulted in a build failure. The environment is on a private
> > network
> > > with a dedicated Nexus repository. I used the maven option
> "-DskipTests".
> > > It appears test execution was skipped, but maven still attempted to
> > compile
> > > the source. In the end, the failure was due to a missing dependency
> from
> > > the Nexus repo. Specifically, the artifact was a dependency in
> > > nifi-accumulo-processors, but only required by test classes.
> > >
> > > Group: org.openjfx
> > > Name: javafx.base
> > > Version: 11.0.0-SNAPSHOT
> > >
> > > First, it surprised me that a SNAPSHOT version was allowed. However,
> > since
> > > it is required only for a test, perhaps this may not be an issue. Can
> > > someone please confirm?
> > >
> > > Secondly, why did maven still attempt to compile test classes even when
> > > using the option "-DskipTests" ? Is there something in the nifi poms
> > > overriding the behavior? Or have I just misunderstood what -DskipTests
> > does?
> > >
> > > I had the same results on Maven version 3.6.3 and 3.5.0.
> > >
> > > Thanks.
> >
>


Re: Nifi Over HTTPS

2020-05-29 Thread James Srinivasan
Have you found the walkthroughs?

https://nifi.apache.org/docs/nifi-docs/html/walkthroughs.html#securing-nifi-with-tls


On Fri, 29 May 2020, 19:41 MYERS, KYLE,  wrote:

> Hello,
>
> I am trying to set up Nifi over SSL/TLS connections. I am following the
> guide
> https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html#tls_intermediate_ca
> .
>
> My scenarios is this:
>
> I have an internal CA stood up using openssl, which I am using to sign
> certificates. I have been attempting to set up the keystore and trustore
> but have not had luck establishing trust between nifi and my cert signed by
> my ca. I provide the ca-cert.pem to the trust store and add my
> nifi-cert.pem and nifi-key.key to the keystore but still nothing.
>
> I have verified that the nifi-cert.pem and ca-cert.pem have a chain of
> trust, but nifi will still not trust my cert. Is there any specific
> configurations I need to perform to get things working?
>
> Any help or point in the right direction ASAP would be great.
>
>
> Thank you,
>
> Kyle Myers
> AT CyberSecurity
>


Re: SSLPeerUnverifiedException following upgrade from 1.6.0 to 1.13.2

2021-06-11 Thread James Srinivasan
This is my usual go-to reference for getting SANs working with openssl CSRs:

https://geekflare.com/san-ssl-certificate/

Newer openssl versions apparently allow it on the command line:

https://security.stackexchange.com/questions/74345/provide-subjectaltname-to-openssl-directly-on-the-command-line

On Fri, 11 Jun 2021 at 13:00, Phil H  wrote:
>
> Well, it took a lot of mis steps recreating and signing the certs (used the
> wrong CA) and working through all the other issues with SANs, BUT I GOT IT
> WORKING!
>
> Thanks David, and thanks to everyone else that helps out in this group!
> Nifi is so complicated I can’t imagine trying to do this stuff alone!
>
> On Fri, 11 Jun 2021 at 15:09, David Handermann 
> wrote:
>
> > Hi Phil,
> >
> > Thanks for providing the stack trace.  Recent versions of NiFi include
> > updates to the OkHttp library, which modified the hostname verification
> > process.  OkHttp starting with version 3.10.0 made changes to TLS hostname
> > verification, requiring that a certificate contain DNS Subject Alternative
> > Names matching the connection hostname.  Based on the error message, it
> > appears that the certificates configured do not have any Subject
> > Alternative Names, resulting in the SSLPeerUnverifiedException.  Generating
> > or obtaining new certificates that include the required DNS Subject
> > Alternative Names should resolve the problem.
> >
> > Here's the release notes for OkHttp 3.10.0, referencing RFC 2818, which
> > deprecated falling back to certificate common names for hostname
> > verification:
> >
> > https://square.github.io/okhttp/changelog_3x/#version-3100
> >
> > Regards,
> > David Handermann
> >
> > On Thu, Jun 10, 2021 at 11:16 PM Phil H  wrote:
> >
> > > Hi there,
> > >
> > > I upgraded an older dev setup today from 1.6.0 to 1.13.2.  After a
> > > couple of config tweaks, it’s “working”, but if I try and access the
> > > interface at https://nifi2.domain.blah/ I get a message on screen
> > > stating that nifi1.domain.blah is not verified.  The logs contain this
> > > same message, along with the stack trace.  (This also happens if I
> > > access nifi1 – it complains nifi2 is not verified).
> > >
> > > My keystore and truststore on both servers both contain the certs for
> > > both servers, and the truststore additionally contains the CA that
> > > signed both servers’ certificates.
> > >
> > > What am I missing?
> > >
> > > Thanks,
> > > Phil
> > >
> > >
> > >
> > >
> > >
> > > 2021-06-11 23:51:20,970 WARN [Replicate Request Thread-1]
> > > o.a.n.c.c.h.r.ThreadPoolRequestReplicator Failed to replicate request
> > > GET /nifi-api/flow/current-user to nifi1.domain.blah:443 due to
> > > javax.net.ssl.SSLPeerUnverifiedException: Hostname nifi1.domain.blah
> > > not verified:
> > >
> > > certificate: sha256/Wv+eIBMlpsSS95xKF+Fry9C/jQhFbNS35yfJGK92/5U=
> > >
> > > DN: CN=nifi1.domain.blah, OU=domain, O=blah
> > >
> > > subjectAltNames: []
> > >
> > > 2021-06-11 23:51:20,970 WARN [Replicate Request Thread-1]
> > > o.a.n.c.c.h.r.ThreadPoolRequestReplicator
> > >
> > > javax.net.ssl.SSLPeerUnverifiedException: Hostname nifi1.
> > > nifi1.domain.blah not verified:
> > >
> > > certificate: sha256/Wv+eIBMlpsSS95xKF+Fry9C/jQhFbNS35yfJGK92/5U=
> > >
> > > DN: CN=nifi1.domain.blah OU=domain, O=blah
> > >
> > > subjectAltNames: []
> > >
> > > at
> > >
> > >
> > okhttp3.internal.connection.RealConnection.connectTls(RealConnection.kt:389)
> > >
> > > at
> > >
> > >
> > okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.kt:337)
> > >
> > > at
> > > okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:209)
> > >
> > > at
> > >
> > >
> > okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226)
> > >
> > > at
> > >
> > >
> > okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106)
> > >
> > > at
> > > okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.kt:74)
> > >
> > > at
> > > okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:255)
> > >
> > > at
> > >
> > >
> > okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
> > >
> > > at
> > >
> > >
> > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
> > >
> > > at
> > > okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
> > >
> > > at
> > >
> > >
> > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
> > >
> > > at
> > >
> > okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
> > >
> > > at
> > >
> > >
> > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
> > >
> > > at
> > >
> > >
> > 

Re: [discuss] we need to enable secure by default...

2021-02-10 Thread James Srinivasan
Would a suitably large warning on the http ui be a good starting point?

Browsers are getting increasingly wary of self signed certs and we probably
don't want to be encouraging people to ignore them.

What about easier acme+certbot support? (notwithstanding all the non public
deployments)

On Wed, 10 Feb 2021, 15:25 Otto Fowler,  wrote:

> Aren’t most of these applications installed by an installer?
> Maybe we can have one or more installers that setup a secure instance, and
> those installers
> could be the *production* nifi, and keep the zip distribution open for
> developers?
>
>
> > On Feb 10, 2021, at 10:04, David Handermann 
> wrote:
> >
> > I agree that a generated password is the way to go, and the challenge is
> > making it available to the user.  Depending on how it is stored for the
> > identity provider, having a command line tool to read and print it could
> be
> > a reasonable option.
> >
> > Although this particular thread referenced a specific Twitter post, this
> > general discussion is more of a reflection of the need to make things
> more
> > secure by default as other products have followed similar approaches.
> >
> > Regards,
> > David Handermann
> >
> > On Wed, Feb 10, 2021 at 8:53 AM Kevin Doran  wrote:
> >
> >> I am in favor of requiring some authentication by default.
> >>
> >> I’m in favor of an admin username and generated password. (It sounds
> li9ke
> >> most of us are on the same page, but I don’t think a static, default
> >> password buys us much against the types of abuse shown on the twitter
> >> thread Joe shared.)
> >>
> >> We need some way of making the generated password discoverable… Print to
> >> the logs on first boot? Not great but are there other mechanisms? A
> setup
> >> CLI utility?
> >>
> >> To help not impede automated deployments, maybe we should change the
> >> startup flow such that there is a way to provide this password. That
> would
> >> be better than people just disabling whatever default authentication we
> set.
> >>
> >>
> >>> On Feb 10, 2021, at 09:45, David Handermann <
> exceptionfact...@gmail.com>
> >> wrote:
> >>>
> >>> Mark,
> >>>
> >>> Thanks for clarifying, that makes sense.
> >>>
> >>> Regards,
> >>> David Handermann
> >>>
> >>> On Wed, Feb 10, 2021 at 8:41 AM Mark Payne 
> wrote:
> >>>
>  David,
> 
>  My concern was purely around generating client certs and using mutual
> >> TLS.
>  I definitely think we should have a server cert if using username &
>  password.
> 
>  Thanks
>  -Mark
> 
> 
> > On Feb 10, 2021, at 9:37 AM, David Handermann <
>  exceptionfact...@gmail.com> wrote:
> >
> > Mark,
> >
> > Thanks for your perspective on certificate generation, I agree that
> > troubleshooting certificates can be confusing due to unclear error
> > messaging.  Generating self-signed certificates for one-way TLS still
> > requires the user to accept the certificate presented, but at least
> >> that
>  is
> > more common in various products.  Are you concerned about that
> >> situation,
> > or are you more concerned about the additional challenges of mutual
> TLS
> > authentication?
> >
> > Generating a random password in absence of any other configuration
> >> would
> > certainly be easier from a new user perspective.  Some of the
> security
> > concerns with password authentication can be mitigated with one-way
> >> TLS,
>  so
> > a blending of these approaches, as Joe describes in NIFI-8220, seems
>  like a
> > good way to go.
> >
> > Regards,
> > David Handermann
> >
> > On Wed, Feb 10, 2021 at 8:23 AM Mark Payne 
> >> wrote:
> >
> >> I would be in favor of this as well. I agree that there is no need
> to
>  wait
> >> for a 2.0 version - there is no inherit backward compatibility
> >> challenge
> >> here.
> >> Requiring that new application configs be set, etc. I think is
>  completely
> >> fair game between minor versions.
> >>
> >> Personally, though, I would very much stray away from
> auto-generating
> >> certificates. For those of us who have dealt with certificates
>  significantly
> >> In the past, minor configuration issues are easy to address. But for
> >> someone who hasn’t spent much time dealing with certificates, the
> >> error
> >> messages
> >> that are often intentionally vague can quickly result in users being
> >> overwhelmed. This is especially true if browser configurations are
>  already
> >> setup to
> >> automatically offer certificates that area already installed - this
> >> can
> >> result in weird error messages about untrusted certificates when the
>  user
> >> thinks
> >> they haven’t even provided a certificate, etc. It gets really hairy.
> >>
> >> I am more in favor of a default username/password personally. It
> would
> >> require implementing a new auth provider. And 

Re: Missing maven dependencies when building nifi

2022-04-05 Thread James Srinivasan
Possibly not your problem, but did you follow the instructions for building
on Windows here:

https://nifi.apache.org/quickstart.html

On Tue, 5 Apr 2022, 12:42 Phil H,  wrote:

> Okay, I started over with a completely new local repo.  This is the exact
> sequence of commands I ran:
>
>  $ git clone https://github.com/apache/nifi.git
>  $ cd nifi
>  $ git remote add upstream https://github.com/apache/nifi.git
>  $ git clean -fxd
>  $ mvn clean package verify -Pcontrib-check,include-grpc
>
> And after about 12 minutes:
>
>  [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test (default-test)
> on project nifi-web-security: There are test failures.
>  [ERROR]
>  [ERROR] Please refer to
>
> D:\nifi\nifi-nar-bundles\nifi-framework-bundle\nifi-framework\nifi-web\nifi-web-security\target\surefire-reports
> for the individual test results.
>  [ERROR] Please refer to dump files (if any exist) [date].dump,
> [date]-jvmRun[N].dump and [date].dumpstream.
>
> The contents of the dumpfile at that location is:
>
>  # Created at 2022-04-05T21:05:18.979
>  Boot Manifest-JAR contains absolute paths in classpath
>
> 'D:\nifi\nifi-nar-bundles\nifi-framework-bundle\nifi-framework\nifi-web\nifi-web-security\target\test-classes'
>  Hint:
> -Djdk.net.URLClassPath.disableClassPathURLCheck=true
>
> I'm sure there's some obvious mistake I'm making, or piece of information I
> am missing, but I have no idea what they are!
>
> Thanks
> Phil
>
> On Tue, Apr 5, 2022 at 12:53 AM Joe Witt  wrote:
>
> > Phil
> >
> > I think you need to run 'git clean -fxd' from the nifi source root.
> >
> > Then try the build again.  And allow the entire thing to run.
> >
> > Too many builds in different levels are leaving things around.
> >
> > Thanks
> >
> > On Mon, Apr 4, 2022 at 7:46 AM Phil H  wrote:
> >
> > > So, about that error that flashed past...
> > >
> > > [INFO]
> > >
> 
> > > [INFO] BUILD FAILURE
> > > [INFO]
> > >
> 
> > > [INFO] Total time:  22:21 min (Wall Clock)
> > > [INFO] Finished at: 2022-04-05T00:37:46+10:00
> > > [INFO]
> > >
> 
> > > [ERROR] Failed to execute goal
> > org.apache.rat:apache-rat-plugin:0.13:check
> > > (default) on project nifi-poi-processors: Too many files with
> unapproved
> > > license: 1 See RAT report in:
> > >
> > >
> >
> D:\nifi.github\nifi\nifi-nar-bundles\nifi-poi-bundle\nifi-poi-processors\target\rat.txt
> > > -> [Help 1]
> > >
> > > The rat file contains the following:
> > >
> > > *
> > > Summary
> > > ---
> > > Generated at: 2022-04-05T00:33:56+10:00
> > >
> > > Notes: 0
> > > Binaries: 4
> > > Archives: 0
> > > Standards: 7
> > >
> > > Apache Licensed: 6
> > > Generated Documents: 0
> > >
> > > JavaDocs are generated, thus a license header is optional.
> > > Generated files do not require license headers.
> > >
> > > 1 Unknown Licenses
> > >
> > > *
> > >
> > > Files with unapproved licenses:
> > >
> > >
> > >
> > >
> >
> D:/nifi.github/nifi/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-processors/src/test/resources/Unsupported.xls
> > >
> > > *
> > >
> > > *
> > >   Files with Apache License headers will be marked AL
> > >   Binary files (which do not require any license headers) will be
> marked
> > B
> > >   Compressed archives will be marked A
> > >   Notices, licenses etc. will be marked N
> > >   AL
> > >
> > >
> >
> D:/nifi.github/nifi/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-processors/pom.xml
> > >   AL
> > >
> > >
> >
> D:/nifi.github/nifi/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-processors/src/main/java/org/apache/nifi/processors/poi/ConvertExcelToCSVProcessor.java
> > >   AL
> > >
> > >
> >
> D:/nifi.github/nifi/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-processors/src/main/resources/docs/org.apache.nifi.processors.poi.ConvertExcelToCSVProcessor/additionalDetails.html
> > >   AL
> > >
> > >
> >
> D:/nifi.github/nifi/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-processors/src/main/resources/META-INF/services/org.apache.nifi.processor.Processor
> > >   AL
> > >
> > >
> >
> D:/nifi.github/nifi/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-processors/src/test/java/org/apache/nifi/processors/poi/ConvertExcelToCSVProcessorTest.java
> > >   B
> > >
> > >
> >
> D:/nifi.github/nifi/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-processors/src/test/resources/CollegeScorecard.xlsx
> > >   B
> > >
> > >
> >
> D:/nifi.github/nifi/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-processors/src/test/resources/dataformatting.xlsx
> > >   AL
> > >
> > >
> >
> 

Re: [discuss] NiFi support for Hadoop ecosystem components

2023-03-24 Thread James Srinivasan
I'm a Hadoop and Nifi user without vendor support so unsurprisingly aren't
keen on #1, but then relying on community support and development is always
going to be a risk for us. If it came to it, we'd probably stop using Nifi
rather than pay a vendor which would be a real shame.

Are certain Hadoop processors more maintenance heavy than others? Its a
rather wide ecosystem.

On Fri, 24 Mar 2023, 18:07 Joe Witt,  wrote:

> Team,
>
> For the full time NiFi has been in Apache we've built with support for
> various Hadoop ecosystem components like HDFS, Hive, HBase, others,
> and more recently formats/serialization modes like necessary for
> Parquet, Orc, Iceberg, etc..
>
> All of these things however present endless challenges with
> compatibility across different versions (Hive being the most difficult
> by far), vendors (hadoop vendors, cloud vendors, etc..).  And also
> super notably the incredible number of dependencies, dependency
> conflicts, inclusions/exclusions, old log libs, vulnerability updates,
> etc..  And last but certainly not least a big reason why our build has
> grown so much.
>
> We have a couple options:
> 1. Deprecate these components in NiFi 1.x and drop them entirely in
> NiFi 2.x.  Leave this as a problem for vendors to deal with.  NiFi
> users interacting with such components are nearly exclusively doing so
> with vendors anyway.
>
> 2. Remove the components from NiFi main code line and create a
> separate repo for 'nifi-hadoop-extensions'.  We manage those
> independently and release them periodically.  They would be available
> for people to grab the nars if they want to use them.  We include none
> of them in the convenience binary going forward by default.
>
> 3. Change nothing.  Continue to battle with the above listed items.
> This is admittedly a bit of a non-option.  We can't keep spending the
> same time/energy on these we have.  It is a very small number of
> people that fight this battle.
>
> Look forward to hearing thoughts on these options or others we might
> consider.
>
> Thanks
>