RE: [EXT] Suggestion: Apache NiFi component enhancement

2018-04-12 Thread Peter Wicks (pwicks)
I think this is a good idea. But based on your example I think you would want 
to provide a primary Type along with a list of Alias types.
If NiFi starts and it can no longer find a processor by the Type name it had in 
the flow.xml it can check he annotations/aliases to see if it's been renamed. 
This would allow for easy renames.

Example 1: NiFi can no longer find AzureDocumentDBProcessor. Developer renamed 
it to CosmosDBProcessor. In this case we don't really want the type to still 
same "DocumentDB", that's just confusing. Also, we might not want the type 
named CosmosDBProcessor. So we make the Type be something nice, like "Azure 
Comos DB", then add Aliases for "AzureDocumentDBProcessor" and 
"CosmosDBProcessor".

Next year when Microsoft renames it "CelestialDB" we can rename the processor 
and add another alias.

Something like that?

-Original Message-
From: Sivaprasanna [mailto:sivaprasanna...@gmail.com] 
Sent: Wednesday, April 11, 2018 23:37
To: dev@nifi.apache.org
Subject: [EXT] Suggestion: Apache NiFi component enhancement

All,

Currently the "type" of a component is actually the component's canonical class 
name which gets rendered in the UI as the class name with the component 
version. This is good. However I'm thinking it is better to have an annotation 
which a developer can use to override the component type.

How is it used?
I think an annotation can be sufficient. The framework checks if the annotation 
is present or not, if it is present, it uses the name provided there or else it 
uses the class name like how it is happening.

Why and where is it needed?

   - In scenarios where we devise a new naming convention and want to apply
   it to older components without breaking backward compatibility
   - A developer had created a component class with a name but later down
   the line, the developer or someone else wants to change it to something
   else, the reason could again be naming convention or just that the new name
   makes more sense
   - A component that has been built to work with third party tech, like
   Azure, MongoDB, S3, Druid processors but the later versions of that tech
   has been changed to something else by the original creators. (Something
   similar has happened to Azure's DocumentDB which got later rebranded as
   Azure CosmosDB). In such cases, without deprecating or rebuilding a new
   processor, this can be used.

Before creating a JIRA, I wanted to get the community's thoughts. Feel free to 
share your thoughts, concerns. If everything seems fine, I'll start working on 
the implementation.

-

Sivaprasanna


RE: (NiFi 1.6) Funnels with no outgoing relationship filling my app log

2018-04-12 Thread Peter Wicks (pwicks)
https://issues.apache.org/jira/browse/NIFI-5075

From: Peter Wicks (pwicks)
Sent: Thursday, April 12, 2018 14:30
To: 'dev@nifi.apache.org' 
Subject: (NiFi 1.6) Funnels with no outgoing relationship filling my app log

I just upgraded one of my servers to NiFi 1.6.0. I have a couple of funnels 
that just dead end, Flow File's come in but never go anywhere after that. 
Mostly I use this for troubleshooting/validation. The inbound relationships all 
have expiration times, and it's a quick way for me to inspect the output of a 
processor on the fly.

These funnel's are filling up my logs with errors that they can't output to 
Relationship '' (see error log below). If I attach the Funnel to another 
downstream processor than everything is fine. I went back and tested on my 
1.5.0 server and did not see the errors.

I briefly looked through the code, but the bug didn't jump out at me.

2018-04-11 23:53:28,066 ERROR [Timer-Driven Process Thread-31] 
o.apache.nifi.controller.StandardFunnel 
StandardFunnel[id=b868231c-0162-1000-571c-ae3e7d15d848] 
StandardFunnel[id=b868231c-0162-1000-571c-ae3e7d15d848] failed to process 
session due to java.lang.RuntimeException: java.lang.IllegalArgumentException: 
Relationship '' is not known; Processor Administratively Yielded for 1 sec: 
java.lang.RuntimeException: java.lang.IllegalArgumentException: Relationship '' 
is not known
java.lang.RuntimeException: java.lang.IllegalArgumentException: Relationship '' 
is not known
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:365)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Relationship '' is not known
at 
org.apache.nifi.controller.repository.StandardProcessSession.transfer(StandardProcessSession.java:1935)
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:379)
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:358)
... 9 common frames omitted


Thanks,
  Peter


Nifi UI improvement

2018-04-12 Thread Jorge Machado
Hi guys, 

Is there any effort to improve the UI ? One nice feature to have would be: Have 
the possibility to limit searches on a canvas. Let’s say we are inside a 
processor group and we only want to search from there on.

Regards

Jorge Machado








Re: Nifi Parallel Execution

2018-04-12 Thread Eric Ulicny
We have attempted to use the distributed map cache with the Detect
Duplicate processor as recommended to no avail. The first time two
identical flowfiles are sent simultaneously they both make it through to
the the non-duplicate relationship. After that point they will be
appropriately detected.

In particular we are testing with generate flow file on a two node cluster.
We extract the Shakey and use that when detecting duplicates.

-Eric




On Tue, Apr 10, 2018, 1:32 PM Bryan Bende  wrote:

> Hello,
>
> DetectDuplicate uses a DistributedMapCacheClientService which would be
> connecting to a DistributedMapCacheServer on one of your nodes.
>
> So all nodes should be connecting to the same cache server which is
> where the information about previously seen data is stored.
>
> -Bryan
>
> On Tue, Apr 10, 2018 at 1:24 PM, Eric Ulicny  wrote:
> > Hello,
> >
> > We have a use case where we execute processors on all nodes but would
> like
> > to use the detect duplicate processor to ensure records are unique. We
> are
> > observing that we must run it on one node to truly detect duplicates. Is
> > there any way to merge flowfiles from all running executors?
> >
> > -Eric
>


Calling getLogger() from @OnScheduled, @OnStopped, etc.

2018-04-12 Thread Russell Bateman
I seem to crash NiFi JUnit test runner when I have code that calls 
getLogger()or attempts to make use of the product of calling 
getLogger()in situations where some context (probably) is missing like 
methods annotated for call at "special" times. This makes sense, but is 
there a technique I can use to profile my custom processor in order to 
observe (easily, such as using the logger) the behavior of (i.e.: 
log-TRACE through) my processor in respect to @OnScheduled, 
@OnUnscheduled, @OnStopped, etc. moments?


Many thanks,
Russ


Re: Nifi UI improvement

2018-04-12 Thread Scott Aslan
Hello Jorge,

There are many efforts ongoing toward the improvement of the NiFi UI and
you can see a list of them here:
https://issues.apache.org/jira/browse/NIFI-5066?jql=project%20%3D%20NIFI%20AND%20component%20%3D%20%22Core%20UI%22.
Please feel free to continue this discussion to determine the scope of the
effort you are envisioning and to gather feedback from the community about
what other users may want and then you can create a JIRA issue that will be
placed on the backlog.

-Scotty

On Thu, Apr 12, 2018 at 3:33 AM, Jorge Machado  wrote:

> Hi guys,
>
> Is there any effort to improve the UI ? One nice feature to have would be:
> Have the possibility to limit searches on a canvas. Let’s say we are inside
> a processor group and we only want to search from there on.
>
> Regards
>
> Jorge Machado
>
>
>
>
>
>
>


Ongoing support for a contribution

2018-04-12 Thread Anthony Roach
Hi,

I am very new to the open source world.  My organization is preparing to submit 
a NiFi processor to load data into our commercial NoSQL database.  Once the 
review and commit are completed, what are our responsibilities for managing 
issues?  Do we monitor the issues distribution list?

Thanks...
___
Anthony Roach
Product Manager
MarkLogic Corporation
Desk: +1 650 287 2587
Mobile: +1 415 368 6460
www.marklogic.com
[MLSoMeSignature]



Re: Data migration in sequence

2018-04-12 Thread Joe Witt
Anil

What is a little hard to tell here is do you mean:

1) I have to transfer data from system A to system B where there are
10 steps I want to execute in NiFi between A and B.
-or-
2) I have to transfer data from systems A,B,C,D,E,F,G,H,I,J to system
Z where first I want to load data from system A to Z, then B to Z,
then C to Z and so on.

If it is #1 then of course it is easy.
If it is #2 then I believe you'd want a series of Wait/Notify
processors where B waits the notification signal from A, C from B, and
so on.

Thanks

On Thu, Apr 12, 2018 at 4:16 PM, Mike Thomsen  wrote:
> It'd be very helpful to know what the input and output processors are for
> each of the flows because they have different capabilities for triggering
> them.
>
> On Thu, Apr 12, 2018 at 12:03 PM Anil Rai  wrote:
>
>> sorry, sent email in between.
>> This can be achieved by putting few flags in cache and checking them in
>> subsequent flows. But this is not elegant.
>> Any other way to achieve this flow runs in a sequence?
>>
>> Thanks
>> Anil
>>
>>
>> On Thu, Apr 12, 2018 at 12:01 PM, Anil Rai  wrote:
>>
>> > Experts, Below is my requirenment.
>> >
>> > I have to do a 1 time data load from system a to b.
>> > I have to transfer 10 different data sets.
>> > These transfers have to happen in sequence. 1 followed by 2 followed by 3
>> > so on till 10.
>> >
>> > We have created seperate process groups for each of the data set
>> migration.
>> > How can we achieve the run sequence once I trigger the first flow?
>> > I know this can be achieved by putting few flags in cache and checking
>> > them in subsequen
>> >
>>


Re: [EXT] Suggestion: Apache NiFi component enhancement

2018-04-12 Thread Sivaprasanna
No my suggestion was a simpler approach. It just affects only the UI aspect
as my intention is to just override how the 'type' gets rendered in the UI.
For example, a processor's type is set to its canonical class name (
DtoFactory.java#createProcessorDto
)
but rather than getting the canonical class name, let's just get from some
other method that checks if the new annotation is present, if it is
present, set the value provided in the annotation as the type, if it's not
present set the canonical class name just like how it is now. Again, my
intention is to just affect the UI so as to avoid making unnecessary
complication that could pose some backwards compatibility issue.

-
Sivaprasanna

On Thu, Apr 12, 2018 at 1:35 PM, Peter Wicks (pwicks) 
wrote:

> I think this is a good idea. But based on your example I think you would
> want to provide a primary Type along with a list of Alias types.
> If NiFi starts and it can no longer find a processor by the Type name it
> had in the flow.xml it can check he annotations/aliases to see if it's been
> renamed. This would allow for easy renames.
>
> Example 1: NiFi can no longer find AzureDocumentDBProcessor. Developer
> renamed it to CosmosDBProcessor. In this case we don't really want the type
> to still same "DocumentDB", that's just confusing. Also, we might not want
> the type named CosmosDBProcessor. So we make the Type be something nice,
> like "Azure Comos DB", then add Aliases for "AzureDocumentDBProcessor" and
> "CosmosDBProcessor".
>
> Next year when Microsoft renames it "CelestialDB" we can rename the
> processor and add another alias.
>
> Something like that?
>
> -Original Message-
> From: Sivaprasanna [mailto:sivaprasanna...@gmail.com]
> Sent: Wednesday, April 11, 2018 23:37
> To: dev@nifi.apache.org
> Subject: [EXT] Suggestion: Apache NiFi component enhancement
>
> All,
>
> Currently the "type" of a component is actually the component's canonical
> class name which gets rendered in the UI as the class name with the
> component version. This is good. However I'm thinking it is better to have
> an annotation which a developer can use to override the component type.
>
> How is it used?
> I think an annotation can be sufficient. The framework checks if the
> annotation is present or not, if it is present, it uses the name provided
> there or else it uses the class name like how it is happening.
>
> Why and where is it needed?
>
>- In scenarios where we devise a new naming convention and want to apply
>it to older components without breaking backward compatibility
>- A developer had created a component class with a name but later down
>the line, the developer or someone else wants to change it to something
>else, the reason could again be naming convention or just that the new
> name
>makes more sense
>- A component that has been built to work with third party tech, like
>Azure, MongoDB, S3, Druid processors but the later versions of that tech
>has been changed to something else by the original creators. (Something
>similar has happened to Azure's DocumentDB which got later rebranded as
>Azure CosmosDB). In such cases, without deprecating or rebuilding a new
>processor, this can be used.
>
> Before creating a JIRA, I wanted to get the community's thoughts. Feel
> free to share your thoughts, concerns. If everything seems fine, I'll start
> working on the implementation.
>
> -
>
> Sivaprasanna
>


Data migration in sequence

2018-04-12 Thread Anil Rai
Experts, Below is my requirenment.

I have to do a 1 time data load from system a to b.
I have to transfer 10 different data sets.
These transfers have to happen in sequence. 1 followed by 2 followed by 3
so on till 10.

We have created seperate process groups for each of the data set migration.
How can we achieve the run sequence once I trigger the first flow?
I know this can be achieved by putting few flags in cache and checking them
in subsequen


Re: Data migration in sequence

2018-04-12 Thread Mike Thomsen
It'd be very helpful to know what the input and output processors are for
each of the flows because they have different capabilities for triggering
them.

On Thu, Apr 12, 2018 at 12:03 PM Anil Rai  wrote:

> sorry, sent email in between.
> This can be achieved by putting few flags in cache and checking them in
> subsequent flows. But this is not elegant.
> Any other way to achieve this flow runs in a sequence?
>
> Thanks
> Anil
>
>
> On Thu, Apr 12, 2018 at 12:01 PM, Anil Rai  wrote:
>
> > Experts, Below is my requirenment.
> >
> > I have to do a 1 time data load from system a to b.
> > I have to transfer 10 different data sets.
> > These transfers have to happen in sequence. 1 followed by 2 followed by 3
> > so on till 10.
> >
> > We have created seperate process groups for each of the data set
> migration.
> > How can we achieve the run sequence once I trigger the first flow?
> > I know this can be achieved by putting few flags in cache and checking
> > them in subsequen
> >
>


Re: Data migration in sequence

2018-04-12 Thread Anil Rai
Thanks Mike and Joe for the quick response.
Hi Mike, Nifi is making api calls to source -> pull data -> Maps -> Post
data to target via API. Generate flow file triggers the flow. Right now i
manually run 1 process group after the previous one is complete.
Hi Joe, The systems are the same. A and B. Just that i will be pulling
different data set and that has to be done in sequence. Example, I will
load customer followed by say order followed by items etc.
The use case is one time data load from system A and B of different data
sets that needs to happen in sequence. Trigger has to be only once to start
the first one.
Hope that helps.

Regards
Anil


On Thu, Apr 12, 2018 at 12:26 PM, Joe Witt  wrote:

> Anil
>
> What is a little hard to tell here is do you mean:
>
> 1) I have to transfer data from system A to system B where there are
> 10 steps I want to execute in NiFi between A and B.
> -or-
> 2) I have to transfer data from systems A,B,C,D,E,F,G,H,I,J to system
> Z where first I want to load data from system A to Z, then B to Z,
> then C to Z and so on.
>
> If it is #1 then of course it is easy.
> If it is #2 then I believe you'd want a series of Wait/Notify
> processors where B waits the notification signal from A, C from B, and
> so on.
>
> Thanks
>
> On Thu, Apr 12, 2018 at 4:16 PM, Mike Thomsen 
> wrote:
> > It'd be very helpful to know what the input and output processors are for
> > each of the flows because they have different capabilities for triggering
> > them.
> >
> > On Thu, Apr 12, 2018 at 12:03 PM Anil Rai  wrote:
> >
> >> sorry, sent email in between.
> >> This can be achieved by putting few flags in cache and checking them in
> >> subsequent flows. But this is not elegant.
> >> Any other way to achieve this flow runs in a sequence?
> >>
> >> Thanks
> >> Anil
> >>
> >>
> >> On Thu, Apr 12, 2018 at 12:01 PM, Anil Rai 
> wrote:
> >>
> >> > Experts, Below is my requirenment.
> >> >
> >> > I have to do a 1 time data load from system a to b.
> >> > I have to transfer 10 different data sets.
> >> > These transfers have to happen in sequence. 1 followed by 2 followed
> by 3
> >> > so on till 10.
> >> >
> >> > We have created seperate process groups for each of the data set
> >> migration.
> >> > How can we achieve the run sequence once I trigger the first flow?
> >> > I know this can be achieved by putting few flags in cache and checking
> >> > them in subsequen
> >> >
> >>
>


Restarting NiFi causing SiteToSiteBulletinReportingTask to fail

2018-04-12 Thread Woodhead, Chad
I am running HDF 3.0.1.1 which comes with NiFi 1.2.0.3.0.1.1-5. We are using 
SiteToSiteBulletinReportingTask to monitor bulletins (for things like Disk 
Usage and Memory Usage). When we restart NiFi via Ambari (either with a Restart 
or Stop and then Start), when NiFi comes back up the 
SiteToSiteBulletinReportingTask no longer works. It throws the following error 
when it is first trying to start up:

SiteToSiteBulletinReportingTask[id=ba6b4499-0162-1000--3ccd7573] 
org.apache.nifi.remote.client.PeerSelector@34e976af Unable to refresh Remote 
Group's peers due to response code 409:Conflict with explanation: null

No matter how long we wait, it never works. The ways I have been able to get it 
to start working again are as follows:

  *   Stop and then Start the Remote Input Port the 
SiteToSiteBulletinReportingTask is using
  *   Delete the SiteToSiteBulletinReportingTask and create a new one
  *   Wait a while and stop and start the SiteToSiteBulletinReportingTask 
(however this doesn't work consistently)

I have tested the same flow steps using a process that uses a Remote Process 
Group and a different Remote Input Port, and that RPG throws the same error 
when first coming up but then starts working after a period of time. So maybe 
the SiteToSiteBulletinReportingTask isn't trying enough times to connect to the 
Remote Input Port?

Sincerely,
Chad Woodhead


Re: Data migration in sequence

2018-04-12 Thread Mike Thomsen
> Hi Joe, The systems are the same. A and B. Just that i will be pulling 
> different
data set and that has to be done in sequence

If your input is JSON that is being sent to the originating API to get
different data sets, what you can do is put all of the GenerateFlowFile
instances' content into one big JSON array and have SplitJson split the
array into multiple flowfiles using the JSONPath statement "$" without the
quotes.

On Thu, Apr 12, 2018 at 12:56 PM Anil Rai  wrote:

> Thanks Mike and Joe for the quick response.
> Hi Mike, Nifi is making api calls to source -> pull data -> Maps -> Post
> data to target via API. Generate flow file triggers the flow. Right now i
> manually run 1 process group after the previous one is complete.
> Hi Joe, The systems are the same. A and B. Just that i will be pulling
> different data set and that has to be done in sequence. Example, I will
> load customer followed by say order followed by items etc.
> The use case is one time data load from system A and B of different data
> sets that needs to happen in sequence. Trigger has to be only once to start
> the first one.
> Hope that helps.
>
> Regards
> Anil
>
>
> On Thu, Apr 12, 2018 at 12:26 PM, Joe Witt  wrote:
>
> > Anil
> >
> > What is a little hard to tell here is do you mean:
> >
> > 1) I have to transfer data from system A to system B where there are
> > 10 steps I want to execute in NiFi between A and B.
> > -or-
> > 2) I have to transfer data from systems A,B,C,D,E,F,G,H,I,J to system
> > Z where first I want to load data from system A to Z, then B to Z,
> > then C to Z and so on.
> >
> > If it is #1 then of course it is easy.
> > If it is #2 then I believe you'd want a series of Wait/Notify
> > processors where B waits the notification signal from A, C from B, and
> > so on.
> >
> > Thanks
> >
> > On Thu, Apr 12, 2018 at 4:16 PM, Mike Thomsen 
> > wrote:
> > > It'd be very helpful to know what the input and output processors are
> for
> > > each of the flows because they have different capabilities for
> triggering
> > > them.
> > >
> > > On Thu, Apr 12, 2018 at 12:03 PM Anil Rai 
> wrote:
> > >
> > >> sorry, sent email in between.
> > >> This can be achieved by putting few flags in cache and checking them
> in
> > >> subsequent flows. But this is not elegant.
> > >> Any other way to achieve this flow runs in a sequence?
> > >>
> > >> Thanks
> > >> Anil
> > >>
> > >>
> > >> On Thu, Apr 12, 2018 at 12:01 PM, Anil Rai 
> > wrote:
> > >>
> > >> > Experts, Below is my requirenment.
> > >> >
> > >> > I have to do a 1 time data load from system a to b.
> > >> > I have to transfer 10 different data sets.
> > >> > These transfers have to happen in sequence. 1 followed by 2 followed
> > by 3
> > >> > so on till 10.
> > >> >
> > >> > We have created seperate process groups for each of the data set
> > >> migration.
> > >> > How can we achieve the run sequence once I trigger the first flow?
> > >> > I know this can be achieved by putting few flags in cache and
> checking
> > >> > them in subsequen
> > >> >
> > >>
> >
>


Re: Calling getLogger() from @OnScheduled, @OnStopped, etc.

2018-04-12 Thread Russell Bateman

Thanks for responding, Andy.

I am able to use it, like you, in onTrigger(). Where I haven't been able 
to use it is from annotated methods (in the sense that onTrigger()isn't 
annotated except by @Overridewhich is not relevant in this question). 
Imagine:


public class Fun extends AbstractProcessor
{
  private ComponentLog logger = getLogger();

  @Override
  public void onTrigger( final ProcessContext context, final ProcessSession 
session ) throws ProcessException
  {
logger.trace( "[PROFILE] onTrigger()" );*/* A */*
...
  }

  *@OnScheduled*
  public void processProperties( final ProcessContext context )
  {
logger.trace( "[PROFILE] processProperties()" );*/* B */*
...
  }

  *@OnStopped*
  public void dropEverything()
  {
logger.trace( "[PROFILE] dropEverything()" );*/* C */*
...
  }
  ...
}


Now, imaging suitable test code, FunTest.test() which sets up

   runner = TestRunners.newTestRunner( processor = new Fun() );

etc., then

   runner.run( 1 );

Above, instance A works fine (it's the one you illustrated in footnote 
[1]). Instances B and C cause the error:


   java.lang.AssertionError: Could not invoke methods annotated with
   @OnScheduled (or @OnStopped) annotation due to:
   java.lang.reflect.InvocationTargetException

Russ

On 04/12/2018 02:52 PM, Andy LoPresto wrote:

Hi Russ,

Are you saying the code that breaks is having “getLogger()” executed 
inside one of the processor lifecycle methods (i.e. 
GetFile#onTrigger()) or in your test code (i.e. 
GetFileTest#testOnTriggerShouldReadFile())?


I’m not aware of anything with the JUnit runner that would cause 
issues here. I use the loggers extensively in both my application code 
[1] and the tests [2]. Obviously in the tests, I instantiate a new 
Logger instance for the test class.


Can you share an example of the code that breaks this for you?

[1] 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/EncryptContent.java#L511
[2] 
https://github.com/apache/nifi/pull/2628/files#diff-e9cfa232683ae75b1fc505d6c9bd3b24R447


Andy LoPresto
alopre...@apache.org 
/alopresto.apa...@gmail.com /
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Apr 12, 2018, at 3:46 PM, Russell Bateman > wrote:


I seem to crash NiFi JUnit test runner when I have code that calls 
getLogger()or attempts to make use of the product of calling 
getLogger()in situations where some context (probably) is missing 
like methods annotated for call at "special" times. This makes sense, 
but is there a technique I can use to profile my custom processor in 
order to observe (easily, such as using the logger) the behavior of 
(i.e.: log-TRACE through) my processor in respect to @OnScheduled, 
@OnUnscheduled, @OnStopped, etc. moments?


Many thanks,
Russ






Re: Calling getLogger() from @OnScheduled, @OnStopped, etc.

2018-04-12 Thread Matt Burgess
I’m AFK right now but I’m thinking there is a point in the lifecycle where you 
can expect a logger, and before that (like when the processor is instantiated 
as you have it coded), you may not be able to assume there is a logger (either 
in real NiFi or the mock-verse :)

My guess is the rule of thumb is that if you have a Context (either 
ProcessContext, ProcessorInitializationContext, ConfigurationContext, etc) then 
you can assume the logger is available, and just call getLogger() there.

Sent from my iPhone

> On Apr 12, 2018, at 5:27 PM, Russell Bateman  wrote:
> 
> Thanks for responding, Andy.
> 
> I am able to use it, like you, in onTrigger(). Where I haven't been able to 
> use it is from annotated methods (in the sense that onTrigger()isn't 
> annotated except by @Overridewhich is not relevant in this question). Imagine:
> 
> public class Fun extends AbstractProcessor
> {
>  private ComponentLog logger = getLogger();
> 
>  @Override
>  public void onTrigger( final ProcessContext context, final ProcessSession 
> session ) throws ProcessException
>  {
>logger.trace( "[PROFILE] onTrigger()" );*/* A */*
>...
>  }
> 
>  *@OnScheduled*
>  public void processProperties( final ProcessContext context )
>  {
>logger.trace( "[PROFILE] processProperties()" );*/* B */*
>...
>  }
> 
>  *@OnStopped*
>  public void dropEverything()
>  {
>logger.trace( "[PROFILE] dropEverything()" );*/* C */*
>...
>  }
>  ...
> }
> 
> 
> Now, imaging suitable test code, FunTest.test() which sets up
> 
>   runner = TestRunners.newTestRunner( processor = new Fun() );
> 
> etc., then
> 
>   runner.run( 1 );
> 
> Above, instance A works fine (it's the one you illustrated in footnote [1]). 
> Instances B and C cause the error:
> 
>   java.lang.AssertionError: Could not invoke methods annotated with
>   @OnScheduled (or @OnStopped) annotation due to:
>   java.lang.reflect.InvocationTargetException
> 
> Russ
> 
>> On 04/12/2018 02:52 PM, Andy LoPresto wrote:
>> Hi Russ,
>> 
>> Are you saying the code that breaks is having “getLogger()” executed inside 
>> one of the processor lifecycle methods (i.e. GetFile#onTrigger()) or in your 
>> test code (i.e. GetFileTest#testOnTriggerShouldReadFile())?
>> 
>> I’m not aware of anything with the JUnit runner that would cause issues 
>> here. I use the loggers extensively in both my application code [1] and the 
>> tests [2]. Obviously in the tests, I instantiate a new Logger instance for 
>> the test class.
>> 
>> Can you share an example of the code that breaks this for you?
>> 
>> [1] 
>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/EncryptContent.java#L511
>> [2] 
>> https://github.com/apache/nifi/pull/2628/files#diff-e9cfa232683ae75b1fc505d6c9bd3b24R447
>> 
>> Andy LoPresto
>> alopre...@apache.org 
>> /alopresto.apa...@gmail.com /
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> 
>>> On Apr 12, 2018, at 3:46 PM, Russell Bateman >> > wrote:
>>> 
>>> I seem to crash NiFi JUnit test runner when I have code that calls 
>>> getLogger()or attempts to make use of the product of calling getLogger()in 
>>> situations where some context (probably) is missing like methods annotated 
>>> for call at "special" times. This makes sense, but is there a technique I 
>>> can use to profile my custom processor in order to observe (easily, such as 
>>> using the logger) the behavior of (i.e.: log-TRACE through) my processor in 
>>> respect to @OnScheduled, @OnUnscheduled, @OnStopped, etc. moments?
>>> 
>>> Many thanks,
>>> Russ
>> 
> 


Re: Calling getLogger() from @OnScheduled, @OnStopped, etc.

2018-04-12 Thread Bryan Bende
The example processor you showed won’t work because you are calling
getLogger() inline as part of the variable declaration.

The logger is given to the processor in an init method which hasn’t been
called yet at that point, so that is assigning null to the variable.

Generally you should just call getLogger() whenever it is needed, or you
could assign it to a variable from inside OnScheduled.

On Thu, Apr 12, 2018 at 5:28 PM Russell Bateman 
wrote:

> Thanks for responding, Andy.
>
> I am able to use it, like you, in onTrigger(). Where I haven't been able
> to use it is from annotated methods (in the sense that onTrigger()isn't
> annotated except by @Overridewhich is not relevant in this question).
> Imagine:
>
> public class Fun extends AbstractProcessor
> {
>private ComponentLog logger = getLogger();
>
>@Override
>public void onTrigger( final ProcessContext context, final
> ProcessSession session ) throws ProcessException
>{
>  logger.trace( "[PROFILE] onTrigger()" );*/* A */*
>  ...
>}
>
>*@OnScheduled*
>public void processProperties( final ProcessContext context )
>{
>  logger.trace( "[PROFILE] processProperties()" );*/* B */*
>  ...
>}
>
>*@OnStopped*
>public void dropEverything()
>{
>  logger.trace( "[PROFILE] dropEverything()" );*/* C */*
>  ...
>}
>...
> }
>
>
> Now, imaging suitable test code, FunTest.test() which sets up
>
> runner = TestRunners.newTestRunner( processor = new Fun() );
>
> etc., then
>
> runner.run( 1 );
>
> Above, instance A works fine (it's the one you illustrated in footnote
> [1]). Instances B and C cause the error:
>
> java.lang.AssertionError: Could not invoke methods annotated with
> @OnScheduled (or @OnStopped) annotation due to:
> java.lang.reflect.InvocationTargetException
>
> Russ
>
> On 04/12/2018 02:52 PM, Andy LoPresto wrote:
> > Hi Russ,
> >
> > Are you saying the code that breaks is having “getLogger()” executed
> > inside one of the processor lifecycle methods (i.e.
> > GetFile#onTrigger()) or in your test code (i.e.
> > GetFileTest#testOnTriggerShouldReadFile())?
> >
> > I’m not aware of anything with the JUnit runner that would cause
> > issues here. I use the loggers extensively in both my application code
> > [1] and the tests [2]. Obviously in the tests, I instantiate a new
> > Logger instance for the test class.
> >
> > Can you share an example of the code that breaks this for you?
> >
> > [1]
> >
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/EncryptContent.java#L511
> > [2]
> >
> https://github.com/apache/nifi/pull/2628/files#diff-e9cfa232683ae75b1fc505d6c9bd3b24R447
> >
> > Andy LoPresto
> > alopre...@apache.org 
> > /alopresto.apa...@gmail.com /
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >
> >> On Apr 12, 2018, at 3:46 PM, Russell Bateman  >> > wrote:
> >>
> >> I seem to crash NiFi JUnit test runner when I have code that calls
> >> getLogger()or attempts to make use of the product of calling
> >> getLogger()in situations where some context (probably) is missing
> >> like methods annotated for call at "special" times. This makes sense,
> >> but is there a technique I can use to profile my custom processor in
> >> order to observe (easily, such as using the logger) the behavior of
> >> (i.e.: log-TRACE through) my processor in respect to @OnScheduled,
> >> @OnUnscheduled, @OnStopped, etc. moments?
> >>
> >> Many thanks,
> >> Russ
> >
>
> --
Sent from Gmail Mobile


Re: Restarting NiFi causing SiteToSiteBulletinReportingTask to fail

2018-04-12 Thread Pierre Villard
Hi Chad,

I believe this could have been fixed recently but I've very limited access
right now (and for the next few days) and can't be sure...
I will check next week if no one gave you feedbacks before.

Pierre

2018-04-12 19:57 GMT+02:00 Woodhead, Chad :

> I am running HDF 3.0.1.1 which comes with NiFi 1.2.0.3.0.1.1-5. We are
> using SiteToSiteBulletinReportingTask to monitor bulletins (for things
> like Disk Usage and Memory Usage). When we restart NiFi via Ambari (either
> with a Restart or Stop and then Start), when NiFi comes back up the
> SiteToSiteBulletinReportingTask no longer works. It throws the following
> error when it is first trying to start up:
>
> SiteToSiteBulletinReportingTask[id=ba6b4499-0162-1000--3ccd7573]
> org.apache.nifi.remote.client.PeerSelector@34e976af Unable to refresh
> Remote Group's peers due to response code 409:Conflict with explanation:
> null
>
> No matter how long we wait, it never works. The ways I have been able to
> get it to start working again are as follows:
>
>   *   Stop and then Start the Remote Input Port the
> SiteToSiteBulletinReportingTask is using
>   *   Delete the SiteToSiteBulletinReportingTask and create a new one
>   *   Wait a while and stop and start the SiteToSiteBulletinReportingTask
> (however this doesn't work consistently)
>
> I have tested the same flow steps using a process that uses a Remote
> Process Group and a different Remote Input Port, and that RPG throws the
> same error when first coming up but then starts working after a period of
> time. So maybe the SiteToSiteBulletinReportingTask isn't trying enough
> times to connect to the Remote Input Port?
>
> Sincerely,
> Chad Woodhead
>


Re: Ongoing support for a contribution

2018-04-12 Thread Mike Thomsen
Anthony,

I think monitoring those lists would be a great idea. In the interim, if
you could get a Docker image up on Docker Hub that allows members of the
community to casually spin up a working MarkLogic instance suitable for
kicking the tires with your processor(s) that would be a big help.

Also, be aware that there are two models of "putting" that you'll want to
address to get maximum visibility for MarkLogic: record and non-record
processors. For example, PutMongo just takes whatever JSON that a user
throws at it and writes to Mongo. PutMongoRecord uses the Record API to
take data based on a schema in any supported format with a record reader.
You'll want to really take a look at that particular use case as it'll help
you accept input in a lot of common record formats beside XML.

Thanks,

Mike

On Thu, Apr 12, 2018 at 2:24 PM Anthony Roach 
wrote:

> Hi,
>
>
>
> I am very new to the open source world.  My organization is preparing to
> submit a NiFi processor to load data into our commercial NoSQL database.
> Once the review and commit are completed, what are our responsibilities for
> managing issues?  Do we monitor the issues distribution list?
>
>
>
> Thanks…
>
> ___
>
> Anthony Roach
>
> Product Manager
> MarkLogic Corporation
>
> Desk: +1 650 287 2587 <(650)%20287-2587>
>
> Mobile: +1 415 368 6460 <(415)%20368-6460>
> www.marklogic.com
> 
>
> [image: MLSoMeSignature] 
>
>
>


Re: Calling getLogger() from @OnScheduled, @OnStopped, etc.

2018-04-12 Thread Andy LoPresto
Hi Russ,

Are you saying the code that breaks is having “getLogger()” executed inside one 
of the processor lifecycle methods (i.e. GetFile#onTrigger()) or in your test 
code (i.e. GetFileTest#testOnTriggerShouldReadFile())?

I’m not aware of anything with the JUnit runner that would cause issues here. I 
use the loggers extensively in both my application code [1] and the tests [2]. 
Obviously in the tests, I instantiate a new Logger instance for the test class.

Can you share an example of the code that breaks this for you?

[1] 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/EncryptContent.java#L511
 

[2] 
https://github.com/apache/nifi/pull/2628/files#diff-e9cfa232683ae75b1fc505d6c9bd3b24R447
 


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Apr 12, 2018, at 3:46 PM, Russell Bateman  wrote:
> 
> I seem to crash NiFi JUnit test runner when I have code that calls 
> getLogger()or attempts to make use of the product of calling getLogger()in 
> situations where some context (probably) is missing like methods annotated 
> for call at "special" times. This makes sense, but is there a technique I can 
> use to profile my custom processor in order to observe (easily, such as using 
> the logger) the behavior of (i.e.: log-TRACE through) my processor in respect 
> to @OnScheduled, @OnUnscheduled, @OnStopped, etc. moments?
> 
> Many thanks,
> Russ



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Ongoing support for a contribution

2018-04-12 Thread Aldrin Piri
Hi Anthony,

>From a general ASF standpoint, when that code is contributed to/accepted by
the project it is taken on by the community and its contributors.  Of
course you and any others involved in the contribution process would be
encouraged to help out as much as you care to, but there is no explicit
responsibility to do so.

There is a great number of resources made available to get better
acquainted with open source and the operations within the ASF.  I would
encourage you to scope out
https://www.apache.org/foundation/getinvolved.html and its linked content.

Let us know if you have additional questions!


On Thu, Apr 12, 2018 at 3:20 PM, Mike Thomsen 
wrote:

> Anthony,
>
> I think monitoring those lists would be a great idea. In the interim, if
> you could get a Docker image up on Docker Hub that allows members of the
> community to casually spin up a working MarkLogic instance suitable for
> kicking the tires with your processor(s) that would be a big help.
>
> Also, be aware that there are two models of "putting" that you'll want to
> address to get maximum visibility for MarkLogic: record and non-record
> processors. For example, PutMongo just takes whatever JSON that a user
> throws at it and writes to Mongo. PutMongoRecord uses the Record API to
> take data based on a schema in any supported format with a record reader.
> You'll want to really take a look at that particular use case as it'll help
> you accept input in a lot of common record formats beside XML.
>
> Thanks,
>
> Mike
>
> On Thu, Apr 12, 2018 at 2:24 PM Anthony Roach  >
> wrote:
>
> > Hi,
> >
> >
> >
> > I am very new to the open source world.  My organization is preparing to
> > submit a NiFi processor to load data into our commercial NoSQL database.
> > Once the review and commit are completed, what are our responsibilities
> for
> > managing issues?  Do we monitor the issues distribution list?
> >
> >
> >
> > Thanks…
> >
> > ___
> >
> > Anthony Roach
> >
> > Product Manager
> > MarkLogic Corporation
> >
> > Desk: +1 650 287 2587 <(650)%20287-2587>
> >
> > Mobile: +1 415 368 6460 <(415)%20368-6460>
> > www.marklogic.com
> >  2xvxxIRdFrhaKKSB3ci9GbM0QJCBS5YGss9e9ahFqmgMx1QfDTTUCA..&
> URL=http%3a%2f%2fwww.marklogic.com%2f>
> >
> > [image: MLSoMeSignature] 
> >
> >
> >
>


Re: Calling getLogger() from @OnScheduled, @OnStopped, etc.

2018-04-12 Thread Russell Bateman
Yes, this is what I assumed, but I was hoping that someone had developed 
a technique for reaching the log in some (twisted) way perhaps that I 
hadn't figured out yet. It would really help me visualize the order in 
which my code's called and help me feel better about what I've written.


Thanks,
Russ

On 04/12/2018 03:41 PM, Bryan Bende wrote:

The example processor you showed won’t work because you are calling
getLogger() inline as part of the variable declaration.

The logger is given to the processor in an init method which hasn’t been
called yet at that point, so that is assigning null to the variable.

Generally you should just call getLogger() whenever it is needed, or you
could assign it to a variable from inside OnScheduled.

On Thu, Apr 12, 2018 at 5:28 PM Russell Bateman 
wrote:


Thanks for responding, Andy.

I am able to use it, like you, in onTrigger(). Where I haven't been able
to use it is from annotated methods (in the sense that onTrigger()isn't
annotated except by @Overridewhich is not relevant in this question).
Imagine:

public class Fun extends AbstractProcessor
{
private ComponentLog logger = getLogger();

@Override
public void onTrigger( final ProcessContext context, final
ProcessSession session ) throws ProcessException
{
  logger.trace( "[PROFILE] onTrigger()" );*/* A */*
  ...
}

*@OnScheduled*
public void processProperties( final ProcessContext context )
{
  logger.trace( "[PROFILE] processProperties()" );*/* B */*
  ...
}

*@OnStopped*
public void dropEverything()
{
  logger.trace( "[PROFILE] dropEverything()" );*/* C */*
  ...
}
...
}


Now, imaging suitable test code, FunTest.test() which sets up

 runner = TestRunners.newTestRunner( processor = new Fun() );

etc., then

 runner.run( 1 );

Above, instance A works fine (it's the one you illustrated in footnote
[1]). Instances B and C cause the error:

 java.lang.AssertionError: Could not invoke methods annotated with
 @OnScheduled (or @OnStopped) annotation due to:
 java.lang.reflect.InvocationTargetException

Russ

On 04/12/2018 02:52 PM, Andy LoPresto wrote:

Hi Russ,

Are you saying the code that breaks is having “getLogger()” executed
inside one of the processor lifecycle methods (i.e.
GetFile#onTrigger()) or in your test code (i.e.
GetFileTest#testOnTriggerShouldReadFile())?

I’m not aware of anything with the JUnit runner that would cause
issues here. I use the loggers extensively in both my application code
[1] and the tests [2]. Obviously in the tests, I instantiate a new
Logger instance for the test class.

Can you share an example of the code that breaks this for you?

[1]


https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/EncryptContent.java#L511

[2]


https://github.com/apache/nifi/pull/2628/files#diff-e9cfa232683ae75b1fc505d6c9bd3b24R447

Andy LoPresto
alopre...@apache.org 
/alopresto.apa...@gmail.com /
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


On Apr 12, 2018, at 3:46 PM, Russell Bateman > wrote:

I seem to crash NiFi JUnit test runner when I have code that calls
getLogger()or attempts to make use of the product of calling
getLogger()in situations where some context (probably) is missing
like methods annotated for call at "special" times. This makes sense,
but is there a technique I can use to profile my custom processor in
order to observe (easily, such as using the logger) the behavior of
(i.e.: log-TRACE through) my processor in respect to @OnScheduled,
@OnUnscheduled, @OnStopped, etc. moments?

Many thanks,
Russ

--

Sent from Gmail Mobile





(NiFi 1.6) Funnels with no outgoing relationship filling my app log

2018-04-12 Thread Peter Wicks (pwicks)
I just upgraded one of my servers to NiFi 1.6.0. I have a couple of funnels 
that just dead end, Flow File's come in but never go anywhere after that. 
Mostly I use this for troubleshooting/validation. The inbound relationships all 
have expiration times, and it's a quick way for me to inspect the output of a 
processor on the fly.

These funnel's are filling up my logs with errors that they can't output to 
Relationship '' (see error log below). If I attach the Funnel to another 
downstream processor than everything is fine. I went back and tested on my 
1.5.0 server and did not see the errors.

I briefly looked through the code, but the bug didn't jump out at me.

2018-04-11 23:53:28,066 ERROR [Timer-Driven Process Thread-31] 
o.apache.nifi.controller.StandardFunnel 
StandardFunnel[id=b868231c-0162-1000-571c-ae3e7d15d848] 
StandardFunnel[id=b868231c-0162-1000-571c-ae3e7d15d848] failed to process 
session due to java.lang.RuntimeException: java.lang.IllegalArgumentException: 
Relationship '' is not known; Processor Administratively Yielded for 1 sec: 
java.lang.RuntimeException: java.lang.IllegalArgumentException: Relationship '' 
is not known
java.lang.RuntimeException: java.lang.IllegalArgumentException: Relationship '' 
is not known
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:365)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Relationship '' is not known
at 
org.apache.nifi.controller.repository.StandardProcessSession.transfer(StandardProcessSession.java:1935)
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:379)
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:358)
... 9 common frames omitted


Thanks,
  Peter