from:"Peter Wicks \(pwicks\)"

Re: [EXT] [ANNOUNCE] New Apache NiFi Committer Otto Fowler

2021-03-22 Thread Peter Wicks (pwicks)

Micron Confidential

Congratulations Otto!

From: Joe Witt 
Date: Monday, March 22, 2021 at 12:16 PM
To: dev@nifi.apache.org 
Subject: [EXT] [ANNOUNCE] New Apache NiFi Committer Otto Fowler
CAUTION: EXTERNAL EMAIL. Do not click links or open attachments unless you 
recognize the sender and were expecting this message.

On behalf of the Apache NiFI PMC, I am very pleased to announce that
Otto has accepted the PMC's invitation to become a committer on the
Apache NiFi project. We greatly appreciate all of Otto's hard work and
generous contributions to the project and look forward to the
continued involvement.

Otto has been engaged for several years and is always helping with
release votes, takes on some of the more corner-casey (that is a word)
PRs/JIRAs, shares perspective, and is always willing to share thoughts
with people in the mailing list or slack which is vital to community
growth.

Welcome and congratulations!

Micron Confidential

Re: [EXT] Using github to merge PRs and just generally easier tracking of commits..

2021-01-05 Thread Peter Wicks (pwicks)

Should mergers be putting the “signed-off by: …” in when they Merge PR’s 
through Github?
I haven’t merged anything in a while, but I was putting this in the comment box 
when I merged, and it showed up on the commit the same way it did when I did it 
through the CLI:
https://github.com/apache/nifi/commit/f1d35f46cede60401190777d55350439af106ba7

As for signing merges, yeah, that’s unfortunate that GitHub doesn’t seem to 
support that.

From: Joe Witt 
Date: Monday, January 4, 2021 at 12:48 PM
To: dev@nifi.apache.org 
Subject: [EXT] Using github to merge PRs and just generally easier tracking of 
commits..
CAUTION: EXTERNAL EMAIL. Do not click links or open attachments unless you 
recognize the sender and were expecting this message.


Team,

If you merge PRs from Github they will not do the 'signed off by' and the
commits are not verified.  That said the convenience is hard to argue and
obviously we need to do a better job of getting PRs processed in a
timely fashion.  So if you want an easier way to see not only who authored
a commit but also who actually merged it (the committer) then you can run

git log --pretty=fuller

That gives output like

commit e7c6bdad42514200d8732b644c59dcb789e358da (HEAD -> main,
upstream/main, github/main)
Author: exceptionfactory 
AuthorDate: Sat Oct 17 12:25:00 2020 -0400
Commit: markap14 
CommitDate: Mon Jan 4 14:20:05 2021 -0500

NIFI-7937 Added StandardFlowFileMediaType enum to replace string
references to FlowFile Media Types



This is nice as it shows both who made the PR, who merged the PR, when the
PR was offered vs merged, etc.  And that works even if the commit
wasnt signed/signed off.

Thanks

RE: [EXT] Re: Teradata and Nifi

2020-08-28 Thread Peter Wicks (pwicks)

Micron Confidential

Harisa,

We did this in the past (sorry, not on Teradata anymore).  To streamline the 
process we ended up building a custom processor. Initially we used FASTLOAD 
CSV, and that worked OK, but it was hard to handle the error cases, since 
FASTLOAD errors don't work the same way as a normal JDBC insert/update error.

In the end, we just switched to batch based processing using staging tables. 
This actually worked pretty well, are you using record based batch loading?

If you want to use FASTLOAD I'd suggest writing the data to a CSV file on disk 
and then running a script through NiFi to load that file.  Without a lot of 
custom coding it's probably your only path to using FASTLOAD.

--Peter


Micron Confidential

-Original Message-
From: Otto Fowler  
Sent: Friday, August 28, 2020 9:44 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Teradata and Nifi

 I believe this still applies:
https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommunity.cloudera.com%2Ft5%2FSupport-Questions%2FTeradata-and-NIFI%2Ftd-p%2F211964data=02%7C01%7Cpwicks%40micron.com%7Cf748a8c4f96a4502709008d84b69408f%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C637342262723373237sdata=p7E9ikOqICbKnMaaQB0pNy5LTIeMvBvjj9w6g3CSvco%3Dreserved=0



On August 28, 2020 at 10:54:34, Urooj, Harisa (harisa.ur...@teradata.com)
wrote:

We’re trying to do bulk loading of data in Teradata using NIFI. Is that 
possible in NIFI?

Currently we noticed that row by row inserts to teradata is possible but that 
is not scaling well when it comes to loading millions of records.
Teradata has their own set of utility operators: FASTLOAD, MULTILOAD, TPTLOAD, 
TPTUPDATE, etc that we can use to do fast bulk loading of records

into Teradata.  Does NIFI support any of these operators? So far I only see the 
option to run SQL against teradata not utility jobs.









*Harisa Urooj*
EMEA Cloud Architect
+44 7557 564489

[image: signature_1806873648] 


3 London Bridge Street

SE1 9SG, London, UK

teradata.com 




*Stop buying “analytics.” Invest in answers. *

This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.

Please consider the environment before printing.

RE: [EXT] Re: NiFi Standard Libraries Info

2020-01-17 Thread Peter Wicks (pwicks)

I know the NiFi 1.11 release process is already going on, but I thought the 
community decided back in September to split most of the libraries out into the 
new `NiFi Standard Libraries` Git repo/project because NiFi as a binary release 
to Apache Maven was getting too big?

Was there further discussions concerning this that I missed (I've been pretty 
absent lately), or did the community decide to hold off on this for a while?

Thanks,
  Peter

-Original Message-
From: Bryan Bende  
Sent: Tuesday, September 10, 2019 11:35 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: NiFi Standard Libraries Info

Thanks for the heads up. I just pushed an initial commit with an empty
README. Let me know if it still doesn't work.

On Tue, Sep 10, 2019 at 1:27 PM Otto Fowler  wrote:
>
> We can’t fork the repo if it is empty on github. At least that is what it
> is saying.
>
>
>
>
> On September 10, 2019 at 10:17:52, Bryan Bende (bbe...@gmail.com) wrote:
>
> Per the recent vote to create the NiFi Standard Libraries sub-project,
> I went ahead and created the git repo [1] and JIRA project [2].
>
> I think there is still something we need INFRA to enable in JIRA so
> that we have the correct permissions as admins. Right now I don't see
> the settings icon on the left to manage users/versions/etc. I will
> look into this.
>
> In the meantime, I created NIFILIBS-1 [3] to initialize the git repo
> with a root pom, README, and LICENSE. If anyone wants to take this on
> feel free, otherwise I will get to it when I have some time.
>
> We also need to update the web site with information for the
> sub-project. This might be a good opportunity to also update the NiFi
> Registry portion of the site to be a true sub-site like MiNiFi.
>
> Thanks,
>
> Bryan
>
> [1] 
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fnifi-standard-librariesdata=02%7C01%7Cpwicks%40micron.com%7C456c8f8e6d814d89ec7a08d73615369e%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C637037337055951115sdata=myaNkvbS5ap1RgafajXbP9pC4591VSd4Ag1EfK%2BhcnM%3Dreserved=0
> [2] 
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fprojects%2FNIFILIBS%2Fsummarydata=02%7C01%7Cpwicks%40micron.com%7C456c8f8e6d814d89ec7a08d73615369e%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C637037337055951115sdata=Hg6mOiTVb1J8EFCTTDZTPDcUbVNdzN%2BH59HgWPiRNjw%3Dreserved=0
> [3] 
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FNIFILIBS-1data=02%7C01%7Cpwicks%40micron.com%7C456c8f8e6d814d89ec7a08d73615369e%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C637037337055951115sdata=FUlexLOIet43tVnHYwsfAgp479d2ncuZJXPy2kmHc2w%3Dreserved=0

RE: [EXT] Re: [DISCUSS] Time based release cycles

2019-11-05 Thread Peter Wicks (pwicks)

I feel like most users ask, "When is version x coming out" because they don't 
want to/or can't do a build themselves and they really want to use new features.

I know it's a completely different direction from where I think your question 
was pointing Pierre, but I wonder how many users would be OK with a nightly 
build binary? Many other Apache projects provide nightly builds including 
JMeter, Ignite, ANT, Cordova, Solr and OpenOffice.  This would also make it 
easier for users to provide feedback sooner on changes, as they could just grab 
a pre-built binary.

Thanks,
  Peter

-Original Message-
From: Russell Bateman  
Sent: Tuesday, November 5, 2019 8:39 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: [DISCUSS] Time based release cycles

Kafka is first-rate, rock-star technology, just as is NiFi.

It would be nice to find something from Kafka elaborating on how this regular 
and accelerated release cadence is working out for them, how much more work 
it's been, what problems they've experienced, etc.

I show their releases over the last couple of years as below[1]. The cadence 
appears to be settling into the the 4-month cycle proposed. It's possible to 
discern a maintenance schedule. It doesn't exactly match NiFi's 0.x and 1.x 
efforts (which were simultaneous for some time too), but it's clear they've 
faced similar complexity (maybe a little more though for a shorter time). And, 
of course, there's no meaningful way to compare the effort going into and 
features implemented in Kafka by comparison with NiFi.

2019
2.3.1    24 October
2.3.0    25 June
2.2.1     1 June
2.2.0    22 March
2.1.1    15 February

2018
2.1.0    20 November
2.0.1     9 November
2.0.0    30 July
1.1.1    19 July
1.0.2     8 July
0.11.0.3  2 July
0.10.2.2  2 July
1.1.0    28 March
1.0.1     5 March

2017
1.0.0 1 November
0.11.0.1 13 September
0.11.0.0 28 June
.
.
.

[1] 
https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkafka.apache.org%2Fdownloadsdata=02%7C01%7Cpwicks%40micron.com%7C5025edaf0fcc4cd23ecb08d762064cb5%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C1%7C637085651507559851sdata=ax5RXiprNm8Ls1k%2FuEE4SwA5tCXzObJu3Dk%2FiP3h3dI%3Dreserved=0

On 11/5/19 8:02 AM, Pierre Villard wrote:
> Hi NiFi dev community,
>
> We just released NiFi 1.10 and that's an amazing release with a LOT of 
> great new features. Congrats to everyone!
>
> I wanted to take this opportunity to bring a discussion around how 
> often we're doing releases.
>
> We released 1.10.0 yesterday and we released 1.9.0 in February, that's 
> around 8 months between the two releases. And if we take 1.9.2, 
> released early April, that's about 7 months.
>
> I acknowledge that doing releases is really up to the committers and 
> anyone can take the lead to perform this process, however, we often 
> have people asking (on the mailing lists or somewhere else) about when 
> will the next release be. I'm wondering if it would make sense to 
> think about something a bit more "planned" by doing time based releases.
>
> The Apache Kafka community wrote a nice summary of the pros/cons about 
> such an approach [1] and it definitely adds more work to the 
> committers with more frequent releases. I do, however, think that it'd 
> ease the adoption of NiFi, its deployment and the dynamism in PR/code review.
>
> I'm just throwing the idea here and I'm genuinely curious about what 
> you think about this approach.
>
> [1]
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik
> i.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FTime%2BBased%2BRelease%2
> BPlandata=02%7C01%7Cpwicks%40micron.com%7C5025edaf0fcc4cd23ecb08d
> 762064cb5%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C1%7C637085651507559
> 851sdata=Nj1t2mTP7VWwOIxD5V5vlnH8quyXYP8ul6Sa2e3nswE%3Dreser
> ved=0
>
> Thanks,
> Pierre
>

RE: [EXT] [ANNOUNCE] New Apache NiFi Committer Kotaro Terada

2019-10-24 Thread Peter Wicks (pwicks)

おめでとう寺田さん！

-Original Message-
From: Aldrin Piri  
Sent: Thursday, October 24, 2019 7:50 AM
To: dev 
Subject: [EXT] [ANNOUNCE] New Apache NiFi Committer Kotaro Terada

Apache NiFi community,

On behalf of the Apache NiFI PMC, I am very pleased to announce that Kotaro
has accepted the PMC's invitation to become a committer on the Apache NiFi
project. We greatly appreciate all of Kotaro's hard work and generous
contributions to the project. We look forward to continued involvement
in the project.

Kotaro contributed to a breadth of areas in both NiFi and Registry as well
as
being a regular reviewer of our releases. Kotaro's communication in Jira
issues
and responsiveness to the review processes highlighted great collaboration
and
embodied our community goals for the project.

Welcome and congratulations!

--ap

Project access to Jira NiFi

2019-09-23 Thread Peter Wicks (pwicks)

When I was added to the PMC, it looks like my Jira permissions were not updated 
to reflect the change.  Can someone on the PMC update my permissions in JIRA?

Thanks,
  Peter

How does http://nifi.apache.org update?

2019-09-20 Thread Peter Wicks (pwicks)

How are updates pushed to http://nifi.apache.org? Are 
these done automatically by some Apache Infra job, or is there a developer out 
there pushing changes?

--Peter

RE: [EXT] Re: [VOTE] Create NiFi Standard Libraries sub-project

2019-09-03 Thread Peter Wicks (pwicks)

+1, binding

-Original Message-
From: Kevin Doran  
Sent: Tuesday, September 3, 2019 7:12 PM
To: dev@nifi.apache.org; dev@nifi.apache.org
Subject: [EXT] Re: [VOTE] Create NiFi Standard Libraries sub-project

+1, binding



From: Tony Kurc 
Sent: Tuesday, September 3, 2019 8:33 PM
To: dev@nifi.apache.org
Subject: Re: [VOTE] Create NiFi Standard Libraries sub-project

+1 (binding)

On Wed, Sep 4, 2019 at 12:29 AM Aldrin Piri  wrote:

> +1, binding
>
> On Tue, Sep 3, 2019 at 19:46 Yolanda Davis 
> wrote:
>
> > +1 Create NiFi Standard Libraries (binding)
> >
> > On Tue, Sep 3, 2019 at 7:03 PM Koji Kawamura 
> > 
> > wrote:
> >
> > > +1 Create NiFi Standard Libraries (binding)
> > >
> > > On Wed, Sep 4, 2019 at 7:25 AM Mike Thomsen 
> > > 
> > > wrote:
> > > >
> > > > +1 binding
> > > >
> > > > On Tue, Sep 3, 2019 at 5:33 PM Andy LoPresto 
> > > > 
> > > wrote:
> > > >
> > > > > +1, create NiFi Standard Libraries (binding)
> > > > >
> > > > > Andy LoPresto
> > > > > alopre...@apache.org
> > > > > alopresto.apa...@gmail.com
> > > > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D 
> > > > > EF69
> > > > >
> > > > > > On Sep 3, 2019, at 2:16 PM, Bryan Bende 
> wrote:
> > > > > >
> > > > > > All,
> > > > > >
> > > > > > In a previous thread there was a plan discussed to 
> > > > > > restructure
> some
> > > of
> > > > > > the repositories in order to address several different 
> > > > > > issues,
> such
> > > as
> > > > > > build time, reusability of code, and eventually separating 
> > > > > > how
> the
> > > > > > framework and extensions are released [1][2].
> > > > > >
> > > > > > The overall plan requires many steps to get there, so I'd 
> > > > > > like to propose starting with a small actionable step - the 
> > > > > > creation of a
> > new
> > > > > > sub-project called NiFi Standard Libraries (formerly 
> > > > > > referred to
> as
> > > > > > nifi-commons).
> > > > > >
> > > > > > Project Name: Apache NiFi Standard Libraries Git Repository: 
> > > > > > nifi-standard-libraries
> > > > > > JIRA: NIFILIBS
> > > > > >
> > > > > > Description:
> > > > > >
> > > > > > A collection of standard implementations used across the 
> > > > > > NiFi
> > > ecosystem.
> > > > > >
> > > > > > Candidate Libraries:
> > > > > >
> > > > > > In general, each library may consist of multiple Maven 
> > > > > > modules,
> and
> > > > > > should be independent from the rest of the ecosystem, and 
> > > > > > from
> > other
> > > > > > libraries within NiFi Standard Libraries.
> > > > > >
> > > > > > In addition, each library may make it's own decision about
> whether
> > it
> > > > > > is considered a public facing extension point/API, or an 
> > > > > > internal library that may be changed at any time. This 
> > > > > > should be
> documented
> > in
> > > > > > a README at the root of each library, such as 
> > > > > > nifi-standard-libraries/nifi-xyz/README.
> > > > > >
> > > > > > An initial library that has been discussed was referred to 
> > > > > > as 'nifi-security' and would centralize much of the security 
> > > > > > related
> > > code
> > > > > > shared by NiFi and NiFi Registry, such as shared security 
> > > > > > APIs,
> and
> > > > > > implementations for various providers, such as LDAP/Kerberos/etc.
> > > > > >
> > > > > > A second candidate library would be an optimistic-locking 
> > > > > > library based on NiFi's revision concept. Currently this has 
> > > > > > been created inside nifi-registry for now [3], but could be 
> > > > > > moved as soon as nifi-standard-libraries exists.
> > > > > >
> > > > > > (This list does not have to be final in order to decide if 
> > > > > > we are creating NiFi Standard Libraries or not)
> > > > > >
> > > > > > Integration & Usage:
> > > > > >
> > > > > > Once NiFi Standard Libraries is created, the community can 
> > > > > > start creating and/or moving code there and perform releases 
> > > > > > as
> > necessary.
> > > A
> > > > > > release will consist of the standard Apache source release, 
> > > > > > plus artifacts released to Maven central. The community can 
> > > > > > then
> decide
> > > > > > when it is appropriate to integrate these released libraries 
> > > > > > into
> > one
> > > > > > of our downstream projects.
> > > > > >
> > > > > > For example, if we create a nifi-security library in 
> > > > > > nifi-standard-libraries, we can release that whenever we 
> > > > > > decide,
> > but
> > > > > > we may not integrate it into NiFi or NiFi Registry until it 
> > > > > > makes sense for a given release of those projects.
> > > > > >
> > > > > > This vote will be open for 48 hours, please vote:
> > > > > >
> > > > > > [ ] +1 Create NiFi Standard Libraries [ ] +0 no opinion [ ] 
> > > > > > -1 Do not create NiFi Standard Libraries because...
> > > > > >
> > > > > > [1]
> > > > >
> > >
> >
> https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapach
>

RE: [EXT] Re: OnPrimaryNodeStateChange vs Primary Only configuration

2019-08-16 Thread Peter Wicks (pwicks)

Bryan,

I'm familiar with the getNodeTypeProvider method.  Unfortunately, this does not 
differentiate between processors that are scheduled to run only on the Primary 
node and those that are scheduled to run on all of them.

So you're saying, a better fix would be to properly call scheduled/unscheduled, 
and when a processor is unscheduled make sure it then handles this; but that 
it's complicated. I can believe hat.

But, in the meantime, there probably isn't a problem with exposing this piece 
of scheduling information in the ProcessContext?

Thanks,
  Peter

-Original Message-
From: Bryan Bende  
Sent: Friday, August 16, 2019 9:19 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: OnPrimaryNodeStateChange vs Primary Only configuration

AbstractSessionFactoryProcessor has a method

getNodeTypeProvider().isPrimary()

The ultimate fix for your problem is that a processor shouldn't have it's 
onScheduled called at all unless it is actually schedule to run on that node. 
Currently it calls onScheduled on all nodes, but then never calls onTrigger on 
the ones where it isn't scheduled. There is a long standing JIRA for this, but 
it's a complex fix.

On Fri, Aug 16, 2019 at 11:07 AM Peter Wicks (pwicks)  wrote:
>
> I'm working on a bug fix for HandleHttpRequest and need to check if a 
> processor is configured to run only on primary node (and not if a processor 
> has the attribute that ONLY allows it to run on primary node).
> Here is the scenario for background:
>
>   *   NiFi cluster, but all nodes are on the same physical machine; we do 
> this to let developers develop/test in a cluster without needing a lot of 
> infrastructure before deploying to the real prod cluster.
>   *   To avoid Port conflicts, HandleHttpRequest is setup to run only on 
> master. But, if there is a master node change then the Http server is not 
> properly shutdown and we get a port conflict when the new master node starts 
> up the new instance of the processor.
>
> The problem is I don't think the Primary Only scheduling configuration is 
> exposed to the processor. I'd like to do something like the code below:
>
> @OnPrimaryNodeStateChange
> public void onPrimaryNodeChange(final PrimaryNodeState newState) {
> // If this processor is running in Primary Only
> // and this is processor is not master, shutdown the http server.
> If(this.isMasterOnlyScheduled) shutdown();
> }
>
> I can do some work to expose this, but I thought I'd ask in case I'm missing 
> it.
>
> Thanks,
>   Peter

OnPrimaryNodeStateChange vs Primary Only configuration

2019-08-16 Thread Peter Wicks (pwicks)

I'm working on a bug fix for HandleHttpRequest and need to check if a processor 
is configured to run only on primary node (and not if a processor has the 
attribute that ONLY allows it to run on primary node).
Here is the scenario for background:

  *   NiFi cluster, but all nodes are on the same physical machine; we do this 
to let developers develop/test in a cluster without needing a lot of 
infrastructure before deploying to the real prod cluster.
  *   To avoid Port conflicts, HandleHttpRequest is setup to run only on 
master. But, if there is a master node change then the Http server is not 
properly shutdown and we get a port conflict when the new master node starts up 
the new instance of the processor.

The problem is I don't think the Primary Only scheduling configuration is 
exposed to the processor. I'd like to do something like the code below:

@OnPrimaryNodeStateChange
public void onPrimaryNodeChange(final PrimaryNodeState newState) {
// If this processor is running in Primary Only
// and this is processor is not master, shutdown the http server.
If(this.isMasterOnlyScheduled) shutdown();
}

I can do some work to expose this, but I thought I'd ask in case I'm missing it.

Thanks,
  Peter

RE: [EXT] Airflow and NiFi

2019-08-15 Thread Peter Wicks (pwicks)

User: "I want to change my production flow while it's running, you know, mid 
stream just route my content to a completely different flow, re-run data mid 
run through a new set of processors, fork it on the fly, you know, whatever I 
want anytime I want "

NiFi: "Yeah, I can do that"

Everyone Else: " I'm sorry... what?"

-Original Message-
From: Mike Thomsen  
Sent: Thursday, August 15, 2019 2:00 PM
To: dev@nifi.apache.org
Subject: [EXT] Airflow and NiFi

Does anyone see any areas where the two can complement each other and where we 
might want to give users the ability to offload processing to Airflow?
Curious since our poking around the docs lead us to conclude it was probably 
more "Airflow vs Spark+Oozie" than really competing with NiFi.

Should there be a sane limit on Strings in NiFI?

2019-08-01 Thread Peter Wicks (pwicks)

I noticed yesterday that automatic ellipsis was not working for relationship 
names in the Connection Creation window (PR submitted in NIFI-6512). As part of 
my test I started playing around with submitting fairly long names for 
relationships in RouteOnAttribute.  This all worked with out issue on 
relationships with greater than 1000 characters.

This led me to wonder how other parts of NiFi deal with very long strings in 
the UI, such as the breadcrumb trail.

To bring an already long story to a close, I submitted a 200MB string as a 
Process Group name using Chrome’s developer tools to do a custom PUT.  NiFi 
happily accepted the 104857600 character long name! Of course, this means each 
call to load that process group fails spectacularly in Chrome with the whole UI 
thread dead, but otherwise NiFi just keeps running.

This is what led me to wonder if maybe we should have some constraints on the 
maximum size of these UI visible strings.

Thanks,
  Peter

RE: [EXT] Re: Duplicate flow files without their content

2019-07-31 Thread Peter Wicks (pwicks)

Lars,

If you are worried about it, using ReplaceText will have the same effect as 
your custom solution. When ReplaceText has it's `Replacement Strategy` set to 
`Always Replace` it doesn't read the contents of the FlowFile and simply writes 
out the replacement Value, which in your case could be an empty string.

Thanks,
  Peter

From: Lars Winderling 
Sent: Wednesday, July 31, 2019 11:02 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Duplicate flow files *without* their content

Hi Edward,

thank you for your input. I didn't know about the cow-semantics, that's really 
useful. I'll check out the in-depth guide for sure!
In my case, the content of the flow file does change heavily from one processor 
to the next one, so I doubt copy-on-write would help here.

Best,
Lars

On Wed, 2019-07-31 at 12:13 +0100, Edward Armes wrote:

HI Lars,



In short. depending on the how a FlowFile is duplicated, the content

shouldn't be duplicated as well.



In general, content is only duplicated when it has been deemed to have been

changed (copy-on-write semantics). For the most part (unless a FlowFIle has

a large number of attributes) a FlowFile is actually quite small and

therefore the waste is minimal, hence why they can be held in memory and

passed through a Flow.



The best way to branch/clone a flow file is to add another output from the

processor you want to log the output from, and the Framework that surrounds

a Processor will handle the rest. This does create a duplicate FlowFIle but

doesn't create a copy of the content. In the provenance repository this

marked as a CLONE event for the original FlowFIle and the new FlowFile gets

treated as it's own unique FlowFIle with a reference to the original

content.



This is quite a short explanation, and a better and more in depth

explanation can be found here and I think this covers all the scenarios

you're thinking about:


https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html


.





Edward



On Wed, Jul 31, 2019 at 11:47 AM Lars Winderling <


lars.winderl...@posteo.de


>

wrote:



Dear NiFi community,



I often face the use-case where I import flow files with content of order

O(1gb) or O(10gb) - already compressed.

Let's day I need to branch off of a flow where the actual flow file should

be processed further, and one some side branch I want just to do some kind

of logging or whatever without accessing the flow file's contents. Thus

it's clearly wasteful to duplicate the flow file including content.

For this case I wrote a processor defining 2 relationships: "original" and

"attributes only", so the flow file attributes can be accessed separately

from the content.

I will gladly prepare a PR if anyone finds that worth incorporating into

NiFi.

Only remaining question for me would be: use an individual processor to

that end, or add it to e.g. the DuplicateFlowFile processor. The former

seems cleaner to me. Proposed names would be something like ForkProcessor

(no better idea yet).



Thanks in advance!

Best,

Lars

RE: [EXT] Re: Adding Color to Connections

2019-07-30 Thread Peter Wicks (pwicks)

Pierre, There are a few other indicators of back pressure, including the red 
shadow around the queue and the queue status bars. Do you think these are 
enough for a user to distinguish between a back pressure queue and a connection 
that someone marked as red? Any thoughts on how to keep these separate?

Matt, I see that the Bring to Front affects the entire connection, and not just 
the label. This makes sense, so no complaints there.  I can see where we might 
make it possible for a user to duplicate an existing style, and thus confuse 
themselves, such as creating a connection that looks just like a Ghost 
connection, but isn't.

--Peter



-Original Message-
From: Matt Gilman  
Sent: Tuesday, July 30, 2019 10:58 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Adding Color to Connections

A couple of other things to note:

- Connections can already be gray and dashed when the underlying relationship 
no longer exists. This condition happens often using any Processor whose 
connections are dynamic (like RouteOnAttribute).
- Also, there is already an option to adjust the z-index of a connection.
However, the concept of 'z-index' is not surfaced to the user. The action 
available is 'Bring to front' and this affects the connection label.

One thing that's been discussed in the past is offering different modes (or
layers) based on specific use cases. Supporting different visualizations when 
the user is configuring versus when they are monitoring could possibly allow us 
to introduce more visualization while not conflicting with existing ones.

Thanks

Matt

On Tue, Jul 30, 2019 at 12:37 PM Pierre Villard 
wrote:

> Hey,
>
> I like the idea but we need to have a clear differentiator compared to 
> the situation where backpressure is enabled (and connection is 
> coloured in red).
>
> Pierre
>
> Le mar. 30 juil. 2019 à 18:19, Peter Wicks (pwicks) 
>  a écrit :
>
> > You know when you have a crazy complex flow, and it's hard sometimes 
> > to even tell where things are going? Especially those failure 
> > conditions
> that
> > are all going back to a central Funnel or Port? I thought it would 
> > be visually very helpful if you could add Color to your connections, 
> > using
> the
> > existing UI components in place for doing that.
> >
> > I have a working POC, 
> > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FNIFI-6504data=02%7C01%7Cpwicks%40micron.com%7C46a4f5c714424a9fd6a008d7150f2480%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C637001027091402448sdata=dlbuypeUDXN7wE7CtvuqcCcnbXV2GrOEFEv14LYeCMQ%3Dreserved=0.
> > Which includes a screenshot and link to the branch.
> >
> > There are a number of open questions:
> >
> >   *   Should the markers (arrows) be colored also?
> >   *   Should there be an option to color other components such as
> funnel's
> > or ports?
> >   *   I've also considered adding some additional line formatting options
> > such as thickness and dashes, like the dashed green lines you see 
> > when creating a new relationship, or the extra thick lines that 
> > represent multiple relationships use now.
> >   *   Should there be an option to adjust connection z-index so that
> users
> > can make sure a connection isn't overlapping things in a weird way.
> >
> > The goal of this ticket is fully on making it easier and faster for 
> > users to visually understand a flow. Any feedback is appreciated!
> >
> > Thanks,
> >   Peter
> >
>

Adding Color to Connections

2019-07-30 Thread Peter Wicks (pwicks)

You know when you have a crazy complex flow, and it's hard sometimes to even 
tell where things are going? Especially those failure conditions that are all 
going back to a central Funnel or Port? I thought it would be visually very 
helpful if you could add Color to your connections, using the existing UI 
components in place for doing that.

I have a working POC, https://issues.apache.org/jira/browse/NIFI-6504.  Which 
includes a screenshot and link to the branch.

There are a number of open questions:

  *   Should the markers (arrows) be colored also?
  *   Should there be an option to color other components such as funnel's or 
ports?
  *   I've also considered adding some additional line formatting options such 
as thickness and dashes, like the dashed green lines you see when creating a 
new relationship, or the extra thick lines that represent multiple 
relationships use now.
  *   Should there be an option to adjust connection z-index so that users can 
make sure a connection isn't overlapping things in a weird way.

The goal of this ticket is fully on making it easier and faster for users to 
visually understand a flow. Any feedback is appreciated!

Thanks,
  Peter

RE: [EXT] Re: FlowFile Expiration - Lineage vs Queue Times

2019-07-24 Thread Peter Wicks (pwicks)

Thanks for the input Mark, I can definitely see that being valuable.  That 
leads to some new ideas.

We could have an, "Expiration Strategy" drop down.  Options might be:

 - Lineage Expiration (default, keeps backwards compat.)
 - Queue Expiration
 - Back Pressure + Queue Expiration (which would follow the rules Mark 
described).

As for the order, I'm not confident I remember how it works right now.

Thanks,
  Peter

-Original Message-
From: Mark Bean  
Sent: Wednesday, July 24, 2019 7:08 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: FlowFile Expiration - Lineage vs Queue Times

On a similar note, we recently had a case where it would be desirable for the 
flowfile expiration to kick in only after the flowfile size/count back pressure 
limits have been reached. In other words, once a back pressure
(size) limit is reached, it would be desirable to then remove flowfiles - 
beginning with the oldest first - until the back pressure limit is no longer in 
violation.

Thanks,
Mark

On Tue, Jul 23, 2019 at 3:41 PM Peter Wicks (pwicks) 
wrote:

> I was thinking it would be nice to expire FlowFile's based on their 
> time in queue, in addition to the current option of their total 
> lineage time (as in, have both options available).
> Any thoughts on pros/cons of having this available?
>
> Thanks,
>   Peter
>

RE: [EXT] Re: Auto cleanup custom Controller Service properties?

2019-07-19 Thread Peter Wicks (pwicks)

Bryan,

That wasn't the behavior I was seeing, but then when I went and tested it 
again, that was the behavior I saw so never mind.

--Peter


-Original Message-
From: Bryan Bende  
Sent: Friday, July 19, 2019 9:08 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Auto cleanup custom Controller Service properties?

I believe you should still be able to delete the property with a delete icon 
next to it.

This should be the same behavior for processors when a property is removed, 
regardless of dynamic properties.

On Fri, Jul 19, 2019 at 11:01 AM Peter Wicks (pwicks)  wrote:
>
> I ran into a fairly rare situation. While working on a custom Controller 
> Service I removed a previously present property from the code, and this 
> specific Controller Service does not support Dynamic Properties.
>
> Like a Processor, the property now shows up as a custom property at the 
> bottom, but since the Controller Service is not flagged for Dynamic 
> Properties, there is no way to remove it.
>
> Is this a situation we should care about, in case we do remove a property 
> from a controller service in the future and otherwise leave users with 
> re-creating all affected controller services to resolve the issue?
>
> Thanks,
>   Peter

RE: [EXT] Re: Weird spring issue if I run a snapshot w/out internet access

2019-07-18 Thread Peter Wicks (pwicks)

As a final follow-up, this has been resolved with the fix merged to master and 
the Jira closed.

--Peter

-Original Message-
From: Joe Witt  
Sent: Thursday, July 18, 2019 9:47 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Weird spring issue if I run a snapshot w/out internet access

See here: 
https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FNIFI-6439data=02%7C01%7Cpwicks%40micron.com%7Cb32f53c539b44de6fee508d70b972de7%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C1%7C636990616276903364sdata=PFZbmcYYibRpdNHg6TxQNd28lrEx1Qho1VCavwWxYDw%3Dreserved=0

Let's try to bring it all in that location for follow-up.  Seeing a similar 
statement/log output from Peter Wicks now on dev in slack.



On Tue, Jun 18, 2019 at 3:44 PM Mike Thomsen  wrote:

> The weird part is I don't think this has ever happened with an 
> official binary. Makes me wonder if it's not a gap in the 
> documentation or something.
>
> On Tue, Jun 18, 2019 at 2:18 PM Jon Logan  wrote:
>
> > I haven't looked at that specifically but we've ran into issues 
> > before
> with
> > Spring quietly pulling XSD's over the internet if they're not 
> > packaged correctly in your artifact, and that breaking miserably if 
> > you don't have the internet available. That's just a wild guess though.
> >
> > On Tue, Jun 18, 2019 at 2:14 PM Mike Thomsen 
> > 
> > wrote:
> >
> > > Sometimes when I start a snapshot build when I don't have Internet
> > access I
> > > get this. Any ideas?
> > >
> > > 2019-06-18 14:11:54,040 WARN [main]
> > org.apache.nifi.web.server.JettyServer
> > > Failed to start web server... shutting down.
> > > org.springframework.beans.factory.xml.XmlBeanDefinitionStoreException:
> > Line
> > > 19 in XML document from class path resource [nifi-context.xml] is
> > invalid;
> > > nested exception is org.xml.sax.SAXParseException; lineNumber: 19;
> > > columnNumber: 139; cvc-elt.1: Cannot find the declaration of 
> > > element 'beans'.
> > > at
> > >
> > >
> >
> org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBe
> anDefinitions(XmlBeanDefinitionReader.java:399)
> > > at
> > >
> > >
> >
> org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBean
> Definitions(XmlBeanDefinitionReader.java:336)
> > > at
> > >
> > >
> >
> org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBean
> Definitions(XmlBeanDefinitionReader.java:304)
> > > at
> > >
> > >
> >
> org.springframework.beans.factory.support.AbstractBeanDefinitionReader
> .loadBeanDefinitions(AbstractBeanDefinitionReader.java:181)
> > > at
> > >
> > >
> >
> org.springframework.beans.factory.support.AbstractBeanDefinitionReader
> .loadBeanDefinitions(AbstractBeanDefinitionReader.java:217)
> > > at
> > >
> > >
> >
> org.springframework.beans.factory.support.AbstractBeanDefinitionReader
> .loadBeanDefinitions(AbstractBeanDefinitionReader.java:188)
> > > at
> > >
> > >
> >
> org.springframework.context.annotation.ConfigurationClassBeanDefinitio
> nReader.loadBeanDefinitionsFromImportedResources(ConfigurationClassBea
> nDefinitionReader.java:354)
> > > at
> > >
> > >
> >
> org.springframework.context.annotation.ConfigurationClassBeanDefinitio
> nReader.loadBeanDefinitionsForConfigurationClass(ConfigurationClassBea
> nDefinitionReader.java:143)
> > >
> >
>

UI Bug driving me crazy (NIFI-6455)

2019-07-18 Thread Peter Wicks (pwicks)

I ran across a UI bug, but haven’t been able to figure it out on my own so I 
wrote up a ticket, NIFI-6455. The bug causes the last item on a scrollable 
Configure window to not be accessible. The scrollbar is the right size, but the 
last property is inaccessible. Showed up in 1.10 for both Configure Processor 
and Configure Controller Service windows, no issues with the same 
processor/controller service in 1.9.2.

If you have a processor like GetSolr, which has a lot of property options, you 
can’t access the last one off the edge of the screen. I’m running into this on 
a custom processor also, and the last property is the most important one… Does 
not appear to be an issue with some other windows like NiFi History.

I tried tracking it down, but I wasn’t able to quickly track down the root 
cause, so I’m looking for a little UI assistance from the team, and hopefully 
someone can quickly replicate it and confirm it’s not just my environment .

Thanks,
  Peter

RE: [EXT] Re: Thoughts on an internal "Terminate" handler for special processors

2019-06-19 Thread Peter Wicks (pwicks)

Mark,

Lots of good questions. I'll write up a JIRA on it at least, and finish 
reviewing your original PR on Terminate 
(https://github.com/apache/nifi/pull/2555/).

Thanks,
  Peter


-Original Message-
From: Mark Payne  
Sent: Wednesday, June 19, 2019 9:27 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Thoughts on an internal "Terminate" handler for special 
processors

Peter,

Without thinking through this too much, I am not opposed to the idea, but there 
are certainly a lot of things that would have to be carefully thought through 
around the lifecycle:

- When a processor is terminated, its threads are interrupted. They may or may 
not ignore the interrupt. When should this handler be triggered? After the 
threads return from the interrupt? Or after the interrupt is triggered and 
before they return (if they ever will)?
- What if the Termination handler never returns? Does this just eat up an 
additional thread and get reported as a "Terminated thread"? What thread pool, 
if any, would this thread come from?

Just some thoughts that would have to be worked through... 


> On Jun 19, 2019, at 11:17 AM, Peter Wicks (pwicks)  wrote:
> 
> I've run into a few cases where a processor that works with external 
> resources (DBCP especially) will hang and can't be properly terminated. I was 
> thinking it would make sense to have an interface or annotation that these 
> processors could have that would flag them for "special" termination.
> 
> For example. In DBCP processors we could put a reference to the current 
> statement, and then when a terminate is received we can call into the 
> processor and ask it to perform it's special termination process, which in 
> this case would be to cancel the statement.
> 
> I've heard similar complaints concerning ExecuteStreamCommand, but have not 
> experienced it.
> 
> Any thoughts on this? It would only be for the Terminate option, it would not 
> affect normal operation states such as Stop.
> 
> Thanks,
>  Peter Wicks

Thoughts on an internal "Terminate" handler for special processors

2019-06-19 Thread Peter Wicks (pwicks)

I've run into a few cases where a processor that works with external resources 
(DBCP especially) will hang and can't be properly terminated. I was thinking it 
would make sense to have an interface or annotation that these processors could 
have that would flag them for "special" termination.

For example. In DBCP processors we could put a reference to the current 
statement, and then when a terminate is received we can call into the processor 
and ask it to perform it's special termination process, which in this case 
would be to cancel the statement.

I've heard similar complaints concerning ExecuteStreamCommand, but have not 
experienced it.

Any thoughts on this? It would only be for the Terminate option, it would not 
affect normal operation states such as Stop.

Thanks,
  Peter Wicks

RE: [EXT] Re: NiFi 2.0 Roadmap

2019-06-14 Thread Peter Wicks (pwicks)

I've also heard something about a big change to the way relationships work. 
Maybe grouping relationships into larger groupings "failure" as a parent, with 
multiple optional/fine grained, children. Something like that.  If that rings a 
bell, maybe add it to your list .

-Original Message-
From: Joe Witt  
Sent: Friday, June 14, 2019 11:31 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: NiFi 2.0 Roadmap

It makes sense there would be an mvp set of registry capability and thus a 
dependency for nifi 2.0 on registry readiness/version.  Otherwise I would 
largely hope not.


On Fri, Jun 14, 2019 at 1:29 PM Otto Fowler  wrote:

> Will that effort or planning be across all the nifi projects?  minifi 
> / cpp / registry etc?
>
>
> On June 14, 2019 at 13:01:36, Joe Witt (joe.w...@gmail.com) wrote:
>
> Peter,
>
> Yeah I think we're all circling around similar thoughts on things 
> which are 'best for a major release' and we need to start codifying 
> that. At the same time we need this to be focused on items which can 
> only reasonably happen in a major release and not become a new kitchen 
> sink for JIRAs/ideas. We should frame up a wiki page for this effort. 
> I'm happy to kick that off soon (time permitting). In my mind the key 
> domino here is having a Flow Registry that can hold extensions and we 
> can then make nifi
> 2.0 fundamentally about distributing nifi as a kernel (small as 
> possible) and all extensions come from a flow registry on demand. 
> Other obvious things like Java 11 as the base requirement and killing 
> off deprecated things come to mind.
>
> Thanks
>
> On Fri, Jun 14, 2019 at 11:45 AM Peter Wicks (pwicks) 
> 
> wrote:
>
> > I've seen a lot of comments along the line of, "I don't think this 
> > will happen before NiFi 2.0". Do we have a roadmap/list somewhere of 
> > the big general changes planned for NiFi 2.0 or some kind of 2.0 roadmap?
> >
> > --Peter
> >
>

NiFi 2.0 Roadmap

2019-06-14 Thread Peter Wicks (pwicks)

I've seen a lot of comments along the line of, "I don't think this will happen 
before NiFi 2.0". Do we have a roadmap/list somewhere of the big general 
changes planned for NiFi 2.0 or some kind of 2.0 roadmap?

--Peter

RE: [EXT] Re: GitHub Stuff

2019-06-11 Thread Peter Wicks (pwicks)

I like having signed commits. I develop on both Windows and Linux, but have 
only had success getting signing working on Windows (which was a bit 
complicated as it was). You can see when I switched from mostly Windows to 
mostly Linux by when I stopped signing commits...

Thanks,
  Peter

-Original Message-
From: Andy LoPresto  
Sent: Tuesday, June 11, 2019 1:25 PM
To: dev@nifi.apache.org
Subject: [EXT] Re: GitHub Stuff

I strongly support both of these suggestions. Thanks for starting the 
conversation Bryan. GPG signing is very important for security and for 
encouraging the rest of the community to adopt these practices as well. 


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Jun 11, 2019, at 11:42 AM, Bryan Bende  wrote:
> 
> I had two thoughts related to our GitHub usage that I wanted to throw 
> out there for PMC members and committers...
> 
> 1) I think it would be helpful if everyone setup the link between 
> their Apache id and github [1]. Setting up this link puts you into the 
> nifi-committers group in Apache (currently 17 of us are in there), and 
> I believe this is what controls the list of users that can be selected 
> as a reviewer on a pull request. Since PRs are the primary form of 
> contribution, it would be nice if all of the PMC/committers were in 
> the reviewer list, but of course you can continue to commit against 
> Gitbox without doing this.
> 
> 2) I also think it would be nice if most of the commits in the repo 
> were signed commits that show up as "Verified" in GitHub [2]. Right 
> now I think we lose the verification if the user reviewing the commit 
> doesn't have signing setup, because when you amend the commit to add 
> "This closes ...", it technically produces a new commit hash, thus 
> making the original signature no longer apply (at least this is what I 
> think is happening, but other may know more).
> 
> These are obviously just my opinions and no one has to do these 
> things, but just thought I would throw it out there for discussion in 
> case anyone wasn't aware.
> 
> -Bryan
> 
> [1] 
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitb
> ox.apache.org%2Fsetup%2Fdata=02%7C01%7Cpwicks%40micron.com%7Cc2f2
> 0a00f6424597c10708d6eea27d65%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C
> 0%7C636958778999592924sdata=mJ59FD6KSYn1jXHN0yRRagKf6BHdWn7N1ZXmV
> 4BtBi8%3Dreserved=0 [2] 
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhelp
> .github.com%2Fen%2Farticles%2Fsigning-commitsdata=02%7C01%7Cpwick
> s%40micron.com%7Cc2f20a00f6424597c10708d6eea27d65%7Cf38a5ecd28134862b1
> 1bac1d563c806f%7C0%7C0%7C636958778999592924sdata=%2BiByT0SfcxSsoL
> XgS4VFLI1DTBn9BW3vD1iPvCCqRSI%3Dreserved=0

RE: [EXT] Sliding windows

2019-06-04 Thread Peter Wicks (pwicks)

Craig,

If you have a timestamp set as an attribute on the processor, then this is kind 
of possible.

Have a regular MergeContent processor, with "Maximum Group Size" set to 1 mb, 
set "Max Bin Age" to 3 min; you may need to tweak settings to get the right 
cadence, but these are generally the settings you need to touch. Use the 
"Merged" relationship for whatever you need. To create the Window, pass the 
"Original" relationship to a RouteOnAttribute processor.

In the RouteOnAttribute use NiFi Expression Language to calculate how old the 
FlowFile is (using the timestamp attribute I mentioned). If the FlowFile is 
older than x, drop it, else send it back to the MergeContent processor.

Using this process, it should be easy to get a 5 min rolling window (drop any 
FlowFile older than 5 min in RouteOnAttribute).

I don't know that this perfectly answers what you asked, but does it give you a 
good direction to investigate?

Thanks,
  Peter

-Original Message-
From: Craig Knell  
Sent: Tuesday, June 4, 2019 1:32 AM
To: dev@nifi.apache.org
Subject: [EXT] Sliding windows

Hi Folks

We have a stream of data that I need to window to 5 minutes and the window is 
to slide every 3 minutes. Each minute is 1 mb, I therefore have to deliver 5mb 
per 3 minutes.  

What is the best way of achieving this in nifi?

Best regards

Craig

RE: [EXT] Re: [ANNOUNCE] New Apache NiFi PMC member Peter Wicks

2019-05-31 Thread Peter Wicks (pwicks)

Thanks everyone. Looking forward to continuing to work with the community for a 
long time to come.

-Original Message-
From: Mike Thomsen  
Sent: Friday, May 31, 2019 5:50 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: [ANNOUNCE] New Apache NiFi PMC member Peter Wicks

Congratulations!

On Thu, May 30, 2019 at 11:55 PM Sivaprasanna 
wrote:

> Congratulations, Peter!
>
> On Fri, 31 May 2019 at 7:07 AM, Michael Moser  wrote:
>
> > Great work, Peter.  Congrats!
> >
> >
> > On Thu, May 30, 2019 at 8:05 PM Marc Parisi  wrote:
> >
> > > Congrats!
> > >
> > > On Thu, May 30, 2019, 2:58 PM Jeff  wrote:
> > >
> > > > Welcome to the PMC, Peter!  Congrats!
> > > >
> > > > On Thu, May 30, 2019 at 2:45 PM Tony Kurc  wrote:
> > > >
> > > > > Congratulations Peter!!
> > > > >
> > > > > On Thu, May 30, 2019 at 11:21 AM Aldrin Piri 
> > > wrote:
> > > > >
> > > > > > NiFi Community,
> > > > > >
> > > > > > On behalf of the Apache NiFi PMC, I am pleased to announce that
> > Peter
> > > > > Wicks
> > > > > > has accepted the PMC's invitation to join the Apache NiFi PMC.
> > > > > >
> > > > > > Peter's contributions have been plentiful in code, community,
> > reviews
> > > > and
> > > > > > discussion after becoming a committer in November 2017.  His
> impact
> > > > > across
> > > > > > NiFi has lead to improvements surrounding Kerberos, GetFile,
> > > ListFile,
> > > > > > Clustering, Node Offload, Recordset Writers, HDFS, and Database
> > > related
> > > > > > processors among others.
> > > > > >
> > > > > > Thank you for all your contributions and welcome to the PMC,
> Peter!
> > > > > >
> > > > > > --aldrin
> > > > > >
> > > > >
> > > >
> > >
> >
>

RE: [EXT] [discuss] Splitting NiFi framework and extension repos and releases

2019-05-30 Thread Peter Wicks (pwicks)

One more "not awesome" would be that core changes that affect extensions will 
be a little harder to test. If I make a core change that changes the signature 
of an interface/etc... I'll need to do some extra work to make sure I don't 
break extensions that use it.

Still worth it, just one more thing to mention.

-Original Message-
From: Joe Witt  
Sent: Thursday, May 30, 2019 9:19 AM
To: dev@nifi.apache.org
Subject: [EXT] [discuss] Splitting NiFi framework and extension repos and 
releases

Team,

We've discussed this a bit over the years in various forms but it again seems 
time to progress this topic and enough has changed I think to warrant it.

Tensions:
1) Our build times take too long.  In travis-ci for instance it takes 40 
minutes when it works.
2) The number of builds we do has increased.  We do us/jp/fr builds on open and 
oracle JDKs.  That is 6 builds.
3) We want to add Java 11 support such that one could build with 8 or 11 and 
the above still apply.  The becomes 6 builds.
4) With the progress in NiFi registry we can now load artifacts there and could 
pull them into NiFi.  And this integration will only get better.
5) The NiFi build is too huge and cannot grow any longer or else we cannot 
upload convenience binaries.

We cannot solve all the things just yet but we can make progress.  I suggest we 
split apart the NiFi 'framework/application' in its own release cycle and code 
repository from the 'nifi extensions' into its own repository and release 
cycle.  The NiFi release would still pull in a specific set of extension 
bundles so to our end users at this time there is no change. In the future we 
could also just stop including the extensions in nifi the application and they 
could be sourced at runtime as needed from the registry (call that a NiFi 2.x 
thing).

Why does this help?
- Builds would only take as long as just extensions take or just core/app 
takes.  This reduces time for each change cycle and reduces load on travis-ci 
which runs the same tests over and over and over for each pull request/push 
regardless of whether it was an extension or core.

- It moves us toward the direction we're heading anyway whereby extensions can 
have their own lifecycle from the framework/app itself.

How is this not awesome:
- Doesn't yet solve for the large builds problem.  I think we'll get there with 
a NiFi 2.x release which fully leverages nifi-registry for retrieval of all 
extensions.
- Adds another 'thing we need to do a release cycle for'.  This is generally 
unpleasant but it is paid for once a release cycle and it does allow us to 
release independently for new cool extensions/fixes apart from the framework 
itself.

Would be great to hear others thoughts if they too feel it is time to make this 
happen.

Thanks
Joe

RE: [EXT] Re: Kicking the Kerberos out of org.apache.nifi.hadoop

2019-04-30 Thread Peter Wicks (pwicks)

Thanks Bryan, I was reviewing this earlier.  Some context on what I'm actually 
trying to do:

I'm trying to add Kerberos support to InvokeHTTP, but didn't really want to add 
a dependency to the org.apache.nifi.hadoop library.  The Livy controller 
service, which I was using as an example for this, uses two classes from 
org.apache.nifi.hadoop to build an Http Connection that uses Kerberos by 
referencing KerberosKeytabCredentials and 
KerberosKeytabSPNegoAuthSchemeProvider.

I also looked at the Solr method, but couldn't understand how it worked... I 
just took a second look and realized that most of the magic happens in the 
onTrigger through the KerberosAction, and not in KerberosHttpClientConfigurer, 
which was a bit misleading, but now makes more sense.

I'll take a shot at doing it this way, maybe I can move 
`KerberosHttpClientConfigurer` to nifi-security-utils?

Thanks,
  Peter

-Original Message-
From: Bryan Bende  
Sent: Tuesday, April 30, 2019 10:33 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Kicking the Kerberos out of org.apache.nifi.hadoop

I created a mini-kerberos framework here in nifi-security-utils that does not 
depend on Hadoop:

https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fnifi%2Ftree%2Fmaster%2Fnifi-commons%2Fnifi-security-utils%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fnifi%2Fsecurity%2Fkrbdata=02%7C01%7Cpwicks%40micron.com%7C68620b45804641b6970b08d6cd898a9e%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C636922387953451036sdata=u35yNAbJ2J9Wv9MGc%2FkXX10ds%2BgyXFzdIEZpvd35IZU%3Dreserved=0

The first use-case was for the Solr processors and I believe this code is also 
now used in the DBCPConnectionPool.

We could consider moving code there if it makes sense, but maybe see if you can 
achieve what you need to do with what is there already.

I think that other Hadoop spnego stuff was added around the same time as part 
of a different effort, but likely a bit of overlap.

On Tue, Apr 30, 2019 at 12:21 PM Peter Wicks (pwicks)  wrote:
>
> I was thinking of moving the non-Hadoop specific Kerberos classes out of 
> org.apache.nifi.hadoop into a more central location. There are some good 
> utility classes, such as KerberosKeytabCredentials, that it would be nice to 
> have centralized into a non-Hadoop specific library.
>
> Would it be appropriate to co-locate these with the 
> KerberosCredentialsService interface, or is there a better home in a 
> different location?
>
> Thanks,
>   Peter

Kicking the Kerberos out of org.apache.nifi.hadoop

2019-04-30 Thread Peter Wicks (pwicks)

I was thinking of moving the non-Hadoop specific Kerberos classes out of 
org.apache.nifi.hadoop into a more central location. There are some good 
utility classes, such as KerberosKeytabCredentials, that it would be nice to 
have centralized into a non-Hadoop specific library.

Would it be appropriate to co-locate these with the KerberosCredentialsService 
interface, or is there a better home in a different location?

Thanks,
  Peter

RE: [EXT] Re: Custom Javascript in UI Without Custom Build

2019-04-25 Thread Peter Wicks (pwicks)

Joe,

We are testing out a product called AppDynamics for application monitoring/user 
experience tracking. We've hooked it into NiFi as an javaagent, but it also has 
an option where you can inject some JavaScript into the frontend and monitor UX 
there.

Doesn't really feel like something that NiFi needs to support, unless there was 
going to be a generic function to add .js files to make them globally 
accessible, like we do with jar's in the lib folder.

--Peter

-Original Message-
From: Joe Witt  
Sent: Thursday, April 25, 2019 1:42 PM
To: dev@nifi.apache.org
Subject: [EXT] Re: Custom Javascript in UI Without Custom Build

Peter,

I'm not aware of any plans to provide/support what you're suggesting at this 
point.

Can you describe more what you mean and would be doing with this?  Is what you 
are trying to do something that the UI should just enable?

Thanks

On Thu, Apr 25, 2019 at 3:26 PM Peter Wicks (pwicks) 
wrote:

> Pinging to see if anyone has a work around for this?
>
> -Original Message-
> From: Peter Wicks (pwicks) 
> Sent: Wednesday, April 17, 2019 8:22 AM
> To: dev@nifi.apache.org
> Subject: Custom Javascript in UI Without Custom Build
>
> I have some custom Javascript, used for application monitoring, that 
> I'd like to put in the UI. I know how to do this if I'm willing to run 
> a custom build of NiFi.  Is there anyway to do this without a custom build of 
> NiFi?
>
> My first thought was creating a dummy processor with an Advanced UI, 
> and seeing if maybe I could get Javascript for that Advanced UI to run 
> when the NiFi UI is loaded. This is a bit of work, so I thought I'd 
> ask before trying this.
>
> My second thought was to try and abuse the nifi.ui.banner.text 
> setting, but this uses `text()` to set the banner text, so would not 
> work for a block of Javascript.
>
> I ran across this article by Scott Aslan, 
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcomm
> unity.hortonworks.com%2Farticles%2F134888%2Fan-apache-nifi-frontend-de
> velopers-cookbook-part-1.htmldata=02%7C01%7Cpwicks%40micron.com%7
> C5020dcc7439e41a117e508d6c9b88039%7Cf38a5ecd28134862b11bac1d563c806f%7
> C0%7C0%7C636918191599225831sdata=VwL6qUgXToeorwY4K4VxV%2FtHW%2B0H
> hdJbhHBXHgWoXV4%3Dreserved=0,
> but it was focused on core NiFi UI (great read, just not the direction 
> I'm trying to go).
>
> Thanks,
>   Peter
>

RE: Custom Javascript in UI Without Custom Build

2019-04-25 Thread Peter Wicks (pwicks)

Pinging to see if anyone has a work around for this?

-Original Message-
From: Peter Wicks (pwicks)  
Sent: Wednesday, April 17, 2019 8:22 AM
To: dev@nifi.apache.org
Subject: Custom Javascript in UI Without Custom Build

I have some custom Javascript, used for application monitoring, that I'd like 
to put in the UI. I know how to do this if I'm willing to run a custom build of 
NiFi.  Is there anyway to do this without a custom build of NiFi?

My first thought was creating a dummy processor with an Advanced UI, and seeing 
if maybe I could get Javascript for that Advanced UI to run when the NiFi UI is 
loaded. This is a bit of work, so I thought I'd ask before trying this.

My second thought was to try and abuse the nifi.ui.banner.text setting, but 
this uses `text()` to set the banner text, so would not work for a block of 
Javascript.

I ran across this article by Scott Aslan, 
https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommunity.hortonworks.com%2Farticles%2F134888%2Fan-apache-nifi-frontend-developers-cookbook-part-1.htmldata=02%7C01%7Cpwicks%40micron.com%7C65be6350469d4b0f3c5a08d6c3402698%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C1%7C636911077628346348sdata=0gUO3oMynAPBTPTdDyE2uJ5n%2BTNY9P4HmDZm%2BdLhmDg%3Dreserved=0,
 but it was focused on core NiFi UI (great read, just not the direction I'm 
trying to go).

Thanks,
  Peter

RE: [EXT] Re: Latest NiFi customs?

2019-04-18 Thread Peter Wicks (pwicks)

One other thing, that seems to catch me every time I upgrade an old instance: 
you will need to go in and allow users to read provenance data again. Somewhere 
along the way (1.6?) provenance reading moved into a separate policy, and it 
does not get assigned to anyone after upgrade.

-Original Message-
From: Lars Francke  
Sent: Thursday, April 18, 2019 3:05 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Latest NiFi customs?

Hi,

I have just one data point on the version but I would suggest moving to 1.9 if 
you're just starting out and if you're using the Record based processors with 
potentially dynamic/changing schemas.
The automatic schema inference described in this blog post[1] makes things much 
easier (or possible). I see no reason to start with 1.8 today if you have the 
option of upgrading.

Java: Java 8, while outdated, is still pretty much standard almost everywhere I 
look.

Cheers,
Lars

[1] <
https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedium.com%2F%40abdelkrim.hadjidj%2Fdemocratizing-nifi-record-processors-with-automatic-schemas-inference-4f2b2794c427data=02%7C01%7Cpwicks%40micron.com%7C756adfb0351748bcb14608d6c3dd1ae1%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C636911751740895088sdata=ek7jCdr8iy83RRJQ1cxhR%2BntfswCcLRImxviuDkWgBQ%3Dreserved=0
>

On Wed, Apr 17, 2019 at 4:49 PM Russell Bateman 
wrote:

> After a couple of years absence from NiFi (prior to Java 9), I find 
> myself just now back in a developer role in a company that uses NiFi.
> (This is a pleasant thought, I might add, as I believe that NiFi 
> rocks.) I have inherited an existing implementation that's sorely aged 
> and, though I've googled mostly in vain on what I'm asking, would like 
> to dot the /i/s and cross the /t/s.
>
> *What version of NiFi?*
> How far forward (toward) NiFi 1.9 should I push my company? I see that 
> the Docker container is at 1.8 if that's any reference. I'm tempted 
> right now to move to 1.8 immediately.
>
> *What about Java?*
> What is the state of Java in NiFi? It appears that it's still back on 
> Java 8? I develop using IntelliJ IDEA. While I constrain the level of 
> language features to 1.8, it isn't realistic to contemplate developing 
> in IDEA without a pretty modern JDK version (I use Java 11 today 
> because LTS). I assume, nevertheless, that if I'm careful not to 
> permit--by setting in IDEA--the use of language constructs in my 
> custom processors to exceed 1.8, I should be okay, right? Or, am I 
> missing something and there are other considerations to watch out for?
>
> Thanks for any and all comments, setting me straight, etc.
>

Custom Javascript in UI Without Custom Build

2019-04-17 Thread Peter Wicks (pwicks)

I have some custom Javascript, used for application monitoring, that I'd like 
to put in the UI. I know how to do this if I'm willing to run a custom build of 
NiFi.  Is there anyway to do this without a custom build of NiFi?

My first thought was creating a dummy processor with an Advanced UI, and seeing 
if maybe I could get Javascript for that Advanced UI to run when the NiFi UI is 
loaded. This is a bit of work, so I thought I'd ask before trying this.

My second thought was to try and abuse the nifi.ui.banner.text setting, but 
this uses `text()` to set the banner text, so would not work for a block of 
Javascript.

I ran across this article by Scott Aslan, 
https://community.hortonworks.com/articles/134888/an-apache-nifi-frontend-developers-cookbook-part-1.html,
 but it was focused on core NiFi UI (great read, just not the direction I'm 
trying to go).

Thanks,
  Peter

RE: [EXT] MS SQL CDC Processor

2019-04-17 Thread Peter Wicks (pwicks)

Viresh,

There are a couple of options for MS SQL CDC.  The "simple" option is to create 
a view in MS SQL that joins the CDC table with the lsn_time_mapping table. Then 
use the QueryDatabaseTable processor to load this view incrementally using the 
"tran_end_time" column.

I see you've also found my pull request from October 2017. I put a lot of time 
into getting this working and covering all of my use cases, unfortunately, 
there are very few users available to test. Because of the lack of testing, 
this PR has been hanging out for quite a while now.  MS SQL CDC used to be an 
Enterprise edition only feature, and only in recent years has become available 
in all versions. I'd still like to see this PR get merged in the future, if 
users become available for testing.

Thanks,
  Peter

-Original Message-
From: Viresh R Navalli  
Sent: Wednesday, April 17, 2019 12:41 AM
To: dev@nifi.apache.org
Cc: Chandrashekhar Thakare 
Subject: [EXT] MS SQL CDC Processor

Hi Team,

We are looking into CDC for MS SQL, got reference to below link.
https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fnifi%2Fpull%2F2231data=02%7C01%7Cpwicks%40micron.com%7C3a87b523908c481272e608d6c3276c09%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C636910971434384147sdata=byIejgwoWBWf%2F9udig411H7k4%2Bszm8nYLL%2BxIb2xm5Y%3Dreserved=0

I have checked in nifi-1.9.2 release didn't see such processor. Is this 
functionality available in latest release.

_
Thanks & Regards
Viresh N



Disclaimer:  This message and the information contained herein is proprietary 
and confidential and subject to the Tech Mahindra policy statement, you may 
review the policy at 
https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.techmahindra.com%2FDisclaimer.htmldata=02%7C01%7Cpwicks%40micron.com%7C3a87b523908c481272e608d6c3276c09%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C636910971434384147sdata=hzVhOkqXAawtnLQGVvSZIwXLF%2FD9p%2FKT%2BPUHeLpDfTU%3Dreserved=0
 

 externally 
https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftim.techmahindra.com%2Ftim%2Fdisclaimer.htmldata=02%7C01%7Cpwicks%40micron.com%7C3a87b523908c481272e608d6c3276c09%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C0%7C636910971434384147sdata=VRreLAKYUieyYwp58pQuXCdVy4XyNpQOKD04NqwRmDE%3Dreserved=0
 

 internally within TechMahindra.

LivySessionController - Unexpected Behavior

2019-04-10 Thread Peter Wicks (pwicks)

I've been working on a few enhancements to the LivySessionController as part of 
my ticket NIFI-6175 : Livy - Add Support for All Missing Session Startup 
Features.

I've been running into a lot of... interesting? features in the code, and I was 
hoping someone (Matt?) who was involved with it from the beginning could shed 
some light on these questions before I change any default behavior/fix any 
perceived bugs.


  *   If NiFi is running in a cluster then a race condition is entered where 
you can't really predict how many Livy sessions will be created.
 *   If the controller service was recently running, and you just restarted 
the controller service, then everything is fine. The existing Livy sessions 
will be found and used. But even in this scenario it's not working correctly, 
because if I asked the controller service to use two instances, it will use all 
available sessions in Livy for the "Kind" (more on this farther down).
 *   The logic checks if any sessions exist on controller startup (and on 
each update interval), since all instances of the controller service start up 
roughly at the same time, you might end up with full duplication across the 
cluster, or you might end up with 50% duplication, or anything in between, 
depending on how quickly session create requests get sent in.
  *   The controller service will "steal" any open Livy session that it can 
see, so long as the "Kind" of session matches the configuration. It will also 
over reach in session allocation if more sessions are available then it needs.
 *   If there are 10 Livy sessions open, it will load all 10 as available 
for use, even if I only wanted 2. If some of those sessions die off it does not 
create new ones, but it will keep using them as long as they are available.
 *   If you have multiple Livy Controller Services, it's very hard 
(impossible?) to keep sessions separate if they are running under the same 
account (and maybe even if they are not, have not spent much time testing the 
separate account option).
 *   The code does not block the sessions/mark it as used. It relies on the 
Livy Session state value of "Idle" to designate a session as available. This is 
another race condition where running multiple threads of 
ExecuteSparkInteractive, either because your in a cluster, or because you just 
have multiple threads, would easily dual assign a Livy session instead of using 
the expected WAIT relationship.
  *   The Controller Service is unable to delete existing sessions. So even 
though there is a Controller Service shutdown hook in the code, it does not 
clean up it's open sessions and they have to time-out.

I don't have resolutions for all these issues. But one thing I was thinking 
about doing was using the Livy Session Name parameter to tag the session when 
it's created so it's associated with a specific Controller Service instance by 
UUID (so it would work across a cluster too), and maybe only manage Livy 
sessions from the master node? (not sure how to find out if your on Master, but 
a thought).

Thanks,
  Peter

RE: [EXT] Re: Direct access to FlowFile Content Repository Path?

2019-03-04 Thread Peter Wicks (pwicks)

Thanks Joe. The data sources out of a database (ExecuteSQL). Content Repository 
is not an area where I'm very knowledgeable on the internals.

Thanks again.

-Original Message-
From: Joe Witt  
Sent: Monday, March 4, 2019 10:05 PM
To: dev@nifi.apache.org
Subject: [EXT] Re: Direct access to FlowFile Content Repository Path?

Peter

It is not possible and if it were it would only be available for repo 
implementations that use files and further would need to be limited to cases 
where the content on disk was in a file by itself (unlikely) or for processes 
that would honor the offsets and length, would be read only, etc..

Another option is to just invoke the 3rd party lib on the files before pulling 
them into nifi.  If data arrives in list/fetch run list/execute custom/fetch 
instead maybe.

Thanks

On Mon, Mar 4, 2019 at 11:43 PM Peter Wicks (pwicks) 
wrote:

> I'm working on a custom processor. The processor calls a 3rd party 
> library that needs the path to my FlowFile content (no streams/arrays, 
> just file names).
>
> I could write the content out to a temp file, but the content is 
> already right there in the content repository... and many of the files 
> are very large, and it would increase execution times a lot to write 
> the data to a temp location a second time.
>
> Is it possible, in the context of a custom processor, to get the path 
> to the FlowFile in the content repo?
>
> Thanks,
>   Peter
>

Direct access to FlowFile Content Repository Path?

2019-03-04 Thread Peter Wicks (pwicks)

I'm working on a custom processor. The processor calls a 3rd party library that 
needs the path to my FlowFile content (no streams/arrays, just file names).

I could write the content out to a temp file, but the content is already right 
there in the content repository... and many of the files are very large, and it 
would increase execution times a lot to write the data to a temp location a 
second time.

Is it possible, in the context of a custom processor, to get the path to the 
FlowFile in the content repo?

Thanks,
  Peter

RE: [EXT] [discuss] release apache nifi 1.9.0

2019-02-06 Thread Peter Wicks (pwicks)

I'd like to see a review of this bug before we release 1.9.0, as it helps 
mature the Node Offload feature.

NIFI-5940 Cluster Node Offload Hangs if any RPG on flow is Disabled
https://github.com/apache/nifi/pull/3255

Thanks,
  Peter

-Original Message-
From: Joe Witt [mailto:joew...@apache.org] 
Sent: Tuesday, February 5, 2019 10:51 AM
To: dev@nifi.apache.org
Subject: [EXT] [discuss] release apache nifi 1.9.0

Team,

https://issues.apache.org/jira/browse/NIFI-5995?jql=project%20%3D%20NIFI%20AND%20status%20in%20(Resolved%2C%20%22Patch%20Available%22)%20AND%20fixVersion%20%3D%201.9.0%20ORDER%20BY%20key%20DESC

We have a pretty large list of bugs/features added already to justify a
1.9.0 release as well.

I'm happy to give RM a go again unless there are volunteers strongly interested.

Thanks
Joe

RE: [EXT] Re: [DISCUSS] Early, voluntary relocation to GitBox

2018-12-14 Thread Peter Wicks (pwicks)

Aldrin,

Your comments have left me confused. Yes, in the past, I would always `-s` to 
sign the commits and push them. But isn't the idea behind the improved GitHub 
integration is that now I can just go to the PR and tell GitHub to accept the 
PR after I've reviewed it. Thus the manual act of amending the commit and 
signing it does not happen. Does that mean there is no Sign Off happening to 
the commit?

--Peter

-Original Message-
From: Aldrin Piri [mailto:aldrinp...@gmail.com] 
Sent: Friday, December 14, 2018 2:32 PM
To: dev 
Subject: Re: [EXT] Re: [DISCUSS] Early, voluntary relocation to GitBox

That level of interoperability is not there and, from some quick searching, do 
not believe it to be something that is supported. [1]

Not sure what is meant by sign-off being automated.  Personally, I use the -s 
flag when amending/pushing reviewed commits.

[1]
https://blogs.apache.org/infra/entry/improved_integration_between_apache_and

On Fri, Dec 14, 2018 at 3:34 PM Peter Wicks (pwicks) 
wrote:

> Does closing a PR through GitHub close the JIRA ticket correctly? I'm 
> assuming sign-off is automated this way, but wasn't sure about this step.
>
> Thanks,
>   Peter
>
> -Original Message-
> From: Aldrin Piri [mailto:aldrinp...@gmail.com]
> Sent: Friday, December 14, 2018 6:23 AM
> To: dev 
> Subject: [EXT] Re: [DISCUSS] Early, voluntary relocation to GitBox
>
> Some clarifying notes until I can get all the updates into place and 
> generate a new email.
>
> There are now two locations that are writable by committers for each 
> of our repositories.  The updated repository locations can be seen in 
> the Apache NiFi section of https://gitbox.apache.org/repos/asf.  NiFi, 
> for instance, is available at 
> https://gitbox.apache.org/repos/asf/nifi-site.git.  The web view of 
> the repository is located at 
> https://gitbox.apache.org/repos/asf?p=nifi-site.git.  Please note the 
> subtle distinction!
>
> If you are a contributor still on their way to committership or you 
> are committer content with the same functionality we had with our old 
> setup (git-wip), you will use the above listed locations.  You can 
> update your location by performing the following steps:
>
>- Go to the root of your source directory, we'll use nifi as an example
>   - cd ${repo_home}/nifi
>- Update your remote.  On a default checkout this will be origin, but
>use the name that is pointing to the old git-wip location (you can view
>these using git remote -v)
>   - git remote set-url origin
>   https://gitbox.apache.org/repos/asf/nifi.git
>
>
> If you are a committer and interested in making use of the tighter 
> GitHub integration, there are a few steps needed to enable this 
> described at https://gitbox.apache.org/setup/.  Please follow those 
> and give some time to allow the background processes sync.  
> Successfully completing them should result in the email from GitHub on 
> behalf of Gitbox I mentioned previously.
>
> Again, look for a bit more polished set of instructions in the next 
> hour or so but feel free to add any questions here in the interim if 
> it is holding you up.
>
>
>
> On Fri, Dec 14, 2018 at 7:35 AM Aldrin Piri  wrote:
>
> > Hey all,
> >
> > Just a quick note that migration has completed.  I'll be gathering 
> > up information shortly and sending out another, separate email to 
> > the community with any needed changes and will work to update our 
> > docs appropriately.
> >
> > Committers, you should have received an email regarding you addition 
> > to the nifi group on GitHub.
> >
> > --aldrin
> >
> > On Thu, Dec 13, 2018 at 1:20 PM Aldrin Piri 
> wrote:
> >
> >> Hey folks,
> >>
> >> The JIRA ticket has been submitted [1].  I will keep this thread 
> >> updated as things progress and then generate a separate "helper"
> >> email for instructions/any updates that may be needed upon completion.
> >>
> >> [1] https://issues.apache.org/jira/browse/INFRA-17419
> >>
> >> On Mon, Dec 10, 2018 at 10:51 AM Pierre Villard < 
> >> pierre.villard...@gmail.com> wrote:
> >>
> >>> +1 as well, should ease the contribution workflow.
> >>> Thanks for volunteering Aldrin.
> >>>
> >>> Le lun. 10 déc. 2018 à 16:41, Laszlo Horvath 
> >>>  a écrit :
> >>>
> >>> > +1
> >>> >
> >>> > On 10/12/2018, 15:22, "Michael Moser"  wrote:
> >>> >
> >>> > +1 from me, sounds great.
> >>> >
> >>> >
> >>> > On Mon, Dec 10, 2018 at

Backwards Compatibility for Property with Default Value

2018-12-14 Thread Peter Wicks (pwicks)

I'm working on a ticket that tweaks the way the Minimum File Age property works 
in GetFile and ListFile (NIFI-5897). Right now users see "0 sec" and assumes 
that means the minimum age check does not happen, when in fact it does; and in 
some timezone scenarios you have future dated files, and this check delays 
files by hours and hours...

I'm trying to provide backwards compatibility with my fix by maintaining a 
default value of "0 sec", in case some users actually are ignoring future dated 
files, but making the property not required. But, if you clear the property, 
the default value takes over. And if you put in an "Empty String", well, then 
it just fails validation.

Thoughts on the path forward? I could just remove the default value, and users 
who already have the processor instantiated in their Flow when they upgrade 
will still be fine, as the old default value will be loaded.

Thanks,
  Peter

RE: [EXT] Re: [DISCUSS] Early, voluntary relocation to GitBox

2018-12-14 Thread Peter Wicks (pwicks)

Does closing a PR through GitHub close the JIRA ticket correctly? I'm assuming 
sign-off is automated this way, but wasn't sure about this step.

Thanks,
  Peter

-Original Message-
From: Aldrin Piri [mailto:aldrinp...@gmail.com] 
Sent: Friday, December 14, 2018 6:23 AM
To: dev 
Subject: [EXT] Re: [DISCUSS] Early, voluntary relocation to GitBox

Some clarifying notes until I can get all the updates into place and generate a 
new email.

There are now two locations that are writable by committers for each of our 
repositories.  The updated repository locations can be seen in the Apache NiFi 
section of https://gitbox.apache.org/repos/asf.  NiFi, for instance, is 
available at https://gitbox.apache.org/repos/asf/nifi-site.git.  The web view 
of the repository is located at 
https://gitbox.apache.org/repos/asf?p=nifi-site.git.  Please note the subtle 
distinction!

If you are a contributor still on their way to committership or you are 
committer content with the same functionality we had with our old setup 
(git-wip), you will use the above listed locations.  You can update your 
location by performing the following steps:

   - Go to the root of your source directory, we'll use nifi as an example
  - cd ${repo_home}/nifi
   - Update your remote.  On a default checkout this will be origin, but
   use the name that is pointing to the old git-wip location (you can view
   these using git remote -v)
  - git remote set-url origin
  https://gitbox.apache.org/repos/asf/nifi.git

If you are a committer and interested in making use of the tighter GitHub 
integration, there are a few steps needed to enable this described at 
https://gitbox.apache.org/setup/.  Please follow those and give some time to 
allow the background processes sync.  Successfully completing them should 
result in the email from GitHub on behalf of Gitbox I mentioned previously.

Again, look for a bit more polished set of instructions in the next hour or so 
but feel free to add any questions here in the interim if it is holding you up.

On Fri, Dec 14, 2018 at 7:35 AM Aldrin Piri  wrote:

> Hey all,
>
> Just a quick note that migration has completed.  I'll be gathering up 
> information shortly and sending out another, separate email to the 
> community with any needed changes and will work to update our docs 
> appropriately.
>
> Committers, you should have received an email regarding you addition 
> to the nifi group on GitHub.
>
> --aldrin
>
> On Thu, Dec 13, 2018 at 1:20 PM Aldrin Piri  wrote:
>
>> Hey folks,
>>
>> The JIRA ticket has been submitted [1].  I will keep this thread 
>> updated as things progress and then generate a separate "helper" 
>> email for instructions/any updates that may be needed upon completion.
>>
>> [1] https://issues.apache.org/jira/browse/INFRA-17419
>>
>> On Mon, Dec 10, 2018 at 10:51 AM Pierre Villard < 
>> pierre.villard...@gmail.com> wrote:
>>
>>> +1 as well, should ease the contribution workflow.
>>> Thanks for volunteering Aldrin.
>>>
>>> Le lun. 10 déc. 2018 à 16:41, Laszlo Horvath 
>>>  a écrit :
>>>
>>> > +1
>>> >
>>> > On 10/12/2018, 15:22, "Michael Moser"  wrote:
>>> >
>>> > +1 from me, sounds great.
>>> >
>>> >
>>> > On Mon, Dec 10, 2018 at 4:07 AM Arpad Boda 
>>> > 
>>> > wrote:
>>> >
>>> > > +1 (for being the guinea pig __ )
>>> > >
>>> > > On 09/12/2018, 04:01, "Aldrin Piri" 
>>> wrote:
>>> > >
>>> > > Thanks to those of you that responded.
>>> > >
>>> > > I think my tentative plan is to give this a few more days to
>>> see
>>> > if
>>> > > there
>>> > > are any strong objections.  Otherwise, I'll look to file the
>>> > ticket
>>> > > around
>>> > > midweek and then look to begin the process toward the end of
>>> > next week
>>> > > to
>>> > > hopefully minimize disruptions.  Thanks again!
>>> > >
>>> > > On Sat, Dec 8, 2018 at 12:09 PM Tony Kurc 
>>> > wrote:
>>> > >
>>> > > > +1, sounds like a great idea
>>> > > >
>>> > > > On Sat, Dec 8, 2018 at 7:44 AM Mike Thomsen <
>>> > mikerthom...@gmail.com>
>>> > > > wrote:
>>> > > >
>>> > > > > +1
>>> > > > >
>>> > > > > On Fri, Dec 7, 2018 at 9:08 PM Sivaprasanna <
>>> > > sivaprasanna...@gmail.com>
>>> > > > > wrote:
>>> > > > >
>>> > > > > > +1 (non-binding)
>>> > > > > >
>>> > > > > > I’m in. Thanks for doing it, Aldrin.
>>> > > > > >
>>> > > > > >
>>> > > > > > On Sat, 8 Dec 2018 at 7:32 AM, James Wing <
>>> > jvw...@gmail.com>
>>> > > wrote:
>>> > > > > >
>>> > > > > > > +1, thanks for volunteering.
>>> > > > > > >
>>> > > > > > > > On Dec 7, 2018, at 13:39, Kevin Doran <
>>> > > kdoran.apa...@gmail.com>
>>> > > > > wrote:
>>> > > > > > > >
>>> > > > > > > > +1
>>> > > > > > > >
>>> > > > > > > > On 12/7/18, 15:17, "Andy

RE: [EXT] Custom UI for Custom Service Controller

2018-11-30 Thread Peter Wicks (pwicks)

I haven't done this before either, but you named your file: 
META-INF/nifi-controller-service-configuration?

-Original Message-
From: Ron Aday [mailto:ronaday@outlook.com] 
Sent: Friday, November 30, 2018 10:12 AM
To: dev@nifi.apache.org
Subject: [EXT] Custom UI for Custom Service Controller

Good day to all!
I have built many custom processors with custom ui's with success, but I
can't seem to get it to work with a custom controller.   By that I mean the
"Advanced" button never shows up on the controller no matter what I do (so 
far).  Is there any relevant information available pertaining to this topic? 
Or has anyone been successful in this and care to share their solution?

Thanks,
R

--
Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

2018-11-09 Thread Peter Wicks (pwicks)

Joe,

The only new thing we can do is Matt can finish reviewing the PR for NIFI-5744, 
which adds this as a feature to ExecuteSQL, as he was waiting for this 
discussion to come to a close. 
https://github.com/apache/nifi/pull/3107#issuecomment-433260483
But apart from that, nothing new to do now.

Thanks,
  Peter

-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Friday, November 9, 2018 2:36 PM
To: dev@nifi.apache.org
Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that caused failure 
in an attribute

Peter

Ok cool.  So i think we agree on the state of things.  And for processors you 
want to add more details in failure cases to you can do so (provided we're not 
just bloating attributes all over).  And we'll recognize that this model is 
basically to help users and will likely be abused and be brittle.  But I think 
i'm saying 'there is nothing new to do now' then right?

Do you agree?

Thanks
On Fri, Nov 9, 2018 at 3:59 PM Peter Wicks (pwicks)  wrote:
>
> Joe,
>
> Several different opinions have been expressed by Matt Burgess, Mark Payne 
> and Bryan Bende about whether we should be storing exception information in 
> attributes, and the pros and cons, in this thread. Those opinions generally 
> matched yours, which is that a well-defined relationship is the best 
> approach. I don't disagree in anyway with the consensus, I agree that the 
> best solution is to use a well-defined relationship.
>
> The pattern I see in the Processor list I provided below is that almost all 
> of the processors work with external systems (outside of NiFi), and in many 
> cases the number of distinct exception classes that can occur is low, but the 
> variety of exceptions is high (JDBC/Stream Command). Matt did a good job of 
> discussing this for JDBC type processors in his reply.
>
> My users are desperate for these error details, especially on ExecuteSQL; and 
> I won't lie, users are absolutely going to parse the exceptions and use 
> RouteOnAttribute. And yep, it's going to be fragile and break sometimes. (I 
> don't know that this will be the primary use case, as troubleshooting using 
> this information will also be a major facet of its usefulness) The problem, 
> especially with JDBC, is that I don't see a reasonable alternative. There are 
> so many different JDBC drivers, and NiFi will only see a SQLException type 
> with differing text. Even if we went down the route of putting exception 
> parsers into the DBAdapters to provide per driver failure handling, all we've 
> really done is move the fragileness into NiFi, where it's hard coded until 
> the next release.
>
> Thank you,
>   Peter
>
> -Original Message-
> From: Joe Witt [mailto:joe.w...@gmail.com]
> Sent: Friday, November 9, 2018 12:23 PM
> To: dev@nifi.apache.org
> Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that 
> caused failure in an attribute
>
> Peter,
>
> I'm not clear on what you're are asking be done precisely and how far it be 
> carried.
>
> You're right there are many processors which store exception/log details as 
> attributes on flowfiles before they route them (success,failure, etc..).  
> This is fine and can be documented with the WritesAttribute annotations to be 
> helpful.
>
> Where that model breaks down though and should never be used is when someone 
> wants to use the text of that String to safely/reliable automate something.  
> If the 'failure' reason for a given situation is precisely knowable enough 
> and could reasonably be valuable for routing it should be an explicit 
> relationship.  Attributes for exceptions/log values are useful provided they 
> are 'advisory' only meaning largely just intended for users/general 
> awareness.  But not for automation or to define an explicit interface.
>
> So, with the above said can you clarify what you are precisely requesting and 
> for 'who' - who is the actor.
>
> Thanks
> On Fri, Nov 9, 2018 at 2:12 PM Peter Wicks (pwicks)  wrote:
> >
> > A one week bump on this thread. --Peter
> >
> > -Original Message-
> > From: Peter Wicks (pwicks)
> > Sent: Friday, November 2, 2018 11:54 AM
> > To: dev@nifi.apache.org
> > Subject: RE: [EXT] Re: New Standard Pattern - Put Exception that 
> > caused failure in an attribute
> >
> > Dev Team,
> >
> > I don’t think we've reached a conclusion on this discussion, but would like 
> > too. I had not done enough research when I originally suggested this as a, 
> > "New Pattern". Having done a bit more research now, I'd say this is already 
> > a well established pattern.
> >
> > Examples using this pattern already, with exception types/text written in 
> > FlowFile Attributes

RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

2018-11-09 Thread Peter Wicks (pwicks)

Joe,

Several different opinions have been expressed by Matt Burgess, Mark Payne and 
Bryan Bende about whether we should be storing exception information in 
attributes, and the pros and cons, in this thread. Those opinions generally 
matched yours, which is that a well-defined relationship is the best approach. 
I don't disagree in anyway with the consensus, I agree that the best solution 
is to use a well-defined relationship.

The pattern I see in the Processor list I provided below is that almost all of 
the processors work with external systems (outside of NiFi), and in many cases 
the number of distinct exception classes that can occur is low, but the variety 
of exceptions is high (JDBC/Stream Command). Matt did a good job of discussing 
this for JDBC type processors in his reply.

My users are desperate for these error details, especially on ExecuteSQL; and I 
won't lie, users are absolutely going to parse the exceptions and use 
RouteOnAttribute. And yep, it's going to be fragile and break sometimes. (I 
don't know that this will be the primary use case, as troubleshooting using 
this information will also be a major facet of its usefulness) The problem, 
especially with JDBC, is that I don't see a reasonable alternative. There are 
so many different JDBC drivers, and NiFi will only see a SQLException type with 
differing text. Even if we went down the route of putting exception parsers 
into the DBAdapters to provide per driver failure handling, all we've really 
done is move the fragileness into NiFi, where it's hard coded until the next 
release.

Thank you,
  Peter

-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Friday, November 9, 2018 12:23 PM
To: dev@nifi.apache.org
Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that caused failure 
in an attribute

Peter,

I'm not clear on what you're are asking be done precisely and how far it be 
carried.

You're right there are many processors which store exception/log details as 
attributes on flowfiles before they route them (success,failure, etc..).  This 
is fine and can be documented with the WritesAttribute annotations to be 
helpful.

Where that model breaks down though and should never be used is when someone 
wants to use the text of that String to safely/reliable automate something.  If 
the 'failure' reason for a given situation is precisely knowable enough and 
could reasonably be valuable for routing it should be an explicit relationship. 
 Attributes for exceptions/log values are useful provided they are 'advisory' 
only meaning largely just intended for users/general awareness.  But not for 
automation or to define an explicit interface.

So, with the above said can you clarify what you are precisely requesting and 
for 'who' - who is the actor.

Thanks
On Fri, Nov 9, 2018 at 2:12 PM Peter Wicks (pwicks)  wrote:
>
> A one week bump on this thread. --Peter
>
> -Original Message-----
> From: Peter Wicks (pwicks)
> Sent: Friday, November 2, 2018 11:54 AM
> To: dev@nifi.apache.org
> Subject: RE: [EXT] Re: New Standard Pattern - Put Exception that 
> caused failure in an attribute
>
> Dev Team,
>
> I don’t think we've reached a conclusion on this discussion, but would like 
> too. I had not done enough research when I originally suggested this as a, 
> "New Pattern". Having done a bit more research now, I'd say this is already a 
> well established pattern.
>
> Examples using this pattern already, with exception types/text written in 
> FlowFile Attributes:
>  - GenerateTableFetch (Matt added back in 2017) does this for incoming 
> FlowFiles that cause a SQL exception
>  - PutDatabaseRecord (Matt added also in 2017 with the original 
> version of the processor)
>  - ValidateCSV and ValidateXML puts that validation cause as an 
> Attribute (maybe not exactly the same, but feels similar)
>  - InvokeHTTP, InvokeGRPC Exception class name and Exception message
>  - Couchbase Processors (Put/Get) provides the exception class name
>  - PutLambda (six different exception fields get written to 
> Attributes)
>  - Other AWS processors are similar in how they handle this, such as the 
> Dynamo processor.
>  - ExecuteStreamCommand provides the error message from a script execution as 
> an attribute.
>  - DeleteHDFS puts the error message as an Attribute
>  - ScanHBase puts scanning errors as an Attribute
>  - DeleteElasticSearch (both versions) put deletion failure messages 
> as an attribute
>  - InfluxDB processors do this also (influxdb.error.message)
>  - ConvertExcelToCSV tracks conversion errors in an attribute
>  - RethinkDB processors do this too
>
> Thanks,
>   Peter
>
> -Original Message-
> From: James Srinivasan [mailto:james.sriniva...@gmail.com]
> Sent: Tuesday, October 30, 2018 3:00 PM
> To: dev@nifi.apache.org
> Subject:

RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

2018-11-09 Thread Peter Wicks (pwicks)

A one week bump on this thread. --Peter

-Original Message-
From: Peter Wicks (pwicks) 
Sent: Friday, November 2, 2018 11:54 AM
To: dev@nifi.apache.org
Subject: RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure 
in an attribute

Dev Team,

I don’t think we've reached a conclusion on this discussion, but would like 
too. I had not done enough research when I originally suggested this as a, "New 
Pattern". Having done a bit more research now, I'd say this is already a well 
established pattern.

Examples using this pattern already, with exception types/text written in 
FlowFile Attributes:
 - GenerateTableFetch (Matt added back in 2017) does this for incoming 
FlowFiles that cause a SQL exception
 - PutDatabaseRecord (Matt added also in 2017 with the original version of the 
processor)
 - ValidateCSV and ValidateXML puts that validation cause as an Attribute 
(maybe not exactly the same, but feels similar)
 - InvokeHTTP, InvokeGRPC Exception class name and Exception message
 - Couchbase Processors (Put/Get) provides the exception class name
 - PutLambda (six different exception fields get written to Attributes)
 - Other AWS processors are similar in how they handle this, such as the Dynamo 
processor.
 - ExecuteStreamCommand provides the error message from a script execution as 
an attribute.
 - DeleteHDFS puts the error message as an Attribute
 - ScanHBase puts scanning errors as an Attribute
 - DeleteElasticSearch (both versions) put deletion failure messages as an 
attribute
 - InfluxDB processors do this also (influxdb.error.message)
 - ConvertExcelToCSV tracks conversion errors in an attribute
 - RethinkDB processors do this too 

Thanks,
  Peter

-Original Message-
From: James Srinivasan [mailto:james.sriniva...@gmail.com]
Sent: Tuesday, October 30, 2018 3:00 PM
To: dev@nifi.apache.org
Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that caused failure 
in an attribute

Apologies if I've missed this in the discussion so far - we use the InvokeHTTP 
processor a lot, and the invokehttp.java.exception.message attribute is really 
handy diving into why things have failed without having to match up logs with 
flow files (from a system with hundreds of processors making thousands of 
requests). We also route on invokehttp.status.code (e.g. to retry 403s due to a 
race hazard in an external system) but I don't imagine we'd route on
invokehttp.java.exception.* since (as others have mentioned) it looks pretty 
fragile.

--
James
On Tue, 30 Oct 2018 at 16:44, Peter Wicks (pwicks)  wrote:
>
> Sorry for the delayed response, I've been traveling.
>
> Responses in order:
>
> Matt,
> Right now our work around is to keep retrying errors, usually with a penalty 
> or control rate processor. The problem is that we don't know why it failed, 
> and thus don't know if retry is the correct option. I have not found a way, 
> without code change, to be able to determine if retrying is the correct 
> option or not.
>
> Koji,
> Detailed error handling would indeed be a good workaround to the problems 
> raised by myself and Matt. I have not see this on other processors, but that 
> does not mean we can't do it of course.  I agree that having some kind of 
> hierarchy system for errors would be a much better solution.
>
> Pierre,
> My primary use case is as you described, a user friendly way to see what 
> actually happened without going through the log files. But I while I know 
> it's fragile, routing on exception text stored in an attribute still feels 
> like a very legitimate use case. I know in many systems there are good 
> exception types that can be used to route FlowFile's to appropriate failure 
> relationships, but as Matt mentioned, JDBC has just a handful of exception 
> types for a very large number of possible error types.
>
> I think this is probably the same rational that was used to justify this 
> feature for Execute Stream Command's inclusion of this feature in the past. 
> To many possible failure conditions to handle with just a few failure 
> conditions.
>
> Uwe,
> That is a fair question, but it doesn't feel like such a bad fit to me. It's 
> like extra metadata on the lineage, "We followed this path through the flow 
> because we had exception "  " which caused the FlowFile to follow the 
> failure route".
>
> But I still prefer the attribute, it could be another option for Detailed 
> error handling; instead of, or in addition to, additional relationships for 
> failures, the exception text could be included in an attribute.
>
> Thanks,
>   Peter
>
> -Original Message-
> From: u...@moosheimer.com [mailto:u...@moosheimer.com]
> Sent: Saturday, October 27, 2018 10:46 AM
> To: dev@nifi.apache.org
> Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that

RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

2018-11-02 Thread Peter Wicks (pwicks)

Dev Team,

I don’t think we've reached a conclusion on this discussion, but would like 
too. I had not done enough research when I originally suggested this as a, "New 
Pattern". Having done a bit more research now, I'd say this is already a well 
established pattern.

Examples using this pattern already, with exception types/text written in 
FlowFile Attributes:
 - GenerateTableFetch (Matt added back in 2017) does this for incoming 
FlowFiles that cause a SQL exception
 - PutDatabaseRecord (Matt added also in 2017 with the original version of the 
processor)
 - ValidateCSV and ValidateXML puts that validation cause as an Attribute 
(maybe not exactly the same, but feels similar)
 - InvokeHTTP, InvokeGRPC Exception class name and Exception message
 - Couchbase Processors (Put/Get) provides the exception class name
 - PutLambda (six different exception fields get written to Attributes)
 - Other AWS processors are similar in how they handle this, such as the Dynamo 
processor.
 - ExecuteStreamCommand provides the error message from a script execution as 
an attribute.
 - DeleteHDFS puts the error message as an Attribute
 - ScanHBase puts scanning errors as an Attribute
 - DeleteElasticSearch (both versions) put deletion failure messages as an 
attribute
 - InfluxDB processors do this also (influxdb.error.message)
 - ConvertExcelToCSV tracks conversion errors in an attribute
 - RethinkDB processors do this too 

Thanks,
  Peter

-Original Message-
From: James Srinivasan [mailto:james.sriniva...@gmail.com] 
Sent: Tuesday, October 30, 2018 3:00 PM
To: dev@nifi.apache.org
Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that caused failure 
in an attribute

Apologies if I've missed this in the discussion so far - we use the InvokeHTTP 
processor a lot, and the invokehttp.java.exception.message attribute is really 
handy diving into why things have failed without having to match up logs with 
flow files (from a system with hundreds of processors making thousands of 
requests). We also route on invokehttp.status.code (e.g. to retry 403s due to a 
race hazard in an external system) but I don't imagine we'd route on
invokehttp.java.exception.* since (as others have mentioned) it looks pretty 
fragile.

--
James
On Tue, 30 Oct 2018 at 16:44, Peter Wicks (pwicks)  wrote:
>
> Sorry for the delayed response, I've been traveling.
>
> Responses in order:
>
> Matt,
> Right now our work around is to keep retrying errors, usually with a penalty 
> or control rate processor. The problem is that we don't know why it failed, 
> and thus don't know if retry is the correct option. I have not found a way, 
> without code change, to be able to determine if retrying is the correct 
> option or not.
>
> Koji,
> Detailed error handling would indeed be a good workaround to the problems 
> raised by myself and Matt. I have not see this on other processors, but that 
> does not mean we can't do it of course.  I agree that having some kind of 
> hierarchy system for errors would be a much better solution.
>
> Pierre,
> My primary use case is as you described, a user friendly way to see what 
> actually happened without going through the log files. But I while I know 
> it's fragile, routing on exception text stored in an attribute still feels 
> like a very legitimate use case. I know in many systems there are good 
> exception types that can be used to route FlowFile's to appropriate failure 
> relationships, but as Matt mentioned, JDBC has just a handful of exception 
> types for a very large number of possible error types.
>
> I think this is probably the same rational that was used to justify this 
> feature for Execute Stream Command's inclusion of this feature in the past. 
> To many possible failure conditions to handle with just a few failure 
> conditions.
>
> Uwe,
> That is a fair question, but it doesn't feel like such a bad fit to me. It's 
> like extra metadata on the lineage, "We followed this path through the flow 
> because we had exception "  " which caused the FlowFile to follow the 
> failure route".
>
> But I still prefer the attribute, it could be another option for Detailed 
> error handling; instead of, or in addition to, additional relationships for 
> failures, the exception text could be included in an attribute.
>
> Thanks,
>   Peter
>
> -Original Message-
> From: u...@moosheimer.com [mailto:u...@moosheimer.com]
> Sent: Saturday, October 27, 2018 10:46 AM
> To: dev@nifi.apache.org
> Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that 
> caused failure in an attribute
>
> Do you really want to mix provenance and data lineage with logging/error 
> information?
>
> Writing exception information/logging information within an attribute is not 
> a bad idea in my opini

RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

2018-10-30 Thread Peter Wicks (pwicks)

>>> 
>>> We could mitigate 1-2 with a tool that updates your flow/template by 
>>> sending all new failure relationships to the same target as the 
>>> existing one, but then the tool itself suffers from maintainability 
>>> issues (as does option #3). If we could recognize that the new 
>>> relationships are self-terminated and then send the errors out to 
>>> the original failure relationship, that could be quite confusing to 
>>> the user, especially as time goes on (how to suppress the "new" 
>>> errors, e.g.).
>>> 
>>> IMHO I think we're between a rock and a hard place here, I guess 
>>> with great entropy comes great responsibility :P
>>> 
>>> P.S. For your use case, is the workaround to just keep retrying? Or 
>>> are there other constraints at play?
>>> 
>>> Regards,
>>> Matt
>>> 
>>> On Thu, Oct 25, 2018 at 10:27 PM Peter Wicks (pwicks) 
>>> 
>> wrote:
>>>> 
>>>> Matt,
>>>> 
>>>> If I were to split an existing failure relationship into several
>> relationships, I do not think I would want to auto-terminate in most cases.
>> Specifically, I'm interested in a failure relationship for a database 
>> disconnect during SQL execution (database was online when the 
>> connection was verified in the DBCP pool, but went down during 
>> execution). If I were to find a way to separate this into its own 
>> relationship, I do not think most users would appreciate it being a 
>> condition silently not handled by the normal failure path.
>>>> 
>>>> Thanks,
>>>>  Peter
>>>> 
>>>> -Original Message-
>>>> From: Matt Burgess [mailto:mattyb...@apache.org]
>>>> Sent: Friday, October 26, 2018 10:18 AM
>>>> To: dev@nifi.apache.org
>>>> Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that
>> caused failure in an attribute
>>>> 
>>>> NiFi (as of the last couple releases I think) has the ability to 
>>>> set
>> auto-terminating relationships; this IMO is one of those use cases 
>> (for NiFi 1.x). If new relationships are added, they could default to 
>> auto-terminate; then the existing processors should remain valid.
>>>> However we might want an "omnibus Jira" to capture those 
>>>> relationships
>> we'd like to remove the auto-termination from in NiFi 2.0.
>>>> 
>>>> Regards,
>>>> Matt
>>>> On Thu, Oct 25, 2018 at 10:12 PM Peter Wicks (pwicks) <
>> pwi...@micron.com> wrote:
>>>>> 
>>>>> Mark,
>>>>> 
>>>>> I agree with you that this is the best option in general terms.
>> After thinking about it some more I think the biggest use case is for 
>> troubleshooting. If a file routes to failure, you need to be watching 
>> the UI to see what the exception was. An admin may have access to the 
>> NiFi log files and could grep the error out, but a normal user who 
>> checks in on the flow and sees a FlowFile in the error queue will not 
>> know what the cause was; this is especially frustrating if retrying 
>> the file works without failure the second time... Capturing the error 
>> message in an attribute makes this easy to find.
>>>>> 
>>>>> One thing I worry about too is adding new relationships to core
>> processors. After an upgrade, won't users need to go to each instance 
>> of that processor and handle the new relationship? Right now I'd 
>> swagger we have at least five thousand ExecuteSQL processors in our 
>> environment; and while we have strong scripting skills in my NiFi 
>> team, I would not want to encounter this without that.
>>>>> 
>>>>> Thanks,
>>>>>  Peter
>>>>> 
>>>>> -Original Message-
>>>>> From: Mark Payne [mailto:marka...@hotmail.com]
>>>>> Sent: Thursday, October 25, 2018 10:38 PM
>>>>> To: dev@nifi.apache.org
>>>>> Subject: [EXT] Re: New Standard Pattern - Put Exception that 
>>>>> caused failure in an attribute
>>>>> 
>>>>> I agree - the notion of adding a "failure.reason" attribute is, in
>> my opinion, an anti-pattern that should be avoided. Relationships are 
>> not a workaround but rather the preferred approach in this scenario - 
>> an attribute I would consider a workaround. This is due to the fact 
>> that not only is it brittle and comp

RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

2018-10-25 Thread Peter Wicks (pwicks)

Matt,

If I were to split an existing failure relationship into several relationships, 
I do not think I would want to auto-terminate in most cases. Specifically, I'm 
interested in a failure relationship for a database disconnect during SQL 
execution (database was online when the connection was verified in the DBCP 
pool, but went down during execution). If I were to find a way to separate this 
into its own relationship, I do not think most users would appreciate it being 
a condition silently not handled by the normal failure path.

Thanks,
  Peter

-Original Message-
From: Matt Burgess [mailto:mattyb...@apache.org] 
Sent: Friday, October 26, 2018 10:18 AM
To: dev@nifi.apache.org
Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that caused failure 
in an attribute

NiFi (as of the last couple releases I think) has the ability to set 
auto-terminating relationships; this IMO is one of those use cases (for NiFi 
1.x). If new relationships are added, they could default to auto-terminate; 
then the existing processors should remain valid.
However we might want an "omnibus Jira" to capture those relationships we'd 
like to remove the auto-termination from in NiFi 2.0.

Regards,
Matt
On Thu, Oct 25, 2018 at 10:12 PM Peter Wicks (pwicks)  wrote:
>
> Mark,
>
> I agree with you that this is the best option in general terms. After 
> thinking about it some more I think the biggest use case is for 
> troubleshooting. If a file routes to failure, you need to be watching the UI 
> to see what the exception was. An admin may have access to the NiFi log files 
> and could grep the error out, but a normal user who checks in on the flow and 
> sees a FlowFile in the error queue will not know what the cause was; this is 
> especially frustrating if retrying the file works without failure the second 
> time... Capturing the error message in an attribute makes this easy to find.
>
> One thing I worry about too is adding new relationships to core processors. 
> After an upgrade, won't users need to go to each instance of that processor 
> and handle the new relationship? Right now I'd swagger we have at least five 
> thousand ExecuteSQL processors in our environment; and while we have strong 
> scripting skills in my NiFi team, I would not want to encounter this without 
> that.
>
> Thanks,
>   Peter
>
> -Original Message-
> From: Mark Payne [mailto:marka...@hotmail.com]
> Sent: Thursday, October 25, 2018 10:38 PM
> To: dev@nifi.apache.org
> Subject: [EXT] Re: New Standard Pattern - Put Exception that caused 
> failure in an attribute
>
> I agree - the notion of adding a "failure.reason" attribute is, in my 
> opinion, an anti-pattern that should be avoided. Relationships are not a 
> workaround but rather the preferred approach in this scenario - an attribute 
> I would consider a workaround. This is due to the fact that not only is it 
> brittle and complex to add processors that route on such things, but there's 
> no reason at all to assume that from release to release (even bug 
> fix/increment releases) that the Exception type or message will be the same, 
> so the flow could stop working at any time after upgrading nifi.
> Relationships offer a well-defined way to explicitly indicate "these are the 
> possible outcomes,"
> similar IMO to Java Exception classes vs. throwing Strings in C.
>
>
> > On Oct 25, 2018, at 9:47 AM, Bryan Bende  wrote:
> >
> > I think processors should really have well defined relationships for 
> > the error scenarios that need to be handled. Having the exception 
> > message is ok for a human who wants to see it, but in order to do 
> > anything with it in the flow you will have to have a bunch of 
> > parsing/interpreting of the message with a bunch of routing 
> > processors, which seems more brittle than just having the 
> > appropriate relationships.
> > On Thu, Oct 25, 2018 at 1:36 AM Peter Wicks (pwicks)  
> > wrote:
> >>
> >> When a FlowFile is routed to failure, frequently there is no clear reason 
> >> without looking into the actual error message.
> >> Some processors work around this by creating many different relationships, 
> >> but even then frequently the generic Failure relationship also provides 
> >> little guidance.
> >>
> >> I've seen a few cases recently where processors are including the 
> >> exception message as an attribute on the FlowFile when routing to failure 
> >> (ExecuteStreamCommand, new PR for ExecuteSQL). Should this be a standard 
> >> pattern so that it's easier for users to route failures?
> >>
> >> --Peter
>

RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

2018-10-25 Thread Peter Wicks (pwicks)

Mark,

I agree with you that this is the best option in general terms. After thinking 
about it some more I think the biggest use case is for troubleshooting. If a 
file routes to failure, you need to be watching the UI to see what the 
exception was. An admin may have access to the NiFi log files and could grep 
the error out, but a normal user who checks in on the flow and sees a FlowFile 
in the error queue will not know what the cause was; this is especially 
frustrating if retrying the file works without failure the second time... 
Capturing the error message in an attribute makes this easy to find.

One thing I worry about too is adding new relationships to core processors. 
After an upgrade, won't users need to go to each instance of that processor and 
handle the new relationship? Right now I'd swagger we have at least five 
thousand ExecuteSQL processors in our environment; and while we have strong 
scripting skills in my NiFi team, I would not want to encounter this without 
that. 

Thanks,
  Peter

-Original Message-
From: Mark Payne [mailto:marka...@hotmail.com] 
Sent: Thursday, October 25, 2018 10:38 PM
To: dev@nifi.apache.org
Subject: [EXT] Re: New Standard Pattern - Put Exception that caused failure in 
an attribute

I agree - the notion of adding a "failure.reason" attribute is, in my opinion, 
an anti-pattern that should be avoided. Relationships are not a workaround but 
rather the preferred approach in this scenario - an attribute I would consider 
a workaround. This is due to the fact that not only is it brittle and complex 
to add processors that route on such things, but there's no reason at all to 
assume that from release to release (even bug fix/increment releases) that the 
Exception type or message will be the same, so the flow could stop working at 
any time after upgrading nifi.
Relationships offer a well-defined way to explicitly indicate "these are the 
possible outcomes,"
similar IMO to Java Exception classes vs. throwing Strings in C.

> On Oct 25, 2018, at 9:47 AM, Bryan Bende  wrote:
> 
> I think processors should really have well defined relationships for 
> the error scenarios that need to be handled. Having the exception 
> message is ok for a human who wants to see it, but in order to do 
> anything with it in the flow you will have to have a bunch of 
> parsing/interpreting of the message with a bunch of routing 
> processors, which seems more brittle than just having the appropriate 
> relationships.
> On Thu, Oct 25, 2018 at 1:36 AM Peter Wicks (pwicks)  
> wrote:
>> 
>> When a FlowFile is routed to failure, frequently there is no clear reason 
>> without looking into the actual error message.
>> Some processors work around this by creating many different relationships, 
>> but even then frequently the generic Failure relationship also provides 
>> little guidance.
>> 
>> I've seen a few cases recently where processors are including the exception 
>> message as an attribute on the FlowFile when routing to failure 
>> (ExecuteStreamCommand, new PR for ExecuteSQL). Should this be a standard 
>> pattern so that it's easier for users to route failures?
>> 
>> --Peter

New Standard Pattern - Put Exception that caused failure in an attribute

2018-10-24 Thread Peter Wicks (pwicks)

When a FlowFile is routed to failure, frequently there is no clear reason 
without looking into the actual error message.
Some processors work around this by creating many different relationships, but 
even then frequently the generic Failure relationship also provides little 
guidance.

I've seen a few cases recently where processors are including the exception 
message as an attribute on the FlowFile when routing to failure 
(ExecuteStreamCommand, new PR for ExecuteSQL). Should this be a standard 
pattern so that it's easier for users to route failures?

--Peter

License of NLKBufferedReader

2018-10-16 Thread Peter Wicks (pwicks)

While digging into ReplaceText today, I noticed that 
org.apache.nifi.processors.standard.util.NLKBufferedReader is actually a direct 
copy of the JDK version of BufferedReader, with a few minor code edits made on 
top of it so that new line characters are kept. This code was in the original 
code commit from way back.

It feels like there is a licensing issue here, as this is Oracle source code. 
Joe?

--Peter

RE: [EXT] Calcite version used in Nifi 1.7+

2018-10-16 Thread Peter Wicks (pwicks)

In NiFi 1.7.0 it was Calcite v1.12.0. In NiFi 1.8 it will be Calcite v1.17.0

-Original Message-
From: ruurd.schoonh...@dbiq.nl [mailto:ruurd.schoonh...@dbiq.nl] 
Sent: Tuesday, October 16, 2018 6:58 AM
To: dev@nifi.apache.org
Subject: [EXT] Calcite version used in Nifi 1.7+

Hi guys,

Which version of calcite is used by QueryRecord processor in Nifi 1.7 or above?

I’m looking for CROSS APPLY suport in my query as introduced in Calcite 1.11 
(january 2017)

Or has anyone a solution to perform select on self joined dataset (recursive)

Kind regards
Ruurd Schoonheim

Future of HDF at Hortonworks/Cloudera

2018-10-08 Thread Peter Wicks (pwicks)

Normally, I wouldn’t ask a question like this on the NiFi Dev group. But I did 
a census and out of the top 10 committers to NiFi in the last year, 8 
definitely work for Hortonworks (couldn’t figure out where MikeThomsen works). 
In the subsequent top 10 committers, there are at least four more working for 
Hortonworks (mosermw, thenatog, and markobean I couldn’t figure out).

I love that Hortonworks has put so much people power into NiFi, no other 
company has more than a single developer contributing code in large quantity 
(that I can find, large quantity being top 20 contributors list for last year). 
But I do worry about what would happen to the whole project if, assuming the 
merger goes through, Cloudera decided a change of direction was in order, or 
something to that affect, and the whole project was affected negatively.

Thoughts from the community (and if you aren’t legally allowed to represent 
your company… probably best not to respond more than your allowed )?

--Peter

RE: [EXT] XKCD use case for NiFi

2018-10-03 Thread Peter Wicks (pwicks)

I don't think NiFi will run on his phone, MiNiFi maybe? 

-Original Message-
From: Joe Gresock [mailto:jgres...@gmail.com] 
Sent: Wednesday, October 3, 2018 12:08 PM
To: dev@nifi.apache.org
Subject: [EXT] XKCD use case for NiFi

https://xkcd.com/2054/

He needs NiFi!

Looking for reviewer for Unit Test only PR (NiFi-5381)

2018-08-06 Thread Peter Wicks (pwicks)

I try really hard not to ask for PR reviews... but I have a PR that only adds 
unit tests for GetSFTP and PutSFTP that's been open for about a month. I have a 
sizeable number of other code changes ready that affect these processors and 
wanted to get better unit tests in place before moving forward with changes.

https://github.com/apache/nifi/pull/2846

This is part of a larger effort I'm working on to restructure and enhance SSH 
functionality in NiFi (new processors like SCP and ExecuteSSH, updated code, 
etc...).

Thanks!
  Peter

RE: [EXT] Re: Moving UI objects on a parent you don't have access to

2018-08-01 Thread Peter Wicks (pwicks)

Matt,

I don't think you misinterpreted the first time. A few more examples:

 - User has Read/Write permission to the process group but has only Read to 
children:
Current: User cannot move children
Proposed: User can move all children on GUI

 - User has Read only permissions to process group, but Read/Write permission 
to children:
Current: User can move children
Proposed: User cannot move children

Hopefully that clarifies things, and I believe that lines right up with your 
original read. I'm OK with saving this for a future release.

Ticket: https://issues.apache.org/jira/browse/NIFI-5477

-Original Message-
From: Matt Gilman [mailto:matt.c.gil...@gmail.com] 
Sent: Friday, July 27, 2018 7:46 PM
To: dev@nifi.apache.org
Subject: [EXT] Re: Moving UI objects on a parent you don't have access to

Peter,

I just re-read your note and I realize that I may have misinterpreted your 
question. I thought that you were asking to only enforce WRITE permissions on 
the parent group. If this was the case, my previously stated concerns apply. If 
we're looking to retain the component based checks and additionally introduce a 
check on the parent group then my concerns don't apply. We certainly have other 
endpoints that concern multiple components (like referencing controller 
services for instance) which require multiple checks. However, they always 
include the primary component as a basis for authorization. As long as we're 
retaining the primary component check as well, we should be ok to introduce 
this in a minor version release.

Matt

On Fri, Jul 27, 2018 at 5:49 PM, Matt Gilman 
wrote:

> Please file the JIRA. I'm definitely not opposed to this change 
> long-term, possibly in the next major release. I do have some concerns 
> about introducing it in the near term. NiFi employs a fine grain 
> authorization model where policies on each component drive access 
> decisions. These resources map to the REST API resources. We treat our 
> REST APIs and corresponding data models as public interfaces from a 
> compatibility perspective (unless called out as non-guaranteed). 
> Currently, clients can perform this action by changing the [x, y] 
> coordinates on the component, invoking the component's REST endpoint, 
> and being authorized to perform this action. The concerns I have are 
> regarding this backward compatibility and existing clients and whether 
> the update would leave the REST API and authorization scheme 
> understandable/consumable. For instance, requiring the client to know 
> that updating field A requires policy Y but updating field B requires policy 
> Z.
>
> Matt
>
>
> On Fri, Jul 27, 2018 at 3:11 PM, Andy LoPresto 
> wrote:
>
>> Peter,
>>
>> I vaguely recall the conversations around (similar, not exactly the 
>> same) permissions at the time this was implemented, and it was 
>> decided to allow this due to time constraints. I do not object to 
>> your proposal to change this (maybe Matt Gilman feels differently?). 
>> If you open a Jira, it should be doable.
>>
>> Andy LoPresto
>> alopre...@apache.org
>> *alopresto.apa...@gmail.com * PGP 
>> Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>
>> On Jul 27, 2018, at 9:32 AM, Peter Wicks (pwicks) 
>> wrote:
>>
>> While experimenting with permissions, I found that if I have no 
>> permissions to a process group, but do have permissions to a child 
>> that lives in that group, I can move that child around on the UI.
>>
>> I know that in the object model the x,y position values are part of 
>> the child, which I have access to; but in this scenario it feels like 
>> I'm allowed to modify things in a group where I have no permissions. 
>> I propose that users can't move (x,y) objects if they do not have 
>> modify access to the parent group. Thoughts?
>>
>> --Peter
>>
>>
>>
>

Moving UI objects on a parent you don't have access to

2018-07-27 Thread Peter Wicks (pwicks)

While experimenting with permissions, I found that if I have no permissions to 
a process group, but do have permissions to a child that lives in that group, I 
can move that child around on the UI.

I know that in the object model the x,y position values are part of the child, 
which I have access to; but in this scenario it feels like I'm allowed to 
modify things in a group where I have no permissions. I propose that users 
can't move (x,y) objects if they do not have modify access to the parent group. 
Thoughts?

--Peter

RE: [EXT] Re: Dark mode

2018-07-13 Thread Peter Wicks (pwicks)

I did something similar back in 1.2.0 using CSS overrides and a chrome plugin : 
https://userstyles.org/styles/142978/dark-nifi-1-2-0.


-Original Message-
From: Brandon DeVries [mailto:b...@jhu.edu] 
Sent: Friday, July 13, 2018 9:42 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Dark mode

I think there are a lot of people that would be a big +1 on this.  Maybe
even more so if it was abstracted so there could be multiple / custom
"themes" (e.g. dark, classic, high contrast / 508 compliant...).

Brandon

On Fri, Jul 13, 2018 at 7:29 AM Joe Witt  wrote:

> Rich
>
> This could certainly be an interesting option.  Perhaps file a JIRA
> with some details on where you're at in your PR and post that when
> ready.  Identify things you think might be remaining and such..
>
> Thanks
>
> On Fri, Jul 13, 2018 at 7:25 AM, Rich M  wrote:
> > Hi
> >
> > Going through some NiFi bits and pieces while I'm between projects;
> > I've got a half-baked dark mode kicking around, is there any interest
> > in me pushing this to a remote branch somewhere? There's some iframes
> > in dialogs missing styling, it probably wants someone more familiar
> > with Angular to look at it and a couple of places the styling isn't
> > applied to so it'd need a little work to even consider merging.
> >
> > https://i.imgur.com/lSofSq5.jpg
> >
> > Rich
>

RE: [EXT] RE: JIRA #NIFI-5327

2018-06-27 Thread Peter Wicks (pwicks)

Prashanth,

Woops, I had a fix for that test error months ago, but dropped it while pushing 
a PR and never put in a ticket for it... I've submitted a PR with the fix.

https://github.com/apache/nifi/pull/2819

--Peter

https://issues.apache.org/jira/browse/NIFI-4561?focusedCommentId=16291838=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16291838



-Original Message-
From: Prashanth Venkatesan [mailto:prashanth.181...@gmail.com] 
Sent: Wednesday, June 27, 2018 1:19 AM
To: dev@nifi.apache.org
Subject: [EXT] RE: JIRA #NIFI-5327

Hi Team,



I create JIRA NIFI-5327  (
https://issues.apache.org/jira/browse/NIFI-5327) and wish to contribute for it. 
I face couple of issues:

   - I can’t assign this issue to myself.
   - After merging my code and when I run ` mvn -Pcontrib-check clean
   install `  in my windows machine. I am getting the following error,

[INFO] Running org.wali.TestMinimalLockingWriteAheadLog

[ERROR] Tests run: 12, Failures: 0, Errors: 1, Skipped: 2, Time elapsed:
24.552 s <<< FAILURE! - in org.wali.TestMinimalLockingWriteAheadLog

[ERROR]
testRecoverFileThatHasTrailingNULBytesAndTruncation(org.wali.TestMinimalLockingWriteAheadLog)
Time elapsed: 0.057 s  <<< ERROR!

java.nio.channels.OverlappingFileLockException

at
org.wali.TestMinimalLockingWriteAheadLog.testRecoverFileThatHasTrailingNULBytesAndTruncation(TestMinimalLockingWriteAheadLog.java:503)



Thanks & Regards,

Prashanth

Preliminary work on SCP and SSH processors

2018-06-26 Thread Peter Wicks (pwicks)

I’ve been working on NIFI-539 and NIFI-3698, adding support for SSH based 
processors like SCP and an Execute SSH.

I’ve submitted a PR, but it definitely needs some work still. Would appreciate 
some feedback on the approaches. Unlike most of my code changes, this one 
should be easy for most people to try out (no MS SQL ).

https://github.com/apache/nifi/pull/2814

Copying small portions of code form other Apache Projects

2018-06-08 Thread Peter Wicks (pwicks)

I'm working on some changes and found that the exact code I need, along with a 
lot of code I can make derivatives from, are buried in the Apache Ant project.
I'm not going to include Ant as a dependency, just copy some code from it.

Should there be some annotation on the code/license entry/anything for this use?

Thanks,
  Peter

RE: [EXT] Re: Primary Only Content Migration

2018-06-07 Thread Peter Wicks (pwicks)

Joe,

I agree it is a lot of work, which is why I was thinking of starting with a 
processor that could do some of these operations before looking further. If the 
processor could move flowfile's between nodes in the cluster it would be a good 
step. Data comes in form a queue on any node, but gets written out to a queue 
on only the desired node; or gets round robin outputted for a distribute 
scenario.

I want to work on it, and was trying to figure out if it could be done using 
only a processor, or if larger changes would be needed for sure.

--Peter 

-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Thursday, June 7, 2018 3:34 PM
To: dev@nifi.apache.org
Subject: Re: [EXT] Re: Primary Only Content Migration

Peter,

It isn't a pattern that is well supported now in a cluster context.

What is needed are automatically load balanced connections with partitioning.  
This would mean a user could select a given relationship and indicate that data 
should automatically distributed and they should be able to express, 
optionally, if there is a correlation attribute that is used for ensuring data 
which belongs together stays together or becomes together.  We could use this 
to automatically have a connection result in data being distributed across the 
cluster for load balancing purposes and also ensure that data is brought back 
to a single node whenever necessary which is the case in certain scenarios like 
fork/distribute/process/join/send and things like distributed receipt then join 
for merging (like defragmenting data which has been split).  To join them 
together we need affinity/correlation and this could work based on some sort of 
hashing mechanism where there are as many buckets as their are nodes in a 
cluster at a given time.  It needs a lot of thought/design/testing/etc..

I was just having a conversation about this yesterday.  It is definitely a 
thing and will be a major effort.  Will make a JIRA for this soon.

Thanks

On Thu, Jun 7, 2018 at 5:21 PM, Peter Wicks (pwicks)  wrote:
> Bryan,
>
> We see this with large files that we have split up into smaller files and 
> distributed across the cluster using site-to-site. We then want to merge them 
> back together, so we send them to the primary node before continuing 
> processing.
>
> --Peter
>
> -Original Message-
> From: Bryan Bende [mailto:bbe...@gmail.com]
> Sent: Thursday, June 7, 2018 12:47 PM
> To: dev@nifi.apache.org
> Subject: [EXT] Re: Primary Only Content Migration
>
> Peter,
>
> There really shouldn't be any non-source processors scheduled for primary 
> node only. We may even want to consider preventing that option when the 
> processor has an incoming connection to avoid creating any confusion.
>
> As long as you set source processors to primary node only then everything 
> should be ok... if primary node changes, the source processor starts 
> executing on the new primary node, and any flow files it already produced on 
> the old primary node will continue to be worked off by the downstream 
> processors on the old node until they are all processed.
>
> -Bryan
>
>
>
> On Thu, Jun 7, 2018 at 1:55 PM, Peter Wicks (pwicks)  
> wrote:
>> I'm sure many of you have the same situation, a flow that runs on a cluster, 
>> and at some point merges back down to a primary only processor; your files 
>> sit there in the queue with nowhere to go... We've used the work around of 
>> having a remote processor group that loops the data back to the primary node 
>> for a while, but would really like a clean/simple solution. This approach 
>> requires that users be able to put an input port on the root flow, and then 
>> route the file back down, which is a nuisance.
>>
>> I have been thinking of adding either a processor that moves data between 
>> specific nodes in a cluster, or a queue (?) option that will let users 
>> migrate the content of a flowfile back to the master node. This would allow 
>> you to move data back to a primary very easily without needing RPG's and 
>> input ports at the root level.
>>
>> All of my development work with NiFi has been focused on processors, so I'm 
>> not really sure where I would start with this.  Thoughts?
>>
>> Thanks,
>>   Peter

RE: [EXT] Re: Primary Only Content Migration

2018-06-07 Thread Peter Wicks (pwicks)

Bryan,

We see this with large files that we have split up into smaller files and 
distributed across the cluster using site-to-site. We then want to merge them 
back together, so we send them to the primary node before continuing processing.

--Peter

-Original Message-
From: Bryan Bende [mailto:bbe...@gmail.com] 
Sent: Thursday, June 7, 2018 12:47 PM
To: dev@nifi.apache.org
Subject: [EXT] Re: Primary Only Content Migration

Peter,

There really shouldn't be any non-source processors scheduled for primary node 
only. We may even want to consider preventing that option when the processor 
has an incoming connection to avoid creating any confusion.

As long as you set source processors to primary node only then everything 
should be ok... if primary node changes, the source processor starts executing 
on the new primary node, and any flow files it already produced on the old 
primary node will continue to be worked off by the downstream processors on the 
old node until they are all processed.

-Bryan

On Thu, Jun 7, 2018 at 1:55 PM, Peter Wicks (pwicks)  wrote:
> I'm sure many of you have the same situation, a flow that runs on a cluster, 
> and at some point merges back down to a primary only processor; your files 
> sit there in the queue with nowhere to go... We've used the work around of 
> having a remote processor group that loops the data back to the primary node 
> for a while, but would really like a clean/simple solution. This approach 
> requires that users be able to put an input port on the root flow, and then 
> route the file back down, which is a nuisance.
>
> I have been thinking of adding either a processor that moves data between 
> specific nodes in a cluster, or a queue (?) option that will let users 
> migrate the content of a flowfile back to the master node. This would allow 
> you to move data back to a primary very easily without needing RPG's and 
> input ports at the root level.
>
> All of my development work with NiFi has been focused on processors, so I'm 
> not really sure where I would start with this.  Thoughts?
>
> Thanks,
>   Peter

Primary Only Content Migration

2018-06-07 Thread Peter Wicks (pwicks)

I'm sure many of you have the same situation, a flow that runs on a cluster, 
and at some point merges back down to a primary only processor; your files sit 
there in the queue with nowhere to go... We've used the work around of having a 
remote processor group that loops the data back to the primary node for a 
while, but would really like a clean/simple solution. This approach requires 
that users be able to put an input port on the root flow, and then route the 
file back down, which is a nuisance.

I have been thinking of adding either a processor that moves data between 
specific nodes in a cluster, or a queue (?) option that will let users migrate 
the content of a flowfile back to the master node. This would allow you to move 
data back to a primary very easily without needing RPG's and input ports at the 
root level.

All of my development work with NiFi has been focused on processors, so I'm not 
really sure where I would start with this.  Thoughts?

Thanks,
  Peter

RE: [EXT] [ANNOUNCE] New Apache NiFi Committer Sivaprasanna Sethuraman

2018-06-05 Thread Peter Wicks (pwicks)

Congratulations Sivaprasanna!

-Original Message-
From: Tony Kurc [mailto:tk...@apache.org] 
Sent: Tuesday, June 05, 2018 08:09
To: dev@nifi.apache.org
Subject: [EXT] [ANNOUNCE] New Apache NiFi Committer Sivaprasanna Sethuraman

On behalf of the Apache NiFI PMC, I am very pleased to announce that 
Sivaprasanna has accepted the PMC's invitation to become a committer on the 
Apache NiFi project. We greatly appreciate all of Sivaprasanna's hard work and 
generous contributions to the project. We look forward to continued  
involvement in the project.

Sivaprasanna has been working with the community on the mailing lists, and has 
a big mix of code and feature contributions to include features and 
improvements to cloud service integrations like Azure, AWS, and Google Cloud.

Welcome and congratulations!

RE: [EXT] Re: ReplaceText Flow File Processing Count

2018-05-08 Thread Peter Wicks (pwicks)

https://github.com/apache/nifi/pull/2687


-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Friday, May 04, 2018 21:19
To: dev@nifi.apache.org
Subject: [EXT] Re: ReplaceText Flow File Processing Count

Bryan's guess on the history is probably right but more to the point with what 
we have available these days with the record processors and so on I think we 
should just change it back to one.  Peter's statement on user expectation I 
agree with for sure.  Any chance you want to file that JIRA/PR peter?

On Fri, May 4, 2018 at 9:13 AM, Bryan Bende <bbe...@gmail.com> wrote:
> I don't know the history of this particular processor, but I think the 
> purpose of the session.get() with batches is similar to the concept of 
> @SupportsBatching. Basically both of them should have better 
> performance because you are handling multiple flow files in a single 
> session. The supports batching concept is a bit more flexible as it is 
> configurable by the user, where as this case is hard-coded into the 
> processor.
>
> I suppose if there is some reason why you need to process 1 flow file 
> at a time, you could set the back-pressure threshold to 1 on the queue 
> leading into ReplaceText.
>
> On Fri, May 4, 2018 at 3:50 AM, Peter Wicks (pwicks) <pwi...@micron.com> 
> wrote:
>> Had a user notice today that a ReplaceText processor, scheduled to run every 
>> 20 minutes, had processed all 14 files in queue at once. I looked at the 
>> code and see that ReplaceText does not do a standard session.get, but 
>> instead calls:
>>
>> final List flowFiles = 
>> session.get(FlowFileFilters.newSizeBasedFilter(1, DataUnit.MB, 100));
>>
>> Was there a design reason behind this? To us it was just really confusing 
>> that we didn't have full control over how quickly FlowFile's move through 
>> this processor.
>>
>> Thanks,
>>   Peter

ReplaceText Flow File Processing Count

2018-05-04 Thread Peter Wicks (pwicks)

Had a user notice today that a ReplaceText processor, scheduled to run every 20 
minutes, had processed all 14 files in queue at once. I looked at the code and 
see that ReplaceText does not do a standard session.get, but instead calls:

final List flowFiles = 
session.get(FlowFileFilters.newSizeBasedFilter(1, DataUnit.MB, 100));

Was there a design reason behind this? To us it was just really confusing that 
we didn't have full control over how quickly FlowFile's move through this 
processor.

Thanks,
  Peter

Should I postpone submitting PR's on seriously breaking changes until prep for 2.0?

2018-04-23 Thread Peter Wicks (pwicks)

I have a couple ideas for changes that would cause wide spread breakage (like 
moving the DatabaseAdapter selection into the DBCP Service).
>From what I've seen, this level of non-backwards compatibility should be 
>postponed until a major version change?

Thanks,
Peter

RE: [EXT] Suggestion: Apache NiFi component enhancement

2018-04-12 Thread Peter Wicks (pwicks)

I think this is a good idea. But based on your example I think you would want 
to provide a primary Type along with a list of Alias types.
If NiFi starts and it can no longer find a processor by the Type name it had in 
the flow.xml it can check he annotations/aliases to see if it's been renamed. 
This would allow for easy renames.

Example 1: NiFi can no longer find AzureDocumentDBProcessor. Developer renamed 
it to CosmosDBProcessor. In this case we don't really want the type to still 
same "DocumentDB", that's just confusing. Also, we might not want the type 
named CosmosDBProcessor. So we make the Type be something nice, like "Azure 
Comos DB", then add Aliases for "AzureDocumentDBProcessor" and 
"CosmosDBProcessor".

Next year when Microsoft renames it "CelestialDB" we can rename the processor 
and add another alias.

Something like that?

-Original Message-
From: Sivaprasanna [mailto:sivaprasanna...@gmail.com] 
Sent: Wednesday, April 11, 2018 23:37
To: dev@nifi.apache.org
Subject: [EXT] Suggestion: Apache NiFi component enhancement

All,

Currently the "type" of a component is actually the component's canonical class 
name which gets rendered in the UI as the class name with the component 
version. This is good. However I'm thinking it is better to have an annotation 
which a developer can use to override the component type.

How is it used?
I think an annotation can be sufficient. The framework checks if the annotation 
is present or not, if it is present, it uses the name provided there or else it 
uses the class name like how it is happening.

Why and where is it needed?

   - In scenarios where we devise a new naming convention and want to apply
   it to older components without breaking backward compatibility
   - A developer had created a component class with a name but later down
   the line, the developer or someone else wants to change it to something
   else, the reason could again be naming convention or just that the new name
   makes more sense
   - A component that has been built to work with third party tech, like
   Azure, MongoDB, S3, Druid processors but the later versions of that tech
   has been changed to something else by the original creators. (Something
   similar has happened to Azure's DocumentDB which got later rebranded as
   Azure CosmosDB). In such cases, without deprecating or rebuilding a new
   processor, this can be used.

Before creating a JIRA, I wanted to get the community's thoughts. Feel free to 
share your thoughts, concerns. If everything seems fine, I'll start working on 
the implementation.

-

Sivaprasanna

RE: (NiFi 1.6) Funnels with no outgoing relationship filling my app log

2018-04-12 Thread Peter Wicks (pwicks)

https://issues.apache.org/jira/browse/NIFI-5075

From: Peter Wicks (pwicks)
Sent: Thursday, April 12, 2018 14:30
To: 'dev@nifi.apache.org' <dev@nifi.apache.org>
Subject: (NiFi 1.6) Funnels with no outgoing relationship filling my app log

I just upgraded one of my servers to NiFi 1.6.0. I have a couple of funnels 
that just dead end, Flow File's come in but never go anywhere after that. 
Mostly I use this for troubleshooting/validation. The inbound relationships all 
have expiration times, and it's a quick way for me to inspect the output of a 
processor on the fly.

These funnel's are filling up my logs with errors that they can't output to 
Relationship '' (see error log below). If I attach the Funnel to another 
downstream processor than everything is fine. I went back and tested on my 
1.5.0 server and did not see the errors.

I briefly looked through the code, but the bug didn't jump out at me.

2018-04-11 23:53:28,066 ERROR [Timer-Driven Process Thread-31] 
o.apache.nifi.controller.StandardFunnel 
StandardFunnel[id=b868231c-0162-1000-571c-ae3e7d15d848] 
StandardFunnel[id=b868231c-0162-1000-571c-ae3e7d15d848] failed to process 
session due to java.lang.RuntimeException: java.lang.IllegalArgumentException: 
Relationship '' is not known; Processor Administratively Yielded for 1 sec: 
java.lang.RuntimeException: java.lang.IllegalArgumentException: Relationship '' 
is not known
java.lang.RuntimeException: java.lang.IllegalArgumentException: Relationship '' 
is not known
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:365)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Relationship '' is not known
at 
org.apache.nifi.controller.repository.StandardProcessSession.transfer(StandardProcessSession.java:1935)
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:379)
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:358)
... 9 common frames omitted

Thanks,
  Peter

(NiFi 1.6) Funnels with no outgoing relationship filling my app log

2018-04-12 Thread Peter Wicks (pwicks)

I just upgraded one of my servers to NiFi 1.6.0. I have a couple of funnels 
that just dead end, Flow File's come in but never go anywhere after that. 
Mostly I use this for troubleshooting/validation. The inbound relationships all 
have expiration times, and it's a quick way for me to inspect the output of a 
processor on the fly.

These funnel's are filling up my logs with errors that they can't output to 
Relationship '' (see error log below). If I attach the Funnel to another 
downstream processor than everything is fine. I went back and tested on my 
1.5.0 server and did not see the errors.

I briefly looked through the code, but the bug didn't jump out at me.

2018-04-11 23:53:28,066 ERROR [Timer-Driven Process Thread-31] 
o.apache.nifi.controller.StandardFunnel 
StandardFunnel[id=b868231c-0162-1000-571c-ae3e7d15d848] 
StandardFunnel[id=b868231c-0162-1000-571c-ae3e7d15d848] failed to process 
session due to java.lang.RuntimeException: java.lang.IllegalArgumentException: 
Relationship '' is not known; Processor Administratively Yielded for 1 sec: 
java.lang.RuntimeException: java.lang.IllegalArgumentException: Relationship '' 
is not known
java.lang.RuntimeException: java.lang.IllegalArgumentException: Relationship '' 
is not known
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:365)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Relationship '' is not known
at 
org.apache.nifi.controller.repository.StandardProcessSession.transfer(StandardProcessSession.java:1935)
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:379)
at 
org.apache.nifi.controller.StandardFunnel.onTrigger(StandardFunnel.java:358)
... 9 common frames omitted


Thanks,
  Peter

RE: [EXT] Re: NiFi Versioned Process Group Status Icons

2018-03-25 Thread Peter Wicks (pwicks)

Matt,

If you really want to enable awareness of this feature, enable the "Version" 
menu option even if no NiFi Registry has been enabled, and when the user 
clicks, "Start version control" take them to the help section for starting up a 
version control server and registering it.

The existing "Up to date" and "Stale" status icons that show in the upper left 
corner are great. They only show up if the Process Group is versioned, and they 
aren't distracting you from the status of your flow in general; simple single 
icon. But the five new status icons at the bottom, even if I am actively using 
NiFi Registry, feel very busy and distracting.

I am excited about NiFi Registry. I am just starting to use it and I think it's 
going to solve a lot of issues.

Thanks,
  Peter

-Original Message-
From: Matt Gilman [mailto:matt.c.gil...@gmail.com] 
Sent: Saturday, March 24, 2018 04:18
To: dev@nifi.apache.org
Subject: [EXT] Re: NiFi Versioned Process Group Status Icons

Peter,

The status icon for a specific Process Group is hidden until that group is 
versioned. That icon is positioned next to the group name.

For the icons in the status bar and in the bottom of the Process Group, we were 
ultimately just trying to remain consistent. These reflect the counts of the 
encapsulated versioned Process Groups and does not include itself.
Like we have with the status icons for the encapsulated Processors and Ports. 
Even if this Process Group is not configured to have any encapsulated 
components, we still render the counts as zero.

Additionally, the presence of the icons helps drive awareness of the feature.

Rob or Drew may have some additional insight to add. Hope this helps

Matt

On Fri, Mar 23, 2018 at 12:10 AM, Peter Wicks (pwicks) <pwi...@micron.com>
wrote:

> Why does NiFi show status icons for Versioned Process Group's on 
> servers that are not configured to connect to a NiFi Registry?
>
> Thanks,
>   Peter
>

NiFi Versioned Process Group Status Icons

2018-03-22 Thread Peter Wicks (pwicks)

Why does NiFi show status icons for Versioned Process Group's on servers that 
are not configured to connect to a NiFi Registry?

Thanks,
  Peter

RE: [EXT] Re: Help with frontend-maven-plugin?

2018-03-20 Thread Peter Wicks (pwicks)

Scott, thanks for the response.



As you can see in the log/error information I included, it is detecting my 
proxy settings; and the proxy information it shows (I removed parts of the 
hostname in my message) are correct.



> [INFO] --- frontend-maven-plugin:1.1:npm (npm install) @ nifi-web-ui

>  [INFO] Found proxies: [m-proxy{protocol='http', host='proxy..com',

> port=8080}, m-proxy-https{protocol='https', host='proxy..com',

> port=8080}] [INFO] Running 'npm --cache-min Infinity install

> --https-proxy=

> http://proxy..com:8080 --proxy=http://proxy..com:8080' in

> C:\\nifi\nifi-nar-bundles\nifi-framework-bundle\nifi-

> framework\nifi-web\nifi-web-ui\target\frontend-working-directory

> [INFO]



I also already tried changing NPM versions:



> I tried just updating the version numbers for the components involved, and the

> build does complete, but I'm left with a copy of NiFi Web UI that doesn't 
> work after deployment.



Since you have had success with running newer versions with the NiFi Registry 
build, I copied those version numbers over into NiFi Web UI's pom.xml. When I 
do this the build succeeds, but the Web UI will not load all the way in Chrome. 
In the JavaScript console I see several errors:


Navigated to http://localhost:8080/nifi/
localhost/:61 GET http://localhost:8080/nifi/assets/angular/angular.min.js 
net::ERR_ABORTED
localhost/:62 GET 
http://localhost:8080/nifi/assets/angular-messages/angular-messages.min.js 
net::ERR_ABORTED
localhost/:65 GET 
http://localhost:8080/nifi/assets/angular-aria/angular-aria.min.js 
net::ERR_ABORTED
localhost/:66 GET 
http://localhost:8080/nifi/assets/angular-animate/angular-animate.min.js 
net::ERR_ABORTED
(index):61 GET http://localhost:8080/nifi/assets/angular/angular.min.js 
net::ERR_ABORTED
(index):62 GET 
http://localhost:8080/nifi/assets/angular-messages/angular-messages.min.js 
net::ERR_ABORTED
angular-resource.js:8 Uncaught TypeError: Cannot read property '$$minErr' of 
undefined at angular-resource.js:8 at angular-resource.js:6
angular-route.js:24 Uncaught TypeError: Cannot read property 'module' of 
undefined at angular-route.js:24 at angular-route.js:6
(index):65 GET 
http://localhost:8080/nifi/assets/angular-aria/angular-aria.min.js 
net::ERR_ABORTED
(index):66 GET 
http://localhost:8080/nifi/assets/angular-animate/angular-animate.min.js 
net::ERR_ABORTED
angular-material.min.js:7 Uncaught TypeError: Cannot read property 'module' of 
undefined at angular-material.min.js:7 at angular-material.min.js:7 at 
angular-material.min.js:15
jquery.min.js:2 jQuery.Deferred exception: Cannot read property 'module' of 
undefined TypeError: Cannot read property 'module' of undefined at 
HTMLDocument. 
(http://localhost:8080/nifi/js/nf/canvas/nf-canvas-all.js?1.6.0-SNAPSHOT:77:5364)
 at j (http://localhost:8080/nifi/assets/jquery/dist/jquery.min.js:2:29948) at 
k (http://localhost:8080/nifi/assets/jquery/dist/jquery.min.js:2:30262) 
undefined
jquery.min.js:2 Uncaught TypeError: Cannot read property 'module' of undefined 
at HTMLDocument. (nf-canvas-all.js?1.6.0-SNAPSHOT:77) at j 
(jquery.min.js:2) at k (jquery.min.js:2)



This is why I was hoping someone with more UI development experience could try 
updating the version numbers and see if they can work out the issues.



Thanks!

  Peter



-Original Message-
From: Scott Aslan [mailto:scottyas...@gmail.com]
Sent: Tuesday, March 20, 2018 23:11
To: dev@nifi.apache.org
Subject: [EXT] Re: Help with frontend-maven-plugin?



Hey there Peter,



I am not sure that the frontend-maven-plugin is out of date or old (the last 
commit was November 2017). NiFi is running version 1.1 of this plugin and NiFi 
Registry is running version 1.5. The frontend-maven-plugin downloads versions 
of Node and npm from https://nodejs.org/dist, extracts them and puts them into 
a node folder created in your installation directory. You can change the 
version of this maven plugin here 
<https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-ui/pom.xml#L362>.

Node will only be "installed" locally to your project. It will not be installed 
globally on the whole system (and it will not interfere with any Node 
installations already present). Node then installs npm. If the issue you are 
encountering is with the version of npm, then you can change the version of 
node or npm in the pom.xml here 
<https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-ui/pom.xml#L27>

. If you have configured proxy settings for Maven 
<http://maven.apache.org/guides/mini/guide-proxies.html> in your settings.xml 
file, the frontend-maven-plugin will automatically use the proxy for 
downloading node and npm, as well as passing the proxy to npm commands 
<https://docs.npmjs.com/misc/config#proxy>.



-Scott



On Tue, Mar 20, 2018 at 5:02 AM, Pete

Help with frontend-maven-plugin?

2018-03-20 Thread Peter Wicks (pwicks)

The version of "frontend-maven-plugin" used in NiFi is out of date, and the 
versions of NPM and Node referenced by it are also a bit old.

I wouldn't bring this up, except there is a bug in this version of NPM that can 
cause issues when trying to download NPM packages via proxy server.

This is the error I'm seeing:

[INFO] --- frontend-maven-plugin:1.1:npm (npm install) @ nifi-web-ui ---
[INFO] Found proxies: [m-proxy{protocol='http', host='proxy..com', port=8080}, 
m-proxy-https{protocol='https', host='proxy..com', port=8080}]
[INFO] Running 'npm --cache-min Infinity install 
--https-proxy=http://proxy..com:8080 --proxy=http://proxy..com:8080' in 
C:\\nifi\nifi-nar-bundles\nifi-framework-bundle\nifi-framework\nifi-web\nifi-web-ui\target\frontend-working-directory
[INFO]

...

[ERROR] npm http GET https://registry.npmjs.org/d3/4.13.0
[ERROR] npm ERR! TypeError: Request path contains unescaped characters.
[ERROR] npm ERR! at new ClientRequest (_http_client.js:53:11)
[ERROR] npm ERR! at TunnelingAgent.exports.request (http.js:31:10)



I've been successful building if I leave the proxy behind, but I can only do 
that outside of work... which makes it hard.  I tried just updating the version 
numbers for the components involved, and the build does complete, but I'm left 
with a copy of NiFi Web UI that doesn't work after deployment.

Thoughts? It would be great of one of the UI developers who's more familiar 
with NPM/Node could look at this maybe?

Thanks,
  Peter

RE: [EXT] Re: Embedded Nifi

2018-01-23 Thread Peter Wicks (pwicks)

Vincent,

Embedded NiFi still has a long ways to go to be really useful, in my opinion; 
and I don't know if anyone is actively working on those improvements.

The PR Andy mentioned simply allows you to startup NiFi inside your process 
instead of running it directly from a startup script, but that doesn't mean you 
magically have access to all of NiFi's internals (someone can correct me if I'm 
wrong). If you want to actually interact with your new NiFi instance you will 
still need to use the REST API.

Thanks,
  Peter

-Original Message-
From: Vincent Russell [mailto:vincent.russ...@gmail.com] 
Sent: Tuesday, January 23, 2018 03:07
To: dev@nifi.apache.org
Subject: [EXT] Re: Embedded Nifi

Thanks Andy,

This looks a great first step.   It would be nice to have a builder pattern
and the ability to download the "executable" from a nexus or the local 
filesystem like embedded elastic, but perhaps that might be better in some 
third party library.

https://github.com/allegro/embedded-elasticsearch

-Vincent

On Mon, Jan 22, 2018 at 1:37 PM, Andy LoPresto  wrote:

> Vincent,
>
> I plan to merge this pull request [1] for NIFI-4424 [2] by Peter 
> Horvath today. Does this satisfy your requirements?
>
> [1] https://github.com/apache/nifi/pull/2251
> [2] https://issues.apache.org/jira/browse/NIFI-4424
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com * PGP 
> Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Jan 21, 2018, at 7:35 AM, Vincent Russell 
> 
> wrote:
>
> Devs,
>
> Does an embedded nifi exist that would start a nifi with a provided
> workflow?   I am aware of the Mock framework, but I am looking for
> something for integration tests.
>
> Thanks,
> Vincent
>
>
>

RE: [EXT] Re: [DISCUSS] CI / Travis / Jenkins

2017-12-07 Thread Peter Wicks (pwicks)

Best news I've heard in a long time :)
Thanks for all the hard work!

-Original Message-
From: af...@fucs.org [mailto:af...@fucs.org] On Behalf Of Andre
Sent: Thursday, December 07, 2017 12:15
To: dev@nifi.apache.org
Subject: [EXT] Re: [DISCUSS] CI / Travis / Jenkins

Joe,

Thanks for that!

yes, you are correct. The builds occurred twice as the parallel build was 
broken. :-)

Since we can now build and run tests in parallel with contrib-check
(YEAY) the dual build is no longer necessary.

THANK YOU for chasing this down. While travis is one of the drivers all the 
developers will benefit from being able to run contrib-check in parallel moving 
forward.

Cheers

On Thu, Dec 7, 2017 at 1:38 PM, Joe Witt  wrote:

> Team,
>
> Ok so finally some really solid news to share on the Travis-CI front.
> First, huge thanks to Aldrin for getting this started and folks like 
> Andre and Pierre who have tweaked it to make it more usable as well.
> After a long run of it helping us out as we all know it went poorly 
> with every build failing for what seemed like months.
>
> After some improvements and updates to our usage of maven which now 
> means parallel builds with contrib check seem to be working and after 
> going ruthless mode on hunting down unstable tests and either fixing 
> them or making them integration-tests the build is far more stable.
> We all need to try and stay on top of that.  Today though i realized 
> that our builds were happening twice and that appeared to be why it 
> took roughly 50 minutes to finish, at best, and we'd timeout and fail.
>   So after adjusting our travis.yml we now only build once and the 
> process takes about 25 mins so we're well within.
>
> Latest build on travis-ci: https://travis-ci.org/apache/
> nifi/builds/312629807
> Appveyor builds:
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/nifi/
> build/1.0.0-SNAPSHOT-6649
>
> So we're heading in the right direction.  If it stays stable perhaps 
> we could add openjdk builds as well.
>
> THanks
> Joe
>
> On Tue, Dec 5, 2017 at 4:11 PM, Joe Witt  wrote:
> > OK well things are looking pretty good.  The only obvious problem 
> > now is that our builds take about 45-50 mins on travis-ci.org and 
> > the build time limit is 50 mins [1] so some jobs get killed.
> >
> > Will look at areas we can avoid spending build time on at least in 
> > travis-ci land.  Probably no great option but let's see.
> >
> > [1] 
> > https://docs.travis-ci.com/user/customizing-the-build#Build-Timeouts
> >
> > On Tue, Dec 5, 2017 at 2:56 PM, Joe Witt  wrote:
> >> Will try it out for PR https://github.com/apache/nifi/pull/2319 
> >> which is being built under
> >> https://travis-ci.org/apache/nifi/builds/312043710
> >>
> >> On Tue, Dec 5, 2017 at 2:51 PM, Joe Witt  wrote:
> >>> Andre
> >>>
> >>> Thanks - read through 
> >>> https://issues.apache.org/jira/browse/NIFI-1657
> >>> where this was discussed and where the relevant multi-env commit 
> >>> came in.
> >>>
> >>> Seems like five environments may be too taxing based on the build 
> >>> failures I'm observing.  I'll cut it down to three FR JP US For 
> >>> now.  We can evaluate if that helps at all and add more back if 
> >>> things become stable.
> >>>
> >>> Thanks
> >>> Joe
> >>>
> >>> On Tue, Dec 5, 2017 at 12:20 AM, Andre  wrote:
>  Joe,
> 
>  Glad to help! Few notes:
> 
>  If I recall correctly there was a reason we chose to add default 
>  and
> BR but
>  to be honest I can't really remember what it was. I think it has 
>  to
> do with
>  Time Zones + Locale issues and has helped detecting bizarre 
>  issues on
> time
>  based junits (Matt B and Pierre may remember this).
> 
>  Regarding the rat check. The idea behind that was a fast failure 
>  in
> case of
>  basic style violations, rather than wait until the end of the
> compilation.
>  To be honest I don't know if this has worked as desired but 
>  should
> allow us
>  to quickly identify validation errors which if I recall correctly
> were only
>  detected at the end of contrib-check.
> 
>  And apologies for the anecdotal comments. I am away from my dev
> environment
>  atm so I can't truly validate them.
> 
> 
>  Kind regards
> 
> 
>  On Tue, Dec 5, 2017 at 3:31 PM, Joe Witt  wrote:
> 
> > Great news!  So for the first time in a long time we now have 
> > travis-ci builds passing!
> >
> > I incorporated Dustin's PR which changed to the -Ddir-only 
> > instead of -P, added Andre's idea of dropping the -quiet flag, 
> > and dropped the number of builds in the config to a single 
> > parallel build with
> contrib
> > check now that we're seeing those pass with rat/checkstyle.
> >
> > https://travis-ci.org/apache/nifi/builds/311660398
> >
>

CIFS/SMB Shares from Linux

2017-11-26 Thread Peter Wicks (pwicks)

We have a number of CIFS networks shares that I've been denied permission to 
mount on my NiFi Linux server.
In the past I've accessed similar shares by running NiFi on a Windows server 
and then pushing the files using Site-to-Site, but I'd rather avoid that if I 
can. In the past those Windows servers were already available, but I don't want 
to setup new servers just to perform this duty.

The Hortonworks article I found online did it using two servers also: 
https://community.hortonworks.com/articles/26089/windows-share-nifi-hdfs-a-practical-guide.htm

I looked into adding CIFS support to the *File processors, but the only library 
I could find was jCIFS, which is licensed as LGPL... Apache Camel works around 
this licensing limitation by pointing users to the "Camel Extra" project, which 
is not officially an Apache project, and is not distributed as part of Camel 
http://camel.apache.org/jcifs.html.

RE: [EXT] Re: Contrib Check Build - RAT and Unit Test Failure

2017-11-22 Thread Peter Wicks (pwicks)

Joe,

I figured it out. At least on my box, there appears to be contention between 
two unit tests that are using the same folder to test the 
MinimalLockingWriteAheadLog. Folder name: 
target/testRecoverFileThatHasTrailingNULBytes.

Unit tests:
testRecoverFileThatHasTrailingNULBytesNoTruncation
testRecoverFileThatHasTrailingNULBytes

If I switch one of them to use a different folder name then the file is not 
locked when the second test runs and it's fine. I'll submit a PR.

--Peter

-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Thursday, November 23, 2017 10:48
To: dev@nifi.apache.org
Subject: Re: [EXT] Re: Contrib Check Build - RAT and Unit Test Failure

Ok.  You might want to check if there is already a JIRA for that.  The tests on 
Windows are notoriously bad.  Many conditional ignores out there for Windows 
runs of tests.  That said, this may well be an actual windows problem and the 
test is possibly flagging it.

Thanks

On Wed, Nov 22, 2017 at 9:34 PM, Peter Wicks (pwicks) <pwi...@micron.com> wrote:
> Joe,
>
> This did resolve the RAT issue, but the unit test I mentioned still fails 
> with the same error.
>
> --Peter
>
> -Original Message-
> From: Joe Witt [mailto:joe.w...@gmail.com]
> Sent: Thursday, November 23, 2017 10:22
> To: dev@nifi.apache.org
> Subject: [EXT] Re: Contrib Check Build - RAT and Unit Test Failure
>
> Peter
>
> I dont' believe RAT works in parallel builds (-T2 for example).
>
> If I use RAT it is during a full clean build and activated via the 
> 'contrib-check' profile.  'mvn clean install -Pcontrib-check'
>
> Thanks
>
> On Wed, Nov 22, 2017 at 9:19 PM, Peter Wicks (pwicks) <pwi...@micron.com> 
> wrote:
>> I'm trying to successfully run Contrib Check on my dev box.
>>
>> Windows 10 x64
>> jdk1.8.0_91
>> MVN 3.3.9
>>
>> I'm using IntelliJ, so my Run looks like:
>>
>> Command line: -T2 -Drat.skip=true clean install
>> Profile: contrib-check
>>
>> If I don't disable RAT (-Drat.skip=true) then I get too many RAT failures 
>> for the build to run, and the files involved aren't the ones that have 
>> changed and that I'm testing.
>> Then, with RAT disabled I'm getting at least one Unit Test failure, with a 
>> test that has been in place since April: 
>> https://github.com/apache/nifi/commit/0f2ac39f69c1a744f151f0d924c9978f6790b7f7.
>>
>> Tests run: 12, Failures: 0, Errors: 1, Skipped: 2, Time elapsed:
>> 11.109 sec <<< FAILURE! - in org.wali.TestMinimalLockingWriteAheadLog
>> testRecoverFileThatHasTrailingNULBytesAndTruncation(org.wali.TestMinimalLockingWriteAheadLog)
>>   Time elapsed: 0.033 sec  <<< ERROR!
>> java.nio.channels.OverlappingFileLockException: null
>> at 
>> sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:255)
>> at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:152)
>> at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:1063)
>> at java.nio.channels.FileChannel.lock(FileChannel.java:1053)
>> at 
>> org.wali.MinimalLockingWriteAheadLog.(MinimalLockingWriteAheadLog.java:187)
>> at 
>> org.wali.MinimalLockingWriteAheadLog.(MinimalLockingWriteAheadLog.java:108)
>> at
>> org.wali.TestMinimalLockingWriteAheadLog.testRecoverFileThatHasTraili
>> n
>> gNULBytesAndTruncation(TestMinimalLockingWriteAheadLog.java:472)
>>
>> Are there settings I need to change on my box that are causing these 
>> failures?
>>
>> Thanks,
>>   Peter

RE: [EXT] Re: Contrib Check Build - RAT and Unit Test Failure

2017-11-22 Thread Peter Wicks (pwicks)

Joe,

This did resolve the RAT issue, but the unit test I mentioned still fails with 
the same error.

--Peter

-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Thursday, November 23, 2017 10:22
To: dev@nifi.apache.org
Subject: [EXT] Re: Contrib Check Build - RAT and Unit Test Failure

Peter

I dont' believe RAT works in parallel builds (-T2 for example).

If I use RAT it is during a full clean build and activated via the 
'contrib-check' profile.  'mvn clean install -Pcontrib-check'

Thanks

On Wed, Nov 22, 2017 at 9:19 PM, Peter Wicks (pwicks) <pwi...@micron.com> wrote:
> I'm trying to successfully run Contrib Check on my dev box.
>
> Windows 10 x64
> jdk1.8.0_91
> MVN 3.3.9
>
> I'm using IntelliJ, so my Run looks like:
>
> Command line: -T2 -Drat.skip=true clean install
> Profile: contrib-check
>
> If I don't disable RAT (-Drat.skip=true) then I get too many RAT failures for 
> the build to run, and the files involved aren't the ones that have changed 
> and that I'm testing.
> Then, with RAT disabled I'm getting at least one Unit Test failure, with a 
> test that has been in place since April: 
> https://github.com/apache/nifi/commit/0f2ac39f69c1a744f151f0d924c9978f6790b7f7.
>
> Tests run: 12, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: 
> 11.109 sec <<< FAILURE! - in org.wali.TestMinimalLockingWriteAheadLog
> testRecoverFileThatHasTrailingNULBytesAndTruncation(org.wali.TestMinimalLockingWriteAheadLog)
>   Time elapsed: 0.033 sec  <<< ERROR!
> java.nio.channels.OverlappingFileLockException: null
> at 
> sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:255)
> at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:152)
> at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:1063)
> at java.nio.channels.FileChannel.lock(FileChannel.java:1053)
> at 
> org.wali.MinimalLockingWriteAheadLog.(MinimalLockingWriteAheadLog.java:187)
> at 
> org.wali.MinimalLockingWriteAheadLog.(MinimalLockingWriteAheadLog.java:108)
> at 
> org.wali.TestMinimalLockingWriteAheadLog.testRecoverFileThatHasTrailin
> gNULBytesAndTruncation(TestMinimalLockingWriteAheadLog.java:472)
>
> Are there settings I need to change on my box that are causing these failures?
>
> Thanks,
>   Peter

Contrib Check Build - RAT and Unit Test Failure

2017-11-22 Thread Peter Wicks (pwicks)

I'm trying to successfully run Contrib Check on my dev box.

Windows 10 x64
jdk1.8.0_91
MVN 3.3.9

I'm using IntelliJ, so my Run looks like:

Command line: -T2 -Drat.skip=true clean install
Profile: contrib-check

If I don't disable RAT (-Drat.skip=true) then I get too many RAT failures for 
the build to run, and the files involved aren't the ones that have changed and 
that I'm testing.
Then, with RAT disabled I'm getting at least one Unit Test failure, with a test 
that has been in place since April: 
https://github.com/apache/nifi/commit/0f2ac39f69c1a744f151f0d924c9978f6790b7f7.

Tests run: 12, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: 11.109 sec <<< 
FAILURE! - in org.wali.TestMinimalLockingWriteAheadLog
testRecoverFileThatHasTrailingNULBytesAndTruncation(org.wali.TestMinimalLockingWriteAheadLog)
  Time elapsed: 0.033 sec  <<< ERROR!
java.nio.channels.OverlappingFileLockException: null
at 
sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:255)
at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:152)
at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:1063)
at java.nio.channels.FileChannel.lock(FileChannel.java:1053)
at 
org.wali.MinimalLockingWriteAheadLog.(MinimalLockingWriteAheadLog.java:187)
at 
org.wali.MinimalLockingWriteAheadLog.(MinimalLockingWriteAheadLog.java:108)
at 
org.wali.TestMinimalLockingWriteAheadLog.testRecoverFileThatHasTrailingNULBytesAndTruncation(TestMinimalLockingWriteAheadLog.java:472)

Are there settings I need to change on my box that are causing these failures?

Thanks,
  Peter

RE: [EXT] Re: Automated Testing of Flows

2017-11-16 Thread Peter Wicks (pwicks)

I don't even know if I need Jenkins to setup a fresh instance, that could be 
nice, but I could use my TEST instance and just upgrade it by hand anytime I 
want to test a new version. I could create a dedicated Process Group that my 
test scripts would use for deploying the templates.

I've worked with the REST API quite a bit, so I'd probably use that to 
configure the data sources, etc... Have the test runner be Java or Python.

From a completely different approach, I saw a recent PR to allow for NiFi to be 
started up from within another process. This could be an interesting approach 
if the API's are easily accessible. https://github.com/apache/nifi/pull/2251

-Original Message-
From: Scott Aslan [mailto:scottyas...@gmail.com] 
Sent: Friday, November 17, 2017 12:39
To: dev@nifi.apache.org
Subject: [EXT] Re: Automated Testing of Flows

Just brain storming here

You could use travis or Jenkins to setup a job where you use mvn and npm to 
build and start NiFi. Assuming the test template was available from within the 
environment maybe you could use the protractor e2e testing suite to script the 
clicks necessary to load the template and run the flow...

On Wed, Nov 15, 2017 at 8:07 PM, Peter Wicks (pwicks) <pwi...@micron.com>
wrote:

> I would like to find a way to automatically run integration/upgrade 
> tests on flows, rather than just one or two processors like the mock 
> framework supports. Preferably something Template driven, where I can 
> provide a template xml file, and an accompanying config file to setup 
> the test, then run the flow and examine the results.
>
> Has anyone worked on something like this?
>

Automated Testing of Flows

2017-11-15 Thread Peter Wicks (pwicks)

I would like to find a way to automatically run integration/upgrade tests on 
flows, rather than just one or two processors like the mock framework supports. 
Preferably something Template driven, where I can provide a template xml file, 
and an accompanying config file to setup the test, then run the flow and 
examine the results.

Has anyone worked on something like this?

Adding EL support to UpdateAttribute for the Attribute Name

2017-10-30 Thread Peter Wicks (pwicks)

I've run into a use case for adding EL support into the Attribute Name itself 
in UpdateAttribute. Looking for thoughts on other approaches, pros/cons of 
doing this.

I'm generically extracting data from a database. Right now I have ~30 tables, 
but that number could be anything, just think big enough to be a pain to put 
into RouteOnAttribute and handle individually. Each table has a varying number 
of columns. In my destination system I have some trailing metadata columns in 
the tables about when the data was loaded, what the FlowFile UUID was, etc... I 
provide the values for these columns using UpdateAttribute and providing a 
value. As part of my extraction code I've tacked on the column count as an 
attribute so that I know my metadata columns are ${fieldcount} + 1, +2, +3, 
etc...

Now my downstream processors are expecting a sql.args.##.value attribute for 
loading the data. Unfortunately, the column number for these trailing columns 
shifts from table to table. I'm experimenting with allowing UpdateAttribute to 
evaluate EL, where the Attribute Name might be: "sql.args.${fieldcount}.value".

https://github.com/patricker/nifi/commit/3f640c20f70956e4ddbe5741c25a422b3ed90357

Thoughts?

Thanks,
  Peter

RE: [EXT] Re: Architecting the MS SQL CDC Processor

2017-10-27 Thread Peter Wicks (pwicks)

I've submitted the first pass at this processor: 
https://github.com/apache/nifi/pull/2230

I did use RecordSetWriter, and ResultSetRecordSet for reading, which has been 
working well.
I found I didn't need to worry about the rate that state gets updated. In the 
MySQL case it checks to see if it should update state after every single row, 
in mine it updates after all changes for the table since the last run have been 
processed, so more like a QueryDatabaseTable. This is possible because I'm not 
reading changes in anything remotely resembling the way MySQL works :)

I created unit tests for this processor, the unit tests run on Apache DB tables 
that match in schema, but not necessarily in type, to those in MS SQL.

The only quirky thing was in order to get my generated SQL to work for both 
Apache DB and MS SQL, I had to use quotes a lot more than usual in my SQL 
statements. So please no comments along the lines of, "Why are there so many 
quoted identifiers in your SQL statements" :)

Thanks,
  Peter

-Original Message-
From: Matt Burgess [mailto:mattyb...@apache.org] 
Sent: Tuesday, October 17, 2017 10:59 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Architecting the MS SQL CDC Processor

Peter,

This is great to hear, I'm sure the community is looking forward to such a 
solution!  I worked on the first offering of the CaptureChangeMySQL processor, 
so here are some notes, comments, and
(hopefully!) answers to your questions:

* If you support a RecordSetWriter controller service as your output, then you 
won't need JdbcCommon per se; instead you would create Records and pass those 
to the user-selected RecordSetWriter. In that sense you can support Avro, CSV, 
JSON, or anything else for which there is a RecordSetWriter implementation.

* Depending on how often you'll be updating state, you may want to implement 
something similar to the State Update Interval property in CaptureChangeMySQL, 
which came about due to similar concerns about the overhead of state updates vs 
the amount of processing beforehand.
This allows to user to tune the tradeoff, based on their own requirements and 
performance and such.

* I have no concerns with having a different output format from 
CaptureChangeMySQL; in fact the only reason it doesn't have a RecordSetWriter 
output interface is that those capabilities were being developed in parallel, 
so rather than have to wait for the record-aware API stuff, I chose to output 
JSON. I have written
NIFI-4491 to improve/augment CDC processor(s) with RecordSetWriter support. 
This would be very helpful by supporting various output formats as well as 
generating the accompanying schema. If your processor were the first to support 
this, it could be the exemplar for past and future CDC processors :)

Regards,
Matt

[1] https://issues.apache.org/jira/browse/NIFI-4491

On Mon, Oct 16, 2017 at 10:36 PM, Peter Wicks (pwicks) <pwi...@micron.com> 
wrote:
> I've been working on a new processor that does Change Data Capture with 
> Microsoft SQL Server. I followed Microsoft's documentation on how CDC works, 
> and I've got some code that gets me the changes and is testing well. Right 
> now, I don't actually have a processor, but a number of scripts that generate 
> SQL and I put it into ExecuteSQL and QueryDatabaseTable processors; with QDB 
> using my as-yet incomplete 
> NIFI-1706<https://github.com/apache/nifi/pull/2162>.
>
> One of the reasons I don't have a processor yet is because I don't want to 
> use the same output format as the MySQL CDC Processor, but I didn't want to 
> put in the time if it was not going to get merged. The MySQL CDC processor 
> uses JSON messages as the output format, but in MS SQL the CDC messages are 
> rows in a table; and it's much more convenient to output them as records. 
> Currently, I'm using Avro.
>
> Questions:
>
>   *   My output format doesn't have to be Avro, but given the source is rows 
> in a table being returned by a ResultSet, using the JdbcCommon class makes a 
> lot of sense to me. Can I move JdbcCommon to somewhere useful like 
> nifi-avro-record-utils?
>   *   I'll be looping through a list of tables and plan on committing the 
> files immediately to the success relationship as that table's CDC records are 
> pulled. I want to make sure that the max value tracking gets updated 
> immediately too. Does calling setState on the State Manager cause an 
> immediate state save? Is this safe to call repeatedly, assuming single 
> threaded, during the execution of the processor?
>   *   Concerns with using a different output format than the MySQL CDC 
> Processor?
>
> Thanks,
>   Peter

Initial Admin Identity can't query provenance?

2017-10-25 Thread Peter Wicks (pwicks)

I just setup a new NiFi instance and setup myself as the Initial Admin 
Identity, I'm currently the only user on the box and I've made no security 
changes. I noticed today that I couldn't query provenance.

I had to go in and create a new policy and add myself to it. This seems odd for 
an initial admin identity, but maybe it's by design?

Thanks,
  Peter

Architecting the MS SQL CDC Processor

2017-10-16 Thread Peter Wicks (pwicks)

I've been working on a new processor that does Change Data Capture with 
Microsoft SQL Server. I followed Microsoft's documentation on how CDC works, 
and I've got some code that gets me the changes and is testing well. Right now, 
I don't actually have a processor, but a number of scripts that generate SQL 
and I put it into ExecuteSQL and QueryDatabaseTable processors; with QDB using 
my as-yet incomplete NIFI-1706.

One of the reasons I don't have a processor yet is because I don't want to use 
the same output format as the MySQL CDC Processor, but I didn't want to put in 
the time if it was not going to get merged. The MySQL CDC processor uses JSON 
messages as the output format, but in MS SQL the CDC messages are rows in a 
table; and it's much more convenient to output them as records. Currently, I'm 
using Avro.

Questions:

  *   My output format doesn't have to be Avro, but given the source is rows in 
a table being returned by a ResultSet, using the JdbcCommon class makes a lot 
of sense to me. Can I move JdbcCommon to somewhere useful like 
nifi-avro-record-utils?
  *   I'll be looping through a list of tables and plan on committing the files 
immediately to the success relationship as that table's CDC records are pulled. 
I want to make sure that the max value tracking gets updated immediately too. 
Does calling setState on the State Manager cause an immediate state save? Is 
this safe to call repeatedly, assuming single threaded, during the execution of 
the processor?
  *   Concerns with using a different output format than the MySQL CDC 
Processor?

Thanks,
  Peter

RE: [EXT] Re: JAVA_HOME trouble in nifi.sh

2017-10-16 Thread Peter Wicks (pwicks)

Aldrin,

My branch isn't  pure, in fact it has a few build only related changes... I'll 
play around with it.

I'm working on moving my builds to a Unix box with Jenkins to reduce the CPU 
strain on my development box. But my builds were failing due to the NPM/Node 
Maven plugin. So I upgraded the package version to get the build to complete.

Since the build did not complete before this update I have no idea how it may 
have affected the build.

Thanks,
  Peter

-Original Message-
From: Aldrin Piri [mailto:aldrinp...@gmail.com] 
Sent: Friday, October 13, 2017 10:25 PM
To: dev <dev@nifi.apache.org>
Subject: [EXT] Re: JAVA_HOME trouble in nifi.sh

Can't say I've seen this before and certainly have JAVA_HOME set on most of the 
places where I've performed builds.  Would you mind please opening up a ticket 
as well as capturing the salient environmental bits (OS, Maven, JDK versions, 
etc)?

That locateJava blurb is something we used from another ASF project and use 
heavily throughout our executables, so definitely need to track it down.

Did you happen to see this substitution in any other files?

On Thu, Oct 12, 2017 at 10:51 PM, Peter Wicks (pwicks) <pwi...@micron.com>
wrote:

> Only when building on Linux, during build my "${JAVA_HOME}" string in 
> nifi.sh is getting overwritten by the current value of my environment 
> variable for JAVA_HOME on my build box... not sure if this is 
> something others have run into.
>
> I built on one box, where JAVA_HOME is set to 
> "/var/spe/tools/jdk1.8.0_144". I then copied the tar.gz directly from 
> nifi-assembly to another box. I only extracted it after getting to the 
> other server where JAVA_HOME is not set.
>
> Here is a snippet from nifi.sh, I've bolded the sections where the raw 
> file has ${JAVA_HOME}. Mabye this is a system config issue? Obviously 
> this isn't happening for everyone else building on Linux...?
>
> locateJava() {
> # Setup the Java Virtual Machine
> if $cygwin ; then
> [ -n "${JAVA}" ] && JAVA=$(cygpath --unix "${JAVA}")
> [ -n "/var/spe/tools/jdk1.8.0_144" ] && JAVA_HOME=$(cygpath 
> --unix
> "/var/spe/tools/jdk1.8.0_144")
> fi
>
> if [ "x${JAVA}" = "x" ] && [ -r /etc/gentoo-release ] ; then
> JAVA_HOME=$(java-config --jre-home)
> fi
> if [ "x${JAVA}" = "x" ]; then
> if [ "x/var/spe/tools/jdk1.8.0_144" != "x" ]; then
> if [ ! -d "/var/spe/tools/jdk1.8.0_144" ]; then
> die "JAVA_HOME is not valid: /var/spe/tools/jdk1.8.0_144"
> fi
> JAVA="/var/spe/tools/jdk1.8.0_144/bin/java"
> else
> warn "JAVA_HOME not set; results may vary"
> JAVA=$(type java)
> JAVA=$(expr "${JAVA}" : '.* \(/.*\)$')
> if [ "x${JAVA}" = "x" ]; then
> die "java command not found"
> fi
> fi
> fi
> # if command is env, attempt to add more to the classpath
> if [ "$1" = "env" ]; then
> [ "x${TOOLS_JAR}" =  "x" ] && [ -n 
> "/var/spe/tools/jdk1.8.0_144" ] && TOOLS_JAR=$(find -H 
> "/var/spe/tools/jdk1.8.0_144" -name "tools.jar")
> [ "x${TOOLS_JAR}" =  "x" ] && [ -n 
> "/var/spe/tools/jdk1.8.0_144" ] && TOOLS_JAR=$(find -H 
> "/var/spe/tools/jdk1.8.0_144" -name "classes.jar")
> if [ "x${TOOLS_JAR}" =  "x" ]; then
>  warn "Could not locate tools.jar or classes.jar. Please 
> set manually to avail all command features."
> fi
> fi
>
> }
>

RE: [EXT] Re: Please refresh my memory on NAR dependencies

2017-10-16 Thread Peter Wicks (pwicks)

I gave this a shot and it worked well for me.
https://github.com/apache/nifi/pull/2194

-Original Message-
From: Koji Kawamura [mailto:ijokaruma...@gmail.com] 
Sent: Monday, October 16, 2017 12:03 PM
To: dev <dev@nifi.apache.org>
Subject: Re: [EXT] Re: Please refresh my memory on NAR dependencies

Peter, Matt,

If the goal is sharing org.apache.nifi.csv.CSVUtils among modules, an 
alternative approach is moving CSVUtils to nifi-standard-record-util and add 
ordinary JAR dependency from nifi-poi-processors. How do you think?

Thanks,
Koji

On Mon, Oct 16, 2017 at 12:17 PM, Peter Wicks (pwicks) <pwi...@micron.com> 
wrote:
> Matt,
>
> I am trying to re-use most of CSVUtils, including most of the property 
> descriptors and CSVUtils.createCSVFormat.
>
> It seemed like a waste to duplicate the entire class. I can try making it the 
> parent, what are the implications if I do that?
>
> Thanks,
>   Peter
>
> -Original Message-
> From: Matt Burgess [mailto:mattyb...@apache.org]
> Sent: Monday, October 16, 2017 10:58 AM
> To: dev@nifi.apache.org
> Subject: [EXT] Re: Please refresh my memory on NAR dependencies
>
> Do you have a hard requirement on the implementations in 
> nifi-record-serialization-services? Otherwise, the existing examples have the 
> processor POM pointing at the following:
>
> 
> org.apache.nifi
> nifi-record-serialization-service-api
> 
>
> which is the API JAR I think. If you need the implementations behind 
> it, you will probably need to declare that as a parent (not a
> dependency) and perhaps still use the API JAR (though I'm guessing about the 
> latter).
>
> Regards,
> Matt
>
>
> On Sun, Oct 15, 2017 at 10:27 PM, Peter Wicks (pwicks) <pwi...@micron.com> 
> wrote:
>> For NIFI-4465 I want the nifi-poi-bundle to include a Maven dependency on 
>> nifi-record-serialization-services. So I start by adding the dependency to 
>> the pom.xml.
>>
>> 
>>org.apache.nifi
>>nifi-record-serialization-services
>> 
>>
>> I've tried several variations on this, with version numbers, putting it at 
>> higher pom levels, including it in the nifi-nar-bundles pom and marking it 
>> as included, etc...
>>
>> Throughout all this compiling is no problem, and all my unit tests run 
>> correctly. But when I try to start NiFi I immediately get Class not found 
>> exceptions from the nifi-poi classes related to the 
>> nifi-record-serialization libraries.
>>
>> I feel like I've run into this in the past, and it was due to how NAR's 
>> work. Can't remember though.
>>
>> Help would be appreciated!
>>
>> Thanks,
>>   Peter

RE: [EXT] Re: Funnel Queue Slowness

2017-10-16 Thread Peter Wicks (pwicks)

Pierre,

I agree with you all around. It would be nice if it was a little smarter.

--Peter


-Original Message-
From: Pierre Villard [mailto:pierre.villard...@gmail.com] 
Sent: Monday, October 16, 2017 4:00 PM
To: dev <dev@nifi.apache.org>
Subject: Re: [EXT] Re: Funnel Queue Slowness

Peter,

This behaviour is by design and it's the case for processors as well.

Back pressure is only checked by the component each time it is scheduled to see 
whether the component can run or not. If yes, the component will run as 
configured and will process as many flow files as it is supposed to process. In 
case of funnels, a funnel will always perform actions on a batch of 100 flow 
files (
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core-api/src/main/java/org/apache/nifi/controller/StandardFunnel.java#L372
).

You would have the same with other components. Let's say you have a SplitText 
creating 10k flow files for each incoming flow file. Even though backpressure 
is configured with 1k flow file on the downstream connection, if back pressure 
thresholds are not reached, the processor will be triggered and produce the 
expected number of flow files (which is over back pressure threshold).

I agree this hard-coded number of 100 for funnels could be improved (something 
like min(100, backpressure threshold - number of queued flow
files)) but I'm not sure that's really an issue.

Pierre







2017-10-16 5:05 GMT+02:00 Peter Wicks (pwicks) <pwi...@micron.com>:

> Joe,
>
> It really is about just forgetting that penalization is a thing. 
> Penalized files are fairly well marked when you do a List Queue.
>
> I think Funnel's need an overall re-examination. I noticed another 
> quirk the other day when moving queues around that already contained 
> FlowFiles; Funnel's ignore back pressure settings if there is any 
> space available in the down-stream queue.
>
> Prep the FlowFiles: https://photos.app.goo.gl/Fu3EBDtQZ5wurQNt2
> Configure the Queue to only allow Back Pressure of 10 files:
> https://photos.app.goo.gl/17OlJSu2NXkxQ8lZ2
> Funnel grabs 100 FlowFiles no matter what and shoves them through:
> https://photos.app.goo.gl/vEwoZYETH6iMImBJ3
>
> If you let the down-stream processor run until there is space for 1 
> FlowFile available then it loads in another 100 flow files:
> https://photos.app.goo.gl/R4P5mdXr3L5oJnSw2
>
> I created a ticket: NIFI-4486.
>
> Thanks,
>   Peter
>
> -Original Message-
> From: Joe Witt [mailto:joe.w...@gmail.com]
> Sent: Tuesday, October 10, 2017 10:01 AM
> To: dev@nifi.apache.org
> Subject: Re: [EXT] Re: Funnel Queue Slowness
>
> Peter,
>
> I see your point that it feels not natural or at least surprising.
> There are two challenges I see with what you propose.  One is user 
> oriented and the other is technical.
>
> The user oriented one is that penalized objects are penalized as a 
> function of the thing that last operated on them.  The further away we 
> let the data get the harder it would be to reason over why they were 
> penalized in the first place.
>
> The technical one is that once something is penalized and placed into 
> the queue there is prioritization and polling logic that kicks in as a factor.
> I'm not sure how we'd tweak it for that to be ok in some cases and in 
> others not.  Perhaps we could just make funnels truly a pass-through 
> and when calculating the queue we're storing on figure out the first 
> non-funnel queue provided there is no cloning/branching we'd have to 
> account for.  But even then it brings us back to the previous point 
> which is the user challenge of knowing what thing penalized objects in 
> queue in the first place.
>
> Alternatively, we should review whether it is obvious enough (or at
> all) that items within a queue at a given moment in time are penalized.
> I've worked with NiFi for a very long time and i'll be honest and 
> state I've forgotten that penalization was a thing more than a few times too.
>
> What do you think?
>
> Thanks
>
> On Mon, Oct 9, 2017 at 9:01 PM, Peter Wicks (pwicks) 
> <pwi...@micron.com>
> wrote:
> > Bryan,
> >
> > Yes, it was the penalty causing the issue. This feels like weird
> behavior for Funnel’s, and I’m not sure if it makes sense for 
> penalties to work this way.
> >
> > Would it make more sense if penalties were generally kept as is, but 
> > not
> applied at Funnel’s, then the penalty would kick back in at the first 
> non-funnel queue?
> >
> > Thanks,
> >   Peter
> >
> > From: Bryan Bende [mailto:bbe...@gmail.com]
> > Sent: Monday, October 09, 2017 7:33 PM
> > To: dev@nifi.apache.org
> > Subject: [EXT] Re: Funnel Queu

RE: [EXT] Re: Please refresh my memory on NAR dependencies

2017-10-15 Thread Peter Wicks (pwicks)

Matt,

I am trying to re-use most of CSVUtils, including most of the property 
descriptors and CSVUtils.createCSVFormat.

It seemed like a waste to duplicate the entire class. I can try making it the 
parent, what are the implications if I do that?

Thanks,
  Peter

-Original Message-
From: Matt Burgess [mailto:mattyb...@apache.org] 
Sent: Monday, October 16, 2017 10:58 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: Please refresh my memory on NAR dependencies

Do you have a hard requirement on the implementations in 
nifi-record-serialization-services? Otherwise, the existing examples have the 
processor POM pointing at the following:

org.apache.nifi
nifi-record-serialization-service-api

which is the API JAR I think. If you need the implementations behind it, you 
will probably need to declare that as a parent (not a
dependency) and perhaps still use the API JAR (though I'm guessing about the 
latter).

Regards,
Matt

On Sun, Oct 15, 2017 at 10:27 PM, Peter Wicks (pwicks) <pwi...@micron.com> 
wrote:
> For NIFI-4465 I want the nifi-poi-bundle to include a Maven dependency on 
> nifi-record-serialization-services. So I start by adding the dependency to 
> the pom.xml.
>
> 
>org.apache.nifi
>nifi-record-serialization-services
> 
>
> I've tried several variations on this, with version numbers, putting it at 
> higher pom levels, including it in the nifi-nar-bundles pom and marking it as 
> included, etc...
>
> Throughout all this compiling is no problem, and all my unit tests run 
> correctly. But when I try to start NiFi I immediately get Class not found 
> exceptions from the nifi-poi classes related to the nifi-record-serialization 
> libraries.
>
> I feel like I've run into this in the past, and it was due to how NAR's work. 
> Can't remember though.
>
> Help would be appreciated!
>
> Thanks,
>   Peter

RE: [EXT] Re: Funnel Queue Slowness

2017-10-15 Thread Peter Wicks (pwicks)

Joe,

It really is about just forgetting that penalization is a thing. Penalized 
files are fairly well marked when you do a List Queue.

I think Funnel's need an overall re-examination. I noticed another quirk the 
other day when moving queues around that already contained FlowFiles; Funnel's 
ignore back pressure settings if there is any space available in the 
down-stream queue.

Prep the FlowFiles: https://photos.app.goo.gl/Fu3EBDtQZ5wurQNt2
Configure the Queue to only allow Back Pressure of 10 files: 
https://photos.app.goo.gl/17OlJSu2NXkxQ8lZ2
Funnel grabs 100 FlowFiles no matter what and shoves them through: 
https://photos.app.goo.gl/vEwoZYETH6iMImBJ3

If you let the down-stream processor run until there is space for 1 FlowFile 
available then it loads in another 100 flow files: 
https://photos.app.goo.gl/R4P5mdXr3L5oJnSw2

I created a ticket: NIFI-4486.

Thanks,
  Peter

-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Tuesday, October 10, 2017 10:01 AM
To: dev@nifi.apache.org
Subject: Re: [EXT] Re: Funnel Queue Slowness

Peter,

I see your point that it feels not natural or at least surprising.
There are two challenges I see with what you propose.  One is user oriented and 
the other is technical.

The user oriented one is that penalized objects are penalized as a function of 
the thing that last operated on them.  The further away we let the data get the 
harder it would be to reason over why they were penalized in the first place.

The technical one is that once something is penalized and placed into the queue 
there is prioritization and polling logic that kicks in as a factor.  I'm not 
sure how we'd tweak it for that to be ok in some cases and in others not.  
Perhaps we could just make funnels truly a pass-through and when calculating 
the queue we're storing on figure out the first non-funnel queue provided there 
is no cloning/branching we'd have to account for.  But even then it brings us 
back to the previous point which is the user challenge of knowing what thing 
penalized objects in queue in the first place.

Alternatively, we should review whether it is obvious enough (or at
all) that items within a queue at a given moment in time are penalized.  I've 
worked with NiFi for a very long time and i'll be honest and state I've 
forgotten that penalization was a thing more than a few times too.

What do you think?

Thanks

On Mon, Oct 9, 2017 at 9:01 PM, Peter Wicks (pwicks) <pwi...@micron.com> wrote:
> Bryan,
>
> Yes, it was the penalty causing the issue. This feels like weird behavior for 
> Funnel’s, and I’m not sure if it makes sense for penalties to work this way.
>
> Would it make more sense if penalties were generally kept as is, but not 
> applied at Funnel’s, then the penalty would kick back in at the first 
> non-funnel queue?
>
> Thanks,
>   Peter
>
> From: Bryan Bende [mailto:bbe...@gmail.com]
> Sent: Monday, October 09, 2017 7:33 PM
> To: dev@nifi.apache.org
> Subject: [EXT] Re: Funnel Queue Slowness
>
> Peter,
>
> The images didn’t come across for me, but since you mentioned that a failure 
> queue is involved, is it possible all the flow files going to failure are 
> being penalized which would cause them to not be processed immediately?
>
> -Bryan
>
>
> On Oct 8, 2017, at 10:49 PM, Peter Wicks (pwicks) 
> <pwi...@micron.com<mailto:pwi...@micron.com>> wrote:
>
> I’ve been running into an issue on 1.4.0 where my Funnel sometimes runs slow. 
> I haven’t been able to create a nice reproducible test case to pass on.
> What I’m seeing is that my failure queue on the right will start to fill up, 
> even though there is plenty of room for them in the next queue. You can see 
> that the Tasks/Time is fairly low, only 24 in the last 5 minutes (first 
> image), so it’s not that the FlowFile’s are moving so fast that they just 
> appear to be in queue.
>
> If I stop the downstream processor the files slowly trickle through the 
> funnel into the next queue slowly. I had an Oldest FlowFile First prioritizer 
> on the downstream queue. I tried removing it but there was no change in 
> behavior.
> One time where I saw this behavior in the past was when my NiFi 
> instance was thread starved, but there are plenty of threads available 
> on the instance and all other processors are running fine. I also 
> don’t understand why it trickles the FlowFile’s in, from what I’ve 
> seen in the code Funnel grabs large batches at one time…
>
> Thoughts?
>
> (Sometimes my images don’t make it, let me know if that happens.) 
> [cid:image002.png@01D340EC.543FE750] 
> [cid:image004.png@01D340EC.543FE750]
>

Please refresh my memory on NAR dependencies

2017-10-15 Thread Peter Wicks (pwicks)

For NIFI-4465 I want the nifi-poi-bundle to include a Maven dependency on 
nifi-record-serialization-services. So I start by adding the dependency to the 
pom.xml.


   org.apache.nifi
   nifi-record-serialization-services


I've tried several variations on this, with version numbers, putting it at 
higher pom levels, including it in the nifi-nar-bundles pom and marking it as 
included, etc...

Throughout all this compiling is no problem, and all my unit tests run 
correctly. But when I try to start NiFi I immediately get Class not found 
exceptions from the nifi-poi classes related to the nifi-record-serialization 
libraries.

I feel like I've run into this in the past, and it was due to how NAR's work. 
Can't remember though.

Help would be appreciated!

Thanks,
  Peter

RE: [EXT] Re: Funnel Queue Slowness

2017-10-09 Thread Peter Wicks (pwicks)

Bryan,

Yes, it was the penalty causing the issue. This feels like weird behavior for 
Funnel’s, and I’m not sure if it makes sense for penalties to work this way.

Would it make more sense if penalties were generally kept as is, but not 
applied at Funnel’s, then the penalty would kick back in at the first 
non-funnel queue?

Thanks,
  Peter

From: Bryan Bende [mailto:bbe...@gmail.com]
Sent: Monday, October 09, 2017 7:33 PM
To: dev@nifi.apache.org
Subject: [EXT] Re: Funnel Queue Slowness

Peter,

The images didn’t come across for me, but since you mentioned that a failure 
queue is involved, is it possible all the flow files going to failure are being 
penalized which would cause them to not be processed immediately?

-Bryan


On Oct 8, 2017, at 10:49 PM, Peter Wicks (pwicks) 
<pwi...@micron.com<mailto:pwi...@micron.com>> wrote:

I’ve been running into an issue on 1.4.0 where my Funnel sometimes runs slow. I 
haven’t been able to create a nice reproducible test case to pass on.
What I’m seeing is that my failure queue on the right will start to fill up, 
even though there is plenty of room for them in the next queue. You can see 
that the Tasks/Time is fairly low, only 24 in the last 5 minutes (first image), 
so it’s not that the FlowFile’s are moving so fast that they just appear to be 
in queue.

If I stop the downstream processor the files slowly trickle through the funnel 
into the next queue slowly. I had an Oldest FlowFile First prioritizer on the 
downstream queue. I tried removing it but there was no change in behavior.
One time where I saw this behavior in the past was when my NiFi instance was 
thread starved, but there are plenty of threads available on the instance and 
all other processors are running fine. I also don’t understand why it trickles 
the FlowFile’s in, from what I’ve seen in the code Funnel grabs large batches 
at one time…

Thoughts?

(Sometimes my images don’t make it, let me know if that happens.)
[cid:image002.png@01D340EC.543FE750] [cid:image004.png@01D340EC.543FE750]

Funnel Queue Slowness

2017-10-08 Thread Peter Wicks (pwicks)

I've been running into an issue on 1.4.0 where my Funnel sometimes runs slow. I 
haven't been able to create a nice reproducible test case to pass on.
What I'm seeing is that my failure queue on the right will start to fill up, 
even though there is plenty of room for them in the next queue. You can see 
that the Tasks/Time is fairly low, only 24 in the last 5 minutes (first image), 
so it's not that the FlowFile's are moving so fast that they just appear to be 
in queue.

If I stop the downstream processor the files slowly trickle through the funnel 
into the next queue slowly. I had an Oldest FlowFile First prioritizer on the 
downstream queue. I tried removing it but there was no change in behavior.
One time where I saw this behavior in the past was when my NiFi instance was 
thread starved, but there are plenty of threads available on the instance and 
all other processors are running fine. I also don't understand why it trickles 
the FlowFile's in, from what I've seen in the code Funnel grabs large batches 
at one time...

Thoughts?

(Sometimes my images don't make it, let me know if that happens.)
[cid:image002.png@01D340EC.543FE750] [cid:image004.png@01D340EC.543FE750]

RE: [EXT] Re: [VOTE] Release Apache NiFi 1.4.0 (RC2)

2017-10-01 Thread Peter Wicks (pwicks)

+1 (non-binding).

Upgraded to 1.4.0, openJDK 1.8.0_102, RHEL. No issues.

-Original Message-
From: Joey Frazee [mailto:joey.fra...@icloud.com] 
Sent: Monday, October 02, 2017 3:50 AM
To: dev@nifi.apache.org
Subject: [EXT] Re: [VOTE] Release Apache NiFi 1.4.0 (RC2)

+1 (non-binding)

- Verified checksums and signature
- Successfully built and ran tests on OSX (Oracle 1.8.0_131), Amazon Linux 
(Oracle 1.8.0_131), and Docker maven:latest (OpenJDK 1.8.0_141)
- Built RPM with `mvn -T 2.0C clean install -Prpm,generateArchives -DskipTests` 
and tested install
- Tried out DMC w/ Redis :)

Note: Inconsistently seeing timeouts and test failures with TestListenSMTP on 
OSX (probably a me problem though)

-joey

On Oct 1, 2017, 12:05 PM -0500, Joe Gresock , wrote:
> +1 (non-binding)
>
> Built on CentOS 7, upgraded an existing 1.3.0 cluster, ran a complex flow
> with self-RPGs, reporting tasks, and controller services. All looks good!
> The Avro viewer is especially nice, and the double-click feature already
> feels natural. Great work!
>
> On Sun, Oct 1, 2017 at 4:41 PM, Joe Percivall  wrote:
>
> > +1 (binding)
> >
> > Built on OSX 10.12.5 and Windows 8.1, and ran very simple flows on OSX,
> > Windows 8.1 and Windows 10. When building on Windows 8.1, ran into previous
> > test failures and a couple new ones found here[1]. As stated before, not
> > important enough to downvote the release.
> >
> > Also one thing to comment, this vote thread doesn't have the SHA256 listed
> > but it is included with the other artifacts. It is:
> > 852dcda482342d9ae60e05e137a025cbf17d948e804716e3c185992a88e8cd8a
> >
> > Thanks for everyone's hard work!
> >
> > [1] https://issues.apache.org/jira/browse/NIFI-3840
> >
> > On Sun, Oct 1, 2017 at 10:44 AM, Joe Witt  wrote:
> >
> > > +1 (binding)
> > >
> > > Did all the normal release validation/L/sigs/etc..
> > >
> > > Did a bunch of different tests largely focused on high performance and
> > > stability. Things look really good.
> > >
> > > I did run into a test failure which I've narrowed down to the latest
> > > JRE/JDK version. Details
> > > https://issues.apache.org/jira/browse/NIFI-4445
> > >
> > > Thanks and great job Jeff on RM and to the community this is an
> > > awesome release. Now we need to focus on the growing list of PRs
> > > which reflect a wide range of new contributors!
> > >
> > > On Sat, Sep 30, 2017 at 11:27 AM, Gerdan Rezende dos Santos
> > >  wrote:
> > > > Verified hash, local build was successful on OS X, confirmed!
> > > > Good Version!
> > > >
> > > > On Sat, 30 Sep 2017 at 09:32 Tony Kurc  wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > Built on Ubuntu 14.04 with Java 1.8.0 and Maven 3.5.0. Verified hashes
> > > and
> > > > > signature. Tested some simple flows. Tested some tls toolkit
> > operations.
> > > > > Did not see any issues with LICENSE or NOTICE files.
> > > > >
> > > > > On Sat, Sep 30, 2017 at 4:17 AM, Koji Kawamura <
> > ijokaruma...@gmail.com
> > > > > wrote:
> > > > >
> > > > > > +1 (binding) Release this package as nifi-1.4.0
> > > > > >
> > > > > > Verified hashes, local build was successful on OS X, confirmed S2S
> > > > > > communication with older versions.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sat, Sep 30, 2017 at 9:27 AM, Andy LoPresto <
> > alopre...@apache.org
> > > > > > wrote:
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > Build environment: Mac OS X 10.11.6, Java 1.8.0_101, Maven 3.3.9,
> > > JCE
> > > > > > > Unlimited Strength Cryptographic Jurisdiction Policies installed
> > > > > > >
> > > > > > > * verified GPG signature is valid and SHA512 digest
> > > > > > > * verified all checksums
> > > > > > > * verified all tests
> > > > > > > * verified checkstyle
> > > > > > > * verified Knox properties present in default nifi.properties
> > > > > > > * verified normal flow
> > > > > > > * verified ListenHTTP and HandleHTTPRequest only accept restricted
> > > > > SSLCS
> > > > > > > * verified bad authorizers.xml (copied from 1.2.0 -- missing
> > > > > > > managedAuthorizer) causes startup fail
> > > > > > > * verified good authorizers.xml works
> > > > > > > * verified secure instance works with client cert auth
> > > > > > > * verified secure instance works with Knox SSO
> > > > > > > * verified encrypted flow value migration works without Jasypt
> > > > > > >
> > > > > > > Andy LoPresto
> > > > > > > alopre...@apache.org
> > > > > > > alopresto.apa...@gmail.com
> > > > > > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D
> > EF69
> > > > > > >
> > > > > > > On Sep 29, 2017, at 1:54 PM, Andrew Lim <
> > andrewlim.apa...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > -Ran full clean install on OS X (10.11.4)
> > > > > > > -Tested UI changes including Variable Registry UI
> > > > > > > -Tested flows using Record

RE: [EXT] ConvertCSVToAvro vs CSVReader - Value Delimiter

2017-09-24 Thread Peter Wicks (pwicks)

Arun,

I'm also using Ctrl+A as a delimiter and had the same problem.  I haven't had 
time to write up a PR but it looked like a pretty easy fix to me too.

I can't merge the change if you submit it, but I'd be happy to review it.

--Peter

-Original Message-
From: Arun Manivannan [mailto:a...@arunma.com] 
Sent: Sunday, September 24, 2017 11:17 PM
To: Dev@nifi.apache.org
Subject: [EXT] ConvertCSVToAvro vs CSVReader - Value Delimiter

Hi,

The ConvertCSVToAvro processor have been having performance issues while 
processing files which are more than a GB and I was suggested to use the 
ConvertRecord that leverages the RecordReader and Writer. Did some tests and 
they do perform well.

Strangely, the CSVReader doesn't accept unicode character as the value 
delimiter - Control A  (\u0001) character is the delimiter of my CSV.

Did some analysis and I see that a minor change needs to be made on the 
CSVUtils to unescape the delimiter, like what ConvertCSVToAvro does and also 
modify the SingleCharacterValidator.

Please let me know if you believe this isn't an issue and there's a workaround 
for this. Else, I am more than happy to raise an issue and submit a PR for 
review.

Best Regards,
Arun

Lifecycle of a Processor instance

2017-09-18 Thread Peter Wicks (pwicks)

There had been a lot of internal discussions at work about the actual life 
cycle of the class instance instantiated by NiFi; and the variables scoped at 
the class level. When do Processors "reset", are new instances created for each 
run, or are instances recycled? What about concurrent threads and thread safety?

I'm sure several developers here on the list could have easily answered these 
questions :), but I decided to do some research on my own. I built a test 
processor that either increments a local private non-thread safe or thread safe 
integer, based upon the property choice you make in the processor.  Just to 
share and discuss, below are my tests and results. The value is stored only in 
the private variable, no state management is used.


-  Test 1: 1 Concurrent Thread, Non-Thread Safe

o   The purpose of this test is to find out what happens to a Processors state 
between execution of Flow Files

o   After 10,000 files the value was 10,000 on the last file. This means that 
state is maintained in a processor between runs (this was what I assumed, but a 
good place to start). Each execution of the processor used the same instance of 
the class.

-  Test 2: Stop the processor, then start it again and run a single file

o   The purpose of this test is to figure out when a processor "resets" it's 
state.

o   After 1 file the value was 10,001.

o   This means that stopping and starting a processor does not reset the 
processor. The same class instance is still used.

-  Test 3: Stop, Disable, Enable, and then start again.

o   The purpose of this test is to see if disabling a processor causes the 
class instance to be disposed of.

o   After 1 file the value was 10,002.

o   This means that disabling and re-enabling a processor does not reset its 
state. The same class instance persists.

-  Test 4: Starting with a new copy of the test processor, run 10 
concurrent threads, non-thread safe

o   The purpose of this test is to see if each thread uses its own instance of 
the class, or if a shared instance of the class is used.

o   After 10,000 files, the value was 9,975 on the last file (I didn't run this 
test more than once, but the value should fluctuate from run to run due to 
thread contention).

o   I saw at least 8 concurrent threads running at one point. Combined with 
this, and the resulting value, I'm fairly confident that the same class 
instance is used for all concurrent threads.

-  Test 5: Starting with a new copy of the test processor, run 1 
Concurrent Thread, Thread-Safe

o   The purpose of this test is to find out what happens to a Processors state 
between execution of Flow Files

o   After 10,000 files the value was 10,000 on the last file

o   This means that state is maintained in a processor between runs (this 
matches the non-thread safe results, which makes sense for 1 concurrent thread).

-  Test 6: Starting with a new copy of the test processor, run 10 
concurrent Thread, Thread-Safe

o   The purpose of this test is to contrast with the non-thread safe approach, 
and verify that a thread-safe object will work across concurrent threads.

o   After 10,000 files the value was 10,000 on the last file

o   This means that thread synchronization works with multiple concurrent 
threads for a single class instance.

These tests ran on a single NiFi instance, with no clustering and are not 
designed to say anything about clustering.

Based upon my limited test results, a Processor class is never re-instantiated 
unless the Processor is deleted from the flow (... yea, kind of like cheating) 
or NiFi restarts. There are of course other tests that could be run, welcome 
any feedback!

Thanks,
  Peter

RE: [EXT] Re: NiFi Processors show 30 Second Execution time, 0 executions

2017-05-30 Thread Peter Wicks (pwicks)

Thanks Matt.

Interestingly enough I'm at +09:00, and not a half hour interval, but hopefully 
this will fix it regardless.

-Original Message-
From: Matt Gilman [mailto:matt.c.gil...@gmail.com] 
Sent: Tuesday, May 30, 2017 9:05 PM
To: dev@nifi.apache.org
Subject: [EXT] Re: NiFi Processors show 30 Second Execution time, 0 executions

I do not believe this issue has been addressed yet. There is an open JIRA [1].

Matt

[1] https://issues.apache.org/jira/browse/NIFI-3719

On Sun, May 28, 2017 at 10:16 PM, Joe Witt <joe.w...@gmail.com> wrote:

> Peter
>
> Probably best to go ahead and file a JIRA.  In it you can reliably 
> post the attachments. There was a potentially related timezone 
> handling issue as I recall in this past release so perhaps there is 
> some relationship.
>
> Thanks
> Joe
>
> On Sun, May 28, 2017 at 10:04 PM, Peter Wicks (pwicks) 
> <pwi...@micron.com> wrote:
> > I wanted to re-open this discussion, it's been a while and I'm still
> seeing
> > the issue even with the latest version. I'm still seeing this issue
> running
> > a stock NiFi v1.2.0. By stock I mean no custom NAR’s, etc… just 
> > original vanilla code, in this case with no configuration, so 
> > running unsecured, empty canvas (except for my test case).
> >
> >
> >
> > I’ve expanded my test scenarios.
> >
> >
> >
> > Scenario 1 is Windows 7, code built using mvn, using Oracle Java.
> >
> > Java Version:
> >
> > java version "1.8.0_91"
> >
> > Java(TM) SE Runtime Environment (build 1.8.0_91-b15)
> >
> > Java HotSpot(TM) 64-Bit Server VM (build 25.91-b15, mixed mode)
> >
> >
> >
> > Scenario 2 is RHEL 7.3, the NiFi build is v1.2.0 downloaded from the 
> > NiFi website. Running OpenJDK.
> >
> >
> >
> > openjdk version "1.8.0_102"
> >
> > OpenJDK Runtime Environment (build 1.8.0_102-b14)
> >
> > OpenJDK 64-Bit Server VM (build 25.102-b14, mixed mode)
> >
> >
> >
> > I’ve attempted to attach a screenshot (my attachments seem to not 
> > make it very often on this list). In it I show the onscreen 
> > Tasks/Time for two
> > processors: one shows 1 / 00:30:04.292 and the other 0 / 00:30:00.000.
> >
> >
> >
> > Thanks!
> >
> >   Peter
> >
> >
> >
> > -Original Message-
> > From: Joseph Niemiec [mailto:josephx...@gmail.com]
> >
> > Sent: Friday, March 31, 2017 10:54 PM
> > To: dev@nifi.apache.org
> > Subject: Re: NiFi Processors show 30 Second Execution time, 0 
> > executions
> >
> >
> >
> > What version of Java are you running on ? Major_minor?
> >
> >
> >
> > On Fri, Mar 31, 2017 at 10:49 AM, Peter Wicks (pwicks) <
> pwi...@micron.com>
> >
> > wrote:
> >
> >
> >
> >> I misread my own screenshot, it says 30 minutes, not seconds. Also, 
> >> I
> >
> >> did a restart of NiFi and opened it up in a fresh instance of 
> >> Chrome;
> >
> >> no change. I kicked off a GenerateFlowFile processor and the
> >
> >> milliseconds are going up, but the 30 minutes is remaining the same...
> >
> >>
> >
> >>
> >
> >>
> >
> >> -Original Message-
> >
> >> From: Joseph Niemiec [mailto:josephx...@gmail.com]
> >
> >> Sent: Friday, March 31, 2017 10:41 PM
> >
> >> To: dev@nifi.apache.org
> >
> >> Subject: Re: NiFi Processors show 30 Second Execution time, 0
> >
> >> executions
> >
> >>
> >
> >> Doing a clean build of 091359b450a7d0fb6bb04e2238c9171728cd2720, I
> >
> >> will have to see if I have a windows 7 VM anywhere, I know Witt was
> >
> >> using Win10 and didnt see it... I find it odd that ALL your 
> >> processors
> >
> >> have a 30 second number not just the UpdateAttribute. Anything else
> >
> >> about your environment you can share that may be unique?
> >
> >>
> >
> >> On Fri, Mar 31, 2017 at 10:15 AM, Peter Wicks (pwicks)
> >
> >> <pwi...@micron.com>
> >
> >> wrote:
> >
> >>
> >
> >> > 091359b450a7d0fb6bb04e2238c9171728cd2720, so just one commit 
> >> > behind
> >
> >> > master.
> >
> >> > I am testing on Windows 7.
> >
> >> >
> >
> >> > Lee, yield isn't a bad idea, but UpdateAttribute in my screenshot
> >
> >> > has never run;

1 2 >

1 - 100 of 140 matches

Mail list logo