Re: Data flow rate exceeding provenance recording rate

2018-05-14 Thread Phil H
I may have spoken too soon. I was processing well through the 7 figure backlog, when the system started slowing again. I upped the indexing thread count again (it was 2 initially, then 4, then finally 8) and the system become unusably slow, so I set it back to 4. The system is now operating

Re: Data flow rate exceeding provenance recording rate

2018-05-14 Thread Phil H
Thanks Mark, That has done the trick. The whole system seems to be performing better than I was used to, even before I started receiving those errors. Cheers, Phil On Tue, 15 May 2018 at 08:54, Mark Payne wrote: > Phil, > > This is just a side effect of how the old

Re: Proposal: standard record metadata attributes for data sources

2018-05-14 Thread Andy LoPresto
Maybe an ADDINFO event or FORK event could be used and a new flowfile with the relevant attributes/content could be created. The flowfiles would be linked, but the “sensitive” information wouldn’t travel with the original. Andy LoPresto alopre...@apache.org alopresto.apa...@gmail.com PGP

Re: Data flow rate exceeding provenance recording rate

2018-05-14 Thread Mark Payne
Phil, This is just a side effect of how the old provenance repository was designed. There is a new implementation that is far faster and seems to be more stable. However, in order to use it, you have to "opt in" simply because we wanted to make sure that it was stable enough to set it as the

Data flow rate exceeding provenance recording rate

2018-05-14 Thread Phil H
Hi gang, I have started receiving this error after perhaps 24 hours of run time. The first queue in our flow has a very large backlog by the time this error arrives. What is odd is that the incoming message rate is fairly constant at all times and while I am watching NiFi during the day, we never

Re: Proposal: standard record metadata attributes for data sources

2018-05-14 Thread Mike Thomsen
Does the provenance system have the ability to add user-defined key/value pairs to a flowfile's provenance record at a particular processor? On Mon, May 14, 2018 at 6:11 PM Andy LoPresto wrote: > I would actually propose that this is added to the provenance but not >

Re: Proposal: standard record metadata attributes for data sources

2018-05-14 Thread Andy LoPresto
I would actually propose that this is added to the provenance but not always put into the flowfile attributes. There are many scenarios in which the data retrieval should be separated from the analysis/follow-on, both for visibility, responsibility, and security concerns. While I understand a

Re: User and Policies

2018-05-14 Thread Anil Rai
Thanks Bryan. It's just that the canvas looks very cluttered when we have a lot of process groups. The fact that I cannot do anything with other process groups, I was wondering if we can not shown them. On Mon, May 14, 2018 at 4:31 PM, Bryan Bende wrote: > There intentionally

Re: User and Policies

2018-05-14 Thread Bryan Bende
There intentionally isn't a way to hide components from the canvas. You can use the analogy of being in an apartment building and seeing all the doors... you can only see whats inside the one you have the key to, but you still know all the other locked doors are there. On Mon, May 14, 2018 at

Re: User and Policies

2018-05-14 Thread Anil Rai
On the same topic, we currently have many teams sharing our common development environment. We have created groups for each team and added users to the groups. Each team is given a process group and this process group is assigned to that specific group. So that they can only work on their process

Re: User and Policies

2018-05-14 Thread Anil Rai
Thanks for the detailed explanation Bryan. Cheers Anil On Mon, May 14, 2018 at 3:01 PM, Bryan Bende wrote: > Hello, > > When a node joins the clusters, if the node has an empty flow.xml, no > users, and no authorizations, then the node will inherit all of those > from the

Re: User and Policies

2018-05-14 Thread Bryan Bende
Hello, When a node joins the clusters, if the node has an empty flow.xml, no users, and no authorizations, then the node will inherit all of those from the cluster, but if any of those are populated then it won't be able to join. One common issue that prevents this from working, is if you have

User and Policies

2018-05-14 Thread Anil Rai
All, We noticed that we cannot add/modify users and policies when 1 node in a cluster is down. So seems like all nodes should have the latest and identical users.xml and auth*.xml. Is this correct? Shouldn't the latest and up to date files be copied to other nodes during startup instead? (like

Re: NiFi code re-use

2018-05-14 Thread Andrew Lim
Scott, Besides the documentation available in NiFi and in NiFi Registry [1], there are also Videos available on the Registry web site [2] that might be helpful to you. Also, a Getting Started guide [3] has been written that didn’t make the 0.1.0 Registry release, but can be seen if you build

Re: Graph database support w/ NiFi

2018-05-14 Thread Otto Fowler
The wiki discussion should list these and other points of concern and should document the extent to which they are to be addressed. On May 12, 2018 at 12:37:59, u...@moosheimer.com (u...@moosheimer.com) wrote: Matt, You have some interesting ideas that I really like. GraphReaders and

Re: Graph database support w/ NiFi

2018-05-14 Thread Otto Fowler
+1 for the wiki page On May 12, 2018 at 10:52:43, Matt Burgess (mattyb...@apache.org) wrote: All, As Joe implied, I'm very happy that we are discussing graph tech in relation to NiFi! NiFi and Graph theory/tech/analytics are passions of mine. Mike, the examples you list are great, I would add

Re: NiFi code re-use

2018-05-14 Thread Ed B
Scott, No versioned PGs aren't getting updated when Registry version getting updated. But NIFI UI will show you the PGs that aren't up to date. And it is easy to update them to current version. As discussed in one of the emails in this thread, there could be feature implemented to have autoupdate