[GitHub] nifi pull request #2490: Added Pulsar processors and Controller Service

2018-02-23 Thread david-streamlio
Github user david-streamlio closed the pull request at:

https://github.com/apache/nifi/pull/2490


---


[GitHub] nifi issue #2490: Added Pulsar processors and Controller Service

2018-02-23 Thread david-streamlio
Github user david-streamlio commented on the issue:

https://github.com/apache/nifi/pull/2490
  
I neglected to add the JIRA ticket to the commit.  NiFi-4908


---


[GitHub] nifi pull request #2490: Added Pulsar processors and Controller Service

2018-02-23 Thread david-streamlio
GitHub user david-streamlio opened a pull request:

https://github.com/apache/nifi/pull/2490

Added Pulsar processors and Controller Service

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/david-streamlio/nifi NIFI-4908

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2490.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2490


commit e4f8550159132d85d0e19d61f24e087e51742dee
Author: David Kjerrumgaard 
Date:   2018-02-24T01:40:22Z

Added Pulsar processors and Controller Service




---


[jira] [Created] (NIFI-4908) Add Consumer and Producer Processors for Apache Pulsar

2018-02-23 Thread David Kjerrumgaard (JIRA)
David Kjerrumgaard created NIFI-4908:


 Summary: Add Consumer and Producer Processors for Apache Pulsar
 Key: NIFI-4908
 URL: https://issues.apache.org/jira/browse/NIFI-4908
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Extensions
Affects Versions: 1.5.0
Reporter: David Kjerrumgaard
 Fix For: 1.5.0


Please add processors that allow NiFi to publish messages to Apache Pulsar 
topics and subscribe to Apache Pulsar topics



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4683) Add ability to execute Spark jobs via Livy

2018-02-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16375101#comment-16375101
 ] 

ASF GitHub Bot commented on NIFI-4683:
--

Github user mattyb149 commented on the issue:

https://github.com/apache/nifi/pull/2339
  
@jomach Does Livy support client-side Kerberos authentication? I thought 
the Livy Kerberos stuff was a config on the Livy server side to connect to a 
Kerberized Hadoop cluster?


> Add ability to execute Spark jobs via Livy
> --
>
> Key: NIFI-4683
> URL: https://issues.apache.org/jira/browse/NIFI-4683
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.5.0
>
>
> Proposal for a new feature to enable NiFi users to execute Spark jobs. A 
> natural entry point for this is to use Apache Livy, as it is a "REST service 
> for Apache Spark". This would allow NiFi to submit Spark jobs without needing 
> to bundle a Spark client itself (and maintain versions of Spark, e.g.).
> Some of the components that could be involved include:
> LivySessionController Controller Service (CS) - provides connections to 
> available sessions in Livy
> * Users could request a type of connection or to retrieve the same connection 
> back by session id if available.
> * Properties to configure Livy session such as number of executors, memory
> * Property for connection pool size
> * Will interact with Livy ensure that only connections that are 
> idle/available are added to the pool and checked back in
> * Key for pool could be based on session id or type
> * Ensure to provide any user credentials
> * Leverages SSLContext for security
> LivyProcessor
> * Obtains Spark JARs/files via properties and/or flow file attribute(s)
> * Obtains connection information from LivySessionController
> * Provides attributes to configure session, maintain session id, attach to 
> session id
> * Potential advanced UI available for testing code (probably a follow-on Jira)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi issue #2339: NIFI-4683: Add ability to execute Spark jobs and code via ...

2018-02-23 Thread mattyb149
Github user mattyb149 commented on the issue:

https://github.com/apache/nifi/pull/2339
  
@jomach Does Livy support client-side Kerberos authentication? I thought 
the Livy Kerberos stuff was a config on the Livy server side to connect to a 
Kerberized Hadoop cluster?


---


[jira] [Commented] (NIFI-4900) nifi Swagger definition - FlowApi ScheduleComponents returns empty component list

2018-02-23 Thread Daniel Chaffelson (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16375059#comment-16375059
 ] 

Daniel Chaffelson commented on NIFI-4900:
-

Thanks for the detailed explanation [~mcgilman]

This makes sense, in that the field is essentially for submission control only. 
It is not difficult to implement a wait loop for the components to achieve the 
submitted state or timeout, which is ultimately what I've done in my Python 
Client SDK.

I think, from your description, that this isn't a bug and also isn't really 
worth being a feature either, so unless someone else feels invested in the idea 
I'll close it.

> nifi Swagger definition - FlowApi ScheduleComponents returns empty component 
> list
> -
>
> Key: NIFI-4900
> URL: https://issues.apache.org/jira/browse/NIFI-4900
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: SDLC
>Affects Versions: 1.5.0
>Reporter: Daniel Chaffelson
>Priority: Minor
>
> When issuing a command to Schedule Components with the 'components' attribute 
> set, the command returns an empty set back. This seems incongruous. I would 
> expect to receive back an object listing out the states of each component 
> that the scheduling command was issued against so I can tell which components 
> where operated upon.
> This can be reproduced by creating two processors within a process group, and 
> issuing a schedule component command with a component attribute referencing 
> only one of the processors, which will then change scheduled state leaving 
> the other as-was. It evidences that the individual component request is being 
> honoured, but the return from the API is the same as if all components in the 
> process group had been scheduled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4735) ParseEVTX only outputs one event per chunk

2018-02-23 Thread J Andrew Skene (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16375023#comment-16375023
 ] 

J Andrew Skene commented on NIFI-4735:
--

I hit this issue a few months and fixed it in a local fork. The NiFi code also 
skips the last chunk of any EVTX file. The above MR fixes both issues.

> ParseEVTX only outputs one event per chunk
> --
>
> Key: NIFI-4735
> URL: https://issues.apache.org/jira/browse/NIFI-4735
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.4.0
>Reporter: Terry Brugger
>Priority: Major
> Attachments: EVTX2XML.xml, Screen Shot 2018-01-03 at 15.06.24.png
>
>
> I have constructed a simple pipeline that reads a Windows EVTX binary file, 
> runs it through ParseEvtx, and writes out the result (template attached). As 
> a sample I fed it a 192MiB file and it only output 3.3MiB (see screenshot). 
> The output file contains 3071 events. Not coincidentally, I am sure, 
> 192MiB/64KiB = 3072, which would indicate that it only wrote out one event 
> from each chunk. If I configure the processor to output by the chunk or event 
> I get 3071 separate files with one event each. Unfortunately, I have no way 
> to sanitize binary EVTX so I cannot provide the actual file used.
> By way of comparison, I ran the same EVTX file through evtx_dump.py from the 
> python-evtx package (which I understand ParseEvtx was based on) and it 
> produced 395,757 events -- on par with what I would expect. It also took much 
> longer than NiFi -- like 30 minutes versus a few seconds -- which I also 
> expect is consistent with processing the entire file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4735) ParseEVTX only outputs one event per chunk

2018-02-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16375015#comment-16375015
 ] 

ASF GitHub Bot commented on NIFI-4735:
--

GitHub user askene opened a pull request:

https://github.com/apache/nifi/pull/2489

NIFI-4735: ParseEVTX only outputs one event per chunk

Updated the EVTX FileHeader class to correctly check if there are more 
chunks in the file. Previously this would not process the last chunk.

Updated the EVTX ChunkHeader class to correctly check if there are 
additional records in the chunk. Previously this would only process the first 
record of each chunk. It was using the fileLastRecordNumber where it should 
have been using the logLastRecordNumber value.

Updated the EVTX unit tests to have the correct expected number of events 
and use the logLastRecordNumber.

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/askene/nifi NIFI-4735

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2489.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2489


commit c04042095006760456e20f0c2e074ea996ae62a6
Author: Andrew Skene 
Date:   2018-02-23T21:58:53Z

Updated the EVTX FileHeader class to correctly check if there are more 
chunks in the file. Previously this would not process the last chunk.
Updated the EVTX ChunkHeader class to correctly check if there are 
additional records in the chunk. Previously this would only process the first 
record of each chunk. It was using the fileLastRecordNumber where it should 
have been using the logLastRecordNumber value.
Updated the EVTX unit tests to have the correct expected number of events 
and use the logLastRecordNumber.




> ParseEVTX only outputs one event per chunk
> --
>
> Key: NIFI-4735
> URL: https://issues.apache.org/jira/browse/NIFI-4735
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.4.0
>Reporter: Terry Brugger
>Priority: Major
> Attachments: EVTX2XML.xml, Screen Shot 2018-01-03 at 15.06.24.png
>
>
> I have constructed a simple pipeline that reads a Windows EVTX binary file, 
> runs it through ParseEvtx, and writes out the result (template attached). As 
> a sample I fed it a 192MiB file and it only output 3.3MiB (see screenshot). 
> The output file contains 3071 events. Not coincidentally, I am sure, 
> 192MiB/64KiB = 3072, which would indicate that it only wrote out one event 
> from each chunk. If I configure the processor to output by the chunk or event 
> I get 3071 separate files with one event each. Unfortunately, I have no way 
> to sanitize binary EVTX so I cannot provide the actual file used.
> By way of comparison, I ran the same EVTX file through evtx_dump.py from the 
> python-evtx package (which I understand ParseEvtx was 

[GitHub] nifi pull request #2489: NIFI-4735: ParseEVTX only outputs one event per chu...

2018-02-23 Thread askene
GitHub user askene opened a pull request:

https://github.com/apache/nifi/pull/2489

NIFI-4735: ParseEVTX only outputs one event per chunk

Updated the EVTX FileHeader class to correctly check if there are more 
chunks in the file. Previously this would not process the last chunk.

Updated the EVTX ChunkHeader class to correctly check if there are 
additional records in the chunk. Previously this would only process the first 
record of each chunk. It was using the fileLastRecordNumber where it should 
have been using the logLastRecordNumber value.

Updated the EVTX unit tests to have the correct expected number of events 
and use the logLastRecordNumber.

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/askene/nifi NIFI-4735

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2489.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2489


commit c04042095006760456e20f0c2e074ea996ae62a6
Author: Andrew Skene 
Date:   2018-02-23T21:58:53Z

Updated the EVTX FileHeader class to correctly check if there are more 
chunks in the file. Previously this would not process the last chunk.
Updated the EVTX ChunkHeader class to correctly check if there are 
additional records in the chunk. Previously this would only process the first 
record of each chunk. It was using the fileLastRecordNumber where it should 
have been using the logLastRecordNumber value.
Updated the EVTX unit tests to have the correct expected number of events 
and use the logLastRecordNumber.




---


[jira] [Commented] (NIFI-3599) Add nifi.properties value to globally set the default backpressure size threshold for each connection

2018-02-23 Thread Michael Moser (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374892#comment-16374892
 ] 

Michael Moser commented on NIFI-3599:
-

Thank you both for the feedback.  I agree showing the actual values in those 
fields is best, and I'll look into modifying the /nifi-api/flow/about endpoint 
response to provide them.

> Add nifi.properties value to globally set the default backpressure size 
> threshold for each connection
> -
>
> Key: NIFI-3599
> URL: https://issues.apache.org/jira/browse/NIFI-3599
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Jeremy Dyer
>Assignee: Jeremy Dyer
>Priority: Major
>
> By default each new connection added to the workflow canvas will have a 
> default backpressure size threshold of 10,000 objects. While the threshold 
> can be changed on a connection level it would be convenient to have a global 
> mechanism for setting that value to something other than 10,000. This 
> enhancement would add a property to nifi.properties that would allow for this 
> threshold to be set globally unless otherwise overridden at the connection 
> level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-minifi pull request #116: MINIFI-438 Refactor MiNiFi C2 Server

2018-02-23 Thread kevdoran
GitHub user kevdoran opened a pull request:

https://github.com/apache/nifi-minifi/pull/116

MINIFI-438 Refactor MiNiFi C2 Server

This begins refactoring MiNiFi C2 Server to lay the groundwork
for future commits that will add new functionality, specifically,
NiFi Registry integration as the prefered method of deploying flows
to agents.

The goal of this refactor is to reuse patterns and best practices
established by the NiFi Registry project, with the goal being that
one day more code can be extracted and shared across web services
in the NiFi ecosystem (for example, authn and authz), with a secondary
benefit being developers working on both projects will only have to
learn one set of frameworks and patterns.

Thank you for submitting a contribution to Apache NiFi - MiNiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with MINIFI- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi-minifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under minifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under minifi-assembly?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kevdoran/nifi-minifi minifi-c2-server

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi-minifi/pull/116.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #116


commit f14c0330dcb986686af57471ba539324257c43da
Author: Kevin Doran 
Date:   2018-02-23T18:06:40Z

MINIFI-438 Refactor MiNiFi C2 Server

This commit begins refactoring MiNiFi C2 Server to lay the groundwork
for future commits that will add new functionality, specifically,
NiFi Registry integration as the prefered method of deploying flows
to agents.

The goal of this refactor is to reuse patterns and best practices
established by the NiFi Registry project, with the goal being that
one day more code can be extracted and shared across web services
in the NiFi ecosystem (for example, authn and authz), with a secondary
benefit being developers working on both projects will only have to
learn one set of frameworks and patterns.




---


[jira] [Commented] (NIFI-3599) Add nifi.properties value to globally set the default backpressure size threshold for each connection

2018-02-23 Thread Matt Gilman (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374828#comment-16374828
 ] 

Matt Gilman commented on NIFI-3599:
---

[~mosermw] We currently have an about endpoint that we use to relay server-side 
information to the front end (content viewer uri, version, timezone, etc). Some 
of these values come from nifi.properties. I would suggest adding to an 
endpoint like this so that the user knows that the default actual is and 
whether that's an appropriate value for the connection. Additionally, by still 
populating an actual value in the UI we are guaranteed that the same value will 
be applied across the cluster. If we allowed the user to specify a blank value 
or 'default' alias then if nodes have different defaults, we would have 
inconsistent flow configurations.

> Add nifi.properties value to globally set the default backpressure size 
> threshold for each connection
> -
>
> Key: NIFI-3599
> URL: https://issues.apache.org/jira/browse/NIFI-3599
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Jeremy Dyer
>Assignee: Jeremy Dyer
>Priority: Major
>
> By default each new connection added to the workflow canvas will have a 
> default backpressure size threshold of 10,000 objects. While the threshold 
> can be changed on a connection level it would be convenient to have a global 
> mechanism for setting that value to something other than 10,000. This 
> enhancement would add a property to nifi.properties that would allow for this 
> threshold to be set globally unless otherwise overridden at the connection 
> level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-3599) Add nifi.properties value to globally set the default backpressure size threshold for each connection

2018-02-23 Thread Scott Aslan (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374824#comment-16374824
 ] 

Scott Aslan commented on NIFI-3599:
---

You are correct that the UI does not have access to nifi.properties but I will 
let [~mcgilman] comment as to whether this approach makes sense on the server 
side. As for the UX, instead of leaving the field blank we should disable it 
and have a checked checkbox next to that field indicating that the 'default' 
values from nifi.properties will be used. Then if the user wants to override 
those values they would uncheck the checkbox and enter the new back pressure 
threshold values...

> Add nifi.properties value to globally set the default backpressure size 
> threshold for each connection
> -
>
> Key: NIFI-3599
> URL: https://issues.apache.org/jira/browse/NIFI-3599
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Jeremy Dyer
>Assignee: Jeremy Dyer
>Priority: Major
>
> By default each new connection added to the workflow canvas will have a 
> default backpressure size threshold of 10,000 objects. While the threshold 
> can be changed on a connection level it would be convenient to have a global 
> mechanism for setting that value to something other than 10,000. This 
> enhancement would add a property to nifi.properties that would allow for this 
> threshold to be set globally unless otherwise overridden at the connection 
> level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-4907) Provenance authorization refactoring

2018-02-23 Thread Mark Bean (JIRA)
Mark Bean created NIFI-4907:
---

 Summary: Provenance authorization refactoring
 Key: NIFI-4907
 URL: https://issues.apache.org/jira/browse/NIFI-4907
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Affects Versions: 1.5.0
Reporter: Mark Bean


Currently, the 'view the data' component policy is too tightly coupled with 
Provenance queries. The 'query provenance' policy should be the only policy 
required for viewing Provenance query results. Both 'view the component' and 
'view the data' policies should be used to refine the appropriate visibility of 
event details - but not the event itself.

1) Component Visibility
The authorization of Provenance events is inconsistent with the behavior of the 
graph. For example, if a user does not have 'view the component' policy, the 
graph shows this component as a "black box" (no details such as name, UUID, 
etc.) However, when querying Provenance, this component will show up including 
the Component Type and the Component Name. This is in effect a violation of the 
policy. These component details should be obscured in the Provenance event 
displayed if user does not have the appropriate 'view the component' policy.

2) Data Visibility
For a Provenance query, all events should be visible as long as the user 
performing the query belongs to the 'query provenance' global policy. As 
mentioned above, some information about the component may be obscured depending 
on 'view the component' policy, but the event itself should be visible. 
Additionally, details of the event (clicking the View Details "i" icon) should 
only be accessible if the user belongs to the 'view the data' policy for the 
affected component. If the user is not in the appropriate 'view the data' 
policy, a popup warning should be displayed indicating the reason details are 
not visible with more specific detail than the current "Contact the system 
administrator".

3) Lineage Graphs
As with the Provenance table view recommendation above, the lineage graph 
should display all events. Currently, if the lineage graph includes an event 
belonging to a component which the user does not have 'view the data', it is 
shown on the graph as "UNKNOWN". As with Data Visibility mentioned above, the 
graph should indicate the event type as long as the user is in the 'view the 
component'. Subsequent "View Details" on the event should only be visible if 
the user is in the 'view the data' policy.

In summary, for Provenance query results and lineage graphs, all events should 
be shown. Component Name and Component Type information should be conditionally 
visible depending on the corresponding component policy 'view the component' 
policy. Event details including Provenance event type and FlowFile information 
should be conditionally available depending on the corresponding component 
policy 'view the data'. Inability to display event details should provide 
feedback to the user indicating the reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-3599) Add nifi.properties value to globally set the default backpressure size threshold for each connection

2018-02-23 Thread Michael Moser (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374737#comment-16374737
 ] 

Michael Moser commented on NIFI-3599:
-

I have an approach to resolve this, but I would like to get [~mcgilman] and/or 
[~scottyaslan] to comment, because there are UI/UX implications.

It's fairly easy to move default back pressure Object and Data Size threshold 
settings from server-side code (StandardFlowFileQueue.java) to nifi.properties 
and make the back end use them.  However, the UI also has default back pressure 
set in the nf-connection-configuration.js code.  The UI does not seem to have 
access to nifi.properties in order to read settings from there.

When a new connection is drawn, I propose setting these two back pressure 
fields to 'default' in the UI, or leave them empty.  If a user doesn't change 
them, the JS would send to the server a null value in the JSON for these two 
fields.  The server would recognize this and use the nifi.properties default 
back pressure settings.  If a user makes changes to these fields, the JSON sent 
to the server would contain those changes.

I tested this approach and it works.  I'll be happy to submit a PR.  But is 
this an acceptable approach?  Thanks for feedback.

> Add nifi.properties value to globally set the default backpressure size 
> threshold for each connection
> -
>
> Key: NIFI-3599
> URL: https://issues.apache.org/jira/browse/NIFI-3599
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Jeremy Dyer
>Assignee: Jeremy Dyer
>Priority: Major
>
> By default each new connection added to the workflow canvas will have a 
> default backpressure size threshold of 10,000 objects. While the threshold 
> can be changed on a connection level it would be convenient to have a global 
> mechanism for setting that value to something other than 10,000. This 
> enhancement would add a property to nifi.properties that would allow for this 
> threshold to be set globally unless otherwise overridden at the connection 
> level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4900) nifi Swagger definition - FlowApi ScheduleComponents returns empty component list

2018-02-23 Thread Matt Gilman (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374671#comment-16374671
 ] 

Matt Gilman commented on NIFI-4900:
---

The components field is used to indicate which components should be scheduled. 
The reason this exists is that only the components the current user has access 
to are eligible to be scheduled. Since the Authorizer is an extension point, 
there is no guarantee that each node would schedule the same set of components. 
To prevent this potential dataflow inconsistency, the node that receives the 
request will determine the components that it thinks are eligible for 
scheduling. This set of components is then replicated across the cluster using 
a two-phase commit to ensure all nodes attempt to operate on the same 
components. Because this field is meant for input to the endpoint, it only 
contains the component identifier and its current revision.

If this Jira is meant to include some output, please use a separate field and 
make it optionally populated. Some clients do not care for those details which 
could be quite large since the number of scheduled components is technically 
unbounded.

> nifi Swagger definition - FlowApi ScheduleComponents returns empty component 
> list
> -
>
> Key: NIFI-4900
> URL: https://issues.apache.org/jira/browse/NIFI-4900
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: SDLC
>Affects Versions: 1.5.0
>Reporter: Daniel Chaffelson
>Priority: Minor
>
> When issuing a command to Schedule Components with the 'components' attribute 
> set, the command returns an empty set back. This seems incongruous. I would 
> expect to receive back an object listing out the states of each component 
> that the scheduling command was issued against so I can tell which components 
> where operated upon.
> This can be reproduced by creating two processors within a process group, and 
> issuing a schedule component command with a component attribute referencing 
> only one of the processors, which will then change scheduled state leaving 
> the other as-was. It evidences that the individual component request is being 
> honoured, but the return from the API is the same as if all components in the 
> process group had been scheduled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4905) If cluster response time is slow - there isn't enough information to help the operator know why

2018-02-23 Thread Joseph Witt (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Witt updated NIFI-4905:
--
Summary: If cluster response time is slow - there isn't enough information 
to help the operator know why  (was: Cluster response time is slow)

> If cluster response time is slow - there isn't enough information to help the 
> operator know why
> ---
>
> Key: NIFI-4905
> URL: https://issues.apache.org/jira/browse/NIFI-4905
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core UI
>Affects Versions: 1.5.0
> Environment: Centos 7.3
> NiFi 1.5.0 Cluster with 8 physical nodes
>Reporter: Josef Zahner
>Priority: Major
> Attachments: cpu_load_nifi.PNG, root_canvas.png
>
>
> we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 
> 44GB RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors 
> and all of them are stopped. Even in the stopped state, we are constantly 
> getting the messages below for all nodes, not only for the primary node.
> {code:java}
> Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
> requests made. To see more information about timing, enable DEBUG logging for 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
> {code}
> If you are on the root canvas, you feel that it takes a few seconds until it 
> response after a refresh. We have already tuned the parameters below, but 
> without any luck. After restart of NiFi it is fine for a few minutes, but 
> then the messages return.
> {code:java}
> nifi.cluster.protocol.heartbeat.interval=15 sec
> nifi.cluster.node.protocol.threads=40
> nifi.cluster.node.protocol.max.threads=80
> nifi.cluster.node.connection.timeout=60 sec
> nifi.cluster.node.read.timeout=60 sec
> {code}
> The nodes have absolutely no load beside of NiFi. What surprised me was, that 
> when the UI refreshs (default every 30s) it produces about 20% cpu load on my 
> machine. And remember, it's a 24x2.9GHz blade server.
> !cpu_load_nifi.PNG!
> That's a picture of my root canvas:
> !root_canvas.png!
> Actually we can't work with the cluster under this circumstances because the 
> gui always gets so slow.
> Is this a bug or normal behavior? Do we have to many elements on the GUI or 
> what could cause this issue?
> Cheers
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4905) Cluster response time is slow

2018-02-23 Thread Josef Zahner (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374567#comment-16374567
 ] 

Josef Zahner commented on NIFI-4905:


[~joewitt] you were right! The culprit were at least 2 custom processors. As 
soon as I delete them from the configuration, the CPU load goes down to more or 
less 0%. Will verify with the developper why that happens. However, it was very 
painfull to find out which processor caused the issue. So some sort of a 
diagnostic log would be very helpful!

> Cluster response time is slow
> -
>
> Key: NIFI-4905
> URL: https://issues.apache.org/jira/browse/NIFI-4905
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core UI
>Affects Versions: 1.5.0
> Environment: Centos 7.3
> NiFi 1.5.0 Cluster with 8 physical nodes
>Reporter: Josef Zahner
>Priority: Major
> Attachments: cpu_load_nifi.PNG, root_canvas.png
>
>
> we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 
> 44GB RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors 
> and all of them are stopped. Even in the stopped state, we are constantly 
> getting the messages below for all nodes, not only for the primary node.
> {code:java}
> Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
> requests made. To see more information about timing, enable DEBUG logging for 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
> {code}
> If you are on the root canvas, you feel that it takes a few seconds until it 
> response after a refresh. We have already tuned the parameters below, but 
> without any luck. After restart of NiFi it is fine for a few minutes, but 
> then the messages return.
> {code:java}
> nifi.cluster.protocol.heartbeat.interval=15 sec
> nifi.cluster.node.protocol.threads=40
> nifi.cluster.node.protocol.max.threads=80
> nifi.cluster.node.connection.timeout=60 sec
> nifi.cluster.node.read.timeout=60 sec
> {code}
> The nodes have absolutely no load beside of NiFi. What surprised me was, that 
> when the UI refreshs (default every 30s) it produces about 20% cpu load on my 
> machine. And remember, it's a 24x2.9GHz blade server.
> !cpu_load_nifi.PNG!
> That's a picture of my root canvas:
> !root_canvas.png!
> Actually we can't work with the cluster under this circumstances because the 
> gui always gets so slow.
> Is this a bug or normal behavior? Do we have to many elements on the GUI or 
> what could cause this issue?
> Cheers
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-4906) Add GetHdfsFileInfo Processor

2018-02-23 Thread Ed Berezitsky (JIRA)
Ed Berezitsky created NIFI-4906:
---

 Summary: Add GetHdfsFileInfo Processor
 Key: NIFI-4906
 URL: https://issues.apache.org/jira/browse/NIFI-4906
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Extensions
Reporter: Ed Berezitsky
Assignee: Ed Berezitsky


Add *GetHdfsFileInfo* Processor to be able to get stats from a file system.

This processor should support recursive scan, getting information of 
directories and files.

_File-level info required_: name, path, length, modified timestamp, last access 
timestamp, owner, group, permissions.

_Directory-level info required_: name, path, sum of lengths of files under a 
dir, count of files under a dir, modified timestamp, last access timestamp, 
owner, group, permissions.

 

The result returned:
 * in single flow file (in content - a json line per file/dir info);
 * flow file per each file/dir info (in content as json obj or in set of 
attributes by the choice).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4616) ConsumeKafka and ConsumeKafka_0_10 can block indefinitely if unable to communicate with Kafka broker that is SSL enabled

2018-02-23 Thread Stephen Barry (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374485#comment-16374485
 ] 

Stephen Barry commented on NIFI-4616:
-

After conducting some further testing, I found that this is actually a config 
issue on Kafka broker side. Some background: I have a two node SSL-secured 
cluster and the replication factor of __consumer_offsets was the default 1. 
When one of my brokers died, there wasn't another replica for partitions which 
seemed to cause this hang and causes ConsumeKafka to stop consuming messages. 

I then increased the "offsets.topic.replication.factor" in server.properties 
and reassigned the existing __consumer_offsets partitions as per 
[https://stackoverflow.com/questions/46289511/kafka-reassignment-of-consumer-offsets-incorrect.]
 On re-running my tests, the issue no longer reproduces. NiFi ConsumeKafka 
consumes messages from topic even when one of my two brokers is down. Tested on 
ConsumeKafka_1_0 and ConsumeKafka_0_11.

> ConsumeKafka and ConsumeKafka_0_10 can block indefinitely if unable to 
> communicate with Kafka broker that is SSL enabled
> 
>
> Key: NIFI-4616
> URL: https://issues.apache.org/jira/browse/NIFI-4616
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.4.0
>Reporter: Aldrin Piri
>Priority: Major
>
> If I use ConsumeKafka and point to a broker that is in a bad state, I see 
> ConsumeKafka block indefinitely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi issue #2101: NIFI-4289 - InfluxDB put processor

2018-02-23 Thread mans2singh
Github user mans2singh commented on the issue:

https://github.com/apache/nifi/pull/2101
  
@pvillard31 @mattyb149 @MikeThomsen 

I've added expression language support for username and password.

Please let me know if there is any other recommendation.

Thanks

Mans


---


[jira] [Commented] (NIFI-4289) Implement put processor for InfluxDB

2018-02-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374463#comment-16374463
 ] 

ASF GitHub Bot commented on NIFI-4289:
--

Github user mans2singh commented on the issue:

https://github.com/apache/nifi/pull/2101
  
@pvillard31 @mattyb149 @MikeThomsen 

I've added expression language support for username and password.

Please let me know if there is any other recommendation.

Thanks

Mans


> Implement put processor for InfluxDB
> 
>
> Key: NIFI-4289
> URL: https://issues.apache.org/jira/browse/NIFI-4289
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Affects Versions: 1.3.0
> Environment: All
>Reporter: Mans Singh
>Assignee: Mans Singh
>Priority: Minor
>  Labels: insert, measurements,, put, timeseries
> Fix For: 1.6.0
>
>
> Support inserting time series measurements into InfluxDB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4905) Cluster response time is slow

2018-02-23 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374440#comment-16374440
 ] 

Joseph Witt commented on NIFI-4905:
---

[~jzahner] ok - i strongly suspect there is custom code processors/validation 
methods that are causing this slow response time.  While we should definitely 
work to have such things be entirely asynchronous that isn't really feasible 
'right now'.  It is important to validate the processors before the show up in 
a shared context in large usage.

 

Now, I could be wrong and it might be something more 
fundamental/frameworky/standard processor-ish.  But I strongly recommend you 
validate the custom code modules validate methods first.  And you could easily 
add log statements in their validate methods to track time.

 

That said, we should do that in the framework logic someplace (log validate 
call durations)  [~markap14] do you think this would fit within the 
processor diagnostics stuff you had the PR for recently?

> Cluster response time is slow
> -
>
> Key: NIFI-4905
> URL: https://issues.apache.org/jira/browse/NIFI-4905
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core UI
>Affects Versions: 1.5.0
> Environment: Centos 7.3
> NiFi 1.5.0 Cluster with 8 physical nodes
>Reporter: Josef Zahner
>Priority: Major
> Attachments: cpu_load_nifi.PNG, root_canvas.png
>
>
> we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 
> 44GB RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors 
> and all of them are stopped. Even in the stopped state, we are constantly 
> getting the messages below for all nodes, not only for the primary node.
> {code:java}
> Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
> requests made. To see more information about timing, enable DEBUG logging for 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
> {code}
> If you are on the root canvas, you feel that it takes a few seconds until it 
> response after a refresh. We have already tuned the parameters below, but 
> without any luck. After restart of NiFi it is fine for a few minutes, but 
> then the messages return.
> {code:java}
> nifi.cluster.protocol.heartbeat.interval=15 sec
> nifi.cluster.node.protocol.threads=40
> nifi.cluster.node.protocol.max.threads=80
> nifi.cluster.node.connection.timeout=60 sec
> nifi.cluster.node.read.timeout=60 sec
> {code}
> The nodes have absolutely no load beside of NiFi. What surprised me was, that 
> when the UI refreshs (default every 30s) it produces about 20% cpu load on my 
> machine. And remember, it's a 24x2.9GHz blade server.
> !cpu_load_nifi.PNG!
> That's a picture of my root canvas:
> !root_canvas.png!
> Actually we can't work with the cluster under this circumstances because the 
> gui always gets so slow.
> Is this a bug or normal behavior? Do we have to many elements on the GUI or 
> what could cause this issue?
> Cheers
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4616) ConsumeKafka and ConsumeKafka_0_10 can block indefinitely if unable to communicate with Kafka broker that is SSL enabled

2018-02-23 Thread Aldrin Piri (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aldrin Piri updated NIFI-4616:
--
Fix Version/s: (was: 1.1.0)

> ConsumeKafka and ConsumeKafka_0_10 can block indefinitely if unable to 
> communicate with Kafka broker that is SSL enabled
> 
>
> Key: NIFI-4616
> URL: https://issues.apache.org/jira/browse/NIFI-4616
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.4.0
>Reporter: Aldrin Piri
>Priority: Major
>
> If I use ConsumeKafka and point to a broker that is in a bad state, I see 
> ConsumeKafka block indefinitely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-4896) Add option to UI for terminating a Processor when stopped but still has threads

2018-02-23 Thread Matt Gilman (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Gilman reassigned NIFI-4896:
-

Assignee: Matt Gilman

> Add option to UI for terminating a Processor when stopped but still has 
> threads
> ---
>
> Key: NIFI-4896
> URL: https://issues.apache.org/jira/browse/NIFI-4896
> Project: Apache NiFi
>  Issue Type: Sub-task
>  Components: Core UI
>Reporter: Mark Payne
>Assignee: Matt Gilman
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4905) Cluster response time is slow

2018-02-23 Thread Josef Zahner (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374391#comment-16374391
 ] 

Josef Zahner commented on NIFI-4905:


Ok, so to find out which processor causes the high load I need to remove step 
by step the config - hmpf a lot of work :(.

Here what you requested:
{code:java}
[root@nifi2-08 ~]# grep -i "" /tmp/flow.xml 
org.apache.nifi.processors.standard.RouteOnAttribute
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.processors.standard.FetchSFTP
org.apache.nifi.processors.standard.CompressContent
org.apache.nifi.processors.standard.UpdateRecord
org.apache.nifi.processors.standard.ListSFTP
com.xyz.nifi.processors.kudu.PutKuduNull
org.apache.nifi.avro.AvroRecordSetWriter
org.apache.nifi.record.script.ScriptedAvscReader2
org.apache.nifi.processors.standard.UpdateCounter
com.xyz.nifi.processors.kudu.PutKuduNull
org.apache.nifi.processors.standard.UpdateCounter
org.apache.nifi.processors.parquet.PutParquet
org.apache.nifi.processors.standard.MergeRecord
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.processors.standard.PutFile
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.processors.standard.UpdateCounter
com.xyz.nifi.processors.kudu.PutKuduNull
com.xyz.nifi.processors.kudu.PutKuduNull
org.apache.nifi.avro.AvroRecordSetWriter
org.apache.nifi.avro.AvroReader
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.standard.ListSFTP
org.apache.nifi.processors.standard.FetchSFTP
com.xyz.nnp.processors.cdrdecode.CDR2AvroProcessor
org.apache.nifi.processors.standard.CompressContent
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.processors.standard.ListSFTP
org.apache.nifi.processors.standard.FetchSFTP
com.xyz.nifi.processors.kudu.PutKuduNull
org.apache.nifi.processors.standard.ConvertRecord
org.apache.nifi.processors.standard.CompressContent
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.avro.AvroRecordSetWriter
org.apache.nifi.record.script.ScriptedAvscReader2
org.apache.nifi.processors.standard.ConvertRecord
org.apache.nifi.processors.standard.ListSFTP
org.apache.nifi.processors.standard.CompressContent
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.processors.standard.FetchSFTP
org.apache.nifi.avro.AvroRecordSetWriter
com.xyz.lidr.processors.smeradius.SmeRadiusReader
com.xyz.nifi.processors.kudu.PutKuduNull
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.standard.UpdateCounter
org.apache.nifi.processors.standard.UpdateCounter
org.apache.nifi.processors.parquet.PutParquet
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.standard.GetFile
org.apache.nifi.processors.standard.MergeRecord
org.apache.nifi.avro.AvroReader
org.apache.nifi.avro.AvroRecordSetWriter
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.processors.standard.ListSFTP
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.standard.CompressContent
org.apache.nifi.processors.standard.FetchSFTP
org.apache.nifi.processors.standard.ConvertRecord
com.xyz.lidr.processors.smeradius.SmeRadiusReader
org.apache.nifi.avro.AvroRecordSetWriter
org.apache.nifi.processors.standard.ListSFTP
com.xyz.nnp.processors.cdrdecode.CDR2AvroProcessor
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.processors.standard.FetchSFTP
org.apache.nifi.processors.standard.CompressContent
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.standard.GenerateFlowFile
org.apache.nifi.processors.attributes.UpdateAttribute
org.apache.nifi.processors.standard.FetchSFTP
com.xyz.nifi.processors.kudu.PutKuduNull
org.apache.nifi.processors.standard.ListSFTP
org.apache.nifi.processors.standard.CompressContent
org.apache.nifi.processors.standard.ConvertRecord
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.avro.AvroReader
org.apache.nifi.avro.AvroRecordSetWriter
org.apache.nifi.record.script.ScriptedAvscReader2
com.xyz.nnp.processors.cdrdecode.CDR2AvroProcessor
org.apache.nifi.processors.standard.ListSFTP
org.apache.nifi.processors.attributes.UpdateAttribute
com.xyz.nifi.processors.kudu.PutKuduNull
org.apache.nifi.processors.standard.FetchSFTP
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.standard.ListSFTP
org.apache.nifi.processors.parquet.PutParquet
org.apache.nifi.processors.standard.UpdateRecordValues
org.apache.nifi.processors.standard.PutSFTP
org.apache.nifi.processors.kite.InferAvroSchema
org.apache.nifi.processors.attributes.UpdateAttribute
com.xyz.nifi.processors.kudu.PutKuduNull
com.xyz.nnp.processors.cdrdecode.CDR2AvroProcessor
org.apache.nifi.processors.standard.CompressContent

[jira] [Commented] (NIFI-4905) Cluster response time is slow

2018-02-23 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374375#comment-16374375
 ] 

Joseph Witt commented on NIFI-4905:
---

[~jzahner] its ok if you cannot share the flow.xml.

if you could just grep out the lines from the flow.xml.gz that have the 
processor classnames that would be very enough probably.

*however*, remove your custom processors from the flow for now and see if it 
still takes that long to validate the flow.  If it does, then we really need 
that list.  If it doesn't then you know your custom processors have code in 
validate that should not be.

 

Generally speaking, make sure your validate methods do not do anything 
expensive at all.  Do expensive/time consuming things in onTrigger calls via 
lazy-init for example.

> Cluster response time is slow
> -
>
> Key: NIFI-4905
> URL: https://issues.apache.org/jira/browse/NIFI-4905
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core UI
>Affects Versions: 1.5.0
> Environment: Centos 7.3
> NiFi 1.5.0 Cluster with 8 physical nodes
>Reporter: Josef Zahner
>Priority: Major
> Attachments: cpu_load_nifi.PNG, root_canvas.png
>
>
> we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 
> 44GB RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors 
> and all of them are stopped. Even in the stopped state, we are constantly 
> getting the messages below for all nodes, not only for the primary node.
> {code:java}
> Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
> requests made. To see more information about timing, enable DEBUG logging for 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
> {code}
> If you are on the root canvas, you feel that it takes a few seconds until it 
> response after a refresh. We have already tuned the parameters below, but 
> without any luck. After restart of NiFi it is fine for a few minutes, but 
> then the messages return.
> {code:java}
> nifi.cluster.protocol.heartbeat.interval=15 sec
> nifi.cluster.node.protocol.threads=40
> nifi.cluster.node.protocol.max.threads=80
> nifi.cluster.node.connection.timeout=60 sec
> nifi.cluster.node.read.timeout=60 sec
> {code}
> The nodes have absolutely no load beside of NiFi. What surprised me was, that 
> when the UI refreshs (default every 30s) it produces about 20% cpu load on my 
> machine. And remember, it's a 24x2.9GHz blade server.
> !cpu_load_nifi.PNG!
> That's a picture of my root canvas:
> !root_canvas.png!
> Actually we can't work with the cluster under this circumstances because the 
> gui always gets so slow.
> Is this a bug or normal behavior? Do we have to many elements on the GUI or 
> what could cause this issue?
> Cheers
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4905) Cluster response time is slow

2018-02-23 Thread Josef Zahner (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374365#comment-16374365
 ] 

Josef Zahner commented on NIFI-4905:


Puh,have to check if I'm allowed to share the flow.xml.gz. But I guess it isn't 
possible. Will check that on monday.

Yes of course we have custom processors. Regarding the list of processors. Is 
there an easy way to export them? NiFi Summary - General, shows them, but copy 
doesn't work well.

> Cluster response time is slow
> -
>
> Key: NIFI-4905
> URL: https://issues.apache.org/jira/browse/NIFI-4905
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core UI
>Affects Versions: 1.5.0
> Environment: Centos 7.3
> NiFi 1.5.0 Cluster with 8 physical nodes
>Reporter: Josef Zahner
>Priority: Major
> Attachments: cpu_load_nifi.PNG, root_canvas.png
>
>
> we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 
> 44GB RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors 
> and all of them are stopped. Even in the stopped state, we are constantly 
> getting the messages below for all nodes, not only for the primary node.
> {code:java}
> Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
> requests made. To see more information about timing, enable DEBUG logging for 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
> {code}
> If you are on the root canvas, you feel that it takes a few seconds until it 
> response after a refresh. We have already tuned the parameters below, but 
> without any luck. After restart of NiFi it is fine for a few minutes, but 
> then the messages return.
> {code:java}
> nifi.cluster.protocol.heartbeat.interval=15 sec
> nifi.cluster.node.protocol.threads=40
> nifi.cluster.node.protocol.max.threads=80
> nifi.cluster.node.connection.timeout=60 sec
> nifi.cluster.node.read.timeout=60 sec
> {code}
> The nodes have absolutely no load beside of NiFi. What surprised me was, that 
> when the UI refreshs (default every 30s) it produces about 20% cpu load on my 
> machine. And remember, it's a 24x2.9GHz blade server.
> !cpu_load_nifi.PNG!
> That's a picture of my root canvas:
> !root_canvas.png!
> Actually we can't work with the cluster under this circumstances because the 
> gui always gets so slow.
> Is this a bug or normal behavior? Do we have to many elements on the GUI or 
> what could cause this issue?
> Cheers
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4901) Json to Avro using Record framework does not support union types with boolean

2018-02-23 Thread Mark Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374364#comment-16374364
 ] 

Mark Payne commented on NIFI-4901:
--

[~gardellajuanpablo] I believe the behavior that you describe above in this 
Jira is the expected behavior. The schema that you provided indicates that 
there is a single field named "isSwap" and that its type is either boolean or 
null. So it would match something like:
{code:java}
{
  "isSwap": true
}{code}
In the sample JSON that you provided, the "isSwap" field is itself a record 
with a field named "boolean". So a matching schema would look like this:
{code:java}
{
   "type":"record",
   "name":"foo",
   "fields":[
  {
 "name":"isSwap",
 "type": {
"type": "record",
"name": "isSwapRecord",
"fields": [
   {
  "name": "boolean",
  "type": [ "boolean", "null" ]
   }
]
 }
  }
   ]
}{code}

> Json to Avro using Record framework does not support union types with boolean
> -
>
> Key: NIFI-4901
> URL: https://issues.apache.org/jira/browse/NIFI-4901
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.5.0
> Environment: ALL
>Reporter: Gardella Juan Pablo
>Priority: Major
> Attachments: optiona-boolean.zip
>
>
> Given the following valid Avro Schema:
> {code}
> {
>"type":"record",
>"name":"foo",
>"fields":[
>   {
>  "name":"isSwap",
>  "type":[
> "boolean",
> "null"
>  ]
>   } 
>]
> }
> {code}
> And the following JSON:
> {code}
> {
>   "isSwap": {
> "boolean": true
>   }
> }
> {code}
> When it is trying to be converted to Avro using ConvertRecord fails with:
> {{org.apache.nifi.serialization.MalformedRecordException: Successfully parsed 
> a JSON object from input but failed to convert into a Record object with the 
> given schema}}
> Attached a repository to reproduce the issue and also included the fix:
> * Run {{mvn clean test}} to reproduce the issue.
> * Run {{mvn clean test -Ppatch}} to test the fix. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4905) Cluster response time is slow

2018-02-23 Thread Josef Zahner (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josef Zahner updated NIFI-4905:
---
Description: 
we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 44GB 
RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors and 
all of them are stopped. Even in the stopped state, we are constantly getting 
the messages below for all nodes, not only for the primary node.
{code:java}
Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
requests made. To see more information about timing, enable DEBUG logging for 
org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
{code}
If you are on the root canvas, you feel that it takes a few seconds until it 
response after a refresh. We have already tuned the parameters below, but 
without any luck. After restart of NiFi it is fine for a few minutes, but then 
the messages return.
{code:java}
nifi.cluster.protocol.heartbeat.interval=15 sec
nifi.cluster.node.protocol.threads=40
nifi.cluster.node.protocol.max.threads=80
nifi.cluster.node.connection.timeout=60 sec
nifi.cluster.node.read.timeout=60 sec
{code}
The nodes have absolutely no load beside of NiFi. What surprised me was, that 
when the UI refreshs (default every 30s) it produces about 20% cpu load on my 
machine. And remember, it's a 24x2.9GHz blade server.

!cpu_load_nifi.PNG!

That's a picture of my root canvas:

!root_canvas.png!

Actually we can't work with the cluster under this circumstances because the 
gui always gets so slow.

Is this a bug or normal behavior? Do we have to many elements on the GUI or 
what could cause this issue?

Cheers

 

  was:
we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 44GB 
RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors and 
all of them are stopped. Even in the stopped state, we are constantly getting 
the messages below for all nodes, not only for the primary node.
{code:java}
Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
requests made. To see more information about timing, enable DEBUG logging for 
org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
{code}
If you are on the root canvas, you feel that it takes a few seconds until it 
response after a refresh. We have already tuned the parameters below, but 
without any luck. After restart of NiFi it is fine for a few minutes, but then 
the messages return.
{code:java}
nifi.cluster.protocol.heartbeat.interval=15 sec
nifi.cluster.node.protocol.threads=40
nifi.cluster.node.protocol.max.threads=80
nifi.cluster.node.connection.timeout=60 sec
nifi.cluster.node.read.timeout=60 sec
{code}
The nodes have absolute no load beside of NiFi. What surprised me was, that 
when the UI refreshs (default every 30s) it produces about 20% cpu load on my 
machine. And remember, it's a 24x2.9GHz blade server.

!cpu_load_nifi.PNG!

That's a picture of my root canvas:

!root_canvas.png!

Actually we can't work with the cluster under this circumstances because the 
gui always gets so slow.

Is this a bug or normal behavior? Do we have to much elements on the GUI or 
what could cause this issue?

Cheers

 


> Cluster response time is slow
> -
>
> Key: NIFI-4905
> URL: https://issues.apache.org/jira/browse/NIFI-4905
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core UI
>Affects Versions: 1.5.0
> Environment: Centos 7.3
> NiFi 1.5.0 Cluster with 8 physical nodes
>Reporter: Josef Zahner
>Priority: Major
> Attachments: cpu_load_nifi.PNG, root_canvas.png
>
>
> we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 
> 44GB RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors 
> and all of them are stopped. Even in the stopped state, we are constantly 
> getting the messages below for all nodes, not only for the primary node.
> {code:java}
> Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
> requests made. To see more information about timing, enable DEBUG logging for 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
> {code}
> If you are on the root canvas, you feel that it takes a few seconds until it 
> response after a refresh. We have already tuned the parameters below, but 
> without any luck. After restart of NiFi it is fine for a few minutes, but 
> then the messages return.
> {code:java}
> nifi.cluster.protocol.heartbeat.interval=15 sec
> nifi.cluster.node.protocol.threads=40
> nifi.cluster.node.protocol.max.threads=80
> nifi.cluster.node.connection.timeout=60 sec
> nifi.cluster.node.read.timeout=60 sec
> {code}
> The nodes have absolutely no load beside of NiFi. What surprised me was, that 
> when the UI refreshs 

[jira] [Commented] (NIFI-4905) Cluster response time is slow

2018-02-23 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374355#comment-16374355
 ] 

Joseph Witt commented on NIFI-4905:
---

[~jzahner] not normal and definitely want to get to the bottom of it.

Can you share a flow.xml by any chance/  Given how well you cleaned up the 
screenshots I'm betting not.  Do you have any custom processors in the flow?  
Can you send a list of all processor classes included in the flow?

 

If you do have custom processors/components it is very possible their validate 
methods are doing unfavorable things.

> Cluster response time is slow
> -
>
> Key: NIFI-4905
> URL: https://issues.apache.org/jira/browse/NIFI-4905
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core UI
>Affects Versions: 1.5.0
> Environment: Centos 7.3
> NiFi 1.5.0 Cluster with 8 physical nodes
>Reporter: Josef Zahner
>Priority: Major
> Attachments: cpu_load_nifi.PNG, root_canvas.png
>
>
> we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 
> 44GB RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors 
> and all of them are stopped. Even in the stopped state, we are constantly 
> getting the messages below for all nodes, not only for the primary node.
> {code:java}
> Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
> requests made. To see more information about timing, enable DEBUG logging for 
> org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
> {code}
> If you are on the root canvas, you feel that it takes a few seconds until it 
> response after a refresh. We have already tuned the parameters below, but 
> without any luck. After restart of NiFi it is fine for a few minutes, but 
> then the messages return.
> {code:java}
> nifi.cluster.protocol.heartbeat.interval=15 sec
> nifi.cluster.node.protocol.threads=40
> nifi.cluster.node.protocol.max.threads=80
> nifi.cluster.node.connection.timeout=60 sec
> nifi.cluster.node.read.timeout=60 sec
> {code}
> The nodes have absolute no load beside of NiFi. What surprised me was, that 
> when the UI refreshs (default every 30s) it produces about 20% cpu load on my 
> machine. And remember, it's a 24x2.9GHz blade server.
> !cpu_load_nifi.PNG!
> That's a picture of my root canvas:
> !root_canvas.png!
> Actually we can't work with the cluster under this circumstances because the 
> gui always gets so slow.
> Is this a bug or normal behavior? Do we have to much elements on the GUI or 
> what could cause this issue?
> Cheers
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-4905) Cluster response time is slow

2018-02-23 Thread Josef Zahner (JIRA)
Josef Zahner created NIFI-4905:
--

 Summary: Cluster response time is slow
 Key: NIFI-4905
 URL: https://issues.apache.org/jira/browse/NIFI-4905
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core UI
Affects Versions: 1.5.0
 Environment: Centos 7.3
NiFi 1.5.0 Cluster with 8 physical nodes
Reporter: Josef Zahner
 Attachments: cpu_load_nifi.PNG, root_canvas.png

we are actually working on a PoC with 8 nodes (HP BL460c Blades, 24 Cores, 44GB 
RAM) in a NiFi 1.5.0 cluster. Our configuration has about 160 processors and 
all of them are stopped. Even in the stopped state, we are constantly getting 
the messages below for all nodes, not only for the primary node.
{code:java}
Response time from nifi2-07.xyz.ch:8080 was slow for each of the last 3 
requests made. To see more information about timing, enable DEBUG logging for 
org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator
{code}
If you are on the root canvas, you feel that it takes a few seconds until it 
response after a refresh. We have already tuned the parameters below, but 
without any luck. After restart of NiFi it is fine for a few minutes, but then 
the messages return.
{code:java}
nifi.cluster.protocol.heartbeat.interval=15 sec
nifi.cluster.node.protocol.threads=40
nifi.cluster.node.protocol.max.threads=80
nifi.cluster.node.connection.timeout=60 sec
nifi.cluster.node.read.timeout=60 sec
{code}
The nodes have absolute no load beside of NiFi. What surprised me was, that 
when the UI refreshs (default every 30s) it produces about 20% cpu load on my 
machine. And remember, it's a 24x2.9GHz blade server.

!cpu_load_nifi.PNG!

That's a picture of my root canvas:

!root_canvas.png!

Actually we can't work with the cluster under this circumstances because the 
gui always gets so slow.

Is this a bug or normal behavior? Do we have to much elements on the GUI or 
what could cause this issue?

Cheers

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)