[jira] [Commented] (NIFI-5776) ListSFTP should log SSH_FXP_STATUS id value

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671137#comment-16671137
 ] 

ASF GitHub Bot commented on NIFI-5776:
--

GitHub user ijokarumawak opened a pull request:

https://github.com/apache/nifi/pull/3120

NIFI-5776: ListSFTP should log SSH_FXP_STATUS id value

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [x] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijokarumawak/nifi nifi-5776

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3120.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3120


commit c1684bbbfe7f2bd0085af22d3f6eb23b89181570
Author: Koji Kawamura 
Date:   2018-11-01T05:41:06Z

NIFI-5776: ListSFTP should log SSH_FXP_STATUS id value




> ListSFTP should log SSH_FXP_STATUS id value
> ---
>
> Key: NIFI-5776
> URL: https://issues.apache.org/jira/browse/NIFI-5776
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>Priority: Minor
>
> When SSH_FXP_READDIR request is sent, if SSH_FXP_STATUS other than the 
> expected SSH_FX_EOF is returned, SftpException is thrown. Current catch block 
> converts the exception into different exceptions based on SSH_FXP_STATUS id. 
> But if the id is not SSH_FX_NO_SUCH_FILE or SSH_FX_PERMISSION_DENIED, the 
> returned id won't be logged. That makes it hard to investigate issues for 
> such code.
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/util/SFTPTransfer.java#L263



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi pull request #3120: NIFI-5776: ListSFTP should log SSH_FXP_STATUS id va...

2018-10-31 Thread ijokarumawak
GitHub user ijokarumawak opened a pull request:

https://github.com/apache/nifi/pull/3120

NIFI-5776: ListSFTP should log SSH_FXP_STATUS id value

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [x] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijokarumawak/nifi nifi-5776

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3120.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3120


commit c1684bbbfe7f2bd0085af22d3f6eb23b89181570
Author: Koji Kawamura 
Date:   2018-11-01T05:41:06Z

NIFI-5776: ListSFTP should log SSH_FXP_STATUS id value




---


[jira] [Created] (NIFI-5776) ListSFTP should log SSH_FXP_STATUS id value

2018-10-31 Thread Koji Kawamura (JIRA)
Koji Kawamura created NIFI-5776:
---

 Summary: ListSFTP should log SSH_FXP_STATUS id value
 Key: NIFI-5776
 URL: https://issues.apache.org/jira/browse/NIFI-5776
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 1.0.0
Reporter: Koji Kawamura
Assignee: Koji Kawamura


When SSH_FXP_READDIR request is sent, if SSH_FXP_STATUS other than the expected 
SSH_FX_EOF is returned, SftpException is thrown. Current catch block converts 
the exception into different exceptions based on SSH_FXP_STATUS id. But if the 
id is not SSH_FX_NO_SUCH_FILE or SSH_FX_PERMISSION_DENIED, the returned id 
won't be logged. That makes it hard to investigate issues for such code.
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/util/SFTPTransfer.java#L263



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-1505) LogAttribute Missing Information in Documentation

2018-10-31 Thread Kotaro Terada (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kotaro Terada reassigned NIFI-1505:
---

Assignee: Kotaro Terada

> LogAttribute Missing Information in Documentation
> -
>
> Key: NIFI-1505
> URL: https://issues.apache.org/jira/browse/NIFI-1505
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Documentation  Website
>Affects Versions: 0.4.1
> Environment: Mac OS/X - Web UI
>Reporter: Stephan Warren
>Assignee: Kotaro Terada
>Priority: Minor
>
> 1. Right click on the LogAttribute processor to get usage (while in the NiFI 
> Flow web UI). The description is "No description provided."  Please add a 
> description. (I would suggest one, but I am not sure what this does.)
> 2. In the following page: 
> https://nifi.apache.org/docs/nifi-docs/html/getting-started.html#downloading-and-installing-nifi
>  -- The LogAttribute processor is not covered in the processor list. Since 
> the getting started instructions say to use the LogAttribute processor, it's 
> a good idea to list what this processor does in order to understand the 
> example by perhaps both (a) adding the LogAttribute processor to the list and 
> (b) say something about what the example is supposed to do. (Suggest to 
> answer the question: What does this example do? The first process get file 
> seems obvious enough, but LogAttribute is a little less clear.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-1505) LogAttribute Missing Information in Documentation

2018-10-31 Thread Kotaro Terada (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671080#comment-16671080
 ] 

Kotaro Terada commented on NIFI-1505:
-

I recently noticed the same issue and found this JIRA ticket. {{LogAttribute}} 
is still the only processor that has no processor description among all the 
processors in the current version (1.8.0).

For the first suggestion, I agree with you and I'll add it.

For the second one, in my opinion, I don't think it is necessary to add 
{{LogAttribute}} to the list here now. Besides, for categorizing in the list, 
we might also need to add {{LogMessage}} and some other related processors. 
This could be another issue.

> LogAttribute Missing Information in Documentation
> -
>
> Key: NIFI-1505
> URL: https://issues.apache.org/jira/browse/NIFI-1505
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Documentation  Website
>Affects Versions: 0.4.1
> Environment: Mac OS/X - Web UI
>Reporter: Stephan Warren
>Priority: Minor
>
> 1. Right click on the LogAttribute processor to get usage (while in the NiFI 
> Flow web UI). The description is "No description provided."  Please add a 
> description. (I would suggest one, but I am not sure what this does.)
> 2. In the following page: 
> https://nifi.apache.org/docs/nifi-docs/html/getting-started.html#downloading-and-installing-nifi
>  -- The LogAttribute processor is not covered in the processor list. Since 
> the getting started instructions say to use the LogAttribute processor, it's 
> a good idea to list what this processor does in order to understand the 
> example by perhaps both (a) adding the LogAttribute processor to the list and 
> (b) say something about what the example is supposed to do. (Suggest to 
> answer the question: What does this example do? The first process get file 
> seems obvious enough, but LogAttribute is a little less clear.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4715) ListS3 produces duplicates in frequently updated buckets

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671058#comment-16671058
 ] 

ASF GitHub Bot commented on NIFI-4715:
--

Github user adamlamar commented on the issue:

https://github.com/apache/nifi/pull/2361
  
Hi @ijokarumawak, I'm sorry I wasn't able to take this across the finish 
line. Thanks a bunch for continuing the effort! 


> ListS3 produces duplicates in frequently updated buckets
> 
>
> Key: NIFI-4715
> URL: https://issues.apache.org/jira/browse/NIFI-4715
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.2.0, 1.3.0, 1.4.0
> Environment: All
>Reporter: Milan Das
>Assignee: Koji Kawamura
>Priority: Major
> Attachments: List-S3-dup-issue.xml, ListS3_Duplication.xml, 
> screenshot-1.png
>
>
> ListS3 state is implemented using HashSet. HashSet is not thread safe. When 
> ListS3 operates in multi threaded mode, sometimes it  tries to list  same 
> file from S3 bucket.  Seems like HashSet data is getting corrupted.
> currentKeys = new HashSet<>(); // need to be implemented Thread Safe like 
> currentKeys = //ConcurrentHashMap.newKeySet();
> *{color:red}+Update+{color}*:
> This is not a HashSet issue:
> Root cause is: 
> When the file gets uploaded to S3 simultaneously  when List S3 is in progress.
> onTrigger-->  maxTimestamp is initiated as 0L.
> This is clearing keys as per the code below
> When lastModifiedTime on S3 object is same as currentTimestamp for the listed 
> key it should be skipped. As the key is cleared, it is loading the same file 
> again. 
> I think fix should be to initiate the maxTimestamp with currentTimestamp not 
> 0L.
> {code}
>  long maxTimestamp = currentTimestamp;
> {code}
> Following block is clearing keys.
> {code:title=org.apache.nifi.processors.aws.s3.ListS3.java|borderStyle=solid}
>  if (lastModified > maxTimestamp) {
> maxTimestamp = lastModified;
> currentKeys.clear();
> getLogger().debug("clearing keys");
> }
> {code}
> Update: 01/03/2018
> There is one more flavor of same defect.
> Suppose: file1 is modified at 1514987611000 on S3 and currentTimestamp = 
> 1514987311000 on state.
> 1. File will be picked up time current state will be updated to 
> currentTimestamp=1514987311000 (but OS System time is 1514987611000)
> 2. next cycle for file2 with lastmodified: 1514987611000 : keys will be 
> cleared because lastModified > maxTimeStamp 
> (=currentTimestamp=1514987311000). 
> CurrentTimeStamp will saved as 1514987611000
> 3. next cycle: currentTimestamp=1514987611000 , "file1 modified at 
> 1514987611000" will be picked up again because file1 is no longer in the keys.
> I think solution is currentTimeStamp need to persisted current system time 
> stamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi issue #2361: NIFI-4715: ListS3 produces duplicates in frequently update...

2018-10-31 Thread adamlamar
Github user adamlamar commented on the issue:

https://github.com/apache/nifi/pull/2361
  
Hi @ijokarumawak, I'm sorry I wasn't able to take this across the finish 
line. Thanks a bunch for continuing the effort! 


---


[jira] [Commented] (NIFI-4715) ListS3 produces duplicates in frequently updated buckets

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671013#comment-16671013
 ] 

ASF GitHub Bot commented on NIFI-4715:
--

Github user jvwing commented on the issue:

https://github.com/apache/nifi/pull/3116
  
Reviewing...


> ListS3 produces duplicates in frequently updated buckets
> 
>
> Key: NIFI-4715
> URL: https://issues.apache.org/jira/browse/NIFI-4715
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.2.0, 1.3.0, 1.4.0
> Environment: All
>Reporter: Milan Das
>Assignee: Koji Kawamura
>Priority: Major
> Attachments: List-S3-dup-issue.xml, ListS3_Duplication.xml, 
> screenshot-1.png
>
>
> ListS3 state is implemented using HashSet. HashSet is not thread safe. When 
> ListS3 operates in multi threaded mode, sometimes it  tries to list  same 
> file from S3 bucket.  Seems like HashSet data is getting corrupted.
> currentKeys = new HashSet<>(); // need to be implemented Thread Safe like 
> currentKeys = //ConcurrentHashMap.newKeySet();
> *{color:red}+Update+{color}*:
> This is not a HashSet issue:
> Root cause is: 
> When the file gets uploaded to S3 simultaneously  when List S3 is in progress.
> onTrigger-->  maxTimestamp is initiated as 0L.
> This is clearing keys as per the code below
> When lastModifiedTime on S3 object is same as currentTimestamp for the listed 
> key it should be skipped. As the key is cleared, it is loading the same file 
> again. 
> I think fix should be to initiate the maxTimestamp with currentTimestamp not 
> 0L.
> {code}
>  long maxTimestamp = currentTimestamp;
> {code}
> Following block is clearing keys.
> {code:title=org.apache.nifi.processors.aws.s3.ListS3.java|borderStyle=solid}
>  if (lastModified > maxTimestamp) {
> maxTimestamp = lastModified;
> currentKeys.clear();
> getLogger().debug("clearing keys");
> }
> {code}
> Update: 01/03/2018
> There is one more flavor of same defect.
> Suppose: file1 is modified at 1514987611000 on S3 and currentTimestamp = 
> 1514987311000 on state.
> 1. File will be picked up time current state will be updated to 
> currentTimestamp=1514987311000 (but OS System time is 1514987611000)
> 2. next cycle for file2 with lastmodified: 1514987611000 : keys will be 
> cleared because lastModified > maxTimeStamp 
> (=currentTimestamp=1514987311000). 
> CurrentTimeStamp will saved as 1514987611000
> 3. next cycle: currentTimestamp=1514987611000 , "file1 modified at 
> 1514987611000" will be picked up again because file1 is no longer in the keys.
> I think solution is currentTimeStamp need to persisted current system time 
> stamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi issue #3116: NIFI-4715: ListS3 produces duplicates in frequently update...

2018-10-31 Thread jvwing
Github user jvwing commented on the issue:

https://github.com/apache/nifi/pull/3116
  
Reviewing...


---


[jira] [Resolved] (NIFI-5773) Improvements to the Release Guide

2018-10-31 Thread Jeff Storck (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Storck resolved NIFI-5773.
---
   Resolution: Done
Fix Version/s: 1.9.0

> Improvements to the Release Guide
> -
>
> Key: NIFI-5773
> URL: https://issues.apache.org/jira/browse/NIFI-5773
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Documentation  Website
>Affects Versions: 1.8.0
>Reporter: Jeff Storck
>Assignee: Jeff Storck
>Priority: Minor
> Fix For: 1.9.0
>
>
> Add some improvements and formatting fixes to the Release Guide.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5773) Improvements to the Release Guide

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670923#comment-16670923
 ] 

ASF GitHub Bot commented on NIFI-5773:
--

Github user asfgit closed the pull request at:

https://github.com/apache/nifi-site/pull/32


> Improvements to the Release Guide
> -
>
> Key: NIFI-5773
> URL: https://issues.apache.org/jira/browse/NIFI-5773
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Documentation  Website
>Affects Versions: 1.8.0
>Reporter: Jeff Storck
>Assignee: Jeff Storck
>Priority: Minor
>
> Add some improvements and formatting fixes to the Release Guide.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-site pull request #32: NIFI-5773 Added some steps and details to the re...

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi-site/pull/32


---


[jira] [Created] (NIFI-5775) DataTypeUtils "toString" incorrectly treats value as a "byte" when passing an array leading to ClassCastException

2018-10-31 Thread Joseph Percivall (JIRA)
Joseph Percivall created NIFI-5775:
--

 Summary: DataTypeUtils "toString" incorrectly treats value as a 
"byte" when passing an array leading to ClassCastException
 Key: NIFI-5775
 URL: https://issues.apache.org/jira/browse/NIFI-5775
 Project: Apache NiFi
  Issue Type: Bug
Affects Versions: 1.8.0
Reporter: Joseph Percivall


To reproduce, change this line[1] to either put "String" as the first choice of 
record type or just set the key to use string. 

The resulting error:

{noformat}
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Byte

at 
org.apache.nifi.serialization.record.util.DataTypeUtils.toString(DataTypeUtils.java:530)
at 
org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:147)
at 
org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:115)
at 
org.apache.nifi.json.WriteJsonResult.writeValue(WriteJsonResult.java:284)
at 
org.apache.nifi.json.WriteJsonResult.writeRecord(WriteJsonResult.java:187)
at 
org.apache.nifi.json.WriteJsonResult.writeRecord(WriteJsonResult.java:136)
at 
org.apache.nifi.json.TestWriteJsonResult.testChoiceArray(TestWriteJsonResult.java:494)
{noformat}



[1] 
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/json/TestWriteJsonResult.java#L479



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5774) Refresh available component types on the front-end

2018-10-31 Thread Bryan Bende (JIRA)
Bryan Bende created NIFI-5774:
-

 Summary: Refresh available component types on the front-end
 Key: NIFI-5774
 URL: https://issues.apache.org/jira/browse/NIFI-5774
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: Bryan Bende


Currently the application retrieves the available processors, controller 
services, and reporting tasks during initial page load and caches them on the 
client. This was fine because the types could never change without restarting 
the application, but now NIFI-5673 introduces the ability to dynamically load 
new NARs without restarting, so we'll need a way to reload the types on the 
front end.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5673) Support auto loading of new NARs

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670678#comment-16670678
 ] 

ASF GitHub Bot commented on NIFI-5673:
--

GitHub user bbende opened a pull request:

https://github.com/apache/nifi/pull/3119

NIFI-5673 Support auto loading of new NARs

This PR introduces a new property in nifi.properties where you can specify 
a directory that will be watched for new NARs (defaults ./extensions), and upon 
finding them will make any extensions in those NARs available to the running 
application without requiring a restart.

In addition, this PR includes a refactoring of the ExtensionManager to 
remove it's static use and make it an instance that is injected everywhere. 
This will help make the framework code more testable going forward.

NOTE: Currently the UI retrieves the available types during initial load of 
the UI and never retrieves them again because they couldn't change, so this 
will have to be changed in a follow on PR.

Some scenarios to consider when testing...

- Dependencies across NARs - handled by keeping track of skipped NARs until 
the a new NAR is available matching the required dependency

- NARs still being written - handled by only consider the NAR once the last 
modified time is at least 5 seconds old
 
- Custom UIs - make sure the custom UI is available for a component that 
was auto-loaded


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bbende/nifi NIFI-5673

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3119.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3119


commit d7561ffa5601e4afd708fdbf8198f696bf3b2bbd
Author: Bryan Bende 
Date:   2018-10-12T15:15:30Z

NIFI-5673 Set up property/assembly for new auto-load directory
- Set up NarAutoLoader to watch directory for new files
- Move NarAutoLoader to JettyServer since it will need access to 
ExtensionManager
- Created NarLoader to shared between NarAutoLoader and the framework
- Created nifi-framework-nar-loading-utils so we can use nifi-documentation 
to call DocGenerator
- Add additional bundles to overall map in NarClassLoaders as they are 
loaded
- Added handling of skipped NARs to include them in next iteration
- Added check of last modified timestamp on NARs
- Refactored JettyServer so we can load additional web contexts while the 
application is running
- Setting up unit tests

commit 81ed267f68c69566851d75b4a99b15e1a05f503c
Author: Bryan Bende 
Date:   2018-10-29T20:19:57Z

NIFI-5673 Remove static use of ExtensionManager

commit e126bfca87c79272836c7c281d6d80f09d9721e4
Author: Bryan Bende 
Date:   2018-10-30T19:51:37Z

NIFI-5673 Adding unit tests for NarLoader

commit 01a7feb9bdd02ef3407a3b3acd5f05c0938e2e49
Author: Bryan Bende 
Date:   2018-10-31T15:36:49Z

NIFI-5673 Extracting interface for ExtensionManager and splitting discovery 
into it's own interface




> Support auto loading of new NARs
> 
>
> Key: NIFI-5673
> URL: https://issues.apache.org/jira/browse/NIFI-5673
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Bryan Bende
>Assignee: Bryan Bende
>Priority: Minor
>
> We should be able to detect when new NARs have been added to any of the NAR 
> directories and automatically load them and make the components available for 
> use without restarting the whole NiFi instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5673) Support auto loading of new NARs

2018-10-31 Thread Bryan Bende (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Bende updated NIFI-5673:
--
Status: Patch Available  (was: Open)

> Support auto loading of new NARs
> 
>
> Key: NIFI-5673
> URL: https://issues.apache.org/jira/browse/NIFI-5673
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Bryan Bende
>Assignee: Bryan Bende
>Priority: Minor
>
> We should be able to detect when new NARs have been added to any of the NAR 
> directories and automatically load them and make the components available for 
> use without restarting the whole NiFi instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi pull request #3119: NIFI-5673 Support auto loading of new NARs

2018-10-31 Thread bbende
GitHub user bbende opened a pull request:

https://github.com/apache/nifi/pull/3119

NIFI-5673 Support auto loading of new NARs

This PR introduces a new property in nifi.properties where you can specify 
a directory that will be watched for new NARs (defaults ./extensions), and upon 
finding them will make any extensions in those NARs available to the running 
application without requiring a restart.

In addition, this PR includes a refactoring of the ExtensionManager to 
remove it's static use and make it an instance that is injected everywhere. 
This will help make the framework code more testable going forward.

NOTE: Currently the UI retrieves the available types during initial load of 
the UI and never retrieves them again because they couldn't change, so this 
will have to be changed in a follow on PR.

Some scenarios to consider when testing...

- Dependencies across NARs - handled by keeping track of skipped NARs until 
the a new NAR is available matching the required dependency

- NARs still being written - handled by only consider the NAR once the last 
modified time is at least 5 seconds old
 
- Custom UIs - make sure the custom UI is available for a component that 
was auto-loaded


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bbende/nifi NIFI-5673

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3119.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3119


commit d7561ffa5601e4afd708fdbf8198f696bf3b2bbd
Author: Bryan Bende 
Date:   2018-10-12T15:15:30Z

NIFI-5673 Set up property/assembly for new auto-load directory
- Set up NarAutoLoader to watch directory for new files
- Move NarAutoLoader to JettyServer since it will need access to 
ExtensionManager
- Created NarLoader to shared between NarAutoLoader and the framework
- Created nifi-framework-nar-loading-utils so we can use nifi-documentation 
to call DocGenerator
- Add additional bundles to overall map in NarClassLoaders as they are 
loaded
- Added handling of skipped NARs to include them in next iteration
- Added check of last modified timestamp on NARs
- Refactored JettyServer so we can load additional web contexts while the 
application is running
- Setting up unit tests

commit 81ed267f68c69566851d75b4a99b15e1a05f503c
Author: Bryan Bende 
Date:   2018-10-29T20:19:57Z

NIFI-5673 Remove static use of ExtensionManager

commit e126bfca87c79272836c7c281d6d80f09d9721e4
Author: Bryan Bende 
Date:   2018-10-30T19:51:37Z

NIFI-5673 Adding unit tests for NarLoader

commit 01a7feb9bdd02ef3407a3b3acd5f05c0938e2e49
Author: Bryan Bende 
Date:   2018-10-31T15:36:49Z

NIFI-5673 Extracting interface for ExtensionManager and splitting discovery 
into it's own interface




---


[jira] [Commented] (NIFI-5772) FetchFile Test Failing

2018-10-31 Thread Bryan Bende (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670643#comment-16670643
 ] 

Bryan Bende commented on NIFI-5772:
---

hmm that is definitely interesting since it is passing on Travis which is some 
form of a linux build

> FetchFile Test Failing
> --
>
> Key: NIFI-5772
> URL: https://issues.apache.org/jira/browse/NIFI-5772
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Peter Wicks
>Priority: Major
>
> In TestFetchFile, the test 
> `testMoveOnCompleteWithParentOfTargetDirNotAccessible` fails for me.
> Ubuntu with OpenJDK 8 1.5.0 and on my Windows 10 laptop.
> Error: 
> Expected all Transferred FlowFiles to go to failure but 1 were routed to 
> success



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-10-31 Thread joewitt
Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2671
  
team; given https://github.com/marklogic/nifi/releases could we consider 
closing this PR and keeping the MarkLogic artifact creation/maintenance 
something MarkLogic takes care of at this time? It is a perfectly fine model.  
We could even create a nifi web page to point at vendor/other community 
managed/supported extensions possibly.


---


[jira] [Commented] (NIFI-5772) FetchFile Test Failing

2018-10-31 Thread Peter Wicks (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670651#comment-16670651
 ] 

Peter Wicks commented on NIFI-5772:
---

I looked at the logs again, and the other test that does something similar, 
`testMoveOnCompleteWithTargetExistsButNotWritable`, also failed, I just didn't 
see the "Failures: 2" last time.

I'm running the builds as root, I wonder if root retains access to these 
directories even when you mark otherwise in code...

> FetchFile Test Failing
> --
>
> Key: NIFI-5772
> URL: https://issues.apache.org/jira/browse/NIFI-5772
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Peter Wicks
>Priority: Major
>
> In TestFetchFile, the test 
> `testMoveOnCompleteWithParentOfTargetDirNotAccessible` fails for me.
> Ubuntu with OpenJDK 8 1.5.0 and on my Windows 10 laptop.
> Error: 
> Expected all Transferred FlowFiles to go to failure but 1 were routed to 
> success



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-site pull request #32: NIFI-5773 Added some steps and details to the re...

2018-10-31 Thread jtstorck
GitHub user jtstorck opened a pull request:

https://github.com/apache/nifi-site/pull/32

NIFI-5773 Added some steps and details to the release process

Fixed several formatting problems with lists and bullet points
Removed some extraneous mentions of "NIFI-" in front of ${JIRA_TICKET}

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jtstorck/nifi-site NIFI-5773

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi-site/pull/32.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #32


commit 8dad7b08671b997632199f15f01909916ab36078
Author: Jeff Storck 
Date:   2018-10-31T19:48:05Z

NIFI-5773 Added some steps and details to the release process
Fixed several formatting problems with lists and bullet points
Removed some extraneous mentions of "NIFI-" in front of ${JIRA_TICKET}




---


[jira] [Commented] (NIFI-5773) Improvements to the Release Guide

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670647#comment-16670647
 ] 

ASF GitHub Bot commented on NIFI-5773:
--

GitHub user jtstorck opened a pull request:

https://github.com/apache/nifi-site/pull/32

NIFI-5773 Added some steps and details to the release process

Fixed several formatting problems with lists and bullet points
Removed some extraneous mentions of "NIFI-" in front of ${JIRA_TICKET}

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jtstorck/nifi-site NIFI-5773

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi-site/pull/32.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #32


commit 8dad7b08671b997632199f15f01909916ab36078
Author: Jeff Storck 
Date:   2018-10-31T19:48:05Z

NIFI-5773 Added some steps and details to the release process
Fixed several formatting problems with lists and bullet points
Removed some extraneous mentions of "NIFI-" in front of ${JIRA_TICKET}




> Improvements to the Release Guide
> -
>
> Key: NIFI-5773
> URL: https://issues.apache.org/jira/browse/NIFI-5773
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Documentation  Website
>Affects Versions: 1.8.0
>Reporter: Jeff Storck
>Assignee: Jeff Storck
>Priority: Minor
>
> Add some improvements and formatting fixes to the Release Guide.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-31 Thread Ivan Omar Olguin Torres (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670633#comment-16670633
 ] 

Ivan Omar Olguin Torres commented on NIFI-5770:
---

[~bdesert] you're right. I reproduced this in my sandbox and did not see the 
memory leak.

The issue I had was the one in NIFI-4968 in version 1.2.

Thanks a lot for the clarification.

> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: 3117.patch, ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  Attached:
>  * template with a flow to reproduce the bug
>  * simple python modules (to be unpacked under /tmp)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-5773) Improvements to the Release Guide

2018-10-31 Thread Jeff Storck (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Storck reassigned NIFI-5773:
-

Assignee: Jeff Storck

> Improvements to the Release Guide
> -
>
> Key: NIFI-5773
> URL: https://issues.apache.org/jira/browse/NIFI-5773
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Documentation  Website
>Affects Versions: 1.8.0
>Reporter: Jeff Storck
>Assignee: Jeff Storck
>Priority: Minor
>
> Add some improvements and formatting fixes to the Release Guide.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5773) Improvements to the Release Guide

2018-10-31 Thread Jeff Storck (JIRA)
Jeff Storck created NIFI-5773:
-

 Summary: Improvements to the Release Guide
 Key: NIFI-5773
 URL: https://issues.apache.org/jira/browse/NIFI-5773
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Documentation  Website
Affects Versions: 1.8.0
Reporter: Jeff Storck


Add some improvements and formatting fixes to the Release Guide.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5772) FetchFile Test Failing

2018-10-31 Thread Peter Wicks (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670620#comment-16670620
 ] 

Peter Wicks commented on NIFI-5772:
---

[~bende] I initially ran into this on Ubuntu, I only later tested on Windows to 
see if it was just an issue with my Linux box or not.

> FetchFile Test Failing
> --
>
> Key: NIFI-5772
> URL: https://issues.apache.org/jira/browse/NIFI-5772
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Peter Wicks
>Priority: Major
>
> In TestFetchFile, the test 
> `testMoveOnCompleteWithParentOfTargetDirNotAccessible` fails for me.
> Ubuntu with OpenJDK 8 1.5.0 and on my Windows 10 laptop.
> Error: 
> Expected all Transferred FlowFiles to go to failure but 1 were routed to 
> success



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-5771) If 0-byte FlowFiles are load balanced, can result in content claim not being cleaned up

2018-10-31 Thread Bryan Bende (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Bende updated NIFI-5771:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> If 0-byte FlowFiles are load balanced, can result in content claim not being 
> cleaned up
> ---
>
> Key: NIFI-5771
> URL: https://issues.apache.org/jira/browse/NIFI-5771
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.9.0
>
>
> To replicate, create two separate Flows:
> GenerateFlowFile (File Size 800 KB) -> UpdateAttribute [auto-terminate]
> GenerateFlowFile (FileSize 0 B) -> UpdateAttribute [auto-terminate]
> On the second one, that is generating 0-byte flowfiles, configure the 
> Connection to Load Balance (Round Robin is easiest).
> Start both flows.
> After a minute or two, stop the GenerateFlowFile processors; let the 
> flowfiles finish being processed by UpdateAttribute.
> Now, wait for the FlowFile Repository to checkpoint. At this point, the 
> content claim should be cleaned up and delete/archive any content claims 
> (with the exception of a few claims that are still 'writable'). However, some 
> claims are still sticking around when they shouldn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5771) If 0-byte FlowFiles are load balanced, can result in content claim not being cleaned up

2018-10-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670612#comment-16670612
 ] 

ASF subversion and git services commented on NIFI-5771:
---

Commit 4c10b47e602741adc52ad693a9bc56b9964cd7ef in nifi's branch 
refs/heads/master from [~markap14]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=4c10b47 ]

NIFI-5771: Ensure that we only increment claimant count for content claim if we 
have a FlowFile that references it

This closes #3118.

Signed-off-by: Bryan Bende 


> If 0-byte FlowFiles are load balanced, can result in content claim not being 
> cleaned up
> ---
>
> Key: NIFI-5771
> URL: https://issues.apache.org/jira/browse/NIFI-5771
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.9.0
>
>
> To replicate, create two separate Flows:
> GenerateFlowFile (File Size 800 KB) -> UpdateAttribute [auto-terminate]
> GenerateFlowFile (FileSize 0 B) -> UpdateAttribute [auto-terminate]
> On the second one, that is generating 0-byte flowfiles, configure the 
> Connection to Load Balance (Round Robin is easiest).
> Start both flows.
> After a minute or two, stop the GenerateFlowFile processors; let the 
> flowfiles finish being processed by UpdateAttribute.
> Now, wait for the FlowFile Repository to checkpoint. At this point, the 
> content claim should be cleaned up and delete/archive any content claims 
> (with the exception of a few claims that are still 'writable'). However, some 
> claims are still sticking around when they shouldn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5771) If 0-byte FlowFiles are load balanced, can result in content claim not being cleaned up

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670614#comment-16670614
 ] 

ASF GitHub Bot commented on NIFI-5771:
--

Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/3118


> If 0-byte FlowFiles are load balanced, can result in content claim not being 
> cleaned up
> ---
>
> Key: NIFI-5771
> URL: https://issues.apache.org/jira/browse/NIFI-5771
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.9.0
>
>
> To replicate, create two separate Flows:
> GenerateFlowFile (File Size 800 KB) -> UpdateAttribute [auto-terminate]
> GenerateFlowFile (FileSize 0 B) -> UpdateAttribute [auto-terminate]
> On the second one, that is generating 0-byte flowfiles, configure the 
> Connection to Load Balance (Round Robin is easiest).
> Start both flows.
> After a minute or two, stop the GenerateFlowFile processors; let the 
> flowfiles finish being processed by UpdateAttribute.
> Now, wait for the FlowFile Repository to checkpoint. At this point, the 
> content claim should be cleaned up and delete/archive any content claims 
> (with the exception of a few claims that are still 'writable'). However, some 
> claims are still sticking around when they shouldn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi pull request #3118: NIFI-5771: Ensure that we only increment claimant c...

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/3118


---


[jira] [Commented] (NIFI-5771) If 0-byte FlowFiles are load balanced, can result in content claim not being cleaned up

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670610#comment-16670610
 ] 

ASF GitHub Bot commented on NIFI-5771:
--

Github user bbende commented on the issue:

https://github.com/apache/nifi/pull/3118
  
Looks good, will merge


> If 0-byte FlowFiles are load balanced, can result in content claim not being 
> cleaned up
> ---
>
> Key: NIFI-5771
> URL: https://issues.apache.org/jira/browse/NIFI-5771
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.9.0
>
>
> To replicate, create two separate Flows:
> GenerateFlowFile (File Size 800 KB) -> UpdateAttribute [auto-terminate]
> GenerateFlowFile (FileSize 0 B) -> UpdateAttribute [auto-terminate]
> On the second one, that is generating 0-byte flowfiles, configure the 
> Connection to Load Balance (Round Robin is easiest).
> Start both flows.
> After a minute or two, stop the GenerateFlowFile processors; let the 
> flowfiles finish being processed by UpdateAttribute.
> Now, wait for the FlowFile Repository to checkpoint. At this point, the 
> content claim should be cleaned up and delete/archive any content claims 
> (with the exception of a few claims that are still 'writable'). However, some 
> claims are still sticking around when they shouldn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi issue #3118: NIFI-5771: Ensure that we only increment claimant count fo...

2018-10-31 Thread bbende
Github user bbende commented on the issue:

https://github.com/apache/nifi/pull/3118
  
Looks good, will merge


---


[jira] [Commented] (MINIFICPP-653) Log message will segfault client if no content produced

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/MINIFICPP-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670601#comment-16670601
 ] 

ASF GitHub Bot commented on MINIFICPP-653:
--

Github user asfgit closed the pull request at:

https://github.com/apache/nifi-minifi-cpp/pull/427


> Log message will segfault client if no content produced
> ---
>
> Key: MINIFICPP-653
> URL: https://issues.apache.org/jira/browse/MINIFICPP-653
> Project: NiFi MiNiFi C++
>  Issue Type: Improvement
>Reporter: Mr TheSegfault
>Assignee: Mr TheSegfault
>Priority: Blocker
>
> Log message will segfault client if no content produced



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-minifi-cpp pull request #427: MINIFICPP-653: Check if empty content, if...

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi-minifi-cpp/pull/427


---


[jira] [Commented] (NIFI-5772) FetchFile Test Failing

2018-10-31 Thread Bryan Bende (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670593#comment-16670593
 ] 

Bryan Bende commented on NIFI-5772:
---

Thanks for the heads up, I'm guessing we'll have to make the test be ignored 
when building on Windows.

> FetchFile Test Failing
> --
>
> Key: NIFI-5772
> URL: https://issues.apache.org/jira/browse/NIFI-5772
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Peter Wicks
>Priority: Major
>
> In TestFetchFile, the test 
> `testMoveOnCompleteWithParentOfTargetDirNotAccessible` fails for me.
> Ubuntu with OpenJDK 8 1.5.0 and on my Windows 10 laptop.
> Error: 
> Expected all Transferred FlowFiles to go to failure but 1 were routed to 
> success



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5772) FetchFile Test Failing

2018-10-31 Thread Peter Wicks (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670571#comment-16670571
 ] 

Peter Wicks commented on NIFI-5772:
---

[~bende] I saw you created this test. I can't get it to pass.

> FetchFile Test Failing
> --
>
> Key: NIFI-5772
> URL: https://issues.apache.org/jira/browse/NIFI-5772
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Peter Wicks
>Priority: Major
>
> In TestFetchFile, the test 
> `testMoveOnCompleteWithParentOfTargetDirNotAccessible` fails for me.
> Ubuntu with OpenJDK 8 1.5.0 and on my Windows 10 laptop.
> Error: 
> Expected all Transferred FlowFiles to go to failure but 1 were routed to 
> success



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5772) FetchFile Test Failing

2018-10-31 Thread Peter Wicks (JIRA)
Peter Wicks created NIFI-5772:
-

 Summary: FetchFile Test Failing
 Key: NIFI-5772
 URL: https://issues.apache.org/jira/browse/NIFI-5772
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Reporter: Peter Wicks


In TestFetchFile, the test 
`testMoveOnCompleteWithParentOfTargetDirNotAccessible` fails for me.

Ubuntu with OpenJDK 8 1.5.0 and on my Windows 10 laptop.

Error: 

Expected all Transferred FlowFiles to go to failure but 1 were routed to success



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5771) If 0-byte FlowFiles are load balanced, can result in content claim not being cleaned up

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670537#comment-16670537
 ] 

ASF GitHub Bot commented on NIFI-5771:
--

Github user bbende commented on the issue:

https://github.com/apache/nifi/pull/3118
  
Reviewing...


> If 0-byte FlowFiles are load balanced, can result in content claim not being 
> cleaned up
> ---
>
> Key: NIFI-5771
> URL: https://issues.apache.org/jira/browse/NIFI-5771
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.9.0
>
>
> To replicate, create two separate Flows:
> GenerateFlowFile (File Size 800 KB) -> UpdateAttribute [auto-terminate]
> GenerateFlowFile (FileSize 0 B) -> UpdateAttribute [auto-terminate]
> On the second one, that is generating 0-byte flowfiles, configure the 
> Connection to Load Balance (Round Robin is easiest).
> Start both flows.
> After a minute or two, stop the GenerateFlowFile processors; let the 
> flowfiles finish being processed by UpdateAttribute.
> Now, wait for the FlowFile Repository to checkpoint. At this point, the 
> content claim should be cleaned up and delete/archive any content claims 
> (with the exception of a few claims that are still 'writable'). However, some 
> claims are still sticking around when they shouldn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi issue #3118: NIFI-5771: Ensure that we only increment claimant count fo...

2018-10-31 Thread bbende
Github user bbende commented on the issue:

https://github.com/apache/nifi/pull/3118
  
Reviewing...


---


[jira] [Resolved] (MINIFICPP-663) Brew upgrade to bison ( 3.0.x -> 3.2 fails EL Build)

2018-10-31 Thread Aldrin Piri (JIRA)


 [ 
https://issues.apache.org/jira/browse/MINIFICPP-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aldrin Piri resolved MINIFICPP-663.
---
   Resolution: Fixed
Fix Version/s: 0.6.0

> Brew upgrade to bison ( 3.0.x -> 3.2 fails EL Build)
> 
>
> Key: MINIFICPP-663
> URL: https://issues.apache.org/jira/browse/MINIFICPP-663
> Project: NiFi MiNiFi C++
>  Issue Type: Bug
>Reporter: Mr TheSegfault
>Assignee: Mr TheSegfault
>Priority: Critical
> Fix For: 0.6.0
>
>
> Brew upgrade to bison ( 3.0.x -> 3.2 fails EL Build)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-minifi-cpp pull request #433: Minificpp 659: Break out CAPI into nanofi

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi-minifi-cpp/pull/433


---


[jira] [Resolved] (MINIFICPP-659) Move CAPI code out of libminifi

2018-10-31 Thread Aldrin Piri (JIRA)


 [ 
https://issues.apache.org/jira/browse/MINIFICPP-659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aldrin Piri resolved MINIFICPP-659.
---
   Resolution: Fixed
Fix Version/s: 0.6.0

> Move CAPI code out of libminifi
> ---
>
> Key: MINIFICPP-659
> URL: https://issues.apache.org/jira/browse/MINIFICPP-659
> Project: NiFi MiNiFi C++
>  Issue Type: Improvement
>Reporter: Mr TheSegfault
>Assignee: Mr TheSegfault
>Priority: Major
> Fix For: 0.6.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-31 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670477#comment-16670477
 ] 

Ed Berezitsky edited comment on NIFI-5770 at 10/31/18 6:26 PM:
---

[~ivanomarot], please confirm the version you are facing this issue in.

As you describe it, the issue has been reported and fixed as part of 
https://issues.apache.org/jira/browse/NIFI-4968 .

Fix is available since v 1.6.

I also tried to reproduce with bad syntax, but it gives only single error in 
log+bulletin+processor validation indicator, and until you change any property, 
it won't be running validations anymore.


was (Author: bdesert):
[~ivanomarot], please confirm the version you are facing this issue in.

As you describe it, the issue has been reported and fixed as part of 
https://issues.apache.org/jira/browse/NIFI-4968 .

Fixed is available since v 1.6.

I also tried to reproduce with bad syntax, but it gives only single error in 
log+bulletin+processor validation indicator, and until you change any property, 
it won't be running validations anymore.

> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: 3117.patch, ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  Attached:
>  * template with a flow to reproduce the bug
>  * simple python modules (to be unpacked under /tmp)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-31 Thread Ed Berezitsky (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670477#comment-16670477
 ] 

Ed Berezitsky commented on NIFI-5770:
-

[~ivanomarot], please confirm the version you are facing this issue in.

As you describe it, the issue has been reported and fixed as part of 
https://issues.apache.org/jira/browse/NIFI-4968 .

Fixed is available since v 1.6.

I also tried to reproduce with bad syntax, but it gives only single error in 
log+bulletin+processor validation indicator, and until you change any property, 
it won't be running validations anymore.

> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: 3117.patch, ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  Attached:
>  * template with a flow to reproduce the bug
>  * simple python modules (to be unpacked under /tmp)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5771) If 0-byte FlowFiles are load balanced, can result in content claim not being cleaned up

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670424#comment-16670424
 ] 

ASF GitHub Bot commented on NIFI-5771:
--

GitHub user markap14 opened a pull request:

https://github.com/apache/nifi/pull/3118

NIFI-5771: Ensure that we only increment claimant count for content c…

…laim if we have a FlowFile that references it

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markap14/nifi NIFI-5771

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3118.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3118


commit 2fbb61f852abd86a134916bd8b2f6fda5b187241
Author: Mark Payne 
Date:   2018-10-31T17:11:58Z

NIFI-5771: Ensure that we only increment claimant count for content claim 
if we have a FlowFile that references it




> If 0-byte FlowFiles are load balanced, can result in content claim not being 
> cleaned up
> ---
>
> Key: NIFI-5771
> URL: https://issues.apache.org/jira/browse/NIFI-5771
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.9.0
>
>
> To replicate, create two separate Flows:
> GenerateFlowFile (File Size 800 KB) -> UpdateAttribute [auto-terminate]
> GenerateFlowFile (FileSize 0 B) -> UpdateAttribute [auto-terminate]
> On the second one, that is generating 0-byte flowfiles, configure the 
> Connection to Load Balance (Round Robin is easiest).
> Start both flows.
> After a minute or two, stop the GenerateFlowFile processors; let the 
> flowfiles finish being processed by UpdateAttribute.
> Now, wait for the FlowFile Repository to checkpoint. At this point, the 
> content claim should be cleaned up and delete/archive any content claims 
> (with the exception of a few claims that are still 'writable'). However, some 
> claims are still sticking around when they shouldn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi pull request #3118: NIFI-5771: Ensure that we only increment claimant c...

2018-10-31 Thread markap14
GitHub user markap14 opened a pull request:

https://github.com/apache/nifi/pull/3118

NIFI-5771: Ensure that we only increment claimant count for content c…

…laim if we have a FlowFile that references it

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markap14/nifi NIFI-5771

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3118.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3118


commit 2fbb61f852abd86a134916bd8b2f6fda5b187241
Author: Mark Payne 
Date:   2018-10-31T17:11:58Z

NIFI-5771: Ensure that we only increment claimant count for content claim 
if we have a FlowFile that references it




---


[jira] [Updated] (NIFI-5771) If 0-byte FlowFiles are load balanced, can result in content claim not being cleaned up

2018-10-31 Thread Mark Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-5771:
-
Fix Version/s: 1.9.0
   Status: Patch Available  (was: Open)

> If 0-byte FlowFiles are load balanced, can result in content claim not being 
> cleaned up
> ---
>
> Key: NIFI-5771
> URL: https://issues.apache.org/jira/browse/NIFI-5771
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.9.0
>
>
> To replicate, create two separate Flows:
> GenerateFlowFile (File Size 800 KB) -> UpdateAttribute [auto-terminate]
> GenerateFlowFile (FileSize 0 B) -> UpdateAttribute [auto-terminate]
> On the second one, that is generating 0-byte flowfiles, configure the 
> Connection to Load Balance (Round Robin is easiest).
> Start both flows.
> After a minute or two, stop the GenerateFlowFile processors; let the 
> flowfiles finish being processed by UpdateAttribute.
> Now, wait for the FlowFile Repository to checkpoint. At this point, the 
> content claim should be cleaned up and delete/archive any content claims 
> (with the exception of a few claims that are still 'writable'). However, some 
> claims are still sticking around when they shouldn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5771) If 0-byte FlowFiles are load balanced, can result in content claim not being cleaned up

2018-10-31 Thread Mark Payne (JIRA)
Mark Payne created NIFI-5771:


 Summary: If 0-byte FlowFiles are load balanced, can result in 
content claim not being cleaned up
 Key: NIFI-5771
 URL: https://issues.apache.org/jira/browse/NIFI-5771
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne


To replicate, create two separate Flows:

GenerateFlowFile (File Size 800 KB) -> UpdateAttribute [auto-terminate]

GenerateFlowFile (FileSize 0 B) -> UpdateAttribute [auto-terminate]

On the second one, that is generating 0-byte flowfiles, configure the 
Connection to Load Balance (Round Robin is easiest).

Start both flows.

After a minute or two, stop the GenerateFlowFile processors; let the flowfiles 
finish being processed by UpdateAttribute.

Now, wait for the FlowFile Repository to checkpoint. At this point, the content 
claim should be cleaned up and delete/archive any content claims (with the 
exception of a few claims that are still 'writable'). However, some claims are 
still sticking around when they shouldn't.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5724) Make the autocommit value in the PutSQL processor configurable

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670287#comment-16670287
 ] 

ASF GitHub Bot commented on NIFI-5724:
--

Github user viswaug commented on a diff in the pull request:

https://github.com/apache/nifi/pull/3113#discussion_r229756472
  
--- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutSQL.java
 ---
@@ -134,6 +134,14 @@
 
.expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)
 .build();
 
+static final PropertyDescriptor AUTO_COMMIT = new 
PropertyDescriptor.Builder()
+.name("database-session-autocommit")
+.displayName("Database session autocommit value")
+.description("The autocommit mode to set on the database 
connection being used.")
+.allowableValues("true", "false")
+.defaultValue("false")
+.build();
--- End diff --

@ijokarumawak gotcha. how about i remove the property i just added.  and 
set autocommit to true when the SUPPORT_TRANSACTIONS property value is set to 
false?


> Make the autocommit value in the PutSQL processor configurable
> --
>
> Key: NIFI-5724
> URL: https://issues.apache.org/jira/browse/NIFI-5724
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: vish uma
>Priority: Minor
>
> The PutSQL processor currently always sets the autocommit value on the 
> database session to false before the SQL statement is run and resets it back 
> to the original value after. 
> i am not sure if the autocommit value is hardcoded to false for a reason, if 
> it is, please let me know.
> This is causing an issue with the snowflake DB where abruptly disconnected 
> sessions do not release the locks they have taken.
> i would like to make this autocommit value configurable. I can submit a patch 
> for this if there is no objections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi pull request #3113: NIFI-5724 making the database connection autocommit...

2018-10-31 Thread viswaug
Github user viswaug commented on a diff in the pull request:

https://github.com/apache/nifi/pull/3113#discussion_r229756472
  
--- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutSQL.java
 ---
@@ -134,6 +134,14 @@
 
.expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)
 .build();
 
+static final PropertyDescriptor AUTO_COMMIT = new 
PropertyDescriptor.Builder()
+.name("database-session-autocommit")
+.displayName("Database session autocommit value")
+.description("The autocommit mode to set on the database 
connection being used.")
+.allowableValues("true", "false")
+.defaultValue("false")
+.build();
--- End diff --

@ijokarumawak gotcha. how about i remove the property i just added.  and 
set autocommit to true when the SUPPORT_TRANSACTIONS property value is set to 
false?


---


[jira] [Commented] (NIFI-5718) Performance degraded in ReplaceText processor

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670259#comment-16670259
 ] 

ASF GitHub Bot commented on NIFI-5718:
--

Github user patricker commented on a diff in the pull request:

https://github.com/apache/nifi/pull/3100#discussion_r229750511
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/RepeatingInputStream.java
 ---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.stream.io;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Objects;
+
+public class RepeatingInputStream extends InputStream {
--- End diff --

It would probably be best to remove it from this specific PR and include it 
in a future one when needed.

Generally, aren't we trying to avoid more `@ignore` tests? Otherwise I'd 
say just include your previous tests and call it good.


> Performance degraded in ReplaceText processor
> -
>
> Key: NIFI-5718
> URL: https://issues.apache.org/jira/browse/NIFI-5718
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Attachments: Screen Shot 2018-10-17 at 10.55.53 AM.png
>
>
> NIFI-5711 addresses some licensing concerns in the NLKBufferedReader class.
> In doing so, however, it results in lower performance. The ReplaceText 
> processor is affected if the Evaluation Mode is set to Line-by-Line, and the 
> RouteText processor will also be affected. We should be able to match the 
> performance of the previous version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi pull request #3100: NIFI-5718: Implemented LineDemarcator and removed N...

2018-10-31 Thread patricker
Github user patricker commented on a diff in the pull request:

https://github.com/apache/nifi/pull/3100#discussion_r229750511
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/RepeatingInputStream.java
 ---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.stream.io;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Objects;
+
+public class RepeatingInputStream extends InputStream {
--- End diff --

It would probably be best to remove it from this specific PR and include it 
in a future one when needed.

Generally, aren't we trying to avoid more `@ignore` tests? Otherwise I'd 
say just include your previous tests and call it good.


---


[jira] [Commented] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-31 Thread Ivan Omar Olguin Torres (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670193#comment-16670193
 ] 

Ivan Omar Olguin Torres commented on NIFI-5770:
---

If the code fails to compile in an ISP, I see multiple attempts to compile it 
in the logs (around 300 times per minute).

Is it possible that we are getting the same memory leak due to the number of 
compilation attempts?

You can reproduce it by setting an incorrect indentation on any script, you may 
be able to see multiple errors like this:
{code:java}
2018-10-30 13:56:11,103 ERROR - o.a.n.p.script.InvokeScriptedProcessor 
InvokeScriptedProcessor[id=c61d2e40-0166-1000--c870d193] Unable to load 
script: IndentationError: unindent does not match any outer indentation level 
in 

[jira] [Updated] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-31 Thread Ed Berezitsky (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ed Berezitsky updated NIFI-5770:

Attachment: 3117.patch
Status: Patch Available  (was: In Progress)

> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: 3117.patch, ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  Attached:
>  * template with a flow to reproduce the bug
>  * simple python modules (to be unpacked under /tmp)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5718) Performance degraded in ReplaceText processor

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670089#comment-16670089
 ] 

ASF GitHub Bot commented on NIFI-5718:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/3100#discussion_r229685801
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/RepeatingInputStream.java
 ---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.stream.io;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Objects;
+
+public class RepeatingInputStream extends InputStream {
--- End diff --

I am however tempted to leave the class in, as I've considered writing such 
a thing a few times for measuring performance of some piece of code.


> Performance degraded in ReplaceText processor
> -
>
> Key: NIFI-5718
> URL: https://issues.apache.org/jira/browse/NIFI-5718
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Attachments: Screen Shot 2018-10-17 at 10.55.53 AM.png
>
>
> NIFI-5711 addresses some licensing concerns in the NLKBufferedReader class.
> In doing so, however, it results in lower performance. The ReplaceText 
> processor is affected if the Evaluation Mode is set to Line-by-Line, and the 
> RouteText processor will also be affected. We should be able to match the 
> performance of the previous version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-5770) Memory Leak in ExecuteScript

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670090#comment-16670090
 ] 

ASF GitHub Bot commented on NIFI-5770:
--

GitHub user bdesert opened a pull request:

https://github.com/apache/nifi/pull/3117

NIFI-5770 Fix Memory Leak in ExecuteScript on Jython

Moved module appending (aka classpath in python) into init stage instead of 
running each time onTrigger.

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [x] Is your initial contribution a single, squashed commit?

### For code changes:
- [x] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bdesert/nifi NIFI-5770_ExecuteScript

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3117.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3117


commit e6837c81e84b2d7fa29b020c8192b4a2a9783e18
Author: Ed B 
Date:   2018-10-31T13:10:27Z

NIFI-5770 Fix Memory Leak in ExecuteScript on Jython

Moved module appending (aka classpath in python) into init stage instead of 
running each time onTrigger.




> Memory Leak in ExecuteScript
> 
>
> Key: NIFI-5770
> URL: https://issues.apache.org/jira/browse/NIFI-5770
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.8.0
>Reporter: Ed Berezitsky
>Assignee: Ed Berezitsky
>Priority: Major
>  Labels: features, performance
> Attachments: ExecuteScriptMemLeak.xml, jython_modules.zip
>
>
> ExecuteScript with Jython engine has memory leak.
>  It uses JythonScriptEngineConfigurator class to configure jython execution 
> environment.
>  The problem is in the line:
> {code:java}
> engine.eval("sys.path.append('" + modulePath + "')");{code}
> There is no check if a module has already been added previously.
>  As a result, with each execution (onTrigger), string value of module 
> property is being appended, and never reset.
> Although InvokeScriptedProcessor uses the same engine configurator, memory 
> leak is not reproducable in it,
>  because ISP builds the engine and compile the code only once (and rebuilds 
> every time any relevant property is changed).
>  Attached:
>  * template with a flow to reproduce the bug
>  * simple python modules (to be unpacked under /tmp)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi pull request #3100: NIFI-5718: Implemented LineDemarcator and removed N...

2018-10-31 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/3100#discussion_r229685801
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/RepeatingInputStream.java
 ---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.stream.io;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Objects;
+
+public class RepeatingInputStream extends InputStream {
--- End diff --

I am however tempted to leave the class in, as I've considered writing such 
a thing a few times for measuring performance of some piece of code.


---


[GitHub] nifi pull request #3117: NIFI-5770 Fix Memory Leak in ExecuteScript on Jytho...

2018-10-31 Thread bdesert
GitHub user bdesert opened a pull request:

https://github.com/apache/nifi/pull/3117

NIFI-5770 Fix Memory Leak in ExecuteScript on Jython

Moved module appending (aka classpath in python) into init stage instead of 
running each time onTrigger.

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [x] Is your initial contribution a single, squashed commit?

### For code changes:
- [x] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bdesert/nifi NIFI-5770_ExecuteScript

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3117.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3117


commit e6837c81e84b2d7fa29b020c8192b4a2a9783e18
Author: Ed B 
Date:   2018-10-31T13:10:27Z

NIFI-5770 Fix Memory Leak in ExecuteScript on Jython

Moved module appending (aka classpath in python) into init stage instead of 
running each time onTrigger.




---


[jira] [Commented] (NIFI-5718) Performance degraded in ReplaceText processor

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670085#comment-16670085
 ] 

ASF GitHub Bot commented on NIFI-5718:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/3100#discussion_r229685528
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/RepeatingInputStream.java
 ---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.stream.io;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Objects;
+
+public class RepeatingInputStream extends InputStream {
--- End diff --

Interesting. I wrote this to use in some unit tests so that I could easily 
test performance, running over several GB of data. The unit tests don't 
reference it now... I intended to mark those test as @Ignore, indicating that 
they were only useful for manual testing before/after changes to determine the 
performance. But apparently I removed the tests all together instead of 
@Ignore'ing them.


> Performance degraded in ReplaceText processor
> -
>
> Key: NIFI-5718
> URL: https://issues.apache.org/jira/browse/NIFI-5718
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Attachments: Screen Shot 2018-10-17 at 10.55.53 AM.png
>
>
> NIFI-5711 addresses some licensing concerns in the NLKBufferedReader class.
> In doing so, however, it results in lower performance. The ReplaceText 
> processor is affected if the Evaluation Mode is set to Line-by-Line, and the 
> RouteText processor will also be affected. We should be able to match the 
> performance of the previous version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi pull request #3100: NIFI-5718: Implemented LineDemarcator and removed N...

2018-10-31 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/3100#discussion_r229685528
  
--- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/RepeatingInputStream.java
 ---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.stream.io;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Objects;
+
+public class RepeatingInputStream extends InputStream {
--- End diff --

Interesting. I wrote this to use in some unit tests so that I could easily 
test performance, running over several GB of data. The unit tests don't 
reference it now... I intended to mark those test as @Ignore, indicating that 
they were only useful for manual testing before/after changes to determine the 
performance. But apparently I removed the tests all together instead of 
@Ignore'ing them.


---


[jira] [Commented] (MINIFICPP-653) Log message will segfault client if no content produced

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/MINIFICPP-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670002#comment-16670002
 ] 

ASF GitHub Bot commented on MINIFICPP-653:
--

Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/427#discussion_r229665932
  
--- Diff: docker/test/integration/minifi/test/__init__.py ---
@@ -142,7 +147,7 @@ def check_output(self, timeout=5):
 self.wait_for_output(timeout)
 self.log_nifi_output()
 
-return self.output_validator.validate()
+return self.output_validator.validate() & ~self.segfault
--- End diff --

I would much rather learn and do what is more generally appropriate for 
that language domain. Thanks!


> Log message will segfault client if no content produced
> ---
>
> Key: MINIFICPP-653
> URL: https://issues.apache.org/jira/browse/MINIFICPP-653
> Project: NiFi MiNiFi C++
>  Issue Type: Improvement
>Reporter: Mr TheSegfault
>Assignee: Mr TheSegfault
>Priority: Blocker
>
> Log message will segfault client if no content produced



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-minifi-cpp pull request #427: MINIFICPP-653: Check if empty content, if...

2018-10-31 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/427#discussion_r229665932
  
--- Diff: docker/test/integration/minifi/test/__init__.py ---
@@ -142,7 +147,7 @@ def check_output(self, timeout=5):
 self.wait_for_output(timeout)
 self.log_nifi_output()
 
-return self.output_validator.validate()
+return self.output_validator.validate() & ~self.segfault
--- End diff --

I would much rather learn and do what is more generally appropriate for 
that language domain. Thanks!


---


[jira] [Commented] (MINIFICPP-653) Log message will segfault client if no content produced

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/MINIFICPP-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670001#comment-16670001
 ] 

ASF GitHub Bot commented on MINIFICPP-653:
--

Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/427#discussion_r229665609
  
--- Diff: docker/test/integration/minifi/test/__init__.py ---
@@ -117,12 +119,15 @@ def log_nifi_output(self):
 for container in self.containers:
 container = self.client.containers.get(container.id)
 logging.info('Container logs for container \'%s\':\n%s', 
container.name, container.logs())
+if b'Segmentation fault' in container.logs():
+self.segfault=true
--- End diff --

Thanks! 


> Log message will segfault client if no content produced
> ---
>
> Key: MINIFICPP-653
> URL: https://issues.apache.org/jira/browse/MINIFICPP-653
> Project: NiFi MiNiFi C++
>  Issue Type: Improvement
>Reporter: Mr TheSegfault
>Assignee: Mr TheSegfault
>Priority: Blocker
>
> Log message will segfault client if no content produced



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-minifi-cpp pull request #427: MINIFICPP-653: Check if empty content, if...

2018-10-31 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/427#discussion_r229665609
  
--- Diff: docker/test/integration/minifi/test/__init__.py ---
@@ -117,12 +119,15 @@ def log_nifi_output(self):
 for container in self.containers:
 container = self.client.containers.get(container.id)
 logging.info('Container logs for container \'%s\':\n%s', 
container.name, container.logs())
+if b'Segmentation fault' in container.logs():
+self.segfault=true
--- End diff --

Thanks! 


---


[GitHub] nifi pull request #2297: #77864: Fixed error handling for the large file (Ou...

2018-10-31 Thread artemgolubnichenko
Github user artemgolubnichenko closed the pull request at:

https://github.com/apache/nifi/pull/2297


---


[jira] [Commented] (NIFI-5752) Load balancing fails with wildcard certs

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669826#comment-16669826
 ] 

ASF GitHub Bot commented on NIFI-5752:
--

Github user kotarot commented on the issue:

https://github.com/apache/nifi/pull/3110
  
@markap14 Thank you for your kind advice! That makes sense to me.

In the new commit, I have left the existing authorization codes, followed 
by the authorization using `HostnameVerifier` (which I added). The 
authorization is performed as follows:
- If the authorization with the string-match succeeded, then returns the 
node information from Client Identities (as we fixed in #3109 ).
- After that, if the authorization by `HostnameVerifier` succeeded, then 
returns the derived hostname from the socket is returned.

Does this change seem to be no problem?

Also, I modified a few tests related to `LoadBalanceAuthorizer` because the 
interface of `authorize` is changed.

@ijokarumawak Could you please review it?


> Load balancing fails with wildcard certs
> 
>
> Key: NIFI-5752
> URL: https://issues.apache.org/jira/browse/NIFI-5752
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Kotaro Terada
>Assignee: Kotaro Terada
>Priority: Major
>
> Load balancing fails when we construct a secure cluster with wildcard certs.
> For example, assume that we have a valid wildcard cert for {{*.example.com}} 
> and a cluster consists of {{nf1.example.com}}, {{nf2.example.com}}, and 
> {{nf3.example.com}} . We cannot transfer a FlowFile between nodes for load 
> balancing because of the following authorization error:
> {noformat}
> 2018-10-25 19:05:13,520 WARN [Load Balance Server Thread-2] 
> o.a.n.c.q.c.s.ClusterLoadBalanceAuthorizer Authorization failed for Client 
> ID's [*.example.com] to Load Balance data because none of the ID's are known 
> Cluster Node Identifiers
> 2018-10-25 19:05:13,521 ERROR [Load Balance Server Thread-2] 
> o.a.n.c.q.c.s.ConnectionLoadBalanceServer Failed to communicate with Peer 
> /xxx.xxx.xxx.xxx:x
> org.apache.nifi.controller.queue.clustered.server.NotAuthorizedException: 
> Client ID's [*.example.com] are not authorized to Load Balance data
>   at 
> org.apache.nifi.controller.queue.clustered.server.ClusterLoadBalanceAuthorizer.authorize(ClusterLoadBalanceAuthorizer.java:65)
>   at 
> org.apache.nifi.controller.queue.clustered.server.StandardLoadBalanceProtocol.receiveFlowFiles(StandardLoadBalanceProtocol.java:142)
>   at 
> org.apache.nifi.controller.queue.clustered.server.ConnectionLoadBalanceServer$CommunicateAction.run(ConnectionLoadBalanceServer.java:176)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> This problem occurs because in {{authorize}} method in 
> {{ClusterLoadBalanceAuthorizer}} class, authorization is tried by just 
> matching strings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi issue #3110: NIFI-5752: Load balancing fails with wildcard certs

2018-10-31 Thread kotarot
Github user kotarot commented on the issue:

https://github.com/apache/nifi/pull/3110
  
@markap14 Thank you for your kind advice! That makes sense to me.

In the new commit, I have left the existing authorization codes, followed 
by the authorization using `HostnameVerifier` (which I added). The 
authorization is performed as follows:
- If the authorization with the string-match succeeded, then returns the 
node information from Client Identities (as we fixed in #3109 ).
- After that, if the authorization by `HostnameVerifier` succeeded, then 
returns the derived hostname from the socket is returned.

Does this change seem to be no problem?

Also, I modified a few tests related to `LoadBalanceAuthorizer` because the 
interface of `authorize` is changed.

@ijokarumawak Could you please review it?


---


[jira] [Commented] (MINIFICPP-653) Log message will segfault client if no content produced

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/MINIFICPP-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669726#comment-16669726
 ] 

ASF GitHub Bot commented on MINIFICPP-653:
--

Github user arpadboda commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/427#discussion_r229595113
  
--- Diff: docker/test/integration/minifi/test/__init__.py ---
@@ -142,7 +147,7 @@ def check_output(self, timeout=5):
 self.wait_for_output(timeout)
 self.log_nifi_output()
 
-return self.output_validator.validate()
+return self.output_validator.validate() & ~self.segfault
--- End diff --

I have hardly seen, but the question was meant to ask what do we plan to 
achieve here. 

I just run through the code to understand what's going on here, this if 
fine, but using logical expressions are way more _pythonic_ to achieve the same:

```
return self.output_validator.validate() and not self.segfault
```


> Log message will segfault client if no content produced
> ---
>
> Key: MINIFICPP-653
> URL: https://issues.apache.org/jira/browse/MINIFICPP-653
> Project: NiFi MiNiFi C++
>  Issue Type: Improvement
>Reporter: Mr TheSegfault
>Assignee: Mr TheSegfault
>Priority: Blocker
>
> Log message will segfault client if no content produced



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-minifi-cpp pull request #427: MINIFICPP-653: Check if empty content, if...

2018-10-31 Thread arpadboda
Github user arpadboda commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/427#discussion_r229595113
  
--- Diff: docker/test/integration/minifi/test/__init__.py ---
@@ -142,7 +147,7 @@ def check_output(self, timeout=5):
 self.wait_for_output(timeout)
 self.log_nifi_output()
 
-return self.output_validator.validate()
+return self.output_validator.validate() & ~self.segfault
--- End diff --

I have hardly seen, but the question was meant to ask what do we plan to 
achieve here. 

I just run through the code to understand what's going on here, this if 
fine, but using logical expressions are way more _pythonic_ to achieve the same:

```
return self.output_validator.validate() and not self.segfault
```


---


[jira] [Commented] (MINIFICPP-653) Log message will segfault client if no content produced

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/MINIFICPP-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669723#comment-16669723
 ] 

ASF GitHub Bot commented on MINIFICPP-653:
--

Github user arpadboda commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/427#discussion_r229593381
  
--- Diff: docker/test/integration/minifi/test/__init__.py ---
@@ -117,12 +119,15 @@ def log_nifi_output(self):
 for container in self.containers:
 container = self.client.containers.get(container.id)
 logging.info('Container logs for container \'%s\':\n%s', 
container.name, container.logs())
+if b'Segmentation fault' in container.logs():
+self.segfault=true
--- End diff --

Actually this line is wrong, this should be True (capital 't')


> Log message will segfault client if no content produced
> ---
>
> Key: MINIFICPP-653
> URL: https://issues.apache.org/jira/browse/MINIFICPP-653
> Project: NiFi MiNiFi C++
>  Issue Type: Improvement
>Reporter: Mr TheSegfault
>Assignee: Mr TheSegfault
>Priority: Blocker
>
> Log message will segfault client if no content produced



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-minifi-cpp pull request #427: MINIFICPP-653: Check if empty content, if...

2018-10-31 Thread arpadboda
Github user arpadboda commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/427#discussion_r229593381
  
--- Diff: docker/test/integration/minifi/test/__init__.py ---
@@ -117,12 +119,15 @@ def log_nifi_output(self):
 for container in self.containers:
 container = self.client.containers.get(container.id)
 logging.info('Container logs for container \'%s\':\n%s', 
container.name, container.logs())
+if b'Segmentation fault' in container.logs():
+self.segfault=true
--- End diff --

Actually this line is wrong, this should be True (capital 't')


---


[jira] [Commented] (NIFIREG-209) Support rebuilding metadata DB from Git repo

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFIREG-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669718#comment-16669718
 ] 

ASF GitHub Bot commented on NIFIREG-209:


Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi-registry/pull/144
  
Oh, this is going to be very useful! I'm reviewing now. Thanks!


> Support rebuilding metadata DB from Git repo
> 
>
> Key: NIFIREG-209
> URL: https://issues.apache.org/jira/browse/NIFIREG-209
> Project: NiFi Registry
>  Issue Type: Improvement
>Reporter: Bryan Bende
>Assignee: Bryan Bende
>Priority: Major
>
> Since the release of git persistence for flow storage, many users have asked 
> if there is a way to stand up a new NiFi Registry instance and just point it 
> at an existing git repo of flows.
> Currently the issue is that the git repo is only used for the persistence of 
> flow content, and the metadata comes from a relational database, so if you 
> lost your server and don't have a copy of the DB then the git repo alone 
> isn't enough.
> In general the DB should be backed up, or an external DB with HA (Postgres) 
> should be used instead of the H2 DB, but we should also be able to offer a 
> way to bootstrap a new NiFi Registry instance from a git repo of flows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-registry issue #144: NIFIREG-209 Rebuild metadata DB from FlowPersisten...

2018-10-31 Thread ijokarumawak
Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi-registry/pull/144
  
Oh, this is going to be very useful! I'm reviewing now. Thanks!


---


[jira] [Commented] (MINIFICPP-640) C API: how to support dynamic properties?

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/MINIFICPP-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669712#comment-16669712
 ] 

ASF GitHub Bot commented on MINIFICPP-640:
--

Github user arpadboda closed the pull request at:

https://github.com/apache/nifi-minifi-cpp/pull/431


> C API: how to support dynamic properties?
> -
>
> Key: MINIFICPP-640
> URL: https://issues.apache.org/jira/browse/MINIFICPP-640
> Project: NiFi MiNiFi C++
>  Issue Type: Brainstorming
>Reporter: Arpad Boda
>Assignee: Arpad Boda
>Priority: Minor
>  Labels: CAPI
> Fix For: 0.6.0
>
>
> The current C API implementation only allows static properties to be 
> set/updated, which means processors that would require dynamic attributes, 
> such as UpdateAttribute cannot be configured properly. 
> The question is whether the difference between static and dynamic properties 
> should be transparent via the API or this should be hidden and handled 
> inside. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi-minifi-cpp pull request #431: MINIFICPP-640 - C API: how to support dyn...

2018-10-31 Thread arpadboda
Github user arpadboda closed the pull request at:

https://github.com/apache/nifi-minifi-cpp/pull/431


---


[GitHub] nifi-minifi-cpp issue #431: MINIFICPP-640 - C API: how to support dynamic pr...

2018-10-31 Thread arpadboda
Github user arpadboda commented on the issue:

https://github.com/apache/nifi-minifi-cpp/pull/431
  
> @arpadboda this was merged but I used the wrong commit hook to close it. 
I closed the wrong PR. So can you close this when you have a chance? Sorry 
about that.

Sure, done


---


[jira] [Commented] (MINIFICPP-640) C API: how to support dynamic properties?

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/MINIFICPP-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669710#comment-16669710
 ] 

ASF GitHub Bot commented on MINIFICPP-640:
--

Github user arpadboda commented on the issue:

https://github.com/apache/nifi-minifi-cpp/pull/431
  
> @arpadboda this was merged but I used the wrong commit hook to close it. 
I closed the wrong PR. So can you close this when you have a chance? Sorry 
about that.

Sure, done


> C API: how to support dynamic properties?
> -
>
> Key: MINIFICPP-640
> URL: https://issues.apache.org/jira/browse/MINIFICPP-640
> Project: NiFi MiNiFi C++
>  Issue Type: Brainstorming
>Reporter: Arpad Boda
>Assignee: Arpad Boda
>Priority: Minor
>  Labels: CAPI
> Fix For: 0.6.0
>
>
> The current C API implementation only allows static properties to be 
> set/updated, which means processors that would require dynamic attributes, 
> such as UpdateAttribute cannot be configured properly. 
> The question is whether the difference between static and dynamic properties 
> should be transparent via the API or this should be hidden and handled 
> inside. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (MINIFICPP-640) C API: how to support dynamic properties?

2018-10-31 Thread Arpad Boda (JIRA)


 [ 
https://issues.apache.org/jira/browse/MINIFICPP-640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpad Boda resolved MINIFICPP-640.
--
   Resolution: Fixed
Fix Version/s: 0.6.0

> C API: how to support dynamic properties?
> -
>
> Key: MINIFICPP-640
> URL: https://issues.apache.org/jira/browse/MINIFICPP-640
> Project: NiFi MiNiFi C++
>  Issue Type: Brainstorming
>Reporter: Arpad Boda
>Assignee: Arpad Boda
>Priority: Minor
>  Labels: CAPI
> Fix For: 0.6.0
>
>
> The current C API implementation only allows static properties to be 
> set/updated, which means processors that would require dynamic attributes, 
> such as UpdateAttribute cannot be configured properly. 
> The question is whether the difference between static and dynamic properties 
> should be transparent via the API or this should be hidden and handled 
> inside. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (NIFI-4715) ListS3 produces duplicates in frequently updated buckets

2018-10-31 Thread Koji Kawamura (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Kawamura updated NIFI-4715:

Status: Patch Available  (was: In Progress)

> ListS3 produces duplicates in frequently updated buckets
> 
>
> Key: NIFI-4715
> URL: https://issues.apache.org/jira/browse/NIFI-4715
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.4.0, 1.3.0, 1.2.0
> Environment: All
>Reporter: Milan Das
>Assignee: Koji Kawamura
>Priority: Major
> Attachments: List-S3-dup-issue.xml, ListS3_Duplication.xml, 
> screenshot-1.png
>
>
> ListS3 state is implemented using HashSet. HashSet is not thread safe. When 
> ListS3 operates in multi threaded mode, sometimes it  tries to list  same 
> file from S3 bucket.  Seems like HashSet data is getting corrupted.
> currentKeys = new HashSet<>(); // need to be implemented Thread Safe like 
> currentKeys = //ConcurrentHashMap.newKeySet();
> *{color:red}+Update+{color}*:
> This is not a HashSet issue:
> Root cause is: 
> When the file gets uploaded to S3 simultaneously  when List S3 is in progress.
> onTrigger-->  maxTimestamp is initiated as 0L.
> This is clearing keys as per the code below
> When lastModifiedTime on S3 object is same as currentTimestamp for the listed 
> key it should be skipped. As the key is cleared, it is loading the same file 
> again. 
> I think fix should be to initiate the maxTimestamp with currentTimestamp not 
> 0L.
> {code}
>  long maxTimestamp = currentTimestamp;
> {code}
> Following block is clearing keys.
> {code:title=org.apache.nifi.processors.aws.s3.ListS3.java|borderStyle=solid}
>  if (lastModified > maxTimestamp) {
> maxTimestamp = lastModified;
> currentKeys.clear();
> getLogger().debug("clearing keys");
> }
> {code}
> Update: 01/03/2018
> There is one more flavor of same defect.
> Suppose: file1 is modified at 1514987611000 on S3 and currentTimestamp = 
> 1514987311000 on state.
> 1. File will be picked up time current state will be updated to 
> currentTimestamp=1514987311000 (but OS System time is 1514987611000)
> 2. next cycle for file2 with lastmodified: 1514987611000 : keys will be 
> cleared because lastModified > maxTimeStamp 
> (=currentTimestamp=1514987311000). 
> CurrentTimeStamp will saved as 1514987611000
> 3. next cycle: currentTimestamp=1514987611000 , "file1 modified at 
> 1514987611000" will be picked up again because file1 is no longer in the keys.
> I think solution is currentTimeStamp need to persisted current system time 
> stamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4715) ListS3 produces duplicates in frequently updated buckets

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669701#comment-16669701
 ] 

ASF GitHub Bot commented on NIFI-4715:
--

Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2361
  
Hi @adamlamar , I hope this message finds you well. Since some other users 
asked about this issue, I went ahead and took over the remaining concerns 
around updating `currentKeys` during list loop. And submitted another PR #3116 
based on your commits.

When it gets merged, this PR will be closed automatically. If you have any 
comments, please keep discussing on the new PR. Thanks again for your 
contribution!


> ListS3 produces duplicates in frequently updated buckets
> 
>
> Key: NIFI-4715
> URL: https://issues.apache.org/jira/browse/NIFI-4715
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.2.0, 1.3.0, 1.4.0
> Environment: All
>Reporter: Milan Das
>Assignee: Koji Kawamura
>Priority: Major
> Attachments: List-S3-dup-issue.xml, ListS3_Duplication.xml, 
> screenshot-1.png
>
>
> ListS3 state is implemented using HashSet. HashSet is not thread safe. When 
> ListS3 operates in multi threaded mode, sometimes it  tries to list  same 
> file from S3 bucket.  Seems like HashSet data is getting corrupted.
> currentKeys = new HashSet<>(); // need to be implemented Thread Safe like 
> currentKeys = //ConcurrentHashMap.newKeySet();
> *{color:red}+Update+{color}*:
> This is not a HashSet issue:
> Root cause is: 
> When the file gets uploaded to S3 simultaneously  when List S3 is in progress.
> onTrigger-->  maxTimestamp is initiated as 0L.
> This is clearing keys as per the code below
> When lastModifiedTime on S3 object is same as currentTimestamp for the listed 
> key it should be skipped. As the key is cleared, it is loading the same file 
> again. 
> I think fix should be to initiate the maxTimestamp with currentTimestamp not 
> 0L.
> {code}
>  long maxTimestamp = currentTimestamp;
> {code}
> Following block is clearing keys.
> {code:title=org.apache.nifi.processors.aws.s3.ListS3.java|borderStyle=solid}
>  if (lastModified > maxTimestamp) {
> maxTimestamp = lastModified;
> currentKeys.clear();
> getLogger().debug("clearing keys");
> }
> {code}
> Update: 01/03/2018
> There is one more flavor of same defect.
> Suppose: file1 is modified at 1514987611000 on S3 and currentTimestamp = 
> 1514987311000 on state.
> 1. File will be picked up time current state will be updated to 
> currentTimestamp=1514987311000 (but OS System time is 1514987611000)
> 2. next cycle for file2 with lastmodified: 1514987611000 : keys will be 
> cleared because lastModified > maxTimeStamp 
> (=currentTimestamp=1514987311000). 
> CurrentTimeStamp will saved as 1514987611000
> 3. next cycle: currentTimestamp=1514987611000 , "file1 modified at 
> 1514987611000" will be picked up again because file1 is no longer in the keys.
> I think solution is currentTimeStamp need to persisted current system time 
> stamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] nifi issue #2361: NIFI-4715: ListS3 produces duplicates in frequently update...

2018-10-31 Thread ijokarumawak
Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2361
  
Hi @adamlamar , I hope this message finds you well. Since some other users 
asked about this issue, I went ahead and took over the remaining concerns 
around updating `currentKeys` during list loop. And submitted another PR #3116 
based on your commits.

When it gets merged, this PR will be closed automatically. If you have any 
comments, please keep discussing on the new PR. Thanks again for your 
contribution!


---


[jira] [Commented] (NIFI-4715) ListS3 produces duplicates in frequently updated buckets

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669697#comment-16669697
 ] 

ASF GitHub Bot commented on NIFI-4715:
--

GitHub user ijokarumawak opened a pull request:

https://github.com/apache/nifi/pull/3116

NIFI-4715: ListS3 produces duplicates in frequently updated buckets

This PR is based on #2361. To preserve @adamlamar's credit, please do not 
squash the first commit when merging. Thanks!

The 2nd commit avoids updating `currentKeys` during the listing loop. 
Before this fix, it's easy to reproduce duplicated list with a small number of 
objects. E.g 10 objects to S3 uploaded at the same time, ListS3 can produce 27 
FlowFiles. Using min age doesn't address the issue.

Please use [the template file attached to the 
JIRA](https://issues.apache.org/jira/secure/attachment/12946341/ListS3_Duplication.xml)
 to reproduce.

After applying this fix, I confirmed ListS3 can produce FlowFiles without 
duplication. I tested 10,000 objects were listed without duplication while 
those were uploaded by PutS3 and listed by ListS3 simultaneously.


---

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijokarumawak/nifi nifi-4715

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3116.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3116


commit 3853b13806121c1479edac2038634992ffc6fdfe
Author: Adam Lamar 
Date:   2017-12-24T03:29:02Z

NIFI-4715: ListS3 produces duplicates in frequently updated buckets

Keep totalListCount, reduce unnecessary persistState

This closes #2361.

Signed-off-by: Koji Kawamura 

commit 4d445055cf605811f85bfed12b33155adbd570a2
Author: Koji Kawamura 
Date:   2018-10-31T07:01:36Z

NIFI-4715: Update currentKeys after listing loop

ListS3 used to update currentKeys within listing loop, that causes
duplicates. Because S3 returns object list in lexicographic order, if we
clear currentKeys during the loop, we cannot tell if the object has been
listed or not, in a case where newer object has a lexicographically
former name.




> ListS3 produces duplicates in frequently updated buckets
> 
>
> Key: NIFI-4715
> URL: https://issues.apache.org/jira/browse/NIFI-4715
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.2.0, 1.3.0, 1.4.0
> Environment: All
>Reporter: Milan Das
>Assignee: Koji Kawamura
>Priority: Major
> Attachments: List-S3-dup-issue.xml, ListS3_Duplication.xml, 
> screenshot-1.png
>
>
> ListS3 state is implemented using HashSet. HashSet is not thread safe. When 
> ListS3 operates in multi threaded mode, sometimes it  tries to list  same 
> file from S3 bucket.  Seems like HashSet data 

[GitHub] nifi pull request #3116: NIFI-4715: ListS3 produces duplicates in frequently...

2018-10-31 Thread ijokarumawak
GitHub user ijokarumawak opened a pull request:

https://github.com/apache/nifi/pull/3116

NIFI-4715: ListS3 produces duplicates in frequently updated buckets

This PR is based on #2361. To preserve @adamlamar's credit, please do not 
squash the first commit when merging. Thanks!

The 2nd commit avoids updating `currentKeys` during the listing loop. 
Before this fix, it's easy to reproduce duplicated list with a small number of 
objects. E.g 10 objects to S3 uploaded at the same time, ListS3 can produce 27 
FlowFiles. Using min age doesn't address the issue.

Please use [the template file attached to the 
JIRA](https://issues.apache.org/jira/secure/attachment/12946341/ListS3_Duplication.xml)
 to reproduce.

After applying this fix, I confirmed ListS3 can produce FlowFiles without 
duplication. I tested 10,000 objects were listed without duplication while 
those were uploaded by PutS3 and listed by ListS3 simultaneously.


---

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijokarumawak/nifi nifi-4715

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/3116.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3116


commit 3853b13806121c1479edac2038634992ffc6fdfe
Author: Adam Lamar 
Date:   2017-12-24T03:29:02Z

NIFI-4715: ListS3 produces duplicates in frequently updated buckets

Keep totalListCount, reduce unnecessary persistState

This closes #2361.

Signed-off-by: Koji Kawamura 

commit 4d445055cf605811f85bfed12b33155adbd570a2
Author: Koji Kawamura 
Date:   2018-10-31T07:01:36Z

NIFI-4715: Update currentKeys after listing loop

ListS3 used to update currentKeys within listing loop, that causes
duplicates. Because S3 returns object list in lexicographic order, if we
clear currentKeys during the loop, we cannot tell if the object has been
listed or not, in a case where newer object has a lexicographically
former name.




---


[jira] [Updated] (NIFI-4715) ListS3 produces duplicates in frequently updated buckets

2018-10-31 Thread Koji Kawamura (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Kawamura updated NIFI-4715:

Attachment: ListS3_Duplication.xml

> ListS3 produces duplicates in frequently updated buckets
> 
>
> Key: NIFI-4715
> URL: https://issues.apache.org/jira/browse/NIFI-4715
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.2.0, 1.3.0, 1.4.0
> Environment: All
>Reporter: Milan Das
>Assignee: Koji Kawamura
>Priority: Major
> Attachments: List-S3-dup-issue.xml, ListS3_Duplication.xml, 
> screenshot-1.png
>
>
> ListS3 state is implemented using HashSet. HashSet is not thread safe. When 
> ListS3 operates in multi threaded mode, sometimes it  tries to list  same 
> file from S3 bucket.  Seems like HashSet data is getting corrupted.
> currentKeys = new HashSet<>(); // need to be implemented Thread Safe like 
> currentKeys = //ConcurrentHashMap.newKeySet();
> *{color:red}+Update+{color}*:
> This is not a HashSet issue:
> Root cause is: 
> When the file gets uploaded to S3 simultaneously  when List S3 is in progress.
> onTrigger-->  maxTimestamp is initiated as 0L.
> This is clearing keys as per the code below
> When lastModifiedTime on S3 object is same as currentTimestamp for the listed 
> key it should be skipped. As the key is cleared, it is loading the same file 
> again. 
> I think fix should be to initiate the maxTimestamp with currentTimestamp not 
> 0L.
> {code}
>  long maxTimestamp = currentTimestamp;
> {code}
> Following block is clearing keys.
> {code:title=org.apache.nifi.processors.aws.s3.ListS3.java|borderStyle=solid}
>  if (lastModified > maxTimestamp) {
> maxTimestamp = lastModified;
> currentKeys.clear();
> getLogger().debug("clearing keys");
> }
> {code}
> Update: 01/03/2018
> There is one more flavor of same defect.
> Suppose: file1 is modified at 1514987611000 on S3 and currentTimestamp = 
> 1514987311000 on state.
> 1. File will be picked up time current state will be updated to 
> currentTimestamp=1514987311000 (but OS System time is 1514987611000)
> 2. next cycle for file2 with lastmodified: 1514987611000 : keys will be 
> cleared because lastModified > maxTimeStamp 
> (=currentTimestamp=1514987311000). 
> CurrentTimeStamp will saved as 1514987611000
> 3. next cycle: currentTimestamp=1514987611000 , "file1 modified at 
> 1514987611000" will be picked up again because file1 is no longer in the keys.
> I think solution is currentTimeStamp need to persisted current system time 
> stamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (NIFI-4715) ListS3 produces duplicates in frequently updated buckets

2018-10-31 Thread Koji Kawamura (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Kawamura reassigned NIFI-4715:
---

Assignee: Koji Kawamura

> ListS3 produces duplicates in frequently updated buckets
> 
>
> Key: NIFI-4715
> URL: https://issues.apache.org/jira/browse/NIFI-4715
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.2.0, 1.3.0, 1.4.0
> Environment: All
>Reporter: Milan Das
>Assignee: Koji Kawamura
>Priority: Major
> Attachments: List-S3-dup-issue.xml, screenshot-1.png
>
>
> ListS3 state is implemented using HashSet. HashSet is not thread safe. When 
> ListS3 operates in multi threaded mode, sometimes it  tries to list  same 
> file from S3 bucket.  Seems like HashSet data is getting corrupted.
> currentKeys = new HashSet<>(); // need to be implemented Thread Safe like 
> currentKeys = //ConcurrentHashMap.newKeySet();
> *{color:red}+Update+{color}*:
> This is not a HashSet issue:
> Root cause is: 
> When the file gets uploaded to S3 simultaneously  when List S3 is in progress.
> onTrigger-->  maxTimestamp is initiated as 0L.
> This is clearing keys as per the code below
> When lastModifiedTime on S3 object is same as currentTimestamp for the listed 
> key it should be skipped. As the key is cleared, it is loading the same file 
> again. 
> I think fix should be to initiate the maxTimestamp with currentTimestamp not 
> 0L.
> {code}
>  long maxTimestamp = currentTimestamp;
> {code}
> Following block is clearing keys.
> {code:title=org.apache.nifi.processors.aws.s3.ListS3.java|borderStyle=solid}
>  if (lastModified > maxTimestamp) {
> maxTimestamp = lastModified;
> currentKeys.clear();
> getLogger().debug("clearing keys");
> }
> {code}
> Update: 01/03/2018
> There is one more flavor of same defect.
> Suppose: file1 is modified at 1514987611000 on S3 and currentTimestamp = 
> 1514987311000 on state.
> 1. File will be picked up time current state will be updated to 
> currentTimestamp=1514987311000 (but OS System time is 1514987611000)
> 2. next cycle for file2 with lastmodified: 1514987611000 : keys will be 
> cleared because lastModified > maxTimeStamp 
> (=currentTimestamp=1514987311000). 
> CurrentTimeStamp will saved as 1514987611000
> 3. next cycle: currentTimestamp=1514987611000 , "file1 modified at 
> 1514987611000" will be picked up again because file1 is no longer in the keys.
> I think solution is currentTimeStamp need to persisted current system time 
> stamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)