[jira] [Commented] (NIFI-4212) Create RethinkDB Delete Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097068#comment-16097068
 ] 

ASF GitHub Bot commented on NIFI-4212:
--

GitHub user mans2singh opened a pull request:

https://github.com/apache/nifi/pull/2030

NIFI-4212 - RethinkDB Delete Processor

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [x] Is your initial contribution a single, squashed commit?

### For code changes:
- [x] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [x] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mans2singh/nifi NIFI-4212

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2030.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2030


commit 5ea664840e5caf88e8cd848f25c4a3cff2756409
Author: mans2singh 
Date:   2017-07-22T02:07:15Z

NIFI-4212 - RethinkDB Delete Processor




> Create RethinkDB Delete Processor
> -
>
> Key: NIFI-4212
> URL: https://issues.apache.org/jira/browse/NIFI-4212
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Affects Versions: 1.3.0
>Reporter: Mans Singh
>Assignee: Mans Singh
>Priority: Minor
>  Labels: delete, rethinkdb
> Fix For: 1.4.0
>
>
> Create processor to delete RethinkDB documents by id.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #2030: NIFI-4212 - RethinkDB Delete Processor

2017-07-21 Thread mans2singh
GitHub user mans2singh opened a pull request:

https://github.com/apache/nifi/pull/2030

NIFI-4212 - RethinkDB Delete Processor

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [x] Is your initial contribution a single, squashed commit?

### For code changes:
- [x] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [x] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mans2singh/nifi NIFI-4212

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2030.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2030


commit 5ea664840e5caf88e8cd848f25c4a3cff2756409
Author: mans2singh 
Date:   2017-07-22T02:07:15Z

NIFI-4212 - RethinkDB Delete Processor




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (NIFI-4215) Avro schemas with records that have a field of themselves fail to parse, causing stackoverflow exception

2017-07-21 Thread Wesley L Lawrence (JIRA)
Wesley L Lawrence created NIFI-4215:
---

 Summary: Avro schemas with records that have a field of themselves 
fail to parse, causing stackoverflow exception
 Key: NIFI-4215
 URL: https://issues.apache.org/jira/browse/NIFI-4215
 Project: Apache NiFi
  Issue Type: Bug
Affects Versions: 1.4.0
Reporter: Wesley L Lawrence
Priority: Minor


Noticed this while attempting to use the AvroSchemaRegsitry with some complex 
schema. Boiled down, Avro lets you define a schema such as;
{code}
{ 
  "namespace": "org.apache.nifi.testing", 
  "name": "CompositRecord", 
  "type": "record", 
  "fields": [ 
{ 
  "name": "id", 
  "type": "int" 
}, 
{ 
  "name": "value", 
  "type": "string" 
}, 
{ 
  "name": "parent", 
  "type": [
"null",
"CompositRecord"
  ]
} 
  ] 
}
{code}
The AvroSchemaRegistry (AvroTypeUtil specifically) will fail to parse, and 
generate a stackoverflow exception.

I've whipped up a fix, tested it out in 1.4.0, and am just running through the 
contrib build before I submit a patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2020: [NiFi-3973] Add PutKudu Processor for ingesting data to Ku...

2017-07-21 Thread cammachusa
Github user cammachusa commented on the issue:

https://github.com/apache/nifi/pull/2020
  
Hi @joewitt , there is a sandbox that has Kudu instance (and related 
components) that you can quickly spin up a VM and test the processor. 
https://kudu.apache.org/docs/quickstart.html
In case you need something diff, I can also provision an AWS's VM with KuDu 
installed and give you the access?
Btw, can you find me a second reviewer?
Thanks,



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #2019: NIFI-4032: Managed Ranger Authorizer

2017-07-21 Thread YolandaMDavis
Github user YolandaMDavis commented on the issue:

https://github.com/apache/nifi/pull/2019
  
 I've worked through 3 Ranger configuration scenarios that leveraged the 
ldap user group provider, or the composite configurable user group provider 
(pairing the ldap provider with the file provider):

1) Using group authorizations for LDAP users (with no mapping for 
identities) alongside  user authorizations for nodes . This is to cover cases 
where node identities may not be present in LDAP

2) Using mapped identities to ensure that user-group associations would 
still be properly resolved

3) Using the Composite Configurable User Group Provider to allow 
maintenance of node identities and groups in NiFi while allowing policies to be 
enforced via Ranger

All three scenarios worked well with an established cluster. I was able to 
go from one scenarios to the next through changing configurations and updating 
policies without issue. However a bug was encountered on the third test case 
when I wanted to add a new node to the cluster.

The process of adding a new node requires that no information that would 
seed the users.xml file be  provided in configurations (e.g. Initial Admin, 
Node Identifiers, etc). Therefore the expectation is once the node attempts to 
join the cluster it would receive the necessary user information from the 
cluster to create it's own local version of the file.  When using the 
ManagedRangerAuthorizer along with the Configurable provider it doesn't appear 
to have that functionality, since the users.xml generated was empty.  This led 
to the node starting up fine however when attempting to access the UI from any 
node a proxy error occurred. Given the users.xml file was empty this error made 
sense because NiFi was unable to determine the users (node identities) or 
groups they should be mapped to, hence unable to apply the Ranger policy that 
allowed the nodes group to perform proxying. 

In speaking with @mcgilman offline this error was due to the 
ManagedRangerAuthorizer not extracting user group information for cases when 
it's paired with configurable user group providers.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4032) Create Managed Ranger Authorizer

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096964#comment-16096964
 ] 

ASF GitHub Bot commented on NIFI-4032:
--

Github user YolandaMDavis commented on the issue:

https://github.com/apache/nifi/pull/2019
  
 I've worked through 3 Ranger configuration scenarios that leveraged the 
ldap user group provider, or the composite configurable user group provider 
(pairing the ldap provider with the file provider):

1) Using group authorizations for LDAP users (with no mapping for 
identities) alongside  user authorizations for nodes . This is to cover cases 
where node identities may not be present in LDAP

2) Using mapped identities to ensure that user-group associations would 
still be properly resolved

3) Using the Composite Configurable User Group Provider to allow 
maintenance of node identities and groups in NiFi while allowing policies to be 
enforced via Ranger

All three scenarios worked well with an established cluster. I was able to 
go from one scenarios to the next through changing configurations and updating 
policies without issue. However a bug was encountered on the third test case 
when I wanted to add a new node to the cluster.

The process of adding a new node requires that no information that would 
seed the users.xml file be  provided in configurations (e.g. Initial Admin, 
Node Identifiers, etc). Therefore the expectation is once the node attempts to 
join the cluster it would receive the necessary user information from the 
cluster to create it's own local version of the file.  When using the 
ManagedRangerAuthorizer along with the Configurable provider it doesn't appear 
to have that functionality, since the users.xml generated was empty.  This led 
to the node starting up fine however when attempting to access the UI from any 
node a proxy error occurred. Given the users.xml file was empty this error made 
sense because NiFi was unable to determine the users (node identities) or 
groups they should be mapped to, hence unable to apply the Ranger policy that 
allowed the nodes group to perform proxying. 

In speaking with @mcgilman offline this error was due to the 
ManagedRangerAuthorizer not extracting user group information for cases when 
it's paired with configurable user group providers.


> Create Managed Ranger Authorizer
> 
>
> Key: NIFI-4032
> URL: https://issues.apache.org/jira/browse/NIFI-4032
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Gilman
>Assignee: Matt Gilman
> Fix For: 1.4.0
>
>
> Update the RangerAuthorizer to implement the ManagedAuthorizer interface. 
> This will allow the Ranger policies to be visualized in NiFi UI. May even be 
> able to extend the RangerAuthorizer to maintain compatibility with existing 
> configurations.
> Additionally, update the RangerAuthorizer's authorize(...) method to consider 
> the user's groups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096884#comment-16096884
 ] 

ASF GitHub Bot commented on NIFI-4087:
--

Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/2026


> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Assignee: Joseph Witt
>Priority: Minor
> Fix For: 1.4.0
>
> Attachments: mimedetectortemplate.xml, 
> NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096886#comment-16096886
 ] 

Joseph Witt commented on NIFI-4087:
---

+1.  Merged to master.  Thanks for the contrib.  Attached template used for 
testing

> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Assignee: Joseph Witt
>Priority: Minor
> Fix For: 1.4.0
>
> Attachments: mimedetectortemplate.xml, 
> NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread Joseph Witt (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Witt updated NIFI-4087:
--
Fix Version/s: 1.4.0

> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Assignee: Joseph Witt
>Priority: Minor
> Fix For: 1.4.0
>
> Attachments: mimedetectortemplate.xml, 
> NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread Joseph Witt (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Witt updated NIFI-4087:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Assignee: Joseph Witt
>Priority: Minor
> Fix For: 1.4.0
>
> Attachments: mimedetectortemplate.xml, 
> NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread Joseph Witt (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Witt updated NIFI-4087:
--
Attachment: mimedetectortemplate.xml

> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Assignee: Joseph Witt
>Priority: Minor
> Fix For: 1.4.0
>
> Attachments: mimedetectortemplate.xml, 
> NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread Joseph Witt (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Witt updated NIFI-4087:
--
Assignee: Joseph Witt
  Status: Patch Available  (was: Open)

> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 0.7.4, 1.3.0
>Reporter: Brandon DeVries
>Assignee: Joseph Witt
>Priority: Minor
> Attachments: mimedetectortemplate.xml, 
> NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096883#comment-16096883
 ] 

ASF subversion and git services commented on NIFI-4087:
---

Commit 3371e915ccf29f6d7a240dd52ea11cc10cf8bc5c in nifi's branch 
refs/heads/master from [~Leah Anderson]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=3371e91 ]

NIFI-4087 This closes #2026. Fix to allow exclusion of filename from tika 
criteria.


> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Priority: Minor
> Attachments: NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #2026: NIFI-4087 Fix to allow exclusion of filename from t...

2017-07-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/2026


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-3376) Implement content repository ResourceClaim compaction

2017-07-21 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096806#comment-16096806
 ] 

Joseph Witt commented on NIFI-3376:
---

I agree that is the scenario of concern here.

I think a mitigating approach is to reduce how many things can live together.  
But that too could have some tradeoffs.

I'd like to see us better understand the problem by first implementing a way to 
monitor/observe how much reachability exists for a given content claim.

I've not observed flows with this behavior but it makes sense it could happen.  
We could rewrite still reachable claims and have compensating redirects in prov 
and flowfile repo.

> Implement content repository ResourceClaim compaction
> -
>
> Key: NIFI-3376
> URL: https://issues.apache.org/jira/browse/NIFI-3376
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 0.7.1, 1.1.1
>Reporter: Michael Moser
>Assignee: Michael Hogue
>
> On NiFi systems that deal with many files whose size is less than 1 MB, we 
> often see that the actual disk usage of the content_repository is much 
> greater than the size of flowfiles that NiFi reports are in its queues.  As 
> an example, NiFi may report "50,000 / 12.5 GB" but the content_repository 
> takes up 240 GB of its file system.  This leads to scenarios where a 500 GB 
> content_repository file system gets 100% full, but "I only had 40 GB of data 
> in my NiFi!"
> When several content claims exist in a single resource claim, and most but 
> not all content claims are terminated, the entire resource claim is still not 
> eligible for deletion or archive.  This could mean that only one 10 KB 
> content claim out of a 1 MB resource claim is counted by NiFi as existing in 
> its queues.
> If a particular flow has a slow egress point where flowfiles could back up 
> and remain on the system longer than expected, this problem is exacerbated.
> A potential solution is to compact resource claim files on disk. A background 
> thread could examine all resource claims, and for those that get "old" and 
> whose active content claim usage drops below a threshold, then rewrite the 
> resource claim file.
> A potential work-around is to allow modification of the FileSystemRepository 
> MAX_APPENDABLE_CLAIM_LENGTH to make it a smaller number.  This would increase 
> the probability that the content claims reference count in a resource claim 
> would reach 0 and the resource claim becomes eligible for deletion/archive.  
> Let users trade-off performance for more accurate accounting of NiFi queue 
> size to content repository size.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-3376) Implement content repository ResourceClaim compaction

2017-07-21 Thread Tony Kurc (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096774#comment-16096774
 ] 

Tony Kurc commented on NIFI-3376:
-

As I understand it, I think the concern is a mix of long lived and short lived 
content "in flight". If you have 99.9% short lived content, and 0.01% long 
lived content (maybe you're buffering due to an outage, maybe the receiving 
service is very slow), you could be holding onto much more content than you may 
need and overrun your disks. Does it not make sense to build a strategy to help 
cope with the scenario?

> Implement content repository ResourceClaim compaction
> -
>
> Key: NIFI-3376
> URL: https://issues.apache.org/jira/browse/NIFI-3376
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 0.7.1, 1.1.1
>Reporter: Michael Moser
>Assignee: Michael Hogue
>
> On NiFi systems that deal with many files whose size is less than 1 MB, we 
> often see that the actual disk usage of the content_repository is much 
> greater than the size of flowfiles that NiFi reports are in its queues.  As 
> an example, NiFi may report "50,000 / 12.5 GB" but the content_repository 
> takes up 240 GB of its file system.  This leads to scenarios where a 500 GB 
> content_repository file system gets 100% full, but "I only had 40 GB of data 
> in my NiFi!"
> When several content claims exist in a single resource claim, and most but 
> not all content claims are terminated, the entire resource claim is still not 
> eligible for deletion or archive.  This could mean that only one 10 KB 
> content claim out of a 1 MB resource claim is counted by NiFi as existing in 
> its queues.
> If a particular flow has a slow egress point where flowfiles could back up 
> and remain on the system longer than expected, this problem is exacerbated.
> A potential solution is to compact resource claim files on disk. A background 
> thread could examine all resource claims, and for those that get "old" and 
> whose active content claim usage drops below a threshold, then rewrite the 
> resource claim file.
> A potential work-around is to allow modification of the FileSystemRepository 
> MAX_APPENDABLE_CLAIM_LENGTH to make it a smaller number.  This would increase 
> the probability that the content claims reference count in a resource claim 
> would reach 0 and the resource claim becomes eligible for deletion/archive.  
> Let users trade-off performance for more accurate accounting of NiFi queue 
> size to content repository size.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096745#comment-16096745
 ] 

ASF GitHub Bot commented on NIFI-4087:
--

Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2026
  
will give this a manual test run soon.  For future contributions it is 
probably easiest if you add commits to your PR rather than force pushing over 
top at least during the review phase.  This makes it easier on the reviewer 
because they can see the diffs between review/change cycles.  At the end the 
reviewer might ask you to squash/rebase or they can do it for you when they 
merge.

If you have any questions on that let us know.

Thanks again for contributing


> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Priority: Minor
> Attachments: NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2026: NIFI-4087 Fix to allow exclusion of filename from tika cri...

2017-07-21 Thread joewitt
Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2026
  
will give this a manual test run soon.  For future contributions it is 
probably easiest if you add commits to your PR rather than force pushing over 
top at least during the review phase.  This makes it easier on the reviewer 
because they can see the diffs between review/change cycles.  At the end the 
reviewer might ask you to squash/rebase or they can do it for you when they 
merge.

If you have any questions on that let us know.

Thanks again for contributing


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-106) Processor Counters should be included in the Status Reports

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096667#comment-16096667
 ] 

ASF GitHub Bot commented on NIFI-106:
-

Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/1872


> Processor Counters should be included in the Status Reports
> ---
>
> Key: NIFI-106
> URL: https://issues.apache.org/jira/browse/NIFI-106
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Matt Gilman
>Assignee: Mark Payne
>Priority: Minor
> Fix For: 1.4.0
>
>
> This would allow a Processor's Status HIstory to show counters that were 
> maintained over time periods instead of having only a single count since 
> system start.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (NIFI-106) Processor Counters should be included in the Status Reports

2017-07-21 Thread Matt Gilman (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Gilman updated NIFI-106:
-
   Resolution: Fixed
Fix Version/s: 1.4.0
   Status: Resolved  (was: Patch Available)

> Processor Counters should be included in the Status Reports
> ---
>
> Key: NIFI-106
> URL: https://issues.apache.org/jira/browse/NIFI-106
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Matt Gilman
>Assignee: Mark Payne
>Priority: Minor
> Fix For: 1.4.0
>
>
> This would allow a Processor's Status HIstory to show counters that were 
> maintained over time periods instead of having only a single count since 
> system start.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-106) Processor Counters should be included in the Status Reports

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1609#comment-1609
 ] 

ASF GitHub Bot commented on NIFI-106:
-

Github user mcgilman commented on the issue:

https://github.com/apache/nifi/pull/1872
  
Thanks @markap14! This has been merged to master.


> Processor Counters should be included in the Status Reports
> ---
>
> Key: NIFI-106
> URL: https://issues.apache.org/jira/browse/NIFI-106
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Matt Gilman
>Assignee: Mark Payne
>Priority: Minor
> Fix For: 1.4.0
>
>
> This would allow a Processor's Status HIstory to show counters that were 
> maintained over time periods instead of having only a single count since 
> system start.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-106) Processor Counters should be included in the Status Reports

2017-07-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096665#comment-16096665
 ] 

ASF subversion and git services commented on NIFI-106:
--

Commit 695e8aa98f1d9cce5a9b3025193ac57f9acd598e in nifi's branch 
refs/heads/master from [~markap14]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=695e8aa ]

NIFI-106:
- Expose processors' counters in Stats History
- Only include counters in Processors' Status History if user has read access 
to corresponding Processor
- Addressed review feedback. Found and addressed bug where a counter is not 
present in all of the aggregate snaphot values for status history, resulting in 
the UI not rendering the chart properly
- This closes #1872


> Processor Counters should be included in the Status Reports
> ---
>
> Key: NIFI-106
> URL: https://issues.apache.org/jira/browse/NIFI-106
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Matt Gilman
>Assignee: Mark Payne
>Priority: Minor
>
> This would allow a Processor's Status HIstory to show counters that were 
> maintained over time periods instead of having only a single count since 
> system start.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #1872: NIFI-106: Expose processors' counters in Stats History

2017-07-21 Thread mcgilman
Github user mcgilman commented on the issue:

https://github.com/apache/nifi/pull/1872
  
Thanks @markap14! This has been merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #1872: NIFI-106: Expose processors' counters in Stats Hist...

2017-07-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/1872


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096644#comment-16096644
 ] 

ASF GitHub Bot commented on NIFI-4087:
--

Github user Leah-Anderson commented on the issue:

https://github.com/apache/nifi/pull/2026
  
OK sounds good. Just let me know if I need to do anything else for this 
(already pushed an updated commit to remove the comment you mentioned)


> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Priority: Minor
> Attachments: NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2026: NIFI-4087 Fix to allow exclusion of filename from tika cri...

2017-07-21 Thread Leah-Anderson
Github user Leah-Anderson commented on the issue:

https://github.com/apache/nifi/pull/2026
  
OK sounds good. Just let me know if I need to do anything else for this 
(already pushed an updated commit to remove the comment you mentioned)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4142) Implement a ValidateRecord Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096635#comment-16096635
 ] 

ASF GitHub Bot commented on NIFI-4142:
--

Github user markap14 commented on the issue:

https://github.com/apache/nifi/pull/2015
  
@joewitt I've pushed a new commit that I believe better clarifies how 
schemas are treated in terms of strictness vs. leniency by providing two 
arguments instead of 'enforceSchema': 'coerceTypes' and 'dropUnknownFields'



> Implement a ValidateRecord Processor
> 
>
> Key: NIFI-4142
> URL: https://issues.apache.org/jira/browse/NIFI-4142
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.4.0
>
>
> We need a processor that is capable of validating that all Records in a 
> FlowFile adhere to the proper schema.
> The Processor should be configured with a Record Reader and should route each 
> record to either 'valid' or 'invalid' based on whether or not the record 
> adheres to the reader's schema. A record would be invalid in any of the 
> following cases:
> - Missing field that is required according to the schema
> - Extra field that is not present in schema (it should be configurable 
> whether or not this is a failure)
> - Field requires coercion and strict type checking enabled (this should also 
> be configurable)
> - Field is invalid, such as the value "hello" when it should be an integer



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2015: NIFI-4142: Refactored Record Reader/Writer to allow for re...

2017-07-21 Thread markap14
Github user markap14 commented on the issue:

https://github.com/apache/nifi/pull/2015
  
@joewitt I've pushed a new commit that I believe better clarifies how 
schemas are treated in terms of strictness vs. leniency by providing two 
arguments instead of 'enforceSchema': 'coerceTypes' and 'dropUnknownFields'



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-3376) Implement content repository ResourceClaim compaction

2017-07-21 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096611#comment-16096611
 ] 

Joseph Witt commented on NIFI-3376:
---

I definitely share a similar concern about compaction and the overhead to 
benefit it would provide.  So lets go back to the original concern of the 
reporter which was about the tradeoff of our current 'slab allocation' style of 
writing content.  

I've not experienced the cases where the current model can be problematic.  
However, I could imagine a good simple alternative, again now that NIFI-3736, 
is to simply set the max appendable size to a small number.

> Implement content repository ResourceClaim compaction
> -
>
> Key: NIFI-3376
> URL: https://issues.apache.org/jira/browse/NIFI-3376
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 0.7.1, 1.1.1
>Reporter: Michael Moser
>Assignee: Michael Hogue
>
> On NiFi systems that deal with many files whose size is less than 1 MB, we 
> often see that the actual disk usage of the content_repository is much 
> greater than the size of flowfiles that NiFi reports are in its queues.  As 
> an example, NiFi may report "50,000 / 12.5 GB" but the content_repository 
> takes up 240 GB of its file system.  This leads to scenarios where a 500 GB 
> content_repository file system gets 100% full, but "I only had 40 GB of data 
> in my NiFi!"
> When several content claims exist in a single resource claim, and most but 
> not all content claims are terminated, the entire resource claim is still not 
> eligible for deletion or archive.  This could mean that only one 10 KB 
> content claim out of a 1 MB resource claim is counted by NiFi as existing in 
> its queues.
> If a particular flow has a slow egress point where flowfiles could back up 
> and remain on the system longer than expected, this problem is exacerbated.
> A potential solution is to compact resource claim files on disk. A background 
> thread could examine all resource claims, and for those that get "old" and 
> whose active content claim usage drops below a threshold, then rewrite the 
> resource claim file.
> A potential work-around is to allow modification of the FileSystemRepository 
> MAX_APPENDABLE_CLAIM_LENGTH to make it a smaller number.  This would increase 
> the probability that the content claims reference count in a resource claim 
> would reach 0 and the resource claim becomes eligible for deletion/archive.  
> Let users trade-off performance for more accurate accounting of NiFi queue 
> size to content repository size.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096601#comment-16096601
 ] 

ASF GitHub Bot commented on NIFI-4087:
--

Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2026
  
yep.  if travis is happy on at least one then we're likely in good shape. 


> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Priority: Minor
> Attachments: NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2026: NIFI-4087 Fix to allow exclusion of filename from tika cri...

2017-07-21 Thread joewitt
Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2026
  
yep.  if travis is happy on at least one then we're likely in good shape. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-1580) Allow double-click to display config of processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096598#comment-16096598
 ] 

ASF GitHub Bot commented on NIFI-1580:
--

Github user scottyaslan commented on the issue:

https://github.com/apache/nifi/pull/2009
  
Reviewing...


> Allow double-click to display config of processor
> -
>
> Key: NIFI-1580
> URL: https://issues.apache.org/jira/browse/NIFI-1580
> Project: Apache NiFi
>  Issue Type: Wish
>  Components: Core UI
>Affects Versions: 0.4.1
> Environment: all
>Reporter: Uwe Geercken
>Priority: Minor
>  Labels: features, processor, ui
>
> A user frequently has to open the "config" dialog when designing nifi flows. 
> Each time the user has to right-click the processor and select "config" from 
> the menu.
> It would be quicker when it would be possible to double click a processor - 
> or maybe the title are - to display the config dialog.
> This could also be designed as a confuguration of the UI that the user can 
> define (if double-clicking open the config dialog, does something else or 
> simply nothing)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2009: NIFI-1580 - Allow double-click to display config

2017-07-21 Thread scottyaslan
Github user scottyaslan commented on the issue:

https://github.com/apache/nifi/pull/2009
  
Reviewing...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi-minifi-cpp issue #117: MINIFI-338: Convert processor threads to use thr...

2017-07-21 Thread benqiu2016
Github user benqiu2016 commented on the issue:

https://github.com/apache/nifi-minifi-cpp/pull/117
  


[config.txt](https://github.com/apache/nifi-minifi-cpp/files/1166193/config.txt)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi-minifi-cpp issue #117: MINIFI-338: Convert processor threads to use thr...

2017-07-21 Thread benqiu2016
Github user benqiu2016 commented on the issue:

https://github.com/apache/nifi-minifi-cpp/pull/117
  
@phrocker the normal flow that i run is attached. one get file connected to 
a RPG




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-385) Add Kerberos support in nifi-kite-nar

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096480#comment-16096480
 ] 

ASF GitHub Bot commented on NIFI-385:
-

Github user WilliamNouet commented on the issue:

https://github.com/apache/nifi/pull/1565
  
@joewitt what do you have in mind when saying "we have very strong 
alternatives for so that should be considered as well."?

This does not need to be validate by someone familiar with Kite as the PR 
only deals with Kerberos related changes and as such, keeps the native Kite 
code which was developed.

Also, closing this PR and opening #2029


> Add Kerberos support in nifi-kite-nar
> -
>
> Key: NIFI-385
> URL: https://issues.apache.org/jira/browse/NIFI-385
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Ryan Blue
>
> Kite should be able to connect to a Kerberized Hadoop cluster to store data. 
> Kite's Flume connector has working code. The Kite dataset needs to be 
> instantiated in a {{doPrivileged}} block and its internal {{FileSystem}} 
> object will hold the credentials after that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-385) Add Kerberos support in nifi-kite-nar

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096481#comment-16096481
 ] 

ASF GitHub Bot commented on NIFI-385:
-

Github user WilliamNouet closed the pull request at:

https://github.com/apache/nifi/pull/1565


> Add Kerberos support in nifi-kite-nar
> -
>
> Key: NIFI-385
> URL: https://issues.apache.org/jira/browse/NIFI-385
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Ryan Blue
>
> Kite should be able to connect to a Kerberized Hadoop cluster to store data. 
> Kite's Flume connector has working code. The Kite dataset needs to be 
> instantiated in a {{doPrivileged}} block and its internal {{FileSystem}} 
> object will hold the credentials after that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #1565: NIFI-385 Add Kerberos Support to Kite

2017-07-21 Thread WilliamNouet
Github user WilliamNouet closed the pull request at:

https://github.com/apache/nifi/pull/1565


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #1565: NIFI-385 Add Kerberos Support to Kite

2017-07-21 Thread WilliamNouet
Github user WilliamNouet commented on the issue:

https://github.com/apache/nifi/pull/1565
  
@joewitt what do you have in mind when saying "we have very strong 
alternatives for so that should be considered as well."?

This does not need to be validate by someone familiar with Kite as the PR 
only deals with Kerberos related changes and as such, keeps the native Kite 
code which was developed.

Also, closing this PR and opening #2029


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-3376) Implement content repository ResourceClaim compaction

2017-07-21 Thread Mark Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096477#comment-16096477
 ] 

Mark Payne commented on NIFI-3376:
--

[~m-hogue] [~mosermw] - I don't believe this is a road that we should really 
venture down. Trying to compact it would mean that the data would have to be 
copied, first off, which is potentially incredibly expensive. It means that any 
FlowFile would have to also be updated to point to the new content claim, which 
would require a stop-the-world lock in order to do. It would also mean that we 
would have some very complex inter-dependencies in locking because this could 
not happen while a Processor has access to a FlowFile that references a Content 
Claim that would be compacted. That means that we'd also have to store a 
tremendous amount of state in the Processor/queue about what is in-flight. In 
addition, it would mean that Provenance events that are pointing to the content 
would also no longer be pointing to the right place. Also, of note, this would 
have to be done in such a way that we write the new, 'compacted' resource claim 
parallel to the old content claim so that a failure can be rolled back; if 
already concerned about over-utilization of the repo this may cause even more 
tension. And all of this doesn't even taken into account the complexity of code 
maintenance.

To get this to work properly, I'd estimate that it would take several months of 
focused effort by the people who know the inner workings of the framework, and 
I think that with NIFI-3736 resolved, it provides a fairly small benefit, so 
I'm a -1 on this proposal.

> Implement content repository ResourceClaim compaction
> -
>
> Key: NIFI-3376
> URL: https://issues.apache.org/jira/browse/NIFI-3376
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 0.7.1, 1.1.1
>Reporter: Michael Moser
>Assignee: Michael Hogue
>
> On NiFi systems that deal with many files whose size is less than 1 MB, we 
> often see that the actual disk usage of the content_repository is much 
> greater than the size of flowfiles that NiFi reports are in its queues.  As 
> an example, NiFi may report "50,000 / 12.5 GB" but the content_repository 
> takes up 240 GB of its file system.  This leads to scenarios where a 500 GB 
> content_repository file system gets 100% full, but "I only had 40 GB of data 
> in my NiFi!"
> When several content claims exist in a single resource claim, and most but 
> not all content claims are terminated, the entire resource claim is still not 
> eligible for deletion or archive.  This could mean that only one 10 KB 
> content claim out of a 1 MB resource claim is counted by NiFi as existing in 
> its queues.
> If a particular flow has a slow egress point where flowfiles could back up 
> and remain on the system longer than expected, this problem is exacerbated.
> A potential solution is to compact resource claim files on disk. A background 
> thread could examine all resource claims, and for those that get "old" and 
> whose active content claim usage drops below a threshold, then rewrite the 
> resource claim file.
> A potential work-around is to allow modification of the FileSystemRepository 
> MAX_APPENDABLE_CLAIM_LENGTH to make it a smaller number.  This would increase 
> the probability that the content claims reference count in a resource claim 
> would reach 0 and the resource claim becomes eligible for deletion/archive.  
> Let users trade-off performance for more accurate accounting of NiFi queue 
> size to content repository size.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-385) Add Kerberos support in nifi-kite-nar

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096473#comment-16096473
 ] 

ASF GitHub Bot commented on NIFI-385:
-

GitHub user WilliamNouet opened a pull request:

https://github.com/apache/nifi/pull/2029

NIFI-385 Add Kerberos Support to Kite

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

[Y] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?

[Y] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

[Y] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

[Y] Is your initial contribution a single, squashed commit?

For code changes:

[Y] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
[Y] Have you written or updated unit tests to verify your changes?
[N/A] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under ASF 2.0?
[N/A] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
[N/A] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
[N/A] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?
For documentation related changes:

[N/A] Have you ensured that format looks appropriate for the output in 
which it is rendered?
Note:

Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WilliamNouet/nifi NIFI-385

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2029.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2029


commit 79322c32537d98b83f3a432f298ac04b1c54cffe
Author: WilliamNouet 
Date:   2017-07-21T16:44:31Z

NIFI-385 Add Kerberos Support to Kite




> Add Kerberos support in nifi-kite-nar
> -
>
> Key: NIFI-385
> URL: https://issues.apache.org/jira/browse/NIFI-385
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Ryan Blue
>
> Kite should be able to connect to a Kerberized Hadoop cluster to store data. 
> Kite's Flume connector has working code. The Kite dataset needs to be 
> instantiated in a {{doPrivileged}} block and its internal {{FileSystem}} 
> object will hold the credentials after that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #2029: NIFI-385 Add Kerberos Support to Kite

2017-07-21 Thread WilliamNouet
GitHub user WilliamNouet opened a pull request:

https://github.com/apache/nifi/pull/2029

NIFI-385 Add Kerberos Support to Kite

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

[Y] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?

[Y] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

[Y] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

[Y] Is your initial contribution a single, squashed commit?

For code changes:

[Y] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
[Y] Have you written or updated unit tests to verify your changes?
[N/A] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under ASF 2.0?
[N/A] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
[N/A] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
[N/A] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?
For documentation related changes:

[N/A] Have you ensured that format looks appropriate for the output in 
which it is rendered?
Note:

Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WilliamNouet/nifi NIFI-385

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2029.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2029


commit 79322c32537d98b83f3a432f298ac04b1c54cffe
Author: WilliamNouet 
Date:   2017-07-21T16:44:31Z

NIFI-385 Add Kerberos Support to Kite




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4142) Implement a ValidateRecord Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096461#comment-16096461
 ] 

ASF GitHub Bot commented on NIFI-4142:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2015#discussion_r128806560
  
--- Diff: 
nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/RecordReader.java
 ---
@@ -38,14 +38,35 @@
 public interface RecordReader extends Closeable {
 
 /**
- * Returns the next record in the stream or null if no 
more records are available.
+ * Returns the next record in the stream or null if no 
more records are available. Schema enforcement will be enabled.
  *
  * @return the next record in the stream or null if no 
more records are available.
  *
  * @throws IOException if unable to read from the underlying data
  * @throws MalformedRecordException if an unrecoverable failure occurs 
when trying to parse a record
+ * @throws SchemaValidationException if a Record contains a field that 
violates the schema and cannot be coerced into the appropriate field type.
  */
-Record nextRecord() throws IOException, MalformedRecordException;
+default Record nextRecord() throws IOException, 
MalformedRecordException {
+return nextRecord(true);
+}
+
+/**
+ * Reads the next record from the underlying stream. If schema 
enforcement is enabled, then any field in the Record whose type does not
+ * match the schema will be coerced to the correct type and a 
MalformedRecordException will be thrown if unable to coerce the data into
+ * the correct type. If schema enforcement is disabled, then no type 
coercion will occur. As a result, calling
+ * {@link 
Record#getValue(org.apache.nifi.serialization.record.RecordField)}
+ * may return any type of Object, such as a String or another Record, 
even though the schema indicates that the field must be an integer.
+ *
+ * @param enforceSchema whether or not fields in the Record should be 
validated against the schema and coerced when necessary
+ *
+ * @return the next record in the stream or null if no 
more records are available
+ * @throws IOException if unable to read from the underlying data
+ * @throws MalformedRecordException if an unrecoverable failure occurs 
when trying to parse a record, or a Record contains a field
+ * that violates the schema and cannot be coerced into the 
appropriate field type.
+ * @throws SchemaValidationException if a Record contains a field that 
violates the schema and cannot be coerced into the appropriate
+ * field type and schema enforcement is enabled
+ */
+Record nextRecord(boolean enforceSchema) throws IOException, 
MalformedRecordException;
--- End diff --

I think I actually want to just separate the concept out into two different 
variables here: boolean coerceTypes, boolean dropUnknownRecords. That way it is 
very explicit what is happening, and I don't think that 'strict' vs. 'lenient' 
really conveys those two semantics as well as I'd like.


> Implement a ValidateRecord Processor
> 
>
> Key: NIFI-4142
> URL: https://issues.apache.org/jira/browse/NIFI-4142
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.4.0
>
>
> We need a processor that is capable of validating that all Records in a 
> FlowFile adhere to the proper schema.
> The Processor should be configured with a Record Reader and should route each 
> record to either 'valid' or 'invalid' based on whether or not the record 
> adheres to the reader's schema. A record would be invalid in any of the 
> following cases:
> - Missing field that is required according to the schema
> - Extra field that is not present in schema (it should be configurable 
> whether or not this is a failure)
> - Field requires coercion and strict type checking enabled (this should also 
> be configurable)
> - Field is invalid, such as the value "hello" when it should be an integer



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-3518) Create a Morphlines processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096460#comment-16096460
 ] 

ASF GitHub Bot commented on NIFI-3518:
--

Github user WilliamNouet closed the pull request at:

https://github.com/apache/nifi/pull/1576


> Create a Morphlines processor
> -
>
> Key: NIFI-3518
> URL: https://issues.apache.org/jira/browse/NIFI-3518
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: William Nouet
>Priority: Minor
>
> Create a dedicate processor to run Morphlines transformations 
> (http://kitesdk.org/docs/1.1.0/morphlines/morphlines-reference-guide.html) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-3518) Create a Morphlines processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096459#comment-16096459
 ] 

ASF GitHub Bot commented on NIFI-3518:
--

Github user WilliamNouet commented on the issue:

https://github.com/apache/nifi/pull/1576
  
Closing this one and opening PR #2028 


> Create a Morphlines processor
> -
>
> Key: NIFI-3518
> URL: https://issues.apache.org/jira/browse/NIFI-3518
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: William Nouet
>Priority: Minor
>
> Create a dedicate processor to run Morphlines transformations 
> (http://kitesdk.org/docs/1.1.0/morphlines/morphlines-reference-guide.html) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #2015: NIFI-4142: Refactored Record Reader/Writer to allow...

2017-07-21 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2015#discussion_r128806560
  
--- Diff: 
nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/RecordReader.java
 ---
@@ -38,14 +38,35 @@
 public interface RecordReader extends Closeable {
 
 /**
- * Returns the next record in the stream or null if no 
more records are available.
+ * Returns the next record in the stream or null if no 
more records are available. Schema enforcement will be enabled.
  *
  * @return the next record in the stream or null if no 
more records are available.
  *
  * @throws IOException if unable to read from the underlying data
  * @throws MalformedRecordException if an unrecoverable failure occurs 
when trying to parse a record
+ * @throws SchemaValidationException if a Record contains a field that 
violates the schema and cannot be coerced into the appropriate field type.
  */
-Record nextRecord() throws IOException, MalformedRecordException;
+default Record nextRecord() throws IOException, 
MalformedRecordException {
+return nextRecord(true);
+}
+
+/**
+ * Reads the next record from the underlying stream. If schema 
enforcement is enabled, then any field in the Record whose type does not
+ * match the schema will be coerced to the correct type and a 
MalformedRecordException will be thrown if unable to coerce the data into
+ * the correct type. If schema enforcement is disabled, then no type 
coercion will occur. As a result, calling
+ * {@link 
Record#getValue(org.apache.nifi.serialization.record.RecordField)}
+ * may return any type of Object, such as a String or another Record, 
even though the schema indicates that the field must be an integer.
+ *
+ * @param enforceSchema whether or not fields in the Record should be 
validated against the schema and coerced when necessary
+ *
+ * @return the next record in the stream or null if no 
more records are available
+ * @throws IOException if unable to read from the underlying data
+ * @throws MalformedRecordException if an unrecoverable failure occurs 
when trying to parse a record, or a Record contains a field
+ * that violates the schema and cannot be coerced into the 
appropriate field type.
+ * @throws SchemaValidationException if a Record contains a field that 
violates the schema and cannot be coerced into the appropriate
+ * field type and schema enforcement is enabled
+ */
+Record nextRecord(boolean enforceSchema) throws IOException, 
MalformedRecordException;
--- End diff --

I think I actually want to just separate the concept out into two different 
variables here: boolean coerceTypes, boolean dropUnknownRecords. That way it is 
very explicit what is happening, and I don't think that 'strict' vs. 'lenient' 
really conveys those two semantics as well as I'd like.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #1576: NIFI-3518 Create a Morphlines processor

2017-07-21 Thread WilliamNouet
Github user WilliamNouet commented on the issue:

https://github.com/apache/nifi/pull/1576
  
Closing this one and opening PR #2028 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-3518) Create a Morphlines processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096457#comment-16096457
 ] 

ASF GitHub Bot commented on NIFI-3518:
--

GitHub user WilliamNouet opened a pull request:

https://github.com/apache/nifi/pull/2028

NIFI-3518 Create a Morphlines processor

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

[Y] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?

[Y] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

[Y] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

[Y] Is your initial contribution a single, squashed commit?

For code changes:

[Y] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
[Y] Have you written or updated unit tests to verify your changes?
[N/A] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under ASF 2.0?
[N/A] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
[N/A] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
[N/A] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?
For documentation related changes:

[N/A] Have you ensured that format looks appropriate for the output in 
which it is rendered?
Note:

Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WilliamNouet/nifi NIFI-3518

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2028


commit bddb035a58fdf8542ceb9808e5ca3c0eb61baf9c
Author: WilliamNouet 
Date:   2017-07-21T16:34:31Z

NIFI-3518 Create a Morphlines processor

diff --git 
a/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-nar/pom.xml 
b/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-nar/pom.xml
new file mode 100644
index 000..afb93b8
--- /dev/null
+++ b/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-nar/pom.xml
@@ -0,0 +1,41 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+
+org.apache.nifi
+nifi-morphlines-bundle
+1.4.0-SNAPSHOT
+
+
+nifi-morphlines-nar
+1.4.0-SNAPSHOT
+nar
+
+true
+true
+
+
+
+
+org.apache.nifi
+nifi-morphlines-processors
+1.4.0-SNAPSHOT
+
+
+
+
diff --git 
a/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/pom.xml 
b/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/pom.xml
new file mode 100644
index 000..c9ebb31
--- /dev/null
+++ 
b/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/pom.xml
@@ -0,0 +1,61 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+
+org.apache.nifi
+nifi-morphlines-bundle
+1.4.0-SNAPSHOT
+
+
+nifi-morphlines-processors
+jar
+
+
+
+org.apache.nifi
+nifi-api
+${nifi.version}
+
+
+org.apache.nifi
+nifi-processor-utils
+
+
+org.apache.nifi
+nifi-mock
+${nifi.version}
+test
+
+
+org.slf4j
+slf4j-simple
+test
+
+
+junit
+junit
+4.11
+test
+
+

[GitHub] nifi pull request #1576: NIFI-3518 Create a Morphlines processor

2017-07-21 Thread WilliamNouet
Github user WilliamNouet closed the pull request at:

https://github.com/apache/nifi/pull/1576


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #2028: NIFI-3518 Create a Morphlines processor

2017-07-21 Thread WilliamNouet
GitHub user WilliamNouet opened a pull request:

https://github.com/apache/nifi/pull/2028

NIFI-3518 Create a Morphlines processor

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

[Y] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?

[Y] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

[Y] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

[Y] Is your initial contribution a single, squashed commit?

For code changes:

[Y] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
[Y] Have you written or updated unit tests to verify your changes?
[N/A] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under ASF 2.0?
[N/A] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
[N/A] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
[N/A] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?
For documentation related changes:

[N/A] Have you ensured that format looks appropriate for the output in 
which it is rendered?
Note:

Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WilliamNouet/nifi NIFI-3518

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2028


commit bddb035a58fdf8542ceb9808e5ca3c0eb61baf9c
Author: WilliamNouet 
Date:   2017-07-21T16:34:31Z

NIFI-3518 Create a Morphlines processor

diff --git 
a/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-nar/pom.xml 
b/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-nar/pom.xml
new file mode 100644
index 000..afb93b8
--- /dev/null
+++ b/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-nar/pom.xml
@@ -0,0 +1,41 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+
+org.apache.nifi
+nifi-morphlines-bundle
+1.4.0-SNAPSHOT
+
+
+nifi-morphlines-nar
+1.4.0-SNAPSHOT
+nar
+
+true
+true
+
+
+
+
+org.apache.nifi
+nifi-morphlines-processors
+1.4.0-SNAPSHOT
+
+
+
+
diff --git 
a/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/pom.xml 
b/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/pom.xml
new file mode 100644
index 000..c9ebb31
--- /dev/null
+++ 
b/nifi-nar-bundles/nifi-morphlines-bundle/nifi-morphlines-processors/pom.xml
@@ -0,0 +1,61 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+
+org.apache.nifi
+nifi-morphlines-bundle
+1.4.0-SNAPSHOT
+
+
+nifi-morphlines-processors
+jar
+
+
+
+org.apache.nifi
+nifi-api
+${nifi.version}
+
+
+org.apache.nifi
+nifi-processor-utils
+
+
+org.apache.nifi
+nifi-mock
+${nifi.version}
+test
+
+
+org.slf4j
+slf4j-simple
+test
+
+
+junit
+junit
+4.11
+test
+
+
+org.kitesdk
+kite-morphlines-core
+${kite.version}
+
+
+
diff --git 

[GitHub] nifi-minifi-cpp issue #117: MINIFI-338: Convert processor threads to use thr...

2017-07-21 Thread phrocker
Github user phrocker commented on the issue:

https://github.com/apache/nifi-minifi-cpp/pull/117
  
@benqiu2016 I've only run this for two hours. It dramatically improved 
throughput on that test when configuring several concurrent tasks for my 
threads. Do you have an example flow that you like to run with? I would be 
happy to run that overnight. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi-minifi-cpp pull request #117: MINIFI-338: Convert processor threads to ...

2017-07-21 Thread phrocker
Github user phrocker commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/117#discussion_r128787106
  
--- Diff: libminifi/include/utils/ThreadPool.h ---
@@ -246,15 +349,67 @@ void ThreadPool::startWorkers() {
 template
 void ThreadPool::run_tasks() {
   auto waitperiod = std::chrono::milliseconds(1) * 100;
+  uint64_t wait_decay_ = 0;
   while (running_.load()) {
 
+// if we are spinning, perform a wait. If something changes in the 
worker such that the timeslice has changed, we will pick that information up. 
Note that it's possible
+// we could starve for processing time if all workers are waiting. In 
the event that the number of workers far exceeds the number of threads, threads 
will spin and potentially
+// wait until they arrive at a task that can be run. In this case we 
reset the wait_decay and attempt to pick up a new task. This means that threads 
that recently ran should
+// be more likely to run. This is intentional.
+if (wait_decay_ > 1000) {
+  std::this_thread::sleep_for(std::chrono::nanoseconds(wait_decay_));
+}
 Worker task;
 if (!worker_queue_.try_dequeue(task)) {
+
   std::unique_lock lock(worker_queue_mutex_);
   tasks_available_.wait_for(lock, waitperiod);
   continue;
 }
-task.run();
+else {
+
+  std::unique_lock lock(worker_queue_mutex_);
+  if (!task_status_[task.getIdentifier()]) {
+continue;
+  }
+}
+
+bool wait_to_run = false;
+if (task.getTimeSlice() > 1) {
+  auto now = std::chrono::system_clock::now().time_since_epoch();
+  auto ms = std::chrono::duration_cast(now);
+  if (task.getTimeSlice() > ms.count()) {
+wait_to_run = true;
+  }
+}
+// if we have to wait we re-queue the worker.
+if (wait_to_run) {
+  {
+std::unique_lock lock(worker_queue_mutex_);
+if (!task_status_[task.getIdentifier()]) {
+  continue;
+}
+  }
+  worker_queue_.enqueue(std::move(task));
--- End diff --

Unfortunately that would require a locking queue or dequeuing everything in 
order to sort. Since it is a lock free queue tasks can be enqueued and dequeued 
with only a std::move with relatively low cost. An alternative to this would be 
to make multiple queues, but then that would require a prioritized strategy to 
access each queue. I think that would be a follow on activity if the need 
arose. 

The biggest negative thus far is that shutdown takes longer because we are 
now deterministically awaiting threads to end.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-3376) Implement content repository ResourceClaim compaction

2017-07-21 Thread Michael Hogue (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096339#comment-16096339
 ] 

Michael Hogue commented on NIFI-3376:
-

[~markap14]: I'm at a point in implementation here where I know when a resource 
claim should be compacted, but i'm not entirely sure on how to rewrite a 
resource claim file. It looks like content claims are appended to a file 
without demarcation, which makes reading them out of an existing file pretty 
difficult. Any advice on constructing a new resource claim with only active 
content claims? 

My current WIP is pushed to the branch linked in my previous comment.

> Implement content repository ResourceClaim compaction
> -
>
> Key: NIFI-3376
> URL: https://issues.apache.org/jira/browse/NIFI-3376
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 0.7.1, 1.1.1
>Reporter: Michael Moser
>Assignee: Michael Hogue
>
> On NiFi systems that deal with many files whose size is less than 1 MB, we 
> often see that the actual disk usage of the content_repository is much 
> greater than the size of flowfiles that NiFi reports are in its queues.  As 
> an example, NiFi may report "50,000 / 12.5 GB" but the content_repository 
> takes up 240 GB of its file system.  This leads to scenarios where a 500 GB 
> content_repository file system gets 100% full, but "I only had 40 GB of data 
> in my NiFi!"
> When several content claims exist in a single resource claim, and most but 
> not all content claims are terminated, the entire resource claim is still not 
> eligible for deletion or archive.  This could mean that only one 10 KB 
> content claim out of a 1 MB resource claim is counted by NiFi as existing in 
> its queues.
> If a particular flow has a slow egress point where flowfiles could back up 
> and remain on the system longer than expected, this problem is exacerbated.
> A potential solution is to compact resource claim files on disk. A background 
> thread could examine all resource claims, and for those that get "old" and 
> whose active content claim usage drops below a threshold, then rewrite the 
> resource claim file.
> A potential work-around is to allow modification of the FileSystemRepository 
> MAX_APPENDABLE_CLAIM_LENGTH to make it a smaller number.  This would increase 
> the probability that the content claims reference count in a resource claim 
> would reach 0 and the resource claim becomes eligible for deletion/archive.  
> Let users trade-off performance for more accurate accounting of NiFi queue 
> size to content repository size.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (NIFI-4214) Support html element extraction based on CSS Selector in Nifi Expression Language

2017-07-21 Thread Anil (JIRA)
Anil created NIFI-4214:
--

 Summary: Support html element extraction based on CSS Selector in 
Nifi Expression Language
 Key: NIFI-4214
 URL: https://issues.apache.org/jira/browse/NIFI-4214
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Reporter: Anil


Using GetHtmlElement processor, It seems I can extract only one HTML element 
from the html flow file.
I would like to know if there is a way to extract multiple html elements from a 
single html flow file (or from flowfile attribute).
It would have been if HTML element extraction is implemented in the Nifi 
Expression language. something like {attributeName:getHtmlElement(CSSSelector)}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi-minifi-cpp issue #117: MINIFI-338: Convert processor threads to use thr...

2017-07-21 Thread benqiu2016
Github user benqiu2016 commented on the issue:

https://github.com/apache/nifi-minifi-cpp/pull/117
  
overall looks good. may be some optimization for the queue.
Please some tests in long duration to make sure it is not breaking the 
master because it is big change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi-minifi-cpp pull request #117: MINIFI-338: Convert processor threads to ...

2017-07-21 Thread benqiu2016
Github user benqiu2016 commented on a diff in the pull request:

https://github.com/apache/nifi-minifi-cpp/pull/117#discussion_r128773288
  
--- Diff: libminifi/include/utils/ThreadPool.h ---
@@ -246,15 +349,67 @@ void ThreadPool::startWorkers() {
 template
 void ThreadPool::run_tasks() {
   auto waitperiod = std::chrono::milliseconds(1) * 100;
+  uint64_t wait_decay_ = 0;
   while (running_.load()) {
 
+// if we are spinning, perform a wait. If something changes in the 
worker such that the timeslice has changed, we will pick that information up. 
Note that it's possible
+// we could starve for processing time if all workers are waiting. In 
the event that the number of workers far exceeds the number of threads, threads 
will spin and potentially
+// wait until they arrive at a task that can be run. In this case we 
reset the wait_decay and attempt to pick up a new task. This means that threads 
that recently ran should
+// be more likely to run. This is intentional.
+if (wait_decay_ > 1000) {
+  std::this_thread::sleep_for(std::chrono::nanoseconds(wait_decay_));
+}
 Worker task;
 if (!worker_queue_.try_dequeue(task)) {
+
   std::unique_lock lock(worker_queue_mutex_);
   tasks_available_.wait_for(lock, waitperiod);
   continue;
 }
-task.run();
+else {
+
+  std::unique_lock lock(worker_queue_mutex_);
+  if (!task_status_[task.getIdentifier()]) {
+continue;
+  }
+}
+
+bool wait_to_run = false;
+if (task.getTimeSlice() > 1) {
+  auto now = std::chrono::system_clock::now().time_since_epoch();
+  auto ms = std::chrono::duration_cast(now);
+  if (task.getTimeSlice() > ms.count()) {
+wait_to_run = true;
+  }
+}
+// if we have to wait we re-queue the worker.
+if (wait_to_run) {
+  {
+std::unique_lock lock(worker_queue_mutex_);
+if (!task_status_[task.getIdentifier()]) {
+  continue;
+}
+  }
+  worker_queue_.enqueue(std::move(task));
--- End diff --

OK. it is possible to sort the queue or somehow to make it such that the 
head of the queue is the first to expire.
In this case, we can avoid enqueue/dequeue for all the items in the queues.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4142) Implement a ValidateRecord Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096270#comment-16096270
 ] 

ASF GitHub Bot commented on NIFI-4142:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2015#discussion_r128764060
  
--- Diff: 
nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/SchemaValidationException.java
 ---
@@ -15,14 +15,16 @@
  * limitations under the License.
  */
 
-package org.apache.nifi.serialization.record;
+package org.apache.nifi.serialization;
 
-public class TypeMismatchException extends RuntimeException {
--- End diff --

I don't agree that they are pretty much the same thing. 
TypeMismatchException is very specific. SchemaValidationException can be much 
more broad. For instance, if a required field is missing, that is not a Type 
Mismatch, but it is a Schema Validation.


> Implement a ValidateRecord Processor
> 
>
> Key: NIFI-4142
> URL: https://issues.apache.org/jira/browse/NIFI-4142
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
> Fix For: 1.4.0
>
>
> We need a processor that is capable of validating that all Records in a 
> FlowFile adhere to the proper schema.
> The Processor should be configured with a Record Reader and should route each 
> record to either 'valid' or 'invalid' based on whether or not the record 
> adheres to the reader's schema. A record would be invalid in any of the 
> following cases:
> - Missing field that is required according to the schema
> - Extra field that is not present in schema (it should be configurable 
> whether or not this is a failure)
> - Field requires coercion and strict type checking enabled (this should also 
> be configurable)
> - Field is invalid, such as the value "hello" when it should be an integer



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #2015: NIFI-4142: Refactored Record Reader/Writer to allow...

2017-07-21 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2015#discussion_r128764060
  
--- Diff: 
nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/SchemaValidationException.java
 ---
@@ -15,14 +15,16 @@
  * limitations under the License.
  */
 
-package org.apache.nifi.serialization.record;
+package org.apache.nifi.serialization;
 
-public class TypeMismatchException extends RuntimeException {
--- End diff --

I don't agree that they are pretty much the same thing. 
TypeMismatchException is very specific. SchemaValidationException can be much 
more broad. For instance, if a required field is missing, that is not a Type 
Mismatch, but it is a Schema Validation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (NIFI-4213) PutHDFS umask not working

2017-07-21 Thread William Nouet (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Nouet updated NIFI-4213:

Description: 
The PutHDFS permission umask property is not working. The umask is set when the 
processor is scheduled to run as per below:

@OnScheduled
public void onScheduled(ProcessContext context) throws Exception {
super.abstractOnScheduled(context);

// Set umask once, to avoid thread safety issues doing it in onTrigger
final PropertyValue umaskProp = context.getProperty(UMASK);
final short dfsUmask;
if (umaskProp.isSet()) {
dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
} else {
dfsUmask = FsPermission.DEFAULT_UMASK;
}
*final Configuration conf = getConfiguration();*
*FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
}

However, when the flowfile is being processed, a new set of configuration is 
loaded:
@Override
public void onTrigger(ProcessContext context, ProcessSession session) 
throws ProcessException {
final FlowFile flowFile = session.get();
if (flowFile == null) {
return;
}

final FileSystem hdfs = getFileSystem();
*final Configuration configuration = getConfiguration();*
...
}

This configuration is the one which is going to be used when putting the file 
to HDFS; hence not grabbing the umask set perviously on onScheduled but only 
the default one (hdfs-site.xml). Thus, the umask property is irrelevant.

Fix should be easy by externalizing the configuration and grabbing it again in 
onTrigger or by setting a new hdfsResources in onScheduled.

  was:
The PutHDFS permission umask property is not working. The umask is set when the 
processor is scheduled to run as per below:

@OnScheduled
public void onScheduled(ProcessContext context) throws Exception {
super.abstractOnScheduled(context);

// Set umask once, to avoid thread safety issues doing it in onTrigger
final PropertyValue umaskProp = context.getProperty(UMASK);
final short dfsUmask;
if (umaskProp.isSet()) {
dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
} else {
dfsUmask = FsPermission.DEFAULT_UMASK;
}
*final Configuration conf = getConfiguration();
FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
}

However, when the flowfile is being processed, a new set of configuration is 
loaded:
@Override
public void onTrigger(ProcessContext context, ProcessSession session) 
throws ProcessException {
final FlowFile flowFile = session.get();
if (flowFile == null) {
return;
}

final FileSystem hdfs = getFileSystem();
*final Configuration configuration = getConfiguration();*
...
}

This configuration is the one which is going to be used when putting the file 
to HDFS; hence not grabbing the umask set perviously on onScheduled but only 
the default one (hdfs-site.xml). Thus, the umask property is irrelevant.

Fix should be easy by externalizing the configuration and grabbing it again in 
onTrigger or by setting a new hdfsResources in onScheduled.


> PutHDFS umask not working
> -
>
> Key: NIFI-4213
> URL: https://issues.apache.org/jira/browse/NIFI-4213
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.1.1
>Reporter: William Nouet
>
> The PutHDFS permission umask property is not working. The umask is set when 
> the processor is scheduled to run as per below:
> @OnScheduled
> public void onScheduled(ProcessContext context) throws Exception {
> super.abstractOnScheduled(context);
> // Set umask once, to avoid thread safety issues doing it in onTrigger
> final PropertyValue umaskProp = context.getProperty(UMASK);
> final short dfsUmask;
> if (umaskProp.isSet()) {
> dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
> } else {
> dfsUmask = FsPermission.DEFAULT_UMASK;
> }
> *final Configuration conf = getConfiguration();*
> *FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
> }
> However, when the flowfile is being processed, a new set of configuration is 
> loaded:
> @Override
> public void onTrigger(ProcessContext context, ProcessSession session) 
> throws ProcessException {
> final FlowFile flowFile = session.get();
> if (flowFile == null) {
> return;
> }
> final FileSystem hdfs = getFileSystem();
> *final Configuration configuration = getConfiguration();*
> ...
> }
> This configuration is the one which is going to be used when putting the file 
> to HDFS; hence not grabbing the 

[jira] [Resolved] (NIFI-1603) Remove deprecated content repo properties

2017-07-21 Thread Mark Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne resolved NIFI-1603.
--
Resolution: Duplicate

> Remove deprecated content repo properties
> -
>
> Key: NIFI-1603
> URL: https://issues.apache.org/jira/browse/NIFI-1603
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Configuration, Core Framework
>Affects Versions: 0.5.1
>Reporter: Aldrin Piri
>Assignee: Michael Hogue
>Priority: Trivial
>
> Both nifi.content.claim.max.appendable.size & 
> nifi.content.claim.max.flow.files no longer appear to be used with defaults 
> specified for the associated properties in FileSystemRepository's 
> maxAppendClaimLength and the configured writableClaimQueue length.
> These should either be removed from both the codebase and documentation or 
> configured to use the associated properties and have their documentation 
> updated to reflect current configuration defaults.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (NIFI-3736) NiFi not honoring the "nifi.content.claim.max.appendable.size" and "nifi.content.claim.max.flow.files" properties

2017-07-21 Thread Mark Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne resolved NIFI-3736.
--
   Resolution: Fixed
Fix Version/s: 1.4.0

> NiFi not honoring the "nifi.content.claim.max.appendable.size" and 
> "nifi.content.claim.max.flow.files" properties
> -
>
> Key: NIFI-3736
> URL: https://issues.apache.org/jira/browse/NIFI-3736
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Michael Hogue
> Fix For: 1.4.0
>
>
> The nifi.properties file has two properties for controlling how many 
> FlowFiles to jam into one Content Claim. Unfortunately, it looks like this is 
> no longer honored in FileSystemRepository.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-3736) NiFi not honoring the "nifi.content.claim.max.appendable.size" and "nifi.content.claim.max.flow.files" properties

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096260#comment-16096260
 ] 

ASF GitHub Bot commented on NIFI-3736:
--

Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/2010


> NiFi not honoring the "nifi.content.claim.max.appendable.size" and 
> "nifi.content.claim.max.flow.files" properties
> -
>
> Key: NIFI-3736
> URL: https://issues.apache.org/jira/browse/NIFI-3736
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Michael Hogue
>
> The nifi.properties file has two properties for controlling how many 
> FlowFiles to jam into one Content Claim. Unfortunately, it looks like this is 
> no longer honored in FileSystemRepository.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-3736) NiFi not honoring the "nifi.content.claim.max.appendable.size" and "nifi.content.claim.max.flow.files" properties

2017-07-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096258#comment-16096258
 ] 

ASF subversion and git services commented on NIFI-3736:
---

Commit c54b2ad81c07e96f26d1ef19d0f887a0f7704da5 in nifi's branch 
refs/heads/master from m-hogue
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=c54b2ad ]

NIFI-3736: change to honor nifi.content.claim.max.appendable.size and 
nifi.content.claim.max.flow.files properties. Added 100 MB cap for 
NiFiProperties.MAX_APPENDABLE_CLAIM_SIZE

This closes #2010.


> NiFi not honoring the "nifi.content.claim.max.appendable.size" and 
> "nifi.content.claim.max.flow.files" properties
> -
>
> Key: NIFI-3736
> URL: https://issues.apache.org/jira/browse/NIFI-3736
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Michael Hogue
>
> The nifi.properties file has two properties for controlling how many 
> FlowFiles to jam into one Content Claim. Unfortunately, it looks like this is 
> no longer honored in FileSystemRepository.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-3736) NiFi not honoring the "nifi.content.claim.max.appendable.size" and "nifi.content.claim.max.flow.files" properties

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096259#comment-16096259
 ] 

ASF GitHub Bot commented on NIFI-3736:
--

Github user markap14 commented on the issue:

https://github.com/apache/nifi/pull/2010
  
@m-hogue thanks for the update! +1 merged to master.


> NiFi not honoring the "nifi.content.claim.max.appendable.size" and 
> "nifi.content.claim.max.flow.files" properties
> -
>
> Key: NIFI-3736
> URL: https://issues.apache.org/jira/browse/NIFI-3736
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Michael Hogue
>
> The nifi.properties file has two properties for controlling how many 
> FlowFiles to jam into one Content Claim. Unfortunately, it looks like this is 
> no longer honored in FileSystemRepository.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #2010: NIFI-3736: change to honor nifi.content.claim.max.a...

2017-07-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/2010


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi issue #2010: NIFI-3736: change to honor nifi.content.claim.max.appendab...

2017-07-21 Thread markap14
Github user markap14 commented on the issue:

https://github.com/apache/nifi/pull/2010
  
@m-hogue thanks for the update! +1 merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (NIFI-4213) PutHDFS umask not working

2017-07-21 Thread William Nouet (JIRA)
William Nouet created NIFI-4213:
---

 Summary: PutHDFS umask not working
 Key: NIFI-4213
 URL: https://issues.apache.org/jira/browse/NIFI-4213
 Project: Apache NiFi
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: William Nouet


The PutHDFS permission umask property is not working. The umask is set when the 
processor is scheduled to run as per below:

@OnScheduled
public void onScheduled(ProcessContext context) throws Exception {
super.abstractOnScheduled(context);

// Set umask once, to avoid thread safety issues doing it in onTrigger
final PropertyValue umaskProp = context.getProperty(UMASK);
final short dfsUmask;
if (umaskProp.isSet()) {
dfsUmask = Short.parseShort(umaskProp.getValue(), 8);
} else {
dfsUmask = FsPermission.DEFAULT_UMASK;
}
*final Configuration conf = getConfiguration();
FsPermission.setUMask(conf, new FsPermission(dfsUmask));*
}

However, when the flowfile is being processed, a new set of configuration is 
loaded:
@Override
public void onTrigger(ProcessContext context, ProcessSession session) 
throws ProcessException {
final FlowFile flowFile = session.get();
if (flowFile == null) {
return;
}

final FileSystem hdfs = getFileSystem();
*final Configuration configuration = getConfiguration();*
...
}

This configuration is the one which is going to be used when putting the file 
to HDFS; hence not grabbing the umask set perviously on onScheduled but only 
the default one (hdfs-site.xml). Thus, the umask property is irrelevant.

Fix should be easy by externalizing the configuration and grabbing it again in 
onTrigger or by setting a new hdfsResources in onScheduled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4200) Consider a ControlNiFi processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096218#comment-16096218
 ] 

ASF GitHub Bot commented on NIFI-4200:
--

Github user pvillard31 commented on the issue:

https://github.com/apache/nifi/pull/2022
  
Hey @trixpan. Just pushed a new commit with the following modifications:
- Renamed the processor to ControlNiFiComponent
- Allow the user to control processor, process group, controller service, 
reporting task
- Allow the user to defined dynamic properties (with EL allowed) that will 
be used to update the configuration of the target component

This would, for instance, allow a user to define a web service to download 
files with something like:
HandleHttpRequest -> ControlNiFiComponent -failure-> HandleHttpResponse
GetHDFS -> HandleHttpResponse
This way, a user could send a HTTP request with the path of the file to 
download, the ControlNiFiComponent is used to start/stop the GetHDFS processor 
with the path to set in the processor, and the file is sent back to the user.

I believe this would cover NIFI-890.


> Consider a ControlNiFi processor
> 
>
> Key: NIFI-4200
> URL: https://issues.apache.org/jira/browse/NIFI-4200
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Pierre Villard
>Assignee: Pierre Villard
>
> We frequently see on the mailing list the need to start/stop a processor 
> based on incoming flow files. At the moment, that's something that can be 
> scripted or that can be done using multiple InvokeHttp processors but it 
> requires a bit of work.
> Even though it is not really in the "NiFi way of thinking", it would be 
> interesting to have a processor with the following parameters:
> - NiFi REST API URL
> - Username
> - Password
> - Processor UUID (with expression language)
> - Action to perform (START, STOP, START/STOP, STOP/START)
> - Sleep duration (between the START and STOP calls when action is START/STOP, 
> or STOP/START)
> That would be helpful in use cases like:
> - start a workflow based on another workflow
> - start a processor not accepting incoming relationship based on a flow file 
> - restart a processor to "refresh" its configuration when the processor 
> relies on configuration files that could be changed
> - have a "start once" behavior



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2022: NIFI-4200 - Initial commit for a ControlNiFi processor

2017-07-21 Thread pvillard31
Github user pvillard31 commented on the issue:

https://github.com/apache/nifi/pull/2022
  
Hey @trixpan. Just pushed a new commit with the following modifications:
- Renamed the processor to ControlNiFiComponent
- Allow the user to control processor, process group, controller service, 
reporting task
- Allow the user to defined dynamic properties (with EL allowed) that will 
be used to update the configuration of the target component

This would, for instance, allow a user to define a web service to download 
files with something like:
HandleHttpRequest -> ControlNiFiComponent -failure-> HandleHttpResponse
GetHDFS -> HandleHttpResponse
This way, a user could send a HTTP request with the path of the file to 
download, the ControlNiFiComponent is used to start/stop the GetHDFS processor 
with the path to set in the processor, and the file is sent back to the user.

I believe this would cover NIFI-890.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4087) IdentifyMimeType: Optionally exclude filename from criteria

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096181#comment-16096181
 ] 

ASF GitHub Bot commented on NIFI-4087:
--

Github user Leah-Anderson commented on the issue:

https://github.com/apache/nifi/pull/2026
  
Hi Joe. I can go ahead and push an update to remove the comment line. I do 
see though that the appveyor build failed but the travis-ci one did not. 
Looking at the appveyor build it looks like the failure wasn't related to the 
change. So is that safe to ignore that for now? 


> IdentifyMimeType: Optionally exclude filename from criteria
> ---
>
> Key: NIFI-4087
> URL: https://issues.apache.org/jira/browse/NIFI-4087
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 0.7.4
>Reporter: Brandon DeVries
>Priority: Minor
> Attachments: NIFI-4087-Add-option-to-exclude-filename-from-tika.patch
>
>
> In IdentifyMimeType\[1], the filename is always (when on-null) passed to tika 
> as a criteria in determining the mime type.  However, there are cases when 
> the filename may be known to be misleading (e.g. after decompression via 
> CompressContent with "Update Filename" set to false).  We should add a 
> boolean processor property (default true) indicating whether or not to pass 
> the filename to tika.
> \[1] 
> https://github.com/apache/nifi/blob/a9a9b67430b33944b5eefa17cb85b5dd42c8d1fc/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/IdentifyMimeType.java#L126-L129



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2026: NIFI-4087 Fix to allow exclusion of filename from tika cri...

2017-07-21 Thread Leah-Anderson
Github user Leah-Anderson commented on the issue:

https://github.com/apache/nifi/pull/2026
  
Hi Joe. I can go ahead and push an update to remove the comment line. I do 
see though that the appveyor build failed but the travis-ci one did not. 
Looking at the appveyor build it looks like the failure wasn't related to the 
change. So is that safe to ignore that for now? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-968) Add Primary Node Scheduling to User and Dev Guide

2017-07-21 Thread Pierre Villard (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096082#comment-16096082
 ] 

Pierre Villard commented on NIFI-968:
-

I added @OnPrimaryNodeStateChange information in the developer guide. I believe 
the other points of this JIRA have been addressed with the last releases of the 
documentation.

> Add Primary Node Scheduling to User and Dev Guide
> -
>
> Key: NIFI-968
> URL: https://issues.apache.org/jira/browse/NIFI-968
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Documentation & Website
>Reporter: Joseph Percivall
>Assignee: Pierre Villard
>Priority: Minor
>
> The User Guide talks about every other scheduling type except Primary Node 
> Only. This should be noted as an option in cluster environments.
> In the Dev Guide, every other annotation is talked about except for 
> OnPrimaryNodeStateChange. It should be added and potentially some information 
> about cluster environment considerations when coding could be added too 
> (currently nothing regarding cluster environments in dev guide).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (NIFI-968) Add Primary Node Scheduling to User and Dev Guide

2017-07-21 Thread Pierre Villard (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Villard updated NIFI-968:

Assignee: Pierre Villard
  Status: Patch Available  (was: Open)

> Add Primary Node Scheduling to User and Dev Guide
> -
>
> Key: NIFI-968
> URL: https://issues.apache.org/jira/browse/NIFI-968
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Documentation & Website
>Reporter: Joseph Percivall
>Assignee: Pierre Villard
>Priority: Minor
>
> The User Guide talks about every other scheduling type except Primary Node 
> Only. This should be noted as an option in cluster environments.
> In the Dev Guide, every other annotation is talked about except for 
> OnPrimaryNodeStateChange. It should be added and potentially some information 
> about cluster environment considerations when coding could be added too 
> (currently nothing regarding cluster environments in dev guide).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-968) Add Primary Node Scheduling to User and Dev Guide

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096080#comment-16096080
 ] 

ASF GitHub Bot commented on NIFI-968:
-

GitHub user pvillard31 opened a pull request:

https://github.com/apache/nifi/pull/2027

NIFI-968 - add @OnPrimaryNodeStateChange to dev guide

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pvillard31/nifi NIFI-968

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2027.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2027


commit e40136013317a6c552f8568b87a319d64a3372e1
Author: Pierre Villard 
Date:   2017-07-21T10:00:18Z

NIFI-968 - add @OnPrimaryNodeStateChange to dev guide




> Add Primary Node Scheduling to User and Dev Guide
> -
>
> Key: NIFI-968
> URL: https://issues.apache.org/jira/browse/NIFI-968
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Documentation & Website
>Reporter: Joseph Percivall
>Priority: Minor
>
> The User Guide talks about every other scheduling type except Primary Node 
> Only. This should be noted as an option in cluster environments.
> In the Dev Guide, every other annotation is talked about except for 
> OnPrimaryNodeStateChange. It should be added and potentially some information 
> about cluster environment considerations when coding could be added too 
> (currently nothing regarding cluster environments in dev guide).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (NIFI-968) Add Primary Node Scheduling to User and Dev Guide

2017-07-21 Thread Pierre Villard (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Villard updated NIFI-968:

Component/s: Documentation & Website

> Add Primary Node Scheduling to User and Dev Guide
> -
>
> Key: NIFI-968
> URL: https://issues.apache.org/jira/browse/NIFI-968
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Documentation & Website
>Reporter: Joseph Percivall
>Priority: Minor
>
> The User Guide talks about every other scheduling type except Primary Node 
> Only. This should be noted as an option in cluster environments.
> In the Dev Guide, every other annotation is talked about except for 
> OnPrimaryNodeStateChange. It should be added and potentially some information 
> about cluster environment considerations when coding could be added too 
> (currently nothing regarding cluster environments in dev guide).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #2027: NIFI-968 - add @OnPrimaryNodeStateChange to dev gui...

2017-07-21 Thread pvillard31
GitHub user pvillard31 opened a pull request:

https://github.com/apache/nifi/pull/2027

NIFI-968 - add @OnPrimaryNodeStateChange to dev guide

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pvillard31/nifi NIFI-968

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2027.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2027


commit e40136013317a6c552f8568b87a319d64a3372e1
Author: Pierre Villard 
Date:   2017-07-21T10:00:18Z

NIFI-968 - add @OnPrimaryNodeStateChange to dev guide




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (NIFI-898) Improve docs specifically around clustering

2017-07-21 Thread Pierre Villard (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Villard resolved NIFI-898.
-
Resolution: Fixed

Closing as documentation has been largely improved with 1.x releases.
I believe all the mentioned points have been addressed. This can be reopened if 
not.

> Improve docs specifically around clustering
> ---
>
> Key: NIFI-898
> URL: https://issues.apache.org/jira/browse/NIFI-898
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Documentation & Website
>Reporter: Edgardo Vega
>
> I just wanted to provide some feedback on some of the issues I ran into
> that I felt the docs didn't do a good job at explaining.
>- Under the Basic Cluster setup it would be good to give an example with
>hostnames ports etc to show exactly what needs to be set. This would really
>help in how many ports I need to open up, how to map to my environment into
>the correct config etc.
>- Under Controlling Levels of Access it mentions that there are other
>providers for clustered mode, but does not talk about how to configure
>them. Why not mention it  there and point long explanation the clustering
>configuration section.
>- I ran into the following issues:
>   - I switched to the following providers in authority-providers.xml
>   and then ran into the issue where
> nifi.security.user.authority.provider in
>   nifi.properties need to match what the identifier is in
>   the authority-providers.xml file.
>   - I also didn't initially set the Authority Provider Port which i had
>   to figure out by also looking at the logs and then figuring out that it
>   should be another port being opened.
> Hopefully this helps someone in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-2162) InvokeHttp's underlying library for Digest Auth uses the Android logger

2017-07-21 Thread Bill Oates (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096074#comment-16096074
 ] 

Bill Oates commented on NIFI-2162:
--

Thanks for taking a look [~JPercivall] - appreciate it. I deployed from the 
released version, so will need to work out how to update with the code that 
you've changed, but looking forward to giving it a go.

Thanks!


> InvokeHttp's underlying library for Digest Auth uses the Android logger
> ---
>
> Key: NIFI-2162
> URL: https://issues.apache.org/jira/browse/NIFI-2162
> Project: Apache NiFi
>  Issue Type: Bug
>Reporter: Joseph Percivall
>Assignee: Joseph Percivall
>
> A user emailed the User mailing list with an issue that InvokeHttp was 
> failing due to not being able to find "android/util/Log"[1]. InvokeHttp uses 
> OkHttp and the library they recommend for digest authentication is 
> okhttp-digest[2]. Currently okhttp-digest assumes it's running on an Android 
> device and has access to the Android logger (OkHttp does not assume it's on 
> an Android device). 
> I raised an issue about it on the project's github page[3] and the creator 
> said he "Will change this soonish."
> Once that is addressed, InvokeHttp will need to update the versions of OkHttp 
> and okhttp-digest. 
> [1] http://mail-archives.apache.org/mod_mbox/nifi-users/201606.mbox/browser
> [2] https://github.com/square/okhttp/issues/205
> [3] https://github.com/rburgst/okhttp-digest/issues/13



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (NIFI-804) Only one processor is selectable in the UI when multiple have the same classname

2017-07-21 Thread Pierre Villard (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Villard resolved NIFI-804.
-
Resolution: Fixed

> Only one processor is selectable in the UI when multiple have the same 
> classname
> 
>
> Key: NIFI-804
> URL: https://issues.apache.org/jira/browse/NIFI-804
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework, Core UI
>Affects Versions: 0.2.1
>Reporter: Aldrin Piri
>
> Not sure if this is an issue in the framework itself and/or the UI, but if 
> two processors have the same classname, regardless of package name, only one 
> will show up in the Add Processor dialog.
> For example, consider org.apache.nifi.processors.standard.EncryptContent and 
> com.mycompany.nifi.processors.crypto.EncryptContent.
> Both processors will be listed in the NiFi startup process logs, but only one 
> will show up in the UI.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread Koji Kawamura (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Kawamura updated NIFI-4184:

Component/s: Extensions

>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: dav
> Fix For: 1.4.0
>
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread Koji Kawamura (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Kawamura resolved NIFI-4184.
-
   Resolution: Fixed
Fix Version/s: 1.4.0

>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: dav
> Fix For: 1.4.0
>
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095927#comment-16095927
 ] 

ASF GitHub Bot commented on NIFI-4184:
--

Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/2007


>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: dav
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #2007: NIFI-4184: PutHDFS Processor Expression language TR...

2017-07-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/2007


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095925#comment-16095925
 ] 

ASF subversion and git services commented on NIFI-4184:
---

Commit d334532b169e53f18765eb20c93ebcee1ac7ade0 in nifi's branch 
refs/heads/master from Davide
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=d334532 ]

NIFI-4184: Enabled EL on PutHDFS Remote Group and Owner

I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER, in order to 
achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor of 
REMOTE_GROUP and REMOTE_OWNER

This closes #2007.

Signed-off-by: Davide 
Signed-off-by: Koji Kawamura 


>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: dav
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095921#comment-16095921
 ] 

ASF GitHub Bot commented on NIFI-4184:
--

Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2007
  
AppVeyor build has been failing. There are many unit tests those are not 
working well on Windows. We've been trying to improve it but not completed.


>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: dav
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2007: NIFI-4184: PutHDFS Processor Expression language TRUE on R...

2017-07-21 Thread ijokarumawak
Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2007
  
AppVeyor build has been failing. There are many unit tests those are not 
working well on Windows. We've been trying to improve it but not completed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095918#comment-16095918
 ] 

ASF GitHub Bot commented on NIFI-4184:
--

Github user panelladavide commented on the issue:

https://github.com/apache/nifi/pull/2007
  
Thanks for your help and for your patience! Why AppVeyor build failed? When 
Nifi release a new release?


>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: dav
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2007: NIFI-4184: PutHDFS Processor Expression language TRUE on R...

2017-07-21 Thread panelladavide
Github user panelladavide commented on the issue:

https://github.com/apache/nifi/pull/2007
  
Thanks for your help and for your patience! Why AppVeyor build failed? When 
Nifi release a new release?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095910#comment-16095910
 ] 

ASF GitHub Bot commented on NIFI-4184:
--

Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2007
  
@panelladavide LGTM, +1. I'm going to merge. Thanks for your contribution!


>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: dav
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi issue #2007: NIFI-4184: PutHDFS Processor Expression language TRUE on R...

2017-07-21 Thread ijokarumawak
Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/2007
  
@panelladavide LGTM, +1. I'm going to merge. Thanks for your contribution!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095843#comment-16095843
 ] 

ASF GitHub Bot commented on NIFI-4184:
--

Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2007#discussion_r128689940
  
--- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java
 ---
@@ -386,11 +388,14 @@ public void process(InputStream in) throws 
IOException {
 });
 }
 
-protected void changeOwner(final ProcessContext context, final 
FileSystem hdfs, final Path name) {
+protected void changeOwner(final ProcessContext context, final 
FileSystem hdfs, final Path name, final FlowFile flowFile) {
 try {
 // Change owner and group of file if configured to do so
-String owner = context.getProperty(REMOTE_OWNER).getValue();
-String group = context.getProperty(REMOTE_GROUP).getValue();
+//String owner = context.getProperty(REMOTE_OWNER).getValue();
+//String group = context.getProperty(REMOTE_GROUP).getValue();
+String owner = 
context.getProperty(REMOTE_OWNER).evaluateAttributeExpressions(flowFile).getValue();
+String group = 
context.getProperty(REMOTE_GROUP).evaluateAttributeExpressions(flowFile).getValue();
+
 if (owner != null || group != null) {
--- End diff --

When I tested with an expression that returns no value, e.g. specify an 
attribute name which does not exist, then the group string became an Empty 
string. And the file was updated with an empty group. To avoid that, Could you 
add following code before this if statement? to make empty string into null:

```
owner = owner == null || owner.isEmpty() ? null : owner;
group = group == null || group.isEmpty() ? null : group;
```


![image](https://user-images.githubusercontent.com/1107620/28451433-3b1718a2-6e28-11e7-9473-282bbfcdd881.png)



>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: dav
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095845#comment-16095845
 ] 

ASF GitHub Bot commented on NIFI-4184:
--

Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2007#discussion_r128684447
  
--- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java
 ---
@@ -386,11 +388,14 @@ public void process(InputStream in) throws 
IOException {
 });
 }
 
-protected void changeOwner(final ProcessContext context, final 
FileSystem hdfs, final Path name) {
+protected void changeOwner(final ProcessContext context, final 
FileSystem hdfs, final Path name, final FlowFile flowFile) {
 try {
 // Change owner and group of file if configured to do so
-String owner = context.getProperty(REMOTE_OWNER).getValue();
-String group = context.getProperty(REMOTE_GROUP).getValue();
+//String owner = context.getProperty(REMOTE_OWNER).getValue();
+//String group = context.getProperty(REMOTE_GROUP).getValue();
--- End diff --

Please removed these old lines. We don't have to keep old code unless if 
there's a strong needs.


>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: dav
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4184) I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the PutHDFS Processor

2017-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095844#comment-16095844
 ] 

ASF GitHub Bot commented on NIFI-4184:
--

Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2007#discussion_r128684359
  
--- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java
 ---
@@ -352,7 +354,7 @@ public void process(InputStream in) throws IOException {
 + " to its final filename");
 }
 
-changeOwner(context, hdfs, copyFile);
+changeOwner(context, hdfs, copyFile,flowFile);
--- End diff --

I just added a whitespace between arguments.


>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in the 
> PutHDFS Processor
> --
>
> Key: NIFI-4184
> URL: https://issues.apache.org/jira/browse/NIFI-4184
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: dav
>
>  I needed to put some attributes on REMOTE_GROUP and REMOTE_OWNER in order to 
> achieve it i put expressionLanguageSupported(true) on the PropertyDescriptor 
> of REMOTE_GROUP and REMOTE_OWNER



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] nifi pull request #2007: NIFI-4184: PutHDFS Processor Expression language TR...

2017-07-21 Thread ijokarumawak
Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2007#discussion_r128689940
  
--- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java
 ---
@@ -386,11 +388,14 @@ public void process(InputStream in) throws 
IOException {
 });
 }
 
-protected void changeOwner(final ProcessContext context, final 
FileSystem hdfs, final Path name) {
+protected void changeOwner(final ProcessContext context, final 
FileSystem hdfs, final Path name, final FlowFile flowFile) {
 try {
 // Change owner and group of file if configured to do so
-String owner = context.getProperty(REMOTE_OWNER).getValue();
-String group = context.getProperty(REMOTE_GROUP).getValue();
+//String owner = context.getProperty(REMOTE_OWNER).getValue();
+//String group = context.getProperty(REMOTE_GROUP).getValue();
+String owner = 
context.getProperty(REMOTE_OWNER).evaluateAttributeExpressions(flowFile).getValue();
+String group = 
context.getProperty(REMOTE_GROUP).evaluateAttributeExpressions(flowFile).getValue();
+
 if (owner != null || group != null) {
--- End diff --

When I tested with an expression that returns no value, e.g. specify an 
attribute name which does not exist, then the group string became an Empty 
string. And the file was updated with an empty group. To avoid that, Could you 
add following code before this if statement? to make empty string into null:

```
owner = owner == null || owner.isEmpty() ? null : owner;
group = group == null || group.isEmpty() ? null : group;
```


![image](https://user-images.githubusercontent.com/1107620/28451433-3b1718a2-6e28-11e7-9473-282bbfcdd881.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #2007: NIFI-4184: PutHDFS Processor Expression language TR...

2017-07-21 Thread ijokarumawak
Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2007#discussion_r128684447
  
--- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java
 ---
@@ -386,11 +388,14 @@ public void process(InputStream in) throws 
IOException {
 });
 }
 
-protected void changeOwner(final ProcessContext context, final 
FileSystem hdfs, final Path name) {
+protected void changeOwner(final ProcessContext context, final 
FileSystem hdfs, final Path name, final FlowFile flowFile) {
 try {
 // Change owner and group of file if configured to do so
-String owner = context.getProperty(REMOTE_OWNER).getValue();
-String group = context.getProperty(REMOTE_GROUP).getValue();
+//String owner = context.getProperty(REMOTE_OWNER).getValue();
+//String group = context.getProperty(REMOTE_GROUP).getValue();
--- End diff --

Please removed these old lines. We don't have to keep old code unless if 
there's a strong needs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #2007: NIFI-4184: PutHDFS Processor Expression language TR...

2017-07-21 Thread ijokarumawak
Github user ijokarumawak commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2007#discussion_r128684359
  
--- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java
 ---
@@ -352,7 +354,7 @@ public void process(InputStream in) throws IOException {
 + " to its final filename");
 }
 
-changeOwner(context, hdfs, copyFile);
+changeOwner(context, hdfs, copyFile,flowFile);
--- End diff --

I just added a whitespace between arguments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---