[jira] [Created] (MINIFICPP-1404) Add option to disable unity build of AWS lib

2020-11-10 Thread Gabor Gyimesi (Jira)
Gabor Gyimesi created MINIFICPP-1404:


 Summary: Add option to disable unity build of AWS lib
 Key: MINIFICPP-1404
 URL: https://issues.apache.org/jira/browse/MINIFICPP-1404
 Project: Apache NiFi MiNiFi C++
  Issue Type: New Feature
Reporter: Gabor Gyimesi


AWS library is built with unity build option ON by default to make the library 
smaller. The library is built using a single generated cpp file, which is 
generated thus recompiled in every single build event. Because of this if we 
build iteratively (for example while developing on a local machine) we have to 
rebuild the library every single time even if no change has occurred. We should 
add an option to disable the unity build in this case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (MINIFICPP-1404) Add option to disable unity build of AWS lib

2020-11-10 Thread Gabor Gyimesi (Jira)


 [ 
https://issues.apache.org/jira/browse/MINIFICPP-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gyimesi reassigned MINIFICPP-1404:


Assignee: Gabor Gyimesi

> Add option to disable unity build of AWS lib
> 
>
> Key: MINIFICPP-1404
> URL: https://issues.apache.org/jira/browse/MINIFICPP-1404
> Project: Apache NiFi MiNiFi C++
>  Issue Type: New Feature
>Reporter: Gabor Gyimesi
>Assignee: Gabor Gyimesi
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> AWS library is built with unity build option ON by default to make the 
> library smaller. The library is built using a single generated cpp file, 
> which is generated thus recompiled in every single build event. Because of 
> this if we build iteratively (for example while developing on a local 
> machine) we have to rebuild the library every single time even if no change 
> has occurred. We should add an option to disable the unity build in this case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7831) KeytabCredentialsService not working with HBase Clients

2020-11-10 Thread Rastislav Krist (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229083#comment-17229083
 ] 

Rastislav Krist commented on NIFI-7831:
---

Same issue here - HBase connector keeps disconnecting after Kerberos TGT 
expiration. Issue appeared with 1.12.1.

> KeytabCredentialsService not working with HBase Clients
> ---
>
> Key: NIFI-7831
> URL: https://issues.apache.org/jira/browse/NIFI-7831
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.12.0
>Reporter: Manuel Navarro
>Priority: Major
>
> HBase Client (both 1.x and 2.x) is not able to renew ticket after expiration 
> with KeytabCredentialsService configured (same behaviour with principal and 
> password configured directly in the controller service). The same 
> KeytabCredentialsService works ok with Hive and Hbase clients configured in 
> the same NIFI cluster. 
> Note that the same configuration works ok in version 1.11 (error start to 
> appear after upgrade from 1.11 to 1.12). 
> After 24hours (time renewal period in our case), the following error appears 
> using HBase_2_ClientServices + HBase_2_ClientMapCacheService : 
> {code:java}
> 2020-09-17 09:00:27,014 ERROR [Relogin service.Chore.1] 
> org.apache.hadoop.hbase.AuthUtil Got exception while trying to refresh 
> credentials: loginUserFromKeyTab must be done first java.io.IOException: 
> loginUserFromKeyTab must be done first at 
> org.apache.hadoop.security.UserGroupInformation.reloginFromKeytab(UserGroupInformation.java:1194)
>  at 
> org.apache.hadoop.security.UserGroupInformation.checkTGTAndReloginFromKeytab(UserGroupInformation.java:1125)
>  at org.apache.hadoop.hbase.AuthUtil$1.chore(AuthUtil.java:206) at 
> org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> With HBase_1_1_2_ClientServices + HBase_1_1_2_ClientMapCacheService the 
> following error appears: 
>  
> {code:java}
>  2020-09-22 12:18:37,184 WARN [hconnection-0x55d9d8d1-shared--pool3-t769] 
> o.a.hadoop.hbase.ipc.AbstractRpcClient Exception encountered while connecting 
> to the server : javax.security.sasl.SaslException: GSS initiate failed 
> [Caused by GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)] 2020-09-22 12:18:37,197 ERROR 
> [hconnection-0x55d9d8d1-shared--pool3-t769] 
> o.a.hadoop.hbase.ipc.AbstractRpcClient SASL authentication failed. The most 
> likely cause is missing or invalid credentials. Consider 'kinit'. 
> javax.security.sasl.SaslException: GSS initiate failed at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>  at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:612)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:157)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:738)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:735)
>  at java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:735)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:897)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:866)
>  at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1208) 
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223)
>  at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:328)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:32879)
>  at 
> org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:128)
>  at 
> 

[jira] [Commented] (NIFI-7376) Avro Single-object encoding Support

2020-11-10 Thread Alasdair Brown (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229094#comment-17229094
 ] 

Alasdair Brown commented on NIFI-7376:
--

FYI see NIFI-7917

> Avro Single-object encoding Support
> ---
>
> Key: NIFI-7376
> URL: https://issues.apache.org/jira/browse/NIFI-7376
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.9.2
>Reporter: Nathan English
>Priority: Minor
>
> For my Flows I consume Avro Binary Encoded Messages from Kafka which is 
> currently working great! However going forward one of our inputs is looking 
> to provide [Avro Single-object 
> Encoding|[https://avro.apache.org/docs/1.8.2/spec.html#single_object_encoding]].
>  
> Avro Single Object Encoding provides us a way to where we don't have to have 
> the overhead and message size increase of embedding the schema in the 
> message. Single Object Encoding achieves this with a couple of extra fields 
> to firstly confirm the message is an Avro message and secondly a fingerprint 
> of the schema used to encode the message.
> This is a massive benefit for us, because we have multiple of the same device 
> producing messages into one Kafka Topic. This is fine until we start 
> upgrading these devices where schema changes may occur, this is when schema 
> fingerprinting comes into its element.
> From all the information I have found, it looks to me as if Avro Single 
> Object Encoding was added in version 1.8.2 of the Avro Specification. NiFi is 
> currently using version 1.8.1 of the Avro Specification based on the 
> [pom|[https://github.com/apache/nifi/blob/rel/nifi-1.11.4/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/pom.xml]]
>  in NiFi Record Serialization Services section of the github project.
> I'm sure there are a million ways to tackle this issue and I'm personally 
> haven't done enough research on NiFi or Avro on it to suggest a way to 
> resolve this, but I can tell it's not as simple as just upgrading the Avro 
> version used.
> My thoughts were to upgrade the Library version, then modify the avro schema 
> registry to either take in a fingerprint value or have it calculated on 
> enablement of the registry. I'm sure it's probably not as simple as I have 
> just made it!.
> More than happy to help where I can.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (NIFI-7376) Avro Single-object encoding Support

2020-11-10 Thread Alasdair Brown (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229094#comment-17229094
 ] 

Alasdair Brown edited comment on NIFI-7376 at 11/10/20, 9:53 AM:
-

FYI see NIFI-7917 which is tracking Avro upgrade to 1.10


was (Author: sdairs):
FYI see NIFI-7917

> Avro Single-object encoding Support
> ---
>
> Key: NIFI-7376
> URL: https://issues.apache.org/jira/browse/NIFI-7376
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.9.2
>Reporter: Nathan English
>Priority: Minor
>
> For my Flows I consume Avro Binary Encoded Messages from Kafka which is 
> currently working great! However going forward one of our inputs is looking 
> to provide [Avro Single-object 
> Encoding|[https://avro.apache.org/docs/1.8.2/spec.html#single_object_encoding]].
>  
> Avro Single Object Encoding provides us a way to where we don't have to have 
> the overhead and message size increase of embedding the schema in the 
> message. Single Object Encoding achieves this with a couple of extra fields 
> to firstly confirm the message is an Avro message and secondly a fingerprint 
> of the schema used to encode the message.
> This is a massive benefit for us, because we have multiple of the same device 
> producing messages into one Kafka Topic. This is fine until we start 
> upgrading these devices where schema changes may occur, this is when schema 
> fingerprinting comes into its element.
> From all the information I have found, it looks to me as if Avro Single 
> Object Encoding was added in version 1.8.2 of the Avro Specification. NiFi is 
> currently using version 1.8.1 of the Avro Specification based on the 
> [pom|[https://github.com/apache/nifi/blob/rel/nifi-1.11.4/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/pom.xml]]
>  in NiFi Record Serialization Services section of the github project.
> I'm sure there are a million ways to tackle this issue and I'm personally 
> haven't done enough research on NiFi or Avro on it to suggest a way to 
> resolve this, but I can tell it's not as simple as just upgrading the Avro 
> version used.
> My thoughts were to upgrade the Library version, then modify the avro schema 
> registry to either take in a fingerprint value or have it calculated on 
> enablement of the registry. I'm sure it's probably not as simple as I have 
> just made it!.
> More than happy to help where I can.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi-minifi-cpp] lordgamez opened a new pull request #935: MINIFICPP-1404 Add option to disable unity build of AWS library

2020-11-10 Thread GitBox


lordgamez opened a new pull request #935:
URL: https://github.com/apache/nifi-minifi-cpp/pull/935


   Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
   
   - [ ] Does your PR title start with MINIFICPP- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically main)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   ### For code changes:
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the LICENSE file?
   - [ ] If applicable, have you updated the NOTICE file?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which 
it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI 
results for build issues and submit an update to your PR as soon as possible.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (NIFI-7831) KeytabCredentialsService not working with HBase Clients

2020-11-10 Thread Anton Koval (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229082#comment-17229082
 ] 

Anton Koval commented on NIFI-7831:
---

We have the same issue after upgrade NiFi from 1.11.4 to 1.12.1

> KeytabCredentialsService not working with HBase Clients
> ---
>
> Key: NIFI-7831
> URL: https://issues.apache.org/jira/browse/NIFI-7831
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.12.0
>Reporter: Manuel Navarro
>Priority: Major
>
> HBase Client (both 1.x and 2.x) is not able to renew ticket after expiration 
> with KeytabCredentialsService configured (same behaviour with principal and 
> password configured directly in the controller service). The same 
> KeytabCredentialsService works ok with Hive and Hbase clients configured in 
> the same NIFI cluster. 
> Note that the same configuration works ok in version 1.11 (error start to 
> appear after upgrade from 1.11 to 1.12). 
> After 24hours (time renewal period in our case), the following error appears 
> using HBase_2_ClientServices + HBase_2_ClientMapCacheService : 
> {code:java}
> 2020-09-17 09:00:27,014 ERROR [Relogin service.Chore.1] 
> org.apache.hadoop.hbase.AuthUtil Got exception while trying to refresh 
> credentials: loginUserFromKeyTab must be done first java.io.IOException: 
> loginUserFromKeyTab must be done first at 
> org.apache.hadoop.security.UserGroupInformation.reloginFromKeytab(UserGroupInformation.java:1194)
>  at 
> org.apache.hadoop.security.UserGroupInformation.checkTGTAndReloginFromKeytab(UserGroupInformation.java:1125)
>  at org.apache.hadoop.hbase.AuthUtil$1.chore(AuthUtil.java:206) at 
> org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> With HBase_1_1_2_ClientServices + HBase_1_1_2_ClientMapCacheService the 
> following error appears: 
>  
> {code:java}
>  2020-09-22 12:18:37,184 WARN [hconnection-0x55d9d8d1-shared--pool3-t769] 
> o.a.hadoop.hbase.ipc.AbstractRpcClient Exception encountered while connecting 
> to the server : javax.security.sasl.SaslException: GSS initiate failed 
> [Caused by GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)] 2020-09-22 12:18:37,197 ERROR 
> [hconnection-0x55d9d8d1-shared--pool3-t769] 
> o.a.hadoop.hbase.ipc.AbstractRpcClient SASL authentication failed. The most 
> likely cause is missing or invalid credentials. Consider 'kinit'. 
> javax.security.sasl.SaslException: GSS initiate failed at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>  at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:612)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:157)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:738)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:735)
>  at java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:735)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:897)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:866)
>  at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1208) 
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223)
>  at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:328)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:32879)
>  at 
> org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:128)
>  at 
> 

[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-11-10 Thread GitBox


adamdebreceni commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r520459478



##
File path: libminifi/test/Utils.h
##
@@ -29,4 +30,11 @@
 return std::forward(instance).method(std::forward(args)...); \
   }
 
-#endif  // LIBMINIFI_TEST_UTILS_H_
+std::string repeat(const std::string& str, std::size_t count) {

Review comment:
   reimplemented repeat using join





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi] MikeThomsen commented on a change in pull request #4570: NIFI-7879 Created record path function for UUID v5

2020-11-10 Thread GitBox


MikeThomsen commented on a change in pull request #4570:
URL: https://github.com/apache/nifi/pull/4570#discussion_r520601275



##
File path: 
nifi-commons/nifi-record-path/src/test/java/org/apache/nifi/record/path/TestRecordPath.java
##
@@ -1644,6 +1646,48 @@ public void testPadRight() {
 assertEquals("MyStringfewfewfewfew", 
RecordPath.compile("padRight(/someString, 20, 
\"few\")").evaluate(record).getSelectedFields().findFirst().get().getValue());
 }
 
+@Test
+public void testUuidV5() {
+final List fields = new ArrayList<>();
+fields.add(new RecordField("input", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("name", 
RecordFieldType.STRING.getDataType(), true));
+final RecordSchema schema = new SimpleRecordSchema(fields);
+final UUID name = 
UUID.fromString("67eb2232-f06e-406a-b934-e17f5fa31ae4");
+final String input = "testing NiFi functionality";
+final Map values = new HashMap<>();
+values.put("input", input);
+values.put("name", name.toString());

Review comment:
   I changed `name` to `namespace`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] aminadinari19 commented on pull request #934: MINIFICPP-1330-add conversion from microseconds

2020-11-10 Thread GitBox


aminadinari19 commented on pull request #934:
URL: https://github.com/apache/nifi-minifi-cpp/pull/934#issuecomment-724690340


   Thank you. Will implement that tip in the future :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] aminadinari19 removed a comment on pull request #934: MINIFICPP-1330-add conversion from microseconds

2020-11-10 Thread GitBox


aminadinari19 removed a comment on pull request #934:
URL: https://github.com/apache/nifi-minifi-cpp/pull/934#issuecomment-724690340


   Thank you. Will implement that tip in the future :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] hunyadi-dev commented on a change in pull request #934: MINIFICPP-1330-add conversion from microseconds

2020-11-10 Thread GitBox


hunyadi-dev commented on a change in pull request #934:
URL: https://github.com/apache/nifi-minifi-cpp/pull/934#discussion_r520551504



##
File path: libminifi/test/unit/PropertyTests.cpp
##
@@ -22,6 +22,39 @@
 #include "../../include/core/Property.h"
 #include "utils/StringUtils.h"
 #include "../TestBase.h"
+namespace {

Review comment:
   It is a good practice to put all tests into anon namespaces.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] aminadinari19 commented on a change in pull request #934: MINIFICPP-1330-add conversion from microseconds

2020-11-10 Thread GitBox


aminadinari19 commented on a change in pull request #934:
URL: https://github.com/apache/nifi-minifi-cpp/pull/934#discussion_r520551885



##
File path: libminifi/test/unit/PropertyTests.cpp
##
@@ -22,6 +22,39 @@
 #include "../../include/core/Property.h"
 #include "utils/StringUtils.h"
 #include "../TestBase.h"
+namespace {
+enum class ParsingStatus { ParsingFail , ParsingSuccessful , ValuesMatch };
+enum class ConversionTestTarget { MS, NS };
+
+ParsingStatus checkTimeValue(const std::string , int64_t t1, 
core::TimeUnit t2) {
+  int64_t TimeVal;
+  core::TimeUnit unit;
+  bool parsing_succeeded = 
org::apache::nifi::minifi::core::Property::StringToTime(input, TimeVal, unit);
+  if (parsing_succeeded) {
+if (TimeVal == t1 && unit == t2) {
+  return ParsingStatus::ValuesMatch;
+} else {
+  return ParsingStatus::ParsingSuccessful;
+}
+  } else {
+return ParsingStatus::ParsingFail;
+  }
+}
+
+bool conversionTest(uint64_t number, core::TimeUnit unit, uint64_t check, 
ConversionTestTarget conversionUnit) {
+  uint64_t out;
+  bool returnStatus;

Review comment:
   Yeah that is always a better method. Will keep that in mind :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (NIFI-7831) KeytabCredentialsService not working with HBase Clients

2020-11-10 Thread Pierre Villard (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Villard resolved NIFI-7831.
--
Fix Version/s: 1.13.0
 Assignee: Tamas Palfy
   Resolution: Duplicate

This is most likely solved by NIFI-7954.

> KeytabCredentialsService not working with HBase Clients
> ---
>
> Key: NIFI-7831
> URL: https://issues.apache.org/jira/browse/NIFI-7831
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.12.0
>Reporter: Manuel Navarro
>Assignee: Tamas Palfy
>Priority: Major
> Fix For: 1.13.0
>
>
> HBase Client (both 1.x and 2.x) is not able to renew ticket after expiration 
> with KeytabCredentialsService configured (same behaviour with principal and 
> password configured directly in the controller service). The same 
> KeytabCredentialsService works ok with Hive and Hbase clients configured in 
> the same NIFI cluster. 
> Note that the same configuration works ok in version 1.11 (error start to 
> appear after upgrade from 1.11 to 1.12). 
> After 24hours (time renewal period in our case), the following error appears 
> using HBase_2_ClientServices + HBase_2_ClientMapCacheService : 
> {code:java}
> 2020-09-17 09:00:27,014 ERROR [Relogin service.Chore.1] 
> org.apache.hadoop.hbase.AuthUtil Got exception while trying to refresh 
> credentials: loginUserFromKeyTab must be done first java.io.IOException: 
> loginUserFromKeyTab must be done first at 
> org.apache.hadoop.security.UserGroupInformation.reloginFromKeytab(UserGroupInformation.java:1194)
>  at 
> org.apache.hadoop.security.UserGroupInformation.checkTGTAndReloginFromKeytab(UserGroupInformation.java:1125)
>  at org.apache.hadoop.hbase.AuthUtil$1.chore(AuthUtil.java:206) at 
> org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> With HBase_1_1_2_ClientServices + HBase_1_1_2_ClientMapCacheService the 
> following error appears: 
>  
> {code:java}
>  2020-09-22 12:18:37,184 WARN [hconnection-0x55d9d8d1-shared--pool3-t769] 
> o.a.hadoop.hbase.ipc.AbstractRpcClient Exception encountered while connecting 
> to the server : javax.security.sasl.SaslException: GSS initiate failed 
> [Caused by GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)] 2020-09-22 12:18:37,197 ERROR 
> [hconnection-0x55d9d8d1-shared--pool3-t769] 
> o.a.hadoop.hbase.ipc.AbstractRpcClient SASL authentication failed. The most 
> likely cause is missing or invalid credentials. Consider 'kinit'. 
> javax.security.sasl.SaslException: GSS initiate failed at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>  at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:612)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:157)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:738)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:735)
>  at java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:735)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:897)
>  at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:866)
>  at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1208) 
> at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223)
>  at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:328)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:32879)
>  at 
> 

[jira] [Commented] (NIFI-7716) Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect

2020-11-10 Thread Kourge (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229264#comment-17229264
 ] 

Kourge commented on NIFI-7716:
--

Hi [~V1ncent24],

Are you sure that the same Twitter Consumer Key/Secret and Access Token/Secret 
are not used by another GetTwitter processor or by another application?
In my experience the "420 Enhance Your Calm" exception often occurs when the 
very same Twitter API credentials are used simultaneously.

> Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to 
> reconnect
> 
>
> Key: NIFI-7716
> URL: https://issues.apache.org/jira/browse/NIFI-7716
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: NiFi Stateless
>Affects Versions: 1.11.3
> Environment: Nifi  on AWS EC2 instance
>Reporter: Vincent Naveen
>Priority: Critical
> Attachments: ErrorOnGetTwitter.JPG
>
>
> We are using python script in the "ExecuteStreamCommand" processor which 
> takes the Twitter user from the database from the incoming flow file 
> generated by the "QueryTableRecord" process. We are able to successfully stop 
> the processor and update the IDs_To_Follow via parameter context in 
> GetTwitter processor and to start the GetTwitter processor.The issue is once 
> the GetTwitter processor starts, it is throwing the error as below,Received 
> error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect
> Received error STOPPED_BY_ERROR: Retries exhausted due to null. Will not 
> attempt to reconnect
> 12:42:46 ISTERRORbe2a0f28-0173-1000-0bbc-4401fbf6d330We came to know that, 
> wait time needs to be added whenever doing the REST Api calls and which we 
> have implemented but still we are getting the same issue. We changed the Max 
> Client Retry count to 50 as well. Still the issue is not fixed.While browsing 
> the below link, we have to do something in the java coding languague but we 
> are using python script in the ExecuteStreamCommand. 
> https://issues.apache.org/jira/browse/NIFI-5953
> [https://github.com/apache/nifi/pull/3276]
> Could you please help us here, how we need to handle this scenario. Thanks in 
> Advance!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi-minifi-cpp] lordgamez commented on a change in pull request #934: MINIFICPP-1330-add conversion from microseconds

2020-11-10 Thread GitBox


lordgamez commented on a change in pull request #934:
URL: https://github.com/apache/nifi-minifi-cpp/pull/934#discussion_r520558191



##
File path: libminifi/test/unit/PropertyTests.cpp
##
@@ -22,6 +22,39 @@
 #include "../../include/core/Property.h"
 #include "utils/StringUtils.h"
 #include "../TestBase.h"
+namespace {

Review comment:
   Thanks, I was just wondering if it was due to a linker error or some 
other reason.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi] pvillard31 commented on pull request #4576: NIFI-7886: FetchAzureBlobStorage, FetchS3Object, and FetchGCSObject processors should be able to fetch ranges

2020-11-10 Thread GitBox


pvillard31 commented on pull request #4576:
URL: https://github.com/apache/nifi/pull/4576#issuecomment-724707553


   No, I just don't have the time to make any tests on my side with all the 
cloud providers right now (it seems this has been tested extensively on AWS, 
but we need to make sure this works well on ADLS/BlobStorage/GCS as well). If 
any other committer can give a +1, I'm good with this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi] MikeThomsen commented on pull request #4570: NIFI-7879 Created record path function for UUID v5

2020-11-10 Thread GitBox


MikeThomsen commented on pull request #4570:
URL: https://github.com/apache/nifi/pull/4570#issuecomment-724761862


   > Just a consistency thing, but the UUID-related functions in EL are 
uppercase and here it's lower.
   
   Everything on record path looks camel case, so that's what I was going with.
   
   > The other modules in nifi-commons aren't nifi-commons-* so I think this 
should just be nifi-uuid5.
   
   Yup. The name will be changed when I push the updates.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi] pvillard31 commented on a change in pull request #4644: NIFI-7974 - Upgrading calcite, hbase, geo2ip deps

2020-11-10 Thread GitBox


pvillard31 commented on a change in pull request #4644:
URL: https://github.com/apache/nifi/pull/4644#discussion_r520571693



##
File path: 
nifi-nar-bundles/nifi-standard-services/nifi-hbase_1_1_2-client-service-bundle/nifi-hbase_1_1_2-client-service/pom.xml
##
@@ -25,7 +25,7 @@
 nifi-hbase_1_1_2-client-service
 jar
 
-1.1.13
+1.4.13

Review comment:
   Oh my bad, I don't recall about this discussion. I discussed with some 
HBase folks and changes are non existent on the client side when it comes to 
data push/pull across the 1.x lines (even between 1.x and 2.x to be honest, 
changes are mostly around administration and management APIs). This upgrade is 
motivated by transitive dependencies but I can remove this specific change if 
needed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (NIFI-6618) Allow 'Change Version' even when local changes exist, which would forget those changes for the version that is selected.

2020-11-10 Thread Pierre Villard (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Villard resolved NIFI-6618.
--
Resolution: Information Provided

> Allow 'Change Version' even when local changes exist, which would forget 
> those changes for the version that is selected.
> 
>
> Key: NIFI-6618
> URL: https://issues.apache.org/jira/browse/NIFI-6618
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Flow Versioning
>Reporter: George Knaggs
>Priority: Major
>
> With an upgrade I performed from 1.8 to 1.9.2, the UpdateAttribute processor 
> was updated with a new property.  This new property was perceived by 
> versioning as local changes that needed to be committed to the NiFi Registry. 
>  The problem was that this versioned flow was a nested flow that I have 
> re-used in multiple data flows, and I was not able to commit the change once 
> and propagate that change to all other copies. 
> After I committed the upgrade related change to the first occurence of the 
> nested versioned flow, all other occurrences of the nested versioned flow 
> changed their state/status from '* Locally modified' to '! Locally modified 
> and stale'.  In this state, you can no longer commit changes since the flow 
> is no longer on the latest version.  The only option you have is to revert 
> the local changes, but since the changes are a result of the upgrade, it is 
> impossible to revert those changes.  Even if you choose the revert option, 
> the changes are not removed and the state does not change.  
> Currently the only options you have are to stop version control, removing 
> ability to propagate any other future enhancements or bug fixes to that 
> nested versioned flow OR to delete the  instance of that nested flow and 
> re-import it from the NiFI Registry, selected the latest version commited 
> with the changes due to the upgrade.  Repeating this activity mulitple times 
> is slow and error prone, as one has to manually verify that every variable 
> within the nested versioned flow (at top level and any contained nested pgs) 
> is set correctly.
> Ideally, the option to 'Change Version' would be available under the version 
> context menu, even when local changes are present.  This would forget any 
> local changes, replacing the versioned flow for the version that is selected, 
> but preserving the variable settings as per the current functionality.  This 
> would provide the same result in situations not related to an upgrade where 
> the user is able to revert the changes and then change the version to 
> another.  
> Request priority for this functionality by the next major release as this 
> issue is particularly painful when upgrading nested versioned flows.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi-minifi-cpp] lordgamez commented on a change in pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-11-10 Thread GitBox


lordgamez commented on a change in pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917#discussion_r520519481



##
File path: extensions/libarchive/CompressContent.cpp
##
@@ -60,10 +60,29 @@ core::Property CompressContent::EncapsulateInTar(
   "If false, on compression the content of the 
FlowFile simply gets compressed, and on decompression a simple compressed 
content is expected.\n"
   "true is the behaviour compatible with older MiNiFi 
C++ versions, false is the behaviour compatible with NiFi.")
 ->isRequired(false)->withDefaultValue(true)->build());
+core::Property CompressContent::BatchSize(
+core::PropertyBuilder::createProperty("Batch Size")
+->withDescription("Maximum number of FlowFiles processed in a single 
session")
+->withDefaultValue(1)->build());
 
 core::Relationship CompressContent::Success("success", "FlowFiles will be 
transferred to the success relationship after successfully being compressed or 
decompressed");
 core::Relationship CompressContent::Failure("failure", "FlowFiles will be 
transferred to the failure relationship if they fail to compress/decompress");
 
+std::map 
CompressContent::compressionFormatMimeTypeMap_{

Review comment:
   Sure that sounds okay





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] arpadboda closed pull request #917: MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors

2020-11-10 Thread GitBox


arpadboda closed pull request #917:
URL: https://github.com/apache/nifi-minifi-cpp/pull/917


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (MINIFICPP-1380) MergeContent batch process incoming files

2020-11-10 Thread Adam Debreceni (Jira)


 [ 
https://issues.apache.org/jira/browse/MINIFICPP-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Debreceni resolved MINIFICPP-1380.
---
Resolution: Implemented

> MergeContent batch process incoming files
> -
>
> Key: MINIFICPP-1380
> URL: https://issues.apache.org/jira/browse/MINIFICPP-1380
> Project: Apache NiFi MiNiFi C++
>  Issue Type: Improvement
>Reporter: Adam Debreceni
>Assignee: Adam Debreceni
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Currently the MergeContent processor dequeues flowFiles one-by-one on each 
> onTrigger, instead we should process as many flowFiles as we can but up to a 
> user provided maximum batchSize count.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] MikeThomsen commented on a change in pull request #4570: NIFI-7879 Created record path function for UUID v5

2020-11-10 Thread GitBox


MikeThomsen commented on a change in pull request #4570:
URL: https://github.com/apache/nifi/pull/4570#discussion_r520560682



##
File path: nifi-commons/nifi-commons-uuid5/pom.xml
##
@@ -0,0 +1,38 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
https://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+
+org.apache.commons
+commons-lang3
+3.10
+compile
+
+
+commons-codec
+commons-codec
+1.14
+compile
+
+
+
+org.apache.nifi
+nifi-commons
+1.13.0-SNAPSHOT
+
+nifi-commons-uuid5

Review comment:
   Will do.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi] MikeThomsen commented on a change in pull request #4570: NIFI-7879 Created record path function for UUID v5

2020-11-10 Thread GitBox


MikeThomsen commented on a change in pull request #4570:
URL: https://github.com/apache/nifi/pull/4570#discussion_r520560456



##
File path: nifi-docs/src/main/asciidoc/record-path-guide.adoc
##
@@ -869,7 +869,42 @@ The following record path expression would append '@' 
characters to the input St
 | `padRight(/name, 15, '@')` | john smith@
 |==
 
+=== uuid5

Review comment:
   I rewrote the documentation, so it should be good now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (NIFI-7989) Add Hive "data drift" processor

2020-11-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-7989:
---
Status: Patch Available  (was: In Progress)

> Add Hive "data drift" processor
> ---
>
> Key: NIFI-7989
> URL: https://issues.apache.org/jira/browse/NIFI-7989
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It would be nice to have a Hive processor (one for each Hive NAR) that could 
> check an incoming record-based flowfile against a destination table, and 
> either add columns and/or partition values, or even create the table if it 
> does not exist. Such a processor could be used in a flow where the incoming 
> data's schema can change and we want to be able to write it to a Hive table, 
> preferably by using PutHDFS, PutParquet, or PutORC to place it directly where 
> it can be queried.
> Such a processor should be able to use a HiveConnectionPool to execute any 
> DDL (ALTER TABLE ADD COLUMN, e.g.) necessary to make the table match the 
> incoming data. For Partition Values, they could be provided via a property 
> that supports Expression Language. In such a case, an ALTER TABLE would be 
> issued to add the partition directory.
> Whether the table is created or updated, and whether there are partition 
> values to consider, an attribute should be written to the outgoing flowfile 
> corresponding to the location of the table (and any associated partitions). 
> This supports the idea of having a flow that updates a Hive table based on 
> the incoming data, and then allows the user to put the flowfile directly into 
> the destination location (PutHDFS, e.g.) instead of having to load it using 
> HiveQL or being subject to the restrictions of Hive Streaming tables 
> (ORC-backed, transactional, etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] mattyb149 opened a new pull request #4653: NIFI-7989: Add UpdateHiveTable processors for data drift capability

2020-11-10 Thread GitBox


mattyb149 opened a new pull request #4653:
URL: https://github.com/apache/nifi/pull/4653


   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
    Description of PR
   
   This PR adds an UpdateHive3Table processor (and copies of it for Hive 1 and 
Hive 1.1 packages) that compares an incoming flow file against a Hive table 
definition and will add columns and partition values as needed, and will create 
a table using the incoming record schema if configured to do so.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
in the commit message?
   
   - [x] Does your PR title start with **NIFI-** where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [x] Has your PR been rebased against the latest commit within the target 
branch (typically `main`)?
   
   - [x] Is your initial contribution a single, squashed commit? _Additional 
commits in response to PR reviewer feedback should be made on this branch and 
pushed to allow change tracking. Do not `squash` or use `--force` when pushing 
to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn 
-Pcontrib-check clean install` at the root `nifi` folder?
   - [x] Have you written or updated unit tests to verify your changes?
   - [x] Have you verified that the full build is successful on JDK 8?
   - [ ] Have you verified that the full build is successful on JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main 
`LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main 
`NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to 
.name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which 
it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for 
build issues and submit an update to your PR as soon as possible.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (NIFI-7992) Content Repository can fail to cleanup archive directory fast enough

2020-11-10 Thread Joe Witt (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229447#comment-17229447
 ] 

Joe Witt commented on NIFI-7992:


Didn't review the code in detail but did review this writeup and thinking back 
about 8 years ago when I think we last talked about this...the writeup/change 
makes a lot of sense!

> Content Repository can fail to cleanup archive directory fast enough
> 
>
> Key: NIFI-7992
> URL: https://issues.apache.org/jira/browse/NIFI-7992
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.13.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the scenario where a use is generating many small FlowFiles and has the 
> "nifi.content.claim.max.appendable.size" property set to a small value, we 
> can encounter a situation where data is constantly archived but not cleaned 
> up quickly enough. As a result, the Content Repository can run out of space.
> The FileSystemRepository has a backpressure mechanism built in to avoid 
> allowing this to happen, but under the above conditions, it can sometimes 
> fail to prevent this situation. The backpressure mechanism works by 
> performing the following steps:
>  # When a new Content Claim is created, the Content Repository determines 
> which 'container' to use.
>  # Content Repository checks if the amount of storage space used for the 
> container is greater than the configured backpressure threshold.
>  # If so, the thread blocks until a background task completes cleanup of the 
> archive directories.
> However, in Step #2 above, it determines if the amount of space currently 
> being used by looking at a cached member variable. That cached member 
> variable is only updated on the first iteration, and when the said background 
> task completes.
> So, now consider a case where there are millions of files in the content 
> repository archive. The background task could take a massive amount of time 
> performing cleanup. Meanwhile, processors are able to write to the repository 
> without any backpressure being applied because the background task hasn't 
> updated the cached variable for the amount of space used. This continues 
> until the content repository fills.
> There are three important very simple things that should be changed:
>  # The background task should be faster in this case. While we cannot improve 
> the amount of time it takes to destroy the files, we do create an ArrayList 
> to hold all of the file info and then use an iterator, calling remove(). 
> Under the hood, this creates a copy of the underlying array for each file 
> that is removed. On my laptop, performing this procedure on an ArrayList with 
> 1 million elements took approximately 1 minute. Changing to a LinkedList took 
> 15 milliseconds but took much more heap. Keeping an ArrayList, then removing 
> all of elements at the end (via ArrayList.subList(0, n).clear()) resulted in 
> similar performance to LinkedList with the memory footprint of ArrayList.
>  # The check to see whether or not the content repository's usage has crossed 
> the threshold should not rely entirely on a cache that is populated by a 
> process that can take a long time. It should periodically calculate the disk 
> usage itself (perhaps once per minute).
>  # When backpressure does get applied, it can appear that the system has 
> frozen up, not performing any sort of work. The background task that is 
> clearing space should periodically log its progress at INFO level to allow 
> users to understand that this action is taking place.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] asfgit closed pull request #4652: NIFI-7992: Periodically check disk usage for content repo to see if b…

2020-11-10 Thread GitBox


asfgit closed pull request #4652:
URL: https://github.com/apache/nifi/pull/4652


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (NIFI-7992) Content Repository can fail to cleanup archive directory fast enough

2020-11-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229503#comment-17229503
 ] 

ASF subversion and git services commented on NIFI-7992:
---

Commit badcfe1ab7b7166decb92a0d427ba48fbf613400 in nifi's branch 
refs/heads/main from Mark Payne
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=badcfe1 ]

NIFI-7992: Periodically check disk usage for content repo to see if 
backpressure should be applied. Log progress in background task. Improve 
performance of background cleanup task by not using an ArrayList Iterator and 
constantly calling remove but instead wait until the end of our cleanup loop 
and then removed from the list all elements that should be removed in a single 
update

This closes #4652.

Signed-off-by: Bryan Bende 


> Content Repository can fail to cleanup archive directory fast enough
> 
>
> Key: NIFI-7992
> URL: https://issues.apache.org/jira/browse/NIFI-7992
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.13.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> For the scenario where a use is generating many small FlowFiles and has the 
> "nifi.content.claim.max.appendable.size" property set to a small value, we 
> can encounter a situation where data is constantly archived but not cleaned 
> up quickly enough. As a result, the Content Repository can run out of space.
> The FileSystemRepository has a backpressure mechanism built in to avoid 
> allowing this to happen, but under the above conditions, it can sometimes 
> fail to prevent this situation. The backpressure mechanism works by 
> performing the following steps:
>  # When a new Content Claim is created, the Content Repository determines 
> which 'container' to use.
>  # Content Repository checks if the amount of storage space used for the 
> container is greater than the configured backpressure threshold.
>  # If so, the thread blocks until a background task completes cleanup of the 
> archive directories.
> However, in Step #2 above, it determines if the amount of space currently 
> being used by looking at a cached member variable. That cached member 
> variable is only updated on the first iteration, and when the said background 
> task completes.
> So, now consider a case where there are millions of files in the content 
> repository archive. The background task could take a massive amount of time 
> performing cleanup. Meanwhile, processors are able to write to the repository 
> without any backpressure being applied because the background task hasn't 
> updated the cached variable for the amount of space used. This continues 
> until the content repository fills.
> There are three important very simple things that should be changed:
>  # The background task should be faster in this case. While we cannot improve 
> the amount of time it takes to destroy the files, we do create an ArrayList 
> to hold all of the file info and then use an iterator, calling remove(). 
> Under the hood, this creates a copy of the underlying array for each file 
> that is removed. On my laptop, performing this procedure on an ArrayList with 
> 1 million elements took approximately 1 minute. Changing to a LinkedList took 
> 15 milliseconds but took much more heap. Keeping an ArrayList, then removing 
> all of elements at the end (via ArrayList.subList(0, n).clear()) resulted in 
> similar performance to LinkedList with the memory footprint of ArrayList.
>  # The check to see whether or not the content repository's usage has crossed 
> the threshold should not rely entirely on a cache that is populated by a 
> process that can take a long time. It should periodically calculate the disk 
> usage itself (perhaps once per minute).
>  # When backpressure does get applied, it can appear that the system has 
> frozen up, not performing any sort of work. The background task that is 
> clearing space should periodically log its progress at INFO level to allow 
> users to understand that this action is taking place.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (NIFI-7988) Prometheus Remote Write Processor

2020-11-10 Thread Javi Roman (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Javi Roman updated NIFI-7988:
-
Description: 
A new processor that allows NiFi to store Prometheus metrics as a Remote Write 
Adapter.  

Prometheus's local storage is limited by single nodes in its scalability and 
durability. Instead of trying to solve clustered storage in Prometheus itself, 
Prometheus has a set of interfaces that allow integrating with remote storage 
systems.

The Remote Write feature in Prometheus allows sending samples to a third party 
storage system. There is a list of specialized remote endpoints here [1].

With this NiFi Prometheus Remote Write Processor you can store the metrics in 
whatever storage supported in NiFi, even to store in several storages at the 
same time with the advantages of NiFi routing capabilities.

This processor has two user defined working modes:
 # One FlowFile per Prometheus sample.
 # One FlowFile every N samples defined by the user. This mode allows storing 
the samples in bunches in an easy way without needing other NiFi processors for 
aggregations.

The user decides the operation mode.

The content of the FlowFiles is JSON format.

[1] 
[https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage]
 

  was:
A new processor that allows NiFi to store Prometheus metrics as a Remote Write 
Adapter.  

Prometheus's local storage is limited by single nodes in its scalability and 
durability. Instead of trying to solve clustered storage in Prometheus itself, 
Prometheus has a set of interfaces that allow integrating with remote storage 
systems.

The Remote Write feature in Prometheus allows sending samples to a third party 
storage system. There is a list of specialized remote endpoints here [1].

With this NiFi Prometheus Remote Write Processor you can store the metrics in 
whatever storage supported in NiFi, even store in several storages with the 
advantages of NiFi routing capabilities.

This processor has two user defined working modes:
 # One FlowFile per Prometheus sample.
 # One FlowFile every N samples defined by the user. This mode allows storing 
the samples in bunches in an easy way without needing other NiFi processors for 
aggregations.

The user decides the operation mode.

The content of the FlowFiles is JSON format.

[1] 
[https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage]
 


> Prometheus Remote Write Processor 
> --
>
> Key: NIFI-7988
> URL: https://issues.apache.org/jira/browse/NIFI-7988
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Javi Roman
>Assignee: Javi Roman
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A new processor that allows NiFi to store Prometheus metrics as a Remote 
> Write Adapter.  
> Prometheus's local storage is limited by single nodes in its scalability and 
> durability. Instead of trying to solve clustered storage in Prometheus 
> itself, Prometheus has a set of interfaces that allow integrating with remote 
> storage systems.
> The Remote Write feature in Prometheus allows sending samples to a third 
> party storage system. There is a list of specialized remote endpoints here 
> [1].
> With this NiFi Prometheus Remote Write Processor you can store the metrics in 
> whatever storage supported in NiFi, even to store in several storages at the 
> same time with the advantages of NiFi routing capabilities.
> This processor has two user defined working modes:
>  # One FlowFile per Prometheus sample.
>  # One FlowFile every N samples defined by the user. This mode allows storing 
> the samples in bunches in an easy way without needing other NiFi processors 
> for aggregations.
> The user decides the operation mode.
> The content of the FlowFiles is JSON format.
> [1] 
> [https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] bbende commented on pull request #4652: NIFI-7992: Periodically check disk usage for content repo to see if b…

2020-11-10 Thread GitBox


bbende commented on pull request #4652:
URL: https://github.com/apache/nifi/pull/4652#issuecomment-724947542


   Thanks for such a thorough JIRA write up. Code changes look good, was able 
to test setting the claim size really low and saw the new log message printing 
about the clean up status, eventually cleaned up the archive and looks good. 
Going to merge.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (NIFI-7992) Content Repository can fail to cleanup archive directory fast enough

2020-11-10 Thread Bryan Bende (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Bende updated NIFI-7992:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Content Repository can fail to cleanup archive directory fast enough
> 
>
> Key: NIFI-7992
> URL: https://issues.apache.org/jira/browse/NIFI-7992
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.13.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> For the scenario where a use is generating many small FlowFiles and has the 
> "nifi.content.claim.max.appendable.size" property set to a small value, we 
> can encounter a situation where data is constantly archived but not cleaned 
> up quickly enough. As a result, the Content Repository can run out of space.
> The FileSystemRepository has a backpressure mechanism built in to avoid 
> allowing this to happen, but under the above conditions, it can sometimes 
> fail to prevent this situation. The backpressure mechanism works by 
> performing the following steps:
>  # When a new Content Claim is created, the Content Repository determines 
> which 'container' to use.
>  # Content Repository checks if the amount of storage space used for the 
> container is greater than the configured backpressure threshold.
>  # If so, the thread blocks until a background task completes cleanup of the 
> archive directories.
> However, in Step #2 above, it determines if the amount of space currently 
> being used by looking at a cached member variable. That cached member 
> variable is only updated on the first iteration, and when the said background 
> task completes.
> So, now consider a case where there are millions of files in the content 
> repository archive. The background task could take a massive amount of time 
> performing cleanup. Meanwhile, processors are able to write to the repository 
> without any backpressure being applied because the background task hasn't 
> updated the cached variable for the amount of space used. This continues 
> until the content repository fills.
> There are three important very simple things that should be changed:
>  # The background task should be faster in this case. While we cannot improve 
> the amount of time it takes to destroy the files, we do create an ArrayList 
> to hold all of the file info and then use an iterator, calling remove(). 
> Under the hood, this creates a copy of the underlying array for each file 
> that is removed. On my laptop, performing this procedure on an ArrayList with 
> 1 million elements took approximately 1 minute. Changing to a LinkedList took 
> 15 milliseconds but took much more heap. Keeping an ArrayList, then removing 
> all of elements at the end (via ArrayList.subList(0, n).clear()) resulted in 
> similar performance to LinkedList with the memory footprint of ArrayList.
>  # The check to see whether or not the content repository's usage has crossed 
> the threshold should not rely entirely on a cache that is populated by a 
> process that can take a long time. It should periodically calculate the disk 
> usage itself (perhaps once per minute).
>  # When backpressure does get applied, it can appear that the system has 
> frozen up, not performing any sort of work. The background task that is 
> clearing space should periodically log its progress at INFO level to allow 
> users to understand that this action is taking place.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (NIFI-7994) ReplaceText concurrency issue

2020-11-10 Thread Peter Turcsanyi (Jira)
Peter Turcsanyi created NIFI-7994:
-

 Summary: ReplaceText concurrency issue
 Key: NIFI-7994
 URL: https://issues.apache.org/jira/browse/NIFI-7994
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Peter Turcsanyi
Assignee: Peter Turcsanyi


ReplaceText, when scheduled to run with multiple Concurrent Tasks, can result 
in corrupted data due to the concurrent / unprotected usage of 
{{additionalAttrs}} map (storing the capturing groups with their current 
matches) in {{RegexReplace}} class 
(https://github.com/apache/nifi/blob/badcfe1ab7b7166decb92a0d427ba48fbf613400/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ReplaceText.java#L497).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (NIFI-7993) Upgrade Jetty from 9.4.26.v20200117 to 9.4.34.v20201102

2020-11-10 Thread Nathan Gough (Jira)
Nathan Gough created NIFI-7993:
--

 Summary: Upgrade Jetty from 9.4.26.v20200117 to 9.4.34.v20201102
 Key: NIFI-7993
 URL: https://issues.apache.org/jira/browse/NIFI-7993
 Project: Apache NiFi
  Issue Type: Task
Affects Versions: 1.12.1, 1.12.0
Reporter: Nathan Gough
Assignee: Nathan Gough


Dependabot notified us to upgrade Jetty from 9.4.26.v20200117 to 
9.4.34.v20201102.

 

[https://github.com/apache/nifi/pull/4649]

 

Manually upgrade, test and verify the upgraded Jetty version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7993) Upgrade Jetty from 9.4.26.v20200117 to 9.4.34.v20201102

2020-11-10 Thread Joe Witt (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229462#comment-17229462
 ] 

Joe Witt commented on NIFI-7993:


nice thanks [~thenatog]


> Upgrade Jetty from 9.4.26.v20200117 to 9.4.34.v20201102
> ---
>
> Key: NIFI-7993
> URL: https://issues.apache.org/jira/browse/NIFI-7993
> Project: Apache NiFi
>  Issue Type: Task
>Affects Versions: 1.12.0, 1.12.1
>Reporter: Nathan Gough
>Assignee: Nathan Gough
>Priority: Major
>  Labels: jetty, upgrade
>
> Dependabot notified us to upgrade Jetty from 9.4.26.v20200117 to 
> 9.4.34.v20201102.
>  
> [https://github.com/apache/nifi/pull/4649]
>  
> Manually upgrade, test and verify the upgraded Jetty version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-7771) Infinite loop on WebUI when node stopped in cluster

2020-11-10 Thread Nathan Gough (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229529#comment-17229529
 ] 

Nathan Gough commented on NIFI-7771:


Hi [~humpfhumpf], my initial guess would be that that PR will not fix the 
issue. [~mtien] is going to double check that and see if we can reproduce the 
issue and then try out your PR.



As for your suggestions about handling signing keys, I think that may need 
further discussion. I know that sharing the signing keys/users database has 
been an issue discussed in the past, but I don't think we arrived on a secure 
method to do that yet. The simplest mechanism was probably using ZooKeeper, but 
ZooKeeper so far has not been considered a secure option. I'm not certain why 
we generate a new NiFi token instead of using the id_token provided by the IDP, 
this may have been to keep in line with other authentication mechanisms that 
result in generating a JWT.

Some of these issues are likely to be considered for a NiFi 2.0 release some 
day.

> Infinite loop on WebUI when node stopped in cluster
> ---
>
> Key: NIFI-7771
> URL: https://issues.apache.org/jira/browse/NIFI-7771
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core UI
>Affects Versions: 1.10.0, 1.9.2, 1.12.0, 1.11.4
> Environment: Linux
>Reporter: humpfhumpf
>Priority: Critical
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h2. Context
> Before the bug occurs, one needs:
>  * a load-balancer which uses client IP affinity to “select” one NiFi 
> instance in the cluster;
>  * a fresh and successful OpenID Connect authentication on WebUI.
> Then, stops the NiFi instance, which was “selected” by the load-balancer for 
> you.
> In the mean time, the WebUI (javascript) continues to poll WebAPI by fetching 
> status URLs, but all calls return HTTP 401 error. Then WebUI shows an error 
> screen: “Unauthorized. Unable to validate the access token.”, with 2 links: 
> “log out” and “home”.
> If you click on the “home” link, *the browser enters in an infinite loop of 
> redirects between NiFi and OIDC Identity Provider.*
> The offending HTTP flows is:
>  * “home” link calls: GET /nifi
>  * Redirects to: GET /nifi/
>  ** asynchronously calls: GET /nifi-api/flow/current-user => failed HTTP 401
>  * Redirects to: GET /nifi/login
>  * Redirects to: GET /openid-connect/auth
>  * Redirects to: GET /nifi-api/access/oidc/callback
>  * Redirects to: GET /nifi
>  * Redirects to: GET /nifi/
>  ** asynchronously calls: GET /nifi-api/flow/current-user => failed HTTP 401
>  * [loop]
>  
> The stack trace of {{/nifi-api/flow/current-user}} call is:
>  
> {code:java}
> ERROR [NiFi Web Server-284] o.a.nifi.web.security.jwt.JwtService There was an 
> error validating the JWT io.jsonwebtoken.JwtException: Unable to validate the 
> access token.
>      at 
> org.apache.nifi.web.security.jwt.JwtService.parseTokenFromBase64EncodedString(JwtService.java:106)
>      at 
> org.apache.nifi.web.security.jwt.JwtService.getAuthenticationFromToken(JwtService.java:60)
>      at 
> org.apache.nifi.web.security.jwt.JwtAuthenticationProvider.authenticate(JwtAuthenticationProvider.java:48)
>      at 
> org.springframework.security.authentication.ProviderManager.authenticate(ProviderManager.java:174)
>      at 
> org.apache.nifi.web.security.NiFiAuthenticationFilter.authenticate(NiFiAuthenticationFilter.java:78)
>      at 
> org.apache.nifi.web.security.NiFiAuthenticationFilter.doFilter(NiFiAuthenticationFilter.java:58)
>      at 
> org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
>      at 
> org.apache.nifi.web.security.NiFiAuthenticationFilter.authenticate(NiFiAuthenticationFilter.java:99)
>      at 
> org.apache.nifi.web.security.NiFiAuthenticationFilter.doFilter(NiFiAuthenticationFilter.java:58)
>      ...
> Caused by: io.jsonwebtoken.SignatureException: JWT signature does not match 
> locally computed signature. JWT validity cannot be asserted and should not be 
> trusted.
>      at 
> io.jsonwebtoken.impl.DefaultJwtParser.parse(DefaultJwtParser.java:342)
>      at 
> io.jsonwebtoken.impl.DefaultJwtParser.parse(DefaultJwtParser.java:458)
>      at 
> io.jsonwebtoken.impl.DefaultJwtParser.parseClaimsJws(DefaultJwtParser.java:518)
>      at 
> org.apache.nifi.web.security.jwt.JwtService.parseTokenFromBase64EncodedString(JwtService.java:102)
>      ...
> {code}
>  
>  
> In my opinion, there are 2 problems:
> h2. Problem 1
> The NiFi JWT token is used (and not replaced) by nfCanvas.init() even after 
> loadCurrentUser() failed with an authentication error (when GET 
> /nifi-api/flow/current-user returns 401).
> In such case, this token, stored in nfStorage (in Javascript Local Storage of 

[jira] [Commented] (NIFI-7771) Infinite loop on WebUI when node stopped in cluster

2020-11-10 Thread humpfhumpf (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229311#comment-17229311
 ] 

humpfhumpf commented on NIFI-7771:
--

[~thenatog], [~mtien]: do you think that PR #4593 
[https://github.com/apache/nifi/pull/4593] will fix this ticket, because it 
changes the way "to convert OIDC Token to a Login AuthN Token instead of a NiFi 
JWT"?

regards.

> Infinite loop on WebUI when node stopped in cluster
> ---
>
> Key: NIFI-7771
> URL: https://issues.apache.org/jira/browse/NIFI-7771
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core UI
>Affects Versions: 1.10.0, 1.9.2, 1.12.0, 1.11.4
> Environment: Linux
>Reporter: humpfhumpf
>Priority: Critical
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h2. Context
> Before the bug occurs, one needs:
>  * a load-balancer which uses client IP affinity to “select” one NiFi 
> instance in the cluster;
>  * a fresh and successful OpenID Connect authentication on WebUI.
> Then, stops the NiFi instance, which was “selected” by the load-balancer for 
> you.
> In the mean time, the WebUI (javascript) continues to poll WebAPI by fetching 
> status URLs, but all calls return HTTP 401 error. Then WebUI shows an error 
> screen: “Unauthorized. Unable to validate the access token.”, with 2 links: 
> “log out” and “home”.
> If you click on the “home” link, *the browser enters in an infinite loop of 
> redirects between NiFi and OIDC Identity Provider.*
> The offending HTTP flows is:
>  * “home” link calls: GET /nifi
>  * Redirects to: GET /nifi/
>  ** asynchronously calls: GET /nifi-api/flow/current-user => failed HTTP 401
>  * Redirects to: GET /nifi/login
>  * Redirects to: GET /openid-connect/auth
>  * Redirects to: GET /nifi-api/access/oidc/callback
>  * Redirects to: GET /nifi
>  * Redirects to: GET /nifi/
>  ** asynchronously calls: GET /nifi-api/flow/current-user => failed HTTP 401
>  * [loop]
>  
> The stack trace of {{/nifi-api/flow/current-user}} call is:
>  
> {code:java}
> ERROR [NiFi Web Server-284] o.a.nifi.web.security.jwt.JwtService There was an 
> error validating the JWT io.jsonwebtoken.JwtException: Unable to validate the 
> access token.
>      at 
> org.apache.nifi.web.security.jwt.JwtService.parseTokenFromBase64EncodedString(JwtService.java:106)
>      at 
> org.apache.nifi.web.security.jwt.JwtService.getAuthenticationFromToken(JwtService.java:60)
>      at 
> org.apache.nifi.web.security.jwt.JwtAuthenticationProvider.authenticate(JwtAuthenticationProvider.java:48)
>      at 
> org.springframework.security.authentication.ProviderManager.authenticate(ProviderManager.java:174)
>      at 
> org.apache.nifi.web.security.NiFiAuthenticationFilter.authenticate(NiFiAuthenticationFilter.java:78)
>      at 
> org.apache.nifi.web.security.NiFiAuthenticationFilter.doFilter(NiFiAuthenticationFilter.java:58)
>      at 
> org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
>      at 
> org.apache.nifi.web.security.NiFiAuthenticationFilter.authenticate(NiFiAuthenticationFilter.java:99)
>      at 
> org.apache.nifi.web.security.NiFiAuthenticationFilter.doFilter(NiFiAuthenticationFilter.java:58)
>      ...
> Caused by: io.jsonwebtoken.SignatureException: JWT signature does not match 
> locally computed signature. JWT validity cannot be asserted and should not be 
> trusted.
>      at 
> io.jsonwebtoken.impl.DefaultJwtParser.parse(DefaultJwtParser.java:342)
>      at 
> io.jsonwebtoken.impl.DefaultJwtParser.parse(DefaultJwtParser.java:458)
>      at 
> io.jsonwebtoken.impl.DefaultJwtParser.parseClaimsJws(DefaultJwtParser.java:518)
>      at 
> org.apache.nifi.web.security.jwt.JwtService.parseTokenFromBase64EncodedString(JwtService.java:102)
>      ...
> {code}
>  
>  
> In my opinion, there are 2 problems:
> h2. Problem 1
> The NiFi JWT token is used (and not replaced) by nfCanvas.init() even after 
> loadCurrentUser() failed with an authentication error (when GET 
> /nifi-api/flow/current-user returns 401).
> In such case, this token, stored in nfStorage (in Javascript Local Storage of 
> browser), should be cleared before redirecting to /nifi/login.
> h2. Problem 2
> The NiFi JWT token is not shared among NiFi cluster.
> This token is not the original id_token returned by the OIDC Identity 
> Provider, but a new one, generated by the NiFi instance on which the browser 
> was routed during the OIDC connection flow.
> This token is closely related to some data stored in H2 users database 
> (nifi-user-keys.h2.db).
> Its “KEY” table contains the following data for each user: email, a primary 
> key (auto increment) and a generated UUID.
> The NiFi JWT token contains (among other things) a copy of email and 

[GitHub] [nifi] markap14 opened a new pull request #4652: NIFI-7992: Periodically check disk usage for content repo to see if b…

2020-11-10 Thread GitBox


markap14 opened a new pull request #4652:
URL: https://github.com/apache/nifi/pull/4652


   …ackpressure should be applied. Log progress in background task. Improve 
performance of background cleanup task by not using an ArrayList Iterator and 
constantly calling remove but instead wait until the end of our cleanup loop 
and then removed from the list all elements that should be removed in a single 
update
   
   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
    Description of PR
   
   _Enables X functionality; fixes bug NIFI-._
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
in the commit message?
   
   - [ ] Does your PR title start with **NIFI-** where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `main`)?
   
   - [ ] Is your initial contribution a single, squashed commit? _Additional 
commits in response to PR reviewer feedback should be made on this branch and 
pushed to allow change tracking. Do not `squash` or use `--force` when pushing 
to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn 
-Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on JDK 8?
   - [ ] Have you verified that the full build is successful on JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main 
`LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main 
`NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to 
.name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which 
it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for 
build issues and submit an update to your PR as soon as possible.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (NIFI-7992) Content Repository can fail to cleanup archive directory fast enough

2020-11-10 Thread Joe Witt (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Witt updated NIFI-7992:
---
Priority: Critical  (was: Major)

> Content Repository can fail to cleanup archive directory fast enough
> 
>
> Key: NIFI-7992
> URL: https://issues.apache.org/jira/browse/NIFI-7992
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.13.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the scenario where a use is generating many small FlowFiles and has the 
> "nifi.content.claim.max.appendable.size" property set to a small value, we 
> can encounter a situation where data is constantly archived but not cleaned 
> up quickly enough. As a result, the Content Repository can run out of space.
> The FileSystemRepository has a backpressure mechanism built in to avoid 
> allowing this to happen, but under the above conditions, it can sometimes 
> fail to prevent this situation. The backpressure mechanism works by 
> performing the following steps:
>  # When a new Content Claim is created, the Content Repository determines 
> which 'container' to use.
>  # Content Repository checks if the amount of storage space used for the 
> container is greater than the configured backpressure threshold.
>  # If so, the thread blocks until a background task completes cleanup of the 
> archive directories.
> However, in Step #2 above, it determines if the amount of space currently 
> being used by looking at a cached member variable. That cached member 
> variable is only updated on the first iteration, and when the said background 
> task completes.
> So, now consider a case where there are millions of files in the content 
> repository archive. The background task could take a massive amount of time 
> performing cleanup. Meanwhile, processors are able to write to the repository 
> without any backpressure being applied because the background task hasn't 
> updated the cached variable for the amount of space used. This continues 
> until the content repository fills.
> There are three important very simple things that should be changed:
>  # The background task should be faster in this case. While we cannot improve 
> the amount of time it takes to destroy the files, we do create an ArrayList 
> to hold all of the file info and then use an iterator, calling remove(). 
> Under the hood, this creates a copy of the underlying array for each file 
> that is removed. On my laptop, performing this procedure on an ArrayList with 
> 1 million elements took approximately 1 minute. Changing to a LinkedList took 
> 15 milliseconds but took much more heap. Keeping an ArrayList, then removing 
> all of elements at the end (via ArrayList.subList(0, n).clear()) resulted in 
> similar performance to LinkedList with the memory footprint of ArrayList.
>  # The check to see whether or not the content repository's usage has crossed 
> the threshold should not rely entirely on a cache that is populated by a 
> process that can take a long time. It should periodically calculate the disk 
> usage itself (perhaps once per minute).
>  # When backpressure does get applied, it can appear that the system has 
> frozen up, not performing any sort of work. The background task that is 
> clearing space should periodically log its progress at INFO level to allow 
> users to understand that this action is taking place.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (NIFI-7992) Content Repository can fail to cleanup archive directory fast enough

2020-11-10 Thread Joe Witt (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Witt updated NIFI-7992:
---
Fix Version/s: 1.13.0

> Content Repository can fail to cleanup archive directory fast enough
> 
>
> Key: NIFI-7992
> URL: https://issues.apache.org/jira/browse/NIFI-7992
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 1.13.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the scenario where a use is generating many small FlowFiles and has the 
> "nifi.content.claim.max.appendable.size" property set to a small value, we 
> can encounter a situation where data is constantly archived but not cleaned 
> up quickly enough. As a result, the Content Repository can run out of space.
> The FileSystemRepository has a backpressure mechanism built in to avoid 
> allowing this to happen, but under the above conditions, it can sometimes 
> fail to prevent this situation. The backpressure mechanism works by 
> performing the following steps:
>  # When a new Content Claim is created, the Content Repository determines 
> which 'container' to use.
>  # Content Repository checks if the amount of storage space used for the 
> container is greater than the configured backpressure threshold.
>  # If so, the thread blocks until a background task completes cleanup of the 
> archive directories.
> However, in Step #2 above, it determines if the amount of space currently 
> being used by looking at a cached member variable. That cached member 
> variable is only updated on the first iteration, and when the said background 
> task completes.
> So, now consider a case where there are millions of files in the content 
> repository archive. The background task could take a massive amount of time 
> performing cleanup. Meanwhile, processors are able to write to the repository 
> without any backpressure being applied because the background task hasn't 
> updated the cached variable for the amount of space used. This continues 
> until the content repository fills.
> There are three important very simple things that should be changed:
>  # The background task should be faster in this case. While we cannot improve 
> the amount of time it takes to destroy the files, we do create an ArrayList 
> to hold all of the file info and then use an iterator, calling remove(). 
> Under the hood, this creates a copy of the underlying array for each file 
> that is removed. On my laptop, performing this procedure on an ArrayList with 
> 1 million elements took approximately 1 minute. Changing to a LinkedList took 
> 15 milliseconds but took much more heap. Keeping an ArrayList, then removing 
> all of elements at the end (via ArrayList.subList(0, n).clear()) resulted in 
> similar performance to LinkedList with the memory footprint of ArrayList.
>  # The check to see whether or not the content repository's usage has crossed 
> the threshold should not rely entirely on a cache that is populated by a 
> process that can take a long time. It should periodically calculate the disk 
> usage itself (perhaps once per minute).
>  # When backpressure does get applied, it can appear that the system has 
> frozen up, not performing any sort of work. The background task that is 
> clearing space should periodically log its progress at INFO level to allow 
> users to understand that this action is taking place.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] markobean commented on pull request #4641: NIFI-6394 - Ability to export contents of a List Queue

2020-11-10 Thread GitBox


markobean commented on pull request #4641:
URL: https://github.com/apache/nifi/pull/4641#issuecomment-724788240


   There is a 'View All' option at the top of the listing and at the bottom, 
next to the Refresh button. I believe the one at the bottom should be removed.
   An extreme nit-pick, but for consistency: the message reads "Displaying X of 
Y". The "X" value does not have a comma whereas "Y" value does.
   
   More significant issue: I tried 'View All' on a queue with 1,000,200 FF's. 
It only listed 19,200 (and corresponding CSV only contained those entries as 
well.)
   Re-running the test with 262,000 files in the queue, 'View All' topped out 
at 19,000. 
   Interesting to note that my first test initially filled the queue with 200 
files, then piled on another 1M. Not sure if that is the reason for the "extra" 
200 in the 19,200 limit as compared to the second test that limited at 19,000. 
   Either way, this limitation should be addressed.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (NIFI-7989) Add Hive "data drift" processor

2020-11-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-7989:
--

Assignee: Matt Burgess

> Add Hive "data drift" processor
> ---
>
> Key: NIFI-7989
> URL: https://issues.apache.org/jira/browse/NIFI-7989
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> It would be nice to have a Hive processor (one for each Hive NAR) that could 
> check an incoming record-based flowfile against a destination table, and 
> either add columns and/or partition values, or even create the table if it 
> does not exist. Such a processor could be used in a flow where the incoming 
> data's schema can change and we want to be able to write it to a Hive table, 
> preferably by using PutHDFS, PutParquet, or PutORC to place it directly where 
> it can be queried.
> Such a processor should be able to use a HiveConnectionPool to execute any 
> DDL (ALTER TABLE ADD COLUMN, e.g.) necessary to make the table match the 
> incoming data. For Partition Values, they could be provided via a property 
> that supports Expression Language. In such a case, an ALTER TABLE would be 
> issued to add the partition directory.
> Whether the table is created or updated, and whether there are partition 
> values to consider, an attribute should be written to the outgoing flowfile 
> corresponding to the location of the table (and any associated partitions). 
> This supports the idea of having a flow that updates a Hive table based on 
> the incoming data, and then allows the user to put the flowfile directly into 
> the destination location (PutHDFS, e.g.) instead of having to load it using 
> HiveQL or being subject to the restrictions of Hive Streaming tables 
> (ORC-backed, transactional, etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi-minifi-cpp] szaszm commented on pull request #935: MINIFICPP-1404 Add option to disable unity build of AWS library

2020-11-10 Thread GitBox


szaszm commented on pull request #935:
URL: https://github.com/apache/nifi-minifi-cpp/pull/935#issuecomment-724800185


   I wonder why changes in our aws code triggers recompilation of the third 
party aws lib. It's as is libaws had a dependency on the minifi aws extension 
rather than the other way around.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (NIFI-7992) Content Repository can fail to cleanup archive directory fast enough

2020-11-10 Thread Mark Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-7992:
-
Status: Patch Available  (was: Open)

> Content Repository can fail to cleanup archive directory fast enough
> 
>
> Key: NIFI-7992
> URL: https://issues.apache.org/jira/browse/NIFI-7992
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>
> For the scenario where a use is generating many small FlowFiles and has the 
> "nifi.content.claim.max.appendable.size" property set to a small value, we 
> can encounter a situation where data is constantly archived but not cleaned 
> up quickly enough. As a result, the Content Repository can run out of space.
> The FileSystemRepository has a backpressure mechanism built in to avoid 
> allowing this to happen, but under the above conditions, it can sometimes 
> fail to prevent this situation. The backpressure mechanism works by 
> performing the following steps:
>  # When a new Content Claim is created, the Content Repository determines 
> which 'container' to use.
>  # Content Repository checks if the amount of storage space used for the 
> container is greater than the configured backpressure threshold.
>  # If so, the thread blocks until a background task completes cleanup of the 
> archive directories.
> However, in Step #2 above, it determines if the amount of space currently 
> being used by looking at a cached member variable. That cached member 
> variable is only updated on the first iteration, and when the said background 
> task completes.
> So, now consider a case where there are millions of files in the content 
> repository archive. The background task could take a massive amount of time 
> performing cleanup. Meanwhile, processors are able to write to the repository 
> without any backpressure being applied because the background task hasn't 
> updated the cached variable for the amount of space used. This continues 
> until the content repository fills.
> There are three important very simple things that should be changed:
>  # The background task should be faster in this case. While we cannot improve 
> the amount of time it takes to destroy the files, we do create an ArrayList 
> to hold all of the file info and then use an iterator, calling remove(). 
> Under the hood, this creates a copy of the underlying array for each file 
> that is removed. On my laptop, performing this procedure on an ArrayList with 
> 1 million elements took approximately 1 minute. Changing to a LinkedList took 
> 15 milliseconds but took much more heap. Keeping an ArrayList, then removing 
> all of elements at the end (via ArrayList.subList(0, n).clear()) resulted in 
> similar performance to LinkedList with the memory footprint of ArrayList.
>  # The check to see whether or not the content repository's usage has crossed 
> the threshold should not rely entirely on a cache that is populated by a 
> process that can take a long time. It should periodically calculate the disk 
> usage itself (perhaps once per minute).
>  # When backpressure does get applied, it can appear that the system has 
> frozen up, not performing any sort of work. The background task that is 
> clearing space should periodically log its progress at INFO level to allow 
> users to understand that this action is taking place.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (NIFI-7990) PutElasticsearch/RecordHttp processors should support Elasticsearch Data Streams

2020-11-10 Thread Chris Sampson (Jira)
Chris Sampson created NIFI-7990:
---

 Summary: PutElasticsearch/RecordHttp processors should support 
Elasticsearch Data Streams
 Key: NIFI-7990
 URL: https://issues.apache.org/jira/browse/NIFI-7990
 Project: Apache NiFi
  Issue Type: Improvement
Affects Versions: 1.12.1, 1.11.4
Reporter: Chris Sampson


PutElasticsearchHttp and PutElasticsearchRecordHttp (and possibly other ES 
related processors) should support the new [Elasticsearch Data 
Streams|https://www.elastic.co/guide/en/elasticsearch/reference/current/use-a-data-stream.html#add-documents-to-a-data-stream].

As these processors use the _bulk endpoint to PUT one or more documents in one 
request, the processors need to be updated to support the "create" operation 
type.

Likely related to: NIFI-7474



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi-minifi-cpp] lordgamez commented on pull request #935: MINIFICPP-1404 Add option to disable unity build of AWS library

2020-11-10 Thread GitBox


lordgamez commented on pull request #935:
URL: https://github.com/apache/nifi-minifi-cpp/pull/935#issuecomment-724797290


   > Tested, works, but I'm not sure this change adds a lot of value to the 
project. I would expect a third party lib to only change when it's updated to a 
newer version, which is a rare occurrence. Since it's behind an option, it 
doesn't hurt too much either. Treat my approval as a verification that this 
change is successfully achieving its goals but I'm neutral on whether we need 
this feature.
   
   It was my personal experience that lead to this change, as working on the 
AWS features every newly introduced test change or minor code change triggered 
the regeneration and recompilation of  the AWS SDK if unity build was enabled. 
This became tiresome after a while as changes took disproportionately long time 
to compile because of this. I did not want to disable the unity build by 
default because it makes the binary 20-30MB smaller in my experience, so I 
chose this option, but I opened the PR separately from other changes to be open 
for discussions whether it's okay have this option or just locally disable it 
in the CMake files when developing the AWS extension. This may be more explicit 
and could be found more easily with some explanation if anyone else experiences 
the same problem.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (NIFI-7991) Flow Configuration History displays "annotation data not found/available" from "Advanced" changes.

2020-11-10 Thread Tim Smith (Jira)
Tim Smith created NIFI-7991:
---

 Summary: Flow Configuration History displays "annotation data not 
found/available" from "Advanced" changes.
 Key: NIFI-7991
 URL: https://issues.apache.org/jira/browse/NIFI-7991
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Reporter: Tim Smith


On making changes through UpdateAttributes Advanced tab, discovered that 
"AnnotationData" is presented in "Flow Configuration History" as "annotation 
data not found/available" in both previous and new values by ProcessorAuditor. 
UpdateAttribute represents the annotation data as XML. ProcessorAuditor, at a 
minimum, should attempt to represent new and previous values as the difference 
between the XML nodes or as String values, if not in XML format. Default value 
presented is not useful.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] MikeThomsen commented on pull request #4570: NIFI-7879 Created record path function for UUID v5

2020-11-10 Thread GitBox


MikeThomsen commented on pull request #4570:
URL: https://github.com/apache/nifi/pull/4570#issuecomment-724825411


   @jfrazee we should be all good now. Even got the L language updated.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] lordgamez edited a comment on pull request #935: MINIFICPP-1404 Add option to disable unity build of AWS library

2020-11-10 Thread GitBox


lordgamez edited a comment on pull request #935:
URL: https://github.com/apache/nifi-minifi-cpp/pull/935#issuecomment-724797290


   > Tested, works, but I'm not sure this change adds a lot of value to the 
project. I would expect a third party lib to only change when it's updated to a 
newer version, which is a rare occurrence. Since it's behind an option, it 
doesn't hurt too much either. Treat my approval as a verification that this 
change is successfully achieving its goals but I'm neutral on whether we need 
this feature.
   
   It was my personal experience that lead to this change, as working on the 
AWS features every newly introduced test change or minor code change triggered 
the regeneration and recompilation of  the AWS SDK if unity build was enabled. 
This became tiresome after a while as changes took disproportionately long time 
to compile because of this. I did not want to disable the unity build by 
default because it makes the binary 20-30MB smaller in my experience, so I 
chose this option. I also felt that this may not have a lot of added value so I 
opened the PR separately from other changes to be open for discussion whether 
it's okay have this option or just locally disable it in the CMake files when 
developing the extension. Having this option may be more explicit and could be 
found more easily with some explanation if anyone else experiences the same 
problem.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi-minifi-cpp] lordgamez commented on pull request #935: MINIFICPP-1404 Add option to disable unity build of AWS library

2020-11-10 Thread GitBox


lordgamez commented on pull request #935:
URL: https://github.com/apache/nifi-minifi-cpp/pull/935#issuecomment-724805060


   We currently have 2 dependencies from the sdk, the aws-cpp-sdk-s3 and the 
aws-cpp-sdk-core libraries. When we trigger a build after a dependent change 
the unity build generates new ub_S3.cpp and a ub_core.cpp files from the source 
files of those libraries. CMake only checks the timestamp of these source files 
so even with if the content is unchanged, they are rebuilt.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (NIFI-7992) Content Repository can fail to cleanup archive directory fast enough

2020-11-10 Thread Mark Payne (Jira)
Mark Payne created NIFI-7992:


 Summary: Content Repository can fail to cleanup archive directory 
fast enough
 Key: NIFI-7992
 URL: https://issues.apache.org/jira/browse/NIFI-7992
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne


For the scenario where a use is generating many small FlowFiles and has the 
"nifi.content.claim.max.appendable.size" property set to a small value, we can 
encounter a situation where data is constantly archived but not cleaned up 
quickly enough. As a result, the Content Repository can run out of space.

The FileSystemRepository has a backpressure mechanism built in to avoid 
allowing this to happen, but under the above conditions, it can sometimes fail 
to prevent this situation. The backpressure mechanism works by performing the 
following steps:
 # When a new Content Claim is created, the Content Repository determines which 
'container' to use.
 # Content Repository checks if the amount of storage space used for the 
container is greater than the configured backpressure threshold.
 # If so, the thread blocks until a background task completes cleanup of the 
archive directories.

However, in Step #2 above, it determines if the amount of space currently being 
used by looking at a cached member variable. That cached member variable is 
only updated on the first iteration, and when the said background task 
completes.

So, now consider a case where there are millions of files in the content 
repository archive. The background task could take a massive amount of time 
performing cleanup. Meanwhile, processors are able to write to the repository 
without any backpressure being applied because the background task hasn't 
updated the cached variable for the amount of space used. This continues until 
the content repository fills.

There are three important very simple things that should be changed:
 # The background task should be faster in this case. While we cannot improve 
the amount of time it takes to destroy the files, we do create an ArrayList to 
hold all of the file info and then use an iterator, calling remove(). Under the 
hood, this creates a copy of the underlying array for each file that is 
removed. On my laptop, performing this procedure on an ArrayList with 1 million 
elements took approximately 1 minute. Changing to a LinkedList took 15 
milliseconds but took much more heap. Keeping an ArrayList, then removing all 
of elements at the end (via ArrayList.subList(0, n).clear()) resulted in 
similar performance to LinkedList with the memory footprint of ArrayList.
 # The check to see whether or not the content repository's usage has crossed 
the threshold should not rely entirely on a cache that is populated by a 
process that can take a long time. It should periodically calculate the disk 
usage itself (perhaps once per minute).
 # When backpressure does get applied, it can appear that the system has frozen 
up, not performing any sort of work. The background task that is clearing space 
should periodically log its progress at INFO level to allow users to understand 
that this action is taking place.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] thenatog commented on a change in pull request #4613: NiFi-7819 - Add Zookeeper client TLS (external zookeeper) for cluster state management

2020-11-10 Thread GitBox


thenatog commented on a change in pull request #4613:
URL: https://github.com/apache/nifi/pull/4613#discussion_r520912917



##
File path: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/state/providers/zookeeper/ZooKeeperStateProvider.java
##
@@ -133,20 +147,52 @@ public ZooKeeperStateProvider() {
 return properties;
 }
 
-
 @Override
 public synchronized void init(final StateProviderInitializationContext 
context) {
 connectionString = context.getProperty(CONNECTION_STRING).getValue();
 rootNode = context.getProperty(ROOT_NODE).getValue();
 timeoutMillis = 
context.getProperty(SESSION_TIMEOUT).asTimePeriod(TimeUnit.MILLISECONDS).intValue();
 
+final Properties stateProviderProperties = new Properties();
+
stateProviderProperties.setProperty(NiFiProperties.ZOOKEEPER_SESSION_TIMEOUT, 
String.valueOf(timeoutMillis));
+
stateProviderProperties.setProperty(NiFiProperties.ZOOKEEPER_CONNECT_TIMEOUT, 
String.valueOf(timeoutMillis));
+
stateProviderProperties.setProperty(NiFiProperties.ZOOKEEPER_ROOT_NODE, 
rootNode);
+
stateProviderProperties.setProperty(NiFiProperties.ZOOKEEPER_CONNECT_STRING, 
connectionString);
+
+zooKeeperClientConfig = 
ZooKeeperClientConfig.createConfig(combineProperties(nifiProperties, 
stateProviderProperties));

Review comment:
   Apparently the order of the initialization -> validator was correct so I 
reverted that back and stopped the properties from being used in initialization 
of the ZooKeeperStateProvider to fix the issue of the validator not catching 
empty/missing properties.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (NIFI-7995) Requesting create Parameter Context via API with Parameters null generates an NPE

2020-11-10 Thread Daniel Chaffelson (Jira)
Daniel Chaffelson created NIFI-7995:
---

 Summary: Requesting create Parameter Context via API with 
Parameters null generates an NPE
 Key: NIFI-7995
 URL: https://issues.apache.org/jira/browse/NIFI-7995
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Affects Versions: 1.12.1, 1.10.0
Reporter: Daniel Chaffelson


org.apache.nifi.web.api.ParameterContextResource.validateParameterNames(ParameterContextResource.java:403)
gets very upset if you submit a JSON with null for the Parameter listing 
instead of an empty or populated list. It is a minor thing, but probably worth 
tidying up in the validator.

Discovered because NiPyAPI defaults to Python 'None' for unpopulated properties.

 

Full logged error is:
2020-11-09 13:20:28,612 ERROR [NiFi Web Server-22] 
o.a.nifi.web.api.config.ThrowableMapper An unexpected error has occurred: 
java.lang.NullPointerException. Returning Internal Server Error response.
java.lang.NullPointerException: null
at 
org.apache.nifi.web.api.ParameterContextResource.validateParameterNames(ParameterContextResource.java:403)
at 
org.apache.nifi.web.api.ParameterContextResource.createParameterContext(ParameterContextResource.java:199)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
at 
org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:200)
at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103)
at 
org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493)
at 
org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:415)
at 
org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:104)
at 
org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:277)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:272)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:268)
at org.glassfish.jersey.internal.Errors.process(Errors.java:316)
at org.glassfish.jersey.internal.Errors.process(Errors.java:298)
at org.glassfish.jersey.internal.Errors.process(Errors.java:268)
at 
org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:289)
at 
org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:256)
at 
org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:703)
at 
org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:416)
at 
org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:370)
at 
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:389)
at 
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:342)
at 
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:229)
at 
org.eclipse.jetty.servlet.ServletHolder$NotAsyncServlet.service(ServletHolder.java:1395)
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:755)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1617)
at 
org.apache.nifi.web.filter.RequestLogger.doFilter(RequestLogger.java:66)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1604)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)
at 
org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)
at 
org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at 

[jira] [Commented] (NIFI-7819) Add Zookeeper client TLS (external zookeeper) for cluster state management

2020-11-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229542#comment-17229542
 ] 

ASF subversion and git services commented on NIFI-7819:
---

Commit 479ee6e3db58ee22dc1c7f4510eed5767c4458a0 in nifi's branch 
refs/heads/main from Nathan Gough
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=479ee6e ]

NIFI-7819 - Added ZooKeeperStateProvider TLS properties.
- Added tests for TLS with ZooKeeperStateProvider.
- Added docs to administration guide.
- Small fixes for PR comments.
- Changed the ZooKeeperStateProvider to receive configuration from the 
nifi.properties file. Uses the Zookeeper TLS properties or if they are not 
declared, uses the standard NiFi TLS properties.
- Updated administration-guide.
- Fixed some boolean literalsl. Set the ZooKeeper watcher to null. Removed 
stacktrace prints to standard out. Added getPreferredProperty for 
key/truststore types.
- Removing some unused code. Fixing up NiFi properties methods. Removed 
whitespace.
- Added some tests for getPreferredProperty().
- Checkstyle fixes.
- Passing through nifi properties to the state provider using an annotation to 
avoid ZooKeeper references in the StateManagerProvider.
- Fixed comment.
- Added CLIENT_SECURE property to isZooKeeperTlsConfigurationPresent() check.
- Small change to getPreferredProperty, added more tests.
- Added checkstyle fix.
- Moved StateProviderContext to nifi-framework-api.
- Changed combine properties to handle null NiFiProperties. Inject 
NiFiProperties object for tests.
- Checkstyle fix.
- Changed the connect string in state-management.xml to be required. Rearranged 
order of property validation to validate before initialization.
- Rearranged the way ZooKeeperClientConfig is initialized and added a non blank 
validator to connect string.
- Minor change to ZooKeeperClientConfig member variable set and get.

This closes #4613.

Signed-off-by: Bryan Bende 


> Add Zookeeper client TLS (external zookeeper) for cluster state management
> --
>
> Key: NIFI-7819
> URL: https://issues.apache.org/jira/browse/NIFI-7819
> Project: Apache NiFi
>  Issue Type: Sub-task
>Affects Versions: 1.12.0
>Reporter: Nathan Gough
>Assignee: Nathan Gough
>Priority: Major
>  Labels: security, tls, zookeeper
>
> When NiFi is configured to use an external Zookeeper, configuration on the 
> NiFi side should allow cluster state management to use TLS. If configured 
> with TLS, it should not allow any connections/communication to operate 
> unsecured (an all or nothing approach). 
> This ticket, in combination with NIFI-7115, should allow NiFi to completely 
> use an external Zookeeper securely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] asfgit closed pull request #4613: NiFi-7819 - Add Zookeeper client TLS (external zookeeper) for cluster state management

2020-11-10 Thread GitBox


asfgit closed pull request #4613:
URL: https://github.com/apache/nifi/pull/4613


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (NIFI-7862) Allow PutDatabaseRecord to optionally create the target table if it doesn't exist

2020-11-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-7862:
---
Description: 
As part of NIFI-6934, the "Database Type" property was added, allowing for 
DB-specific SQL/DML to be generated. That property could be used with a new 
true/false property "Create Table" that would create the target table if it 
didn't already exist. mapping the NiFi Record datatypes to DB-specific column 
types and generating/executing a DB-specific CREATE TABLE statement. If "Create 
Table" is set to true and an error occurs while attempting to create the table, 
the flowfile should be routed to failure or rolled back (depending on the 
setting of the "Rollback on Failure" property).

Also it could support some schema migration (adding columns, e.g.) with an 
additional value in the "Unmatched Field Behavior" property for "Create Column 
on Unmatched Fields". The use case here is for feature parity with 
UpdateHiveTable (see NIFI-7989)

  was:
As part of NIFI-6934, the "Database Type" property was added, allowing for 
DB-specific SQL/DML to be generated. That property could be used with a new 
true/false property "Create Table" that would create the target table if it 
didn't already exist.

I wouldn't go so far as to support schema migration (changing column types, 
e.g.), at least not for this case, the intent here would only be to check if 
table exists and attempt to create it , mapping the NiFi Record datatypes to 
DB-specific column types and generating/executing a DB-specific CREATE TABLE 
statement. If "Create Table" is set to true and an error occurs while 
attempting to create the table, the flowfile should be routed to failure or 
rolled back (depending on the setting of the "Rollback on Failure" property).


> Allow PutDatabaseRecord to optionally create the target table if it doesn't 
> exist
> -
>
> Key: NIFI-7862
> URL: https://issues.apache.org/jira/browse/NIFI-7862
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Priority: Major
>
> As part of NIFI-6934, the "Database Type" property was added, allowing for 
> DB-specific SQL/DML to be generated. That property could be used with a new 
> true/false property "Create Table" that would create the target table if it 
> didn't already exist. mapping the NiFi Record datatypes to DB-specific column 
> types and generating/executing a DB-specific CREATE TABLE statement. If 
> "Create Table" is set to true and an error occurs while attempting to create 
> the table, the flowfile should be routed to failure or rolled back (depending 
> on the setting of the "Rollback on Failure" property).
> Also it could support some schema migration (adding columns, e.g.) with an 
> additional value in the "Unmatched Field Behavior" property for "Create 
> Column on Unmatched Fields". The use case here is for feature parity with 
> UpdateHiveTable (see NIFI-7989)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (NIFI-7862) Allow PutDatabaseRecord to optionally create the target table and/or columns if they don't exist

2020-11-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-7862:
---
Summary: Allow PutDatabaseRecord to optionally create the target table 
and/or columns if they don't exist  (was: Allow PutDatabaseRecord to optionally 
create the target table if it doesn't exist)

> Allow PutDatabaseRecord to optionally create the target table and/or columns 
> if they don't exist
> 
>
> Key: NIFI-7862
> URL: https://issues.apache.org/jira/browse/NIFI-7862
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Priority: Major
>
> As part of NIFI-6934, the "Database Type" property was added, allowing for 
> DB-specific SQL/DML to be generated. That property could be used with a new 
> true/false property "Create Table" that would create the target table if it 
> didn't already exist. mapping the NiFi Record datatypes to DB-specific column 
> types and generating/executing a DB-specific CREATE TABLE statement. If 
> "Create Table" is set to true and an error occurs while attempting to create 
> the table, the flowfile should be routed to failure or rolled back (depending 
> on the setting of the "Rollback on Failure" property).
> Also it could support some schema migration (adding columns, e.g.) with an 
> additional value in the "Unmatched Field Behavior" property for "Create 
> Column on Unmatched Fields". The use case here is for feature parity with 
> UpdateHiveTable (see NIFI-7989)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (NIFI-7819) Add Zookeeper client TLS (external zookeeper) for cluster state management

2020-11-10 Thread Bryan Bende (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Bende resolved NIFI-7819.
---
Fix Version/s: 1.13.0
   Resolution: Fixed

> Add Zookeeper client TLS (external zookeeper) for cluster state management
> --
>
> Key: NIFI-7819
> URL: https://issues.apache.org/jira/browse/NIFI-7819
> Project: Apache NiFi
>  Issue Type: Sub-task
>Affects Versions: 1.12.0
>Reporter: Nathan Gough
>Assignee: Nathan Gough
>Priority: Major
>  Labels: security, tls, zookeeper
> Fix For: 1.13.0
>
>
> When NiFi is configured to use an external Zookeeper, configuration on the 
> NiFi side should allow cluster state management to use TLS. If configured 
> with TLS, it should not allow any connections/communication to operate 
> unsecured (an all or nothing approach). 
> This ticket, in combination with NIFI-7115, should allow NiFi to completely 
> use an external Zookeeper securely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] jfrazee commented on pull request #4613: NiFi-7819 - Add Zookeeper client TLS (external zookeeper) for cluster state management

2020-11-10 Thread GitBox


jfrazee commented on pull request #4613:
URL: https://github.com/apache/nifi/pull/4613#issuecomment-724999026


   @bbende That's fine. I wasn't expecting to find anything else blocking. I'll 
definitely be using it, so if I find anything I'll file an issue. Glad to see 
this go in!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi] turcsanyip opened a new pull request #4654: NIFI-7994: Fixed ReplaceText concurrency issue

2020-11-10 Thread GitBox


turcsanyip opened a new pull request #4654:
URL: https://github.com/apache/nifi/pull/4654


   Moving RegexReplace.additionalAttrs field to local variable in replace() 
method.
   Concurrent usage of the additionalAttrs map caused data corruption.
   
   https://issues.apache.org/jira/browse/NIFI-7994
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
in the commit message?
   
   - [ ] Does your PR title start with **NIFI-** where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `main`)?
   
   - [ ] Is your initial contribution a single, squashed commit? _Additional 
commits in response to PR reviewer feedback should be made on this branch and 
pushed to allow change tracking. Do not `squash` or use `--force` when pushing 
to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn 
-Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on JDK 8?
   - [ ] Have you verified that the full build is successful on JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main 
`LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main 
`NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to 
.name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which 
it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for 
build issues and submit an update to your PR as soon as possible.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi] bbende commented on pull request #4613: NiFi-7819 - Add Zookeeper client TLS (external zookeeper) for cluster state management

2020-11-10 Thread GitBox


bbende commented on pull request #4613:
URL: https://github.com/apache/nifi/pull/4613#issuecomment-724992528


   Latest changes look good, going to squash and merge, thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi] bbende commented on pull request #4613: NiFi-7819 - Add Zookeeper client TLS (external zookeeper) for cluster state management

2020-11-10 Thread GitBox


bbende commented on pull request #4613:
URL: https://github.com/apache/nifi/pull/4613#issuecomment-724996746


   @jfrazee sorry I realized you had been reviewing this as well, Nathan's 
latest changes had incorporated your feedback so I assumed it was good on your 
end, but let us know if there was anything additional.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (NIFI-7964) PutAzureBlobStorage OutOfMemory Exception

2020-11-10 Thread Joey Frazee (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17229554#comment-17229554
 ] 

Joey Frazee commented on NIFI-7964:
---

[~esecules] I forgot to mention, this was replicated on 1.9.x. I think it 
likely never worked but has just been hidden by heaps larger than the file size.

> PutAzureBlobStorage OutOfMemory Exception
> -
>
> Key: NIFI-7964
> URL: https://issues.apache.org/jira/browse/NIFI-7964
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.12.1
>Reporter: Eric Secules
>Assignee: Joey Frazee
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As part of my flow I upload files to azure blob storage. They can be several 
> hundred MB in size. I putting a 300 MB file into my flow and it choked on the 
> PutAzureBlobStorage processor with the following log message.
> {code:java}
> 2020-10-28 19:34:10,717 ERROR [Timer-Driven Process Thread-6] 
> o.a.n.p.a.storage.PutAzureBlobStorage 
> PutAzureBlobStorage[id=74b80a47-016d-3430-fd74-ece7653158d5] 
> PutAzureBlobStorage[id=74b80a47-016d-3430-fd74-ece7653158d5] failed to 
> process session due to java.lang.OutOfMemoryError: Java heap space; Processor 
> Administratively Yielded for 1 sec: java.lang.OutOfMemoryError: Java heap 
> space
> java.lang.OutOfMemoryError: Java heap space
> 2020-10-28 19:34:10,717 WARN [Timer-Driven Process Thread-6] 
> o.a.n.controller.tasks.ConnectableTask Administratively Yielding 
> PutAzureBlobStorage[id=74b80a47-016d-3430-fd74-ece7653158d5] due to uncaught 
> Exception: java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> {code}
> I did not expect this to happen because I think the PutAzureBlob processor 
> should be streaming the flowfile from disk directly to blob. But this 
> behaviour suggests to me that it's getting read into memory in its entirety. 
> My JVM heap size is set to 512 MB, which shouldn't be a problem if streaming 
> was used to upload to blob storage in chunks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] ottobackwards opened a new pull request #4513: NIFI-7761 Allow HandleHttpRequest to add specified form data to FlowF…

2020-11-10 Thread GitBox


ottobackwards opened a new pull request #4513:
URL: https://github.com/apache/nifi/pull/4513


   
   
[http_template.xml.zip](https://github.com/apache/nifi/files/5181358/http_template.xml.zip)
   
   …ile attributes
   
   To verify:
   - add the attached template
   - configure the HandleHttpRequest `Parameters to Attributes List` property 
to have `data1,DataTwo`
   - Enable the context service
   run the flow.
   
   Observe the new attributes in the nifi-app.log
   
   
   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
    Description of PR
   
   _Enables X functionality; fixes bug NIFI-._
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
in the commit message?
   
   - [x] Does your PR title start with **NIFI-** where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [x] Has your PR been rebased against the latest commit within the target 
branch (typically `main`)?
   
   - [x] Is your initial contribution a single, squashed commit? _Additional 
commits in response to PR reviewer feedback should be made on this branch and 
pushed to allow change tracking. Do not `squash` or use `--force` when pushing 
to allow for clean monitoring of changes._
   
   ### For code changes:
   - [x] Have you ensured that the full suite of tests is executed via `mvn 
-Pcontrib-check clean install` at the root `nifi` folder?
   - [-] Have you written or updated unit tests to verify your changes?
   - [x] Have you verified that the full build is successful on JDK 8?
   - [-] Have you verified that the full build is successful on JDK 11?
   - [-] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [-] If applicable, have you updated the `LICENSE` file, including the main 
`LICENSE` file under `nifi-assembly`?
   - [-] If applicable, have you updated the `NOTICE` file, including the main 
`NOTICE` file found under `nifi-assembly`?
   - [x] If adding new Properties, have you added `.displayName` in addition to 
.name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [x] Have you ensured that format looks appropriate for the output in which 
it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for 
build issues and submit an update to your PR as soon as possible.
   
   
   ---
   This change is [https://reviewable.io/review_button.svg; 
height="34" align="absmiddle" 
alt="Reviewable"/>](https://reviewable.io/reviews/apache/nifi/4513)
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [nifi] ottobackwards closed pull request #4513: NIFI-7761 Allow HandleHttpRequest to add specified form data to FlowF…

2020-11-10 Thread GitBox


ottobackwards closed pull request #4513:
URL: https://github.com/apache/nifi/pull/4513


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org