[GitHub] [nifi] tpalfy commented on issue #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
tpalfy commented on issue #3966: NIFI-6992 - Add "Batch Size" property to 
GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#issuecomment-573157022
 
 
   > Thanks @tpalfy for this improvement - definitely think this will help for 
users that want to monitor huge HDFS directories! I think we can simplify the 
code and the user experience a bit by avoiding the use of an empty string for 
Batch Size, as I indicated in the inline comments. But please do let me know if 
I misunderstood something.
   > 
   > Another option that may further simplify things would be to set the 
default value of the Batch Size property 1. Then, in the custom validate, 
rather than check if it's set, check if the value is 1 or not. That way, you 
could actually avoid ever even having to check if the value is null, etc. 
because the validators guarantee that it's a positive integer and the default 
value guarantees that it is non-null. Up to you if you want to go this route, 
but I think it simplifies the code (and tends to adhere to common nifi 
patterns).
   
   Hi @markap14,
   
   Yes, this was my first approach.
   
   However, @turcsanyip suggested we should help the users notice if they 
misconfigure the processor by making sure that the otherwise valid _'Batch 
Size'_ values become invalid unless the prerequisites are met (i.e. 
_'Destination'_ is set to _'Content'_ and _'Grouping'_ is set to _'None'_).
   With that we needed a valid value when the prerequisites are _not_ met, 
which would be the empty string.
   
   So basically the user should clear the value if it's not going to be used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] joewitt commented on a change in pull request #3968: NIFI-3833 Implemented encrypted flowfile repository

2020-01-10 Thread GitBox
joewitt commented on a change in pull request #3968: NIFI-3833 Implemented 
encrypted flowfile repository
URL: https://github.com/apache/nifi/pull/3968#discussion_r365371051
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java
 ##
 @@ -200,17 +211,19 @@ protected void initialize(final ResourceClaimManager 
claimManager, final Reposit
 // delete backup. On restore, if no files exist in partition's 
directory, would have to check backup directory
 this.serdeFactory = serdeFactory;
 
-if (walImplementation.equals(SEQUENTIAL_ACCESS_WAL)) {
+// The specified implementation can be plaintext or encrypted; the 
only difference is the serde factory
+if (isSequentialAccessWAL(walImplementation)) {
+// TODO: May need to instantiate ESAWAL for clarity?
 
 Review comment:
   This todo seems not necessary.  Given the todos related to follow on work 
with NIFI-6617 though this seems fine for now to help get more evaluation 
cycles on this new capability.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] alopresto commented on issue #3968: NIFI-3833 Implemented encrypted flowfile repository

2020-01-10 Thread GitBox
alopresto commented on issue #3968: NIFI-3833 Implemented encrypted flowfile 
repository
URL: https://github.com/apache/nifi/pull/3968#issuecomment-573155914
 
 
   Thanks everyone for the reviews. My favorite commits are the ones that 
remove more code than they add. Adding just a little bit (and lots of tests) 
are my next favorite. Will merge. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] markap14 commented on issue #3968: NIFI-3833 Implemented encrypted flowfile repository

2020-01-10 Thread GitBox
markap14 commented on issue #3968: NIFI-3833 Implemented encrypted flowfile 
repository
URL: https://github.com/apache/nifi/pull/3968#issuecomment-573154939
 
 
   I'm a +1 as well. One of the great things about this implementation is that 
it requires relatively small changes to the existing codebase, mostly just 
adding features as new classes. Nicely done!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (NIFI-6964) Use compression level for xz-lzma2 format of the CompressContent processor

2020-01-10 Thread John Pierce (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-6964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Pierce updated NIFI-6964:
--
Description: 
The CompressContent processor does not use the Compression Level property of 
the processor except for when using the GZIP compression format. On the 
contrary, the xz-lzma2 compression format defaults to using XZ compression 
level 6 for that specific format (I read the CompressContent.java source code 
to verify this) – disregarding whatever compression level you set on the 
processor itself.

As a side note, the xz compression format supports, amazingly enough, 10 levels 
of compression from 0 to 9 – the same as GZIP. The only difference that I can 
tell is level 0 of xz is not the lack of compression, but the lightest 
compression possible (i.e. still some compression) – whereas GZIP compression 
level 0 means just container the content but do not compress.

I have a use case where I must use the xz-lzma2 format (don't ask why) and I 
have to send (using the XZ format) already highly-compressed content that is 
+*NOT*+ XZ format to begin with. I have in excess of 500 GB of this sort of 
already highly compressed content to further compress into the XZ format on a 
daily basis.

The attached patch will enhance the CompressContent.java source code enabling 
the compression level property to be used in both the GZIP and the XZ-LZMA2 
formats.

Please consider adding this patch to the baseline for this processor. I've 
tested it and the results are fantastic because I can crank down the 
compression level to 0 for XZ-LZMA2 now and use a lot less CPU. I'm generally 
seeing a 66% improvement in elapsed time to process highly compressed content 
using XZ format with compression level of 0 versus the hard-coded level 6 of 
the baseline code.

 

  was:
The CompressContent processor does not use the Compression Level property of 
the processor except for when using the GZIP compression format. On the 
contrary, the xz-lzma2 compression format defaults to using XZ compression 
level 6 for that specific format (I read the CompressContent.java source code 
to verify this) – disregarding whatever compression level you set on the 
processor itself.

I have a use case where I must use the xz-lzma2 format (don't ask why) and I 
have to send (using the XZ format) already highly-compressed content that is 
+*NOT*+ XZ format to begin with. I have in excess of 500 GB of this sort of 
already highly compressed content to further compress into the XZ format on a 
daily basis.

The attached patch will enhance the CompressContent.java source code enabling 
the compression level property to be used in both the GZIP and the XZ-LZMA2 
formats.

Please consider adding this patch to the baseline for this processor. I've 
tested it and the results are fantastic because I can crank down the 
compression level to 0 for XZ-LZMA2 now and use a lot less CPU. I'm generally 
seeing a 66% improvement in elapsed time to process highly compressed content 
using XZ format with compression level of 0 versus the hard-coded level 6 of 
the baseline code.

 

 Labels: compression xz-lzma2  (was: xz-lzma2)

> Use compression level for xz-lzma2 format of the CompressContent processor
> --
>
> Key: NIFI-6964
> URL: https://issues.apache.org/jira/browse/NIFI-6964
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.10.0
>Reporter: John Pierce
>Priority: Minor
>  Labels: compression, xz-lzma2
> Fix For: 1.11.0
>
>   Original Estimate: 4h
>  Time Spent: 10m
>  Remaining Estimate: 3h 50m
>
> The CompressContent processor does not use the Compression Level property of 
> the processor except for when using the GZIP compression format. On the 
> contrary, the xz-lzma2 compression format defaults to using XZ compression 
> level 6 for that specific format (I read the CompressContent.java source code 
> to verify this) – disregarding whatever compression level you set on the 
> processor itself.
> As a side note, the xz compression format supports, amazingly enough, 10 
> levels of compression from 0 to 9 – the same as GZIP. The only difference 
> that I can tell is level 0 of xz is not the lack of compression, but the 
> lightest compression possible (i.e. still some compression) – whereas GZIP 
> compression level 0 means just container the content but do not compress.
> I have a use case where I must use the xz-lzma2 format (don't ask why) and I 
> have to send (using the XZ format) already highly-compressed content that is 
> +*NOT*+ XZ format to begin with. I have in excess of 500 GB of this sort of 
> already highly compressed content to further compress into the XZ format on a 
> daily basis.
> The 

[GitHub] [nifi] joewitt commented on issue #3968: NIFI-3833 Implemented encrypted flowfile repository

2020-01-10 Thread GitBox
joewitt commented on issue #3968: NIFI-3833 Implemented encrypted flowfile 
repository
URL: https://github.com/apache/nifi/pull/3968#issuecomment-573154475
 
 
   Walked through this with Mark Payne.  I'm +1.  Minor comments.  Love all the 
good docs here to help folks get going too.  Thanks ANdy!  Look forward to 
NIFI-6617 to ease config too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (NIFI-3833) Implement encrypted flowfile repository

2020-01-10 Thread Andy LoPresto (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy LoPresto updated NIFI-3833:

Fix Version/s: 1.11.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Implement encrypted flowfile repository
> ---
>
> Key: NIFI-3833
> URL: https://issues.apache.org/jira/browse/NIFI-3833
> Project: Apache NiFi
>  Issue Type: Sub-task
>  Components: Core Framework
>Affects Versions: 1.2.0
>Reporter: Andy LoPresto
>Assignee: Andy LoPresto
>Priority: Major
>  Labels: encryption, repository, security
> Fix For: 1.11.0
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Similar to the {{EncryptedWriteAheadProvenanceRepository}}, implement an 
> encrypted flowfile repository with random-access seek and confidentiality and 
> integrity controls over the data persisted to disk. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-3833) Implement encrypted flowfile repository

2020-01-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013142#comment-17013142
 ] 

ASF subversion and git services commented on NIFI-3833:
---

Commit 2cc467eb587247f1e8e0f02285ee01acc4ec9918 in nifi's branch 
refs/heads/master from Andy LoPresto
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=2cc467e ]

NIFI-3833 Added encrypted flowfile repository implementation.
Added EncryptedSchemaRepositoryRecordSerde.
Refactored CryptoUtils utility methods for repository encryption configuration 
validation checks to RepositoryEncryptorUtils.
Added FlowFile repo encryption config container.
Added more logging in cryptographic and serialization operations.
Generalized log messages in shared encryption services.
Added encrypted serde factory.
Added marker impl for encrypted WAL.
Moved validation of FF repo encryption config earlier in startup process.
Refactored duplicate property lookup code in NiFiProperties.
Added title case string helper.
Added validation and warning around misformatted encryption repo properties.
Added unit tests.
Added documentation to User Guide & Admin Guide.
Added screenshot for docs.
Added links to relevant sections of NiFi In-Depth doc to User Guide.
Added flowfile & content repository encryption configuration properties to 
default nifi.properties.

Signed-off-by: Joe Witt 
Signed-off-by: Mark Payne 

This closes #3968.


> Implement encrypted flowfile repository
> ---
>
> Key: NIFI-3833
> URL: https://issues.apache.org/jira/browse/NIFI-3833
> Project: Apache NiFi
>  Issue Type: Sub-task
>  Components: Core Framework
>Affects Versions: 1.2.0
>Reporter: Andy LoPresto
>Assignee: Andy LoPresto
>Priority: Major
>  Labels: encryption, repository, security
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Similar to the {{EncryptedWriteAheadProvenanceRepository}}, implement an 
> encrypted flowfile repository with random-access seek and confidentiality and 
> integrity controls over the data persisted to disk. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] asfgit closed pull request #3968: NIFI-3833 Implemented encrypted flowfile repository

2020-01-10 Thread GitBox
asfgit closed pull request #3968: NIFI-3833 Implemented encrypted flowfile 
repository
URL: https://github.com/apache/nifi/pull/3968
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] markap14 closed pull request #3978: NIFI-7011: SwappablePriorityQueue contains two internal data structur…

2020-01-10 Thread GitBox
markap14 closed pull request #3978: NIFI-7011: SwappablePriorityQueue contains 
two internal data structur…
URL: https://github.com/apache/nifi/pull/3978
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi] apiri commented on issue #181: MINIFI-517: Added /opt/minifi/minifi-current symlink

2020-01-10 Thread GitBox
apiri commented on issue #181: MINIFI-517: Added /opt/minifi/minifi-current 
symlink
URL: https://github.com/apache/nifi-minifi/pull/181#issuecomment-573207559
 
 
   @r65535 looks good with that fix.  thanks for adjusting!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] turcsanyip commented on issue #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
turcsanyip commented on issue #3966: NIFI-6992 - Add "Batch Size" property to 
GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#issuecomment-573156420
 
 
   @markap14 As the batch size only used when the Grouping=NONE and 
Destination=CONTENT, I suggested having no default value. For me it is more 
consistent, if a property has no value if that value will not be used. It might 
be a wrong approach and actually I haven't checked how it is handled in similar 
cases in NiFi.
   Regarding the empty string, I agree with you, I overlooked it in the review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi] apiri commented on issue #177: MINIFI-516: Added bootstrap option to override processors ssl references to use minifi parent ssl instead

2020-01-10 Thread GitBox
apiri commented on issue #177: MINIFI-516: Added bootstrap option to override 
processors ssl references to use minifi parent ssl instead
URL: https://github.com/apache/nifi-minifi/pull/177#issuecomment-573208556
 
 
   reviewing


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi] apiri closed pull request #181: MINIFI-517: Added /opt/minifi/minifi-current symlink

2020-01-10 Thread GitBox
apiri closed pull request #181: MINIFI-517: Added /opt/minifi/minifi-current 
symlink
URL: https://github.com/apache/nifi-minifi/pull/181
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi-cpp] arpadboda closed pull request #706: MINIFICPP-1116 - Fix error reporting in C2Agent

2020-01-10 Thread GitBox
arpadboda closed pull request #706: MINIFICPP-1116 - Fix error reporting in 
C2Agent
URL: https://github.com/apache/nifi-minifi-cpp/pull/706
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch 
Size" property to GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#discussion_r365190890
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java
 ##
 @@ -609,7 +636,11 @@ protected HDFSFileInfoRequest 
buildRequestDetails(ProcessContext context, Proces
 req.isIgnoreDotDirs = 
context.getProperty(IGNORE_DOTTED_DIRS).asBoolean();
 
 req.groupping = 
HDFSFileInfoRequest.Groupping.getEnum(context.getProperty(GROUPING).getValue());
-req.batchSize = context.getProperty(BATCH_SIZE).asInteger();
+req.batchSize = Optional.ofNullable(context.getProperty(BATCH_SIZE))
+.filter(propertyValue -> propertyValue.getValue() != null)
+.filter(propertyValue -> !propertyValue.getValue().equals(""))
 
 Review comment:
   StringUtils.isEmpty() could be used here too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch 
Size" property to GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#discussion_r365193100
 
 

 ##
 File path: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/processor/util/StandardValidators.java
 ##
 @@ -507,6 +507,20 @@ public ValidationResult validate(final String subject, 
final String input, final
 // FACTORY METHODS FOR VALIDATORS
 //
 //
+public static Validator createAllowEmptyValidator(Validator delegate) {
+return (subject, input, context) -> {
+ValidationResult validationResult;
+
+if (input == null || input.equals("")) {
 
 Review comment:
   isEmpty() method of this class could be used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch 
Size" property to GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#discussion_r365192302
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/test/java/org/apache/nifi/processors/hadoop/TestGetHDFSFileInfo.java
 ##
 @@ -76,6 +76,59 @@ public void setup() throws InitializationException {
 runner.setProperty(GetHDFSFileInfo.HADOOP_CONFIGURATION_RESOURCES, 
"src/test/resources/core-site.xml");
 }
 
+@Test
+public void 
testInvalidBatchSizeWhenDestinationAndGroupingDoesntAllowBatchSize() throws 
Exception {
+Arrays.asList("1", "2", "100").forEach(
+validBatchSize -> {
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, 
GetHDFSFileInfo.GROUP_ALL, validBatchSize, false);
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, 
GetHDFSFileInfo.GROUP_PARENT_DIR, validBatchSize, false);
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, 
GetHDFSFileInfo.GROUP_NONE, validBatchSize, false);
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, 
GetHDFSFileInfo.GROUP_ALL, validBatchSize, false);
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, 
GetHDFSFileInfo.GROUP_PARENT_DIR, validBatchSize, false);
+}
+);
+}
+
+@Test
+public void testValidBatchSize() throws Exception {
+Arrays.asList("1", "2", "100").forEach(
+validBatchSize -> {
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, 
GetHDFSFileInfo.GROUP_NONE, validBatchSize, true);
+}
+);
+
+Arrays.asList("", null).forEach(
+emptyBatchSize -> {
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, 
GetHDFSFileInfo.GROUP_NONE, emptyBatchSize, true);
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, 
GetHDFSFileInfo.GROUP_PARENT_DIR, emptyBatchSize, true);
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, 
GetHDFSFileInfo.GROUP_ALL, emptyBatchSize, true);
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, 
GetHDFSFileInfo.GROUP_ALL, emptyBatchSize, true);
+testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, 
GetHDFSFileInfo.GROUP_PARENT_DIR, emptyBatchSize, true);
 
 Review comment:
   Grouping=NONE + Destination=CONTENT + Batch Size="" / null should also be 
added as a valid config.
   
   Grouping=NONE + Destination=CONTENT could also be tested with -1 and 0 as 
invalid configs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch 
Size" property to GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#discussion_r365190713
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java
 ##
 @@ -266,6 +268,31 @@ public void onPropertyModified(PropertyDescriptor 
descriptor, String oldValue, S
 req = null;
 }
 
+@Override
+protected Collection customValidate(ValidationContext 
validationContext) {
+final List validationResults = new 
ArrayList<>(super.customValidate(validationContext));
+
+String destination = 
validationContext.getProperty(DESTINATION).getValue();
+String grouping = validationContext.getProperty(GROUPING).getValue();
+String batchSize = 
validationContext.getProperty(BATCH_SIZE).getValue();
+
+if (
+(!DESTINATION_CONTENT.getValue().equals(destination) || 
!GROUP_NONE.getValue().equals(grouping))
+&& batchSize != null
+&& !batchSize.equals("")
 
 Review comment:
   StringUtils.isEmpty() could be used to simplify the condition.
   We have a StringUtils in NiFi and commons-lang3's StringUtils is also used. 
The latter is already imported here, so I'd use that one (but I don't know if 
there is a convention for it).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch 
Size" property to GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#discussion_r365196428
 
 

 ##
 File path: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/processor/util/StandardValidators.java
 ##
 @@ -507,6 +507,20 @@ public ValidationResult validate(final String subject, 
final String input, final
 // FACTORY METHODS FOR VALIDATORS
 //
 //
+public static Validator createAllowEmptyValidator(Validator delegate) {
 
 Review comment:
   Please add some unit tests for this new validator.
   You can find the existing tests in a different package for some reason: 
org.apache.nifi.util.validator.TestStandardValidators


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch 
Size" property to GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#discussion_r365187489
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java
 ##
 @@ -183,9 +186,8 @@
 .name("gethdfsfileinfo-batch-size")
 .description("Number of records to put into an output flowfile 
when 'Destination' is set to 'Content'"
 + " and 'Group Results' is set to 'None'")
-.required(true)
-.addValidator(StandardValidators.NON_NEGATIVE_INTEGER_VALIDATOR)
-.defaultValue("1")
+.required(false)
+
.addValidator(StandardValidators.createAllowEmptyValidator(StandardValidators.NON_NEGATIVE_INTEGER_VALIDATOR))
 
 Review comment:
   POSITIVE_INTEGER_VALIDATOR seems to me more reasonable here.
   "0" falls back to batch size = 1, but it might not be straightforward / 
obvious for the end user.
   The batch size properties of the other processors also have 1 as their lower 
bound.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (NIFI-7008) PutS3Object: Invalid V4 Authorization Header When Using Custom S3 Blobstore

2020-01-10 Thread Matt M (Jira)
Matt M created NIFI-7008:


 Summary: PutS3Object: Invalid V4 Authorization Header When Using 
Custom S3 Blobstore
 Key: NIFI-7008
 URL: https://issues.apache.org/jira/browse/NIFI-7008
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 1.10.0
 Environment:  Nifi 1.10.0, connecting to MinIO 2019-12-19 
S3-Compatible Blobstore
Reporter: Matt M


Hello!

Some background: I'm currently attempting to use a {{PutS3Object}} processor in 
Nifi {{1.10.0}} to upload an object to a [MinIO|https://min.io/] cluster. The 
MinIO cluster is configured to act as an S3-compatible blobstore in the 
{{us-east-1}} region. The MinIO cluster is running on an internal private 
network at my company at https://s3.mydomain.mycompany.com .

The {{PutS3Object}} processor is configured thusly:
- {{Bucket}}: {{mybucket}}
- {{Region}}: {{US East (N. Virginia)}}
- {{Endpoint Override URL}}: {{https://s3.mydomain.mycompany.com:9000}}
- {{Signer Override}}: {{Signature v4}}

All other options are left at their default values.

What happens when I attempt to use the processor to put a file into MinIO is 
that the processor shows an error like the following: {{Status Code: 400, Error 
Code: AuthorizationHeaderMalformed}}.

After some debugging, it looks like that the HTTP {{Authorization}} header 
being generated by Nifi isn't quite what I would expect. The {{Authorization}} 
header starts off like this:

{noformat}
Authorization: AWS4-HMAC-SHA256 
Credential=AKIAIOSFODNN7EXAMPLE/20200111/mydomain/s3/aws4_request ...
{noformat}
Whereas what I would _expect_ is something more like this:
{noformat}
Authorization: AWS4-HMAC-SHA256 
Credential=AKIAIOSFODNN7EXAMPLE/20200111/us-east-1/s3/aws4_request ...
{noformat}

The current behaviour seems to be: take part of the domain from the {{Endpoint 
Override URL}} and use that as the region inside of the {{Authorization}} 
header, instead of using the {{Region}} that was specified. 

As a workaround for now we can use {{Signature v2}} instead, but how long MinIO 
will continue to support {{Signature v2}} at this time is unknown. 

Would it be possible to fix the S3 family of processors so that they use the 
{{Region}} being specified instead of attempting to extract the region from the 
URL instead?




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] woutifier-t opened a new pull request #3977: NIFI-7007 Add update functionality to the PutCassandraRecord processor.

2020-01-10 Thread GitBox
woutifier-t opened a new pull request #3977: NIFI-7007 Add update functionality 
to the PutCassandraRecord processor.
URL: https://github.com/apache/nifi/pull/3977
 
 
   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
    Description of PR
   
   _Enables UPDATE functionality in the PutCassandraRecord processor; fixes bug 
NIFI-7007._
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
in the commit message?
   
   - [x] Does your PR title start with **NIFI-** where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [x] Has your PR been rebased against the latest commit within the target 
branch (typically `master`)?
   
   - [x] Is your initial contribution a single, squashed commit? _Additional 
commits in response to PR reviewer feedback should be made on this branch and 
pushed to allow change tracking. Do not `squash` or use `--force` when pushing 
to allow for clean monitoring of changes._
   
   ### For code changes:
   - [x] Have you ensured that the full suite of tests is executed via `mvn 
-Pcontrib-check clean install` at the root `nifi` folder?
   - [x] Have you written or updated unit tests to verify your changes?
   - [x] Have you verified that the full build is successful on both JDK 8 and 
JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main 
`LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main 
`NOTICE` file found under `nifi-assembly`?
   - [x] If adding new Properties, have you added `.displayName` in addition to 
.name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which 
it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (MINIFICPP-1116) Segfault when trying to use non-existent C2 protocol class

2020-01-10 Thread Marton Szasz (Jira)


 [ 
https://issues.apache.org/jira/browse/MINIFICPP-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Szasz resolved MINIFICPP-1116.
-
Resolution: Fixed

https://github.com/apache/nifi-minifi-cpp/pull/706

> Segfault when trying to use non-existent C2 protocol class
> --
>
> Key: MINIFICPP-1116
> URL: https://issues.apache.org/jira/browse/MINIFICPP-1116
> Project: Apache NiFi MiNiFi C++
>  Issue Type: Bug
>Affects Versions: master
> Environment: Ubuntu 18.04.3 LTS
>Reporter: Marton Szasz
>Assignee: Marton Szasz
>Priority: Minor
> Fix For: 0.8.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> h3. Steps to reproduce:
>  # Compile minifi-cpp with no CoAP support
>  # Configure C2 with {{nifi.c2.agent.protocol.class=CoapProtocol}}
>  # Start MiNiFi C++
> h3. Expected behavior:
> Print an error message to stderr, produce a log message with log level of at 
> least error and exit. Alternatively, produce a log message with log level of 
> at least warning and continue execution without C2.
> h3. Actual behavior:
> No output on stderr.
>  Error message logged with log level info.
>  Segmentation fault in {{C2Agent.cpp:329}}, trying to dereference 
> {{C2Agent::protocol_}}, but it's null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (NIFI-7007) Add update capability to PutCassandraRecord

2020-01-10 Thread Wouter de Vries (Jira)
Wouter de Vries created NIFI-7007:
-

 Summary: Add update capability to PutCassandraRecord
 Key: NIFI-7007
 URL: https://issues.apache.org/jira/browse/NIFI-7007
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Wouter de Vries


The current PutCassandraRecord processor (1.10.0) only supports the INSERT 
statement. It would be useful to extend it to also support UPDATE, especially 
in the context of COUNTER tables which do not allow INSERTs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] turcsanyip commented on issue #3894: NIFI-6884 - Native library loading fixed/improved

2020-01-10 Thread GitBox
turcsanyip commented on issue #3894: NIFI-6884 - Native library loading 
fixed/improved
URL: https://github.com/apache/nifi/pull/3894#issuecomment-573006140
 
 
   @joewitt @markap14 @tpalfy The 2nd round review (refactorings, integration 
tests) is in progress from my side.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] ravowlga123 commented on a change in pull request #3971: NIFI-6997: consumeAMQP closing connection when queue not found

2020-01-10 Thread GitBox
ravowlga123 commented on a change in pull request #3971: NIFI-6997: consumeAMQP 
closing connection when queue not found 
URL: https://github.com/apache/nifi/pull/3971#discussion_r365238434
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/ConsumeAMQP.java
 ##
 @@ -194,6 +194,13 @@ protected synchronized AMQPConsumer 
createAMQPWorker(final ProcessContext contex
 
 return amqpConsumer;
 } catch (final IOException ioe) {
+try {
+connection.close();
+getLogger().warn("Closed connection at port " + 
connection.getPort());
+} catch (final IOException ioe_close) {
 
 Review comment:
   Instead of "ioe_close" "ioeClose" would be better?? Also not sure if log 
message should be at warn or info level. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] asfgit closed pull request #3970: NIFI-6945: Use minExtInfo=true to reduce the amount of data queried f…

2020-01-10 Thread GitBox
asfgit closed pull request #3970: NIFI-6945: Use minExtInfo=true to reduce the 
amount of data queried f…
URL: https://github.com/apache/nifi/pull/3970
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] mcgilman commented on issue #3970: NIFI-6945: Use minExtInfo=true to reduce the amount of data queried f…

2020-01-10 Thread GitBox
mcgilman commented on issue #3970: NIFI-6945: Use minExtInfo=true to reduce the 
amount of data queried f…
URL: https://github.com/apache/nifi/pull/3970#issuecomment-573048135
 
 
   Thanks for the PR @turcsanyip! Thanks for the review @tpalfy!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (NIFI-6945) Use minExtInfo=true to reduce the amount of data queried from Atlas

2020-01-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012898#comment-17012898
 ] 

ASF subversion and git services commented on NIFI-6945:
---

Commit b8ffb5461216b8330fb464162316197458f482c7 in nifi's branch 
refs/heads/master from Peter Turcsanyi
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=b8ffb54 ]

NIFI-6945: Use minExtInfo=true to reduce the amount of data queried from Atlas

This closes #3970


> Use minExtInfo=true to reduce the amount of data queried from Atlas
> ---
>
> Key: NIFI-6945
> URL: https://issues.apache.org/jira/browse/NIFI-6945
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Peter Turcsanyi
>Assignee: Peter Turcsanyi
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When ReportLineageToAtlas queries the entities from Atlas, the 
> AtlasEntityWithExtInfo response object also contains all the referred 
> entities with all their data by default which leads to huge data transfers 
> between Atlas and NiFi. On the other hand, this extended data of the related 
> entities are not used by the reporting task, so they could be excluded from 
> the query (using {{minExtInfo=true}} parameter) in order to reduce the 
> response size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (NIFI-6945) Use minExtInfo=true to reduce the amount of data queried from Atlas

2020-01-10 Thread Matt Gilman (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Gilman updated NIFI-6945:
--
Fix Version/s: 1.11.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Use minExtInfo=true to reduce the amount of data queried from Atlas
> ---
>
> Key: NIFI-6945
> URL: https://issues.apache.org/jira/browse/NIFI-6945
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Peter Turcsanyi
>Assignee: Peter Turcsanyi
>Priority: Major
> Fix For: 1.11.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When ReportLineageToAtlas queries the entities from Atlas, the 
> AtlasEntityWithExtInfo response object also contains all the referred 
> entities with all their data by default which leads to huge data transfers 
> between Atlas and NiFi. On the other hand, this extended data of the related 
> entities are not used by the reporting task, so they could be excluded from 
> the query (using {{minExtInfo=true}} parameter) in order to reduce the 
> response size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (NIFI-7009) Deleted flow components should not be queried by Atlas reporting task

2020-01-10 Thread Peter Turcsanyi (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Turcsanyi updated NIFI-7009:
--
Description: 
The NiFi Atlas reporting task retrieves the flow and its components from Atlas 
at every execution of the task in the following way:
 # get flow
 # get each flow component
 # filter out deleted components

#1 returns references to all the subcomponents of the flow, also to the deleted 
ones. It seems there is no way to parameterize the Atlas query to filter out 
the deleted item here.

But the result of #1 already contains the status of the components, so the 
deleted items do not need to be queried in #2, then filter out in #3 only.

#2 and #3 can be swapped, and only the active items need to be retrieved.

  was:
The NiFi Atlas reporting task retrieves the flow and its components from Atlas 
at every execution of the task in the following way:
 # get flow
 # get each flow component
 # filter out deleted components

#1 returns references to all the subcomponents of the flow, also to the deleted 
ones. It seems there is no way to parameterize the Atlas query to filter out 
the deleted item here.

But the result of #1 already contains the status of the components, so the 
deleted items do not need to be queried in #2, then filter out in #3. So #2 and 
#3 can be swapped.


> Deleted flow components should not be queried by Atlas reporting task
> -
>
> Key: NIFI-7009
> URL: https://issues.apache.org/jira/browse/NIFI-7009
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Peter Turcsanyi
>Assignee: Peter Turcsanyi
>Priority: Major
>
> The NiFi Atlas reporting task retrieves the flow and its components from 
> Atlas at every execution of the task in the following way:
>  # get flow
>  # get each flow component
>  # filter out deleted components
> #1 returns references to all the subcomponents of the flow, also to the 
> deleted ones. It seems there is no way to parameterize the Atlas query to 
> filter out the deleted item here.
> But the result of #1 already contains the status of the components, so the 
> deleted items do not need to be queried in #2, then filter out in #3 only.
> #2 and #3 can be swapped, and only the active items need to be retrieved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] joewitt commented on issue #3971: NIFI-6997: consumeAMQP closing connection when queue not found

2020-01-10 Thread GitBox
joewitt commented on issue #3971: NIFI-6997: consumeAMQP closing connection 
when queue not found 
URL: https://github.com/apache/nifi/pull/3971#issuecomment-573043464
 
 
   no checklist needed.
   exceptions should not be at info level.
   good to use common naming conventions of camel case rather than hyphen


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] shawnweeks commented on issue #3973: NIFI-6852 Don't Validate Processors that accept any ControllerService…

2020-01-10 Thread GitBox
shawnweeks commented on issue #3973: NIFI-6852 Don't Validate Processors that 
accept any ControllerService…
URL: https://github.com/apache/nifi/pull/3973#issuecomment-573055892
 
 
   @joewitt Do you have any suggestions on building a test case for this. We 
have basically no test cases for this class at all and I'm not having much luck 
getting enough of it stubbed out to actually test it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (NIFI-6852) CTL doesn't work in ExecuteGroovyScript processor with Apache NiFi 1.10.0

2020-01-10 Thread Mark Bean (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012927#comment-17012927
 ] 

Mark Bean commented on NIFI-6852:
-

This issue also presents itself for another use case: a controller service that 
includes dynamic properties where that property is another controller service 
(i.e. ControllerService.class).
Prior to 1.10.0, referenced controller services in dynamic properties were not 
validated against an API. NIFI-6798 fixed that. Therefore, a (custom) 
controller service which validated pre-1.10.0 no longer validates.
I don't see a downside to allowing all ControllerService.class API components 
to validate as long as the designer intends to allow ANY controller service. In 
certain cases, this is required.

> CTL doesn't work in ExecuteGroovyScript processor with Apache NiFi 1.10.0
> -
>
> Key: NIFI-6852
> URL: https://issues.apache.org/jira/browse/NIFI-6852
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.10.0
>Reporter: Behrouz
>Assignee: Shawn Weeks
>Priority: Blocker
> Attachments: NIFI_CTL.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> more information is available in this stackoverflow question:
> [https://stackoverflow.com/questions/58731092/why-ctl-doesnt-work-in-executegroovyscript-processor-with-apache-nifi-1-10-0]
> according to daggett comment in stackoverflow the source of error seems to be 
> [https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core-api/src/main/java/org/apache/nifi/controller/AbstractComponentNode.java#L746|http://example.com/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] tpalfy commented on issue #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
tpalfy commented on issue #3966: NIFI-6992 - Add "Batch Size" property to 
GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#issuecomment-573020796
 
 
   @turcsanyip Thanks for the comments, I applied all the suggested changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] ravowlga123 commented on issue #3971: NIFI-6997: consumeAMQP closing connection when queue not found

2020-01-10 Thread GitBox
ravowlga123 commented on issue #3971: NIFI-6997: consumeAMQP closing connection 
when queue not found 
URL: https://github.com/apache/nifi/pull/3971#issuecomment-573038910
 
 
   Hey @dirkbig I don't see the checklist which is usually present while 
opening the PR regarding the changes made. Would you please add it??


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (NIFI-7009) Deleted flow components should not be queried by Atlas reporting task

2020-01-10 Thread Peter Turcsanyi (Jira)
Peter Turcsanyi created NIFI-7009:
-

 Summary: Deleted flow components should not be queried by Atlas 
reporting task
 Key: NIFI-7009
 URL: https://issues.apache.org/jira/browse/NIFI-7009
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: Peter Turcsanyi
Assignee: Peter Turcsanyi


The NiFi Atlas reporting task retrieves the flow and its components from Atlas 
at every execution of the task in the following way:
 # get flow
 # get each flow component
 # filter out deleted components

#1 returns references to all the subcomponents of the flow, also to the deleted 
ones. It seems there is no way to parameterize the Atlas query to filter out 
the deleted item here.

But the result of #1 already contains the status of the components, so the 
deleted items do not need to be queried in #2, then filter out in #3. So #2 and 
#3 can be swapped.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi-minifi-cpp] szaszm commented on a change in pull request #707: MINIFICPP-1117 - minifi::Exception is now nothrow copyable

2020-01-10 Thread GitBox
szaszm commented on a change in pull request #707: MINIFICPP-1117 - 
minifi::Exception is now nothrow copyable
URL: https://github.com/apache/nifi-minifi-cpp/pull/707#discussion_r365213737
 
 

 ##
 File path: libminifi/include/Exception.h
 ##
 @@ -44,49 +43,56 @@ enum ExceptionType {
   MAX_EXCEPTION
 };
 
-// Exception String 
 static const char *ExceptionStr[MAX_EXCEPTION] = { "File Operation", "Flow 
File Operation", "Processor Operation", "Process Session Operation", "Process 
Schedule Operation", "Site2Site Protocol",
 "General Operation", "Regex Operation" };
 
-// Exception Type to String 
 inline const char *ExceptionTypeToString(ExceptionType type) {
   if (type < MAX_EXCEPTION)
 return ExceptionStr[type];
   else
 return NULL;
 }
 
-// Exception Class
-class Exception : public std::exception {
- public:
-  // Constructor
-  /*!
-   * Create a new exception
-   */
-  Exception(ExceptionType type, std::string errorMsg)
-  : _type(type),
-_errorMsg(std::move(errorMsg)) {
-  }
+namespace detail {
+inline size_t StringLength(const char* str) { return strlen(str); }
 
-  // Destructor
-  virtual ~Exception() noexcept {
-  }
-  virtual const char * what() const noexcept {
+template
+constexpr size_t StringLength(const char ()[L]) { return L; }
 
-_whatStr = ExceptionTypeToString(_type);
+inline size_t StringLength(const std::string& str) { return str.size(); }
 
-_whatStr += ":" + _errorMsg;
-return _whatStr.c_str();
-  }
+template
+size_t sum(SizeT... ns) {
+  size_t result = 0;
+  (void)(std::initializer_list{(
+  result += ns
+  )...});
+  return result; // (ns + ...)
+}
 
- private:
-  // Exception type
-  ExceptionType _type;
-  // Exception detailed information
-  std::string _errorMsg;
-  // Hold the what result
-  mutable std::string _whatStr;
+template
+std::string StringJoin(Strs&&... strs) {
 
 Review comment:
   moved to StringUtils in dbc0033


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi-cpp] szaszm edited a comment on issue #707: MINIFICPP-1117 - minifi::Exception is now nothrow copyable

2020-01-10 Thread GitBox
szaszm edited a comment on issue #707: MINIFICPP-1117 - minifi::Exception is 
now nothrow copyable
URL: https://github.com/apache/nifi-minifi-cpp/pull/707#issuecomment-573019916
 
 
   update:
   - fix `ProcessorTests`
   - move string utilities to `StringUtils` and generalize them
   - add unit test to cover the new `StringUtils::join_pack`
   
   Moving the utilities from a private namespace to `StringUtils` means that 
they need to be more general IMO. I've extended them to work with any character 
type and do SFINAE checking on the string types. 
   
   This means quite some template code, which may not be welcome in this 
codebase. Suggestions are welcome.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi-cpp] szaszm commented on issue #707: MINIFICPP-1117 - minifi::Exception is now nothrow copyable

2020-01-10 Thread GitBox
szaszm commented on issue #707: MINIFICPP-1117 - minifi::Exception is now 
nothrow copyable
URL: https://github.com/apache/nifi-minifi-cpp/pull/707#issuecomment-573019916
 
 
   update:
   - fix `ProcessorTests`
   - move string utilities to `StringUtils` and generalize them
   - add unit test to cover the new `StringUtils::join_pack`
   Moving the utilities from a private namespace to `StringUtils` means that 
they need to be more general IMO. I've extended them to work with any character 
type and do SFINAE checking on the string types. 
   This means quite some template code, which may not be welcome in this 
codebase. Suggestions are welcome.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] joewitt commented on a change in pull request #3971: NIFI-6997: consumeAMQP closing connection when queue not found

2020-01-10 Thread GitBox
joewitt commented on a change in pull request #3971: NIFI-6997: consumeAMQP 
closing connection when queue not found 
URL: https://github.com/apache/nifi/pull/3971#discussion_r365281042
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/ConsumeAMQP.java
 ##
 @@ -194,6 +194,13 @@ protected synchronized AMQPConsumer 
createAMQPWorker(final ProcessContext contex
 
 return amqpConsumer;
 } catch (final IOException ioe) {
+try {
+connection.close();
+getLogger().warn("Closed connection at port " + 
connection.getPort());
+} catch (final IOException ioeClose) {
+throw new ProcessException("Failed to close connection at port 
" + connection.getPort());
 
 Review comment:
   You're at this point because of an io exception that already occurred.  Here 
you're closing a connection as good practice but the fact that it too might not 
go well isn't useful/not something you could do anything about.  As such I'd 
closeQuietly meaning do not rethrow the exception and do not log even at a warn 
level.  If you want you could keep the logging part and only as a debug thing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (NIFI-7010) Add queue inspection and management to NiFi toolkit cli

2020-01-10 Thread John Wise (Jira)
John Wise created NIFI-7010:
---

 Summary: Add queue inspection and management to NiFi toolkit cli
 Key: NIFI-7010
 URL: https://issues.apache.org/jira/browse/NIFI-7010
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Tools and Build
Reporter: John Wise


There are currently no commands in the cli to inspect or manage connection 
queues.  We'd like to be able to use the cli to perform monitoring, alerting, 
and triage of certain queues.

Also, the ability to find processors, groups, and queues by name, rather than 
just UUID, would greatly simplify some of our scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] dirkbig commented on a change in pull request #3971: NIFI-6997: consumeAMQP closing connection when queue not found

2020-01-10 Thread GitBox
dirkbig commented on a change in pull request #3971: NIFI-6997: consumeAMQP 
closing connection when queue not found 
URL: https://github.com/apache/nifi/pull/3971#discussion_r365272278
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/ConsumeAMQP.java
 ##
 @@ -194,6 +194,13 @@ protected synchronized AMQPConsumer 
createAMQPWorker(final ProcessContext contex
 
 return amqpConsumer;
 } catch (final IOException ioe) {
+try {
+connection.close();
+getLogger().warn("Closed connection at port " + 
connection.getPort());
+} catch (final IOException ioe_close) {
 
 Review comment:
   thanks for the input, just changed it. Kept the log message to level WARN as 
joewitt affirmed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (NIFI-7011) FlowFile ordering can become incorrect when swapping data

2020-01-10 Thread Mark Payne (Jira)
Mark Payne created NIFI-7011:


 Summary: FlowFile ordering can become incorrect when swapping data
 Key: NIFI-7011
 URL: https://issues.apache.org/jira/browse/NIFI-7011
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne


NiFi provides weak ordering guarantees when using prioritizers in conjunction 
with swapped data. When data is swapped out, it is always the lowest priority 
data (according to the selected prioritizers) that is swapped out first. When 
data is swapped, it is always swapped back in, in the same order.

However, I've come across a problem where data is swapped out. Then, the data 
that is not swapped out gets processed, then data waiting to be swapped out 
(lowest priority data) is processed, then the swapped data. It should always be 
that the lowest priority data, waiting to be swapped out, should be processed 
after all data that is already swapped out gets swapped back in.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] markap14 opened a new pull request #3978: NIFI-7011: SwappablePriorityQueue contains two internal data structur…

2020-01-10 Thread GitBox
markap14 opened a new pull request #3978: NIFI-7011: SwappablePriorityQueue 
contains two internal data structur…
URL: https://github.com/apache/nifi/pull/3978
 
 
   …es: activeQueue, swapQueue. activeQueue is intended to be pulled from for 
processing. swapQueue is intended to hold FlowFiles that are waiting to be 
swapped out. SinWe want to ensure that we first swap in any data that has 
already been swapped out before processing the swap queue, in order to ensure 
that we process the data in the correct order. This fix ddresses an issue where 
data was being swapped out by writing the lowest priority data to a swap file, 
then adding the highest priority data to activeQueue and the 'middle' priority 
data back to swapQueue. As a result, when polling from the queue we got highest 
priority data, followed by lowest priority data, followed by middle priority 
data. This is addressed by avoiding putting anything back on swapQueue when we 
swap out. Instead, write data to the swap file, then push everything else to 
activeQueue. This way, any new data that comes in will still go to the 
swapQueue, as it should, but all data that didn't get written to the Swap file 
will be processed before the low priority data in the swap file.
   
   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
    Description of PR
   
   _Enables X functionality; fixes bug NIFI-._
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
in the commit message?
   
   - [ ] Does your PR title start with **NIFI-** where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `master`)?
   
   - [ ] Is your initial contribution a single, squashed commit? _Additional 
commits in response to PR reviewer feedback should be made on this branch and 
pushed to allow change tracking. Do not `squash` or use `--force` when pushing 
to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn 
-Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on both JDK 8 and 
JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main 
`LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main 
`NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to 
.name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which 
it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (NIFI-7010) Add queue inspection and management to NiFi toolkit cli

2020-01-10 Thread John Wise (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Wise updated NIFI-7010:

Description: 
There are currently no commands in the cli to inspect or manage connection 
queues.  We'd like to be able to use the cli to perform monitoring, alerting, 
and triage of certain queues.  We're currently using a series of cli and REST 
API calls to accomplish this now.

Also, the ability to find processors, groups, and queues by name, rather than 
just UUID, would greatly simplify some of our scripts.

  was:
There are currently no commands in the cli to inspect or manage connection 
queues.  We'd like to be able to use the cli to perform monitoring, alerting, 
and triage of certain queues.

Also, the ability to find processors, groups, and queues by name, rather than 
just UUID, would greatly simplify some of our scripts.


> Add queue inspection and management to NiFi toolkit cli
> ---
>
> Key: NIFI-7010
> URL: https://issues.apache.org/jira/browse/NIFI-7010
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Tools and Build
>Reporter: John Wise
>Priority: Minor
>
> There are currently no commands in the cli to inspect or manage connection 
> queues.  We'd like to be able to use the cli to perform monitoring, alerting, 
> and triage of certain queues.  We're currently using a series of cli and REST 
> API calls to accomplish this now.
> Also, the ability to find processors, groups, and queues by name, rather than 
> just UUID, would greatly simplify some of our scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (NIFI-7012) Using sensitive parameter in sensitive property of InvokeScriptedProcessor causes Jetty shutdown on NiFi restart

2020-01-10 Thread Dariusz Chmielewski (Jira)
Dariusz Chmielewski created NIFI-7012:
-

 Summary: Using sensitive parameter in sensitive property of 
InvokeScriptedProcessor causes Jetty shutdown on NiFi restart
 Key: NIFI-7012
 URL: https://issues.apache.org/jira/browse/NIFI-7012
 Project: Apache NiFi
  Issue Type: Bug
Affects Versions: 1.10.0
 Environment: OpenJDK 1.8.0_232 on Ubuntu 18.04.3 LTS
Reporter: Dariusz Chmielewski
 Attachments: groovy_sample.txt

To simulate this add a new InvokeScriptedProcessor to your flow with attached 
example basic Groovy script as the Script Body. This will create a "Password" 
sensitive property on the processor, then using UI enter sensitive parameter 
(ex: "#\{test_pass}") in the value of this property. At this point everything 
will work fine, no errors in UI or nifi-app.log during this processor 
execution, but when you restart NiFi you will notice the below warning in 
nifi-app.log. To get around this issue and have Jetty start again I had to 
manually edit flow.xml and set the parameter ("test_pass") sensitive value to 
false. Note that both the processor property and parameter appear as sensitive 
in UI (before making changes to flow.xml) and their values are encrypted in 
flow.xml.
{quote}2020-01-10 15:37:28,492 INFO [main] org.eclipse.jetty.server.Server 
Started @29687ms
 2020-01-10 15:37:28,493 WARN [main] org.apache.nifi.web.server.JettyServer 
Failed to start web server... shutting down.
 org.apache.nifi.controller.serialization.FlowSynchronizationException: 
java.lang.IllegalArgumentException: The property 'Password' cannot reference 
Parameter 'test_pass' because Sensitive Parameters may only be referenced by 
Sensitive Properties.
 at 
org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:510)
 at 
org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1368)
 at 
org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:88)
 at 
org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:812)
 at 
org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:557)
 at 
org.apache.nifi.web.contextlistener.ApplicationStartupContextListener.contextInitialized(ApplicationStartupContextListener.java:72)
 at 
org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:959)
 at 
org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:553)
 at 
org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:924)
 at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:365)
 at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1497)
 at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1459)
 at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:854)
 at 
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:278)
 at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:545)
 at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:119)
 at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
 at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
 at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
 at 
org.eclipse.jetty.server.handler.gzip.GzipHandler.doStart(GzipHandler.java:406)
 at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:119)
 at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
 at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:119)
 at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
 at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
 at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167)
 at 

[GitHub] [nifi] turcsanyip opened a new pull request #3979: NIFI-7009: Atlas reporting task retrieves only the active flow compon…

2020-01-10 Thread GitBox
turcsanyip opened a new pull request #3979: NIFI-7009: Atlas reporting task 
retrieves only the active flow compon…
URL: https://github.com/apache/nifi/pull/3979
 
 
   …ents
   
   Filter out the deleted components before querying them, instead of retrieving
   all the components before filtering.
   
   https://issues.apache.org/jira/browse/NIFI-7009
   
   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
    Description of PR
   
   _Enables X functionality; fixes bug NIFI-._
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
in the commit message?
   
   - [ ] Does your PR title start with **NIFI-** where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `master`)?
   
   - [ ] Is your initial contribution a single, squashed commit? _Additional 
commits in response to PR reviewer feedback should be made on this branch and 
pushed to allow change tracking. Do not `squash` or use `--force` when pushing 
to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn 
-Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on both JDK 8 and 
JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main 
`LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main 
`NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to 
.name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which 
it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (NIFI-7009) Deleted flow components should not be queried by Atlas reporting task

2020-01-10 Thread Peter Turcsanyi (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Turcsanyi updated NIFI-7009:
--
Status: Patch Available  (was: In Progress)

> Deleted flow components should not be queried by Atlas reporting task
> -
>
> Key: NIFI-7009
> URL: https://issues.apache.org/jira/browse/NIFI-7009
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Peter Turcsanyi
>Assignee: Peter Turcsanyi
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The NiFi Atlas reporting task retrieves the flow and its components from 
> Atlas at every execution of the task in the following way:
>  # get flow
>  # get each flow component
>  # filter out deleted components
> #1 returns references to all the subcomponents of the flow, also to the 
> deleted ones. It seems there is no way to parameterize the Atlas query to 
> filter out the deleted item here.
> But the result of #1 already contains the status of the components, so the 
> deleted items do not need to be queried in #2, then filter out in #3 only.
> #2 and #3 can be swapped, and only the active items need to be retrieved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [nifi] markap14 commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
markap14 commented on a change in pull request #3966: NIFI-6992 - Add "Batch 
Size" property to GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#discussion_r365353329
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java
 ##
 @@ -178,6 +183,15 @@
 .defaultValue(GROUP_ALL.getValue())
 .build();
 
+public static final PropertyDescriptor BATCH_SIZE = new 
PropertyDescriptor.Builder()
+.displayName("Batch Size")
+.name("gethdfsfileinfo-batch-size")
+.description("Number of records to put into an output flowfile 
when 'Destination' is set to 'Content'"
++ " and 'Group Results' is set to 'None'")
+.required(false)
+
.addValidator(StandardValidators.createAllowEmptyValidator(StandardValidators.POSITIVE_INTEGER_VALIDATOR))
 
 Review comment:
   Can you explain the logic here @tpalfy ? I don't understand why we would 
ever allow an empty string for this property. It is marked as optional 
(required=false). So either it should be specified as an integer or not 
specified. It looks like there is quite a bit of code in this PR (creation of 
this validator, and several places where it checks if the value is empty) that 
can be avoided.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi] markap14 commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor

2020-01-10 Thread GitBox
markap14 commented on a change in pull request #3966: NIFI-6992 - Add "Batch 
Size" property to GetHDFSFileInfo processor
URL: https://github.com/apache/nifi/pull/3966#discussion_r365353915
 
 

 ##
 File path: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java
 ##
 @@ -255,6 +270,30 @@ public void onPropertyModified(PropertyDescriptor 
descriptor, String oldValue, S
 req = null;
 }
 
+@Override
+protected Collection customValidate(ValidationContext 
validationContext) {
+final List validationResults = new 
ArrayList<>(super.customValidate(validationContext));
+
+String destination = 
validationContext.getProperty(DESTINATION).getValue();
+String grouping = validationContext.getProperty(GROUPING).getValue();
+String batchSize = 
validationContext.getProperty(BATCH_SIZE).getValue();
+
+if (
+(!DESTINATION_CONTENT.getValue().equals(destination) || 
!GROUP_NONE.getValue().equals(grouping))
+&& !isEmpty(batchSize)
 
 Review comment:
   If we avoid allowing Batch Size to be set to an empty string, as recommended 
above, we can instead just check 
`validationContext.getProperty(BATCH_SIZE).isSet()` here. I think this reads 
more cleanly, and is more inline with the way most processors are written.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [nifi-minifi-cpp] am-c-p-p commented on a change in pull request #656: MINIFI-1013 Used soci library.

2020-01-10 Thread GitBox
am-c-p-p commented on a change in pull request #656: MINIFI-1013 Used soci 
library.
URL: https://github.com/apache/nifi-minifi-cpp/pull/656#discussion_r365129833
 
 

 ##
 File path: extensions/sql/processors/ExecuteSQL.cpp
 ##
 @@ -0,0 +1,150 @@
+/**
+ * @file ExecuteSQL.cpp
+ * ExecuteSQL class declaration
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include "ExecuteSQL.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "io/DataStream.h"
+#include "core/ProcessContext.h"
+#include "core/ProcessSession.h"
+#include "Exception.h"
+#include "utils/OsUtils.h"
+#include "data/DatabaseConnectors.h"
+#include "data/JSONSQLWriter.h"
+#include "data/WriteCallback.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+  const std::string ExecuteSQL::ProcessorName("ExecuteSQL");
+
+static core::Property DBCControllerService(
+core::PropertyBuilder::createProperty("DB Controller 
Service")->isRequired(true)->withDescription("Database Controller 
Service.")->supportsExpressionLanguage(true)->build());
+
+static core::Property s_SQLSelectQuery(
+core::PropertyBuilder::createProperty("SQL select 
query")->isRequired(true)->withDefaultValue("System")->withDescription(
+"The SQL select query to execute. The query can be empty, a constant 
value, or built from attributes using Expression Language. "
+"If this property is specified, it will be used regardless of the 
content of incoming flowfiles. "
+"If this property is empty, the content of the incoming flow file is 
expected to contain a valid SQL select query, to be issued by the processor to 
the database. "
+"Note that Expression Language is not evaluated for flow file 
contents.")->supportsExpressionLanguage(true)->build());
+
+static core::Property MaxRowsPerFlowFile(
+   core::PropertyBuilder::createProperty("Max Rows Per Flow 
File")->isRequired(true)->withDefaultValue(0)->withDescription(
+   "The maximum number of result rows that will be included intoi 
a flow file. If zero then all will be placed into the flow 
file")->supportsExpressionLanguage(true)->build());
+
+core::Relationship ExecuteSQL::Success("success", "Successfully created 
FlowFile from SQL query result set.");
+
+ExecuteSQL::ExecuteSQL(const std::string& name, utils::Identifier uuid)
+: core::Processor(name, uuid), max_rows_(0),
+  logger_(logging::LoggerFactory::getLogger()) {
+}
+
+ExecuteSQL::~ExecuteSQL() {
+}
+
+void ExecuteSQL::initialize() {
+  //! Set the supported properties
+  setSupportedProperties( { DBCControllerService, s_SQLSelectQuery, 
MaxRowsPerFlowFile });
+
+  //! Set the supported relationships
+  setSupportedRelationships( { Success });
+}
+
+void ExecuteSQL::onSchedule(const std::shared_ptr 
, const std::shared_ptr ) {
+  context->getProperty(DBCControllerService.getName(), db_controller_service_);
+  context->getProperty(s_SQLSelectQuery.getName(), sqlSelectQuery_);
+  context->getProperty(MaxRowsPerFlowFile.getName(), max_rows_);
+
+  database_service_ = 
std::dynamic_pointer_cast(context->getControllerService(db_controller_service_));
+  if (!database_service_)
+throw minifi::Exception(PROCESSOR_EXCEPTION, "'DB Controller Service' must 
be defined");
+}
+
+void ExecuteSQL::onTrigger(const std::shared_ptr 
, const std::shared_ptr ) {
+  std::unique_lock lock(onTriggerMutex_, std::try_to_lock);
+  if (!lock.owns_lock()) {
+logger_->log_warn("'onTrigger' is called before previous 'onTrigger' call 
is finished.");
+context->yield();
+return;
+  }
+
+  if (!connection_) {
+connection_ = database_service_->getConnection();
+if (!connection_) {
+  context->yield();
+  return;
+}
+  }
+
+  try {
+auto statement = connection_->prepareStatement(sqlSelectQuery_);
+
+auto rowset = statement->execute();
+
+int count = 0;
+size_t row_count = 0;
+std::stringstream outputStream;
+sql::JSONSQLWriter writer(rowset, );
+// serialize the rows
+do {
+  row_count = writer.serialize(max_rows_ == 0 ? 
std::numeric_limits::max() : max_rows_);

[GitHub] [nifi-minifi-cpp] am-c-p-p commented on a change in pull request #656: MINIFI-1013 Used soci library.

2020-01-10 Thread GitBox
am-c-p-p commented on a change in pull request #656: MINIFI-1013 Used soci 
library.
URL: https://github.com/apache/nifi-minifi-cpp/pull/656#discussion_r365130435
 
 

 ##
 File path: extensions/sql/processors/ExecuteSQL.cpp
 ##
 @@ -0,0 +1,150 @@
+/**
+ * @file ExecuteSQL.cpp
+ * ExecuteSQL class declaration
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include "ExecuteSQL.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "io/DataStream.h"
+#include "core/ProcessContext.h"
+#include "core/ProcessSession.h"
+#include "Exception.h"
+#include "utils/OsUtils.h"
+#include "data/DatabaseConnectors.h"
+#include "data/JSONSQLWriter.h"
+#include "data/WriteCallback.h"
+
+namespace org {
+namespace apache {
+namespace nifi {
+namespace minifi {
+namespace processors {
+
+  const std::string ExecuteSQL::ProcessorName("ExecuteSQL");
+
+static core::Property DBCControllerService(
+core::PropertyBuilder::createProperty("DB Controller 
Service")->isRequired(true)->withDescription("Database Controller 
Service.")->supportsExpressionLanguage(true)->build());
+
+static core::Property s_SQLSelectQuery(
+core::PropertyBuilder::createProperty("SQL select 
query")->isRequired(true)->withDefaultValue("System")->withDescription(
+"The SQL select query to execute. The query can be empty, a constant 
value, or built from attributes using Expression Language. "
+"If this property is specified, it will be used regardless of the 
content of incoming flowfiles. "
+"If this property is empty, the content of the incoming flow file is 
expected to contain a valid SQL select query, to be issued by the processor to 
the database. "
+"Note that Expression Language is not evaluated for flow file 
contents.")->supportsExpressionLanguage(true)->build());
+
+static core::Property MaxRowsPerFlowFile(
+   core::PropertyBuilder::createProperty("Max Rows Per Flow 
File")->isRequired(true)->withDefaultValue(0)->withDescription(
+   "The maximum number of result rows that will be included intoi 
a flow file. If zero then all will be placed into the flow 
file")->supportsExpressionLanguage(true)->build());
+
+core::Relationship ExecuteSQL::Success("success", "Successfully created 
FlowFile from SQL query result set.");
+
+ExecuteSQL::ExecuteSQL(const std::string& name, utils::Identifier uuid)
+: core::Processor(name, uuid), max_rows_(0),
+  logger_(logging::LoggerFactory::getLogger()) {
+}
+
+ExecuteSQL::~ExecuteSQL() {
+}
+
+void ExecuteSQL::initialize() {
+  //! Set the supported properties
+  setSupportedProperties( { DBCControllerService, s_SQLSelectQuery, 
MaxRowsPerFlowFile });
+
+  //! Set the supported relationships
+  setSupportedRelationships( { Success });
+}
+
+void ExecuteSQL::onSchedule(const std::shared_ptr 
, const std::shared_ptr ) {
+  context->getProperty(DBCControllerService.getName(), db_controller_service_);
+  context->getProperty(s_SQLSelectQuery.getName(), sqlSelectQuery_);
+  context->getProperty(MaxRowsPerFlowFile.getName(), max_rows_);
+
+  database_service_ = 
std::dynamic_pointer_cast(context->getControllerService(db_controller_service_));
+  if (!database_service_)
+throw minifi::Exception(PROCESSOR_EXCEPTION, "'DB Controller Service' must 
be defined");
+}
+
+void ExecuteSQL::onTrigger(const std::shared_ptr 
, const std::shared_ptr ) {
+  std::unique_lock lock(onTriggerMutex_, std::try_to_lock);
+  if (!lock.owns_lock()) {
+logger_->log_warn("'onTrigger' is called before previous 'onTrigger' call 
is finished.");
+context->yield();
+return;
+  }
+
+  if (!connection_) {
+connection_ = database_service_->getConnection();
 
 Review comment:
   Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services