[GitHub] [nifi] tpalfy commented on issue #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
tpalfy commented on issue #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#issuecomment-573157022 > Thanks @tpalfy for this improvement - definitely think this will help for users that want to monitor huge HDFS directories! I think we can simplify the code and the user experience a bit by avoiding the use of an empty string for Batch Size, as I indicated in the inline comments. But please do let me know if I misunderstood something. > > Another option that may further simplify things would be to set the default value of the Batch Size property 1. Then, in the custom validate, rather than check if it's set, check if the value is 1 or not. That way, you could actually avoid ever even having to check if the value is null, etc. because the validators guarantee that it's a positive integer and the default value guarantees that it is non-null. Up to you if you want to go this route, but I think it simplifies the code (and tends to adhere to common nifi patterns). Hi @markap14, Yes, this was my first approach. However, @turcsanyip suggested we should help the users notice if they misconfigure the processor by making sure that the otherwise valid _'Batch Size'_ values become invalid unless the prerequisites are met (i.e. _'Destination'_ is set to _'Content'_ and _'Grouping'_ is set to _'None'_). With that we needed a valid value when the prerequisites are _not_ met, which would be the empty string. So basically the user should clear the value if it's not going to be used. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] joewitt commented on a change in pull request #3968: NIFI-3833 Implemented encrypted flowfile repository
joewitt commented on a change in pull request #3968: NIFI-3833 Implemented encrypted flowfile repository URL: https://github.com/apache/nifi/pull/3968#discussion_r365371051 ## File path: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java ## @@ -200,17 +211,19 @@ protected void initialize(final ResourceClaimManager claimManager, final Reposit // delete backup. On restore, if no files exist in partition's directory, would have to check backup directory this.serdeFactory = serdeFactory; -if (walImplementation.equals(SEQUENTIAL_ACCESS_WAL)) { +// The specified implementation can be plaintext or encrypted; the only difference is the serde factory +if (isSequentialAccessWAL(walImplementation)) { +// TODO: May need to instantiate ESAWAL for clarity? Review comment: This todo seems not necessary. Given the todos related to follow on work with NIFI-6617 though this seems fine for now to help get more evaluation cycles on this new capability. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] alopresto commented on issue #3968: NIFI-3833 Implemented encrypted flowfile repository
alopresto commented on issue #3968: NIFI-3833 Implemented encrypted flowfile repository URL: https://github.com/apache/nifi/pull/3968#issuecomment-573155914 Thanks everyone for the reviews. My favorite commits are the ones that remove more code than they add. Adding just a little bit (and lots of tests) are my next favorite. Will merge. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] markap14 commented on issue #3968: NIFI-3833 Implemented encrypted flowfile repository
markap14 commented on issue #3968: NIFI-3833 Implemented encrypted flowfile repository URL: https://github.com/apache/nifi/pull/3968#issuecomment-573154939 I'm a +1 as well. One of the great things about this implementation is that it requires relatively small changes to the existing codebase, mostly just adding features as new classes. Nicely done! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (NIFI-6964) Use compression level for xz-lzma2 format of the CompressContent processor
[ https://issues.apache.org/jira/browse/NIFI-6964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Pierce updated NIFI-6964: -- Description: The CompressContent processor does not use the Compression Level property of the processor except for when using the GZIP compression format. On the contrary, the xz-lzma2 compression format defaults to using XZ compression level 6 for that specific format (I read the CompressContent.java source code to verify this) – disregarding whatever compression level you set on the processor itself. As a side note, the xz compression format supports, amazingly enough, 10 levels of compression from 0 to 9 – the same as GZIP. The only difference that I can tell is level 0 of xz is not the lack of compression, but the lightest compression possible (i.e. still some compression) – whereas GZIP compression level 0 means just container the content but do not compress. I have a use case where I must use the xz-lzma2 format (don't ask why) and I have to send (using the XZ format) already highly-compressed content that is +*NOT*+ XZ format to begin with. I have in excess of 500 GB of this sort of already highly compressed content to further compress into the XZ format on a daily basis. The attached patch will enhance the CompressContent.java source code enabling the compression level property to be used in both the GZIP and the XZ-LZMA2 formats. Please consider adding this patch to the baseline for this processor. I've tested it and the results are fantastic because I can crank down the compression level to 0 for XZ-LZMA2 now and use a lot less CPU. I'm generally seeing a 66% improvement in elapsed time to process highly compressed content using XZ format with compression level of 0 versus the hard-coded level 6 of the baseline code. was: The CompressContent processor does not use the Compression Level property of the processor except for when using the GZIP compression format. On the contrary, the xz-lzma2 compression format defaults to using XZ compression level 6 for that specific format (I read the CompressContent.java source code to verify this) – disregarding whatever compression level you set on the processor itself. I have a use case where I must use the xz-lzma2 format (don't ask why) and I have to send (using the XZ format) already highly-compressed content that is +*NOT*+ XZ format to begin with. I have in excess of 500 GB of this sort of already highly compressed content to further compress into the XZ format on a daily basis. The attached patch will enhance the CompressContent.java source code enabling the compression level property to be used in both the GZIP and the XZ-LZMA2 formats. Please consider adding this patch to the baseline for this processor. I've tested it and the results are fantastic because I can crank down the compression level to 0 for XZ-LZMA2 now and use a lot less CPU. I'm generally seeing a 66% improvement in elapsed time to process highly compressed content using XZ format with compression level of 0 versus the hard-coded level 6 of the baseline code. Labels: compression xz-lzma2 (was: xz-lzma2) > Use compression level for xz-lzma2 format of the CompressContent processor > -- > > Key: NIFI-6964 > URL: https://issues.apache.org/jira/browse/NIFI-6964 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Affects Versions: 1.10.0 >Reporter: John Pierce >Priority: Minor > Labels: compression, xz-lzma2 > Fix For: 1.11.0 > > Original Estimate: 4h > Time Spent: 10m > Remaining Estimate: 3h 50m > > The CompressContent processor does not use the Compression Level property of > the processor except for when using the GZIP compression format. On the > contrary, the xz-lzma2 compression format defaults to using XZ compression > level 6 for that specific format (I read the CompressContent.java source code > to verify this) – disregarding whatever compression level you set on the > processor itself. > As a side note, the xz compression format supports, amazingly enough, 10 > levels of compression from 0 to 9 – the same as GZIP. The only difference > that I can tell is level 0 of xz is not the lack of compression, but the > lightest compression possible (i.e. still some compression) – whereas GZIP > compression level 0 means just container the content but do not compress. > I have a use case where I must use the xz-lzma2 format (don't ask why) and I > have to send (using the XZ format) already highly-compressed content that is > +*NOT*+ XZ format to begin with. I have in excess of 500 GB of this sort of > already highly compressed content to further compress into the XZ format on a > daily basis. > The
[GitHub] [nifi] joewitt commented on issue #3968: NIFI-3833 Implemented encrypted flowfile repository
joewitt commented on issue #3968: NIFI-3833 Implemented encrypted flowfile repository URL: https://github.com/apache/nifi/pull/3968#issuecomment-573154475 Walked through this with Mark Payne. I'm +1. Minor comments. Love all the good docs here to help folks get going too. Thanks ANdy! Look forward to NIFI-6617 to ease config too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (NIFI-3833) Implement encrypted flowfile repository
[ https://issues.apache.org/jira/browse/NIFI-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy LoPresto updated NIFI-3833: Fix Version/s: 1.11.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Implement encrypted flowfile repository > --- > > Key: NIFI-3833 > URL: https://issues.apache.org/jira/browse/NIFI-3833 > Project: Apache NiFi > Issue Type: Sub-task > Components: Core Framework >Affects Versions: 1.2.0 >Reporter: Andy LoPresto >Assignee: Andy LoPresto >Priority: Major > Labels: encryption, repository, security > Fix For: 1.11.0 > > Time Spent: 6h > Remaining Estimate: 0h > > Similar to the {{EncryptedWriteAheadProvenanceRepository}}, implement an > encrypted flowfile repository with random-access seek and confidentiality and > integrity controls over the data persisted to disk. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-3833) Implement encrypted flowfile repository
[ https://issues.apache.org/jira/browse/NIFI-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013142#comment-17013142 ] ASF subversion and git services commented on NIFI-3833: --- Commit 2cc467eb587247f1e8e0f02285ee01acc4ec9918 in nifi's branch refs/heads/master from Andy LoPresto [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=2cc467e ] NIFI-3833 Added encrypted flowfile repository implementation. Added EncryptedSchemaRepositoryRecordSerde. Refactored CryptoUtils utility methods for repository encryption configuration validation checks to RepositoryEncryptorUtils. Added FlowFile repo encryption config container. Added more logging in cryptographic and serialization operations. Generalized log messages in shared encryption services. Added encrypted serde factory. Added marker impl for encrypted WAL. Moved validation of FF repo encryption config earlier in startup process. Refactored duplicate property lookup code in NiFiProperties. Added title case string helper. Added validation and warning around misformatted encryption repo properties. Added unit tests. Added documentation to User Guide & Admin Guide. Added screenshot for docs. Added links to relevant sections of NiFi In-Depth doc to User Guide. Added flowfile & content repository encryption configuration properties to default nifi.properties. Signed-off-by: Joe Witt Signed-off-by: Mark Payne This closes #3968. > Implement encrypted flowfile repository > --- > > Key: NIFI-3833 > URL: https://issues.apache.org/jira/browse/NIFI-3833 > Project: Apache NiFi > Issue Type: Sub-task > Components: Core Framework >Affects Versions: 1.2.0 >Reporter: Andy LoPresto >Assignee: Andy LoPresto >Priority: Major > Labels: encryption, repository, security > Time Spent: 5h 50m > Remaining Estimate: 0h > > Similar to the {{EncryptedWriteAheadProvenanceRepository}}, implement an > encrypted flowfile repository with random-access seek and confidentiality and > integrity controls over the data persisted to disk. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [nifi] asfgit closed pull request #3968: NIFI-3833 Implemented encrypted flowfile repository
asfgit closed pull request #3968: NIFI-3833 Implemented encrypted flowfile repository URL: https://github.com/apache/nifi/pull/3968 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] markap14 closed pull request #3978: NIFI-7011: SwappablePriorityQueue contains two internal data structur…
markap14 closed pull request #3978: NIFI-7011: SwappablePriorityQueue contains two internal data structur… URL: https://github.com/apache/nifi/pull/3978 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi-minifi] apiri commented on issue #181: MINIFI-517: Added /opt/minifi/minifi-current symlink
apiri commented on issue #181: MINIFI-517: Added /opt/minifi/minifi-current symlink URL: https://github.com/apache/nifi-minifi/pull/181#issuecomment-573207559 @r65535 looks good with that fix. thanks for adjusting! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] turcsanyip commented on issue #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
turcsanyip commented on issue #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#issuecomment-573156420 @markap14 As the batch size only used when the Grouping=NONE and Destination=CONTENT, I suggested having no default value. For me it is more consistent, if a property has no value if that value will not be used. It might be a wrong approach and actually I haven't checked how it is handled in similar cases in NiFi. Regarding the empty string, I agree with you, I overlooked it in the review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi-minifi] apiri commented on issue #177: MINIFI-516: Added bootstrap option to override processors ssl references to use minifi parent ssl instead
apiri commented on issue #177: MINIFI-516: Added bootstrap option to override processors ssl references to use minifi parent ssl instead URL: https://github.com/apache/nifi-minifi/pull/177#issuecomment-573208556 reviewing This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi-minifi] apiri closed pull request #181: MINIFI-517: Added /opt/minifi/minifi-current symlink
apiri closed pull request #181: MINIFI-517: Added /opt/minifi/minifi-current symlink URL: https://github.com/apache/nifi-minifi/pull/181 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi-minifi-cpp] arpadboda closed pull request #706: MINIFICPP-1116 - Fix error reporting in C2Agent
arpadboda closed pull request #706: MINIFICPP-1116 - Fix error reporting in C2Agent URL: https://github.com/apache/nifi-minifi-cpp/pull/706 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#discussion_r365190890 ## File path: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java ## @@ -609,7 +636,11 @@ protected HDFSFileInfoRequest buildRequestDetails(ProcessContext context, Proces req.isIgnoreDotDirs = context.getProperty(IGNORE_DOTTED_DIRS).asBoolean(); req.groupping = HDFSFileInfoRequest.Groupping.getEnum(context.getProperty(GROUPING).getValue()); -req.batchSize = context.getProperty(BATCH_SIZE).asInteger(); +req.batchSize = Optional.ofNullable(context.getProperty(BATCH_SIZE)) +.filter(propertyValue -> propertyValue.getValue() != null) +.filter(propertyValue -> !propertyValue.getValue().equals("")) Review comment: StringUtils.isEmpty() could be used here too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#discussion_r365193100 ## File path: nifi-commons/nifi-utils/src/main/java/org/apache/nifi/processor/util/StandardValidators.java ## @@ -507,6 +507,20 @@ public ValidationResult validate(final String subject, final String input, final // FACTORY METHODS FOR VALIDATORS // // +public static Validator createAllowEmptyValidator(Validator delegate) { +return (subject, input, context) -> { +ValidationResult validationResult; + +if (input == null || input.equals("")) { Review comment: isEmpty() method of this class could be used. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#discussion_r365192302 ## File path: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/test/java/org/apache/nifi/processors/hadoop/TestGetHDFSFileInfo.java ## @@ -76,6 +76,59 @@ public void setup() throws InitializationException { runner.setProperty(GetHDFSFileInfo.HADOOP_CONFIGURATION_RESOURCES, "src/test/resources/core-site.xml"); } +@Test +public void testInvalidBatchSizeWhenDestinationAndGroupingDoesntAllowBatchSize() throws Exception { +Arrays.asList("1", "2", "100").forEach( +validBatchSize -> { +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, GetHDFSFileInfo.GROUP_ALL, validBatchSize, false); +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, GetHDFSFileInfo.GROUP_PARENT_DIR, validBatchSize, false); +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, GetHDFSFileInfo.GROUP_NONE, validBatchSize, false); +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, GetHDFSFileInfo.GROUP_ALL, validBatchSize, false); +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, GetHDFSFileInfo.GROUP_PARENT_DIR, validBatchSize, false); +} +); +} + +@Test +public void testValidBatchSize() throws Exception { +Arrays.asList("1", "2", "100").forEach( +validBatchSize -> { +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, GetHDFSFileInfo.GROUP_NONE, validBatchSize, true); +} +); + +Arrays.asList("", null).forEach( +emptyBatchSize -> { +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, GetHDFSFileInfo.GROUP_NONE, emptyBatchSize, true); +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, GetHDFSFileInfo.GROUP_PARENT_DIR, emptyBatchSize, true); +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_ATTRIBUTES, GetHDFSFileInfo.GROUP_ALL, emptyBatchSize, true); +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, GetHDFSFileInfo.GROUP_ALL, emptyBatchSize, true); +testValidateBatchSize(GetHDFSFileInfo.DESTINATION_CONTENT, GetHDFSFileInfo.GROUP_PARENT_DIR, emptyBatchSize, true); Review comment: Grouping=NONE + Destination=CONTENT + Batch Size="" / null should also be added as a valid config. Grouping=NONE + Destination=CONTENT could also be tested with -1 and 0 as invalid configs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#discussion_r365190713 ## File path: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java ## @@ -266,6 +268,31 @@ public void onPropertyModified(PropertyDescriptor descriptor, String oldValue, S req = null; } +@Override +protected Collection customValidate(ValidationContext validationContext) { +final List validationResults = new ArrayList<>(super.customValidate(validationContext)); + +String destination = validationContext.getProperty(DESTINATION).getValue(); +String grouping = validationContext.getProperty(GROUPING).getValue(); +String batchSize = validationContext.getProperty(BATCH_SIZE).getValue(); + +if ( +(!DESTINATION_CONTENT.getValue().equals(destination) || !GROUP_NONE.getValue().equals(grouping)) +&& batchSize != null +&& !batchSize.equals("") Review comment: StringUtils.isEmpty() could be used to simplify the condition. We have a StringUtils in NiFi and commons-lang3's StringUtils is also used. The latter is already imported here, so I'd use that one (but I don't know if there is a convention for it). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#discussion_r365196428 ## File path: nifi-commons/nifi-utils/src/main/java/org/apache/nifi/processor/util/StandardValidators.java ## @@ -507,6 +507,20 @@ public ValidationResult validate(final String subject, final String input, final // FACTORY METHODS FOR VALIDATORS // // +public static Validator createAllowEmptyValidator(Validator delegate) { Review comment: Please add some unit tests for this new validator. You can find the existing tests in a different package for some reason: org.apache.nifi.util.validator.TestStandardValidators This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
turcsanyip commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#discussion_r365187489 ## File path: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java ## @@ -183,9 +186,8 @@ .name("gethdfsfileinfo-batch-size") .description("Number of records to put into an output flowfile when 'Destination' is set to 'Content'" + " and 'Group Results' is set to 'None'") -.required(true) -.addValidator(StandardValidators.NON_NEGATIVE_INTEGER_VALIDATOR) -.defaultValue("1") +.required(false) + .addValidator(StandardValidators.createAllowEmptyValidator(StandardValidators.NON_NEGATIVE_INTEGER_VALIDATOR)) Review comment: POSITIVE_INTEGER_VALIDATOR seems to me more reasonable here. "0" falls back to batch size = 1, but it might not be straightforward / obvious for the end user. The batch size properties of the other processors also have 1 as their lower bound. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (NIFI-7008) PutS3Object: Invalid V4 Authorization Header When Using Custom S3 Blobstore
Matt M created NIFI-7008: Summary: PutS3Object: Invalid V4 Authorization Header When Using Custom S3 Blobstore Key: NIFI-7008 URL: https://issues.apache.org/jira/browse/NIFI-7008 Project: Apache NiFi Issue Type: Bug Components: Extensions Affects Versions: 1.10.0 Environment: Nifi 1.10.0, connecting to MinIO 2019-12-19 S3-Compatible Blobstore Reporter: Matt M Hello! Some background: I'm currently attempting to use a {{PutS3Object}} processor in Nifi {{1.10.0}} to upload an object to a [MinIO|https://min.io/] cluster. The MinIO cluster is configured to act as an S3-compatible blobstore in the {{us-east-1}} region. The MinIO cluster is running on an internal private network at my company at https://s3.mydomain.mycompany.com . The {{PutS3Object}} processor is configured thusly: - {{Bucket}}: {{mybucket}} - {{Region}}: {{US East (N. Virginia)}} - {{Endpoint Override URL}}: {{https://s3.mydomain.mycompany.com:9000}} - {{Signer Override}}: {{Signature v4}} All other options are left at their default values. What happens when I attempt to use the processor to put a file into MinIO is that the processor shows an error like the following: {{Status Code: 400, Error Code: AuthorizationHeaderMalformed}}. After some debugging, it looks like that the HTTP {{Authorization}} header being generated by Nifi isn't quite what I would expect. The {{Authorization}} header starts off like this: {noformat} Authorization: AWS4-HMAC-SHA256 Credential=AKIAIOSFODNN7EXAMPLE/20200111/mydomain/s3/aws4_request ... {noformat} Whereas what I would _expect_ is something more like this: {noformat} Authorization: AWS4-HMAC-SHA256 Credential=AKIAIOSFODNN7EXAMPLE/20200111/us-east-1/s3/aws4_request ... {noformat} The current behaviour seems to be: take part of the domain from the {{Endpoint Override URL}} and use that as the region inside of the {{Authorization}} header, instead of using the {{Region}} that was specified. As a workaround for now we can use {{Signature v2}} instead, but how long MinIO will continue to support {{Signature v2}} at this time is unknown. Would it be possible to fix the S3 family of processors so that they use the {{Region}} being specified instead of attempting to extract the region from the URL instead? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [nifi] woutifier-t opened a new pull request #3977: NIFI-7007 Add update functionality to the PutCassandraRecord processor.
woutifier-t opened a new pull request #3977: NIFI-7007 Add update functionality to the PutCassandraRecord processor. URL: https://github.com/apache/nifi/pull/3977 Thank you for submitting a contribution to Apache NiFi. Please provide a short description of the PR here: Description of PR _Enables UPDATE functionality in the PutCassandraRecord processor; fixes bug NIFI-7007._ In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [x] Does your PR title start with **NIFI-** where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically `master`)? - [x] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._ ### For code changes: - [x] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder? - [x] Have you written or updated unit tests to verify your changes? - [x] Have you verified that the full build is successful on both JDK 8 and JDK 11? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`? - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`? - [x] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (MINIFICPP-1116) Segfault when trying to use non-existent C2 protocol class
[ https://issues.apache.org/jira/browse/MINIFICPP-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Szasz resolved MINIFICPP-1116. - Resolution: Fixed https://github.com/apache/nifi-minifi-cpp/pull/706 > Segfault when trying to use non-existent C2 protocol class > -- > > Key: MINIFICPP-1116 > URL: https://issues.apache.org/jira/browse/MINIFICPP-1116 > Project: Apache NiFi MiNiFi C++ > Issue Type: Bug >Affects Versions: master > Environment: Ubuntu 18.04.3 LTS >Reporter: Marton Szasz >Assignee: Marton Szasz >Priority: Minor > Fix For: 0.8.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > h3. Steps to reproduce: > # Compile minifi-cpp with no CoAP support > # Configure C2 with {{nifi.c2.agent.protocol.class=CoapProtocol}} > # Start MiNiFi C++ > h3. Expected behavior: > Print an error message to stderr, produce a log message with log level of at > least error and exit. Alternatively, produce a log message with log level of > at least warning and continue execution without C2. > h3. Actual behavior: > No output on stderr. > Error message logged with log level info. > Segmentation fault in {{C2Agent.cpp:329}}, trying to dereference > {{C2Agent::protocol_}}, but it's null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (NIFI-7007) Add update capability to PutCassandraRecord
Wouter de Vries created NIFI-7007: - Summary: Add update capability to PutCassandraRecord Key: NIFI-7007 URL: https://issues.apache.org/jira/browse/NIFI-7007 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Wouter de Vries The current PutCassandraRecord processor (1.10.0) only supports the INSERT statement. It would be useful to extend it to also support UPDATE, especially in the context of COUNTER tables which do not allow INSERTs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [nifi] turcsanyip commented on issue #3894: NIFI-6884 - Native library loading fixed/improved
turcsanyip commented on issue #3894: NIFI-6884 - Native library loading fixed/improved URL: https://github.com/apache/nifi/pull/3894#issuecomment-573006140 @joewitt @markap14 @tpalfy The 2nd round review (refactorings, integration tests) is in progress from my side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] ravowlga123 commented on a change in pull request #3971: NIFI-6997: consumeAMQP closing connection when queue not found
ravowlga123 commented on a change in pull request #3971: NIFI-6997: consumeAMQP closing connection when queue not found URL: https://github.com/apache/nifi/pull/3971#discussion_r365238434 ## File path: nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/ConsumeAMQP.java ## @@ -194,6 +194,13 @@ protected synchronized AMQPConsumer createAMQPWorker(final ProcessContext contex return amqpConsumer; } catch (final IOException ioe) { +try { +connection.close(); +getLogger().warn("Closed connection at port " + connection.getPort()); +} catch (final IOException ioe_close) { Review comment: Instead of "ioe_close" "ioeClose" would be better?? Also not sure if log message should be at warn or info level. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] asfgit closed pull request #3970: NIFI-6945: Use minExtInfo=true to reduce the amount of data queried f…
asfgit closed pull request #3970: NIFI-6945: Use minExtInfo=true to reduce the amount of data queried f… URL: https://github.com/apache/nifi/pull/3970 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] mcgilman commented on issue #3970: NIFI-6945: Use minExtInfo=true to reduce the amount of data queried f…
mcgilman commented on issue #3970: NIFI-6945: Use minExtInfo=true to reduce the amount of data queried f… URL: https://github.com/apache/nifi/pull/3970#issuecomment-573048135 Thanks for the PR @turcsanyip! Thanks for the review @tpalfy! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (NIFI-6945) Use minExtInfo=true to reduce the amount of data queried from Atlas
[ https://issues.apache.org/jira/browse/NIFI-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012898#comment-17012898 ] ASF subversion and git services commented on NIFI-6945: --- Commit b8ffb5461216b8330fb464162316197458f482c7 in nifi's branch refs/heads/master from Peter Turcsanyi [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=b8ffb54 ] NIFI-6945: Use minExtInfo=true to reduce the amount of data queried from Atlas This closes #3970 > Use minExtInfo=true to reduce the amount of data queried from Atlas > --- > > Key: NIFI-6945 > URL: https://issues.apache.org/jira/browse/NIFI-6945 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Peter Turcsanyi >Assignee: Peter Turcsanyi >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > When ReportLineageToAtlas queries the entities from Atlas, the > AtlasEntityWithExtInfo response object also contains all the referred > entities with all their data by default which leads to huge data transfers > between Atlas and NiFi. On the other hand, this extended data of the related > entities are not used by the reporting task, so they could be excluded from > the query (using {{minExtInfo=true}} parameter) in order to reduce the > response size. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (NIFI-6945) Use minExtInfo=true to reduce the amount of data queried from Atlas
[ https://issues.apache.org/jira/browse/NIFI-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Gilman updated NIFI-6945: -- Fix Version/s: 1.11.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Use minExtInfo=true to reduce the amount of data queried from Atlas > --- > > Key: NIFI-6945 > URL: https://issues.apache.org/jira/browse/NIFI-6945 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Peter Turcsanyi >Assignee: Peter Turcsanyi >Priority: Major > Fix For: 1.11.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > When ReportLineageToAtlas queries the entities from Atlas, the > AtlasEntityWithExtInfo response object also contains all the referred > entities with all their data by default which leads to huge data transfers > between Atlas and NiFi. On the other hand, this extended data of the related > entities are not used by the reporting task, so they could be excluded from > the query (using {{minExtInfo=true}} parameter) in order to reduce the > response size. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (NIFI-7009) Deleted flow components should not be queried by Atlas reporting task
[ https://issues.apache.org/jira/browse/NIFI-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Turcsanyi updated NIFI-7009: -- Description: The NiFi Atlas reporting task retrieves the flow and its components from Atlas at every execution of the task in the following way: # get flow # get each flow component # filter out deleted components #1 returns references to all the subcomponents of the flow, also to the deleted ones. It seems there is no way to parameterize the Atlas query to filter out the deleted item here. But the result of #1 already contains the status of the components, so the deleted items do not need to be queried in #2, then filter out in #3 only. #2 and #3 can be swapped, and only the active items need to be retrieved. was: The NiFi Atlas reporting task retrieves the flow and its components from Atlas at every execution of the task in the following way: # get flow # get each flow component # filter out deleted components #1 returns references to all the subcomponents of the flow, also to the deleted ones. It seems there is no way to parameterize the Atlas query to filter out the deleted item here. But the result of #1 already contains the status of the components, so the deleted items do not need to be queried in #2, then filter out in #3. So #2 and #3 can be swapped. > Deleted flow components should not be queried by Atlas reporting task > - > > Key: NIFI-7009 > URL: https://issues.apache.org/jira/browse/NIFI-7009 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Peter Turcsanyi >Assignee: Peter Turcsanyi >Priority: Major > > The NiFi Atlas reporting task retrieves the flow and its components from > Atlas at every execution of the task in the following way: > # get flow > # get each flow component > # filter out deleted components > #1 returns references to all the subcomponents of the flow, also to the > deleted ones. It seems there is no way to parameterize the Atlas query to > filter out the deleted item here. > But the result of #1 already contains the status of the components, so the > deleted items do not need to be queried in #2, then filter out in #3 only. > #2 and #3 can be swapped, and only the active items need to be retrieved. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [nifi] joewitt commented on issue #3971: NIFI-6997: consumeAMQP closing connection when queue not found
joewitt commented on issue #3971: NIFI-6997: consumeAMQP closing connection when queue not found URL: https://github.com/apache/nifi/pull/3971#issuecomment-573043464 no checklist needed. exceptions should not be at info level. good to use common naming conventions of camel case rather than hyphen This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] shawnweeks commented on issue #3973: NIFI-6852 Don't Validate Processors that accept any ControllerService…
shawnweeks commented on issue #3973: NIFI-6852 Don't Validate Processors that accept any ControllerService… URL: https://github.com/apache/nifi/pull/3973#issuecomment-573055892 @joewitt Do you have any suggestions on building a test case for this. We have basically no test cases for this class at all and I'm not having much luck getting enough of it stubbed out to actually test it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (NIFI-6852) CTL doesn't work in ExecuteGroovyScript processor with Apache NiFi 1.10.0
[ https://issues.apache.org/jira/browse/NIFI-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17012927#comment-17012927 ] Mark Bean commented on NIFI-6852: - This issue also presents itself for another use case: a controller service that includes dynamic properties where that property is another controller service (i.e. ControllerService.class). Prior to 1.10.0, referenced controller services in dynamic properties were not validated against an API. NIFI-6798 fixed that. Therefore, a (custom) controller service which validated pre-1.10.0 no longer validates. I don't see a downside to allowing all ControllerService.class API components to validate as long as the designer intends to allow ANY controller service. In certain cases, this is required. > CTL doesn't work in ExecuteGroovyScript processor with Apache NiFi 1.10.0 > - > > Key: NIFI-6852 > URL: https://issues.apache.org/jira/browse/NIFI-6852 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.10.0 >Reporter: Behrouz >Assignee: Shawn Weeks >Priority: Blocker > Attachments: NIFI_CTL.png > > Time Spent: 40m > Remaining Estimate: 0h > > more information is available in this stackoverflow question: > [https://stackoverflow.com/questions/58731092/why-ctl-doesnt-work-in-executegroovyscript-processor-with-apache-nifi-1-10-0] > according to daggett comment in stackoverflow the source of error seems to be > [https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core-api/src/main/java/org/apache/nifi/controller/AbstractComponentNode.java#L746|http://example.com/] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [nifi] tpalfy commented on issue #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
tpalfy commented on issue #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#issuecomment-573020796 @turcsanyip Thanks for the comments, I applied all the suggested changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] ravowlga123 commented on issue #3971: NIFI-6997: consumeAMQP closing connection when queue not found
ravowlga123 commented on issue #3971: NIFI-6997: consumeAMQP closing connection when queue not found URL: https://github.com/apache/nifi/pull/3971#issuecomment-573038910 Hey @dirkbig I don't see the checklist which is usually present while opening the PR regarding the changes made. Would you please add it?? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (NIFI-7009) Deleted flow components should not be queried by Atlas reporting task
Peter Turcsanyi created NIFI-7009: - Summary: Deleted flow components should not be queried by Atlas reporting task Key: NIFI-7009 URL: https://issues.apache.org/jira/browse/NIFI-7009 Project: Apache NiFi Issue Type: Improvement Reporter: Peter Turcsanyi Assignee: Peter Turcsanyi The NiFi Atlas reporting task retrieves the flow and its components from Atlas at every execution of the task in the following way: # get flow # get each flow component # filter out deleted components #1 returns references to all the subcomponents of the flow, also to the deleted ones. It seems there is no way to parameterize the Atlas query to filter out the deleted item here. But the result of #1 already contains the status of the components, so the deleted items do not need to be queried in #2, then filter out in #3. So #2 and #3 can be swapped. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [nifi-minifi-cpp] szaszm commented on a change in pull request #707: MINIFICPP-1117 - minifi::Exception is now nothrow copyable
szaszm commented on a change in pull request #707: MINIFICPP-1117 - minifi::Exception is now nothrow copyable URL: https://github.com/apache/nifi-minifi-cpp/pull/707#discussion_r365213737 ## File path: libminifi/include/Exception.h ## @@ -44,49 +43,56 @@ enum ExceptionType { MAX_EXCEPTION }; -// Exception String static const char *ExceptionStr[MAX_EXCEPTION] = { "File Operation", "Flow File Operation", "Processor Operation", "Process Session Operation", "Process Schedule Operation", "Site2Site Protocol", "General Operation", "Regex Operation" }; -// Exception Type to String inline const char *ExceptionTypeToString(ExceptionType type) { if (type < MAX_EXCEPTION) return ExceptionStr[type]; else return NULL; } -// Exception Class -class Exception : public std::exception { - public: - // Constructor - /*! - * Create a new exception - */ - Exception(ExceptionType type, std::string errorMsg) - : _type(type), -_errorMsg(std::move(errorMsg)) { - } +namespace detail { +inline size_t StringLength(const char* str) { return strlen(str); } - // Destructor - virtual ~Exception() noexcept { - } - virtual const char * what() const noexcept { +template +constexpr size_t StringLength(const char ()[L]) { return L; } -_whatStr = ExceptionTypeToString(_type); +inline size_t StringLength(const std::string& str) { return str.size(); } -_whatStr += ":" + _errorMsg; -return _whatStr.c_str(); - } +template +size_t sum(SizeT... ns) { + size_t result = 0; + (void)(std::initializer_list{( + result += ns + )...}); + return result; // (ns + ...) +} - private: - // Exception type - ExceptionType _type; - // Exception detailed information - std::string _errorMsg; - // Hold the what result - mutable std::string _whatStr; +template +std::string StringJoin(Strs&&... strs) { Review comment: moved to StringUtils in dbc0033 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi-minifi-cpp] szaszm edited a comment on issue #707: MINIFICPP-1117 - minifi::Exception is now nothrow copyable
szaszm edited a comment on issue #707: MINIFICPP-1117 - minifi::Exception is now nothrow copyable URL: https://github.com/apache/nifi-minifi-cpp/pull/707#issuecomment-573019916 update: - fix `ProcessorTests` - move string utilities to `StringUtils` and generalize them - add unit test to cover the new `StringUtils::join_pack` Moving the utilities from a private namespace to `StringUtils` means that they need to be more general IMO. I've extended them to work with any character type and do SFINAE checking on the string types. This means quite some template code, which may not be welcome in this codebase. Suggestions are welcome. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi-minifi-cpp] szaszm commented on issue #707: MINIFICPP-1117 - minifi::Exception is now nothrow copyable
szaszm commented on issue #707: MINIFICPP-1117 - minifi::Exception is now nothrow copyable URL: https://github.com/apache/nifi-minifi-cpp/pull/707#issuecomment-573019916 update: - fix `ProcessorTests` - move string utilities to `StringUtils` and generalize them - add unit test to cover the new `StringUtils::join_pack` Moving the utilities from a private namespace to `StringUtils` means that they need to be more general IMO. I've extended them to work with any character type and do SFINAE checking on the string types. This means quite some template code, which may not be welcome in this codebase. Suggestions are welcome. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] joewitt commented on a change in pull request #3971: NIFI-6997: consumeAMQP closing connection when queue not found
joewitt commented on a change in pull request #3971: NIFI-6997: consumeAMQP closing connection when queue not found URL: https://github.com/apache/nifi/pull/3971#discussion_r365281042 ## File path: nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/ConsumeAMQP.java ## @@ -194,6 +194,13 @@ protected synchronized AMQPConsumer createAMQPWorker(final ProcessContext contex return amqpConsumer; } catch (final IOException ioe) { +try { +connection.close(); +getLogger().warn("Closed connection at port " + connection.getPort()); +} catch (final IOException ioeClose) { +throw new ProcessException("Failed to close connection at port " + connection.getPort()); Review comment: You're at this point because of an io exception that already occurred. Here you're closing a connection as good practice but the fact that it too might not go well isn't useful/not something you could do anything about. As such I'd closeQuietly meaning do not rethrow the exception and do not log even at a warn level. If you want you could keep the logging part and only as a debug thing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (NIFI-7010) Add queue inspection and management to NiFi toolkit cli
John Wise created NIFI-7010: --- Summary: Add queue inspection and management to NiFi toolkit cli Key: NIFI-7010 URL: https://issues.apache.org/jira/browse/NIFI-7010 Project: Apache NiFi Issue Type: Improvement Components: Tools and Build Reporter: John Wise There are currently no commands in the cli to inspect or manage connection queues. We'd like to be able to use the cli to perform monitoring, alerting, and triage of certain queues. Also, the ability to find processors, groups, and queues by name, rather than just UUID, would greatly simplify some of our scripts. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [nifi] dirkbig commented on a change in pull request #3971: NIFI-6997: consumeAMQP closing connection when queue not found
dirkbig commented on a change in pull request #3971: NIFI-6997: consumeAMQP closing connection when queue not found URL: https://github.com/apache/nifi/pull/3971#discussion_r365272278 ## File path: nifi-nar-bundles/nifi-amqp-bundle/nifi-amqp-processors/src/main/java/org/apache/nifi/amqp/processors/ConsumeAMQP.java ## @@ -194,6 +194,13 @@ protected synchronized AMQPConsumer createAMQPWorker(final ProcessContext contex return amqpConsumer; } catch (final IOException ioe) { +try { +connection.close(); +getLogger().warn("Closed connection at port " + connection.getPort()); +} catch (final IOException ioe_close) { Review comment: thanks for the input, just changed it. Kept the log message to level WARN as joewitt affirmed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (NIFI-7011) FlowFile ordering can become incorrect when swapping data
Mark Payne created NIFI-7011: Summary: FlowFile ordering can become incorrect when swapping data Key: NIFI-7011 URL: https://issues.apache.org/jira/browse/NIFI-7011 Project: Apache NiFi Issue Type: Bug Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne NiFi provides weak ordering guarantees when using prioritizers in conjunction with swapped data. When data is swapped out, it is always the lowest priority data (according to the selected prioritizers) that is swapped out first. When data is swapped, it is always swapped back in, in the same order. However, I've come across a problem where data is swapped out. Then, the data that is not swapped out gets processed, then data waiting to be swapped out (lowest priority data) is processed, then the swapped data. It should always be that the lowest priority data, waiting to be swapped out, should be processed after all data that is already swapped out gets swapped back in. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [nifi] markap14 opened a new pull request #3978: NIFI-7011: SwappablePriorityQueue contains two internal data structur…
markap14 opened a new pull request #3978: NIFI-7011: SwappablePriorityQueue contains two internal data structur… URL: https://github.com/apache/nifi/pull/3978 …es: activeQueue, swapQueue. activeQueue is intended to be pulled from for processing. swapQueue is intended to hold FlowFiles that are waiting to be swapped out. SinWe want to ensure that we first swap in any data that has already been swapped out before processing the swap queue, in order to ensure that we process the data in the correct order. This fix ddresses an issue where data was being swapped out by writing the lowest priority data to a swap file, then adding the highest priority data to activeQueue and the 'middle' priority data back to swapQueue. As a result, when polling from the queue we got highest priority data, followed by lowest priority data, followed by middle priority data. This is addressed by avoiding putting anything back on swapQueue when we swap out. Instead, write data to the swap file, then push everything else to activeQueue. This way, any new data that comes in will still go to the swapQueue, as it should, but all data that didn't get written to the Swap file will be processed before the low priority data in the swap file. Thank you for submitting a contribution to Apache NiFi. Please provide a short description of the PR here: Description of PR _Enables X functionality; fixes bug NIFI-._ In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with **NIFI-** where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically `master`)? - [ ] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._ ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] Have you verified that the full build is successful on both JDK 8 and JDK 11? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`? - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`? - [ ] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (NIFI-7010) Add queue inspection and management to NiFi toolkit cli
[ https://issues.apache.org/jira/browse/NIFI-7010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Wise updated NIFI-7010: Description: There are currently no commands in the cli to inspect or manage connection queues. We'd like to be able to use the cli to perform monitoring, alerting, and triage of certain queues. We're currently using a series of cli and REST API calls to accomplish this now. Also, the ability to find processors, groups, and queues by name, rather than just UUID, would greatly simplify some of our scripts. was: There are currently no commands in the cli to inspect or manage connection queues. We'd like to be able to use the cli to perform monitoring, alerting, and triage of certain queues. Also, the ability to find processors, groups, and queues by name, rather than just UUID, would greatly simplify some of our scripts. > Add queue inspection and management to NiFi toolkit cli > --- > > Key: NIFI-7010 > URL: https://issues.apache.org/jira/browse/NIFI-7010 > Project: Apache NiFi > Issue Type: Improvement > Components: Tools and Build >Reporter: John Wise >Priority: Minor > > There are currently no commands in the cli to inspect or manage connection > queues. We'd like to be able to use the cli to perform monitoring, alerting, > and triage of certain queues. We're currently using a series of cli and REST > API calls to accomplish this now. > Also, the ability to find processors, groups, and queues by name, rather than > just UUID, would greatly simplify some of our scripts. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (NIFI-7012) Using sensitive parameter in sensitive property of InvokeScriptedProcessor causes Jetty shutdown on NiFi restart
Dariusz Chmielewski created NIFI-7012: - Summary: Using sensitive parameter in sensitive property of InvokeScriptedProcessor causes Jetty shutdown on NiFi restart Key: NIFI-7012 URL: https://issues.apache.org/jira/browse/NIFI-7012 Project: Apache NiFi Issue Type: Bug Affects Versions: 1.10.0 Environment: OpenJDK 1.8.0_232 on Ubuntu 18.04.3 LTS Reporter: Dariusz Chmielewski Attachments: groovy_sample.txt To simulate this add a new InvokeScriptedProcessor to your flow with attached example basic Groovy script as the Script Body. This will create a "Password" sensitive property on the processor, then using UI enter sensitive parameter (ex: "#\{test_pass}") in the value of this property. At this point everything will work fine, no errors in UI or nifi-app.log during this processor execution, but when you restart NiFi you will notice the below warning in nifi-app.log. To get around this issue and have Jetty start again I had to manually edit flow.xml and set the parameter ("test_pass") sensitive value to false. Note that both the processor property and parameter appear as sensitive in UI (before making changes to flow.xml) and their values are encrypted in flow.xml. {quote}2020-01-10 15:37:28,492 INFO [main] org.eclipse.jetty.server.Server Started @29687ms 2020-01-10 15:37:28,493 WARN [main] org.apache.nifi.web.server.JettyServer Failed to start web server... shutting down. org.apache.nifi.controller.serialization.FlowSynchronizationException: java.lang.IllegalArgumentException: The property 'Password' cannot reference Parameter 'test_pass' because Sensitive Parameters may only be referenced by Sensitive Properties. at org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:510) at org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1368) at org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:88) at org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:812) at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:557) at org.apache.nifi.web.contextlistener.ApplicationStartupContextListener.contextInitialized(ApplicationStartupContextListener.java:72) at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:959) at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:553) at org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:924) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:365) at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1497) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1459) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:854) at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:278) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:545) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:119) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113) at org.eclipse.jetty.server.handler.gzip.GzipHandler.doStart(GzipHandler.java:406) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:119) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:119) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167) at
[GitHub] [nifi] turcsanyip opened a new pull request #3979: NIFI-7009: Atlas reporting task retrieves only the active flow compon…
turcsanyip opened a new pull request #3979: NIFI-7009: Atlas reporting task retrieves only the active flow compon… URL: https://github.com/apache/nifi/pull/3979 …ents Filter out the deleted components before querying them, instead of retrieving all the components before filtering. https://issues.apache.org/jira/browse/NIFI-7009 Thank you for submitting a contribution to Apache NiFi. Please provide a short description of the PR here: Description of PR _Enables X functionality; fixes bug NIFI-._ In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with **NIFI-** where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically `master`)? - [ ] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._ ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] Have you verified that the full build is successful on both JDK 8 and JDK 11? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`? - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`? - [ ] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (NIFI-7009) Deleted flow components should not be queried by Atlas reporting task
[ https://issues.apache.org/jira/browse/NIFI-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Turcsanyi updated NIFI-7009: -- Status: Patch Available (was: In Progress) > Deleted flow components should not be queried by Atlas reporting task > - > > Key: NIFI-7009 > URL: https://issues.apache.org/jira/browse/NIFI-7009 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Peter Turcsanyi >Assignee: Peter Turcsanyi >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The NiFi Atlas reporting task retrieves the flow and its components from > Atlas at every execution of the task in the following way: > # get flow > # get each flow component > # filter out deleted components > #1 returns references to all the subcomponents of the flow, also to the > deleted ones. It seems there is no way to parameterize the Atlas query to > filter out the deleted item here. > But the result of #1 already contains the status of the components, so the > deleted items do not need to be queried in #2, then filter out in #3 only. > #2 and #3 can be swapped, and only the active items need to be retrieved. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [nifi] markap14 commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
markap14 commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#discussion_r365353329 ## File path: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java ## @@ -178,6 +183,15 @@ .defaultValue(GROUP_ALL.getValue()) .build(); +public static final PropertyDescriptor BATCH_SIZE = new PropertyDescriptor.Builder() +.displayName("Batch Size") +.name("gethdfsfileinfo-batch-size") +.description("Number of records to put into an output flowfile when 'Destination' is set to 'Content'" ++ " and 'Group Results' is set to 'None'") +.required(false) + .addValidator(StandardValidators.createAllowEmptyValidator(StandardValidators.POSITIVE_INTEGER_VALIDATOR)) Review comment: Can you explain the logic here @tpalfy ? I don't understand why we would ever allow an empty string for this property. It is marked as optional (required=false). So either it should be specified as an integer or not specified. It looks like there is quite a bit of code in this PR (creation of this validator, and several places where it checks if the value is empty) that can be avoided. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi] markap14 commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor
markap14 commented on a change in pull request #3966: NIFI-6992 - Add "Batch Size" property to GetHDFSFileInfo processor URL: https://github.com/apache/nifi/pull/3966#discussion_r365353915 ## File path: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java ## @@ -255,6 +270,30 @@ public void onPropertyModified(PropertyDescriptor descriptor, String oldValue, S req = null; } +@Override +protected Collection customValidate(ValidationContext validationContext) { +final List validationResults = new ArrayList<>(super.customValidate(validationContext)); + +String destination = validationContext.getProperty(DESTINATION).getValue(); +String grouping = validationContext.getProperty(GROUPING).getValue(); +String batchSize = validationContext.getProperty(BATCH_SIZE).getValue(); + +if ( +(!DESTINATION_CONTENT.getValue().equals(destination) || !GROUP_NONE.getValue().equals(grouping)) +&& !isEmpty(batchSize) Review comment: If we avoid allowing Batch Size to be set to an empty string, as recommended above, we can instead just check `validationContext.getProperty(BATCH_SIZE).isSet()` here. I think this reads more cleanly, and is more inline with the way most processors are written. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [nifi-minifi-cpp] am-c-p-p commented on a change in pull request #656: MINIFI-1013 Used soci library.
am-c-p-p commented on a change in pull request #656: MINIFI-1013 Used soci library. URL: https://github.com/apache/nifi-minifi-cpp/pull/656#discussion_r365129833 ## File path: extensions/sql/processors/ExecuteSQL.cpp ## @@ -0,0 +1,150 @@ +/** + * @file ExecuteSQL.cpp + * ExecuteSQL class declaration + * + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include "ExecuteSQL.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "io/DataStream.h" +#include "core/ProcessContext.h" +#include "core/ProcessSession.h" +#include "Exception.h" +#include "utils/OsUtils.h" +#include "data/DatabaseConnectors.h" +#include "data/JSONSQLWriter.h" +#include "data/WriteCallback.h" + +namespace org { +namespace apache { +namespace nifi { +namespace minifi { +namespace processors { + + const std::string ExecuteSQL::ProcessorName("ExecuteSQL"); + +static core::Property DBCControllerService( +core::PropertyBuilder::createProperty("DB Controller Service")->isRequired(true)->withDescription("Database Controller Service.")->supportsExpressionLanguage(true)->build()); + +static core::Property s_SQLSelectQuery( +core::PropertyBuilder::createProperty("SQL select query")->isRequired(true)->withDefaultValue("System")->withDescription( +"The SQL select query to execute. The query can be empty, a constant value, or built from attributes using Expression Language. " +"If this property is specified, it will be used regardless of the content of incoming flowfiles. " +"If this property is empty, the content of the incoming flow file is expected to contain a valid SQL select query, to be issued by the processor to the database. " +"Note that Expression Language is not evaluated for flow file contents.")->supportsExpressionLanguage(true)->build()); + +static core::Property MaxRowsPerFlowFile( + core::PropertyBuilder::createProperty("Max Rows Per Flow File")->isRequired(true)->withDefaultValue(0)->withDescription( + "The maximum number of result rows that will be included intoi a flow file. If zero then all will be placed into the flow file")->supportsExpressionLanguage(true)->build()); + +core::Relationship ExecuteSQL::Success("success", "Successfully created FlowFile from SQL query result set."); + +ExecuteSQL::ExecuteSQL(const std::string& name, utils::Identifier uuid) +: core::Processor(name, uuid), max_rows_(0), + logger_(logging::LoggerFactory::getLogger()) { +} + +ExecuteSQL::~ExecuteSQL() { +} + +void ExecuteSQL::initialize() { + //! Set the supported properties + setSupportedProperties( { DBCControllerService, s_SQLSelectQuery, MaxRowsPerFlowFile }); + + //! Set the supported relationships + setSupportedRelationships( { Success }); +} + +void ExecuteSQL::onSchedule(const std::shared_ptr , const std::shared_ptr ) { + context->getProperty(DBCControllerService.getName(), db_controller_service_); + context->getProperty(s_SQLSelectQuery.getName(), sqlSelectQuery_); + context->getProperty(MaxRowsPerFlowFile.getName(), max_rows_); + + database_service_ = std::dynamic_pointer_cast(context->getControllerService(db_controller_service_)); + if (!database_service_) +throw minifi::Exception(PROCESSOR_EXCEPTION, "'DB Controller Service' must be defined"); +} + +void ExecuteSQL::onTrigger(const std::shared_ptr , const std::shared_ptr ) { + std::unique_lock lock(onTriggerMutex_, std::try_to_lock); + if (!lock.owns_lock()) { +logger_->log_warn("'onTrigger' is called before previous 'onTrigger' call is finished."); +context->yield(); +return; + } + + if (!connection_) { +connection_ = database_service_->getConnection(); +if (!connection_) { + context->yield(); + return; +} + } + + try { +auto statement = connection_->prepareStatement(sqlSelectQuery_); + +auto rowset = statement->execute(); + +int count = 0; +size_t row_count = 0; +std::stringstream outputStream; +sql::JSONSQLWriter writer(rowset, ); +// serialize the rows +do { + row_count = writer.serialize(max_rows_ == 0 ? std::numeric_limits::max() : max_rows_);
[GitHub] [nifi-minifi-cpp] am-c-p-p commented on a change in pull request #656: MINIFI-1013 Used soci library.
am-c-p-p commented on a change in pull request #656: MINIFI-1013 Used soci library. URL: https://github.com/apache/nifi-minifi-cpp/pull/656#discussion_r365130435 ## File path: extensions/sql/processors/ExecuteSQL.cpp ## @@ -0,0 +1,150 @@ +/** + * @file ExecuteSQL.cpp + * ExecuteSQL class declaration + * + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include "ExecuteSQL.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "io/DataStream.h" +#include "core/ProcessContext.h" +#include "core/ProcessSession.h" +#include "Exception.h" +#include "utils/OsUtils.h" +#include "data/DatabaseConnectors.h" +#include "data/JSONSQLWriter.h" +#include "data/WriteCallback.h" + +namespace org { +namespace apache { +namespace nifi { +namespace minifi { +namespace processors { + + const std::string ExecuteSQL::ProcessorName("ExecuteSQL"); + +static core::Property DBCControllerService( +core::PropertyBuilder::createProperty("DB Controller Service")->isRequired(true)->withDescription("Database Controller Service.")->supportsExpressionLanguage(true)->build()); + +static core::Property s_SQLSelectQuery( +core::PropertyBuilder::createProperty("SQL select query")->isRequired(true)->withDefaultValue("System")->withDescription( +"The SQL select query to execute. The query can be empty, a constant value, or built from attributes using Expression Language. " +"If this property is specified, it will be used regardless of the content of incoming flowfiles. " +"If this property is empty, the content of the incoming flow file is expected to contain a valid SQL select query, to be issued by the processor to the database. " +"Note that Expression Language is not evaluated for flow file contents.")->supportsExpressionLanguage(true)->build()); + +static core::Property MaxRowsPerFlowFile( + core::PropertyBuilder::createProperty("Max Rows Per Flow File")->isRequired(true)->withDefaultValue(0)->withDescription( + "The maximum number of result rows that will be included intoi a flow file. If zero then all will be placed into the flow file")->supportsExpressionLanguage(true)->build()); + +core::Relationship ExecuteSQL::Success("success", "Successfully created FlowFile from SQL query result set."); + +ExecuteSQL::ExecuteSQL(const std::string& name, utils::Identifier uuid) +: core::Processor(name, uuid), max_rows_(0), + logger_(logging::LoggerFactory::getLogger()) { +} + +ExecuteSQL::~ExecuteSQL() { +} + +void ExecuteSQL::initialize() { + //! Set the supported properties + setSupportedProperties( { DBCControllerService, s_SQLSelectQuery, MaxRowsPerFlowFile }); + + //! Set the supported relationships + setSupportedRelationships( { Success }); +} + +void ExecuteSQL::onSchedule(const std::shared_ptr , const std::shared_ptr ) { + context->getProperty(DBCControllerService.getName(), db_controller_service_); + context->getProperty(s_SQLSelectQuery.getName(), sqlSelectQuery_); + context->getProperty(MaxRowsPerFlowFile.getName(), max_rows_); + + database_service_ = std::dynamic_pointer_cast(context->getControllerService(db_controller_service_)); + if (!database_service_) +throw minifi::Exception(PROCESSOR_EXCEPTION, "'DB Controller Service' must be defined"); +} + +void ExecuteSQL::onTrigger(const std::shared_ptr , const std::shared_ptr ) { + std::unique_lock lock(onTriggerMutex_, std::try_to_lock); + if (!lock.owns_lock()) { +logger_->log_warn("'onTrigger' is called before previous 'onTrigger' call is finished."); +context->yield(); +return; + } + + if (!connection_) { +connection_ = database_service_->getConnection(); Review comment: Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services