[jira] [Updated] (NIFI-8134) DataTypeUtils.toRecord methods do not recursively convert Maps into Records
[ https://issues.apache.org/jira/browse/NIFI-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-8134: - Fix Version/s: 2.0.0-M4 Resolution: Fixed Status: Resolved (was: Patch Available) > DataTypeUtils.toRecord methods do not recursively convert Maps into Records > --- > > Key: NIFI-8134 > URL: https://issues.apache.org/jira/browse/NIFI-8134 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.11.4 >Reporter: Chris Sampson >Assignee: Chris Sampson >Priority: Major > Fix For: 2.0.0-M4 > > Time Spent: 3h > Remaining Estimate: 0h > > Given a java Map that contains one or more Maps as values (optionally nested > within arrays), the DataTypeUtils.toRecord method should convert the child > Maps to Records before converting the to level Map. > This assumes the associated schema for the data represents these objects as > Records (including as part of a Choice of Array type). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-13206) S3 Integration tests failing due to server-side encrypt enabled by default
[ https://issues.apache.org/jira/browse/NIFI-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-13206: -- Status: Patch Available (was: Open) > S3 Integration tests failing due to server-side encrypt enabled by default > -- > > Key: NIFI-13206 > URL: https://issues.apache.org/jira/browse/NIFI-13206 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0-M3 > > Time Spent: 10m > Remaining Estimate: 0h > > The 3.0 version of Localstack enabled S3 server-side encryption by default. > This is causing integration tests now to fail with errors such as: > {code:java} > org.opentest4j.AssertionFailedError: Attribute s3.sseAlgorithm should not > exist on FlowFile, but exists with value AES256 ==> > Expected :false > Actual :true {code} > This is happening in both Fetch and Put S3 integration tests. They tests work > as-is if we change the docker image of Localstack to 2.3.2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-13206) S3 Integration tests failing due to server-side encrypt enabled by default
Mark Payne created NIFI-13206: - Summary: S3 Integration tests failing due to server-side encrypt enabled by default Key: NIFI-13206 URL: https://issues.apache.org/jira/browse/NIFI-13206 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0-M3 The 3.0 version of Localstack enabled S3 server-side encryption by default. This is causing integration tests now to fail with errors such as: {code:java} org.opentest4j.AssertionFailedError: Attribute s3.sseAlgorithm should not exist on FlowFile, but exists with value AES256 ==> Expected :false Actual :true {code} This is happening in both Fetch and Put S3 integration tests. They tests work as-is if we change the docker image of Localstack to 2.3.2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-13200) Framework should not allow removal of 'filename' or 'path' attributes
[ https://issues.apache.org/jira/browse/NIFI-13200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845158#comment-17845158 ] Mark Payne commented on NIFI-13200: --- Yup good catch, [~mosermw] > Framework should not allow removal of 'filename' or 'path' attributes > - > > Key: NIFI-13200 > URL: https://issues.apache.org/jira/browse/NIFI-13200 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0-M3 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently, the framework prevents processors from removing the 'uuid' > attribute. However, it does allow any other attribute to be removed. However, > there are many processors that assume that the 'filename' and 'path' > attributes exist, and they are intended always to exist - they are even > assigned values when the FlowFile is created. We should ensure that these > attributes cannot be removed. > Otherwise, configuring UpdateAttribute to remove these attributes can cause > follow-on processors to fail with unexpected NullPointerExceptions -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-13203) Include s3.url attribute in FetchS3Object and PutS3Object
[ https://issues.apache.org/jira/browse/NIFI-13203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-13203: -- Fix Version/s: 2.0.0-M3 Assignee: Mark Payne Status: Patch Available (was: Open) > Include s3.url attribute in FetchS3Object and PutS3Object > - > > Key: NIFI-13203 > URL: https://issues.apache.org/jira/browse/NIFI-13203 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0-M3 > > Time Spent: 10m > Remaining Estimate: 0h > > The FetchS3Object and PutS3Object add several s3-related attributes, such as > s3.bucket, s3.key, etc. and then at the end of the onTrigger, they emit a > provenance event with the S3 URL. However, there is no attribute for the S3 > url. This would be handy to have available when building flows. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-13203) Include s3.url attribute in FetchS3Object and PutS3Object
Mark Payne created NIFI-13203: - Summary: Include s3.url attribute in FetchS3Object and PutS3Object Key: NIFI-13203 URL: https://issues.apache.org/jira/browse/NIFI-13203 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne The FetchS3Object and PutS3Object add several s3-related attributes, such as s3.bucket, s3.key, etc. and then at the end of the onTrigger, they emit a provenance event with the S3 URL. However, there is no attribute for the S3 url. This would be handy to have available when building flows. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-13200) Framework should not allow removal of 'filename' or 'path' attributes
[ https://issues.apache.org/jira/browse/NIFI-13200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-13200: -- Status: Patch Available (was: Open) > Framework should not allow removal of 'filename' or 'path' attributes > - > > Key: NIFI-13200 > URL: https://issues.apache.org/jira/browse/NIFI-13200 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the framework prevents processors from removing the 'uuid' > attribute. However, it does allow any other attribute to be removed. However, > there are many processors that assume that the 'filename' and 'path' > attributes exist, and they are intended always to exist - they are even > assigned values when the FlowFile is created. We should ensure that these > attributes cannot be removed. > Otherwise, configuring UpdateAttribute to remove these attributes can cause > follow-on processors to fail with unexpected NullPointerExceptions -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-13200) Framework should not allow removal of 'filename' or 'path' attributes
Mark Payne created NIFI-13200: - Summary: Framework should not allow removal of 'filename' or 'path' attributes Key: NIFI-13200 URL: https://issues.apache.org/jira/browse/NIFI-13200 Project: Apache NiFi Issue Type: Improvement Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne Currently, the framework prevents processors from removing the 'uuid' attribute. However, it does allow any other attribute to be removed. However, there are many processors that assume that the 'filename' and 'path' attributes exist, and they are intended always to exist - they are even assigned values when the FlowFile is created. We should ensure that these attributes cannot be removed. Otherwise, configuring UpdateAttribute to remove these attributes can cause follow-on processors to fail with unexpected NullPointerExceptions -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-13199) Update ValidateRecord to avoid writing to FlowFiles that will be auto-terminated
Mark Payne created NIFI-13199: - Summary: Update ValidateRecord to avoid writing to FlowFiles that will be auto-terminated Key: NIFI-13199 URL: https://issues.apache.org/jira/browse/NIFI-13199 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne NIFI-13196 introduces the ability to check if a relationship is auto-terminated. In the case of ValidateRecord, the processor is commonly used to filter out invalid records. Before writing records to an 'invalid' FlowFile we should first check if the relationship is auto-terminated and not spend the resources to create the data if it will be auto-terminated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-13198) Update RouteText not to write to FlowFiles for auto-terminated relationships
Mark Payne created NIFI-13198: - Summary: Update RouteText not to write to FlowFiles for auto-terminated relationships Key: NIFI-13198 URL: https://issues.apache.org/jira/browse/NIFI-13198 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne NIFI-13196 introduces the ability to check if a relationship is auto-terminated. In the case of RouteText, the processor is commonly used to filter out unwanted lines of text. For anything that is auto-terminated, though, we still write out the data. We should instead check if the Relationship that we're writing to is auto-terminated and if so, don't bother creating the flowfile or writing to it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-13196) Add a new isAutoTerminated(Relationship) method to ProcessContext
[ https://issues.apache.org/jira/browse/NIFI-13196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-13196: -- Fix Version/s: 2.0.0-M3 Status: Patch Available (was: Open) > Add a new isAutoTerminated(Relationship) method to ProcessContext > - > > Key: NIFI-13196 > URL: https://issues.apache.org/jira/browse/NIFI-13196 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0-M3 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently a Processor has no way of determining whether or not a Relationship > is auto-terminated. There are cases where a Processor forks an incoming > FlowFile and updates it (in a potentially expensive manner) and then > transfers it to a Relationship that is auto-terminated. > We should add the ability to determine whether or not a given relationship is > auto-terminated so that we can be more efficient -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-13196) Add a new isAutoTerminated(Relationship) method to ProcessContext
Mark Payne created NIFI-13196: - Summary: Add a new isAutoTerminated(Relationship) method to ProcessContext Key: NIFI-13196 URL: https://issues.apache.org/jira/browse/NIFI-13196 Project: Apache NiFi Issue Type: Improvement Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne Currently a Processor has no way of determining whether or not a Relationship is auto-terminated. There are cases where a Processor forks an incoming FlowFile and updates it (in a potentially expensive manner) and then transfers it to a Relationship that is auto-terminated. We should add the ability to determine whether or not a given relationship is auto-terminated so that we can be more efficient -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (NIFI-13146) ConsumeSlack processor rate limited
[ https://issues.apache.org/jira/browse/NIFI-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne resolved NIFI-13146. --- Fix Version/s: 2.0.0-M3 Resolution: Fixed > ConsumeSlack processor rate limited > --- > > Key: NIFI-13146 > URL: https://issues.apache.org/jira/browse/NIFI-13146 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Zsihovszki Krisztina >Assignee: Zsihovszki Krisztina >Priority: Major > Fix For: 2.0.0-M3 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > ConsumeSlack processor was not able to start when it was running in a Slack > workplace with thousands of channels. It got stuck in initialization phase > and reported rate limit error continuously. > The processor fetches all available channels (conversation list) during its > setup and creates a channel id/name mapping. > Fetching the conversation list items is executed in batches, 1000 > channels/batch. Since the fetch is done continously, after a while Slack API > returns rate limit error. (Rate limit settings were at default in Slack.) > Even if some delay was added after each API call, 30 seconds was not enough > to fetch all the channels (since it is in onScheduled, after 30 seconds the > initialization is re-attempted) > As a mitigation for the problem I'd like to add a logic which checks if > "Channels" property contains only IDs. In case no channel name is specified, > another Slack API call ( > [conversations.info|https://api.slack.com/methods/conversations.info]) can be > used to fetch the channel names for the channel IDs and it is not necessary > to fetch the whole conversation list. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (NIFI-13076) reduce enum array allocation in OpenTelemetry bundle
[ https://issues.apache.org/jira/browse/NIFI-13076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne resolved NIFI-13076. --- Fix Version/s: 2.0.0-M3 Resolution: Fixed > reduce enum array allocation in OpenTelemetry bundle > > > Key: NIFI-13076 > URL: https://issues.apache.org/jira/browse/NIFI-13076 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Sean Sullivan >Priority: Minor > Fix For: 2.0.0-M3 > > Time Spent: 20m > Remaining Estimate: 0h > > module: *nifi-opentelemetry-bundle* > > h2. Motivation > reduce enum array allocation > h2. Modifications > cache enum .values() in a static variable > h2. Additional context > [https://www.gamlor.info/wordpress/2017/08/javas-enum-values-hidden-allocations/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12986) Tidy up JavaDoc of ProcessSession
[ https://issues.apache.org/jira/browse/NIFI-12986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839719#comment-17839719 ] Mark Payne commented on NIFI-12986: --- This PR incorrect marks {{ProcessSession.commit()}} as being deprecated. It is not deprecated. In the vast majority of cases, {{commitAsync()}} should be preferred. However, there are still cases where {{commit()}} may make sense. It is used, for example, in the Site-to-Site server, as it cannot respond to the client until the commit has completed. Such code *could* be rewritten to use commitAsync but currently has not. We should not be deprecating methods that we are actively using and do not necessarily intend to stop using. Additionally, while it is possible to rewrite in such a way that it uses commitAsync, there's really no need to, as the synchronous commit is still a valid approach and is more straight-forward. Additionally, the PR changes the formatted of the methods from the syntax: {code:java} Documentation Paragraph 1 Documentation Paragraph 2 {code} To a less explicit version of: {code:java} Documentation Paragraph 1 Documentation Paragraph 2 {code} This should be undone, as the former formatting is preferred and is the dominant formatting throughout the codebase. It also makes it more clear where a paragraph begins and ends, and results in more consistent rendering of the text, as the latter approach does not necessarily apply the same formatting as the former. It does look like the commit applies some additional documentation around Exceptions that are thrown, but honestly it is difficult to say, as Github shows it as if methods were added and removed, or parameters were changed, etc. I think it gets confused by the change in formatting? > Tidy up JavaDoc of ProcessSession > - > > Key: NIFI-12986 > URL: https://issues.apache.org/jira/browse/NIFI-12986 > Project: Apache NiFi > Issue Type: Sub-task >Reporter: endzeit >Assignee: endzeit >Priority: Major > Fix For: 2.0.0-M3 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > While working on NIFI-12982 I noticed that the JavaDoc of {{ProcessSession}} > has some minor typos and documentation drifts between method overloads. > The goal of this ticket is to aim make the JavaDoc for the current > {{ProcessSession}} specification more consistent. The specified contract must > not be altered. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (NIFI-12986) Tidy up JavaDoc of ProcessSession
[ https://issues.apache.org/jira/browse/NIFI-12986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne reopened NIFI-12986: --- Reopening Issue, as I do not believe it to be correct. Will elaborate more in a separate comment. > Tidy up JavaDoc of ProcessSession > - > > Key: NIFI-12986 > URL: https://issues.apache.org/jira/browse/NIFI-12986 > Project: Apache NiFi > Issue Type: Sub-task >Reporter: endzeit >Assignee: endzeit >Priority: Major > Fix For: 2.0.0-M3 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > While working on NIFI-12982 I noticed that the JavaDoc of {{ProcessSession}} > has some minor typos and documentation drifts between method overloads. > The goal of this ticket is to aim make the JavaDoc for the current > {{ProcessSession}} specification more consistent. The specified contract must > not be altered. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12969) Under heavy load, nifi node unable to rejoin cluster, graph modified with temp funnel
[ https://issues.apache.org/jira/browse/NIFI-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833738#comment-17833738 ] Mark Payne commented on NIFI-12969: --- [~Nissim Shiman] [~pgyori] I pushed a PR that appears to address the issue. I believe you're on the right track, that the situation is caused by the fact that the temp funnel was incorrectly used. But instead of trying to detect when it's going to happen and/or rollback, the issue is that we had a bug in the logic for when the temp funnel was created. In this case, there should never be a temp funnel. In cases where we DO need a temp funnel, the existing logic should handle stopping the Port, which would make this work smoothly. The issue arose here because the Port was (rightly) left running. We just need to avoid creating the temp funnel unnecessarily. > Under heavy load, nifi node unable to rejoin cluster, graph modified with > temp funnel > - > > Key: NIFI-12969 > URL: https://issues.apache.org/jira/browse/NIFI-12969 > Project: Apache NiFi > Issue Type: Bug >Affects Versions: 1.24.0, 2.0.0-M2 >Reporter: Nissim Shiman >Assignee: Mark Payne >Priority: Critical > Fix For: 2.0.0-M3, 1.26.0 > > Attachments: nifi-app.log, simple_flow.png, > simple_flow_with_temp-funnel.png > > Time Spent: 10m > Remaining Estimate: 0h > > Under heavy load, if a node leaves the cluster (due to heartbeat time out), > many times it is unable to rejoin the cluster. > The nodes' graph will have been modified with a temp-funnel as well. > Appears to be some sort of [timing > issue|https://github.com/apache/nifi/blob/rel/nifi-2.0.0-M2/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/connectable/StandardConnection.java#L298] > # To reproduce, on a nifi cluster of three nodes, set up: > 2 GenerateFlowFile processors -> PG > Inside PG: > inputPort -> UpdateAttribute > # Keep all defaults except for the following: > For UpdateAttribute terminate the success relationship > One of the GenerateFlowFile processors can be disabled, > the other one should have Run Schedule to be 0 min (this will allow for the > heavy load) > # In nifi.properties (on all 3 nodes) to allow for nodes to fall out of the > cluster, set: nifi.cluster.protocol.heartbeat.interval=2 sec (default is 5) > nifi.cluster.protocol.heartbeat.missable.max=1 (default is 8) > Restart nifi. Start flow. The nodes will quickly fall out and rejoin cluster. > After a few minutes one will likely not be able to rejoin. The graph for > that node will have the disabled GenerateFlowFile now pointing to a funnel (a > temp-funnel) instead of the PG > Stack trace on that nodes nifi-app.log will look like this: (this is from > 2.0.0-M2): > {code:java} > 2024-03-28 13:55:19,395 INFO [Reconnect to Cluster] > o.a.nifi.controller.StandardFlowService Node disconnected due to Failed to > properly handle Reconnection request due to org.apache.nifi.control > ler.serialization.FlowSynchronizationException: Failed to connect node to > cluster because local flow controller partially updated. Administrator should > disconnect node and review flow for corrup > tion. > 2024-03-28 13:55:19,395 ERROR [Reconnect to Cluster] > o.a.nifi.controller.StandardFlowService Handling reconnection request failed > due to: org.apache.nifi.controller.serialization.FlowSynchroniza > tionException: Failed to connect node to cluster because local flow > controller partially updated. Administrator should disconnect node and review > flow for corruption. > org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed > to connect node to cluster because local flow controller partially updated. > Administrator should disconnect node and > review flow for corruption. > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:985) > at > org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:655) > at > org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:384) > at java.base/java.lang.Thread.run(Thread.java:1583) > Caused by: > org.apache.nifi.controller.serialization.FlowSynchronizationException: > java.lang.IllegalStateException: Cannot change destination of Connection > because FlowFiles from this Connection > are currently held by LocalPort[id=99213c00-78ca-4848-112f-5454cc20656b, > type=INPUT_PORT, name=inputPort, group=innerPG] > at > org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:472) > at >
[jira] [Updated] (NIFI-12969) Under heavy load, nifi node unable to rejoin cluster, graph modified with temp funnel
[ https://issues.apache.org/jira/browse/NIFI-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12969: -- Assignee: Mark Payne (was: Nissim Shiman) Status: Patch Available (was: Open) > Under heavy load, nifi node unable to rejoin cluster, graph modified with > temp funnel > - > > Key: NIFI-12969 > URL: https://issues.apache.org/jira/browse/NIFI-12969 > Project: Apache NiFi > Issue Type: Bug >Affects Versions: 2.0.0-M2, 1.24.0 >Reporter: Nissim Shiman >Assignee: Mark Payne >Priority: Critical > Fix For: 2.0.0-M3, 1.26.0 > > Attachments: nifi-app.log, simple_flow.png, > simple_flow_with_temp-funnel.png > > Time Spent: 10m > Remaining Estimate: 0h > > Under heavy load, if a node leaves the cluster (due to heartbeat time out), > many times it is unable to rejoin the cluster. > The nodes' graph will have been modified with a temp-funnel as well. > Appears to be some sort of [timing > issue|https://github.com/apache/nifi/blob/rel/nifi-2.0.0-M2/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/connectable/StandardConnection.java#L298] > # To reproduce, on a nifi cluster of three nodes, set up: > 2 GenerateFlowFile processors -> PG > Inside PG: > inputPort -> UpdateAttribute > # Keep all defaults except for the following: > For UpdateAttribute terminate the success relationship > One of the GenerateFlowFile processors can be disabled, > the other one should have Run Schedule to be 0 min (this will allow for the > heavy load) > # In nifi.properties (on all 3 nodes) to allow for nodes to fall out of the > cluster, set: nifi.cluster.protocol.heartbeat.interval=2 sec (default is 5) > nifi.cluster.protocol.heartbeat.missable.max=1 (default is 8) > Restart nifi. Start flow. The nodes will quickly fall out and rejoin cluster. > After a few minutes one will likely not be able to rejoin. The graph for > that node will have the disabled GenerateFlowFile now pointing to a funnel (a > temp-funnel) instead of the PG > Stack trace on that nodes nifi-app.log will look like this: (this is from > 2.0.0-M2): > {code:java} > 2024-03-28 13:55:19,395 INFO [Reconnect to Cluster] > o.a.nifi.controller.StandardFlowService Node disconnected due to Failed to > properly handle Reconnection request due to org.apache.nifi.control > ler.serialization.FlowSynchronizationException: Failed to connect node to > cluster because local flow controller partially updated. Administrator should > disconnect node and review flow for corrup > tion. > 2024-03-28 13:55:19,395 ERROR [Reconnect to Cluster] > o.a.nifi.controller.StandardFlowService Handling reconnection request failed > due to: org.apache.nifi.controller.serialization.FlowSynchroniza > tionException: Failed to connect node to cluster because local flow > controller partially updated. Administrator should disconnect node and review > flow for corruption. > org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed > to connect node to cluster because local flow controller partially updated. > Administrator should disconnect node and > review flow for corruption. > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:985) > at > org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:655) > at > org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:384) > at java.base/java.lang.Thread.run(Thread.java:1583) > Caused by: > org.apache.nifi.controller.serialization.FlowSynchronizationException: > java.lang.IllegalStateException: Cannot change destination of Connection > because FlowFiles from this Connection > are currently held by LocalPort[id=99213c00-78ca-4848-112f-5454cc20656b, > type=INPUT_PORT, name=inputPort, group=innerPG] > at > org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:472) > at > org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.sync(VersionedFlowSynchronizer.java:223) > at > org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1740) > at > org.apache.nifi.persistence.StandardFlowConfigurationDAO.load(StandardFlowConfigurationDAO.java:91) > at > org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:805) > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:954) > ... 3 common frames omitted > Caused by: java.lang.IllegalStateException: Cannot change destination of > Connection
[jira] [Updated] (NIFI-12934) RenameRecordField does not clear serialized form of records
[ https://issues.apache.org/jira/browse/NIFI-12934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12934: -- Fix Version/s: 2.0.0-M3 > RenameRecordField does not clear serialized form of records > --- > > Key: NIFI-12934 > URL: https://issues.apache.org/jira/browse/NIFI-12934 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Mark Payne >Priority: Critical > Fix For: 2.0.0-M3 > > > When RenameRecordField runs, it updates the parent record for any matches. > However, when this happens, the Record's "serialized form" does not get > cleared. As a result, the Record Writer may (depending on its configuration) > write out the 'cached' / serialized form of the Record, resulting in no > change to the records. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12934) RenameRecordField does not clear serialized form of records
[ https://issues.apache.org/jira/browse/NIFI-12934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12934: -- Priority: Critical (was: Major) > RenameRecordField does not clear serialized form of records > --- > > Key: NIFI-12934 > URL: https://issues.apache.org/jira/browse/NIFI-12934 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Mark Payne >Priority: Critical > > When RenameRecordField runs, it updates the parent record for any matches. > However, when this happens, the Record's "serialized form" does not get > cleared. As a result, the Record Writer may (depending on its configuration) > write out the 'cached' / serialized form of the Record, resulting in no > change to the records. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12959) Support loading Python processors from NARs
[ https://issues.apache.org/jira/browse/NIFI-12959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12959: -- Status: Patch Available (was: Open) > Support loading Python processors from NARs > --- > > Key: NIFI-12959 > URL: https://issues.apache.org/jira/browse/NIFI-12959 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0-M3 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, third-party dependencies for Python Processors can be handled in > two ways. Either they can be declared as dependencies in a Processor itself; > or the Processor can be in a module where a {{requirements.txt}} dictates the > requirements. These can be very helpful for developing Python based > Processors. > However, in production environments, it is not uncommon to see environments > where {{pip}} is not installed. There is an inherent risk in allowing remote > code to be downloaded in an ad-hoc manner like this, without any sort of > vulnerability scanning, etc. > As such, we should allow users to also package python packages in NiFi's > native archiving format (NARs). > The package structure should be as follows: > {code:java} > my-nar.nar > +-- META-INF/ > +-- MANIFEST.MF > +-- NAR-INF/ > +-- bundled-dependencies/ > +-- dependency1 > +-- dependency2 > +-- etc. > +-- MyProcessor.py{code} > Where {{MyProcessor.py}} could also be a python module / directory. > In this way, we allow a Python Processor to be packaged up with its third > party dependencies and dropped in the lib/ directory (or extensions) > directory of a NiFi installation in the same way that a Java processor would > be. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12959) Support loading Python processors from NARs
Mark Payne created NIFI-12959: - Summary: Support loading Python processors from NARs Key: NIFI-12959 URL: https://issues.apache.org/jira/browse/NIFI-12959 Project: Apache NiFi Issue Type: Improvement Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0-M3 Currently, third-party dependencies for Python Processors can be handled in two ways. Either they can be declared as dependencies in a Processor itself; or the Processor can be in a module where a {{requirements.txt}} dictates the requirements. These can be very helpful for developing Python based Processors. However, in production environments, it is not uncommon to see environments where {{pip}} is not installed. There is an inherent risk in allowing remote code to be downloaded in an ad-hoc manner like this, without any sort of vulnerability scanning, etc. As such, we should allow users to also package python packages in NiFi's native archiving format (NARs). The package structure should be as follows: {code:java} my-nar.nar +-- META-INF/ +-- MANIFEST.MF +-- NAR-INF/ +-- bundled-dependencies/ +-- dependency1 +-- dependency2 +-- etc. +-- MyProcessor.py{code} Where {{MyProcessor.py}} could also be a python module / directory. In this way, we allow a Python Processor to be packaged up with its third party dependencies and dropped in the lib/ directory (or extensions) directory of a NiFi installation in the same way that a Java processor would be. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12934) RenameRecordField does not clear serialized form of records
Mark Payne created NIFI-12934: - Summary: RenameRecordField does not clear serialized form of records Key: NIFI-12934 URL: https://issues.apache.org/jira/browse/NIFI-12934 Project: Apache NiFi Issue Type: Bug Components: Extensions Reporter: Mark Payne When RenameRecordField runs, it updates the parent record for any matches. However, when this happens, the Record's "serialized form" does not get cleared. As a result, the Record Writer may (depending on its configuration) write out the 'cached' / serialized form of the Record, resulting in no change to the records. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12897) Allow users to upload files/resources to nifi for use in the dataflow
[ https://issues.apache.org/jira/browse/NIFI-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828101#comment-17828101 ] Mark Payne commented on NIFI-12897: --- [~pvillard] probably best to use comments in the feature proposal. > Allow users to upload files/resources to nifi for use in the dataflow > - > > Key: NIFI-12897 > URL: https://issues.apache.org/jira/browse/NIFI-12897 > Project: Apache NiFi > Issue Type: Epic > Components: Core Framework, Core UI >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > > A common feature request that we receive is to make it easier to upload a > resource file, such as JDBC Driver JAR to nifi so that all of the nodes > receive the file. This epic is meant to capture that request -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12897) Allow users to upload files/resources to nifi for use in the dataflow
[ https://issues.apache.org/jira/browse/NIFI-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828053#comment-17828053 ] Mark Payne commented on NIFI-12897: --- [~pvillard] I pushed up a Feature Proposal: [https://cwiki.apache.org/confluence/display/NIFI/Asset+Management] > Allow users to upload files/resources to nifi for use in the dataflow > - > > Key: NIFI-12897 > URL: https://issues.apache.org/jira/browse/NIFI-12897 > Project: Apache NiFi > Issue Type: Epic > Components: Core Framework, Core UI >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > > A common feature request that we receive is to make it easier to upload a > resource file, such as JDBC Driver JAR to nifi so that all of the nodes > receive the file. This epic is meant to capture that request -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12897) Allow users to upload files/resources to nifi for use in the dataflow
[ https://issues.apache.org/jira/browse/NIFI-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827195#comment-17827195 ] Mark Payne commented on NIFI-12897: --- [~pvillard] not at the moment. I will plan to put together a Feature Proposal in the next few days. > Allow users to upload files/resources to nifi for use in the dataflow > - > > Key: NIFI-12897 > URL: https://issues.apache.org/jira/browse/NIFI-12897 > Project: Apache NiFi > Issue Type: Epic > Components: Core Framework, Core UI >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > > A common feature request that we receive is to make it easier to upload a > resource file, such as JDBC Driver JAR to nifi so that all of the nodes > receive the file. This epic is meant to capture that request -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12897) Allow users to upload files/resources to nifi for use in the dataflow
Mark Payne created NIFI-12897: - Summary: Allow users to upload files/resources to nifi for use in the dataflow Key: NIFI-12897 URL: https://issues.apache.org/jira/browse/NIFI-12897 Project: Apache NiFi Issue Type: Epic Components: Core Framework, Core UI Reporter: Mark Payne Assignee: Mark Payne A common feature request that we receive is to make it easier to upload a resource file, such as JDBC Driver JAR to nifi so that all of the nodes receive the file. This epic is meant to capture that request -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12899) UI - Allow user to choose assets to upload for a given parameter
Mark Payne created NIFI-12899: - Summary: UI - Allow user to choose assets to upload for a given parameter Key: NIFI-12899 URL: https://issues.apache.org/jira/browse/NIFI-12899 Project: Apache NiFi Issue Type: New Feature Components: Core UI Reporter: Mark Payne -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12898) Backend - Allow uploading asset and referencing via Parameter
Mark Payne created NIFI-12898: - Summary: Backend - Allow uploading asset and referencing via Parameter Key: NIFI-12898 URL: https://issues.apache.org/jira/browse/NIFI-12898 Project: Apache NiFi Issue Type: New Feature Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (NIFI-11446) Better handling for cases where Python process dies
[ https://issues.apache.org/jira/browse/NIFI-11446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne resolved NIFI-11446. --- Fix Version/s: 2.0.0 Assignee: Mark Payne Resolution: Fixed > Better handling for cases where Python process dies > --- > > Key: NIFI-11446 > URL: https://issues.apache.org/jira/browse/NIFI-11446 > Project: Apache NiFi > Issue Type: Sub-task > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0 > > > If a Python process dies, we need the ability to detect this, re-launch the > Process, and recreate the Processors that are a part of the Process, and then > restore the Processors' configuration and enable/start them. Essentially, if > the Python process dies, the framework should spawn a new process and allow > the Processor to keep running. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (NIFI-11444) Improve FlowFileTransform to allow returning a String for the content instead of byte[]
[ https://issues.apache.org/jira/browse/NIFI-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne resolved NIFI-11444. --- Fix Version/s: 2.0.0 Resolution: Fixed > Improve FlowFileTransform to allow returning a String for the content instead > of byte[] > --- > > Key: NIFI-11444 > URL: https://issues.apache.org/jira/browse/NIFI-11444 > Project: Apache NiFi > Issue Type: Sub-task > Components: Core Framework >Reporter: Mark Payne >Priority: Major > Fix For: 2.0.0 > > > The FlowFileTransform Python class returns a FlowFileTransformResult. If the > contents are to be returned, they must be provided as a byte[]. But we should > also allow providing the contents as a String and deal with the conversion > behind the scenes, in order to provide a simpler API. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-11443) Setup proper logging for Python framework
[ https://issues.apache.org/jira/browse/NIFI-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-11443: -- Fix Version/s: 2.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Setup proper logging for Python framework > - > > Key: NIFI-11443 > URL: https://issues.apache.org/jira/browse/NIFI-11443 > Project: Apache NiFi > Issue Type: Sub-task > Components: Core Framework >Reporter: Mark Payne >Assignee: David Handermann >Priority: Major > Fix For: 2.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Currently the python framework establishes logging to logs/nifi-python.log > (directory configured in nifi.properties). But we need to establish proper > logging with log file rotation, etc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12498) The Prioritization description in the User Guide is different from the actual source code implementation.
[ https://issues.apache.org/jira/browse/NIFI-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12498: -- Fix Version/s: 2.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) > The Prioritization description in the User Guide is different from the actual > source code implementation. > - > > Key: NIFI-12498 > URL: https://issues.apache.org/jira/browse/NIFI-12498 > Project: Apache NiFi > Issue Type: Bug > Components: Documentation Website >Affects Versions: 1.25.0, 2.0.0-M2 >Reporter: Doin Cha >Assignee: endzeit >Priority: Minor > Fix For: 2.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > In the prioritization explanation of the User Guide, it is stated that > *OldestFlowFileFirstPrioritizer* is the _"default scheme that is used if no > prioritizers are selected."_ > _([https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization)|https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization]_ > > > However, in the actual source code implementation, {color:#ff}*there is > no automatic default setting when prioritizers are not selected.* {color} > In such cases, the sorting is done by comparing the *ContentClaim* *of > FlowFiles.* > _([https://github.com/apache/nifi/blob/9a5ec83baa1b3593031f0917659a69e7a36bb0be/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/queue/QueuePrioritizer.java#L39-L90])_ > > > It looks like the user guide needs to be revised. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (NIFI-12740) Python Processors sometimes stuck in invalid state: 'Initializing runtime environment'
[ https://issues.apache.org/jira/browse/NIFI-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne reopened NIFI-12740: --- Re-opening issue. While the fix greatly reduced the chances of this happening, I did encounter the issue again. So not all cases are handled correctly. > Python Processors sometimes stuck in invalid state: 'Initializing runtime > environment' > -- > > Key: NIFI-12740 > URL: https://issues.apache.org/jira/browse/NIFI-12740 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 2.0.0-M1, 2.0.0-M2 >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Blocker > Fix For: 2.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > When creating a Python processor, sometimes the Processor remains in an > invalid state with the message "Initializing runtime environment" > In the logs, we see the following error/stack trace: > {code:java} > 2024-02-05 17:23:30,308 ERROR [Initialize SetRecordField] > org.apache.nifi.NiFi An Unknown Error Occurred in Thread > VirtualThread[#123,Initialize > SetRecordField]/runnable@ForkJoinPool-1-worker-5: > java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" > because "processorTypes" is null > java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" > because "processorTypes" is null > at > org.apache.nifi.py4j.StandardPythonBridge.findExtensionId(StandardPythonBridge.java:322) > at > org.apache.nifi.py4j.StandardPythonBridge.createProcessorBridge(StandardPythonBridge.java:99) > at > org.apache.nifi.py4j.StandardPythonBridge.lambda$createProcessor$3(StandardPythonBridge.java:142) > at > org.apache.nifi.python.processor.PythonProcessorProxy.lambda$new$0(PythonProcessorProxy.java:73) > at java.base/java.lang.VirtualThread.run(VirtualThread.java:309) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12841) Introduce RemoveXYZ type of processors
[ https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820810#comment-17820810 ] Mark Payne commented on NIFI-12841: --- [~EndzeitBegins] I don't have a particular problem with it, if there's a use case where it's needed. Typically, though, you'd only want to delete a file on the local file system after processing it, and the FetchFile / GetFile have strategies to handle that already once the file has been ingested. > Introduce RemoveXYZ type of processors > -- > > Key: NIFI-12841 > URL: https://issues.apache.org/jira/browse/NIFI-12841 > Project: Apache NiFi > Issue Type: Improvement >Reporter: endzeit >Priority: Minor > > There is the notion of "families" or "types" of processors in the standard > distribution of NiFi. > Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, > and {{PutXYZ}}. > The following examples will be based on files on the local filesystem. > However, the same principle applies to other types of resources, e.g. files > on a SFTP server. > The existing {{GetFile}} and {{FetchFile}} processors support the removal of > the resource from the source after successful transfer into the content of a > FlowFile. > However, in some scenarios it might be undesired to remove the resource until > it has been processed successfully and the transformation result be stored, > e.g. to a secure network storage. > This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its > own. > As of now, one of the scripting processors or even a full-fledged custom > processor can be used to achieve this. > However, these might get relatively involved due to session handling or other > concerns. > This issue proposes the introduction of an additional such processor "type", > namely {{RemoveXYZ}} which removes a resource. > The base processor should have two properties, namely {{path}} and > {{filename}}, by default retrieving their values from the respective core > FlowFile attributes. Implementations may add protocol specific properties, > e.g. for authentication. > There should be three outgoing relationships at least: > - "success" for FlowFiles, where the resource was removed from the source, > - "not exists" for FlowFiles, where the resource did (no longer) exist on the > source, > - "failure" for FlowFiles, where the resource couldn't be removed from the > source, e.g. due to network errors or missing permissions. > An initial implementation should provide {{RemoveXYZ}} for one of the > existing resources types, e.g. File, FTP, SFTP... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12841) Introduce RemoveXYZ type of processors
[ https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820514#comment-17820514 ] Mark Payne commented on NIFI-12841: --- Hey [~EndzeitBegins] thanks for reaching out about this. In general, the naming convention used in NiFi for such a thing is DeleteXYZ, rather than RemoveXYZ. Several of these components exist. For example: DeleteMongo, DeleteHDFS, DeleteDynamoDB, DeleteSQS, DeleteS3Object, DeleteGCSObject. It is important to note that these Processors should not share a base class. There is no significant code reuse that would be gained by sharing a base class, but doing so would constraint the extensibility of the Processors. For example, DeleteS3Object is likely to extend from an AbstractS3Processor, etc. Typically, the Processor will have both a "success" and a "failure" relationship. As for a "does not exist" relationship, it depends on the Processor. Some Processors may provide such a relationship while others do not. It should be documented how each Processor behaves in such a condition - whether it's a specific relationship, or the FlowFile goes to failure (because it failed to delete the file), or the FlowFile goes to success (because the file no longer exists), etc. It would be good to ensure that we are consistent, but given that several Delete* Processors already exist, it may not make sense to start changing the behavior. It would take some investigation there. It is also important to note that detecting whether or not a given file exists may also even have significant performance considerations. For something like an SQS Processor it may be expensive to make the request for every single message to detect whether or not it exists. > Introduce RemoveXYZ type of processors > -- > > Key: NIFI-12841 > URL: https://issues.apache.org/jira/browse/NIFI-12841 > Project: Apache NiFi > Issue Type: Improvement >Reporter: endzeit >Priority: Minor > > There is the notion of "families" or "types" of processors in the standard > distribution of NiFi. > Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, > and {{PutXYZ}}. > The following examples will be based on files on the local filesystem. > However, the same principle applies to other types of resources, e.g. files > on a SFTP server. > The existing {{GetFile}} and {{FetchFile}} processors support the removal of > the resource from the source after successful transfer into the content of a > FlowFile. > However, in some scenarios it might be undesired to remove the resource until > it has been processed successfully and the transformation result be stored, > e.g. to a secure network storage. > This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its > own. > As of now, one of the scripting processors or even a full-fledged custom > processor can be used to achieve this. > However, these might get relatively involved due to session handling or other > concerns. > This issue proposes the introduction of an additional such processor "type", > namely {{RemoveXYZ}} which removes a resource. > The base processor should have two properties, namely {{path}} and > {{filename}}, by default retrieving their values from the respective core > FlowFile attributes. Implementations may add protocol specific properties, > e.g. for authentication. > There should be three outgoing relationships at least: > - "success" for FlowFiles, where the resource was removed from the source, > - "not exists" for FlowFiles, where the resource did (no longer) exist on the > source, > - "failure" for FlowFiles, where the resource couldn't be removed from the > source, e.g. due to network errors or missing permissions. > An initial implementation should provide {{RemoveXYZ}} for one of the > existing resources types, e.g. File, FTP, SFTP... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12834) ConsumeSlack throwing NullPointerException
[ https://issues.apache.org/jira/browse/NIFI-12834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12834: -- Status: Patch Available (was: Open) > ConsumeSlack throwing NullPointerException > -- > > Key: NIFI-12834 > URL: https://issues.apache.org/jira/browse/NIFI-12834 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > I have an instance of ConsumeSlack that is throwing NullPointerExceptions: > ``` > 2024-02-22 21:33:08,239 ERROR [Timer-Driven Process Thread-6] > o.a.nifi.processors.slack.ConsumeSlack > ConsumeSlack[id=55d6ad46-018d-1000--24aa6dbf] Failed to retrieve > messages > java.lang.NullPointerException: Cannot invoke "String.split(String)" because > "value" is null > at > org.apache.nifi.processors.slack.consume.SlackTimestamp.(SlackTimestamp.java:42) > at > org.apache.nifi.processors.slack.consume.ConsumeChannel.fetchReplies(ConsumeChannel.java:637) > at > org.apache.nifi.processors.slack.consume.ConsumeChannel.consumeMessages(ConsumeChannel.java:493) > at > org.apache.nifi.processors.slack.consume.ConsumeChannel.consumeReplies(ConsumeChannel.java:268) > at > org.apache.nifi.processors.slack.consume.ConsumeChannel.consume(ConsumeChannel.java:173) > at > org.apache.nifi.processors.slack.ConsumeSlack.onTrigger(ConsumeSlack.java:346) > at > org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1274) > at > org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:244) > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102) > at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) > at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown > Source) > at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source) > at > java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown > Source) > at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > Source) > at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > at java.base/java.lang.Thread.run(Unknown Source) > ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12834) ConsumeSlack throwing NullPointerException
Mark Payne created NIFI-12834: - Summary: ConsumeSlack throwing NullPointerException Key: NIFI-12834 URL: https://issues.apache.org/jira/browse/NIFI-12834 Project: Apache NiFi Issue Type: Bug Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 I have an instance of ConsumeSlack that is throwing NullPointerExceptions: ``` 2024-02-22 21:33:08,239 ERROR [Timer-Driven Process Thread-6] o.a.nifi.processors.slack.ConsumeSlack ConsumeSlack[id=55d6ad46-018d-1000--24aa6dbf] Failed to retrieve messages java.lang.NullPointerException: Cannot invoke "String.split(String)" because "value" is null at org.apache.nifi.processors.slack.consume.SlackTimestamp.(SlackTimestamp.java:42) at org.apache.nifi.processors.slack.consume.ConsumeChannel.fetchReplies(ConsumeChannel.java:637) at org.apache.nifi.processors.slack.consume.ConsumeChannel.consumeMessages(ConsumeChannel.java:493) at org.apache.nifi.processors.slack.consume.ConsumeChannel.consumeReplies(ConsumeChannel.java:268) at org.apache.nifi.processors.slack.consume.ConsumeChannel.consume(ConsumeChannel.java:173) at org.apache.nifi.processors.slack.ConsumeSlack.onTrigger(ConsumeSlack.java:346) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1274) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:244) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12832) Cleanup nifi-mock dependencies
Mark Payne created NIFI-12832: - Summary: Cleanup nifi-mock dependencies Key: NIFI-12832 URL: https://issues.apache.org/jira/browse/NIFI-12832 Project: Apache NiFi Issue Type: Improvement Components: Core Framework, Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 We have allowed quite a few dependencies to creep into the nifi-mock module. It has dependencies now on nifi-utils, nifi-framework-api, nifi-parameter. These are not modules that the mock framework should depend on. We should ensure that we keep this module lean and clean. I suspect removing these dependencies from the mock framework will have a trickle-down effect, as most modules depend on this module, and removing these dependencies will likely require updates to modules who use these things as transitive dependencies. It appears that nifi-parameter is not even used, even though it's a dependency. There are two classes in nifi-utils that are in use: CoreAttributes and StandardValidators. But I argue these really should move to nifi-api, as they are APIs that are widely used and we will guarantee backward compatibility. Additionally, StandardValidators depends on FormatUtils. While we don't want to bring FormatUtils into nifi-api, we should introduce a new TimeFormat class in nifi-api that is responsible for parsing things like durations that our extensions use ("5 mins", etc.) This makes it simpler to build "framework-level extensions" and allows for a cleaner implementation of NiFiProperties in the future. FormatUtils should then make use of this class. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12232) Frequent "failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption"
[ https://issues.apache.org/jira/browse/NIFI-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12232: -- Assignee: Mark Payne Status: Patch Available (was: Open) > Frequent "failed to connect node to cluster because local flow controller > partially updated. Administrator should disconnect node and review flow for > corruption" > - > > Key: NIFI-12232 > URL: https://issues.apache.org/jira/browse/NIFI-12232 > Project: Apache NiFi > Issue Type: Bug > Components: Configuration Management >Affects Versions: 1.23.2 >Reporter: John Joseph >Assignee: Mark Payne >Priority: Major > Attachments: image-2023-10-16-16-12-31-027.png, > image-2024-02-14-13-33-44-354.png > > Time Spent: 10m > Remaining Estimate: 0h > > This is an issue that we have been observing in the 1.23.2 version of NiFi > when we try upgrade, > Since Rolling upgrade is not supported in NiFi, we scale out the revision > that is running and {_}run a helm upgrade{_}. > We have NIFI running in k8s cluster mode, there is a post job that call the > Tenants and policies API. On a successful run it would run like this > {code:java} > set_policies() Action: 'read' Resource: '/flow' entity_id: > 'ad2d3ad6-5d69-3e0f-95e9-c7feb36e2de5' entity_name: 'CN=nifi-api-admin' > entity_type: 'USER' > set_policies() status: '200' > 'read' '/flow' policy already exists. It will be updated... > set_policies() fetching policy inside -eq 200 status: '200' > set_policies() after update PUT: '200' > set_policies() Action: 'read' Resource: '/tenants' entity_id: > 'ad2d3ad6-5d69-3e0f-95e9-c7feb36e2de5' entity_name: 'CN=nifi-api-admin' > entity_type: 'USER' > set_policies() status: '200'{code} > *_This job was running fine in 1.23.0, 1.22 and other previous versions._* In > {*}{{1.23.2}}{*}, we are noticing that the job is failing very frequently > with the error logs; > {code:java} > set_policies() Action: 'read' Resource: '/flow' entity_id: > 'ad2d3ad6-5d69-3e0f-95e9-c7feb36e2de5' entity_name: 'CN=nifi-api-admin' > entity_type: 'USER' > set_policies() status: '200' > 'read' '/flow' policy already exists. It will be updated... > set_policies() fetching policy inside -eq 200 status: '200' > set_policies() after update PUT: '400' > An error occurred getting 'read' '/flow' policy: 'This node is disconnected > from its configured cluster. The requested change will only be allowed if the > flag to acknowledge the disconnected node is set.'{code} > {{_*'This node is disconnected from its configured cluster. The requested > change will only be allowed if the flag to acknowledge the disconnected node > is set.'*_}} > The job is configured to run only after all the pods are up and running. > Though the pods are up we see exception is the inside pods > {code:java} > org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed > to connect node to cluster because local flow controller partially updated. > Administrator should disconnect node and review flow for corruption. > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1059) > at > org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:667) > at > org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:107) > at > org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:396) > at java.base/java.lang.Thread.run(Thread.java:833) > Caused by: > org.apache.nifi.controller.serialization.FlowSynchronizationException: > java.lang.IllegalStateException: Cannot change destination of Connection > because the current destination is running > at > org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:448) > at > org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.sync(VersionedFlowSynchronizer.java:206) > at > org.apache.nifi.controller.serialization.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:42) > at > org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1530) > at > org.apache.nifi.persistence.StandardFlowConfigurationDAO.load(StandardFlowConfigurationDAO.java:104) > at > org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:817) > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1028) > ... 4 common frames omitted > Caused by: java.lang.IllegalStateException: Cannot change destination of > Connection because the current destination is running > at >
[jira] [Created] (NIFI-12797) Record.incorporateInactiveFields fails if inactive field added with same name but different type
Mark Payne created NIFI-12797: - Summary: Record.incorporateInactiveFields fails if inactive field added with same name but different type Key: NIFI-12797 URL: https://issues.apache.org/jira/browse/NIFI-12797 Project: Apache NiFi Issue Type: Bug Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 The Record.incorporateInactiveFields has a bug in it. It considers two cases: updated fields and inactive fields. When considering inactive fields, it skips any fields that are also present in the 'updated fields'. This makes sense, as we don't want to add a new field if there's already a field with the same name. However, the comparison it uses is based on RecordField and not the field name. So in some cases it can throw an Exception because there's a conflict where an inactive field has the same name but different type than an updated field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12739) Python custom processor cannot import ProcessPoolExecutor
[ https://issues.apache.org/jira/browse/NIFI-12739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12739: -- Fix Version/s: 2.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Python custom processor cannot import ProcessPoolExecutor > - > > Key: NIFI-12739 > URL: https://issues.apache.org/jira/browse/NIFI-12739 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 2.0.0-M2 >Reporter: Alex Ethier >Assignee: Alex Ethier >Priority: Major > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > A runtime exception is thrown when trying to import ProcessPoolExecutor in a > Python custom processor. This affects other libraries such as llama-index > when it tries to import ProcessPoolExecutor. > My system's full stack trace (see below for a simpler stack trace): > {code:java} > py4j.Py4JException: An exception was raised by the Python Proxy. Return > Message: Traceback (most recent call last): > File "/opt/nifi-2.0.0-SNAPSHOT/python/framework/py4j/java_gateway.py", line > 2466, in _call_proxy > return_value = getattr(self.pool[obj_id], method)(*params) >^^^ > File "/opt/nifi-2.0.0-SNAPSHOT/./python/framework/Controller.py", line 75, > in createProcessor > processorClass = self.extensionManager.getProcessorClass(processorType, > version, work_dir) > > ^ > File "/opt/nifi-2.0.0-SNAPSHOT/python/framework/ExtensionManager.py", line > 104, in getProcessorClass > processor_class = self.__load_extension_module(module_file, > details.local_dependencies) > > ^ > File "/opt/nifi-2.0.0-SNAPSHOT/python/framework/ExtensionManager.py", line > 360, in __load_extension_module > module_spec.loader.exec_module(module) > File "", line 940, in exec_module > File "", line 241, in _call_with_frames_removed > File > "/Users/aethier/playground/the_source/datavolo/datavolo-resources/demo/advanced_rag_small_to_big/processors/RedisVectorStoreProcessor.py", > line 4, in > from llama_index import GPTVectorStoreIndex, StorageContext, > ServiceContext, Document > File > "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/__init__.py", > line 24, in > from llama_index.indices import ( > File > "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/indices/__init__.py", > line 4, in > from llama_index.indices.composability.graph import ComposableGraph > File > "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/indices/composability/__init__.py", > line 4, in > from llama_index.indices.composability.graph import ComposableGraph > File > "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/indices/composability/graph.py", > line 7, in > from llama_index.indices.base import BaseIndex > File > "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/indices/base.py", > line 10, in > from llama_index.ingestion import run_transformations > File > "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/ingestion/__init__.py", > line 2, in > from llama_index.ingestion.pipeline import ( > File > "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/ingestion/pipeline.py", > line 5, in > from concurrent.futures import ProcessPoolExecutor > File "", line 1229, in _handle_fromlist > File > "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/__init__.py", > line 44, in __getattr__ > from .process import ProcessPoolExecutor as pe > File > "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", > line 106, in > threading._register_atexit(_python_exit) > File > "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", > line 1527, in _register_atexit > raise RuntimeError("can't register atexit after shutdown") > RuntimeError: can't register atexit after shutdown > at py4j.Protocol.getReturnValue(Protocol.java:476) > at > org.apache.nifi.py4j.client.PythonProxyInvocationHandler.invoke(PythonProxyInvocationHandler.java:64) > at
[jira] [Updated] (NIFI-12773) Add 'join' and 'anchored' RecordPath functions
[ https://issues.apache.org/jira/browse/NIFI-12773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12773: -- Status: Patch Available (was: Open) > Add 'join' and 'anchored' RecordPath functions > -- > > Key: NIFI-12773 > URL: https://issues.apache.org/jira/browse/NIFI-12773 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > I've come across two functions that would make flow design much simpler in > RecordPath. > The first one, 'join' would be similar to the 'concat' method but provides a > delimiter between each element instead of just smashing the values together. > The other provides the ability to anchor the context node while evaluating a > RecordPath. For example, given the following record: > {code:java} > { > "id": "1234", > "elements": [{ > "name": "book", > "color": "red" > }, { > "name": "computer", > "color": "black" > }] > } {code} > We should be able to use: > {code:java} > anchored(/elements, concat(/name, ': ', /color)) {code} > In order to obtain an array of 2 elements: > {code:java} > book: red {code} > and > {code:java} > computer: black {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12773) Add 'join' and 'anchored' RecordPath functions
Mark Payne created NIFI-12773: - Summary: Add 'join' and 'anchored' RecordPath functions Key: NIFI-12773 URL: https://issues.apache.org/jira/browse/NIFI-12773 Project: Apache NiFi Issue Type: Improvement Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 I've come across two functions that would make flow design much simpler in RecordPath. The first one, 'join' would be similar to the 'concat' method but provides a delimiter between each element instead of just smashing the values together. The other provides the ability to anchor the context node while evaluating a RecordPath. For example, given the following record: {code:java} { "id": "1234", "elements": [{ "name": "book", "color": "red" }, { "name": "computer", "color": "black" }] } {code} We should be able to use: {code:java} anchored(/elements, concat(/name, ': ', /color)) {code} In order to obtain an array of 2 elements: {code:java} book: red {code} and {code:java} computer: black {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12764) Remove commons-codec and commons-lang3 from nifi-security-utils
[ https://issues.apache.org/jira/browse/NIFI-12764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12764: -- Fix Version/s: 2.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Remove commons-codec and commons-lang3 from nifi-security-utils > --- > > Key: NIFI-12764 > URL: https://issues.apache.org/jira/browse/NIFI-12764 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Reporter: David Handermann >Assignee: David Handermann >Priority: Minor > Fix For: 2.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The {{nifi-security-utils}} module is a dependency of many other components > and should have a minimal set of dependencies. With the introduction of Java > HexFormat, Apache Commons Codec is no longer necessary in > {{{}nifi-security-utils{}}}. The module also makes minimal use of Apache > Commons Lang3, so references to {{StringUtils}} can be reduced and refactored. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12768) Intermittent Failures in TestListFile.testFilterAge
[ https://issues.apache.org/jira/browse/NIFI-12768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12768: -- Fix Version/s: 2.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Intermittent Failures in TestListFile.testFilterAge > --- > > Key: NIFI-12768 > URL: https://issues.apache.org/jira/browse/NIFI-12768 > Project: Apache NiFi > Issue Type: Bug >Affects Versions: 2.0.0-M2 >Reporter: David Handermann >Assignee: David Handermann >Priority: Major > Fix For: 2.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The TestListFIle class has not changed substantively in quite some time, but > it has begun to fail more recently across multiple platforms on GitHub Action > runners. > The {{testFilterAge}} method often fails with the same stack trace: > {noformat} > Error: org.apache.nifi.processors.standard.TestListFile.testFilterAge -- > Time elapsed: 6.436 s <<< FAILURE! > org.opentest4j.AssertionFailedError: expected: but was: > at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at > org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182) > at > org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177) > at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141) > at > org.apache.nifi.processors.standard.TestListFile.testFilterAge(TestListFile.java:331) > at java.base/java.lang.reflect.Method.invoke(Method.java:580) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1596) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1596) > {noformat} > The test method use recalculated timestamps to set file modification time, so > the problem appears to be related to these timing calculations. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12757) Memory leak on Python side can result in OOMKiller killing python processes
[ https://issues.apache.org/jira/browse/NIFI-12757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12757: -- Status: Patch Available (was: Open) > Memory leak on Python side can result in OOMKiller killing python processes > --- > > Key: NIFI-12757 > URL: https://issues.apache.org/jira/browse/NIFI-12757 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Blocker > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > There is a memory leak on the Python side that results in objects not being > properly cleaned up when transform method is invoked. This ultimately leads > to the Python process using large amounts of ram and often results in > OOMKiller killing the process -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12757) Memory leak on Python side can result in OOMKiller killing python processes
Mark Payne created NIFI-12757: - Summary: Memory leak on Python side can result in OOMKiller killing python processes Key: NIFI-12757 URL: https://issues.apache.org/jira/browse/NIFI-12757 Project: Apache NiFi Issue Type: Bug Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 There is a memory leak on the Python side that results in objects not being properly cleaned up when transform method is invoked. This ultimately leads to the Python process using large amounts of ram and often results in OOMKiller killing the process -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12740) Python Processors sometimes stuck in invalid state: 'Initializing runtime environment'
[ https://issues.apache.org/jira/browse/NIFI-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12740: -- Status: Patch Available (was: Open) > Python Processors sometimes stuck in invalid state: 'Initializing runtime > environment' > -- > > Key: NIFI-12740 > URL: https://issues.apache.org/jira/browse/NIFI-12740 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 2.0.0-M2, 2.0.0-M1 >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Blocker > Fix For: 2.0.0 > > > When creating a Python processor, sometimes the Processor remains in an > invalid state with the message "Initializing runtime environment" > In the logs, we see the following error/stack trace: > {code:java} > 2024-02-05 17:23:30,308 ERROR [Initialize SetRecordField] > org.apache.nifi.NiFi An Unknown Error Occurred in Thread > VirtualThread[#123,Initialize > SetRecordField]/runnable@ForkJoinPool-1-worker-5: > java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" > because "processorTypes" is null > java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" > because "processorTypes" is null > at > org.apache.nifi.py4j.StandardPythonBridge.findExtensionId(StandardPythonBridge.java:322) > at > org.apache.nifi.py4j.StandardPythonBridge.createProcessorBridge(StandardPythonBridge.java:99) > at > org.apache.nifi.py4j.StandardPythonBridge.lambda$createProcessor$3(StandardPythonBridge.java:142) > at > org.apache.nifi.python.processor.PythonProcessorProxy.lambda$new$0(PythonProcessorProxy.java:73) > at java.base/java.lang.VirtualThread.run(VirtualThread.java:309) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12740) Python Processors sometimes stuck in invalid state: 'Initializing runtime environment'
Mark Payne created NIFI-12740: - Summary: Python Processors sometimes stuck in invalid state: 'Initializing runtime environment' Key: NIFI-12740 URL: https://issues.apache.org/jira/browse/NIFI-12740 Project: Apache NiFi Issue Type: Bug Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 When creating a Python processor, sometimes the Processor remains in an invalid state with the message "Initializing runtime environment" In the logs, we see the following error/stack trace: {code:java} 2024-02-05 17:23:30,308 ERROR [Initialize SetRecordField] org.apache.nifi.NiFi An Unknown Error Occurred in Thread VirtualThread[#123,Initialize SetRecordField]/runnable@ForkJoinPool-1-worker-5: java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" because "processorTypes" is null java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" because "processorTypes" is null at org.apache.nifi.py4j.StandardPythonBridge.findExtensionId(StandardPythonBridge.java:322) at org.apache.nifi.py4j.StandardPythonBridge.createProcessorBridge(StandardPythonBridge.java:99) at org.apache.nifi.py4j.StandardPythonBridge.lambda$createProcessor$3(StandardPythonBridge.java:142) at org.apache.nifi.python.processor.PythonProcessorProxy.lambda$new$0(PythonProcessorProxy.java:73) at java.base/java.lang.VirtualThread.run(VirtualThread.java:309) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12740) Python Processors sometimes stuck in invalid state: 'Initializing runtime environment'
[ https://issues.apache.org/jira/browse/NIFI-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12740: -- Affects Version/s: 2.0.0-M2 2.0.0-M1 > Python Processors sometimes stuck in invalid state: 'Initializing runtime > environment' > -- > > Key: NIFI-12740 > URL: https://issues.apache.org/jira/browse/NIFI-12740 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 2.0.0-M1, 2.0.0-M2 >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Blocker > Fix For: 2.0.0 > > > When creating a Python processor, sometimes the Processor remains in an > invalid state with the message "Initializing runtime environment" > In the logs, we see the following error/stack trace: > {code:java} > 2024-02-05 17:23:30,308 ERROR [Initialize SetRecordField] > org.apache.nifi.NiFi An Unknown Error Occurred in Thread > VirtualThread[#123,Initialize > SetRecordField]/runnable@ForkJoinPool-1-worker-5: > java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" > because "processorTypes" is null > java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" > because "processorTypes" is null > at > org.apache.nifi.py4j.StandardPythonBridge.findExtensionId(StandardPythonBridge.java:322) > at > org.apache.nifi.py4j.StandardPythonBridge.createProcessorBridge(StandardPythonBridge.java:99) > at > org.apache.nifi.py4j.StandardPythonBridge.lambda$createProcessor$3(StandardPythonBridge.java:142) > at > org.apache.nifi.python.processor.PythonProcessorProxy.lambda$new$0(PythonProcessorProxy.java:73) > at java.base/java.lang.VirtualThread.run(VirtualThread.java:309) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12710) PutDatabaseRecord processor does not handle microsecond timestamps properly
[ https://issues.apache.org/jira/browse/NIFI-12710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12710: -- Status: Patch Available (was: Open) > PutDatabaseRecord processor does not handle microsecond timestamps properly > --- > > Key: NIFI-12710 > URL: https://issues.apache.org/jira/browse/NIFI-12710 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12710) PutDatabaseRecord processor does not handle microsecond timestamps properly
Mark Payne created NIFI-12710: - Summary: PutDatabaseRecord processor does not handle microsecond timestamps properly Key: NIFI-12710 URL: https://issues.apache.org/jira/browse/NIFI-12710 Project: Apache NiFi Issue Type: Bug Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-8932) Add feature to CSVReader to skip N lines at top of the file
[ https://issues.apache.org/jira/browse/NIFI-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813009#comment-17813009 ] Mark Payne commented on NIFI-8932: -- I think I'm a -1 on this Jira. The CSV Reader should be given valid CSV, rather than skipping over an arbitrary number of lines. [~iiojj2] to strip out the first line of a file, you should not use the complex flow above but rather just use RouteText. Add a property with a value of ${lineNo:gt(1)} and auto-terminated unmatched. > Add feature to CSVReader to skip N lines at top of the file > --- > > Key: NIFI-8932 > URL: https://issues.apache.org/jira/browse/NIFI-8932 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Philipp Korniets >Assignee: Matt Burgess >Priority: Minor > Labels: backport-needed > Time Spent: 3h 10m > Remaining Estimate: 0h > > We have a lot of CSV files where provider add custom header/footer to valid > CSV content. > CSV header is actually second row. > To remove unnecessary data we can use > * ReplaceText > * splitText->RouteOnAttribute -> MergeContent > It would be great to have an option in CSVReader controller to skip N rows > from top/bottom in order to get5 clean data. > * skip N from the top > * skip M from the bottom > Similar request was developed in FLINK > https://issues.apache.org/jira/browse/FLINK-1002 > > Data Example: > {code} > 7/20/21 2:48:47 AM GMT-04:00 ABB: Blended Rate Calc (X),,, > distribution_id,Distribution > Id,settle_date,group_code,company_name,currency_code,common_account_name,business_date,prod_code,security,class,asset_type > -1,all,20210719,Repo 21025226,qwerty > ,EUR,TPSL_21025226 ,19-Jul-21,BRM96ST7 ,ABC > 14/09/24,NR,BOND > -1,all,20210719,Repo 21025226,qwerty > ,GBP,RPSS_21025226 ,19-Jul-21,,Total @ -0.11,, > {code} > |7/20/21 2:48:47 AM GMT-04:00 ABB: Blended Rate Calc (X)| | | | | | | > | | | | | > |distribution_id|Distribution > Id|settle_date|group_code|company_name|currency_code|common_account_name|business_date|prod_code|security|class|asset_type| > |-1|all|20210719|Repo 21025226|qwerty > |EUR|TPSL_21025226 |19-Jul-21|BRM96ST7 |ABC > 14/09/24|NR|BOND | > |-1|all|20210719|Repo 21025226|qwerty > |GBP|RPSS_21025226 |19-Jul-21| |Total @ -0.11| | | -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12707) Allow LookupRecord to operate on multiple "child records"
Mark Payne created NIFI-12707: - Summary: Allow LookupRecord to operate on multiple "child records" Key: NIFI-12707 URL: https://issues.apache.org/jira/browse/NIFI-12707 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne Assignee: Mark Payne LookupRecord provides a lot of power when it comes to performing enrichment in Records. However, there are cases in which a single Record has many sub-records, or child records. For example, let's take the following record: {code:java} { "fileSet": { "id": "11223344", "source": "external", "files": [{ "filename": "file1.txt", "size": 4810 }, { "filename": "file2.pdf", "size": 47203782 }, { "filename": "unknown-file.unk", "size": 278102 } ] } } {code} Let's say that I want to lookup a MIME type, based on the filename. So I want an output such as: {code:java} { "fileSet" : { "id" : "11223344", "source" : "external", "files" : [ { "filename" : "file1.txt", "size" : 4810, "mimeType" : "text/plain" }, { "filename" : "file2.pdf", "size" : 47203782, "mimeType" : "application/pdf" }, { "filename" : "unknown-file.unk", "size" : 278102, "mimeType" : null } ] } } {code} I can have a Lookup Service that is capable of handling this, no problem. And in LookupRecord, I can specify the path to lookup as {{/fileSet/files[*]/filename}} but then I have a problem - there's no way to tell it where to place the returned values (i.e., the mimeType field) because it is relative to each individual value. We need to add a "Root Record Path" that allows us to choose a sub-record. In this case, {{/fileSet/files[*]}} and then specify the value to lookup as {{/filename}} and the return value should be placed at {{{}/mimeType{}}}. This gives us much greater flexibility in performing lookups/enrichments. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12697) Improve JSON Reader/Writer handling of floating-point numbers
Mark Payne created NIFI-12697: - Summary: Improve JSON Reader/Writer handling of floating-point numbers Key: NIFI-12697 URL: https://issues.apache.org/jira/browse/NIFI-12697 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 The JSON Writer currently provides no way to dictate whether or not floating-point numbers should use scientific notation. Several people have run into issues where the downstream systems do not understand scientific notation. We should allow this to be configurable. Specifically, we should not change the behavior of existing services, but we should default new services so that they do not use scientific notation. Additionally, the Jackson parser has the ability to use its own implementation of floating point parsing, which should be faster than the default version supplied by Java. We should enable that feature. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12675) Python Processor erroring when creating custom relationships
[ https://issues.apache.org/jira/browse/NIFI-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12675: -- Status: Patch Available (was: Open) > Python Processor erroring when creating custom relationships > > > Key: NIFI-12675 > URL: https://issues.apache.org/jira/browse/NIFI-12675 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > From apache Slack thread > ([https://apachenifi.slack.com/archives/C0L9VCD47/p1706176890922519):] > {quote}Hello, I am trying to test some custom python processors with nifi > 2.0.0-M1 > It works fine except when I try to add custom relationships to it (other than > the default success, failure and original). > Here's what I am trying: > {code:java} > self.matched = Relationship("matched", "flowfiles having a match with > the regex") > self.unmatched = Relationship("unmatched", "flowfiles not having any > match with regex") > self.failure = Relationship("failure", "flowfiles for which process > errored while matching") > self.relationships = {self.matched, self.unmatched, self.failure} > {code} > I get py4j complaining about AttributeError: 'set' object has no attribute > '_get_object_id' > which seems like the auto conversion of Python to java container is not > happening for "Relationship" class. Any idea what could be wrong here? > {quote} > The problem appears to be that Relationships created are of type > {{nifiapi.Relationship}} but that is being sent back to the Java side without > being converted into a {{org.apache.nifi.processor.Relationship}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12693) When Processor is removed, Python Process should be notified asynchronously
[ https://issues.apache.org/jira/browse/NIFI-12693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12693: -- Status: Patch Available (was: Open) > When Processor is removed, Python Process should be notified asynchronously > --- > > Key: NIFI-12693 > URL: https://issues.apache.org/jira/browse/NIFI-12693 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > When a Processor is removed, the PythonBridge is notified of the removal, and > it then notifies any relevant Python process. This is done synchronously > during the removal. I encountered two occurrences in which notifying the > Python process failed. > While the failure itself is not a huge concern, the handling of those > failures resulted in very bad outcomes. In the first instance, the > communication with the Python process was blocked on a socket read or write. > As a result, the Service Facade's lock was never released, and no web > requests could be made; they all blocked on the read lock. This resulted in > requiring a restart of NiFi. > In the other scenario, the call did not block indefinitely but threw an > Exception. In this case, the associated Connections were never removed. As a > result, I could no longer navigate to that Process Group in the UI, or the UI > would have errors because there were Connections whose source or destination > didn't exist. This required manually removing those connections from the > flow.json file to recover. > Since the intention of this action is simply a notification so that the > Python process can cleanup after itself, this notification should be moved to > a background thread, so that any failures are simply logged without causing > problematic side effects. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12693) When Processor is removed, Python Process should be notified asynchronously
Mark Payne created NIFI-12693: - Summary: When Processor is removed, Python Process should be notified asynchronously Key: NIFI-12693 URL: https://issues.apache.org/jira/browse/NIFI-12693 Project: Apache NiFi Issue Type: Bug Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 When a Processor is removed, the PythonBridge is notified of the removal, and it then notifies any relevant Python process. This is done synchronously during the removal. I encountered two occurrences in which notifying the Python process failed. While the failure itself is not a huge concern, the handling of those failures resulted in very bad outcomes. In the first instance, the communication with the Python process was blocked on a socket read or write. As a result, the Service Facade's lock was never released, and no web requests could be made; they all blocked on the read lock. This resulted in requiring a restart of NiFi. In the other scenario, the call did not block indefinitely but threw an Exception. In this case, the associated Connections were never removed. As a result, I could no longer navigate to that Process Group in the UI, or the UI would have errors because there were Connections whose source or destination didn't exist. This required manually removing those connections from the flow.json file to recover. Since the intention of this action is simply a notification so that the Python process can cleanup after itself, this notification should be moved to a background thread, so that any failures are simply logged without causing problematic side effects. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12675) Python Processor erroring when creating custom relationships
Mark Payne created NIFI-12675: - Summary: Python Processor erroring when creating custom relationships Key: NIFI-12675 URL: https://issues.apache.org/jira/browse/NIFI-12675 Project: Apache NiFi Issue Type: Bug Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne >From apache Slack thread >([https://apachenifi.slack.com/archives/C0L9VCD47/p1706176890922519):] {quote}Hello, I am trying to test some custom python processors with nifi 2.0.0-M1 It works fine except when I try to add custom relationships to it (other than the default success, failure and original). Here's what I am trying: {code:java} self.matched = Relationship("matched", "flowfiles having a match with the regex") self.unmatched = Relationship("unmatched", "flowfiles not having any match with regex") self.failure = Relationship("failure", "flowfiles for which process errored while matching") self.relationships = {self.matched, self.unmatched, self.failure} {code} I get py4j complaining about AttributeError: 'set' object has no attribute '_get_object_id' which seems like the auto conversion of Python to java container is not happening for "Relationship" class. Any idea what could be wrong here? {quote} The problem appears to be that Relationships created are of type {{nifiapi.Relationship}} but that is being sent back to the Java side without being converted into a {{org.apache.nifi.processor.Relationship}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12647) Add @MultiProcessorUseCase annotations to explain how to use ListFile/FetchFile together
Mark Payne created NIFI-12647: - Summary: Add @MultiProcessorUseCase annotations to explain how to use ListFile/FetchFile together Key: NIFI-12647 URL: https://issues.apache.org/jira/browse/NIFI-12647 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne Assignee: Mark Payne It's common to use ListFile / FetchFile together to pull in all files in a directory, or to pull in specific files (based on filename, for instance). We should add documentation to ListFile to explain how these Processors go hand-in-hand. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12629) Add metadata filtering to QueryPinecone
[ https://issues.apache.org/jira/browse/NIFI-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12629: -- Fix Version/s: 2.0.0-M2 Resolution: Fixed Status: Resolved (was: Patch Available) > Add metadata filtering to QueryPinecone > --- > > Key: NIFI-12629 > URL: https://issues.apache.org/jira/browse/NIFI-12629 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 2.0.0-M1 >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > Fix For: 2.0.0-M2 > > Time Spent: 50m > Remaining Estimate: 0h > > The QueryPinecone processor should be improved to allow for metadata > filtering. > [https://docs.pinecone.io/docs/metadata-filtering] > [https://medium.com/@gmarcilhacy/deep-dive-into-langchain-and-pinecone-metadata-filtering-75a9b6eba9c] > An optional filter property should be added to the processor allowing a user > to specify which metadata filters should be applied to the query. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12634) Kubernetes Components Should Ignore Empty Prefix Properties
[ https://issues.apache.org/jira/browse/NIFI-12634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12634: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Kubernetes Components Should Ignore Empty Prefix Properties > --- > > Key: NIFI-12634 > URL: https://issues.apache.org/jira/browse/NIFI-12634 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: David Handermann >Assignee: David Handermann >Priority: Major > Fix For: 2.0.0-M2 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Following recent changes on the main branch to support optional prefix > properties for Kubernetes Leases and ConfigMaps, testing indicated that the > Leader Election Manager and State Provider included empty strings as valid > values. This changes the default behavior based on the default > nifi.properties and state-management.xml including empty strings for prefix > values. The components should be modified to ignore empty strings in addition > to null values, aligning with current behavior prior to the introduction of > these properties. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12638) Add @UseCase documentation to QueryRecord to explain how to use as a record-based Router
[ https://issues.apache.org/jira/browse/NIFI-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12638: -- Status: Patch Available (was: Open) > Add @UseCase documentation to QueryRecord to explain how to use as a > record-based Router > > > Key: NIFI-12638 > URL: https://issues.apache.org/jira/browse/NIFI-12638 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0-M2 > > Time Spent: 10m > Remaining Estimate: 0h > > A common use case for QueryRecord is to use it to route Records to one route > or another. Add use case documentation explaining how to set this up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12638) Add @UseCase documentation to QueryRecord to explain how to use as a record-based Router
Mark Payne created NIFI-12638: - Summary: Add @UseCase documentation to QueryRecord to explain how to use as a record-based Router Key: NIFI-12638 URL: https://issues.apache.org/jira/browse/NIFI-12638 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0-M2 A common use case for QueryRecord is to use it to route Records to one route or another. Add use case documentation explaining how to set this up. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12637) Automatically update InvokeHTTP Proxy configuration properties
[ https://issues.apache.org/jira/browse/NIFI-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12637: -- Status: Patch Available (was: Open) > Automatically update InvokeHTTP Proxy configuration properties > -- > > Key: NIFI-12637 > URL: https://issues.apache.org/jira/browse/NIFI-12637 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0-M2 > > Time Spent: 10m > Remaining Estimate: 0h > > Users updating from 1.x are finding that InvokeHTTP is failing because the > "Proxy Type" property that was previously defined no longer is. As a result, > InvokeHTTP treats it as a header and attempts to send it as an HTTP Header. > However, since it has a space in the name, it's invalid and InvokeHTTP fails. > We should automatically handle migrating the Proxy properties to make this > seamless. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12637) Automatically update InvokeHTTP Proxy configuration properties
Mark Payne created NIFI-12637: - Summary: Automatically update InvokeHTTP Proxy configuration properties Key: NIFI-12637 URL: https://issues.apache.org/jira/browse/NIFI-12637 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0-M2 Users updating from 1.x are finding that InvokeHTTP is failing because the "Proxy Type" property that was previously defined no longer is. As a result, InvokeHTTP treats it as a header and attempts to send it as an HTTP Header. However, since it has a space in the name, it's invalid and InvokeHTTP fails. We should automatically handle migrating the Proxy properties to make this seamless. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12635) Upgrade slack client to 1.37.0
[ https://issues.apache.org/jira/browse/NIFI-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12635: -- Status: Patch Available (was: Open) > Upgrade slack client to 1.37.0 > -- > > Key: NIFI-12635 > URL: https://issues.apache.org/jira/browse/NIFI-12635 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0-M2 > > Time Spent: 10m > Remaining Estimate: 0h > > I sometimes see the ListenSlack spew errors about Rate Limiting and > connection failures. This appears to be fixed in the 1.37.0 version of the > client according to [https://github.com/slackapi/java-slack-sdk/pull/1265] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12635) Upgrade slack client to 1.37.0
Mark Payne created NIFI-12635: - Summary: Upgrade slack client to 1.37.0 Key: NIFI-12635 URL: https://issues.apache.org/jira/browse/NIFI-12635 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0-M2 I sometimes see the ListenSlack spew errors about Rate Limiting and connection failures. This appears to be fixed in the 1.37.0 version of the client according to [https://github.com/slackapi/java-slack-sdk/pull/1265] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12623) Allow ListenSlack to receive App Mention events and include user details
Mark Payne created NIFI-12623: - Summary: Allow ListenSlack to receive App Mention events and include user details Key: NIFI-12623 URL: https://issues.apache.org/jira/browse/NIFI-12623 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0-M2 When using the ListenSlack processor, I often want only the events that mention my bot by name. However, as it is, the processor requires that I receive all events in the channels that have my bot, and then filter them out. We should instead allow users to receive only App Mention events. Additionally, it would be beneficial to retrieve user details such as username instead of just the User ID. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12616) Enable @use_case and @multi_processor_use_case decorators to be added to Python Processors
Mark Payne created NIFI-12616: - Summary: Enable @use_case and @multi_processor_use_case decorators to be added to Python Processors Key: NIFI-12616 URL: https://issues.apache.org/jira/browse/NIFI-12616 Project: Apache NiFi Issue Type: Bug Components: Core Framework, Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0-M2 Currently, Python processors have no way of articulating specific use cases and multi-processor use cases in their docs. Introduce new decorators to allow for these. We use decorators here in order to keep the structure similar to that of Java but also because it offers a clean mechanism for defining the MultiProcessorUseCase, which becomes awkward if trying to include in the ProcessorDetails inner class. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-9464) Provenance Events files corrupted
[ https://issues.apache.org/jira/browse/NIFI-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-9464: - Resolution: Fixed Status: Resolved (was: Patch Available) > Provenance Events files corrupted > - > > Key: NIFI-9464 > URL: https://issues.apache.org/jira/browse/NIFI-9464 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.11.0, 1.15.0 > Environment: java 11, centos 7, nifi standalone >Reporter: Wiktor Kubicki >Assignee: Tamas Palfy >Priority: Minor > Fix For: 1.25.0, 2.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > In my logs i found: > {code:java} > SiteToSiteProvenanceReportingTask[id=b209c0ae-016e-1000-ae39-301c9dcfc544] > Failed to retrieve Provenance Events from repository due to: Attempted to > skip to byte offset 9149491 for 1125432890.prov.gz but file does not have > that many bytes (TOC > Reader=StandardTocReader[file=//provenance_repository/toc/1125432890.toc, > compressed=false]): java.io.EOFException: Attempted to skip to byte offset > 9149491 for 1125432890.prov.gz but file does not have that many bytes (TOC > Reader=StandardTocReader[file=/.../provenance_repository/toc/1125432890.toc, > compressed=false]) > {code} > It is criticaly important for me to have 100% sure of my logs. It happened > about 100 times in last 1 year for 15 *.prov.gz files: > {code:java} > -rw-rw-rw-. 1 user user 1013923 Oct 17 21:17 1075441276.prov.gz > -rw-rw-rw-. 1 user user 1345431 Oct 24 13:06 1083362251.prov.gz > -rw-rw-rw-. 1 user user 1359282 Oct 25 13:07 1084546392.prov.gz > -rw-rw-rw-. 1 user user 1155791 Nov 2 17:08 1094516954.prov.gz > -rw-rw-r--. 1 user user 974136 Nov 18 22:07 1113402183.prov.gz > -rw-rw-r--. 1 user user 1125608 Nov 28 22:00 1125097576.prov.gz > -rw-rw-r--. 1 user user 1248319 Nov 29 04:30 1125432890.prov.gz > -rw-rw-r--. 1 user user 832120 Feb 2 2021 661957813.prov.gz > -rw-rw-r--. 1 user user 1110978 Mar 17 2021 734807613.prov.gz > -rw-rw-r--. 1 user user 1506819 Apr 16 2021 786154249.prov.gz > -rw-rw-r--. 1 user user 1763198 May 25 2021 852626782.prov.gz > -rw-rw-r--. 1 user user 1580598 Jun 15 08:32 891934274.prov.gz > -rw-rw-r--. 1 user user 2960296 Jun 28 17:07 917991812.prov.gz > -rw-rw-r--. 1 user user 1808037 Jun 28 17:37 918051650.prov.gz > -rw-rw-rw-. 1 user user 765924 Aug 14 13:09 991505484.prov.gz > {code} > BTW it's interesting why thera ere different chmods > My config for provenance (BTW if you see posibbility for tune it, please tell > me): > {code:java} > nifi.provenance.repository.directory.default=/../provenance_repository > nifi.provenance.repository.max.storage.time=730 days > nifi.provenance.repository.max.storage.size=512 GB > nifi.provenance.repository.rollover.time=10 mins > nifi.provenance.repository.rollover.size=100 MB > nifi.provenance.repository.query.threads=2 > nifi.provenance.repository.index.threads=1 > nifi.provenance.repository.compress.on.rollover=true > nifi.provenance.repository.always.sync=false > nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, > ProcessorID > nifi.provenance.repository.indexed.attributes= > nifi.provenance.repository.index.shard.size=1 GB > nifi.provenance.repository.max.attribute.length=65536 > nifi.provenance.repository.concurrent.merge.threads=1 > nifi.provenance.repository.buffer.size=10 > {code} > Now my provenance repo has 140GB of data. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-9464) Provenance Events files corrupted
[ https://issues.apache.org/jira/browse/NIFI-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804441#comment-17804441 ] Mark Payne commented on NIFI-9464: -- [~tpalfy] I got you. Makes sense. I did a quick look over this again to make sure that I fully understand what's happening here. It looks like this was actually designed to work as you've proposed in the PR. But when the Encrypted Prov Repo was introduced, the base class's init() method was changed to start creating its own `EventFileManager`. As a result, the base class has a different instance than the concrete class is using. So this change fixes that to ensure that both the base class and the concrete class are sharing the same instance. Makes perfect sense. Great catch! Thanks for running that down and fixing. I'm a +1 will merge. > Provenance Events files corrupted > - > > Key: NIFI-9464 > URL: https://issues.apache.org/jira/browse/NIFI-9464 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.11.0, 1.15.0 > Environment: java 11, centos 7, nifi standalone >Reporter: Wiktor Kubicki >Assignee: Tamas Palfy >Priority: Minor > Fix For: 1.25.0, 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > In my logs i found: > {code:java} > SiteToSiteProvenanceReportingTask[id=b209c0ae-016e-1000-ae39-301c9dcfc544] > Failed to retrieve Provenance Events from repository due to: Attempted to > skip to byte offset 9149491 for 1125432890.prov.gz but file does not have > that many bytes (TOC > Reader=StandardTocReader[file=//provenance_repository/toc/1125432890.toc, > compressed=false]): java.io.EOFException: Attempted to skip to byte offset > 9149491 for 1125432890.prov.gz but file does not have that many bytes (TOC > Reader=StandardTocReader[file=/.../provenance_repository/toc/1125432890.toc, > compressed=false]) > {code} > It is criticaly important for me to have 100% sure of my logs. It happened > about 100 times in last 1 year for 15 *.prov.gz files: > {code:java} > -rw-rw-rw-. 1 user user 1013923 Oct 17 21:17 1075441276.prov.gz > -rw-rw-rw-. 1 user user 1345431 Oct 24 13:06 1083362251.prov.gz > -rw-rw-rw-. 1 user user 1359282 Oct 25 13:07 1084546392.prov.gz > -rw-rw-rw-. 1 user user 1155791 Nov 2 17:08 1094516954.prov.gz > -rw-rw-r--. 1 user user 974136 Nov 18 22:07 1113402183.prov.gz > -rw-rw-r--. 1 user user 1125608 Nov 28 22:00 1125097576.prov.gz > -rw-rw-r--. 1 user user 1248319 Nov 29 04:30 1125432890.prov.gz > -rw-rw-r--. 1 user user 832120 Feb 2 2021 661957813.prov.gz > -rw-rw-r--. 1 user user 1110978 Mar 17 2021 734807613.prov.gz > -rw-rw-r--. 1 user user 1506819 Apr 16 2021 786154249.prov.gz > -rw-rw-r--. 1 user user 1763198 May 25 2021 852626782.prov.gz > -rw-rw-r--. 1 user user 1580598 Jun 15 08:32 891934274.prov.gz > -rw-rw-r--. 1 user user 2960296 Jun 28 17:07 917991812.prov.gz > -rw-rw-r--. 1 user user 1808037 Jun 28 17:37 918051650.prov.gz > -rw-rw-rw-. 1 user user 765924 Aug 14 13:09 991505484.prov.gz > {code} > BTW it's interesting why thera ere different chmods > My config for provenance (BTW if you see posibbility for tune it, please tell > me): > {code:java} > nifi.provenance.repository.directory.default=/../provenance_repository > nifi.provenance.repository.max.storage.time=730 days > nifi.provenance.repository.max.storage.size=512 GB > nifi.provenance.repository.rollover.time=10 mins > nifi.provenance.repository.rollover.size=100 MB > nifi.provenance.repository.query.threads=2 > nifi.provenance.repository.index.threads=1 > nifi.provenance.repository.compress.on.rollover=true > nifi.provenance.repository.always.sync=false > nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, > ProcessorID > nifi.provenance.repository.indexed.attributes= > nifi.provenance.repository.index.shard.size=1 GB > nifi.provenance.repository.max.attribute.length=65536 > nifi.provenance.repository.concurrent.merge.threads=1 > nifi.provenance.repository.buffer.size=10 > {code} > Now my provenance repo has 140GB of data. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12536) ParseDocument incorrectly converts byte array to String, result in text like b'...' instead of just ...
[ https://issues.apache.org/jira/browse/NIFI-12536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12536: -- Status: Patch Available (was: Open) > ParseDocument incorrectly converts byte array to String, result in text like > b'...' instead of just ... > --- > > Key: NIFI-12536 > URL: https://issues.apache.org/jira/browse/NIFI-12536 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 2.0.0-M1 >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Trivial > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Documents that are produced from ParseDocument use > {{{}str(flowfile.getContentsAsBytes(){}}}) when it should use > {{{}flowfile.getContentsAsBytes().decode('utf-8'){}}}. > This results in text such as {{One Two Three}} to be produces as {{b'One Two > Three'}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12536) ParseDocument incorrectly converts byte array to String, result in text like b'...' instead of just ...
Mark Payne created NIFI-12536: - Summary: ParseDocument incorrectly converts byte array to String, result in text like b'...' instead of just ... Key: NIFI-12536 URL: https://issues.apache.org/jira/browse/NIFI-12536 Project: Apache NiFi Issue Type: Bug Components: Extensions Affects Versions: 2.0.0-M1 Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 Documents that are produced from ParseDocument use {{{}str(flowfile.getContentsAsBytes(){}}}) when it should use {{{}flowfile.getContentsAsBytes().decode('utf-8'){}}}. This results in text such as {{One Two Three}} to be produces as {{b'One Two Three'}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12516) When clustered, View Content shows wrong content type, will not show formatted
[ https://issues.apache.org/jira/browse/NIFI-12516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12516: -- Status: Patch Available (was: Open) > When clustered, View Content shows wrong content type, will not show formatted > -- > > Key: NIFI-12516 > URL: https://issues.apache.org/jira/browse/NIFI-12516 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework, Core UI >Affects Versions: 2.0.0-M1 >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > When I choose to view a FlowFile's contents in the UI (regardless of whether > it came from Provenance view or List Queue view), it shows the content. > However, it shows the filename as an empty string, and it shows the > content-type as "text/plain" even though the mime.type attribute is set to > "application/json". As a result, when I try to change to use 'formatted' view > as, instead of 'original' it does not render it as JSON, since it thinks the > data is text/plain. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12516) When clustered, View Content shows wrong content type, will not show formatted
Mark Payne created NIFI-12516: - Summary: When clustered, View Content shows wrong content type, will not show formatted Key: NIFI-12516 URL: https://issues.apache.org/jira/browse/NIFI-12516 Project: Apache NiFi Issue Type: Bug Components: Core Framework, Core UI Affects Versions: 2.0.0-M1 Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 When I choose to view a FlowFile's contents in the UI (regardless of whether it came from Provenance view or List Queue view), it shows the content. However, it shows the filename as an empty string, and it shows the content-type as "text/plain" even though the mime.type attribute is set to "application/json". As a result, when I try to change to use 'formatted' view as, instead of 'original' it does not render it as JSON, since it thinks the data is text/plain. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-12394) when importing versioned flow with component that migrates properties, controller service reference is invalid
[ https://issues.apache.org/jira/browse/NIFI-12394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796772#comment-17796772 ] Mark Payne commented on NIFI-12394: --- Thanks for reporting, [~mosermw] . That's a good corner case that I'd not thought of. So when a Processor (or Controller Service or Reporting Task) is created in the StandardVersionedComponentSynchronizer, we add a {{CreatedExtension}} to the {{createdExtensions}} Map. When the entry is added, it has the original property values. We then call {{updateProcessor}} which calls {{{}populatePropertiesMap{}}}. This is the part of the code where, if a Property references a Controller Service, it updates the Properties Map. Then, at the end, we call {{migrateConfiguration}} which is responsible for updating the properties, based on the original property values. So the solution that I would propose would be in {{{}populatePropertiesMap{}}}, we update the logic there so that if it maps a property value to a Controller Service, we also get the {{CreatedExtension}} from the Map and update the property map there too. I believe this should then pass in the values to {{migrateConfiguration}} (which then calls {{{}migrateProperties{}}}) with the appropriate value. > when importing versioned flow with component that migrates properties, > controller service reference is invalid > -- > > Key: NIFI-12394 > URL: https://issues.apache.org/jira/browse/NIFI-12394 > Project: Apache NiFi > Issue Type: Bug > Components: Flow Versioning >Reporter: Michael W Moser >Priority: Major > > I built a Process Group containing one StandardRestrictedSSLContextService > that is referenced by one InvokeHTTP processor. I downloaded that Process > Group as a flow definition {*}with external services{*}. I also versioned > that Process Group in NiFi Registry. > Inside the flow definition file, I see the > StandardRestrictedSSLContextService with > "identifier":"d7d70b6c-abe4-3564-a219-b289cb7f25d2" and InvokeHTTP references > that UUID. > When I create a new Process Group using either the downloaded flow definition > or the NiFi Registry flow, a new StandardRestrictedSSLContextService is > created and it has a new UUID as expected. The InvokeHTTP processor is > invalid because it references the proposed > StandardRestrictedSSLContextService UUID d7d70b6c-abe4-3564-a219-b289cb7f25d2 > which does not exist. > The service and processor are created and references are updated, but when > migrating processor properties and any change occurs, the service reference > is reverted back to what was in proposedProperties. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-9464) Provenance Events files corrupted
[ https://issues.apache.org/jira/browse/NIFI-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796766#comment-17796766 ] Mark Payne commented on NIFI-9464: -- Thanks [~tpalfy] - your analysis seems reasonable. As you said, it's difficult to confirm. One thing that I would recommend in order to confirm (although you'd not want to leave this in the codebase) would be to temporarily add a `Thread.sleep` into the code in between the time that the new .toc.tmp file completes its writing and the time that it's renamed to .toc. If you were to add a sleep of say 30 seconds or 1 minute, it would be easy to confirm that the threading issue is present as described and also that this change addresses the concern. > Provenance Events files corrupted > - > > Key: NIFI-9464 > URL: https://issues.apache.org/jira/browse/NIFI-9464 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.11.0, 1.15.0 > Environment: java 11, centos 7, nifi standalone >Reporter: Wiktor Kubicki >Assignee: Tamas Palfy >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > In my logs i found: > {code:java} > SiteToSiteProvenanceReportingTask[id=b209c0ae-016e-1000-ae39-301c9dcfc544] > Failed to retrieve Provenance Events from repository due to: Attempted to > skip to byte offset 9149491 for 1125432890.prov.gz but file does not have > that many bytes (TOC > Reader=StandardTocReader[file=//provenance_repository/toc/1125432890.toc, > compressed=false]): java.io.EOFException: Attempted to skip to byte offset > 9149491 for 1125432890.prov.gz but file does not have that many bytes (TOC > Reader=StandardTocReader[file=/.../provenance_repository/toc/1125432890.toc, > compressed=false]) > {code} > It is criticaly important for me to have 100% sure of my logs. It happened > about 100 times in last 1 year for 15 *.prov.gz files: > {code:java} > -rw-rw-rw-. 1 user user 1013923 Oct 17 21:17 1075441276.prov.gz > -rw-rw-rw-. 1 user user 1345431 Oct 24 13:06 1083362251.prov.gz > -rw-rw-rw-. 1 user user 1359282 Oct 25 13:07 1084546392.prov.gz > -rw-rw-rw-. 1 user user 1155791 Nov 2 17:08 1094516954.prov.gz > -rw-rw-r--. 1 user user 974136 Nov 18 22:07 1113402183.prov.gz > -rw-rw-r--. 1 user user 1125608 Nov 28 22:00 1125097576.prov.gz > -rw-rw-r--. 1 user user 1248319 Nov 29 04:30 1125432890.prov.gz > -rw-rw-r--. 1 user user 832120 Feb 2 2021 661957813.prov.gz > -rw-rw-r--. 1 user user 1110978 Mar 17 2021 734807613.prov.gz > -rw-rw-r--. 1 user user 1506819 Apr 16 2021 786154249.prov.gz > -rw-rw-r--. 1 user user 1763198 May 25 2021 852626782.prov.gz > -rw-rw-r--. 1 user user 1580598 Jun 15 08:32 891934274.prov.gz > -rw-rw-r--. 1 user user 2960296 Jun 28 17:07 917991812.prov.gz > -rw-rw-r--. 1 user user 1808037 Jun 28 17:37 918051650.prov.gz > -rw-rw-rw-. 1 user user 765924 Aug 14 13:09 991505484.prov.gz > {code} > BTW it's interesting why thera ere different chmods > My config for provenance (BTW if you see posibbility for tune it, please tell > me): > {code:java} > nifi.provenance.repository.directory.default=/../provenance_repository > nifi.provenance.repository.max.storage.time=730 days > nifi.provenance.repository.max.storage.size=512 GB > nifi.provenance.repository.rollover.time=10 mins > nifi.provenance.repository.rollover.size=100 MB > nifi.provenance.repository.query.threads=2 > nifi.provenance.repository.index.threads=1 > nifi.provenance.repository.compress.on.rollover=true > nifi.provenance.repository.always.sync=false > nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, > ProcessorID > nifi.provenance.repository.indexed.attributes= > nifi.provenance.repository.index.shard.size=1 GB > nifi.provenance.repository.max.attribute.length=65536 > nifi.provenance.repository.concurrent.merge.threads=1 > nifi.provenance.repository.buffer.size=10 > {code} > Now my provenance repo has 140GB of data. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12478) Return Message Type as body for JMS Object Messages
[ https://issues.apache.org/jira/browse/NIFI-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12478: -- Fix Version/s: 2.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Return Message Type as body for JMS Object Messages > --- > > Key: NIFI-12478 > URL: https://issues.apache.org/jira/browse/NIFI-12478 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: David Handermann >Assignee: David Handermann >Priority: Minor > Fix For: 2.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The ConsumeJMS Processor supports receiving multiple types of JMS messages > and implements different serialization strategies for each type of message. > The JMS ObjectMessage Type provides a generic wrapper around an opaque Java > Object without any further information. The ConsumeJMS Processor currently > writes the bytes of an Object using Java Object serialization, which presents > several issues. Java Object serialization is not compatible with services > outside of Java, it writes the exact version of the Java Object, and it can > reference classes that may not be present on the receiving system. This can > lead to unexpected errors when receiving JMS messages in the context of a > NiFi Processor. Instead of reporting the message as an error, the message > metadata could still be useful in some flows. Using the Message Type of > {{ObjectMessage}} as the output bytes enables this edge case scenario, > although any system designed to interoperate with NiFi should use other types > of JMS messages to enable subsequent handling in other Processors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12480) Improve handling of embedded JSON records
[ https://issues.apache.org/jira/browse/NIFI-12480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12480: -- Fix Version/s: 2.0.0 Status: Patch Available (was: Open) > Improve handling of embedded JSON records > - > > Key: NIFI-12480 > URL: https://issues.apache.org/jira/browse/NIFI-12480 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > It proves to be difficult to treat embedded JSON elements as Strings. There > are times when this is necessary, such as when pushing to a database that > declares a field of type JSON or interacting with a web service that expects > incoming JSON as a String. However, there's no easy way to do this in NiFi > today. > Instead, what typically happens is that the Record gets converted to a String > via a call to {{toString()}} and that produces something like > {{{}MapRecord[name=John Doe, color=blue]{}}}, which is not helpful. > However, when a JSON Reader is used, we already have a JSON representation of > the Record in the record's SerializedForm. When {{toString()}} is called, we > should always use the SerializedForm of a Record, if it is available and only > fall back to the given version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12480) Improve handling of embedded JSON records
Mark Payne created NIFI-12480: - Summary: Improve handling of embedded JSON records Key: NIFI-12480 URL: https://issues.apache.org/jira/browse/NIFI-12480 Project: Apache NiFi Issue Type: Improvement Components: Core Framework, Extensions Reporter: Mark Payne Assignee: Mark Payne It proves to be difficult to treat embedded JSON elements as Strings. There are times when this is necessary, such as when pushing to a database that declares a field of type JSON or interacting with a web service that expects incoming JSON as a String. However, there's no easy way to do this in NiFi today. Instead, what typically happens is that the Record gets converted to a String via a call to {{toString()}} and that produces something like {{{}MapRecord[name=John Doe, color=blue]{}}}, which is not helpful. However, when a JSON Reader is used, we already have a JSON representation of the Record in the record's SerializedForm. When {{toString()}} is called, we should always use the SerializedForm of a Record, if it is available and only fall back to the given version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (NIFI-12331) Introduce a PublishSlack processor
[ https://issues.apache.org/jira/browse/NIFI-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne reassigned NIFI-12331: - Assignee: Mark Payne > Introduce a PublishSlack processor > -- > > Key: NIFI-12331 > URL: https://issues.apache.org/jira/browse/NIFI-12331 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > The Slack API provides multiple different ways to publish messages to a Slack > channel. NiFi already has two Processors for pushing to Slack - PostSlack and > PutSlack. These processors have slightly different nuances, and the > documentation does not articulate when to which one. One of them is oriented > more toward sending FlowFile contents as an attachment while the other is > oriented toward posting a message based on a property value. We should > consolidate both of these Processors into a single Processor that is capable > of sending a message and optionally providing the FlowFile content as an > attachment. > Both PostSlack and PutSlack make use of WebHooks instead of using the > official Slack SDK. This means that rather than simply specifying the name of > the Channel to post to, in order to send a message in Slack, the creator of > the Slack App must explicitly add a Webhook for the desired channel, and the > Processor must then be configured to use that Webhook. As a result, the > channel cannot be easily configured and cannot be dynamic. This makes it > difficult to use in conjunction with ListenSlack / ConsumeSlack in order to > respond in threads. > We need to consolidate both into a single Processor that is configured and > behaves differently, based on the SDK. > This Processor should be configured with properties that allow specifying: > * Bot Token > * Name of the channel to send to > * How to obtain the message content (FlowFile Content or specified as a > Property that accepts Expression Language) > * If using a Property value, should be configured also with the message to > send, and whether or not to attach the FlowFile content as an attachment to > the message. > * Thread Timestamp (optional to convey which thread the message should be > sent to) - should support Expression Language -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12457) Add Use Case documentation to explain how to use RouteOnAttribute for specific use cases
[ https://issues.apache.org/jira/browse/NIFI-12457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12457: -- Status: Patch Available (was: Open) > Add Use Case documentation to explain how to use RouteOnAttribute for > specific use cases > > > Key: NIFI-12457 > URL: https://issues.apache.org/jira/browse/NIFI-12457 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Add @MultiProcessorUseCase that explains how to use RouteOnAttribute in > conjunction with List/Fetch S3 in order to fetch only specific files. > Also add documentation showing how to use RouteOnAttribute alongside > PartitionRecord in order to route record-oriented data based on the contents > of the FlowFile -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12457) Add Use Case documentation to explain how to use RouteOnAttribute for specific use cases
Mark Payne created NIFI-12457: - Summary: Add Use Case documentation to explain how to use RouteOnAttribute for specific use cases Key: NIFI-12457 URL: https://issues.apache.org/jira/browse/NIFI-12457 Project: Apache NiFi Issue Type: Improvement Reporter: Mark Payne Assignee: Mark Payne Add @MultiProcessorUseCase that explains how to use RouteOnAttribute in conjunction with List/Fetch S3 in order to fetch only specific files. Also add documentation showing how to use RouteOnAttribute alongside PartitionRecord in order to route record-oriented data based on the contents of the FlowFile -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12456) Improve leniency of JSON readers and flexibility of JSON Writer
Mark Payne created NIFI-12456: - Summary: Improve leniency of JSON readers and flexibility of JSON Writer Key: NIFI-12456 URL: https://issues.apache.org/jira/browse/NIFI-12456 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Mark Payne Currently, we adhere to the JSON specification fairly strictly, with the exception of allowing for "JSON Lines" / ndjson / ldjson. However, the Jackson library allows for several {{Features}} that we do not expose, which may be helpful for handling data that does not strictly adhere to the schema, or where there are preferences in serialization. For example, {{JsonParser.Feature}} allows for the ability to allow comments in JSON (to include lines beginning with {{{}//{}}}, {{{}/*{}}}, and "YAML Style" comments (#)). Additionally, it allows for single-quotes for field names or no quoting at all. While these do not adhere to the specification, they are common enough for the parser to support them, and we should do. Similarly, on the serialization side, we have had requests to support writing decimal values without use of scientific notation, which can be achieved by enabling the {{WRITE_BIGDECIMAL_AS_PLAIN}} feature. We should expose these options on the JsonTreeReader and the JSON Writer. I don't know of any downside to enabling the leniency / non-standard options, so it probably makes sense to simply enable them all by default. Though there is argument for introducing a new "Parsing Leniency" option that allows the user to disable these features. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12454) Allow decommissioning a node without shutdown
[ https://issues.apache.org/jira/browse/NIFI-12454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12454: -- Status: Patch Available (was: Open) > Allow decommissioning a node without shutdown > - > > Key: NIFI-12454 > URL: https://issues.apache.org/jira/browse/NIFI-12454 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0 > > > When a node is decommissioned, it takes the following steps: > * Disconnect Node from cluster > * Trigger data offload > * Wait for data offload > * Remove node from cluster > * Shutdown > It would be helpful to allow taking the node out of the cluster without > completely terminating the process. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12453) Allow obtaining a node's cluster status via nifi.sh
[ https://issues.apache.org/jira/browse/NIFI-12453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12453: -- Status: Patch Available (was: Open) > Allow obtaining a node's cluster status via nifi.sh > --- > > Key: NIFI-12453 > URL: https://issues.apache.org/jira/browse/NIFI-12453 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > There are often times when we want to check the cluster status of a > particular node (i.e., is it CONNECTING, CONNECTED, DISCONNECTING, > DISCONNECTED, etc.) Currently the only way to obtain this information is via > the REST API or the full diagnostic dump. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12454) Allow decommissioning a node without shutdown
Mark Payne created NIFI-12454: - Summary: Allow decommissioning a node without shutdown Key: NIFI-12454 URL: https://issues.apache.org/jira/browse/NIFI-12454 Project: Apache NiFi Issue Type: Improvement Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 When a node is decommissioned, it takes the following steps: * Disconnect Node from cluster * Trigger data offload * Wait for data offload * Remove node from cluster * Shutdown It would be helpful to allow taking the node out of the cluster without completely terminating the process. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12453) Allow obtaining a node's cluster status via nifi.sh
Mark Payne created NIFI-12453: - Summary: Allow obtaining a node's cluster status via nifi.sh Key: NIFI-12453 URL: https://issues.apache.org/jira/browse/NIFI-12453 Project: Apache NiFi Issue Type: Improvement Components: Core Framework Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.0.0 There are often times when we want to check the cluster status of a particular node (i.e., is it CONNECTING, CONNECTED, DISCONNECTING, DISCONNECTED, etc.) Currently the only way to obtain this information is via the REST API or the full diagnostic dump. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement
[ https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-11671: -- Status: Patch Available (was: Open) > JoinEnrichment SQL strategy doesn't allow attributes in join statement > -- > > Key: NIFI-11671 > URL: https://issues.apache.org/jira/browse/NIFI-11671 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.23.0, 1.20.0, 1.18.0 >Reporter: Philipp Korniets >Assignee: Mark Payne >Priority: Minor > Fix For: 1.latest, 2.latest > > Attachments: screenshot-1.png, screenshot-2.png > > Time Spent: 10m > Remaining Estimate: 0h > > We use ForkEnrichement - JoinEnrichment pattern and want to include filtering > in join SQL. Filter value is coming from FlowFile attribute > {code:sql} > ${test} = 'NewValue' > SELECT original.*, enrichment.*,'${test}' > FROM original > LEFT OUTER JOIN enrichment > ON original.Underlying = enrichment.Underlying > WHERE enrichment.MyField = '${test}' > {code} > However this doesnt work because JoinEnrichment doesnt use > evaluateAttributeExpressions > Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as > attribute. > !screenshot-1.png|width=692,height=431! > > {code:java} > 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] > o.a.n.processors.standard.JoinEnrichment > JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join > 'original' FlowFile > StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687948831976-629, > container=default, section=629], offset=8334082, > length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] > and 'enrichment' FlowFile > StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687949723375-631, > container=default, section=631], offset=5362822, > length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502]; > routing to failure > java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}] > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99) > at > org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387) > at > org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503) > at > org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193) > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354) > at > org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246) > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102) > at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line > 1, column 1. > Was expecting one of: > "ABS" ... > {code} > As I understand issue is in following line of code >
[jira] [Assigned] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement
[ https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne reassigned NIFI-11671: - Assignee: Mark Payne > JoinEnrichment SQL strategy doesn't allow attributes in join statement > -- > > Key: NIFI-11671 > URL: https://issues.apache.org/jira/browse/NIFI-11671 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.18.0, 1.20.0, 1.23.0 >Reporter: Philipp Korniets >Assignee: Mark Payne >Priority: Minor > Fix For: 1.latest, 2.latest > > Attachments: screenshot-1.png, screenshot-2.png > > > We use ForkEnrichement - JoinEnrichment pattern and want to include filtering > in join SQL. Filter value is coming from FlowFile attribute > {code:sql} > ${test} = 'NewValue' > SELECT original.*, enrichment.*,'${test}' > FROM original > LEFT OUTER JOIN enrichment > ON original.Underlying = enrichment.Underlying > WHERE enrichment.MyField = '${test}' > {code} > However this doesnt work because JoinEnrichment doesnt use > evaluateAttributeExpressions > Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as > attribute. > !screenshot-1.png|width=692,height=431! > > {code:java} > 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] > o.a.n.processors.standard.JoinEnrichment > JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join > 'original' FlowFile > StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687948831976-629, > container=default, section=629], offset=8334082, > length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] > and 'enrichment' FlowFile > StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687949723375-631, > container=default, section=631], offset=5362822, > length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502]; > routing to failure > java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}] > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99) > at > org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387) > at > org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503) > at > org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193) > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354) > at > org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246) > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102) > at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line > 1, column 1. > Was expecting one of: > "ABS" ... > {code} > As I understand issue is in following line of code >
[jira] [Updated] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement
[ https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-11671: -- Fix Version/s: 1.latest 2.latest > JoinEnrichment SQL strategy doesn't allow attributes in join statement > -- > > Key: NIFI-11671 > URL: https://issues.apache.org/jira/browse/NIFI-11671 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.18.0, 1.20.0, 1.23.0 >Reporter: Philipp Korniets >Priority: Minor > Fix For: 1.latest, 2.latest > > Attachments: screenshot-1.png, screenshot-2.png > > > We use ForkEnrichement - JoinEnrichment pattern and want to include filtering > in join SQL. Filter value is coming from FlowFile attribute > {code:sql} > ${test} = 'NewValue' > SELECT original.*, enrichment.*,'${test}' > FROM original > LEFT OUTER JOIN enrichment > ON original.Underlying = enrichment.Underlying > WHERE enrichment.MyField = '${test}' > {code} > However this doesnt work because JoinEnrichment doesnt use > evaluateAttributeExpressions > Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as > attribute. > !screenshot-1.png|width=692,height=431! > > {code:java} > 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] > o.a.n.processors.standard.JoinEnrichment > JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join > 'original' FlowFile > StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687948831976-629, > container=default, section=629], offset=8334082, > length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] > and 'enrichment' FlowFile > StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687949723375-631, > container=default, section=631], offset=5362822, > length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502]; > routing to failure > java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}] > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99) > at > org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387) > at > org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503) > at > org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193) > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354) > at > org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246) > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102) > at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line > 1, column 1. > Was expecting one of: > "ABS" ... > {code} > As I understand issue is in following line of code >
[jira] [Updated] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement
[ https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-11671: -- Priority: Minor (was: Critical) > JoinEnrichment SQL strategy doesn't allow attributes in join statement > -- > > Key: NIFI-11671 > URL: https://issues.apache.org/jira/browse/NIFI-11671 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.18.0, 1.20.0, 1.23.0 >Reporter: Philipp Korniets >Priority: Minor > Attachments: screenshot-1.png, screenshot-2.png > > > We use ForkEnrichement - JoinEnrichment pattern and want to include filtering > in join SQL. Filter value is coming from FlowFile attribute > {code:sql} > ${test} = 'NewValue' > SELECT original.*, enrichment.*,'${test}' > FROM original > LEFT OUTER JOIN enrichment > ON original.Underlying = enrichment.Underlying > WHERE enrichment.MyField = '${test}' > {code} > However this doesnt work because JoinEnrichment doesnt use > evaluateAttributeExpressions > Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as > attribute. > !screenshot-1.png|width=692,height=431! > > {code:java} > 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] > o.a.n.processors.standard.JoinEnrichment > JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join > 'original' FlowFile > StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687948831976-629, > container=default, section=629], offset=8334082, > length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] > and 'enrichment' FlowFile > StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687949723375-631, > container=default, section=631], offset=5362822, > length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502]; > routing to failure > java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}] > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99) > at > org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387) > at > org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503) > at > org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193) > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354) > at > org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246) > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102) > at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line > 1, column 1. > Was expecting one of: > "ABS" ... > {code} > As I understand issue is in following line of code >
[jira] [Updated] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement
[ https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-11671: -- Component/s: Extensions (was: Core Framework) > JoinEnrichment SQL strategy doesn't allow attributes in join statement > -- > > Key: NIFI-11671 > URL: https://issues.apache.org/jira/browse/NIFI-11671 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.18.0, 1.20.0, 1.23.0 >Reporter: Philipp Korniets >Priority: Minor > Attachments: screenshot-1.png, screenshot-2.png > > > We use ForkEnrichement - JoinEnrichment pattern and want to include filtering > in join SQL. Filter value is coming from FlowFile attribute > {code:sql} > ${test} = 'NewValue' > SELECT original.*, enrichment.*,'${test}' > FROM original > LEFT OUTER JOIN enrichment > ON original.Underlying = enrichment.Underlying > WHERE enrichment.MyField = '${test}' > {code} > However this doesnt work because JoinEnrichment doesnt use > evaluateAttributeExpressions > Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as > attribute. > !screenshot-1.png|width=692,height=431! > > {code:java} > 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] > o.a.n.processors.standard.JoinEnrichment > JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join > 'original' FlowFile > StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687948831976-629, > container=default, section=629], offset=8334082, > length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] > and 'enrichment' FlowFile > StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim > [resourceClaim=StandardResourceClaim[id=1687949723375-631, > container=default, section=631], offset=5362822, > length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502]; > routing to failure > java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}] > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99) > at > org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65) > at > org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387) > at > org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233) > at > org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503) > at > org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193) > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354) > at > org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246) > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102) > at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line > 1, column 1. > Was expecting one of: > "ABS" ... > {code} > As I understand issue is in following line of code >
[jira] [Updated] (NIFI-12358) NPE when configured network interfaces do not exist
[ https://issues.apache.org/jira/browse/NIFI-12358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12358: -- Fix Version/s: 2.latest Assignee: Mark Payne Status: Patch Available (was: Open) > NPE when configured network interfaces do not exist > --- > > Key: NIFI-12358 > URL: https://issues.apache.org/jira/browse/NIFI-12358 > Project: Apache NiFi > Issue Type: Bug >Affects Versions: 1.20.0 >Reporter: Guillaume Lhermenier >Assignee: Mark Payne >Priority: Major > Fix For: 2.latest > > Time Spent: 10m > Remaining Estimate: 0h > > I recently had to switch our NiFi base AMIs in AWS from amazonlinux 2 to > amazonlinux 2023. > This went pretty smoothly but I an issue about network interfaces. > For some reasons, I had the following configured in my nifi.properties : > {code:java} > nifi.web.https.host=nifi1.emea.qa.domain.io > nifi.web.https.port=8443 > nifi.web.https.network.interface.eth0=eth0 > nifi.web.https.network.interface.eth1=eth1{code} > And this worked for many years. > However, in amazon Linux, networks seems to have changed and naming too. > Instead of eth0/eth1, I had my network interfaces named ens5/ens6. > Of course, NiFi wasn't able to find them. > However, the log could be clearer than a NullPointerException > {code:java} > 2023-11-13 14:35:28,644 WARN [main] o.a.nifi.web.server.HostHeaderHandler > Failed to determine custom network interfaces. > java.lang.NullPointerException: null > at > org.apache.nifi.web.server.HostHeaderHandler.extractIPsFromNetworkInterfaces(HostHeaderHandler.java:335) > at > org.apache.nifi.web.server.HostHeaderHandler.generateDefaultHostnames(HostHeaderHandler.java:276) > at > org.apache.nifi.web.server.HostHeaderHandler.(HostHeaderHandler.java:100) > at org.apache.nifi.web.server.JettyServer.init(JettyServer.java:217) > at > org.apache.nifi.web.server.JettyServer.initialize(JettyServer.java:1074) > at org.apache.nifi.NiFi.(NiFi.java:164) > at org.apache.nifi.NiFi.(NiFi.java:83) > at org.apache.nifi.NiFi.main(NiFi.java:332) > 2023-11-13 14:35:28,649 INFO [main] o.a.nifi.web.server.HostHeaderHandler > Determined 14 valid hostnames and IP addresses for incoming headers: > 127.0.0.1, 127.0.0.1:8443, localhost, localhost:8443, [::1], [::1]:8443, > ip-172-30-xx-xx.eu-west-1.compute.internal, > ip-172-30-xx-xx.eu-west-1.compute.internal:8443, 172.30.xx.xx, > 172.30.xx.xx:8443, nifi1.emea.qa.domain.io, nifi1.emea.qa.domain.io:8443, > nifi.emea.qa.domain.io, {code} > > NB : I hadn't tested this on newer versions than 1.20 and won't have time to > in the coming weeks. > However, our migration to 1.23 should be done in the next months, I'll update > the ticket if needed at that time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12374) Add Use Case based documentation for performing full/incremental loads
[ https://issues.apache.org/jira/browse/NIFI-12374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12374: -- Status: Patch Available (was: Open) > Add Use Case based documentation for performing full/incremental loads > -- > > Key: NIFI-12374 > URL: https://issues.apache.org/jira/browse/NIFI-12374 > Project: Apache NiFi > Issue Type: Task > Components: Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Minor > Fix For: 2.latest > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-12374) Add Use Case based documentation for performing full/incremental loads
Mark Payne created NIFI-12374: - Summary: Add Use Case based documentation for performing full/incremental loads Key: NIFI-12374 URL: https://issues.apache.org/jira/browse/NIFI-12374 Project: Apache NiFi Issue Type: Task Components: Extensions Reporter: Mark Payne Assignee: Mark Payne Fix For: 2.latest -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-12332) Remove nifi-toolkit-flowfile-repo Module
[ https://issues.apache.org/jira/browse/NIFI-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-12332: -- Fix Version/s: 2.0.0 (was: 2.latest) Resolution: Fixed Status: Resolved (was: Patch Available) > Remove nifi-toolkit-flowfile-repo Module > > > Key: NIFI-12332 > URL: https://issues.apache.org/jira/browse/NIFI-12332 > Project: Apache NiFi > Issue Type: Improvement >Reporter: David Handermann >Assignee: David Handermann >Priority: Minor > Fix For: 2.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The nifi-toolkit-flowfile-repo module contains a command to support repairing > corrupted endings in a FlowFile repository. The command is not accessible > through any shell scripts and is not regularly maintained as part of the NiFi > CLI or other toolkit components. For these reasons, the module should be > removed from the main branch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (NIFI-12339) Sensitive Dynamic Properties not properly decrypted, resulting in wrong property value and ever-growing flow.json.gz
[ https://issues.apache.org/jira/browse/NIFI-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne resolved NIFI-12339. --- Resolution: Fixed > Sensitive Dynamic Properties not properly decrypted, resulting in wrong > property value and ever-growing flow.json.gz > > > Key: NIFI-12339 > URL: https://issues.apache.org/jira/browse/NIFI-12339 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Reporter: Mark Payne >Assignee: David Handermann >Priority: Blocker > Fix For: 2.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > To replication, create an InvokeHTTP Processor. Add a Sensitive Dynamic > Property named "Authorization" with a value of "Bearer > fsi8y3ofysp9f8ncp9nupnu8p3s9nu3s9" (it's ok that the value is nonsense). > Apply the changes. > Check the flow.json.gz: > {code:java} > cat conf/flow.json.gz | gunzip - | jq | grep Authorization{code} > Restart NiFi. > The value is no longer correct. And if you run the {{cat}} command above, > you'll see the value has doubled in length. After restarting several times we > can see this: > {code:java} > nifi-2.0.0-SNAPSHOT $ cat conf/flow.json.gz | gunzip - | jq | grep > Authorization > "Authorization": > "enc{f1f9ba180c6468ff8ce393955034e69383739de54b44ef42b1bf2050c2639e83815d940b8a0cf9f5bc65bdf36f7df59bff9d7e69fa02f0ccc25c8b381684550c8fc6b6a8c570998064ef730f05b0dc}", > -- restart --nifi-2.0.0-SNAPSHOT $ cat conf/flow.json.gz | gunzip - | jq | > grep Authorization > "Authorization": > "enc{e4455b884d07a7156397d2f60ce3a2f44be909084403f5a84af205bae2af6dbfa2adf47a33d6663799ab523915e9323064554030236b928d5b1684b0a9d635b6589d878b731c35ae1560fbef5627a433b23fb331657e66af355ac356a1c9cd1435c0836a4ecb872966c2852aa3b13e179da1a0f7898c64173b27363458c01dbf7c8595a5dfe9ab798834568c9e0a52fefaf03f6f9d1bdf6ad230fea7cf1e8663a78a6b964d945c729d9ae678e2eaba8910d02373cd9acd08e7a047e0c676ee8a13e9c0}", > -- restart --nifi-2.0.0-SNAPSHOT $ cat conf/flow.json.gz | gunzip - | jq | > grep Authorization > "Authorization": > "enc{1aeb6970c1ff7f10b88f5b94a2c0cfa70c179638eb976ff7580f5b2546a64b4d96ae834afff9d01cae79c98b9ca4d73af604eab5e95013047e79c152d3e90b3c556e054f9478713eb156da41477d59668902c606f3f300e9804b8a504712822b5f072a5a596c2ba1706520f0163ce8bf0a51dbaf84ee9359c60e55df029dec700725ff1ac599774d4271d5c390ad49d4b350d21bee9f2c235a81f5356d85279db7b4e335bc11fc0d6bf1045a6d2610ff61d8b9da931fc026d356a3d9a9b738312d283c01740757a286e5eb9ad675daa14a391d3df694eaeeb6c66085976a88c86a08052b3eb046e622e5346205bc1e38bfe4aed2ff130595688e4b72d217f29a5c24a28bc06c7bb55e4fd2d25fea15ce523e92b8d721e9a9c0d08ab6d1634cb027658c868feacd89462796b604db7dc55cc2bba7c650f77148bad4ec7328ae8dbeed743420b5b640061f36ed8c8c1db200bbe6a241d6eb370cb024a5881fc734d722e2f1091f1ffa178ad841a4859c9dc734b66a628fbfeb8c3f0a1e5d02e28ce3e2c04737ab5b92d032fafe21ebe5abd542731228b394356bb5b547c68517f972864351022d2ef1118426}", > -- restart -- > nifi-2.0.0-SNAPSHOT $ cat conf/flow.json.gz | gunzip - | jq | grep > Authorization > "Authorization": >