from:"Mark Payne \(Jira\)"

[jira] [Updated] (NIFI-8134) DataTypeUtils.toRecord methods do not recursively convert Maps into Records

2024-05-14 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-8134:
-
Fix Version/s: 2.0.0-M4
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> DataTypeUtils.toRecord methods do not recursively convert Maps into Records
> ---
>
> Key: NIFI-8134
> URL: https://issues.apache.org/jira/browse/NIFI-8134
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.11.4
>Reporter: Chris Sampson
>Assignee: Chris Sampson
>Priority: Major
> Fix For: 2.0.0-M4
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Given a java Map that contains one or more Maps as values (optionally nested 
> within arrays), the DataTypeUtils.toRecord method should convert the child 
> Maps to Records before converting the to level Map.
> This assumes the associated schema for the data represents these objects as 
> Records (including as part of a Choice of Array type).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-13206) S3 Integration tests failing due to server-side encrypt enabled by default

2024-05-09 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-13206:
--
Status: Patch Available  (was: Open)

> S3 Integration tests failing due to server-side encrypt enabled by default
> --
>
> Key: NIFI-13206
> URL: https://issues.apache.org/jira/browse/NIFI-13206
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The 3.0 version of Localstack enabled S3 server-side encryption by default. 
> This is causing integration tests now to fail with errors such as:
> {code:java}
> org.opentest4j.AssertionFailedError: Attribute s3.sseAlgorithm should not 
> exist on FlowFile, but exists with value AES256 ==> 
> Expected :false
> Actual   :true {code}
> This is happening in both Fetch and Put S3 integration tests. They tests work 
> as-is if we change the docker image of Localstack to 2.3.2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-13206) S3 Integration tests failing due to server-side encrypt enabled by default

2024-05-09 Thread Mark Payne (Jira)

Mark Payne created NIFI-13206:
-

 Summary: S3 Integration tests failing due to server-side encrypt 
enabled by default
 Key: NIFI-13206
 URL: https://issues.apache.org/jira/browse/NIFI-13206
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0-M3


The 3.0 version of Localstack enabled S3 server-side encryption by default. 
This is causing integration tests now to fail with errors such as:
{code:java}
org.opentest4j.AssertionFailedError: Attribute s3.sseAlgorithm should not exist 
on FlowFile, but exists with value AES256 ==> 
Expected :false
Actual   :true {code}
This is happening in both Fetch and Put S3 integration tests. They tests work 
as-is if we change the docker image of Localstack to 2.3.2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-13200) Framework should not allow removal of 'filename' or 'path' attributes

2024-05-09 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-13200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845158#comment-17845158
 ] 

Mark Payne commented on NIFI-13200:
---

Yup good catch, [~mosermw] 

> Framework should not allow removal of 'filename' or 'path' attributes
> -
>
> Key: NIFI-13200
> URL: https://issues.apache.org/jira/browse/NIFI-13200
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, the framework prevents processors from removing the 'uuid' 
> attribute. However, it does allow any other attribute to be removed. However, 
> there are many processors that assume that the 'filename' and 'path' 
> attributes exist, and they are intended always to exist - they are even 
> assigned values when the FlowFile is created. We should ensure that these 
> attributes cannot be removed.
> Otherwise, configuring UpdateAttribute to remove these attributes can cause 
> follow-on processors to fail with unexpected NullPointerExceptions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-13203) Include s3.url attribute in FetchS3Object and PutS3Object

2024-05-09 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-13203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-13203:
--
Fix Version/s: 2.0.0-M3
 Assignee: Mark Payne
   Status: Patch Available  (was: Open)

> Include s3.url attribute in FetchS3Object and PutS3Object
> -
>
> Key: NIFI-13203
> URL: https://issues.apache.org/jira/browse/NIFI-13203
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The FetchS3Object and PutS3Object add several s3-related attributes, such as 
> s3.bucket, s3.key, etc. and then at the end of the onTrigger, they emit a 
> provenance event with the S3 URL. However, there is no attribute for the S3 
> url. This would be handy to have available when building flows.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-13203) Include s3.url attribute in FetchS3Object and PutS3Object

2024-05-09 Thread Mark Payne (Jira)

Mark Payne created NIFI-13203:
-

 Summary: Include s3.url attribute in FetchS3Object and PutS3Object
 Key: NIFI-13203
 URL: https://issues.apache.org/jira/browse/NIFI-13203
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne


The FetchS3Object and PutS3Object add several s3-related attributes, such as 
s3.bucket, s3.key, etc. and then at the end of the onTrigger, they emit a 
provenance event with the S3 URL. However, there is no attribute for the S3 
url. This would be handy to have available when building flows.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-13200) Framework should not allow removal of 'filename' or 'path' attributes

2024-05-09 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-13200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-13200:
--
Status: Patch Available  (was: Open)

> Framework should not allow removal of 'filename' or 'path' attributes
> -
>
> Key: NIFI-13200
> URL: https://issues.apache.org/jira/browse/NIFI-13200
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the framework prevents processors from removing the 'uuid' 
> attribute. However, it does allow any other attribute to be removed. However, 
> there are many processors that assume that the 'filename' and 'path' 
> attributes exist, and they are intended always to exist - they are even 
> assigned values when the FlowFile is created. We should ensure that these 
> attributes cannot be removed.
> Otherwise, configuring UpdateAttribute to remove these attributes can cause 
> follow-on processors to fail with unexpected NullPointerExceptions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-13200) Framework should not allow removal of 'filename' or 'path' attributes

2024-05-09 Thread Mark Payne (Jira)

Mark Payne created NIFI-13200:
-

 Summary: Framework should not allow removal of 'filename' or 
'path' attributes
 Key: NIFI-13200
 URL: https://issues.apache.org/jira/browse/NIFI-13200
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne


Currently, the framework prevents processors from removing the 'uuid' 
attribute. However, it does allow any other attribute to be removed. However, 
there are many processors that assume that the 'filename' and 'path' attributes 
exist, and they are intended always to exist - they are even assigned values 
when the FlowFile is created. We should ensure that these attributes cannot be 
removed.

Otherwise, configuring UpdateAttribute to remove these attributes can cause 
follow-on processors to fail with unexpected NullPointerExceptions



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-13199) Update ValidateRecord to avoid writing to FlowFiles that will be auto-terminated

2024-05-09 Thread Mark Payne (Jira)

Mark Payne created NIFI-13199:
-

 Summary: Update ValidateRecord to avoid writing to FlowFiles that 
will be auto-terminated
 Key: NIFI-13199
 URL: https://issues.apache.org/jira/browse/NIFI-13199
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne


NIFI-13196 introduces the ability to check if a relationship is 
auto-terminated. In the case of ValidateRecord, the processor is commonly used 
to filter out invalid records. Before writing records to an 'invalid' FlowFile 
we should first check if the relationship is auto-terminated and not spend the 
resources to create the data if it will be auto-terminated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-13198) Update RouteText not to write to FlowFiles for auto-terminated relationships

2024-05-09 Thread Mark Payne (Jira)

Mark Payne created NIFI-13198:
-

 Summary: Update RouteText not to write to FlowFiles for 
auto-terminated relationships
 Key: NIFI-13198
 URL: https://issues.apache.org/jira/browse/NIFI-13198
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne


NIFI-13196 introduces the ability to check if a relationship is 
auto-terminated. In the case of RouteText, the processor is commonly used to 
filter out unwanted lines of text. For anything that is auto-terminated, 
though, we still write out the data. We should instead check if the 
Relationship that we're writing to is auto-terminated and if so, don't bother 
creating the flowfile or writing to it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-13196) Add a new isAutoTerminated(Relationship) method to ProcessContext

2024-05-09 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-13196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-13196:
--
Fix Version/s: 2.0.0-M3
   Status: Patch Available  (was: Open)

> Add a new isAutoTerminated(Relationship) method to ProcessContext
> -
>
> Key: NIFI-13196
> URL: https://issues.apache.org/jira/browse/NIFI-13196
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently a Processor has no way of determining whether or not a Relationship 
> is auto-terminated. There are cases where a Processor forks an incoming 
> FlowFile and updates it (in a potentially expensive manner) and then 
> transfers it to a Relationship that is auto-terminated.
> We should add the ability to determine whether or not a given relationship is 
> auto-terminated so that we can be more efficient



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-13196) Add a new isAutoTerminated(Relationship) method to ProcessContext

2024-05-09 Thread Mark Payne (Jira)

Mark Payne created NIFI-13196:
-

 Summary: Add a new isAutoTerminated(Relationship) method to 
ProcessContext
 Key: NIFI-13196
 URL: https://issues.apache.org/jira/browse/NIFI-13196
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne


Currently a Processor has no way of determining whether or not a Relationship 
is auto-terminated. There are cases where a Processor forks an incoming 
FlowFile and updates it (in a potentially expensive manner) and then transfers 
it to a Relationship that is auto-terminated.

We should add the ability to determine whether or not a given relationship is 
auto-terminated so that we can be more efficient



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (NIFI-13146) ConsumeSlack processor rate limited

2024-05-09 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne resolved NIFI-13146.
---
Fix Version/s: 2.0.0-M3
   Resolution: Fixed

> ConsumeSlack processor rate limited
> ---
>
> Key: NIFI-13146
> URL: https://issues.apache.org/jira/browse/NIFI-13146
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Zsihovszki Krisztina
>Assignee: Zsihovszki Krisztina
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> ConsumeSlack processor was not able to start when it was running in a Slack 
> workplace with thousands of channels. It got stuck in initialization phase 
> and reported rate limit error continuously.
> The processor fetches all available channels (conversation list) during its 
> setup and creates a channel id/name mapping. 
> Fetching the conversation list items is executed in batches, 1000 
> channels/batch. Since the fetch is done continously, after a while Slack API 
> returns rate limit error. (Rate limit settings were at default in Slack.)
> Even if some delay was added after each API call, 30 seconds was not enough 
> to fetch all the channels (since it is in onScheduled, after 30 seconds the 
> initialization is re-attempted)
> As a mitigation for the problem I'd like to add a logic which checks if 
> "Channels" property contains only IDs. In case no channel name is specified, 
> another Slack API call (
> [conversations.info|https://api.slack.com/methods/conversations.info]) can be 
> used to fetch the channel names for the channel IDs and it is not necessary 
> to fetch the whole conversation list.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (NIFI-13076) reduce enum array allocation in OpenTelemetry bundle

2024-04-22 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-13076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne resolved NIFI-13076.
---
Fix Version/s: 2.0.0-M3
   Resolution: Fixed

> reduce enum array allocation in OpenTelemetry bundle
> 
>
> Key: NIFI-13076
> URL: https://issues.apache.org/jira/browse/NIFI-13076
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Sean Sullivan
>Priority: Minor
> Fix For: 2.0.0-M3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> module:  *nifi-opentelemetry-bundle* 
>  
> h2. Motivation
> reduce enum array allocation
> h2. Modifications
> cache enum .values() in a static variable
> h2. Additional context
> [https://www.gamlor.info/wordpress/2017/08/javas-enum-values-hidden-allocations/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-12986) Tidy up JavaDoc of ProcessSession

2024-04-22 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-12986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839719#comment-17839719
 ] 

Mark Payne commented on NIFI-12986:
---

This PR incorrect marks {{ProcessSession.commit()}} as being deprecated. It is 
not deprecated. In the vast majority of cases, {{commitAsync()}} should be 
preferred. However, there are still cases where {{commit()}} may make sense. It 
is used, for example, in the Site-to-Site server, as it cannot respond to the 
client until the commit has completed. Such code *could* be rewritten to use 
commitAsync but currently has not. We should not be deprecating methods that we 
are actively using and do not necessarily intend to stop using. Additionally, 
while it is possible to rewrite in such a way that it uses commitAsync, there's 
really no need to, as the synchronous commit is still a valid approach and is 
more straight-forward.

Additionally, the PR changes the formatted of the methods from the syntax:
{code:java}

Documentation Paragraph 1


Documentation Paragraph 2
 {code}
To a less explicit version of:
{code:java}
Documentation Paragraph 1

Documentation Paragraph 2 {code}
This should be undone, as the former formatting is preferred and is the 
dominant formatting throughout the codebase. It also makes it more clear where 
a paragraph begins and ends, and results in more consistent rendering of the 
text, as the latter approach does not necessarily apply the same formatting as 
the former.

It does look like the commit applies some additional documentation around 
Exceptions that are thrown, but honestly it is difficult to say, as Github 
shows it as if methods were added and removed, or parameters were changed, etc. 
I think it gets confused by the change in formatting?

> Tidy up JavaDoc of ProcessSession
> -
>
> Key: NIFI-12986
> URL: https://issues.apache.org/jira/browse/NIFI-12986
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> While working on NIFI-12982 I noticed that the JavaDoc of {{ProcessSession}} 
> has some minor typos and documentation drifts between method overloads.
> The goal of this ticket is to aim make the JavaDoc for the current 
> {{ProcessSession}} specification more consistent. The specified contract must 
> not be altered. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (NIFI-12986) Tidy up JavaDoc of ProcessSession

2024-04-22 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne reopened NIFI-12986:
---

Reopening Issue, as I do not believe it to be correct. Will elaborate more in a 
separate comment.

> Tidy up JavaDoc of ProcessSession
> -
>
> Key: NIFI-12986
> URL: https://issues.apache.org/jira/browse/NIFI-12986
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> While working on NIFI-12982 I noticed that the JavaDoc of {{ProcessSession}} 
> has some minor typos and documentation drifts between method overloads.
> The goal of this ticket is to aim make the JavaDoc for the current 
> {{ProcessSession}} specification more consistent. The specified contract must 
> not be altered. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-12969) Under heavy load, nifi node unable to rejoin cluster, graph modified with temp funnel

2024-04-03 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833738#comment-17833738
 ] 

Mark Payne commented on NIFI-12969:
---

[~Nissim Shiman] [~pgyori] I pushed a PR that appears to address the issue. I 
believe you're on the right track, that the situation is caused by the fact 
that the temp funnel was incorrectly used. But instead of trying to detect when 
it's going to happen and/or rollback, the issue is that we had a bug in the 
logic for when the temp funnel was created. In this case, there should never be 
a temp funnel. In cases where we DO need a temp funnel, the existing logic 
should handle stopping the Port, which would make this work smoothly. The issue 
arose here because the Port was (rightly) left running. We just need to avoid 
creating the temp funnel unnecessarily.

> Under heavy load, nifi node unable to rejoin cluster, graph modified with 
> temp funnel
> -
>
> Key: NIFI-12969
> URL: https://issues.apache.org/jira/browse/NIFI-12969
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.24.0, 2.0.0-M2
>Reporter: Nissim Shiman
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 2.0.0-M3, 1.26.0
>
> Attachments: nifi-app.log, simple_flow.png, 
> simple_flow_with_temp-funnel.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Under heavy load, if a node leaves the cluster (due to heartbeat time out), 
> many times it is unable to rejoin the cluster.
> The nodes' graph will have been modified with a temp-funnel as well.
> Appears to be some sort of [timing 
> issue|https://github.com/apache/nifi/blob/rel/nifi-2.0.0-M2/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/connectable/StandardConnection.java#L298]
>  # To reproduce, on a nifi cluster of three nodes, set up:
> 2 GenerateFlowFile processors -> PG
> Inside PG:
> inputPort -> UpdateAttribute
>  # Keep all defaults except for the following:
> For UpdateAttribute terminate the success relationship
> One of the GenerateFlowFile processors can be disabled,
> the other one should have Run Schedule to be 0 min (this will allow for the 
> heavy load)
>  # In nifi.properties (on all 3 nodes) to allow for nodes to fall out of the 
> cluster, set: nifi.cluster.protocol.heartbeat.interval=2 sec  (default is 5) 
> nifi.cluster.protocol.heartbeat.missable.max=1   (default is 8)
> Restart nifi. Start flow. The nodes will quickly fall out and rejoin cluster. 
> After a few minutes one will likely not be able to rejoin.  The graph for 
> that node will have the disabled GenerateFlowFile now pointing to a funnel (a 
> temp-funnel) instead of the PG
> Stack trace on that nodes nifi-app.log will look like this: (this is from 
> 2.0.0-M2):
> {code:java}
> 2024-03-28 13:55:19,395 INFO [Reconnect to Cluster] 
> o.a.nifi.controller.StandardFlowService Node disconnected due to Failed to 
> properly handle Reconnection request due to org.apache.nifi.control
> ler.serialization.FlowSynchronizationException: Failed to connect node to 
> cluster because local flow controller partially updated. Administrator should 
> disconnect node and review flow for corrup
> tion.
> 2024-03-28 13:55:19,395 ERROR [Reconnect to Cluster] 
> o.a.nifi.controller.StandardFlowService Handling reconnection request failed 
> due to: org.apache.nifi.controller.serialization.FlowSynchroniza
> tionException: Failed to connect node to cluster because local flow 
> controller partially updated. Administrator should disconnect node and review 
> flow for corruption.
> org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed 
> to connect node to cluster because local flow controller partially updated. 
> Administrator should disconnect node and
>  review flow for corruption.
> at 
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:985)
> at 
> org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:655)
> at 
> org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:384)
> at java.base/java.lang.Thread.run(Thread.java:1583)
> Caused by: 
> org.apache.nifi.controller.serialization.FlowSynchronizationException: 
> java.lang.IllegalStateException: Cannot change destination of Connection 
> because FlowFiles from this Connection
> are currently held by LocalPort[id=99213c00-78ca-4848-112f-5454cc20656b, 
> type=INPUT_PORT, name=inputPort, group=innerPG]
> at 
> org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:472)
> at 
>

[jira] [Updated] (NIFI-12969) Under heavy load, nifi node unable to rejoin cluster, graph modified with temp funnel

2024-04-03 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12969:
--
Assignee: Mark Payne  (was: Nissim Shiman)
  Status: Patch Available  (was: Open)

> Under heavy load, nifi node unable to rejoin cluster, graph modified with 
> temp funnel
> -
>
> Key: NIFI-12969
> URL: https://issues.apache.org/jira/browse/NIFI-12969
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 2.0.0-M2, 1.24.0
>Reporter: Nissim Shiman
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 2.0.0-M3, 1.26.0
>
> Attachments: nifi-app.log, simple_flow.png, 
> simple_flow_with_temp-funnel.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Under heavy load, if a node leaves the cluster (due to heartbeat time out), 
> many times it is unable to rejoin the cluster.
> The nodes' graph will have been modified with a temp-funnel as well.
> Appears to be some sort of [timing 
> issue|https://github.com/apache/nifi/blob/rel/nifi-2.0.0-M2/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/connectable/StandardConnection.java#L298]
>  # To reproduce, on a nifi cluster of three nodes, set up:
> 2 GenerateFlowFile processors -> PG
> Inside PG:
> inputPort -> UpdateAttribute
>  # Keep all defaults except for the following:
> For UpdateAttribute terminate the success relationship
> One of the GenerateFlowFile processors can be disabled,
> the other one should have Run Schedule to be 0 min (this will allow for the 
> heavy load)
>  # In nifi.properties (on all 3 nodes) to allow for nodes to fall out of the 
> cluster, set: nifi.cluster.protocol.heartbeat.interval=2 sec  (default is 5) 
> nifi.cluster.protocol.heartbeat.missable.max=1   (default is 8)
> Restart nifi. Start flow. The nodes will quickly fall out and rejoin cluster. 
> After a few minutes one will likely not be able to rejoin.  The graph for 
> that node will have the disabled GenerateFlowFile now pointing to a funnel (a 
> temp-funnel) instead of the PG
> Stack trace on that nodes nifi-app.log will look like this: (this is from 
> 2.0.0-M2):
> {code:java}
> 2024-03-28 13:55:19,395 INFO [Reconnect to Cluster] 
> o.a.nifi.controller.StandardFlowService Node disconnected due to Failed to 
> properly handle Reconnection request due to org.apache.nifi.control
> ler.serialization.FlowSynchronizationException: Failed to connect node to 
> cluster because local flow controller partially updated. Administrator should 
> disconnect node and review flow for corrup
> tion.
> 2024-03-28 13:55:19,395 ERROR [Reconnect to Cluster] 
> o.a.nifi.controller.StandardFlowService Handling reconnection request failed 
> due to: org.apache.nifi.controller.serialization.FlowSynchroniza
> tionException: Failed to connect node to cluster because local flow 
> controller partially updated. Administrator should disconnect node and review 
> flow for corruption.
> org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed 
> to connect node to cluster because local flow controller partially updated. 
> Administrator should disconnect node and
>  review flow for corruption.
> at 
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:985)
> at 
> org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:655)
> at 
> org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:384)
> at java.base/java.lang.Thread.run(Thread.java:1583)
> Caused by: 
> org.apache.nifi.controller.serialization.FlowSynchronizationException: 
> java.lang.IllegalStateException: Cannot change destination of Connection 
> because FlowFiles from this Connection
> are currently held by LocalPort[id=99213c00-78ca-4848-112f-5454cc20656b, 
> type=INPUT_PORT, name=inputPort, group=innerPG]
> at 
> org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:472)
> at 
> org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.sync(VersionedFlowSynchronizer.java:223)
> at 
> org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1740)
> at 
> org.apache.nifi.persistence.StandardFlowConfigurationDAO.load(StandardFlowConfigurationDAO.java:91)
> at 
> org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:805)
> at 
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:954)
> ... 3 common frames omitted
> Caused by: java.lang.IllegalStateException: Cannot change destination of 
> Connection

[jira] [Updated] (NIFI-12934) RenameRecordField does not clear serialized form of records

2024-03-27 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12934:
--
Fix Version/s: 2.0.0-M3

> RenameRecordField does not clear serialized form of records
> ---
>
> Key: NIFI-12934
> URL: https://issues.apache.org/jira/browse/NIFI-12934
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Mark Payne
>Priority: Critical
> Fix For: 2.0.0-M3
>
>
> When RenameRecordField runs, it updates the parent record for any matches. 
> However, when this happens, the Record's "serialized form" does not get 
> cleared. As a result, the Record Writer may (depending on its configuration) 
> write out the 'cached' / serialized form of the Record, resulting in no 
> change to the records.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12934) RenameRecordField does not clear serialized form of records

2024-03-27 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12934:
--
Priority: Critical  (was: Major)

> RenameRecordField does not clear serialized form of records
> ---
>
> Key: NIFI-12934
> URL: https://issues.apache.org/jira/browse/NIFI-12934
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Mark Payne
>Priority: Critical
>
> When RenameRecordField runs, it updates the parent record for any matches. 
> However, when this happens, the Record's "serialized form" does not get 
> cleared. As a result, the Record Writer may (depending on its configuration) 
> write out the 'cached' / serialized form of the Record, resulting in no 
> change to the records.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12959) Support loading Python processors from NARs

2024-03-26 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12959:
--
Status: Patch Available  (was: Open)

> Support loading Python processors from NARs
> ---
>
> Key: NIFI-12959
> URL: https://issues.apache.org/jira/browse/NIFI-12959
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, third-party dependencies for Python Processors can be handled in 
> two ways. Either they can be declared as dependencies in a Processor itself; 
> or the Processor can be in a module where a {{requirements.txt}} dictates the 
> requirements. These can be very helpful for developing Python based 
> Processors.
> However, in production environments, it is not uncommon to see environments 
> where {{pip}} is not installed. There is an inherent risk in allowing remote 
> code to be downloaded in an ad-hoc manner like this, without any sort of 
> vulnerability scanning, etc.
> As such, we should allow users to also package python packages in NiFi's 
> native archiving format (NARs).
> The package structure should be as follows:
> {code:java}
> my-nar.nar
> +-- META-INF/
> +-- MANIFEST.MF
> +-- NAR-INF/
> +-- bundled-dependencies/
> +-- dependency1
> +-- dependency2
> +-- etc.
> +-- MyProcessor.py{code}
> Where {{MyProcessor.py}} could also be a python module / directory.
> In this way, we allow a Python Processor to be packaged up with its third 
> party dependencies and dropped in the lib/ directory (or extensions) 
> directory of a NiFi installation in the same way that a Java processor would 
> be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12959) Support loading Python processors from NARs

2024-03-26 Thread Mark Payne (Jira)

Mark Payne created NIFI-12959:
-

 Summary: Support loading Python processors from NARs
 Key: NIFI-12959
 URL: https://issues.apache.org/jira/browse/NIFI-12959
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0-M3


Currently, third-party dependencies for Python Processors can be handled in two 
ways. Either they can be declared as dependencies in a Processor itself; or the 
Processor can be in a module where a {{requirements.txt}} dictates the 
requirements. These can be very helpful for developing Python based Processors.

However, in production environments, it is not uncommon to see environments 
where {{pip}} is not installed. There is an inherent risk in allowing remote 
code to be downloaded in an ad-hoc manner like this, without any sort of 
vulnerability scanning, etc.

As such, we should allow users to also package python packages in NiFi's native 
archiving format (NARs).

The package structure should be as follows:
{code:java}
my-nar.nar
+-- META-INF/
+-- MANIFEST.MF
+-- NAR-INF/
+-- bundled-dependencies/
+-- dependency1
+-- dependency2
+-- etc.
+-- MyProcessor.py{code}
Where {{MyProcessor.py}} could also be a python module / directory.

In this way, we allow a Python Processor to be packaged up with its third party 
dependencies and dropped in the lib/ directory (or extensions) directory of a 
NiFi installation in the same way that a Java processor would be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12934) RenameRecordField does not clear serialized form of records

2024-03-22 Thread Mark Payne (Jira)

Mark Payne created NIFI-12934:
-

 Summary: RenameRecordField does not clear serialized form of 
records
 Key: NIFI-12934
 URL: https://issues.apache.org/jira/browse/NIFI-12934
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Mark Payne


When RenameRecordField runs, it updates the parent record for any matches. 
However, when this happens, the Record's "serialized form" does not get 
cleared. As a result, the Record Writer may (depending on its configuration) 
write out the 'cached' / serialized form of the Record, resulting in no change 
to the records.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-12897) Allow users to upload files/resources to nifi for use in the dataflow

2024-03-18 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828101#comment-17828101
 ] 

Mark Payne commented on NIFI-12897:
---

[~pvillard] probably best to use comments in the feature proposal.

> Allow users to upload files/resources to nifi for use in the dataflow
> -
>
> Key: NIFI-12897
> URL: https://issues.apache.org/jira/browse/NIFI-12897
> Project: Apache NiFi
>  Issue Type: Epic
>  Components: Core Framework, Core UI
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>
> A common feature request that we receive is to make it easier to upload a 
> resource file, such as JDBC Driver JAR to nifi so that all of the nodes 
> receive the file. This epic is meant to capture that request



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-12897) Allow users to upload files/resources to nifi for use in the dataflow

2024-03-18 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828053#comment-17828053
 ] 

Mark Payne commented on NIFI-12897:
---

[~pvillard] I pushed up a Feature Proposal: 
[https://cwiki.apache.org/confluence/display/NIFI/Asset+Management]

 

> Allow users to upload files/resources to nifi for use in the dataflow
> -
>
> Key: NIFI-12897
> URL: https://issues.apache.org/jira/browse/NIFI-12897
> Project: Apache NiFi
>  Issue Type: Epic
>  Components: Core Framework, Core UI
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>
> A common feature request that we receive is to make it easier to upload a 
> resource file, such as JDBC Driver JAR to nifi so that all of the nodes 
> receive the file. This epic is meant to capture that request



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-12897) Allow users to upload files/resources to nifi for use in the dataflow

2024-03-14 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827195#comment-17827195
 ] 

Mark Payne commented on NIFI-12897:
---

[~pvillard] not at the moment. I will plan to put together a Feature Proposal 
in the next few days.

> Allow users to upload files/resources to nifi for use in the dataflow
> -
>
> Key: NIFI-12897
> URL: https://issues.apache.org/jira/browse/NIFI-12897
> Project: Apache NiFi
>  Issue Type: Epic
>  Components: Core Framework, Core UI
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>
> A common feature request that we receive is to make it easier to upload a 
> resource file, such as JDBC Driver JAR to nifi so that all of the nodes 
> receive the file. This epic is meant to capture that request



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12897) Allow users to upload files/resources to nifi for use in the dataflow

2024-03-14 Thread Mark Payne (Jira)

Mark Payne created NIFI-12897:
-

 Summary: Allow users to upload files/resources to nifi for use in 
the dataflow
 Key: NIFI-12897
 URL: https://issues.apache.org/jira/browse/NIFI-12897
 Project: Apache NiFi
  Issue Type: Epic
  Components: Core Framework, Core UI
Reporter: Mark Payne
Assignee: Mark Payne


A common feature request that we receive is to make it easier to upload a 
resource file, such as JDBC Driver JAR to nifi so that all of the nodes receive 
the file. This epic is meant to capture that request



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12899) UI - Allow user to choose assets to upload for a given parameter

2024-03-14 Thread Mark Payne (Jira)

Mark Payne created NIFI-12899:
-

 Summary: UI - Allow user to choose assets to upload for a given 
parameter
 Key: NIFI-12899
 URL: https://issues.apache.org/jira/browse/NIFI-12899
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Core UI
Reporter: Mark Payne






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12898) Backend - Allow uploading asset and referencing via Parameter

2024-03-14 Thread Mark Payne (Jira)

Mark Payne created NIFI-12898:
-

 Summary: Backend - Allow uploading asset and referencing via 
Parameter
 Key: NIFI-12898
 URL: https://issues.apache.org/jira/browse/NIFI-12898
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (NIFI-11446) Better handling for cases where Python process dies

2024-03-05 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-11446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne resolved NIFI-11446.
---
Fix Version/s: 2.0.0
 Assignee: Mark Payne
   Resolution: Fixed

> Better handling for cases where Python process dies
> ---
>
> Key: NIFI-11446
> URL: https://issues.apache.org/jira/browse/NIFI-11446
> Project: Apache NiFi
>  Issue Type: Sub-task
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>
> If a Python process dies, we need the ability to detect this, re-launch the 
> Process, and recreate the Processors that are a part of the Process, and then 
> restore the Processors' configuration and enable/start them. Essentially, if 
> the Python process dies, the framework should spawn a new process and allow 
> the Processor to keep running.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (NIFI-11444) Improve FlowFileTransform to allow returning a String for the content instead of byte[]

2024-03-05 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne resolved NIFI-11444.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Improve FlowFileTransform to allow returning a String for the content instead 
> of byte[]
> ---
>
> Key: NIFI-11444
> URL: https://issues.apache.org/jira/browse/NIFI-11444
> Project: Apache NiFi
>  Issue Type: Sub-task
>  Components: Core Framework
>Reporter: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>
> The FlowFileTransform Python class returns a FlowFileTransformResult. If the 
> contents are to be returned, they must be provided as a byte[]. But we should 
> also allow providing the contents as a String and deal with the conversion 
> behind the scenes, in order to provide a simpler API.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-11443) Setup proper logging for Python framework

2024-03-05 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-11443:
--
Fix Version/s: 2.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Setup proper logging for Python framework
> -
>
> Key: NIFI-11443
> URL: https://issues.apache.org/jira/browse/NIFI-11443
> Project: Apache NiFi
>  Issue Type: Sub-task
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: David Handermann
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently the python framework establishes logging to logs/nifi-python.log 
> (directory configured in nifi.properties). But we need to establish proper 
> logging with log file rotation, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12498) The Prioritization description in the User Guide is different from the actual source code implementation.

2024-02-28 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12498:
--
Fix Version/s: 2.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> The Prioritization description in the User Guide is different from the actual 
> source code implementation.
> -
>
> Key: NIFI-12498
> URL: https://issues.apache.org/jira/browse/NIFI-12498
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Documentation  Website
>Affects Versions: 1.25.0, 2.0.0-M2
>Reporter: Doin Cha
>Assignee: endzeit
>Priority: Minor
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In the prioritization explanation of the User Guide, it is stated that 
> *OldestFlowFileFirstPrioritizer* is the _"default scheme that is used if no 
> prioritizers are selected."_
> _([https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization)|https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization]_
>  
>  
> However, in the actual source code implementation, {color:#ff}*there is 
> no automatic default setting when prioritizers are not selected.* {color}
> In such cases, the sorting is done by comparing the *ContentClaim* *of 
> FlowFiles.*
> _([https://github.com/apache/nifi/blob/9a5ec83baa1b3593031f0917659a69e7a36bb0be/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/queue/QueuePrioritizer.java#L39-L90])_
>  
>  
> It looks like the user guide needs to be revised.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (NIFI-12740) Python Processors sometimes stuck in invalid state: 'Initializing runtime environment'

2024-02-28 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne reopened NIFI-12740:
---

Re-opening issue. While the fix greatly reduced the chances of this happening, 
I did encounter the issue again. So not all cases are handled correctly.

> Python Processors sometimes stuck in invalid state: 'Initializing runtime 
> environment'
> --
>
> Key: NIFI-12740
> URL: https://issues.apache.org/jira/browse/NIFI-12740
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 2.0.0-M1, 2.0.0-M2
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Blocker
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When creating a Python processor, sometimes the Processor remains in an 
> invalid state with the message "Initializing runtime environment"
> In the logs, we see the following error/stack trace:
> {code:java}
> 2024-02-05 17:23:30,308 ERROR [Initialize SetRecordField] 
> org.apache.nifi.NiFi An Unknown Error Occurred in Thread 
> VirtualThread[#123,Initialize 
> SetRecordField]/runnable@ForkJoinPool-1-worker-5: 
> java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" 
> because "processorTypes" is null
> java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" 
> because "processorTypes" is null
>   at 
> org.apache.nifi.py4j.StandardPythonBridge.findExtensionId(StandardPythonBridge.java:322)
>   at 
> org.apache.nifi.py4j.StandardPythonBridge.createProcessorBridge(StandardPythonBridge.java:99)
>   at 
> org.apache.nifi.py4j.StandardPythonBridge.lambda$createProcessor$3(StandardPythonBridge.java:142)
>   at 
> org.apache.nifi.python.processor.PythonProcessorProxy.lambda$new$0(PythonProcessorProxy.java:73)
>   at java.base/java.lang.VirtualThread.run(VirtualThread.java:309) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-02-26 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820810#comment-17820810
 ] 

Mark Payne commented on NIFI-12841:
---

[~EndzeitBegins] I don't have a particular problem with it, if there's a use 
case where it's needed. Typically, though, you'd only want to delete a file on 
the local file system after processing it, and the FetchFile / GetFile have 
strategies to handle that already once the file has been ingested.

> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
> the resource from the source after successful transfer into the content of a 
> FlowFile. 
> However, in some scenarios it might be undesired to remove the resource until 
> it has been processed successfully and the transformation result be stored, 
> e.g. to a secure network storage.
> This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its 
> own. 
> As of now, one of the scripting processors or even a full-fledged custom 
> processor can be used to achieve this. 
> However, these might get relatively involved due to session handling or other 
> concerns.
> This issue proposes the introduction of an additional such processor "type", 
> namely {{RemoveXYZ}} which removes a resource.
> The base processor should have two properties, namely {{path}} and 
> {{filename}}, by default retrieving their values from the respective core 
> FlowFile attributes. Implementations may add protocol specific properties, 
> e.g. for authentication. 
> There should be three outgoing relationships at least:
> - "success" for FlowFiles, where the resource was removed from the source,
> - "not exists" for FlowFiles, where the resource did (no longer) exist on the 
> source,
> - "failure" for FlowFiles, where the resource couldn't be removed from the 
> source, e.g. due to network errors or missing permissions.
> An initial implementation should provide {{RemoveXYZ}} for one of the 
> existing resources types, e.g. File, FTP, SFTP...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-02-25 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820514#comment-17820514
 ] 

Mark Payne commented on NIFI-12841:
---

Hey [~EndzeitBegins] thanks for reaching out about this.

In general, the naming convention used in NiFi for such a thing is DeleteXYZ, 
rather than RemoveXYZ. Several of these components exist. For example: 
DeleteMongo, DeleteHDFS, DeleteDynamoDB, DeleteSQS, DeleteS3Object, 
DeleteGCSObject.

It is important to note that these Processors should not share a base class. 
There is no significant code reuse that would be gained by sharing a base 
class, but doing so would constraint the extensibility of the Processors. For 
example, DeleteS3Object is likely to extend from an AbstractS3Processor, etc.

Typically, the Processor will have both a "success" and a "failure" 
relationship. As for a "does not exist" relationship, it depends on the 
Processor. Some Processors may provide such a relationship while others do not. 
It should be documented how each Processor behaves in such a condition - 
whether it's a specific relationship, or the FlowFile goes to failure (because 
it failed to delete the file), or the FlowFile goes to success (because the 
file no longer exists), etc. It would be good to ensure that we are consistent, 
but given that several Delete* Processors already exist, it may not make sense 
to start changing the behavior. It would take some investigation there.

It is also important to note that detecting whether or not a given file exists 
may also even have significant performance considerations. For something like 
an SQS Processor it may be expensive to make the request for every single 
message to detect whether or not it exists.

> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
> the resource from the source after successful transfer into the content of a 
> FlowFile. 
> However, in some scenarios it might be undesired to remove the resource until 
> it has been processed successfully and the transformation result be stored, 
> e.g. to a secure network storage.
> This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its 
> own. 
> As of now, one of the scripting processors or even a full-fledged custom 
> processor can be used to achieve this. 
> However, these might get relatively involved due to session handling or other 
> concerns.
> This issue proposes the introduction of an additional such processor "type", 
> namely {{RemoveXYZ}} which removes a resource.
> The base processor should have two properties, namely {{path}} and 
> {{filename}}, by default retrieving their values from the respective core 
> FlowFile attributes. Implementations may add protocol specific properties, 
> e.g. for authentication. 
> There should be three outgoing relationships at least:
> - "success" for FlowFiles, where the resource was removed from the source,
> - "not exists" for FlowFiles, where the resource did (no longer) exist on the 
> source,
> - "failure" for FlowFiles, where the resource couldn't be removed from the 
> source, e.g. due to network errors or missing permissions.
> An initial implementation should provide {{RemoveXYZ}} for one of the 
> existing resources types, e.g. File, FTP, SFTP...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12834) ConsumeSlack throwing NullPointerException

2024-02-22 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12834:
--
Status: Patch Available  (was: Open)

> ConsumeSlack throwing NullPointerException
> --
>
> Key: NIFI-12834
> URL: https://issues.apache.org/jira/browse/NIFI-12834
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I have an instance of ConsumeSlack that is throwing NullPointerExceptions:
> ```
> 2024-02-22 21:33:08,239 ERROR [Timer-Driven Process Thread-6] 
> o.a.nifi.processors.slack.ConsumeSlack 
> ConsumeSlack[id=55d6ad46-018d-1000--24aa6dbf] Failed to retrieve 
> messages
> java.lang.NullPointerException: Cannot invoke "String.split(String)" because 
> "value" is null
>     at 
> org.apache.nifi.processors.slack.consume.SlackTimestamp.(SlackTimestamp.java:42)
>     at 
> org.apache.nifi.processors.slack.consume.ConsumeChannel.fetchReplies(ConsumeChannel.java:637)
>     at 
> org.apache.nifi.processors.slack.consume.ConsumeChannel.consumeMessages(ConsumeChannel.java:493)
>     at 
> org.apache.nifi.processors.slack.consume.ConsumeChannel.consumeReplies(ConsumeChannel.java:268)
>     at 
> org.apache.nifi.processors.slack.consume.ConsumeChannel.consume(ConsumeChannel.java:173)
>     at 
> org.apache.nifi.processors.slack.ConsumeSlack.onTrigger(ConsumeSlack.java:346)
>     at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>     at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1274)
>     at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:244)
>     at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
>     at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>     at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown 
> Source)
>     at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
>     at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
>  Source)
>     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
> Source)
>     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
> Source)
>     at java.base/java.lang.Thread.run(Unknown Source)
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12834) ConsumeSlack throwing NullPointerException

2024-02-22 Thread Mark Payne (Jira)

Mark Payne created NIFI-12834:
-

 Summary: ConsumeSlack throwing NullPointerException
 Key: NIFI-12834
 URL: https://issues.apache.org/jira/browse/NIFI-12834
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


I have an instance of ConsumeSlack that is throwing NullPointerExceptions:
```

2024-02-22 21:33:08,239 ERROR [Timer-Driven Process Thread-6] 
o.a.nifi.processors.slack.ConsumeSlack 
ConsumeSlack[id=55d6ad46-018d-1000--24aa6dbf] Failed to retrieve 
messages
java.lang.NullPointerException: Cannot invoke "String.split(String)" because 
"value" is null
    at 
org.apache.nifi.processors.slack.consume.SlackTimestamp.(SlackTimestamp.java:42)
    at 
org.apache.nifi.processors.slack.consume.ConsumeChannel.fetchReplies(ConsumeChannel.java:637)
    at 
org.apache.nifi.processors.slack.consume.ConsumeChannel.consumeMessages(ConsumeChannel.java:493)
    at 
org.apache.nifi.processors.slack.consume.ConsumeChannel.consumeReplies(ConsumeChannel.java:268)
    at 
org.apache.nifi.processors.slack.consume.ConsumeChannel.consume(ConsumeChannel.java:173)
    at 
org.apache.nifi.processors.slack.ConsumeSlack.onTrigger(ConsumeSlack.java:346)
    at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
    at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1274)
    at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:244)
    at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
    at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown 
Source)
    at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
Source)
    at java.base/java.lang.Thread.run(Unknown Source)

```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12832) Cleanup nifi-mock dependencies

2024-02-21 Thread Mark Payne (Jira)

Mark Payne created NIFI-12832:
-

 Summary: Cleanup nifi-mock dependencies
 Key: NIFI-12832
 URL: https://issues.apache.org/jira/browse/NIFI-12832
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework, Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


We have allowed quite a few dependencies to creep into the nifi-mock module. It 
has dependencies now on nifi-utils, nifi-framework-api, nifi-parameter. These 
are not modules that the mock framework should depend on. We should ensure that 
we keep this module lean and clean.

I suspect removing these dependencies from the mock framework will have a 
trickle-down effect, as most modules depend on this module, and removing these 
dependencies will likely require updates to modules who use these things as 
transitive dependencies.

It appears that nifi-parameter is not even used, even though it's a dependency. 
There are two classes in nifi-utils that are in use: CoreAttributes and 
StandardValidators. But I argue these really should move to nifi-api, as they 
are APIs that are widely used and we will guarantee backward compatibility.

Additionally, StandardValidators depends on FormatUtils. While we don't want to 
bring FormatUtils into nifi-api, we should introduce a new TimeFormat class in 
nifi-api that is responsible for parsing things like durations that our 
extensions use ("5 mins", etc.) This makes it simpler to build "framework-level 
extensions" and allows for a cleaner implementation of NiFiProperties in the 
future. FormatUtils should then make use of this class.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12232) Frequent "failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption"

2024-02-16 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12232:
--
Assignee: Mark Payne
  Status: Patch Available  (was: Open)

> Frequent "failed to connect node to cluster because local flow controller 
> partially updated. Administrator should disconnect node and review flow for 
> corruption"
> -
>
> Key: NIFI-12232
> URL: https://issues.apache.org/jira/browse/NIFI-12232
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Configuration Management
>Affects Versions: 1.23.2
>Reporter: John Joseph
>Assignee: Mark Payne
>Priority: Major
> Attachments: image-2023-10-16-16-12-31-027.png, 
> image-2024-02-14-13-33-44-354.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is an issue that we have been observing in the 1.23.2 version of NiFi 
> when we try upgrade,
> Since Rolling upgrade is not supported in NiFi, we scale out the revision 
> that is running and {_}run a helm upgrade{_}.
> We have NIFI running in k8s cluster mode, there is a post job that call the 
> Tenants and policies API. On a successful run it would run like this
> {code:java}
> set_policies() Action: 'read' Resource: '/flow' entity_id: 
> 'ad2d3ad6-5d69-3e0f-95e9-c7feb36e2de5' entity_name: 'CN=nifi-api-admin' 
> entity_type: 'USER'
> set_policies() status: '200'
> 'read' '/flow' policy already exists. It will be updated...
> set_policies() fetching policy inside -eq 200 status: '200'
> set_policies() after update PUT: '200'
> set_policies() Action: 'read' Resource: '/tenants' entity_id: 
> 'ad2d3ad6-5d69-3e0f-95e9-c7feb36e2de5' entity_name: 'CN=nifi-api-admin' 
> entity_type: 'USER'
> set_policies() status: '200'{code}
> *_This job was running fine in 1.23.0, 1.22 and other previous versions._* In 
> {*}{{1.23.2}}{*}, we are noticing that the job is failing very frequently 
> with the error logs;
> {code:java}
> set_policies() Action: 'read' Resource: '/flow' entity_id: 
> 'ad2d3ad6-5d69-3e0f-95e9-c7feb36e2de5' entity_name: 'CN=nifi-api-admin' 
> entity_type: 'USER'
> set_policies() status: '200'
> 'read' '/flow' policy already exists. It will be updated...
> set_policies() fetching policy inside -eq 200 status: '200'
> set_policies() after update PUT: '400'
> An error occurred getting 'read' '/flow' policy: 'This node is disconnected 
> from its configured cluster. The requested change will only be allowed if the 
> flag to acknowledge the disconnected node is set.'{code}
> {{_*'This node is disconnected from its configured cluster. The requested 
> change will only be allowed if the flag to acknowledge the disconnected node 
> is set.'*_}}
> The job is configured to run only after all the pods are up and running. 
> Though the pods are up we see exception is the inside pods
> {code:java}
> org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed 
> to connect node to cluster because local flow controller partially updated. 
> Administrator should disconnect node and review flow for corruption.
> at 
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1059)
> at 
> org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:667)
> at 
> org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:107)
> at 
> org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:396)
> at java.base/java.lang.Thread.run(Thread.java:833)
> Caused by: 
> org.apache.nifi.controller.serialization.FlowSynchronizationException: 
> java.lang.IllegalStateException: Cannot change destination of Connection 
> because the current destination is running
> at 
> org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:448)
> at 
> org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.sync(VersionedFlowSynchronizer.java:206)
> at 
> org.apache.nifi.controller.serialization.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:42)
> at 
> org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1530)
> at 
> org.apache.nifi.persistence.StandardFlowConfigurationDAO.load(StandardFlowConfigurationDAO.java:104)
> at 
> org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:817)
> at 
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1028)
> ... 4 common frames omitted
> Caused by: java.lang.IllegalStateException: Cannot change destination of 
> Connection because the current destination is running
> at 
>

[jira] [Created] (NIFI-12797) Record.incorporateInactiveFields fails if inactive field added with same name but different type

2024-02-15 Thread Mark Payne (Jira)

Mark Payne created NIFI-12797:
-

 Summary: Record.incorporateInactiveFields fails if inactive field 
added with same name but different type
 Key: NIFI-12797
 URL: https://issues.apache.org/jira/browse/NIFI-12797
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


The Record.incorporateInactiveFields has a bug in it.

It considers two cases: updated fields and inactive fields. When considering 
inactive fields, it skips any fields that are also present in the 'updated 
fields'. This makes sense, as we don't want to add a new field if there's 
already a field with the same name.

However, the comparison it uses is based on RecordField and not the field name. 
So in some cases it can throw an Exception because there's a conflict where an 
inactive field has the same name but different type than an updated field.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12739) Python custom processor cannot import ProcessPoolExecutor

2024-02-14 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12739:
--
Fix Version/s: 2.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Python custom processor cannot import ProcessPoolExecutor
> -
>
> Key: NIFI-12739
> URL: https://issues.apache.org/jira/browse/NIFI-12739
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 2.0.0-M2
>Reporter: Alex Ethier
>Assignee: Alex Ethier
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A runtime exception is thrown when trying to import ProcessPoolExecutor in a 
> Python custom processor. This affects other libraries such as llama-index 
> when it tries to import ProcessPoolExecutor.
> My system's full stack trace (see below for a simpler stack trace):
> {code:java}
> py4j.Py4JException: An exception was raised by the Python Proxy. Return 
> Message: Traceback (most recent call last):
>   File "/opt/nifi-2.0.0-SNAPSHOT/python/framework/py4j/java_gateway.py", line 
> 2466, in _call_proxy
> return_value = getattr(self.pool[obj_id], method)(*params)
>^^^
>   File "/opt/nifi-2.0.0-SNAPSHOT/./python/framework/Controller.py", line 75, 
> in createProcessor
> processorClass = self.extensionManager.getProcessorClass(processorType, 
> version, work_dir)
>  
> ^
>   File "/opt/nifi-2.0.0-SNAPSHOT/python/framework/ExtensionManager.py", line 
> 104, in getProcessorClass
> processor_class = self.__load_extension_module(module_file, 
> details.local_dependencies)
>   
> ^
>   File "/opt/nifi-2.0.0-SNAPSHOT/python/framework/ExtensionManager.py", line 
> 360, in __load_extension_module
> module_spec.loader.exec_module(module)
>   File "", line 940, in exec_module
>   File "", line 241, in _call_with_frames_removed
>   File 
> "/Users/aethier/playground/the_source/datavolo/datavolo-resources/demo/advanced_rag_small_to_big/processors/RedisVectorStoreProcessor.py",
>  line 4, in 
> from llama_index import GPTVectorStoreIndex, StorageContext, 
> ServiceContext, Document
>   File 
> "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/__init__.py",
>  line 24, in 
> from llama_index.indices import (
>   File 
> "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/indices/__init__.py",
>  line 4, in 
> from llama_index.indices.composability.graph import ComposableGraph
>   File 
> "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/indices/composability/__init__.py",
>  line 4, in 
> from llama_index.indices.composability.graph import ComposableGraph
>   File 
> "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/indices/composability/graph.py",
>  line 7, in 
> from llama_index.indices.base import BaseIndex
>   File 
> "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/indices/base.py",
>  line 10, in 
> from llama_index.ingestion import run_transformations
>   File 
> "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/ingestion/__init__.py",
>  line 2, in 
> from llama_index.ingestion.pipeline import (
>   File 
> "/opt/nifi-2.0.0-SNAPSHOT/./work/python/extensions/RedisVectorStoreProcessor/2.0.0-M1/llama_index/ingestion/pipeline.py",
>  line 5, in 
> from concurrent.futures import ProcessPoolExecutor
>   File "", line 1229, in _handle_fromlist
>   File 
> "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/__init__.py",
>  line 44, in __getattr__
> from .process import ProcessPoolExecutor as pe
>   File 
> "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py",
>  line 106, in 
> threading._register_atexit(_python_exit)
>   File 
> "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py",
>  line 1527, in _register_atexit
> raise RuntimeError("can't register atexit after shutdown")
> RuntimeError: can't register atexit after shutdown
>   at py4j.Protocol.getReturnValue(Protocol.java:476)
>   at 
> org.apache.nifi.py4j.client.PythonProxyInvocationHandler.invoke(PythonProxyInvocationHandler.java:64)
>   at

[jira] [Updated] (NIFI-12773) Add 'join' and 'anchored' RecordPath functions

2024-02-09 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12773:
--
Status: Patch Available  (was: Open)

> Add 'join' and 'anchored' RecordPath functions
> --
>
> Key: NIFI-12773
> URL: https://issues.apache.org/jira/browse/NIFI-12773
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I've come across two functions that would make flow design much simpler in 
> RecordPath.
> The first one, 'join' would be similar to the 'concat' method but provides a 
> delimiter between each element instead of just smashing the values together.
> The other provides the ability to anchor the context node while evaluating a 
> RecordPath. For example, given the following record:
> {code:java}
> {
> "id": "1234",
> "elements": [{
> "name": "book",
> "color": "red"
> }, {
> "name": "computer",
> "color": "black"
> }]
> } {code}
> We should be able to use:
> {code:java}
> anchored(/elements, concat(/name, ': ', /color)) {code}
> In order to obtain an array of 2 elements:
> {code:java}
> book: red {code}
> and
> {code:java}
> computer: black {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12773) Add 'join' and 'anchored' RecordPath functions

2024-02-09 Thread Mark Payne (Jira)

Mark Payne created NIFI-12773:
-

 Summary: Add 'join' and 'anchored' RecordPath functions
 Key: NIFI-12773
 URL: https://issues.apache.org/jira/browse/NIFI-12773
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


I've come across two functions that would make flow design much simpler in 
RecordPath.

The first one, 'join' would be similar to the 'concat' method but provides a 
delimiter between each element instead of just smashing the values together.

The other provides the ability to anchor the context node while evaluating a 
RecordPath. For example, given the following record:
{code:java}
{
"id": "1234",
"elements": [{
"name": "book",
"color": "red"
}, {
"name": "computer",
"color": "black"
}]
} {code}
We should be able to use:
{code:java}
anchored(/elements, concat(/name, ': ', /color)) {code}
In order to obtain an array of 2 elements:
{code:java}
book: red {code}
and
{code:java}
computer: black {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12764) Remove commons-codec and commons-lang3 from nifi-security-utils

2024-02-09 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12764:
--
Fix Version/s: 2.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Remove commons-codec and commons-lang3 from nifi-security-utils
> ---
>
> Key: NIFI-12764
> URL: https://issues.apache.org/jira/browse/NIFI-12764
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Minor
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The {{nifi-security-utils}} module is a dependency of many other components 
> and should have a minimal set of dependencies. With the introduction of Java 
> HexFormat, Apache Commons Codec is no longer necessary in 
> {{{}nifi-security-utils{}}}. The module also makes minimal use of Apache 
> Commons Lang3, so references to {{StringUtils}} can be reduced and refactored.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12768) Intermittent Failures in TestListFile.testFilterAge

2024-02-09 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12768:
--
Fix Version/s: 2.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Intermittent Failures in TestListFile.testFilterAge
> ---
>
> Key: NIFI-12768
> URL: https://issues.apache.org/jira/browse/NIFI-12768
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 2.0.0-M2
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The TestListFIle class has not changed substantively in quite some time, but 
> it has begun to fail more recently across multiple platforms on GitHub Action 
> runners.
> The {{testFilterAge}} method often fails with the same stack trace:
> {noformat}
> Error:  org.apache.nifi.processors.standard.TestListFile.testFilterAge -- 
> Time elapsed: 6.436 s <<< FAILURE!
> org.opentest4j.AssertionFailedError: expected:  but was: 
>   at 
> org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>   at 
> org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>   at 
> org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
>   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
>   at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141)
>   at 
> org.apache.nifi.processors.standard.TestListFile.testFilterAge(TestListFile.java:331)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:580)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
> {noformat}
> The test method use recalculated timestamps to set file modification time, so 
> the problem appears to be related to these timing calculations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12757) Memory leak on Python side can result in OOMKiller killing python processes

2024-02-07 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12757:
--
Status: Patch Available  (was: Open)

> Memory leak on Python side can result in OOMKiller killing python processes
> ---
>
> Key: NIFI-12757
> URL: https://issues.apache.org/jira/browse/NIFI-12757
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Blocker
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a memory leak on the Python side that results in objects not being 
> properly cleaned up when transform method is invoked. This ultimately leads 
> to the Python process using large amounts of ram and often results in 
> OOMKiller killing the process



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12757) Memory leak on Python side can result in OOMKiller killing python processes

2024-02-07 Thread Mark Payne (Jira)

Mark Payne created NIFI-12757:
-

 Summary: Memory leak on Python side can result in OOMKiller 
killing python processes
 Key: NIFI-12757
 URL: https://issues.apache.org/jira/browse/NIFI-12757
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


There is a memory leak on the Python side that results in objects not being 
properly cleaned up when transform method is invoked. This ultimately leads to 
the Python process using large amounts of ram and often results in OOMKiller 
killing the process



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12740) Python Processors sometimes stuck in invalid state: 'Initializing runtime environment'

2024-02-05 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12740:
--
Status: Patch Available  (was: Open)

> Python Processors sometimes stuck in invalid state: 'Initializing runtime 
> environment'
> --
>
> Key: NIFI-12740
> URL: https://issues.apache.org/jira/browse/NIFI-12740
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 2.0.0-M2, 2.0.0-M1
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Blocker
> Fix For: 2.0.0
>
>
> When creating a Python processor, sometimes the Processor remains in an 
> invalid state with the message "Initializing runtime environment"
> In the logs, we see the following error/stack trace:
> {code:java}
> 2024-02-05 17:23:30,308 ERROR [Initialize SetRecordField] 
> org.apache.nifi.NiFi An Unknown Error Occurred in Thread 
> VirtualThread[#123,Initialize 
> SetRecordField]/runnable@ForkJoinPool-1-worker-5: 
> java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" 
> because "processorTypes" is null
> java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" 
> because "processorTypes" is null
>   at 
> org.apache.nifi.py4j.StandardPythonBridge.findExtensionId(StandardPythonBridge.java:322)
>   at 
> org.apache.nifi.py4j.StandardPythonBridge.createProcessorBridge(StandardPythonBridge.java:99)
>   at 
> org.apache.nifi.py4j.StandardPythonBridge.lambda$createProcessor$3(StandardPythonBridge.java:142)
>   at 
> org.apache.nifi.python.processor.PythonProcessorProxy.lambda$new$0(PythonProcessorProxy.java:73)
>   at java.base/java.lang.VirtualThread.run(VirtualThread.java:309) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12740) Python Processors sometimes stuck in invalid state: 'Initializing runtime environment'

2024-02-05 Thread Mark Payne (Jira)

Mark Payne created NIFI-12740:
-

 Summary: Python Processors sometimes stuck in invalid state: 
'Initializing runtime environment'
 Key: NIFI-12740
 URL: https://issues.apache.org/jira/browse/NIFI-12740
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


When creating a Python processor, sometimes the Processor remains in an invalid 
state with the message "Initializing runtime environment"

In the logs, we see the following error/stack trace:
{code:java}
2024-02-05 17:23:30,308 ERROR [Initialize SetRecordField] org.apache.nifi.NiFi 
An Unknown Error Occurred in Thread VirtualThread[#123,Initialize 
SetRecordField]/runnable@ForkJoinPool-1-worker-5: 
java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" because 
"processorTypes" is null
java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" because 
"processorTypes" is null
at 
org.apache.nifi.py4j.StandardPythonBridge.findExtensionId(StandardPythonBridge.java:322)
at 
org.apache.nifi.py4j.StandardPythonBridge.createProcessorBridge(StandardPythonBridge.java:99)
at 
org.apache.nifi.py4j.StandardPythonBridge.lambda$createProcessor$3(StandardPythonBridge.java:142)
at 
org.apache.nifi.python.processor.PythonProcessorProxy.lambda$new$0(PythonProcessorProxy.java:73)
at java.base/java.lang.VirtualThread.run(VirtualThread.java:309) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12740) Python Processors sometimes stuck in invalid state: 'Initializing runtime environment'

2024-02-05 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12740:
--
Affects Version/s: 2.0.0-M2
   2.0.0-M1

> Python Processors sometimes stuck in invalid state: 'Initializing runtime 
> environment'
> --
>
> Key: NIFI-12740
> URL: https://issues.apache.org/jira/browse/NIFI-12740
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 2.0.0-M1, 2.0.0-M2
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Blocker
> Fix For: 2.0.0
>
>
> When creating a Python processor, sometimes the Processor remains in an 
> invalid state with the message "Initializing runtime environment"
> In the logs, we see the following error/stack trace:
> {code:java}
> 2024-02-05 17:23:30,308 ERROR [Initialize SetRecordField] 
> org.apache.nifi.NiFi An Unknown Error Occurred in Thread 
> VirtualThread[#123,Initialize 
> SetRecordField]/runnable@ForkJoinPool-1-worker-5: 
> java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" 
> because "processorTypes" is null
> java.lang.NullPointerException: Cannot invoke "java.util.List.stream()" 
> because "processorTypes" is null
>   at 
> org.apache.nifi.py4j.StandardPythonBridge.findExtensionId(StandardPythonBridge.java:322)
>   at 
> org.apache.nifi.py4j.StandardPythonBridge.createProcessorBridge(StandardPythonBridge.java:99)
>   at 
> org.apache.nifi.py4j.StandardPythonBridge.lambda$createProcessor$3(StandardPythonBridge.java:142)
>   at 
> org.apache.nifi.python.processor.PythonProcessorProxy.lambda$new$0(PythonProcessorProxy.java:73)
>   at java.base/java.lang.VirtualThread.run(VirtualThread.java:309) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12710) PutDatabaseRecord processor does not handle microsecond timestamps properly

2024-01-31 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12710:
--
Status: Patch Available  (was: Open)

> PutDatabaseRecord processor does not handle microsecond timestamps properly
> ---
>
> Key: NIFI-12710
> URL: https://issues.apache.org/jira/browse/NIFI-12710
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12710) PutDatabaseRecord processor does not handle microsecond timestamps properly

2024-01-31 Thread Mark Payne (Jira)

Mark Payne created NIFI-12710:
-

 Summary: PutDatabaseRecord processor does not handle microsecond 
timestamps properly
 Key: NIFI-12710
 URL: https://issues.apache.org/jira/browse/NIFI-12710
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-8932) Add feature to CSVReader to skip N lines at top of the file

2024-01-31 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813009#comment-17813009
 ] 

Mark Payne commented on NIFI-8932:
--

I think I'm a -1 on this Jira. The CSV Reader should be given valid CSV, rather 
than skipping over an arbitrary number of lines.

[~iiojj2] to strip out the first line of a file, you should not use the complex 
flow above but rather just use RouteText. Add a property with a value of 
${lineNo:gt(1)} and auto-terminated unmatched.

> Add feature to CSVReader to skip N lines at top of the file
> ---
>
> Key: NIFI-8932
> URL: https://issues.apache.org/jira/browse/NIFI-8932
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Philipp Korniets
>Assignee: Matt Burgess
>Priority: Minor
>  Labels: backport-needed
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We have a lot of CSV files where provider add custom header/footer to valid 
> CSV content.
>  CSV header is actually second row. 
> To remove unnecessary data we can use
>  * ReplaceText 
>  * splitText->RouteOnAttribute -> MergeContent
> It would be great to have an option in CSVReader controller to skip N rows 
> from top/bottom in order to get5 clean data.
>  * skip N from the top
>  * skip M from the bottom
>  Similar request was developed in FLINK 
> https://issues.apache.org/jira/browse/FLINK-1002
>  
> Data Example:
> {code}
> 7/20/21 2:48:47 AM GMT-04:00  ABB: Blended Rate Calc (X),,,
> distribution_id,Distribution 
> Id,settle_date,group_code,company_name,currency_code,common_account_name,business_date,prod_code,security,class,asset_type
> -1,all,20210719,Repo 21025226,qwerty                                    
> ,EUR,TPSL_21025226   ,19-Jul-21,BRM96ST7   ,ABC 
> 14/09/24,NR,BOND  
> -1,all,20210719,Repo 21025226,qwerty                                    
> ,GBP,RPSS_21025226   ,19-Jul-21,,Total @ -0.11,,
> {code}
> |7/20/21 2:48:47 AM GMT-04:00  ABB: Blended Rate Calc (X)|  |  |  |  |  |  |  
> |  |  |  |  |  
> |distribution_id|Distribution 
> Id|settle_date|group_code|company_name|currency_code|common_account_name|business_date|prod_code|security|class|asset_type|
> |-1|all|20210719|Repo 21025226|qwerty                                    
> |EUR|TPSL_21025226   |19-Jul-21|BRM96ST7   |ABC 
> 14/09/24|NR|BOND  |
> |-1|all|20210719|Repo 21025226|qwerty                                    
> |GBP|RPSS_21025226   |19-Jul-21| |Total @ -0.11| | |



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12707) Allow LookupRecord to operate on multiple "child records"

2024-01-31 Thread Mark Payne (Jira)

Mark Payne created NIFI-12707:
-

 Summary: Allow LookupRecord to operate on multiple "child records"
 Key: NIFI-12707
 URL: https://issues.apache.org/jira/browse/NIFI-12707
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne


LookupRecord provides a lot of power when it comes to performing enrichment in 
Records. However, there are cases in which a single Record has many 
sub-records, or child records. For example, let's take the following record:
{code:java}
{
  "fileSet": {
"id": "11223344",
"source": "external",
"files": [{
"filename": "file1.txt",
"size": 4810
  }, {
"filename": "file2.pdf",
"size": 47203782
  }, {
"filename": "unknown-file.unk",
"size": 278102
  }
]
  }
} {code}
Let's say that I want to lookup a MIME type, based on the filename. So I want 
an output such as:
{code:java}
{
  "fileSet" : {
"id" : "11223344",
"source" : "external",
"files" : [ {
  "filename" : "file1.txt",
  "size" : 4810,
  "mimeType" : "text/plain"
}, {
  "filename" : "file2.pdf",
  "size" : 47203782,
  "mimeType" : "application/pdf"
}, {
  "filename" : "unknown-file.unk",
  "size" : 278102,
  "mimeType" : null
} ]
  }
} {code}
 

I can have a Lookup Service that is capable of handling this, no problem. And 
in LookupRecord, I can specify the path to lookup as 
{{/fileSet/files[*]/filename}} but then I have a problem - there's no way to 
tell it where to place the returned values (i.e., the mimeType field) because 
it is relative to each individual value.

We need to add a "Root Record Path" that allows us to choose a sub-record. In 
this case, {{/fileSet/files[*]}} and then specify the value to lookup as 
{{/filename}} and the return value should be placed at {{{}/mimeType{}}}.

This gives us much greater flexibility in performing lookups/enrichments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12697) Improve JSON Reader/Writer handling of floating-point numbers

2024-01-30 Thread Mark Payne (Jira)

Mark Payne created NIFI-12697:
-

 Summary: Improve JSON Reader/Writer handling of floating-point 
numbers
 Key: NIFI-12697
 URL: https://issues.apache.org/jira/browse/NIFI-12697
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


The JSON Writer currently provides no way to dictate whether or not 
floating-point numbers should use scientific notation. Several people have run 
into issues where the downstream systems do not understand scientific notation. 
We should allow this to be configurable.

Specifically, we should not change the behavior of existing services, but we 
should default new services so that they do not use scientific notation.

Additionally, the Jackson parser has the ability to use its own implementation 
of floating point parsing, which should be faster than the default version 
supplied by Java. We should enable that feature.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12675) Python Processor erroring when creating custom relationships

2024-01-30 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12675:
--
Status: Patch Available  (was: Open)

> Python Processor erroring when creating custom relationships
> 
>
> Key: NIFI-12675
> URL: https://issues.apache.org/jira/browse/NIFI-12675
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> From apache Slack thread 
> ([https://apachenifi.slack.com/archives/C0L9VCD47/p1706176890922519):]
> {quote}Hello, I am trying to test some custom python processors with nifi 
> 2.0.0-M1
> It works fine except when I try to add custom relationships to it (other than 
> the default success, failure and original).
> Here's what I am trying:
> {code:java}
> self.matched = Relationship("matched", "flowfiles having a match with 
> the regex")
> self.unmatched = Relationship("unmatched", "flowfiles not having any 
> match with regex")
> self.failure = Relationship("failure", "flowfiles for which process 
> errored while matching")
> self.relationships = {self.matched, self.unmatched, self.failure}
> {code}
> I get py4j complaining about AttributeError: 'set' object has no attribute 
> '_get_object_id'
> which seems like the auto conversion of Python to java container is not 
> happening for "Relationship" class. Any idea what could be wrong here?
> {quote}
> The problem appears to be that Relationships created are of type 
> {{nifiapi.Relationship}} but that is being sent back to the Java side without 
> being converted into a {{org.apache.nifi.processor.Relationship}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12693) When Processor is removed, Python Process should be notified asynchronously

2024-01-30 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12693:
--
Status: Patch Available  (was: Open)

> When Processor is removed, Python Process should be notified asynchronously
> ---
>
> Key: NIFI-12693
> URL: https://issues.apache.org/jira/browse/NIFI-12693
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a Processor is removed, the PythonBridge is notified of the removal, and 
> it then notifies any relevant Python process. This is done synchronously 
> during the removal. I encountered two occurrences in which notifying the 
> Python process failed.
> While the failure itself is not a huge concern, the handling of those 
> failures resulted in very bad outcomes. In the first instance, the 
> communication with the Python process was blocked on a socket read or write. 
> As a result, the Service Facade's lock was never released, and no web 
> requests could be made; they all blocked on the read lock. This resulted in 
> requiring a restart of NiFi.
> In the other scenario, the call did not block indefinitely but threw an 
> Exception. In this case, the associated Connections were never removed. As a 
> result, I could no longer navigate to that Process Group in the UI, or the UI 
> would have errors because there were Connections whose source or destination 
> didn't exist. This required manually removing those connections from the 
> flow.json file to recover.
> Since the intention of this action is simply a notification so that the 
> Python process can cleanup after itself, this notification should be moved to 
> a background thread, so that any failures are simply logged without causing 
> problematic side effects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12693) When Processor is removed, Python Process should be notified asynchronously

2024-01-30 Thread Mark Payne (Jira)

Mark Payne created NIFI-12693:
-

 Summary: When Processor is removed, Python Process should be 
notified asynchronously
 Key: NIFI-12693
 URL: https://issues.apache.org/jira/browse/NIFI-12693
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


When a Processor is removed, the PythonBridge is notified of the removal, and 
it then notifies any relevant Python process. This is done synchronously during 
the removal. I encountered two occurrences in which notifying the Python 
process failed.

While the failure itself is not a huge concern, the handling of those failures 
resulted in very bad outcomes. In the first instance, the communication with 
the Python process was blocked on a socket read or write. As a result, the 
Service Facade's lock was never released, and no web requests could be made; 
they all blocked on the read lock. This resulted in requiring a restart of NiFi.

In the other scenario, the call did not block indefinitely but threw an 
Exception. In this case, the associated Connections were never removed. As a 
result, I could no longer navigate to that Process Group in the UI, or the UI 
would have errors because there were Connections whose source or destination 
didn't exist. This required manually removing those connections from the 
flow.json file to recover.

Since the intention of this action is simply a notification so that the Python 
process can cleanup after itself, this notification should be moved to a 
background thread, so that any failures are simply logged without causing 
problematic side effects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12675) Python Processor erroring when creating custom relationships

2024-01-25 Thread Mark Payne (Jira)

Mark Payne created NIFI-12675:
-

 Summary: Python Processor erroring when creating custom 
relationships
 Key: NIFI-12675
 URL: https://issues.apache.org/jira/browse/NIFI-12675
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne


>From apache Slack thread 
>([https://apachenifi.slack.com/archives/C0L9VCD47/p1706176890922519):]
{quote}Hello, I am trying to test some custom python processors with nifi 
2.0.0-M1
It works fine except when I try to add custom relationships to it (other than 
the default success, failure and original).
Here's what I am trying:
{code:java}
self.matched = Relationship("matched", "flowfiles having a match with 
the regex")
self.unmatched = Relationship("unmatched", "flowfiles not having any 
match with regex")
self.failure = Relationship("failure", "flowfiles for which process 
errored while matching")

self.relationships = {self.matched, self.unmatched, self.failure}
{code}
I get py4j complaining about AttributeError: 'set' object has no attribute 
'_get_object_id'
which seems like the auto conversion of Python to java container is not 
happening for "Relationship" class. Any idea what could be wrong here?
{quote}

The problem appears to be that Relationships created are of type 
{{nifiapi.Relationship}} but that is being sent back to the Java side without 
being converted into a {{org.apache.nifi.processor.Relationship}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12647) Add @MultiProcessorUseCase annotations to explain how to use ListFile/FetchFile together

2024-01-19 Thread Mark Payne (Jira)

Mark Payne created NIFI-12647:
-

 Summary: Add @MultiProcessorUseCase annotations to explain how to 
use ListFile/FetchFile together
 Key: NIFI-12647
 URL: https://issues.apache.org/jira/browse/NIFI-12647
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne


It's common to use ListFile / FetchFile together to pull in all files in a 
directory, or to pull in specific files (based on filename, for instance). We 
should add documentation to ListFile to explain how these Processors go 
hand-in-hand.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12629) Add metadata filtering to QueryPinecone

2024-01-19 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12629:
--
Fix Version/s: 2.0.0-M2
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Add metadata filtering to QueryPinecone
> ---
>
> Key: NIFI-12629
> URL: https://issues.apache.org/jira/browse/NIFI-12629
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 2.0.0-M1
>Reporter: Pierre Villard
>Assignee: Pierre Villard
>Priority: Major
> Fix For: 2.0.0-M2
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The QueryPinecone processor should be improved to allow for metadata 
> filtering.
> [https://docs.pinecone.io/docs/metadata-filtering]
> [https://medium.com/@gmarcilhacy/deep-dive-into-langchain-and-pinecone-metadata-filtering-75a9b6eba9c]
> An optional filter property should be added to the processor allowing a user 
> to specify which metadata filters should be applied to the query.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12634) Kubernetes Components Should Ignore Empty Prefix Properties

2024-01-19 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12634:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Kubernetes Components Should Ignore Empty Prefix Properties
> ---
>
> Key: NIFI-12634
> URL: https://issues.apache.org/jira/browse/NIFI-12634
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Major
> Fix For: 2.0.0-M2
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Following recent changes on the main branch to support optional prefix 
> properties for Kubernetes Leases and ConfigMaps, testing indicated that the 
> Leader Election Manager and State Provider included empty strings as valid 
> values. This changes the default behavior based on the default 
> nifi.properties and state-management.xml including empty strings for prefix 
> values. The components should be modified to ignore empty strings in addition 
> to null values, aligning with current behavior prior to the introduction of 
> these properties.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12638) Add @UseCase documentation to QueryRecord to explain how to use as a record-based Router

2024-01-18 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12638:
--
Status: Patch Available  (was: Open)

> Add @UseCase documentation to QueryRecord to explain how to use as a 
> record-based Router
> 
>
> Key: NIFI-12638
> URL: https://issues.apache.org/jira/browse/NIFI-12638
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0-M2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A common use case for QueryRecord is to use it to route Records to one route 
> or another. Add use case documentation explaining how to set this up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12638) Add @UseCase documentation to QueryRecord to explain how to use as a record-based Router

2024-01-18 Thread Mark Payne (Jira)

Mark Payne created NIFI-12638:
-

 Summary: Add @UseCase documentation to QueryRecord to explain how 
to use as a record-based Router
 Key: NIFI-12638
 URL: https://issues.apache.org/jira/browse/NIFI-12638
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0-M2


A common use case for QueryRecord is to use it to route Records to one route or 
another. Add use case documentation explaining how to set this up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12637) Automatically update InvokeHTTP Proxy configuration properties

2024-01-18 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12637:
--
Status: Patch Available  (was: Open)

> Automatically update InvokeHTTP Proxy configuration properties
> --
>
> Key: NIFI-12637
> URL: https://issues.apache.org/jira/browse/NIFI-12637
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0-M2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Users updating from 1.x are finding that InvokeHTTP is failing because the 
> "Proxy Type" property that was previously defined no longer is. As a result, 
> InvokeHTTP treats it as a header and attempts to send it as an HTTP Header. 
> However, since it has a space in the name, it's invalid and InvokeHTTP fails.
> We should automatically handle migrating the Proxy properties to make this 
> seamless.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12637) Automatically update InvokeHTTP Proxy configuration properties

2024-01-18 Thread Mark Payne (Jira)

Mark Payne created NIFI-12637:
-

 Summary: Automatically update InvokeHTTP Proxy configuration 
properties
 Key: NIFI-12637
 URL: https://issues.apache.org/jira/browse/NIFI-12637
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0-M2


Users updating from 1.x are finding that InvokeHTTP is failing because the 
"Proxy Type" property that was previously defined no longer is. As a result, 
InvokeHTTP treats it as a header and attempts to send it as an HTTP Header. 
However, since it has a space in the name, it's invalid and InvokeHTTP fails.

We should automatically handle migrating the Proxy properties to make this 
seamless.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12635) Upgrade slack client to 1.37.0

2024-01-18 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12635:
--
Status: Patch Available  (was: Open)

> Upgrade slack client to 1.37.0
> --
>
> Key: NIFI-12635
> URL: https://issues.apache.org/jira/browse/NIFI-12635
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0-M2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I sometimes see the ListenSlack spew errors about Rate Limiting and 
> connection failures. This appears to be fixed in the 1.37.0 version of the 
> client according to [https://github.com/slackapi/java-slack-sdk/pull/1265]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12635) Upgrade slack client to 1.37.0

2024-01-18 Thread Mark Payne (Jira)

Mark Payne created NIFI-12635:
-

 Summary: Upgrade slack client to 1.37.0
 Key: NIFI-12635
 URL: https://issues.apache.org/jira/browse/NIFI-12635
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0-M2


I sometimes see the ListenSlack spew errors about Rate Limiting and connection 
failures. This appears to be fixed in the 1.37.0 version of the client 
according to [https://github.com/slackapi/java-slack-sdk/pull/1265]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12623) Allow ListenSlack to receive App Mention events and include user details

2024-01-16 Thread Mark Payne (Jira)

Mark Payne created NIFI-12623:
-

 Summary: Allow ListenSlack to receive App Mention events and 
include user details
 Key: NIFI-12623
 URL: https://issues.apache.org/jira/browse/NIFI-12623
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0-M2


When using the ListenSlack processor, I often want only the events that mention 
my bot by name. However, as it is, the processor requires that I receive all 
events in the channels that have my bot, and then filter them out. We should 
instead allow users to receive only App Mention events.

Additionally, it would be beneficial to retrieve user details such as username 
instead of just the User ID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12616) Enable @use_case and @multi_processor_use_case decorators to be added to Python Processors

2024-01-16 Thread Mark Payne (Jira)

Mark Payne created NIFI-12616:
-

 Summary: Enable @use_case and @multi_processor_use_case decorators 
to be added to Python Processors
 Key: NIFI-12616
 URL: https://issues.apache.org/jira/browse/NIFI-12616
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework, Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0-M2


Currently, Python processors have no way of articulating specific use cases and 
multi-processor use cases in their docs. Introduce new decorators to allow for 
these.

We use decorators here in order to keep the structure similar to that of Java 
but also because it offers a clean mechanism for defining the 
MultiProcessorUseCase, which becomes awkward if trying to include in the 
ProcessorDetails inner class.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-9464) Provenance Events files corrupted

2024-01-08 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-9464:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Provenance Events files corrupted
> -
>
> Key: NIFI-9464
> URL: https://issues.apache.org/jira/browse/NIFI-9464
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.11.0, 1.15.0
> Environment: java 11, centos 7, nifi standalone
>Reporter: Wiktor Kubicki
>Assignee: Tamas Palfy
>Priority: Minor
> Fix For: 1.25.0, 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In my logs i found:
> {code:java}
> SiteToSiteProvenanceReportingTask[id=b209c0ae-016e-1000-ae39-301c9dcfc544] 
> Failed to retrieve Provenance Events from repository due to: Attempted to 
> skip to byte offset 9149491 for 1125432890.prov.gz but file does not have 
> that many bytes (TOC 
> Reader=StandardTocReader[file=//provenance_repository/toc/1125432890.toc, 
> compressed=false]): java.io.EOFException: Attempted to skip to byte offset 
> 9149491 for 1125432890.prov.gz but file does not have that many bytes (TOC 
> Reader=StandardTocReader[file=/.../provenance_repository/toc/1125432890.toc, 
> compressed=false])
> {code}
> It is criticaly important for me to have 100% sure of my logs. It happened 
> about 100 times in last 1 year for 15 *.prov.gz files:
> {code:java}
> -rw-rw-rw-. 1 user user 1013923 Oct 17 21:17 1075441276.prov.gz
> -rw-rw-rw-. 1 user user 1345431 Oct 24 13:06 1083362251.prov.gz
> -rw-rw-rw-. 1 user user 1359282 Oct 25 13:07 1084546392.prov.gz
> -rw-rw-rw-. 1 user user 1155791 Nov  2 17:08 1094516954.prov.gz
> -rw-rw-r--. 1 user user  974136 Nov 18 22:07 1113402183.prov.gz
> -rw-rw-r--. 1 user user 1125608 Nov 28 22:00 1125097576.prov.gz
> -rw-rw-r--. 1 user user 1248319 Nov 29 04:30 1125432890.prov.gz
> -rw-rw-r--. 1 user user  832120 Feb  2  2021 661957813.prov.gz
> -rw-rw-r--. 1 user user 1110978 Mar 17  2021 734807613.prov.gz
> -rw-rw-r--. 1 user user 1506819 Apr 16  2021 786154249.prov.gz
> -rw-rw-r--. 1 user user 1763198 May 25  2021 852626782.prov.gz
> -rw-rw-r--. 1 user user 1580598 Jun 15 08:32 891934274.prov.gz
> -rw-rw-r--. 1 user user 2960296 Jun 28 17:07 917991812.prov.gz
> -rw-rw-r--. 1 user user 1808037 Jun 28 17:37 918051650.prov.gz
> -rw-rw-rw-. 1 user user  765924 Aug 14 13:09 991505484.prov.gz
> {code}
> BTW it's interesting why thera ere different chmods
> My config for provenance (BTW if you see posibbility for tune it, please tell 
> me):
> {code:java}
> nifi.provenance.repository.directory.default=/../provenance_repository
> nifi.provenance.repository.max.storage.time=730 days
> nifi.provenance.repository.max.storage.size=512 GB
> nifi.provenance.repository.rollover.time=10 mins
> nifi.provenance.repository.rollover.size=100 MB
> nifi.provenance.repository.query.threads=2
> nifi.provenance.repository.index.threads=1
> nifi.provenance.repository.compress.on.rollover=true
> nifi.provenance.repository.always.sync=false
> nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, 
> ProcessorID
> nifi.provenance.repository.indexed.attributes=
> nifi.provenance.repository.index.shard.size=1 GB
> nifi.provenance.repository.max.attribute.length=65536
> nifi.provenance.repository.concurrent.merge.threads=1
> nifi.provenance.repository.buffer.size=10
> {code}
> Now my provenance repo has 140GB of data.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-9464) Provenance Events files corrupted

2024-01-08 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804441#comment-17804441
 ] 

Mark Payne commented on NIFI-9464:
--

[~tpalfy] I got you. Makes sense. I did a quick look over this again to make 
sure that I fully understand what's happening here. It looks like this was 
actually designed to work as you've proposed in the PR. But when the Encrypted 
Prov Repo was introduced, the base class's init() method was changed to start 
creating its own `EventFileManager`. As a result, the base class has a 
different instance than the concrete class is using. So this change fixes that 
to ensure that both the base class and the concrete class are sharing the same 
instance.  Makes perfect sense. Great catch! Thanks for running that down and 
fixing. I'm a +1 will merge.

> Provenance Events files corrupted
> -
>
> Key: NIFI-9464
> URL: https://issues.apache.org/jira/browse/NIFI-9464
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.11.0, 1.15.0
> Environment: java 11, centos 7, nifi standalone
>Reporter: Wiktor Kubicki
>Assignee: Tamas Palfy
>Priority: Minor
> Fix For: 1.25.0, 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In my logs i found:
> {code:java}
> SiteToSiteProvenanceReportingTask[id=b209c0ae-016e-1000-ae39-301c9dcfc544] 
> Failed to retrieve Provenance Events from repository due to: Attempted to 
> skip to byte offset 9149491 for 1125432890.prov.gz but file does not have 
> that many bytes (TOC 
> Reader=StandardTocReader[file=//provenance_repository/toc/1125432890.toc, 
> compressed=false]): java.io.EOFException: Attempted to skip to byte offset 
> 9149491 for 1125432890.prov.gz but file does not have that many bytes (TOC 
> Reader=StandardTocReader[file=/.../provenance_repository/toc/1125432890.toc, 
> compressed=false])
> {code}
> It is criticaly important for me to have 100% sure of my logs. It happened 
> about 100 times in last 1 year for 15 *.prov.gz files:
> {code:java}
> -rw-rw-rw-. 1 user user 1013923 Oct 17 21:17 1075441276.prov.gz
> -rw-rw-rw-. 1 user user 1345431 Oct 24 13:06 1083362251.prov.gz
> -rw-rw-rw-. 1 user user 1359282 Oct 25 13:07 1084546392.prov.gz
> -rw-rw-rw-. 1 user user 1155791 Nov  2 17:08 1094516954.prov.gz
> -rw-rw-r--. 1 user user  974136 Nov 18 22:07 1113402183.prov.gz
> -rw-rw-r--. 1 user user 1125608 Nov 28 22:00 1125097576.prov.gz
> -rw-rw-r--. 1 user user 1248319 Nov 29 04:30 1125432890.prov.gz
> -rw-rw-r--. 1 user user  832120 Feb  2  2021 661957813.prov.gz
> -rw-rw-r--. 1 user user 1110978 Mar 17  2021 734807613.prov.gz
> -rw-rw-r--. 1 user user 1506819 Apr 16  2021 786154249.prov.gz
> -rw-rw-r--. 1 user user 1763198 May 25  2021 852626782.prov.gz
> -rw-rw-r--. 1 user user 1580598 Jun 15 08:32 891934274.prov.gz
> -rw-rw-r--. 1 user user 2960296 Jun 28 17:07 917991812.prov.gz
> -rw-rw-r--. 1 user user 1808037 Jun 28 17:37 918051650.prov.gz
> -rw-rw-rw-. 1 user user  765924 Aug 14 13:09 991505484.prov.gz
> {code}
> BTW it's interesting why thera ere different chmods
> My config for provenance (BTW if you see posibbility for tune it, please tell 
> me):
> {code:java}
> nifi.provenance.repository.directory.default=/../provenance_repository
> nifi.provenance.repository.max.storage.time=730 days
> nifi.provenance.repository.max.storage.size=512 GB
> nifi.provenance.repository.rollover.time=10 mins
> nifi.provenance.repository.rollover.size=100 MB
> nifi.provenance.repository.query.threads=2
> nifi.provenance.repository.index.threads=1
> nifi.provenance.repository.compress.on.rollover=true
> nifi.provenance.repository.always.sync=false
> nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, 
> ProcessorID
> nifi.provenance.repository.indexed.attributes=
> nifi.provenance.repository.index.shard.size=1 GB
> nifi.provenance.repository.max.attribute.length=65536
> nifi.provenance.repository.concurrent.merge.threads=1
> nifi.provenance.repository.buffer.size=10
> {code}
> Now my provenance repo has 140GB of data.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12536) ParseDocument incorrectly converts byte array to String, result in text like b'...' instead of just ...

2023-12-21 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12536:
--
Status: Patch Available  (was: Open)

> ParseDocument incorrectly converts byte array to String, result in text like 
> b'...' instead of just ...
> ---
>
> Key: NIFI-12536
> URL: https://issues.apache.org/jira/browse/NIFI-12536
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 2.0.0-M1
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Trivial
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Documents that are produced from ParseDocument use 
> {{{}str(flowfile.getContentsAsBytes(){}}}) when it should use 
> {{{}flowfile.getContentsAsBytes().decode('utf-8'){}}}.
> This results in text such as {{One Two Three}} to be produces as {{b'One Two 
> Three'}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12536) ParseDocument incorrectly converts byte array to String, result in text like b'...' instead of just ...

2023-12-21 Thread Mark Payne (Jira)

Mark Payne created NIFI-12536:
-

 Summary: ParseDocument incorrectly converts byte array to String, 
result in text like b'...' instead of just ...
 Key: NIFI-12536
 URL: https://issues.apache.org/jira/browse/NIFI-12536
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 2.0.0-M1
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


Documents that are produced from ParseDocument use 
{{{}str(flowfile.getContentsAsBytes(){}}}) when it should use 
{{{}flowfile.getContentsAsBytes().decode('utf-8'){}}}.

This results in text such as {{One Two Three}} to be produces as {{b'One Two 
Three'}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12516) When clustered, View Content shows wrong content type, will not show formatted

2023-12-15 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12516:
--
Status: Patch Available  (was: Open)

> When clustered, View Content shows wrong content type, will not show formatted
> --
>
> Key: NIFI-12516
> URL: https://issues.apache.org/jira/browse/NIFI-12516
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework, Core UI
>Affects Versions: 2.0.0-M1
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When I choose to view a FlowFile's contents in the UI (regardless of whether 
> it came from Provenance view or List Queue view), it shows the content. 
> However, it shows the filename as an empty string, and it shows the 
> content-type as "text/plain" even though the mime.type attribute is set to 
> "application/json". As a result, when I try to change to use 'formatted' view 
> as, instead of 'original' it does not render it as JSON, since it thinks the 
> data is text/plain.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12516) When clustered, View Content shows wrong content type, will not show formatted

2023-12-15 Thread Mark Payne (Jira)

Mark Payne created NIFI-12516:
-

 Summary: When clustered, View Content shows wrong content type, 
will not show formatted
 Key: NIFI-12516
 URL: https://issues.apache.org/jira/browse/NIFI-12516
 Project: Apache NiFi
  Issue Type: Bug
  Components: Core Framework, Core UI
Affects Versions: 2.0.0-M1
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


When I choose to view a FlowFile's contents in the UI (regardless of whether it 
came from Provenance view or List Queue view), it shows the content. However, 
it shows the filename as an empty string, and it shows the content-type as 
"text/plain" even though the mime.type attribute is set to "application/json". 
As a result, when I try to change to use 'formatted' view as, instead of 
'original' it does not render it as JSON, since it thinks the data is 
text/plain.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-12394) when importing versioned flow with component that migrates properties, controller service reference is invalid

2023-12-14 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-12394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796772#comment-17796772
 ] 

Mark Payne commented on NIFI-12394:
---

Thanks for reporting, [~mosermw] . That's a good corner case that I'd not 
thought of.

So when a Processor (or Controller Service or Reporting Task) is created in the 
StandardVersionedComponentSynchronizer, we add a {{CreatedExtension}} to the 
{{createdExtensions}} Map. When the entry is added, it has the original 
property values.

We then call {{updateProcessor}} which calls {{{}populatePropertiesMap{}}}. 
This is the part of the code where, if a Property references a Controller 
Service, it updates the Properties Map.

Then, at the end, we call {{migrateConfiguration}} which is responsible for 
updating the properties, based on the original property values.

So the solution that I would propose would be in {{{}populatePropertiesMap{}}}, 
we update the logic there so that if it maps a property value to a Controller 
Service, we also get the {{CreatedExtension}} from the Map and update the 
property map there too. I believe this should then pass in the values to 
{{migrateConfiguration}} (which then calls {{{}migrateProperties{}}}) with the 
appropriate value.

> when importing versioned flow with component that migrates properties, 
> controller service reference is invalid
> --
>
> Key: NIFI-12394
> URL: https://issues.apache.org/jira/browse/NIFI-12394
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Flow Versioning
>Reporter: Michael W Moser
>Priority: Major
>
> I built a Process Group containing one StandardRestrictedSSLContextService 
> that is referenced by one InvokeHTTP processor.  I downloaded that Process 
> Group as a flow definition {*}with external services{*}.  I also versioned 
> that Process Group in NiFi Registry.
> Inside the flow definition file, I see the 
> StandardRestrictedSSLContextService with 
> "identifier":"d7d70b6c-abe4-3564-a219-b289cb7f25d2" and InvokeHTTP references 
> that UUID.
> When I create a new Process Group using either the downloaded flow definition 
> or the NiFi Registry flow, a new StandardRestrictedSSLContextService is 
> created and it has a new UUID as expected.  The InvokeHTTP processor is 
> invalid because it references the proposed 
> StandardRestrictedSSLContextService UUID d7d70b6c-abe4-3564-a219-b289cb7f25d2 
> which does not exist.
> The service and processor are created and references are updated, but when 
> migrating processor properties and any change occurs, the service reference 
> is reverted back to what was in proposedProperties.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (NIFI-9464) Provenance Events files corrupted

2023-12-14 Thread Mark Payne (Jira)



[ 
https://issues.apache.org/jira/browse/NIFI-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796766#comment-17796766
 ] 

Mark Payne commented on NIFI-9464:
--

Thanks [~tpalfy] - your analysis seems reasonable. As you said, it's difficult 
to confirm. One thing that I would recommend in order to confirm (although 
you'd not want to leave this in the codebase) would be to temporarily add a 
`Thread.sleep` into the code in between the time that the new .toc.tmp file 
completes its writing and the time that it's renamed to .toc. If you were to 
add a sleep of say 30 seconds or 1 minute, it would be easy to confirm that the 
threading issue is present as described and also that this change addresses the 
concern.

> Provenance Events files corrupted
> -
>
> Key: NIFI-9464
> URL: https://issues.apache.org/jira/browse/NIFI-9464
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.11.0, 1.15.0
> Environment: java 11, centos 7, nifi standalone
>Reporter: Wiktor Kubicki
>Assignee: Tamas Palfy
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In my logs i found:
> {code:java}
> SiteToSiteProvenanceReportingTask[id=b209c0ae-016e-1000-ae39-301c9dcfc544] 
> Failed to retrieve Provenance Events from repository due to: Attempted to 
> skip to byte offset 9149491 for 1125432890.prov.gz but file does not have 
> that many bytes (TOC 
> Reader=StandardTocReader[file=//provenance_repository/toc/1125432890.toc, 
> compressed=false]): java.io.EOFException: Attempted to skip to byte offset 
> 9149491 for 1125432890.prov.gz but file does not have that many bytes (TOC 
> Reader=StandardTocReader[file=/.../provenance_repository/toc/1125432890.toc, 
> compressed=false])
> {code}
> It is criticaly important for me to have 100% sure of my logs. It happened 
> about 100 times in last 1 year for 15 *.prov.gz files:
> {code:java}
> -rw-rw-rw-. 1 user user 1013923 Oct 17 21:17 1075441276.prov.gz
> -rw-rw-rw-. 1 user user 1345431 Oct 24 13:06 1083362251.prov.gz
> -rw-rw-rw-. 1 user user 1359282 Oct 25 13:07 1084546392.prov.gz
> -rw-rw-rw-. 1 user user 1155791 Nov  2 17:08 1094516954.prov.gz
> -rw-rw-r--. 1 user user  974136 Nov 18 22:07 1113402183.prov.gz
> -rw-rw-r--. 1 user user 1125608 Nov 28 22:00 1125097576.prov.gz
> -rw-rw-r--. 1 user user 1248319 Nov 29 04:30 1125432890.prov.gz
> -rw-rw-r--. 1 user user  832120 Feb  2  2021 661957813.prov.gz
> -rw-rw-r--. 1 user user 1110978 Mar 17  2021 734807613.prov.gz
> -rw-rw-r--. 1 user user 1506819 Apr 16  2021 786154249.prov.gz
> -rw-rw-r--. 1 user user 1763198 May 25  2021 852626782.prov.gz
> -rw-rw-r--. 1 user user 1580598 Jun 15 08:32 891934274.prov.gz
> -rw-rw-r--. 1 user user 2960296 Jun 28 17:07 917991812.prov.gz
> -rw-rw-r--. 1 user user 1808037 Jun 28 17:37 918051650.prov.gz
> -rw-rw-rw-. 1 user user  765924 Aug 14 13:09 991505484.prov.gz
> {code}
> BTW it's interesting why thera ere different chmods
> My config for provenance (BTW if you see posibbility for tune it, please tell 
> me):
> {code:java}
> nifi.provenance.repository.directory.default=/../provenance_repository
> nifi.provenance.repository.max.storage.time=730 days
> nifi.provenance.repository.max.storage.size=512 GB
> nifi.provenance.repository.rollover.time=10 mins
> nifi.provenance.repository.rollover.size=100 MB
> nifi.provenance.repository.query.threads=2
> nifi.provenance.repository.index.threads=1
> nifi.provenance.repository.compress.on.rollover=true
> nifi.provenance.repository.always.sync=false
> nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, 
> ProcessorID
> nifi.provenance.repository.indexed.attributes=
> nifi.provenance.repository.index.shard.size=1 GB
> nifi.provenance.repository.max.attribute.length=65536
> nifi.provenance.repository.concurrent.merge.threads=1
> nifi.provenance.repository.buffer.size=10
> {code}
> Now my provenance repo has 140GB of data.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12478) Return Message Type as body for JMS Object Messages

2023-12-07 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12478:
--
Fix Version/s: 2.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Return Message Type as body for JMS Object Messages
> ---
>
> Key: NIFI-12478
> URL: https://issues.apache.org/jira/browse/NIFI-12478
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Minor
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The ConsumeJMS Processor supports receiving multiple types of JMS messages 
> and implements different serialization strategies for each type of message. 
> The JMS ObjectMessage Type provides a generic wrapper around an opaque Java 
> Object without any further information. The ConsumeJMS Processor currently 
> writes the bytes of an Object using Java Object serialization, which presents 
> several issues. Java Object serialization is not compatible with services 
> outside of Java, it writes the exact version of the Java Object, and it can 
> reference classes that may not be present on the receiving system. This can 
> lead to unexpected errors when receiving JMS messages in the context of a 
> NiFi Processor. Instead of reporting the message as an error, the message 
> metadata could still be useful in some flows. Using the Message Type of 
> {{ObjectMessage}} as the output bytes enables this edge case scenario, 
> although any system designed to interoperate with NiFi should use other types 
> of JMS messages to enable subsequent handling in other Processors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12480) Improve handling of embedded JSON records

2023-12-06 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12480:
--
Fix Version/s: 2.0.0
   Status: Patch Available  (was: Open)

> Improve handling of embedded JSON records
> -
>
> Key: NIFI-12480
> URL: https://issues.apache.org/jira/browse/NIFI-12480
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework, Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It proves to be difficult to treat embedded JSON elements as Strings. There 
> are times when this is necessary, such as when pushing to a database that 
> declares a field of type JSON or interacting with a web service that expects 
> incoming JSON as a String. However, there's no easy way to do this in NiFi 
> today.
> Instead, what typically happens is that the Record gets converted to a String 
> via a call to {{toString()}} and that produces something like 
> {{{}MapRecord[name=John Doe, color=blue]{}}}, which is not helpful.
> However, when a JSON Reader is used, we already have a JSON representation of 
> the Record in the record's SerializedForm. When {{toString()}} is called, we 
> should always use the SerializedForm of a Record, if it is available and only 
> fall back to the given version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12480) Improve handling of embedded JSON records

2023-12-06 Thread Mark Payne (Jira)

Mark Payne created NIFI-12480:
-

 Summary: Improve handling of embedded JSON records
 Key: NIFI-12480
 URL: https://issues.apache.org/jira/browse/NIFI-12480
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework, Extensions
Reporter: Mark Payne
Assignee: Mark Payne


It proves to be difficult to treat embedded JSON elements as Strings. There are 
times when this is necessary, such as when pushing to a database that declares 
a field of type JSON or interacting with a web service that expects incoming 
JSON as a String. However, there's no easy way to do this in NiFi today.

Instead, what typically happens is that the Record gets converted to a String 
via a call to {{toString()}} and that produces something like 
{{{}MapRecord[name=John Doe, color=blue]{}}}, which is not helpful.

However, when a JSON Reader is used, we already have a JSON representation of 
the Record in the record's SerializedForm. When {{toString()}} is called, we 
should always use the SerializedForm of a Record, if it is available and only 
fall back to the given version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (NIFI-12331) Introduce a PublishSlack processor

2023-12-04 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne reassigned NIFI-12331:
-

Assignee: Mark Payne

> Introduce a PublishSlack processor
> --
>
> Key: NIFI-12331
> URL: https://issues.apache.org/jira/browse/NIFI-12331
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The Slack API provides multiple different ways to publish messages to a Slack 
> channel. NiFi already has two Processors for pushing to Slack - PostSlack and 
> PutSlack. These processors have slightly different nuances, and the 
> documentation does not articulate when to which one. One of them is oriented 
> more toward sending FlowFile contents as an attachment while the other is 
> oriented toward posting a message based on a property value. We should 
> consolidate both of these Processors into a single Processor that is capable 
> of sending a message and optionally providing the FlowFile content as an 
> attachment.
> Both PostSlack and PutSlack make use of WebHooks instead of using the 
> official Slack SDK. This means that rather than simply specifying the name of 
> the Channel to post to, in order to send a message in Slack, the creator of 
> the Slack App must explicitly add a Webhook for the desired channel, and the 
> Processor must then be configured to use that Webhook. As a result, the 
> channel cannot be easily configured and cannot be dynamic. This makes it 
> difficult to use in conjunction with ListenSlack / ConsumeSlack in order to 
> respond in threads.
> We need to consolidate both into a single Processor that is configured and 
> behaves differently, based on the SDK.
> This Processor should be configured with properties that allow specifying:
>  * Bot Token
>  * Name of the channel to send to
>  * How to obtain the message content (FlowFile Content or specified as a 
> Property that accepts Expression Language)
>  * If using a Property value, should be configured also with the message to 
> send, and whether or not to attach the FlowFile content as an attachment to 
> the message.
>  * Thread Timestamp (optional to convey which thread the message should be 
> sent to) - should support Expression Language



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12457) Add Use Case documentation to explain how to use RouteOnAttribute for specific use cases

2023-12-03 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12457:
--
Status: Patch Available  (was: Open)

> Add Use Case documentation to explain how to use RouteOnAttribute for 
> specific use cases
> 
>
> Key: NIFI-12457
> URL: https://issues.apache.org/jira/browse/NIFI-12457
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add @MultiProcessorUseCase that explains how to use RouteOnAttribute in 
> conjunction with List/Fetch S3 in order to fetch only specific files.
> Also add documentation showing how to use RouteOnAttribute alongside 
> PartitionRecord in order to route record-oriented data based on the contents 
> of the FlowFile



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12457) Add Use Case documentation to explain how to use RouteOnAttribute for specific use cases

2023-12-03 Thread Mark Payne (Jira)

Mark Payne created NIFI-12457:
-

 Summary: Add Use Case documentation to explain how to use 
RouteOnAttribute for specific use cases
 Key: NIFI-12457
 URL: https://issues.apache.org/jira/browse/NIFI-12457
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: Mark Payne
Assignee: Mark Payne


Add @MultiProcessorUseCase that explains how to use RouteOnAttribute in 
conjunction with List/Fetch S3 in order to fetch only specific files.

Also add documentation showing how to use RouteOnAttribute alongside 
PartitionRecord in order to route record-oriented data based on the contents of 
the FlowFile



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12456) Improve leniency of JSON readers and flexibility of JSON Writer

2023-12-03 Thread Mark Payne (Jira)

Mark Payne created NIFI-12456:
-

 Summary: Improve leniency of JSON readers and flexibility of JSON 
Writer
 Key: NIFI-12456
 URL: https://issues.apache.org/jira/browse/NIFI-12456
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Mark Payne


Currently, we adhere to the JSON specification fairly strictly, with the 
exception of allowing for "JSON Lines" / ndjson / ldjson.

However, the Jackson library allows for several {{Features}} that we do not 
expose, which may be helpful for handling data that does not strictly adhere to 
the schema, or where there are preferences in serialization.

For example, {{JsonParser.Feature}} allows for the ability to allow comments in 
JSON (to include lines beginning with {{{}//{}}}, {{{}/*{}}}, and "YAML Style" 
comments (#)). Additionally, it allows for single-quotes for field names or no 
quoting at all. While these do not adhere to the specification, they are common 
enough for the parser to support them, and we should do.

Similarly, on the serialization side, we have had requests to support writing 
decimal values without use of scientific notation, which can be achieved by 
enabling the {{WRITE_BIGDECIMAL_AS_PLAIN}} feature.

We should expose these options on the JsonTreeReader and the JSON Writer. I 
don't know of any downside to enabling the leniency / non-standard options, so 
it probably makes sense to simply enable them all by default. Though there is 
argument for introducing a new "Parsing Leniency" option that allows the user 
to disable these features.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12454) Allow decommissioning a node without shutdown

2023-12-01 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12454:
--
Status: Patch Available  (was: Open)

> Allow decommissioning a node without shutdown
> -
>
> Key: NIFI-12454
> URL: https://issues.apache.org/jira/browse/NIFI-12454
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>
> When a node is decommissioned, it takes the following steps:
>  * Disconnect Node from cluster
>  * Trigger data offload
>  * Wait for data offload
>  * Remove node from cluster
>  * Shutdown
> It would be helpful to allow taking the node out of the cluster without 
> completely terminating the process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12453) Allow obtaining a node's cluster status via nifi.sh

2023-12-01 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12453:
--
Status: Patch Available  (was: Open)

> Allow obtaining a node's cluster status via nifi.sh
> ---
>
> Key: NIFI-12453
> URL: https://issues.apache.org/jira/browse/NIFI-12453
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are often times when we want to check the cluster status of a 
> particular node (i.e., is it CONNECTING, CONNECTED, DISCONNECTING, 
> DISCONNECTED, etc.) Currently the only way to obtain this information is via 
> the REST API or the full diagnostic dump.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12454) Allow decommissioning a node without shutdown

2023-12-01 Thread Mark Payne (Jira)

Mark Payne created NIFI-12454:
-

 Summary: Allow decommissioning a node without shutdown
 Key: NIFI-12454
 URL: https://issues.apache.org/jira/browse/NIFI-12454
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


When a node is decommissioned, it takes the following steps:
 * Disconnect Node from cluster
 * Trigger data offload
 * Wait for data offload
 * Remove node from cluster
 * Shutdown

It would be helpful to allow taking the node out of the cluster without 
completely terminating the process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12453) Allow obtaining a node's cluster status via nifi.sh

2023-12-01 Thread Mark Payne (Jira)

Mark Payne created NIFI-12453:
-

 Summary: Allow obtaining a node's cluster status via nifi.sh
 Key: NIFI-12453
 URL: https://issues.apache.org/jira/browse/NIFI-12453
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.0.0


There are often times when we want to check the cluster status of a particular 
node (i.e., is it CONNECTING, CONNECTED, DISCONNECTING, DISCONNECTED, etc.) 
Currently the only way to obtain this information is via the REST API or the 
full diagnostic dump.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement

2023-11-21 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-11671:
--
Status: Patch Available  (was: Open)

> JoinEnrichment SQL strategy doesn't allow attributes in join statement
> --
>
> Key: NIFI-11671
> URL: https://issues.apache.org/jira/browse/NIFI-11671
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.23.0, 1.20.0, 1.18.0
>Reporter: Philipp Korniets
>Assignee: Mark Payne
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We use ForkEnrichement - JoinEnrichment pattern and want to include filtering 
> in join SQL. Filter value is coming from FlowFile attribute
> {code:sql}
> ${test}  = 'NewValue'
> SELECT original.*, enrichment.*,'${test}'
> FROM original 
> LEFT OUTER JOIN enrichment 
> ON original.Underlying = enrichment.Underlying
> WHERE enrichment.MyField = '${test}'
> {code}
> However this doesnt work because JoinEnrichment doesnt use 
> evaluateAttributeExpressions
> Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as 
> attribute.
> !screenshot-1.png|width=692,height=431!
>  
> {code:java}
> 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] 
> o.a.n.processors.standard.JoinEnrichment 
> JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join 
> 'original' FlowFile 
> StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687948831976-629, 
> container=default, section=629], offset=8334082, 
> length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] 
> and 'enrichment' FlowFile 
> StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687949723375-631, 
> container=default, section=631], offset=5362822, 
> length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502];
>  routing to failure
> java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}]
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99)
>     at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193)
>     at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354)
>     at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246)
>     at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
>     at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:750)
> Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line 
> 1, column 1.
> Was expecting one of:
>     "ABS" ...
>  {code}
> As I understand issue is in following line of code
>

[jira] [Assigned] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement

2023-11-21 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne reassigned NIFI-11671:
-

Assignee: Mark Payne

> JoinEnrichment SQL strategy doesn't allow attributes in join statement
> --
>
> Key: NIFI-11671
> URL: https://issues.apache.org/jira/browse/NIFI-11671
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.18.0, 1.20.0, 1.23.0
>Reporter: Philipp Korniets
>Assignee: Mark Payne
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We use ForkEnrichement - JoinEnrichment pattern and want to include filtering 
> in join SQL. Filter value is coming from FlowFile attribute
> {code:sql}
> ${test}  = 'NewValue'
> SELECT original.*, enrichment.*,'${test}'
> FROM original 
> LEFT OUTER JOIN enrichment 
> ON original.Underlying = enrichment.Underlying
> WHERE enrichment.MyField = '${test}'
> {code}
> However this doesnt work because JoinEnrichment doesnt use 
> evaluateAttributeExpressions
> Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as 
> attribute.
> !screenshot-1.png|width=692,height=431!
>  
> {code:java}
> 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] 
> o.a.n.processors.standard.JoinEnrichment 
> JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join 
> 'original' FlowFile 
> StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687948831976-629, 
> container=default, section=629], offset=8334082, 
> length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] 
> and 'enrichment' FlowFile 
> StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687949723375-631, 
> container=default, section=631], offset=5362822, 
> length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502];
>  routing to failure
> java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}]
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99)
>     at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193)
>     at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354)
>     at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246)
>     at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
>     at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:750)
> Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line 
> 1, column 1.
> Was expecting one of:
>     "ABS" ...
>  {code}
> As I understand issue is in following line of code
>

[jira] [Updated] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement

2023-11-21 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-11671:
--
Fix Version/s: 1.latest
   2.latest

> JoinEnrichment SQL strategy doesn't allow attributes in join statement
> --
>
> Key: NIFI-11671
> URL: https://issues.apache.org/jira/browse/NIFI-11671
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.18.0, 1.20.0, 1.23.0
>Reporter: Philipp Korniets
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We use ForkEnrichement - JoinEnrichment pattern and want to include filtering 
> in join SQL. Filter value is coming from FlowFile attribute
> {code:sql}
> ${test}  = 'NewValue'
> SELECT original.*, enrichment.*,'${test}'
> FROM original 
> LEFT OUTER JOIN enrichment 
> ON original.Underlying = enrichment.Underlying
> WHERE enrichment.MyField = '${test}'
> {code}
> However this doesnt work because JoinEnrichment doesnt use 
> evaluateAttributeExpressions
> Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as 
> attribute.
> !screenshot-1.png|width=692,height=431!
>  
> {code:java}
> 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] 
> o.a.n.processors.standard.JoinEnrichment 
> JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join 
> 'original' FlowFile 
> StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687948831976-629, 
> container=default, section=629], offset=8334082, 
> length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] 
> and 'enrichment' FlowFile 
> StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687949723375-631, 
> container=default, section=631], offset=5362822, 
> length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502];
>  routing to failure
> java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}]
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99)
>     at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193)
>     at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354)
>     at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246)
>     at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
>     at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:750)
> Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line 
> 1, column 1.
> Was expecting one of:
>     "ABS" ...
>  {code}
> As I understand issue is in following line of code
>

[jira] [Updated] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement

2023-11-21 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-11671:
--
Priority: Minor  (was: Critical)

> JoinEnrichment SQL strategy doesn't allow attributes in join statement
> --
>
> Key: NIFI-11671
> URL: https://issues.apache.org/jira/browse/NIFI-11671
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.18.0, 1.20.0, 1.23.0
>Reporter: Philipp Korniets
>Priority: Minor
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We use ForkEnrichement - JoinEnrichment pattern and want to include filtering 
> in join SQL. Filter value is coming from FlowFile attribute
> {code:sql}
> ${test}  = 'NewValue'
> SELECT original.*, enrichment.*,'${test}'
> FROM original 
> LEFT OUTER JOIN enrichment 
> ON original.Underlying = enrichment.Underlying
> WHERE enrichment.MyField = '${test}'
> {code}
> However this doesnt work because JoinEnrichment doesnt use 
> evaluateAttributeExpressions
> Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as 
> attribute.
> !screenshot-1.png|width=692,height=431!
>  
> {code:java}
> 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] 
> o.a.n.processors.standard.JoinEnrichment 
> JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join 
> 'original' FlowFile 
> StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687948831976-629, 
> container=default, section=629], offset=8334082, 
> length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] 
> and 'enrichment' FlowFile 
> StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687949723375-631, 
> container=default, section=631], offset=5362822, 
> length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502];
>  routing to failure
> java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}]
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99)
>     at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193)
>     at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354)
>     at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246)
>     at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
>     at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:750)
> Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line 
> 1, column 1.
> Was expecting one of:
>     "ABS" ...
>  {code}
> As I understand issue is in following line of code
>

[jira] [Updated] (NIFI-11671) JoinEnrichment SQL strategy doesn't allow attributes in join statement

2023-11-21 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-11671:
--
Component/s: Extensions
 (was: Core Framework)

> JoinEnrichment SQL strategy doesn't allow attributes in join statement
> --
>
> Key: NIFI-11671
> URL: https://issues.apache.org/jira/browse/NIFI-11671
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.18.0, 1.20.0, 1.23.0
>Reporter: Philipp Korniets
>Priority: Minor
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We use ForkEnrichement - JoinEnrichment pattern and want to include filtering 
> in join SQL. Filter value is coming from FlowFile attribute
> {code:sql}
> ${test}  = 'NewValue'
> SELECT original.*, enrichment.*,'${test}'
> FROM original 
> LEFT OUTER JOIN enrichment 
> ON original.Underlying = enrichment.Underlying
> WHERE enrichment.MyField = '${test}'
> {code}
> However this doesnt work because JoinEnrichment doesnt use 
> evaluateAttributeExpressions
> Additionally in version 1.18,1.23 - doesnt allow whole query to be passed as 
> attribute.
> !screenshot-1.png|width=692,height=431!
>  
> {code:java}
> 2023-06-28 11:07:16,611 ERROR [Timer-Driven Process Thread-7] 
> o.a.n.processors.standard.JoinEnrichment 
> JoinEnrichment[id=dbe156ac-0187-1000-4477-0183899e0432] Failed to join 
> 'original' FlowFile 
> StandardFlowFileRecord[uuid=2ab9f6ad-73a5-4763-b25e-fd26c44835e1,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687948831976-629, 
> container=default, section=629], offset=8334082, 
> length=600557],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=600557] 
> and 'enrichment' FlowFile 
> StandardFlowFileRecord[uuid=e4bb7769-fdce-4dfe-af18-443676103035,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1687949723375-631, 
> container=default, section=631], offset=5362822, 
> length=1999502],offset=0,name=lmr_SY08C41-1_S_514682_20230627.csv,size=1999502];
>  routing to failure
> java.sql.SQLException: Error while preparing statement [${instrumentJoinSQL}]
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:224)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:203)
>     at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:99)
>     at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:178)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.createCalciteParameters(SqlJoinCache.java:91)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinCache.getCalciteParameters(SqlJoinCache.java:65)
>     at 
> org.apache.nifi.processors.standard.enrichment.SqlJoinStrategy.join(SqlJoinStrategy.java:49)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBin(JoinEnrichment.java:387)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.processBins(BinFiles.java:233)
>     at 
> org.apache.nifi.processors.standard.JoinEnrichment.processBins(JoinEnrichment.java:503)
>     at 
> org.apache.nifi.processor.util.bin.BinFiles.onTrigger(BinFiles.java:193)
>     at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1354)
>     at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246)
>     at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
>     at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:750)
> Caused by: java.lang.RuntimeException: parse failed: Encountered "$" at line 
> 1, column 1.
> Was expecting one of:
>     "ABS" ...
>  {code}
> As I understand issue is in following line of code
>

[jira] [Updated] (NIFI-12358) NPE when configured network interfaces do not exist

2023-11-15 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12358:
--
Fix Version/s: 2.latest
 Assignee: Mark Payne
   Status: Patch Available  (was: Open)

> NPE when configured network interfaces do not exist
> ---
>
> Key: NIFI-12358
> URL: https://issues.apache.org/jira/browse/NIFI-12358
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Guillaume Lhermenier
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I recently had to switch our NiFi base AMIs in AWS from amazonlinux 2 to 
> amazonlinux 2023. 
> This went pretty smoothly but I an issue about network interfaces.
> For some reasons, I had the following configured in my nifi.properties :
> {code:java}
> nifi.web.https.host=nifi1.emea.qa.domain.io
> nifi.web.https.port=8443 
> nifi.web.https.network.interface.eth0=eth0
> nifi.web.https.network.interface.eth1=eth1{code}
> And this worked for many years.
> However, in amazon Linux, networks seems to have changed and naming too. 
> Instead of eth0/eth1, I had my network interfaces named ens5/ens6.
> Of course, NiFi wasn't able to find them. 
> However, the log could be clearer than a NullPointerException
> {code:java}
> 2023-11-13 14:35:28,644 WARN [main] o.a.nifi.web.server.HostHeaderHandler 
> Failed to determine custom network interfaces.
> java.lang.NullPointerException: null
> at 
> org.apache.nifi.web.server.HostHeaderHandler.extractIPsFromNetworkInterfaces(HostHeaderHandler.java:335)
> at 
> org.apache.nifi.web.server.HostHeaderHandler.generateDefaultHostnames(HostHeaderHandler.java:276)
> at 
> org.apache.nifi.web.server.HostHeaderHandler.(HostHeaderHandler.java:100)
> at org.apache.nifi.web.server.JettyServer.init(JettyServer.java:217)
> at 
> org.apache.nifi.web.server.JettyServer.initialize(JettyServer.java:1074)
> at org.apache.nifi.NiFi.(NiFi.java:164)
> at org.apache.nifi.NiFi.(NiFi.java:83)
> at org.apache.nifi.NiFi.main(NiFi.java:332)
> 2023-11-13 14:35:28,649 INFO [main] o.a.nifi.web.server.HostHeaderHandler 
> Determined 14 valid hostnames and IP addresses for incoming headers: 
> 127.0.0.1, 127.0.0.1:8443, localhost, localhost:8443, [::1], [::1]:8443, 
> ip-172-30-xx-xx.eu-west-1.compute.internal, 
> ip-172-30-xx-xx.eu-west-1.compute.internal:8443, 172.30.xx.xx, 
> 172.30.xx.xx:8443, nifi1.emea.qa.domain.io, nifi1.emea.qa.domain.io:8443, 
> nifi.emea.qa.domain.io, {code}
>  
> NB : I hadn't tested this on newer versions than 1.20 and won't have time to 
> in the coming weeks.
> However, our migration to 1.23 should be done in the next months, I'll update 
> the ticket if needed at that time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12374) Add Use Case based documentation for performing full/incremental loads

2023-11-15 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12374:
--
Status: Patch Available  (was: Open)

> Add Use Case based documentation for performing full/incremental loads
> --
>
> Key: NIFI-12374
> URL: https://issues.apache.org/jira/browse/NIFI-12374
> Project: Apache NiFi
>  Issue Type: Task
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Minor
> Fix For: 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (NIFI-12374) Add Use Case based documentation for performing full/incremental loads

2023-11-15 Thread Mark Payne (Jira)

Mark Payne created NIFI-12374:
-

 Summary: Add Use Case based documentation for performing 
full/incremental loads
 Key: NIFI-12374
 URL: https://issues.apache.org/jira/browse/NIFI-12374
 Project: Apache NiFi
  Issue Type: Task
  Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
 Fix For: 2.latest






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (NIFI-12332) Remove nifi-toolkit-flowfile-repo Module

2023-11-10 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-12332:
--
Fix Version/s: 2.0.0
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Remove nifi-toolkit-flowfile-repo Module
> 
>
> Key: NIFI-12332
> URL: https://issues.apache.org/jira/browse/NIFI-12332
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Minor
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The nifi-toolkit-flowfile-repo module contains a command to support repairing 
> corrupted endings in a FlowFile repository. The command is not accessible 
> through any shell scripts and is not regularly maintained as part of the NiFi 
> CLI or other toolkit components. For these reasons, the module should be 
> removed from the main branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (NIFI-12339) Sensitive Dynamic Properties not properly decrypted, resulting in wrong property value and ever-growing flow.json.gz

2023-11-10 Thread Mark Payne (Jira)



 [ 
https://issues.apache.org/jira/browse/NIFI-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne resolved NIFI-12339.
---
Resolution: Fixed

> Sensitive Dynamic Properties not properly decrypted, resulting in wrong 
> property value and ever-growing flow.json.gz
> 
>
> Key: NIFI-12339
> URL: https://issues.apache.org/jira/browse/NIFI-12339
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: David Handermann
>Priority: Blocker
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> To replication, create an InvokeHTTP Processor. Add a Sensitive Dynamic 
> Property named "Authorization" with a value of "Bearer 
> fsi8y3ofysp9f8ncp9nupnu8p3s9nu3s9" (it's ok that the value is nonsense). 
> Apply the changes.
> Check the flow.json.gz:
> {code:java}
> cat conf/flow.json.gz | gunzip - | jq | grep Authorization{code}
> Restart NiFi.
> The value is no longer correct. And if you run the {{cat}} command above, 
> you'll see the value has doubled in length. After restarting several times we 
> can see this:
> {code:java}
> nifi-2.0.0-SNAPSHOT $ cat conf/flow.json.gz | gunzip - | jq | grep 
> Authorization
>               "Authorization": 
> "enc{f1f9ba180c6468ff8ce393955034e69383739de54b44ef42b1bf2050c2639e83815d940b8a0cf9f5bc65bdf36f7df59bff9d7e69fa02f0ccc25c8b381684550c8fc6b6a8c570998064ef730f05b0dc}",
> -- restart --nifi-2.0.0-SNAPSHOT $ cat conf/flow.json.gz | gunzip - | jq | 
> grep Authorization
>               "Authorization": 
> "enc{e4455b884d07a7156397d2f60ce3a2f44be909084403f5a84af205bae2af6dbfa2adf47a33d6663799ab523915e9323064554030236b928d5b1684b0a9d635b6589d878b731c35ae1560fbef5627a433b23fb331657e66af355ac356a1c9cd1435c0836a4ecb872966c2852aa3b13e179da1a0f7898c64173b27363458c01dbf7c8595a5dfe9ab798834568c9e0a52fefaf03f6f9d1bdf6ad230fea7cf1e8663a78a6b964d945c729d9ae678e2eaba8910d02373cd9acd08e7a047e0c676ee8a13e9c0}",
> -- restart --nifi-2.0.0-SNAPSHOT $ cat conf/flow.json.gz | gunzip - | jq | 
> grep Authorization
>               "Authorization": 
> "enc{1aeb6970c1ff7f10b88f5b94a2c0cfa70c179638eb976ff7580f5b2546a64b4d96ae834afff9d01cae79c98b9ca4d73af604eab5e95013047e79c152d3e90b3c556e054f9478713eb156da41477d59668902c606f3f300e9804b8a504712822b5f072a5a596c2ba1706520f0163ce8bf0a51dbaf84ee9359c60e55df029dec700725ff1ac599774d4271d5c390ad49d4b350d21bee9f2c235a81f5356d85279db7b4e335bc11fc0d6bf1045a6d2610ff61d8b9da931fc026d356a3d9a9b738312d283c01740757a286e5eb9ad675daa14a391d3df694eaeeb6c66085976a88c86a08052b3eb046e622e5346205bc1e38bfe4aed2ff130595688e4b72d217f29a5c24a28bc06c7bb55e4fd2d25fea15ce523e92b8d721e9a9c0d08ab6d1634cb027658c868feacd89462796b604db7dc55cc2bba7c650f77148bad4ec7328ae8dbeed743420b5b640061f36ed8c8c1db200bbe6a241d6eb370cb024a5881fc734d722e2f1091f1ffa178ad841a4859c9dc734b66a628fbfeb8c3f0a1e5d02e28ce3e2c04737ab5b92d032fafe21ebe5abd542731228b394356bb5b547c68517f972864351022d2ef1118426}",
> -- restart --
> nifi-2.0.0-SNAPSHOT $ cat conf/flow.json.gz | gunzip - | jq | grep 
> Authorization
>               "Authorization": 
>

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3001 matches

Mail list logo