[jira] [Created] (NIFI-13217) Update JavaDoc on core attributes unaffected by removeAttribute / removeAllAttributes

2024-05-11 Thread endzeit (Jira)
endzeit created NIFI-13217:
--

 Summary: Update JavaDoc on core attributes unaffected by 
removeAttribute / removeAllAttributes
 Key: NIFI-13217
 URL: https://issues.apache.org/jira/browse/NIFI-13217
 Project: Apache NiFi
  Issue Type: Task
Reporter: endzeit
Assignee: endzeit


NIFI-13200 introduced a behavioral change to the {{standard}} implementation of 
{{ProcessSession}}.

The existing JavaDoc of {{ProcessSession}} notes that {{removeAttribute}} and 
{{removeAllAttributes}} do not affect the core attribute {{uuid}}. A similar 
note should be added for the core attributes {{path}} and {{filename}} which 
are exempt as well now.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-12986) Tidy up JavaDoc of ProcessSession

2024-04-22 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839795#comment-17839795
 ] 

endzeit commented on NIFI-12986:


I opened [PR 8683|https://github.com/apache/nifi/pull/8683] addressing the 
deprecation notice. 

For now, I've kept the paragraph style in unison with the other paragraphs in 
the file.
I would like to address the paragraph formatting in a separate issue / PR, so 
we can get this merged to main quickly and don't block release work. 
 

> Tidy up JavaDoc of ProcessSession
> -
>
> Key: NIFI-12986
> URL: https://issues.apache.org/jira/browse/NIFI-12986
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> While working on NIFI-12982 I noticed that the JavaDoc of {{ProcessSession}} 
> has some minor typos and documentation drifts between method overloads.
> The goal of this ticket is to aim make the JavaDoc for the current 
> {{ProcessSession}} specification more consistent. The specified contract must 
> not be altered. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (NIFI-12986) Tidy up JavaDoc of ProcessSession

2024-04-22 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839770#comment-17839770
 ] 

endzeit edited comment on NIFI-12986 at 4/22/24 4:43 PM:
-

As discussed with [~exceptionfactory], I've found both styles of paragraphs 
throughout the codebase and contradictory recommendations online which is why I 
opted for the less verbose option, see the conversation in the PR. 

I agree with David, that I'm fine with either style as long as it's consistent. 
If you prefer the closing tags I'd like to open a PR that aligns all comments 
in the codebase to that style

I'm fine with removing the deprecation notice but would like to keep the 
general explanation of why `asyncCommit` was introduced and should be preferred 
most of the cases. 
The TestRunner by defaults fails when using `commit`, but I wasn't able to 
directly found the reasoning behind these changes. The comments I added was 
based on information I found on the PR / ticket that introduced these changes 
solely. 
What are your thoughts on that?


was (Author: endzeitbegins):
As discussed with [~exceptionfactory], I've found both styles of paragraphs 
throughout the codebase and contradictory recommendations online which is why I 
opted for the less verbose option, see the conversation in the PR. 

I agree with David, that I'm fine with either style as long as it's consistent. 
If you prefer the closing tags I'd like to open a PR that aligns all comments 
in the codebase to that style

I'm fine with removing the deprecation notice but would like to see the general 
explanation of why `asyncCommit` was introduced and should be preferred most of 
the cases. 
The TestRunner by defaults fails when using `commit`, but I wasn't able to 
directly found the reasoning behind these changes. The comments I added was 
based on information I found on the PR / ticket that introduced these changes 
solely. 
What are your thoughts on that?

> Tidy up JavaDoc of ProcessSession
> -
>
> Key: NIFI-12986
> URL: https://issues.apache.org/jira/browse/NIFI-12986
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> While working on NIFI-12982 I noticed that the JavaDoc of {{ProcessSession}} 
> has some minor typos and documentation drifts between method overloads.
> The goal of this ticket is to aim make the JavaDoc for the current 
> {{ProcessSession}} specification more consistent. The specified contract must 
> not be altered. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-12986) Tidy up JavaDoc of ProcessSession

2024-04-22 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839770#comment-17839770
 ] 

endzeit commented on NIFI-12986:


As discussed with [~exceptionfactory], I've found both styles of paragraphs 
throughout the codebase and contradictory recommendations online which is why I 
opted for the less verbose option, see the conversation in the PR. 

I agree with David, that I'm fine with either style as long as it's consistent. 
If you prefer the closing tags I'd like to open a PR that aligns all comments 
in the codebase to that style

I'm fine with removing the deprecation notice but would like to see the general 
explanation of why `asyncCommit` was introduced and should be preferred most of 
the cases. 
The TestRunner by defaults fails when using `commit`, but I wasn't able to 
directly found the reasoning behind these changes. The comments I added was 
based on information I found on the PR / ticket that introduced these changes 
solely. 
What are your thoughts on that?

> Tidy up JavaDoc of ProcessSession
> -
>
> Key: NIFI-12986
> URL: https://issues.apache.org/jira/browse/NIFI-12986
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
> Fix For: 2.0.0-M3
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> While working on NIFI-12982 I noticed that the JavaDoc of {{ProcessSession}} 
> has some minor typos and documentation drifts between method overloads.
> The goal of this ticket is to aim make the JavaDoc for the current 
> {{ProcessSession}} specification more consistent. The specified contract must 
> not be altered. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-13039) Provide default implementations for method overloads in ProcessSession

2024-04-13 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit reassigned NIFI-13039:
--

Assignee: endzeit

> Provide default implementations for method overloads in ProcessSession
> --
>
> Key: NIFI-13039
> URL: https://issues.apache.org/jira/browse/NIFI-13039
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>
> While going through the JavaDoc of ProcessSession in NIFI-12986, I noticed 
> that _void commitAsync();_ and _FlowFile merge(Collection sources, 
> FlowFile destination);_ are overloads of their respective "sibling" functions.
> As discussed in the [PR 
> #8620|https://github.com/apache/nifi/pull/8620#issuecomment-2050053994], it 
> makes sense to provide default implementations that utilize the "sibling" 
> functions with more parameters and remove the existing implementations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-13039) Provide default implementations for method overloads in ProcessSession

2024-04-12 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-13039:
---
Description: 
While going through the JavaDoc of ProcessSession in NIFI-12986, I noticed that 
_void commitAsync();_ and _FlowFile merge(Collection sources, 
FlowFile destination);_ are overloads of their respective "sibling" functions.
As discussed in the [PR 
#8620|https://github.com/apache/nifi/pull/8620#issuecomment-2050053994], it 
makes sense to provide default implementations that utilize the "sibling" 
functions with more parameters and remove the existing implementations.


  was:
While going through the JavaDoc of ProcessSession in NIFI-12986, I noticed that 
_void commitAsync();_ and _FlowFile merge(Collection sources, 
FlowFile destination);_ are overloads of their respective "sibling" functions.
As discussed in the [PR #8620|https://github.com/apache/nifi/pull/8620], it 
makes sense to provide default implementations that utilize the "sibling" 
functions with more parameters and remove the existing implementations.



> Provide default implementations for method overloads in ProcessSession
> --
>
> Key: NIFI-13039
> URL: https://issues.apache.org/jira/browse/NIFI-13039
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Priority: Major
>
> While going through the JavaDoc of ProcessSession in NIFI-12986, I noticed 
> that _void commitAsync();_ and _FlowFile merge(Collection sources, 
> FlowFile destination);_ are overloads of their respective "sibling" functions.
> As discussed in the [PR 
> #8620|https://github.com/apache/nifi/pull/8620#issuecomment-2050053994], it 
> makes sense to provide default implementations that utilize the "sibling" 
> functions with more parameters and remove the existing implementations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-13039) Provide default implementations for method overloads in ProcessSession

2024-04-12 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-13039:
---
Description: 
While going through the JavaDoc of ProcessSession in NIFI-12986, I noticed that 
_void commitAsync();_ and _FlowFile merge(Collection sources, 
FlowFile destination);_ are overloads of their respective "sibling" functions.
As discussed in the [PR #8620|https://github.com/apache/nifi/pull/8620], it 
makes sense to provide default implementations that utilize the "sibling" 
functions with more parameters and remove the existing implementations.


  was:
While going through the JavaDoc of ProcessSession in NIFI-12986, I noticed that 
_void commitAsync();_ and _FlowFile merge(Collection sources, 
FlowFile destination);_ are overloads of their respective "sibling" functions.
As discussed in the PR with more parameters but do not define a default 
implementation.



> Provide default implementations for method overloads in ProcessSession
> --
>
> Key: NIFI-13039
> URL: https://issues.apache.org/jira/browse/NIFI-13039
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Priority: Major
>
> While going through the JavaDoc of ProcessSession in NIFI-12986, I noticed 
> that _void commitAsync();_ and _FlowFile merge(Collection sources, 
> FlowFile destination);_ are overloads of their respective "sibling" functions.
> As discussed in the [PR #8620|https://github.com/apache/nifi/pull/8620], it 
> makes sense to provide default implementations that utilize the "sibling" 
> functions with more parameters and remove the existing implementations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-13039) Provide default implementations for method overloads in ProcessSession

2024-04-12 Thread endzeit (Jira)
endzeit created NIFI-13039:
--

 Summary: Provide default implementations for method overloads in 
ProcessSession
 Key: NIFI-13039
 URL: https://issues.apache.org/jira/browse/NIFI-13039
 Project: Apache NiFi
  Issue Type: Sub-task
Reporter: endzeit


While going through the JavaDoc of ProcessSession in NIFI-12986, I noticed that 
_void commitAsync();_ and _FlowFile merge(Collection sources, 
FlowFile destination);_ are overloads of their respective "sibling" functions.
As discussed in the PR with more parameters but do not define a default 
implementation.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12986) Tidy up JavaDoc of ProcessSession

2024-04-01 Thread endzeit (Jira)
endzeit created NIFI-12986:
--

 Summary: Tidy up JavaDoc of ProcessSession
 Key: NIFI-12986
 URL: https://issues.apache.org/jira/browse/NIFI-12986
 Project: Apache NiFi
  Issue Type: Sub-task
Reporter: endzeit
Assignee: endzeit


While working on NIFI-12982 I noticed that the JavaDoc of {{ProcessSession}} 
has some minor typos and documentation drifts between method overloads.

The goal of this ticket is to aim make the JavaDoc for the current 
{{ProcessSession}} specification more consistent. The specified contract must 
not be altered. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12985) Refactor MockProcessSession using current API methods

2024-04-01 Thread endzeit (Jira)
endzeit created NIFI-12985:
--

 Summary: Refactor MockProcessSession using current API methods
 Key: NIFI-12985
 URL: https://issues.apache.org/jira/browse/NIFI-12985
 Project: Apache NiFi
  Issue Type: Sub-task
Reporter: endzeit
Assignee: endzeit






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-12971) Provide a utility to detect leaked ProcessSession objects in unit tests or the UI

2024-04-01 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit reassigned NIFI-12971:
--

Assignee: endzeit

> Provide a utility to detect leaked ProcessSession objects in unit tests or 
> the UI
> -
>
> Key: NIFI-12971
> URL: https://issues.apache.org/jira/browse/NIFI-12971
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0, 2.0.0-M2
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>
> When developing processors for NiFi, developers need to implement 
> [Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].
> Most often this is done by extending 
> [AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
>  which ensures that the 
> [ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
>  used is either commited or, if that's not possible, rolled back.
> In cases where the developer needs more control over session management, they 
> might extend from 
> [AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
>  instead, which allows to create and handle {{ProcessSessions}} on their own 
> terms.
> When using the latter, developers need to ensure they handle all sessions 
> created gracefully, that is, to commit or roll back all sessions they create, 
> like {{AbstractProcessor}} ensures.
> However, failing to do so may lead to unnoticed leakage / lost of 
> [FlowFile|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/flowfile/FlowFile.html]s
>  and their associated data. 
> While data might be recovered from provenance, users are most likely not even 
> aware of the data loss, as 
> there won't be a bulletin visible in the UI indicating data loss due to no 
> Exception occuring or am error being logged.
> The following is a minimal example, which reproduces the problem. All 
> {{FlowFiles}} that enter the processor leak and eventually get lost when the 
> processor is shut down.
> {code:java}
> @InputRequirement(INPUT_REQUIRED)
> public class LeakFlowFile extends AbstractSessionFactoryProcessor {
> public static final Relationship REL_SUCCESS = new Relationship.Builder()
> .name("success")
> .description("All FlowFiles are routed to this relationship.")
> .build();
> private static final Set RELATIONSHIPS = 
> Set.of(REL_SUCCESS);
> @Override
> public Set getRelationships() {
> return RELATIONSHIPS;
> }
> @Override
> public void onTrigger(ProcessContext context, ProcessSessionFactory 
> sessionFactory) throws ProcessException {
> ProcessSession session = sessionFactory.createSession();
> FlowFile flowFile = session.get();
> if (flowFile == null) {
> return;
> }
> session.transfer(flowFile, REL_SUCCESS);
> // whoops, no commit or rollback
> }
> } {code}
> While the issue is quite obvious in this example, it might not be for more 
> complex processors, e.g. when based on 
> [BinFiles|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-extension-utils/nifi-bin-manager/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java].
>  In case a developer misses to commit / rollback the session in 
> {{{}processBin{}}}, the same behaviour can be observed.
> The behavior also is not made visible by tests. The following test passes, 
> even though the session has not been committed (or rolled back).
> {code:java}
> class LeakFlowFileTest {
> private final TestRunner testRunner = 
> TestRunners.newTestRunner(LeakFlowFile.class);
> @Test
> void doesNotDetectLeak() {
> testRunner.enqueue("some data");
> testRunner.run();
> testRunner.assertAllFlowFilesTransferred(LeakFlowFile.REL_SUCCESS, 1);
> }
> } {code}
> 
> I would like to propose enhancements to NiFi in order to ease detection of 
> such implementation faults or even confine the harm they might incur.
> One approach is to extend the capabilities of TestRunner such that on 
> shutdown of a tested processor, it checks whether all sessions that were 
> created during the test and had a change associated with them, e.g. pulling a 
> FlowFile or adjusting state, do not have pending changes left but were 
> properly handled, e.g. by committing the session. In case that's not the 
> case, the test may fail, similar to trying to commit a session where 
> FlowFiles haven't been transferred / removed. 
> This way, developers that test their processors 

[jira] [Updated] (NIFI-12982) Extend test suite of MockProcessSession

2024-03-31 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12982:
---
Description: 
As part of NIFI-12971 (or a subtask) most likely changes will be introduced to 
both {{StandardProcessorTestRunner}} and {{MockProcessSession}}.
 
To reduce the risk of regressions  introduced by those changes, the test suite 
of {{MockProcessSession}} should be extended to account for more scenarios, 
regarding both expected successful and failing interactions.

  was:
As part of NIFI-12971 (or a subtask) most likely changes will be introduced to 
both {{
StandardProcessorTestRunner}} and {{MockProcessSession}}.
 
To reduce the risk of regressions  introduced by those changes, the test suite 
of \{{MockProcessSession}} should be extended to account for more scenarios, 
regarding both expected successful and failing interactions.


> Extend test suite of MockProcessSession
> ---
>
> Key: NIFI-12982
> URL: https://issues.apache.org/jira/browse/NIFI-12982
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As part of NIFI-12971 (or a subtask) most likely changes will be introduced 
> to both {{StandardProcessorTestRunner}} and {{MockProcessSession}}.
>  
> To reduce the risk of regressions  introduced by those changes, the test 
> suite of {{MockProcessSession}} should be extended to account for more 
> scenarios, regarding both expected successful and failing interactions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12982) Extend test suite of MockProcessSession

2024-03-31 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12982:
---
Description: 
As part of NIFI-12971 (or a subtask thereof) most likely changes will be 
introduced to both {{StandardProcessorTestRunner}} and {{MockProcessSession}}.
 
To reduce the risk of regressions  introduced by those changes, the test suite 
of {{MockProcessSession}} should be extended to account for more scenarios, 
regarding both expected successful and failing interactions.

  was:
As part of NIFI-12971 (or a subtask) most likely changes will be introduced to 
both {{StandardProcessorTestRunner}} and {{MockProcessSession}}.
 
To reduce the risk of regressions  introduced by those changes, the test suite 
of {{MockProcessSession}} should be extended to account for more scenarios, 
regarding both expected successful and failing interactions.


> Extend test suite of MockProcessSession
> ---
>
> Key: NIFI-12982
> URL: https://issues.apache.org/jira/browse/NIFI-12982
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As part of NIFI-12971 (or a subtask thereof) most likely changes will be 
> introduced to both {{StandardProcessorTestRunner}} and {{MockProcessSession}}.
>  
> To reduce the risk of regressions  introduced by those changes, the test 
> suite of {{MockProcessSession}} should be extended to account for more 
> scenarios, regarding both expected successful and failing interactions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12982) Extend test suite of MockProcessSession

2024-03-31 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12982:
---
Description: 
As part of NIFI-12971 (or a subtask) most likely changes will be introduced to 
both {{
StandardProcessorTestRunner}} and {{MockProcessSession}}.
 
To reduce the risk of regressions  introduced by those changes, the test suite 
of \{{MockProcessSession}} should be extended to account for more scenarios, 
regarding both expected successful and failing interactions.

  was:
As part of NIFI-12971 (or a subtask) most likely changes will be introduced to 
both {{
StandardProcessorTestRunner}} and \{{MockProcessSession}}.
 
To reduce the risk of regressions  introduced by those changes, the test suite 
of \{{MockProcessSession}} should be extended to account for more scenarios, 
regarding both expected successful and failing interactions.


> Extend test suite of MockProcessSession
> ---
>
> Key: NIFI-12982
> URL: https://issues.apache.org/jira/browse/NIFI-12982
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As part of NIFI-12971 (or a subtask) most likely changes will be introduced 
> to both {{
> StandardProcessorTestRunner}} and {{MockProcessSession}}.
>  
> To reduce the risk of regressions  introduced by those changes, the test 
> suite of \{{MockProcessSession}} should be extended to account for more 
> scenarios, regarding both expected successful and failing interactions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12982) Extend test suite of MockProcessSession

2024-03-31 Thread endzeit (Jira)
endzeit created NIFI-12982:
--

 Summary: Extend test suite of MockProcessSession
 Key: NIFI-12982
 URL: https://issues.apache.org/jira/browse/NIFI-12982
 Project: Apache NiFi
  Issue Type: Sub-task
Reporter: endzeit
Assignee: endzeit


As part of NIFI-12971 (or a subtask) most likely changes will be introduced to 
both {{
StandardProcessorTestRunner}} and \{{MockProcessSession}}.
 
To reduce the risk of regressions  introduced by those changes, the test suite 
of \{{MockProcessSession}} should be extended to account for more scenarios, 
regarding both expected successful and failing interactions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12971:
---
Description: 
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back.

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSessions}} on their own 
terms.

When using the latter, developers need to ensure they handle all sessions 
created gracefully, that is, to commit or roll back all sessions they create, 
like {{AbstractProcessor}} ensures.
However, failing to do so may lead to unnoticed leakage / lost of 
[FlowFile|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/flowfile/FlowFile.html]s
 and their associated data. 
While data might be recovered from provenance, users are most likely not even 
aware of the data loss, as 
there won't be a bulletin visible in the UI indicating data loss due to no 
Exception occuring or am error being logged.

The following is a minimal example, which reproduces the problem. All 
{{FlowFiles}} that enter the processor leak and eventually get lost when the 
processor is shut down.
{code:java}
@InputRequirement(INPUT_REQUIRED)
public class LeakFlowFile extends AbstractSessionFactoryProcessor {

public static final Relationship REL_SUCCESS = new Relationship.Builder()
.name("success")
.description("All FlowFiles are routed to this relationship.")
.build();

private static final Set RELATIONSHIPS = Set.of(REL_SUCCESS);

@Override
public Set getRelationships() {
return RELATIONSHIPS;
}

@Override
public void onTrigger(ProcessContext context, ProcessSessionFactory 
sessionFactory) throws ProcessException {
ProcessSession session = sessionFactory.createSession();

FlowFile flowFile = session.get();
if (flowFile == null) {
return;
}

session.transfer(flowFile, REL_SUCCESS);

// whoops, no commit or rollback
}
} {code}
While the issue is quite obvious in this example, it might not be for more 
complex processors, e.g. when based on 
[BinFiles|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-extension-utils/nifi-bin-manager/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java].
 In case a developer misses to commit / rollback the session in 
{{{}processBin{}}}, the same behaviour can be observed.

The behavior also is not made visible by tests. The following test passes, even 
though the session has not been committed (or rolled back).
{code:java}
class LeakFlowFileTest {

private final TestRunner testRunner = 
TestRunners.newTestRunner(LeakFlowFile.class);

@Test
void doesNotDetectLeak() {
testRunner.enqueue("some data");

testRunner.run();

testRunner.assertAllFlowFilesTransferred(LeakFlowFile.REL_SUCCESS, 1);
}
} {code}

I would like to propose enhancements to NiFi in order to ease detection of such 
implementation faults or even confine the harm they might incur.

One approach is to extend the capabilities of TestRunner such that on shutdown 
of a tested processor, it checks whether all sessions that were created during 
the test and had a change associated with them, e.g. pulling a FlowFile or 
adjusting state, do not have pending changes left but were properly handled, 
e.g. by committing the session. In case that's not the case, the test may fail, 
similar to trying to commit a session where FlowFiles haven't been transferred 
/ removed. 
This way, developers that test their processors thoroughly might catch such 
implementation mistakes early on even before they get into production.
However, in case tests are missing for a scenario or all together, the issue 
might get overlooked and happen in production. On the plus side, such a change 
would only affect the development and had no (e.g. performance) impact on the 
production environments. Side node: Maybe the TestRunner also should not treat 
FlowFiles as transferred until the session in which the transfer was issued is 
committed.

A different approach would be to enhance {{AbstractSessionFactoryProcessor}} or 
a cooperating component to check for unhandled 

[jira] [Updated] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12971:
---
Description: 
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back.

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSessions}} on their own 
terms.

When using the latter, developers need to ensure they handle all sessions 
created gracefully, that is, to commit or roll back all sessions they create, 
like {{AbstractProcessor}} ensures.
However, failing to do so may lead to unnoticed leakage / lost of 
[FlowFile|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/flowfile/FlowFile.html]s
 and their associated data. 
While data might be recovered from provenance, users are most likely not even 
aware of the data loss, as 
there won't be a bulletin visible in the UI indicating data loss due to no 
Exception occuring or am error being logged.

The following is a minimal example, which reproduces the problem. All 
{{FlowFiles}} that enter the processor leak and eventually get lost when the 
processor is shut down.
{code:java}
@InputRequirement(INPUT_REQUIRED)
public class LeakFlowFile extends AbstractSessionFactoryProcessor {

public static final Relationship REL_SUCCESS = new Relationship.Builder()
.name("success")
.description("All FlowFiles are routed to this relationship.")
.build();

private static final Set RELATIONSHIPS = Set.of(REL_SUCCESS);

@Override
public Set getRelationships() {
return RELATIONSHIPS;
}

@Override
public void onTrigger(ProcessContext context, ProcessSessionFactory 
sessionFactory) throws ProcessException {
ProcessSession session = sessionFactory.createSession();

FlowFile flowFile = session.get();
if (flowFile == null) {
return;
}

session.transfer(flowFile, REL_SUCCESS);

// whoops, no commit or rollback
}
} {code}
While the issue is quite obvious in this example, it might not be for more 
complex processors, e.g. when based on 
[BinFiles|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-extension-utils/nifi-bin-manager/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java].
 In case a developer misses to commit / rollback the session in 
{{{}processBin{}}}, the same behaviour can be observed.

The behavior also is not made visible by tests. The following test passes, even 
though the session has not been committed (or rolled back).
{code:java}
class LeakFlowFileTest {

private final TestRunner testRunner = 
TestRunners.newTestRunner(LeakFlowFile.class);

@Test
void doesNotDetectLeak() {
testRunner.enqueue("some data");

testRunner.run();

testRunner.assertAllFlowFilesTransferred(LeakFlowFile.REL_SUCCESS, 1);
}
} {code}

I would like to propose enhancements to NiFi in order to ease detection of such 
implementation faults or even confine the harm they might incur.

One approach is to extend the capabilities of TestRunner such that on shutdown 
of a tested processor, it checks whether all sessions that were created during 
the test and had a change associated with them, e.g. pulling a FlowFile or 
adjusting state, do not have pending changes left but were properly handled, 
e.g. by committing the session. In case that's not the case, the test may fail, 
similar to trying to commit a session where FlowFiles haven't been transferred 
/ removed. 
This way, developers that test their processors thoroughly might catch such 
implementation mistakes early on even before they get into production.
However, in case tests are missing for a scenario or all together, the issue 
might get overlooked and happen in production. On the plus side, such a change 
would only affect the development and had no (e.g. performance) impact on the 
production environments.

A different approach would be to enhance {{AbstractSessionFactoryProcessor}} or 
a cooperating component to check for unhandled {{ProcessSessions}} on processor 
shutdown. In case an unhandled session is found, an error should be logged to 
make the issue visible and the session 

[jira] [Updated] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12971:
---
Description: 
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back.

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSessions}} on their own 
terms.

When using the latter, developers need to ensure they handle all sessions 
created gracefully, that is, to commit or roll back all sessions they create, 
like {{AbstractProcessor}} ensures.
However, failing to do so may lead to unnoticed leakage / lost of 
[FlowFile|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/flowfile/FlowFile.html]s
 and their associated data. 
While data might be recovered from provenance, users are most likely not even 
aware of the data loss, as 
there won't be a bulletin visible in the UI indicating data loss due to no 
Exception occuring or am error being logged.

The following is a minimal example, which reproduces the problem. All 
{{FlowFiles}} that enter the processor leak and eventually get lost when the 
processor is shut down.
{code:java}
@InputRequirement(INPUT_REQUIRED)
public class LeakFlowFile extends AbstractSessionFactoryProcessor {

public static final Relationship REL_SUCCESS = new Relationship.Builder()
.name("success")
.description("A FlowFile is routed to this relationship after it 
has been successfully stored in HBase")
.build();

private static final Set RELATIONSHIPS = Set.of(REL_SUCCESS);

@Override
public Set getRelationships() {
return RELATIONSHIPS;
}

@Override
public void onTrigger(ProcessContext context, ProcessSessionFactory 
sessionFactory) throws ProcessException {
ProcessSession session = sessionFactory.createSession();

FlowFile flowFile = session.get();
if (flowFile == null) {
return;
}

session.transfer(flowFile, REL_SUCCESS);

// whoops, no commit or rollback
}
} {code}
While the issue is quite obvious in this example, it might not be for more 
complex processors, e.g. when based on 
[BinFiles|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-extension-utils/nifi-bin-manager/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java].
 In case a developer misses to commit / rollback the session in 
{{{}processBin{}}}, the same behaviour can be observed.

The behavior also is not made visible by tests. The following test passes, even 
though the session has not been committed (or rolled back).
{code:java}
class LeakFlowFileTest {

private final TestRunner testRunner = 
TestRunners.newTestRunner(LeakFlowFile.class);

@Test
void doesNotDetectLeak() {
testRunner.enqueue("some data");

testRunner.run();

testRunner.assertAllFlowFilesTransferred(LeakFlowFile.REL_SUCCESS, 1);
}
} {code}

I would like to propose enhancements to NiFi in order to ease detection of such 
implementation faults or even confine the harm they might incur.

One approach is to extend the capabilities of TestRunner such that on shutdown 
of a tested processor, it checks whether all sessions that were created during 
the test and had a change associated with them, e.g. pulling a FlowFile or 
adjusting state, do not have pending changes left but were properly handled, 
e.g. by committing the session. In case that's not the case, the test may fail, 
similar to trying to commit a session where FlowFiles haven't been transferred 
/ removed. 
This way, developers that test their processors thoroughly might catch such 
implementation mistakes early on even before they get into production.
However, in case tests are missing for a scenario or all together, the issue 
might get overlooked and happen in production. On the plus side, such a change 
would only affect the development and had no (e.g. performance) impact on the 
production environments.

A different approach would be to enhance {{AbstractSessionFactoryProcessor}} or 
a cooperating component to check for unhandled {{ProcessSessions}} on processor 
shutdown. In case an unhandled session is found, an error should be logged to 

[jira] [Updated] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12971:
---
Description: 
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back.

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSessions}} on their own 
terms.

When using the latter, developers need to ensure they handle all sessions 
created gracefully, that is, to commit or roll back all sessions they create, 
like {{AbstractProcessor}} ensures.
However, failing to do so may lead to unnoticed leakage / lost of 
[FlowFile|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/flowfile/FlowFile.html]s
 and their associated data. 
While data might be recovered from provenance, users are most likely not even 
aware of the data loss, as 
there won't be a bulletin visible in the UI indicating data loss due to no 
Exception occuring or am error being logged. 

The following is a minimal example, which reproduces the problem. All 
{{FlowFiles}} that enter the processor leak and eventually get lost when the 
processor is shut down. 

{code:java}
class LeakFlowFile extends AbstractSessionFactoryProcessor {
TODO
}
{code}

While the issue is quite obvious in this example, it might not be for more 
complex processors, e.g. when based on 
[BinFiles|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-extension-utils/nifi-bin-manager/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java].
 In case a developer misses to commit / rollback the session in {{processBin}}, 
the same behaviour can be observed.

-

I would like to propose enhancements to NiFi in order to ease detection of such 
implementation faults or even confine the harm they might incur.

One approach is to extend the capabilities of TestRunner such that on shutdown 
of a tested processor, it checks whether all sessions that were created during 
the test and had a change associated with them, e.g. pulling a FlowFile or 
adjusting state, do not have pending changes left but were properly handled, 
e.g. by committing the session. In case that's not the case, the test may fail, 
similar to trying to commit a session where FlowFiles haven't been transferred 
/ removed. 
This way, developers that test their processors thoroughly might catch such 
implementation mistakes early on even before they get into production.
However, in case tests are missing for a scenario or all together, the issue 
might get overlooked and happen in production. On the plus side, such a change 
would only affect the development and had no (e.g. performance) impact on the 
production environments. 

A different approach would be to enhance {{AbstractSessionFactoryProcessor}} or 
a cooperating component to check for unhandled {{ProcessSessions}} on processor 
shutdown. In case an unhandled session is found, an error should be logged to 
make the issue visible and the session handled, probably by rolling back. While 
this might reduce the chance for data loss, it might impact performance and 
have other unpredictable side effects for processors, that interact with 
external systems and communicated some sort of changes already. However, this 
might be worth it in order to not lose data. 

The approaches might be combined to reap the benefits of both, that is, 
catching issues in test cases early on but having a more graceful behaviour in 
production in case am issue is overlooked.

  was:
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back.

In cases where the developer needs more control over session management, they 
might extend from 

[jira] [Updated] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12971:
---
Description: 
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back.

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSessions}} on their own 
terms.

When using the latter, developers need to ensure they handle all sessions 
created gracefully, that is, to commit or roll back all sessions they create, 
like {{AbstractProcessor}} ensures.
However, failing to do so may lead to unnoticed leakage / lost of 
[FlowFile|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/flowfile/FlowFile.html]s
 and their associated data. 

The following is a minimal example, which reproduces the problem. All 
{{FlowFiles}} that enter the processor leak and eventually get lost when the 
processor is shut down. 

{code:java}
class LeakFlowFile extends AbstractSessionFactoryProcessor {
TODO
}
{code}

While the issue is quite obvious in this example, it might not be for more 
complex processors, e.g. when based on 
[BinFiles|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-extension-utils/nifi-bin-manager/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java].
 In case a developer misses to commit / rollback the session in {{processBin}}, 
the same behaviour can be observed.

-

I would like to propose enhancements to NiFi in order to ease detection of such 
implementation faults or even confine the harm they might incur.

TODO TestRunner

TODO AbstractSessionFactoryProcessor?



  was:
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back.

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSessions}} on their own 
terms.

When using the latter, developers need to ensure they handle all sessions 
created gracefully, that is, to commit or roll back all sessions they create, 
like {{AbstractProcessor}} ensures.
However, failing to do so may lead to unnoticed leakage / lost of 
[FlowFile|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/flowfile/FlowFile.html]s
 and their associated data. 

The following is a minimal example, which reproduces the problem. All 
{{FlowFiles}} that enter the processor leak and eventually get lost when the 
processor is shut down. 

{code:java}
class LeakFlowFile extends AbstractSessionFactoryProcessor {
TODO
}
{code}

While the issue is quite obvious in this example, it might not be for more 
complex processors, e.g. when based on 
[BinFiles|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-extension-utils/nifi-bin-manager/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java].
 In case a developer misses to commit / rollback the session in {{processBin}}, 
the same behaviour can be observed.

---

I would like to propose enhancements to NiFi in order to ease detection of such 
implementation faults or even confine the harm they might incur.

TODO TestRunner

TODO AbstractSessionFactoryProcessor?




> Processor may leak / lose ProcessSession with FlowFile
> --
>
> Key: NIFI-12971
> URL: https://issues.apache.org/jira/browse/NIFI-12971
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.25.0, 2.0.0-M2
>Reporter: endzeit
>Priority: Major
>
> When developing processors for NiFi, 

[jira] [Updated] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12971:
---
Description: 
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back.

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSessions}} on their own 
terms.

When using the latter, developers need to ensure they handle all sessions 
created gracefully, that is, to commit or roll back all sessions they create, 
like {{AbstractProcessor}} ensures.
However, failing to do so may lead to unnoticed leakage / lost of 
[FlowFile|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/flowfile/FlowFile.html]s
 and their associated data. 

The following is a minimal example, which reproduces the problem. All 
{{FlowFiles}} that enter the processor leak and eventually get lost when the 
processor is shut down. 

{code:java}
class LeakFlowFile extends AbstractSessionFactoryProcessor {
TODO
}
{code}

While the issue is quite obvious in this example, it might not be for more 
complex processors, e.g. when based on 
[BinFiles|https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-extension-utils/nifi-bin-manager/src/main/java/org/apache/nifi/processor/util/bin/BinFiles.java].
 In case a developer misses to commit / rollback the session in {{processBin}}, 
the same behaviour can be observed.

---

I would like to propose enhancements to NiFi in order to ease detection of such 
implementation faults or even confine the harm they might incur.

TODO TestRunner

TODO AbstractSessionFactoryProcessor?



  was:
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back.

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSessions}} on their own 
terms.

TODO

{code:java}
class LeakFlowFile extends AbstractSessionFactoryProcessor {
TODO
}
{code}

---

I would like to propose enhancements to NiFi in order to ease detection of such 
implementation faults or even confine the harm they might incur.

TODO TestRunner

TODO AbstractSessionFactoryProcessor?




> Processor may leak / lose ProcessSession with FlowFile
> --
>
> Key: NIFI-12971
> URL: https://issues.apache.org/jira/browse/NIFI-12971
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.25.0, 2.0.0-M2
>Reporter: endzeit
>Priority: Major
>
> When developing processors for NiFi, developers need to implement 
> [Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].
> Most often this is done by extending 
> [AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
>  which ensures that the 
> [ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
>  used is either commited or, if that's not possible, rolled back.
> In cases where the developer needs more control over session management, they 
> might extend from 
> [AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
>  instead, which allows to create and handle {{ProcessSessions}} on their own 
> terms.
> When using the latter, developers need to ensure they handle all sessions 
> created 

[jira] [Updated] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12971:
---
Description: 
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].
 

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back. 

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSessions}} on their own 
terms. 

TODO

  was:
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].
 

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back. 

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSession}}s on their own 
terms. 

TODO


> Processor may leak / lose ProcessSession with FlowFile
> --
>
> Key: NIFI-12971
> URL: https://issues.apache.org/jira/browse/NIFI-12971
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.25.0, 2.0.0-M2
>Reporter: endzeit
>Priority: Major
>
> When developing processors for NiFi, developers need to implement 
> [Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].
>  
> Most often this is done by extending 
> [AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
>  which ensures that the 
> [ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
>  used is either commited or, if that's not possible, rolled back. 
> In cases where the developer needs more control over session management, they 
> might extend from 
> [AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
>  instead, which allows to create and handle {{ProcessSessions}} on their own 
> terms. 
> TODO



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12971:
---
Description: 
When developing processors for NiFi, developers need to implement 
[Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].
 

Most often this is done by extending 
[AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
 which ensures that the 
[ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
 used is either commited or, if that's not possible, rolled back. 

In cases where the developer needs more control over session management, they 
might extend from 
[AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
 instead, which allows to create and handle {{ProcessSession}}s on their own 
terms. 

TODO

  was:When developing processors for NiFi, developers need to implement 
{{org.apache.nifi.processor.Processor}}. Most often this is done by extending 


> Processor may leak / lose ProcessSession with FlowFile
> --
>
> Key: NIFI-12971
> URL: https://issues.apache.org/jira/browse/NIFI-12971
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.25.0, 2.0.0-M2
>Reporter: endzeit
>Priority: Major
>
> When developing processors for NiFi, developers need to implement 
> [Processor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/Processor.html].
>  
> Most often this is done by extending 
> [AbstractProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractProcessor.html]
>  which ensures that the 
> [ProcessSession|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/ProcessSession.html]
>  used is either commited or, if that's not possible, rolled back. 
> In cases where the developer needs more control over session management, they 
> might extend from 
> [AbstractSessionFactoryProcessor|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/processor/AbstractSessionFactoryProcessor.html]
>  instead, which allows to create and handle {{ProcessSession}}s on their own 
> terms. 
> TODO



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12971:
---
Description: When developing processors for NiFi, developers need to 
implement {{org.apache.nifi.processor.Processor}}. Most often this is done by 
extending   (was: TODO)

> Processor may leak / lose ProcessSession with FlowFile
> --
>
> Key: NIFI-12971
> URL: https://issues.apache.org/jira/browse/NIFI-12971
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.25.0, 2.0.0-M2
>Reporter: endzeit
>Priority: Major
>
> When developing processors for NiFi, developers need to implement 
> {{org.apache.nifi.processor.Processor}}. Most often this is done by extending 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12971) Processor may leak / lose ProcessSession with FlowFile

2024-03-29 Thread endzeit (Jira)
endzeit created NIFI-12971:
--

 Summary: Processor may leak / lose ProcessSession with FlowFile
 Key: NIFI-12971
 URL: https://issues.apache.org/jira/browse/NIFI-12971
 Project: Apache NiFi
  Issue Type: Bug
Affects Versions: 2.0.0-M2, 1.25.0
Reporter: endzeit


TODO



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (NIFI-12893) Adjust documentation on time based properties in DBCPProperties

2024-03-16 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit resolved NIFI-12893.

Fix Version/s: 2.0.0
   1.26.0
   Resolution: Fixed

> Adjust documentation on time based properties in DBCPProperties
> ---
>
> Key: NIFI-12893
> URL: https://issues.apache.org/jira/browse/NIFI-12893
> Project: Apache NiFi
>  Issue Type: Task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
> Fix For: 2.0.0, 1.26.0
>
>
> Some of the properties in {{DBCPProperties}}, like {{MAX_CONN_LIFETIME}}, 
> note that the time has to be defined in milliseconds.
> However, this is not true as an time period like "5 mins" is expected.
> The documentation should be adjusted accordingly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12900) Avoid unnecessary directory listing in PutSFTP

2024-03-16 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12900:
---
Summary: Avoid unnecessary directory listing in PutSFTP   (was: Avoid 
unnecessary file listing in PutSFTP )

> Avoid unnecessary directory listing in PutSFTP 
> ---
>
> Key: NIFI-12900
> URL: https://issues.apache.org/jira/browse/NIFI-12900
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The processor {{PutSFTP}} is based on {{PutFileTransfer}}.
> Before an actual upload takes place, potential conflicts (e.g. existing file) 
> are identified and resolved using {{identifyAndResolveConflictFile(...)}}.
> As part of this process, information on the target file is retrieved using 
> {{FileTransfer.getRemoteFileInfo(...)}}.
> In case of {{PutSFTP}} this is implemented by {{SFTPTransfer}}.
> The implementation of {{getRemoteFileInfo}} executes {{ls}} on the target 
> directory path. In case there are a lot of files inside the remote directory, 
> e.g. >10.000 files, the listing reduces the performance of {{PutSFTP}} 
> significantly.
> Instead of a listing on the directory, file information should be retrieved 
> using either {{ls}} or {{stat}} on the target file directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12901) Remove time units in description of properties that specify a time period

2024-03-14 Thread endzeit (Jira)
endzeit created NIFI-12901:
--

 Summary: Remove time units in description of properties that 
specify a time period
 Key: NIFI-12901
 URL: https://issues.apache.org/jira/browse/NIFI-12901
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: endzeit
Assignee: endzeit


In some properties that specify a time period, like {{Max Connection Lifetime}} 
of {{DBCPProperties}}, the documentation claims that the property value is to 
be defined in a concrete time unit, like milliseconds.
However, those properties use the {{StandardValidators.TIME_PERIOD_VALIDATOR}}, 
or a derivative thereof, which supports to define the period in a plethora of 
time units.

The documentation should be changed accordingly, to clarify that the properties 
do not require an integer value of a static time unit but rather the 
declaration of a time period including a time unit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12900) Avoid unnecessary file listing in PutSFTP

2024-03-14 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12900:
---
Description: 
The processor {{PutSFTP}} is based on {{PutFileTransfer}}.
Before an actual upload takes place, potential conflicts (e.g. existing file) 
are identified and resolved using {{identifyAndResolveConflictFile(...)}}.
As part of this process, information on the target file is retrieved using 
{{FileTransfer.getRemoteFileInfo(...)}}.
In case of {{PutSFTP}} this is implemented by {{SFTPTransfer}}.

The implementation of {{getRemoteFileInfo}} executes {{ls}} on the target 
directory path. In case there are a lot of files inside the remote directory, 
e.g. >10.000 files, the listing reduces the performance of {{PutSFTP}} 
significantly.

Instead of a listing on the directory, file information should be retrieved 
using either {{ls}} or {{stat}} on the target file directly.


  was:
The processor `PutSFTP` is based on `PutFileTransfer`.
Before an actual upload takes place, potential conflicts (e.g. existing file) 
are identified and resolved using `identifyAndResolveConflictFile(...)`.
As part of this process, information on the target file is retrieved using 
`FileTransfer.getRemoteFileInfo(...)`.
In case of `PutSFTP` this is implemented by `SFTPTransfer`.

The implementation of `getRemoteFileInfo` executes `ls` on the target directory 
path. In case there are a lot of files inside the remote directory, e.g. 
>10.000 files, the listing reduces the performance of `PutSFTP` significantly.

Instead of a listing on the directory, file information should be retrieved 
using either `ls` or `stat` on the target file directly.



> Avoid unnecessary file listing in PutSFTP 
> --
>
> Key: NIFI-12900
> URL: https://issues.apache.org/jira/browse/NIFI-12900
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>
> The processor {{PutSFTP}} is based on {{PutFileTransfer}}.
> Before an actual upload takes place, potential conflicts (e.g. existing file) 
> are identified and resolved using {{identifyAndResolveConflictFile(...)}}.
> As part of this process, information on the target file is retrieved using 
> {{FileTransfer.getRemoteFileInfo(...)}}.
> In case of {{PutSFTP}} this is implemented by {{SFTPTransfer}}.
> The implementation of {{getRemoteFileInfo}} executes {{ls}} on the target 
> directory path. In case there are a lot of files inside the remote directory, 
> e.g. >10.000 files, the listing reduces the performance of {{PutSFTP}} 
> significantly.
> Instead of a listing on the directory, file information should be retrieved 
> using either {{ls}} or {{stat}} on the target file directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12900) Avoid unnecessary file listing in PutSFTP

2024-03-14 Thread endzeit (Jira)
endzeit created NIFI-12900:
--

 Summary: Avoid unnecessary file listing in PutSFTP 
 Key: NIFI-12900
 URL: https://issues.apache.org/jira/browse/NIFI-12900
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: endzeit
Assignee: endzeit


The processor `PutSFTP` is based on `PutFileTransfer`.
Before an actual upload takes place, potential conflicts (e.g. existing file) 
are identified and resolved using `identifyAndResolveConflictFile(...)`.
As part of this process, information on the target file is retrieved using 
`FileTransfer.getRemoteFileInfo(...)`.
In case of `PutSFTP` this is implemented by `SFTPTransfer`.

The implementation of `getRemoteFileInfo` executes `ls` on the target directory 
path. In case there are a lot of files inside the remote directory, e.g. 
>10.000 files, the listing reduces the performance of `PutSFTP` significantly.

Instead of a listing on the directory, file information should be retrieved 
using either `ls` or `stat` on the target file directly.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12893) Adjust documentation on time based properties in DBCPProperties

2024-03-13 Thread endzeit (Jira)
endzeit created NIFI-12893:
--

 Summary: Adjust documentation on time based properties in 
DBCPProperties
 Key: NIFI-12893
 URL: https://issues.apache.org/jira/browse/NIFI-12893
 Project: Apache NiFi
  Issue Type: Task
Reporter: endzeit
Assignee: endzeit


Some of the properties in {{DBCPProperties}}, like {{MAX_CONN_LIFETIME}}, note 
that the time has to be defined in milliseconds.
However, this is not true as an time period like "5 mins" is expected.

The documentation should be adjusted accordingly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-12882) Allow control over unexpected failure behavior in processors

2024-03-11 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825424#comment-17825424
 ] 

endzeit commented on NIFI-12882:


There was a discussion on a related topic on the [dev mail 
list|https://lists.apache.org/thread/bps716px66lo47x5xb7tm8znp04dgdk0] just a 
few weeks ago.

> Allow control over unexpected failure behavior in processors
> 
>
> Key: NIFI-12882
> URL: https://issues.apache.org/jira/browse/NIFI-12882
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: saarbs
>Priority: Minor
>
> Recently we had a problem with a flow using UpdateAttribute that does 
> base64decode, and one of our flow sources began sending invalid base64. This 
> resulted in exceptions in the processor causing rollbacks to the flowfiles 
> and backpressure in the flow.
> Since there is no failure relationship to the processor there was no way to 
> resolve this issue, And we realized there are a lot of processors facing this 
> issue where an unexpected failure could cause an infinite loop where some 
> flows may desire to send them to a failure relationship.
> I suggest a setting or an unexpected failure relationship, which would allow 
> keeping the existing behavior of a rollback but would also allow terminating 
> or processing the failure in different ways.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12880) Add DeleteFile processor

2024-03-09 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12880:
---
Description: 
The existing processor to retrieve a file from the file system, namely 
{{FetchFile}} and {{GetFile}}, support the removal of the file from the file 
system once the content has been copied into the FlowFile.

However, deleting the file from the file system immediately might not be 
feasible in certain circumstances.  
In cases where the content repository of NiFi does not meet sufficient data 
durability guarantees, it might be desired to remove the source file only after 
it has been processed successfully and its result transferred to a system that 
satisfies those durability constraints.

As of now, there is no built-in solution to achieve such behavior using the 
standard NiFi distribution.
Current workarounds involve the usage of a scripted processor or the creation 
of a custom processor, that provides the desired functionality.

This issue proposes the addition of a {{DeleteFile}} processor to the NiFi 
standard-processors bundle, that fills this gap.
It should expect a FlowFile and delete the file at the path derived from the 
FlowFile attributes. The default values to determine the file path should be 
compatible with the attributes written by the existing {{ListFiles}} processor.

  was:
The existing processor to retrieve a file from the file system, namely 
{{FetchFile}} and {{GetFile}}, support the removal of the file from the file 
system once the content has been copied into the FlowFile.

However, deleting the file from the file system immediately might not be 
feasible in certain circumstances.  
In cases where the content repository of NiFi does not meet sufficient data 
durability guarantees, it might be desired to remove the source file only after 
it has been processed successfully and its result transferred to a system that 
satisfies those durability constraints.

As of now, there is no built-in solution to achieve such behavior using the 
standard NiFi distribution.
Current workarounds involve the usage of a scripted processor or the creation 
of a custom processor, that provides the desired functionality.

This issue proposes the addition of a {{DeleteFile}} processor to the NiFi 
standard-processors bundle, that fills this gap.
It should expect a FlowFile and delete the file at the path derived from the 
FlowFile attributes. The default values for determine the file path should be 
compatible with the attributes written by the existing {{ListFiles}} processor.


> Add DeleteFile processor
> 
>
> Key: NIFI-12880
> URL: https://issues.apache.org/jira/browse/NIFI-12880
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The existing processor to retrieve a file from the file system, namely 
> {{FetchFile}} and {{GetFile}}, support the removal of the file from the file 
> system once the content has been copied into the FlowFile.
> However, deleting the file from the file system immediately might not be 
> feasible in certain circumstances.  
> In cases where the content repository of NiFi does not meet sufficient data 
> durability guarantees, it might be desired to remove the source file only 
> after it has been processed successfully and its result transferred to a 
> system that satisfies those durability constraints.
> As of now, there is no built-in solution to achieve such behavior using the 
> standard NiFi distribution.
> Current workarounds involve the usage of a scripted processor or the creation 
> of a custom processor, that provides the desired functionality.
> This issue proposes the addition of a {{DeleteFile}} processor to the NiFi 
> standard-processors bundle, that fills this gap.
> It should expect a FlowFile and delete the file at the path derived from the 
> FlowFile attributes. The default values to determine the file path should be 
> compatible with the attributes written by the existing {{ListFiles}} 
> processor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12880) Add DeleteFile processor

2024-03-09 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12880:
---
Description: 
The existing processor to retrieve a file from the file system, namely 
{{FetchFile}} and {{GetFile}}, support the removal of the file from the file 
system once the content has been copied into the FlowFile.

However, deleting the file from the file system immediately might not be 
feasible in certain circumstances.  
In cases where the content repository of NiFi does not meet sufficient data 
durability guarantees, it might be desired to remove the source file only after 
it has been processed successfully and its result transferred to a system that 
satisfies those durability constraints.

As of now, there is no built-in solution to achieve such behavior using the 
standard NiFi distribution.
Current workarounds involve the usage of a scripted processor or the creation 
of a custom processor, that provides the desired functionality.

This issue proposes the addition of a {{DeleteFile}} processor to the NiFi 
standard-processors bundle, that fills this gap.
It should expect a FlowFile and delete the file at the path derived from the 
FlowFile attributes. The default values for determine the file path should be 
compatible with the attributes written by the existing {{ListFiles}} processor.

  was:
The existing processor to retrieve a file from the file system, namely 
{{FetchFile}} and {{GetFile}}, support the removal of the file from the file 
system once the content has been copied into the FlowFile.

However, deleting the file from the file system immediately might not be 
feasible in certain circumstances.  
In cases where the content repository of NiFi does not meet sufficient data 
durability guarantees, it might be desired to remove the source file only after 
it has been processed successfully and its result transferred to a system that 
satisfies those durability constraints.

As of now, there is no built-in solution to achieve such behavior using the 
standard NiFi distribution.
Current workarounds involve the usage of a scripted processor or the creation 
of a custom processor, that provides the desired functionality.

This issue proposes the addition of a {{DeleteFile}} processor to the NiFi 
standard-processors bundle, that fills this gap.
It should expect a FlowFile and delete the file at the file path provided as 
FlowFile attributes. The default values for determine the file path should be 
compatible with the attributes written by the existing {{ListFiles}} processor.


> Add DeleteFile processor
> 
>
> Key: NIFI-12880
> URL: https://issues.apache.org/jira/browse/NIFI-12880
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The existing processor to retrieve a file from the file system, namely 
> {{FetchFile}} and {{GetFile}}, support the removal of the file from the file 
> system once the content has been copied into the FlowFile.
> However, deleting the file from the file system immediately might not be 
> feasible in certain circumstances.  
> In cases where the content repository of NiFi does not meet sufficient data 
> durability guarantees, it might be desired to remove the source file only 
> after it has been processed successfully and its result transferred to a 
> system that satisfies those durability constraints.
> As of now, there is no built-in solution to achieve such behavior using the 
> standard NiFi distribution.
> Current workarounds involve the usage of a scripted processor or the creation 
> of a custom processor, that provides the desired functionality.
> This issue proposes the addition of a {{DeleteFile}} processor to the NiFi 
> standard-processors bundle, that fills this gap.
> It should expect a FlowFile and delete the file at the path derived from the 
> FlowFile attributes. The default values for determine the file path should be 
> compatible with the attributes written by the existing {{ListFiles}} 
> processor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12880) Add DeleteFile processor

2024-03-09 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12880:
---
Description: 
The existing processor to retrieve a file from the file system, namely 
{{FetchFile}} and {{GetFile}}, support the removal of the file from the file 
system once the content has been copied into the FlowFile.

However, deleting the file from the file system immediately might not be 
feasible in certain circumstances.  
In cases where the content repository of NiFi does not meet sufficient data 
durability guarantees, it might be desired to remove the source file only after 
it has been processed successfully and its result transferred to a system that 
satisfies those durability constraints.

As of now, there is no built-in solution to achieve such behavior using the 
standard NiFi distribution.
Current workarounds involve the usage of a scripted processor or the creation 
of a custom processor, that provides the desired functionality.

This issue proposes the addition of a {{DeleteFile}} processor to the NiFi 
standard-processors bundle, that fills this gap.
It should expect a FlowFile and delete the file at the file path provided as 
FlowFile attributes. The default values for determine the file path should be 
compatible with the attributes written by the existing {{ListFiles}} processor.

  was:TODO


> Add DeleteFile processor
> 
>
> Key: NIFI-12880
> URL: https://issues.apache.org/jira/browse/NIFI-12880
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>
> The existing processor to retrieve a file from the file system, namely 
> {{FetchFile}} and {{GetFile}}, support the removal of the file from the file 
> system once the content has been copied into the FlowFile.
> However, deleting the file from the file system immediately might not be 
> feasible in certain circumstances.  
> In cases where the content repository of NiFi does not meet sufficient data 
> durability guarantees, it might be desired to remove the source file only 
> after it has been processed successfully and its result transferred to a 
> system that satisfies those durability constraints.
> As of now, there is no built-in solution to achieve such behavior using the 
> standard NiFi distribution.
> Current workarounds involve the usage of a scripted processor or the creation 
> of a custom processor, that provides the desired functionality.
> This issue proposes the addition of a {{DeleteFile}} processor to the NiFi 
> standard-processors bundle, that fills this gap.
> It should expect a FlowFile and delete the file at the file path provided as 
> FlowFile attributes. The default values for determine the file path should be 
> compatible with the attributes written by the existing {{ListFiles}} 
> processor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (NIFI-12632) Extract SFTP components out of the standard bundle

2024-03-09 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit closed NIFI-12632.
--

> Extract SFTP components out of the standard bundle
> --
>
> Key: NIFI-12632
> URL: https://issues.apache.org/jira/browse/NIFI-12632
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> NIFI-11171 and the goals for NIFI 2.0 outline the desire to extract the SFTP 
> based components out of the standard bundle into a separate bundle. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12880) Add DeleteFile processor

2024-03-09 Thread endzeit (Jira)
endzeit created NIFI-12880:
--

 Summary: Add DeleteFile processor
 Key: NIFI-12880
 URL: https://issues.apache.org/jira/browse/NIFI-12880
 Project: Apache NiFi
  Issue Type: New Feature
Reporter: endzeit
Assignee: endzeit


TODO



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-03-09 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit closed NIFI-12841.
--

> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
> the resource from the source after successful transfer into the content of a 
> FlowFile. 
> However, in some scenarios it might be undesired to remove the resource until 
> it has been processed successfully and the transformation result be stored, 
> e.g. to a secure network storage.
> This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its 
> own. 
> As of now, one of the scripting processors or even a full-fledged custom 
> processor can be used to achieve this. 
> However, these might get relatively involved due to session handling or other 
> concerns.
> This issue proposes the introduction of an additional such processor "type", 
> namely {{RemoveXYZ}} which removes a resource.
> The base processor should have two properties, namely {{path}} and 
> {{filename}}, by default retrieving their values from the respective core 
> FlowFile attributes. Implementations may add protocol specific properties, 
> e.g. for authentication. 
> There should be three outgoing relationships at least:
> - "success" for FlowFiles, where the resource was removed from the source,
> - "not exists" for FlowFiles, where the resource did (no longer) exist on the 
> source,
> - "failure" for FlowFiles, where the resource couldn't be removed from the 
> source, e.g. due to network errors or missing permissions.
> An initial implementation should provide {{RemoveXYZ}} for one of the 
> existing resources types, e.g. File, FTP, SFTP...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:42 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

-I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.-

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
 - 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 
 - 
[1.23.2|https://web.archive.org/web/20231106213006/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.19.1|https://web.archive.org/web/20230315024356/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.14.0|https://web.archive.org/web/20211021223339/https://nifi.apache.org/docs.html]
 - 
[1.13.2|https://web.archive.org/web/20201004172411/http://nifi.apache.org/docs.html]
 - 
[1.10.0|https://web.archive.org/web/20190906190320/http://nifi.apache.org:80/docs.html]

*Looking at the documentation of all these older version of NiFi in Wayback 
Machine, it looks like this never worked in the first place.*

It looks like the issue might stem from the Maven plugin used for the spec 
generation, see [issue 
#352|https://github.com/kongchen/swagger-maven-plugin/issues/352] in the 
project. As outlined in the issue, there is a workaround that involves 
declaring the param a second time as {{@ApiImplicitParam}} which seems to work, 
as the only form parameter visible in the documentation for 
{{/process-groups/id/templates/upload}} has this declared.


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:42 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

-I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.-

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
 - 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 
 - 
[1.23.2|https://web.archive.org/web/20231106213006/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.19.1|https://web.archive.org/web/20230315024356/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.14.0|https://web.archive.org/web/20211021223339/https://nifi.apache.org/docs.html]
 - 
[1.13.2|https://web.archive.org/web/20201004172411/http://nifi.apache.org/docs.html]
 - 
[1.10.0|https://web.archive.org/web/20190906190320/http://nifi.apache.org:80/docs.html]

*Looking at the documentation of all these older version of NiFi in Wayback 
Machine, it looks like this never worked in the first place.*

It looks like the issue might stem from the Maven plugin used for the spec 
generation, see [issue 
#352|https://github.com/kongchen/swagger-maven-plugin/issues/352] in the 
project. As outlined in the issue, there is a workaround that involves 
declaring the param a second time as {{@ApiImplicitParam}} which seems to work, 
as the only form parameter visible in the documentation for 
{{/process-groups/{id}/templates/upload}} has this declared.


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:39 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

-I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.-

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
 - 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 
 - 
[1.23.2|https://web.archive.org/web/20231106213006/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.19.1|https://web.archive.org/web/20230315024356/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.14.0|https://web.archive.org/web/20211021223339/https://nifi.apache.org/docs.html]
 - 
[1.13.2|https://web.archive.org/web/20201004172411/http://nifi.apache.org/docs.html]
 - 
[1.10.0|https://web.archive.org/web/20190906190320/http://nifi.apache.org:80/docs.html]

*Looking at the documentation of all these older version of NiFi in Wayback 
Machine, it looks like this never worked in the first place.*

It looks like the issue might stem from the Maven plugin used for the spec 
generation, see [issue 
#352|https://github.com/kongchen/swagger-maven-plugin/issues/352] in the 
project.


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:30 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

-I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.-

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
 - 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 
 - 
[1.23.2|https://web.archive.org/web/20231106213006/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.19.1|https://web.archive.org/web/20230315024356/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.14.0|https://web.archive.org/web/20211021223339/https://nifi.apache.org/docs.html]
 - 
[1.13.2|https://web.archive.org/web/20201004172411/http://nifi.apache.org/docs.html]
 - 
[1.10.0|https://web.archive.org/web/20190906190320/http://nifi.apache.org:80/docs.html]

*Looking at the documentation of all these older version of NiFi in Wayback 
Machine, it looks like this never worked in the first place.*


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:28 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.*

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
 - 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 
 - 
[1.23.2|https://web.archive.org/web/20231106213006/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.19.1|https://web.archive.org/web/20230315024356/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.14.0|https://web.archive.org/web/20211021223339/https://nifi.apache.org/docs.html]
 - 
[1.13.2|https://web.archive.org/web/20201004172411/http://nifi.apache.org/docs.html]
 - 
[1.10.0|https://web.archive.org/web/20190906190320/http://nifi.apache.org:80/docs.html]


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:27 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.*

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
 - 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 
 - 
[1.23.2|https://web.archive.org/web/20231106213006/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.19.1|https://web.archive.org/web/20230315024356/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.14.0|https://web.archive.org/web/20211021223339/https://nifi.apache.org/docs.html]
 - 
[1.13.2|https://web.archive.org/web/20201004172411/http://nifi.apache.org/docs.html]


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:26 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.*

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
 - 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 
 - 
[1.23.2|https://web.archive.org/web/20231106213006/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.19.1|https://web.archive.org/web/20230315024356/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.14.0|https://web.archive.org/web/20211021223339/https://nifi.apache.org/docs.html]


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:23 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.*

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
 - 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 
 - 
[1.23.2|https://web.archive.org/web/20231106213006/https://nifi.apache.org/docs.html]
 - ...
 - 
[1.19.1|https://web.archive.org/web/20230315024356/https://nifi.apache.org/docs.html]


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:22 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.*

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
 - 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 
- [1.23.2 
|https://web.archive.org/web/20231106213006/https://nifi.apache.org/docs.html]



was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:20 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.*

The documentation is broken in the most recent release version for 1.x which is 
1.25.0.
However, it was broken already for:
- 
[1.24.0|https://web.archive.org/web/20231208064943/https://nifi.apache.org/docs.html]
 


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.*

> Missing 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:16 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger itself, introduced into NiFi 
by using a newer version, a new Swagger version requiring a different type of 
declaration or a change in the generation procedure of NiFi itself.*


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger Codegen itself, introduced 
into NiFi by using a newer version, a new Codegen version requiring a different 
type of declaration or a change in the generation procedure of NiFi itself.*

> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:13 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

*I suspect there is either an issue in the Swagger Codegen itself, introduced 
into NiFi by using a newer version, a new Codegen version requiring a different 
type of declaration or a change in the generation procedure of NiFi itself.*


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:11 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code:java}
/process-groups/{id}/templates/upload Uploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{code:java}
"/process-groups/{id}/templates/upload" : {
      "post" : {
        "tags" : [ "process-groups" ],
        "summary" : "Uploads a template",
        "description" : "",
        "operationId" : "uploadTemplate",
        "consumes" : [ "multipart/form-data" ],
        "produces" : [ "application/xml" ],
        "parameters" : [ {
          "name" : "id",
          "in" : "path",
          "description" : "The process group id.",
          "required" : true,
          "type" : "string"
        }, {
          "in" : "body",
          "name" : "body",
          "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
          "required" : false,
          "schema" : {
            "type" : "boolean"
          }
        }, {
          "name" : "template",
          "in" : "formData",
          "description" : "The binary content of the template file being 
uploaded.",
          "required" : true,
          "type" : "file"
        } ],
        "responses" : ... {code}
The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/upload Uploads a template{code}

Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{{code}}"/process-groups/{id}/templates/upload" : {
  "post" : {
"tags" : [ "process-groups" ],
"summary" : "Uploads a template",
"description" : "",
"operationId" : "uploadTemplate",
"consumes" : [ "multipart/form-data" ],
"produces" : [ "application/xml" ],
"parameters" : [ {
  "name" : "id",
  "in" : "path",
  "description" : "The process group id.",
  "required" : true,
  "type" : "string"
}, {
  "in" : "body",
  "name" : "body",
  "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
  "required" : false,
  "schema" : {
"type" : "boolean"
  }
}, {
  "name" : "template",
  "in" : "formData",
  "description" : "The binary content of the template file being 
uploaded.",
  "required" : true,
  "type" : "file"
} ],
"responses" : ...{{code}}

The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user has noticed that nifi-api docs are missing required request 
> values.  One such example is groupName on the api call for uploading 
> process-groups:
> /process-groups/upload
>  
>  
> More dialouge and original conversation here:
> 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:10 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/upload Uploads a template{code}

Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{{code}}"/process-groups/{id}/templates/upload" : {
  "post" : {
"tags" : [ "process-groups" ],
"summary" : "Uploads a template",
"description" : "",
"operationId" : "uploadTemplate",
"consumes" : [ "multipart/form-data" ],
"produces" : [ "application/xml" ],
"parameters" : [ {
  "name" : "id",
  "in" : "path",
  "description" : "The process group id.",
  "required" : true,
  "type" : "string"
}, {
  "in" : "body",
  "name" : "body",
  "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
  "required" : false,
  "schema" : {
"type" : "boolean"
  }
}, {
  "name" : "template",
  "in" : "formData",
  "description" : "The binary content of the template file being 
uploaded.",
  "required" : true,
  "type" : "file"
} ],
"responses" : ...{{code}}

The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/upload Uploads a template{code}

Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{{code}}"/process-groups/{id}/templates/upload" : {
  "post" : {
"tags" : [ "process-groups" ],
"summary" : "Uploads a template",
"description" : "",
"operationId" : "uploadTemplate",
"consumes" : [ "multipart/form-data" ],
"produces" : [ "application/xml" ],
"parameters" : [ {
  "name" : "id",
  "in" : "path",
  "description" : "The process group id.",
  "required" : true,
  "type" : "string"
}, {
  "in" : "body",
  "name" : "body",
  "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
  "required" : false,
  "schema" : {
"type" : "boolean"
  }
}, {
  "name" : "template",
  "in" : "formData",
  "description" : "The binary content of the template file being 
uploaded.",
  "required" : true,
  "type" : "file"
} ],
"responses" : ...
{{code}}

The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user has noticed that nifi-api docs are missing required request 
> values.  One such example is groupName on the api call for uploading 
> process-groups:
> /process-groups/upload
>  
>  
> More dialouge and original conversation here:
> 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:10 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/upload Uploads a template{code}

Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{{code}}"/process-groups/{id}/templates/upload" : {
  "post" : {
"tags" : [ "process-groups" ],
"summary" : "Uploads a template",
"description" : "",
"operationId" : "uploadTemplate",
"consumes" : [ "multipart/form-data" ],
"produces" : [ "application/xml" ],
"parameters" : [ {
  "name" : "id",
  "in" : "path",
  "description" : "The process group id.",
  "required" : true,
  "type" : "string"
}, {
  "in" : "body",
  "name" : "body",
  "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
  "required" : false,
  "schema" : {
"type" : "boolean"
  }
}, {
  "name" : "template",
  "in" : "formData",
  "description" : "The binary content of the template file being 
uploaded.",
  "required" : true,
  "type" : "file"
} ],
"responses" : ...
{{code}}

The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/upload Uploads a template{code}

Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{{code}}
"/process-groups/{id}/templates/upload" : {
  "post" : {
"tags" : [ "process-groups" ],
"summary" : "Uploads a template",
"description" : "",
"operationId" : "uploadTemplate",
"consumes" : [ "multipart/form-data" ],
"produces" : [ "application/xml" ],
"parameters" : [ {
  "name" : "id",
  "in" : "path",
  "description" : "The process group id.",
  "required" : true,
  "type" : "string"
}, {
  "in" : "body",
  "name" : "body",
  "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
  "required" : false,
  "schema" : {
"type" : "boolean"
  }
}, {
  "name" : "template",
  "in" : "formData",
  "description" : "The binary content of the template file being 
uploaded.",
  "required" : true,
  "type" : "file"
} ],
"responses" : ...
{{code}}

The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.

> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user has noticed that nifi-api docs are missing required request 
> values.  One such example is groupName on the api call for uploading 
> process-groups:
> /process-groups/upload
>  
>  
> More dialouge and original conversation here:
> 

[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:09 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/upload Uploads a template{code}

Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.

This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.
{{code}}
"/process-groups/{id}/templates/upload" : {
  "post" : {
"tags" : [ "process-groups" ],
"summary" : "Uploads a template",
"description" : "",
"operationId" : "uploadTemplate",
"consumes" : [ "multipart/form-data" ],
"produces" : [ "application/xml" ],
"parameters" : [ {
  "name" : "id",
  "in" : "path",
  "description" : "The process group id.",
  "required" : true,
  "type" : "string"
}, {
  "in" : "body",
  "name" : "body",
  "description" : "Acknowledges that this node is disconnected to allow 
for mutable requests to proceed.",
  "required" : false,
  "schema" : {
"type" : "boolean"
  }
}, {
  "name" : "template",
  "in" : "formData",
  "description" : "The binary content of the template file being 
uploaded.",
  "required" : true,
  "type" : "file"
} ],
"responses" : ...
{{code}}

The declaration side of the endpoints in 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/api/ProcessGroupResource.java}}
 looks fine and has not changed for a long time.


was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/upload Uploads a template{code}

Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.
This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.


> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user has noticed that nifi-api docs are missing required request 
> values.  One such example is groupName on the api call for uploading 
> process-groups:
> /process-groups/upload
>  
>  
> More dialouge and original conversation here:
> [https://community.cloudera.com/t5/Support-Questions/NIFI-API-REST-Upload-Json-definition-flow-file/m-p/380384#M244057]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 9:04 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.

It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/upload Uploads a template{code}

Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.
This issue can be seen in the generated {{swagger.json}} under 
{{nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/target/swagger-ui}}
 already, thus is not an issue of the rendered documentation itself.



was (Author: endzeitbegins):
The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.
It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/uploadUploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.


> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user has noticed that nifi-api docs are missing required request 
> values.  One such example is groupName on the api call for uploading 
> process-groups:
> /process-groups/upload
>  
>  
> More dialouge and original conversation here:
> [https://community.cloudera.com/t5/Support-Questions/NIFI-API-REST-Upload-Json-definition-flow-file/m-p/380384#M244057]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit edited comment on NIFI-12503 at 3/8/24 8:57 AM:


The latest commit version of 2.x seems to not he affected.

This issue seems to only exist on the 1.x branch.
It affects other endpoint that use multipart/form-data as well, e.g.
{code}/process-groups/{id}/templates/uploadUploads a template{code}
Here only one of the two form-data parameters is displayed correctly, the other 
one is displayed without name and said to be expected in the body, instead of 
as part of the form-data; similar to the endpoint mentioned in the issue itself.



was (Author: endzeitbegins):
This issue seems to only exist on the 1.x branch.

The latest commit version of 2.x seems to not he affected.

> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user has noticed that nifi-api docs are missing required request 
> values.  One such example is groupName on the api call for uploading 
> process-groups:
> /process-groups/upload
>  
>  
> More dialouge and original conversation here:
> [https://community.cloudera.com/t5/Support-Questions/NIFI-API-REST-Upload-Json-definition-flow-file/m-p/380384#M244057]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12503:
---
Affects Version/s: 1.25.0

> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user has noticed that nifi-api docs are missing required request 
> values.  One such example is groupName on the api call for uploading 
> process-groups:
> /process-groups/upload
>  
>  
> More dialouge and original conversation here:
> [https://community.cloudera.com/t5/Support-Questions/NIFI-API-REST-Upload-Json-definition-flow-file/m-p/380384#M244057]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824657#comment-17824657
 ] 

endzeit commented on NIFI-12503:


This issue seems to only exist on the 1.x branch.

The latest commit version of 2.x seems to not he affected.

> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user has noticed that nifi-api docs are missing required request 
> values.  One such example is groupName on the api call for uploading 
> process-groups:
> /process-groups/upload
>  
>  
> More dialouge and original conversation here:
> [https://community.cloudera.com/t5/Support-Questions/NIFI-API-REST-Upload-Json-definition-flow-file/m-p/380384#M244057]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-12503) Missing Documentation for nifi-api

2024-03-08 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit reassigned NIFI-12503:
--

Assignee: endzeit

> Missing Documentation for nifi-api
> --
>
> Key: NIFI-12503
> URL: https://issues.apache.org/jira/browse/NIFI-12503
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Steven Matison
>Assignee: endzeit
>Priority: Minor
> Attachments: SAMSAL_0-1701894321710.png
>
>
> Community user has noticed that nifi-api docs are missing required request 
> values.  One such example is groupName on the api call for uploading 
> process-groups:
> /process-groups/upload
>  
>  
> More dialouge and original conversation here:
> [https://community.cloudera.com/t5/Support-Questions/NIFI-API-REST-Upload-Json-definition-flow-file/m-p/380384#M244057]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (NIFI-9931) OutOfMemoryError from EvaluateXPath processor halts all FlowFiles from upstream

2024-03-01 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-9931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit resolved NIFI-9931.
---
Resolution: Won't Fix

> OutOfMemoryError from EvaluateXPath processor halts all FlowFiles from 
> upstream
> ---
>
> Key: NIFI-9931
> URL: https://issues.apache.org/jira/browse/NIFI-9931
> Project: Apache NiFi
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: endzeit
>Assignee: endzeit
>Priority: Minor
>
> For some of our flows in NiFi Apache we need to extract information out of 
> XML files for later use. As we need to transform the FlowFile's content while 
> retaining that information, we extract the required bits into FlowFile 
> attributes.
> We make use of the _EvaluateXPath_ processor for this, most of the time, 
> which works like a charm in 99,99% of cases.
> However, recently we had a minor outage caused by the processor. Normally the 
> content inside the tag is quite small and can be put into the FlowFile 
> attributes (and thus in RAM) without problems. A malprocessed XML with an 
> unusually large content in one of the XML tags we extract to the FlowFile 
> attributes reached the processor, which resulted in an _OutOfMemoryError_ and 
> the processor itself yielding. As the FlowFile's content did not change, all 
> subsequent attempts to extract the data resulted in the same 
> _OutOfMemoryError_ and the processor yielding again and again.  
> Ultimately, this resulted in blocking any following FlowFiles in the upstream 
> and bringing processing to a halt effectively.
> 
> That's why we'd like to propose (and contribute, if accepted) an extension to 
> the _EvaluateXPath_ processor to mitigate or at least reduce the risk for 
> this behaviour to occurr.
> We thought about a new (optional) property which limits the amount of 
> characters / bytes allowed for each extracted tag. This "{_}Maximum Attribute 
> Size{_}" would only take affect when set and the _Destination_ is set to 
> {_}flowfile-attribute{_}. If any extraction would reach this limit, the 
> FlowFile should be moved to the _failure_ relationship instead of yielding 
> the processor and blocking the upstream.
> However, other ideas and proposals are welcomed as well. This will not be a 
> complete solution to the problem, but should limit the propability of it 
> happening.
> --
> As a "quick fix", to mitigate the error for now, we prepended every 
> _EvaluateXPath_ processor with a _RouteOnAttribute_ processor, that filters 
> out any files whose content exceed an arbitrary size of FlowFiles we know 
> were processed successfully in the past.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (NIFI-12632) Extract SFTP components out of the standard bundle

2024-03-01 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit resolved NIFI-12632.

Resolution: Later

> Extract SFTP components out of the standard bundle
> --
>
> Key: NIFI-12632
> URL: https://issues.apache.org/jira/browse/NIFI-12632
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> NIFI-11171 and the goals for NIFI 2.0 outline the desire to extract the SFTP 
> based components out of the standard bundle into a separate bundle. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-03-01 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit resolved NIFI-12841.

Resolution: Invalid

> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
> the resource from the source after successful transfer into the content of a 
> FlowFile. 
> However, in some scenarios it might be undesired to remove the resource until 
> it has been processed successfully and the transformation result be stored, 
> e.g. to a secure network storage.
> This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its 
> own. 
> As of now, one of the scripting processors or even a full-fledged custom 
> processor can be used to achieve this. 
> However, these might get relatively involved due to session handling or other 
> concerns.
> This issue proposes the introduction of an additional such processor "type", 
> namely {{RemoveXYZ}} which removes a resource.
> The base processor should have two properties, namely {{path}} and 
> {{filename}}, by default retrieving their values from the respective core 
> FlowFile attributes. Implementations may add protocol specific properties, 
> e.g. for authentication. 
> There should be three outgoing relationships at least:
> - "success" for FlowFiles, where the resource was removed from the source,
> - "not exists" for FlowFiles, where the resource did (no longer) exist on the 
> source,
> - "failure" for FlowFiles, where the resource couldn't be removed from the 
> source, e.g. due to network errors or missing permissions.
> An initial implementation should provide {{RemoveXYZ}} for one of the 
> existing resources types, e.g. File, FTP, SFTP...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12857) Refactor QueuePrioritizer using updated Java APIs

2024-03-01 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12857:
---
Description: The QueuePrioritizer is not covered by a dedicated test. 
Additionally, it contains some boilerplate that can be reduced by using updated 
Java APIs.  (was: The FlowFilePrioritizer implementations and test contain some 
boilerplate / duplication that can be reduced by using updated Java APIs.)

> Refactor QueuePrioritizer using updated Java APIs
> -
>
> Key: NIFI-12857
> URL: https://issues.apache.org/jira/browse/NIFI-12857
> Project: Apache NiFi
>  Issue Type: Task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>
> The QueuePrioritizer is not covered by a dedicated test. Additionally, it 
> contains some boilerplate that can be reduced by using updated Java APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12857) Refactor QueuePrioritizer using updated Java APIs

2024-03-01 Thread endzeit (Jira)
endzeit created NIFI-12857:
--

 Summary: Refactor QueuePrioritizer using updated Java APIs
 Key: NIFI-12857
 URL: https://issues.apache.org/jira/browse/NIFI-12857
 Project: Apache NiFi
  Issue Type: Task
Reporter: endzeit
Assignee: endzeit


The FlowFilePrioritizer implementations and test contain some boilerplate / 
duplication that can be reduced by using updated Java APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12853) Refactor FlowFilePrioritizer using updated Java APIs

2024-02-29 Thread endzeit (Jira)
endzeit created NIFI-12853:
--

 Summary: Refactor FlowFilePrioritizer using updated Java APIs
 Key: NIFI-12853
 URL: https://issues.apache.org/jira/browse/NIFI-12853
 Project: Apache NiFi
  Issue Type: Task
Reporter: endzeit
Assignee: endzeit


The FlowFilePrioritizer implementations and test contain some boilerplate / 
duplication that can be reduced by using updated Java APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12852) Refactor TestRecordPath

2024-02-29 Thread endzeit (Jira)
endzeit created NIFI-12852:
--

 Summary: Refactor TestRecordPath
 Key: NIFI-12852
 URL: https://issues.apache.org/jira/browse/NIFI-12852
 Project: Apache NiFi
  Issue Type: Task
Reporter: endzeit
Assignee: endzeit


The tests for RecordPath contain quite some duplication and can be tidied up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-02-25 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820519#comment-17820519
 ] 

endzeit commented on NIFI-12841:


Thank you for your feedback [~markap14], that makes perfect sense. Somehow I 
must've overlooked those processors. ._.'
What are you thoughts on adding a dedicated {{DeleteFile}} processor? 


> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
> the resource from the source after successful transfer into the content of a 
> FlowFile. 
> However, in some scenarios it might be undesired to remove the resource until 
> it has been processed successfully and the transformation result be stored, 
> e.g. to a secure network storage.
> This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its 
> own. 
> As of now, one of the scripting processors or even a full-fledged custom 
> processor can be used to achieve this. 
> However, these might get relatively involved due to session handling or other 
> concerns.
> This issue proposes the introduction of an additional such processor "type", 
> namely {{RemoveXYZ}} which removes a resource.
> The base processor should have two properties, namely {{path}} and 
> {{filename}}, by default retrieving their values from the respective core 
> FlowFile attributes. Implementations may add protocol specific properties, 
> e.g. for authentication. 
> There should be three outgoing relationships at least:
> - "success" for FlowFiles, where the resource was removed from the source,
> - "not exists" for FlowFiles, where the resource did (no longer) exist on the 
> source,
> - "failure" for FlowFiles, where the resource couldn't be removed from the 
> source, e.g. due to network errors or missing permissions.
> An initial implementation should provide {{RemoveXYZ}} for one of the 
> existing resources types, e.g. File, FTP, SFTP...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12841:
---
Description: 
There is the notion of "families" or "types" of processors in the standard 
distribution of NiFi. 
Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
and {{PutXYZ}}. 

The following examples will be based on files on the local filesystem. However, 
the same principle applies to other types of resources, e.g. files on a SFTP 
server.

The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
the resource from the source after successful transfer into the content of a 
FlowFile. 
However, in some scenarios it might be undesired to remove the resource until 
it has been processed successfully and the transformation result be stored, 
e.g. to a secure network storage.
This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its own. 
As of now, one of the scripting processors or even a full-fledged custom 
processor can be used to achieve this. 
However, these might get relatively involved due to session handling or other 
concerns.

This issue proposes the introduction of an additional such processor "type", 
namely {{RemoveXYZ}} which removes a resource.

The base processor should have two properties, namely {{path}} and 
{{filename}}, by default retrieving their values from the respective core 
FlowFile attributes. Implementations may add protocol specific properties, e.g. 
for authentication. 
There should be three outgoing relationships at least:
- "success" for FlowFiles, where the resource was removed from the source,
- "not exists" for FlowFiles, where the resource did (no longer) exist on the 
source,
- "failure" for FlowFiles, where the resource couldn't be removed from the 
source, e.g. due to network errors or missing permissions.

An initial implementation should provide {{RemoveXYZ}} for one of the existing 
resources types, e.g. File, FTP, SFTP...

  was:
There is the notion of "families" or "types" of processors in the standard 
distribution of NiFi. 
Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
and {{PutXYZ}}. 

The following examples will be based on files on the local filesystem. However, 
the same principle applies to other types of resources, e.g. files on a SFTP 
server.

The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
the resource from the source after successful transfer into the content of a 
FlowFile. 
However, in some scenarios it might be undesired to remove the resource until 
it has been processed successfully and the transformation result be stored, 
e.g. to a secure network storage.
This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its own. 
As of now,  one of the scripting processors or even a full-fledged custom 
processor can be used to achieve this. 

This issue proposes the introduction of an additional such processor "type", 
namely {{RemoveXYZ}} which removes a resource.

The base processor should have two properties, namely {{path}} and 
{{filename}}, by default retrieving their values from the respective core 
FlowFile attributes. Implementations may add protocol specific properties, e.g. 
for authentication. 
There should be three outgoing relationships at least:
- "success" for FlowFiles, where the resource was removed from the source,
- "not exists" for FlowFiles, where the resource did (no longer) exist on the 
source,
- "failure" for FlowFiles, where the resource couldn't be removed from the 
source, e.g. due to network errors or missing permissions.

An initial implementation should provide {{RemoveXYZ}} for one of the existing 
resources types, e.g. File, FTP, SFTP...


> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
> the resource from the source after successful transfer into the content of a 
> FlowFile. 
> However, in some scenarios it might be undesired to remove the resource until 
> it has been processed successfully and the transformation result be stored, 
> e.g. to a secure network storage.
> This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its 
> own. 
> As of now, one 

[jira] [Updated] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12841:
---
Description: 
There is the notion of "families" or "types" of processors in the standard 
distribution of NiFi. 
Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
and {{PutXYZ}}. 

The following examples will be based on files on the local filesystem. However, 
the same principle applies to other types of resources, e.g. files on a SFTP 
server.

The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
the resource from the source after successful transfer into the content of a 
FlowFile. 
However, in some scenarios it might be undesired to remove the resource until 
it has been processed successfully and the transformation result be stored, 
e.g. to a secure network storage.
This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its own. 
As of now,  one of the scripting processors or even a full-fledged custom 
processor can be used to achieve this. 

This issue proposes the introduction of an additional such processor "type", 
namely {{RemoveXYZ}} which removes a resource.

The base processor should have two properties, namely {{path}} and 
{{filename}}, by default retrieving their values from the respective core 
FlowFile attributes. Implementations may add protocol specific properties, e.g. 
for authentication. 
There should be three outgoing relationships at least:
- "success" for FlowFiles, where the resource was removed from the source,
- "not exists" for FlowFiles, where the resource did (no longer) exist on the 
source,
- "failure" for FlowFiles, where the resource couldn't be removed from the 
source, e.g. due to network errors or missing permissions.

An initial implementation should provide {{RemoveXYZ}} for one of the existing 
resources types, e.g. File, FTP, SFTP...

  was:
There is the notion of "families" or "types" of processors in the standard 
distribution of NiFi. 
Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
and {{PutXYZ}}. 

The following examples will be based on files on the local filesystem. However, 
the same principle applies to other types of resources, e.g. files on a SFTP 
server.

The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
the resource from the source after successful transfer into the content of a 
FlowFile. 
However, in some scenarios it might be undesired to remove the resource until 
it has been processed successfully and the transformation result be stored, 
e.g. to a secure network storage.
This cannot be achieved with the {{GetXYZ}} or {{FetchXYZ}} processor on its 
own. 
As of now,  one of the scripting processors or even a full-fledged custom 
processor can be used to achieve this. 

This issue proposes the introduction of an additional such processor "type", 
namely {{RemoveXYZ}} which removes a resource.

The base processor should have two properties, namely {{path}} and 
{{filename}}, by default retrieving their values from the respective core 
FlowFile attributes. Implementations may add protocol specific properties, e.g. 
for authentication. 
There should be three outgoing relationships at least:
- "success" for FlowFiles, where the resource was removed from the source,
- "not exists" for FlowFiles, where the resource did (no longer) exist on the 
source,
- "failure" for FlowFiles, where the resource couldn't be removed from the 
source, e.g. due to network errors or missing permissions.

An initial implementation should provide {{RemoveXYZ}} for one of the existing 
resources types, e.g. File, FTP, SFTP...


> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
> the resource from the source after successful transfer into the content of a 
> FlowFile. 
> However, in some scenarios it might be undesired to remove the resource until 
> it has been processed successfully and the transformation result be stored, 
> e.g. to a secure network storage.
> This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its 
> own. 
> As of now,  one of the scripting processors or even a full-fledged custom 
> processor can be used to 

[jira] [Updated] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12841:
---
Description: 
There is the notion of "families" or "types" of processors in the standard 
distribution of NiFi. 
Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
and {{PutXYZ}}. 

The following examples will be based on files on the local filesystem. However, 
the same principle applies to other types of resources, e.g. files on a SFTP 
server.

The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
the resource from the source after successful transfer into the content of a 
FlowFile. 
However, in some scenarios it might be undesired to remove the resource until 
it has been processed successfully and the transformation result be stored, 
e.g. to a secure network storage.
This cannot be achieved with the {{GetXYZ}} or {{FetchXYZ}} processor on its 
own. 
As of now,  one of the scripting processors or even a full-fledged custom 
processor can be used to achieve this. 

This issue proposes the introduction of an additional such processor "type", 
namely {{RemoveXYZ}} which removes a resource.

The base processor should have two properties, namely {{path}} and 
{{filename}}, by default retrieving their values from the respective core 
FlowFile attributes. Implementations may add protocol specific properties, e.g. 
for authentication. 
There should be three outgoing relationships at least:
- "success" for FlowFiles, where the resource was removed from the source,
- "not exists" for FlowFiles, where the resource did (no longer) exist on the 
source,
- "failure" for FlowFiles, where the resource couldn't be removed from the 
source, e.g. due to network errors or missing permissions.

An initial implementation should provide {{RemoveXYZ}} for one of the existing 
resources types, e.g. File, FTP, SFTP...

  was:
There is the notion of "families" or "types" of processors in the standard 
distribution of NiFi. 
Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
and {{PutXYZ}}.

This issue proposes the introduction of an additional such "type", namely 
{{RemoveXYZ}} which removes a resource. 

The following examples will be based on files on the local filesystem. However, 
the same principle applies to other types of resources, e.g. files on a SFTP 
server.

TODO ..


> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
> the resource from the source after successful transfer into the content of a 
> FlowFile. 
> However, in some scenarios it might be undesired to remove the resource until 
> it has been processed successfully and the transformation result be stored, 
> e.g. to a secure network storage.
> This cannot be achieved with the {{GetXYZ}} or {{FetchXYZ}} processor on its 
> own. 
> As of now,  one of the scripting processors or even a full-fledged custom 
> processor can be used to achieve this. 
> This issue proposes the introduction of an additional such processor "type", 
> namely {{RemoveXYZ}} which removes a resource.
> The base processor should have two properties, namely {{path}} and 
> {{filename}}, by default retrieving their values from the respective core 
> FlowFile attributes. Implementations may add protocol specific properties, 
> e.g. for authentication. 
> There should be three outgoing relationships at least:
> - "success" for FlowFiles, where the resource was removed from the source,
> - "not exists" for FlowFiles, where the resource did (no longer) exist on the 
> source,
> - "failure" for FlowFiles, where the resource couldn't be removed from the 
> source, e.g. due to network errors or missing permissions.
> An initial implementation should provide {{RemoveXYZ}} for one of the 
> existing resources types, e.g. File, FTP, SFTP...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12841:
---
Description: 
There is the notion of "families" or "types" of processors in the standard 
distribution of NiFi. 
Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
and {{PutXYZ}}.

This issue proposes the introduction of an additional such "type", namely 
{{RemoveXYZ}} which removes a resource. 

The following examples will be based on files on the local filesystem. However, 
the same principle applies to other types of resources, e.g. files on a SFTP 
server.

TODO ..

  was:
There is the notion of "families" or "types" of processors in the standard 
distribution of NiFi. 
Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
and {{PutXYZ}}.

This issue propose the introduction of an additional such "type", namely 
{{RemoveXYZ}}.

TODO ...


> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}.
> This issue proposes the introduction of an additional such "type", namely 
> {{RemoveXYZ}} which removes a resource. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> TODO ..



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12841:
---
Description: 
There is the notion of "families" or "types" of processors in the standard 
distribution of NiFi. 
Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
and {{PutXYZ}}.

This issue propose the introduction of an additional such "type", namely 
{{RemoveXYZ}}.

TODO ...

  was:There exist several "families" of processor types already, to name a few 
\{{ListXYZ}}


> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}.
> This issue propose the introduction of an additional such "type", namely 
> {{RemoveXYZ}}.
> TODO ...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12841) Introduce RemoveXYZ type of processors

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12841:
---
Summary: Introduce RemoveXYZ type of processors  (was: Introduce RemoveX 
type of processors)

> Introduce RemoveXYZ type of processors
> --
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There exist several "families" of processor types already, to name a few 
> \{{ListXYZ}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12841) Introduce RemoveX type of processors

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12841:
---
Description: There exist several "families" of processor types already, to 
name a few \{{ListXYZ}}  (was: TODO)

> Introduce RemoveX type of processors
> 
>
> Key: NIFI-12841
> URL: https://issues.apache.org/jira/browse/NIFI-12841
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: endzeit
>Priority: Minor
>
> There exist several "families" of processor types already, to name a few 
> \{{ListXYZ}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12841) Introduce RemoveX type of processors

2024-02-25 Thread endzeit (Jira)
endzeit created NIFI-12841:
--

 Summary: Introduce RemoveX type of processors
 Key: NIFI-12841
 URL: https://issues.apache.org/jira/browse/NIFI-12841
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: endzeit


TODO



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-12498) The Prioritization description in the User Guide is different from the actual source code implementation.

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit reassigned NIFI-12498:
--

Assignee: endzeit

> The Prioritization description in the User Guide is different from the actual 
> source code implementation.
> -
>
> Key: NIFI-12498
> URL: https://issues.apache.org/jira/browse/NIFI-12498
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Documentation  Website
>Reporter: Doin Cha
>Assignee: endzeit
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the prioritization explanation of the User Guide, it is stated that 
> *OldestFlowFileFirstPrioritizer* is the _"default scheme that is used if no 
> prioritizers are selected."_
> _([https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization)|https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization]_
>  
>  
> However, in the actual source code implementation, {color:#ff}*there is 
> no automatic default setting when prioritizers are not selected.* {color}
> In such cases, the sorting is done by comparing the *ContentClaim* *of 
> FlowFiles.*
> _([https://github.com/apache/nifi/blob/9a5ec83baa1b3593031f0917659a69e7a36bb0be/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/queue/QueuePrioritizer.java#L39-L90])_
>  
>  
> It looks like the user guide needs to be revised.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12498) The Prioritization description in the User Guide is different from the actual source code implementation.

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12498:
---
Affects Version/s: 2.0.0-M2
   1.25.0

> The Prioritization description in the User Guide is different from the actual 
> source code implementation.
> -
>
> Key: NIFI-12498
> URL: https://issues.apache.org/jira/browse/NIFI-12498
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Documentation  Website
>Affects Versions: 1.25.0, 2.0.0-M2
>Reporter: Doin Cha
>Assignee: endzeit
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the prioritization explanation of the User Guide, it is stated that 
> *OldestFlowFileFirstPrioritizer* is the _"default scheme that is used if no 
> prioritizers are selected."_
> _([https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization)|https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization]_
>  
>  
> However, in the actual source code implementation, {color:#ff}*there is 
> no automatic default setting when prioritizers are not selected.* {color}
> In such cases, the sorting is done by comparing the *ContentClaim* *of 
> FlowFiles.*
> _([https://github.com/apache/nifi/blob/9a5ec83baa1b3593031f0917659a69e7a36bb0be/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/queue/QueuePrioritizer.java#L39-L90])_
>  
>  
> It looks like the user guide needs to be revised.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12704) Record Path function escapeJson raises NPE when given root "/" as argument

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12704:
---
Affects Version/s: 2.0.0-M2

> Record Path function escapeJson raises NPE when given root "/" as argument
> --
>
> Key: NIFI-12704
> URL: https://issues.apache.org/jira/browse/NIFI-12704
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.24.0, 2.0.0-M2
> Environment: Docker, RedHat 8
>Reporter: Stephen Jeffrey Hindmarch
>Assignee: endzeit
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> My use case is I want to create a field called "original" which is the 
> escaped string of my whole record. This will preserve the original contents 
> of the message before I start trnasforming it.
> My expectation is that I should be able to use an UpdateRecord processor to 
> create the field using RecordPath language.
> What actually happens is that when I use 
> {code:java}
> escapeJson(/)
> {code}
> as the function the result is that the processor throws a Null Pointer 
> Exception (NPE).
> Detail:
> For any input flow file containing JSON records, define an UpdateRecord 
> processor with these settings.
>  * Replacement Value Strategy = Record Path Value
>  * /original = escapeJson(/)
> When the flow file is presented to the processor, the processor generates an 
> NPE.
> For example, I present this record.
> {noformat}
> [{"hello":"world","record":{"key":"one","value":"hello","subrecord":{"key":"two","value":"bob"}},"array":[0,1,2]}]{noformat}
> If the escapeJson function is offered an existing field to escape then it 
> works as expected. Stack trace as follows.
> {noformat}
> 2024-01-31 12:17:15 2024-01-31 12:17:14,741 ERROR [Timer-Driven Process 
> Thread-8] o.a.n.processors.standard.UpdateRecord 
> UpdateRecord[id=5f6b498d-018d-1000--323c6b89] Failed to process 
> StandardFlowFileRecord[uuid=2d4a3eab-e752-4117-93d3-3e5515b6f3f8,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1706703050563-1, container=default, 
> section=1], offset=114, 
> length=114],offset=0,name=2d4a3eab-e752-4117-93d3-3e5515b6f3f8,size=114]; 
> will route to failure
> 2024-01-31 12:17:15 
> org.apache.nifi.record.path.exception.RecordPathException: 
> java.lang.NullPointerException
> 2024-01-31 12:17:15     at 
> org.apache.nifi.record.path.RecordPath.compile(RecordPath.java:105)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.LocalLoadingCache.lambda$newMappingFunction$2(LocalLoadingCache.java:145)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406)
> 2024-01-31 12:17:15     at 
> java.base/java.util.concurrent.ConcurrentHashMap.compute(Unknown Source)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:56)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.record.path.util.RecordPathCache.getCompiled(RecordPathCache.java:34)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.processors.standard.UpdateRecord.process(UpdateRecord.java:166)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.processors.standard.AbstractRecordProcessor$1.process(AbstractRecordProcessor.java:147)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:3432)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:122)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1361)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:247)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
> 2024-01-31 12:17:15     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> 2024-01-31 12:17:15     at 
> java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
> 2024-01-31 12:17:15     at 
> 

[jira] [Assigned] (NIFI-12704) Record Path function escapeJson raises NPE when given root "/" as argument

2024-02-25 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit reassigned NIFI-12704:
--

Assignee: endzeit

> Record Path function escapeJson raises NPE when given root "/" as argument
> --
>
> Key: NIFI-12704
> URL: https://issues.apache.org/jira/browse/NIFI-12704
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.24.0
> Environment: Docker, RedHat 8
>Reporter: Stephen Jeffrey Hindmarch
>Assignee: endzeit
>Priority: Minor
>
> My use case is I want to create a field called "original" which is the 
> escaped string of my whole record. This will preserve the original contents 
> of the message before I start trnasforming it.
> My expectation is that I should be able to use an UpdateRecord processor to 
> create the field using RecordPath language.
> What actually happens is that when I use 
> {code:java}
> escapeJson(/)
> {code}
> as the function the result is that the processor throws a Null Pointer 
> Exception (NPE).
> Detail:
> For any input flow file containing JSON records, define an UpdateRecord 
> processor with these settings.
>  * Replacement Value Strategy = Record Path Value
>  * /original = escapeJson(/)
> When the flow file is presented to the processor, the processor generates an 
> NPE.
> For example, I present this record.
> {noformat}
> [{"hello":"world","record":{"key":"one","value":"hello","subrecord":{"key":"two","value":"bob"}},"array":[0,1,2]}]{noformat}
> If the escapeJson function is offered an existing field to escape then it 
> works as expected. Stack trace as follows.
> {noformat}
> 2024-01-31 12:17:15 2024-01-31 12:17:14,741 ERROR [Timer-Driven Process 
> Thread-8] o.a.n.processors.standard.UpdateRecord 
> UpdateRecord[id=5f6b498d-018d-1000--323c6b89] Failed to process 
> StandardFlowFileRecord[uuid=2d4a3eab-e752-4117-93d3-3e5515b6f3f8,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1706703050563-1, container=default, 
> section=1], offset=114, 
> length=114],offset=0,name=2d4a3eab-e752-4117-93d3-3e5515b6f3f8,size=114]; 
> will route to failure
> 2024-01-31 12:17:15 
> org.apache.nifi.record.path.exception.RecordPathException: 
> java.lang.NullPointerException
> 2024-01-31 12:17:15     at 
> org.apache.nifi.record.path.RecordPath.compile(RecordPath.java:105)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.LocalLoadingCache.lambda$newMappingFunction$2(LocalLoadingCache.java:145)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406)
> 2024-01-31 12:17:15     at 
> java.base/java.util.concurrent.ConcurrentHashMap.compute(Unknown Source)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
> 2024-01-31 12:17:15     at 
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:56)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.record.path.util.RecordPathCache.getCompiled(RecordPathCache.java:34)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.processors.standard.UpdateRecord.process(UpdateRecord.java:166)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.processors.standard.AbstractRecordProcessor$1.process(AbstractRecordProcessor.java:147)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:3432)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:122)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1361)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:247)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
> 2024-01-31 12:17:15     at 
> org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
> 2024-01-31 12:17:15     at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> 2024-01-31 12:17:15     at 
> java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
> 2024-01-31 12:17:15     at 
> 

[jira] [Created] (NIFI-12840) Expose REMOTE_POLL_BATCH_SIZE property for ListSFTP

2024-02-24 Thread endzeit (Jira)
endzeit created NIFI-12840:
--

 Summary: Expose REMOTE_POLL_BATCH_SIZE property for ListSFTP
 Key: NIFI-12840
 URL: https://issues.apache.org/jira/browse/NIFI-12840
 Project: Apache NiFi
  Issue Type: Improvement
Affects Versions: 1.25.0
Reporter: endzeit


Backport changes from NIFI-12772 into the _support/nifi-1.x_ branch by exposing 
the property {{REMOTE_POLL_BATCH_SIZE}} for {{ListSFTP}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-12840) Expose REMOTE_POLL_BATCH_SIZE property for ListSFTP

2024-02-24 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit reassigned NIFI-12840:
--

Assignee: endzeit

> Expose REMOTE_POLL_BATCH_SIZE property for ListSFTP
> ---
>
> Key: NIFI-12840
> URL: https://issues.apache.org/jira/browse/NIFI-12840
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.25.0
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>
> Backport changes from NIFI-12772 into the _support/nifi-1.x_ branch by 
> exposing the property {{REMOTE_POLL_BATCH_SIZE}} for {{ListSFTP}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-12648) Refactor components in elasticsearch bundle using current API methods

2024-01-20 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17808939#comment-17808939
 ] 

endzeit commented on NIFI-12648:


Thank you for removing the version information from the Jira ticket 
@[~exceptionfactory]. I haven't noticed that it was copied as well. 

> Refactor components in elasticsearch bundle using current API methods
> -
>
> Key: NIFI-12648
> URL: https://issues.apache.org/jira/browse/NIFI-12648
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Minor
>
> Based on improvements to support for {{DescribedValue}} in NiFi API and other 
> improvements in Java 21, the _elasticsearch_ bundle components should be 
> updated to use current methods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12648) Refactor components in elasticsearch bundle using current API methods

2024-01-19 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12648:
---
Description: Based on improvements to support for {{DescribedValue}} in 
NiFi API and other improvements in Java 21, the _elasticsearch_ bundle 
components should be updated to use current methods.  (was: Based on 
improvements to support for {{DescribedValue}} in NiFi API and other 
improvements in Java 21, the _azure_ bundle components should be updated to use 
current methods.)

> Refactor components in elasticsearch bundle using current API methods
> -
>
> Key: NIFI-12648
> URL: https://issues.apache.org/jira/browse/NIFI-12648
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Minor
> Fix For: 2.0.0-M2
>
>
> Based on improvements to support for {{DescribedValue}} in NiFi API and other 
> improvements in Java 21, the _elasticsearch_ bundle components should be 
> updated to use current methods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12648) Refactor components in elasticsearch bundle using current API methods

2024-01-19 Thread endzeit (Jira)
endzeit created NIFI-12648:
--

 Summary: Refactor components in elasticsearch bundle using current 
API methods
 Key: NIFI-12648
 URL: https://issues.apache.org/jira/browse/NIFI-12648
 Project: Apache NiFi
  Issue Type: Sub-task
Reporter: endzeit
Assignee: endzeit
 Fix For: 2.0.0-M2


Based on improvements to support for {{DescribedValue}} in NiFi API and other 
improvements in Java 21, the _azure_ bundle components should be updated to use 
current methods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12632) Extract SFTP components out of the standard bundle

2024-01-18 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12632:
---
Description: NIFI-11171 and the goals for NIFI 2.0 outline the desire to 
extract the SFTP based components out of the standard bundle into a separate 
bundle. 

> Extract SFTP components out of the standard bundle
> --
>
> Key: NIFI-12632
> URL: https://issues.apache.org/jira/browse/NIFI-12632
> Project: Apache NiFi
>  Issue Type: Sub-task
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>
> NIFI-11171 and the goals for NIFI 2.0 outline the desire to extract the SFTP 
> based components out of the standard bundle into a separate bundle. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12632) Extract SFTP components out of the standard bundle

2024-01-18 Thread endzeit (Jira)
endzeit created NIFI-12632:
--

 Summary: Extract SFTP components out of the standard bundle
 Key: NIFI-12632
 URL: https://issues.apache.org/jira/browse/NIFI-12632
 Project: Apache NiFi
  Issue Type: Sub-task
Reporter: endzeit
Assignee: endzeit






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12627) Extract nifi-file-transfer from nifi-standard-processors

2024-01-17 Thread endzeit (Jira)
endzeit created NIFI-12627:
--

 Summary: Extract nifi-file-transfer from nifi-standard-processors
 Key: NIFI-12627
 URL: https://issues.apache.org/jira/browse/NIFI-12627
 Project: Apache NiFi
  Issue Type: Sub-task
Reporter: endzeit
Assignee: endzeit


NIFI-11171 plans to move {{SFTP}} processors out of the 
{{nifi-standard-processors}} bundle.
These processors are based on the "FileTransfer" processor family.
The same is true for the {{FTP}} processors.

In order for both maven packages {{nifi-standard-processors}} and the new one 
for the SFTP processors to access / build upon the "FileTransfer" processors, 
these processors need to be moved to a separate package "nifi-file-transfer".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-12613) Align type-safe access to allowableValue with it's declaration

2024-01-15 Thread endzeit (Jira)
endzeit created NIFI-12613:
--

 Summary: Align type-safe access to allowableValue with it's 
declaration 
 Key: NIFI-12613
 URL: https://issues.apache.org/jira/browse/NIFI-12613
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: endzeit
Assignee: endzeit


NIFI-12452 introduced a new method on {{PropertyValue}} to type-safely access a 
property with allowableValues constrained by an Enum. 
{code:java}
 & DescribedValue> E asDescribedValue(Class enumType) 
throws IllegalArgumentException {code}
I think it makes sense to align the access site in {{PropertyValue}} with the 
declaration site in {{PropertyDescriptor.Builder}}.

This would involve renaming the method to {{asAllowableValue}} for improved 
symmetry.
This is a breaking change, however the method was never part of an stable 
release.

Additionally, NIFI-12573 unified the behaviour of specifying Enums (not) 
implementing {{DescribedValue}} as allowableValues. With this change in place, 
I think it's reasonable to open the method to accepts any Enum as well. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12595) Introduce an "Entity Tracking Mode" and "Track Last Listing Time" to ListedEntityTracker

2024-01-14 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12595:
---
Description: 
h1. Situation

The existing {{ListX}} processors support different "Listing Strategies". One 
commonly used "Listing Strategy" is "Tracking Entities" whereby crucial 
information of all recently listed entities, e.g. files, is remembered.
On every listing, in case information to an entity has been remembered before, 
the entity is not listed again (unless it was modified).

This has several benefits over other available "Listing Strategies". For 
example, unlike with "No Tracking" the same entity is not listed repeatedly. 
Other than "Tracking Timestamps", entities with an older timestamp than ones 
previously listed can be picked up. 
However, the strategy comes with its own problems.
h1. Problem

Due to the ever given constraints to available memory and performance, entities 
cannot be tracked indefinitely.
That's why the {{{}ListedEntityTracker{}}}, used for implementing "Tracking 
Entities" by most processors, introduces the notion of an "Entity Tracking Time 
Window".
All remembered entities that are out of the time window (they are older than 
the current time minus the time window) are removed from the tracking cache, to 
limit memory use. Additionally, not yet listed entities that are out of the 
time window are exempt from listing, as they would be removed from the "cache" 
on the next run immediately, resulting in them being listed over and over.

However, this results in entities "older" than the specified "Entity Tracking 
Time Window" not being picked up. For example, given entities are listed from a 
remote server and this server is not available for some time. Once the server 
is available again, the listing continues. However, all entities / files that 
were created before the defined time window, will be silently ignored.

As of now, this can be solved by manual intervention, re-starting the ListX 
processor. The 
"Entity Tracking Time Window" can be ignored upon initial listing, when the 
"Entity Tracking Initial Listing Target" is set to "All Available" (default).

However, this requires the NiFi user to be aware of lingering old entities 
being available on the connected remote source. Additionally, the need for 
manual intervention might be undesired / impractical when having a plentiful of 
sources connected.

Additionally, the "Entity Tracking Time Window" can be increased to account for 
longer time frames. However, this only betters the situation somewhat and does 
not solve the problem. Also there is a limit to this, as it increases the 
memory needed.
h1. Proposal

This issue proposes introducing the notion of a "Entity Tracking Mode", whereby 
the current behavior could be understand as "Track Entity Timestamp".

An new mode of "Track Last Listing Time" is added. Other than the existing 
"Track Entity Timestamp" mode, this would not impose any prerequisites on the 
entities regarding their timestamp (see {{{}minTimestampToList{}}}). Instead, 
all entities would be considered. 
However, this strategy needs a way to limit / clean the entity cache as well. 
Instead of measuring the time window by the timestamp of the entity, the mode 
should remember the last time the entity was tracked; that is, part of a call 
to "listEntities" in "trackEntities". That is, every time an entity is listed, 
its cache entry is renewed. After every listing, only the cache entries that 
have been updated in the time window will be kept. All other, entities that 
have not been listed for a longer time, are removed from cache.

In case users want to limit a processor to only list entities up to a certain 
age, most processors have support for this with a separate property already, 
e.g. "Maximum File Age" in ListSFTP.

While this mode solves the problem of listing "old" entities it comes with its 
own downsides. Due to lifting the restriction on {{{}minTimestampToList{}}}, 
more entities can be listed, potentially leading to long listing times. 
Additionally, similar to the existing "Track Entity Timestamp" there is no 
enforced upper limit on how many cache entries are possible. See NIFI-12609 for 
a proposal that may address both problems.

  was:
h1. Situation

The existing {{ListX}} processors support different "Listing Strategies". One 
commonly used  "Listing Strategy" is "Tracking Entities" whereby crucial 
information of all recently listed entities, e.g. files, is remembered.
On every listing, in case information to an entity has been remembered before, 
the entity is not listed again (unless it was modified).

This has several benefits over other available "Listing Strategies". For 
example, unlike with "No Tracking" the same entity is not listed repeatedly.  
Other than "Tracking Timestamps", entities with an older timestamp than ones 
previously listed can be picked up. 
However, the 

[jira] [Updated] (NIFI-12609) Introduce an "Cached Entity Limit" to ListedEntityTracker

2024-01-14 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12609:
---
Description: 
The {{ListedEntityTracker}} has no option to enforce a strong limit on cachable 
entities.

This can result in both, 1.) an arbitrary amount of entities listed, and 2.) an 
arbitrary amount of cache entries stored in the DistributedMapCache. 
This may lead to long response times of the cache as well as an overload of the 
NiFi instance with FlowFiles.

The ListedEntityTracker should support the (optional) enforcement of an upper 
limit on listable entities during the time window. 

Given an configured upper limit of N, no more than N entries should be 
supported in the entity tracking cache simultaneously and listed by the 
processor during the configured time window. 

This limit is designed to prohibit abuse and reduce of consequences of load 
peaks. Thus a warning bulletin should be shown to the user whenever the limit 
is reached, to make the unusual usage visible.

Additionally, no listing should be executed against the source system as long 
as the cache is full, to avoid undesired and unnecessary load. 

  was:TODO


> Introduce an "Cached Entity Limit" to ListedEntityTracker
> -
>
> Key: NIFI-12609
> URL: https://issues.apache.org/jira/browse/NIFI-12609
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: endzeit
>Priority: Major
>
> The {{ListedEntityTracker}} has no option to enforce a strong limit on 
> cachable entities.
> This can result in both, 1.) an arbitrary amount of entities listed, and 2.) 
> an arbitrary amount of cache entries stored in the DistributedMapCache. 
> This may lead to long response times of the cache as well as an overload of 
> the NiFi instance with FlowFiles.
> The ListedEntityTracker should support the (optional) enforcement of an upper 
> limit on listable entities during the time window. 
> Given an configured upper limit of N, no more than N entries should be 
> supported in the entity tracking cache simultaneously and listed by the 
> processor during the configured time window. 
> This limit is designed to prohibit abuse and reduce of consequences of load 
> peaks. Thus a warning bulletin should be shown to the user whenever the limit 
> is reached, to make the unusual usage visible.
> Additionally, no listing should be executed against the source system as long 
> as the cache is full, to avoid undesired and unnecessary load. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-12595) Introduce an "Entity Tracking Mode" and "Track Last Listing Time" to ListedEntityTracker

2024-01-14 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit reassigned NIFI-12595:
--

Assignee: endzeit

> Introduce an "Entity Tracking Mode" and "Track Last Listing Time" to 
> ListedEntityTracker
> 
>
> Key: NIFI-12595
> URL: https://issues.apache.org/jira/browse/NIFI-12595
> Project: Apache NiFi
>  Issue Type: New Feature
>Affects Versions: 1.24.0
>Reporter: endzeit
>Assignee: endzeit
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h1. Situation
> The existing {{ListX}} processors support different "Listing Strategies". One 
> commonly used  "Listing Strategy" is "Tracking Entities" whereby crucial 
> information of all recently listed entities, e.g. files, is remembered.
> On every listing, in case information to an entity has been remembered 
> before, the entity is not listed again (unless it was modified).
> This has several benefits over other available "Listing Strategies". For 
> example, unlike with "No Tracking" the same entity is not listed repeatedly.  
> Other than "Tracking Timestamps", entities with an older timestamp than ones 
> previously listed can be picked up. 
> However, the strategy comes with its own problems.
> h1. Problem
> Due to the ever given constraints to available memory and performance, 
> entities cannot be tracked indefinitely.
> That's why the {{ListedEntityTracker}}, used for implementing "Tracking 
> Entities" by most processors, introduces the notion of an "Entity Tracking 
> Time Window".
> All remembered entities that are out of the time window (they are older than 
> the current time minus the time window) are removed from the tracking cache, 
> to limit memory use. Additionally, not yet listed entities that are out of 
> the time window are exempt from listing, as they would be removed from the 
> "cache" on the next run immediately, resulting in them being listed over and 
> over. 
> However, this results in entities "older" than the specified "Entity Tracking 
> Time Window" not being picked up. For example, given entities are listed from 
> a remote server and this server is not available for some time. Once the 
> server is available again, the listing continues. However, all entities / 
> files that were created before the defined time window, will be silently 
> ignored.
> As of now, this can be solved by manual intervention, re-starting the ListX 
> processor. The 
> "Entity Tracking Time Window" can be ignored upon initial listing, when the 
> "Entity Tracking Initial Listing Target" is set to "All Available" (default).
> However, this requires the NiFi user to be aware of lingering old entities 
> being available on the connected remote source. Additionally, the need for 
> manual intervention might be undesired / impractical when having a plentiful 
> of sources connected.
> Additionally, the "Entity Tracking Time Window" can be increased to account 
> for longer time frames. However, this only betters the situation somewhat and 
> does not solve the problem. Also there is a limit to this, as it increases 
> the memory needed.
> h1. Proposal
> This issue proposes introducing the notion of a "Entity Tracking Mode", 
> whereby the current behavior could be understand as "Track Entity Timestamp".
> An new mode of "Track Last Listing Time" is added. Other than the existing 
> "Track Entity Timestamp" mode, this would not impose any prerequisites on the 
> entities regarding they timestamp (see {{minTimestampToList}}). Instead, all 
> entities would be considered. 
> However, this strategy needs a way to limit / clean the entity cache as well. 
> Instead of measuring the time window by the timestamp of the entity, the mode 
> should remember the last time the entity was tracked; that is, part of a call 
> to "listEntities" in "trackEntities". That is, every time an entity is 
> listed, its cache entry is renewed. After every listing, only the cache 
> entries that have been updated in the time window will be kept. All other, 
> entities that have not been listed for a longer time, are removed from cache.
> In case users want to limit a processor to only list entities up to a certain 
> age, most processors have support for this with a separate property already, 
> e.g. "Maximum File Age" in ListSFTP. 
> While this mode solves the problem of listing "old" entities it comes with 
> its own downsides. Due to lifting the restriction on {{minTimestampToList}}, 
> more entities can be listed, potentially leading to long listing times. 
> Additionally, similar to the existing "Track Entity Timestamp" there is no 
> enforced upper limit on how many cache entries are possible. See NIFI-12609 
> for a proposal that may address both problems.



--
This 

[jira] [Updated] (NIFI-12595) Introduce an "Entity Tracking Mode" and "Track Last Listing Time" to ListedEntityTracker

2024-01-14 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12595:
---
Description: 
h1. Situation

The existing {{ListX}} processors support different "Listing Strategies". One 
commonly used  "Listing Strategy" is "Tracking Entities" whereby crucial 
information of all recently listed entities, e.g. files, is remembered.
On every listing, in case information to an entity has been remembered before, 
the entity is not listed again (unless it was modified).

This has several benefits over other available "Listing Strategies". For 
example, unlike with "No Tracking" the same entity is not listed repeatedly.  
Other than "Tracking Timestamps", entities with an older timestamp than ones 
previously listed can be picked up. 
However, the strategy comes with its own problems.

h1. Problem

Due to the ever given constraints to available memory and performance, entities 
cannot be tracked indefinitely.
That's why the {{ListedEntityTracker}}, used for implementing "Tracking 
Entities" by most processors, introduces the notion of an "Entity Tracking Time 
Window".
All remembered entities that are out of the time window (they are older than 
the current time minus the time window) are removed from the tracking cache, to 
limit memory use. Additionally, not yet listed entities that are out of the 
time window are exempt from listing, as they would be removed from the "cache" 
on the next run immediately, resulting in them being listed over and over. 

However, this results in entities "older" than the specified "Entity Tracking 
Time Window" not being picked up. For example, given entities are listed from a 
remote server and this server is not available for some time. Once the server 
is available again, the listing continues. However, all entities / files that 
were created before the defined time window, will be silently ignored.

As of now, this can be solved by manual intervention, re-starting the ListX 
processor. The 
"Entity Tracking Time Window" can be ignored upon initial listing, when the 
"Entity Tracking Initial Listing Target" is set to "All Available" (default).

However, this requires the NiFi user to be aware of lingering old entities 
being available on the connected remote source. Additionally, the need for 
manual intervention might be undesired / impractical when having a plentiful of 
sources connected.

Additionally, the "Entity Tracking Time Window" can be increased to account for 
longer time frames. However, this only betters the situation somewhat and does 
not solve the problem. Also there is a limit to this, as it increases the 
memory needed.

h1. Proposal

This issue proposes introducing the notion of a "Entity Tracking Mode", whereby 
the current behavior could be understand as "Track Entity Timestamp".

An new mode of "Track Last Listing Time" is added. Other than the existing 
"Track Entity Timestamp" mode, this would not impose any prerequisites on the 
entities regarding they timestamp (see {{minTimestampToList}}). Instead, all 
entities would be considered. 
However, this strategy needs a way to limit / clean the entity cache as well. 
Instead of measuring the time window by the timestamp of the entity, the mode 
should remember the last time the entity was tracked; that is, part of a call 
to "listEntities" in "trackEntities". That is, every time an entity is listed, 
its cache entry is renewed. After every listing, only the cache entries that 
have been updated in the time window will be kept. All other, entities that 
have not been listed for a longer time, are removed from cache.

In case users want to limit a processor to only list entities up to a certain 
age, most processors have support for this with a separate property already, 
e.g. "Maximum File Age" in ListSFTP. 

While this mode solves the problem of listing "old" entities it comes with its 
own downsides. Due to lifting the restriction on {{minTimestampToList}}, more 
entities can be listed, potentially leading to long listing times. 
Additionally, similar to the existing "Track Entity Timestamp" there is no 
enforced upper limit on how many cache entries are possible. See NIFI-12609 for 
a proposal that may address both problems.


  was:
h1. Situation

The existing {{ListX}} processors support different "Listing Strategies". One 
commonly used  "Listing Strategy" is "Tracking Entities" whereby crucial 
information of all recently listed entities, e.g. files, is remembered.
On every listing, in case information to an entity has been remembered before, 
the entity is not listed again (unless it was modified).

This has several benefits over other available "Listing Strategies". For 
example, unlike with "No Tracking" the same entity is not listed repeatedly.  
Other than "Tracking Timestamps", entities with an older timestamp than ones 
previously listed can be picked up. 
However, the strategy 

[jira] [Created] (NIFI-12609) Introduce an "Cached Entity Limit" to ListedEntityTracker

2024-01-14 Thread endzeit (Jira)
endzeit created NIFI-12609:
--

 Summary: Introduce an "Cached Entity Limit" to ListedEntityTracker
 Key: NIFI-12609
 URL: https://issues.apache.org/jira/browse/NIFI-12609
 Project: Apache NiFi
  Issue Type: New Feature
Reporter: endzeit


TODO



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12595) Introduce an "Entity Tracking Mode" and "Track Last Listing Time" to ListedEntityTracker

2024-01-14 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12595:
---
Description: 
h1. Situation

The existing {{ListX}} processors support different "Listing Strategies". One 
commonly used  "Listing Strategy" is "Tracking Entities" whereby crucial 
information of all recently listed entities, e.g. files, is remembered.
On every listing, in case information to an entity has been remembered before, 
the entity is not listed again (unless it was modified).

This has several benefits over other available "Listing Strategies". For 
example, unlike with "No Tracking" the same entity is not listed repeatedly.  
Other than "Tracking Timestamps", entities with an older timestamp than ones 
previously listed can be picked up. 
However, the strategy comes with its own problems.

h1. Problem

Due to the ever given constraints to available memory and performance, entities 
cannot be tracked indefinitely.
That's why the {{ListedEntityTracker}}, used for implementing "Tracking 
Entities" by most processors, introduces the notion of an "Entity Tracking Time 
Window".
All remembered entities that are out of the time window (they are older than 
the current time minus the time window) are removed from the tracking cache, to 
limit memory use. Additionally, not yet listed entities that are out of the 
time window are exempt from listing, as they would be removed from the "cache" 
on the next run immediately, resulting in them being listed over and over. 

However, this results in entities "older" than the specified "Entity Tracking 
Time Window" not being picked up. For example, given entities are listed from a 
remote server and this server is not available for some time. Once the server 
is available again, the listing continues. However, all entities / files that 
were created before the defined time window, will be silently ignored.

As of now, this can be solved by manual intervention, re-starting the ListX 
processor. The 
"Entity Tracking Time Window" can be ignored upon initial listing, when the 
"Entity Tracking Initial Listing Target" is set to "All Available" (default).

However, this requires the NiFi user to be aware of lingering old entities 
being available on the connected remote source. Additionally, the need for 
manual intervention might be undesired / impractical when having a plentiful of 
sources connected.

Additionally, the "Entity Tracking Time Window" can be increased to account for 
longer time frames. However, this only betters the situation somewhat and does 
not solve the problem. Also there is a limit to this, as it increases the 
memory needed.

h1. Proposal

This issue proposes introducing the notion of a "Entity Tracking Mode", whereby 
the current behavior could be understand as "Track Entity Timestamp".

An new mode of "Track Last Listing Time" is added. Other than the existing 
"Track Entity Timestamp" mode, this would not impose any prerequisites on the 
entities regarding they timestamp (see {{minTimestampToList}}). Instead, all 
entities would be considered. 
However, this strategy needs a way to limit / clean the entity cache as well. 
Instead of measuring the time window by the timestamp of the entity, the mode 
should remember the last time the entity was tracked; that is, part of a call 
to "listEntities" in "trackEntities". That is, every time an entity is listed, 
its cache entry is renewed. After every listing, only the cache entries that 
have been updated in the time window will be kept. All other, entities that 
have not been listed for a longer time, are removed from cache.

While this mode solves the problem of listing "old" entities it comes with its 
own downsides. Similar to the existing "Track Entity Timestamp" there is no 
upper limit on how many cache entries are possible, 
The NiFi user has to configure a sensible value for the amount N of maximum 
cache entries. Failing to do so can result in listing an entity more than once 
(similar to "No Tracking"), when the source provides more than N entities at 
once (e.g. due to a load peak) and the entities are not removed from the source 
until the next listing. Thus this approach works best for Flows where the 
entity is removed from the source short after the listing. This behavior should 
be mentioned in the documentation of the property.

  was:
h1. Situation

The existing {{ListX}} processors support different "Listing Strategies". One 
commonly used  "Listing Strategy" is "Tracking Entities" whereby crucial 
information of all recently listed entities, e.g. files, is remembered.
On every listing, in case information to an entity has been remembered before, 
the entity is not listed again (unless it was modified).

This has several benefits over other available "Listing Strategies". For 
example, unlike with "No Tracking" the same entity is not listed repeatedly.  
Other than "Tracking 

[jira] [Updated] (NIFI-12595) Introduce an "Entity Tracking Mode" and "Track Last Listing Time" to ListedEntityTracker

2024-01-14 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12595:
---
Summary: Introduce an "Entity Tracking Mode" and "Track Last Listing Time" 
to ListedEntityTracker  (was: Introduce an "Entity Tracking Strategy" and 
"Tracking Recently Listed" to ListedEntityTracker)

> Introduce an "Entity Tracking Mode" and "Track Last Listing Time" to 
> ListedEntityTracker
> 
>
> Key: NIFI-12595
> URL: https://issues.apache.org/jira/browse/NIFI-12595
> Project: Apache NiFi
>  Issue Type: New Feature
>Affects Versions: 1.24.0
>Reporter: endzeit
>Priority: Major
>
> h1. Situation
> The existing {{ListX}} processors support different "Listing Strategies". One 
> commonly used  "Listing Strategy" is "Tracking Entities" whereby crucial 
> information of all recently listed entities, e.g. files, is remembered.
> On every listing, in case information to an entity has been remembered 
> before, the entity is not listed again (unless it was modified).
> This has several benefits over other available "Listing Strategies". For 
> example, unlike with "No Tracking" the same entity is not listed repeatedly.  
> Other than "Tracking Timestamps", entities with an older timestamp than ones 
> previously listed can be picked up. 
> However, the strategy comes with its own problems.
> h1. Problem
> Due to the ever given constraints to available memory and performance, 
> entities cannot be tracked indefinitely.
> That's why the {{ListedEntityTracker}}, used for implementing "Tracking 
> Entities" by most processors, introduces the notion of an "Entity Tracking 
> Time Window".
> All remembered entities that are out of the time window (they are older than 
> the current time minus the time window) are removed from the tracking cache, 
> to limit memory use. Additionally, not yet listed entities that are out of 
> the time window are exempt from listing, as they would be removed from the 
> "cache" on the next run immediately, resulting in them being listed over and 
> over. 
> However, this results in entities "older" than the specified "Entity Tracking 
> Time Window" not being picked up. For example, given entities are listed from 
> a remote server and this server is not available for some time. Once the 
> server is available again, the listing continues. However, all entities / 
> files that were created before the defined time window, will be silently 
> ignored.
> As of now, this can be solved by manual intervention, re-starting the ListX 
> processor. The 
> "Entity Tracking Time Window" can be ignored upon initial listing, when the 
> "Entity Tracking Initial Listing Target" is set to "All Available" (default).
> However, this requires the NiFi user to be aware of lingering old entities 
> being available on the connected remote source. Additionally, the need for 
> manual intervention might be undesired / impractical when having a plentiful 
> of sources connected.
> Additionally, the "Entity Tracking Time Window" can be increased to account 
> for longer time frames. However, this only betters the situation somewhat and 
> does not solve the problem. Also there is a limit to this, as it increases 
> the memory needed.
> h1. Proposal
> This issue proposes introducing the notion of a "Entity Tracking Strategy", 
> whereby the current behavior could be understand as "Tracking Time Window".
> An new strategy of "Tracking Recently Listed" is added. Other than they 
> existing "Tracking Time Window" strategy, this would not impose any 
> prerequisites on the entities regarding they timestamp (see 
> {{minTimestampToList}}). Instead, all entities would be considered. 
> However, this strategy needs a way to limit / clean the entity cache as well. 
> Instead of removing entries that leave the time window, the strategy should 
> remember only the last N listed entities. That is, every time an entity is 
> listed, it is moved to the front of an "ordered list". In case the entity has 
> been listed before, its entry in the "list" is moved to the front 
> nonetheless. After every listing, only the first up to N entries are kept. 
> All other, less recently listed entities, are removed from cache.
> While this strategy solves the problem of listing "old" entities it comes 
> with its own downsides. The NiFi user has to configure a sensible value for 
> the amount N of maximum cache entries. Failing to do so can result in listing 
> an entity more than once (similar to "No Tracking"), when the source provides 
> more than N entities at once (e.g. due to a load peak) and the entities are 
> not removed from the source until the next listing. Thus this approach works 
> best for Flows where the entity is removed from the source short after the 
> listing. 

[jira] [Created] (NIFI-12595) Introduce an "Entity Tracking Strategy" to ListedEntityTracker

2024-01-10 Thread endzeit (Jira)
endzeit created NIFI-12595:
--

 Summary: Introduce an "Entity Tracking Strategy" to 
ListedEntityTracker
 Key: NIFI-12595
 URL: https://issues.apache.org/jira/browse/NIFI-12595
 Project: Apache NiFi
  Issue Type: New Feature
Affects Versions: 1.24.0
Reporter: endzeit


h1. Situation

The existing {{ListX}} processors support different "Listing Strategies". One 
commonly used  "Listing Strategy" is "Tracking Entities" whereby crucial 
information of all recently listed entities, e.g. files, is remembered.
On every listing, in case information to an entity has been remembered before, 
the entity is not listed again (unless it was modified).

This has several benefits over other available "Listing Strategies". For 
example, unlike with "No Tracking" the same entity is not listed repeatedly.  
Other than "Tracking Timestamps", entities with an older timestamp than ones 
previously listed can be picked up. 
However, the strategy comes with its own problems.

h1. Problem

Due to the ever given constraints to available memory and performance, entities 
cannot be tracked indefinitely.
That's why the {{ListedEntityTracker}}, used for implementing "Tracking 
Entities" by most processors, introduces the notion of an "Entity Tracking Time 
Window".
All remembered entities that are out of the time window (they are older than 
the current time minus the time window) are removed from the tracking cache, to 
limit memory use. Additionally, not yet listed entities that are out of the 
time window are exempt from listing, as they would be removed from the "cache" 
on the next run immediately, resulting in them being listed over and over. 

However, this results in entities "older" than the specified "Entity Tracking 
Time Window" not being picked up. For example, given entities are listed from a 
remote server and this server is not available for some time. Once the server 
is available again, the listing continues. However, all entities / files that 
were created before the defined time window, will be silently ignored.

As of now, this can be solved by manual intervention, re-starting the ListX 
processor. The 
"Entity Tracking Time Window" can be ignored upon initial listing, when the 
"Entity Tracking Initial Listing Target" is set to "All Available" (default).

However, this requires the NiFi user to be aware of lingering old entities 
being available on the connected remote source. Additionally, the need for 
manual intervention might be undesired / impractical when having a plentiful of 
sources connected.

Additionally, the "Entity Tracking Time Window" can be increased to account for 
longer time frames. However, this only betters the situation somewhat and does 
not solve the problem. Also there is a limit to this, as it increases the 
memory needed.

h1. Proposal

This issue proposes introducing the notion of a "Entity Tracking Strategy", 
whereby the current behavior could be understand as "Tracking Time Window".

An new strategy of "Tracking Recently Listed" is added. Other than they 
existing "Tracking Time Window" strategy, this would not impose any 
prerequisites on the entities regarding they timestamp (see 
{{minTimestampToList}}). Instead, all entities would be considered. 
However, this strategy needs a way to limit / clean the entity cache as well. 
Instead of removing entries that leave the time window, the strategy should 
remember only the last N listed entities. That is, every time an entity is 
listed, it is moved to the front of an "ordered list". In case the entity has 
been listed before, its entry in the "list" is moved to the front nonetheless. 
After every listing, only the first up to N entries are kept. All other, less 
recently listed entities, are removed from cache.

While this strategy solves the problem of listing "old" entities it comes with 
its own downsides. The NiFi user has to configure a sensible value for the 
amount N of maximum cache entries. Failing to do so can result in listing an 
entity more than once (similar to "No Tracking"), when the source provides more 
than N entities at once (e.g. due to a load peak) and the entities are not 
removed from the source until the next listing. Thus this approach works best 
for Flows where the entity is removed from the source short after the listing. 
This behavior should be mentioned in the documentation of the property.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-12595) Introduce an "Entity Tracking Strategy" and "Tracking Recently Listed" to ListedEntityTracker

2024-01-10 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit updated NIFI-12595:
---
Summary: Introduce an "Entity Tracking Strategy" and "Tracking Recently 
Listed" to ListedEntityTracker  (was: Introduce an "Entity Tracking Strategy" 
to ListedEntityTracker)

> Introduce an "Entity Tracking Strategy" and "Tracking Recently Listed" to 
> ListedEntityTracker
> -
>
> Key: NIFI-12595
> URL: https://issues.apache.org/jira/browse/NIFI-12595
> Project: Apache NiFi
>  Issue Type: New Feature
>Affects Versions: 1.24.0
>Reporter: endzeit
>Priority: Major
>
> h1. Situation
> The existing {{ListX}} processors support different "Listing Strategies". One 
> commonly used  "Listing Strategy" is "Tracking Entities" whereby crucial 
> information of all recently listed entities, e.g. files, is remembered.
> On every listing, in case information to an entity has been remembered 
> before, the entity is not listed again (unless it was modified).
> This has several benefits over other available "Listing Strategies". For 
> example, unlike with "No Tracking" the same entity is not listed repeatedly.  
> Other than "Tracking Timestamps", entities with an older timestamp than ones 
> previously listed can be picked up. 
> However, the strategy comes with its own problems.
> h1. Problem
> Due to the ever given constraints to available memory and performance, 
> entities cannot be tracked indefinitely.
> That's why the {{ListedEntityTracker}}, used for implementing "Tracking 
> Entities" by most processors, introduces the notion of an "Entity Tracking 
> Time Window".
> All remembered entities that are out of the time window (they are older than 
> the current time minus the time window) are removed from the tracking cache, 
> to limit memory use. Additionally, not yet listed entities that are out of 
> the time window are exempt from listing, as they would be removed from the 
> "cache" on the next run immediately, resulting in them being listed over and 
> over. 
> However, this results in entities "older" than the specified "Entity Tracking 
> Time Window" not being picked up. For example, given entities are listed from 
> a remote server and this server is not available for some time. Once the 
> server is available again, the listing continues. However, all entities / 
> files that were created before the defined time window, will be silently 
> ignored.
> As of now, this can be solved by manual intervention, re-starting the ListX 
> processor. The 
> "Entity Tracking Time Window" can be ignored upon initial listing, when the 
> "Entity Tracking Initial Listing Target" is set to "All Available" (default).
> However, this requires the NiFi user to be aware of lingering old entities 
> being available on the connected remote source. Additionally, the need for 
> manual intervention might be undesired / impractical when having a plentiful 
> of sources connected.
> Additionally, the "Entity Tracking Time Window" can be increased to account 
> for longer time frames. However, this only betters the situation somewhat and 
> does not solve the problem. Also there is a limit to this, as it increases 
> the memory needed.
> h1. Proposal
> This issue proposes introducing the notion of a "Entity Tracking Strategy", 
> whereby the current behavior could be understand as "Tracking Time Window".
> An new strategy of "Tracking Recently Listed" is added. Other than they 
> existing "Tracking Time Window" strategy, this would not impose any 
> prerequisites on the entities regarding they timestamp (see 
> {{minTimestampToList}}). Instead, all entities would be considered. 
> However, this strategy needs a way to limit / clean the entity cache as well. 
> Instead of removing entries that leave the time window, the strategy should 
> remember only the last N listed entities. That is, every time an entity is 
> listed, it is moved to the front of an "ordered list". In case the entity has 
> been listed before, its entry in the "list" is moved to the front 
> nonetheless. After every listing, only the first up to N entries are kept. 
> All other, less recently listed entities, are removed from cache.
> While this strategy solves the problem of listing "old" entities it comes 
> with its own downsides. The NiFi user has to configure a sensible value for 
> the amount N of maximum cache entries. Failing to do so can result in listing 
> an entity more than once (similar to "No Tracking"), when the source provides 
> more than N entities at once (e.g. due to a load peak) and the entities are 
> not removed from the source until the next listing. Thus this approach works 
> best for Flows where the entity is removed from the source short after the 
> listing. This behavior 

[jira] [Assigned] (NIFI-12089) CSVReader - bad documentation

2024-01-09 Thread endzeit (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-12089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

endzeit reassigned NIFI-12089:
--

Assignee: endzeit

> CSVReader - bad documentation
> -
>
> Key: NIFI-12089
> URL: https://issues.apache.org/jira/browse/NIFI-12089
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 1.20.0
>Reporter: Mermillod
>Assignee: endzeit
>Priority: Trivial
> Attachments: image-2023-09-19-17-34-44-532.png
>
>
> This avro sample is invalid in CSVReader documentation :
> !image-2023-09-19-17-34-44-532.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (NIFI-10972) PutBigQuery - Invalid project resource name projects/${Project}

2024-01-09 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804847#comment-17804847
 ] 

endzeit edited comment on NIFI-10972 at 1/9/24 7:02 PM:


@[~hipotures]: 

The property "Project ID" of the processor PutBigQuery does not support 
evaluating FlowFile attributes as can be read [in the 
documentation|https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-gcp-nar/1.23.2/org.apache.nifi.processors.gcp.bigquery.PutBigQuery/index.html]:
{noformat}
Google Cloud Project ID
Supports Expression Language: true (will be evaluated using variable registry 
only) {noformat}
Please note that NiFi 2.x will no longer support the variable registry. Thus 
the documentation for this version states:
{noformat}
Google Cloud Project ID
Supports Expression Language: true (will be evaluated using Environment 
variables only) {noformat}
 

Thus, the behavior is documented, intended and not a bug. In case you want this 
new functionality, feel free to convert the type of this Jira issue to 
"Improvement" and provide a pull-request with the required changes, see the 
[contributor 
guide|https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide#ContributorGuide-HowtocontributetoApacheNiFi]
 on how to get started.


was (Author: endzeitbegins):
The property "Project ID" of the processor PutBigQuery does not support 
evaluating FlowFile attributes as can be read [in the 
documentation|https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-gcp-nar/1.23.2/org.apache.nifi.processors.gcp.bigquery.PutBigQuery/index.html]:
{noformat}
Google Cloud Project ID
Supports Expression Language: true (will be evaluated using variable registry 
only) {noformat}
Please note that NiFi 2.x will no longer support the variable registry. Thus 
the documentation for this version states:
{noformat}
Google Cloud Project ID
Supports Expression Language: true (will be evaluated using Environment 
variables only) {noformat}
 

Thus, the behavior is documented, intended and not a bug. In case you want this 
new functionality, feel free to convert the type of this Jira issue to 
"Improvement" and provide a pull-request with the required changes, see the 
[contributor 
guide|https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide#ContributorGuide-HowtocontributetoApacheNiFi]
 on how to get started.

> PutBigQuery - Invalid project resource name projects/${Project}
> ---
>
> Key: NIFI-10972
> URL: https://issues.apache.org/jira/browse/NIFI-10972
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.19.0, 1.19.1
> Environment: Debian
>Reporter: hipotures
>Priority: Major
>
> Processor: *PutBigQuery* 
> Parameter: *Project ID*
> Value for Project ID: *${Project}*
> Problem: *attributes are unevaluated*
> Error log:
> {code:java}
> 2022-12-13 10:40:18,955 ERROR [Timer-Driven Process Thread-9] 
> o.a.n.p.gcp.bigquery.PutBigQuery 
> PutBigQuery[id=121711dc-1182-1bfc-7f3d-6fb3bcfc7d0b] Processing halted: 
> yielding [1 sec]
> com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Invalid project resource 
> name projects/${Project}; Project id: ${Project}
>     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:92)
>     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:41)
>     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:86)
>     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66)
>     at 
> com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
>     at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:67)
>     at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1132)
>     at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
>     at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
>     at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
>     at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
>     at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:563)
>     at 
> io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
>     at 
> io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
>     at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
>     at 
> 

[jira] [Commented] (NIFI-10972) PutBigQuery - Invalid project resource name projects/${Project}

2024-01-09 Thread endzeit (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804847#comment-17804847
 ] 

endzeit commented on NIFI-10972:


The property "Project ID" of the processor PutBigQuery does not support 
evaluating FlowFile attributes as can be read [in the 
documentation|https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-gcp-nar/1.23.2/org.apache.nifi.processors.gcp.bigquery.PutBigQuery/index.html]:
{noformat}
Google Cloud Project ID
Supports Expression Language: true (will be evaluated using variable registry 
only) {noformat}
Please note that NiFi 2.x will no longer support the variable registry. Thus 
the documentation for this version states:
{noformat}
Google Cloud Project ID
Supports Expression Language: true (will be evaluated using Environment 
variables only) {noformat}
 

Thus, the behavior is documented, intended and not a bug. In case you want this 
new functionality, feel free to convert the type of this Jira issue to 
"Improvement" and provide a pull-request with the required changes, see the 
[contributor 
guide|https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide#ContributorGuide-HowtocontributetoApacheNiFi]
 on how to get started.

> PutBigQuery - Invalid project resource name projects/${Project}
> ---
>
> Key: NIFI-10972
> URL: https://issues.apache.org/jira/browse/NIFI-10972
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.19.0, 1.19.1
> Environment: Debian
>Reporter: hipotures
>Priority: Major
>
> Processor: *PutBigQuery* 
> Parameter: *Project ID*
> Value for Project ID: *${Project}*
> Problem: *attributes are unevaluated*
> Error log:
> {code:java}
> 2022-12-13 10:40:18,955 ERROR [Timer-Driven Process Thread-9] 
> o.a.n.p.gcp.bigquery.PutBigQuery 
> PutBigQuery[id=121711dc-1182-1bfc-7f3d-6fb3bcfc7d0b] Processing halted: 
> yielding [1 sec]
> com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Invalid project resource 
> name projects/${Project}; Project id: ${Project}
>     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:92)
>     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:41)
>     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:86)
>     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66)
>     at 
> com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
>     at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:67)
>     at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1132)
>     at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
>     at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
>     at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
>     at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
>     at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:563)
>     at 
> io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
>     at 
> io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
>     at 
> io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
>     at 
> io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
>     at 
> com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:535)
>     at 
> io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:463)
>     at 
> io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:427)
>     at 
> io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:460)
>     at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
>     at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
>     at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
>     at 
> io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
>     at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>     at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> 

  1   2   >