[jira] [Updated] (NIFI-11670) Encrypted Content Repository is very slow when FlowFiles have a non-zero Content Claim Offset

2023-06-08 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11670:

Status: Patch Available  (was: Open)

> Encrypted Content Repository is very slow when FlowFiles have a non-zero 
> Content Claim Offset
> -
>
> Key: NIFI-11670
> URL: https://issues.apache.org/jira/browse/NIFI-11670
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To replicate, create a flow with GenerateFlowFile -> MergeContent
> Configure GenerateFlowFile to generate 25 KB FlowFiles with a batch size of 
> 1,000. It's important in order to replicate, that batch size be used.
> Configure MergeContent to merge bins of 1,000 FlowFiles.
> Merging the files when using unencrypted/default content repository takes 
> milliseconds. Using Encrypted Repo it takes nearly a minute.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-4957) Enable JoltTransformJSON to pickup a Jolt Spec file from a file location

2023-06-08 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-4957:
---
Fix Version/s: 1.latest
   2.latest

> Enable JoltTransformJSON to pickup a Jolt Spec file from a file location
> 
>
> Key: NIFI-4957
> URL: https://issues.apache.org/jira/browse/NIFI-4957
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Ryan Hendrickson
>Assignee: Ryan Hendrickson
>Priority: Minor
> Fix For: 1.latest, 2.latest
>
> Attachments: image-2018-03-09-23-56-43-912.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Add a property to allow the Jolt Spec to be read from a file on disk and/or 
> the classpath.
> !image-2018-03-09-23-56-43-912.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11647) org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue does not map UUID RecordFieldType

2023-06-08 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11647:

Fix Version/s: 2.0.0
   1.23.0
   (was: 1.latest)
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue does 
> not map UUID RecordFieldType
> -
>
> Key: NIFI-11647
> URL: https://issues.apache.org/jira/browse/NIFI-11647
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Sander Bylemans
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.0.0, 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue does 
> not map the RecordFieldType UUID.
> This causes the PutDatabaseRecord to fail when trying to INSERT a flowfile 
> with a UUID as logical type with the error message:
> {code}
> 2023-06-05 22:03:29,872 ERROR [Timer-Driven Process Thread-11] 
> o.a.n.p.standard.PutDatabaseRecord 
> PutDatabaseRecord[id=8f505e85-8058-3714-ac24-aaeeb5efc6a3] Failed to put 
> Records to database for 
> StandardFlowFileRecord[uuid=cedad728-117a-4235-9251-ded3b7580b7b,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1685995389355-150, 
> container=default, section=150], offset=2643, 
> length=6551],offset=0,name=fase_3.2.23_00699164_00699164.parquet,size=4991].
>  Routing to failure.
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException: 
> Cannot convert unknown type UUID
>   at 
> org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue(DataTypeUtils.java:2148)
>   at 
> org.apache.nifi.processors.standard.PutDatabaseRecord.executeDML(PutDatabaseRecord.java:723)
>   at 
> org.apache.nifi.processors.standard.PutDatabaseRecord.putToDatabase(PutDatabaseRecord.java:970)
>   at 
> org.apache.nifi.processors.standard.PutDatabaseRecord.onTrigger(PutDatabaseRecord.java:493)
>   at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>   at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1356)
>   at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246)
>   at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
>   at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at 
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>   at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> {code}
> Possibly more are affected.
> This was added in https://issues.apache.org/jira/browse/NIFI-9981 and there 
> was already concern for the need of mapping a UUID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11666) ModifyCompression Missing Exception in Error Log

2023-06-08 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11666:

Fix Version/s: 1.23.0
   (was: 1.latest)

> ModifyCompression Missing Exception in Error Log
> 
>
> Key: NIFI-11666
> URL: https://issues.apache.org/jira/browse/NIFI-11666
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.22.0
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Trivial
> Fix For: 2.0.0, 1.23.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The {{ModifyCompression}} Processor does not include the Exception as an 
> argument when logging an error on processing failures. This was an 
> inadvertent omission when finalizing the initial version of the Processor and 
> should be corrected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11666) ModifyCompression Missing Exception in Error Log

2023-06-08 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11666:

Fix Version/s: 2.0.0
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> ModifyCompression Missing Exception in Error Log
> 
>
> Key: NIFI-11666
> URL: https://issues.apache.org/jira/browse/NIFI-11666
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.22.0
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Trivial
> Fix For: 2.0.0, 1.latest
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The {{ModifyCompression}} Processor does not include the Exception as an 
> argument when logging an error on processing failures. This was an 
> inadvertent omission when finalizing the initial version of the Processor and 
> should be corrected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11655) GenerateRecord doesn't generate floats and doubles correctly when a schema is supplied

2023-06-07 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11655:

Status: Patch Available  (was: In Progress)

> GenerateRecord doesn't generate floats and doubles correctly when a schema is 
> supplied
> --
>
> Key: NIFI-11655
> URL: https://issues.apache.org/jira/browse/NIFI-11655
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a schema is supplied to GenerateRecord via the Schema Text property and 
> it contains either float or double fields, the processor fails with an error:
> 2023-06-06 15:10:36,271 ERROR [Timer-Driven Process Thread-7] 
> o.a.n.processors.standard.GenerateRecord 
> GenerateRecord[id=9201dbe8-0188-1000-6d56-74ba1fc1e732] Processing failed
> org.apache.nifi.processor.exception.ProcessException: Record generation failed
>   at 
> org.apache.nifi.processors.standard.GenerateRecord.onTrigger(GenerateRecord.java:274)
>   at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>   at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1360)
>   at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:243)
>   at 
> org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:59)
>   at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.ClassCastException: class 
> org.apache.nifi.serialization.record.DataType cannot be cast to class 
> org.apache.nifi.serialization.record.type.DecimalDataType 
> (org.apache.nifi.serialization.record.DataType and 
> org.apache.nifi.serialization.record.type.DecimalDataType are in unnamed 
> module of loader org.apache.nifi.nar.NarClassLoader @7fd987ef)
>   at 
> org.apache.nifi.processors.standard.GenerateRecord.generateValueFromRecordField(GenerateRecord.java:316)
>   at 
> org.apache.nifi.processors.standard.GenerateRecord.lambda$onTrigger$0(GenerateRecord.java:238)
>   at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:3138)
>   at 
> org.apache.nifi.processors.standard.GenerateRecord.onTrigger(GenerateRecord.java:210)
> This is because GenerateRecord handles floats, doubles, and decimals the same 
> way, by trying to treat them as DecimalDataTypes when floats and doubles are 
> not compatible (they have their own distinct data types). The cases should be 
> handled separately and unit tests added/augmented to verify.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11647) org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue does not map UUID RecordFieldType

2023-06-07 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11647:

Fix Version/s: 1.latest
   2.latest
   Status: Patch Available  (was: In Progress)

> org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue does 
> not map UUID RecordFieldType
> -
>
> Key: NIFI-11647
> URL: https://issues.apache.org/jira/browse/NIFI-11647
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Sander Bylemans
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue does 
> not map the RecordFieldType UUID.
> This causes the PutDatabaseRecord to fail when trying to INSERT a flowfile 
> with a UUID as logical type with the error message:
> {code}
> 2023-06-05 22:03:29,872 ERROR [Timer-Driven Process Thread-11] 
> o.a.n.p.standard.PutDatabaseRecord 
> PutDatabaseRecord[id=8f505e85-8058-3714-ac24-aaeeb5efc6a3] Failed to put 
> Records to database for 
> StandardFlowFileRecord[uuid=cedad728-117a-4235-9251-ded3b7580b7b,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1685995389355-150, 
> container=default, section=150], offset=2643, 
> length=6551],offset=0,name=fase_3.2.23_00699164_00699164.parquet,size=4991].
>  Routing to failure.
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException: 
> Cannot convert unknown type UUID
>   at 
> org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue(DataTypeUtils.java:2148)
>   at 
> org.apache.nifi.processors.standard.PutDatabaseRecord.executeDML(PutDatabaseRecord.java:723)
>   at 
> org.apache.nifi.processors.standard.PutDatabaseRecord.putToDatabase(PutDatabaseRecord.java:970)
>   at 
> org.apache.nifi.processors.standard.PutDatabaseRecord.onTrigger(PutDatabaseRecord.java:493)
>   at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>   at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1356)
>   at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246)
>   at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
>   at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at 
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>   at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> {code}
> Possibly more are affected.
> This was added in https://issues.apache.org/jira/browse/NIFI-9981 and there 
> was already concern for the need of mapping a UUID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11655) GenerateRecord doesn't generate floats and doubles correctly when a schema is supplied

2023-06-06 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11655:
---

Assignee: Matt Burgess

> GenerateRecord doesn't generate floats and doubles correctly when a schema is 
> supplied
> --
>
> Key: NIFI-11655
> URL: https://issues.apache.org/jira/browse/NIFI-11655
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>
> When a schema is supplied to GenerateRecord via the Schema Text property and 
> it contains either float or double fields, the processor fails with an error:
> 2023-06-06 15:10:36,271 ERROR [Timer-Driven Process Thread-7] 
> o.a.n.processors.standard.GenerateRecord 
> GenerateRecord[id=9201dbe8-0188-1000-6d56-74ba1fc1e732] Processing failed
> org.apache.nifi.processor.exception.ProcessException: Record generation failed
>   at 
> org.apache.nifi.processors.standard.GenerateRecord.onTrigger(GenerateRecord.java:274)
>   at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>   at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1360)
>   at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:243)
>   at 
> org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:59)
>   at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.ClassCastException: class 
> org.apache.nifi.serialization.record.DataType cannot be cast to class 
> org.apache.nifi.serialization.record.type.DecimalDataType 
> (org.apache.nifi.serialization.record.DataType and 
> org.apache.nifi.serialization.record.type.DecimalDataType are in unnamed 
> module of loader org.apache.nifi.nar.NarClassLoader @7fd987ef)
>   at 
> org.apache.nifi.processors.standard.GenerateRecord.generateValueFromRecordField(GenerateRecord.java:316)
>   at 
> org.apache.nifi.processors.standard.GenerateRecord.lambda$onTrigger$0(GenerateRecord.java:238)
>   at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:3138)
>   at 
> org.apache.nifi.processors.standard.GenerateRecord.onTrigger(GenerateRecord.java:210)
> This is because GenerateRecord handles floats, doubles, and decimals the same 
> way, by trying to treat them as DecimalDataTypes when floats and doubles are 
> not compatible (they have their own distinct data types). The cases should be 
> handled separately and unit tests added/augmented to verify.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11655) GenerateRecord doesn't generate floats and doubles correctly when a schema is supplied

2023-06-06 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11655:
---

 Summary: GenerateRecord doesn't generate floats and doubles 
correctly when a schema is supplied
 Key: NIFI-11655
 URL: https://issues.apache.org/jira/browse/NIFI-11655
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Matt Burgess
 Fix For: 1.latest, 2.latest


When a schema is supplied to GenerateRecord via the Schema Text property and it 
contains either float or double fields, the processor fails with an error:

2023-06-06 15:10:36,271 ERROR [Timer-Driven Process Thread-7] 
o.a.n.processors.standard.GenerateRecord 
GenerateRecord[id=9201dbe8-0188-1000-6d56-74ba1fc1e732] Processing failed
org.apache.nifi.processor.exception.ProcessException: Record generation failed
at 
org.apache.nifi.processors.standard.GenerateRecord.onTrigger(GenerateRecord.java:274)
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1360)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:243)
at 
org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:59)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.ClassCastException: class 
org.apache.nifi.serialization.record.DataType cannot be cast to class 
org.apache.nifi.serialization.record.type.DecimalDataType 
(org.apache.nifi.serialization.record.DataType and 
org.apache.nifi.serialization.record.type.DecimalDataType are in unnamed module 
of loader org.apache.nifi.nar.NarClassLoader @7fd987ef)
at 
org.apache.nifi.processors.standard.GenerateRecord.generateValueFromRecordField(GenerateRecord.java:316)
at 
org.apache.nifi.processors.standard.GenerateRecord.lambda$onTrigger$0(GenerateRecord.java:238)
at 
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:3138)
at 
org.apache.nifi.processors.standard.GenerateRecord.onTrigger(GenerateRecord.java:210)


This is because GenerateRecord handles floats, doubles, and decimals the same 
way, by trying to treat them as DecimalDataTypes when floats and doubles are 
not compatible (they have their own distinct data types). The cases should be 
handled separately and unit tests added/augmented to verify.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11647) org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue does not map UUID RecordFieldType

2023-06-05 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11647:
---

Assignee: Matt Burgess

> org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue does 
> not map UUID RecordFieldType
> -
>
> Key: NIFI-11647
> URL: https://issues.apache.org/jira/browse/NIFI-11647
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Sander Bylemans
>Assignee: Matt Burgess
>Priority: Major
>
> org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue does 
> not map the RecordFieldType UUID.
> This causes the PutDatabaseRecord to fail when trying to INSERT a flowfile 
> with a UUID as logical type with the error message:
> {code}
> 2023-06-05 22:03:29,872 ERROR [Timer-Driven Process Thread-11] 
> o.a.n.p.standard.PutDatabaseRecord 
> PutDatabaseRecord[id=8f505e85-8058-3714-ac24-aaeeb5efc6a3] Failed to put 
> Records to database for 
> StandardFlowFileRecord[uuid=cedad728-117a-4235-9251-ded3b7580b7b,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1685995389355-150, 
> container=default, section=150], offset=2643, 
> length=6551],offset=0,name=fase_3.2.23_00699164_00699164.parquet,size=4991].
>  Routing to failure.
> org.apache.nifi.serialization.record.util.IllegalTypeConversionException: 
> Cannot convert unknown type UUID
>   at 
> org.apache.nifi.serialization.record.util.DataTypeUtils.getSQLTypeValue(DataTypeUtils.java:2148)
>   at 
> org.apache.nifi.processors.standard.PutDatabaseRecord.executeDML(PutDatabaseRecord.java:723)
>   at 
> org.apache.nifi.processors.standard.PutDatabaseRecord.putToDatabase(PutDatabaseRecord.java:970)
>   at 
> org.apache.nifi.processors.standard.PutDatabaseRecord.onTrigger(PutDatabaseRecord.java:493)
>   at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>   at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1356)
>   at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:246)
>   at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
>   at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at 
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>   at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> {code}
> Possibly more are affected.
> This was added in https://issues.apache.org/jira/browse/NIFI-9981 and there 
> was already concern for the need of mapping a UUID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11646) Deprecate Lua and Ruby Script Engines

2023-06-05 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11646:

Status: Patch Available  (was: In Progress)

> Deprecate Lua and Ruby Script Engines
> -
>
> Key: NIFI-11646
> URL: https://issues.apache.org/jira/browse/NIFI-11646
> Project: Apache NiFi
>  Issue Type: Task
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{lua}} and {{ruby}} Script Engines for multiple scripted Processors and 
> Controller Services should be deprecated for removal in NiFi 2.0. The engines 
> are not often used and as they are not JVM-native languages the capabilities 
> for these engines are more limited than a JVM-native language such as Groovy. 
> This Jira continues the idea of NIFI-11630 but for the Lua and Ruby script 
> engines.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11646) Deprecate Lua and Ruby Script Engines

2023-06-05 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11646:
---

Assignee: Matt Burgess

> Deprecate Lua and Ruby Script Engines
> -
>
> Key: NIFI-11646
> URL: https://issues.apache.org/jira/browse/NIFI-11646
> Project: Apache NiFi
>  Issue Type: Task
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest
>
>
> The {{lua}} and {{ruby}} Script Engines for multiple scripted Processors and 
> Controller Services should be deprecated for removal in NiFi 2.0. The engines 
> are not often used and as they are not JVM-native languages the capabilities 
> for these engines are more limited than a JVM-native language such as Groovy. 
> This Jira continues the idea of NIFI-11630 but for the Lua and Ruby script 
> engines.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11646) Deprecate Lua and Ruby Script Engines

2023-06-05 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11646:
---

 Summary: Deprecate Lua and Ruby Script Engines
 Key: NIFI-11646
 URL: https://issues.apache.org/jira/browse/NIFI-11646
 Project: Apache NiFi
  Issue Type: Task
  Components: Extensions
Reporter: Matt Burgess
 Fix For: 1.latest


The {{lua}} and {{ruby}} Script Engines for multiple scripted Processors and 
Controller Services should be deprecated for removal in NiFi 2.0. The engines 
are not often used and as they are not JVM-native languages the capabilities 
for these engines are more limited than a JVM-native language such as Groovy. 
This Jira continues the idea of NIFI-11630 but for the Lua and Ruby script 
engines.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11631) Add Oracle support for NiFi Registry

2023-06-05 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11631:

Summary: Add Oracle support for NiFi Registry  (was: Add OracleDB support 
for Nifi Registry)

> Add Oracle support for NiFi Registry
> 
>
> Key: NIFI-11631
> URL: https://issues.apache.org/jira/browse/NIFI-11631
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: NiFi Registry
>Reporter: Kalmár Róbert
>Assignee: Kalmár Róbert
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11630) Deprecate ECMAScript Script Engine

2023-06-05 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11630:

Fix Version/s: 1.22.0
   (was: 1.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Deprecate ECMAScript Script Engine
> --
>
> Key: NIFI-11630
> URL: https://issues.apache.org/jira/browse/NIFI-11630
> Project: Apache NiFi
>  Issue Type: Task
>  Components: Extensions
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Major
> Fix For: 1.22.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The {{ECMAScript}} Script Engine for multiple scripted Processors and 
> Controller Services should be deprecated for removal in NiFi 2.0.
> The {{ECMAScript}} engine supports JavaScript-compatible scripted components. 
> The Nashorn engine was deprecated in Java 11 as described in [JEP 
> 335|https://openjdk.org/jeps/335] and is no longer available in Java 17. 
> Alternative JavaScript engines could be considered separately, but 
> {{ECMAScript}} should be deprecated for removal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11639) Update maven-checkstyle-plugin to 3.3.0

2023-06-03 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11639:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Update maven-checkstyle-plugin to 3.3.0
> ---
>
> Key: NIFI-11639
> URL: https://issues.apache.org/jira/browse/NIFI-11639
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Siddharth R
>Assignee: Siddharth R
>Priority: Minor
>  Labels: dependency-upgrade
> Fix For: 2.0.0, 1.22.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Bump maven-checkstyle-plugin from 3.2.1 to 3.3.0 to remediate CVE:
> [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13936]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11639) Update maven-checkstyle-plugin to 3.3.0

2023-06-03 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11639:

Fix Version/s: 2.0.0
   1.22.0
   (was: 1.latest)
   (was: 2.latest)

> Update maven-checkstyle-plugin to 3.3.0
> ---
>
> Key: NIFI-11639
> URL: https://issues.apache.org/jira/browse/NIFI-11639
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Siddharth R
>Assignee: Siddharth R
>Priority: Minor
>  Labels: dependency-upgrade
> Fix For: 2.0.0, 1.22.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Bump maven-checkstyle-plugin from 3.2.1 to 3.3.0 to remediate CVE:
> [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13936]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11639) Update maven-checkstyle-plugin to 3.3.0

2023-06-03 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11639:

Fix Version/s: 1.latest
   2.latest
   (was: 2.0.0)
   (was: 1.22.0)

> Update maven-checkstyle-plugin to 3.3.0
> ---
>
> Key: NIFI-11639
> URL: https://issues.apache.org/jira/browse/NIFI-11639
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Siddharth R
>Assignee: Siddharth R
>Priority: Minor
>  Labels: dependency-upgrade
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Bump maven-checkstyle-plugin from 3.2.1 to 3.3.0 to remediate CVE:
> [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13936]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11639) Update maven-checkstyle-plugin to 3.3.0

2023-06-03 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11639:

Status: Patch Available  (was: Open)

> Update maven-checkstyle-plugin to 3.3.0
> ---
>
> Key: NIFI-11639
> URL: https://issues.apache.org/jira/browse/NIFI-11639
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Siddharth R
>Assignee: Siddharth R
>Priority: Minor
>  Labels: dependency-upgrade
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Bump maven-checkstyle-plugin from 3.2.1 to 3.3.0 to remediate CVE:
> [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13936]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11636) ParquetReader buffers up to 2 GB of content into heap unnecessarily

2023-06-02 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11636:

Fix Version/s: 2.0.0
   1.22.0
   (was: 1.latest)
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> ParquetReader buffers up to 2 GB of content into heap unnecessarily
> ---
>
> Key: NIFI-11636
> URL: https://issues.apache.org/jira/browse/NIFI-11636
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0, 1.22.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The Parquet Record Reader uses the NiFiSeekableInputStream. Because Parquet 
> requires reading the footer first, this class is intended to use 
> {{mark/reset}} so that we can read the footer and then reset back to the 
> beginning.
> To achieve this, it calls {{InputStream.mark(Integer.MAX_VALUE)}} which will 
> buffer up to 2 GB onto heap. However, the underlying InputStream is the 
> ContentClaimInputStream. The ContentClaimInputStream has smarts built into it 
> to allow resetting without having to buffer content into memory. In 
> particular, if you read over the {{limit}} provided and then call {{reset}} 
> it will close the InputStream and open a new InputStream from the beginning 
> of the FlowFIle content and seek to the desired offset.
> Because of this, we don't need to use {{InputStream.mark(Integer.MAX_VALUE)}} 
> and can instead use {{InputStream.mark(8192)}} or some similarly small value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (NIFI-11537) Add support for Iceberg tables to UpdateHive3Table

2023-06-01 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess resolved NIFI-11537.
-
Fix Version/s: (was: 2.latest)
   Resolution: Won't Fix

Closing this case as the Iceberg support in Apache Hive 3.1.3 isn't complete 
enough to warrant this effort. The storage format can't be specified and 
columns can't be added (these capabilities will be added in Hive 4).

> Add support for Iceberg tables to UpdateHive3Table 
> ---
>
> Key: NIFI-11537
> URL: https://issues.apache.org/jira/browse/NIFI-11537
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> UpdateHive3Table currently adds columns to Iceberg-backed tables 
> successfully, but Iceberg needs a special CREATE TABLE command to specify the 
> Iceberg Storage Handler and table properties.
> This Jira proposes to add a Create Table Storage Handler property with 
> Default and Iceberg as the initial choices. Default does not generate a 
> STORED BY clause, and Iceberg will generate the appropriate STORED BY clause 
> and set the necessary table properties. This approach can be used in the 
> future to add support for HBase- and Kudu-backed Hive tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11621) Inferring schema for JSON fails when there's a CHOICE of different ARRAY types

2023-05-31 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11621:

Fix Version/s: 2.0.0
   1.22.0
   (was: 1.latest)
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Inferring schema for JSON fails when there's a CHOICE of different ARRAY types
> --
>
> Key: NIFI-11621
> URL: https://issues.apache.org/jira/browse/NIFI-11621
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0, 1.22.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> From Apache Slack: 
> https://apachenifi.slack.com/archives/C0L9VCD47/p1685553667778359?thread_ts=1685461745.470939=C0L9VCD47
> When using ConvertRecord with a JSON Reader and an Avro Writer, when 
> inferring the JSON schema, each of the following two records works properly:
> {code}
> {"test_record":{"array_test_record":{"test_array":[]}}}
> {code}
> {code}
> {"test_record":{"array_test_record":{"test_array":["test"]}}}
> {code}
> However, when combined into a single FlowFile:
> {code}
> {"test_record":{"array_test_record":{"test_array":[]}}}
> {"test_record":{"array_test_record":{"test_array":["test"]}}}
> {code}
> It fails with a NullPointerException:
> {code}
> 2023-05-31 13:51:35,632 ERROR [Timer-Driven Process Thread-8] 
> o.a.n.processors.standard.ConvertRecord 
> ConvertRecord[id=72e564dc-0188-1000-360a-9f86b50ec8ac] Failed to process 
> StandardFlowFileRecord[uuid=9bf4f0fb-0942-48ba-8a16-0dbd98db3f97,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1685554864966-1, container=default, 
> section=1], offset=3278, 
> length=117],offset=0,name=9bf4f0fb-0942-48ba-8a16-0dbd98db3f97,size=117]; 
> will route to failure
> org.apache.nifi.processor.exception.ProcessException: Could not determine the 
> Avro Schema to use for writing the content
> at 
> org.apache.nifi.avro.AvroRecordSetWriter.createWriter(AvroRecordSetWriter.java:154)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at 
> org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:254)
> at 
> org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:105)
> at com.sun.proxy.$Proxy177.createWriter(Unknown Source)
> at 
> org.apache.nifi.processors.standard.AbstractRecordProcessor$1.process(AbstractRecordProcessor.java:150)
> at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:3441)
> at 
> org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:122)
> at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1360)
> at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:243)
> at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:102)
> at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at 
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: org.apache.nifi.schema.access.SchemaNotFoundException: Failed to 
> compile Avro Schema
> at 
> org.apache.nifi.avro.AvroRecordSetWriter.createWriter(AvroRecordSetWriter.java:145)
> ... 21 common frames omitted
> Caused by: java.lang.NullPointerException: null
> at 
> org.apache.nifi.avro.AvroTypeUtil.buildAvroSchema(AvroTypeUtil.java:208)
> at 
> 

[jira] [Updated] (NIFI-11537) Add support for Iceberg tables to UpdateHive3Table

2023-05-31 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11537:

Fix Version/s: (was: 1.latest)
   Status: Open  (was: Patch Available)

> Add support for Iceberg tables to UpdateHive3Table 
> ---
>
> Key: NIFI-11537
> URL: https://issues.apache.org/jira/browse/NIFI-11537
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> UpdateHive3Table currently adds columns to Iceberg-backed tables 
> successfully, but Iceberg needs a special CREATE TABLE command to specify the 
> Iceberg Storage Handler and table properties.
> This Jira proposes to add a Create Table Storage Handler property with 
> Default and Iceberg as the initial choices. Default does not generate a 
> STORED BY clause, and Iceberg will generate the appropriate STORED BY clause 
> and set the necessary table properties. This approach can be used in the 
> future to add support for HBase- and Kudu-backed Hive tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11538) PutIceberg does not correctly convert primitive source objects into target objects

2023-05-23 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11538:

Fix Version/s: 1.latest
   2.latest

> PutIceberg does not correctly convert primitive source objects into target 
> objects
> --
>
> Key: NIFI-11538
> URL: https://issues.apache.org/jira/browse/NIFI-11538
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When inserting data into Iceberg table using the PutIceberg processor, if the 
> incoming record field(s) being inserted have a different datatype than the 
> target column type, the data is not automatically converted but throws a 
> ClassCastException. This happens for primitive types such as long <-> int, 
> int <-> string, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11538) PutIceberg does not correctly convert primitive source objects into target objects

2023-05-23 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11538:

Status: Patch Available  (was: In Progress)

> PutIceberg does not correctly convert primitive source objects into target 
> objects
> --
>
> Key: NIFI-11538
> URL: https://issues.apache.org/jira/browse/NIFI-11538
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When inserting data into Iceberg table using the PutIceberg processor, if the 
> incoming record field(s) being inserted have a different datatype than the 
> target column type, the data is not automatically converted but throws a 
> ClassCastException. This happens for primitive types such as long <-> int, 
> int <-> string, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11557) Eliminate use of Files.walkFileTree for any performance-critical parts of application

2023-05-23 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11557:

Fix Version/s: 2.0.0
   1.22.0
   (was: 1.latest)
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Eliminate use of Files.walkFileTree for any performance-critical parts of 
> application
> -
>
> Key: NIFI-11557
> URL: https://issues.apache.org/jira/browse/NIFI-11557
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework, Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
>  Labels: content-repo, content-repository, performance, slowness, 
> startup
> Fix For: 2.0.0, 1.22.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The FileSystemRepository (content repo implementation) as well as ListFile 
> both make use of the {{Files.walkFileTree}} method. Recently, I worked with a 
> user who had horribly long startup times. Thread dumps show that the time was 
> almost entirely in the FileSystemRepository's {{initializeRepository}} method 
> as it is walking the file tree in order to determine which archive files can 
> be cleaned up next. This is done during startup and again periodically in 
> background threads.
> I made a small modification locally to instead use the standard synchronous 
> IO methods ( {{File.listFiles}} method. I used GenerateFlowFile to generate 
> 1-byte FlowFiles and set  {{nifi.content.claim.max.appendable.size=1 B}} in 
> nifi.properties in order to generate a huge number of files - about 1.2 
> million files in the content repository and restarted a few times. 
> Additionally, added some log lines to show how long this part of the startup 
> process took.
> With the existing code, startup took 210 seconds (3.5 mins). With the new 
> implementation, it took 6.7 seconds. The appears to be due to the fact that 
> when using NIO.2 for every file, it does an individual disk access to obtain 
> File attributes, while when using the {{File.listFiles}} method the File 
> objects that are returned already have the necessary attributes. As a result, 
> the NIO.2 approach makes millions of disk accesses that are unnecessary. As 
> the number of files in the repository grows, the discrepancy also grows.
> We need to eliminate any use of {{File.walkFileTree}} for any 
> performance-critical parts of the codebase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11221) Remove support for processor-level connection configuration in the MongoDB package

2023-05-19 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11221:

Fix Version/s: 2.0.0
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Remove support for processor-level connection configuration in the MongoDB 
> package
> --
>
> Key: NIFI-11221
> URL: https://issues.apache.org/jira/browse/NIFI-11221
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Mike Thomsen
>Assignee: Mike Thomsen
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The configuration should be done entirely through the controller service that 
> manages the connections. That class has been around for a few years now, so 
> it shouldn't be a surprise to anyone.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11552) Support FlowFile attributes in PutIceberg's Table Name property

2023-05-18 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11552:

Status: Patch Available  (was: In Progress)

> Support FlowFile attributes in PutIceberg's Table Name property
> ---
>
> Key: NIFI-11552
> URL: https://issues.apache.org/jira/browse/NIFI-11552
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>
> The documentation for PutIceberg's Table Name property says it doesn’t 
> support any Expression Language but the code calls the evaluate method on the 
> property without passing in a FlowFile, so at the very least it supports 
> Variable Registry. this Jira proposes to add EL support including the 
> FlowFile attributes for the Table Name property and update the documentation 
> to reflect the new behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11552) Support FlowFile attributes in PutIceberg's Table Name property

2023-05-18 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11552:
---

Assignee: Matt Burgess

> Support FlowFile attributes in PutIceberg's Table Name property
> ---
>
> Key: NIFI-11552
> URL: https://issues.apache.org/jira/browse/NIFI-11552
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>
> The documentation for PutIceberg's Table Name property says it doesn’t 
> support any Expression Language but the code calls the evaluate method on the 
> property without passing in a FlowFile, so at the very least it supports 
> Variable Registry. this Jira proposes to add EL support including the 
> FlowFile attributes for the Table Name property and update the documentation 
> to reflect the new behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-5151) Patch Nifi with Upsert functions for PutDatabaseRecord processor

2023-05-18 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-5151:
---
Fix Version/s: 2.0.0
   1.22.0
   (was: 1.latest)
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Patch Nifi with Upsert functions for PutDatabaseRecord processor
> 
>
> Key: NIFI-5151
> URL: https://issues.apache.org/jira/browse/NIFI-5151
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.7.0
>Reporter: Karl Amundsson
>Assignee: Lehel Boér
>Priority: Major
>  Labels: Processor
> Fix For: 2.0.0, 1.22.0
>
> Attachments: 
> 0001-NIFI-5151-Adding-support-for-UPSERT-in-PutDatabaseRe.patch, 
> 0001-NIFI-5151-Using-DatabaseAdapter-to-generate-INSERT-S.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Since Phoenix doesn't support the SQL statement INSERT you have to use a 
> process like: ConvertAttributesToJSON->ConvertJSONToSQL in Insert 
> mode->ReplaceText to replace "INSERT" with "UPSERT" -> PutSQL (See: 
> [https://community.hortonworks.com/questions/40561/nifi-phoenix-processor.html)]
> With this patch you can choose to use UPSERT directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-5151) Patch Nifi with Upsert functions for PutDatabaseRecord processor

2023-05-18 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-5151:
---
Fix Version/s: 1.latest
   2.latest

> Patch Nifi with Upsert functions for PutDatabaseRecord processor
> 
>
> Key: NIFI-5151
> URL: https://issues.apache.org/jira/browse/NIFI-5151
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.7.0
>Reporter: Karl Amundsson
>Assignee: Lehel Boér
>Priority: Major
>  Labels: Processor
> Fix For: 1.latest, 2.latest
>
> Attachments: 
> 0001-NIFI-5151-Adding-support-for-UPSERT-in-PutDatabaseRe.patch, 
> 0001-NIFI-5151-Using-DatabaseAdapter-to-generate-INSERT-S.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Since Phoenix doesn't support the SQL statement INSERT you have to use a 
> process like: ConvertAttributesToJSON->ConvertJSONToSQL in Insert 
> mode->ReplaceText to replace "INSERT" with "UPSERT" -> PutSQL (See: 
> [https://community.hortonworks.com/questions/40561/nifi-phoenix-processor.html)]
> With this patch you can choose to use UPSERT directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11567) GeoEnrichIP processors should auto-reload the database file

2023-05-18 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11567:

Status: Patch Available  (was: In Progress)

> GeoEnrichIP processors should auto-reload the database file
> ---
>
> Key: NIFI-11567
> URL: https://issues.apache.org/jira/browse/NIFI-11567
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>
> Currently the GeoEnrichIP processors only load the database when the 
> processor is scheduled. This requires a processor restart if the database 
> file changes. Instead, the processors should auto-reload the database file 
> when it detects a change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11567) GeoEnrichIP processors should auto-reload the database file

2023-05-18 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11567:

Fix Version/s: 1.latest
   2.latest

> GeoEnrichIP processors should auto-reload the database file
> ---
>
> Key: NIFI-11567
> URL: https://issues.apache.org/jira/browse/NIFI-11567
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>
> Currently the GeoEnrichIP processors only load the database when the 
> processor is scheduled. This requires a processor restart if the database 
> file changes. Instead, the processors should auto-reload the database file 
> when it detects a change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11567) GeoEnrichIP processors should auto-reload the database file

2023-05-18 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11567:
---

Assignee: Matt Burgess

> GeoEnrichIP processors should auto-reload the database file
> ---
>
> Key: NIFI-11567
> URL: https://issues.apache.org/jira/browse/NIFI-11567
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> Currently the GeoEnrichIP processors only load the database when the 
> processor is scheduled. This requires a processor restart if the database 
> file changes. Instead, the processors should auto-reload the database file 
> when it detects a change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11567) GeoEnrichIP processors should auto-reload the database file

2023-05-18 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11567:
---

 Summary: GeoEnrichIP processors should auto-reload the database 
file
 Key: NIFI-11567
 URL: https://issues.apache.org/jira/browse/NIFI-11567
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Matt Burgess


Currently the GeoEnrichIP processors only load the database when the processor 
is scheduled. This requires a processor restart if the database file changes. 
Instead, the processors should auto-reload the database file when it detects a 
change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11552) Support FlowFile attributes in PutIceberg's Table Name property

2023-05-15 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11552:
---

 Summary: Support FlowFile attributes in PutIceberg's Table Name 
property
 Key: NIFI-11552
 URL: https://issues.apache.org/jira/browse/NIFI-11552
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Matt Burgess
 Fix For: 1.latest, 2.latest


The documentation for PutIceberg's Table Name property says it doesn’t support 
any Expression Language but the code calls the evaluate method on the property 
without passing in a FlowFile, so at the very least it supports Variable 
Registry. this Jira proposes to add EL support including the FlowFile 
attributes for the Table Name property and update the documentation to reflect 
the new behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11551) Default Use Avro Logical Types to 'true'

2023-05-15 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11551:
---

 Summary: Default Use Avro Logical Types to 'true'
 Key: NIFI-11551
 URL: https://issues.apache.org/jira/browse/NIFI-11551
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Matt Burgess
 Fix For: 2.latest


Starting with Apache NiFi 2.0, we should default any Use Avro Logical Types 
properties to 'true'. NiFi supports most/all Avro logical types as native NiFi 
record field types, so to get better datatype resolution downstream, we should 
be using Avro logical types wherever possible.

Historically, the reason it defaulted to false was because NiFi originally used 
Avro before logical types were implemented, so when they were added to Avro, 
NiFi kept the existing behavior of not using logical types by default. At this 
point we should use logical types wherever possible (by default).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11538) PutIceberg does not correctly convert primitive source objects into target objects

2023-05-11 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11538:

Summary: PutIceberg does not correctly convert primitive source objects 
into target objects  (was: PutIceberg does not correctly convert source objects 
into target objects)

> PutIceberg does not correctly convert primitive source objects into target 
> objects
> --
>
> Key: NIFI-11538
> URL: https://issues.apache.org/jira/browse/NIFI-11538
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> When inserting data into Iceberg table using the PutIceberg processor, if the 
> incoming record field(s) being inserted have a different datatype than the 
> target column type, the data is not automatically converted but throws a 
> ClassCastException. This happens for primitive types such as long <-> int, 
> int <-> string, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11538) PutIceberg does not correctly convert source objects into target objects

2023-05-11 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11538:
---

Assignee: Matt Burgess

> PutIceberg does not correctly convert source objects into target objects
> 
>
> Key: NIFI-11538
> URL: https://issues.apache.org/jira/browse/NIFI-11538
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> When inserting data into Iceberg table using the PutIceberg processor, if the 
> incoming record field(s) being inserted have a different datatype than the 
> target column type, the data is not automatically converted but throws a 
> ClassCastException. This happens for primitive types such as long <-> int, 
> int <-> string, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11538) PutIceberg does not correctly convert source objects into target objects

2023-05-11 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11538:
---

 Summary: PutIceberg does not correctly convert source objects into 
target objects
 Key: NIFI-11538
 URL: https://issues.apache.org/jira/browse/NIFI-11538
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Matt Burgess


When inserting data into Iceberg table using the PutIceberg processor, if the 
incoming record field(s) being inserted have a different datatype than the 
target column type, the data is not automatically converted but throws a 
ClassCastException. This happens for primitive types such as long <-> int, int 
<-> string, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11537) Add support for Iceberg tables to UpdateHive3Table

2023-05-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11537:

Fix Version/s: 1.latest
   2.latest
   Status: Patch Available  (was: In Progress)

> Add support for Iceberg tables to UpdateHive3Table 
> ---
>
> Key: NIFI-11537
> URL: https://issues.apache.org/jira/browse/NIFI-11537
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> UpdateHive3Table currently adds columns to Iceberg-backed tables 
> successfully, but Iceberg needs a special CREATE TABLE command to specify the 
> Iceberg Storage Handler and table properties.
> This Jira proposes to add a Create Table Storage Handler property with 
> Default and Iceberg as the initial choices. Default does not generate a 
> STORED BY clause, and Iceberg will generate the appropriate STORED BY clause 
> and set the necessary table properties. This approach can be used in the 
> future to add support for HBase- and Kudu-backed Hive tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11537) Add support for Iceberg tables to UpdateHive3Table

2023-05-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11537:
---

Assignee: Matt Burgess

> Add support for Iceberg tables to UpdateHive3Table 
> ---
>
> Key: NIFI-11537
> URL: https://issues.apache.org/jira/browse/NIFI-11537
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> UpdateHive3Table currently adds columns to Iceberg-backed tables 
> successfully, but Iceberg needs a special CREATE TABLE command to specify the 
> Iceberg Storage Handler and table properties.
> This Jira proposes to add a Create Table Storage Handler property with 
> Default and Iceberg as the initial choices. Default does not generate a 
> STORED BY clause, and Iceberg will generate the appropriate STORED BY clause 
> and set the necessary table properties. This approach can be used in the 
> future to add support for HBase- and Kudu-backed Hive tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11537) Add support for Iceberg tables to UpdateHive3Table

2023-05-10 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11537:
---

 Summary: Add support for Iceberg tables to UpdateHive3Table 
 Key: NIFI-11537
 URL: https://issues.apache.org/jira/browse/NIFI-11537
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Matt Burgess


UpdateHive3Table currently adds columns to Iceberg-backed tables successfully, 
but Iceberg needs a special CREATE TABLE command to specify the Iceberg Storage 
Handler and table properties.

This Jira proposes to add a Create Table Storage Handler property with Default 
and Iceberg as the initial choices. Default does not generate a STORED BY 
clause, and Iceberg will generate the appropriate STORED BY clause and set the 
necessary table properties. This approach can be used in the future to add 
support for HBase- and Kudu-backed Hive tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11449) add autocommit property to PutDatabaseRecord processor

2023-05-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11449:
---

Assignee: (was: Matt Burgess)

> add autocommit property to PutDatabaseRecord processor
> --
>
> Key: NIFI-11449
> URL: https://issues.apache.org/jira/browse/NIFI-11449
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Affects Versions: 1.21.0
> Environment: Any Nifi Deployment
>Reporter: Abdelrahim Ahmad
>Priority: Blocker
>  Labels: Trino, autocommit, database, iceberg, putdatabaserecord
>
> The issue is with the {{PutDatabaseRecord}} processor in Apache NiFi. When 
> using the processor with the Trino-JDBC-Driver or Dremio-JDBC-Driver to write 
> to an Iceberg catalog, it disables the autocommit feature. This leads to 
> errors such as "{*}Catalog only supports writes using autocommit: iceberg{*}".
> the autocommit feature needs to be added in the processor to be 
> enabled/disabled.
> enabling auto-commit in the Nifi PutDatabaseRecord processor is important for 
> Deltalake, Iceberg, and Hudi as it ensures data consistency and integrity by 
> allowing atomic writes to be performed in the underlying database. This will 
> allow the process to be widely used with bigger range of databases.
> _Improving this processor will allow Nifi to be the main tool to ingest data 
> into these new Technologies. So we don't have to deal with another tool to do 
> so._
> +*_{color:#de350b}BUT:{color}_*+
> I have reviewed The {{PutDatabaseRecord}} processor in NiFi. It inserts 
> records one by one into the database using a prepared statement, and commits 
> the transaction at the end of the loop that processes each record. This 
> approach can be inefficient and slow when inserting large volumes of data 
> into tables that are optimized for bulk ingestion, such as Delta Lake, 
> Iceberg, and Hudi tables.
> These tables use various techniques to optimize the performance of bulk 
> ingestion, such as partitioning, clustering, and indexing. Inserting records 
> one by one using a prepared statement can bypass these optimizations, leading 
> to poor performance and potentially causing issues such as excessive disk 
> usage, increased memory consumption, and decreased query performance.
> To avoid these issues, it is recommended to have a new processor, or add 
> feature to the current one, to bulk insert method with AutoCommit feature 
> when inserting large volumes of data into Delta Lake, Iceberg, and Hudi 
> tables. 
>  
> P.S.: using PutSQL is not a have autoCommit but have the same performance 
> problem described above..
> Thanks and best regards :)
> Abdelrahim Ahmad



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11449) add autocommit property to PutDatabaseRecord processor

2023-05-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11449:
---

Assignee: Matt Burgess

> add autocommit property to PutDatabaseRecord processor
> --
>
> Key: NIFI-11449
> URL: https://issues.apache.org/jira/browse/NIFI-11449
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Affects Versions: 1.21.0
> Environment: Any Nifi Deployment
>Reporter: Abdelrahim Ahmad
>Assignee: Matt Burgess
>Priority: Blocker
>  Labels: Trino, autocommit, database, iceberg, putdatabaserecord
>
> The issue is with the {{PutDatabaseRecord}} processor in Apache NiFi. When 
> using the processor with the Trino-JDBC-Driver or Dremio-JDBC-Driver to write 
> to an Iceberg catalog, it disables the autocommit feature. This leads to 
> errors such as "{*}Catalog only supports writes using autocommit: iceberg{*}".
> the autocommit feature needs to be added in the processor to be 
> enabled/disabled.
> enabling auto-commit in the Nifi PutDatabaseRecord processor is important for 
> Deltalake, Iceberg, and Hudi as it ensures data consistency and integrity by 
> allowing atomic writes to be performed in the underlying database. This will 
> allow the process to be widely used with bigger range of databases.
> _Improving this processor will allow Nifi to be the main tool to ingest data 
> into these new Technologies. So we don't have to deal with another tool to do 
> so._
> +*_{color:#de350b}BUT:{color}_*+
> I have reviewed The {{PutDatabaseRecord}} processor in NiFi. It inserts 
> records one by one into the database using a prepared statement, and commits 
> the transaction at the end of the loop that processes each record. This 
> approach can be inefficient and slow when inserting large volumes of data 
> into tables that are optimized for bulk ingestion, such as Delta Lake, 
> Iceberg, and Hudi tables.
> These tables use various techniques to optimize the performance of bulk 
> ingestion, such as partitioning, clustering, and indexing. Inserting records 
> one by one using a prepared statement can bypass these optimizations, leading 
> to poor performance and potentially causing issues such as excessive disk 
> usage, increased memory consumption, and decreased query performance.
> To avoid these issues, it is recommended to have a new processor, or add 
> feature to the current one, to bulk insert method with AutoCommit feature 
> when inserting large volumes of data into Delta Lake, Iceberg, and Hudi 
> tables. 
>  
> P.S.: using PutSQL is not a have autoCommit but have the same performance 
> problem described above..
> Thanks and best regards :)
> Abdelrahim Ahmad



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11221) Remove support for processor-level connection configuration in the MongoDB package

2023-05-01 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11221:

Fix Version/s: 2.latest
   (was: 2.0.0)

> Remove support for processor-level connection configuration in the MongoDB 
> package
> --
>
> Key: NIFI-11221
> URL: https://issues.apache.org/jira/browse/NIFI-11221
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Mike Thomsen
>Assignee: Mike Thomsen
>Priority: Major
> Fix For: 2.latest
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The configuration should be done entirely through the controller service that 
> manages the connections. That class has been around for a few years now, so 
> it shouldn't be a surprise to anyone.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11221) Remove support for processor-level connection configuration in the MongoDB package

2023-04-30 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11221:

Status: Patch Available  (was: Open)

> Remove support for processor-level connection configuration in the MongoDB 
> package
> --
>
> Key: NIFI-11221
> URL: https://issues.apache.org/jira/browse/NIFI-11221
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Mike Thomsen
>Assignee: Mike Thomsen
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The configuration should be done entirely through the controller service that 
> manages the connections. That class has been around for a few years now, so 
> it shouldn't be a surprise to anyone.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11034) Image Viewer not available in Apache NiFi release

2023-04-29 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11034:

Status: Patch Available  (was: In Progress)

> Image Viewer not available in Apache NiFi release
> -
>
> Key: NIFI-11034
> URL: https://issues.apache.org/jira/browse/NIFI-11034
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The image viewer in the UI (used to display JPEG, GIF, WEBP, etc.) is in the 
> nifi-media-nar, which is no longer included in the Apache NiFi binary release 
> due to size constraints. However, the image viewer itself should be available 
> in the binary release (unless it is too large by itself). 
> Recommend breaking the image viewer out into its own module and include it in 
> the NiFi assembly by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11034) Image Viewer not available in Apache NiFi release

2023-04-29 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11034:
---

Assignee: Matt Burgess

> Image Viewer not available in Apache NiFi release
> -
>
> Key: NIFI-11034
> URL: https://issues.apache.org/jira/browse/NIFI-11034
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> The image viewer in the UI (used to display JPEG, GIF, WEBP, etc.) is in the 
> nifi-media-nar, which is no longer included in the Apache NiFi binary release 
> due to size constraints. However, the image viewer itself should be available 
> in the binary release (unless it is too large by itself). 
> Recommend breaking the image viewer out into its own module and include it in 
> the NiFi assembly by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11034) Image Viewer not available in Apache NiFi release

2023-04-29 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11034:

Fix Version/s: 1.latest
   2.latest

> Image Viewer not available in Apache NiFi release
> -
>
> Key: NIFI-11034
> URL: https://issues.apache.org/jira/browse/NIFI-11034
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>
> The image viewer in the UI (used to display JPEG, GIF, WEBP, etc.) is in the 
> nifi-media-nar, which is no longer included in the Apache NiFi binary release 
> due to size constraints. However, the image viewer itself should be available 
> in the binary release (unless it is too large by itself). 
> Recommend breaking the image viewer out into its own module and include it in 
> the NiFi assembly by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-10946) ScriptedRecord processors script error handling

2023-04-29 Thread Matt Burgess (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-10946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17717931#comment-17717931
 ] 

Matt Burgess commented on NIFI-10946:
-

Is there a reason to have the processor itself set the attribute for the error, 
or should we just leave it to the script writer to handle errors and set the 
attribute in the script and route the FlowFile to failure?

> ScriptedRecord processors script error handling
> ---
>
> Key: NIFI-10946
> URL: https://issues.apache.org/jira/browse/NIFI-10946
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Julien G.
>Priority: Major
>
> An attribute is written if an error occurred in the ScriptedProcessor's 
> reader.
> But if an error occurs in the script execution, the FlowFile will be routed 
> to failure but the error will not be written to an attribute in the FlowFile.
> Can an improvment be made in this direction?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11473) Flow version change in NiFi should not stop a component when only position is changed

2023-04-27 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11473:

Fix Version/s: 2.0.0
   1.22.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Flow version change in NiFi should not stop a component when only position is 
> changed
> -
>
> Key: NIFI-11473
> URL: https://issues.apache.org/jira/browse/NIFI-11473
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Timea Barna
>Assignee: Timea Barna
>Priority: Major
> Fix For: 2.0.0, 1.22.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When going from one flow version to another and the position of a component 
> is changing, but not its configuration, the component should not be stopped.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11466) Add a ModifyCompression processor

2023-04-17 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11466:

Status: Patch Available  (was: In Progress)

> Add a ModifyCompression processor
> -
>
> Key: NIFI-11466
> URL: https://issues.apache.org/jira/browse/NIFI-11466
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a user would like to convert from one compression format to another, they 
> currently have to use CompressContent to decompress, then another 
> CompressContent to compress into a different format. Two processors plus disk 
> I/O for the FlowFiles and their underlying content claims can be I/O 
> intensive in that case.
> Instead, a new ModifyCompression processor is proposed, to allow for both 
> decompression of the incoming FlowFile and compression for the outgoing 
> FlowFile, using appropriate memory buffers for the 
> decompression/recompression. Adding "no decompression" and "no compression" 
> options for the respective properties could allow this property to function 
> like CompressContent does now, plus the ability to convert from one 
> compression format (gzip, e.g.) to another (snappy-hadoop, e.g.). One example 
> of a use case where this would be helpful is an I/O bound flow to get 
> compressed data from a legacy source system into HDFS for faster (and 
> larger-volume / distributed) processing of the data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11466) Add a ModifyCompression processor

2023-04-17 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11466:
---

Assignee: Matt Burgess

> Add a ModifyCompression processor
> -
>
> Key: NIFI-11466
> URL: https://issues.apache.org/jira/browse/NIFI-11466
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.latest
>
>
> If a user would like to convert from one compression format to another, they 
> currently have to use CompressContent to decompress, then another 
> CompressContent to compress into a different format. Two processors plus disk 
> I/O for the FlowFiles and their underlying content claims can be I/O 
> intensive in that case.
> Instead, a new ModifyCompression processor is proposed, to allow for both 
> decompression of the incoming FlowFile and compression for the outgoing 
> FlowFile, using appropriate memory buffers for the 
> decompression/recompression. Adding "no decompression" and "no compression" 
> options for the respective properties could allow this property to function 
> like CompressContent does now, plus the ability to convert from one 
> compression format (gzip, e.g.) to another (snappy-hadoop, e.g.). One example 
> of a use case where this would be helpful is an I/O bound flow to get 
> compressed data from a legacy source system into HDFS for faster (and 
> larger-volume / distributed) processing of the data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11466) Add a ModifyCompression processor

2023-04-17 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11466:
---

 Summary: Add a ModifyCompression processor
 Key: NIFI-11466
 URL: https://issues.apache.org/jira/browse/NIFI-11466
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Extensions
Reporter: Matt Burgess
 Fix For: 2.latest


If a user would like to convert from one compression format to another, they 
currently have to use CompressContent to decompress, then another 
CompressContent to compress into a different format. Two processors plus disk 
I/O for the FlowFiles and their underlying content claims can be I/O intensive 
in that case.

Instead, a new ModifyCompression processor is proposed, to allow for both 
decompression of the incoming FlowFile and compression for the outgoing 
FlowFile, using appropriate memory buffers for the decompression/recompression. 
Adding "no decompression" and "no compression" options for the respective 
properties could allow this property to function like CompressContent does now, 
plus the ability to convert from one compression format (gzip, e.g.) to another 
(snappy-hadoop, e.g.). One example of a use case where this would be helpful is 
an I/O bound flow to get compressed data from a legacy source system into HDFS 
for faster (and larger-volume / distributed) processing of the data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11380) Refactor CaptureChangeMySQL with improvements

2023-04-03 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11380:

Status: Patch Available  (was: In Progress)

> Refactor CaptureChangeMySQL with improvements
> -
>
> Key: NIFI-11380
> URL: https://issues.apache.org/jira/browse/NIFI-11380
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The CaptureChangeMySQL processor can be improved in many ways:
> - Eliminate use of DistributedCacheClient
> - MySQLCDCUtils - delete this class. It’s unnecessary. Just put the single 
> method that exists in the parent class of the writers
> - Eliminate mutable member variables. Gather all state together into an 
> object and store that as a single volatile member variable.
> - The outputEvents method is a huge block of switch/case and if/then/else 
> blocks. Kill all of this. Create an interface that’s capable of handling a 
> given event type and have multiple implementations. Determine appropriate 
> impl and call the method.
> - Do not keep a bunch of member variables to “rollback local state”. Keep 
> this in variables. If we fail, no harm, no foul. If we succeed, then update 
> member variable.
> - Remove onStopped method, just annotate stop() method with @OnStopped. No 
> need for @OnShutdown
> - Change name of “hostname” property to "node", and don’t require the port! 
> Default to 3306.
> - Remove unused hasRun member variable



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11380) Refactor CaptureChangeMySQL with improvements

2023-04-03 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11380:

Fix Version/s: 1.latest
   2.latest

> Refactor CaptureChangeMySQL with improvements
> -
>
> Key: NIFI-11380
> URL: https://issues.apache.org/jira/browse/NIFI-11380
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>
> The CaptureChangeMySQL processor can be improved in many ways:
> - Eliminate use of DistributedCacheClient
> - MySQLCDCUtils - delete this class. It’s unnecessary. Just put the single 
> method that exists in the parent class of the writers
> - Eliminate mutable member variables. Gather all state together into an 
> object and store that as a single volatile member variable.
> - The outputEvents method is a huge block of switch/case and if/then/else 
> blocks. Kill all of this. Create an interface that’s capable of handling a 
> given event type and have multiple implementations. Determine appropriate 
> impl and call the method.
> - Do not keep a bunch of member variables to “rollback local state”. Keep 
> this in variables. If we fail, no harm, no foul. If we succeed, then update 
> member variable.
> - Remove onStopped method, just annotate stop() method with @OnStopped. No 
> need for @OnShutdown
> - Change name of “hostname” property to "node", and don’t require the port! 
> Default to 3306.
> - Remove unused hasRun member variable



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11380) Refactor CaptureChangeMySQL with improvements

2023-04-03 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11380:
---

Assignee: Matt Burgess

> Refactor CaptureChangeMySQL with improvements
> -
>
> Key: NIFI-11380
> URL: https://issues.apache.org/jira/browse/NIFI-11380
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> The CaptureChangeMySQL processor can be improved in many ways:
> - Eliminate use of DistributedCacheClient
> - MySQLCDCUtils - delete this class. It’s unnecessary. Just put the single 
> method that exists in the parent class of the writers
> - Eliminate mutable member variables. Gather all state together into an 
> object and store that as a single volatile member variable.
> - The outputEvents method is a huge block of switch/case and if/then/else 
> blocks. Kill all of this. Create an interface that’s capable of handling a 
> given event type and have multiple implementations. Determine appropriate 
> impl and call the method.
> - Do not keep a bunch of member variables to “rollback local state”. Keep 
> this in variables. If we fail, no harm, no foul. If we succeed, then update 
> member variable.
> - Remove onStopped method, just annotate stop() method with @OnStopped. No 
> need for @OnShutdown
> - Change name of “hostname” property to "node", and don’t require the port! 
> Default to 3306.
> - Remove unused hasRun member variable



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11380) Refactor CaptureChangeMySQL with improvements

2023-04-03 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11380:
---

 Summary: Refactor CaptureChangeMySQL with improvements
 Key: NIFI-11380
 URL: https://issues.apache.org/jira/browse/NIFI-11380
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: Matt Burgess


The CaptureChangeMySQL processor can be improved in many ways:

- Eliminate use of DistributedCacheClient
- MySQLCDCUtils - delete this class. It’s unnecessary. Just put the single 
method that exists in the parent class of the writers

- Eliminate mutable member variables. Gather all state together into an object 
and store that as a single volatile member variable.
- The outputEvents method is a huge block of switch/case and if/then/else 
blocks. Kill all of this. Create an interface that’s capable of handling a 
given event type and have multiple implementations. Determine appropriate impl 
and call the method.
- Do not keep a bunch of member variables to “rollback local state”. Keep this 
in variables. If we fail, no harm, no foul. If we succeed, then update member 
variable.
- Remove onStopped method, just annotate stop() method with @OnStopped. No need 
for @OnShutdown
- Change name of “hostname” property to "node", and don’t require the port! 
Default to 3306.
- Remove unused hasRun member variable




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-8710) StandardProvenanceReporter - receive() incorrectly overiding details parameter

2023-03-31 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-8710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-8710:
---
Fix Version/s: 2.latest

> StandardProvenanceReporter - receive() incorrectly overiding details parameter
> --
>
> Key: NIFI-8710
> URL: https://issues.apache.org/jira/browse/NIFI-8710
> Project: Apache NiFi
>  Issue Type: Bug
>Reporter: Nissim Shiman
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.latest
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The ProvenanceReporter interface has receive() with details as the third 
> parameter [1]:
> void receive(FlowFile flowFile, String transitUri, String details, long 
> transmissionMillis);
> StandardProvenanceReporter.java implements this with 
> sourceSystemFlowFileIdentifier as the third parameter [2]:
>  public void receive(final FlowFile flowFile, final String transitUri, final 
> String sourceSystemFlowFileIdentifier, final long transmissionMillis)
> This implementation in StandardProvenanceReporter should be modified to 
> reflect the interface.
>  
> [1] 
> [https://github.com/apache/nifi/blob/main/nifi-api/src/main/java/org/apache/nifi/provenance/ProvenanceReporter.java#L100]
> [2] 
> [https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/controller/repository/StandardProvenanceReporter.java#L159]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11337) PutDatabaseRecord docs don't reflect actual behavior and Max Batch Size shouldn't be a dependent property

2023-03-23 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11337:

Fix Version/s: 1.latest
   2.latest
   Status: Patch Available  (was: In Progress)

> PutDatabaseRecord docs don't reflect actual behavior and Max Batch Size 
> shouldn't be a dependent property
> -
>
> Key: NIFI-11337
> URL: https://issues.apache.org/jira/browse/NIFI-11337
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The current documentation for PutDatabaseRecord's Maximum Batch Size property 
> says: "Specifies maximum batch size for INSERT and UPDATE statements. This 
> parameter has no effect for other statements specified in 'Statement Type'. 
> Zero means the batch size is not limited." This is not accurate as the 
> Maximum Batch Size is used for all batches regardless of statement type. This 
> also makes this property not dependent so that should be removed as well as 
> the documentation updated to reflect the the behavior. The default value of 
> Maximum Batch size should also be changed from 0 to something like 1000 to 
> avoid possible memory usage issues for large numbers of incoming records.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11337) PutDatabaseRecord docs don't reflect actual behavior and Max Batch Size shouldn't be a dependent property

2023-03-23 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11337:
---

 Summary: PutDatabaseRecord docs don't reflect actual behavior and 
Max Batch Size shouldn't be a dependent property
 Key: NIFI-11337
 URL: https://issues.apache.org/jira/browse/NIFI-11337
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Matt Burgess


The current documentation for PutDatabaseRecord's Maximum Batch Size property 
says: "Specifies maximum batch size for INSERT and UPDATE statements. This 
parameter has no effect for other statements specified in 'Statement Type'. 
Zero means the batch size is not limited." This is not accurate as the Maximum 
Batch Size is used for all batches regardless of statement type. This also 
makes this property not dependent so that should be removed as well as the 
documentation updated to reflect the the behavior. The default value of Maximum 
Batch size should also be changed from 0 to something like 1000 to avoid 
possible memory usage issues for large numbers of incoming records.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11337) PutDatabaseRecord docs don't reflect actual behavior and Max Batch Size shouldn't be a dependent property

2023-03-23 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11337:
---

Assignee: Matt Burgess

> PutDatabaseRecord docs don't reflect actual behavior and Max Batch Size 
> shouldn't be a dependent property
> -
>
> Key: NIFI-11337
> URL: https://issues.apache.org/jira/browse/NIFI-11337
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> The current documentation for PutDatabaseRecord's Maximum Batch Size property 
> says: "Specifies maximum batch size for INSERT and UPDATE statements. This 
> parameter has no effect for other statements specified in 'Statement Type'. 
> Zero means the batch size is not limited." This is not accurate as the 
> Maximum Batch Size is used for all batches regardless of statement type. This 
> also makes this property not dependent so that should be removed as well as 
> the documentation updated to reflect the the behavior. The default value of 
> Maximum Batch size should also be changed from 0 to something like 1000 to 
> avoid possible memory usage issues for large numbers of incoming records.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (NIFI-11305) CaptureChangeMySQL does not stop if the queue is not empty

2023-03-21 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reopened NIFI-11305:
-

Reopening as there is an order-of-events bug where the binlog client waits on 
the event listener thread which no longer has anything pulling events off the 
queue. The event listener should be unregistered before the binlog client 
disconnects.

> CaptureChangeMySQL does not stop if the queue is not empty
> --
>
> Key: NIFI-11305
> URL: https://issues.apache.org/jira/browse/NIFI-11305
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.0.0, 1.21.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There is a logic bug in the handling of stopping the CaptureChangeMySQL 
> processor. This causes it to not stop while there are events in the queue. If 
> the processor isn't running fast enough to drain the queue, stopping the 
> processor will have no effect.
> The logic was being handled in OnStopped, but that won't get called until the 
> onTrigger has finished. Instead the loop should be checking to see if the 
> processor is still scheduled using isScheduled(), and if not should break out 
> of the loop and finish the onTrigger processing, thereby allowing the 
> OnStopped logic to be executed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11310) When Processor terminated, any resources that modify the classpath are not reloaded

2023-03-21 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11310:

Fix Version/s: 2.0.0
   1.21.0
   (was: 1.latest)
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> When Processor terminated, any resources that modify the classpath are not 
> reloaded
> ---
>
> Key: NIFI-11310
> URL: https://issues.apache.org/jira/browse/NIFI-11310
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0, 1.21.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Processors often provide Property Descriptors that can be used to modify the 
> classpath. For example, processors may expose a property for specifying .jar 
> files to load, such as JDBC drivers.
> When a Processor is terminated, though, and restarted, the Processor no 
> longer has access to those additional classpath resources. The processor must 
> have its properties modified in some way in order to re-establish those 
> classpath resources. Otherwise, we see errors such as 
> {{ClassNotFoundException}} being thrown



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-4651) PutSQL should return error messages and error codes in an attribute

2023-03-21 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-4651:
---
Fix Version/s: 2.0.0
   1.21.0
   (was: 1.latest)
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> PutSQL should return error messages and error codes in an attribute
> ---
>
> Key: NIFI-4651
> URL: https://issues.apache.org/jira/browse/NIFI-4651
> Project: Apache NiFi
>  Issue Type: Improvement
>Affects Versions: 1.20.0
>Reporter: Kay-Uwe Moosheimer
>Assignee: Zsihovszki Krisztina
>Priority: Minor
> Fix For: 2.0.0, 1.21.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In case of an error PutSQL should return the error messages and the error 
> code in attributes to be able to edit each FlowFile manually or automatically 
> according to the error.
> A selection "Return errors" -> "Yes/No" and the possibility to specify two 
> attribute names (one for error message and one for error code) would be 
> helpful for postprocessing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11305) CaptureChangeMySQL does not stop if the queue is not empty

2023-03-20 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11305:

Status: Patch Available  (was: In Progress)

> CaptureChangeMySQL does not stop if the queue is not empty
> --
>
> Key: NIFI-11305
> URL: https://issues.apache.org/jira/browse/NIFI-11305
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a logic bug in the handling of stopping the CaptureChangeMySQL 
> processor. This causes it to not stop while there are events in the queue. If 
> the processor isn't running fast enough to drain the queue, stopping the 
> processor will have no effect.
> The logic was being handled in OnStopped, but that won't get called until the 
> onTrigger has finished. Instead the loop should be checking to see if the 
> processor is still scheduled using isScheduled(), and if not should break out 
> of the loop and finish the onTrigger processing, thereby allowing the 
> OnStopped logic to be executed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11305) CaptureChangeMySQL does not stop if the queue is not empty

2023-03-20 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11305:

Fix Version/s: 1.latest
   2.latest

> CaptureChangeMySQL does not stop if the queue is not empty
> --
>
> Key: NIFI-11305
> URL: https://issues.apache.org/jira/browse/NIFI-11305
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>
> There is a logic bug in the handling of stopping the CaptureChangeMySQL 
> processor. This causes it to not stop while there are events in the queue. If 
> the processor isn't running fast enough to drain the queue, stopping the 
> processor will have no effect.
> The logic was being handled in OnStopped, but that won't get called until the 
> onTrigger has finished. Instead the loop should be checking to see if the 
> processor is still scheduled using isScheduled(), and if not should break out 
> of the loop and finish the onTrigger processing, thereby allowing the 
> OnStopped logic to be executed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11305) CaptureChangeMySQL does not stop if the queue is not empty

2023-03-20 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11305:
---

 Summary: CaptureChangeMySQL does not stop if the queue is not empty
 Key: NIFI-11305
 URL: https://issues.apache.org/jira/browse/NIFI-11305
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Matt Burgess


There is a logic bug in the handling of stopping the CaptureChangeMySQL 
processor. This causes it to not stop while there are events in the queue. If 
the processor isn't running fast enough to drain the queue, stopping the 
processor will have no effect.

The logic was being handled in OnStopped, but that won't get called until the 
onTrigger has finished. Instead the loop should be checking to see if the 
processor is still scheduled using isScheduled(), and if not should break out 
of the loop and finish the onTrigger processing, thereby allowing the OnStopped 
logic to be executed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11305) CaptureChangeMySQL does not stop if the queue is not empty

2023-03-20 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11305:
---

Assignee: Matt Burgess

> CaptureChangeMySQL does not stop if the queue is not empty
> --
>
> Key: NIFI-11305
> URL: https://issues.apache.org/jira/browse/NIFI-11305
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> There is a logic bug in the handling of stopping the CaptureChangeMySQL 
> processor. This causes it to not stop while there are events in the queue. If 
> the processor isn't running fast enough to drain the queue, stopping the 
> processor will have no effect.
> The logic was being handled in OnStopped, but that won't get called until the 
> onTrigger has finished. Instead the loop should be checking to see if the 
> processor is still scheduled using isScheduled(), and if not should break out 
> of the loop and finish the onTrigger processing, thereby allowing the 
> OnStopped logic to be executed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11285) Unable to build module created by maven archetype plugin

2023-03-14 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11285:

Fix Version/s: 2.0.0
   1.21.0
   (was: 1.latest)
   (was: 2.latest)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Unable to build module created by maven archetype plugin
> 
>
> Key: NIFI-11285
> URL: https://issues.apache.org/jira/browse/NIFI-11285
> Project: Apache NiFi
>  Issue Type: Bug
>Reporter: Nandor Soma Abonyi
>Assignee: Nandor Soma Abonyi
>Priority: Major
> Fix For: 2.0.0, 1.21.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Modules created by the archetype plugin use dependencies that are not 
> imported. Also, they have check-style issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11279) CaptureChangeMySQL should not stop processing if event stream is out of sync

2023-03-13 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11279:

Status: Patch Available  (was: In Progress)

> CaptureChangeMySQL should not stop processing if event stream is out of sync
> 
>
> Key: NIFI-11279
> URL: https://issues.apache.org/jira/browse/NIFI-11279
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently if CaptureChangeMySQL gets a BEGIN event while already in a 
> transaction or a COMMIT event without being in a transaction, processing of 
> the event queue stalls and no data will be emitted. Instead a warning should 
> be issued to the user that the event stream may be out of sync, but should 
> continue processing binlog events.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11279) CaptureChangeMySQL should not stop processing if event stream is out of sync

2023-03-13 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11279:

Fix Version/s: 1.latest
   2.latest

> CaptureChangeMySQL should not stop processing if event stream is out of sync
> 
>
> Key: NIFI-11279
> URL: https://issues.apache.org/jira/browse/NIFI-11279
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 1.latest, 2.latest
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently if CaptureChangeMySQL gets a BEGIN event while already in a 
> transaction or a COMMIT event without being in a transaction, processing of 
> the event queue stalls and no data will be emitted. Instead a warning should 
> be issued to the user that the event stream may be out of sync, but should 
> continue processing binlog events.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11279) CaptureChangeMySQL should not stop processing if event stream is out of sync

2023-03-13 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11279:
---

Assignee: Matt Burgess

> CaptureChangeMySQL should not stop processing if event stream is out of sync
> 
>
> Key: NIFI-11279
> URL: https://issues.apache.org/jira/browse/NIFI-11279
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> Currently if CaptureChangeMySQL gets a BEGIN event while already in a 
> transaction or a COMMIT event without being in a transaction, processing of 
> the event queue stalls and no data will be emitted. Instead a warning should 
> be issued to the user that the event stream may be out of sync, but should 
> continue processing binlog events.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11279) CaptureChangeMySQL should not stop processing if event stream is out of sync

2023-03-13 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11279:
---

 Summary: CaptureChangeMySQL should not stop processing if event 
stream is out of sync
 Key: NIFI-11279
 URL: https://issues.apache.org/jira/browse/NIFI-11279
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Matt Burgess


Currently if CaptureChangeMySQL gets a BEGIN event while already in a 
transaction or a COMMIT event without being in a transaction, processing of the 
event queue stalls and no data will be emitted. Instead a warning should be 
issued to the user that the event stream may be out of sync, but should 
continue processing binlog events.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-5501) MySQL connection cleanup thread leak

2023-03-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-5501:
---
Fix Version/s: 1.21.0

> MySQL connection cleanup thread leak
> 
>
> Key: NIFI-5501
> URL: https://issues.apache.org/jira/browse/NIFI-5501
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
> Environment: - Ubuntu 18.04 LTS
> - mysql-connector-java-5.1.46
> - Java 1.8.0_171
>Reporter: Tanapol Nearunchorn
>Assignee: Matt Burgess
>Priority: Major
> Fix For: 2.0.0, 1.21.0
>
> Attachments: nifi-threaddump.txt
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There are thousands of "Abandoned connection cleanup thread" left running 
> that cause memory leak in NiFi.
> I got a thread dump example here (full thread dump also attached):
> {code:java}
> "Abandoned connection cleanup thread" #18371 daemon prio=5 os_prio=0 
> tid=0x7f3b840e7800 nid=0x76a3 in Object.wait() [0x7f3b24ebb000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
> - locked <0x000348a12628> (a java.lang.ref.ReferenceQueue$Lock)
> at 
> com.mysql.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:64)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Locked ownable synchronizers:
> - <0x000348a12648> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {code}
> I'm not sure where these threads come from which component because in my 
> flow, I used CaptureChangeMySQL processor and DBCPConnectionPool controller 
> service.
> As I also found related problem here: [mysql Bug 
> #69526|https://bugs.mysql.com/bug.php?id=69526] but it quite a bit old.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-5501) MySQL connection cleanup thread leak

2023-03-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-5501:
---
Affects Version/s: (was: 1.7.1)
   Status: Patch Available  (was: In Progress)

> MySQL connection cleanup thread leak
> 
>
> Key: NIFI-5501
> URL: https://issues.apache.org/jira/browse/NIFI-5501
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
> Environment: - Ubuntu 18.04 LTS
> - mysql-connector-java-5.1.46
> - Java 1.8.0_171
>Reporter: Tanapol Nearunchorn
>Assignee: Matt Burgess
>Priority: Major
> Attachments: nifi-threaddump.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are thousands of "Abandoned connection cleanup thread" left running 
> that cause memory leak in NiFi.
> I got a thread dump example here (full thread dump also attached):
> {code:java}
> "Abandoned connection cleanup thread" #18371 daemon prio=5 os_prio=0 
> tid=0x7f3b840e7800 nid=0x76a3 in Object.wait() [0x7f3b24ebb000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
> - locked <0x000348a12628> (a java.lang.ref.ReferenceQueue$Lock)
> at 
> com.mysql.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:64)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Locked ownable synchronizers:
> - <0x000348a12648> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {code}
> I'm not sure where these threads come from which component because in my 
> flow, I used CaptureChangeMySQL processor and DBCPConnectionPool controller 
> service.
> As I also found related problem here: [mysql Bug 
> #69526|https://bugs.mysql.com/bug.php?id=69526] but it quite a bit old.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (NIFI-5501) MySQL connection cleanup thread leak

2023-03-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reopened NIFI-5501:

  Assignee: Matt Burgess

Reopening as this processor should handle the driver and connections more 
cleanly to avoid creating multiple threads

> MySQL connection cleanup thread leak
> 
>
> Key: NIFI-5501
> URL: https://issues.apache.org/jira/browse/NIFI-5501
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.7.1
> Environment: - Ubuntu 18.04 LTS
> - mysql-connector-java-5.1.46
> - Java 1.8.0_171
>Reporter: Tanapol Nearunchorn
>Assignee: Matt Burgess
>Priority: Major
> Attachments: nifi-threaddump.txt
>
>
> There are thousands of "Abandoned connection cleanup thread" left running 
> that cause memory leak in NiFi.
> I got a thread dump example here (full thread dump also attached):
> {code:java}
> "Abandoned connection cleanup thread" #18371 daemon prio=5 os_prio=0 
> tid=0x7f3b840e7800 nid=0x76a3 in Object.wait() [0x7f3b24ebb000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
> - locked <0x000348a12628> (a java.lang.ref.ReferenceQueue$Lock)
> at 
> com.mysql.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:64)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Locked ownable synchronizers:
> - <0x000348a12648> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {code}
> I'm not sure where these threads come from which component because in my 
> flow, I used CaptureChangeMySQL processor and DBCPConnectionPool controller 
> service.
> As I also found related problem here: [mysql Bug 
> #69526|https://bugs.mysql.com/bug.php?id=69526] but it quite a bit old.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11253) Remove H2 Database Migrator

2023-03-07 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11253:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Remove H2 Database Migrator
> ---
>
> Key: NIFI-11253
> URL: https://issues.apache.org/jira/browse/NIFI-11253
> Project: Apache NiFi
>  Issue Type: Task
>  Components: Core Framework
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> NiFi 1.16.0 and following included a shaded version of the H2 1.4 library to 
> support automated migration from H2 1.4 to 2.1. All installations upgrading 
> from NiFi 1.16.0 or later to a newer version do not need the migration. The 
> custom migration should be removed for NiFi 2.0.0 since H2 2.1 is the 
> baseline version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (NIFI-11147) Allow QuerySalesforceObject to query all existing fields

2023-02-23 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess resolved NIFI-11147.
-
Fix Version/s: 2.0.0
   Resolution: Fixed

> Allow QuerySalesforceObject to query all existing fields
> 
>
> Key: NIFI-11147
> URL: https://issues.apache.org/jira/browse/NIFI-11147
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Matt Burgess
>Assignee: Lehel Boér
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently the Field Names property of QuerySalesforceObject is required and 
> must contain the names of the fields the user wants to return. However in a 
> schema drift use case, the user may want to add a field to a Salesforce 
> object and have the NiFi flow continue without needing alteration.
> This Jira is to make it possible for QuerySalesforceObject to return all 
> fields from an object. A suggestion is to make Field Names optional and if it 
> is not set, all fields are queried. The documentation should be updated to 
> match the behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11209) UpdateHive3Table and UpdateDatabaseTable erase newly created columns in output FlowFile

2023-02-23 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11209:

Status: Patch Available  (was: In Progress)

> UpdateHive3Table and UpdateDatabaseTable erase newly created columns in 
> output FlowFile
> ---
>
> Key: NIFI-11209
> URL: https://issues.apache.org/jira/browse/NIFI-11209
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When using UpdateHive3Table or PutDatabaseTable with the Update Field Names 
> property set to "true", the output will include only those fields that 
> already had existing columns in the target table. This means if an incoming 
> FlowFile has a "new" field, the column will be created and populated in the 
> target database, but the output FlowFile will no longer have the "new" fields 
> in the outgoing records.
> This is because the original database columns are used to populate the 
> outgoing record schema when the Update Field Names property is set to true. 
> Instead the database columns should be refreshed/maintained after any DDL is 
> executed in order to get the updated set of database column names.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11209) UpdateHive3Table and UpdateDatabaseTable erase newly created columns in output FlowFile

2023-02-23 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11209:

Description: 
When using UpdateHive3Table or PutDatabaseTable with the Update Field Names 
property set to "true", the output will include only those fields that already 
had existing columns in the target table. This means if an incoming FlowFile 
has a "new" field, the column will be created and populated in the target 
database, but the output FlowFile will no longer have the "new" fields in the 
outgoing records.

This is because the original database columns are used to populate the outgoing 
record schema when the Update Field Names property is set to true. Instead the 
database columns should be refreshed/maintained after any DDL is executed in 
order to get the updated set of database column names.

  was:
When using UpdateHive3Table or PutDatabaseTable with the Update Field Names 
property set to "true", the output will include only those fields that already 
had existing columns in the target table. This means if an incoming FlowFile 
has a "new" field, the column will be created and populated in the target 
database, but the output FlowFile will no longer have the "new" fields in the 
outgoing records.

This is because the original database columns are used to populate the outgoing 
record schema when the Update Field Names property is set to true. Instead the 
database columns should be refreshed after any DDL is executed in order to get 
the updated set of database column names.


> UpdateHive3Table and UpdateDatabaseTable erase newly created columns in 
> output FlowFile
> ---
>
> Key: NIFI-11209
> URL: https://issues.apache.org/jira/browse/NIFI-11209
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> When using UpdateHive3Table or PutDatabaseTable with the Update Field Names 
> property set to "true", the output will include only those fields that 
> already had existing columns in the target table. This means if an incoming 
> FlowFile has a "new" field, the column will be created and populated in the 
> target database, but the output FlowFile will no longer have the "new" fields 
> in the outgoing records.
> This is because the original database columns are used to populate the 
> outgoing record schema when the Update Field Names property is set to true. 
> Instead the database columns should be refreshed/maintained after any DDL is 
> executed in order to get the updated set of database column names.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11209) UpdateHive3Table and UpdateDatabaseTable erase newly created columns in output FlowFile

2023-02-23 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11209:
---

Assignee: Matt Burgess

> UpdateHive3Table and UpdateDatabaseTable erase newly created columns in 
> output FlowFile
> ---
>
> Key: NIFI-11209
> URL: https://issues.apache.org/jira/browse/NIFI-11209
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> When using UpdateHive3Table or PutDatabaseTable with the Update Field Names 
> property set to "true", the output will include only those fields that 
> already had existing columns in the target table. This means if an incoming 
> FlowFile has a "new" field, the column will be created and populated in the 
> target database, but the output FlowFile will no longer have the "new" fields 
> in the outgoing records.
> This is because the original database columns are used to populate the 
> outgoing record schema when the Update Field Names property is set to true. 
> Instead the database columns should be refreshed after any DDL is executed in 
> order to get the updated set of database column names.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11192) If Port moved from parent to child group or vice versa between flow versions, version change can leave nifi in bad state

2023-02-22 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11192:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> If Port moved from parent to child group or vice versa between flow versions, 
> version change can leave nifi in bad state
> 
>
> Key: NIFI-11192
> URL: https://issues.apache.org/jira/browse/NIFI-11192
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Blocker
> Fix For: 2.0.0, 1.21.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> To reproduce:
> Create a Process Group, Parent.
> Inside of Parent, create:
>  * A processor, for example, ReplaceText
>  * A Process Group, Child.
> Inside of Child, create:
>  * An Input Port, In
>  * A Processor, for example, UpdateAttribute
>  * A connection between the two components
> Then connect ReplaceText to the Input Port, In.
> Save the Parent PG flow as Version 1 of a flow.
> Now, create a new Processor, say RouteOnAttribute, within the Parent PG.
> Move the destination of the connection from the Input Port to 
> RouteOnAttribute.
> Step into the Child PG. Select all components, right-click, and choose "Move 
> to Parent"
> Save Parent PG as Version 2 of the flow.
> Now, attempt to Change Version on the Parent Group. Change the version to 
> Version 1.
> The version change will fail with an error: "Failed to perform update flow 
> request due to 42fb2904-c774-359b-5368-2e48b60ac02d is the destination of 
> another component" and the logs will have a stack trace:
> {code:java}
> 2023-02-16 15:18:28,830 ERROR [Process Group Update Thread-1] 
> o.apache.nifi.web.api.FlowUpdateResource Failed to perform update flow request
> java.lang.IllegalStateException: 42fb2904-c774-359b-5368-2e48b60ac02d is the 
> destination of another component
>     at 
> org.apache.nifi.controller.AbstractPort.verifyCanDelete(AbstractPort.java:562)
>     at 
> org.apache.nifi.controller.AbstractPort.verifyCanDelete(AbstractPort.java:542)
>     at 
> org.apache.nifi.groups.StandardProcessGroup.removeInputPort(StandardProcessGroup.java:637)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.removeMissingComponents(StandardVersionedComponentSynchronizer.java:948)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.removeMissingInputPorts(StandardVersionedComponentSynchronizer.java:873)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:410)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.lambda$synchronize$0(StandardVersionedComponentSynchronizer.java:260)
>     at 
> org.apache.nifi.controller.flow.AbstractFlowManager.withParameterContextResolution(AbstractFlowManager.java:556)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:255)
>     at 
> org.apache.nifi.groups.StandardProcessGroup.synchronizeFlow(StandardProcessGroup.java:3972)
>     at 
> org.apache.nifi.groups.StandardProcessGroup.updateFlow(StandardProcessGroup.java:3952)
>     at 
> org.apache.nifi.web.dao.impl.StandardProcessGroupDAO.updateProcessGroupFlow(StandardProcessGroupDAO.java:435)
>     at 
> org.apache.nifi.web.dao.impl.StandardProcessGroupDAO$$FastClassBySpringCGLIB$$10a99b47.invoke()
>     at 
> org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
>     at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793)
>     at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
>     at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
>     at 
> org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:89)
>     at 
> org.apache.nifi.audit.ProcessGroupAuditor.updateProcessGroupFlowAdvice(ProcessGroupAuditor.java:308)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> 

[jira] [Updated] (NIFI-11192) If Port moved from parent to child group or vice versa between flow versions, version change can leave nifi in bad state

2023-02-21 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11192:

Status: Patch Available  (was: Open)

> If Port moved from parent to child group or vice versa between flow versions, 
> version change can leave nifi in bad state
> 
>
> Key: NIFI-11192
> URL: https://issues.apache.org/jira/browse/NIFI-11192
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Blocker
> Fix For: 2.0.0, 1.21.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To reproduce:
> Create a Process Group, Parent.
> Inside of Parent, create:
>  * A processor, for example, ReplaceText
>  * A Process Group, Child.
> Inside of Child, create:
>  * An Input Port, In
>  * A Processor, for example, UpdateAttribute
>  * A connection between the two components
> Then connect ReplaceText to the Input Port, In.
> Save the Parent PG flow as Version 1 of a flow.
> Now, create a new Processor, say RouteOnAttribute, within the Parent PG.
> Move the destination of the connection from the Input Port to 
> RouteOnAttribute.
> Step into the Child PG. Select all components, right-click, and choose "Move 
> to Parent"
> Save Parent PG as Version 2 of the flow.
> Now, attempt to Change Version on the Parent Group. Change the version to 
> Version 1.
> The version change will fail with an error: "Failed to perform update flow 
> request due to 42fb2904-c774-359b-5368-2e48b60ac02d is the destination of 
> another component" and the logs will have a stack trace:
> {code:java}
> 2023-02-16 15:18:28,830 ERROR [Process Group Update Thread-1] 
> o.apache.nifi.web.api.FlowUpdateResource Failed to perform update flow request
> java.lang.IllegalStateException: 42fb2904-c774-359b-5368-2e48b60ac02d is the 
> destination of another component
>     at 
> org.apache.nifi.controller.AbstractPort.verifyCanDelete(AbstractPort.java:562)
>     at 
> org.apache.nifi.controller.AbstractPort.verifyCanDelete(AbstractPort.java:542)
>     at 
> org.apache.nifi.groups.StandardProcessGroup.removeInputPort(StandardProcessGroup.java:637)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.removeMissingComponents(StandardVersionedComponentSynchronizer.java:948)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.removeMissingInputPorts(StandardVersionedComponentSynchronizer.java:873)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:410)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.lambda$synchronize$0(StandardVersionedComponentSynchronizer.java:260)
>     at 
> org.apache.nifi.controller.flow.AbstractFlowManager.withParameterContextResolution(AbstractFlowManager.java:556)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:255)
>     at 
> org.apache.nifi.groups.StandardProcessGroup.synchronizeFlow(StandardProcessGroup.java:3972)
>     at 
> org.apache.nifi.groups.StandardProcessGroup.updateFlow(StandardProcessGroup.java:3952)
>     at 
> org.apache.nifi.web.dao.impl.StandardProcessGroupDAO.updateProcessGroupFlow(StandardProcessGroupDAO.java:435)
>     at 
> org.apache.nifi.web.dao.impl.StandardProcessGroupDAO$$FastClassBySpringCGLIB$$10a99b47.invoke()
>     at 
> org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
>     at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793)
>     at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
>     at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
>     at 
> org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:89)
>     at 
> org.apache.nifi.audit.ProcessGroupAuditor.updateProcessGroupFlowAdvice(ProcessGroupAuditor.java:308)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:634)
>     at 
> 

[jira] [Assigned] (NIFI-11192) If Port moved from parent to child group or vice versa between flow versions, version change can leave nifi in bad state

2023-02-21 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11192:
---

Assignee: Mark Payne

> If Port moved from parent to child group or vice versa between flow versions, 
> version change can leave nifi in bad state
> 
>
> Key: NIFI-11192
> URL: https://issues.apache.org/jira/browse/NIFI-11192
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Blocker
> Fix For: 2.0.0, 1.21.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To reproduce:
> Create a Process Group, Parent.
> Inside of Parent, create:
>  * A processor, for example, ReplaceText
>  * A Process Group, Child.
> Inside of Child, create:
>  * An Input Port, In
>  * A Processor, for example, UpdateAttribute
>  * A connection between the two components
> Then connect ReplaceText to the Input Port, In.
> Save the Parent PG flow as Version 1 of a flow.
> Now, create a new Processor, say RouteOnAttribute, within the Parent PG.
> Move the destination of the connection from the Input Port to 
> RouteOnAttribute.
> Step into the Child PG. Select all components, right-click, and choose "Move 
> to Parent"
> Save Parent PG as Version 2 of the flow.
> Now, attempt to Change Version on the Parent Group. Change the version to 
> Version 1.
> The version change will fail with an error: "Failed to perform update flow 
> request due to 42fb2904-c774-359b-5368-2e48b60ac02d is the destination of 
> another component" and the logs will have a stack trace:
> {code:java}
> 2023-02-16 15:18:28,830 ERROR [Process Group Update Thread-1] 
> o.apache.nifi.web.api.FlowUpdateResource Failed to perform update flow request
> java.lang.IllegalStateException: 42fb2904-c774-359b-5368-2e48b60ac02d is the 
> destination of another component
>     at 
> org.apache.nifi.controller.AbstractPort.verifyCanDelete(AbstractPort.java:562)
>     at 
> org.apache.nifi.controller.AbstractPort.verifyCanDelete(AbstractPort.java:542)
>     at 
> org.apache.nifi.groups.StandardProcessGroup.removeInputPort(StandardProcessGroup.java:637)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.removeMissingComponents(StandardVersionedComponentSynchronizer.java:948)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.removeMissingInputPorts(StandardVersionedComponentSynchronizer.java:873)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:410)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.lambda$synchronize$0(StandardVersionedComponentSynchronizer.java:260)
>     at 
> org.apache.nifi.controller.flow.AbstractFlowManager.withParameterContextResolution(AbstractFlowManager.java:556)
>     at 
> org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:255)
>     at 
> org.apache.nifi.groups.StandardProcessGroup.synchronizeFlow(StandardProcessGroup.java:3972)
>     at 
> org.apache.nifi.groups.StandardProcessGroup.updateFlow(StandardProcessGroup.java:3952)
>     at 
> org.apache.nifi.web.dao.impl.StandardProcessGroupDAO.updateProcessGroupFlow(StandardProcessGroupDAO.java:435)
>     at 
> org.apache.nifi.web.dao.impl.StandardProcessGroupDAO$$FastClassBySpringCGLIB$$10a99b47.invoke()
>     at 
> org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
>     at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793)
>     at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
>     at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
>     at 
> org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:89)
>     at 
> org.apache.nifi.audit.ProcessGroupAuditor.updateProcessGroupFlowAdvice(ProcessGroupAuditor.java:308)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:634)
>     at 
> 

[jira] [Updated] (NIFI-11187) Remove ActiveMQ from Standard Processors

2023-02-21 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11187:

Fix Version/s: 2.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Remove ActiveMQ from Standard Processors
> 
>
> Key: NIFI-11187
> URL: https://issues.apache.org/jira/browse/NIFI-11187
> Project: Apache NiFi
>  Issue Type: Task
>  Components: Extensions
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> With the removal of GetJMS and PutJMS Processors from the standard processors 
> module, the ActiveMQ dependency and related classes should be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11208) Remove Hortonworks Schema Registry

2023-02-21 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11208:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Remove Hortonworks Schema Registry
> --
>
> Key: NIFI-11208
> URL: https://issues.apache.org/jira/browse/NIFI-11208
> Project: Apache NiFi
>  Issue Type: Task
>Reporter: David Handermann
>Assignee: David Handermann
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The {{HortonworksSchemaRegistry}} was deprecated for removal in NiFi 1.20.0 
> and should be removed along with the {{nifi-hwx-schema-registry-bundle}} 
> modules.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11209) UpdateHive3Table and UpdateDatabaseTable erase newly created columns in output FlowFile

2023-02-21 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11209:
---

 Summary: UpdateHive3Table and UpdateDatabaseTable erase newly 
created columns in output FlowFile
 Key: NIFI-11209
 URL: https://issues.apache.org/jira/browse/NIFI-11209
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Reporter: Matt Burgess


When using UpdateHive3Table or PutDatabaseTable with the Update Field Names 
property set to "true", the output will include only those fields that already 
had existing columns in the target table. This means if an incoming FlowFile 
has a "new" field, the column will be created and populated in the target 
database, but the output FlowFile will no longer have the "new" fields in the 
outgoing records.

This is because the original database columns are used to populate the outgoing 
record schema when the Update Field Names property is set to true. Instead the 
database columns should be refreshed after any DDL is executed in order to get 
the updated set of database column names.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11149) Add PutRedisHashRecord processor

2023-02-13 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11149:

Summary: Add PutRedisHashRecord processor  (was: Add PutRedisRecord 
processor)

> Add PutRedisHashRecord processor
> 
>
> Key: NIFI-11149
> URL: https://issues.apache.org/jira/browse/NIFI-11149
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This case is to add a record enabled processor to send data into Redis. The 
> hash should be chosen such that the field values can be retrieved later. It 
> should use the existing RedisConnectionPool controller service



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11149) Add PutRedisRecord processor

2023-02-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11149:

Status: Patch Available  (was: In Progress)

> Add PutRedisRecord processor
> 
>
> Key: NIFI-11149
> URL: https://issues.apache.org/jira/browse/NIFI-11149
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This case is to add a record enabled processor to send data into Redis. The 
> hash should be chosen such that the field values can be retrieved later. It 
> should use the existing RedisConnectionPool controller service



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11149) Add PutRedisRecord processor

2023-02-10 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11149:
---

Assignee: Matt Burgess

> Add PutRedisRecord processor
> 
>
> Key: NIFI-11149
> URL: https://issues.apache.org/jira/browse/NIFI-11149
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> This case is to add a record enabled processor to send data into Redis. The 
> hash should be chosen such that the field values can be retrieved later. It 
> should use the existing RedisConnectionPool controller service



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11149) Add PutRedisRecord processor

2023-02-07 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11149:
---

 Summary: Add PutRedisRecord processor
 Key: NIFI-11149
 URL: https://issues.apache.org/jira/browse/NIFI-11149
 Project: Apache NiFi
  Issue Type: New Feature
  Components: Extensions
Reporter: Matt Burgess


This case is to add a record enabled processor to send data into Redis. The 
hash should be chosen such that the field values can be retrieved later. It 
should use the existing RedisConnectionPool controller service



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11148) Race condition loading script engines in InvokeScriptedProcessor

2023-02-07 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11148:
---

 Summary: Race condition loading script engines in 
InvokeScriptedProcessor
 Key: NIFI-11148
 URL: https://issues.apache.org/jira/browse/NIFI-11148
 Project: Apache NiFi
  Issue Type: Bug
Reporter: Matt Burgess


In the 1.20.0 release candidate I noticed this error when loading a flow with 
an InvokeScriptedProcessor:

2023-02-07 16:26:39,825 ERROR [NiFi Web Server-32] 
o.a.n.p.script.InvokeScriptedProcessor 
InvokeScriptedProcessor[id=2bbc053d-8b08-3206-7c47-cc00a08beb64] Error adding 
script engine Groovy
2023-02-07 16:26:39,826 ERROR [NiFi Web Server-32] 
o.a.n.p.script.InvokeScriptedProcessor 
InvokeScriptedProcessor[id=2bbc053d-8b08-3206-7c47-cc00a08beb64] Unable to load 
script: No script runner available
org.apache.nifi.processor.exception.ProcessException: No script runner available
at 
org.apache.nifi.processors.script.InvokeScriptedProcessor.reloadScript(InvokeScriptedProcessor.java:371)
at 
org.apache.nifi.processors.script.InvokeScriptedProcessor.reloadScriptBody(InvokeScriptedProcessor.java:326)
at 
org.apache.nifi.processors.script.InvokeScriptedProcessor.setup(InvokeScriptedProcessor.java:230)
at 
org.apache.nifi.processors.script.InvokeScriptedProcessor.onConfigurationRestored(InvokeScriptedProcessor.java:222)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:145)
at 
org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:133)
at 
org.apache.nifi.util.ReflectionUtils.quietlyInvokeMethodsWithAnnotations(ReflectionUtils.java:316)
at 
org.apache.nifi.util.ReflectionUtils.quietlyInvokeMethodsWithAnnotation(ReflectionUtils.java:93)
at 
org.apache.nifi.controller.StandardProcessorNode.onConfigurationRestored(StandardProcessorNode.java:2115)
at 
org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.addProcessor(StandardVersionedComponentSynchronizer.java:2417)
at 
org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.synchronizeProcessors(StandardVersionedComponentSynchronizer.java:932)
at 
org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:422)
at 
org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.lambda$synchronize$0(StandardVersionedComponentSynchronizer.java:260)
at 
org.apache.nifi.controller.flow.AbstractFlowManager.withParameterContextResolution(AbstractFlowManager.java:550)
at 
org.apache.nifi.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:255)
at 
org.apache.nifi.groups.StandardProcessGroup.synchronizeFlow(StandardProcessGroup.java:3972)
at 
org.apache.nifi.groups.StandardProcessGroup.updateFlow(StandardProcessGroup.java:3952)
at 
org.apache.nifi.web.dao.impl.StandardProcessGroupDAO.updateProcessGroupFlow(StandardProcessGroupDAO.java:435)
at 
org.apache.nifi.web.dao.impl.StandardProcessGroupDAO$$FastClassBySpringCGLIB$$10a99b47.invoke()
at 
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
at 
org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:89)
at 
org.apache.nifi.audit.ProcessGroupAuditor.updateProcessGroupFlowAdvice(ProcessGroupAuditor.java:308)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:634)
at 

[jira] [Assigned] (NIFI-11147) Allow QuerySalesforceObject to query all existing fields

2023-02-07 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11147:
---

Assignee: Lehel Boér  (was: Matt Burgess)

> Allow QuerySalesforceObject to query all existing fields
> 
>
> Key: NIFI-11147
> URL: https://issues.apache.org/jira/browse/NIFI-11147
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Matt Burgess
>Assignee: Lehel Boér
>Priority: Major
>
> Currently the Field Names property of QuerySalesforceObject is required and 
> must contain the names of the fields the user wants to return. However in a 
> schema drift use case, the user may want to add a field to a Salesforce 
> object and have the NiFi flow continue without needing alteration.
> This Jira is to make it possible for QuerySalesforceObject to return all 
> fields from an object. A suggestion is to make Field Names optional and if it 
> is not set, all fields are queried. The documentation should be updated to 
> match the behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (NIFI-11147) Allow QuerySalesforceObject to query all existing fields

2023-02-07 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess reassigned NIFI-11147:
---

Assignee: Matt Burgess

> Allow QuerySalesforceObject to query all existing fields
> 
>
> Key: NIFI-11147
> URL: https://issues.apache.org/jira/browse/NIFI-11147
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>
> Currently the Field Names property of QuerySalesforceObject is required and 
> must contain the names of the fields the user wants to return. However in a 
> schema drift use case, the user may want to add a field to a Salesforce 
> object and have the NiFi flow continue without needing alteration.
> This Jira is to make it possible for QuerySalesforceObject to return all 
> fields from an object. A suggestion is to make Field Names optional and if it 
> is not set, all fields are queried. The documentation should be updated to 
> match the behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (NIFI-11147) Allow QuerySalesforceObject to query all existing fields

2023-02-07 Thread Matt Burgess (Jira)
Matt Burgess created NIFI-11147:
---

 Summary: Allow QuerySalesforceObject to query all existing fields
 Key: NIFI-11147
 URL: https://issues.apache.org/jira/browse/NIFI-11147
 Project: Apache NiFi
  Issue Type: Improvement
Reporter: Matt Burgess


Currently the Field Names property of QuerySalesforceObject is required and 
must contain the names of the fields the user wants to return. However in a 
schema drift use case, the user may want to add a field to a Salesforce object 
and have the NiFi flow continue without needing alteration.

This Jira is to make it possible for QuerySalesforceObject to return all fields 
from an object. A suggestion is to make Field Names optional and if it is not 
set, all fields are queried. The documentation should be updated to match the 
behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11110) Create processor for triggering HMS events

2023-02-02 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-0:

Status: Patch Available  (was: Open)

> Create processor for triggering HMS events
> --
>
> Key: NIFI-0
> URL: https://issues.apache.org/jira/browse/NIFI-0
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: Mark Bathori
>Assignee: Mark Bathori
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Create a processor which is capable to trigger HiveMetaStore actions and 
> generate notifications for them. The main goal is to be able to register file 
> and directory insertions and removals in the HiveMetaStore done by other 
> processors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11110) Create processor for triggering HMS events

2023-02-02 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-0:

Fix Version/s: 1.20.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Create processor for triggering HMS events
> --
>
> Key: NIFI-0
> URL: https://issues.apache.org/jira/browse/NIFI-0
> Project: Apache NiFi
>  Issue Type: New Feature
>Reporter: Mark Bathori
>Assignee: Mark Bathori
>Priority: Major
> Fix For: 1.20.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Create a processor which is capable to trigger HiveMetaStore actions and 
> generate notifications for them. The main goal is to be able to register file 
> and directory insertions and removals in the HiveMetaStore done by other 
> processors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NIFI-11094) Allow CaptureChangeMySQL to send multiple events per FlowFile

2023-01-31 Thread Matt Burgess (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-11094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-11094:

Status: Patch Available  (was: In Progress)

> Allow CaptureChangeMySQL to send multiple events per FlowFile
> -
>
> Key: NIFI-11094
> URL: https://issues.apache.org/jira/browse/NIFI-11094
> Project: Apache NiFi
>  Issue Type: Improvement
>Reporter: Matt Burgess
>Assignee: Matt Burgess
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It would be nice if there could be a Events Per FlowFile Strategy property 
> for CaptureChangeMySQL that could allow for things like N events per FlowFile 
> or one full transaction per FlowFile. It can help lower overhead downstream 
> and increase the overall performance of the flow.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


<    1   2   3   4   5   6   7   8   9   10   >