[jira] [Commented] (NIFI-3599) Add nifi.properties value to globally set the default backpressure size threshold for each connection

2017-04-10 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962687#comment-15962687
 ] 

Toivo Adams commented on NIFI-3599:
---

It would be very useful to be able set both 
Back Pressure Object Threshold
and Back Pressure Data Size Threshold
default values in  nifi.properties

[~jeremy.dyer], do you have a time to implement this?

Thanks
Toivo


> Add nifi.properties value to globally set the default backpressure size 
> threshold for each connection
> -
>
> Key: NIFI-3599
> URL: https://issues.apache.org/jira/browse/NIFI-3599
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Jeremy Dyer
>Assignee: Jeremy Dyer
>
> By default each new connection added to the workflow canvas will have a 
> default backpressure size threshold of 10,000 objects. While the threshold 
> can be changed on a connection level it would be convenient to have a global 
> mechanism for setting that value to something other than 10,000. This 
> enhancement would add a property to nifi.properties that would allow for this 
> threshold to be set globally unless otherwise overridden at the connection 
> level.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-2624) JDBC-to-Avro processors handle BigDecimals as Strings

2017-02-15 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868124#comment-15868124
 ] 

Toivo Adams commented on NIFI-2624:
---

@ijokarumawak,

Sorry I didn't have time to look your code.
Your approach looks much better.

My only concern is – do we generate record which conforms to Avro standard?
Can any other consumer (outside NiFi) read our Avro record and get back same 
BigDecimal?

@joewitt,

Right, let’s try to finish this.

Thanks
Toivo


> JDBC-to-Avro processors handle BigDecimals as Strings
> -
>
> Key: NIFI-2624
> URL: https://issues.apache.org/jira/browse/NIFI-2624
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Toivo Adams
>
> The original SQL processors implemented BigDecimal values as Strings for 
> Avro, as the version of Avro it used (1.7.6) did not support DECIMAL type.
> As of Avro 1.7.7 (AVRO-1402), this type is supported and so the SQL/HiveQL 
> processors should be updated to handle BigDecimals correctly if possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (NIFI-3339) Add getDataSource() to DBCPService

2017-01-29 Thread Toivo Adams (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Toivo Adams commented on  NIFI-3339 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
  Re: Add getDataSource() to DBCPService  
 
 
 
 
 
 
 
 
 
 
Thank you for reviewing 
Koji Kawamura Great ideas. I created new PR. https://github.com/apache/nifi/pull/1450 
Spring JDBCTemplate usage example is in DBCPServiceTest.java testGetDataSource() and createInsertSelectDrop(DataSource dataSource) 
Thanks Toivo 
 
 
 
 
 
 
 
 
 
 
 
 

 
 Add Comment 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 

 This message was sent by Atlassian JIRA (v6.3.15#6346-sha1:dbc023d) 
 
 
 
 
  
 
 
 
 
 
 
 
 
   



[jira] [Updated] (NIFI-3339) Add getDataSource() to DBCPService

2017-01-14 Thread Toivo Adams (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toivo Adams updated NIFI-3339:
--
Status: Patch Available  (was: In Progress)

> Add getDataSource() to DBCPService
> --
>
> Key: NIFI-3339
> URL: https://issues.apache.org/jira/browse/NIFI-3339
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Toivo Adams
>Assignee: Toivo Adams
>Priority: Minor
>
> Currently DBCPService returns only Connection. 
> Sometimes DataSource is needed, for example Spring JdbcTemplate, 
> SimpleJdbcCall need DataSource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (NIFI-3339) Add getDataSource() to DBCPService

2017-01-13 Thread Toivo Adams (JIRA)
Toivo Adams created NIFI-3339:
-

 Summary: Add getDataSource() to DBCPService
 Key: NIFI-3339
 URL: https://issues.apache.org/jira/browse/NIFI-3339
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Reporter: Toivo Adams
Assignee: Toivo Adams
Priority: Minor


Currently DBCPService returns only Connection. 
Sometimes DataSource is needed, for example Spring JdbcTemplate, SimpleJdbcCall 
need DataSource.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-1251) Allow ExecuteSQL to send out large result sets in chunks

2016-12-03 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717822#comment-15717822
 ] 

Toivo Adams commented on NIFI-1251:
---

Hi [~arun_90]

It seems to be stalled.
Do you need this feature or you want to provide contribution?

Thanks
Toivo


> Allow ExecuteSQL to send out large result sets in chunks
> 
>
> Key: NIFI-1251
> URL: https://issues.apache.org/jira/browse/NIFI-1251
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Mark Payne
>
> Currently, when using ExecuteSQL, if a result set is very large, it can take 
> quite a long time to pull back all of the results. It would be nice to have 
> the ability to specify the maximum number of records to put into a FlowFile, 
> so that if we pull back say 1 million records we can configure it to create 
> 1000 FlowFiles, each with 1000 records. This way, we can begin processing the 
> first 1,000 records while the next 1000 are being pulled from the remote 
> database.
> This suggestion comes from Vinay via the dev@ mailing list:
> Is there way to have streaming feature when large result set is fetched from
> database basically to reads data from the database in chunks of records
> instead of loading the full result set into memory.
> As part of ExecuteSQL can a property be specified called "FetchSize" which
> Indicates how many rows should be fetched from the resultSet.
> Since jam bit new in using NIFI , can any guide me on above.
> Thanks in advance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2918) JDBC getColumnName may return empty string

2016-10-19 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15588458#comment-15588458
 ] 

Toivo Adams commented on NIFI-2918:
---

Hi,

I don't want to argue with SAP.
But maybe we should preserve current behavior?
I mean we still use getColumnName()
and when return value is null or empty String we try getColumnLabel()

I am not sure how other vendors use getColumnLabel()

Thanks
Toivo


> JDBC getColumnName may return empty string
> --
>
> Key: NIFI-2918
> URL: https://issues.apache.org/jira/browse/NIFI-2918
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.1.0
>Reporter: Peter Wicks
>Assignee: Peter Wicks
>
> The SAP Hana JDBC Driver returns an empty string for getColumnName if the 
> column name in the SELECT statement was aliased.
> Instead you have to call getColumnLabel on the ResultSetMetaData.
> SAP as a company feels this is per the JDBC spec, so they are not going to 
> change it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (NIFI-1280) Create FilterCSVColumns Processor

2016-10-06 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15552595#comment-15552595
 ] 

Toivo Adams edited comment on NIFI-1280 at 10/6/16 5:30 PM:


[~markap14]

Roughly same way as databases use indexes.
For example 
select * from emp join emp as mgr on emp.mgr = mgr.id where emp.salary > 
mgr.salary

We can create index on salary column.
Of course indexing itself is expensive operation.
But maybe we can find a way how to create index reasonably cheap.
I have crazy idea to create index in previous step during writing to FlowFile.
Of course creating index should optional and should be used with care.
But sometimes it might improve performance considerably.

Maybe indexing is not worth of trouble.

Glad you have new version almost ready.
Certainly I am interested to see it.
I have a feeling this greatly improves user experience.

Thanks
Toivo


was (Author: toivo adams):
@markap14

Roughly same way as databases use indexes.
For example 
select * from emp join emp as mgr on emp.mgr = mgr.id where emp.salary > 
mgr.salary

We can create index on salary column.
Of course indexing itself is expensive operation.
But maybe we can find a way how to create index reasonably cheap.
I have crazy idea to create index in previous step during writing to FlowFile.
Of course creating index should optional and should be used with care.
But sometimes it might improve performance considerably.

Maybe indexing is not worth of trouble.

Glad you have new version almost ready.
Certainly I am interested to see it.
I have a feeling this greatly improves user experience.

Thanks
Toivo

> Create FilterCSVColumns Processor
> -
>
> Key: NIFI-1280
> URL: https://issues.apache.org/jira/browse/NIFI-1280
> Project: Apache NiFi
>  Issue Type: Task
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Toivo Adams
>
> We should have a Processor that allows users to easily filter out specific 
> columns from CSV data. For instance, a user would configure two different 
> properties: "Columns of Interest" (a comma-separated list of column indexes) 
> and "Filtering Strategy" (Keep Only These Columns, Remove Only These Columns).
> We can do this today with ReplaceText, but it is far more difficult than it 
> would be with this Processor, as the user has to use Regular Expressions, 
> etc. with ReplaceText.
> Eventually a Custom UI could even be built that allows a user to upload a 
> Sample CSV and choose which columns from there, similar to the way that Excel 
> works when importing CSV by dragging and selecting the desired columns? That 
> would certainly be a larger undertaking and would not need to be done for an 
> initial implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-1280) Create FilterCSVColumns Processor

2016-10-06 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15552595#comment-15552595
 ] 

Toivo Adams commented on NIFI-1280:
---

@markap14

Roughly same way as databases use indexes.
For example 
select * from emp join emp as mgr on emp.mgr = mgr.id where emp.salary > 
mgr.salary

We can create index on salary column.
Of course indexing itself is expensive operation.
But maybe we can find a way how to create index reasonably cheap.
I have crazy idea to create index in previous step during writing to FlowFile.
Of course creating index should optional and should be used with care.
But sometimes it might improve performance considerably.

Maybe indexing is not worth of trouble.

Glad you have new version almost ready.
Certainly I am interested to see it.
I have a feeling this greatly improves user experience.

Thanks
Toivo

> Create FilterCSVColumns Processor
> -
>
> Key: NIFI-1280
> URL: https://issues.apache.org/jira/browse/NIFI-1280
> Project: Apache NiFi
>  Issue Type: Task
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Toivo Adams
>
> We should have a Processor that allows users to easily filter out specific 
> columns from CSV data. For instance, a user would configure two different 
> properties: "Columns of Interest" (a comma-separated list of column indexes) 
> and "Filtering Strategy" (Keep Only These Columns, Remove Only These Columns).
> We can do this today with ReplaceText, but it is far more difficult than it 
> would be with this Processor, as the user has to use Regular Expressions, 
> etc. with ReplaceText.
> Eventually a Custom UI could even be built that allows a user to upload a 
> Sample CSV and choose which columns from there, similar to the way that Excel 
> works when importing CSV by dragging and selecting the desired columns? That 
> would certainly be a larger undertaking and would not need to be done for an 
> initial implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-1613) ConvertJSONToSQL Drops Type Information

2016-10-04 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545519#comment-15545519
 ] 

Toivo Adams commented on NIFI-1613:
---

Hi @trixpan

If I remember correctly, Boolean and Numeric types handling is improved.

But DATE, TIME and TIMESTAMP handling is still problematic. (when used together 
with PutSQL processor)
PutSQL expects Long numeric types.
But JSON usually uses ISO 8601, for example: 2012-04-23T18:25:43.511Z for dates.
This means PutSql will fail.

Maybe PutSQL should be changed? But change may break current PutSQL behavior.
Workaround maybe change PutSQL in the way there:
1. We first try to convert numeric value.
2. If conversion fails we try to convert String value.
This way we can preserve current behaviour.

Thanks
Toivo


> ConvertJSONToSQL Drops Type Information
> ---
>
> Key: NIFI-1613
> URL: https://issues.apache.org/jira/browse/NIFI-1613
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 0.4.1, 0.5.1
> Environment: Ubuntu 14.04 LTS
>Reporter: Aaron Stephens
>Assignee: Toivo Adams
>  Labels: ConvertJSONToSQL, Phoenix, SQL
>
> It appears that the ConvertJSONToSQL processor is turning Boolean (and 
> possibly Integer and Float) values into Strings.  This is okay for some 
> drivers (like PostgreSQL) which can coerce a String back into a Boolean, but 
> it causes issues for others (specifically Phoenix in my case).
> {noformat}
> org.apache.phoenix.schema.ConstraintViolationException: 
> org.apache.phoenix.schema.TypeMismatchException: ERROR 203 (22005): Type 
> mismatch. VARCHAR cannot be coerced to BOOLEAN
> at 
> org.apache.phoenix.schema.types.PDataType.throwConstraintViolationException(PDataType.java:282)
>  ~[na:na]
> at 
> org.apache.phoenix.schema.types.PBoolean.toObject(PBoolean.java:136) ~[na:na]
> at 
> org.apache.phoenix.jdbc.PhoenixPreparedStatement.setObject(PhoenixPreparedStatement.java:442)
>  ~[na:na]
> at 
> org.apache.commons.dbcp.DelegatingPreparedStatement.setObject(DelegatingPreparedStatement.java:166)
>  ~[na:na]
> at 
> org.apache.commons.dbcp.DelegatingPreparedStatement.setObject(DelegatingPreparedStatement.java:166)
>  ~[na:na]
> at 
> org.apache.nifi.processors.standard.PutSQL.setParameter(PutSQL.java:728) 
> ~[na:na]
> at 
> org.apache.nifi.processors.standard.PutSQL.setParameters(PutSQL.java:606) 
> ~[na:na]
> at 
> org.apache.nifi.processors.standard.PutSQL.onTrigger(PutSQL.java:223) ~[na:na]
> at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>  ~[nifi-api-0.4.1.jar:0.4.1]
> at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1146)
>  ~[nifi-framework-core-0.4.1.jar:0.4.1]
> at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:139)
>  [nifi-framework-core-0.4.1.jar:0.4.1]
> at 
> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:49)
>  [nifi-framework-core-0.4.1.jar:0.4.1]
> at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:119)
>  [nifi-framework-core-0.4.1.jar:0.4.1]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> [na:1.7.0_79]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
> [na:1.7.0_79]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>  [na:1.7.0_79]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  [na:1.7.0_79]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_79]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_79]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
> Caused by: org.apache.phoenix.schema.TypeMismatchException: ERROR 203 
> (22005): Type mismatch. VARCHAR cannot be coerced to BOOLEAN
> at 
> org.apache.phoenix.exception.SQLExceptionCode$1.newException(SQLExceptionCode.java:71)
>  ~[na:na]
> at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
>  ~[na:na]
> ... 20 common frames omitted
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2848) Queues aren't fairly drained when leading to a single component

2016-09-30 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15536240#comment-15536240
 ] 

Toivo Adams commented on NIFI-2848:
---

I removed Funnel and tested again.

One queue tends to dominate. This time third GenerateFlowFile queue.
After stopping third GenerateFlowFile, second started dominate.
Once queue gets opportunity it uses forever and don’t give others change to 
send anything.

See dominate_1.png and dominate_2.png


> Queues aren't fairly drained when leading to a single component
> ---
>
> Key: NIFI-2848
> URL: https://issues.apache.org/jira/browse/NIFI-2848
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.0.0, 0.7.0
>Reporter: Joseph Gresock
> Attachments: Backpressure_prioritization_test.xml, dominate_1.png, 
> dominate_2.png, queue_drain.png
>
>
> Consider the scenario where multiple queues lead to a single component and 
> all of them are full due to back pressure.  With the attached template, it is 
> easily observable that once a single queue starts to drain due to relieved 
> back pressure, it will continue to drain as long as it has incoming flow 
> files.  This means that if there's a constant flow of incoming flow files to 
> this queue, the other queues will never be drained (at least, that's my 
> theory based on several hours of observation).
> To reproduce this: 
> # Load the template into NiFi 1.0.0
> # Play all three GenerateFlowFile processors, but not the UpdateAttribute 
> processor (this simulates backpressure).  Wait until each queue has 1,000 
> flow files (max backpressure)
> # Stop the GenerateFlowFile processors, and play the UpdateAttribute 
> processor (this relieves the backpressure)
> # Observe which queue has started to drain, and start its GenerateFlowFile 
> processor
> # Observe that the other two queues remain full indefinitely, while the 
> draining queue continues to replenish and be drained indefinitely



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NIFI-2848) Queues aren't fairly drained when leading to a single component

2016-09-30 Thread Toivo Adams (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toivo Adams updated NIFI-2848:
--
Attachment: dominate_2.png
dominate_1.png

> Queues aren't fairly drained when leading to a single component
> ---
>
> Key: NIFI-2848
> URL: https://issues.apache.org/jira/browse/NIFI-2848
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.0.0, 0.7.0
>Reporter: Joseph Gresock
> Attachments: Backpressure_prioritization_test.xml, dominate_1.png, 
> dominate_2.png, queue_drain.png
>
>
> Consider the scenario where multiple queues lead to a single component and 
> all of them are full due to back pressure.  With the attached template, it is 
> easily observable that once a single queue starts to drain due to relieved 
> back pressure, it will continue to drain as long as it has incoming flow 
> files.  This means that if there's a constant flow of incoming flow files to 
> this queue, the other queues will never be drained (at least, that's my 
> theory based on several hours of observation).
> To reproduce this: 
> # Load the template into NiFi 1.0.0
> # Play all three GenerateFlowFile processors, but not the UpdateAttribute 
> processor (this simulates backpressure).  Wait until each queue has 1,000 
> flow files (max backpressure)
> # Stop the GenerateFlowFile processors, and play the UpdateAttribute 
> processor (this relieves the backpressure)
> # Observe which queue has started to drain, and start its GenerateFlowFile 
> processor
> # Observe that the other two queues remain full indefinitely, while the 
> draining queue continues to replenish and be drained indefinitely



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NIFI-2848) Queues aren't fairly drained when leading to a single component

2016-09-30 Thread Toivo Adams (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toivo Adams updated NIFI-2848:
--
Attachment: queue_drain.png

> Queues aren't fairly drained when leading to a single component
> ---
>
> Key: NIFI-2848
> URL: https://issues.apache.org/jira/browse/NIFI-2848
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.0.0, 0.7.0
>Reporter: Joseph Gresock
> Attachments: Backpressure_prioritization_test.xml, queue_drain.png
>
>
> Consider the scenario where multiple queues lead to a single component and 
> all of them are full due to back pressure.  With the attached template, it is 
> easily observable that once a single queue starts to drain due to relieved 
> back pressure, it will continue to drain as long as it has incoming flow 
> files.  This means that if there's a constant flow of incoming flow files to 
> this queue, the other queues will never be drained (at least, that's my 
> theory based on several hours of observation).
> To reproduce this: 
> # Load the template into NiFi 1.0.0
> # Play all three GenerateFlowFile processors, but not the UpdateAttribute 
> processor (this simulates backpressure).  Wait until each queue has 1,000 
> flow files (max backpressure)
> # Stop the GenerateFlowFile processors, and play the UpdateAttribute 
> processor (this relieves the backpressure)
> # Observe which queue has started to drain, and start its GenerateFlowFile 
> processor
> # Observe that the other two queues remain full indefinitely, while the 
> draining queue continues to replenish and be drained indefinitely



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2848) Queues aren't fairly drained when leading to a single component

2016-09-30 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15536071#comment-15536071
 ] 

Toivo Adams commented on NIFI-2848:
---

Hi Joseph Gresock 

I observed similar behaviour.
First GenerateFlowFile (from left to right) got opportunity and continued to 
push new files.
Second did not have any chance.
Third had opportunity (only short period) before I started first again.

Thanks
Toivo

> Queues aren't fairly drained when leading to a single component
> ---
>
> Key: NIFI-2848
> URL: https://issues.apache.org/jira/browse/NIFI-2848
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.0.0, 0.7.0
>Reporter: Joseph Gresock
> Attachments: Backpressure_prioritization_test.xml
>
>
> Consider the scenario where multiple queues lead to a single component and 
> all of them are full due to back pressure.  With the attached template, it is 
> easily observable that once a single queue starts to drain due to relieved 
> back pressure, it will continue to drain as long as it has incoming flow 
> files.  This means that if there's a constant flow of incoming flow files to 
> this queue, the other queues will never be drained (at least, that's my 
> theory based on several hours of observation).
> To reproduce this: 
> # Load the template into NiFi 1.0.0
> # Play all three GenerateFlowFile processors, but not the UpdateAttribute 
> processor (this simulates backpressure).  Wait until each queue has 1,000 
> flow files (max backpressure)
> # Stop the GenerateFlowFile processors, and play the UpdateAttribute 
> processor (this relieves the backpressure)
> # Observe which queue has started to drain, and start its GenerateFlowFile 
> processor
> # Observe that the other two queues remain full indefinitely, while the 
> draining queue continues to replenish and be drained indefinitely



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-1280) Create FilterCSVColumns Processor

2016-09-23 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15516152#comment-15516152
 ] 

Toivo Adams commented on NIFI-1280:
---

Hi,

Could indexing FlowFile content help to avoid rereading whole data?

Thanks
Toivo

> Create FilterCSVColumns Processor
> -
>
> Key: NIFI-1280
> URL: https://issues.apache.org/jira/browse/NIFI-1280
> Project: Apache NiFi
>  Issue Type: Task
>  Components: Extensions
>Reporter: Mark Payne
>Assignee: Toivo Adams
>
> We should have a Processor that allows users to easily filter out specific 
> columns from CSV data. For instance, a user would configure two different 
> properties: "Columns of Interest" (a comma-separated list of column indexes) 
> and "Filtering Strategy" (Keep Only These Columns, Remove Only These Columns).
> We can do this today with ReplaceText, but it is far more difficult than it 
> would be with this Processor, as the user has to use Regular Expressions, 
> etc. with ReplaceText.
> Eventually a Custom UI could even be built that allows a user to upload a 
> Sample CSV and choose which columns from there, similar to the way that Excel 
> works when importing CSV by dragging and selecting the desired columns? That 
> would certainly be a larger undertaking and would not need to be done for an 
> initial implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2381) Connection Pooling Service -Drop invalid connections and create new ones

2016-09-16 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496871#comment-15496871
 ] 

Toivo Adams commented on NIFI-2381:
---

Hi

Can any Committer take a look?
This change is possibly important.

Thanks
Toivo


> Connection Pooling Service -Drop invalid connections and create new ones 
> -
>
> Key: NIFI-2381
> URL: https://issues.apache.org/jira/browse/NIFI-2381
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 0.7.0
> Environment: all
>Reporter: Carlos Manuel António Fernandes
>Assignee: Toivo Adams
>
> The connections  in Connection Pooling Service become invalid for several 
> reasons : session timeout, firewalls block idle connections, outages of 
> backend server, etc.
> In the current niif releases this connections rest in the pool as good  but 
> when the user use one of them is triggered an errror  for the backend 
> database. 
> Ex: org.netezza.error.NzSQLException: FATAL 1:  Connection Terminated - 
> session timeout exceeded
> With this improvement we pretend periodicaly to test all the connections , 
> drop the invalid ,create new ones and  mantain all the pool healthy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2764) JdbcCommon Avro Can't Process Java Short Types

2016-09-13 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15487179#comment-15487179
 ] 

Toivo Adams commented on NIFI-2764:
---

Hi,

Fix itself looks good to me.
Maybe you can add test method to TestJdbcCommon?
As testSignedIntShouldBeInt() is done? Using Mock you don’t have to use 
sqljdbc41.jar dependency.

Thanks
Toivo


> JdbcCommon Avro Can't Process Java Short Types
> --
>
> Key: NIFI-2764
> URL: https://issues.apache.org/jira/browse/NIFI-2764
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Affects Versions: 1.0.0
>Reporter: Peter Wicks
>
> Microsoft SQL Server returns TINYINT values as Java Short's.  Avro is unable 
> to write datum's of this type and throws an exception when trying to.
> This currently breaks QueryDatabaseTable at the very least when querying MS 
> SQL Server with TINYINT's in the ResultSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (NIFI-2381) Connection Pooling Service -Drop invalid connections and create new ones

2016-09-04 Thread Toivo Adams (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toivo Adams reassigned NIFI-2381:
-

Assignee: Toivo Adams

> Connection Pooling Service -Drop invalid connections and create new ones 
> -
>
> Key: NIFI-2381
> URL: https://issues.apache.org/jira/browse/NIFI-2381
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 0.7.0
> Environment: all
>Reporter: Carlos Manuel António Fernandes
>Assignee: Toivo Adams
>
> The connections  in Connection Pooling Service become invalid for several 
> reasons : session timeout, firewalls block idle connections, outages of 
> backend server, etc.
> In the current niif releases this connections rest in the pool as good  but 
> when the user use one of them is triggered an errror  for the backend 
> database. 
> Ex: org.netezza.error.NzSQLException: FATAL 1:  Connection Terminated - 
> session timeout exceeded
> With this improvement we pretend periodicaly to test all the connections , 
> drop the invalid ,create new ones and  mantain all the pool healthy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (NIFI-2624) JDBC-to-Avro processors handle BigDecimals as Strings

2016-09-03 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460710#comment-15460710
 ] 

Toivo Adams edited comment on NIFI-2624 at 9/3/16 8:49 AM:
---

Note!! BigInteger handling is not yet changed.

I am not sure how Avro supports DECIMAL types.
Current solution looks clumsy.
You can’t put BigDecimal value directly to Avro record, like
   rec.put(i - 1, bigDecimalValue);
Avro will throw 
org.apache.avro.AvroRuntimeException: Unknown datum type java.math.BigDecimal: 
38

Instead you need to convert value to bytes, like
   Schema decimalSchema = getDecimalSchema(schema, meta.getColumnName(i));
   LogicalType logicalType = LogicalTypes.fromSchema(decimalSchema);
   ByteBuffer byteBuffer = decimalConversion.toBytes((BigDecimal) value, 
decimalSchema, logicalType);
   rec.put(i - 1, byteBuffer);

And getting BigDecimal value from record is equally weird:

record = dataFileReader.next(record);
DecimalConversion decimalConversion = new DecimalConversion();
Schema schema = record.getSchema();
Schema decimalSchema = getDecimalSchema(schema, "Chairman");
LogicalType logicalType = LogicalTypes.fromSchema(decimalSchema);
ByteBuffer buffer = (ByteBuffer) record.get("Chairman");

BigDecimal resultBD = decimalConversion.fromBytes(buffer, schema, 
logicalType);

I am not sure how to handle BigInteger, because its seems DecimalConversion 
supports only BigDecimal.

Thanks
Toivo




was (Author: toivo adams):
Note!! BigInteger handling is not yet changed.

I am not sure how Avro supports DECIMAL types.
Current solution looks clumsy.
You can’ put BigDecimal value directly to Avro record, like
   rec.put(i - 1, bigDecimalValue);
Avro will throw 
org.apache.avro.AvroRuntimeException: Unknown datum type java.math.BigDecimal: 
38

Instead you need to convert value to bytes, like
   Schema decimalSchema = getDecimalSchema(schema, meta.getColumnName(i));
   LogicalType logicalType = LogicalTypes.fromSchema(decimalSchema);
   ByteBuffer byteBuffer = decimalConversion.toBytes((BigDecimal) value, 
decimalSchema, logicalType);
   rec.put(i - 1, byteBuffer);

And getting BigDecimal value from record is equally weird:

record = dataFileReader.next(record);
DecimalConversion decimalConversion = new DecimalConversion();
Schema schema = record.getSchema();
Schema decimalSchema = getDecimalSchema(schema, "Chairman");
LogicalType logicalType = LogicalTypes.fromSchema(decimalSchema);
ByteBuffer buffer = (ByteBuffer) record.get("Chairman");

BigDecimal resultBD = decimalConversion.fromBytes(buffer, schema, 
logicalType);

I am not sure how to handle BigInteger, because its seems DecimalConversion 
supports only BigDecimal.

Thanks
Toivo



> JDBC-to-Avro processors handle BigDecimals as Strings
> -
>
> Key: NIFI-2624
> URL: https://issues.apache.org/jira/browse/NIFI-2624
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Toivo Adams
>
> The original SQL processors implemented BigDecimal values as Strings for 
> Avro, as the version of Avro it used (1.7.6) did not support DECIMAL type.
> As of Avro 1.7.7 (AVRO-1402), this type is supported and so the SQL/HiveQL 
> processors should be updated to handle BigDecimals correctly if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2624) JDBC-to-Avro processors handle BigDecimals as Strings

2016-09-03 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460710#comment-15460710
 ] 

Toivo Adams commented on NIFI-2624:
---

Note!! BigInteger handling is not yet changed.

I am not sure how Avro supports DECIMAL types.
Current solution looks clumsy.
You can’ put BigDecimal value directly to Avro record, like
   rec.put(i - 1, bigDecimalValue);
Avro will throw 
org.apache.avro.AvroRuntimeException: Unknown datum type java.math.BigDecimal: 
38

Instead you need to convert value to bytes, like
   Schema decimalSchema = getDecimalSchema(schema, meta.getColumnName(i));
   LogicalType logicalType = LogicalTypes.fromSchema(decimalSchema);
   ByteBuffer byteBuffer = decimalConversion.toBytes((BigDecimal) value, 
decimalSchema, logicalType);
   rec.put(i - 1, byteBuffer);

And getting BigDecimal value from record is equally weird:

record = dataFileReader.next(record);
DecimalConversion decimalConversion = new DecimalConversion();
Schema schema = record.getSchema();
Schema decimalSchema = getDecimalSchema(schema, "Chairman");
LogicalType logicalType = LogicalTypes.fromSchema(decimalSchema);
ByteBuffer buffer = (ByteBuffer) record.get("Chairman");

BigDecimal resultBD = decimalConversion.fromBytes(buffer, schema, 
logicalType);

I am not sure how to handle BigInteger, because its seems DecimalConversion 
supports only BigDecimal.

Thanks
Toivo



> JDBC-to-Avro processors handle BigDecimals as Strings
> -
>
> Key: NIFI-2624
> URL: https://issues.apache.org/jira/browse/NIFI-2624
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Toivo Adams
>
> The original SQL processors implemented BigDecimal values as Strings for 
> Avro, as the version of Avro it used (1.7.6) did not support DECIMAL type.
> As of Avro 1.7.7 (AVRO-1402), this type is supported and so the SQL/HiveQL 
> processors should be updated to handle BigDecimals correctly if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (NIFI-2624) JDBC-to-Avro processors handle BigDecimals as Strings

2016-09-03 Thread Toivo Adams (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toivo Adams updated NIFI-2624:
--
Status: Patch Available  (was: In Progress)

https://github.com/apache/nifi/pull/984
JdbcCommon treats BigDecimals now as Avro Logical type using bytes to hold data 
(not String as is was before).

> JDBC-to-Avro processors handle BigDecimals as Strings
> -
>
> Key: NIFI-2624
> URL: https://issues.apache.org/jira/browse/NIFI-2624
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Toivo Adams
>
> The original SQL processors implemented BigDecimal values as Strings for 
> Avro, as the version of Avro it used (1.7.6) did not support DECIMAL type.
> As of Avro 1.7.7 (AVRO-1402), this type is supported and so the SQL/HiveQL 
> processors should be updated to handle BigDecimals correctly if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (NIFI-2624) JDBC-to-Avro processors handle BigDecimals as Strings

2016-08-28 Thread Toivo Adams (JIRA)

 [ 
https://issues.apache.org/jira/browse/NIFI-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toivo Adams reassigned NIFI-2624:
-

Assignee: Toivo Adams

> JDBC-to-Avro processors handle BigDecimals as Strings
> -
>
> Key: NIFI-2624
> URL: https://issues.apache.org/jira/browse/NIFI-2624
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Reporter: Matt Burgess
>Assignee: Toivo Adams
>
> The original SQL processors implemented BigDecimal values as Strings for 
> Avro, as the version of Avro it used (1.7.6) did not support DECIMAL type.
> As of Avro 1.7.7 (AVRO-1402), this type is supported and so the SQL/HiveQL 
> processors should be updated to handle BigDecimals correctly if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-1214) Mock Framework should allow order-independent assumptions on FlowFiles

2016-07-20 Thread Toivo Adams (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385585#comment-15385585
 ] 

Toivo Adams commented on NIFI-1214:
---

I will try to solve on the weekend. 

> Mock Framework should allow order-independent assumptions on FlowFiles
> --
>
> Key: NIFI-1214
> URL: https://issues.apache.org/jira/browse/NIFI-1214
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Tools and Build
>Reporter: Mark Payne
>Assignee: Toivo Adams
> Fix For: 1.0.0
>
>
> A common pattern in unit testing is to iterate over all FlowFiles that are 
> output to a Relationship and verify that each FlowFile matches one criteria 
> or another and that all criteria are met. For example, the following code 
> snippet from TestRouteText  verifies that two FlowFiles were output and that 
> Criteria A was met by one of them and Criteria B was met by the other:
> {code}
> final List list = 
> runner.getFlowFilesForRelationship("o");
> boolean found1 = false;
> boolean found2 = false;
> for (final MockFlowFile mff : list) {
> if (mff.getAttribute(RouteText.GROUP_ATTRIBUTE_KEY).equals("1")) {
> mff.assertContentEquals("1,hello\n1,good-bye");
> found1 = true;
> } else {
> mff.assertAttributeEquals(RouteText.GROUP_ATTRIBUTE_KEY, "2");
> mff.assertContentEquals("2,world\n");
> found2 = true;
> }
> }
> assertTrue(found1);
> assertTrue(found2);
> {code}
> This is very verbose, and error-prone. It could be done much more concisely 
> if we have a method like:
> {code}
> TestRunner.assertAllConditionsMet( Relationship relationship, 
> FlowFileVerifier... verifier );
> {code}
> Where FlowFileVerifier is able to verify some condition on a FlowFile. This 
> method would then be responsible for ensuring that each FlowFile that was 
> routed to 'relationship' meets one of the criteria specified by a verifier, 
> and that all of the verifiers were met. For example:
> {code}
> runner.assertAllConditionsMet( "o", 
> { mff -> mff.isAttributeEqual(RouteText.GROUP_ATTRIBUTE_KEY, "1") && 
> mff.isContentEqual("1,hello\n1,good-bye") },
> { mff -> mff.isAttributeEqual(RouteText.GROUP_ATTRIBUTE_KEY, "2") && 
> mff.isContentEqual("2,world\n") }
> );
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)