[ 
https://issues.apache.org/jira/browse/NIFI-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698408#comment-14698408
 ] 

Toivo Adams commented on NIFI-853:
----------------------------------

Flexibility is always good but we may lose something.

I am little bit worried about performance.
Single insert/update can very slow. 
Depending how many columns are included in indexes some relational databases 
are able do to only few hundred insert/updates per seconds.

This is painfully low compared how much NiFi can handle.

Using jdbc batch insert/update can be improved 5-8 times. This is a huge 
difference.
I know this not so simple. Many databases have special parameters how to fine 
tune 'bulk' inserts/updates.

It might be I don't know how PreparedStatement can be used but I suspect batch 
is not possible using different columns.

Second concern is how good NiFi is handling a lot of small FlowFiles instead of 
one bigger.
I know NiFi is very good, but still isn't one bigger FlowFile much less 
resource eating than one bigger?  

And third concern is provenance usage. I always prefer to read first some kind 
of summary.
Say ten thousand records are inserted successfully. I don't want to see them 
all separately. 

Thanks
Toivo

> Create Processors to put JSON data to a Relational Database
> -----------------------------------------------------------
>
>                 Key: NIFI-853
>                 URL: https://issues.apache.org/jira/browse/NIFI-853
>             Project: Apache NiFi
>          Issue Type: Task
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>             Fix For: 0.3.0
>
>         Attachments: 
> 0001-NIFI-853-Added-processors-ConvertFlatJSONToSQL-PutSQ.patch, 
> 0002-NIFI-853-Made-updates-to-processors.patch
>
>
> Most of the discussion/design for these processors happened in the comments 
> of NIFI-293, which was the initial ticket for implementing JDBC functionality 
> in NiFi, but was closed in a previous version, so this ticket was created to 
> do the work.
> The idea is to have a processor that will take in FlowFiles whose contents 
> are arbitrary SQL INSERT/UPDATE commands. The commands can be parameterized 
> with the parameters' values and types in FlowFile attributes.
> We then should have a processor that converts a JSON document into a SQL 
> command to either update or insert data into a database table. We will also 
> want some other processors in the future probably to handle other data types, 
> such as converting XML, CSV, Avro, etc. into SQL commands.
> This breakout gives us a nice coherence to the "do only one thing and do it 
> well" principle by separating the logic of handling all of the incoming 
> formats from the logic of updating the database.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to