[
https://issues.apache.org/jira/browse/NIFI-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698408#comment-14698408
]
Toivo Adams commented on NIFI-853:
----------------------------------
Flexibility is always good but we may lose something.
I am little bit worried about performance.
Single insert/update can very slow.
Depending how many columns are included in indexes some relational databases
are able do to only few hundred insert/updates per seconds.
This is painfully low compared how much NiFi can handle.
Using jdbc batch insert/update can be improved 5-8 times. This is a huge
difference.
I know this not so simple. Many databases have special parameters how to fine
tune 'bulk' inserts/updates.
It might be I don't know how PreparedStatement can be used but I suspect batch
is not possible using different columns.
Second concern is how good NiFi is handling a lot of small FlowFiles instead of
one bigger.
I know NiFi is very good, but still isn't one bigger FlowFile much less
resource eating than one bigger?
And third concern is provenance usage. I always prefer to read first some kind
of summary.
Say ten thousand records are inserted successfully. I don't want to see them
all separately.
Thanks
Toivo
> Create Processors to put JSON data to a Relational Database
> -----------------------------------------------------------
>
> Key: NIFI-853
> URL: https://issues.apache.org/jira/browse/NIFI-853
> Project: Apache NiFi
> Issue Type: Task
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Fix For: 0.3.0
>
> Attachments:
> 0001-NIFI-853-Added-processors-ConvertFlatJSONToSQL-PutSQ.patch,
> 0002-NIFI-853-Made-updates-to-processors.patch
>
>
> Most of the discussion/design for these processors happened in the comments
> of NIFI-293, which was the initial ticket for implementing JDBC functionality
> in NiFi, but was closed in a previous version, so this ticket was created to
> do the work.
> The idea is to have a processor that will take in FlowFiles whose contents
> are arbitrary SQL INSERT/UPDATE commands. The commands can be parameterized
> with the parameters' values and types in FlowFile attributes.
> We then should have a processor that converts a JSON document into a SQL
> command to either update or insert data into a database table. We will also
> want some other processors in the future probably to handle other data types,
> such as converting XML, CSV, Avro, etc. into SQL commands.
> This breakout gives us a nice coherence to the "do only one thing and do it
> well" principle by separating the logic of handling all of the incoming
> formats from the logic of updating the database.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)