[
https://issues.apache.org/jira/browse/NIFI-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698251#comment-14698251
]
Mark Payne commented on NIFI-853:
---------------------------------
Toivo,
Some excellent feedback! Thank you! Responses to questions below.
1. You're right - the processor is not currently batching these together. This
is certainly something that we need to address.
2. I used the content of the FlowFile, rather than a FlowFile attribute,
because i was afraid that the SQL statement might get quite large and not
something that we want as an attribute. However, now that you mention this,
it's probably not a problem at all because the actual values to get inserted
are individual attributes. So the SQL statement itself won't likely be that
large. We could certainly consider changing it to use an attribute instead.
[~aldrin], I think you looked at this Processor a little bit as well, right? Do
you have any thoughts on using FlowFile content vs. attributes? I'm thinking
that you're right, Toivo - attributes may be the way to go here.
3. Yes, each FlowFile will represent a single row in the database.
One advantage of using attributes to hold the SQL statement instead of the
FlowFile content is that we could potentially have many statements in FlowFile
attributes, rather than having a single FlowFile per row to insert. The
complexity here, though, is how do we handle parameterized queries, if we have
multiple queries? Right now, the query parameters are stored in a known
location - sql.args.1.value, sql.args.1.type, sql.args.2.value,
sql.args.2.type, and so on. It becomes more difficult if we have even two
queries for a given FlowFile - how do we specify which parameters belong to
which query?
> Create Processors to put JSON data to a Relational Database
> -----------------------------------------------------------
>
> Key: NIFI-853
> URL: https://issues.apache.org/jira/browse/NIFI-853
> Project: Apache NiFi
> Issue Type: Task
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Fix For: 0.3.0
>
> Attachments:
> 0001-NIFI-853-Added-processors-ConvertFlatJSONToSQL-PutSQ.patch,
> 0002-NIFI-853-Made-updates-to-processors.patch
>
>
> Most of the discussion/design for these processors happened in the comments
> of NIFI-293, which was the initial ticket for implementing JDBC functionality
> in NiFi, but was closed in a previous version, so this ticket was created to
> do the work.
> The idea is to have a processor that will take in FlowFiles whose contents
> are arbitrary SQL INSERT/UPDATE commands. The commands can be parameterized
> with the parameters' values and types in FlowFile attributes.
> We then should have a processor that converts a JSON document into a SQL
> command to either update or insert data into a database table. We will also
> want some other processors in the future probably to handle other data types,
> such as converting XML, CSV, Avro, etc. into SQL commands.
> This breakout gives us a nice coherence to the "do only one thing and do it
> well" principle by separating the logic of handling all of the incoming
> formats from the logic of updating the database.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)