[ 
https://issues.apache.org/jira/browse/NIFI-853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698251#comment-14698251
 ] 

Mark Payne commented on NIFI-853:
---------------------------------

Toivo,

Some excellent feedback! Thank you! Responses to questions below.

1. You're right - the processor is not currently batching these together. This 
is certainly something that we need to address.

2. I used the content of the FlowFile, rather than a FlowFile attribute, 
because i was afraid that the SQL statement might get quite large and not 
something that we want as an attribute. However, now that you mention this, 
it's probably not a problem at all because the actual values to get inserted 
are individual attributes. So the SQL statement itself won't likely be that 
large. We could certainly consider changing it to use an attribute instead. 
[~aldrin], I think you looked at this Processor a little bit as well, right? Do 
you have any thoughts on using FlowFile content vs. attributes? I'm thinking 
that you're right, Toivo - attributes may be the way to go here.

3. Yes, each FlowFile will represent a single row in the database.

One advantage of using attributes to hold the SQL statement instead of the 
FlowFile content is that we could potentially have many statements in FlowFile 
attributes, rather than having a single FlowFile per row to insert. The 
complexity here, though, is how do we handle parameterized queries, if we have 
multiple queries? Right now, the query parameters are stored in a known 
location - sql.args.1.value, sql.args.1.type, sql.args.2.value, 
sql.args.2.type, and so on. It becomes more difficult if we have even two 
queries for a given FlowFile - how do we specify which parameters belong to 
which query?

> Create Processors to put JSON data to a Relational Database
> -----------------------------------------------------------
>
>                 Key: NIFI-853
>                 URL: https://issues.apache.org/jira/browse/NIFI-853
>             Project: Apache NiFi
>          Issue Type: Task
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>             Fix For: 0.3.0
>
>         Attachments: 
> 0001-NIFI-853-Added-processors-ConvertFlatJSONToSQL-PutSQ.patch, 
> 0002-NIFI-853-Made-updates-to-processors.patch
>
>
> Most of the discussion/design for these processors happened in the comments 
> of NIFI-293, which was the initial ticket for implementing JDBC functionality 
> in NiFi, but was closed in a previous version, so this ticket was created to 
> do the work.
> The idea is to have a processor that will take in FlowFiles whose contents 
> are arbitrary SQL INSERT/UPDATE commands. The commands can be parameterized 
> with the parameters' values and types in FlowFile attributes.
> We then should have a processor that converts a JSON document into a SQL 
> command to either update or insert data into a database table. We will also 
> want some other processors in the future probably to handle other data types, 
> such as converting XML, CSV, Avro, etc. into SQL commands.
> This breakout gives us a nice coherence to the "do only one thing and do it 
> well" principle by separating the logic of handling all of the incoming 
> formats from the logic of updating the database.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to