Samrat Vilasrao Bandgar created NIFI-4359:
---------------------------------------------

             Summary: Enhance ConvertJSONToSQL processor to handle JSON 
containing fields having complex type
                 Key: NIFI-4359
                 URL: https://issues.apache.org/jira/browse/NIFI-4359
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Extensions
            Reporter: Samrat Vilasrao Bandgar


Processor: ConvertJSONToSQL

*Problem statement: *

Sample JSON:
{
    "prop1": "value1",
    "prop2": "value2",
    "prop3": "value3",
    "prop4": {
        "prop5": "value5",
        "prop6": "value6"
    }
}

Sample table:
{noformat}
mysql> desc mytable;
+-------+------------+------+-----+---------+----------------+
| Field | Type       | Null | Key | Default | Extra          |
+-------+------------+------+-----+---------+----------------+
| id    | bigint(20) | NO   | PRI | NULL    | auto_increment |
| prop1 | char(30)   | NO   |     | NULL    |                |
| prop2 | char(30)   | NO   |     | NULL    |                |
| prop3 | char(30)   | NO   |     | NULL    |                |
| prop4 | text       | NO   |     | NULL    |                |
+-------+------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)
{noformat}

With the above mentioned sample json and table, I want to convert the json into 
insert sql in such a way that prop4 column will get inserted with value 
{"prop5":"value5","prop6":"value6"}. However, when I use the current 
ConvertJSONToSQL processor, prop4 column gets inserted with empty string.

*Expected:*

{noformat}
mysql> select * from mytable;
+----+--------+--------+--------+-------------------------------------+
| id | prop1  | prop2  | prop3  | prop4                               |
+----+--------+--------+--------+-------------------------------------+
|  1 | value1 | value2 | value3 | {"prop5":"value5","prop6":"value6"} |
+----+--------+--------+--------+-------------------------------------+
1 row in set (0.00 sec)
{noformat}

*Actual:*

{noformat}
mysql> select * from mytable;
+----+--------+--------+--------+----------+
| id | prop1  | prop2  | prop3  | prop4    |
+----+--------+--------+--------+----------+
|  1 | value1 | value2 | value3 |          |
+----+--------+--------+--------+----------+
1 row in set (0.00 sec)
{noformat}

*Attributes details captured from Provenance Event UI for the above use case 
are:*

sql.args.1.type
1
sql.args.1.value
value1

sql.args.2.type
1
sql.args.2.value
value2

sql.args.3.type
1
sql.args.3.value
value3

sql.args.4.type
-1
sql.args.4.value
{color:red}Empty string set{color}

sql.table
mytable

The ConvertJSONToSQL.java has a method createSqlStringValue(final JsonNode 
fieldNode, final Integer colSize, final int sqlType) which is responsible for 
populating attribute values for each column. This method uses below line to get 
field value.

{code:java}
String fieldValue = fieldNode.asText();
{code}

Documentation for org.codehaus.jackson.JsonNode.asText() method tells us:

*asText()*

Method that will return valid String representation of the container value, if 
the node is a value node (method isValueNode() returns true), otherwise empty 
String.

Since prop4 in this case is not a value node, empty string is returned and get 
set to attribute value for the column prop4.

Suggested improvement is as below. 

If the fieldNode is value node, use asText() else use toString() with 
StringEscapeUtils.escapeSql() to take of characters like quotes in insert 
query. I have tested this locally. Please let me know if it makes sense to add 
this improvement. I will attach the patch file for code changes.

Thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to