Good find Matt, from a quick glance it looks like NIFI-2625 is definitely
related, although it is more specific to when the timestamp is in
milliseconds vs. nano-seconds.

Part of the problem is that PutSQL can handle the epoch OR a date string,
but the Avro produced by ExecuteSQL/QueryDatabaseTable always calls
value.toString() for timestamps which seems to usually be yyyy-MM-dd
HH:mm:ss.
We end up passing around a date string with a JDBC type of 93 (timestamp),
so comparing the size of the column in the target database would never line
up.

Should we modify JDBCCommon so that if rs.getObject(i) is a date/timestamp
we take epoch and put that as the string value, rather than relying on
value.toString() ??

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/util/JdbcCommon.java#L153-L157

On Sat, Sep 10, 2016 at 2:46 PM, Matt Burgess <[email protected]> wrote:

> I think we're running into NIFI-2625 [1] and/or NIFI-1613 [2], the
> column size returned is for the underlying data type, not the String
> literal it corresponds to.
>
> Regards,
> Matt
>
> [1] https://issues.apache.org/jira/browse/NIFI-2625
> [2] https://issues.apache.org/jira/browse/NIFI-1613
>
> On Sat, Sep 10, 2016 at 2:43 PM, Bryan Bende <[email protected]> wrote:
> > Peng,
> >
> > Thanks for checking what the value for sql.args.3.value was... the fact
> > that it is already truncated there and not including the
> hours/mins/seconds
> > means something is going wrong before it even gets to PutSQL.
> >
> > I was looking at the code in ConvertJsonToSQL and found this code:
> >
> > final Integer colSize = desc.getColumnSize();
> >
> > final JsonNode fieldNode = rootNode.get(fieldName);
> >
> > if (!fieldNode.isNull()) {
> >     String fieldValue = rootNode.get(fieldName).asText();
> >     if (colSize != null && fieldValue.length() > colSize) {
> >         fieldValue = fieldValue.substring(0, colSize);
> >     }
> >     attributes.put("sql.args." + fieldCount + ".value", fieldValue);
> > }
> >
> >
> > What could be happening is that the target DB is saying that the column
> > size is 7 (comes from desc.getColumnSize() above), and then it sees that
> > the string value is larger than 7 so it takes the substring from 0 to 6.
> > It seems weird to me that 7 would be size of a timestamp column, but I
> > can't come up with any other explanation, unless Oracle's driver is just
> > doing something unusual.
> >
> > If that is what is happening, we probably have a couple of options, and
> I'm
> > curious to here what others with more DB experience think...
> >
> > 1) We could decide to leave ConvertJsonToSQL as is, and recommend that
> for
> > timestamp columns people need to manipulate the JSON between
> > ConvertAvroToJson & ConvertJsonToSQL, this could probably be done using
> the
> > JOLT processor to transform the date string into a long, but not sure.
> This
> > would probably be confusing because no one would realize they need to do
> > this.
> >
> > 2) We could completely remove this size check from ConvertJSONToSQL which
> > can in some cases lead to insert statements that might fail downstream,
> but
> > that can be handled by routing the failure relationship of PutSQL
> > somewhere. We know that PutSQL has logic to handle the epoch or date
> > string, so it would only be other cases that could fail.
> >
> > 3) We could add special logic to ConvertJSONToSQL, similar to that in
> > PutSQL, where we determined if it was a timestamp column and if it was a
> > date string then don't perform this size check/truncation, but leave it
> for
> > other types. It could be a slippery slope starting to do special logic
> for
> > different types of columns.
> >
> > We need to consider that a lot of people are going to do
> > ExecuteSQL/QueryDatabaseTable -> ConvertAvroToJSON -> ConvertJSONToSQL ->
> > PutSQL but other people may be producing JSON in some other way and using
> > the ConvertJSONToSQL -> PutSQL part. In the first case we know that we
> are
> > storing the date as a string in Avro so we will always run into this
> case,
> > but in the second we don't really know.
> >
> > Thoughts?
> >
> > -Bryan
> >
> >
> > On Sat, Sep 10, 2016 at 9:59 AM, Li, Peng <[email protected]> wrote:
> >
> >> Hi Bryan,
> >>
> >> In provenance the attribute sql.args.3.value was given as "2016-04" and
> >> sql.args.3.type was given as "93".
> >>
> >> But as already said in previous mail, i checked the content of the
> output
> >> of ConvertAvroToJSON and "2016-04-29 00:00:00.0" was given there, so I
> >> think this value was given to the following ConvertJSONToSQL processor.
> >>
> >>
> >>
> >> Perhaps it has nothing to do with format "yyyy-MM-dd HH:mm:ss.SSS" but
> >> something else?
> >>
> >>
> >>
> >> Thanks
> >>
> >> Peng
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >>
> >>
> >> On Fri, 09 Sep 2016 12:06:35 GMT Bryan Bende [email protected]<mailto:
> >> [email protected]> wrote:
> >>
> >> Hi Peng,
> >>
> >> In the stack trace it looks like it thinks the value of
> sql.args.3.value is
> >> "2016-04" rather than "2016-04-29 00:00:00.000".
> >>
> >> Can you double check the value of sql.args.3.value? You can look at the
> >> flow file in provenance and look at the attributes.
> >>
> >>
> >>
>

Reply via email to