[ 
https://issues.apache.org/jira/browse/NIFI-15599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18058382#comment-18058382
 ] 

Pierre Villard commented on NIFI-15599:
---------------------------------------

Hi [~fube1] - thanks for reporting this issue.

This is actually a known "issue" on the Calcite side:

[https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rel/type/RelDataTypeSystemImpl.java#L154]

The default precision when casting a string is 0 even though it should not 
according to standards.

The reason why your cases lead to different results is because we don't take 
the same code paths in Calcite code when you pass a string literal and when you 
reference a "column".

Case 1 - String literal: CAST('2026-01-01 12:12:00.123' AS TIMESTAMP)

Calcite sees that this is a constant expression (a literal cast to a type). 
During query planning (before execution), Calcite performs constant folding - 
it evaluates the expression eagerly. It parses the string into a 
TimestampString, then constructs a TIMESTAMP literal with the declared 
precision. Since TIMESTAMP defaults to TIMESTAMP(0), the value is actively 
truncated to second precision at this point.

Case 2 - Column reference: CAST(ts_json AS TIMESTAMP)

Calcite cannot evaluate this at planning time because the value of ts_json is 
not known until the query runs. So it generates runtime conversion code. At 
execution time, when Calcite processes each row, it converts the String value 
to a long (epoch milliseconds) using a function like 
SqlFunctions.toTimestamp(). This function parses the full string "2026-03-03 
03:23:55.456" into an epoch-millis long value that naturally includes the 456 
milliseconds. The declared type is still TIMESTAMP(0), but the runtime 
conversion does not enforce precision truncation on the actual value - the long 
just contains what it contains.

Back to the issue here... the best way to fix the problem is to specify the 
precision in the query:
{code:java}
select 
CAST('2026-01-01 12:12:00.123' AS TIMESTAMP(3)) AS ts_from_string_timestamp,
CAST(ts_json AS TIMESTAMP(3)) AS ts_from_json_timestamp,
CAST('${ts_attr}' AS TIMESTAMP(3)) AS ts_from_attr_timestamp
from flowfile  {code}
We could do something on the NiFi side to specifically set a default that 
overrides the default one in Calcite but, even though it could be considered as 
a bug fix, it could be considered as a breaking change for existing flows. So 
I'd rather do nothing on the NiFi side. But open to the discussion on this.

> QueryRecord CAST String with miliseconds to TIMESTAMP not always works as 
> expected 
> -----------------------------------------------------------------------------------
>
>                 Key: NIFI-15599
>                 URL: https://issues.apache.org/jira/browse/NIFI-15599
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 2.7.2
>            Reporter: Beat Fuellemann
>            Assignee: Pierre Villard
>            Priority: Major
>         Attachments: Bildschirmfoto 2026-02-13 um 07.55.46.png, 
> Bildschirmfoto 2026-02-13 um 07.56.05.png
>
>
> With queryRecord I need to cast a String with miliseconds format (2026-01-01 
> 12:12:00.123) to a timestamp.
> Unfortunately, that doesn't work in each case.
> ------------------------------------
> Case1: (String directly to timestamp)
> CAST('2026-01-01 12:12:00.123' AS TIMESTAMP) AS ts_from_string_timestamp
> Results:  "ts_from_string_timestamp" : "2026-01-01T12:12:00Z"  -> 
> *MILISECONDS are dropped*
> ------------------------------------
> Case2: (String from content to timestamp)
> CAST(ts_json AS TIMESTAMP) AS ts_from_json_timestamp,
> Result:   "ts_from_json_timestamp" : "2026-03-03T03:23:55.{*}456{*}Z" -> 
> MILISECONDS are shown as expected
> ------------------------------------
> Case3: (String from attribute to timestamp)
> CAST('${ts_attr}' AS TIMESTAMP) AS ts_from_attr_timestamp
> Result: "ts_from_attr_timestamp" : "2026-02-02T20:20:20Z"  -> *MILISECONDS 
> are dropped*
>  
> QueryRecord SQL:
> {code:java}
> select 
> CAST('2026-01-01 12:12:00.123' AS TIMESTAMP) AS ts_from_string_timestamp,
> CAST(ts_json AS TIMESTAMP) AS ts_from_json_timestamp,
> CAST('${ts_attr}' AS TIMESTAMP) AS ts_from_attr_timestamp
> from flowfile {code}
> Result Flowfile original as avro:
> {code:java}
> Objavro.schema�{"type":"record","name":"nifiRecord","namespace":"org.apache.nifi","fields":[{"name":"ts_from_string_timestamp","type":{"type":"long","logicalType":"timestamp-millis"}},{"name":"ts_from_json_timestamp","type":[{"type":"long","logicalType":"timestamp-millis"},"null"]},{"name":"ts_from_attr_timestamp","type":{"type":"long","logicalType":"timestamp-millis"}}]}avro.codecnull{code}
> Result Flowfile  formated as avro:
> {code:java}
> [ {
>   "ts_from_string_timestamp" : "2026-01-01T12:12:00Z",
>   "ts_from_json_timestamp" : "2026-03-03T03:23:55.456Z",
>   "ts_from_attr_timestamp" : "2026-02-02T20:20:20Z"
> } ]{code}
> Because it works in case where the string is get from json content, I think 
> it is a bug in nifi and NOT a limitation of calcite SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to