hudi-bot opened a new issue, #16549:
URL: https://github.com/apache/hudi/issues/16549

   Following error is thrown when using Json Kafka Source with transformer and 
decimal is in the schema:
   {code:java}
   Caused by: Json to Avro Type conversion error for field loaded_at, 
2024-06-03 13:42:34.951+00:00 for 
{"type":"long","logicalType":"timestamp-millis"}
        at 
org.apache.hudi.avro.MercifulJsonConverter$JsonToAvroFieldProcessorUtil$JsonToAvroFieldProcessor.convertToAvro(MercifulJsonConverter.java:194)
        at 
org.apache.hudi.avro.MercifulJsonConverter$JsonToAvroFieldProcessorUtil.convertToAvro(MercifulJsonConverter.java:204)
        at 
org.apache.hudi.avro.MercifulJsonConverter.convertJsonToAvroField(MercifulJsonConverter.java:182)
        at 
org.apache.hudi.avro.MercifulJsonConverter.convertJsonToAvro(MercifulJsonConverter.java:126)
        at 
org.apache.hudi.avro.MercifulJsonConverter.convert(MercifulJsonConverter.java:107)
        at 
org.apache.hudi.utilities.sources.helpers.AvroConvertor.fromJson(AvroConvertor.java:118)
        ... 43 more {code}
   We need to make sure "2024-06-03 13:42:34.951+00:00" is supported in 
timestamp logical type.
    * ISO 8601 supports the zone offset in the standard, e.g., {{+01:00}} , and 
{{Z}} is the zone offset equivalent to {{+00:00}} or UTC 
([ref1|https://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators])
    * {{2011-12-03T10:15:30+01:00}} conforms to ISO 8601 with {{T}} as the 
separation character
    * There are systems that use \{{ }} (space) instead of {{T}} as the 
separation (other parts are the same).  References indicate that ISO-8601 used 
to allow this by _mutual agreement_ 
([ref2|https://stackoverflow.com/questions/30201003/how-to-deal-with-optional-t-in-iso-8601-timestamp-in-java-8-jsr-310-threet],
 
[ref3|https://www.reddit.com/r/ISO8601/comments/173r61j/t_vs_space_separation_of_date_and_time/])
    * {{DateTimeFormatter.ISO_OFFSET_DATE_TIME}} can successfully parse 
timestamps like {{2024-05-13T23:53:36.004Z}} , already supported in 
{{{}MercifulJsonConverter{}}}, and additionally {{2011-12-03T10:15:30+01:00}} 
with zone offset (which is not supported in {{MercifulJsonConverter}} yet)
    * {{DateTimeFormatter.ISO_OFFSET_DATE_TIME}} cannot parse the timestamp 
with space as the separator, like {{2011-12-03 10:15:30+01:00}} .  But with a 
simple twist of the formatter, it can be easily supported.
   
   My take is we should change the formatter of the timestamp logical types to 
support zone offset and space character as the separator (which is backwards 
compatible), instead of introducing a new config of format (assuming that 
common use cases just have space character as the variant). 
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-7985
   - Type: Improvement
   - Fix version(s):
     - 1.1.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to