[ 
https://issues.apache.org/jira/browse/HUDI-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Vexler updated HUDI-9172:
----------------------------------
    Status: In Progress  (was: Open)

> Timestamp millis logical type is being read wrong from log files
> ----------------------------------------------------------------
>
>                 Key: HUDI-9172
>                 URL: https://issues.apache.org/jira/browse/HUDI-9172
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: reader-core, spark, spark-sql
>    Affects Versions: 1.0.0, 1.0.1
>            Reporter: Jonathan Vexler
>            Assignee: Jonathan Vexler
>            Priority: Major
>
> Partial schema:
> {code:java}
> {
>       "name": "timestamp_millis_nullable_field",
>       "type": [
>         "null",
>         {
>           "type": "long",
>           "logicalType": "timestamp-millis"
>         }
>       ],
>       "default": null
>     },
>     {
>       "name": "timestamp_micros_nullable_field",
>       "type": [
>         "null",
>         {
>           "type": "long",
>           "logicalType": "timestamp-micros"
>         }
>       ],
>       "default": null
>     },
>     {
>       "name": "timestamp_local_millis_nullable_field",
>       "type": [
>         "null",
>         {
>           "type": "long",
>           "logicalType": "local-timestamp-millis"
>         }
>       ],
>       "default": null
>     },
>     {
>       "name": "timestamp_local_micros_nullable_field",
>       "type": [
>         "null",
>         {
>           "type": "long",
>           "logicalType": "local-timestamp-micros"
>         }
>       ],
>       "default": null
>     }{code}
> Here is the data read before and after compaction using spark datasource:
> {code:java}
> {"_hoodie_commit_time":"20250312194153518","_hoodie_commit_seqno":"20250312194153518_2_83","_hoodie_record_key":"0252824c-7b64-41a0-81a4-b8f5e2271b14","_hoodie_partition_path":"WARN","key":"0252824c-7b64-41a0-81a4-b8f5e2271b14","ts":1741808513729,"severity":null,"double_field":0.7353261619385916,"float_field":0.7660646,"int_field":2010860559,"long_field":8103800306916814465,"boolean_field":true,"string_field":"JMdZXEImEEXXScOivldhirRdMxmdbXxuQMyMfHpQynWkTDNBoOkoyOdVZNPgvxNQZOColrHsbrLJASmWSHKOEsKXnVUYsZhckRjEHLCSrUBIeeCEftWvmtxoExNcOPCxVhNZrQgRqxAWbssnYiPqzMFfmZXrtMfkihFfvWfgMZQkTIKpDdpBOREWPrqYBNwRmVtpMXItwCIsgvpWmUiiTQCkxsegiauMpMgGiOTQUPkJppnjrloeOBpTMjkbNefyXNfsyRlqsfIVnnAfgxuwJdbBKFMYZnjJCPqPmCWZUVetPLiVUWvTrhUFjyLlxsjvyfrOIktzabVyiPnIzZkUUTJoIkIktqVzWeWbVWSivYrOCbRboPYbmTtfPIYaUcMQrlHaYEKwtFXpWZBeIHcOkTpCueBPqJAcdxRsfkwIwTRIExGqXlMaCLoUtaNrccViRqLnfjjjguskqWZncyOUtYeZjFQvFFcYsuhbrpUSFTiCFYrtSpdvBnKCnjINoUijyYSLhvNNaggCnEGShkrgBeWguHyFnFhNWbVWXUjrACTzLSFyZVWRGfEvBzzlKlEymyXXeRvnoMxxfhcEDBOpQBXEGFZMLEdmdqhKmNvafARRuHJGrzjWxwTfPTFqtLjSGnxqdZBIOqjuignkWIFpzbHWnWtYfCRIqRBICdnNzKvNVjtYgIsBXjZLRdkzdBvsNeMRhbDzYjxbxDyiEIdBHabzoTlWgguFLkStQvkYhMrPhcioDmiusCgyuuVzlqStzLMsRksajVDRxFEKmZKZLKApeuRCLKDoVOSkuMXBowizUdEe","bytes_field":"dFloYXhsRGFIUw==","decimal_field":7608.44,"nested_record":null,"nullable_map_field":{"MmbfD":{"nested_int":-560277617,"level":"ERROR"}},"array_field":[{"nested_int":2020017221,"level":"WARN"},{"nested_int":370699254,"level":"ERROR"}],"enum_field":"FIRST","date_nullable_field":"2025-03-11","timestamp_millis_nullable_field":"1970-01-21T03:47:59.388Z","timestamp_micros_nullable_field":"2025-03-11T04:40:28.143Z","timestamp_local_millis_nullable_field":1741715142032,"timestamp_local_micros_nullable_field":1741675805756000,"level":"WARN"}
> {"_hoodie_commit_time":"20250312194153518","_hoodie_commit_seqno":"20250312194153518_2_83","_hoodie_record_key":"0252824c-7b64-41a0-81a4-b8f5e2271b14","_hoodie_partition_path":"WARN","key":"0252824c-7b64-41a0-81a4-b8f5e2271b14","ts":1741808513729,"severity":null,"double_field":0.7353261619385916,"float_field":0.7660646,"int_field":2010860559,"long_field":8103800306916814465,"boolean_field":true,"string_field":"JMdZXEImEEXXScOivldhirRdMxmdbXxuQMyMfHpQynWkTDNBoOkoyOdVZNPgvxNQZOColrHsbrLJASmWSHKOEsKXnVUYsZhckRjEHLCSrUBIeeCEftWvmtxoExNcOPCxVhNZrQgRqxAWbssnYiPqzMFfmZXrtMfkihFfvWfgMZQkTIKpDdpBOREWPrqYBNwRmVtpMXItwCIsgvpWmUiiTQCkxsegiauMpMgGiOTQUPkJppnjrloeOBpTMjkbNefyXNfsyRlqsfIVnnAfgxuwJdbBKFMYZnjJCPqPmCWZUVetPLiVUWvTrhUFjyLlxsjvyfrOIktzabVyiPnIzZkUUTJoIkIktqVzWeWbVWSivYrOCbRboPYbmTtfPIYaUcMQrlHaYEKwtFXpWZBeIHcOkTpCueBPqJAcdxRsfkwIwTRIExGqXlMaCLoUtaNrccViRqLnfjjjguskqWZncyOUtYeZjFQvFFcYsuhbrpUSFTiCFYrtSpdvBnKCnjINoUijyYSLhvNNaggCnEGShkrgBeWguHyFnFhNWbVWXUjrACTzLSFyZVWRGfEvBzzlKlEymyXXeRvnoMxxfhcEDBOpQBXEGFZMLEdmdqhKmNvafARRuHJGrzjWxwTfPTFqtLjSGnxqdZBIOqjuignkWIFpzbHWnWtYfCRIqRBICdnNzKvNVjtYgIsBXjZLRdkzdBvsNeMRhbDzYjxbxDyiEIdBHabzoTlWgguFLkStQvkYhMrPhcioDmiusCgyuuVzlqStzLMsRksajVDRxFEKmZKZLKApeuRCLKDoVOSkuMXBowizUdEe","bytes_field":"dFloYXhsRGFIUw==","decimal_field":7608.44,"nested_record":null,"nullable_map_field":{"MmbfD":{"nested_int":-560277617,"level":"ERROR"}},"array_field":[{"nested_int":2020017221,"level":"WARN"},{"nested_int":370699254,"level":"ERROR"}],"enum_field":"FIRST","date_nullable_field":"2025-03-11","timestamp_millis_nullable_field":"2025-03-11T07:49:48.590Z","timestamp_micros_nullable_field":"2025-03-11T04:40:28.143Z","timestamp_local_millis_nullable_field":1741715142032,"timestamp_local_micros_nullable_field":1741675805756000,"level":"WARN"}
>  {code}
> All fields are the same except the timestamp_millis_nullable_field. My guess 
> is that it is due to avro->internal row conversion in the filegroup reader
> HUDI-9142 seems like it might be related



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to