[GitHub] [hudi] leoyy0316 opened a new issue, #6097: [SUPPORT] spark timestamp issue

GitBox Wed, 13 Jul 2022 03:52:44 -0700


leoyy0316 opened a new issue, #6097:
URL: https://github.com/apache/hudi/issues/6097


   For hudi timestamp type, if bulksert is used for initialization and upsert 
is used for stream processing, this field will become 1970-01-19 
09:54:26.763497 during query. As long as there is an update operation, this 
field in that file will become 1970-01-19 09:54:26.763497, files that have not 
been updated will not be affected; use bulksert stream processing during 
initialization, and the problem will not occur.
   
   This field is correct after initialization, but after stream processing it 
becomes 1970-01-19 09:54:26.763497
   
   Spark scala used for initialization, spark structed streaming used for 
stream processing
   
    **spark, hive, parquet types and values are below**
   parquet: int64 1619666763497                                                 
                                                                                
           
   spark:  timestamp 1970-01-19 09:54:26.763497                                 
                                 
   hive:  bigint  1619666763497
   
   **Environment Description**
   
   * Hudi version :0.11.1
   
   * Spark version :spark3.0.2`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] leoyy0316 opened a new issue, #6097: [SUPPORT] spark timestamp issue

Reply via email to