[GitHub] [incubator-doris] zeropc opened a new issue #5888: Date type column got incorrect value using spark load with GMT-5 timezone

GitBox Sun, 23 May 2021 20:52:43 -0700


zeropc opened a new issue #5888:
URL: https://github.com/apache/incubator-doris/issues/5888



   Date type column got incorrect value using spark load with GMT-5 timezone
   
   Steps to reproduce the behavior:
   1. Deploying BE on GMT-5
   2. Spark load data that contains date type column, with value like 
'2021-05-22'
   3. Got '2021-05-21' in target table
   
   **Expected behavior**
   Expect '2021-05-22'
   
   **Screenshots**
   show partitions:
   | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime  | 
VisibleVersionHash  | State  | PartitionKey | Range                             
                                         | DistributionKey | Buckets | 
ReplicationNum | StorageMedium | CooldownTime        | LastConsistencyCheckTime 
| DataSize   | IsInMemory |
   | 289486      | p20210522     | 2              | 2021-05-23 16:12:47 | 
6425342295842155734 | NORMAL | etl_date     | [types: [DATE]; keys: 
[2021-05-22]; ..types: [DATE]; keys: [2021-05-23]; ) | date, ltv_type  | 12     
 | 3              | HDD           | 9999-12-31 23:59:59 | 2021-05-23 23:00:53   
   | 952.310 KB | false      |
   select distinct:
   +------------+
   | etl_date   |
   +------------+
   | 2021-05-21 |
   
   
   **Additional context**
   In /be/src/exec/parquet_reader.cpp:
   `
   time_t timestamp = 
(time_t)((int64_t)ts_array-&gt;Value(_current_line_of_batch) *24 * 60 * 60); 
   struct tm local; 
   localtime_r(&amp;timestamp, &amp;local);
   `
   Since date type columns in hive, impala are stored as int32 in parquet 
files, standing for the number of dates since 1970-01-01. This is a no-timezone 
data, but localtime_r is dependent on machine timezone.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-doris] zeropc opened a new issue #5888: Date type column got incorrect value using spark load with GMT-5 timezone

Reply via email to