kimberlyamandalu commented on issue #2123:
URL: https://github.com/apache/hudi/issues/2123#issuecomment-778464849


   For reference, this is their response for the ticket.
   They have suggested a workaround.
   
   Greetings Kim,
   Thank you for contacting AWS Premium Support. This is Mrunmayee and I will 
be assisting you with the case.
   I did some investigation on the timestamps not being displayed correctly in 
Athena. As pointed out by you, this is caused When a Glue crawler is run on the 
parquet files, the field is confirmed as TIMESTAMP as granularity of timestamp 
file fields is in "TIMESTAMP_MICROS".
   In the Presto documentation [1], it is given that timestamp granularity up 
to millisecond is supported but not microseconds.  As Athena uses Presto engine 
as the backend query engine, this limitation of Presto validates for Athena 
also which is why we are not being able to see the correct timestamp for those 
columns in Athena console.
   
   In order to deal with this issue, one possible workaround is to change the 
data type of timestamp columns from "timestamp" to "bigint" by editing the 
schema of your table “table_name”. Here “createdt” is the column of concern
   
   You can do this by following the below mentioned steps:
   
   1. Go to the Glue console.
   2. Navigate to the "Tables" option on the sidebar on the left hand side of 
the console.
   3. Select your table, “table_name”
   4. On the right hand side of the console, select the "Edit Schemas" option.
   5. For the field "createdt", select on the datatype (which is currently set 
to timestamp) and change it to bigint by selecting on it.
   6. Click on "Save" to save the changes.
   
   Then we can create a view over the table using Presto functions eg. 
"from_unixtime" [2].
   
   For example : I have changed the 'createdt' data type as bigint and ran the 
following query to get the the desired result with correct timestamp value :
   
   ------------
   SELECT id, code, from_unixtime(createdt/1000000) AS createdt FROM 
"default”.”table”_name limit 10;
   ------------
   
   Let me know if this workaround works for you. As far as a patch/fix is 
concerned, at the moment, I do not have visibility into it. That being said, I 
will look into this and provide you with an update as soon as I have one. 
Apologies for the inconvenience cause.
   I hope you find this information useful. Apologies in advance if I 
misunderstood your case. Please feel free to get in touch with me if you have 
any questions or need clarifications with anything.
   
   References :
   [1] TIMESTAMP limitations - 
http://teradata.github.io/presto/docs/127t/release/unsupported.html
   [2] from_unixtime() - 
https://prestodb.github.io/docs/current/functions/datetime.html
   
   We value your feedback. Please share your experience by rating this 
correspondence using the AWS Support Center link at the end of this 
correspondence. Each correspondence can also be rated by selecting the stars in 
top right corner of each correspondence within the AWS Support Center.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to