kou commented on issue #13396:
URL: https://github.com/apache/arrow/issues/13396#issuecomment-1162676527

   Apache Parquet doesn't support "seconds" as unit: 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp
   So "seconds" timestamp type is converted to "milliseconds" timestamp type 
when we write Apache Arrow data as Apache Parquet.
   
   We can cast type by the following:
   
   ```ruby
   s3_existing_table.merge(
     "History Completed Time" => 
Arrow::ChunkedArray.new(s3_existing_table["History Completed 
Time"].data.chunks.collect {|chunk| 
chunk.cast(Arrow::TimestampDataType.new(:second))},
     "History Created Time" => 
Arrow::ChunkedArray.new(s3_existing_table["History Created 
Time"].data.chunks.collect {|chunk| 
chunk.cast(Arrow::TimestampDataType.new(:second))}
   )
   ```
   
   But this it too inconvenient... I'll add convenient APIs to write something 
like the following:
   
   ```ruby
   s3_existing_table.merge(
     "History Completed Time" => s3_existing_table["History Completed 
Time"].cast(unit: :second)),
     "History Created Time" => s3_existing_table["History Created 
Time"].cast(unit: :second)),
   )
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to