Hi Kabeer,

Thanks for bringing this up. I don't think we have actually hit this before
:)

Let me spend sometime understanding the issue and get back to you

Thanks
Vinoth

On Thu, Mar 14, 2019 at 10:46 PM Kabeer Ahmed <[email protected]> wrote:

> Hi,
>
> https://github.com/apache/incubator-hudi/issues/547 (
> https://link.getmailspring.com/link/[email protected]/0?redirect=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-hudi%2Fissues%2F547&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D)
> has resulted in the jira https://issues.apache.org/jira/browse/HUDI-12 (
> https://link.getmailspring.com/link/[email protected]/1?redirect=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHUDI-12&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D
> ).
> The requirement is to be able to interpret timestamp from CSV and store it
> in the parquet table. Does anyone have a working example on these lines?
> Going by the Hudi example from the GitHub:
> Timestamp is being encoded in avro as double:
> https://github.com/apache/incubator-hudi/blob/master/hoodie-client/src/test/java/com/uber/hoodie/common/HoodieTestDataGenerator.java#L69
> (
> https://link.getmailspring.com/link/[email protected]/2?redirect=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-hudi%2Fblob%2Fmaster%2Fhoodie-client%2Fsrc%2Ftest%2Fjava%2Fcom%2Fuber%2Fhoodie%2Fcommon%2FHoodieTestDataGenerator.java%23L69&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D
> )
>
> The end result is that parquet field for timestamp is not of timestamp
> (INT96).
>
> My best guess is that this would have been a requirement at Uber (tracking
> trips in minutes and seconds) and how is it being handled.
>
> If anyone else has handled this and has an example that can be shared, it
> will be much appreciated.
> Kabeer Ahmed, http://www.linkedin.com/in/kabeerahmed
>
>

Reply via email to