Hi Kabeer, Thanks for bringing this up. I don't think we have actually hit this before :)
Let me spend sometime understanding the issue and get back to you Thanks Vinoth On Thu, Mar 14, 2019 at 10:46 PM Kabeer Ahmed <[email protected]> wrote: > Hi, > > https://github.com/apache/incubator-hudi/issues/547 ( > https://link.getmailspring.com/link/[email protected]/0?redirect=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-hudi%2Fissues%2F547&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D) > has resulted in the jira https://issues.apache.org/jira/browse/HUDI-12 ( > https://link.getmailspring.com/link/[email protected]/1?redirect=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHUDI-12&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D > ). > The requirement is to be able to interpret timestamp from CSV and store it > in the parquet table. Does anyone have a working example on these lines? > Going by the Hudi example from the GitHub: > Timestamp is being encoded in avro as double: > https://github.com/apache/incubator-hudi/blob/master/hoodie-client/src/test/java/com/uber/hoodie/common/HoodieTestDataGenerator.java#L69 > ( > https://link.getmailspring.com/link/[email protected]/2?redirect=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-hudi%2Fblob%2Fmaster%2Fhoodie-client%2Fsrc%2Ftest%2Fjava%2Fcom%2Fuber%2Fhoodie%2Fcommon%2FHoodieTestDataGenerator.java%23L69&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D > ) > > The end result is that parquet field for timestamp is not of timestamp > (INT96). > > My best guess is that this would have been a requirement at Uber (tracking > trips in minutes and seconds) and how is it being handled. > > If anyone else has handled this and has an example that can be shared, it > will be much appreciated. > Kabeer Ahmed, http://www.linkedin.com/in/kabeerahmed > >
