Hi Kabeer, I spent time looking at the issue and its other linked issues as well.
High level, seems like we need to change the data type mappings for these date/timestamp types.. It does seem doable, given Avro also supports date/timestamp types.. Do you have some sample schema/data generation that we can start with? Thanks Vinoth On Fri, Mar 15, 2019 at 11:19 AM Vinoth Chandar <[email protected]> wrote: > Hi Kabeer, > > Thanks for bringing this up. I don't think we have actually hit this > before :) > > Let me spend sometime understanding the issue and get back to you > > Thanks > Vinoth > > On Thu, Mar 14, 2019 at 10:46 PM Kabeer Ahmed <[email protected]> > wrote: > >> Hi, >> >> https://github.com/apache/incubator-hudi/issues/547 ( >> https://link.getmailspring.com/link/[email protected]/0?redirect=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-hudi%2Fissues%2F547&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D) >> has resulted in the jira https://issues.apache.org/jira/browse/HUDI-12 ( >> https://link.getmailspring.com/link/[email protected]/1?redirect=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHUDI-12&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D >> ). >> The requirement is to be able to interpret timestamp from CSV and store >> it in the parquet table. Does anyone have a working example on these lines? >> Going by the Hudi example from the GitHub: >> Timestamp is being encoded in avro as double: >> https://github.com/apache/incubator-hudi/blob/master/hoodie-client/src/test/java/com/uber/hoodie/common/HoodieTestDataGenerator.java#L69 >> ( >> https://link.getmailspring.com/link/[email protected]/2?redirect=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-hudi%2Fblob%2Fmaster%2Fhoodie-client%2Fsrc%2Ftest%2Fjava%2Fcom%2Fuber%2Fhoodie%2Fcommon%2FHoodieTestDataGenerator.java%23L69&recipient=ZGV2QGh1ZGkuYXBhY2hlLm9yZw%3D%3D >> ) >> >> The end result is that parquet field for timestamp is not of timestamp >> (INT96). >> >> My best guess is that this would have been a requirement at Uber >> (tracking trips in minutes and seconds) and how is it being handled. >> >> If anyone else has handled this and has an example that can be shared, it >> will be much appreciated. >> Kabeer Ahmed, http://www.linkedin.com/in/kabeerahmed >> >>
