[ https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288680#comment-13288680 ]
Thejas M Nair commented on PIG-1314: ------------------------------------ bq. When adding the DateTime type for Pig, we need to take care of the I/O with AVRO, which still doesn't support the Date/Time type. StoreFuncs that write in avro format will need to throw an exception if the schema being stored contains a datetime type. That will force the users to serialize datetime as some other type. As long as we are not breaking existing pig queries don't use datetime type, we should be fine. Avro is just one of the many formats. Regarding AugmentBaseDataVisitor, that is used for example generation. (see [sigmod paper on illustrate feature | http://infolab.stanford.edu/~olston/publications/sigmod09.pdf] for details) . For example, if there is no value in col1 in sample that satisfies "col1 > 0", a value of col1 > 0 is generated. This will be useful for datetime type as well. To have a more realistic value generated (similar to values in input), I think we should increment/decrement the smallest field that is non zero. For example if the millisecond and second fields are 0, but hour field is non zero, increment that. If all time parts are 0, but day of month is not, increment that. In case of boolean, as we don't support > or < operations, these functions do not make sense. Thanks for bringing this up. I had forgot about this use case. We should add a few unit tests for example generation that involve datetime. > Add DateTime Support to Pig > --------------------------- > > Key: PIG-1314 > URL: https://issues.apache.org/jira/browse/PIG-1314 > Project: Pig > Issue Type: Bug > Components: data > Affects Versions: 0.7.0 > Reporter: Russell Jurney > Assignee: Zhijie Shen > Labels: gsoc2012 > Attachments: joda_vs_builtin.zip > > Original Estimate: 672h > Remaining Estimate: 672h > > Hadoop/Pig are primarily used to parse log data, and most logs have a > timestamp component. Therefore Pig should support dates as a primitive. > Can someone familiar with adding types to pig comment on how hard this is? > We're looking at doing this, rather than use UDFs. Is this a patch that > would be accepted? > This is a candidate project for Google summer of code 2012. More information > about the program can be found at > https://cwiki.apache.org/confluence/display/PIG/GSoc2012 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira