[ https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401884#comment-13401884 ]
Zhijie Shen commented on PIG-1314: ---------------------------------- Hi Thejas and Russell, I'll do serialization for timezone as well. {quote} I think converting the string timezone (location name) to UTC offset in minutes, is one possibility. {quote} In my opinion, this kind of compression is lossy. Several time zones may share the same UTC offset, such that when the reverse operation is to do, it will be unknown which timezone the UTC offset should be converted to. {quote} We need an efficient way to serialize timezone along with the long. Can you propose something ? (Maybe, just make it efficient for 256 most 'popular' timezones and store it a byte. And not have the byte for UTC. For other timezones, add a timezone string ?) {quote} The time zone class in either builtin and joda has the function "getAvailableIDs", which returns all the available time zone strings. On my machine, I got 616 from the builtin time zone while 558 from the joda one. Probably we can have a one-to-one mapping between the time zone strings and the integer ids in short variables. However the "available" in the function "getAvailableIDs" sounds tricky. I'm not sure whether "getAvailableIDs" returns the same time zone list on all machines or is machine-dependent. > Add DateTime Support to Pig > --------------------------- > > Key: PIG-1314 > URL: https://issues.apache.org/jira/browse/PIG-1314 > Project: Pig > Issue Type: Bug > Components: data > Affects Versions: 0.7.0 > Reporter: Russell Jurney > Assignee: Zhijie Shen > Labels: gsoc2012 > Attachments: PIG-1314-1.patch, PIG-1314-2.patch, joda_vs_builtin.zip > > Original Estimate: 672h > Remaining Estimate: 672h > > Hadoop/Pig are primarily used to parse log data, and most logs have a > timestamp component. Therefore Pig should support dates as a primitive. > Can someone familiar with adding types to pig comment on how hard this is? > We're looking at doing this, rather than use UDFs. Is this a patch that > would be accepted? > This is a candidate project for Google summer of code 2012. More information > about the program can be found at > https://cwiki.apache.org/confluence/display/PIG/GSoc2012 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira