[ 
https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401884#comment-13401884
 ] 

Zhijie Shen commented on PIG-1314:
----------------------------------

Hi Thejas and Russell,

I'll do serialization for timezone as well.

{quote}
I think converting the string timezone (location name) to UTC offset in 
minutes, is one possibility.
{quote}

In my opinion, this kind of compression is lossy. Several time zones may share 
the same UTC offset, such that when the reverse operation is to do, it will be 
unknown which timezone the UTC offset should be converted to.

{quote}
We need an efficient way to serialize timezone along with the long. Can you 
propose something ? (Maybe, just make it efficient for 256 most 'popular' 
timezones and store it a byte. And not have the byte for UTC. For other 
timezones, add a timezone string ?)
{quote}

The time zone class in either builtin and joda has the function 
"getAvailableIDs", which returns all the available time zone strings. On my 
machine, I got 616 from the builtin time zone while 558 from the joda one. 
Probably we can have a one-to-one mapping between the time zone strings and the 
integer ids in short variables. However the "available" in the function 
"getAvailableIDs" sounds tricky. I'm not sure whether "getAvailableIDs" returns 
the same time zone list on all machines or is machine-dependent.

                
> Add DateTime Support to Pig
> ---------------------------
>
>                 Key: PIG-1314
>                 URL: https://issues.apache.org/jira/browse/PIG-1314
>             Project: Pig
>          Issue Type: Bug
>          Components: data
>    Affects Versions: 0.7.0
>            Reporter: Russell Jurney
>            Assignee: Zhijie Shen
>              Labels: gsoc2012
>         Attachments: PIG-1314-1.patch, PIG-1314-2.patch, joda_vs_builtin.zip
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Hadoop/Pig are primarily used to parse log data, and most logs have a 
> timestamp component.  Therefore Pig should support dates as a primitive.
> Can someone familiar with adding types to pig comment on how hard this is?  
> We're looking at doing this, rather than use UDFs.  Is this a patch that 
> would be accepted?
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to