[jira] Created: (PIG-1015) [piggybank] DateExtractor should take into account timezones

2009-10-11 Thread Dmitriy V. Ryaboy (JIRA)
[piggybank] DateExtractor should take into account timezones


 Key: PIG-1015
 URL: https://issues.apache.org/jira/browse/PIG-1015
 Project: Pig
  Issue Type: Bug
Reporter: Dmitriy V. Ryaboy


The current implementation defaults to the local timezone when parsing strings, 
thereby providing inconsistent results depending on the settings of the 
computer the program is executing on (this is causing unit test failures). We 
should set the timezone to a consistent default, and allow users to override 
this default.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1015) [piggybank] DateExtractor should take into account timezones

2009-10-11 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-1015:
---

Attachment: date_extractor.patch

Note that this changes the contract slightly, as the DateExtractor extracts 
dates in GMT by default, whereas before it extracted them in system's local 
time. 

 [piggybank] DateExtractor should take into account timezones
 

 Key: PIG-1015
 URL: https://issues.apache.org/jira/browse/PIG-1015
 Project: Pig
  Issue Type: Bug
Reporter: Dmitriy V. Ryaboy
 Attachments: date_extractor.patch


 The current implementation defaults to the local timezone when parsing 
 strings, thereby providing inconsistent results depending on the settings of 
 the computer the program is executing on (this is causing unit test 
 failures). We should set the timezone to a consistent default, and allow 
 users to override this default.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1015) [piggybank] DateExtractor should take into account timezones

2009-10-11 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-1015:
---

Fix Version/s: 0.6.0
   Status: Patch Available  (was: Open)

 [piggybank] DateExtractor should take into account timezones
 

 Key: PIG-1015
 URL: https://issues.apache.org/jira/browse/PIG-1015
 Project: Pig
  Issue Type: Bug
Reporter: Dmitriy V. Ryaboy
 Fix For: 0.6.0

 Attachments: date_extractor.patch


 The current implementation defaults to the local timezone when parsing 
 strings, thereby providing inconsistent results depending on the settings of 
 the computer the program is executing on (this is causing unit test 
 failures). We should set the timezone to a consistent default, and allow 
 users to override this default.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-868) indexof / lastindexof / lower / replace / substring udf's

2009-10-11 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764533#action_12764533
 ] 

Dmitriy V. Ryaboy commented on PIG-868:
---

The dateExtractor issue is addressed by PIG-1015 ; just changing the testcase 
is not sufficient, as the testcase will still break in some parts of the world 
because it relies on local settings.

 indexof / lastindexof / lower / replace / substring udf's
 -

 Key: PIG-868
 URL: https://issues.apache.org/jira/browse/PIG-868
 Project: Pig
  Issue Type: New Feature
Reporter: Bennie Schut
Priority: Trivial
 Attachments: addSomeUDFsPatch.patch, dateExtractorPatch.patch


 We parse some apache logs using pig and are using some pretty simple udf's 
 like this:
 B = FOREACH A GENERATE substring(uri, lastindexof(uri, '/')+1, indexof(uri, 
 '.txt')) as lang;
 It's pretty simple stuff but I figured someone else might find it useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-986) [zebra] Zebra Column Group Naming Support

2009-10-11 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-986:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I just committed this. Thanks Yan.

 [zebra] Zebra Column Group Naming Support
 -

 Key: PIG-986
 URL: https://issues.apache.org/jira/browse/PIG-986
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.6.0

 Attachments: ColumnGroupName.patch, ColumnGroupName.patch, 
 ColumnGroupName.patch


 We introduce column group name to Zebra and make it a first-class citizen in 
 Zebra. This can ease management of column groups.
 We plan to introduce an as clause for column group name in Zebra's syntax.
 Functional Specifications:
 1) Column group names are optional. For column groups which do not have a 
 user-provided name, Zebra will assign some default column group names 
 internally that is unique for that table - CG0, CG1, CG2 ... Note: If CGx is 
 used by user, then it can not be used for internal names.
 2) We introduce an AS clause in Zebra's syntax for column group names. If 
 it occurs, it has to immediately follow [ ]. For example, [a1, a2] as PI 
 secure by user:joe group:secure perm:640; [a3, a4] as General compress by 
 lzo. Note that keyword AS is case insensitive.
 3) Column group names are unique within one table and are case sensitive, 
 i.e., c1 and C1 are different.
 4) Column group names will be used as the physical column group directory 
 path names.
 5) Zebra V2 will support dropColumnGroup by column group names (will 
 integrate with Raghu's A29 drop column work).
 6) Zebra V2 can support backward compatibility (If there are Zebra V1 created 
 tables in production when V2 is released). More specifically, this means that 
 Zebra V2 can load from V1-created tables and do dropColumnGroup on it.
 7) Does NOT support renaming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-11 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764552#action_12764552
 ] 

Raghu Angadi commented on PIG-993:
--

This patch depends on PIG-992. It is not a functional dependency and can be 
removed if required.

 [zebra] Abitlity to drop a column group in a table
 --

 Key: PIG-993
 URL: https://issues.apache.org/jira/browse/PIG-993
 Project: Pig
  Issue Type: Bug
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.6.0

 Attachments: DropColumnGroupExample.java, zebra-drop-cg.patch, 
 zebra-drop-cg.patch


 A Zebra table is stored as multiple sub tables each containing a set of 
 columns called column group (CG). The user specifies how these columns are 
 grouped while creating a table through the _storage hint_.
 For some of the large tables, it might be necessary for users to remove a set 
 of columns and retain the rest. This jira provides a way for users to delete 
 an entire column group. 
 The following comments will have more details on API and the semantics. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.