FYI - forking TFile off Hadoop into Zebra

2009-11-11 Thread Chao Wang
Hi all, In Jira Pig-1077, we Zebra team plan to utilize Hadoop TFile's split by record sequence number support to provide record(row)-based input split support in Zebra. Here we would like to point out that: along the way we plan to also resolve the dependency issue that Zebra record-based

[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

2009-10-28 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1057: --- Attachment: patch_1057 > [Zebra] Zebra does not support concurrent deletions of column groups

[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

2009-10-29 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1057: --- Attachment: patch_1057 > [Zebra] Zebra does not support concurrent deletions of column groups

[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

2009-10-29 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1057: --- Attachment: (was: patch_1057) > [Zebra] Zebra does not support concurrent deletions of column groups

[jira] Updated: (PIG-1057) [Zebra] Zebra does not support concurrent deletions of column groups now.

2009-10-29 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1057: --- Status: Patch Available (was: Open) > [Zebra] Zebra does not support concurrent deletions of column gro

[jira] Commented: (PIG-993) [zebra] Abitlity to drop a column group in a table

2009-10-29 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771686#action_12771686 ] Chao Wang commented on PIG-993: --- Raghu's comment has been addressed in Jira 1057 :

[jira] Created: (PIG-1067) [Zebra] to support pig projection push down in Zebra

2009-10-30 Thread Chao Wang (JIRA)
Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0 Pig tries to determine which fields in a query script file will be needed and passes that information to the load function, thereby optimizing the query by reducing the data to be loaded. To support this

[jira] Updated: (PIG-1026) [zebra] map split returns null

2009-11-02 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1026: --- Patch reviewed. +1 > [zebra] map split returns null > -- > >

[jira] Created: (PIG-1077) [Zebra] to support record(row)-based file split in Zebra' TableInputFormat

2009-11-09 Thread Chao Wang (JIRA)
Type: New Feature Affects Versions: 0.4.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0 TFile currently supports split by record sequence number (see Jira HADOOP-6218). We want to utilize this to provide record(row)-based input split support in

[jira] Updated: (PIG-1077) [Zebra] to support record(row)-based file split in Zebra's TableInputFormat

2009-11-09 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1077: --- Summary: [Zebra] to support record(row)-based file split in Zebra's TableInputFormat (was: [Zebra] to su

[jira] Updated: (PIG-1077) [Zebra] to support record(row)-based file split in Zebra's TableInputFormat

2009-11-11 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1077: --- Release Note: In this jira, we plan to also resolve the dependency issue that Zebra record-based split needs

[jira] Created: (PIG-1087) Use Pig's version for Zebra's own version.

2009-11-11 Thread Chao Wang (JIRA)
rter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0 Zebra is a contrib project of Pig now. It should use Pig's version for its own version. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1087) Use Pig's version for Zebra's own version.

2009-11-11 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1087: --- Attachment: patch_Pig1087 > Use Pig's version for Zebra'

[jira] Updated: (PIG-1087) Use Pig's version for Zebra's own version.

2009-11-11 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1087: --- Status: Patch Available (was: Open) > Use Pig's version for Zebra'

[jira] Updated: (PIG-1077) [Zebra] to support record(row)-based file split in Zebra's TableInputFormat

2009-11-13 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1077: --- Attachment: patch_Pig1077 > [Zebra] to support record(row)-based file split in Zebra's TableInp

[jira] Updated: (PIG-1077) [Zebra] to support record(row)-based file split in Zebra's TableInputFormat

2009-11-13 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1077: --- Status: Patch Available (was: Open) > [Zebra] to support record(row)-based file split in Zebr

[jira] Updated: (PIG-1099) [zebra] version on APACHE trunk should be 0.7.0 to be in pace with PIG

2009-11-19 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1099: --- Patch reviewed +1 > [zebra] version on APACHE trunk should be 0.7.0 to be in pace with

[jira] Updated: (PIG-1091) [zebra] Exception when load with projection of map keys on a map column that is not map split

2009-11-19 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1091: --- Patch reviewed. +1 > [zebra] Exception when load with projection of map keys on a map column that > is n

[jira] Updated: (PIG-1078) [zebra] merge join with empty table failed

2009-11-23 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1078: --- Patch reviewed. +1 > [zebra] merge join with empty table fai

[jira] Created: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-11-23 Thread Chao Wang (JIRA)
: Chao Wang Assignee: Chao Wang Fix For: 0.6.0 Hadoop streaming is very popular among Hadoop users. The main attraction is the simplicity of use. A user can write the application logic in any language and process large amounts of data using Hadoop framework. As more

[jira] Commented: (PIG-1098) [zebra] Zebra Performance Optimizations

2009-12-01 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784468#action_12784468 ] Chao Wang commented on PIG-1098: Ideally, should have a better structure for methods suc

[jira] Commented: (PIG-1122) [zebra] Zebra build.xml still uses 0.6 version

2009-12-02 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785127#action_12785127 ] Chao Wang commented on PIG-1122: +1 > [zebra] Zebra build.xml still uses 0.6

[jira] Commented: (PIG-1111) [Zebra] multiple outputs support

2009-12-03 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785419#action_12785419 ] Chao Wang commented on PIG-: Why we need build script change to run multiple out

[jira] Commented: (PIG-1111) [Zebra] multiple outputs support

2009-12-03 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785431#action_12785431 ] Chao Wang commented on PIG-: +1 > [Zebra] multiple outputs

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-03 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Fix Version/s: 0.7.0 > [zebra] Provide streaming support in Ze

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-03 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Attachment: PIG1104.patch > [zebra] Provide streaming support in Ze

[jira] Commented: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-03 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785571#action_12785571 ] Chao Wang commented on PIG-1104: Response to the above comment: point 1) The scenarios

[jira] Created: (PIG-1125) [zebra] Using typed APIs for Zebra's Map/Reduce interface

2009-12-03 Thread Chao Wang (JIRA)
sions: 0.4.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0, 0.7.0 We plan to modify Zebra's M/R interface to use typed APIs, i.e., APIs taking object arguments, instead of String arguments. Take TableInputFormat as an example: setSchema(jo

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-04 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Attachment: PIG-1104.patch > [zebra] Provide streaming support in Ze

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-04 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Status: Patch Available (was: Open) > [zebra] Provide streaming support in Ze

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-04 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Attachment: (was: PIG1104.patch) > [zebra] Provide streaming support in Ze

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-04 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Status: Open (was: Patch Available) > [zebra] Provide streaming support in Ze

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-06 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Status: Open (was: Patch Available) > [zebra] Provide streaming support in Ze

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-06 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Status: Patch Available (was: Open) > [zebra] Provide streaming support in Ze

[jira] Commented: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-06 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786758#action_12786758 ] Chao Wang commented on PIG-1104: Seems Pig has some issue. I checked Pig's TestBui

[jira] Updated: (PIG-1125) [zebra] Using typed APIs for Zebra's Map/Reduce interface

2009-12-07 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1125: --- Attachment: PIG-1125.patch > [zebra] Using typed APIs for Zebra's Map/Reduce i

[jira] Updated: (PIG-1125) [zebra] Using typed APIs for Zebra's Map/Reduce interface

2009-12-08 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1125: --- Attachment: PIG-1125.patch > [zebra] Using typed APIs for Zebra's Map/Reduce i

[jira] Updated: (PIG-1125) [zebra] Using typed APIs for Zebra's Map/Reduce interface

2009-12-08 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1125: --- Status: Patch Available (was: Open) > [zebra] Using typed APIs for Zebra's Map/Reduce i

[jira] Resolved: (PIG-982) [zebra] Prevent checkin test cases from running twice in nightly test.

2009-12-09 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang resolved PIG-982. --- Resolution: Not A Problem > [zebra] Prevent checkin test cases from running twice in nightly t

[jira] Resolved: (PIG-985) [zebra] Make necessary changes to build scripts to accommodate new zebra features plus other improvement.

2009-12-09 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang resolved PIG-985. --- Resolution: Not A Problem > [zebra] Make necessary changes to build scripts to accommodate new ze

[jira] Commented: (PIG-1145) [zebra] merge join on large table ( 100,000.000 rows zebra table) failed

2009-12-09 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788496#action_12788496 ] Chao Wang commented on PIG-1145: Patch reviewed +1. > [zebra] merge join on larg

[jira] Commented: (PIG-1145) [zebra] merge join on large table ( 100,000.000 rows zebra table) failed

2009-12-11 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789425#action_12789425 ] Chao Wang commented on PIG-1145: The patch looks good +1. > [zebra] merge join o

[jira] Updated: (PIG-1136) [zebra] Map Split of Storage info do not allow for leading underscore char '_'

2009-12-17 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1136: --- Status: Open (was: Patch Available) > [zebra] Map Split of Storage info do not allow for leading undersc

[jira] Commented: (PIG-1153) [zebra] spliting columns at different levels in a complex record column into different column groups throws exception

2009-12-22 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793798#action_12793798 ] Chao Wang commented on PIG-1153: Patch reviewed. +1 > [zebra] spliting columns at di

[jira] Commented: (PIG-1167) [zebra] Zebra does not support Hadoop Globs

2010-01-04 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796259#action_12796259 ] Chao Wang commented on PIG-1167: Patch looks good +1. > [zebra] Zebra does not

[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

2010-01-19 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1140: --- Status: Open (was: Patch Available) > [zebra] Use of Hadoop 2.0 A

[jira] Updated: (PIG-1201) [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all JobConf contents including those unused by zebra

2010-02-02 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1201: --- Patch looks good +1 > [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all > J

[jira] Updated: (PIG-1227) [zebra] Missing column group meta file should not be allowed at query time

2010-02-08 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1227: --- Patch looks good +1. > [zebra] Missing column group meta file should not be allowed at query t

[jira] Created: (PIG-1253) [zebra] make map/reduce test cases run on real cluster

2010-02-23 Thread Chao Wang (JIRA)
Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.7.0 The goal of this task is to make map/reduce test cases run on real cluster. Currently map/reduce test cases are mostly tested under local mode. When running on real cluster, all involved jars have to be

[jira] Updated: (PIG-1198) [zebra] performance improvements

2010-02-25 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1198: --- Patch reviewed. Some feedbacks: 1) in fillRowSplit() method, reader.close() should always be called at the end

[jira] Created: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-08 Thread Chao Wang (JIRA)
0.6.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.7.0 The goal of this task is to make Zebra's pig test cases run on real cluster. Currently Zebra's pig test cases are mostly tested using MiniCluster. We want to use a real hadoop cluster to test

[jira] Updated: (PIG-1253) [zebra] make map/reduce test cases run on real cluster

2010-03-08 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1253: --- Attachment: PIG-1253.patch > [zebra] make map/reduce test cases run on real clus

[jira] Updated: (PIG-1253) [zebra] make map/reduce test cases run on real cluster

2010-03-08 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1253: --- Status: Patch Available (was: Open) > [zebra] make map/reduce test cases run on real clus

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-09 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Attachment: PIG-1282.patch > [zebra] make Zebra's pig test cases run on real

[jira] Commented: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-09 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843342#action_12843342 ] Chao Wang commented on PIG-1282: Thank Jing for the most of the migration work. >

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-09 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Status: Patch Available (was: Open) > [zebra] make Zebra's pig test cases run on real

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-09 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Status: Open (was: Patch Available) > [zebra] make Zebra's pig test cases run on real

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-10 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Attachment: PIG-1282.patch > [zebra] make Zebra's pig test cases run on real

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-10 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Status: Patch Available (was: Open) > [zebra] make Zebra's pig test cases run on real

[jira] Commented: (PIG-1268) [Zebra] Need an ant target that runs all pig-related tests in Zebra

2010-03-10 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843679#action_12843679 ] Chao Wang commented on PIG-1268: Patch looks good +1 > [Zebra] Need an ant target th

[jira] Commented: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-10 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843856#action_12843856 ] Chao Wang commented on PIG-1282: The test case failure is caused by some environmental i

[jira] Commented: (PIG-1269) [Zebra] Restrict schema definition for collection

2010-03-11 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844212#action_12844212 ] Chao Wang commented on PIG-1269: Patch looks good +1 > [Zebra] Restrict schema def

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-18 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Attachment: PIG-1282.patch > [zebra] make Zebra's pig test cases run on real

[jira] Commented: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-18 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847162#action_12847162 ] Chao Wang commented on PIG-1282: The new attached patch addressed the concerns raise

[jira] Updated: (PIG-1253) [zebra] make map/reduce test cases run on real cluster

2010-03-19 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1253: --- Attachment: PIG-1253-0.6.patch Some of the test case clean up work also applies to 0.6 branch. > [ze

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-19 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Status: Open (was: Patch Available) > [zebra] make Zebra's pig test cases run on real

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-19 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Status: Patch Available (was: Open) > [zebra] make Zebra's pig test cases run on real

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-21 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Attachment: PIG-1282.patch added hadoop license comment to BaseTestCase.java > [zebra] make Zebra's

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-21 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Status: Open (was: Patch Available) > [zebra] make Zebra's pig test cases run on real

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-21 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Status: Patch Available (was: Open) > [zebra] make Zebra's pig test cases run on real

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-21 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Attachment: (was: PIG-1282.patch) > [zebra] make Zebra's pig test cases run on real

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-21 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Attachment: (was: PIG-1282.patch) > [zebra] make Zebra's pig test cases run on real

[jira] Updated: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-21 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1282: --- Attachment: (was: PIG-1282.patch) > [zebra] make Zebra's pig test cases run on real

[jira] Created: (PIG-1337) Make job configuration object properly serialized to backend in Pig's LoadFunc

2010-03-29 Thread Chao Wang (JIRA)
: Pig Issue Type: Improvement Affects Versions: 0.6.0 Reporter: Chao Wang Fix For: 0.8.0 The Zebra storage layer needs to use distributed cache to reduce name node load during job runs. To to this, Zebra needs to set up distributed cache related configuration informati

[jira] Updated: (PIG-1337) Need a way to pass distributed cache configuration information to hadoop backend in Pig's LoadFunc

2010-03-29 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1337: --- Summary: Need a way to pass distributed cache configuration information to hadoop backend in Pig's Loa

[jira] Commented: (PIG-1337) Need a way to pass distributed cache configuration information to hadoop backend in Pig's LoadFunc

2010-03-29 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851105#action_12851105 ] Chao Wang commented on PIG-1337: This may also relate to https://issues.apache.org/

[jira] Created: (PIG-1342) [Zebra] Avoid making unnecessary name node calls for writes in Zebra

2010-03-30 Thread Chao Wang (JIRA)
Affects Versions: 0.6.0, 0.7.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.8.0 Currently, table and column group level meta data is extracted from job configuration object and written onto HDFS disk within checkOutputSpec(). Later on, writers at

[jira] Commented: (PIG-1340) [zebra] The zebra version number should be changed from 0.7 to 0.8

2010-03-31 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852124#action_12852124 ] Chao Wang commented on PIG-1340: +1 > [zebra] The zebra version number should be

[jira] Updated: (PIG-1342) [Zebra] Avoid making unnecessary name node calls for writes in Zebra

2010-03-31 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1342: --- Attachment: PIG-1342.patch > [Zebra] Avoid making unnecessary name node calls for writes in Ze

[jira] Commented: (PIG-1337) Need a way to pass distributed cache configuration information to hadoop backend in Pig's LoadFunc

2010-04-01 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852445#action_12852445 ] Chao Wang commented on PIG-1337: It's ok for us not to use getSchema() for thi

[jira] Updated: (PIG-1342) [Zebra] Avoid making unnecessary name node calls for writes in Zebra

2010-04-02 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1342: --- Attachment: PIG-1342.patch > [Zebra] Avoid making unnecessary name node calls for writes in Ze

[jira] Updated: (PIG-1342) [Zebra] Avoid making unnecessary name node calls for writes in Zebra

2010-04-02 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1342: --- Attachment: (was: PIG-1342.patch) > [Zebra] Avoid making unnecessary name node calls for writes in Ze

[jira] Created: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-02 Thread Chao Wang (JIRA)
, 0.7.0, 0.8.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.8.0 In Zebra, we do not have any type check when writing to a basic table. Say, we have a schema: "f1:int, f2:string", however we can write a tuple ("abc", 123) without a

[jira] Updated: (PIG-1342) [Zebra] Avoid making unnecessary name node calls for writes in Zebra

2010-04-02 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1342: --- Attachment: PIG-1342.patch > [Zebra] Avoid making unnecessary name node calls for writes in Ze

[jira] Updated: (PIG-1342) [Zebra] Avoid making unnecessary name node calls for writes in Zebra

2010-04-02 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1342: --- Attachment: (was: PIG-1342.patch) > [Zebra] Avoid making unnecessary name node calls for writes in Ze

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-06 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- Attachment: PIG-1351.patch > [Zebra] No type check when we write to the basic ta

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-07 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- Attachment: (was: PIG-1351.patch) > [Zebra] No type check when we write to the basic ta

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-07 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- Attachment: PIG-1351.patch > [Zebra] No type check when we write to the basic ta

[jira] Commented: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-07 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854595#action_12854595 ] Chao Wang commented on PIG-1351: 1) We follow Java's type compatibility rule as fol

[jira] Commented: (PIG-1350) [Zebra] Zebra column names cannot have leading "_"

2010-04-07 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854623#action_12854623 ] Chao Wang commented on PIG-1350: Patch looks good +1 > [Zebra] Zebra column names

[jira] Commented: (PIG-1357) [zebra] Test cases of map-side GROUP-BY should be added.

2010-04-07 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854732#action_12854732 ] Chao Wang commented on PIG-1357: +1 > [zebra] Test cases of map-side GROUP-BY sh

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-08 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- Attachment: (was: PIG-1351.patch) > [Zebra] No type check when we write to the basic ta

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-08 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- Attachment: PIG-1351.patch > [Zebra] No type check when we write to the basic ta

[jira] Created: (PIG-1375) [Zebra] To support writing multiple Zebra tables through Pig

2010-04-13 Thread Chao Wang (JIRA)
Versions: 0.8.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.8.0 In Zebra, we already have multiple outputs support for map/reduce. But we do not support this feature if users use Zebra through Pig. This jira is to address this issue. We plan to support

[jira] Updated: (PIG-1375) [Zebra] To support writing multiple Zebra tables through Pig

2010-04-15 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1375: --- Attachment: PIG-1375.patch > [Zebra] To support writing multiple Zebra tables through

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-15 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- Attachment: PIG-1351.patch > [Zebra] No type check when we write to the basic ta

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-15 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- One small change: we only check the first non-null map value for map and the first record for collection

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-15 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- Status: Patch Available (was: Open) Affects Version/s: (was: 0.6.0

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-15 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- Status: Open (was: Patch Available) > [Zebra] No type check when we write to the basic ta

[jira] Updated: (PIG-1351) [Zebra] No type check when we write to the basic table

2010-04-15 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1351: --- Attachment: (was: PIG-1351.patch) > [Zebra] No type check when we write to the basic ta

  1   2   >