[jira] [Updated] (PIG-1842) Improve Scalability of the XMLLoader for large datasets such as wikipedia

2011-03-30 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1842: Fix Version/s: (was: 0.7.0) Improve Scalability of the XMLLoader for large datasets such as wikipedia

[jira] [Updated] (PIG-1932) GFCross should allow the user to set the DEFAULT_PARALLELISM value

2011-03-30 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1932: Resolution: Fixed Status: Resolved (was: Patch Available) Patch 2 checked in. GFCross should

[jira] [Commented] (PIG-1899) Pig needs a tool for doing end to end testing efficiently

2011-03-30 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13013692#comment-13013692 ] Alan Gates commented on PIG-1899: - bq. (1) There are several scripts that are placed

[jira] [Updated] (PIG-1899) Pig needs a tool for doing end to end testing efficiently

2011-03-30 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1899: Attachment: PIG-1899-3.patch Addressed several issues from Olga's comments. Added embedding (turing

[jira] [Updated] (PIG-1932) GFCross should allow the user to set the DEFAULT_PARALLELISM value

2011-03-29 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1932: Status: Patch Available (was: Open) GFCross should allow the user to set the DEFAULT_PARALLELISM value

[jira] [Updated] (PIG-1932) GFCross should allow the user to set the DEFAULT_PARALLELISM value

2011-03-29 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1932: Attachment: PIG-1932_2.patch Commit test unit tests pass [exec] -1 overall. [exec] [exec

[jira] [Commented] (PIG-1881) Need a special interface for Penny (Inspector Gadget)

2011-03-29 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012586#comment-13012586 ] Alan Gates commented on PIG-1881: - It looks like you brought back

[jira] [Updated] (PIG-1932) GFCross should allow the user to set the DEFAULT_PARALLELISM value

2011-03-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1932: Attachment: PIG-1932.patch Unit tests pass. Results of test-patch: [exec] -1 overall. [exec

[jira] [Updated] (PIG-1932) GFCross should allow the user to set the DEFAULT_PARALLELISM value

2011-03-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1932: Fix Version/s: 0.9.0 Assignee: Alan Gates Status: Patch Available (was: Open) GFCross

[jira] [Commented] (PIG-1931) Integrate Macro Expansion with New Parser

2011-03-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010899#comment-13010899 ] Alan Gates commented on PIG-1931: - Please make sure you post any results from your

[jira] [Resolved] (PIG-1802) Mark old logical plan and related classes deprecated so developers know where to focus their work

2011-03-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved PIG-1802. - Resolution: Invalid Old logical plan is being removed as part of 0.9. Mark old logical plan and related

[jira] [Updated] (PIG-1932) GFCross should allow the user to set the DEFAULT_PARALLELISM value

2011-03-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1932: Status: Open (was: Patch Available) Daniel convinced me I should use the parallelism value from the cross

[jira] [Updated] (PIG-1924) CSV Loader/Store that handles newlines in fields, and other Excel CSV features.

2011-03-22 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1924: Attachment: PIG-1924.patch Rather than directly attach the source code, you should generate a patch. I've

[jira] [Commented] (PIG-1924) CSV Loader/Store that handles newlines in fields, and other Excel CSV features.

2011-03-22 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009750#comment-13009750 ] Alan Gates commented on PIG-1924: - Wait, I missed that you did not check the grant box

[jira] [Assigned] (PIG-1924) CSV Loader/Store that handles newlines in fields, and other Excel CSV features.

2011-03-22 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1924: --- Assignee: Andreas Paepcke CSV Loader/Store that handles newlines in fields, and other Excel CSV

[jira] [Resolved] (PIG-1925) Parser error message doesn't show location of the error or show it as Line 0:0

2011-03-22 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved PIG-1925. - Resolution: Fixed Patch 1925-1 checked in. Parser error message doesn't show location of the error

[jira] [Created] (PIG-1932) GFCross should allow the user to set the DEFAULT_PARALLELISM value

2011-03-22 Thread Alan Gates (JIRA)
Components: impl Affects Versions: 0.8.0 Reporter: Alan Gates Priority: Minor The internal UDF GFCross uses a final static int DEFAULT_PARALLELISM to determine how wide to spread the records in a cross. It is currently hard wired to 96. There are no comments

[jira] [Commented] (PIG-671) typechecker does not throw an error when multiple arguments are passed to COUNT

2011-03-21 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009331#comment-13009331 ] Alan Gates commented on PIG-671: test-patch uses a tool to audit the files and make sure

[jira] [Commented] (PIG-1925) Parser error message doesn't show location of the error or show it as Line 0:0

2011-03-21 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009336#comment-13009336 ] Alan Gates commented on PIG-1925: - A couple of comments. One, 1:9. I assume this means

[jira] [Commented] (PIG-671) typechecker does not throw an error when multiple arguments are passed to COUNT

2011-03-21 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009342#comment-13009342 ] Alan Gates commented on PIG-671: Patch checked in. Thanks Deepak. typechecker does

[jira] Updated: (PIG-1899) Pig needs a tool for doing end to end testing efficiently

2011-03-17 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1899: Attachment: PIG-1899.patch Added UDFs for use with the test harness that were left out in the first patch

[jira] Commented: (PIG-1830) Type mismatch error in key from map, when doing GROUP on PigStorageSchema() variable

2011-03-16 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007651#comment-13007651 ] Alan Gates commented on PIG-1830: - Unit tests pass. All looks good. Type mismatch error

[jira] Commented: (PIG-1830) Type mismatch error in key from map, when doing GROUP on PigStorageSchema() variable

2011-03-15 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13006961#comment-13006961 ] Alan Gates commented on PIG-1830: - If the release audit warnings are all in generated code

[jira] Commented: (PIG-671) typechecker does not throw an error when multiple arguments are passed to COUNT

2011-03-15 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13006972#comment-13006972 ] Alan Gates commented on PIG-671: Reviewing the patch. typechecker does not throw an error

[jira] Commented: (PIG-1896) CastUtils - Converting Pig DataTypes to Java Data Types

2011-03-15 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007300#comment-13007300 ] Alan Gates commented on PIG-1896: - Unit tests pass. CastUtils - Converting Pig DataTypes

[jira] Commented: (PIG-1885) SUBSTRING fails when input length less than start

2011-03-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13006561#comment-13006561 ] Alan Gates commented on PIG-1885: - [exec] +1 overall. [exec] [exec] +1

[jira] Updated: (PIG-1885) SUBSTRING fails when input length less than start

2011-03-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1885: Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed. Thanks Deepak

[jira] Updated: (PIG-1830) Type mismatch error in key from map, when doing GROUP on PigStorageSchema() variable

2011-03-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1830: Attachment: 1830-testpatch.tgz release audit output from test-patch Type mismatch error in key from map

[jira] Assigned: (PIG-1896) CastUtils - Converting Pig DataTypes to Java Data Types

2011-03-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1896: --- Assignee: Jonathan Holloway CastUtils - Converting Pig DataTypes to Java Data Types

[jira] Commented: (PIG-1891) Enable StoreFunc to make intelligent decision based on job success or failure

2011-03-11 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13005707#comment-13005707 ] Alan Gates commented on PIG-1891: - When we redesigned the load and store interfaces in 0.7

[jira] Commented: (PIG-144) The error message should be more meaningful when there is a typo in PIg script

2011-03-11 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13005759#comment-13005759 ] Alan Gates commented on PIG-144: I don't think we have to suggest alternate spellings

[jira] Commented: (PIG-1885) SUBSTRING fails when input length less than start

2011-03-11 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13005904#comment-13005904 ] Alan Gates commented on PIG-1885: - Patch looks good. I'll run the unit tests and test_patch

[jira] Created: (PIG-1899) Pig needs a tool for doing end to end testing efficiently

2011-03-11 Thread Alan Gates (JIRA)
Reporter: Alan Gates Assignee: Alan Gates Pig currently uses junit for all testing. junit is good for unit tests, but limited for end to end and integration testing. Building an end to end test in junit is cumbersome (a lot of setup and such to do using MiniCluster

[jira] Updated: (PIG-1899) Pig needs a tool for doing end to end testing efficiently

2011-03-11 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1899: Attachment: e2e.patch The attached patch contains a testing tool developed by the Yahoo Pig team to handle

[jira] Commented: (PIG-1874) Make PigServer work in a multithreading environment

2011-03-10 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13005183#comment-13005183 ] Alan Gates commented on PIG-1874: - Changes looks good. What kind of testing are we doing

[jira] Commented: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2011-03-09 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004872#comment-13004872 ] Alan Gates commented on PIG-366: By ownership here Olga didn't mean taking the code out

[jira] Commented: (PIG-1891) Enable StoreFunc to make intelligent decision based on job success or failure

2011-03-09 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004873#comment-13004873 ] Alan Gates commented on PIG-1891: - It sounds like what you want is a way for the storage

[jira] Commented: (PIG-1881) Need a special interface for Penny (Inspector Gadget)

2011-03-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004075#comment-13004075 ] Alan Gates commented on PIG-1881: - I don't want this to be a public interface at all

[jira] Assigned: (PIG-1885) SUBSTRING fails when input length less than start

2011-03-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1885: --- Assignee: Deepak Kumar V SUBSTRING fails when input length less than start

[jira] Commented: (PIG-671) typechecker does not throw an error when multiple arguments are passed to COUNT

2011-03-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004087#comment-13004087 ] Alan Gates commented on PIG-671: Any changes we make have to still work in two scenarios

[jira] Commented: (PIG-1885) SUBSTRING fails when input length less than start

2011-03-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004136#comment-13004136 ] Alan Gates commented on PIG-1885: - Changes look good. One thought I had is rather than

[jira] Resolved: (PIG-824) SQL interface for Pig

2011-03-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved PIG-824. Resolution: Won't Fix SQL interface for Pig - Key: PIG-824

[jira] Commented: (PIG-1881) Need a special interface for Penny (Inspector Gadget)

2011-03-07 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003449#comment-13003449 ] Alan Gates commented on PIG-1881: - The reason for not opening this up is that it exposes

[jira] Created: (PIG-1885) SUBSTRING fails when input length less than start

2011-03-05 Thread Alan Gates (JIRA)
Versions: 0.8.0, 0.9.0 Reporter: Alan Gates Priority: Minor SUBSTRING throws an error if it gets a string which has a length less than its start value. For example, SUBSTRING(x, 100, 120) will fail with any chararray of length less than 100. It should return null instead

[jira] Created: (PIG-1881) Need a special interface for Penny (Inspector Gadget)

2011-03-03 Thread Alan Gates (JIRA)
Affects Versions: 0.9.0 Reporter: Alan Gates Assignee: Laukik Chitnis Priority: Minor Fix For: 0.9.0 The proposed Penny tool needs access to Pig's new logical plan in order to inject code into the the dataflow. Once it has modified the plan

[jira] Updated: (PIG-1881) Need a special interface for Penny (Inspector Gadget)

2011-03-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1881: Attachment: toolsserver.patch This patch provides the requested interface. It has gone out of sync

[jira] Commented: (PIG-1876) Typed map for Pig

2011-03-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002100#comment-13002100 ] Alan Gates commented on PIG-1876: - I assume at the end when the schema for b is listed

[jira] Commented: (PIG-1875) Keep tuples serialized to limit spilling and speed it when it happens

2011-03-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001192#comment-13001192 ] Alan Gates commented on PIG-1875: - Thoughts so far on a possible implementation

[jira] Updated: (PIG-1875) Keep tuples serialized to limit spilling and speed it when it happens

2011-03-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1875: Attachment: mrtuple.patch Here's a first pass at what MToRTuple might look like. I've done some basic

[jira] Commented: (PIG-1680) Pig 0.8 HBaseStorage may not against HBase 0.89

2011-03-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001211#comment-13001211 ] Alan Gates commented on PIG-1680: - Sorry Dmitriy, I still haven't gotten to testing it. Go

[jira] Created: (PIG-1877) map constants not working properly in filter statements

2011-03-01 Thread Alan Gates (JIRA)
Affects Versions: 0.8.0 Reporter: Alan Gates Priority: Minor The Pig Latin script: {code} A = load '/Users/gates/test/data/studenttab10' as (a:map[], b:tuple(), c:bag{}); B = filter A by a == ['name'#'bob', 'age'#55]; dump B; {code} runs but produces the error: {code

[jira] Commented: (PIG-1877) map constants not working properly in filter statements

2011-03-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001221#comment-13001221 ] Alan Gates commented on PIG-1877: - Tuple constants appear to have the same issue. map

[jira] Commented: (PIG-1680) Pig 0.8 HBaseStorage may not against HBase 0.89

2011-02-25 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12999504#comment-12999504 ] Alan Gates commented on PIG-1680: - Since you added an additional call to setLocation I want

[jira] Commented: (PIG-1842) Improve Scalability of the XMLLoader for large datasets such as wikipedia

2011-02-25 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12999506#comment-12999506 ] Alan Gates commented on PIG-1842: - I have checked the patch into trunk. I applied

[jira] Updated: (PIG-1842) Improve Scalability of the XMLLoader for large datasets such as wikipedia

2011-02-25 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1842: Attachment: TEST-org.apache.pig.piggybank.test.storage.TestXMLLoader.txt Improve Scalability

[jira] Commented: (PIG-1842) Improve Scalability of the XMLLoader for large datasets such as wikipedia

2011-02-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12999053#comment-12999053 ] Alan Gates commented on PIG-1842: - From reviewing the code it is not clear to me how

[jira] Commented: (PIG-1680) Pig 0.8 HBaseStorage may not against HBase 0.89

2011-02-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12999074#comment-12999074 ] Alan Gates commented on PIG-1680: - https://repository.apache.org/content/repositories

[jira] Commented: (PIG-1867) Allow UDFs that can generate multiple output tuples from a single input tuple

2011-02-23 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998570#comment-12998570 ] Alan Gates commented on PIG-1867: - Pig already offers this. Have your UDF return a bag

Re: About time estimator

2011-02-17 Thread Alan Gates
, and it was the join case. I mean they say that the progress estimator woudn't deal with joins. What do you think it would be a good approach for a join progress estimator? Any ideas are more than welcome. Thanks in advance! Renato M. 2011/2/15 Alan Gates ga...@yahoo-inc.com Parallax is implemented on a pull

Re: REMINDER: Pig developer meeting in February

2011-02-15 Thread Alan Gates
On Feb 15, 2011, at 5:18 AM, Dmitriy Ryaboy wrote: Is there overlap there with the error handling proposal? I don't think so. The error handling proposal is about how to handle errors that happen when you are running Pig jobs. Penny is a way to instrument your scripts so that you can

[jira] Commented: (PIG-1765) as_clause in foreach statement should differentiate between simple type and type within tuple

2011-02-15 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994717#comment-12994717 ] Alan Gates commented on PIG-1765: - We don't know what we want to do in the long term here

[jira] Assigned: (PIG-1848) Confusing statement for Merge Join - Both Conditions in Pig reference manual1

2011-02-10 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1848: --- Assignee: Corinne Chandel Confusing statement for Merge Join - Both Conditions in Pig reference

Re: Review Request: Add macro expansion to Pig Latin

2011-02-08 Thread Alan Gates
On Feb 7, 2011, at 5:05 PM, Richard Ding wrote: On 2011-02-07 15:25:52, Julien Le Dem wrote: http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestMacroExpansion.java , line 3 https://reviews.apache.org/r/400/diff/1/?file=10791#file10791line3 Can a Macro call another

Re: Review Request: Add macro expansion to Pig Latin

2011-02-08 Thread Alan Gates
Cool. Alan. On Feb 8, 2011, at 9:54 AM, Richard Ding wrote: On 2011-02-07 15:25:52, Julien Le Dem wrote: http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestMacroExpansion.java , line 3 https://reviews.apache.org/r/400/diff/1/?file=10791#file10791line3 Can a Macro

[jira] Commented: (PIG-1842) Improve Scalability of the XMLLoader for large datasets such as wikipedia

2011-02-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12992088#comment-12992088 ] Alan Gates commented on PIG-1842: - The patch does not apply cleanly against the trunk. Can

[jira] Commented: (PIG-1825) ability to turn off the write ahead log for pig's HBaseStorage

2011-02-04 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12990759#comment-12990759 ] Alan Gates commented on PIG-1825: - Unit tests pass. The output of test-patch: [exec

[jira] Commented: (PIG-1717) pig needs to call setPartitionFilter if schema is null but getPartitionKeys is not

2011-02-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989327#comment-12989327 ] Alan Gates commented on PIG-1717: - Re-running test-patch and unit tests. pig needs to call

[jira] Commented: (PIG-1717) pig needs to call setPartitionFilter if schema is null but getPartitionKeys is not

2011-02-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989329#comment-12989329 ] Alan Gates commented on PIG-1717: - [exec] -1 overall. [exec] [exec] +1

[jira] Commented: (PIG-1825) ability to turn off the write ahead log for pig's HBaseStorage

2011-02-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989330#comment-12989330 ] Alan Gates commented on PIG-1825: - Dmitriy, is this something we should check in? You

[jira] Created: (PIG-1836) Accumulator like interface should be used with Pig operators after (co)group in certain cases

2011-02-01 Thread Alan Gates (JIRA)
Project: Pig Issue Type: Improvement Reporter: Alan Gates There are a number of cases where people (co)group their data, and then pass it to an operator other than foreach with a UDF, but where an accumulator like interface would still make sense. A few examples

[jira] Commented: (PIG-1812) Problem with DID_NOT_FIND_LOAD_ONLY_MAP_PLAN

2011-01-27 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12987878#action_12987878 ] Alan Gates commented on PIG-1812: - +1 Problem with DID_NOT_FIND_LOAD_ONLY_MAP_PLAN

[jira] Commented: (PIG-1824) Support import modules in Jython UDF

2011-01-26 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12987143#action_12987143 ] Alan Gates commented on PIG-1824: - +1 to Ashutosh's comment. Also, this won't port well

[jira] Commented: (PIG-1769) Consistency for HBaseStorage

2011-01-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12985879#action_12985879 ] Alan Gates commented on PIG-1769: - +1, changes look good. I'll run the test-patch and unit

Re: dataflow in logical plan

2011-01-24 Thread Alan Gates
The logical plan for your script will look like: Load - Filter - Store Filter will have an expression plan that looks like Proj($0) const(5) So yes, all your data will go through the filter operator. But keep in mind that there is a filter operator in each map task, so all your code will

Re: dataflow in logical plan

2011-01-24 Thread Alan Gates
that the filter take tow inputs the myfile data and the result of the proj(0) 5 which is 7 9 6 regards On Mon, Jan 24, 2011 at 10:08 PM, Alan Gates ga...@yahoo-inc.com wrote: The logical plan for your script will look like: Load - Filter - Store Filter will have an expression plan that looks like

[jira] Commented: (PIG-1717) pig needs to call setPartitionFilter if schema is null but getPartitionKeys is not

2011-01-20 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984315#action_12984315 ] Alan Gates commented on PIG-1717: - I saw several unit test failures: [junit] Test

[jira] Assigned: (PIG-1749) Update Pig parser so that function arguments can contain newline characters

2011-01-20 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1749: --- Assignee: Jakob Homan Update Pig parser so that function arguments can contain newline characters

[jira] Commented: (PIG-847) Setting twoLevelAccessRequired field in a bag schema should not be required to access fields in the tuples of the bag

2011-01-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981816#action_12981816 ] Alan Gates commented on PIG-847: I'm 100% behind removing twoLevelAccess, but I don't want

Re: Semantic cleanup: How to adding two bytearray

2011-01-14 Thread Alan Gates
I think the big win of static typing is that from examining the script alone you can know the output: A = load 'bla' using BinStorage(); B = foreach A generate $0 + $1; With static typing $0 and $1 will both be viewed as bytearrays and thus will be cast to doubles, regardless of how

[jira] Created: (PIG-1802) Mark old logical plan and related classes deprecated so developers know where to focus their work

2011-01-12 Thread Alan Gates (JIRA)
Project: Pig Issue Type: Bug Affects Versions: 0.9.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.9.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1717) pig needs to call setPartitionFilter if schema is null but getPartitionKeys is not

2011-01-12 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981044#action_12981044 ] Alan Gates commented on PIG-1717: - Let me clarify my last comment. We are still

[jira] Commented: (PIG-1717) pig needs to call setPartitionFilter if schema is null but getPartitionKeys is not

2011-01-10 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979808#action_12979808 ] Alan Gates commented on PIG-1717: - Ideally I would like to pick A, since that is the clean

[jira] Commented: (PIG-1479) Embed Pig in scripting languages

2011-01-06 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978503#action_12978503 ] Alan Gates commented on PIG-1479: - Latest patch looks good. I just have one question. Why

[jira] Commented: (PIG-1675) Suggest to allow PigServer can register pig script from InputStream

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976979#action_12976979 ] Alan Gates commented on PIG-1675: - +1 on the latest patch, changes look good. To comment

[jira] Commented: (PIG-1777) LoadFunc in a scripting language

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976988#action_12976988 ] Alan Gates commented on PIG-1777: - Changes look good. One thing I missed before

[jira] Commented: (PIG-1717) pig needs to call setPartitionFilter if schema is null but getPartitionKeys is not

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976993#action_12976993 ] Alan Gates commented on PIG-1717: - We do not see using the AS clause in LOAD as the preferred

[jira] Assigned: (PIG-1584) deal with inner cogroup

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1584: --- Assignee: Daniel Dai (was: Alan Gates) deal with inner cogroup

[jira] Assigned: (PIG-1536) use same logic for merging inner schemas in default union and union onschema

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1536: --- Assignee: Daniel Dai (was: Alan Gates) use same logic for merging inner schemas in default union

[jira] Assigned: (PIG-1222) cast ends up with NULL value

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1222: --- Assignee: Daniel Dai (was: Alan Gates) cast ends up with NULL value

[jira] Assigned: (PIG-1112) FLATTEN eliminates the alias

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1112: --- Assignee: Daniel Dai (was: Alan Gates) FLATTEN eliminates the alias

[jira] Assigned: (PIG-998) revisit frontend logic and pig-latin semantics

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-998: -- Assignee: Daniel Dai (was: Alan Gates) revisit frontend logic and pig-latin semantics

[jira] Resolved: (PIG-904) Conversion from double to chararray for udf input arguments does not occur

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved PIG-904. Resolution: Invalid Conversion from double to chararray for udf input arguments does not occur

[jira] Assigned: (PIG-847) Setting twoLevelAccessRequired field in a bag schema should not be required to access fields in the tuples of the bag

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-847: -- Assignee: Daniel Dai (was: Alan Gates) Setting twoLevelAccessRequired field in a bag schema should

[jira] Updated: (PIG-798) Schema errors when using PigStorage and none when using BinStorage in FOREACH??

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-798: --- Fix Version/s: (was: 0.9.0) Schema errors when using PigStorage and none when using BinStorage

[jira] Assigned: (PIG-767) Schema reported from DESCRIBE and actual schema of inner bags are different.

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-767: -- Assignee: Daniel Dai (was: Alan Gates) Schema reported from DESCRIBE and actual schema of inner bags

[jira] Assigned: (PIG-730) problem combining schema from a union of several LOAD expressions, with a nested bag inside the schema.

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-730: -- Assignee: Daniel Dai (was: Alan Gates) problem combining schema from a union of several LOAD

[jira] Assigned: (PIG-723) Pig generates incorrect schema for generated bags after FOREACH.

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-723: -- Assignee: Daniel Dai (was: Alan Gates) Pig generates incorrect schema for generated bags after FOREACH

[jira] Assigned: (PIG-694) Schema merge should take into account bags with tuples and bags with schemas

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-694: -- Assignee: Daniel Dai (was: Alan Gates) Schema merge should take into account bags with tuples and bags

[jira] Assigned: (PIG-496) project of bags from complex data causes failures

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-496: -- Assignee: Daniel Dai (was: Alan Gates) project of bags from complex data causes failures

[jira] Updated: (PIG-678) as support for group-by

2011-01-03 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-678: --- Fix Version/s: (was: 0.9.0) as support for group-by - Key: PIG

<    2   3   4   5   6   7   8   >