[jira] Commented: (PIG-794) Use Avro serialization in Pig

2009-07-10 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12729700#action_12729700 ] Alan Gates commented on PIG-794: I agree with Doug's comments that it's better to use an API

[jira] Commented: (PIG-697) Proposed improvements to pig's optimizer

2009-07-02 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12726528#action_12726528 ] Alan Gates commented on PIG-697: A couple of questions and a comment on patch4-part2 I don't

[jira] Updated: (PIG-820) PERFORMANCE: The RandomSampleLoader should be changed to allow it subsume another loader

2009-06-30 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-820: --- Resolution: Fixed Status: Resolved (was: Patch Available) v6 of the patch checked in. Thanks Ashutosh

[jira] Resolved: (PIG-788) Proposal to remove float from Pig data types

2009-06-30 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved PIG-788. Resolution: Won't Fix Avro has decided to keep float as a type. Proposal to remove float from Pig data

[jira] Commented: (PIG-793) Improving memory efficiency of Tuple implementation

2009-06-27 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12724880#action_12724880 ] Alan Gates commented on PIG-793: The cost for storing data raw is: 16 bytes for the tuple

[jira] Assigned: (PIG-793) Improving memory efficiency of Tuple implementation

2009-06-26 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-793: -- Assignee: Alan Gates Improving memory efficiency of Tuple implementation

[jira] Commented: (PIG-793) Improving memory efficiency of Tuple implementation

2009-06-26 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12724594#action_12724594 ] Alan Gates commented on PIG-793: Using jmap, I've been toying around with our DefaultTuple

Re: requirements for Pig 1.0?

2009-06-24 Thread Alan Gates
Jurney wrote: For 1.0 - complete Owl? http://wiki.apache.org/pig/Metadata Russell Jurney rjur...@cloudstenography.com On Jun 23, 2009, at 4:40 PM, Alan Gates wrote: I don't believe there's a solid list of want to haves for 1.0. The big issue I see is that there are too many interfaces

Re: requirements for Pig 1.0?

2009-06-24 Thread Alan Gates
To be clear, going to 1.0 is not about having a certain set of features. It is about stability and usability. When a project declares itself 1.0 it is making some guarantees regarding the stability of its interfaces (in Pig's case this is Pig Latin, UDFs, and command line usage). It is

[jira] Commented: (PIG-794) Use Avro serialization in Pig

2009-06-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723812#action_12723812 ] Alan Gates commented on PIG-794: PIG-734 has been committed. This will allow this patch

Re: asking for comments on benchmark queries

2009-06-23 Thread Alan Gates
Zheng, I don't think you're subscribed to pig-dev (your emails have been bouncing to the moderator). So I've cc'd you explicitly on this. I don't think we need a Pig JIRA, it's probably easier if we all work on the hive one. I'll post my comments on the various scripts to that bug.

[jira] Commented: (PIG-820) PERFORMANCE: The RandomSampleLoader should be changed to allow it subsume another loader

2009-06-23 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723204#action_12723204 ] Alan Gates commented on PIG-820: +1 PERFORMANCE: The RandomSampleLoader should be changed

Re: [VOTE] Release Pig 0.3.0 (candidate 0)

2009-06-22 Thread Alan Gates
Downloaded, ran, ran tutorial, built piggybank. All looks good. +1 Alan. On Jun 18, 2009, at 12:30 PM, Olga Natkovich wrote: Hi, I created a candidate build for Pig 0.3.0 release. The main feature of this release is support for multiquery which allows to share computation across

[jira] Commented: (PIG-697) Proposed improvements to pig's optimizer

2009-06-19 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721968#action_12721968 ] Alan Gates commented on PIG-697: Why is it that some Logical operators (LOCross, LOStream

[jira] Commented: (PIG-697) Proposed improvements to pig's optimizer

2009-06-19 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721977#action_12721977 ] Alan Gates commented on PIG-697: +1, looks good. Proposed improvements to pig's optimizer

[jira] Updated: (PIG-734) Non-string keys in maps

2009-06-18 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-734: --- Status: Open (was: Patch Available) Non-string keys in maps --- Key

[jira] Updated: (PIG-734) Non-string keys in maps

2009-06-18 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-734: --- Attachment: PIG-734_2.patch New version of the patch, brought up to date with current trunk. Non-string keys

[jira] Updated: (PIG-734) Non-string keys in maps

2009-06-18 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-734: --- Fix Version/s: (was: 0.3.0) 0.4.0 Status: Patch Available (was: Open) Non

[jira] Updated: (PIG-734) Non-string keys in maps

2009-06-18 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-734: --- Attachment: PIG-734_3.patch Attaching a version of the file that fixes some of the introduced compiler

[jira] Commented: (PIG-753) Provide support for UDFs without parameters

2009-06-18 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721559#action_12721559 ] Alan Gates commented on PIG-753: +1 I tested the patch, and the issue was just with the bzip

[jira] Commented: (PIG-856) PERFORMANCE: reduce number of replicas

2009-06-18 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721590#action_12721590 ] Alan Gates commented on PIG-856: My $0.02, based on the assumption that we see a significant

Re: Rewire and multi-query load/store optimization

2009-06-16 Thread Alan Gates
+1 on option one. The use of store-load was only to overcome a temporary problem in Pig. We've fixed the problem, so let's not propagate it. We will need to document this very clearly (maybe even to the point of issuing warnings in the parser when we see this combo) so users understand

[jira] Commented: (PIG-842) PigStorage should support multi-byte delimiters

2009-06-16 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720259#action_12720259 ] Alan Gates commented on PIG-842: I'm concerned about the performance hit of supporting multi

[jira] Commented: (PIG-753) Provide support for UDFs without parameters

2009-06-15 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719616#action_12719616 ] Alan Gates commented on PIG-753: The test failures are in bzip tests, which I doubt

[jira] Updated: (PIG-753) Provide support for UDFs without parameters

2009-06-15 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-753: --- Status: Open (was: Patch Available) Provide support for UDFs without parameters

Re: PigPen Source

2009-06-15 Thread Alan Gates
It has not yet been integrated into contrib because it requires the eclipse libraries to build, and those weren't integrated. The ivy stuff used by pig's build should be configured to pick up the appropriate eclipse jars so that this can be added to contrib. Alan. On Jun 15, 2009, at

[jira] Commented: (PIG-823) Hadoop Metadata Service

2009-06-10 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12718063#action_12718063 ] Alan Gates commented on PIG-823: In response to Matei's comment: The intent

[jira] Commented: (PIG-6) Addition of Hbase Storage Option In Load/Store Statement

2009-06-10 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12718071#action_12718071 ] Alan Gates commented on PIG-6: -- The outstanding patch that has not been applied (m34813f5.txt

[jira] Updated: (PIG-830) Port Apache Log parsing piggybank contrib to Pig 0.2

2009-06-05 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-830: --- Resolution: Fixed Fix Version/s: 0.3.0 Status: Resolved (was: Patch Available) Patch checked

[jira] Commented: (PIG-826) DISTINCT as Function/Operator rather than statement/operator - High Level Pig

2009-06-02 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12715639#action_12715639 ] Alan Gates commented on PIG-826: It can be done like this: {code} Logs = load 'log' using

[jira] Created: (PIG-831) Records and bytes written reported by pig are wrong in a multi-store program

2009-06-02 Thread Alan Gates (JIRA)
Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Minor The stats features checked in as part of PIG-626 (reporting the number of records and bytes written at the end of the query) print wrong

[jira] Commented: (PIG-831) Records and bytes written reported by pig are wrong in a multi-store program

2009-06-02 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12715657#action_12715657 ] Alan Gates commented on PIG-831: There are a couple of issues going on here. One, PigStats

[jira] Updated: (PIG-830) Port Apache Log parsing piggybank contrib to Pig 0.2

2009-06-02 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-830: --- Attachment: TEST-org.apache.pig.piggybank.test.storage.TestMyRegExLoader.txt Log file for failing unit test

[jira] Commented: (PIG-809) number of input lines it processed, number of output lines it produced for PIG job

2009-06-02 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12715709#action_12715709 ] Alan Gates commented on PIG-809: Sorry, I referenced the wrong jira in the previous comment

[jira] Updated: (PIG-830) Port Apache Log parsing piggybank contrib to Pig 0.2

2009-06-02 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-830: --- Status: Open (was: Patch Available) When I run the unit tests I get a failure in TestMyRegexLoader. I'll

[jira] Commented: (PIG-564) Parameter Substitution using -param option does not seem to work when parameters contain special characters such as +,=,-,?,'

2009-06-02 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12715764#action_12715764 ] Alan Gates commented on PIG-564: Questions/comments on the patch. 1) Why did output1.pig

[jira] Resolved: (PIG-825) PIG_HADOOP_VERSION should be 18

2009-06-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved PIG-825. Resolution: Fixed Fix Version/s: 0.3.0 Patch checked in. Thanks Dmitriy. PIG_HADOOP_VERSION should

[jira] Commented: (PIG-753) Provide support for UDFs without parameters

2009-06-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12715312#action_12715312 ] Alan Gates commented on PIG-753: The patch should include a unit tests that to test whether

[jira] Commented: (PIG-825) PIG_HADOOP_VERSION should be 18

2009-05-29 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714621#action_12714621 ] Alan Gates commented on PIG-825: I'll take a look at this patch. PIG_HADOOP_VERSION should

[jira] Updated: (PIG-619) Dumping empty results produces Unable to get results for /tmp/temp-1964806069/tmp256878619 org.apache.pig.builtin.BinStorage message

2009-05-28 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-619: --- Resolution: Fixed Status: Resolved (was: Patch Available) Patch checked in. Dumping empty results

Proposed design for new merge join in pig

2009-05-28 Thread Alan Gates
http://wiki.apache.org/pig/PigMergeJoin Alan.

[jira] Commented: (PIG-796) support conversion from numeric types to chararray

2009-05-28 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714224#action_12714224 ] Alan Gates commented on PIG-796: Can options a and b not be combined? Could we cache the type

Updated PigMix numbers for latest top of trunk

2009-05-28 Thread Alan Gates
http://wiki.apache.org/pig/PigMix Alan.

[jira] Created: (PIG-820) PERFORMANCE: The RandomSampleLoader should be changed to allow it subsume another loader

2009-05-26 Thread Alan Gates (JIRA)
Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.3.0 Reporter: Alan Gates Assignee: Alan Gates Currently a sampling job requires that data already be stored in BinaryStorage format, since RandomSampleLoader extends BinaryStorage

[jira] Commented: (PIG-697) Proposed improvements to pig's optimizer

2009-05-22 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12712153#action_12712153 ] Alan Gates commented on PIG-697: +1 for latest rev of part 3. Proposed improvements to pig's

Re: UDF with parameters?

2009-05-22 Thread Alan Gates
Yes, it is possible. The UDF should take the percentage you want as a constructor argument. It will have to be passed as a string and converted. Then in your Pig Latin, you will use the DEFINE statement to pass the argument to the constructor. REGISTER /src/myfunc.jar DEFINE percentile

[jira] Commented: (PIG-697) Proposed improvements to pig's optimizer

2009-05-21 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12711861#action_12711861 ] Alan Gates commented on PIG-697: Comments on OptimizerPhase3_parrt1.patch Why does LOSplit

Re: A proposal for changing pig's memory management

2009-05-20 Thread Alan Gates
On May 19, 2009, at 10:30 PM, Mridul Muralidharan wrote: I am still not very convinced about the value about this implementation - particularly considering the advances made since 1.3 in memory allocators and garbage collection. My fundamental concern is not with the slowness of garbage

Re: A proposal for changing pig's memory management

2009-05-19 Thread Alan Gates
are there to provide? Wouldn't a virtual tuple type that was nothing more than a byte buffer, type and an offset do almost all of what is proposed here? On Thu, May 14, 2009 at 5:33 PM, Alan Gates ga...@yahoo-inc.com wrote: http://wiki.apache.org/pig/PigMemory Alan.

[jira] Commented: (PIG-697) Proposed improvements to pig's optimizer

2009-05-18 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12710473#action_12710473 ] Alan Gates commented on PIG-697: +1 for OptimizerPhase2.patch Proposed improvements to pig's

[jira] Commented: (PIG-809) number of input lines it processed, number of output lines it produced for PIG job

2009-05-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12709440#action_12709440 ] Alan Gates commented on PIG-809: Is this a duplicate of PIG-619, which was just committed

[jira] Updated: (PIG-810) Scripts failing with NPE

2009-05-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-810: --- Attachment: PIG-810.patch Scripts failing with NPE Key: PIG-810

[jira] Updated: (PIG-810) Scripts failing with NPE

2009-05-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-810: --- Fix Version/s: 0.3.0 Status: Patch Available (was: Open) Scripts failing with NPE

[jira] Updated: (PIG-619) Dumping empty results produces Unable to get results for /tmp/temp-1964806069/tmp256878619 org.apache.pig.builtin.BinStorage message

2009-05-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-619: --- Fix Version/s: 0.3.0 Status: Patch Available (was: Open) In order to see this behavior, you need

[jira] Commented: (PIG-806) to remove author tags in the pig source code

2009-05-12 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708508#action_12708508 ] Alan Gates commented on PIG-806: http://wiki.apache.org/pig/HowToContribute see section

[jira] Commented: (PIG-788) Proposal to remove float from Pig data types

2009-05-12 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708512#action_12708512 ] Alan Gates commented on PIG-788: Reading the latest comments on AVRO-17 it looks like

[jira] Updated: (PIG-626) Statistics (records read by each mapper and reducer)

2009-05-11 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-626: --- Status: Patch Available (was: Reopened) Statistics (records read by each mapper and reducer

[jira] Assigned: (PIG-788) Proposal to remove float from Pig data types

2009-05-11 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-788: -- Assignee: Alan Gates Proposal to remove float from Pig data types

[jira] Updated: (PIG-626) Statistics (records read by each mapper and reducer)

2009-05-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-626: --- Assignee: Shubham Chopra Status: Patch Available (was: Open) Statistics (records read by each mapper

[jira] Updated: (PIG-626) Statistics (records read by each mapper and reducer)

2009-05-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-626: --- Resolution: Fixed Fix Version/s: 0.3.0 Status: Resolved (was: Patch Available) Patch checked

[jira] Reopened: (PIG-626) Statistics (records read by each mapper and reducer)

2009-05-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reopened PIG-626: I should have checked my other window before I marked the bug as fixed. The commit failed, I can't seem

[jira] Updated: (PIG-626) Statistics (records read by each mapper and reducer)

2009-05-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-626: --- Attachment: PIG-626.patch A version of the patch that deals with the findbugs and javac warnings. Statistics

[jira] Updated: (PIG-800) script1-hadoop.pig in pig tutorial hangs when run in local mode

2009-05-07 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-800: --- Attachment: PIG-800.patch script1-hadoop.pig in pig tutorial hangs when run in local mode

[jira] Updated: (PIG-800) script1-hadoop.pig in pig tutorial hangs when run in local mode

2009-05-07 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-800: --- Attachment: PIG-800.patch script1-hadoop.pig in pig tutorial hangs when run in local mode

[jira] Updated: (PIG-800) script1-hadoop.pig in pig tutorial hangs when run in local mode

2009-05-07 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-800: --- Status: Patch Available (was: Open) Changed POSort and PODistinct to swallow POStatus.STATUS_NULL instead

[jira] Updated: (PIG-800) script1-hadoop.pig in pig tutorial hangs when run in local mode

2009-05-07 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-800: --- Resolution: Fixed Fix Version/s: 0.3.0 Status: Resolved (was: Patch Available) Patch

[jira] Updated: (PIG-734) Non-string keys in maps

2009-05-07 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-734: --- Attachment: PIG-734.patch Non-string keys in maps --- Key: PIG-734

[jira] Commented: (PIG-734) Non-string keys in maps

2009-05-07 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12707174#action_12707174 ] Alan Gates commented on PIG-734: For serialization, a type discovery has to happen on every

[jira] Commented: (PIG-734) Non-string keys in maps

2009-05-07 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12707195#action_12707195 ] Alan Gates commented on PIG-734: Changing maps to allow the user to specify a type would

[jira] Resolved: (PIG-357) PERFORMANCE: progress reported on every tuple

2009-05-06 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved PIG-357. Resolution: Won't Fix I made a change in ProgressableReporter to only report progress every 2 minutes. I

[jira] Created: (PIG-800) script1-hadoop.pig in pig tutorial hangs when run in local mode

2009-05-06 Thread Alan Gates (JIRA)
Versions: 0.2.0 Reporter: Alan Gates Assignee: Alan Gates Any script of the form {code} B = foreach A generate flatten(X); -- X is a bag C = distinct B; {code} where X is sometimes an empty bag will hang in local mode. If distinct is replaced by order by it will also hang

[jira] Updated: (PIG-741) Add LIMIT as a statement that works in nested FOREACH

2009-05-05 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-741: --- Resolution: Fixed Status: Resolved (was: Patch Available) Patch checked in. Add LIMIT as a statement

[jira] Resolved: (PIG-789) coupling load and store in script no longer works

2009-05-04 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved PIG-789. Resolution: Fixed Fix Version/s: 0.3.0 Patch checked in. Thanks Gunther. coupling load and store

[jira] Commented: (PIG-795) Command that selects a random sample of the rows, similar to LIMIT

2009-05-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705037#action_12705037 ] Alan Gates commented on PIG-795: Eric, Thanks for the patch. I agree this is a feature

[jira] Commented: (PIG-626) Statistics (records read by each mapper and reducer)

2009-05-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705152#action_12705152 ] Alan Gates commented on PIG-626: Shubham, I apologize for being so slow to get to this. I

[jira] Assigned: (PIG-697) Proposed improvements to pig's optimizer

2009-05-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-697: -- Assignee: Santhosh Srinivasan (was: Alan Gates) Proposed improvements to pig's optimizer

[jira] Commented: (PIG-795) Command that selects a random sample of the rows, similar to LIMIT

2009-05-01 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705134#action_12705134 ] Alan Gates commented on PIG-795: I think it's fine to have sample as a keyword. It's valuable

[jira] Commented: (PIG-741) Add LIMIT as a statement that works in nested FOREACH

2009-04-30 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704672#action_12704672 ] Alan Gates commented on PIG-741: Since limit distributes rather nicely, I'd very much like

[jira] Commented: (PIG-741) Add LIMIT as a statement that works in nested FOREACH

2009-04-30 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704775#action_12704775 ] Alan Gates commented on PIG-741: I only added tests for local mode because inner operators

[jira] Commented: (PIG-627) PERFORMANCE: multi-query optimization

2009-04-28 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703792#action_12703792 ] Alan Gates commented on PIG-627: Checked in multiquery-phase3_0423.patch to multiquery branch

[jira] Commented: (PIG-619) Dumping empty results produces Unable to get results for /tmp/temp-1964806069/tmp256878619 org.apache.pig.builtin.BinStorage message

2009-04-28 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703940#action_12703940 ] Alan Gates commented on PIG-619: Does fixing this still make sense? IIRC the main reason

[jira] Created: (PIG-789) coupling load and store in script no longer works

2009-04-28 Thread Alan Gates (JIRA)
: 0.3.0 Reporter: Alan Gates Many user's pig script do something like this: a = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa); c = filter a by age 500; e = group c by (name, age); f = foreach e generate group, COUNT($1); store f into 'bla'; f1 = load 'bla

[jira] Commented: (PIG-774) Pig does not handle Chinese characters (in both the parameter subsitution using -param_file or embedded in the Pig script) correctly

2009-04-24 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702445#action_12702445 ] Alan Gates commented on PIG-774: Two lines of change are needed to fix this: 1

[jira] Created: (PIG-783) PigStorage does not handle unicode characters above \u007f as a separator in the data.

2009-04-24 Thread Alan Gates (JIRA)
: Pig Issue Type: Bug Components: impl Affects Versions: 0.2.0 Reporter: Alan Gates Priority: Minor PigStorage reads one byte at a time to find the separator character in the data. So any multi-byte UTF-8 character will not work as a separator

[jira] Commented: (PIG-759) HBaseStorage scheme for Load/Slice function

2009-04-23 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702002#action_12702002 ] Alan Gates commented on PIG-759: Are you suggesting that the hbase scheme include ways

[jira] Commented: (PIG-712) Need utilities to create schemas for bags and tuples

2009-04-23 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702007#action_12702007 ] Alan Gates commented on PIG-712: My apologies, I dropped the ball on this. As SchemaUtil

[jira] Commented: (PIG-775) PORelationToExprProject should create a NonSpillableDataBag to create empty bags

2009-04-23 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702183#action_12702183 ] Alan Gates commented on PIG-775: +1 PORelationToExprProject should create

[jira] Commented: (PIG-697) Proposed improvements to pig's optimizer

2009-04-20 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700958#action_12700958 ] Alan Gates commented on PIG-697: +1 on Part2 of Phase 1 patch. Proposed improvements

Re: [Pig Wiki] Update of HowToContribute by AlanGates

2009-04-16 Thread Alan Gates
At this point these are all proposed, none are yet realized. So there is no code for any of them. The place to track these proposals are in the referenced JIRAs. Alan. On Apr 15, 2009, at 6:44 PM, zhang jianfeng wrote: Hi Alan, Thank you for your guideline. So where's code of these

Re: [Pig Wiki] Update of ProposedProjects by AlanGates

2009-04-16 Thread Alan Gates
Your understanding of the proposal is correct. The goal would be to produce Java code rather than a pipeline configuration. But the reasoning is not so that users can then take that and modify themselves. There's nothing preventing them from doing it, but it has a couple of major

Re: Ajax library for Pig

2009-04-14 Thread Alan Gates
calls ( i.e async call from server to browser an inbuilt feature in DWR). DWR is under Apache Licence V2. --nitesh On Wed, Apr 8, 2009 at 9:11 PM, Alan Gates ga...@yahoo-inc.com wrote: Sorry if these are silly questions, but I'm not very familiar with some of these technologies. So what

[jira] Commented: (PIG-766) ava.lang.OutOfMemoryError: Java heap space

2009-04-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12698996#action_12698996 ] Alan Gates commented on PIG-766: It isn't overall data size that matters. It is the size

[jira] Commented: (PIG-697) Proposed improvements to pig's optimizer

2009-04-13 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12698481#action_12698481 ] Alan Gates commented on PIG-697: Patch looks good. A few comments on comments. It looks like

Pig release 0.2.0

2009-04-09 Thread Alan Gates
The Pig team is happy to announce Pig 0.2.0 has been released. This release includes the addition of a types, better error detection and handling, and 5x performance improvement over 0.1.1. The details of the release can be found at http://hadoop.apache.org/pig/releases.html . Pig is a

[jira] Updated: (PIG-712) Need utilities to create schemas for bags and tuples

2009-04-09 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-712: --- Status: Open (was: Patch Available) The patch looks good. A couple of comments: 1) You set up a set

[jira] Commented: (PIG-729) Use of default parallelism

2009-04-09 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12697627#action_12697627 ] Alan Gates commented on PIG-729: -1 to requiring parallel as a keyword. Users move

[jira] Commented: (PIG-724) Treating integers and strings in PigStorage

2009-04-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12697054#action_12697054 ] Alan Gates commented on PIG-724: Currently Pig doesn't require that all keys and values

[jira] Commented: (PIG-745) Please add DataTypes.toString() conversion function

2009-04-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12697056#action_12697056 ] Alan Gates commented on PIG-745: I'm reviewing this patch. Please add DataTypes.toString

Re: Ajax library for Pig

2009-04-08 Thread Alan Gates
Sorry if these are silly questions, but I'm not very familiar with some of these technologies. So what you propose is that Pig would be installed on some dedicated server machine and a web server would be placed in front of it. Then client libraries would be developed that made calls to

[jira] Commented: (PIG-712) Need utilities to create schemas for bags and tuples

2009-04-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12697075#action_12697075 ] Alan Gates commented on PIG-712: Jeff, Thanks for the patch. I'll take a look

<    4   5   6   7   8   9   10   >