Re: How to group on a group id that is present inside a complex hierarchy

2011-04-06 Thread Alan Gates
Approach 2 should work except for a bug in the way flatten schemas are handled (the bug will be fixed in 0.9 fwiw). If you specify the schema after the flatten I think it will work. Change at = foreach inputTuples generate flatten($0.$0#'stdout'); to at = foreach inputTuples generate flat

Re: Tests using ExecType.MAPREDUCE

2011-04-16 Thread Alan Gates
With the exception of occasional timeout issues all the unit tests should pass. We (Yahoo dev team) run them all every night. Alan. On Apr 15, 2011, at 9:55 PM, Dmitriy Ryaboy wrote: As a follow-up: I've been running the "ant test" suite since noon, trying ot validate PIG-1870. It's now 1

Re: Tests using ExecType.MAPREDUCE

2011-04-16 Thread Alan Gates
I'm not opposed to switching them. IIRC the the "using" clause is ignored in local mode, so tests for joins, cogroup, etc. that have "using" should not be switched. I want to discuss further how to fix this at the developer meeting next week. The current setup is a total mess. Alan. O

Re: [VOTE] Release Pig 0.8.1 (candidate 0)

2011-04-22 Thread Alan Gates
+1. Built it, ran the commit tests, ran the tutorial, checked the signature of the download and the md5. All looks good. Alan. On Apr 19, 2011, at 10:49 PM, Daniel Dai wrote: Hi, I have created a candidate build for Pig 0.8.1. A description of what is new and different is included in t

Review Request: Review request for PIG-1883-2.patch

2011-05-02 Thread Alan Gates
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/683/ --- Review request for pig. Summary --- This is Laukik's patch for PIG-1883 Th

Re: Review Request: Review request for PIG-1883-2.patch

2011-05-02 Thread Alan Gates
y one parameter, but two are listed in the comment. - Alan On 2011-05-02 20:41:04, Alan Gates wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/683/ > --

Unit tests messed up on 0.9 branch

2011-05-04 Thread Alan Gates
When I run test-commit on the 0.9 branch I get three tests failing: TestTypeCheckingValidatorNoSchema, TestTypeChecking, and TestSchemaParser. All three complain that the test class is not found. It appears the test classes have been moved but that the commit-tests file hasn't been updat

Re: Pig 0.7 download mirror sites not working

2011-05-12 Thread Alan Gates
Hadoop has removed the release artifacts of its former subprojects (including Pig) from the mirrors. You can still find the release in Apache's archive: http://archive.apache.org/dist/hadoop/pig/pig-0.7.0/ Alan. On May 12, 2011, at 9:16 AM, Subhramanian, Deepak wrote: The mirror sites I

New DOAP file for Pig Project

2011-05-16 Thread Alan Gates
I have a DOAP file for the Pig project that I would like to have listed on http://projects.apache.org/ The file is available at http://svn.apache.org/repos/asf/pig/trunk/doap_Pig.rdf Thanks. Alan.

Welcome to Aniket Mokashi

2011-05-19 Thread Alan Gates
Please join me in welcoming Aniket Mokashi as a new committer on Pig. Aniket has been contributing to Pig since last summer. He wrote or helped shepherd several major features in 0.8, including the Python UDF work, the new mapreduce functionality, and the custom partitioner. We look forw

Re: [jira] [Commented] (PIG-1772) Pig 090 Documentation

2011-05-28 Thread Alan Gates
0.9 has not yet been released. You can build your own copy from source by doing: svn co http://svn.apache.org/repos/asf/pig/branches/branch-0.9 cd branch-0.9 ant Alan. On May 27, 2011, at 4:00 PM, Richa Khandelwal wrote: Is 0.9 version of pig available for download? On Fri, May 27, 2011

Fwd: Travel Assistance applications now open for ApacheCon NA 2011

2011-06-06 Thread Alan Gates
Begin forwarded message: From: Gavin McDonald Date: June 6, 2011 1:03:24 AM PDT To: "p...@apache.org" Subject: Travel Assistance applications now open for ApacheCon NA 2011 Reply-To: "priv...@incubator.apache.org" , "ga...@16degrees.com.au" > Hi PMC folks, could you please kindly redist

Pig meetup after the Hadoop summit

2011-06-06 Thread Alan Gates
I've created a meetup at http://www.meetup.com/PigUser/events/ 21215831/ for the Pig user meetup on 6/30, the day after the Hadoop summit. We already have some great discussions lined up on Elephant Bird, embedding Pig in Python, and integrating Pig and Cassandra. There will also be time f

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

2011-06-17 Thread Alan Gates
MAX should definitely handle null, and it should ignore it. The goal for our SQL like built in aggregate functions (MIN, MAX, COUNT, SUM, AVG) is to be SQL like. SQL ignores nulls in these functions. It's inconsistent, but it's usually what people. So, we should be consistently inconsis

Re: [VOTE] Release Pig 0.9.0 (candidate 0)

2011-06-23 Thread Alan Gates
Are you referring to PIG-2137? I have a few of questions on that before I vote for this release candidate or to reroll. Is this a new issue introduced in 0.9? Is there a workaround for this? We have already discussed that 0.9.0 will be beta quality, and a follow up release will be needed a

Re: [DISCUSS] Release Pig 0.9.0 (candidate 0)

2011-06-23 Thread Alan Gates
t data, with no way for an analyst to know her statistics are now fundamentally wrong. Silent and subtle data corruption is just about the wrist big we can have. I can't really block the release, you guys can outvote me. But I'd rather you didn't; we can patch this problem

Fwd: Reminder: TAC Assistance to ApacheCon NA 2011 closes July 8th

2011-07-02 Thread Alan Gates
Begin forwarded message: > From: "Gavin McDonald" > Date: July 2, 2011 5:16:14 PM PDT > To: > Subject: Reminder: TAC Assistance to ApacheCon NA 2011 closes July 8th > Reply-To: priv...@incubator.apache.org > Reply-To: > > PMCs, please re-post this reminder to your user and dev lists and anyw

Pig testing proposal

2011-07-14 Thread Alan Gates
I have posted a proposal for changes in Pig's testing that I would like to make. https://cwiki.apache.org/confluence/display/PIG/PigTestProposal Please take a look and provide feedback. Alan.

Updates to HowToContribute

2011-07-21 Thread Alan Gates
I have changed our HowToContribute page. Most of the changes are trivial (Hudson->Jenkins, etc.). One significant change it that I changed all the places we say contributors should run "ant test" before submitting their patch to saying they should run "ant test-commit". We (as committers, and

Marking patches as submitted

2011-07-21 Thread Alan Gates
A heads up to all those contributing patches. I've noticed some people attach their patch without changing the workflow state to SubmitPatch. The way we Pig committers know to start reviewing and testing a patch is when it changes to this state. So if you don't change the state we are likely

Re: [VOTE] Release Pig 0.9.0 (candidate 1)

2011-07-22 Thread Alan Gates
+1. Ran the test-commit, tutorial, and quick sanity test against a real cluster on Linux, ran a quick sanity test in local mode on Mac. Checked signature key and md5. Alan. On Jul 22, 2011, at 2:12 PM, Olga Natkovich wrote: > I have created the second candidate build for Pig 0.9.0 release. T

Re: Failing tests after parser change?

2011-08-11 Thread Alan Gates
This looks like the intermittent Antlr bug we're seeing (https://issues.apache.org/jira/browse/PIG-2055). We're testing other versions of Antlr to try to fix this, but until we find one that addresses the issue the only solution is to do ant clean, and then rebuild and see if it goes away. We

Moving to new e2e harness for end-to-end testing

2011-09-01 Thread Alan Gates
I have gotten the end-to-end test harness to the point where it runs basically all the existing tests and where it can be run from ant. It can be run on either an existing cluster or in Amazon's EC2. There are instructions on how to run it in both settings at https://cwiki.apache.org/confluen

Re: Moving to new e2e harness for end-to-end testing

2011-09-01 Thread Alan Gates
re's EC2 for running your own tests if you don't have a cluster. But this isn't a viable long term solution. Alan. On Sep 1, 2011, at 6:49 PM, Dmitriy Ryaboy wrote: > Alan, > Great work. > Any plans for hooking this up to the apache Jenkins instance? > > D > &g

Proposing a Pig 0.9.1 release

2011-09-07 Thread Alan Gates
I propose to create a Pig 0.9.1 release. The main motivation behind this release is getting a version of Pig that works with Hadoop 0.20.204 out of the box. It would also work with 0.20.2. This would have all of the JIRAs that have been checked into the 0.9 branch post 0.9.0 [1]. I also prop

Re: [VOTE] Release Pig 0.9.1 (candidate 0)

2011-09-29 Thread Alan Gates
I found one very strange issue: Download and unpackage the tar ball, then do: > bin/pig -x local 2011-09-29 17:39:57,895 [main] INFO org.apache.pig.Main - Logging error messages to: /homes/hortonal/tmp/pig-0.9.1/pig-0.9.1/pig_1317317997892.log 2011-09-29 17:39:58,051 [main] INFO org.apache.pi

Re: [VOTE] Release Pig 0.9.1 (candidate 1)

2011-09-30 Thread Alan Gates
+1 Checked signatures and checksums on all 3 packages. Ran smoke test in local and cluster mode for tar installation. Tested installation of the rpm. Ran Piggybank tests and checked tutorial runs. Alan. On Sep 29, 2011, at 4:27 PM, Daniel Dai wrote: > Hi, > > I have created a candidate build f

Re: Using 'collected' group

2011-10-07 Thread Alan Gates
I would vote for Dmitriy's original option b, on a per feature basis. I know per feature switches are more cumbersome, but a "turn off all sanity checks" option is dangerous. When removing safeties it seems better to do it one at a time. Alan. On Oct 6, 2011, at 10:50 PM, Dmitriy Ryaboy wrot

Re: What is the canonicalname field in a Schema object used for?

2011-11-18 Thread Alan Gates
Santosh is the best person to answer this, as he wrote that code. But, IIRC its purpose is to store the "full" name of a column after cogroups and joins. For example, A = load 'foo' as (u, v); B = load 'bar' as (x, y); C = join A by u, B by x; I believe the canonicalname will now hold A::u, e

Re: Does the name of the tuple that a bag has to have matter?

2011-11-18 Thread Alan Gates
The name doesn't matter. We mostly left it there for backward compatibility, for both specifying schemas and for UDFs. I do think we should make sure we ignore it everywhere (including equality for schemas). Alan. On Nov 16, 2011, at 7:17 PM, Jonathan Coveney wrote: > This is related to an i

Re: Adding a flatten that doesn't throw out the row if you have an empty bag?

2011-12-09 Thread Alan Gates
+1, it seems like people often ask for this. I also would prefer a new operator. OUTER_FLATTEN maybe? Alan. On Dec 9, 2011, at 5:49 PM, Jonathan Coveney wrote: > I think this would be nice to have. We could either add a LEFTFLATTEN() > operator, or add a flag to FLATTEN ie FLATTEN({()},true)

Re: StoreMetadata.storeStatistics

2011-12-20 Thread Alan Gates
At the moment it is unused. It was placed there on the belief that someday storage functions like HCatStorer would want to record statistics from Pig when data was being generated. Alan. On Dec 19, 2011, at 11:04 PM, Vivek Padmanabhan wrote: > Hi, > Can someone tell what is the exact use case

Re: How do we feel about improving pigmix queries?

2011-12-20 Thread Alan Gates
On Dec 14, 2011, at 12:41 PM, Dmitriy Ryaboy wrote: > Two questions relating to that: > > 1) we currently hardcode parallel 40 in pigmix. Since Pig can now > automatically select parallelism, would it be better to let it do so? I agree the hard wiring is bad. But my take is that the auto-parall

Re: Request for contribution

2012-01-10 Thread Alan Gates
What's your ID on JIRA? I can make you a contributor so that you can assign JIRAs to yourself. Alan. On Jan 10, 2012, at 9:54 AM, Carl Frendo wrote: > Hi, > >I would like to start contributing to the pig project starting by the > newbie issues. I tried to assign a ticket in JIRA but I fou

Re: [VOTE] Release Pig 0.9.2 (candidate 0)

2012-01-16 Thread Alan Gates
+1. I checked the keys and signature of .rpm, .deb, and source release. I ran smoke tests in both local and cluster mode on the source release. One issue that we should clear up but that I don't believe blocks the tests is that there are a bunch (~100) 0 length .java files. These are files th

Re: [VOTE] Release Pig 0.9.2 (candidate 1)

2012-01-18 Thread Alan Gates
+1. Checked the rat report, signatures and md5s on all packages. I installed the tar/src release and ran smoke tests in local and cluster mode. I also build the packages and ran the commit unit tests. Alan. On Jan 17, 2012, at 5:16 PM, Daniel Dai wrote: > Hi, > > I have created a candidate

Re: Adding a method to PigProgressNotificationListener

2012-02-13 Thread Alan Gates
AFAIK the only user of this interface is Oozie. You might want to shoot a message to oozie-...@incubator.apache.org and let them know about the change. Alan. On Feb 12, 2012, at 5:12 PM, Dmitriy Ryaboy wrote: > I created https://issues.apache.org/jira/browse/PIG-2528 to track this > issue. >

Re: Where does the m in "mBagFactory" and "mTupleFactory" come from?

2012-02-14 Thread Alan Gates
Member. Somewhere a while ago I picked up the habit of starting member variables with an m so I could keep track of what was local vs what was a member. Alan. On Feb 13, 2012, at 11:58 PM, Jonathan Coveney wrote: > This came up today and I've always been curious as well. What is it > supposed

Re: Is this desirable: relation.projection as sugar for foreach relation generate projection

2012-02-23 Thread Alan Gates
I've been ruminating on this for a while, and I don't have an answer yet. So I'll just give the feedback I've thought of. This further blurs the already line between a relation (which can go on the left side of an assignment) and a bag (which cannot). I don't know if that's a good thing or no

Re: Where do we want to put non-java source files?

2012-03-16 Thread Alan Gates
I vote we avoid the re-organization until there's a tangible benefit. I don't think there's any cost (beyond annoyance maybe) to putting ruby stuff in src-ruby. There isn't any benefit to moving to src/main/java/maven/demands/super/long/paths until we move to maven, if we ever do. Alan. On

Re: Where do we want to put non-java source files?

2012-03-16 Thread Alan Gates
e the other one once it is in. >> >> 2012/3/16 Alan Gates >> >>> I vote we avoid the re-organization until there's a tangible benefit. I >>> don't think there's any cost (beyond annoyance maybe) to putting ruby stuff >>> in src-ruby.

Re: [PIG-2226] Submitting a patch how to

2012-03-19 Thread Alan Gates
See Contributing your work in https://cwiki.apache.org/confluence/display/PIG/HowToContribute It talks about how to upload the patch to JIRA. Alan. On Mar 19, 2012, at 6:02 AM, Sachith Withana wrote: > Hi folks, > I prepared a patch for the PIG-2226 bug and I'd like to know how I can > submit i

Re: Pig User Group

2012-03-19 Thread Alan Gates
There will be a Pig meetup the day before Hadoop summit. But if you guys are willing to organize one sooner that's great. It would be cool to have a PUG every couple months, maybe rotating between SF and the valley. We just need someone with the time and desire to organize it. Alan. On Mar

Re: Pig User Group

2012-03-19 Thread Alan Gates
gt; >>> Great. >>> >>> Since Dmitriy graciously offered Twitter as a location for hosting such an >>> event :) , and I am sure Salesforce campus could be used too, what other >>> logistics would need to be worked out to arrange such a meet? >>&

Re: Making git the repo of choice for Pig?

2012-03-21 Thread Alan Gates
AFAIK Apache still does not support git as a primary repository. You can use the git mirror, which Pig does. If this has changed (or when it does), I'm +0 on changing, by which I mean I don't care which we use. Alan. On Mar 20, 2012, at 11:22 PM, Jonathan Coveney wrote: > Would anyone be opp

Re: Making git the repo of choice for Pig?

2012-03-21 Thread Alan Gates
;> had >>>>>> re git) where I was pointing out that some incubator projects were >>> using >>>>>> git. >>>>>> >>>>>> https://twitter.com/#!/billgraham/status/174744199407738880 >>>>>> >>>>>> &

Re: Fixing a broken dependency // can we include a patched piece of JRuby source code in Pig?

2012-03-23 Thread Alan Gates
Won't a lot of people already have their version of JRuby and not want a special one? I'm fine with having a patched version on github and referring it in our release notes. I'm not wild about including a version of JRuby with Pig, for both licensing reasons and because our tar file is bloated

Re: [VOTE] Release Pig 0.10.1 (candidate 1)

2012-12-27 Thread Alan Gates
bin/pig and autocomplete files are missing from the -src artifact. Also this artifact has penny.jar in it, which I assume is unintentional. Alan. On Dec 27, 2012, at 11:23 AM, Daniel Dai wrote: > Hi, > > I have created a candidate build for Pig 0.10.1. This is a maintenance > release of Pig 0

Re: Patching multiple versions

2012-12-28 Thread Alan Gates
Are you asking how to personally apply the patch to other branches or how to have it applied to other branches when it is checked in? To apply it yourself just check out the branch you want and apply the patch. You may need to make some changes to account for changes in the code between trun

Re: [VOTE] Release Pig 0.10.1 (candidate 2)

2012-12-28 Thread Alan Gates
+1. Checked the checksums and signature for the source release. Built and ran some of the unit tests. Downloaded the tar binary, ran local and cluster jobs. Alan. On Dec 28, 2012, at 10:01 AM, Daniel Dai wrote: > Hi, > > I have created a candidate build for Pig 0.10.1. This is a maintenance

Re: [VOTE] Release Pig 0.10.1 (candidate 3)

2012-12-31 Thread Alan Gates
+1, yet again :). Checked the key signature and checksum on the source package. Built and ran commit unit tests on src, ran a test job in local mode. Downloaded the tar binary and ran a job in local and cluster mode. Alan. On Dec 28, 2012, at 11:50 PM, Daniel Dai wrote: > Hi, > > I have c

Re: Turn off Speculative Execution in a UDF?

2013-01-11 Thread Alan Gates
Store functions can run in either map or reduce depending on your script. If your script has any operator that requires a reduce (most joins, group by, order by, distinct, limit) then the store function will be in a reduce. Alan. On Jan 11, 2013, at 9:14 AM, Corbin Hoenes wrote: > Hi all, >

Re: Review Request: Add BigInteger and BigDecimal to Pig

2013-01-21 Thread Alan Gates
Seems like the second argument here should be a BigInteger, not a boolean. Same comment for the next line. - Alan Gates On Jan. 18, 2013, 10:11 p.m., Jonathan Coveney wrote: > > --- > This is an

Re: pig pull request: Merge pull request #1 from apache/trunk

2013-01-24 Thread Alan Gates
The way to get patches into Pig is to file a JIRA ( https://issues.apache.org/jira/browse/PIG ) and attach your patch there. Alan. On Jan 22, 2013, at 4:45 PM, l33t-verticloud wrote: > GitHub user l33t-verticloud opened a pull request: > >https://github.com/apache/pig/pull/8 > >Merge

Re: Got a build error: missing required library: 'build/ivy/lib/Pig/javacc-4.2.jar'

2013-01-25 Thread Alan Gates
javacc is still used in Pig trunk. The main parser was replaced by ANTLR in 0.9, but there are still several places Pig uses javacc. Alan. On Jan 25, 2013, at 11:57 AM, Kyungho Jeon wrote: > Hello, > > I just cloned Pig svn repository and imported into Eclipse. > Immediately I got an error as

Re: Pig 11.0

2013-01-29 Thread Alan Gates
Olga had previously volunteered to be RM for this release. Olga, did you still want to do this or are you open to someone else taking it on? BTW, I don't have time to drive it right now either, though I'm happy to coach a first time RM through it if Olga doesn't want to do it. Alan. On Jan 2

Re: [VOTE] Release Pig 0.11.0 (candidate 0)

2013-02-11 Thread Alan Gates
+1 Ran a few smoke tests in both local and cluster modes, compiled and ran the unit tests on the source, tested piggybank and the tutorial, looked over the NOTICE, LICENSE, and RELEASE_NOTES files. We should update the RELEASE_NOTES to say we've tested this version with Hadoop 1.0 (instead of

Re: [VOTE] Release Pig 0.11.0 (candidate 1)

2013-02-12 Thread Alan Gates
+1 Release notes look good. Built the code and tests, including piggybank and docs. Ran the piggybank tests and all the tutorials. Ran simple smoke tests in local and cluster mode. Alan. On Feb 11, 2013, at 11:03 PM, Bill Graham wrote: > Hi, > > I have created a candidate build for Pig 0.

Re: What do we need to change site documentation?

2013-02-19 Thread Alan Gates
No, somebody fixed it a while ago so it works with java 6. Just checkout pig/site, make your changes, build with "ant -Dforrest.home=", view the changes locally under the publish directory, add any new files, and check in. The publication from SVN to web is now automatic. It all works fine wi

Re: pig 0.11 candidate 2 feedback: Several problems

2013-02-20 Thread Alan Gates
Time is not the right metric to determine how long we support a particular release of Hadoop. Some versions of Hadoop are widely adopted and have a long life (0.20) and some have very low adoption and short life spans (0.19, 0.21). We're an open source project driven by volunteers, so it makes

Re: When making a change to pig.apache.org, do we attach just the patch for the changes to author, or to the post-forrest changes to publish as well?

2013-02-20 Thread Alan Gates
You need to check in both author and publish. The site is now directly loaded from SVN using what's under publish. Alan. On Feb 20, 2013, at 1:35 AM, Jonathan Coveney wrote: > I believe that this is how it works, but it has been a while and I want to > make sure...

Re: pig 0.11 candidate 2 feedback: Several problems

2013-02-20 Thread Alan Gates
No. Bugs like these are supposed to be found and fixed after we branch from trunk (which happened several months ago in the case of 0.11). The point of RCs are to check that it's a good build, licenses are right, etc. Any bugs found this late in the game have to be seen as failures of earlier

Re: ON_ERROR command

2013-02-22 Thread Alan Gates
AFAIK no one has picked it up to work on it. I still believe this would be a very valuable feature. If you want to pick it up and drive it that would be really cool. Alan. On Feb 22, 2013, at 12:10 PM, Adam Silberstein wrote: > Hi, > I'm interested in custom-error handling in Pig. I came ac

Re: How do we post to the apache pig blog?

2013-02-22 Thread Alan Gates
I tried to send Dmitriy an invitation to be an author on the blog, but it told me "Error creating user invitation". I'm happy to post the content and make it clear you're the author. I've also filed https://issues.apache.org/jira/browse/INFRA-5894 to try to fix the issue. Alan. On Feb 22, 20

Fwd: Call for papers: Management of Big Data track - ICAC'13 by USENIX/ACM-SIGARCH

2013-03-20 Thread Alan Gates
Begin forwarded message: > From: Dani Abel Rayan > Date: March 14, 2013 10:55:00 AM PDT > To: user > Subject: Call for papers: Management of Big Data track - ICAC'13 by > USENIX/ACM-SIGARCH > Reply-To: u...@hadoop.apache.org > > Hi, > > Join us for the 10th International Conference on Auton

Re: A major addition to Pig. Working with spatial data

2013-05-01 Thread Alan Gates
Passing on the technical details for a moment, I see a licensing issue. JTS is licensed under LGPL. Apache projects cannot contain or ship [L]GPL. Apache does not meet the requirements of GPL and thus we cannot repackage their code. If you wanted to go forward using that class this would have

Re: A major addition to Pig. Working with spatial data

2013-05-02 Thread Alan Gates
for example, to download the jar file when compiling. > On May 1, 2013 7:50 PM, "Alan Gates" wrote: > >> Passing on the technical details for a moment, I see a licensing issue. >> JTS is licensed under LGPL. Apache projects cannot contain or ship >> [L]GPL.

Re: CHANGES.txt in trunk

2013-05-03 Thread Alan Gates
What do mean by remove? They should still be in the file. They may need to be relocated under the 0.11 section. But the trunk CHANGES file should include all changes that are on trunk. Alan. On May 3, 2013, at 1:34 PM, Rohini Palaniswamy wrote: > Hi, > I see lot of patches that went into

Re: CHANGES.txt in trunk

2013-05-06 Thread Alan Gates
Cool, just wanted to make sure. I agree this is a good idea. Alan. On May 5, 2013, at 7:06 PM, Rohini Palaniswamy wrote: > Alan, > I meant relocating only - Moving jiras from 0.12 to 0.11.x releases > section :). > > Regards, > Rohini > > > On Fri, May 3, 2013 at

Hadoop Summit Pig Meetup

2013-05-08 Thread Alan Gates
As always we'll be hosting a Pig meetup to coincide with the Hadoop Summit in San Jose. Details are at http://www.meetup.com/PigUser/events/118434872/ Please sign up there if you plan to attend. Alan.

Re: Uploading patches for review

2013-06-06 Thread Alan Gates
I think it's fine for a reviewer to ask for a particular patch to be put in review board. I think it would also be fine to put in our HowToContribute doc that for larger patches putting it in review board may help get it reviewed more quickly. I'm not in favor of requiring it, as some reviewer

Fwd: DesignLounge @ HadoopSummit

2013-06-13 Thread Alan Gates
Begin forwarded message: > From: Eric Baldeschwieler > Date: June 11, 2013 10:46:25 AM PDT > To: "common-...@hadoop.apache.org" > Subject: DesignLounge @ HadoopSummit > Reply-To: common-...@hadoop.apache.org > > Hi Folks, > > We thought we'd try something new at Hadoop Summit this year to bu

Fwd: DesignLounge @ HadoopSummit

2013-06-24 Thread Alan Gates
Begin forwarded message: > From: Eric Baldeschwieler > Date: June 23, 2013 9:32:12 PM PDT > To: "common-...@hadoop.apache.org" , > "mapreduce-...@hadoop.apache.org" , > "hdfs-...@hadoop.apache.org" > Subject: DesignLounge @ HadoopSummit > Reply-To: common-...@hadoop.apache.org > > Hi Folks,

Re: Pig and Storm

2013-07-24 Thread Alan Gates
This sounds exciting. The next question is how do you plan to do it? Would a physical plan be translated to a Storm job (or jobs)? Would it need a different physical plan? Or would you just have the connection at the language layer and all the planning separate? Do you envision needing ext

Re: JsonLoader fails the pig job in case of malformed json input

2013-08-08 Thread Alan Gates
Definitely, please provide a patch. Alan. On Aug 8, 2013, at 4:58 AM, Demeter Sztanko wrote: > Hi all, > > Suppose I have a text file that contains only one line: > {"a", bad} > > This is obviously not a valid json. > > This input fails the this simple script: > b = load 'bad.input' using Jso

Re: Slow Group By operator

2013-08-22 Thread Alan Gates
When data comes out of a map task, Hadoop serializes it so that it can know its exact size as it writes it into the output buffer. To run it through the combiner it needs to deserialize it again, and then re-serialize it when it comes out. So each pass through the combiner costs a serialize/de

Re: Propose UDF

2013-09-04 Thread Alan Gates
A few questions: 1) Why did you try to use RANK? I don't see how rank is part of this. 2) The semantics here aren't clear to me. record_id appears to be crossed with name and id but name and id appear to be chosen in order. If this is join semantics I'd have expected two more entries in B, on

Re: Which file translates the program into a map reduce plan

2013-09-19 Thread Alan Gates
Checkout src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java Alan. On Sep 19, 2013, at 3:54 PM, Abdollahian Noghabi, Shadi wrote: > Hi, > > I want to find which file in pig converts the physical plan into the map > reduce plan. Actually, I want to get some informa

Re: [Discussion] Any thoughts on PIG-3457?

2013-09-30 Thread Alan Gates
We should separate out two separate concerns. If I understand correctly we don't need any of these changes in 0.12. So we should revert these patches from the 12 branch so that we can get it released quickly in a backwards compatible way. We will then have plenty of time to discuss the sepa

Re: [VOTE] Release Pig 0.12.0 (candidate 2)

2013-10-07 Thread Alan Gates
+1. Downloaded, ran commit-test, piggybank unit tests, tutorial, and simple local mode smoke tests. Looked over the CHANGES, README, RELEASE_NOTES files to make sure they looked reasonable. Alan. On Oct 7, 2013, at 12:28 PM, Daniel Dai wrote: > Hi, > > I have created a candidate build for P

Re: How do we determine 'stable' pig version?

2013-10-22 Thread Alan Gates
I don't think we should change our use of stable. Our usage is in line with the Hadoop usage of the term in their releases. To the best of our knowledge as Apache developers it is stable. It passes all of the tests we have. We have no criteria for deciding stability beyond this. Alan. On O

Re: please unsubscribe

2016-06-01 Thread Alan Gates
To unsubscribe send email to dev-unsubscr...@pig.apache.org Alan. > On Jun 1, 2016, at 07:40, asser dennis wrote: > >

Re: [VOTE] Release Pig 0.16.0 (candidate 0)

2016-06-03 Thread Alan Gates
+1. Checked the signatures, did a build, ran a smoke test. Looks good. Alan. > On Jun 1, 2016, at 23:39, Daniel Dai wrote: > > Hi, > > I have created a candidate build for Pig 0.16.0. > > Keys used to sign the release are available at > http://svn.apache.org/viewvc/pig/trunk/KEYS?view=marku

Re: Please add me to the contributors

2016-07-01 Thread Alan Gates
Done. Welcome to the Pig team. Alan. > On Jul 1, 2016, at 05:58, Nandor Kollar wrote: > > Hi Dev team, > > Can you please add me to the Pig contributors? I'd like to submit patches > to open issues. > > Thanks, > Nandor

Re: Request for addition as contributor

2016-07-12 Thread Alan Gates
Done. Welcome to the Pig project! Alan. > On Jul 12, 2016, at 06:56, Adam Szita wrote: > > Hi, > > Can you add my userid (szita) as contributor to Pig please. > > Thanks, > Adam

Hadoop Summit EU 2017

2016-11-04 Thread Alan Gates
The DataWorks Summit EU 2017 (including Hadoop Summit) is going to be in Munich April 5-6 2017. I’ve pasted the text from the CFP below. Would you like to share your knowledge with the best and brightest in the data community? If so, we encourage you to submit an abstract for DataWorks Summit

Call for abstracts open for Dataworks & Hadoop Summit San Jose

2017-01-31 Thread Alan Gates
The Dataworks & Hadoop summit will be in San Jose June 13-15, 2017. The call for abstracts closes February 10. You can submit an abstract at http://tinyurl.com/dwsj17CFA There are tracks for Hadoop, data processing and warehousing, governance and security, IoT and streaming, cloud and operati

CFP for Dataworks Summit Sydney

2017-05-03 Thread Alan Gates
The Australia/Pacific version of Dataworks Summit is in Sydney this year, September 20-21.   This is a great place to talk about work you are doing in Apache Pig or how you are using Pig.  Information on submitting an abstract is at https://dataworkssummit.com/sydney-2017/abstracts/submit-abstra

Fwd: Travel Assistance applications open. Please inform your communities

2018-02-14 Thread Alan Gates
-- Forwarded message -- From: Gavin McDonald Date: Wed, Feb 14, 2018 at 1:34 AM Subject: Travel Assistance applications open. Please inform your communities To: travel-assista...@apache.org Hello PMCs. Please could you forward on the below email to your dev and user lists. Than

Moving Pig's code

2010-10-05 Thread Alan Gates
As part of our move to a TLP we need to move our code base out of svn.apache.org/repos/asf/hadoop/pig/ and into svn.apache.org/repos/asf/ pig/ I plan to do this move tomorrow. It should not affect any of your outstanding checkouts (including ones where you have made changes), as there is

Re: Moving Pig's code

2010-10-06 Thread Alan Gates
I don't think so, but I'll check and find out. Alan. On Oct 5, 2010, at 11:29 PM, Russell Jurney wrote: Just curious - are apache projects allowed to use git as the 'repo of record' ? Russ On Tuesday, October 5, 2010, Alan Gates wrote: As part of our move to a TLP

Re: Moving Pig's code

2010-10-06 Thread Alan Gates
, and http://wiki.apache.org/general/GitAtApache. So if you prefer git you should be able to forget the official repository is in SVN and just live in the git world. Alan. On Oct 6, 2010, at 6:40 AM, Alan Gates wrote: I don't think so, but I'll check and find out. Alan. On Oct 5

Re: Moving Pig's code

2010-10-06 Thread Alan Gates
I'm starting the move now. You should hold off on any checkins until its done, as you may not know where to check it in. :) Alan. On Oct 5, 2010, at 4:22 PM, Alan Gates wrote: As part of our move to a TLP we need to move our code base out of svn.apache.org/repos/asf/hadoop/pig/ and

Re: Moving Pig's code

2010-10-06 Thread Alan Gates
URL accordingly. And remember, from this point forward checkouts will need to come from https://svn.apache.org/repos/asf/pig/ and not https://svn.apache.org/repos/asf/hadoop/pig/ Alan. On Oct 6, 2010, at 12:52 PM, Alan Gates wrote: I'm starting the move now. You should hold off o

Re: Moving Pig's code

2010-10-07 Thread Alan Gates
On Thu, Oct 7, 2010 at 8:16 AM, Thejas M Nair wrote: Replace https with http in the commands below if your repository is checked out using http. (sending this because it took me a few minutes to figure out why the command didn't work for me) -Thejas On 10/6/10 1:24 PM, &quo

Proposed design for adding control flow to Pig

2010-10-15 Thread Alan Gates
After several months of mulling things around Richard and I have put together a proposed design for adding control flow to Pig. See http://wiki.apache.org/pig/TuringCompletePig for complete details. Please give us your feedback. Alan.

Re: Proposed design for adding control flow to Pig

2010-10-15 Thread Alan Gates
on examples and see tons of alias naming boilerplate that should IMO be implicit somehow. Pig already has a lot of alias and field naming boilerplate, I would like to avoid introducing more. Otherwise, I'm sure I'll use a preprocessor again to get rid of it :). On Oct 15, 2

Re: Proposed design for adding control flow to Pig

2010-10-25 Thread Alan Gates
On Oct 18, 2010, at 12:28 PM, Scott Carey wrote: It is the last that looks to complicate things a bit. We should probably pass the fields visible to the macro as well as the aliases. That might look like: define disjunctive_join_filter[out RESULT, A.(a,b,c), B.(a,b

Pig contributor meeting notes

2010-11-08 Thread Alan Gates
d anywhere else. No solid conclusion was reached on what is the right syntax, though no one voted for the currently proposed syntax. Dmitriy also noted that if we picked the right scripting language (such as JRuby) we could do the functions in that language without a need for adding macros to Pi

Re: pig LoadMetaData find schema in AS clause from Loader.

2010-11-10 Thread Alan Gates
To answer your direct question, no, there is currently no provision in the interface for Pig to provide the user defined schema to the load function. But it seems like the real solution to your problem is that LoadMetaData:setPartitionFilter ought to be called regardless of whether the lo

  1   2   3   4   5   6   7   8   9   10   >