Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
On Thursday 04 October 2007 04:56, Robert Burrell Donkin wrote: http://agileskills2.org/blog/2007/09/my_thoughts_on_the_differences.html i have the impression that howard is one of those people who dominates by his charisma and energy rather than any abuse of the process Yes, and I hope he has wisdom enough to mitigate confrontation should it arise. Cheers -- Niclas Hedhman, Software Developer I live here; http://tinyurl.com/2qq9er I work here; http://tinyurl.com/2ymelc I relax here; http://tinyurl.com/2cgsug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
On 9/25/07, Craig L Russell [EMAIL PROTECTED] wrote: On Sep 25, 2007, at 8:28 AM, Guillaume Nodet wrote: One of the purpose of the incubator is to ensure that there is a sustainable developer community, so I don't see failure of incubating projects as a real problem. +1. If we knew for sure that a project would be able to attract a community, we would have much less need for incubation. +1 - robert - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
On 9/26/07, Niall Pemberton [EMAIL PROTECTED] wrote: On 9/25/07, Guillaume Nodet [EMAIL PROTECTED] wrote: One of the purpose of the incubator is to ensure that there is a sustainable developer community, so I don't see failure of incubating projects as a real problem. +1 Theres more of an issue IMO with projects that don't come thru the incubator, since they don't have to meet the Incubator's stringent graduation requirement. As an example - Tapestry was pushed out to a TLP from Jakarta, but the following blog from a Tapestry committer doesn't make good reading from a community PoV: http://agileskills2.org/blog/2007/09/my_thoughts_on_the_differences.html i have the impression that howard is one of those people who dominates by his charisma and energy rather than any abuse of the process - robert - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Jakarta [was: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]]
On 01.10.2007, at 18:43, Roland Weber wrote: Erik Abele wrote: Sure, am happy to help (as a satisfied user of both, HttpComponents and JMeter); just let me know where you'd like to see me subscribed... (I assume [EMAIL PROTECTED] and [EMAIL PROTECTED]) That's great! Yes, those will be the interesting lists in terms of future directions for both subprojects. Done. If the traffic on either list is too high, you could subscribing to [EMAIL PROTECTED] instead. Already subscribed. I'll make sure to post there when discussions get on the way. For HttpComponents, we're planning to prepare the TLP proposal for the December board meeting. Nice. You can find some older discussions in the mailing list archives. Ok, will have a look, thx for the pointers. Cheers, Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Jakarta [was: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]]
On 30.09.2007, at 18:17, Roland Weber wrote: Niclas Hedhman wrote: I don't know what to suggest, but perhaps recruiting one or more veteran ASFer, either just off the member's list or some experienced Incubator mentor, feeling this being important could just join the PMC and at least ensure process with 3 pairs of eye balls. Yeah, we'll try to get a veteran (though not a member) to help us out as the chair for the initial phase. (speaking for HttpComponents, not JMeter) If anyone here feels like keeping an eye on us too, you're most welcome. We know our way through the code and public processes up to PMC, but we currently don't have an ASF member on board who is familiar with what's going on beyond PMC. (speaking for both) Sure, am happy to help (as a satisfied user of both, HttpComponents and JMeter); just let me know where you'd like to see me subscribed... (I assume [EMAIL PROTECTED] and [EMAIL PROTECTED]) Cheers, Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Jakarta [was: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]]
Erik Abele wrote: Sure, am happy to help (as a satisfied user of both, HttpComponents and JMeter); just let me know where you'd like to see me subscribed... (I assume [EMAIL PROTECTED] and [EMAIL PROTECTED]) That's great! Yes, those will be the interesting lists in terms of future directions for both subprojects. If the traffic on either list is too high, you could subscribing to [EMAIL PROTECTED] instead. I'll make sure to post there when discussions get on the way. For HttpComponents, we're planning to prepare the TLP proposal for the December board meeting. You can find some older discussions in the mailing list archives. thanks! Roland - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Jakarta [was: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]]
On Sunday 30 September 2007 01:19, Roland Weber wrote: The new HttpComponents as well as the old HttpClient we maintain are being used by Apache projects, so coming into the Incubator is not an option for HttpComponents. I agree. And typically, TLPs receive somewhat more exposure than sub-projects and a better chance of building a stronger community. I don't know what to suggest, but perhaps recruiting one or more veteran ASFer, either just off the member's list or some experienced Incubator mentor, feeling this being important could just join the PMC and at least ensure process with 3 pairs of eye balls. It sounds to me there should be such interest... Cheers -- Niclas Hedhman, Software Developer I live here; http://tinyurl.com/2qq9er I work here; http://tinyurl.com/2ymelc I relax here; http://tinyurl.com/2cgsug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Jakarta [was: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]]
Niclas Hedhman wrote: I don't know what to suggest, but perhaps recruiting one or more veteran ASFer, either just off the member's list or some experienced Incubator mentor, feeling this being important could just join the PMC and at least ensure process with 3 pairs of eye balls. Yeah, we'll try to get a veteran (though not a member) to help us out as the chair for the initial phase. (speaking for HttpComponents, not JMeter) If anyone here feels like keeping an eye on us too, you're most welcome. We know our way through the code and public processes up to PMC, but we currently don't have an ASF member on board who is familiar with what's going on beyond PMC. (speaking for both) thanks for taking the time, Roland - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Jakarta [was: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]]
Hello Niclas, Staying at Jakarta will buy some time, but won't last forever. If you have ideas on what to do with these small but active projects, please come over to [EMAIL PROTECTED] and share your thoughts. Can't Jakarta just be revitalized as a home for small, but mature and stable projects?? I feel Jakarta has to downsize some more before we can think about reviving it in a new role. A home for small, mature and stable projects is surely an option, though I don't believe it will be easy. The two projects I care about, however, do not match that profile. They are highly active, evolving, and growing. You can see that from Jakarta's September board report[1]: 5 releases, all of them from JMeter and HttpComponents. JMeter released 2.3 final today, and HttpComponents has three more releases in the pipeline until the end of the year. Both projects have hundreds of mails on their lists each month. That's not what I would associate with mature and stable, which sounds more like maintenance mode. So, both projects are actively developed and used. We know how to vote and cut releases. The projects have prospects of growing and attracting a larger user base, from which we can hope to get new committers over time. But at the moment, both depend on a very small group of developers that provide continuity, with occasional patches from others coming in. And we don't get the time to grow organically, with Jakarta disintegrating. I don't know what Sebastian plans for JMeter. Oleg and I will push for an HttpComponents TLP later this year. Not because we feel that the project is ready for that move, but because we see it is the best option left to us. Either we stay at Jakarta until we're being asked to leave (or shut down), or we make a move of our own while we can still choose the time to our convenience. A fallback option is to move from Jakarta to WebServices, exchanging one umbrella for another. That wouldn't move us forward, and the impression we got from Jakarta is that umbrellas are not in favour at the board. As a TLP, we will have a fighting chance to grow the project to the point where it no longer depends on just the two of us. Until that is achieved, we'll be one of those projects that Niall referred to, with community issues because they didn't pass through the Incubator. That's why I thought it was a good occasion to ask for suggestions. The new HttpComponents as well as the old HttpClient we maintain are being used by Apache projects, so coming into the Incubator is not an option for HttpComponents. I'm sorry if this is getting off-topic. I'm just trying to tap into the Incubator's experience in community building. cheers, Roland [1] http://wiki.apache.org/jakarta/JakartaBoardReport-current - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Jakarta [was: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]]
Niall Pemberton wrote: Theres more of an issue IMO with projects that don't come thru the incubator, since they don't have to meet the Incubator's stringent graduation requirement. As an example - Tapestry was pushed out to a TLP from Jakarta,[...] Jakarta is disintegrating. All big projects have gone TLP, there are two or three more that might just make it. The rest is too inactive, because of maturity or disinterest, to stand on their own. There have been discussions every few months on how to revive inactive projects, and sending them back into the incubator was one of the options. Not that it matters much, I don't remember any that picked up enough interest. Now the projects that are still active but have only a small developer community - too small for Incubator standards for sure - are caught between a rock and a hard place. There are users out there, and the code has seen many official releases. Going into the Incubator and making unofficial incubating releases from there is not a preferred option. Going TLP with just enough PMCs to collect three binding votes during holiday season creates TLPs with a very high dependency on very few people. Projects which are an issue. Staying at Jakarta will buy some time, but won't last forever. If you have ideas on what to do with these small but active projects, please come over to [EMAIL PROTECTED] and share your thoughts. The two examples I have in mind are HttpComponents and JMeter, but there may be others: http://jakarta.apache.org/httpcomponents/index.html http://jakarta.apache.org/jmeter/index.html cheers, Roland - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Jakarta [was: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]]
On Saturday 29 September 2007 00:03, Roland Weber wrote: Staying at Jakarta will buy some time, but won't last forever. If you have ideas on what to do with these small but active projects, please come over to [EMAIL PROTECTED] and share your thoughts. Can't Jakarta just be revitalized as a home for small, but mature and stable projects?? We must allow for stable, near perfect codebases, to just exist without further development, i.e. no dedicated community. Cheers Niclas - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
Noel J. Bergman wrote: Dims wrote: Niclas Hedhman asked: Do we have any examples where corporate backing has been withdrawn, and how the project was affected, whether inside or outside ASF? TSIK - Verisign folks lost interest, community did not form, project shelved. Plus Kabuki, Heraldry, and possibly Lokahi. We still hope to save the latter, as there is consistently a lot of user interest, but the developer input has dwindled, and we need developers! ahh, now I understand why you've been trying to get me on the mail list :) The biggest departure I know of was not in the incubator, it was the implementation of bits of WS-RF that HP was doing under WS;, what was it, Apache Muse? suddenly corporate priorities got changed and all FTEs got reassigned to something else. It just sat there for a while before IBM took up the challenge with a port to Axis2. similarly, there was a bit of stutter in Axis1 when the IBM team suddenly dropped of the net. There was lots of other active developers, but there were whole swathes of things like Java-to-WSDL code that came from IBM and which the others suddenly needed to learn, because till now that area had been well covered by the IBM folk, but not oustandingly well documented. At least they provided lots of tests, which does make it easier for others to take on the maintenance task -it reduces the amount of damage done while learning. It seems to me then, that the problem is more than just in-incubator. -steve - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
On 9/25/07, Guillaume Nodet [EMAIL PROTECTED] wrote: One of the purpose of the incubator is to ensure that there is a sustainable developer community, so I don't see failure of incubating projects as a real problem. +1 Theres more of an issue IMO with projects that don't come thru the incubator, since they don't have to meet the Incubator's stringent graduation requirement. As an example - Tapestry was pushed out to a TLP from Jakarta, but the following blog from a Tapestry committer doesn't make good reading from a community PoV: http://agileskills2.org/blog/2007/09/my_thoughts_on_the_differences.html Niall On 9/25/07, Noel J. Bergman [EMAIL PROTECTED] wrote: Dims wrote: Niclas Hedhman asked: Do we have any examples where corporate backing has been withdrawn, and how the project was affected, whether inside or outside ASF? TSIK - Verisign folks lost interest, community did not form, project shelved. Plus Kabuki, Heraldry, and possibly Lokahi. We still hope to save the latter, as there is consistently a lot of user interest, but the developer input has dwindled, and we need developers! --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
On Tuesday 25 September 2007 04:18, Robert Burrell Donkin wrote: this is probably just an indication that it's time to starting thinking... +1 to all said. Do we have any examples where corporate backing has been withdrawn, and how the project was affected, whether inside or outside ASF? It would be interesting to hear about... Not that this has much to do about the current proposal Pig... Cheers -- Niclas Hedhman, Software Developer I live here; http://tinyurl.com/2qq9er I work here; http://tinyurl.com/2ymelc I relax here; http://tinyurl.com/2cgsug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
TSIK - Verisign folks lost interest, community did not form, project shelved. -- dims On 9/25/07, Niclas Hedhman [EMAIL PROTECTED] wrote: On Tuesday 25 September 2007 04:18, Robert Burrell Donkin wrote: this is probably just an indication that it's time to starting thinking... +1 to all said. Do we have any examples where corporate backing has been withdrawn, and how the project was affected, whether inside or outside ASF? It would be interesting to hear about... Not that this has much to do about the current proposal Pig... Cheers -- Niclas Hedhman, Software Developer I live here; http://tinyurl.com/2qq9er I work here; http://tinyurl.com/2ymelc I relax here; http://tinyurl.com/2cgsug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Davanum Srinivas :: http://davanum.wordpress.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
One of the purpose of the incubator is to ensure that there is a sustainable developer community, so I don't see failure of incubating projects as a real problem. On 9/25/07, Noel J. Bergman [EMAIL PROTECTED] wrote: Dims wrote: Niclas Hedhman asked: Do we have any examples where corporate backing has been withdrawn, and how the project was affected, whether inside or outside ASF? TSIK - Verisign folks lost interest, community did not form, project shelved. Plus Kabuki, Heraldry, and possibly Lokahi. We still hope to save the latter, as there is consistently a lot of user interest, but the developer input has dwindled, and we need developers! --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Cheers, Guillaume Nodet Blog: http://gnodet.blogspot.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
On Sep 25, 2007, at 8:28 AM, Guillaume Nodet wrote: One of the purpose of the incubator is to ensure that there is a sustainable developer community, so I don't see failure of incubating projects as a real problem. +1. If we knew for sure that a project would be able to attract a community, we would have much less need for incubation. Craig I have not failed. I've just found 10,000 ways that won't work. Thomas Alva Edison On 9/25/07, Noel J. Bergman [EMAIL PROTECTED] wrote: Dims wrote: Niclas Hedhman asked: Do we have any examples where corporate backing has been withdrawn, and how the project was affected, whether inside or outside ASF? TSIK - Verisign folks lost interest, community did not form, project shelved. Plus Kabuki, Heraldry, and possibly Lokahi. We still hope to save the latter, as there is consistently a lot of user interest, but the developer input has dwindled, and we need developers! --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Cheers, Guillaume Nodet Blog: http://gnodet.blogspot.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Craig Russell Architect, Sun Java Enterprise System http://java.sun.com/products/jdo 408 276-5638 mailto:[EMAIL PROTECTED] P.S. A good JDO? O, Gasp! smime.p7s Description: S/MIME cryptographic signature
RE: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
One of the purpose of the incubator is to ensure that there is a sustainable developer community, so I don't see failure of incubating projects as a real problem. Neither do I. It merely underscores the need to make sure that there is such a sustainable community. But Niclas did ask for examples. :-) --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
On 9/25/07, Guillaume Nodet [EMAIL PROTECTED] wrote: One of the purpose of the incubator is to ensure that there is a sustainable developer community, so I don't see failure of incubating projects as a real problem. Actually I think it's showing the incubator's success even if it's always sad to see a project die. I would see a problem if a project graduated successfully from the incubator and then got abandoned a few months later, which I haven't seen so far. So a strong corporate backing is maybe just a sign that more attention is going to be needed from mentors and the IPMC but I don't see it as worrying either. Matthieu On 9/25/07, Noel J. Bergman [EMAIL PROTECTED] wrote: Dims wrote: Niclas Hedhman asked: Do we have any examples where corporate backing has been withdrawn, and how the project was affected, whether inside or outside ASF? TSIK - Verisign folks lost interest, community did not form, project shelved. Plus Kabuki, Heraldry, and possibly Lokahi. We still hope to save the latter, as there is consistently a lot of user interest, but the developer input has dwindled, and we need developers! --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Cheers, Guillaume Nodet Blog: http://gnodet.blogspot.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Effects on corporate backing withdrawals [was: Incubator Proposal: Pig]
On Sep 25, 2007, at 11:28 AM, Guillaume Nodet wrote: One of the purpose of the incubator is to ensure that there is a sustainable developer community, so I don't see failure of incubating projects as a real problem. Neither do I... - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Niclas Hedhman wrote: [...] b) I can't say that I understand the technical merits of the proposal, and just see the headline analyzing large data sets. And I would like to know the relationship with UIMA's statement ... analyze large volumes of unstructured information... and hear whether there are overlap, synergies and/or collaboration in view. Niclas, I'm not 100% clear on where there could be synergies between Pig and UIMA. Map/reduce is a natural distribution strategy for UIMA, so executing UIMA programs on top of Hadoop seems natural. Maybe Pig can help with that and make it easier somehow. However, that is not clear to me from the proposal at this time. At the same time, I don't really think there is any overlap. Pig is concerned with computation in a distributed environment, while UIMA is agnostic in that respect. On the other hand, UIMA offers a component model to develop analysis modules and combine them into processing chains (with an emphasis on reuse). I do not see from the proposal that Pig is in the business of defining a component model. So synergies probably yes, no overlap as far as I can see. --Thilo - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Olga Natkovich wrote: Hi, Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. High-level tools like Pig are definitely needed to ease the adoption non-traditional storage/database systems like Hadoop, both by the developer communities and their managers. I was pretty excited when the first opensource version was released a few months ago, so a big +1 for this proposal. I'd be happy to be a mentor too. Sylvain -- Sylvain Wallez - http://bluxte.net - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
I am a +1 on the proposal, but I am still unclear, at this point, how Y! is going to align the open source aspects of Pig with their hiring push for Pig developers as per: http://research.yahoo.com/project/pig I guess this is more a general concern about the changing dynamics. First, of course, there were people developing code as volunteers because it was fun or because the were directly influenced by the code itself (after all, this is where httpd got its start). Then we were able to move into the sweet spot where not only did we have true volunteer developers but also developers who got paid to continue developing... Now we seem to be getting into the realm where a condition of their employment is to code ASF stuff... The main concern is whether they are developing because they want to, or they have to. In other words, if the corporate support of the project or podling went away, would they stop developing and working on the codebase because they, after all, had no allegiance in the code at all? Were they, in effect, coders-for-hire? Certainly Pig is not unique in this. There are other Incubator podlings soo much in line with a major corporate entity that if the entity decided today that Apache Foo didn't make corporate sense, that 95% of their developers would never be seen or heard from again... I, of course, trust the Mentors of projects to work through these issues, and a condition of graduation after all is that the community itself is diverse enough and strong enough to survive such transitions. But I see this becoming harder as time goes on as well as much, much more common. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On Sep 24, 2007, at 4:44 AM, Sylvain Wallez wrote: Olga Natkovich wrote: Hi, Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. +1 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Niclas Hedhman wrote: a) The name Pig is somewhat provocative (not kosher/halal) and I would like to hear the rationale behind the name, and whether there are any willingness to look for another name. It is not meant to be provocative. It is named after the animal and is not an acronym. It also provides a convenient name for the query language, Pig Latin. I'm sure that the developers would have some regret about changing the name, but if it were truly determined to be offensive then I believe there would be a willingness to change it. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Speaking just for myself, I find the name unusual but not offensive or even provocative. The fact that you wouldn't eat an animal doesn't mean you deny its existence... Of course, I thought the language was officially called igpay atinlay. Craig On Sep 24, 2007, at 10:15 AM, Doug Cutting wrote: Niclas Hedhman wrote: a) The name Pig is somewhat provocative (not kosher/halal) and I would like to hear the rationale behind the name, and whether there are any willingness to look for another name. It is not meant to be provocative. It is named after the animal and is not an acronym. It also provides a convenient name for the query language, Pig Latin. I'm sure that the developers would have some regret about changing the name, but if it were truly determined to be offensive then I believe there would be a willingness to change it. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Craig Russell Architect, Sun Java Enterprise System http://java.sun.com/products/jdo 408 276-5638 mailto:[EMAIL PROTECTED] P.S. A good JDO? O, Gasp! smime.p7s Description: S/MIME cryptographic signature
Re: Incubator Proposal: Pig
On 9/24/07, Craig L Russell [EMAIL PROTECTED] wrote: Speaking just for myself, I find the name unusual but not offensive or even provocative. The fact that you wouldn't eat an animal doesn't mean you deny its existence... And they make good pets in many cultures and are generally acknowledged to be smart. I thought 'pig' sounds a bit weird for an OSS project, but certainly not offensive. Eelco - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On Sep 24, 2007, at 8:22 PM, Craig L Russell wrote: Speaking just for myself, I find the name unusual but not offensive or even provocative. The fact that you wouldn't eat an animal doesn't mean you deny its existence... +1 I find it a fun name, and the one unlikely to infringe on the existing software trademarks. Andrus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Jim Jagielski wrote: In other words, if the corporate support of the project or podling went away, would they stop developing and working on the codebase because they, after all, had no allegiance in the code at all? Were they, in effect, coders-for-hire? Yes, this is a known risk, perhaps the largest risk for Pig's incubation. We must develop a diverse developer community so the project can survive the departure of any employer or individual. Yahoo! is aware of this, and seeks non-Yahoo! developers. This is the primary motivation to incubate. If Yahoo! wished to develop Pig alone, then it could simply continue to distribute it under a BSD license. And, yes, some developers may stop contributing when they change employment. But, in my experience, many others will be hired specifically for their experience with an Apache project. I see more careers built around Apache experience than short-term coders-for-hire. Today Yahoo! is hiring folks to work on Pig. Soon, hopefully, other companies will do so as well, if they're not already. There's no shame in being paid to work on Apache projects, is there? Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Incubator Proposal: Pig
Doug Cutting wrote: Niclas Hedhman wrote: a) The name Pig is somewhat provocative (not kosher/halal) and I would like to hear the rationale behind the name, and whether there are any willingness to look for another name. It is not meant to be provocative. It is named after the animal and is not an acronym. As one of the residents around here who keeps Kosher, I don't find it offensive in that regard. I do, however, consider it an odd choice for a name, since refering to some software as a Pig is never intended to be a good thing. :-) I certainly don't want anyone calling my work produce a pig! :-) Just pointing it out, not basing a vote on the name. :-) --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Actually we would really like to modify Ruby and Python to support Pig Latin as part of the language: Ubyray and Ythonpay. We have avoided creating our own crippled scripting language, there are just too many in the world, and instead hope to take an existing language and and embedded Pig Latin into it as a first class part of the language. ben On Monday 24 September 2007, Craig L Russell wrote: Speaking just for myself, I find the name unusual but not offensive or even provocative. The fact that you wouldn't eat an animal doesn't mean you deny its existence... Of course, I thought the language was officially called igpay atinlay. Craig On Sep 24, 2007, at 10:15 AM, Doug Cutting wrote: Niclas Hedhman wrote: a) The name Pig is somewhat provocative (not kosher/halal) and I would like to hear the rationale behind the name, and whether there are any willingness to look for another name. It is not meant to be provocative. It is named after the animal and is not an acronym. It also provides a convenient name for the query language, Pig Latin. I'm sure that the developers would have some regret about changing the name, but if it were truly determined to be offensive then I believe there would be a willingness to change it. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Craig Russell Architect, Sun Java Enterprise System http://java.sun.com/products/jdo 408 276-5638 mailto:[EMAIL PROTECTED] P.S. A good JDO? O, Gasp! - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Hey, On 9/24/07, Doug Cutting [EMAIL PROTECTED] wrote: Niclas Hedhman wrote: a) The name Pig is somewhat provocative (not kosher/halal) and I would like to hear the rationale behind the name, and whether there are any willingness to look for another name. It is not meant to be provocative. It is named after the animal and is not an acronym. It also provides a convenient name for the query language, Pig Latin. I'm sure that the developers would have some regret about changing the name, but if it were truly determined to be offensive then I believe there would be a willingness to change it. As another Jewish person on the list, I don't find it the least bit provocative. It's a freakin' animal, that's all ;) In fact, I'm a big fan of more fun names around here. Succubus, Imperius, Pig, all are great in my book. I'm tried of four-letter acronyms and packaging that's been through N layers of marketing analysis ;) Let's not do this: http://video.google.com/videoplay?docid=36099539665548298 Yoav - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
We could add rules indefinitely to make just about anyone not usable... I don't have any issues with Pig. Carl. Yoav Shapira wrote: Hey, On 9/24/07, Doug Cutting [EMAIL PROTECTED] wrote: Niclas Hedhman wrote: a) The name Pig is somewhat provocative (not kosher/halal) and I would like to hear the rationale behind the name, and whether there are any willingness to look for another name. It is not meant to be provocative. It is named after the animal and is not an acronym. It also provides a convenient name for the query language, Pig Latin. I'm sure that the developers would have some regret about changing the name, but if it were truly determined to be offensive then I believe there would be a willingness to change it. As another Jewish person on the list, I don't find it the least bit provocative. It's a freakin' animal, that's all ;) In fact, I'm a big fan of more fun names around here. Succubus, Imperius, Pig, all are great in my book. I'm tried of four-letter acronyms and packaging that's been through N layers of marketing analysis ;) Let's not do this: http://video.google.com/videoplay?docid=36099539665548298 Yoav - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Doug Cutting wrote: Niclas Hedhman wrote: a) The name Pig is somewhat provocative (not kosher/halal) and I would like to hear the rationale behind the name, and whether there are any willingness to look for another name. It is not meant to be provocative. It is named after the animal and is not an acronym. It also provides a convenient name for the query language, Pig Latin. I'm sure that the developers would have some regret about changing the name, but if it were truly determined to be offensive then I believe there would be a willingness to change it. Not offensive at all, nobody has to eat the source code. That said, is Igpay available as a project name and would the project entertain it? Now that would be a fun name :) Bill - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
I don't find the name provocative either, but the connotations are a bit weird :-P On Sep 24, 2007, at 10:39 AM, Noel J. Bergman [EMAIL PROTECTED] wrote: Doug Cutting wrote: Niclas Hedhman wrote: a) The name Pig is somewhat provocative (not kosher/halal) and I would like to hear the rationale behind the name, and whether there are any willingness to look for another name. It is not meant to be provocative. It is named after the animal and is not an acronym. As one of the residents around here who keeps Kosher, I don't find it offensive in that regard. I do, however, consider it an odd choice for a name, since refering to some software as a Pig is never intended to be a good thing. :-) I certainly don't want anyone calling my work produce a pig! :-) Just pointing it out, not basing a vote on the name. :-) --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On 9/24/07, Doug Cutting [EMAIL PROTECTED] wrote: Jim Jagielski wrote: In other words, if the corporate support of the project or podling went away, would they stop developing and working on the codebase because they, after all, had no allegiance in the code at all? Were they, in effect, coders-for-hire? Yes, this is a known risk, perhaps the largest risk for Pig's incubation. We must develop a diverse developer community so the project can survive the departure of any employer or individual. Yahoo! is aware of this, and seeks non-Yahoo! developers. This is the primary motivation to incubate. If Yahoo! wished to develop Pig alone, then it could simply continue to distribute it under a BSD license. true any corporation can pick a license, host a public repository, build brand awareness and create a project where the source is open but the development is closed. the reason to approach apache is that we've had a reasonable track record in the difficult task of building healthy and open communties. And, yes, some developers may stop contributing when they change employment. But, in my experience, many others will be hired specifically for their experience with an Apache project. I see more careers built around Apache experience than short-term coders-for-hire. Today Yahoo! is hiring folks to work on Pig. Soon, hopefully, other companies will do so as well, if they're not already. There's no shame in being paid to work on Apache projects, is there? of course not but there has been a definite shift over the years. committers paid to work full time on a project (as opposed to being allowed to work on the project in work time) are now more common. it seems comfortable when a long time contributor is hired to work full time on a project. now that so many volunteers are now employed to code open source, it is perhaps inevitable that corporations will look to hire from outside this pool. i'm not sure (though) that we've had enough time to digest this phenomenum to really understand it's long term effects on community health. this is probably just an indication that it's time to starting thinking... - robert - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Olga Natkovich wrote: http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. Thanks for all the comments. I've seen no issues raised that should block Pig from entering incubation. Unless something arises before then, I will call a formal vote on the proposal tomorrow. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On Wednesday 19 September 2007 03:52, Olga Natkovich wrote: We would like to ask that the ASF consider forming a podling according to the proposal. +1, but I also got a couple of observations. a) The name Pig is somewhat provocative (not kosher/halal) and I would like to hear the rationale behind the name, and whether there are any willingness to look for another name. b) I can't say that I understand the technical merits of the proposal, and just see the headline analyzing large data sets. And I would like to know the relationship with UIMA's statement ... analyze large volumes of unstructured information... and hear whether there are overlap, synergies and/or collaboration in view. Cheers -- Niclas Hedhman, Software Developer I live here; http://tinyurl.com/2qq9er I work here; http://tinyurl.com/2ymelc I relax here; http://tinyurl.com/2cgsug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
+1 -- I'd offer to help as much as I can, but I know how little that is right now :-( Definitely support (and will probably use at least ;-) -Brian On Sep 18, 2007, at 12:52 PM, Olga Natkovich wrote: Hi, Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. Thanks, Olga Natkovich mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] -- -- - = Pig Open Source Proposal = == Abstract == Pig is a platform for analyzing large data sets. == Proposal == The Pig project consists of high-level languages for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties: 1. ''Ease of programming''. It is trivial to achieve parallel execution of simple, embarrassingly parallel data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain. 2. ''Optimization opportunities''. The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency. 3. ''Extensibility''. Users can create their own functions to do special-purpose processing. == Background == Pig started as a research project at Yahoo! in May of 2006 to combine ideas in parallel databases and distributed computing. The first internal release took place in July 2006. The first release was a simple front-end to the Hadoop Map/Reduce framework. The following releases added new features and evolved the language based on user feedback. In July 2007, pig was taken over by a development team and the first production version is due to be released on 9/28/07. Since its inception, we had observed a steady growth of the user community within Yahoo!. In April 2007, Pig was released under a BSD-type license. Several external parties are using this version and have expressed interest in collaborating on its development. == Rationale == In an information-centric world, innovation is driven by ad-hoc analysis of large data sets. For example, search engine companies routinely deploy and refine services based on analyzing the recorded behavior of users, publishers, and advertisers. The rate of innovation depends on the efficiency with which data can be analyzed. To analyze large data sets efficiently, one needs parallelism. The cheapest and most scalable form of parallelism is cluster computing. Unfortunately, programming for a cluster computing environment is difficult and time-consuming. Pig makes it easy to harness the power of cluster computing for ad-hoc data analysis. While other language exist that try to achieve the same goals, we believe that Pig provides more flexibility and gives more control to the end user. SQL typically requires (1) importing data from a user's preferred format into a database system's internal format (2) well-structured, normalized data with a declared schema, and (3) programs expressed in declarative SELECT-FROM-WHERE blocks. In contrast, Pig Latin facilitates (1) interoperability, i.e. data may be read/written in a format accepted by other applications such as text editors or graph generators (2) flexibility, i.e. data may be loosely structured or have structure that is defined operationally, and (3) adoption by programmers who find procedural programming more natural than declarative programming. Sawzall is a scripting language used at Google on top of Map-Reduce. A sawzall program has a fairly rigid structure consisting of a filtering phase (the map step) followed by an aggregation phase (the reduce step). Furthermore, only the filtering phase can be written by the user, and only a pre-built set of aggregations are available (new ones are non- trivial to add). While Pig Latin has similar higher level primitives like filtering and aggregation, an arbitrary number of them can be flexibly chained together in a Pig Latin program, and all primitives can use user-defined functions with equal ease. Further, Pig Latin has additional primitives such as cogrouping, that allow operations such
Re: Incubator Proposal: Pig
On 9/18/07, Olga Natkovich [EMAIL PROTECTED] wrote: ...We would like to ask that the ASF consider forming a podling according to the proposal +1 to the proposal, and I'd be happy to help as a mentor. -Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
+1 from me, too. It looks very promising. :-) On 9/21/07, Bertrand Delacretaz [EMAIL PROTECTED] wrote: On 9/18/07, Olga Natkovich [EMAIL PROTECTED] wrote: ...We would like to ask that the ASF consider forming a podling according to the proposal +1 to the proposal, and I'd be happy to help as a mentor. -Bertrand - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Regards, Petar! Karlovo, Bulgaria. EOOXML objections http://www.grokdoc.net/index.php/EOOXML_objections Public PGP Key at: http://keyserver.linux.it/pks/lookup?op=getsearch=0x1A15B53B761500F9 Key Fingerprint: AA16 8004 AADD 9C76 EF5B 4210 1A15 B53B 7615 00F9
Re: Incubator Proposal: Pig
+1 -Jim On Sep 18, 2007, at 3:52 PM, Olga Natkovich wrote: Hi, Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. Thanks, Olga Natkovich mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] -- -- - = Pig Open Source Proposal = == Abstract == Pig is a platform for analyzing large data sets. == Proposal == The Pig project consists of high-level languages for expressing data analysis programs, coupled with infrastructure for evaluating these : : : - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On Sep 18, 2007, at 9:52 PM, Olga Natkovich wrote: Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. ... Pig is a platform for analyzing large data sets. +1, looks cool! ...seems like your biggest challenge here is attracting a diverse developer community, and hopefully the apache incubation process will help you there... cheers, Leo Simons -- http://www.leosimons.com/blog/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On 9/20/07, Leo Simons [EMAIL PROTECTED] wrote: On Sep 18, 2007, at 9:52 PM, Olga Natkovich wrote: Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. ... Pig is a platform for analyzing large data sets. +1, looks cool! ...seems like your biggest challenge here is attracting a diverse developer community, and hopefully the apache incubation process will help you there... +1 it's very important to focus on encouraging new developers in the neonate period of a project the energy required to let people know about a new project is often underestimated. the open source space is now much bigger and more diffuse than years ago. so it's not as easy for interesting projects and interested people to find each other any more. stuff like blogging (www.planetapache.org aggregates many blogs written by apache committers) and podcasting (www.feathercast.org is an apache podcast) are useful but tend to reach only people who are already interested in apache. articles, grassroots meeting and conference talks are also important. one of the black arts is trying to ensure that the right level of exposure is acheived. too much too early before the development infrastructure is ready leads to disappointment but too little too late when the project is too finished means that there is less chance for meaningful contributions to be made. - robert - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On 20.09.2007, at 19:06, Robert Burrell Donkin wrote: On 9/20/07, Leo Simons [EMAIL PROTECTED] wrote: On Sep 18, 2007, at 9:52 PM, Olga Natkovich wrote: Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. ... Pig is a platform for analyzing large data sets. +1, looks cool! ...seems like your biggest challenge here is attracting a diverse developer community, and hopefully the apache incubation process will help you there... +1 +1 Actually I would also be interested in stepping up as a mentor. cheers -- Torsten - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Torsten Curdt wrote: +1 Actually I would also be interested in stepping up as a mentor. Thanks, that'd be great! Please add yourself to the proposal in the wiki. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Big +1! :) Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Olga Natkovich [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Tuesday, September 18, 2007 3:52:23 PM Subject: Incubator Proposal: Pig Hi, Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. Thanks, Olga Natkovich mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] - = Pig Open Source Proposal = == Abstract == Pig is a platform for analyzing large data sets. == Proposal == The Pig project consists of high-level languages for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties: 1. ''Ease of programming''. It is trivial to achieve parallel execution of simple, embarrassingly parallel data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain. 2. ''Optimization opportunities''. The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency. 3. ''Extensibility''. Users can create their own functions to do special-purpose processing. == Background == Pig started as a research project at Yahoo! in May of 2006 to combine ideas in parallel databases and distributed computing. The first internal release took place in July 2006. The first release was a simple front-end to the Hadoop Map/Reduce framework. The following releases added new features and evolved the language based on user feedback. In July 2007, pig was taken over by a development team and the first production version is due to be released on 9/28/07. Since its inception, we had observed a steady growth of the user community within Yahoo!. In April 2007, Pig was released under a BSD-type license. Several external parties are using this version and have expressed interest in collaborating on its development. == Rationale == In an information-centric world, innovation is driven by ad-hoc analysis of large data sets. For example, search engine companies routinely deploy and refine services based on analyzing the recorded behavior of users, publishers, and advertisers. The rate of innovation depends on the efficiency with which data can be analyzed. To analyze large data sets efficiently, one needs parallelism. The cheapest and most scalable form of parallelism is cluster computing. Unfortunately, programming for a cluster computing environment is difficult and time-consuming. Pig makes it easy to harness the power of cluster computing for ad-hoc data analysis. While other language exist that try to achieve the same goals, we believe that Pig provides more flexibility and gives more control to the end user. SQL typically requires (1) importing data from a user's preferred format into a database system's internal format (2) well-structured, normalized data with a declared schema, and (3) programs expressed in declarative SELECT-FROM-WHERE blocks. In contrast, Pig Latin facilitates (1) interoperability, i.e. data may be read/written in a format accepted by other applications such as text editors or graph generators (2) flexibility, i.e. data may be loosely structured or have structure that is defined operationally, and (3) adoption by programmers who find procedural programming more natural than declarative programming. Sawzall is a scripting language used at Google on top of Map-Reduce. A sawzall program has a fairly rigid structure consisting of a filtering phase (the map step) followed by an aggregation phase (the reduce step). Furthermore, only the filtering phase can be written by the user, and only a pre-built set of aggregations are available (new ones are non-trivial to add). While Pig Latin has similar higher level primitives like filtering and aggregation, an arbitrary number of them can be flexibly chained together in a Pig Latin program, and all primitives can use user-defined functions with equal ease. Further, Pig Latin has additional primitives such as cogrouping, that allow
Re: Incubator Proposal: Pig
Done! On 20.09.2007, at 19:46, Doug Cutting wrote: Torsten Curdt wrote: +1 Actually I would also be interested in stepping up as a mentor. Thanks, that'd be great! Please add yourself to the proposal in the wiki. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Incubator Proposal: Pig
Hi, Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. Thanks, Olga Natkovich mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] - = Pig Open Source Proposal = == Abstract == Pig is a platform for analyzing large data sets. == Proposal == The Pig project consists of high-level languages for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties: 1. ''Ease of programming''. It is trivial to achieve parallel execution of simple, embarrassingly parallel data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain. 2. ''Optimization opportunities''. The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency. 3. ''Extensibility''. Users can create their own functions to do special-purpose processing. == Background == Pig started as a research project at Yahoo! in May of 2006 to combine ideas in parallel databases and distributed computing. The first internal release took place in July 2006. The first release was a simple front-end to the Hadoop Map/Reduce framework. The following releases added new features and evolved the language based on user feedback. In July 2007, pig was taken over by a development team and the first production version is due to be released on 9/28/07. Since its inception, we had observed a steady growth of the user community within Yahoo!. In April 2007, Pig was released under a BSD-type license. Several external parties are using this version and have expressed interest in collaborating on its development. == Rationale == In an information-centric world, innovation is driven by ad-hoc analysis of large data sets. For example, search engine companies routinely deploy and refine services based on analyzing the recorded behavior of users, publishers, and advertisers. The rate of innovation depends on the efficiency with which data can be analyzed. To analyze large data sets efficiently, one needs parallelism. The cheapest and most scalable form of parallelism is cluster computing. Unfortunately, programming for a cluster computing environment is difficult and time-consuming. Pig makes it easy to harness the power of cluster computing for ad-hoc data analysis. While other language exist that try to achieve the same goals, we believe that Pig provides more flexibility and gives more control to the end user. SQL typically requires (1) importing data from a user's preferred format into a database system's internal format (2) well-structured, normalized data with a declared schema, and (3) programs expressed in declarative SELECT-FROM-WHERE blocks. In contrast, Pig Latin facilitates (1) interoperability, i.e. data may be read/written in a format accepted by other applications such as text editors or graph generators (2) flexibility, i.e. data may be loosely structured or have structure that is defined operationally, and (3) adoption by programmers who find procedural programming more natural than declarative programming. Sawzall is a scripting language used at Google on top of Map-Reduce. A sawzall program has a fairly rigid structure consisting of a filtering phase (the map step) followed by an aggregation phase (the reduce step). Furthermore, only the filtering phase can be written by the user, and only a pre-built set of aggregations are available (new ones are non-trivial to add). While Pig Latin has similar higher level primitives like filtering and aggregation, an arbitrary number of them can be flexibly chained together in a Pig Latin program, and all primitives can use user-defined functions with equal ease. Further, Pig Latin has additional primitives such as cogrouping, that allow operations such as joins (which require multiple programs in Sawzall) to be written in a single line in Pig Latin. Further, Pig Latin is designed to be embedded into other languages, and can use functions written in other languages. Thus, in contrast to Sawzall, it directly caters to a large community of developers
Re: Incubator Proposal: Pig
Hey, On 9/18/07, Olga Natkovich [EMAIL PROTECTED] wrote: Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal Looks very cool to me. +1 to accepting Pig as an Incubator project. I'll also gladly volunteer as a mentor. Yoav - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Garrett Rooney wrote: Is there any particular reason you want this podling to be sponsored by the Incubator PMC (which is generally done for projects that intend to turn into their own top level project) rather than having it sponsored by the Lucene PMC (where Hadoop currently resides)? It seems to me that the close relationship between Pig and Hadoop implies that they very well might best be served under the same roof. The existing contributor base is largely disjoint from the Hadoop contributor base, and they expect that to mostly remain the case. Nigel, Owen I, Hadoop committers, will mostly just help the Pig crew out with Apache ways, and don't expect to become significant contributors to Pig. Pig builds on Hadoop, and the communities may overlap a bit, but, to the primary folks involved, it feels like a separate community and they'd prefer to aim for a TLP. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Hey, On 9/18/07, Doug Cutting [EMAIL PROTECTED] wrote: overlap a bit, but, to the primary folks involved, it feels like a separate community and they'd prefer to aim for a TLP. It should be clear to everyone involved, though, that part of the goal of incubation is to diversify the project's community so that it's not disjoint from everyone else. I hope to have a bunch of non-Yahoo people contributing to the project. Yoav - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Is there any particular reason you want this podling to be sponsored by the Incubator PMC (which is generally done for projects that intend to turn into their own top level project) Is this true? I thought all new projects had to go through the incubator. Woden [1] is an incubator project that plans to graduate and join the WS PMC. [1] http://incubator.apache.org/woden/ Lawrence Garrett Rooney [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 09/18/2007 04:02 PM Please respond to general@incubator.apache.org To general@incubator.apache.org cc Subject Re: Incubator Proposal: Pig On 9/18/07, Olga Natkovich [EMAIL PROTECTED] wrote: Hi, Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. Is there any particular reason you want this podling to be sponsored by the Incubator PMC (which is generally done for projects that intend to turn into their own top level project) rather than having it sponsored by the Lucene PMC (where Hadoop currently resides)? It seems to me that the close relationship between Pig and Hadoop implies that they very well might best be served under the same roof. -garrett - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On 9/18/07, Lawrence Mandel [EMAIL PROTECTED] wrote: Is there any particular reason you want this podling to be sponsored by the Incubator PMC (which is generally done for projects that intend to turn into their own top level project) Is this true? I thought all new projects had to go through the incubator. Woden [1] is an incubator project that plans to graduate and join the WS PMC. [1] http://incubator.apache.org/woden/ All projects need to go through the incubation process, but not all are sponsored by the incubator PMC, many are sponsored by an existing PMC outside the incubator. I don't recall for sure, but I'd expect that woden entered the incubator after being sponsored by the WS PMC (at least, that's how things tend to work today if I understand correctly, it's quite possible that woden predates that practice, I'm really not sure). -garrett - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On 9/18/07, Olga Natkovich [EMAIL PROTECTED] wrote: Hi, Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. Is there any particular reason you want this podling to be sponsored by the Incubator PMC (which is generally done for projects that intend to turn into their own top level project) rather than having it sponsored by the Lucene PMC (where Hadoop currently resides)? It seems to me that the close relationship between Pig and Hadoop implies that they very well might best be served under the same roof. -garrett - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
+1 as well. I would be happy to help with code contributions and user testing. Yoav Shapira-2 wrote: Hey, On 9/18/07, Olga Natkovich [EMAIL PROTECTED] wrote: Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal Looks very cool to me. +1 to accepting Pig as an Incubator project. I'll also gladly volunteer as a mentor. Yoav - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- View this message in context: http://www.nabble.com/Incubator-Proposal%3A-Pig-tf4476730.html#a12766208 Sent from the Apache Incubator - General mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On 9/18/07, Doug Cutting [EMAIL PROTECTED] wrote: Garrett Rooney wrote: Is there any particular reason you want this podling to be sponsored by the Incubator PMC (which is generally done for projects that intend to turn into their own top level project) rather than having it sponsored by the Lucene PMC (where Hadoop currently resides)? It seems to me that the close relationship between Pig and Hadoop implies that they very well might best be served under the same roof. The existing contributor base is largely disjoint from the Hadoop contributor base, and they expect that to mostly remain the case. Nigel, Owen I, Hadoop committers, will mostly just help the Pig crew out with Apache ways, and don't expect to become significant contributors to Pig. Pig builds on Hadoop, and the communities may overlap a bit, but, to the primary folks involved, it feels like a separate community and they'd prefer to aim for a TLP. Well, it seems a little odd to me (if anything it seems like a new TLP associated with hadoop and generic distributed computing tools like pig that are built on top of hadoop seems like it would make more sense than just a pig TLP), but if that's what the people involved want I don't see anything wrong with it. I mean it's not like a decision on a final home for the project has to happen now anyway. In any event, +1 from me, this is a neat project and I'd be happy to see it here. -garrett - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Garrett Rooney wrote: (if anything it seems like a new TLP associated with hadoop and generic distributed computing tools like pig that are built on top of hadoop seems like it would make more sense than just a pig TLP), Yes, I agree. But that's not happened yet, and the Pig folks are ready to enter the incubator now. but if that's what the people involved want I don't see anything wrong with it. I mean it's not like a decision on a final home for the project has to happen now anyway. Exactly. If, when Pig is ready to graduate, there is a more Hadoop-specific TLP, then it may make sense to have Pig join that as a sub-project, or it may not. But, for now, the folks involved have elected to aim for TLP rather than Lucene sub-project (the available options). Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Yoav Shapira wrote: It should be clear to everyone involved, though, that part of the goal of incubation is to diversify the project's community so that it's not disjoint from everyone else. I hope to have a bunch of non-Yahoo people contributing to the project. Indeed. That's the primary reason to move this to Apache: to be able to collaborate with others outside Y!. If Y! didn't want to diversify the community it could just keep posting code dumps under BSD as it does today. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
On Sep 18, 2007, at 2:45 PM, Doug Cutting wrote: Garrett Rooney wrote: (if anything it seems like a new TLP associated with hadoop and generic distributed computing tools like pig that are built on top of hadoop seems like it would make more sense than just a pig TLP), Yes, I agree. But that's not happened yet, and the Pig folks are ready to enter the incubator now. but if that's what the people involved want I don't see anything wrong with it. I mean it's not like a decision on a final home for the project has to happen now anyway. Exactly. If, when Pig is ready to graduate, there is a more Hadoop- specific TLP, then it may make sense to have Pig join that as a sub- project, or it may not. I agree. One of the objectives of incubation is to decide exactly where in Apache a project (or sub-project) belongs. Let's get it into incubation and see what kind of community/synergy Pig can build with other projects. Craig But, for now, the folks involved have elected to aim for TLP rather than Lucene sub-project (the available options). Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Craig Russell Architect, Sun Java Enterprise System http://java.sun.com/products/jdo 408 276-5638 mailto:[EMAIL PROTECTED] P.S. A good JDO? O, Gasp! smime.p7s Description: S/MIME cryptographic signature