To be clear, going to 1.0 is not about having a certain set of
features. It is about stability and usability. When a project
declares itself 1.0 it is making some guarantees regarding the
stability of its interfaces (in Pig's case this is Pig Latin, UDFs,
and command line usage). It is also declaring itself ready for the
world at large, not just the brave and the free. New features can
come in as experimental once we're 1.0, but the semantics of the
language and UDFs can't be shifting (as we've done the last several
releases and will continue to do for a bit I think).
With that in mind, further comments inlined.
On Jun 24, 2009, at 10:18 AM, Dmitriy Ryaboy wrote:
Meaning do we need to reach a certain speed before 1.0? I don't think
so. Pig is fast enough now that many people find it useful. We want
to continue working to shrink the gap between Pig and MR, but I don't
see this as a blocker for 1.0.
Alan, any thoughts on performance baselines and benchmarks?
If we were debating today whether to go 1.0, I agree that we would not
wait for SQL. But given that we aren't (at least I wouldn't vote for
it now) and that SQL will be in soon, it will need to stabilize.
I am a little surprised that you think SQL is a requirement for 1.0,
it's essentially an overlay, not core functionality.
To be clear, the Zebra (columnar store stuff) is not a rewrite of the
storage layer. It is an additional storage option we want to
support. We aren't changing current support for load and store.
What about the storage layer rewrite (or is that what you referred
your first bullet-point)?
Also, the subject of making more (or all) operators nestable within a
foreach comes up now and then.. would you consider this important
or something that can wait?
This would be an added feature, not a semantic change in Pig Latin.
Integration with other languages (a-la PyPig)?
Again, this is a new feature, not a stability issue.
Agreed. Olga has given me the task of updating this soon. I'm going
to try to get to that over the next couple of weeks. This discussion
will certainly provide input to that update.
The Roadmap on the Wiki is still "as of Q3 2007".... makes it hard
outside contributor to know where to jump :-).