In general, it would be really, really nice if it were easy to build
abstract Pig syntax trees outside of the normal parser.

For instance, I find the fact that pig is not a full scale scripting
language incredibly confining.  I would love to be able to build a DSL in
groovy that let me use groovy for scripting, but still execute pig jobs
easily.  If I could build Pig syntax trees easily, then I would be, as they
say, in pig heaven.

That would also let the switch to a different parsing technology happen
gradually rather than all at once.  Two different grunt interpreters could
coexist for a short time while the new one is proved out.

On Thu, Feb 12, 2009 at 3:58 PM, Olga Natkovich <> wrote:

> Pig Developers,
> Pig currently uses javacc for parsing pig commands. We have found
> several shortcomings with using javacc. In particular,
> (1) Lack of good documentation which makes it hard to and time consuming
> to learn javacc and make changes to Pig grammar
> (2) No easy way to customize error handling and error messages
> (3) Single path that performs both tokenizing and parsing
> We are considering to use JFlex and Cup which are Java versions of Lex
> and Bison instead. The main advantage of this transition is proven, well
> known and well understood technology and input format. In addition, it
> addresses the issues stated above.
> One problem with the transition is that JFlex and Cup have GPL license
> that is not compatible with Apache license. The workaround could be that
> we don't commit the tools into SVN and instead developers who need to
> update grammar would install them on their own. Note, that we can commit
> the input grammar as well as the output of the grammar into SVN which
> means that for developers just compiling code or making non-parser
> changes, there will be no impact.
> Please, comment on whether you think this is a reasonable change.
> Thanks,
> Olga

Ted Dunning, CTO
4600 Bohannon Drive, Suite 220
Menlo Park, CA 94025
650-324-0110, ext. 738
858-414-0013 (m)

Reply via email to