Re: switching to different parser in Pig

Alan Gates Tue, 17 Feb 2009 08:53:07 -0800

Ted,

If understand your comments correctly you aren't chiming in on whetherwe should switch parsers, just that you would like there to be apublished interface of what pig latin syntax trees look like so youcould generate them in other tools and then feed them into pig. Isthat correct? So whether we switch parsing technologies or not is notof interest to you, only the interfaces we expose?


Alan.

On Feb 12, 2009, at 4:42 PM, Ted Dunning wrote:

In general, it would be really, really nice if it were easy to build
abstract Pig syntax trees outside of the normal parser.

For instance, I find the fact that pig is not a full scale scripting
language incredibly confining. I would love to be able to build aDSL ingroovy that let me use groovy for scripting, but still execute pigjobseasily. If I could build Pig syntax trees easily, then I would be,as they
say, in pig heaven.
That would also let the switch to a different parsing technologyhappengradually rather than all at once. Two different grunt interpreterscould
coexist for a short time while the new one is proved out.
On Thu, Feb 12, 2009 at 3:58 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
Pig Developers,

Pig currently uses javacc for parsing pig commands. We have found
several shortcomings with using javacc. In particular,
(1) Lack of good documentation which makes it hard to and timeconsuming
to learn javacc and make changes to Pig grammar
(2) No easy way to customize error handling and error messages
(3) Single path that performs both tokenizing and parsing
We are considering to use JFlex and Cup which are Java versions ofLexand Bison instead. The main advantage of this transition is proven,wellknown and well understood technology and input format. In addition,it
addresses the issues stated above.
One problem with the transition is that JFlex and Cup have GPLlicensethat is not compatible with Apache license. The workaround could bethat
we don't commit the tools into SVN and instead developers who need to
update grammar would install them on their own. Note, that we cancommit
the input grammar as well as the output of the grammar into SVN which
means that for developers just compiling code or making non-parser
changes, there will be no impact.

Please, comment on whether you think this is a reasonable change.

Thanks,

Olga
--
Ted Dunning, CTO
DeepDyve
4600 Bohannon Drive, Suite 220
Menlo Park, CA 94025
www.deepdyve.com
650-324-0110, ext. 738
858-414-0013 (m)

Re: switching to different parser in Pig

Reply via email to