To add to what Alan said -- one other implementation existed (local, in-memory mode), and it was consistently just different from MR mode enough to cause all kinds of confusion and bugs for users. We also noticed that our abstractions wound up being partial wrappers of Hadoop's abstractions, and less extensible at that. My feeling is that Hadoop already has quite a few abstraction layers that allow one to implement his own input splits, file systems, etc, so sneaking in a different system could be done at that layer.
You could try to add a new mode and intercept the output of the logical planner, when the job is still represented as a DAG of logical operators; you'd have to write all of your own physical operators, but it could be done. It'd be a fair amount of work. D On Wed, Dec 14, 2011 at 10:19 AM, Alan Gates <[email protected]> wrote: > The issue was that trying to be non-backend specific forced us to write > abstractions for all of Hadoop's features rather than use them directly. > For example, we had to have an abstract notion of a file system rather > than use HDFS, an abstract notion of input rather than use InputFormats. > This caused efficiency issues for the runtime system and made it much > harder for developers to understand what was going on in the code. Given > that no other implementations existed, it did not make sense to pay this > price. > > Alan. > > On Dec 14, 2011, at 8:13 AM, Tharindu Mathew wrote: > > > Hi Gianmarco, > > > > Thanks for the response... > > > > I do understand the Pig Latin is not coupled with anything. Hence, the > > question for use in another framework. > > > > I was referring to what you said secondly. It would be great to run Pig > as > > a Pig Latin Executor without Hadoop. > > > > Why was such an extension point a burden? It seems like the right thing > to > > do in terms of design. > > > > On Wed, Dec 14, 2011 at 7:29 PM, Gianmarco De Francisci Morales < > > [email protected]> wrote: > > > >> Hi, > >> Pig Latin (the language) is not coupled to Hadoop and can be used on any > >> other framework. > >> Pig (the system) however, is tightly coupled to Hadoop. > >> Extension point existed but were removed because they were a major > burden. > >> That said, it is theoretically possible to build another Pig Latin > executor > >> that runs on a different framework. > >> > >> Cheers, > >> -- > >> Gianmarco > >> > >> > >> > >> On Wed, Dec 14, 2011 at 13:17, Tharindu Mathew <[email protected]> > >> wrote: > >> > >>> I understand that Pig Latin is a data flow language. In that sense it > >>> should be theoretically possible to execute Pig Latin in any framework > >>> though currently and it is meant to be executed in a Hadoop > enviornment. > >>> How hard would it be to switch Pig Latin to run on a different > framework? > >>> Are there any extension points for this if at all or is Pig Latin is > >>> tightly coupled to Hadoop? > >>> > >>> -- > >>> Regards, > >>> > >>> Tharindu > >>> > >>> blog: http://mackiemathew.com/ > >>> > >> > > > > > > > > -- > > Regards, > > > > Tharindu > > > > blog: http://mackiemathew.com/ > >
