On Dec 15, 2011, at 7:24 AM, Tharindu Mathew wrote: > Thanks folks. You've been really helpful. > > Just to clear my doubts, does the local mode in [1] refer to the Hadoop > local mode? Yes, since 0.6 or 0.7 we use Hadoop's LocalJobRunner to execute local mode queries.
Alan. > > [1] - http://pig.apache.org/docs/r0.9.1/cont.html#embed-java > > On Thu, Dec 15, 2011 at 12:36 AM, Dmitriy Ryaboy <[email protected]> wrote: > >> To add to what Alan said -- one other implementation existed (local, >> in-memory mode), and it was consistently just different from MR mode enough >> to cause all kinds of confusion and bugs for users. We also noticed that >> our abstractions wound up being partial wrappers of Hadoop's abstractions, >> and less extensible at that. My feeling is that Hadoop already has quite a >> few abstraction layers that allow one to implement his own input splits, >> file systems, etc, so sneaking in a different system could be done at that >> layer. >> >> You could try to add a new mode and intercept the output of the logical >> planner, when the job is still represented as a DAG of logical operators; >> you'd have to write all of your own physical operators, but it could be >> done. It'd be a fair amount of work. >> >> D >> >> On Wed, Dec 14, 2011 at 10:19 AM, Alan Gates <[email protected]> >> wrote: >> >>> The issue was that trying to be non-backend specific forced us to write >>> abstractions for all of Hadoop's features rather than use them directly. >>> For example, we had to have an abstract notion of a file system rather >>> than use HDFS, an abstract notion of input rather than use InputFormats. >>> This caused efficiency issues for the runtime system and made it much >>> harder for developers to understand what was going on in the code. Given >>> that no other implementations existed, it did not make sense to pay this >>> price. >>> >>> Alan. >>> >>> On Dec 14, 2011, at 8:13 AM, Tharindu Mathew wrote: >>> >>>> Hi Gianmarco, >>>> >>>> Thanks for the response... >>>> >>>> I do understand the Pig Latin is not coupled with anything. Hence, the >>>> question for use in another framework. >>>> >>>> I was referring to what you said secondly. It would be great to run Pig >>> as >>>> a Pig Latin Executor without Hadoop. >>>> >>>> Why was such an extension point a burden? It seems like the right thing >>> to >>>> do in terms of design. >>>> >>>> On Wed, Dec 14, 2011 at 7:29 PM, Gianmarco De Francisci Morales < >>>> [email protected]> wrote: >>>> >>>>> Hi, >>>>> Pig Latin (the language) is not coupled to Hadoop and can be used on >> any >>>>> other framework. >>>>> Pig (the system) however, is tightly coupled to Hadoop. >>>>> Extension point existed but were removed because they were a major >>> burden. >>>>> That said, it is theoretically possible to build another Pig Latin >>> executor >>>>> that runs on a different framework. >>>>> >>>>> Cheers, >>>>> -- >>>>> Gianmarco >>>>> >>>>> >>>>> >>>>> On Wed, Dec 14, 2011 at 13:17, Tharindu Mathew <[email protected]> >>>>> wrote: >>>>> >>>>>> I understand that Pig Latin is a data flow language. In that sense it >>>>>> should be theoretically possible to execute Pig Latin in any >> framework >>>>>> though currently and it is meant to be executed in a Hadoop >>> enviornment. >>>>>> How hard would it be to switch Pig Latin to run on a different >>> framework? >>>>>> Are there any extension points for this if at all or is Pig Latin is >>>>>> tightly coupled to Hadoop? >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> >>>>>> Tharindu >>>>>> >>>>>> blog: http://mackiemathew.com/ >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Regards, >>>> >>>> Tharindu >>>> >>>> blog: http://mackiemathew.com/ >>> >>> >> > > > > -- > Regards, > > Tharindu > > blog: http://mackiemathew.com/
