A couple of years ago we had this concept that Pig as is should be able to run on other backends (like say Dryad if it were open source). So we built this whole backend interface and (mostly) kept Hadoop specific objects out of the front end.

Recently we have modified that stand and said that this implementation of Pig is Hadoop specific. Pig Latin itself will still stay Hadoop independent. So the ability to have multiple backends is fine. But the ability to have non-Hadoop backends is not really interesting now.

So I at least see the proposal here as getting rid of generic code that tries to hide the fact that we are working on top of Hadoop (things like DataStorage and ExecutionEngine).

Alan.

On Apr 22, 2010, at 4:14 PM, Arun C Murthy wrote:

I read it as getting rid of concepts parallel to hadoop in src/org/ apache/pig/backend/hadoop/datastorage.

Is that true?

thanks,
Arun

On Apr 22, 2010, at 1:34 PM, Dmitriy Ryaboy wrote:

I kind of dig the concept of being able to plug in a different backend, though I definitely thing we should get rid of the dead localmode code. Can you give an example of how this will simplify the codebase? Is it more than just GenericClass foo = new SpecificClass(), and the associated extra files?

-D

On Thu, Apr 22, 2010 at 1:25 PM, Arun C Murthy <[email protected]> wrote:

+1

Arun


On Apr 22, 2010, at 11:35 AM, Richard Ding wrote:

Pig has an abstraction layer (interfaces and abstract classes) to
support multiple execution engines. After PIG-1053, Hadoop is the only execution engine supported by Pig. I wonder if we should remove this layer of code, and make Hadoop THE execution engine for Pig. This will
simplify a lot the backend code.



Thanks,

-Richard







Reply via email to