A couple of years ago we had this concept that Pig as is should be
able to run on other backends (like say Dryad if it were open
source). So we built this whole backend interface and (mostly) kept
Hadoop specific objects out of the front end.
Recently we have modified that stand and said that this implementation
of Pig is Hadoop specific. Pig Latin itself will still stay Hadoop
independent. So the ability to have multiple backends is fine. But
the ability to have non-Hadoop backends is not really interesting now.
So I at least see the proposal here as getting rid of generic code
that tries to hide the fact that we are working on top of Hadoop
(things like DataStorage and ExecutionEngine).
Alan.
On Apr 22, 2010, at 4:14 PM, Arun C Murthy wrote:
I read it as getting rid of concepts parallel to hadoop in src/org/
apache/pig/backend/hadoop/datastorage.
Is that true?
thanks,
Arun
On Apr 22, 2010, at 1:34 PM, Dmitriy Ryaboy wrote:
I kind of dig the concept of being able to plug in a different
backend,
though I definitely thing we should get rid of the dead localmode
code. Can
you give an example of how this will simplify the codebase? Is it
more than
just GenericClass foo = new SpecificClass(), and the associated
extra files?
-D
On Thu, Apr 22, 2010 at 1:25 PM, Arun C Murthy <[email protected]>
wrote:
+1
Arun
On Apr 22, 2010, at 11:35 AM, Richard Ding wrote:
Pig has an abstraction layer (interfaces and abstract classes) to
support multiple execution engines. After PIG-1053, Hadoop is the
only
execution engine supported by Pig. I wonder if we should remove
this
layer of code, and make Hadoop THE execution engine for Pig. This
will
simplify a lot the backend code.
Thanks,
-Richard