Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by Shravan Narayanamurthy:
http://wiki.apache.org/pig/PigContextAbuse

------------------------------------------------------------------------------
  == Current roles of PigContext ==
   1. It has a set of properties
   2. Maintains handles to the DataStores both DFS and LFS
-  3. It is tightly integrated with the jar registration on object instantiation
+  3. It is tightly integrated with the jar registration and object 
instantiation
   4. Even has methods that rename & copy files
   5. Maintains a reference to current JobConf that is being exectued
+  6. Maintains a reference to the execution engine
  
  == Changes I would like to see ==
- It should basically just be a set of properties. But the object instantiation 
is tightly coupled with other parts. So we should probably do it later. Parts 
that can be easily separated out are the DataStores. Classes that want to 
create handles to DataStore should just use the PigContext instance for 
properties and create the DataStores internally instead of depending on 
PigContext to provide the handles. Also it is probably not a good idea to make 
PigContext Singleton or a class with static methods because if we move to 
server model where we can have multiple backends to which pig has to talk, we 
can create different instances of PigContext one per backend and use that to 
connect to the backend. With this the object would become lighter. If we can 
decouple the object instantiation logic too that would be ideal.
+ It should basically just be a set of properties. Parts that can be easily 
separated out are the DataStores and the handle to the execution engine. 
Classes that want to create handles to DataStore should just use the PigContext 
instance for properties and create the DataStores internally instead of 
depending on PigContext to provide the handles. The execution engine is 
somethig that PigServer should maintain. A better thing to do would be to have 
a mapping between the execution engines and the PigContext that was used to 
invoke them. Also it is probably not a good idea to make PigContext Singleton 
or a class with static methods because if we move to a server model where we 
can have multiple backends to which pig has to talk, we need to create 
different instances of PigContext one per backend and use that to connect to 
the backend. 
  
+ Object instantiation is tightly coupled with other parts. We should add 
another utility class that just does object instantiation by interacting using 
the properties inside PigContext like the extra jars that were added during the 
execution of Pig. With this infact the PigContext class becomes redundant. This 
can be maintained as a variable of type properties in PigServer as a mapping 
from the execution engine and the PigServer can have a variable 
currentExecEngine which points to an entry in the mapping. So all accesses to 
PigContext can be replaced by a Properties object. However this would mean that 
PigServer will be the starting point for any operation. I guess that is the way 
it should be.
+ 

Reply via email to