Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 

The following page has been changed by Shravan Narayanamurthy:

New page:
= Use, rather abuse, of PigContext =
The current code for pig uses PigContext as a generic placeholder for anything 
and everything. To the extent that it even has rename and copy methods. It is 
very confusing and doesn't stick to its role as can be inferred from the name 
of the class. I would prefer the following role for PigContext and it is how I 
would like to see PigContext used.

== Role fo PigContext ==
>From the name I infer that the class should be a context in which Pig 
>executes. So this should ideally just be a set of properties.

== Current roles of PigContext ==
 1. It has a set of properties
 2. Maintains handles to the DataStores both DFS and LFS
 3. It is tightly integrated with the jar registration on object instantiation
 4. Even has methods that rename & copy files
 5. Maintains a reference to current JobConf that is being exectued

== Changes I would like to see ==
It should basically just be a set of properties. But the object instantiation 
is tightly coupled with other parts. So we should probably do it later. Parts 
that can be easily separated out are the DataStores. Classes that want to 
create handles to DataStore should just use the PigContext instance for 
properties and create the DataStores internally instead of depending on 
PigContext to provide the handles. Also it is probably not a good idea to make 
PigContext Singleton or a class with static methods because if we move to 
server model where we can have multiple backends to which pig has to talk, we 
can create different instances of PigContext one per backend and use that to 
connect to the backend. With this the object would become lighter. If we can 
decouple the object instantiation logic too that would be ideal.

Reply via email to