Hi Alan et al,
comments through.
Here's my thinking on this, though I don't speak for all of the
committers. Pig should have 3 ways to pick up configuration:
1) from .pigrc, as it does now
agreed
2) when embedded in another java program, the caller should be able to
set values in PigContext, as I referred to in my response to
Benjamin's email.
agreed
3) From the pig script, we should be able to something like: set
conf.x = y (I'm not necessarily suggesting syntax here).
Ok, start of a patch for this attached.
I wasn't sure if the "conf." prefix is required, so patch has this as a
comment.
* Should .pigrc evolve into a place for Pig aliases and properties,
and even scripts? (similar to .bashrc etc)
Right now you can store pig properties here. It's not clear it needs
to grow beyond that. What use case do you see for storing aliases or
scripts here?
(motivations below)
* Should new commands be added: import, include, sharedFS etc?
I'm guessing this is the same things as I'm saying in 3 above. If
not, please elaborate on what these new commands would do.
I'm trying to envisage how to allow reusability in the Pig.
These are mostly from my original email.
2. Extensions to Pig syntax
(a) "set" command sets all system properties
(b) "include" includes and parses another pig script
(c) "import" adds a package namespace to the search path
3. Change so that ~/.pigrc into a pig script that is parsed on
startup of Grunt/PigServer?
(1) Why should a user have to supply the fully qualified name to his
user defined function, if all the functions he ever uses are in that
package? Obviously, he shouldnt have to, which is why PigContext
includes this line:
packageImportList.add("com.yahoo.pig.yst.sds.ULT.");
I'm asking to add a command that allows me to do something like:
import uk.ac.gla.terrier.pig
and have that package searched for any functions. Yahoo users have this
ability (com.yahoo.pig.yst.sds.ULT. is searched by default), why not
everyone else ;-)
[Somewhat similar to the define keyword.]
These could be properties instead.
(2) Include other pig files. Just to allow commonly created imports,
configuration, defines, etc to be easily loaded. How often do you
register the same jar files time-in time-out for every pig script that
your write.
(3) sharedFS - see PIG-102 - equally could be a property too.
(4) pigrc as a script - similar to (2).
This is like your Unix shell rc, eg .bashrc
Mine is full of single character aliases for commands I use all the
time, etc.
C