PigContext can indeed wrap up existing Pig specific system properties,
given the addition of a getProperty() method as well.
For instance, many System properties are used to control HOD connection
in HExecutionEngine. These could instead be stored in PigContext, which
is passed to HExecutionEngine anyway.
Any thoughts about set, import, include etc?
Using pigrc as a place for set commands etc would allow users to simply
call pig.pl to start their Pig session - the pigrc could do imports, set
appropriate properties etc. The set command may be tricky - if
properties are used to control the connection to the execution engine,
then changing these properties using a set command would have no effect,
as PigContext.connect() has already been called by PigServer's
constructor. One option might be for PigServer to delay connection until
the first query is executed.
C
Alan Gates wrote:
It is already possible to do the following:
PigContext context = new PigContext(execType); // execType is
PigServer.ExecType
PigServer pig = new PigServer(context);
pig.registerQuery(...);
...
PigContext contains, among other things, a Properties object that keep
the properties. What's missing is a way to set values in that
properties object. So if we added a setProperty() method to
PigContext I think we'd have what you suggest below.
Alan.
Benjamin Francisoud wrote:
I'm not a project lead but as a user of pig I'd like a possibility to
configure pig programmatically.
Doing something like:
PigConfiguration conf = new PigConfiguration();
conf.set("foo", "bar");
Pig pig = new Pig();
pig.setConf(conf);
pig.run();
This is just a pseudo-code example.
Replace Pig and PigConfiguration with whatever name you like ;)
--
Benjamin Francisoud
Craig Macdonald a écrit :
Hi pig-devs,
Just a quiet ping for comments from project leads on this. It seems
I have raised several issues recently that require configuration of
pig. Questions:
* Are System properties the best place for these?
* Should .pigrc evolve into a place for Pig aliases and properties,
and even scripts? (similar to .bashrc etc)
* Should new commands be added: import, include, sharedFS etc?
* Please direct me as to how JIRAs should be created.
I may be able to provide patches to some JIRAs I have created if we
have a policy for configuration-type stuff.
C
Craig Macdonald wrote:
Good morning Pig-devs,
This email notes some of the yahoo specifics remaining in Pig that
may be needed to checked before a Pig release (see 1. below). I
would hope that Pig syntax can be evolved to allow these to be
removed (see (2) below), and instead placed in users .pigrc.
From PigContext, I note that the .pigrc is in fact a place for
properties. An alternative would be for the Grunt set command to
set System properties, and then make .pigrc into a pig script,
allowing users to define aliases, register common jar files, import
common namespaces, include other pig script files.
Please direct how JIRAs should be created to track these issues -
one issue for all, with subtasks; separate tasks for the three
issues below?
Details below.
Craig
1. Yahoo specifics
src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java
doHod() contains yahoo specific stuff. I'm not sure Hod has
stabalised sufficiently for this to be changed
fixUpDomain() assumes unqualified hostnames are part of the
.inktomisearch.com DNS domain
src/org/apache/pig/impl/PigContext.java:125
packageImportList.add("com.yahoo.pig.yst.sds.ULT.");
Note - these is no Pig command to allow imports of package
namespaces into the packageImportList ArrayList
scripts/pig.pl
kryptontite mentions, specifics: 69, 114
2. Extensions to Pig syntax
(a) "set" command sets all system properties
(b) "include" includes and parses another pig script
(c) "import" adds a package namespace to the search path
3. Change so that ~/.pigrc into a pig script that is parsed on
startup of Grunt/PigServer?