[ 
https://issues.apache.org/jira/browse/PIG-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572344#action_12572344
 ] 

Stefan Groschupf commented on PIG-111:
--------------------------------------

The direction is very good and this is very important to do asap, however some 
general thoughts regarding the topic configuration and this patch

+ storing a non visble .pigrc within the user folder is a very bad idea. This 
is not transparent for the user.
++ configuration files should go into PIG_HOME/conf 
++ the file should have at least all possible configuration values listed even 
if they are uncommented.
++ in java configuration files have the .properties or .xml extension // users 
will better and faster recognize that this are configuration files

+ method names like pigContext.getConfiguration().getConfiguration() are 
missunderstandable how about pigContext.getConfiguration().toHadoopConf() or 
something.

+ in context of removing hadoop dependencies from Pig the whole HConfiguration 
makes no sense, since it introduce hadoop dependencies into a configuration 
object. Instead HExecutionEngine and HDataStoreage should convert the plain 
properties object into a hadoop conf object.
+ the configuration object should be as generic and stupid as possible 
ExecutionEngines and DataSorage should handle hadoop releated stuff.

To illustrate my suggestions I will contribute a patch soon.

> Configuration of Pig
> --------------------
>
>                 Key: PIG-111
>                 URL: https://issues.apache.org/jira/browse/PIG-111
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Craig Macdonald
>         Attachments: after.png, before.png, config.patch.1502, 
> PIG-93-v01.patch, PIG-93-v02.patch
>
>
> This JIRA discusses issues relating to the configuration of Pig.
> Uses cases:
>  
> 1. I want to configure Pig programatically from Java
>  Motivation: pig can be embedded from another Java program, and configuration 
> should be accessible to be set by the client code
> 2. I want to configure Pig from the command line
> 3. I want to configure Pig from the Pig shell (Grunt)
> 4. I want Pig to remember my configuration for every Pig session
>  Motivation: to save me typing in some configuration stuff every time.
> 5. I want Pig to remember my configuration for this script.
>  Motivation: I must use a common configuration for 50% of my Pig scripts - 
> can I share this configuration between scripts.
> Current Status: 
>  * Pig uses System properties for some configuration
>  * A configuration properties object in PigContext is not used.
>  * pigrc can contain properties
>  * Configuration properties can not be set from Grunt
> Proposed solutions to use cases:
> 1. Configuration should be set in PigContext, and accessible from client code.
> 2. System properties are copied to PigContext, or can be specified on the 
> command line (duplication with System properties)
> 3. Allow configuration properties to be set using the "set" command in Grunt
> 4. Pigrc can contain properties. Is this enough, or can other configuration 
> stuff be set, eg aliases, imports, etc.
> 5. Add an include directive to pig, to allow a shared configuration/Pig 
> script to be included.
> Connections to Shell scripting: 
>  * The source command in Bash allows another bash script file to be included 
> - this allows shared variables to be set in one file shared between a set of 
> scripts.
>  * Aliases can be set, according to user preferences, etc.
>  * All this can be done in your .bashrc file
> Issues: 
>  * What happens when you change a property after the property has been read?
>  * Can Grunt read a pigrc containing various statements etc before the 
> PigServer is completely configured?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to