[ 
https://issues.apache.org/jira/browse/PIG-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573999#action_12573999
 ] 

Alan Gates commented on PIG-111:
--------------------------------

Wow, what a patch to review.  I have a few comments and issues:

First the comments:
The config file conf/pig.properties references conf/log4j.properties, though 
that file doesn't exist in SVN at this point.  Does this patch depend on 
another patch?

Why was @SuppressWarnings added in Main.java?

I know we had some discussion earlier on unit testing and one of the points 
mentioned was that unit tests should be moved to reside in the packages of the 
classes they test.  However, we haven't yet as a group agreed to that.  So 
until we do, let's keep the tests in org.apache.pig.test (PigContextTest was 
put in pig.impl).

Now, some issues:

1) I ran the unit tests, and saw several failures.  

Testcase: testFunctionInsideFunction took 18.472 sec
Testcase: testJoin took 11.915 sec
Testcase: testDriverMethod took 11.831 sec
Testcase: testMapLookup took 11.85 sec
Testcase: testBagFunctionWithFlattening took 14.012 sec
Testcase: testSort took 0.35 sec
    Caused an ERROR
Bad mapred.job.tracker: local
java.lang.RuntimeException: Bad mapred.job.tracker: local
    at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:711)
    at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:143)
    at org.apache.pig.impl.PigContext.connect(PigContext.java:154)
    at org.apache.pig.PigServer.<init>(PigServer.java:128)
    at 
org.apache.pig.test.TestEvalPipeline.testSortDistinct(TestEvalPipeline.java:284)
    at org.apache.pig.test.TestEvalPipeline.testSort(TestEvalPipeline.java:266)

Testcase: testDistinct took 0.387 sec
    Caused an ERROR
Bad mapred.job.tracker: local
java.lang.RuntimeException: Bad mapred.job.tracker: local
    at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:711)
    at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:143)
    at org.apache.pig.impl.PigContext.connect(PigContext.java:154)
    at org.apache.pig.PigServer.<init>(PigServer.java:128)
    at 
org.apache.pig.test.TestEvalPipeline.testSortDistinct(TestEvalPipeline.java:284)
    at 
org.apache.pig.test.TestEvalPipeline.testDistinct(TestEvalPipeline.java:271)

and

Testcase: testBigGroupAll took 18.651 sec
Testcase: testStoreFunction took 11.995 sec
Testcase: testQualifiedFuncions took 11.834 sec
Testcase: testDefinedFunctions took 11.78 sec
Testcase: testPigServer took 0.045 sec
    Caused an ERROR
Bad mapred.job.tracker: local
java.lang.RuntimeException: Bad mapred.job.tracker: local
    at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:711)
    at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:143)
    at org.apache.pig.impl.PigContext.connect(PigContext.java:154)
    at org.apache.pig.PigServer.<init>(PigServer.java:128)
    at org.apache.pig.test.TestMapReduce.testPigServer(TestMapReduce.java:212)

2)  When I tried to run pig with this change, I could only get it to run in the 
local mode.  In pig.properties I set exectype=mapreduce and cluster=<my 
jobtracker> and it still only ran in local mode.

> Configuration of Pig
> --------------------
>
>                 Key: PIG-111
>                 URL: https://issues.apache.org/jira/browse/PIG-111
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Craig Macdonald
>         Attachments: after.png, before.png, config.patch.1502, 
> PIG-111-v04.patch, PIG-111-v05.patch, PIG-111-v06.patch, 
> PIG-111_v_3_sg.patch, PIG-93-v01.patch, PIG-93-v02.patch
>
>
> This JIRA discusses issues relating to the configuration of Pig.
> Uses cases:
>  
> 1. I want to configure Pig programatically from Java
>  Motivation: pig can be embedded from another Java program, and configuration 
> should be accessible to be set by the client code
> 2. I want to configure Pig from the command line
> 3. I want to configure Pig from the Pig shell (Grunt)
> 4. I want Pig to remember my configuration for every Pig session
>  Motivation: to save me typing in some configuration stuff every time.
> 5. I want Pig to remember my configuration for this script.
>  Motivation: I must use a common configuration for 50% of my Pig scripts - 
> can I share this configuration between scripts.
> Current Status: 
>  * Pig uses System properties for some configuration
>  * A configuration properties object in PigContext is not used.
>  * pigrc can contain properties
>  * Configuration properties can not be set from Grunt
> Proposed solutions to use cases:
> 1. Configuration should be set in PigContext, and accessible from client code.
> 2. System properties are copied to PigContext, or can be specified on the 
> command line (duplication with System properties)
> 3. Allow configuration properties to be set using the "set" command in Grunt
> 4. Pigrc can contain properties. Is this enough, or can other configuration 
> stuff be set, eg aliases, imports, etc.
> 5. Add an include directive to pig, to allow a shared configuration/Pig 
> script to be included.
> Connections to Shell scripting: 
>  * The source command in Bash allows another bash script file to be included 
> - this allows shared variables to be set in one file shared between a set of 
> scripts.
>  * Aliases can be set, according to user preferences, etc.
>  * All this can be done in your .bashrc file
> Issues: 
>  * What happens when you change a property after the property has been read?
>  * Can Grunt read a pigrc containing various statements etc before the 
> PigServer is completely configured?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to