-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15634/
-----------------------------------------------------------
Review request for pig, Aniket Mokashi, Daniel Dai, and Rohini Palaniswamy.
Bugs: PIG-3525
https://issues.apache.org/jira/browse/PIG-3525
Repository: pig-git
Description
-------
This is my 2nd attempt to fix PIG-3525. Please review this patch and ignore the
previous one.
Problem:
PigStats and ScriptState are thread local variables that are accessible via
PigStats.get() and ScriptState.get(). Currently, these get() functions can
return MR-specific objects regardless of the exec type (MR or non-MR).
Fix:
I am introducing the following contract for PigStats and ScriptState:
* PigStats.start(PigStats) and ScriptState.start(ScriptState) must be called
before any calls to PigStats.get() and ScriptState.get() are made.
* If not, PigStats.get() and ScriptState.get() will return null since PigStats
and ScriptState objects are not initialized.
In addition, I add the following lines to the PigServer constructor:
-----
if (PigStats.get() == null) {
PigStats.start(pigContext.getExecutionEngine().instantiatePigStats());
}
if (ScriptState.get() == null) {
ScriptState.start(pigContext.getExecutionEngine().instantiateScriptState());
}
-----
This initializes PigStats and ScriptState at start-up for both interactive and
batch modes. The only exception is embedded mode where the main PigStats is set
to EmbeddedPigStats in Main class. EmbeddedPigStats is really a collection of
PigStats objects, and each running thread creates its own PigStats object in
BoundScript class.
Note that I decided to not infer the object type in the get() functions because
that requires access to PigContext, resulting in a lot of code changes.
Diffs
-----
src/org/apache/pig/Main.java 9ba1b86
src/org/apache/pig/PigServer.java c0826ea
src/org/apache/pig/backend/executionengine/ExecutionEngine.java 56a958b
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecutionEngine.java
ddc89f9
src/org/apache/pig/scripting/BoundScript.java 28340d1
src/org/apache/pig/tools/pigstats/PigStats.java ff0d0f9
src/org/apache/pig/tools/pigstats/PigStatsUtil.java 84c52de
src/org/apache/pig/tools/pigstats/ScriptState.java e3a2ba2
test/org/apache/pig/test/TestJobControlCompiler.java 1ef5b4d
test/org/apache/pig/test/TestPOPartialAggPlan.java f6e84d1
test/org/apache/pig/test/TestPlanGeneration.java 506857f
test/org/apache/pig/test/TestScriptLanguage.java cb7eb9d
Diff: https://reviews.apache.org/r/15634/diff/
Testing
-------
I ran all the unit tests and fixed the following tests-
* TestJobControlCompiler
* TestPOPartialAggPlan
* TestPlanGeneration
* TestScriptLanguage
There were two reasons why these tests failed-
* PigStats and ScriptState were never initialized, and thus, PigStats.get() and
ScriptState.get() caused NPEs. I fixed this by explicitly creating a PigServer
object.
* PigServer was created by JUnit before, and PigStats was set to SimplePigStats
instead of EmbeddedPigStats in embedded mode. I fixed this by moving the call
to PigServer constructor out of JUnit before.
Thanks,
Cheolsoo Park