-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30262/
-----------------------------------------------------------
(Updated Jan. 26, 2015, 7:23 p.m.)
Review request for pig, liyun zhang and Praveen R.
Changes
-------
Incorp feedback: Removed spark version change from this patch.
Bugs: PIG-4393
https://issues.apache.org/jira/browse/PIG-4393
Repository: pig-git
Description
-------
PIG-4393 : Add stats and error reporting for Spark
After Pig submits a job to Spark cluster, we need to report job progress, spark
specific stats and any error logs back to the user.
This is an initial patch that adds spark specific stats, mostly to get feedback
around assumption that a separate Spark job is launched for each POStore
operator.
It also re-factors code to correctly populate PigStats, which is used by most
unit tests. This should fix a bunch of unit tests.
TODO items:
- Probably need to add counters to capture number of records, bytes in output
file to populate OutputStats.
- Though StatsReportListener prints spark job progress in the logs, we also
probably need to implement PigProgressNotificationListener for spark.
Diffs (updated)
-----
src/org/apache/pig/backend/hadoop/executionengine/spark/JobMetricsListener.java
PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/spark/SparkExecutionEngine.java
db152b5003ce6e79b001b2624010b91cc0f921d8
src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java
6e9b29753fa2db360b5063da38c785675f1e5b57
src/org/apache/pig/tools/pigstats/SparkStats.java
fd45dd4f0be415dd48d9fb7381c57c861bbbf7ce
src/org/apache/pig/tools/pigstats/spark/SparkJobStats.java PRE-CREATION
src/org/apache/pig/tools/pigstats/spark/SparkPigStats.java PRE-CREATION
src/org/apache/pig/tools/pigstats/spark/SparkStatsUtil.java PRE-CREATION
Diff: https://reviews.apache.org/r/30262/diff/
Testing
-------
Thanks,
Mohit Sabharwal