-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30262/#review70542
-----------------------------------------------------------


@Mohit: Did you have any chance to run the whole unit tests, would like to know 
if we are having any additional unit tests breaking than current 129 failures. 
I see you are having 2 new tests passing with the patch, but it would be easy 
for the merge if the patch doesn't cause any additional failures.

- Praveen R


On Jan. 30, 2015, 1:39 p.m., Mohit Sabharwal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30262/
> -----------------------------------------------------------
> 
> (Updated Jan. 30, 2015, 1:39 p.m.)
> 
> 
> Review request for pig, liyun zhang and Praveen R.
> 
> 
> Bugs: PIG-4393
>     https://issues.apache.org/jira/browse/PIG-4393
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> PIG-4393 : Add stats and error reporting for Spark
> 
> After Pig submits a job to Spark cluster, we need to report job progress, 
> spark specific stats and any error logs back to the user.
> 
> (1) It adds getting back status of basic success/failure for each Spark job. 
> (2) It adds logging of Spark specific stats in log file. Essentially, 
> registers a job metrics listener with spark context and collects spark  task 
> level metrics and aggregates.
> (3) It also re-factors code to correctly populate PigStats, which is used by 
> most unit tests. This should fix a bunch of unit tests.
> 
> TODO items in a follow-up patch:
>  - Add #records to OutputStats for each job.
>  - Though StatsReportListener prints spark job progress in the logs, we also 
> probably need to implement PigProgressNotificationListener for spark.
> 
> 
> Diffs
> -----
> 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/JobMetricsListener.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/SparkExecutionEngine.java
>  db152b5003ce6e79b001b2624010b91cc0f921d8 
>   src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 
> b15994d525250bbb26f7b7126dae619b9da363c8 
>   src/org/apache/pig/tools/pigstats/SparkStats.java 
> fd45dd4f0be415dd48d9fb7381c57c861bbbf7ce 
>   src/org/apache/pig/tools/pigstats/spark/SparkJobStats.java PRE-CREATION 
>   src/org/apache/pig/tools/pigstats/spark/SparkPigStats.java PRE-CREATION 
>   src/org/apache/pig/tools/pigstats/spark/SparkStatsUtil.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30262/diff/
> 
> 
> Testing
> -------
> 
> Tested with unit tests, at least some unit tests that were failing
> eariler due to lack of stats, like TestToolsPigServer and TestSplitStore 
> now pass.
> 
> 
> Example of Spark Job metrics that appear in logs:
> 
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  - Spark Job [0] Metrics
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       
> EexcutorDeserializeTime : 74
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       ExecutorRunTime : 
> 538
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       ResultSize : 2535
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       JvmGCTime : 0
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       
> ResultSerializationTime : 1
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       MemoryBytesSpilled 
> : 0
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       DiskBytesSpilled : > 0
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       
> RemoteBlocksFetched : 0
> 2015-01-29 23:06:42,520 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       LocalBlocksFetched 
> : 2
> 2015-01-29 23:06:42,521 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       TotalBlocksFetched 
> : 2
> 2015-01-29 23:06:42,521 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       FetchWaitTime : 0
> 2015-01-29 23:06:42,521 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       RemoteBytesRead : 0
> 2015-01-29 23:06:42,521 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       
> ShuffleBytesWritten : 918
> 2015-01-29 23:06:42,521 [main] INFO  
> org.apache.pig.tools.pigstats.spark.SparkPigStats  -       ShuffleWriteTime : 
> 67000
> 
> 
> Thanks,
> 
> Mohit Sabharwal
> 
>

Reply via email to