[jira] [Commented] (HIVE-18652) Print Spark metrics on console

Sahil Takiar (JIRA) Thu, 19 Apr 2018 08:57:12 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444287#comment-16444287
 ]


Sahil Takiar commented on HIVE-18652:
-------------------------------------

Now the console logs show this: 

{code}
Spark Job[1] Metrics: TaskDurationTime: 570     ExecutorCpuTime: 390    
JvmGCTime: 0    BytesRead / RecordsRead: 11150 / 500    ShuffleTotalBytesRead / 
ShuffleRecordsRead: 2445 / 1    ShuffleBytesWritten / ShuffleRecordsWritten: 
2445 / 1
{code}

I decided to expose only these metrics because they are the ones that are 
exposed by default on the Spark Web UI.

Some follow up enhancements:
* HIVE-19051: Add units to the displayed metrics
* HIVE-19176: When we implement this, we can also add metrics for each 
individual Spark stage, right now the granularity is at the job level

> Print Spark metrics on console
> ------------------------------
>
>                 Key: HIVE-18652
>                 URL: https://issues.apache.org/jira/browse/HIVE-18652
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: HIVE-18652.1.patch, HIVE-18652.2.patch
>
>
> For Hive-on-MR, each MR job launched prints out some stats about the job:
> {code}
> INFO  : 2018-02-07 17:51:11,218 Stage-1 map = 0%,  reduce = 0%
> INFO  : 2018-02-07 17:51:18,396 Stage-1 map = 100%,  reduce = 0%, Cumulative 
> CPU 1.87 sec
> INFO  : 2018-02-07 17:51:25,742 Stage-1 map = 100%,  reduce = 100%, 
> Cumulative CPU 4.34 sec
> INFO  : MapReduce Total cumulative CPU time: 4 seconds 340 msec
> INFO  : Ended Job = job_1517865654989_0004
> INFO  : MapReduce Jobs Launched:
> INFO  : Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 4.34 sec   HDFS 
> Read: 7353 HDFS Write: 151 SUCCESS
> INFO  : Total MapReduce CPU Time Spent: 4 seconds 340 msec
> {code}
> We should do the same for Hive-on-Spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18652) Print Spark metrics on console

Reply via email to