[ 
https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570446#comment-15570446
 ] 

Gaoxiang Liu edited comment on SPARK-16827 at 10/13/16 1:16 AM:
----------------------------------------------------------------

[~rxin], for this one, I think spill byte (both memory and disk), and shuffle 
bytes are already logged and reported, right ?
Also, if I want to add spill time metrics, do you suggest I create a parent 
class DiskWriteMetrics, and ShuffleWriteMetrics and my new class (eg 
SpillWriteMetrics) inherit from it, and then pass parent 
class(DiskWriteMetrics) to UnsafeSorterSpillWriter 
https://github.com/facebook/FB-Spark/blob/fb-2.0/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java#L209
 ?  

Or do you suggest rename the ShuffleWriteMetrics class to something like 
WriteMetrics ?


was (Author: dreamworks007):
[~rxin], for this one, if I want to add spill time metrics, do you suggest I 
create a parent class DiskWriteMetrics, and ShuffleWriteMetrics and my new 
class (eg SpillWriteMetrics) inherit from it, and then pass parent 
class(DiskWriteMetrics) to UnsafeSorterSpillWriter 
https://github.com/facebook/FB-Spark/blob/fb-2.0/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java#L209
 ?  

Or do you suggest rename the ShuffleWriteMetrics class to something like 
WriteMetrics ?

> Stop reporting spill metrics as shuffle metrics
> -----------------------------------------------
>
>                 Key: SPARK-16827
>                 URL: https://issues.apache.org/jira/browse/SPARK-16827
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle, Spark Core
>    Affects Versions: 2.0.0
>            Reporter: Sital Kedia
>            Assignee: Brian Cho
>              Labels: performance
>             Fix For: 2.1.0
>
>
> One of our hive job which looks like this -
> {code}
>  SELECT  userid
>      FROM  table1 a
>      JOIN table2 b
>       ON    a.ds = '2016-07-15'
>       AND  b.ds = '2016-07-15'
>       AND  a.source_id = b.id
> {code}
> After upgrade to Spark 2.0 the job is significantly slow.  Digging a little 
> into it, we found out that one of the stages produces excessive amount of 
> shuffle data.  Please note that this is a regression from Spark 1.6. Stage 2 
> of the job which used to produce 32KB shuffle data with 1.6, now produces 
> more than 400GB with Spark 2.0. We also tried turning off whole stage code 
> generation but that did not help. 
> PS - Even if the intermediate shuffle data size is huge, the job still 
> produces accurate output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to