[jira] [Commented] (SPARK-5647) Output metrics do not show up for older hadoop versions (< 2.5)

2015-02-09 Thread Patrick Wendell (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313020#comment-14313020
 ] 

Patrick Wendell commented on SPARK-5647:


Isn't it just possible to get the file path in the case of file output format, 
and then read the size of that file? The main challenge I see is how quickly 
that size becomes visible to the HDFS client. In general I think it's worth 
doing because a lot of people still use older versions of the Spark HDFS 
client, for instance people based on AWS who primarily read from S3 and don't 
keep up to date with the newest Hadoop API's.

> Output metrics do not show up for older hadoop versions (< 2.5)
> ---
>
> Key: SPARK-5647
> URL: https://issues.apache.org/jira/browse/SPARK-5647
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Reporter: Kostas Sakellis
>
> Need to add output metrics for hadoop < 2.5. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5647) Output metrics do not show up for older hadoop versions (< 2.5)

2015-02-09 Thread Kay Ousterhout (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312841#comment-14312841
 ] 

Kay Ousterhout commented on SPARK-5647:
---

Cool thanks [~sandyr]...mostly I was curious because I've done this for my own 
purposes by recompiling HDFS with these metrics exposed, and was just wondering 
if there was something simpler.

> Output metrics do not show up for older hadoop versions (< 2.5)
> ---
>
> Key: SPARK-5647
> URL: https://issues.apache.org/jira/browse/SPARK-5647
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Reporter: Kostas Sakellis
>
> Need to add output metrics for hadoop < 2.5. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5647) Output metrics do not show up for older hadoop versions (< 2.5)

2015-02-09 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312834#comment-14312834
 ] 

Sandy Ryza commented on SPARK-5647:
---

Yeah, we would need to check the final file size.  But my opinion is that this 
isn't worth the effort.

> Output metrics do not show up for older hadoop versions (< 2.5)
> ---
>
> Key: SPARK-5647
> URL: https://issues.apache.org/jira/browse/SPARK-5647
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Reporter: Kostas Sakellis
>
> Need to add output metrics for hadoop < 2.5. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5647) Output metrics do not show up for older hadoop versions (< 2.5)

2015-02-09 Thread Kostas Sakellis (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312818#comment-14312818
 ] 

Kostas Sakellis commented on SPARK-5647:


I'm not sure if this is possible with older hadoop. Need to do some 
investigation. We could possibly check the final file size after it has been 
written? [~sandyr] had some ideas when I talked to him about it.

> Output metrics do not show up for older hadoop versions (< 2.5)
> ---
>
> Key: SPARK-5647
> URL: https://issues.apache.org/jira/browse/SPARK-5647
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Reporter: Kostas Sakellis
>
> Need to add output metrics for hadoop < 2.5. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5647) Output metrics do not show up for older hadoop versions (< 2.5)

2015-02-09 Thread Kay Ousterhout (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312811#comment-14312811
 ] 

Kay Ousterhout commented on SPARK-5647:
---

Is this possible? I thought Hadoop didn't add thread-level stats until 2.5 (see 
the comment here: 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L156)
 -- is there a different way you were thinking of adding the output bytes?

> Output metrics do not show up for older hadoop versions (< 2.5)
> ---
>
> Key: SPARK-5647
> URL: https://issues.apache.org/jira/browse/SPARK-5647
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Reporter: Kostas Sakellis
>
> Need to add output metrics for hadoop < 2.5. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org