[
https://issues.apache.org/jira/browse/SPARK-34015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-34015:
------------------------------------
Assignee: Apache Spark
> SparkR partition timing summary reports input time correctly
> ------------------------------------------------------------
>
> Key: SPARK-34015
> URL: https://issues.apache.org/jira/browse/SPARK-34015
> Project: Spark
> Issue Type: Bug
> Components: SparkR
> Affects Versions: 2.3.2, 3.0.1
> Environment: Observed on CentOS-7 running spark 2.3.1 and on my mac
> running master
> Reporter: Tom Howland
> Assignee: Apache Spark
> Priority: Major
> Original Estimate: 0h
> Remaining Estimate: 0h
>
> When sparkR is run at log level INFO, a summary of how the worker spent its
> time processing the partition is printed. There is a logic error where it is
> over-reporting the time inputting rows.
> In detail: the variable inputElap in a wider context is used to mark the
> beginning of reading rows, but in the part changed here it was used as a
> local variable for measuring compute time. Thus, the error is not observable
> if there is only one group per partition, which is what you get in unit tests.
> For our application, here's what a log entry looks like before these changes
> were applied:
> {{20/10/09 04:08:58 WARN RRunner: Times: boot = 0.013 s, init = 0.005 s,
> broadcast = 0.000 s, read-input = 529.471 s, compute = 492.037 s,
> write-output = 0.020 s, total = 1021.546 s}}
> this indicates that we're spending more time reading rows than operating on
> the rows.
> After these changes, it looks like this:
> {{20/12/15 06:43:29 WARN RRunner: Times: boot = 0.013 s, init = 0.010 s,
> broadcast = 0.000 s, read-input = 120.275 s, compute = 1680.161 s,
> write-output = 0.045 s, total = 1812.553 s}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]