[
https://issues.apache.org/jira/browse/SPARK-26260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
shahid updated SPARK-26260:
---------------------------
Description:
Currently, tasks summary metrics is calculated based on all the tasks, instead
of successful tasks.
After the JIRA, https://issues.apache.org/jira/browse/SPARK-26119, when using
InMemory store, it find task summary metrics for all the successful tasks
metrics. But we need to find an efficient implementation for disk store case
for SHS. The main bottle neck for disk store is deserialization time overhead.
Hints: Need to rework on the way indexing works, so that we can index by
specific metrics for successful and failed tasks differently (would be tricky).
Also would require changing the disk store version (to invalidate old stores).
OR any other efficient solutions.
was:
Currently, tasks summary metrics is calculated based on all the tasks, instead
of successful tasks.
After the JIRA, https://issues.apache.org/jira/browse/SPARK-26119, when using
InMemory store, it find task summary metrics for all the successful tasks
metrics. But we need to find an efficient implementation for disk store case
for SHS. The main bottle neck for disk store is deserialization time overhead.
Hints: Need to rework on the way indexing works, so that we can index by
specific metrics for successful and failed tasks differently (would be tricky).
Also would require changing the disk store version (to invalidate old stores).
> Summary Task Metrics for Stage Page: Efficient implimentation for SHS when
> using disk store.
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-26260
> URL: https://issues.apache.org/jira/browse/SPARK-26260
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 2.4.0, 3.0.0
> Reporter: shahid
> Priority: Major
>
> Currently, tasks summary metrics is calculated based on all the tasks,
> instead of successful tasks.
> After the JIRA, https://issues.apache.org/jira/browse/SPARK-26119, when using
> InMemory store, it find task summary metrics for all the successful tasks
> metrics. But we need to find an efficient implementation for disk store case
> for SHS. The main bottle neck for disk store is deserialization time overhead.
> Hints: Need to rework on the way indexing works, so that we can index by
> specific metrics for successful and failed tasks differently (would be
> tricky). Also would require changing the disk store version (to invalidate
> old stores).
> OR any other efficient solutions.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]