[ 
https://issues.apache.org/jira/browse/SPARK-54921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051942#comment-18051942
 ] 

Dipanshu Pandey commented on SPARK-54921:
-----------------------------------------

Hi [~warrenzhu25] ,

  I've started working on this issue and have raised a PR: 
https://github.com/apache/spark/pull/53803

  Implementation approach:
  - Added parentIds: collection.Seq[Int] field to StageData class
  - Updated protobuf schema with repeated int64 parent_ids field
  - Populated from existing StageInfo.parentIds which already tracks parent 
stage dependencies
  - Added serialization/deserialization support for persistence
  - Added unit tests including edge case for empty parentIds (root stages)

  The parent stage information was already available internally in the 
scheduler (StageInfo.parentIds), so this change simply exposes it through the 
REST API.

  Could you please assign this issue to me?

> Add parentIds in StageData
> --------------------------
>
>                 Key: SPARK-54921
>                 URL: https://issues.apache.org/jira/browse/SPARK-54921
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 4.1.0
>            Reporter: Zhongwei Zhu
>            Priority: Major
>              Labels: pull-request-available
>
> The `parentIds` field is necessary for tracking stage dependencies in the 
> Spark UI and other status-related tools.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to