[ 
https://issues.apache.org/jira/browse/SPARK-42472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692981#comment-17692981
 ] 

Ruifeng Zheng edited comment on SPARK-42472 at 2/24/23 2:40 AM:
----------------------------------------------------------------

Since in ml side 'groupId' is only used in training models now, and training 
can be treated as a 'command', I guess it is fine to:
1, support it in 'commonds.proto' by adding additional field 'group_id';
2, add a additional 'cancel job group' command.


{code:java}
// A [[Command]] is an operation that is executed by the server that does not 
directly consume or
// produce a relational result.
message Command {
  oneof command_type {
    CommonInlineUserDefinedFunction register_function = 1;
    WriteOperation write_operation = 2;
    CreateDataFrameViewCommand create_dataframe_view = 3;
    WriteOperationV2 write_operation_v2 = 4;
    CancelJobGroup = 5;

    // This field is used to mark extensions to the protocol. When plugins 
generate arbitrary
    // Commands they can add them here. During the planning the correct 
resolution is done.
    google.protobuf.Any extension = 999;

  }
  
  // (Optional)
  //
  // Optional for commands other than 'CancelJobGroup'.
  // If set, the command will be executed within given group id. A new group 
will be created if needed
  //
  // Required for command 'CancelJobGroup'.
  // All the running command within this given group id will be cancelled.
  optional string job_group_id;
}
{code}



was (Author: podongfeng):
Since in ml side 'groupId' is only used in training models now, and training 
can be treated as a 'command', I guess it is fine to:
1, support it in 'commonds.proto' by adding additional field 'group_id';
2, add a additional 'cancel job group' command

> Make spark connect supporting canceling job group
> -------------------------------------------------
>
>                 Key: SPARK-42472
>                 URL: https://issues.apache.org/jira/browse/SPARK-42472
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Connect, ML
>    Affects Versions: 3.4.0
>            Reporter: Ruifeng Zheng
>            Priority: Major
>
> PySpark ML crossValidator relies on this:
> proposal:
> when spark client sending request, can we make it send with a "job_group_id", 
> and in server side, for each request that with the same (user_id, 
> job_group_id), handling request in a fixed thread



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to