[
https://issues.apache.org/jira/browse/SUBMARINE-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094813#comment-17094813
]
Wangda Tan commented on SUBMARINE-481:
--------------------------------------
Thanks [~pingsutw], I still saw it is using "Job" as the name, I think we
should switch to "Experiment" for the first cut.
Here're my rough ideas about how to define the experiments API:
Top-level APIs:
- POST /experiment : add new experiment
- GET /experiment/ \{experiment-id}: get an experiment
- GET /experiment/list (with parameters): get experiments with parameters.
- DELETE /experiment/\{experiment-id}
: delete an experiment.
Objects definition:
{code:java}
Experiment {
// runtime status of an experiment
ExperimentStatus: {
ExperimentState: Running (stringm, common states of Experiments,
such as Running, New, Pending, Completed, Failed, etc.)
startTime: string, (in UTC format, like 2020-04-28T19:01:37Z)
runningTimeInMilliSeconds: long (how long the job is running),
terminatedTime: string, (in UTC format, empty if not terminated).
},
// experiment spec (static)
Experiment: {
// below are common fields for all experiments
name: $unique-name-for-the-experiment (string)
environment: $name-for-environment (string)
// Where we can sync the code,
code:
$ref: "#/definitions/CodeSpec"
timeout:
$ref: "#/definitions/Time"
type: string with following values:
("Script", "Tensorflow", "PyTorch", "Template")
// based on the type, one of the following types
parameters:
oneOf: // Use oneOf keyword of swagger, see
https://swagger.io/docs/specification/data-models/inheritance-and-polymorphism/
- $ref: "#/definitions/ScriptParameterSpec"
- $ref: "#/definitions/TensorflowParameterSpec"
- ...
},
}
CodeSpec {
syncMode: s3/git (string)
url: string (git://, or s3://)
}
ScriptParameterSpec {
cli:
- $ref: "#/definitions/CommandLineOptions"
resource:
- $ref: "#/definitions/ResourceSpec"
}
TensorflowParameterSpec {
// reference to the below like:
ps:
environment: "team-default-ml-cpu"
resource_constraint:
res="mem=20gb, vcore=3, gpu=0"
worker:
environment: "team-default-ml-gpu"
resource_constraint:
res="mem=20gb, vcore=3, gpu=2"
}
Time {
time: 30
unit: "Minute" (or other units)
}
CommandLineOptions {
type: array
items:
type: string
}
ResourceSpec {
// Should be a list of key (resource type) to value, and unit.
like cpu=10, memory = 20 GB, etc.
}
{code}
> Use Swagger to describe and document submarine RESTful APIs
> -----------------------------------------------------------
>
> Key: SUBMARINE-481
> URL: https://issues.apache.org/jira/browse/SUBMARINE-481
> Project: Apache Submarine
> Issue Type: Improvement
> Components: Doc
> Reporter: Kevin Su
> Priority: Major
> Labels: pull-request-available
> Attachments: swagger.yaml
>
>
> Follow
> [JobManagerRestApi.java|https://github.com/apache/submarine/blob/31f9322216307f958a1c3ec79e8a09cb0a5f5b5e/submarine-server/server-core/src/main/java/org/apache/submarine/server/rest/JobManagerRestApi.java#L44]
> and [Job.java
> |https://github.com/apache/submarine/blob/master/submarine-server/server-api/src/main/java/org/apache/submarine/server/api/job/Job.java],
> [submarine-server
> doc|https://github.com/apache/submarine/tree/master/docs/submarine-server] to
> define our job API by swagger
> We could copy swagger.ymal to [https://editor.swagger.io/] so that we could
> automatically generate doc and related client API.
> Anything I miss or something that needs to modify, please let me know
> cc [~leftnoteasy] [~liuxun] [~jiwq] [~tangzhankun]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]