[
https://issues.apache.org/jira/browse/SUBMARINE-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kevin Su resolved SUBMARINE-949.
--------------------------------
Resolution: Fixed
> [Umbrella] Refactor and stabilize experiment service in submarine-server
> ------------------------------------------------------------------------
>
> Key: SUBMARINE-949
> URL: https://issues.apache.org/jira/browse/SUBMARINE-949
> Project: Apache Submarine
> Issue Type: Improvement
> Components: Backend Server, experiment
> Reporter: Kai-Hsun Chen
> Assignee: Kai-Hsun Chen
> Priority: Major
> Fix For: 0.6.0
>
>
> Now, the experiment service is the most important feature in Apache
> Submarine. However, the service is not stable and not user-friendly. For
> example,
> (1) The frontend workbench cannot reflect the actual experiment status. (ex:
> OOM)
> (2) The server misses some constraints in Kubernetes Java Client. (ex: If the
> experiment name contains the character "_", the k8s java API will throw an
> exception.)
> (3) Unexpected out-of-memory error: It is very inconvenient for users to
> predict the actual memory usage before running the experiment. Thus, using
> the memory request and memory limit mechanism to allow overcommitment of
> memory is helpful for users.
> (4) Allow users to create experiments with the same name, and they can
> retrieve these experiments with the name.
> (5) Set different tags on experiments to divide them into categories, and
> thus users can retrieve these experiments with tags.
> (6) The K8sSubmitter will submit an experiment to the Kubernetes cluster when
> it is created, no matter how much resource quota is left.
>
> With these reasons, it is necessary to refactor and stabilize experiment
> service in submarine-server.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]