kevin85421 commented on pull request #622:
URL: https://github.com/apache/submarine/pull/622#issuecomment-870701150


   @jiwq Thank you for your recommendation! In my opinion, data scientists need 
to run multiple experiments to achieve a single task. 
   
   For example, a data scientist wants to train a model with the MNIST dataset. 
At first, he creates an experiment "mnist-exp" with his `minist-ver1.py`. Next, 
because the model accuracy does not satisfy the scientist's expectation, the 
data scientist makes some modifications to the Python script (`mnist-ver1.py`) 
and creates an experiment "mnist-exp" again. The bug will occur seamlessly 
(That is, the data scientist does not know his experiment will not be created).
   
   To avoid the condition, the data scientist needs to:
   
   * Method1: 
     * Delete the first "mnist-exp"
     * Create "mnist-exp" with the updated `mnist-ver1.py`.
     * Cons: 
       * (1) Lose the data about the first "mnist-exp" 
       * (2) Delete "mnist-exp" manually
   * Method2:
     * Change the name of the experiment (ex: "mnist-exp-2")
     * Cons: Data scientists usually need to run many experiments to achieve a 
single task. Thus, the repeat human effort is very annoying. (For example, 
"mnist-exp-2", "mnist-exp-3", "mnist-exp-4" ... "mnist-exp-100")
   
   
   With my personal user experience, I usually use a script to help me create 
an experiment without entering the information, including experiment name, 
number of workers, and resource limitations, again and again. However, because 
of this bug, when I want to run my script to create an experiment again, I need 
a lot of human effort.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to