[
https://issues.apache.org/jira/browse/DATALAB-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459129#comment-17459129
]
Leonid Frolov edited comment on DATALAB-2644 at 12/14/21, 1:04 PM:
-------------------------------------------------------------------
*For now is only one image pulled from cloud (AWS/GCP/Azure)?*
Now we don`t pull images from cloud. We have image list specified in our code.
Probably because we need not only image name but also it`s description so that
user will understand what particular image contains.
On AWS there are two DeepLearning images: Base AMI only with some libraries and
drivers and Conda AMI with many preinstalled libraries, frameworks, etc. We
Conda AMI, but we use v.42.1 not the latest one and there is no way for it to
changed by the user.
[https://aws.amazon.com/machine-learning/amis/]
[https://aws.amazon.com/marketplace/pp/prodview-x5nivojpquy6y?sr=0-1&ref_=beagle&applicationId=AWSMPContessa]
On Azure we use Deep Learning AMI from Azure docs (in docs there is only one on
ubuntu)
[https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/#product-overview]
[https://azuremarketplace.microsoft.com/en-us/marketplace/apps/microsoft-dsvm.ubuntu-1804?tab=Overview]
On GCP there are ~940 both GPU and CPU DeepLearning images divided by 12
frameworks
[https://cloud.google.com/deep-learning-vm/docs/images]
On Dlab UI we have images with 8 frameworks (except TensorFlow Enterprise 2.x,,
TensorFlow Enterprise 1.x, R (only CPU), PyTorch XLA). 7 of 8 frameworks use
latest GPU image and with base framework there are 3 images with different cuda
versions to choose from (11.0, 10.0, 9.2)
[https://github.com/apache/incubator-datalab/blob/develop/infrastructure-provisioning/src/general/files/gcp/deeplearning_description.json|http://example.com/]
*Why we could not pull all available image list (templates) of related product
from service cloud?*
On AWS we can probably add Base AMI but user will have to install all
frameworks/packages/libraries by himself.
On Azure marketplace there are many AMIs with similar to DeepLearning sets of
software but these images are placed not only by microsoft but also by other
companies. All those images have different names and there is no easy way to
distinguish them from another images (like base ubuntu, opensuse, jupyter,
etc.).
On GCP list of all available DeepLearning images is too big.
*Why is current version of image pulling implemented from cloud service? ( To
add another we should add it manually)*
Probably because it was faster to implement. Our custom DeepLearning, that we
had before, had many compatibility issues due to software updates and failed on
creation. So Cloud AMIs, that were the most similar to our DeepLearning by
frameworks/libraries were chosen.
was (Author: lfrolov):
*Answer these questions in DeepLearning content*
# *For now is only one image pulled from cloud (AWS/GCP/Azure)?*
Now we don`t pull images from cloud. We have image list specified in our code.
Probably because we need not only image name but also it`s description so that
user will understand what particular image contains.
On AWS there are two DeepLearning images: Base AMI only with some libraries and
drivers and Conda AMI with many preinstalled libraries, frameworks, etc. We
Conda AMI, but we use v.42.1 not the latest one and there is no way for it to
changed by the user.
[https://aws.amazon.com/machine-learning/amis/]
[https://aws.amazon.com/marketplace/pp/prodview-x5nivojpquy6y?sr=0-1&ref_=beagle&applicationId=AWSMPContessa]
On Azure we use Deep Learning AMI from Azure docs (in docs there is only one on
ubuntu)
[https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/#product-overview]
https://azuremarketplace.microsoft.com/en-us/marketplace/apps/microsoft-dsvm.ubuntu-1804?tab=Overview
On GCP there are ~940 both GPU and CPU DeepLearning images divided by 12
frameworks
[https://cloud.google.com/deep-learning-vm/docs/images]
On Dlab UI we have images with 8 frameworks (except TensorFlow Enterprise 2.x,,
TensorFlow Enterprise 1.x, R (only CPU), PyTorch XLA). 7 of 8 frameworks use
latest GPU image and with base framework there are 3 images with different cuda
versions to choose from (11.0, 10.0, 9.2)
[https://github.com/apache/incubator-datalab/blob/develop/infrastructure-provisioning/src/general/files/gcp/deeplearning_description.json|http://example.com]
# *Why we could not pull all available image list (templates) of related
product from service cloud?*
On AWS we can probably add Base AMI but user will have to install all
frameworks/packages/libraries by himself.
On Azure marketplace there are many AMIs with similar to DeepLearning sets of
software but these images are placed not only by microsoft but also by other
companies. All those images have different names and there is no easy way to
distinguish them from another images (like base ubuntu, opensuse, jupyter,
etc.).
On GCP list of all available DeepLearning images is too big.
# *Why is current version of image pulling implemented from cloud service? (
To add another we should add it manually)*
Probably because it was faster to implement. Our custom DeepLearning, that we
had before, had many compatibility issues due to software updates and failed on
creation. So Cloud AMIs, that were the most similar to our DeepLearning by
frameworks/libraries were chosen.
> Investigate how image adding is implemented and possible future options
> -----------------------------------------------------------------------
>
> Key: DATALAB-2644
> URL: https://issues.apache.org/jira/browse/DATALAB-2644
> Project: Apache DataLab
> Issue Type: Task
> Security Level: Public(Regular Issues)
> Components: DataLab Main
> Reporter: Vira Vitanska
> Assignee: Leonid Frolov
> Priority: Major
> Labels: AWS, AZURE, DevOps, GCP
> Fix For: v.2.5.2
>
>
> *Answer these questions in DeepLearning content*
> # For now is only one image pulled from cloud (AWS/GCP/Azure)?
> # Why we could not pull all available image list (templates) of related
> product from service cloud?
> # Why is current version of image pulling implemented from cloud service?
> ( To add another we should add it manually)
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]