This is an automated email from the ASF dual-hosted git repository.

arvindsh pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/fluo-muchos.git


The following commit(s) were added to refs/heads/main by this push:
     new 7128cc6  Make Azure data disk config names consistent (#397)
7128cc6 is described below

commit 7128cc630b9e01920ea6782fca57ffaadac0f462
Author: Arvind Shyamsundar <[email protected]>
AuthorDate: Fri Jun 11 09:35:18 2021 -0700

    Make Azure data disk config names consistent (#397)
    
    * Make Azure data disk config names consistent
    * Use the `data_disk` prefix consistently across single-VMSS and
      multiple-VMSS configurations in Azure
    * Consistently specify OS disk SKU, and disk caching across VMSS types
    * Remove the hard-coded Azure VM SKU for the proxy host
    * Remove second `metrics` role assignment in example Azure multi-VMSS
      configuration file - we just need 1 host assigned `metrics`.
    * Update Azure multiple VMSS doc and sample config files accordingly
    * Document the `azure_proxy_host_vm_sku` configuration
    * Other minor edits to the README
---
 README.md                                           | 12 +++++++-----
 ansible/roles/azure/tasks/create_multiple_vmss.yml  |  2 +-
 ansible/roles/azure/tasks/create_optional_proxy.yml |  8 +++++---
 ansible/roles/azure/tasks/create_vmss.yml           |  5 +++--
 conf/azure_multiple_vmss_vars.yml.example           | 17 ++++++++---------
 conf/muchos.props.example                           |  7 +++++--
 docs/azure-multiple-vmss.md                         |  2 +-
 lib/muchos/azure.py                                 | 14 ++++----------
 lib/muchos/config/azure.py                          |  6 +++---
 lib/muchos/config/azurevalidationhelpers.py         |  6 +-----
 lib/muchos/config/azurevalidations.py               | 17 +++++++++--------
 lib/tests/azure/test_config.py                      |  2 +-
 12 files changed, 48 insertions(+), 50 deletions(-)

diff --git a/README.md b/README.md
index 390793b..efe71fb 100644
--- a/README.md
+++ b/README.md
@@ -144,7 +144,7 @@ You can check the status of the nodes using the EC2 
Dashboard or by running the
 ## Launching an Azure cluster
 
 Before launching a cluster, you will need to complete the requirements for 
Azure above, clone the Muchos repo, and
-create [muchos.props] by making a copy of existing [muchos.props.example]. If 
you want to give others access to your
+create your `conf/muchos.props` file by making a copy of the [muchos.props] 
example. If you want to give others access to your
 cluster, add their public keys to a file named `keys` in your `conf/` 
directory.  During the setup of your cluster,
 this file will be appended on each node to the `~/.ssh/authorized_keys` file 
for the user set by the
 `cluster.username` property.  You will also need to ensure you have 
authenticated to Azure and set the target
@@ -168,14 +168,14 @@ Under the `azure` section, edit following values as per 
your configuration:
 * `vnet` to provide the name of the VNET that your cluster nodes should use. A 
new VNET with this name will be
   created if it doesn't already exist
 * `subnet` to provide a name for the subnet within which the cluster resources 
will be deployed
-* `use_multiple_vmss` allows you to configure VMs with different CPU, memory, 
disks for leaders and workers. To 
-  know more about this feature, please follow the 
[doc](docs/azure-multiple-vmss.md).   
-* `azure_image_reference` allows you to specify the CentOS image SKU in the 
format as shown below. To configure 
+* `use_multiple_vmss` allows you to configure VMs with different CPU, memory, 
disk configurations for leaders and workers. To
+  know more about this feature, please follow the 
[doc](docs/azure-multiple-vmss.md).
+* `azure_image_reference` allows you to specify the CentOS image SKU in the 
format as shown below. To configure
   CentOS 8.x, please follow [these steps](docs/azure-image-reference.md).
   ```bash
   offer|publisher|sku|version|
   Ex: CentOS|OpenLogic|7.5|latest|
-  ```  
+  ```
 * `numnodes` to change the cluster size in terms of number of nodes deployed
 * `data_disk_count` to specify how many persistent data disks are attached to 
each node and will be used by HDFS.
    If you would prefer to use ephemeral / storage for Azure clusters, please 
follow [these steps](docs/azure-ephemeral-disks.md).
@@ -189,6 +189,8 @@ Under the `azure` section, edit following values as per 
your configuration:
   [Create and Share 
dashboards](https://docs.microsoft.com/en-us/azure/azure-portal/azure-portal-dashboards).
   [Azure Monitor 
Workbooks](https://docs.microsoft.com/en-us/azure/azure-monitor/platform/workbooks-overview).
 
+Please refer to the [muchos.props] example for the full list of Azure-specific 
configurations - some of which have supplementary comments.
+
 Within Azure the `nodes` section is auto populated with the hostnames and 
their default roles.
 
 After following the steps above, run the following command to launch an Azure 
VMSS cluster called `mycluster`
diff --git a/ansible/roles/azure/tasks/create_multiple_vmss.yml 
b/ansible/roles/azure/tasks/create_multiple_vmss.yml
index 02537d5..1e2a224 100644
--- a/ansible/roles/azure/tasks/create_multiple_vmss.yml
+++ b/ansible/roles/azure/tasks/create_multiple_vmss.yml
@@ -59,7 +59,7 @@
     data_disks: |
       {%- set data_disks = [] -%}
       {%- for lun in range(item.data_disk_count) -%}
-        {%- set _ = data_disks.append({'lun': lun, 'disk_size_gb': 
item.data_disk_size_gb, 'managed_disk_type': item.disk_sku }) -%}
+        {%- set _ = data_disks.append({'lun': lun, 'disk_size_gb': 
item.data_disk_size_gb, 'managed_disk_type': item.data_disk_sku, 'caching': 
None }) -%}
       {%- endfor -%}
       {{ data_disks }}
   with_items:
diff --git a/ansible/roles/azure/tasks/create_optional_proxy.yml 
b/ansible/roles/azure/tasks/create_optional_proxy.yml
index 1de89e8..32d62bb 100644
--- a/ansible/roles/azure/tasks/create_optional_proxy.yml
+++ b/ansible/roles/azure/tasks/create_optional_proxy.yml
@@ -66,7 +66,7 @@
     name: "{{ azure_proxy_host }}"
     network_interface_names:
       - "{{ azure_proxy_host }}-nic"
-    vm_size: Standard_D8s_v3
+    vm_size: "{{ azure_proxy_host_vm_sku }}"
     admin_username: "{{ cluster_user }}"
     ssh_password_enabled: false
     ssh_public_keys:
@@ -78,9 +78,11 @@
       publisher: OpenLogic
       sku: 7.5
       version: latest
-    managed_disk_type: "{{ managed_disk_type }}"
+    managed_disk_type: "{{ osdisk_sku }}"
     data_disks:
      - lun: 0
        disk_size_gb: 64
-       managed_disk_type: "{{ managed_disk_type }}"
+       managed_disk_type: "{{ data_disk_sku }}"
+  vars:
+  - osdisk_sku: "{{ 'Premium_LRS' if azure_proxy_host_vm_sku in 
premiumio_capable_skus else 'Standard_LRS' }}"
   when: azure_proxy_host is defined and azure_proxy_host and azure_proxy_host 
!= None
diff --git a/ansible/roles/azure/tasks/create_vmss.yml 
b/ansible/roles/azure/tasks/create_vmss.yml
index 22e7de4..eadf05f 100644
--- a/ansible/roles/azure/tasks/create_vmss.yml
+++ b/ansible/roles/azure/tasks/create_vmss.yml
@@ -31,7 +31,7 @@
 
 - name: Create luns dictionary
   set_fact:
-    luns_dict: "{{ luns_dict | default ([]) + [{ 'lun': item, 'disk_size_gb': 
disk_size_gb , 'caching': None } ] }}"
+    luns_dict: "{{ luns_dict | default ([]) + [{ 'lun': item, 'disk_size_gb': 
disk_size_gb, 'managed_disk_type': data_disk_sku, 'caching': None } ] }}"
   with_sequence: start=0 end={{ data_disk_count-1 if data_disk_count > 0 else 
0 }}
 
 - name: Set single placement group to correct value
@@ -55,7 +55,7 @@
     subnet_name: "{{ subnet }}"
     upgrade_policy: Manual
     tier: Standard
-    managed_disk_type: "{{ managed_disk_type }}"
+    managed_disk_type: "{{ osdisk_sku }}"
     os_disk_caching: ReadWrite
     enable_accelerated_networking: "{{ accnet_capable }}"
     single_placement_group: "{{ single_placement_group | default(omit) }}"
@@ -71,6 +71,7 @@
   - image_sku: "{{ azure_image_reference.split('|')[2] }}"
   - image_version: "{{ azure_image_reference.split('|')[3] }}"
   - accnet_capable: "{{ True if vm_sku in accnet_capable_skus else False }}"
+  - osdisk_sku: "{{ 'Premium_LRS' if vm_sku in premiumio_capable_skus else 
'Standard_LRS' }}"
   tags: create_vmss
 
 # SECTION 4: Automatically populate entries in the hosts file and in the 
muchos.props file, based on the VMSS node details
diff --git a/conf/azure_multiple_vmss_vars.yml.example 
b/conf/azure_multiple_vmss_vars.yml.example
index 71c94f4..d66dbec 100644
--- a/conf/azure_multiple_vmss_vars.yml.example
+++ b/conf/azure_multiple_vmss_vars.yml.example
@@ -5,7 +5,7 @@ vars_list:
     sku: Standard_D4s_v3
     perf_profile: azd8s
     data_disk_count: 4
-    disk_sku: Premium_LRS
+    data_disk_sku: Premium_LRS
     data_disk_size_gb: 512
     capacity: 4
     roles:
@@ -20,12 +20,11 @@ vars_list:
     sku: Standard_D4s_v3
     perf_profile: perf-small
     data_disk_count: 4
-    disk_sku: Standard_LRS
+    data_disk_sku: Standard_LRS
     data_disk_size_gb: 512
     capacity: 4
     roles:
       zookeeper: 2
-      metrics: 1
       journalnode: 1
       namenode: 1
       zkfc: 1
@@ -36,18 +35,18 @@ vars_list:
     sku: Standard_D4s_v3
     perf_profile: azd8s
     data_disk_count: 8
-    disk_sku: Standard_LRS
+    data_disk_sku: Standard_LRS
     data_disk_size_gb: 1024
     capacity: 4
     roles:
       worker: 4
-    
+
   # The below roles are required when HA is not enabled  (i.e hdfs_ha = False)
   - name_suffix: vmss4
     sku: Standard_D4s_v3
     perf_profile: azd8s
     data_disk_count: 4
-    disk_sku: Premium_LRS
+    data_disk_sku: Premium_LRS
     data_disk_size_gb: 512
     capacity: 3
     roles:
@@ -60,7 +59,7 @@ vars_list:
     sku: Standard_D4s_v3
     perf_profile: azd8s
     data_disk_count: 4
-    disk_sku: Premium_LRS
+    data_disk_sku: Premium_LRS
     data_disk_size_gb: 512
     capacity: 1
     roles:
@@ -70,7 +69,7 @@ vars_list:
     sku: Standard_D8s_v3
     perf_profile: azd8s
     data_disk_count: 8
-    disk_sku: Standard_LRS
+    data_disk_sku: Standard_LRS
     data_disk_size_gb: 1024
     capacity: 3
     roles:
@@ -87,7 +86,7 @@ vars_list:
     azure_disk_device_pattern: nvme*n1
     mount_root: /nvmedata
     data_disk_count: 0
-    disk_sku: Standard_LRS
+    data_disk_sku: Standard_LRS
     data_disk_size_gb: 1024
     capacity: 4
     roles:
diff --git a/conf/muchos.props.example b/conf/muchos.props.example
index 4af9e70..c7f60e8 100644
--- a/conf/muchos.props.example
+++ b/conf/muchos.props.example
@@ -117,7 +117,7 @@ subnet_cidr = 10.1.0.0/16
 #Optional. If set to True, will create multiple VMSS based on 
multiple_vmss_vars.yml
 use_multiple_vmss = False
 # Azure image reference defined as a pipe-delimited string in the format 
offer|publisher|sku|version|
-# Please refer 'Launching an Azure cluster' section of the README before 
making changes 
+# Please refer 'Launching an Azure cluster' section of the README before 
making changes
 azure_image_reference = CentOS|OpenLogic|7.5|latest|
 # Size of the cluster to provision.
 # A virtual machine scale set (VMSS) with these many VMs will be created.
@@ -135,7 +135,7 @@ azure_disk_device_pattern = lun*
 # azure_disk_device_path = /dev
 # azure_disk_device_pattern = nvme*n1
 # Type of the data disk attached to the VMSS. 'Standard_LRS' for HDD, 
'Premium_LRS' for SSD, 'StandardSSD_LRS' for Standard SSD
-managed_disk_type = Standard_LRS
+data_disk_sku = Standard_LRS
 # Number of managed disks provisioned on each VM
 data_disk_count = 3
 # The size of each managed disk provisioned
@@ -146,6 +146,9 @@ mount_root = /var/data
 metrics_drive_root = var-data
 # Optional proxy VM. If not set, the first node of the cluster will be 
selected as the proxy.
 azure_proxy_host =
+# Azure VM SKU to use when creating the proxy host - defaults to a 8-vCore 
general-purpose VM
+azure_proxy_host_vm_sku = Standard_D8s_v3
+# The Azure datacenter location to use for creating Muchos resources
 location = westus2
 # Enable ADLS Gen2 storage configuration. Muchos parameters 
instance_volumes_input, instance_volumes_adls & adls_storage_type is not 
required if use_adlsg2 is false.
 use_adlsg2 = False
diff --git a/docs/azure-multiple-vmss.md b/docs/azure-multiple-vmss.md
index 082281f..0f8458c 100644
--- a/docs/azure-multiple-vmss.md
+++ b/docs/azure-multiple-vmss.md
@@ -27,7 +27,7 @@ Muchos provides a [sample 
file](../conf/azure_multiple_vmss_vars.yml.example) wh
 | `azure_disk_device_pattern`| Optional | If not specified, the corresponding 
`azure_disk_device_pattern` value from the `azure` section in 
[muchos.props](../conf/muchos.props.example) is used | This is a device name 
wildcard pattern used (internally) in conjunction with `azure_disk_device_path` 
to enumerate attached SCSI or NVME disks to use for persistent local storage |
 | `mount_root`| Optional | If not specified, the corresponding `mount_root` 
value from the `azure` section in [muchos.props](../conf/muchos.props.example) 
is used | This is the folder in the file system where the persistent disks are 
mounted |
 | `data_disk_count`| Required | - | An integer value which specifies the 
number of persistent (managed) data disks to be attached to each VM in the 
VMSS. It can be 0 in specific cases - see [notes on using ephemeral 
storage](./azure-ephemeral-disks.md) for details |
-| `disk_sku`| Required | - | Can be either Standard_LRS (for HDD) or 
Premium_LRS (for Premium SSD). At this time, we have not tested the use of 
Standard SSD or UltraSSD with Muchos |
+| `data_disk_sku`| Required | - | Can be either Standard_LRS (for HDD) or 
Premium_LRS (for Premium SSD). At this time, we have not tested the use of 
Standard SSD or UltraSSD with Muchos |
 | `data_disk_size_gb`| Required | - | An integer value specifying the size of 
each persistent (managed) data disk in GiB |
 | `image_reference`| Optional | If not specified, the corresponding 
`azure_image_reference` value from the `azure` section in 
[muchos.props](../conf/muchos.props.example) is used | Azure image reference 
defined as a pipe-delimited string.
 | `capacity`| Required | - | An integer value specifying the number of VMs in 
this specific VMSS |
diff --git a/lib/muchos/azure.py b/lib/muchos/azure.py
index 54cdbb3..8558a1e 100644
--- a/lib/muchos/azure.py
+++ b/lib/muchos/azure.py
@@ -84,8 +84,7 @@ class VmssCluster(ExistingCluster):
                 [
                     "ansible-playbook",
                     path.join(
-                        config.deploy_path,
-                        "ansible/azure_terminate.yml",
+                        config.deploy_path, "ansible/azure_terminate.yml",
                     ),
                     "--extra-vars",
                     json.dumps(azure_config),
@@ -109,10 +108,7 @@ class VmssCluster(ExistingCluster):
         retcode = subprocess.call(
             [
                 "ansible-playbook",
-                path.join(
-                    config.deploy_path,
-                    "ansible/azure_wipe.yml",
-                ),
+                path.join(config.deploy_path, "ansible/azure_wipe.yml",),
                 "--extra-vars",
                 json.dumps(azure_config),
             ]
@@ -202,8 +198,7 @@ class VmssCluster(ExistingCluster):
 
                     print(
                         "{0}: {1}".format(
-                            "worker_data_dirs",
-                            curr_worker_dirs,
+                            "worker_data_dirs", curr_worker_dirs,
                         ),
                         file=vmss_file,
                     )
@@ -218,8 +213,7 @@ class VmssCluster(ExistingCluster):
 
                     print(
                         "{0}: {1}".format(
-                            "default_data_dirs",
-                            curr_default_dirs,
+                            "default_data_dirs", curr_default_dirs,
                         ),
                         file=vmss_file,
                     )
diff --git a/lib/muchos/config/azure.py b/lib/muchos/config/azure.py
index 0e728cc..5c1d8db 100644
--- a/lib/muchos/config/azure.py
+++ b/lib/muchos/config/azure.py
@@ -297,15 +297,15 @@ class AzureDeployConfig(BaseConfig):
         return self.get("azure", "azure_proxy_host")
 
     @ansible_host_var
-    @default(None)
+    @default("Standard_D8s_v3")
     def azure_proxy_host_vm_sku(self):
         return self.get("azure", "azure_proxy_host_vm_sku")
 
     @ansible_host_var
     @default("Standard_LRS")
     @is_valid(is_in(["Standard_LRS", "Premium_LRS", "StandardSSD_LRS"]))
-    def managed_disk_type(self):
-        return self.get("azure", "managed_disk_type")
+    def data_disk_sku(self):
+        return self.get("azure", "data_disk_sku")
 
     @ansible_host_var
     def accnet_capable_skus(self):
diff --git a/lib/muchos/config/azurevalidationhelpers.py 
b/lib/muchos/config/azurevalidationhelpers.py
index 421cd10..3d9ff1e 100644
--- a/lib/muchos/config/azurevalidationhelpers.py
+++ b/lib/muchos/config/azurevalidationhelpers.py
@@ -50,11 +50,7 @@ def vmss_status_succeeded_if_exists(config, client):
 
 
 def validate_disk_count(
-    context,
-    specified_disk_count,
-    mount_root,
-    disk_pattern,
-    validation_errors,
+    context, specified_disk_count, mount_root, disk_pattern, validation_errors,
 ):
     # min_data_disk_count is 1 unless we are using exclusively
     # ephemeral storage (data_disk_count is 0), which in turn is when:
diff --git a/lib/muchos/config/azurevalidations.py 
b/lib/muchos/config/azurevalidations.py
index 803fd39..7fb69dd 100644
--- a/lib/muchos/config/azurevalidations.py
+++ b/lib/muchos/config/azurevalidations.py
@@ -93,32 +93,33 @@ AZURE_VALIDATIONS = {
             "when use_multiple_vmss == True, any VMSS with sku "
             "must be a valid VM SKU for the selected location",
         ),
-        # managed_disk_type in
+        # data_disk_sku in
         # ['Standard_LRS', 'StandardSSD_LRS', Premium_LRS']
         ConfigValidator(
-            lambda config, client: config.managed_disk_type()
+            lambda config, client: config.data_disk_sku()
             in ["Standard_LRS", "StandardSSD_LRS", "Premium_LRS"],
-            "managed_disk_type must be "
+            "data_disk_sku must be "
             "one of Standard_LRS, StandardSSD_LRS, or Premium_LRS",
         ),
         ConfigValidator(
             lambda config, client: not config.use_multiple_vmss()
             or all(
                 [
-                    vmss.get("disk_sku")
+                    vmss.get("data_disk_sku")
                     in ["Standard_LRS", "StandardSSD_LRS", "Premium_LRS"]
                     for vmss in config.azure_multiple_vmss_vars.get(
                         "vars_list", []
                     )
                 ]
             ),
-            "when use_multiple_vmss == True, any VMSS with disk_sku must "
-            "be one of Standard_LRS, StandardSSD_LRS or Premium_LRS",
+            "when use_multiple_vmss == True, the data_disk_sku specified for "
+            "the VMSS must be one of Standard_LRS, StandardSSD_LRS "
+            "or Premium_LRS",
         ),
         # Cannot specify Premium managed disks if VMSS SKU is / are not capable
         ConfigValidator(
             lambda config, client: config.use_multiple_vmss()
-            or not config.managed_disk_type() == "Premium_LRS"
+            or not config.data_disk_sku() == "Premium_LRS"
             or config.vm_sku() in config.premiumio_capable_skus(),
             "azure.vm_sku must be Premium I/O capable VM SKU "
             "in order to use Premium Managed Disks",
@@ -128,7 +129,7 @@ AZURE_VALIDATIONS = {
             or all(
                 [
                     vmss.get("sku") in config.premiumio_capable_skus()
-                    if vmss.get("disk_sku") == "Premium_LRS"
+                    if vmss.get("data_disk_sku") == "Premium_LRS"
                     else True
                     for vmss in config.azure_multiple_vmss_vars.get(
                         "vars_list", []
diff --git a/lib/tests/azure/test_config.py b/lib/tests/azure/test_config.py
index 7fa1830..69469a3 100644
--- a/lib/tests/azure/test_config.py
+++ b/lib/tests/azure/test_config.py
@@ -43,7 +43,7 @@ def test_azure_cluster():
         "d611d7fc67698c91ec73da0e85b9907aa72b98d5eb4d49ea3a5d51b0c6c5785f"
     )
     assert c.get("azure", "vm_sku") == "Standard_D8s_v3"
-    assert c.get("azure", "managed_disk_type") == "Standard_LRS"
+    assert c.get("azure", "data_disk_sku") == "Standard_LRS"
     assert c.user_home() == "/home/centos"
     assert c.mount_root() == "/var/data"
     assert c.use_multiple_vmss() is False

Reply via email to