Re: [Autotest] [PATCH] KVM-test: Add transparent hugepage test cases

Lucas Meneghel Rodrigues Thu, 09 Jun 2011 18:32:08 -0700

On Fri, 2011-05-20 at 14:55 +0800, Yiqiao Pu wrote:
> Transparent hugepage test includes:
> 1)smoking test and stress test as basic test
> Smoking test is test the transparent hugepage is used by kvm and guest.
> Stress test test use a parallel dd to test the stability of transparent
> hugepage
> 2)swapping test
> Bootup a vm and verify that it can be swapped out and swapped in correctly
> 3)defrag test
> Allocate hugepage for libhugetlbfs while defrag is on and off.
> Than compare the result
> And there is a script for env preparing named thp.py. It will set the
> khugepaged to a active mode and it also including a simple test for
> khugepaged.


Hi Yiqiao, I read through the test and have comments to make. Bottom
line, we need to bring the test to use more current autotest API, avoid
use of the commands API and reformulate the test stage setup. See below.

> Signed-off-by: Yiqiao Pu <[email protected]>
> ---
>  client/tests/kvm/scripts/thp.py                   |  130 
> +++++++++++++++++++++
>  client/tests/kvm/tests/trans_hugepage.py          |  120 +++++++++++++++++++
>  client/tests/kvm/tests/trans_hugepage_defrag.py   |   72 ++++++++++++
>  client/tests/kvm/tests/trans_hugepage_swapping.py |  104 ++++++++++++++++
>  client/tests/kvm/tests_base.cfg.sample            |   18 +++
>  5 files changed, 444 insertions(+), 0 deletions(-)
>  create mode 100755 client/tests/kvm/scripts/thp.py
>  create mode 100644 client/tests/kvm/tests/trans_hugepage.py
>  create mode 100644 client/tests/kvm/tests/trans_hugepage_defrag.py
>  create mode 100644 client/tests/kvm/tests/trans_hugepage_swapping.py
> 
> diff --git a/client/tests/kvm/scripts/thp.py b/client/tests/kvm/scripts/thp.py
> new file mode 100755
> index 0000000..a1cf288
> --- /dev/null
> +++ b/client/tests/kvm/scripts/thp.py

^ We have moved away from having scripts on the scripts dir to perform
pre-test setup. Now it's recommended that such setup code stays either
at:

* The test itself, in case the code is only going to be used once, such
as unattended install:

http://autotest.kernel.org/browser/trunk/client/tests/kvm/tests/unattended_install.py

* The test setup library, for code that is going to be used in multiple
tests (that is the case for THP tests):

http://autotest.kernel.org/browser/trunk/client/virt/virt_test_setup.py

See HugePageConfig. Also, there are some other details, discussed
below. 

> @@ -0,0 +1,130 @@
> +#!/usr/bin/python
> +# -*- coding: utf-8 -*-
> +import os, sys, time, commands, string, re, stat, shelve
> +
> +"""
> +script to save and restore environment in transparent hugepage test
> +and khuagepaged test
> +"""
> +
> +class THPError(Exception):
> +    """
> +    Exception from transparent hugepage preparing scripts
> +    """
> +    pass

^ Maybe we can add more exception types for different things that can go
wrong during the course of the test.

> +class THP:
> +    def __init__(self):
> +        """
> +        Get the configuration path of transparent hugepage, ksm,
> +        libhugetlbfs and etc. And the option config parameters from user
> +        """
> +        if os.path.isdir("/sys/kernel/mm/redhat_transparent_hugepage"):
> +            self.thp_path = '/sys/kernel/mm/redhat_transparent_hugepage'
> +        elif os.path.isdir("/sys/kernel/mm/transparent_hugepage"):
> +            self.thp_path = '/sys/kernel/mm/transparent_hugepage'
> +        else:
> +            raise THPError("System don't support transparent hugepage")

^ Better to use variables, to handle the paths and reduce the amount of
typing:

RH_THP_PATH = "/sys/kernel/mm/redhat_transparent_hugepage"
UPSTREAM_THP_PATH = "/sys/kernel/mm/transparent_hugepage"

if os.path.isdir(RH_THP_PATH):
    self.thp_path = RH_THP_PATH

And so on and so forth

> +
> +        self.default_config_file = '/tmp/thp_default_config'
> +        # Update the test config dict from environment
> +        test_cfg={"%s/defrag" % self.thp_path:"yes",
> +                "%s/enabled" % self.thp_path:"always",
> +                "%s/khugepaged/defrag" % self.thp_path:"yes",
> +                "%s/khugepaged/scan_sleep_millisecs" % self.thp_path:"100",
> +                "%s/khugepaged/pages_to_scan" % self.thp_path:"4096",
> +                "%s/khugepaged/alloc_sleep_millisecs" % self.thp_path:"100",
> +                "/sys/kernel/mm/ksm/run":"1",
> +                "/proc/sys/vm/nr_hugepages":"0"
> +                }
> +        if os.path.isfile("%s/khugepaged/enabled" % self.thp_path):
> +            test_cfg["%s/khugepaged/enabled" % self.thp_path] = "always"
> +        if os.path.isfile("%s/khugepaged/max_ptes_none" % self.thp_path):
> +            test_cfg["%s/khugepaged/max_ptes_none" % self.thp_path] = "511"
> +            test_cfg["%s/defrag" % self.thp_path] = "always"
> +
> +        tmp_list = []
> +        test_config = str(os.environ['KVM_TEST_thp_test_config'])

^ Now rather than collecting env variables, we are instantiating the
setup object with the test parameters (see for example the enospc test).

> +        if len(test_config) > 0:
> +            tmp_list = re.split(';', test_config)
> +        while len(tmp_list) > 0:
> +            tmp_cfg = tmp_list.pop()
> +            test_cfg[re.split(":", tmp_cfg)[0]] = \
> +                                           re.split(":", tmp_cfg)[1]
> +        self.test_config = test_cfg
> +
> +    def save_env(self):
> +        """
> +        Save and set the environment including related parts in kernel.
> +        Such as ksm and libhugetlbfs.
> +        """
> +        fd_default = shelve.open(str(self.default_config_file))

^ Now that we have moved from stand alone scripts, we can use classes
and internal object state to store such information, rather than saving
it on a file, so this entire function can be cut off.

> +        for f in self.test_config.keys():
> +            parameter = file(f,'r').read()
> +            if string.find(f, "enabled") > 0 or string.find(f, "defrag") > 0:
> +                parameter = re.split("\[|\]", parameter)[1] + '\n'
> +        fd_default[f] = parameter
> +        fd_default.close()
> +
> +    def set_env(self):
> +        """
> +        After khugepaged test inuse_config is already set to an active mode 
> of
> +        transparent hugepage. Get some special config of sub test and set it.
> +        """
> +        if len(self.test_config) > 0:

^ You can test for an empty string just like:

if self.test_config:

Without using the length, as an empty string evals to False.

> +            for path in self.test_config.keys():
> +                file(path, 'w').write(self.test_config[path])
> +
> +    def set_params(self, path_dict={}, filename="", value=""):
> +        """
> +        Set the value of a config files in one path which is
> +        store in a path dict
> +        @param path_dict: Dict of files' pathes {path : value}
> +        @param filename: Name of file to setup
> +        @param value: Value set to the configuration files
> +        """
> +        for path in path_dict.keys():
> +            if string.find(path, filename) > 0:
> +                try:
> +                    file(path, "w").write(value)
> +                except IOError, e:
> +                    raise THPError("Can not set %s to %s: %s" % \
> +                                   (value, filename, e))
> +
> +    def khugepaged_test(self):
> +        """
> +        Start, stop and frequency change test for khugepaged
> +        """
> +        action_list = [("never", 256),("always", 0),("never", 256)]
> +        # ret is the (status, value)
> +        ret = [self.set_params(self.test_config, "enabled", a) or\
> +         commands.getstatusoutput('pgrep khugepaged') for (a, r) in 
> action_list]

^ I don't understand, set_params does not return a value, how can we use
values it yields to populate the results on this list? I assume this
list will return always the result of commands.getstatusoutput('pgrep
khugepaged'), as self.set_params(self.test_config, "enabled", a) will
always evaluate to None...

> +        for i in range(len(action_list)):
> +            if ret[i][0] != action_list[i][1]:
> +                raise THPError("khugepaged can not set to status %s" %\
> +                                action_list[i][0])
> +
> +
> +    def restore_default_config(self):
> +        """:
> +        Restore the default configuration to host after test
> +        """
> +        fd = shelve.open(self.default_config_file)
> +        for path in fd.keys():
> +            file(path, 'w').write(fd[path])
> +        fd.close()
> +
> +if __name__ == "__main__":
> +    if len(sys.argv) < 2:
> +        raise THPError("Please use -s for set and -r for restore")

^ Here we can replace calling -s or -r with functions setup() and
cleanup() of the config object.

> +    run_time = sys.argv[1][1:]
> +    trans_hugepage = THP()
> +    if run_time == "s":
> +        trans_hugepage.save_env()
> +        trans_hugepage.khugepaged_test()
> +        trans_hugepage.set_env()
> +    elif run_time == "r":
> +        trans_hugepage.restore_default_config()
> +    else:
> +        raise THPError("Please use -s for set and -r for restore")
> diff --git a/client/tests/kvm/tests/trans_hugepage.py 
> b/client/tests/kvm/tests/trans_hugepage.py
> new file mode 100644
> index 0000000..5cd3ef9
> --- /dev/null
> +++ b/client/tests/kvm/tests/trans_hugepage.py
> @@ -0,0 +1,120 @@
> +import logging, time, commands, os, string, re
> +from autotest_lib.client.common_lib import error
> +from autotest_lib.client.virt import virt_test_utils, aexpect
> +
> +def run_trans_hugepage(test, params, env):
> +    """
> +    KVM khugepage user side test:
> +    1) Smoking test
> +    2) Stress test
> +
> +    @param test: kvm test object.
> +    @param params: Dictionary with test parameters.
> +    @param env: Dictionary with the test environment.
> +    """
> +    def get_mem_status(params, type):
> +        if type == "host":
> +            s, info = commands.getstatusoutput("cat /proc/meminfo")

^ We should use utils.system_output here

> +        else:
> +            s, info = session.get_command_status_output("cat /proc/meminfo")

^ Here we could use info = session.cmd("cat...") and handle the possible
ShellErrors that can be thrown.

> +        if s != 0:
> +            raise error.TestError("Can not get memory info in guest")
> +        for h in re.split("\n+", info):
> +            if h.startswith("%s" % params):
> +                output = re.split('\s+', h)[1]
> +        return output
> +    # Check khugepage is used by guest
> +    dd_timeout = float(params.get("dd_timeout", 900))
> +    fail = 0
> +    nr_ah = []
> +    mem = params['mem']
> +
> +    debugfs_flag = 1
> +    if not os.path.ismount('/mnt/debugfs'):
> +        if not os.path.isdir('/mnt/debugfs'):
> +            os.makedirs('/mnt/debugfs')
> +        s, o = commands.getstatusoutput("mount -t debugfs none /mnt/debugfs")

^ Similarly, better to use autotest API rather than commands. This
comment applies to other uses of the commands API.

> +        if s != 0:
> +            debugfs_flag = 0
> +            logging.info("Warning: debugfs is unusable")
> +
> +    logging.info("Smoking test start")
> +    login_timeout = float(params.get("login_timeout", "3600"))
> +    vm = virt_test_utils.get_living_vm(env, params.get("main_vm"))
> +    session = virt_test_utils.wait_for_login(vm, timeout=login_timeout)
> +
> +    nr_ah.append(get_mem_status('AnonHugePages', 'host'))
> +    if nr_ah[0] <= 0:
> +        raise error.TestFail("VM is not using transparent hugepage")
> +
> +    # Protect system from oom killer
> +    if int(get_mem_status('MemFree', 'guest')) / 1024 < mem :
> +        mem = int(get_mem_status('MemFree', 'guest')) / 1024
> +    cmd = "ls -d /space"
> +    status = session.get_command_status(cmd)
> +    if status != 0:
> +        cmd = "mkdir /space &> /dev/null"
> +        status = session.get_command_status(cmd)
> +        if status != 0:
> +            raise error.TestError("Can not mkdir in guest")

^ Better to just use session.cmd, since that API is supposed to throw an
error anyway.

> +
> +    cmd = "mount -t tmpfs -o size=%sM none /space" % str(mem)
> +    status = session.get_command_status(cmd)
> +    if status != 0:
> +        raise error.TestError("Can not mount tmpfs in guest")
> +
> +    count = mem / 4
> +    cmd = "dd if=/dev/zero of=/space/1 bs=4000000 count=%s" % count
> +    try:
> +        status = session.get_command_status(cmd, timeout=dd_timeout)
> +    except aexpect.ShellStatusError, e:
> +        logging.debug(e)
> +        raise error.TestFail("Could not get exit status of command %s" % cmd)
> +    except Exception, e:
> +        logging.debug(e)
> +        raise error.TestFail("Fail command: %s. Please check debug log!" % 
> cmd)
> +
> +    if status != 0:
> +        raise error.TestError("dd failed in guest")
> +
> +    nr_ah.append(get_mem_status('AnonHugePages', 'host'))
> +
> +    if nr_ah[1] <= nr_ah[0]:
> +        logging.info("WARNING: VM don't use transparent hugepage when dd")
> +
> +    if debugfs_flag == 1:
> +        if int(open('/mnt/debugfs/kvm/largepages', 'r').read()) <= 0:
> +            raise error.TestFail("KVM doesn't use transparenthugepage")
> +
> +    logging.info("Smoking test finished")
> +
> +    # Use parallel dd as stress for memory
> +    count = count / 3
> +    logging.info("Stress test start")
> +    cmd = "for i in `seq %s`;do dd if=/dev/zero of=/space/$i" % count \
> +           + " bs=4000000 count=1& done "
> +    try:
> +        status, output = session.get_command_status_output(cmd,
> +                                                           
> timeout=dd_timeout)
> +    except aexpect.ShellStatusError, e:
> +        logging.debug(e)
> +        raise error.TestFail("Could not get exit status of command %s" % cmd)
> +    except Exception, e:
> +        logging.debug(e)
> +        raise error.TestFail("Fail command: %s. Please check debug log!" % 
> cmd)
> +
> +    if len(re.findall("No space", output)) > count * 0.05:
> +        raise error.TestFail("Too many dd failed in guest")
> +
> +    s, o = session.get_command_status_output('pidof dd')
> +    if s == 0 :
> +        for i in re.split('\n+', o):
> +            session.get_command_status_output('kill -9 %s' % i)
> +
> +    cmd = "umount /space"
> +    status, output = session.get_command_status_output(cmd)
> +    if status != 0:
> +        raise error.TestFail("Can not umount tmpfs after stress test%s"% 
> output)
> +    logging.info("Stress test finished")
> +
> +    session.close()
> diff --git a/client/tests/kvm/tests/trans_hugepage_defrag.py 
> b/client/tests/kvm/tests/trans_hugepage_defrag.py
> new file mode 100644
> index 0000000..14ce546
> --- /dev/null
> +++ b/client/tests/kvm/tests/trans_hugepage_defrag.py
> @@ -0,0 +1,72 @@
> +
> +import logging, time, commands, os, string, re
> +from autotest_lib.client.common_lib import error
> +from autotest_lib.client.virt import virt_test_utils
> +
> +def run_trans_hugepage_defrag(test, params, env):
> +    """
> +    KVM khugepage user side test:
> +    1) Verify that the host supports khugepahe. If it does proceed with the 
> test.
> +    2) Verify that the khugepage can be used in host
> +    3) Verify that the khugepage can be used in guest
> +    4) Migration while using khuge page
> +
> +    @param test: kvm test object.
> +    @param params: Dictionary with test parameters.
> +    @param env: Dictionary with the test environment.
> +    """
> +    def get_mem_status(params):
> +        for line in file('/proc/meminfo', 'r').readlines():
> +            if line.startswith("%s" % params):
> +                output = re.split('\s+', line)[1]
> +        return output
> +
> +    def set_libhugetlbfs(number):
> +        fd = file("/proc/sys/vm/nr_hugepages", "w+")

^ Better to not use fd (commonly stands for file descriptor) with a file
object (which is what fd actually is), to avoid people confused while
reading the code.

> +        fd.write(number)
> +        fd.seek(0)
> +        ret = fd.read()
> +        return int(ret)
> +
> +    # Test the defrag
> +    logging.info("Defrag test start")
> +    login_timeout = float(params.get("login_timeout", 360))
> +    vm = virt_test_utils.get_living_vm(env, params.get("main_vm"))
> +    session = virt_test_utils.wait_for_login(vm, timeout=login_timeout)
> +
> +    try:
> +        if not os.path.isdir('/space'):
> +            os.makedirs('/space')
> +        if os.system("mount -t tmpfs none /space"):
> +            raise error.TestError("Can not mount tmpfs")
> +
> +        # Try to make some fragment in memory, the total size of fragments 
> is 1G

> +        cmd = "for i in `seq 262144`; do dd if=/dev/urandom of=/space/$i 
> bs=4K count=1 & done"
> +        s, o = commands.getstatusoutput(cmd)
> +        if s != 0:
> +            raise error.TestError("Can not dd in host")
> +    finally:
> +        s, o = commands.getstatusoutput("umount /space")
> +        if s != 0:
> +            raise error.TestError("Can not free tmpfs")
> +
> +    total = int(get_mem_status('MemTotal'))
> +    hugepagesize = int(get_mem_status('Hugepagesize'))
> +    nr_full = str(total / hugepagesize)
> +    # Allocate hugepage for libhugetlbfs before and after enable defrag.
> +    # And see the different

> +    nr = []
> +    nr.append(set_libhugetlbfs(nr_full))
> +    try:
> +        defrag_path = 
> '/sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag'
> +        file(str(defrag_path),'w').write('yes')
> +    except IOError, e:
> +        raise error.TestFail("Can not start defrag:%s" % e)
> +
> +    time.sleep(1)
> +    nr.append(set_libhugetlbfs(nr_full))
> +
> +    if nr[0] >= nr[1]:
> +        raise error.TestFail("Do not defrag in the system %s" % nr)

^ It was not possible to defrag memory in host...

> +    session.close()
> +    logging.info("Defrag test succeed")
> diff --git a/client/tests/kvm/tests/trans_hugepage_swapping.py 
> b/client/tests/kvm/tests/trans_hugepage_swapping.py
> new file mode 100644
> index 0000000..a58ee6b
> --- /dev/null
> +++ b/client/tests/kvm/tests/trans_hugepage_swapping.py
> @@ -0,0 +1,104 @@
> +import logging, time, commands, os, string, re
> +from autotest_lib.client.common_lib import error
> +from autotest_lib.client.virt import virt_test_utils
> +
> +def run_trans_hugepage_swapping(test, params, env):
> +    """
> +    KVM khugepage user side test:
> +    1) Verify that the hugepages can be swapped in/out.
> +
> +    @param test: kvm test object.
> +    @param params: Dictionary with test parameters.
> +    @param env: Dictionary with the test environment.
> +    """
> +    def get_args(args_list):
> +        """
> +        Get the memory arguments from system

^ Populates a dict with memory information extracted from the system

> +        """
> +        args_list_tmp = args_list.copy()
> +        for line in file('/proc/meminfo', 'r').readlines():
> +            for key in args_list_tmp.keys():
> +                if line.startswith("%s" % args_list_tmp[key]):
> +                    args_list_tmp[key] = int(re.split('\s+', line)[1])
> +        return args_list_tmp
> +
> +    # Swapping test
> +    logging.info("Swapping test start")
> +    # Parameters of memory information
> +    # @total: Memory size
> +    # @free: Free memory size
> +    # @swap_size: Swap size
> +    # @swap_free: Free swap size
> +    # @hugepage_size: Page size of one hugepage
> +    # @page_size: The biggest page size that app can ask for
> +    args_dict_check = {"free" : "MemFree", "swap_size" : "SwapTotal",
> +                        "swap_free" : "SwapFree", "total" : "MemTotal",
> +                       "hugepage_size" : "Hugepagesize",}
> +    args_dict = get_args(args_dict_check)
> +    swap_free = []
> +    total = int(args_dict['total']) / 1024
> +    free = int(args_dict['free']) / 1024
> +    swap_size = int(args_dict['swap_size']) / 1024
> +    swap_free.append(int(args_dict['swap_free'])/1024)
> +    hugepage_size = int(args_dict['hugepage_size']) / 1024
> +    dd_timeout = float(params.get("dd_timeout", 900))
> +    login_timeout = float(params.get("login_timeout", 360))
> +    check_cmd_timeout = float(params.get("check_cmd_timeout", 900))
> +    # If swap is enough fill all memory with dd
> +    if swap_free > (total - free):
> +        count = total / hugepage_size
> +        tmpfs_size = total
> +    else:
> +        count = free / hugepage_size
> +        tmpfs_size = free
> +
> +    if swap_size <= 0:
> +        raise logging.info("Host don't have swap")
> +    try:
> +        if not os.path.isdir('/space'):
> +            os.makedirs('/space')
> +        if os.system("mount -t tmpfs  -o size=%sM none /space" % tmpfs_size):
> +            raise error.TestError("Can not mount tmpfs")

^ Use utils.run()

> +        # Set the memory size of vm
> +        # To ignore the oom killer set it to the free swap size
> +        vm = virt_test_utils.get_living_vm(env, params.get("main_vm"))
> +        if int(params['mem']) > swap_free[0]:
> +            vm.destroy()
> +            vm_name = 'vmsw'
> +            vm0 =  params.get("main_vm")
> +            vm0_key = virt_utils.env_get_vm(env, vm0)
> +            params['vms'] = params['vms'] + " " + vm_name
> +            params['mem'] = str(swap_free[0])
> +            vm_key = vm0_key.clone(vm0, params)
> +            virt_utils.env_register_vm(env, vm_name, vm_key)
> +            virt_env_preprocessing.preprocess_vm(test, params, env, vm_name)
> +            vm_key.create()
> +            session = virt_utils.wait_for(vm_key.remote_login, 
> timeout=login_timeout)
> +        else:
> +            session = virt_test_utils.wait_for_login(vm, 
> timeout=login_timeout)
> +
> +        cmd = "dd if=/dev/zero of=/space/zero bs=%s000000 count=%s" % \
> +                (hugepage_size, count)
> +        s, o = commands.getstatusoutput(cmd)
> +        if s != 0:
> +            raise error.TestError("dd failed in host: %s" % o)
> +
> +        args_dict = get_args(args_dict_check)
> +        swap_free.append(int(args_dict['swap_free'])/1024)
> +
> +        if swap_free[1] - swap_free[0] >= 0:
> +            raise error.TestFail("Nothing is swapped")
> +
> +        # Check the vm still alive
> +        s = session.get_command_status("find / -name \"*\"",
> +                                        timeout=check_cmd_timeout)

^ why a command that takes so long to execute to check if the VM is
still alive? I suppose any simple command could verify whether the
session is alive, not to mention we can also use a vm.verify_alive()
here to check if the VM process is well and monitors responsive.

> +        if s != 0 :
> +            raise error.TestFail("There is something wrong after swap")
> +    finally:
> +        s, o = commands.getstatusoutput("umount /space")
> +        if s != 0:
> +            logging.warning("Can not umount tmpfs after swap")
> +        logging.info("Swapping test succeed")
> +
> +    session.close()
> diff --git a/client/tests/kvm/tests_base.cfg.sample 
> b/client/tests/kvm/tests_base.cfg.sample
> index 78c84c6..3c8e552 100644
> --- a/client/tests/kvm/tests_base.cfg.sample
> +++ b/client/tests/kvm/tests_base.cfg.sample
> @@ -965,6 +965,24 @@ variants:
>          kill_vm_gracefully = no
>      # Do not define test variants below shutdown
>  
> +    - trans_hugepage:
> +        pre_command = python scripts/thp.py -s
> +        post_command = python scripts/thp.py -r
> +        thp_default_config = "/tmp/thp_default_config"
> +        thp_test_config = ""
> +        kill_vm = yes
> +        login_timeout = 360
> +        variants:
> +            - base:
> +                type = trans_hugepage
> +                dd_timeout = 900
> +            - defrag:
> +                type = trans_hugepage_defrag
> +            - swapping:
> +                type = trans_hugepage_swapping
> +                dd_timeout = 900
> +                check_cmd_timeout = 900
> +
>  
>  # NICs
>  variants:


_______________________________________________
Autotest mailing list
[email protected]
http://test.kernel.org/cgi-bin/mailman/listinfo/autotest

Re: [Autotest] [PATCH] KVM-test: Add transparent hugepage test cases

Reply via email to