[Vserver] Re: [Devel] Fwd: [PATCH] pidspace is_init()
Sukadev Bhattiprolu wrote: CCing Vserver and OpenVz lists. BTW, can we add these two lists to the lxc-devel list ? Sure, feel free. ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Devel] Re: [Vserver] Linux Containers : next steps
Serge E. Hallyn wrote: Quoting Cedric Le Goater ([EMAIL PROTECTED]): However, i've also heard many times that we should agree before flooding lkml. So I guess we should use the vserver, openvz, lxc-devel mailing-list (eric please subscribe to one) before sending our agreement or disagreement on lkml. vserver@list.linux-vserver.org [EMAIL PROTECTED] [EMAIL PROTECTED] Given (a) the likely occasional bursts of activity, and (b) the narrow scope which shouldn't interest people just looking for vserver or openvz help, I think we should go with just the third. Well, this is what we have agreed upon during OLS/KS. I'd like to have [EMAIL PROTECTED] list included in To. Let Herbert speak on behalf of vserver list. Regards, Kir ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] Re: [Devel] Container Test Campaign
Clément Calmels wrote: What do you think of something like this: o reboot o run dbench (or wathever) X times o reboot Perfectly fine with me. Here you do not have to reboot. OpenVZ tools does not require OpenVZ kernel to be built. You got me... I was still believing the VZKERNEL_HEADERS variable was needed. Things have changed since vzctl 3.0.0-4.. Yes, we get rid off that dependency, to ease the external packages maintenance. can I can split the launch a guest part into 2 parts: o guest creation o reboot o guest start-up Do you feel comfortable with that? Perfectly fine. Same scenario applies to other cases: the rule of thumb is if your test preparation involves a lot of I/O, you'd better reboot in between preparation and the actual test. The same will happen with most of the other tests involving I/O. Thus, test results will be non-accurate. To achieve more accuracy and exclude the impact of the disk and filesystem layout to the results, you should reformat the partition you use for testing each time before the test. Note that you don't have to reinstall everything from scratch -- just use a separate partition (mounted to say /mnt/temptest) and make sure most of the I/O during the test happens on that partition. It would be possible for 'host' node... inside the 'guest' node, I don't know if it makes sense. Just adding an 'external' partition to the 'guest' for I/O test purpose? For example in an OpenVZ guest, creating a new and empty simfs partition in order to run test on it? simfs is not a real filesystem, it is kinda 'pass-though' fake FS which works on top of a real FS (like ext2 or ext3). So, in order to have a new fresh filesystem for guests, you can create some disk partition, mkfs and mount it to /vz. If you want to keep templates, just change the TEMPLATE variable in /etc/vz/vz.conf from /vz/template to something outside of /vz. There are other ways possible, and I think the same applies to VServer. - For the settings of the guest I tried to use the default settings (I had to change some openvz guest settings) just following the HOWTO on vserver or openvz site. For the kernel parameters, did you mean kernel config file tweaking? No I mean those params from /proc/sys (== /etc/sysctl.conf). For example, if you want networking for canOpenVZ guests, you have to turn on ip_forwarding. There are some params affecting network performance, such as various gc_thresholds. For the big number of guests, you have to tune some system-wide parameters as well. For the moment, I just follow the available documentation: http://wiki.openvz.org/Quick_installation#Configuring_sysctl_settings Do you think these paramenters can hardly affect network performance? From what I understand lot of them are needed. OK. Still, such stuff should be documented on the test results pages. ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] Re: [Devel] Container Test Campaign
Clément, Thanks for addressing my concerns! See comments below. Clément Calmels wrote: Hi, 1.1 It would be nice to run vmstat (say, vmstat 10) for the duration of the tests, and put the vmstat output logs to the site. Our benchmark framework allows us to use oprofile during test... couldn't it be better than vmstat? Good idea. Basically, the detailed description of a process would be nice to have, in order to catch possible problems. There are a lot of tiny things which are influencing the results. For example, in linux kernels 2.4 binding the NIC IRQ to a single CPU on an SMP system boosts network performance by about 15%! Sure this is not relevant here, it's just an example. I agree. Actually, I always try to use 'default' configuration or installation but I will try to describe the tests in details. 1.3 Would be nice to have diffs between different kernel configs. The different configs used are available in the lxc site. You will notice that I used a minimal config file for most of the test, but for Openvz I had to use the one I found in the OpenVZ site because I faced kernel build error (some CONFIG_NET... issues). We are trying to eliminate those, so a bug report would be nice. I think that the differences are more dealing with network stuff. For example, the tbench test is probably failed to finish because it hits the limits for privvmpages, tcpsndbuf and tcprcvbuf. I have increased the limits for those parameters and the test was finished successfully. Also, dbench test could hit the disk quota limit for a VE. Some more info is available at http://wiki.openvz.org/Resource_management I already used this page. I had to increase 'diskinodes' and 'diskspace' resources in order to run some test properly (the disk errors were more selfexplicit). I'm wondering why a default 'guest' creation implies some resources restrictions? Couldn't the resources be unlimited? I understand the need for resource management, but the default values look a little bit tiny... The reason is security. A guest is untrusted by default, though sane limits are applied. Same as ulimit which has some sane defaults (check output of ulimit -a). Same as those kernel settings from /proc/sys -- should /proc/sys/fs/file-max be 'unlimited' by default? In fact, those limits are taken from a sample configuration file during vzctl create stage. Sample file is specified in global OpenVZ config file (/etc/vz/vz.conf, parameter name is CONFIGFILE, default is to take configuration from /etc/vz/conf/ve-vps.basic.conf-sample). There are several ways to change that default configuration: 1. (globally) Put another sample config and specify it in /etc/vz/vz.conf 2. (globally) Edit the existing sample config (/etc/vz/conf/ve-vps.basic.conf-sample) 3. (per VE) Specify another config during vzctl create stage, like this: vzctl create VEID [--config name] 4. (per VE) Tune the specific parameters using vzctl set [--param value ...] --save 2.2 For OpenVZ specifically, it would be nice to collect /proc/user_beancounters output before and after the test. For sure... I will take a look at how integrating it in our automatic test environment. Best regards, ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] Re: [Devel] Container Test Campaign
Clément Calmels wrote: Hi, I'm wondering why a default 'guest' creation implies some resources restrictions? Couldn't the resources be unlimited? I understand the need for resource management, but the default values look a little bit tiny... The reason is security. A guest is untrusted by default, though sane limits are applied. Same as ulimit which has some sane defaults (check output of ulimit -a). Same as those kernel settings from /proc/sys -- should /proc/sys/fs/file-max be 'unlimited' by default? Ok. So as our benchmarks have no security concern, you will see no objection if I set all the parameters in the 'guest' to their value in the host, won't you? Sure. In case you are testing performance (but not, say, isolation), you can definitely set all the UBCs to unlimited values (i.e. both barrier and limit for each parameter should be set to MAX_LONG). The only issues is with vmguarpages parameter, because this is a guarantee but not limit -- but unless you are doing something weird it should be OK to set to to MAX_LONG as well. Another approach is to generate sample config (for the given server) using vzsplit utility with the number of VEs set to 1, like this: # vzsplit -f one-ve -n 1 [-s xxx] and use it for new VE creation: # vzctl create 123 --config one-ve ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] Re: [Devel] Container Test Campaign
See my comments below. In general - please don't get the impression I try to be fastidious. I'm just trying to help you create a system in which results can be reproducible and trusted. There are a lot of factors that influence the performance; some of those are far from being obvious. Clément Calmels wrote: A basic 'patch' test looks like: o build the appropriate kernel (2.6.16-026test014-x86_64-smp for example) o reboot o run dbench on /tmp with 8 processes IMO you should add a reboot here, in between _different_ tests, just because previous tests should not influence the following ones. Certainly you do not need a reboot before iterations of the same test. o run tbench with 8 processes o run lmbench o run kernbench For test inside a 'guest' I just do something like: o build the appropriate kernel (2.6.16-026test014-x86_64-smp for example) o reboot Here you do not have to reboot. OpenVZ tools does not require OpenVZ kernel to be built. o build the utilities (vztcl+vzquota for example) o reboot o launch a guest Even this part is tricky! You haven't specified whether you create the guest before or after the reboot. Let me explain. If you create a guest before the reboot, the performance (at least at the first iteration) could be a bit higher than if you create a guest after the reboot. The reason is in the second case the buffer cache will be filled with OS template data (which is several hundred megs). o run in the guest dbench ... Again, a clean reboot is needed IMO. o run in the guest tbench ... -The results are the average value of several iterations of each set of these kind of tests. Hope you do not recompile the kernels before the iterations (just to speed things up). I will try to update the site with the numbers of iterations behind each values. Would be great to have that data (as well as the results of the individual iterations, and probably graphs for the individual iterations -- to see the warming progress, discrepancy between iterations, degradation over iterations (if that takes place) etc). Based on that data, one can decide to further tailor the testing process. For example, if there are visible signs of warming for a first few iterations (i.e. the performance is worse) it makes sense to unconditionally exclude those from the results. If there is a sign of degradation, something is wrong. And so on... - For the filesystem testing, the partition is not reformatted. I can change this behaviour... Disk layout is influencing the results of the test which do heavy I/O. Just a single example: if you try to test the performance of a web server, results will decrease over time. The reason of degradation is ... web server's access_log file! It grows over time, and write operation takes a bit longer (due to several different reasons). The same will happen with most of the other tests involving I/O. Thus, test results will be non-accurate. To achieve more accuracy and exclude the impact of the disk and filesystem layout to the results, you should reformat the partition you use for testing each time before the test. Note that you don't have to reinstall everything from scratch -- just use a separate partition (mounted to say /mnt/temptest) and make sure most of the I/O during the test happens on that partition. - For the settings of the guest I tried to use the default settings (I had to change some openvz guest settings) just following the HOWTO on vserver or openvz site. For the kernel parameters, did you mean kernel config file tweaking? No I mean those params from /proc/sys (== /etc/sysctl.conf). For example, if you want networking for OpenVZ guests, you have to turn on ip_forwarding. There are some params affecting network performance, such as various gc_thresholds. For the big number of guests, you have to tune some system-wide parameters as well. So I am leading to the proposition that all such changes should be documented in test results. - Cron are stopped during tests. Hope you do that for the guest as well... :) - All binaries are always build in the test node. I assuming you are doing your tests on the same system (i.e. same compiler/libs/whatever else), and you do not change that system over time (i.e. you do not upgrade gcc on it in between the tests). Feel free to provide me different scenario which you think are more relevant. ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
[Vserver] Re: [Devel] Container Test Campaign
Clement, Thanks for sharing the results! A few comments... (1) General 1.1 It would be nice to run vmstat (say, vmstat 10) for the duration of the tests, and put the vmstat output logs to the site. 1.2 Can you tell how you run the tests. I am particularly interested in - how many iterations do you do? - what result do you choose from those iterations? - how reproducible are the results? - are you rebooting the box between the iterations? - are you reformatting the partition used for filesystem testing? - what settings are you using (such as kernel vm params)? - did you stop cron daemons before running the test? - are you using the same test binaries across all the participants? - etc. etc... Basically, the detailed description of a process would be nice to have, in order to catch possible problems. There are a lot of tiny things which are influencing the results. For example, in linux kernels 2.4 binding the NIC IRQ to a single CPU on an SMP system boosts network performance by about 15%! Sure this is not relevant here, it's just an example. 1.3 Would be nice to have diffs between different kernel configs. (2) OpenVZ specifics 2.1 Concerning the tests running inside an OpenVZ VE, the problem is there is a (default) set of resource limits applied to each VE. Basically one should tailor those limits to suit the applications running, OR, for the purpose of testing, just set those limits to some very high values so they will never be reached. For example, the tbench test is probably failed to finish because it hits the limits for privvmpages, tcpsndbuf and tcprcvbuf. I have increased the limits for those parameters and the test was finished successfully. Also, dbench test could hit the disk quota limit for a VE. Some more info is available at http://wiki.openvz.org/Resource_management 2.2 For OpenVZ specifically, it would be nice to collect /proc/user_beancounters output before and after the test. Clément Calmels wrote: Hi, A first round about virtualisation benchmarks can be found here: http://lxc.sourceforge.net/bench/ These benchmarks run with vanilla kernels and the patched versions of well know virtualisation solutions: VServer and OpenVZ. Some benchs also run inside the virtual 'guest' but we ran into trouble trying to run some of them... probably virtual 'guest' configuration issues... we will trying to fix them... The metacluster migration solution (formely a Meiosys company produt) was added as it seems that the checkpoint/restart topic is close to the virtualisation's one (OpenVZ now provides a checkpoint/restart capability). For the moment, benchmarks only ran on xeon platform but we expect more architecture soon. Besides the 'classic' benchs used, more network oriented benchs will be added. Netpipe between two virtual 'guests' for example. We hope we will be able to provide results concerning the virtual 'guest' scalability, running several 'guest' at the same time. Best regards, Le mercredi 07 juin 2006 à 16:20 +0200, Clement Calmels a écrit : Hello ! I'm part of a team of IBMers working on lightweight containers and we are going to start a new test campaign. Candidates are vserver, vserver context, namespaces (being pushed upstream), openvz, mcr (our simple container dedicated to migration) and eventually xen. We will focus on the performance overhead but we are also interested in checkpoint/restart and live migration. A last topic would be how well the resource managment criteria are met, but that's extra for the moment. We plan on measuring performance overhead by comparing the results on a vanilla kernel with a partial and with a complete virtual environment. By partial, we mean the patched kernel and a 'namespace' virtualisation. Test tools -- o For network performance : * netpipe (http://www.scl.ameslab.gov/netpipe/) * netperf (http://www.netperf.org/netperf/NetperfPage.html) * tbench (http://samba.org/ftp/tridge/dbench/README) o Filesystem : * dbench (http://samba.org/ftp/tridge/dbench/README) * iozone (http://www.iozone.org/) o General * kernbench (http://ck.kolivas.org/kernbench/) stress cpu and filesystem through kernel compilation * More 'real world' application could be used, feel free to submit candidates... We have experience on C/R and migration so we'll start with our own scenario, migrating oracle under load. The load is generated by DOTS (http://ltp.sourceforge.net/dotshowWe ran into trouble trying to run sto.php). If you could provided us some material on what has already been done : URL, bench tools, scenarios. We'll try to compile them in. configuration hints and tuning are most welcome if they are reasonable. Results, tools, scenarios will be published on lxc.sf.net . We will set up the testing environment so as to be able to accept new versions, patches, test tools and rerun the all on demand. Results, tools, scenarios will be published on lxc.sf.net. thanks ! Clement,
Re: [Vserver] Re: [ANNOUNCE] second stable release of Linux-VServer
Herbert Poetzl wrote: Additionally, the pid virtualization we've been discussing (and which should be submitted soon) would remove the need for the tasklookup patch, so bsdjail would reduce even further, to network and simple access controls. complete pid virtualization would be interesting for migration and checkpointing too (not just isolation and security), so I think that might be something of interest for a broader audience ... Just to make sure everybody is aware: pids are already virtualized in OpenVZ. If you want to look at the code, it is available from within diff-openvz-ve patch, see http://ftp.openvz.org/kernel/broken-out/022stab053.1/ ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] [EMAIL PROTECTED]: Re: [Users] VServer vs OpenVZ]
Rik Bobbaers wrote: stable: yes, secure... well... as far as possible, BUT! multipath using devicemapper in their kernel? almost impossible, unless the backported that entirely from 2.6.13 (of some 2.6.12 rcX) a lot of other enhancements in 2.6.8+ kernels... it's for a reason that kernels get updated, you know... Current OpenVZ kernel has more than 200 patches backported from mainstream. Second is resource management. There are a lot of resources that can be abused from inside VServer guest or OpenVZ VPS, leading to at least DoS; some of those resources are not under control of traditional UNIX means such as ulimit. In OpenVZ we have User Beancounters (UBC for short), which accounts and limits about 20 of such resources (including IPC objects, various kernel buffers etc). there is a decent resource management in vserver too... it's not easy at all to dos an entire vserver. (you have rlimits map for every vserver if you want, where you can choose what the limits are) I am really interested in comparison of OpenVZ's and VServer's resource management. ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver
Re: [Vserver] VServer vs OpenVZ
Let me comment on that as well (ccing our users@ mailing list). Sure I'm biased as well :) Herbert Poetzl wrote: On Tue, Dec 06, 2005 at 01:20:13PM +0100, Eugen Leitl wrote: Factors of interest are - stability, Z: the announcement reads first stable OVZ version Although this is indeed the first stable OpenVZ release, OpenVZ is essentially Virtuozzo (without its bells and whistles), and Virtuozzo for Linux is available for more than five years already. S: we are at version 2.0.1 ( two years stable releases) - Debian support, Z: afaik they are redhat oriented (and recently trying to get gentoo support done) Certainly you can compile OpenVZ from sources and use it on any Linux distro. And yes, we are a part of Gentoo for about two months already (with all the recent releases making their way into Gentoo in a very timely fashion). Debian is one of our goals (see roadmap http://openvz.org/development/roadmap), although personally I am not a Debian expert, still with some help from the Debian community we will make it. S: L-VS is in sarge (although with older/broken packages) but either using recent packages or compiling the tool yourself works pretty fine on debian - hardware utilization, Z: no idea S: support for 90% of all kernel archs at (almost) native speed (utilization? I'd say 100% if required) OpenVZ is supporting x86 (i386), x86_64 (AMD64, EM64T) and ia64 platforms. Supporting here means we have enough hardware for all the three platforms, and do an extensive quality testing (functionality, performance and stress tests) and security audit on all of them. It's a pity but we can not provide the same level of support for other platforms than those three. Speaking of specific hardware, we are supporting the same set of hardware that RHEL4 does, achieving this by backporting newer drivers from mainstream, vendors and RHEL4 kernels. There is an official Virtuozzo/OpenVZ HCL http://virtuozzo.com/en/products/virtuozzo/hcl. - documentation and Z: no idea S: the wiki, the L-VS paper(s) and google There is an extensive 100-pages user's guide (http://openvz.org/documentation/guides/). Also all utilities has man pages, and there are some short to-the-point howtos on the site and the forum. We are working on more stuff, like QoS (User Beancounters/FairScheduler/Disk Quota) paper. - community support, Z: irc channel and forum/bug tracker S: ML, irc channel (I guess we have excellent support) There is a bug tracking (Bugzilla) and quite an active support forums, also we have mailing lists and IRC channel (#openvz at freenode). We also provide fee-based support for OpenVZ, done by the same excellent team who supports Virtuozzo. - security. guess both projects are trying to keep high security and IMHO the security is at least as high as with the vanilla kernel release ... I can definitely say our security is higher than that of vanilla kernel. We achieve that by two means: (1) sticking to older kernel and backporting all the fixes from mainstream and (2) hiring top-rated security specialists to do OpenVZ security audit. Regards, Kir, OpenVZ project leader. ___ Vserver mailing list Vserver@list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver