Re: [lxc-users] AWS EC2: timeout connecting to instance metadata webserver (169.254.169.254) for *some* URLs (when connecting from a LXD container)
On 2020-11-19 00:07, Tomasz Chmielewski wrote: On 2020-11-18 23:50, Tomasz Chmielewski wrote: That's a weird one! In AWS, there is a concept of "instance metadata" - a webserver which lets you fetch some instance metadata using http: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html For example, you can run this (from both AWS/EC2 instance and LXD container running inside a AWS/EC2 instance), it will return some metadata: curl -v http://169.254.169.254/latest/meta-data/ Now, some of these requests time out when executed from a LXD container running inside a AWS/EC2 - but work perfectly from the very same AWS/EC2 instance. For example, this request works fine from AWS/EC2 instance (ignore the output - HTTP connection works just fine): root@aws-instance:~# curl -v http://169.254.169.254/latest/api/token * Trying 169.254.169.254... * TCP_NODELAY set * Connected to 169.254.169.254 (169.254.169.254) port 80 (#0) GET /latest/api/token HTTP/1.1 Host: 169.254.169.254 User-Agent: curl/7.58.0 Accept: */* < HTTP/1.1 405 Not Allowed < Allow: OPTIONS, PUT < Content-Length: 0 < Date: Wed, 18 Nov 2020 22:41:46 GMT < Server: EC2ws < Connection: close < Content-Type: text/plain < * Closing connection 0 However, when executed from within a LXD container running inside the very same AWS/EC2 instance - it times out! root@lxd-container:~# curl -v http://169.254.169.254/latest/api/token * Trying 169.254.169.254... * TCP_NODELAY set * Connected to 169.254.169.254 (169.254.169.254) port 80 (#0) GET /latest/api/token HTTP/1.1 Host: 169.254.169.254 User-Agent: curl/7.58.0 Accept: */* Even more weirdly, these work inside the container: curl -v http://169.254.169.254/latest/api/ curl -v http://169.254.169.254/latest/api/t curl -v http://169.254.169.254/latest/api/to curl -v http://169.254.169.254/latest/api/tok curl -v http://169.254.169.254/latest/api/toke And this times out: curl -v http://169.254.169.254/latest/api/token Does anyone know why? tcpdump doesn't give me many clues (TTL?). A somewhat related post (with docker having a similar issue): https://rtfm.co.ua/en/aws-eksctl-put-http-169-254-169-254-latest-api-token-net-http-request-canceled-2/ But, I'm no closer in getting a similar workaround for LXD. If someone's struggling with a similar issue - here is a fix: aws ec2 modify-instance-metadata-options --instance-id i-abcdefghijklmn --http-put-response-hop-limit 2 --http-endpoint enabled Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] AWS EC2: timeout connecting to instance metadata webserver (169.254.169.254) for *some* URLs (when connecting from a LXD container)
On 2020-11-18 23:50, Tomasz Chmielewski wrote: That's a weird one! In AWS, there is a concept of "instance metadata" - a webserver which lets you fetch some instance metadata using http: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html For example, you can run this (from both AWS/EC2 instance and LXD container running inside a AWS/EC2 instance), it will return some metadata: curl -v http://169.254.169.254/latest/meta-data/ Now, some of these requests time out when executed from a LXD container running inside a AWS/EC2 - but work perfectly from the very same AWS/EC2 instance. For example, this request works fine from AWS/EC2 instance (ignore the output - HTTP connection works just fine): root@aws-instance:~# curl -v http://169.254.169.254/latest/api/token * Trying 169.254.169.254... * TCP_NODELAY set * Connected to 169.254.169.254 (169.254.169.254) port 80 (#0) GET /latest/api/token HTTP/1.1 Host: 169.254.169.254 User-Agent: curl/7.58.0 Accept: */* < HTTP/1.1 405 Not Allowed < Allow: OPTIONS, PUT < Content-Length: 0 < Date: Wed, 18 Nov 2020 22:41:46 GMT < Server: EC2ws < Connection: close < Content-Type: text/plain < * Closing connection 0 However, when executed from within a LXD container running inside the very same AWS/EC2 instance - it times out! root@lxd-container:~# curl -v http://169.254.169.254/latest/api/token * Trying 169.254.169.254... * TCP_NODELAY set * Connected to 169.254.169.254 (169.254.169.254) port 80 (#0) GET /latest/api/token HTTP/1.1 Host: 169.254.169.254 User-Agent: curl/7.58.0 Accept: */* Even more weirdly, these work inside the container: curl -v http://169.254.169.254/latest/api/ curl -v http://169.254.169.254/latest/api/t curl -v http://169.254.169.254/latest/api/to curl -v http://169.254.169.254/latest/api/tok curl -v http://169.254.169.254/latest/api/toke And this times out: curl -v http://169.254.169.254/latest/api/token Does anyone know why? tcpdump doesn't give me many clues (TTL?). A somewhat related post (with docker having a similar issue): https://rtfm.co.ua/en/aws-eksctl-put-http-169-254-169-254-latest-api-token-net-http-request-canceled-2/ But, I'm no closer in getting a similar workaround for LXD. Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] AWS EC2: timeout connecting to instance metadata webserver (169.254.169.254) for *some* URLs (when connecting from a LXD container)
That's a weird one! In AWS, there is a concept of "instance metadata" - a webserver which lets you fetch some instance metadata using http: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html For example, you can run this (from both AWS/EC2 instance and LXD container running inside a AWS/EC2 instance), it will return some metadata: curl -v http://169.254.169.254/latest/meta-data/ Now, some of these requests time out when executed from a LXD container running inside a AWS/EC2 - but work perfectly from the very same AWS/EC2 instance. For example, this request works fine from AWS/EC2 instance (ignore the output - HTTP connection works just fine): root@aws-instance:~# curl -v http://169.254.169.254/latest/api/token * Trying 169.254.169.254... * TCP_NODELAY set * Connected to 169.254.169.254 (169.254.169.254) port 80 (#0) GET /latest/api/token HTTP/1.1 Host: 169.254.169.254 User-Agent: curl/7.58.0 Accept: */* < HTTP/1.1 405 Not Allowed < Allow: OPTIONS, PUT < Content-Length: 0 < Date: Wed, 18 Nov 2020 22:41:46 GMT < Server: EC2ws < Connection: close < Content-Type: text/plain < * Closing connection 0 However, when executed from within a LXD container running inside the very same AWS/EC2 instance - it times out! root@lxd-container:~# curl -v http://169.254.169.254/latest/api/token * Trying 169.254.169.254... * TCP_NODELAY set * Connected to 169.254.169.254 (169.254.169.254) port 80 (#0) GET /latest/api/token HTTP/1.1 Host: 169.254.169.254 User-Agent: curl/7.58.0 Accept: */* Even more weirdly, these work inside the container: curl -v http://169.254.169.254/latest/api/ curl -v http://169.254.169.254/latest/api/t curl -v http://169.254.169.254/latest/api/to curl -v http://169.254.169.254/latest/api/tok curl -v http://169.254.169.254/latest/api/toke And this times out: curl -v http://169.254.169.254/latest/api/token Does anyone know why? tcpdump doesn't give me many clues (TTL?). Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] /etc/sysctl.conf, /etc/security/limits.conf for LXD from snap?
Are /etc/sysctl.conf and /etc/security/limits.conf changes documented on https://github.com/lxc/lxd/blob/master/doc/production-setup.md still relevant for LXD installed from snap (on Ubuntu 20.04)? Tomasz ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] Quick Question
On 2020-03-20 21:07, Ray Jender wrote: So if I have an LXD container hosting an application that requires some specific ports be open, must those same ports be opened on the host OS? For example, I need udp ports 5000-65000 open in the container. Must I also open these ports on the host? Does the container have a dedicated, public IP? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxc: command not found
On 2020-03-17 09:12, Tomasz Chmielewski wrote: Not sure what happened, but suddenly, my system no longer has lxc command: (...) I didn't do any snap manipulations (like lxd removal) recently. The containers are still running on this system (as seen by ps). Sorry for noise - that's due to some filesystem issues described in a separate ticket (btrfs quotas). It's fixed now. Tomasz ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxc: command not found
Not sure what happened, but suddenly, my system no longer has lxc command: # lxc -bash: lxc: command not found # lxd -bash: lxd: command not found There are no lxc/lxd commands in /snap/bin/. The directory was modified nearly 2 hours ago: # date Tue Mar 17 00:11:01 UTC 2020 # ls -ld /snap/bin drwxr-xr-x 2 root root 4096 Mar 16 22:34 /snap/bin # snap list Name VersionRevTracking Publisher Notes aws-cli 1.16.266 151stableaws✓classic core 16-2.43.3 8689 stablecanonical✓ core lxd 3.22 13717 stablecanonical✓ disabled I didn't do any snap manipulations (like lxd removal) recently. The containers are still running on this system (as seen by ps). Tomasz ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxd VMs - running ARM VM on amd64 host?
There is this image: | e/arm64 (2 more) | ea0b0d18e384 | yes| ubuntu 19.10 arm64 (release) (20200307) | aarch64 | VIRTUAL-MACHINE | 481.19MB | Mar 7, 2020 at 12:00am (UTC) | Let's try to run it on a amd64 host: $ lxc launch --vm ubuntu:ea0b0d18e384 arm Creating arm Error: Failed instance creation: Create instance: Requested architecture isn't supported by this host Obviously this won't work - the snap doesn't even have a binary for it: $ ls /snap/lxd/13704/bin/qemu-system-* /snap/lxd/13704/bin/qemu-system-x86_64 Is it planned to support foreign architecture VMs at some point (i.e. ARM VM on amd64 host)? I understand it would be quite slow, but in general, it works if you fiddle with qemu-system-arm. Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] how to forbid cross-network traffic?
On 2020-02-11 05:32, Andrey Repin wrote: Containers in these two networks have IP address assigned from DHCP and can connect out to the world - this is what I want. Unfortunately, containers from one network (staging) can also connect to containers from the other network (testing) - which is not what I want. So, fix it? iptables to your rescue. (E.g.: this is not an LXD problem.) IMO it's LXD configuration nuance. And a problem. See below. Is there any mechanism in LXD to prevent it? Or do I have to add my own, custom iptables rules? You have enabled packet forwarding on the host, but not specified any restrictions. Indeed, everything is forwarded where possible. That's why I'm asking if there is any mechanism in LXD to prevent such traffic. LXD adds a lot of its own iptables rules. I can add my own, of course, but in my opinion, it's not a very clear solution: - if one uses iptables-persistent, these rules will kind of conflict with the ones set by LXD and in case of reload, will even clear iptables rules set by LXD; there are issues with rule saving and so on - I can set my own rules via other mechanisms, i.e. in /etc/rc.local on server startup - but then again, there is no reload/change mechanism Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] how to forbid cross-network traffic?
I have these two networks: # lxc network show br-staging config: ipv4.address: 10.100.0.1/24 ipv4.dhcp.ranges: 10.100.0.50-10.100.0.254 ipv4.firewall: "true" ipv4.nat: "true" description: staging network name: br-staging type: bridge # lxc network show br-testing config: ipv4.address: 10.200.0.1/24 ipv4.dhcp.ranges: 10.200.0.50-10.200.0.254 ipv4.firewall: "true" ipv4.nat: "true" description: testing network name: br-testing type: bridge Containers in these two networks have IP address assigned from DHCP and can connect out to the world - this is what I want. Unfortunately, containers from one network (staging) can also connect to containers from the other network (testing) - which is not what I want. Is there any mechanism in LXD to prevent it? Or do I have to add my own, custom iptables rules? Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxc proxy nat? is there a "reverse proxy nat"?
On 2020-01-24 23:40, Tomasz Chmielewski wrote: Now, it works great. However, mail sent from container 10.2.2.2 will use LXD server's 1.1.1.1 as the outgoing IP. I'd like it to use 2.2.2.2, and still have the private IP assigned (I don't want to assign the public IP to this container). How can I do it? OK, there is "ipv4.nat.address" which can be set in a network, i.e. "lxc network edit lxdbr1". Good enough! Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxc proxy nat? is there a "reverse proxy nat"?
Let's say I have a LXD server with two public IPs, 1.1.1.1 and 2.2.2.2. The default IP for outgoing routing is 1.1.1.1. There, I setup two containers with private IP addresses: 10.1.1.1 and 10.2.2.2. They receive the following proxy nat config: - LXD server passes TCP traffic 1.1.1.1:25 to container 10.1.1.1:25: proxy-smtp: connect: tcp:10.1.1.1:25 listen: tcp:1.1.1.1:25 nat: "true" type: proxy - LXD server passes TCP traffic 2.2.2.2:25 to container 10.2.2.2:25: proxy-smtp: connect: tcp:10.2.2.2:25 listen: tcp:2.2.2.2:25 nat: "true" type: proxy Now, it works great. However, mail sent from container 10.2.2.2 will use LXD server's 1.1.1.1 as the outgoing IP. I'd like it to use 2.2.2.2, and still have the private IP assigned (I don't want to assign the public IP to this container). How can I do it? Tomasz ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] Docker in unprivileged LXC?
On 2019-11-20 19:52, Dirk Geschke wrote: Hi all, is there a way to get docker up and running in an unprivileged LXC? It seems to have problems with cgroups: docker: Error response from daemon: OCI runtime create failed: container_linux.go:344: starting container process caused "process_linux.go:275: applying cgroup configuration for process caused \"mkdir /sys/fs/cgroup/cpuset/docker: permission denied\"": unknown. Does someone know a way to get it working? I don't trust the docker containers, so my idea was to run them in an LXC. But up to now I have no clue how to do this... You just need to set this: security.nesting: "true" (in "lxc config edit container-name"). Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxc config edit $container - how to change the editor (defaults to vim.tiny)?
In the last week or two, my lxd servers installed from snap start "vim.tiny" when I use "lxc config edit $container". It used to use vim before (I think - vim.tiny has arrow keys messed up, and they used to work as expected before). How to change the editor to something else, i.e. proper/full vim? Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] Error: websocket: close 1006 (abnormal closure): unexpected EOF
On 2019-09-20 12:23, Fajar A. Nugraha wrote: If you're asking 'how to keep "lxc exec session running when lxd is restarted", then it's not possible. Yes, I guess that's what I'm asking about! Is there a chance it will be possible in the future? With some changes to lxd restart model maybe? Or how lxc exec sessions are handled? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] Error: websocket: close 1006 (abnormal closure): unexpected EOF
Ubuntu 18.04, lxd installed from snap. Very often, a "lxc shell container" or "lxc exec container some-command" session gets interrupted with: Error: websocket: close 1006 (abnormal closure): unexpected EOF I suppose this happens when lxd snap gets auto-updated (i.e. today, from lxd ver 3.17, rev. 11964 to rev. 11985). This is quite annoying and leads to various errors, including data loss: - some long-running jobs within "lxc shell container" console get interrupted when this happens - script/jobs/builds running as "lxc exec container some-command" also get interrupted when this happens Is there a way to prevent this? Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxcfs segfaults since around 2019-07-23, containers malfunction
Since around 2019-07-23, lxcfs segfaults randomly on Ubuntu 18.04 servers with LXD from snap: lxcfs[1424]: segfault at 0 ip 7f518f5e4326 sp 7f519da1f9a0 error 4 in liblxcfs.so[7f518f5d8000+1a000] As a result, containers malfunction. If a container is stopped, then started again - everything works well, however - after a few hours, it happens again. root@uni01:~# free Error: /proc must be mounted To mount /proc at boot you need an /etc/fstab line like: proc /proc procdefaults In the meantime, run "mount proc /proc -t proc" root@uni01:~# uptime Error: /proc must be mounted To mount /proc at boot you need an /etc/fstab line like: proc /proc procdefaults In the meantime, run "mount proc /proc -t proc" root@uni01:~# ls /proc ls: cannot access '/proc/stat': Transport endpoint is not connected ls: cannot access '/proc/swaps': Transport endpoint is not connected ls: cannot access '/proc/uptime': Transport endpoint is not connected ls: cannot access '/proc/cpuinfo': Transport endpoint is not connected ls: cannot access '/proc/meminfo': Transport endpoint is not connected ls: cannot access '/proc/diskstats': Transport endpoint is not connected (...) Is it a known issue? I'm observing it on around 10 servers. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] unable to set security.protection.delete on a new LXD server
That was it! Thanks for the hint. Tomasz On 2019-07-03 14:09, Stéphane Graber wrote: Do you maybe have both the deb and snap installed on that system and are in fact interacting with the deb rather than snap? dpkg -l | grep lxd And if you do have both, then run `lxd.migrate` to transition the data over to the snap. On Tue, Jul 2, 2019 at 9:52 PM Tomasz Chmielewski wrote: Just installed lxd from snap on a Ubuntu 18.04 server and launched the first container: # snap list Name VersionRevTracking Publisher Notes amazon-ssm-agent 2.3.612.0 1335 stable/… aws✓classic core 16-2.39.3 7270 stablecanonical✓ core lxd 3.14 11098 stablecanonical✓ - # lxc launch ubuntu:18.04 terraform Creating terraform Starting terraform However, I'm not able to set security.protection.delete for containers created here: # lxc config set terraform security.protection.delete true Error: Invalid config: Unknown configuration key: security.protection.delete Also doesn't work when I try to set it via "lxc config edit". This works perfectly on other LXD servers, so I'm a bit puzzled why it won't work here? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] unable to set security.protection.delete on a new LXD server
Just installed lxd from snap on a Ubuntu 18.04 server and launched the first container: # snap list Name VersionRevTracking Publisher Notes amazon-ssm-agent 2.3.612.0 1335 stable/… aws✓classic core 16-2.39.3 7270 stablecanonical✓ core lxd 3.14 11098 stablecanonical✓ - # lxc launch ubuntu:18.04 terraform Creating terraform Starting terraform However, I'm not able to set security.protection.delete for containers created here: # lxc config set terraform security.protection.delete true Error: Invalid config: Unknown configuration key: security.protection.delete Also doesn't work when I try to set it via "lxc config edit". This works perfectly on other LXD servers, so I'm a bit puzzled why it won't work here? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] limits.memory - possible to set per group of containers?
On 2019-06-18 11:59, Stéphane Graber wrote: So we have plans to introduce project quotas which will allow placing such restrictions in a clean way through LXD. Until then you can manually tweak /sys/fs/cgroup/memory/lxc or /sys/fs/cgroup/memory/lxc.payload (depending on version of liblxc) as all containers reside under there and limits are hierarchical. It's pretty similar to what systemd would attempt to do except that liblxc/lxd bypass systemd's expected cgroup so placing the limit through systemd wouldn't work. I was just going to test systemd - because systemd-cgls shows these: 1) this is just the monitor process, so setting any limits on it won't have the desired effect: ├─lxc.monitor │ ├─shard01d │ │ └─26025 [lxc monitor] /var/snap/lxd/common/lxd/containers shard01d │ └─uni01-2019-06-18-01-09-11 │ └─11442 [lxc monitor] /var/snap/lxd/common/lxd/containers uni01-2019-06-18-01-09-11 2) Container processes seem to be all under lxc.payload tree: ─lxc.payload │ ├─shard01d │ │ ├─ 488 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid │ │ ├─ 1821 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d │ │ ├─ 3158 /lib/systemd/systemd-udevd │ │ ├─ 3878 /usr/sbin/rsyslogd -n │ │ ├─ 4592 /sbin/rpcbind -f -w (...) │ └─uni01-2019-06-18-01-09-11 │ ├─ 2875 /usr/bin/php cli.php Jiradaemon │ ├─ 3446 /usr/bin/php cli.php Jiradaemon │ ├─ 4022 pickup -l -t unix -u -c │ ├─ 4180 /usr/bin/php cli.php Jiradaemon (...) But since there isn't any "lxc.payload" systemd service - then yes, like you said - I'd need to dive into /sys/fs/cgroup/memory/ for now. Great to hear project quotas are in the plans! Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] limits.memory - possible to set per group of containers?
On 2019-06-18 11:34, Fajar A. Nugraha wrote: You could probably just use nested lxd instead: https://stgraber.org/2016/12/07/running-snaps-in-lxd-containers/ Set the outer container memory limit to 29GB, and put other containers inside that one. Yeah, but that's a bit "hackish" (to achieve the desired result - memory limit per group of containers; nothing wrong with nesting itself), and may be even hard to implement on existing setups. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] limits.memory - possible to set per group of containers?
Let's say I have a host with 32 GB RAM. To make sure the host is not affected by any weird memory consumption patterns, I've set the following in the container: limits.memory: 29GB This works quite well - where previously, several processes with high memory usage, forking rapidly (a forkbomb to test, but also i.e. a supervisor in normal usage) running in the container could make the host very slow or even unreachable - with the above setting, everything (on the host) is just smooth no matter what the container does. However, that's just with one container. With two (or more) containers having "limits.memory: 29GB" set - it's easy for each of them to consume i.e. 20 GB, leading to host unavailability. Is it possible to set a global, or per-container group "limits.memory: 29GB"? For example, if I add "MemoryMax=29G" to /etc/systemd/system/snap.lxd.daemon.service - would I achieve a desired effect? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] Looking for LXD Container with AWS CDN Experience?
On 2019-03-18 01:09, Ray Jender wrote: If there is anyone experienced with using the Amazon Cloudfront with an LXD container, I could really use a little help! Wat are you trying to achieve? LXD with Amazon CloudFront isn't really used differently than a standalone or a fully virtual system like KVM or Xen. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] docker in LXD from snap?
On 2019-03-14 19:05, Sergiusz Pawlowicz wrote: On Thu, 14 Mar 2019 at 16:55, Tomasz Chmielewski wrote: Is the following guide also relevant for running docker in LXD installed from snap? https://stgraber.org/2016/04/13/lxd-2-0-docker-in-lxd-712/ yes, I am using docker - but container must be privileged for docker to run properly. I've just started a bionic container which is not privileged, and simple docker images seem to run fine. I did use this option though: security.nesting: "true" Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxc container rootfs dev folder permission are changing from ro to rw inside container
On 2019-02-25 17:27, Yasoda Padala wrote: Actual results: dev folder of container rootfs is read-only on host machine but inside container, it is writable. Please help with inputs on why the dev folder permissions are changed on lxc-attach. Can you paste the output of: mount cat /proc/mounts from the container? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] docker in LXD from snap?
Is the following guide also relevant for running docker in LXD installed from snap? https://stgraber.org/2016/04/13/lxd-2-0-docker-in-lxd-712/ Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] [LXD] how to set container description (in a script)?
I would like to add some own "metadata" for a container. I thought I'd use "description" for that, since it's present in "lxc config show ": This however doesn't work: $ lxc config show testcontainer | grep ^description description: "" $ lxc config set testcontainer description "some description" Error: Invalid config: Unknown configuration key: description What would be the best way to set the description for a container? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxc container rootfs dev folder permission are changing from ro to rw inside container
On 2019-02-26 13:02, Yasoda Padala wrote: Hi Tomasz, Please find below the output of mount & cat /proc/mounts container config is also attached with this mail yasoda@yasoda-HP-Z600-Workstation:~/.local/share/lxc/busybox$ lxc-attach -n busybox BusyBox v1.22.1 (Ubuntu 1:1.22.0-15ubuntu1) built-in shell (ash) Enter 'help' for a list of built-in commands. / # mount /dev/loop0 on / type squashfs (ro,relatime) none on /dev type tmpfs This. You have tmpfs mounted over /dev in your container. Why is it an issue for you? I'd say it's perfectly normal behaviour. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxc container rootfs dev folder permission are changing from ro to rw inside container
Please note these are two separate commands: mount cat /proc/mounts Tomasz Chmielewski https://lxadm.com On 2019-02-25 17:37, Yasoda Padala wrote: yasoda@yasoda-HP-Z600-Workstation:~/.local/share/lxc/busybox$ lxc-attach -n busybox lxc-attach: busybox: utils.c: get_ns_uid: 548 No such file or directory - Failed to open uid_map lxc-attach: busybox: utils.c: get_ns_gid: 579 No such file or directory - Failed to open gid_map BusyBox v1.22.1 (Ubuntu 1:1.22.0-15ubuntu1) built-in shell (ash) Enter 'help' for a list of built-in commands. / # mount cat /proc/mounts mount: mounting cat on /proc/mounts failed: No such file or directory / # / # Please find attached container config On Mon, Feb 25, 2019 at 2:01 PM Tomasz Chmielewski wrote: On 2019-02-25 17:27, Yasoda Padala wrote: Actual results: dev folder of container rootfs is read-only on host machine but inside container, it is writable. Please help with inputs on why the dev folder permissions are changed on lxc-attach. Can you paste the output of: mount cat /proc/mounts from the container? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] why does ssh + lxc hang? (used to work)
Yes, these parameters passed to lxc command don't really help. "ssh -t" makes "lxc exec" return after executing the command. Though I'd be interested to understand why it suddenly started to happen; have a number of scripts which broke because of this change. Tomasz Chmielewski https://lxadm.com On 2019-02-25 00:39, Kees Bos wrote: Did you try '-T, --force-noninteractive' ? (Disable pseudo-terminal allocation) i.e. laptop$ ssh root@host "/snap/bin/lxc exec container -T -- date" On Sun, 2019-02-24 at 21:33 +0900, Tomasz Chmielewski wrote: This works (executed on a host): host# lxc exec container -- date Sun Feb 24 12:25:21 UTC 2019 host# This however hangs and doesn't return (executed from a remote system, i.e. your laptop or a different server): laptop$ ssh root@host "export PATH=$PATH:/snap/bin ; lxc exec container -- date" Sun Feb 24 12:28:04 UTC 2019 (...command does not return...) Or a direct path to lxc binary - also hangs: laptop$ ssh root@host "/snap/bin/lxc exec container -- date" Sun Feb 24 12:29:54 UTC 2019 (...command does not return...) Of course a simple "date" execution via ssh on the host does not hang: laptop$ ssh root@host date Sun Feb 24 12:31:33 UTC 2019 laptop$ Why do commands executed via ssh and lxc hang? It used to work some 1-2 months ago, not sure with which lxd version it regressed like this. Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] why does ssh + lxc hang? (used to work)
This works (executed on a host): host# lxc exec container -- date Sun Feb 24 12:25:21 UTC 2019 host# This however hangs and doesn't return (executed from a remote system, i.e. your laptop or a different server): laptop$ ssh root@host "export PATH=$PATH:/snap/bin ; lxc exec container -- date" Sun Feb 24 12:28:04 UTC 2019 (...command does not return...) Or a direct path to lxc binary - also hangs: laptop$ ssh root@host "/snap/bin/lxc exec container -- date" Sun Feb 24 12:29:54 UTC 2019 (...command does not return...) Of course a simple "date" execution via ssh on the host does not hang: laptop$ ssh root@host date Sun Feb 24 12:31:33 UTC 2019 laptop$ Why do commands executed via ssh and lxc hang? It used to work some 1-2 months ago, not sure with which lxd version it regressed like this. Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] kernel messages appear in containers?
On 2019-02-03 11:51, Richard Hector wrote: Hi all, I've noticed that some log messages that really belong to the host (like those from monthly RAID checks, for example) can appear in arbitrary containers instead - so they're spread all over the place. Is that normal/fixable? I'd say it's a "normal, bad default". You can fix it by adding this to your sysctl config file: kernel.dmesg_restrict = 1 Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] unable to start containers ("Permission denied - Failed to mount")
It just broke for me on two servers again, more or less at the same time: root@backup01 ~ # ls -l /data/lxd total 0 drwx-- 1 root root 198 Jan 24 03:34 containers (...) Both servers are running Ubuntu 18.04 with LXD from snap: lxd 3.99919 stablecanonical✓ - And storage on a btrfs device: root@lxd05 ~ # lxc storage list +-+-++---+-+ | NAME | DESCRIPTION | DRIVER | SOURCE | USED BY | +-+-++---+-+ | default | | btrfs | /data/lxd | 16 | +-+-++---+-+ root@backup01 ~ # lxc storage list +-+-++---+-+ | NAME | DESCRIPTION | DRIVER | SOURCE | USED BY | +-+-++---+-+ | default | | btrfs | /data/lxd | 44 | +-+-++---+-+ Not sure what's causing, but it's yet another time I'm seeing it. Tomasz On 2018-09-24 22:43, Christian Brauner wrote: On Mon, Sep 24, 2018 at 03:40:57PM +0200, Tomasz Chmielewski wrote: Turns out something changed the permissions on "containers" directory: Odd, the new storage snapshot api performs an on-disk upgrade but it shouldn't touch the containers directory... //cc Stéphane Christian # lxc storage list +-+-++---+-+ | NAME | DESCRIPTION | DRIVER | SOURCE | USED BY | +-+-++---+-+ | default | | btrfs | /data/lxd | 12 | +-+-++---+-+ # ls -l /data/lxd total 0 drwxr-xr-x 1 root root 90 Sep 24 13:05 archives drwx-- 1 root root 518 Sep 24 13:12 containers <- here drwx--x--x 1 root root 0 Mar 28 16:14 custom drwx-- 1 root root 0 Sep 21 06:05 images drwx-- 1 root root 0 Sep 24 05:48 snapshots This fixed it: chmod 711 /data/lxd/containers/ I'm 99% sure we did not change the permissions on that directory... Tomasz On 2018-09-24 15:32, Tomasz Chmielewski wrote: > I'm not able to start any container today. > > # lxc start preprod-app > Error: Failed to run: /snap/lxd/current/bin/lxd forkstart preprod-app > /var/snap/lxd/common/lxd/containers > /var/snap/lxd/common/lxd/logs/preprod-app/lxc.conf: > Try `lxc info --show-log preprod-app` for more info > > > # lxc info --show-log preprod-app > Name: preprod-app > Remote: unix:// > Architecture: x86_64 > Created: 2018/09/05 15:01 UTC > Status: Stopped > Type: persistent > Profiles: default > > Log: > > lxc preprod-app 20180924132438.883 WARN conf - > conf.c:lxc_map_ids:2917 - newuidmap binary is missing > lxc preprod-app 20180924132438.883 WARN conf - > conf.c:lxc_map_ids:2923 - newgidmap binary is missing > lxc preprod-app 20180924132438.887 WARN conf - > conf.c:lxc_map_ids:2917 - newuidmap binary is missing > lxc preprod-app 20180924132438.887 WARN conf - > conf.c:lxc_map_ids:2923 - newgidmap binary is missing > lxc preprod-app 20180924132438.917 ERRORdir - > storage/dir.c:dir_mount:195 - Permission denied - Failed to mount > "/var/snap/lxd/common/lxd/containers/preprod-app/rootfs" on > "/var/snap/lxd/common/lxc/" > lxc preprod-app 20180924132438.917 ERRORconf - > conf.c:lxc_mount_rootfs:1337 - Failed to mount rootfs > "/var/snap/lxd/common/lxd/containers/preprod-app/rootfs" onto > "/var/snap/lxd/common/lxc/" with options "(null)" > lxc preprod-app 20180924132438.917 ERRORconf - > conf.c:lxc_setup_rootfs_prepare_root:3446 - Failed to setup rootfs for > lxc preprod-app 20180924132438.917 ERRORconf - > conf.c:lxc_setup:3510 - Failed to setup rootfs > lxc preprod-app 20180924132438.917 ERRORstart - > start.c:do_start:1234 - Failed to setup container "preprod-app" > lxc preprod-app 20180924132438.918 ERRORsync - > sync.c:__sync_wait:59 - An error occurred in another process (expected > sequence number 5) > lxc preprod-app 20180924132439.235 ERRORstart - > start.c:__lxc_start:1910 - Failed to spawn container "preprod-app" > lxc preprod-app 20180924132439.235 ERRORlxccontainer - > lxccontainer.c:wait_on_daemonized_start:840 - Received container state > "ABORTING" instead of "RUNNING" > lxc preprod-app 20180924132439.963 WARN conf - > conf.c:lxc_map_ids:2917 - newuidmap binary is missing > lxc preprod-app 20180924132439.101 WARN conf - > conf.c:lxc_map_ids:2923 - newgidmap binary is missing > lxc 20180924132439.380 WARN commands - > commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to > receive response for command "get_state" > > > # snap lis
Re: [lxc-users] Container mount accounting
On 2018-12-20 21:40, Guido Jäkel wrote: Hi all, is there any way to measure (read/written bytes and/or ops) the "traffic" and/or inspect (monitor) fs operations on a container root-mount and additional mounts without serious impact on performance? systemd-cgtop? But frankly, it doesn't show the data too reliably. It seems to rely on reading data from /sys/fs/cgroup/ - maybe there are better ways to process it and come up with some meaningful data. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] LXD: modify IP of snapshot before starting
On 2018-12-08 00:34, Steven Spencer wrote: All, My Google search turned up empty, so I'm turning to the list to see if this is possible: * In LXD I make a copy of a container, but want to create a new container from it * The container has a static assigned IP address, so if I bring up the new container with the other one running, I'm going to end up with an IP conflict * What I'd like to be able to do is to change the IP of the snapshot before creating a container out of it. Is that possible, or am I missing another method. I've already done this step before, which works, but isn't the best if you want to keep systems up. * Stop the original container * create the new container with the snapshot * modify the IP of the new container * start the original container If it isn't possible, I'll continue on as I've been doing. lxc file pull / lxc file push ? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] unable to delete a snapshot with a '+' sign in the name
On 2018-12-05 22:00, Werner Hack wrote: Hi all, I am using lxd 3.0.2 on Ubuntu 16.04. I noticed that I can not delete a snapshot with a '+' sign in the name. lxc delete vm/v2.monit+letsencrypt Error: not found Also quoting the filename or masking the sign does not help. Any idea what I can do? It's similar with snapshots with a space in name - which used to work in the past. Unfortunately I also don't have a solution for that. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] /proc lost in some containers
On 2018-11-09 11:19, Stéphane Graber wrote: On Fri, Nov 09, 2018 at 04:29:46AM +0900, Tomasz Chmielewski wrote: LXD 3.6 from a snap on an up-to-date Ubuntu 18.04 server: lxd 3.69510 stablecanonical✓ - Suddenly, some (but not all) containers lost their /proc filesystem: # ps auxf Error: /proc must be mounted To mount /proc at boot you need an /etc/fstab line like: proc /proc procdefaults In the meantime, run "mount proc /proc -t proc" # I think I've seen something similar like this in the past. Can it be attributed to some not-so-well automatic snap upgrades? That can either be a lxcfs crash or lxcfs bug of some kind. Can you show "ps fauxww | grep lxcfs" on the host. Hmm, running twice? root 85202 0.0 0.0 382524 1236 ?S/var/snap/lxd/common/var/lib/lxcfs -p /var/snap/lxd/common/lxcfs.pid root 19414 0.0 0.0 530260 1716 ?S/var/snap/lxd/common/var/lib/lxcfs -p /var/snap/lxd/common/lxcfs.pid And then inside an affected container run "grep lxcfs /proc/mounts" and for each of the /proc path listed, attempted to read them with "cat", that should let you check if it's just one lxcfs file that's broken or all of them. Unfortunately I've already stopped / started all affected containers. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] /proc lost in some containers
LXD 3.6 from a snap on an up-to-date Ubuntu 18.04 server: lxd 3.69510 stablecanonical✓ - Suddenly, some (but not all) containers lost their /proc filesystem: # ps auxf Error: /proc must be mounted To mount /proc at boot you need an /etc/fstab line like: proc /proc procdefaults In the meantime, run "mount proc /proc -t proc" # I think I've seen something similar like this in the past. Can it be attributed to some not-so-well automatic snap upgrades? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] unable to start any container ("Permission denied - Failed to mount")
I'm not able to start any container today. # lxc start preprod-app Error: Failed to run: /snap/lxd/current/bin/lxd forkstart preprod-app /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/preprod-app/lxc.conf: Try `lxc info --show-log preprod-app` for more info # lxc info --show-log preprod-app Name: preprod-app Remote: unix:// Architecture: x86_64 Created: 2018/09/05 15:01 UTC Status: Stopped Type: persistent Profiles: default Log: lxc preprod-app 20180924132438.883 WARN conf - conf.c:lxc_map_ids:2917 - newuidmap binary is missing lxc preprod-app 20180924132438.883 WARN conf - conf.c:lxc_map_ids:2923 - newgidmap binary is missing lxc preprod-app 20180924132438.887 WARN conf - conf.c:lxc_map_ids:2917 - newuidmap binary is missing lxc preprod-app 20180924132438.887 WARN conf - conf.c:lxc_map_ids:2923 - newgidmap binary is missing lxc preprod-app 20180924132438.917 ERRORdir - storage/dir.c:dir_mount:195 - Permission denied - Failed to mount "/var/snap/lxd/common/lxd/containers/preprod-app/rootfs" on "/var/snap/lxd/common/lxc/" lxc preprod-app 20180924132438.917 ERRORconf - conf.c:lxc_mount_rootfs:1337 - Failed to mount rootfs "/var/snap/lxd/common/lxd/containers/preprod-app/rootfs" onto "/var/snap/lxd/common/lxc/" with options "(null)" lxc preprod-app 20180924132438.917 ERRORconf - conf.c:lxc_setup_rootfs_prepare_root:3446 - Failed to setup rootfs for lxc preprod-app 20180924132438.917 ERRORconf - conf.c:lxc_setup:3510 - Failed to setup rootfs lxc preprod-app 20180924132438.917 ERRORstart - start.c:do_start:1234 - Failed to setup container "preprod-app" lxc preprod-app 20180924132438.918 ERRORsync - sync.c:__sync_wait:59 - An error occurred in another process (expected sequence number 5) lxc preprod-app 20180924132439.235 ERRORstart - start.c:__lxc_start:1910 - Failed to spawn container "preprod-app" lxc preprod-app 20180924132439.235 ERRORlxccontainer - lxccontainer.c:wait_on_daemonized_start:840 - Received container state "ABORTING" instead of "RUNNING" lxc preprod-app 20180924132439.963 WARN conf - conf.c:lxc_map_ids:2917 - newuidmap binary is missing lxc preprod-app 20180924132439.101 WARN conf - conf.c:lxc_map_ids:2923 - newgidmap binary is missing lxc 20180924132439.380 WARN commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command "get_state" # snap list Name Version Rev Tracking Publisher Notes core 16-2.35 5328 stable canonical✓ core lxd 3.5 8774 stablecanonical✓ - This is on Ubuntu 18.04. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] unable to start containers ("Permission denied - Failed to mount")
Turns out something changed the permissions on "containers" directory: # lxc storage list +-+-++---+-+ | NAME | DESCRIPTION | DRIVER | SOURCE | USED BY | +-+-++---+-+ | default | | btrfs | /data/lxd | 12 | +-+-++---+-+ # ls -l /data/lxd total 0 drwxr-xr-x 1 root root 90 Sep 24 13:05 archives drwx-- 1 root root 518 Sep 24 13:12 containers <- here drwx--x--x 1 root root 0 Mar 28 16:14 custom drwx-- 1 root root 0 Sep 21 06:05 images drwx-- 1 root root 0 Sep 24 05:48 snapshots This fixed it: chmod 711 /data/lxd/containers/ I'm 99% sure we did not change the permissions on that directory... Tomasz On 2018-09-24 15:32, Tomasz Chmielewski wrote: I'm not able to start any container today. # lxc start preprod-app Error: Failed to run: /snap/lxd/current/bin/lxd forkstart preprod-app /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/preprod-app/lxc.conf: Try `lxc info --show-log preprod-app` for more info # lxc info --show-log preprod-app Name: preprod-app Remote: unix:// Architecture: x86_64 Created: 2018/09/05 15:01 UTC Status: Stopped Type: persistent Profiles: default Log: lxc preprod-app 20180924132438.883 WARN conf - conf.c:lxc_map_ids:2917 - newuidmap binary is missing lxc preprod-app 20180924132438.883 WARN conf - conf.c:lxc_map_ids:2923 - newgidmap binary is missing lxc preprod-app 20180924132438.887 WARN conf - conf.c:lxc_map_ids:2917 - newuidmap binary is missing lxc preprod-app 20180924132438.887 WARN conf - conf.c:lxc_map_ids:2923 - newgidmap binary is missing lxc preprod-app 20180924132438.917 ERRORdir - storage/dir.c:dir_mount:195 - Permission denied - Failed to mount "/var/snap/lxd/common/lxd/containers/preprod-app/rootfs" on "/var/snap/lxd/common/lxc/" lxc preprod-app 20180924132438.917 ERRORconf - conf.c:lxc_mount_rootfs:1337 - Failed to mount rootfs "/var/snap/lxd/common/lxd/containers/preprod-app/rootfs" onto "/var/snap/lxd/common/lxc/" with options "(null)" lxc preprod-app 20180924132438.917 ERRORconf - conf.c:lxc_setup_rootfs_prepare_root:3446 - Failed to setup rootfs for lxc preprod-app 20180924132438.917 ERRORconf - conf.c:lxc_setup:3510 - Failed to setup rootfs lxc preprod-app 20180924132438.917 ERRORstart - start.c:do_start:1234 - Failed to setup container "preprod-app" lxc preprod-app 20180924132438.918 ERRORsync - sync.c:__sync_wait:59 - An error occurred in another process (expected sequence number 5) lxc preprod-app 20180924132439.235 ERRORstart - start.c:__lxc_start:1910 - Failed to spawn container "preprod-app" lxc preprod-app 20180924132439.235 ERRORlxccontainer - lxccontainer.c:wait_on_daemonized_start:840 - Received container state "ABORTING" instead of "RUNNING" lxc preprod-app 20180924132439.963 WARN conf - conf.c:lxc_map_ids:2917 - newuidmap binary is missing lxc preprod-app 20180924132439.101 WARN conf - conf.c:lxc_map_ids:2923 - newgidmap binary is missing lxc 20180924132439.380 WARN commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command "get_state" # snap list Name Version Rev Tracking Publisher Notes core 16-2.35 5328 stablecanonical✓ core lxd 3.5 8774 stablecanonical✓ - This is on Ubuntu 18.04. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] unable to start containers ("Permission denied - Failed to mount")
I'm not able to start any container today. # lxc start preprod-app Error: Failed to run: /snap/lxd/current/bin/lxd forkstart preprod-app /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/preprod-app/lxc.conf: Try `lxc info --show-log preprod-app` for more info # lxc info --show-log preprod-app Name: preprod-app Remote: unix:// Architecture: x86_64 Created: 2018/09/05 15:01 UTC Status: Stopped Type: persistent Profiles: default Log: lxc preprod-app 20180924132438.883 WARN conf - conf.c:lxc_map_ids:2917 - newuidmap binary is missing lxc preprod-app 20180924132438.883 WARN conf - conf.c:lxc_map_ids:2923 - newgidmap binary is missing lxc preprod-app 20180924132438.887 WARN conf - conf.c:lxc_map_ids:2917 - newuidmap binary is missing lxc preprod-app 20180924132438.887 WARN conf - conf.c:lxc_map_ids:2923 - newgidmap binary is missing lxc preprod-app 20180924132438.917 ERRORdir - storage/dir.c:dir_mount:195 - Permission denied - Failed to mount "/var/snap/lxd/common/lxd/containers/preprod-app/rootfs" on "/var/snap/lxd/common/lxc/" lxc preprod-app 20180924132438.917 ERRORconf - conf.c:lxc_mount_rootfs:1337 - Failed to mount rootfs "/var/snap/lxd/common/lxd/containers/preprod-app/rootfs" onto "/var/snap/lxd/common/lxc/" with options "(null)" lxc preprod-app 20180924132438.917 ERRORconf - conf.c:lxc_setup_rootfs_prepare_root:3446 - Failed to setup rootfs for lxc preprod-app 20180924132438.917 ERRORconf - conf.c:lxc_setup:3510 - Failed to setup rootfs lxc preprod-app 20180924132438.917 ERRORstart - start.c:do_start:1234 - Failed to setup container "preprod-app" lxc preprod-app 20180924132438.918 ERRORsync - sync.c:__sync_wait:59 - An error occurred in another process (expected sequence number 5) lxc preprod-app 20180924132439.235 ERRORstart - start.c:__lxc_start:1910 - Failed to spawn container "preprod-app" lxc preprod-app 20180924132439.235 ERRORlxccontainer - lxccontainer.c:wait_on_daemonized_start:840 - Received container state "ABORTING" instead of "RUNNING" lxc preprod-app 20180924132439.963 WARN conf - conf.c:lxc_map_ids:2917 - newuidmap binary is missing lxc preprod-app 20180924132439.101 WARN conf - conf.c:lxc_map_ids:2923 - newgidmap binary is missing lxc 20180924132439.380 WARN commands - commands.c:lxc_cmd_rsp_recv:130 - Connection reset by peer - Failed to receive response for command "get_state" # snap list Name Version Rev Tracking Publisher Notes core 16-2.35 5328 stable canonical✓ core lxd 3.5 8774 stablecanonical✓ - This is on Ubuntu 18.04. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxc publish - how to use "pigz" (parallel gzip) for compression?
On 2018-09-21 09:28, Stéphane Graber wrote: On Fri, Sep 21, 2018 at 09:22:46AM +0200, Tomasz Chmielewski wrote: On 2018-09-21 09:11, lxc-us...@licomonch.net wrote: > maybe not what you are looking for, but could work as workaround for the > moment: > mv /snap/core/4917/bin/gzip /snap/core/4917/bin/gzip_dist > ln -s /usr/bin/pigz /snap/core/4917/bin/gzip Nope, it's snap, read-only: mount -o bind /usr/bin/pigz /snap/core/4917/bin/gzip That may work. But won't survive core snap update or reboots. Are there any plans to support pigz (or "xz -T 0") "out of the box"? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxc publish - how to use "pigz" (parallel gzip) for compression?
On 2018-09-21 09:11, lxc-us...@licomonch.net wrote: maybe not what you are looking for, but could work as workaround for the moment: mv /snap/core/4917/bin/gzip /snap/core/4917/bin/gzip_dist ln -s /usr/bin/pigz /snap/core/4917/bin/gzip Nope, it's snap, read-only: # touch /snap/core/4917/bin/anything touch: cannot touch '/snap/core/4917/bin/anything': Read-only file system Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxc publish - how to use "pigz" (parallel gzip) for compression?
"lxc publish $container --alias $container" can take quite long even on a server with multiple CPUs. This is because "gzip" is used, which can only use one CPU, i.e.: gzip -c -n /var/snap/lxd/common/lxd/images/lxd_build_015181701/lxd_build_tar_202915839 Is it possible to specify an alternative compression program which is able to do parallel compression? I see it doesn't work for pigz: # lxc publish $container --alias $container --compression pigz Error: exec: "pigz": executable file not found in $PATH This is because it looks up the binary in /snap/core/4917/bin/ and pigz does not exist there. xz is able to do parallel compression (i.e. xz -T 0), but sadly, lxd is not using it, so it's even slower than gzip: # lxc publish $container --alias $container --compression xz And this notation is not allowed: # lxc publish $container --alias $container --compression "xz -T 0" Are there any possible workarounds to use parallel compression for "lxc publish"? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] lxc container network occasional problem with bridge network on bonding device
FYI I've seen a similar phenomenon when launching new containers. Sometimes, connectivity freezes for several seconds after that. What usually "helps" is sending an arping to the gateway IP from an affected container. Tomasz On 2018-09-17 06:02, toshinao wrote: Hi. I experienced occasional network problem of containers running on ubuntu server 18.04.1. Containers can communicate with host IP always and they can communicate sometimes to the other hosts but they are disconnected occasionally. When the problem occurs, the ping from the container to external hosts does not reach at all, but very rarely they recover after, for example, several hours later. Disconnection happens much more easily. The host network is organized by using netplan in the following topology. +-eno1-< <--lan_cable--> >-+ br0--bond0-+ +-- Cisco 3650 +-en02-< <--lan_cable--> >-+ The bonding mode is balance-a1b. I also found that if one of the LAN cables is physically disconnected, this problem has never happened. By using iptraf-ng, I watched the bridge device, the following br0, as well as the slave devices. Even if containers send a ping to the external hosts, no ping packet is detected, when they cannot communicate. Ping packets are detected by iptraf-ng on these devices when the communication is working. I guess this can be a low-level problem of virtual networking. Are there any suggestions to solve the problem ? Here's the detail of the setting. host's netplan setting network: version: 2 renderer: networkd ethernets: eno1: dhcp4: no eno2: dhcp4: no bonds: bond0: interfaces: [eno1, eno2] parameters: mode: balanec-a1b bridges: br0: interfaces: - bond0 addresses: [10.1.2.3/24] gateway4: 10.1.2.254 dhcp4: no host network interface status host# ip a s 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eno1: mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether 0b:25:b5:f2:e1:34 brd ff:ff:ff:ff:ff:ff 3: eno2: mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether 0b:25:b5:f2:e1:35 brd ff:ff:ff:ff:ff:ff 4: br0: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 0a:1a:6c:85:ff:ed brd ff:ff:ff:ff:ff:ff inet 10.1.2.3/24 brd 10.1.2.255 scope global br0 valid_lft forever preferred_lft forever inet6 fe80::81a:6cff:fe85:ffed/64 scope link valid_lft forever preferred_lft forever 5: bond0: mtu 1500 qdisc noqueue master br0 state UP group default qlen 1000 link/ether 0a:54:4b:f2:d7:10 brd ff:ff:ff:ff:ff:ff 7: vethK4HOFU@if6: mtu 1500 qdisc noqueue master br0 state UP group default qlen 1000 link/ether fe:ca:07:3e:2b:2d brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 fe80::fcca:7ff:fe3e:2b2d/64 scope link valid_lft forever preferred_lft forever 9: veth77HJ0V@if8: mtu 1500 qdisc noqueue master br0 state UP group default qlen 1000 link/ether fe:85:f0:ef:78:b2 brd ff:ff:ff:ff:ff:ff link-netnsid 1 inet6 fe80::fc85:f0ff:feef:78b2/64 scope link valid_lft forever preferred_lft forever container's network interface status root@bionic0:~# ip a s 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 6: eth0@if7: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:16:3e:cb:ef:ce brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.1.2.20/24 brd 10.1.2.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::216:3eff:fecb:efce/64 scope link valid_lft forever preferred_lft forever ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxc publish -> no space left on device
I'm using LXD 3.4 installed from snap on Ubuntu 18.04. I'm trying to publish a ~100 GB container snapshot; unfortunately, it ends with "no space left on device": # time lxc publish mysql-prod-slave/2018-09-13-15h-08m-58s-daily --alias db-testcontainer Error: Failed to copy file content: write /var/snap/lxd/common/lxd/images/lxd_build_309046357/lxd_build_tar_011920719: no space left on device real1m14.851s user0m0.052s sys 0m0.036s My default and only storage has over 1.7 TB free: # lxc storage list +-+-++---+-+ | NAME | DESCRIPTION | DRIVER | SOURCE | USED BY | +-+-++---+-+ | default | | btrfs | /data/lxd | 12 | +-+-++---+-+ # btrfs fi usage /data/lxd Overall: Device size: 5.18TiB Device allocated: 1.98TiB Device unallocated:3.20TiB Device missing: 0.00B Used: 1.64TiB Free (estimated): 1.76TiB (min: 1.76TiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) (...) My root (/) partition is only 20 GB in size, which is why the process fails. Why isn't LXD using "default" storage for images? Is there a way to persuade LXD to use the default storage also for temporary files when publishing the images (/var/snap/lxd/common/lxd/images/)? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxc list -> "Error: Failed to fetch http://unix.socket/1.0: 500 Internal Server Error"
On some Ubuntu 18.04 systems running LXD from snap I'm getting the following after using any lxc command: # lxc shell some-container Error: Failed to fetch http://unix.socket/1.0: 500 Internal Server Error or: # lxc list Error: Get http://unix.socket/1.0: EOF Restarting lxd with the following command does not help: # systemctl restart snap.lxd.daemon The existing containers seem to be running just fine (except it's not possible to use lxc list, lxc exec, lxc shell etc.). The affected and unaffected systems are running the following versions: # snap list Name VersionRev Tracking Publisher Notes core 16-2.34.3 5145 stablecanonical core lxd 3.48393 stablecanonical - Full system restart helps, but it would be great to know if there is a "better" fix. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] "lxc list" on Linux 4.18: cannot perform readlinkat() on the mount namespace file descriptor of the init process: Permission denied
On 2018-08-15 12:06, Christian Brauner wrote: On Wed, Aug 15, 2018 at 11:49:40AM +, Tomasz Chmielewski wrote: # lxc list cannot perform readlinkat() on the mount namespace file descriptor of the init process: Permission denied Where is this error coming from? It's not from LX{C,D} What does lxc info show? It looks like some apparmor / snap issue: https://bugs.launchpad.net/snapd/+bug/1786889 Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] "lxc list" on Linux 4.18: cannot perform readlinkat() on the mount namespace file descriptor of the init process: Permission denied
# lxc list cannot perform readlinkat() on the mount namespace file descriptor of the init process: Permission denied # dmesg -c [ 1554.529049] audit: type=1400 audit(1534333565.580:49): apparmor="DENIED" operation="ptrace" profile="/snap/core/5145/usr/lib/snapd/snap-confine" pid=2636 comm="snap-confine" requested_mask="read" denied_mask="read" peer="unconfined" This is after upgrading the kernel to: # uname -a Linux lxd05 4.18.0-041800-generic #201808122131 SMP Sun Aug 12 21:33:20 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux # snap list Name VersionRev Tracking Publisher Notes core 16-2.34.3 5145 stablecanonical core lxd 3.38011 stablecanonical - # cat /etc/issue Ubuntu 18.04.1 LTS \n \l Expected? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] "error: LXD still not running after 5 minutes" - failed lxd.migrate - how to recover?
On 2018-08-08 21:06, Tomasz Chmielewski wrote: I've tried to migrate from deb to snap on Ubuntu 18.04. Unfortunately, lxd.migrate failed with "error: LXD still not running after 5 minutes": (...) Not sure how to recover now? The containers seem intact in /var/lib/lxd/ It seems it's partially migrated with no clear info on how to continue. Attempting to do "systemctl start lxd" produces: Error: failed to open cluster database: failed to ensure schema: schema version '9' is more recent than expected '7' Attempting to start lxd from the snap results in: # /snap/bin/lxc list Error: Both native and snap packages are installed on this system Run "lxd.migrate" to complete your migration to the snap package # systemctl status lxd ● lxd.service - LXD - main daemon Loaded: loaded (/lib/systemd/system/lxd.service; indirect; vendor preset: enabled) Active: activating (start-post) (Result: exit-code) since Wed 2018-08-08 19:28:57 UTC; 9s ago Docs: man:lxd(1) Process: 6829 ExecStart=/usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log (code=exited, status=1/FAILURE) Process: 6824 ExecStartPre=/usr/lib/x86_64-linux-gnu/lxc/lxc-apparmor-load (code=exited, status=0/SUCCESS) Main PID: 6829 (code=exited, status=1/FAILURE); Control PID: 6831 (lxd) Tasks: 8 CGroup: /system.slice/lxd.service └─6831 /usr/lib/lxd/lxd waitready --timeout=600 Aug 08 19:28:57 b1 systemd[1]: Starting LXD - main daemon... Aug 08 19:28:57 b1 lxd[6829]: lvl=warn msg="AppArmor support has been disabled because of lack of kernel support" t=2018-08-08T19:28:57+ Aug 08 19:28:57 b1 lxd[6829]: lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." t=2018-08-08T19:28:57+ Aug 08 19:28:58 b1 lxd[6829]: lvl=eror msg="Failed to start the daemon: failed to open cluster database: failed to ensure schema: schema version '9' is more recent than expected '7'" t=2018-08-08T19:28:58+ Aug 08 19:28:58 b1 lxd[6829]: Error: failed to open cluster database: failed to ensure schema: schema version '9' is more recent than expected '7' Aug 08 19:28:58 b1 systemd[1]: lxd.service: Main process exited, code=exited, status=1/FAILURE # lxd.migrate => Connecting to source server error: Unable to connect to the source LXD: Get http://unix.socket/1.0: EOF # dpkg -l|grep lxd ii lxd 3.0.1-0ubuntu1~18.04.1 amd64Container hypervisor based on LXC - daemon ii lxd-client3.0.1-0ubuntu1~18.04.1 amd64Container hypervisor based on LXC - client # snap list Name VersionRev Tracking Publisher Notes core 16-2.34.3 5145 stablecanonical core lxd 3.38011 stablecanonical - -- Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] "error: LXD still not running after 5 minutes" - failed lxd.migrate - how to recover?
I've tried to migrate from deb to snap on Ubuntu 18.04. Unfortunately, lxd.migrate failed with "error: LXD still not running after 5 minutes": root@b1 ~ # lxd.migrate => Connecting to source server => Connecting to destination server => Running sanity checks === Source server LXD version: 3.0.1 LXD PID: 2656 Resources: Containers: 6 Images: 4 Networks: 1 Storage pools: 1 === Destination server LXD version: 3.3 LXD PID: 12791 Resources: Containers: 0 Images: 0 Networks: 0 Storage pools: 0 The migration process will shut down all your containers then move your data to the destination LXD. Once the data is moved, the destination LXD will start and apply any needed updates. And finally your containers will be brought back to their previous state, completing the migration. WARNING: /var/lib/lxd is a mountpoint. You will need to update that mount location after the migration. Are you ready to proceed (yes/no) [default=no]? yes => Shutting down the source LXD => Stopping the source LXD units => Stopping the destination LXD unit => Unmounting source LXD paths => Unmounting destination LXD paths => Wiping destination LXD clean => Backing up the database => Moving the /var/lib/lxd mountpoint => Updating the storage backends => Starting the destination LXD => Waiting for LXD to come online error: LXD still not running after 5 minutes. root@b1 ~ # lxd.migrate => Connecting to source server error: Unable to connect to the source LXD: Get http://unix.socket/1.0: dial unix /var/lib/lxd/unix.socket: connect: no such file or directory root@b1 ~ # lxc list Error: Get http://unix.socket/1.0: dial unix /var/lib/lxd/unix.socket: connect: no such file or directory Not sure how to recover now? The containers seem intact in /var/lib/lxd/ Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] Error: Error opening startup config file: "loading config file for the container failed" after upgrading to LXD 3.0.1
This is what's happening after upgrading to LXD 3.0.1 and trying to run "lxc shell": # lxc shell webpagetest Error: Error opening startup config file: "loading config file for the container failed" Error: EOF # lxc exec webpagetest /bin/bash Error: Error opening startup config file: "loading config file for the container failed" Error: EOF "lxc config edit webpagetest" works fine. How to fix that? Tried "systemctl restart lxd", but it didn't help. # dpkg -l|grep lxd ii lxd 3.0.1-0ubuntu1~16.04.1 amd64Container hypervisor based on LXC - daemon ii lxd-client3.0.1-0ubuntu1~16.04.1 amd64Container hypervisor based on LXC - client # cat /etc/issue Ubuntu 16.04.4 LTS \n \l Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] filesystem full -> lxd won't start after freeing space
Tada! Works now, thanks a lot! # systemctl status snap.lxd.daemon ● snap.lxd.daemon.service - Service for snap application lxd.daemon Loaded: loaded (/etc/systemd/system/snap.lxd.daemon.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2018-05-07 14:45:31 CEST; 2min 35s ago Process: 39652 ExecStop=/usr/bin/snap run --command=stop lxd.daemon (code=exited, status=0/SUCCESS) Main PID: 39893 (daemon.start) CGroup: /system.slice/snap.lxd.daemon.service ‣ 39893 /bin/sh /snap/lxd/6954/commands/daemon.start May 07 14:45:33 lxd05 lxd.daemon[3336]: 5: fd: 13: pids May 07 14:45:33 lxd05 lxd.daemon[3336]: 6: fd: 14: devices May 07 14:45:33 lxd05 lxd.daemon[3336]: 7: fd: 15: cpuset May 07 14:45:33 lxd05 lxd.daemon[3336]: 8: fd: 16: blkio May 07 14:45:33 lxd05 lxd.daemon[3336]: 9: fd: 17: cpu,cpuacct May 07 14:45:33 lxd05 lxd.daemon[3336]: 10: fd: 18: freezer May 07 14:45:33 lxd05 lxd.daemon[3336]: 11: fd: 19: name=systemd May 07 14:45:33 lxd05 lxd.daemon[3336]: lxcfs.c: 105: do_reload: lxcfs: reloaded May 07 14:45:35 lxd05 lxd.daemon[39893]: => LXD is ready May 07 14:46:55 lxd05 lxd.daemon[39893]: lvl=warn msg="Detected poll(POLLNVAL) event." t=2018-05-07T12:46:55+ Tomasz On 2018-05-07 20:56, Free Ekanayaka wrote: Hello, I set a tarball of the repaired database to Tomasz privately. I simply deleted the last raft log entry which was broken, although there was not enough information to understand what caused the issue. There have been several fixes in dqlite in that area (rollback of failed transactions), so hopefully this won't happen again with LXD 3.1.0. Free Free Ekanayaka <free.ekanay...@canonical.com> writes: Hello, yes, I received it. I'm going to get through in the next few hours and get back to you. Cheers Tomasz Chmielewski <man...@wpkg.org> writes: On 2018-05-06 18:02, Tomasz Chmielewski wrote: Please would send us tar with the content /var/snap/lxd/common/lxd/database? (or /var/snap/lxd/common/lxd/raft/, depending on which version of the snap you use). I believe this particular crash has been solved in our master branches, but probably it's not the build you have. I'll give a look at the data you send to confirm that, and possibly post a workaround. I've sent it to your free.*@canonical* address, let me know if you need more info. Could you confirm if you've received the archive? It would be great to know if it's possible to make LXD run again, or if it crashed for good and I have to recreate the containers somehow. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] filesystem full -> lxd won't start after freeing space
On 2018-05-06 18:02, Tomasz Chmielewski wrote: Please would send us tar with the content /var/snap/lxd/common/lxd/database? (or /var/snap/lxd/common/lxd/raft/, depending on which version of the snap you use). I believe this particular crash has been solved in our master branches, but probably it's not the build you have. I'll give a look at the data you send to confirm that, and possibly post a workaround. I've sent it to your free.*@canonical* address, let me know if you need more info. Could you confirm if you've received the archive? It would be great to know if it's possible to make LXD run again, or if it crashed for good and I have to recreate the containers somehow. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] filesystem full -> lxd won't start after freeing space
On 2018-05-06 17:55, Free Ekanayaka wrote: Perhaps it's related to the disk full issue? Maybe Stephane has more insight bout this. About the panic, I assume it's still the case that you can't start the LXD daemon because it panics at startup all the times? Correct. Please would send us tar with the content /var/snap/lxd/common/lxd/database? (or /var/snap/lxd/common/lxd/raft/, depending on which version of the snap you use). I believe this particular crash has been solved in our master branches, but probably it's not the build you have. I'll give a look at the data you send to confirm that, and possibly post a workaround. I've sent it to your free.*@canonical* address, let me know if you need more info. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] filesystem full -> lxd won't start after freeing space
On 2018-05-06 17:27, Tomasz Chmielewski wrote: On 2018-05-06 17:16, Tomasz Chmielewski wrote: On 2018-05-06 17:07, Tomasz Chmielewski wrote: I have a Ubuntu 16.04 server with LXD 3.0 installed from snap. I've filled the disk to 100% to get "No space left on device". (...) The same error shows up after freeing space and restarting the server. How to debug this? Also found this (panic: txn not found): # snap logs -n 1000 lxd (...) 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.57677: fsm: restore=1101 filename: db.bin I've copied /var/snap/lxd/common/lxd/database/global/db.bin to /tmp/db.bin - and I'm able to open it with sqlite3. Not sure what's broken there. This issue is very similar: https://github.com/lxc/lxd/issues/4465 Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] filesystem full -> lxd won't start after freeing space
On 2018-05-06 17:16, Tomasz Chmielewski wrote: On 2018-05-06 17:07, Tomasz Chmielewski wrote: I have a Ubuntu 16.04 server with LXD 3.0 installed from snap. I've filled the disk to 100% to get "No space left on device". (...) The same error shows up after freeing space and restarting the server. How to debug this? Also found this (panic: txn not found): # snap logs -n 1000 lxd (...) 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.57677: fsm: restore=1101 filename: db.bin I've copied /var/snap/lxd/common/lxd/database/global/db.bin to /tmp/db.bin - and I'm able to open it with sqlite3. Not sure what's broken there. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] filesystem full -> lxd won't start after freeing space
On 2018-05-06 17:07, Tomasz Chmielewski wrote: I have a Ubuntu 16.04 server with LXD 3.0 installed from snap. I've filled the disk to 100% to get "No space left on device". (...) The same error shows up after freeing space and restarting the server. How to debug this? Also found this (panic: txn not found): # snap logs -n 1000 lxd (...) 2018-05-06T08:08:06Z systemd[1]: Stopping Service for snap application lxd.daemon... 2018-05-06T08:08:06Z lxd.daemon[4176]: => Stop reason is: crashed 2018-05-06T08:08:06Z systemd[1]: Stopped Service for snap application lxd.daemon. 2018-05-06T08:08:06Z systemd[1]: Started Service for snap application lxd.daemon. 2018-05-06T08:08:06Z lxd.daemon[4215]: => Preparing the system 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Loading snap configuration 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Setting up mntns symlink 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Setting up kmod wrapper 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Preparing /boot 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Preparing a clean copy of /run 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Preparing a clean copy of /etc 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Setting up ceph configuration 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Setting up LVM configuration 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Rotating logs 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Escaping the systemd cgroups 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Escaping the systemd process resource limits 2018-05-06T08:08:06Z lxd.daemon[4215]: ==> Detected kernel with partial AppArmor support 2018-05-06T08:08:06Z lxd.daemon[4215]: => Re-using existing LXCFS 2018-05-06T08:08:06Z lxd.daemon[4215]: => Starting LXD 2018-05-06T08:08:06Z lxd.daemon[4215]: lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." t=2018-05-06T08:08:06+ 2018-05-06T08:08:06Z lxd.daemon[4215]: panic: txn not found 2018-05-06T08:08:06Z lxd.daemon[4215]: trace: 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.57659: fsm: restore=1101 start 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.57659: fsm: restore=1101 database size: 4096 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.57660: fsm: restore=1101 wal size: 325512 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.57677: fsm: restore=1101 filename: db.bin 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.57677: fsm: restore=1101 transaction ID: 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.57732: fsm: restore=1101 open follower: db.bin 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.57817: fsm: restore=1101 done 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.58869: fsm: term=1 index=1102 cmd=frames txn=1101 pages=3 commit=1 start 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.59327: fsm: term=1 index=1102 cmd=frames txn=1101 pages=3 commit=1 unregister txn 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.59327: fsm: term=1 index=1102 cmd=frames txn=1101 pages=3 commit=1 done 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.59333: fsm: term=1 index=1103 cmd=frames txn=1102 pages=3 commit=1 start 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.59377: fsm: term=1 index=1103 cmd=frames txn=1102 pages=3 commit=1 unregister txn 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.59378: fsm: term=1 index=1103 cmd=frames txn=1102 pages=3 commit=1 done 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60338: fsm: term=1 index=1104 cmd=frames txn=1103 pages=3 commit=1 start 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60400: fsm: term=1 index=1104 cmd=frames txn=1103 pages=3 commit=1 unregister txn 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60400: fsm: term=1 index=1104 cmd=frames txn=1103 pages=3 commit=1 done 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60407: fsm: term=1 index=1105 cmd=frames txn=1104 pages=11 commit=1 start 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60461: fsm: term=1 index=1105 cmd=frames txn=1104 pages=11 commit=1 unregister txn 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60461: fsm: term=1 index=1105 cmd=frames txn=1104 pages=11 commit=1 done 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60462: fsm: term=1 index=1106 cmd=frames txn=1105 pages=3 commit=1 start 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60501: fsm: term=1 index=1106 cmd=frames txn=1105 pages=3 commit=1 unregister txn 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60501: fsm: term=1 index=1106 cmd=frames txn=1105 pages=3 commit=1 done 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60502: fsm: term=1 index=1107 cmd=frames txn=1106 pages=3 commit=1 start 2018-05-06T08:08:06Z lxd.daemon[4215]: 2018-05-06 08:08:06.60540: fsm: t
Re: [lxc-users] Migration from LXD 2.x packages to LXD 3.0 snap
How exactly did you upgrade from 2.x deb to 3.0 snap? I've made a test "upgrade" on one server, but it didn't really result in an upgrade: - the old deb install was left with its containers and settings - the new snap version was installed in a different place, and was managing its own containers and own settings Though I didn't investigate later how to move the containers from the deb setup to snap setup (I could probably import/export, but it would take a while on production servers). Tomasz Chmielewski https://lxadm.com On 2018-05-05 04:55, Steven Spencer wrote: Thomas, I don't know if you have been able to answer your own question or not, but moving from 2.x on my workstation to 3.x did not interrupt the containers I had running on my local machine. It does not mean that if you have a specific filesystem back end (btrfs, zfs) that there isn't something that you need to do. Hopefully you've been able to continue on! Steve On Fri, Apr 20, 2018 at 11:42 AM Thomas Ward <tew...@ubuntu.com> wrote: I'm currently using the Ubuntu backports repositories in 16.04 to get LXD 2.x packages. LXD 3.0 was released recently, and it seems to work better with the networking now, getting over issues I had within the LXD 2.x series snaps. However, I've got a bunch of containers running within the LXD 2.x infrastructure. Is there any documentation on how I go about moving from the LXD 2.x packages to the LXD 3.0 snap? Short of rebuilding the entire system again, that is. Thomas ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] bionic image not getting IPv4 address
On 2018-05-03 21:57, Fajar A. Nugraha wrote: I can confirm this. Seeing the same issue. BTW. It's the /etc/netplan/10-lxc.yaml Not working (current) version: network: ethernets: eth0: {dhcp4: true} version: 2 Working version (for me): network: version: 2 ethernets: eth0: dhcp4: true Works for me. Both with images:ubuntu/bionic (which has /etc/netplan/10-lxc.yaml, identical to your 'not working' one) and ubuntu:bionic (which has /etc/netplan/50-cloud-init.yaml). Then again the images:ubuntu/bionic one has '20180503_11:06' in its description, so it's possible that the bug was fixed recently. Indeed, the bug seems now fixed in the bionic image and new containers are getting IPv4 via DHCP again: | | 88a22ac497ad | no | Ubuntu bionic amd64 (20180503_03:49) | x86_64 | 104.71MB | May 3, 2018 at 8:51am (UTC) | This one was producing broken /etc/netplan/10-lxc.yaml: | | 87b5c0fec8ff | no | Ubuntu bionic amd64 (20180502_09:49) | x86_64 | 118.15MB | May 3, 2018 at 2:39am (UTC) | Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] bionic image not getting IPv4 address
On 2018-05-03 15:28, Kees Bos wrote: I can confirm this. Seeing the same issue. BTW. It's the /etc/netplan/10-lxc.yaml Not working (current) version: network: ethernets: eth0: {dhcp4: true} version: 2 Working version (for me): network: version: 2 ethernets: eth0: dhcp4: true Indeed, I can confirm it's some netplan-related issue with /etc/netplan/10-lxc.yaml. Working version for bionic containers set up before 2018-May-02: network: ethernets: eth0: {dhcp4: true} version: 2 Broken version for bionic containers set up after 2018-May-02: network: ethernets: eth0: {dhcp4: true} version: 2 Please note that the broken one has no indentation (two spaces) before "version: 2", this is the only thing that differs and which breaks DHCPv4. What's responsible for this? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] bionic image not getting IPv4 address
On 2018-05-03 12:09, Tomasz Chmielewski wrote: On 2018-05-03 11:58, Mark Constable wrote: On 5/3/18 12:42 PM, Tomasz Chmielewski wrote: > Today or yesterday, bionic image launched in LXD is not getting an IPv4 > address. It is getting an IPv6 address. If you do a "lxc profile show default" you will probably find it doesn't have an IPv4 network attached by default. I haven't yet found a simple step by step howto example of how to setup a network for v3.0 but in my case I use a bridge on my host and create a new profile that includes... lxc network attach-profile lxdbr0 [profile name] eth0 then when I manually launch a container I use something like... lxc launch images:ubuntu-core/16 uc1 -p [profile name] The bionic container is attached to a bridge with IPv4 networking. Besides, xenial container is getting IPv4 address just fine, while bionic is not. The issue is not LXD 3.0 specific - I'm able to reproduce this on servers with LXD 2.21 and LXD 3.0. I'm able to reproduce this issue with these LXD servers: - Ubuntu 16.04 with LXD 2.21 from deb - Ubuntu 18.04 with LXD 3.0.0 from deb - Ubuntu 16.04 with LXD 3.0.0 from snap Reproducing is easy: # lxc launch images:ubuntu/bionic/amd64 bionic-broken-dhcp Then wait a few secs until it starts - "lxc list" will show it has IPv6 address (if your bridge was configured to provide IPv6), but not IPv4 (and you can confirm by doing "lxc shell", too): # lxc list On the other hand, this works fine with xenial, and "lxc list" will show this container is getting an IPv4 address: # lxc launch images:ubuntu/bionic/amd64 xenial-working-dhcp Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] bionic image not getting IPv4 address
On 2018-05-03 12:14, David Favor wrote: Mark Constable wrote: On 5/3/18 12:42 PM, Tomasz Chmielewski wrote: Today or yesterday, bionic image launched in LXD is not getting an IPv4 address. It is getting an IPv6 address. If you do a "lxc profile show default" you will probably find it doesn't have an IPv4 network attached by default. I haven't yet found a simple step by step howto example of how to setup a network for v3.0 but in my case I use a bridge on my host and create a new profile that includes... lxc network attach-profile lxdbr0 [profile name] eth0 then when I manually launch a container I use something like... lxc launch images:ubuntu-core/16 uc1 -p [profile name] Be aware there is a bug in Bionic packaging, so if you upgrade machine level OS from any previous OS version to Bionic, LXD networking becomes broken... so badly... no Ubuntu or LXD developer has figured out a fix. To avoid this, move all containers off the machine... via... lxc stop lxc copy local:cname offsite:cname Then do a fresh Bionic install at machine level. Then install LXD via SNAP (which is only LXD install option on Bionic). Once done, you're good to go... Just ensure... I'm having an issue with *new* (2018-May-02 onwards) bionic containers not getting IPv4 addresses. Bionic containers created before 2018-May-02 are getting IPv4 just fine. Here is how I launch *new* bionic containers: lxc launch images:ubuntu/bionic/amd64 bionictest Just try it yourself and see if this container is getting an IPv4 address. 1) You've setup routes for all your IP ranges to lxcbr0. All routes are fine. 2) You've added your IPV4 address to one of... /etc/netplan/* /etc/network/interfaces I'm talking about DHCP, not a static IP address. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] bionic image not getting IPv4 address
On 2018-05-03 11:58, Mark Constable wrote: On 5/3/18 12:42 PM, Tomasz Chmielewski wrote: > Today or yesterday, bionic image launched in LXD is not getting an IPv4 > address. It is getting an IPv6 address. If you do a "lxc profile show default" you will probably find it doesn't have an IPv4 network attached by default. I haven't yet found a simple step by step howto example of how to setup a network for v3.0 but in my case I use a bridge on my host and create a new profile that includes... lxc network attach-profile lxdbr0 [profile name] eth0 then when I manually launch a container I use something like... lxc launch images:ubuntu-core/16 uc1 -p [profile name] The bionic container is attached to a bridge with IPv4 networking. Besides, xenial container is getting IPv4 address just fine, while bionic is not. The issue is not LXD 3.0 specific - I'm able to reproduce this on servers with LXD 2.21 and LXD 3.0. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] bionic image not getting IPv4 address
On 2018-05-03 11:42, Tomasz Chmielewski wrote: I was able to reproduce it on two different LXD servers. This used to work a few days ago. Also - xenial containers are getting IPv4 address just fine. Here is the output of "systemctl status systemd-networkd" on a bionic container launched yesterday, with working DHCP (it's also getting IPv4 after restart etc.): # systemctl status systemd-networkd ● systemd-networkd.service - Network Service Loaded: loaded (/lib/systemd/system/systemd-networkd.service; enabled-runtime; vendor preset: enabled) Active: active (running) since Thu 2018-05-03 02:46:14 UTC; 6min ago Docs: man:systemd-networkd.service(8) Main PID: 45 (systemd-network) Status: "Processing requests..." Tasks: 1 (limit: 4915) CGroup: /system.slice/systemd-networkd.service └─45 /lib/systemd/systemd-networkd May 03 02:46:14 a19ea62218-2018-05-02-11-12-12 systemd-networkd[45]: Enumeration completed May 03 02:46:14 a19ea62218-2018-05-02-11-12-12 systemd[1]: Started Network Service. May 03 02:46:14 a19ea62218-2018-05-02-11-12-12 systemd-networkd[45]: eth0: DHCPv4 address 10.190.0.95/24 via 10.190.0.1 May 03 02:46:14 a19ea62218-2018-05-02-11-12-12 systemd-networkd[45]: Not connected to system bus, not setting hostname. May 03 02:46:16 a19ea62218-2018-05-02-11-12-12 systemd-networkd[45]: eth0: Gained IPv6LL May 03 02:46:16 a19ea62218-2018-05-02-11-12-12 systemd-networkd[45]: eth0: Configured Here is the output of "systemctl status systemd-networkd" on a bionic container launched today - DHCPv4 is not working (I can get IPv4 there by running "dhclient eth0" manually, but that's not how it should work): # systemctl status systemd-networkd ● systemd-networkd.service - Network Service Loaded: loaded (/lib/systemd/system/systemd-networkd.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2018-05-03 02:49:10 UTC; 3min 36s ago Docs: man:systemd-networkd.service(8) Main PID: 54 (systemd-network) Status: "Processing requests..." Tasks: 1 (limit: 4915) CGroup: /system.slice/systemd-networkd.service └─54 /lib/systemd/systemd-networkd May 03 02:49:10 tomasztest systemd[1]: Starting Network Service... May 03 02:49:10 tomasztest systemd-networkd[54]: Enumeration completed May 03 02:49:10 tomasztest systemd[1]: Started Network Service. May 03 02:49:11 tomasztest systemd-networkd[54]: eth0: Gained IPv6LL Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] bionic image not getting IPv4 address
Today or yesterday, bionic image launched in LXD is not getting an IPv4 address. It is getting an IPv6 address. I'm launching the container like this: lxc launch images:ubuntu/bionic/amd64 bionictest Inside the container: 44: eth0@if45: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:16:3e:4b:61:41 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 fd42:b5d:f6dd:6b21:216:3eff:fe4b:6141/64 scope global dynamic mngtmpaddr valid_lft 3553sec preferred_lft 3553sec inet6 fe80::216:3eff:fe4b:6141/64 scope link valid_lft forever preferred_lft forever I was able to reproduce it on two different LXD servers. This used to work a few days ago. Did anything change in bionic images recently? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] LXC Copying a running container
On 2018-04-20 21:58, Saint Michael wrote: I need to transfer a copy of a container that cannot be stopped. I don´t mid if the data is slightly out of sync. Is there a way to do this? I tried lxc-copy and it fails because the source is running. rsync? On a destination server, create a container with the same operating system/version. Then copy it with rsync. Please not that any files which are opened and in use, will most likely be corrupt and unusable. Most certainly this will be true for any database files, i.e. MySQL. Also - if, on the source system, you have a filesystem with snapshotting functionality, you can snapshot, and copy the snapshot. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] Announcing LXC, LXD and LXCFS 3.0 LTS
On 2018-04-12 06:25, Stéphane Graber wrote: The LXC, LXD and LXCFS teams are proud to announce the release of the 3.0 version of all three projects. Great news, great features! What's the best way to upgrade from a 2.xx deb to a 3.xx snap? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] LXC containers networking
On 2018-04-06 02:33, Bhangui, Avadhut Upendra wrote: Hello, I'm pretty new to using LXC containers. I have a requirement that the solution running inside the container should be able to communicate to services in public cloud and also with some services on the host machine. * How do I setup the networking of this container? * When it will try to communicate to the service on the host machine, will request be routed to machine over the physical network? I'd say best to attach two NICs to the container, with two network bridges: - one with a public IP (assuming the container needs a public IP) - one to a NIC with internal network only If the container doesn't need a public IP, then one NIC attached to the internal network should be enough. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] deb -> snap usage issues
I've noticed that cronjobs using "lxc" commands don't work on servers with LXD installed from a snap package. It turned out that: # which lxc /snap/bin/lxc Which is not in a $PATH when executing via cron. In other words - lxc binary is in a $PATH when installed from a deb package, but is not when installed from a snap package. Is it a bug, a feature? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] LXD - detect old configuration keys
I'm also seeing it complaining with no raw.lxc keys. Tomasz Chmielewski https://lxadm.com On 2018-03-24 23:28, MonkZ wrote: Nope - so what are the other 0,01%? ;) architecture: x86_64 config: image.architecture: amd64 image.description: ubuntu 17.10 amd64 (release) (20180314) image.label: release image.os: ubuntu image.release: artful image.serial: "20180314" image.version: "17.10" volatile.base_image: c7cfac63aedfa6a5d68ae74bfa662bb03e97ada576dcd801768d617b3e59d3db volatile.eth0.hwaddr: 00:16:3e:67:8a:7a volatile.idmap.base: "2197152" volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":2197152,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":2197152,"Nsid":0,"Maprange":65536}]' volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":2197152,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":2197152,"Nsid":0,"Maprange":65536}]' volatile.last_state.power: RUNNING devices: {} ephemeral: false profiles: - default stateful: false description: "" I think it would be nice if LXD/LXC reports those keys to have an option to run a (verbose) sanity check on the config. MfG MonkZ On 19.03.2018 00:49, Andrey Repin wrote: Greetings, MonkZ! Hiho, i'm running LXD 2.21 on Ubuntu. With my latest upgrade i got warnings like "The configuration file contains legacy configuration keys. Please update your configuration file!" Is there a way to list those keys? 99,99% chances are it is lxc.raw ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] logging for applications in container
On 2018-02-17 02:20, Stepan Santalov wrote: Hello! Is there a way to log to host's machine syslog by applications, running in containers? I've googled but with no luck. rsyslog over TCP/UDP? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] Creating testcontainer error: Failed to fetch https://images.linuxcontainers.org:8443/1.0/images/ubuntu%2Fxenial%2Famd64: 404 Not Found
On 2017-12-27 01:17, Tomasz Chmielewski wrote: Trying to launch a new container, but it fails: # lxc launch images:ubuntu/xenial/amd64 testcontainer Creating testcontainer error: Failed to fetch https://images.linuxcontainers.org:8443/1.0/images/ubuntu%2Fxenial%2Famd64: 404 Not Found The issue started today, or a few days ago. What's interesting, the above command works on some other systems. ii lxd-client 2.21-0ubuntu2~16.04.1~ppa1 amd64Container hypervisor based on LXC - client Replying to myself - this fixed the issue: lxc remote set-url images https://images.linuxcontainers.org I had https://images.linuxcontainers.org:8443 as the URL, apparently it's no longer valid. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] Creating testcontainer error: Failed to fetch https://images.linuxcontainers.org:8443/1.0/images/ubuntu%2Fxenial%2Famd64: 404 Not Found
Trying to launch a new container, but it fails: # lxc launch images:ubuntu/xenial/amd64 testcontainer Creating testcontainer error: Failed to fetch https://images.linuxcontainers.org:8443/1.0/images/ubuntu%2Fxenial%2Famd64: 404 Not Found The issue started today, or a few days ago. What's interesting, the above command works on some other systems. ii lxd-client 2.21-0ubuntu2~16.04.1~ppa1 amd64Container hypervisor based on LXC - client Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] "lxc delete" freezes sometimes
"lxc delete containername --force" is sometimes freezing and does not return. ii lxd 2.20-0ubuntu4~16.04.1 amd64Container hypervisor based on LXC - daemon ii lxd-client 2.20-0ubuntu4~16.04.1 amd64Container hypervisor based on LXC - client ii liblxc1 2.1.1-0ubuntu1~ubuntu16.04.1~ppa1 amd64Linux Containers userspace tools (library) ii lxc-common 2.1.1-0ubuntu1~ubuntu16.04.1~ppa1 amd64Linux Containers userspace tools (common tools) ii lxcfs2.0.8-1ubuntu2~ubuntu16.04.1~ppa1 amd64FUSE based filesystem for LXC Server is hosted on AWS and is running linux-image-4.4.0-1043-aws kernel (Ubuntu 16.04). Container's /var/log/lxd/containername/lxc.log comtains: lxc 20171217104137.857 WARN lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory. lxc 20171217104137.857 WARN lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory. lxc 20171217104137.978 WARN lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory. lxc 20171217104137.978 WARN lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory. lxc 20171219063617.834 WARN lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory. lxc 20171221100816.533 WARN lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory. Not sure how to debug this. It kills our server deployment / automation. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxc exec - support for wildcards and/or variables?
lxc exec does not seem to support wildcards: # lxc exec $CONTAINER -- touch /tmp/file1 /tmp/file2 # lxc exec $CONTAINER -- ls /tmp/file* ls: cannot access '/tmp/file*': No such file or directory # lxc exec $CONTAINER -- ls /tmp/file\* ls: cannot access '/tmp/file*': No such file or directory So let's try by setting a variable, which works: # lxc exec --env LSFILES=/tmp/file* $CONTAINER -- env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin container=lxc TERM=xterm-256color USER=root HOME=/root LSFILES=/tmp/file* <- it's set, here! LANG=C.UTF-8 But how to use it from lxc exec? - this obviously gets expanded on the host: # lxc exec --env LSFILES=/tmp/file* $CONTAINER -- echo $LSFILES - this is passed as a literal $LSFILES to the container: # lxc exec --env LSFILES=/tmp/file* $CONTAINER -- echo \$LSFILES $LSFILES How do I use the variables / wildcards with lxc exec? Say, I want to remove all /tmp/somefile* in the container. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] "The configuration file contains legacy configuration keys" - which ones?
On 2017-10-20 12:27, Stéphane Graber wrote: What's legacy there? Can you confirm you restarted that container? Yes, it was restarted. Also, the "lxc config show" isn't very useful if you don't pass it "--expanded" as you may have some raw.lxc config keys set through profiles. So please post "lxc config show --expanded proddeploymenttest" # lxc exec work02-2017-10-26-08-46-40 /bin/date The configuration file contains legacy configuration keys. Please update your configuration file! Thu Oct 26 09:55:10 UTC 2017 # lxc config show work02-2017-10-26-08-46-40 --expanded architecture: x86_64 config: environment.http_proxy: "" image.architecture: amd64 image.description: ubuntu 16.04 LTS amd64 (release) (20171011) image.label: release image.os: ubuntu image.release: xenial image.serial: "20171011" image.version: "16.04" user.network_mode: "" volatile.base_image: 61d54418874f2f84e24ddd6934b3bb759ca76cbc49820da7d34f8b5b778e4816 volatile.eth0.hwaddr: 00:16:3e:30:b0:e7 volatile.eth0.name: eth0 volatile.idmap.base: "0" volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":10,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":10,"Nsid":0,"Maprange":65536}]' volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":10,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":10,"Nsid":0,"Maprange":65536}]' volatile.last_state.power: RUNNING devices: eth0: nictype: bridged parent: lxdbr0 type: nic root: path: / pool: default type: disk ephemeral: false profiles: - default stateful: false description: "" Also listing what's in /usr/share/lxc/config/common.conf.d/ would be useful. # ls -l /usr/share/lxc/config/common.conf.d/ total 4 -rw-r--r-- 1 root root 103 Jul 5 22:24 00-lxcfs.conf # cat /usr/share/lxc/config/common.conf.d/00-lxcfs.conf lxc.hook.mount = /usr/share/lxcfs/lxc.mount.hook lxc.hook.post-stop = /usr/share/lxcfs/lxc.reboot.hook # dpkg -l|grep lxd ii lxd 2.18-0ubuntu3~16.04.2 amd64Container hypervisor based on LXC - daemon ii lxd-client 2.18-0ubuntu3~16.04.2 amd64Container hypervisor based on LXC - client Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] "The configuration file contains legacy configuration keys" - which ones?
I'm seeing some more "The configuration file contains legacy configuration keys": host# lxc shell proddeploymenttest The configuration file contains legacy configuration keys. <- Please update your configuration file! <- run-parts: /etc/update-motd.d/98-fsck-at-reboot exited with return code 1 mesg: ttyname failed: Success root@proddeploymenttest:~# What's legacy there? host# lxc config show proddeploymenttest architecture: x86_64 config: image.architecture: amd64 image.description: ubuntu 16.04 LTS amd64 (release) (20171011) image.label: release image.os: ubuntu image.release: xenial image.serial: "20171011" image.version: "16.04" volatile.base_image: 61d54418874f2f84e24ddd6934b3bb759ca76cbc49820da7d34f8b5b778e4816 volatile.eth0.hwaddr: 00:16:3e:83:ba:9b volatile.eth0.name: eth0 volatile.idmap.base: "0" volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":10,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":10,"Nsid":0,"Maprange":65536}]' volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":10,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":10,"Nsid":0,"Maprange":65536}]' volatile.last_state.power: RUNNING devices: {} ephemeral: false profiles: - default stateful: false description: "" host# dpkg -l|grep lxd ii lxd 2.18-0ubuntu3~16.04.2 amd64Container hypervisor based on LXC - daemon ii lxd-client 2.18-0ubuntu3~16.04.2 amd64Container hypervisor based on LXC - client Ubuntu 16.04.03 LTS. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] preventing multiple networks to connect to each other?
On 2017-10-02 03:25, Mike Wright wrote: On 10/01/2017 10:59 AM, Tomasz Chmielewski wrote: I would like to have several networks on the same host - so I've created them with: # lxc network create br-testing # lxc network create br-staging Then edited to match: # lxc network show br-staging config: ipv4.address: 10.191.0.1/24 ipv4.dhcp.ranges: 10.191.0.50-10.191.0.254 ipv4.nat: "false" # lxc network show br-testing config: ipv4.address: 10.190.0.1/24 ipv4.dhcp.ranges: 10.190.0.50-10.190.0.254 ipv4.nat: "false" The problem is I'd like these network to be separated - i.e. containers using br-staging bridge should not be able to connect to br-testing containers, and the other way around. Both networks should be able to connect to hosts in the internet. Is there any easy switch for that? So far, one thing which works is write my own iptables rules, but that gets messy with more networks. Is there any reason to keep them on the same subnet? They are not the same subnets (one is 10.190.0.1/24, the other is 10.191.0.1/24). How about: to the host 10.191.0.0/23 (or larger), then the subnets: 10.191.0.0/24 and 10.191.1.0/24. Then iptables could easily block them from each other: -s 10.191.0.0/24 -d 10.191.1.0/24 -j DROP and -s 10.191.1.0/24 -d 10.191.0.0/24 -d DROP. Like this, it won't work, because LXD adds iptables rules which pass all kinds of traffic between the networks. Also, you can see how the number of combinations grow if the number of network grow. Also, filtering by IP will not be secure in some environments - i.e. if a user in a container adds an IP from a different network, the rules will no longer apply. So we need to filter on the interface. So I figured I need to set: config: ipv4.address: 10.190.0.1/24 ipv4.dhcp.ranges: 10.190.0.50-10.190.0.254 ipv4.firewall: "false" # < important Then add firewall rules which allow internet connectivity, and prevent cross-bridge traffic. But would be cool if we were able to do it somehow in LXD network configuration. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] preventing multiple networks to connect to each other?
I would like to have several networks on the same host - so I've created them with: # lxc network create br-testing # lxc network create br-staging Then edited to match: # lxc network show br-staging config: ipv4.address: 10.191.0.1/24 ipv4.dhcp.ranges: 10.191.0.50-10.191.0.254 ipv4.nat: "false" # lxc network show br-testing config: ipv4.address: 10.190.0.1/24 ipv4.dhcp.ranges: 10.190.0.50-10.190.0.254 ipv4.nat: "false" The problem is I'd like these network to be separated - i.e. containers using br-staging bridge should not be able to connect to br-testing containers, and the other way around. Both networks should be able to connect to hosts in the internet. Is there any easy switch for that? So far, one thing which works is write my own iptables rules, but that gets messy with more networks. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] "The configuration file contains legacy configuration keys" - which ones?
On 2017-09-27 22:03, Stéphane Graber wrote: On Wed, Sep 27, 2017 at 09:48:39PM +0900, Tomasz Chmielewski wrote: # lxc exec some-container /bin/bash The configuration file contains legacy configuration keys. Please update your configuration file! Is there a way to tell find out which ones are legacy without pasting the whole config on the mailing list? Tomasz Chmielewski https://lxadm.com In most cases, it's just that the container was started on LXC 2.0.x, so just restarting it will have LXD generate a new LXC 2.1 config for it, getting you rid of the warning. If that warning remains, then it means you're using a "raw.lxc" config in your container or one of its profile and that this key is the one which contains a now legacy config key. Details on key changes can be found here: https://discuss.linuxcontainers.org/t/lxc-2-1-has-been-released/487 The tl is that for 99% of LXD users, all you need to do is restart your running containers so that they get switched to the new config format, no config change required. The remaining 1% is where you're using raw.lxc which then needs manual updating to get rid of the warning. I use this one, as I have a newer kernel from Ubuntu ppa: config: raw.lxc: lxc.aa_allow_incomplete=1 So how exactly do I modify it? lxc.aa_allow_incomplete -> lxc.apparmor.allow_incomplete ? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] "The configuration file contains legacy configuration keys" - which ones?
# lxc exec some-container /bin/bash The configuration file contains legacy configuration keys. Please update your configuration file! Is there a way to tell find out which ones are legacy without pasting the whole config on the mailing list? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? OVS / GRE - guest-transparent mesh networking across multiple hosts
I think fan is single server only and / or won't cross different networks. You may also take a look at https://www.tinc-vpn.org/ Tomasz https://lxadm.com On Thursday, August 03, 2017 20:51 JST, Félix Archambaultwrote: > Hi Amblard, > > I have never used it, but this may be worth taking a look to solve your > problem: > > https://wiki.ubuntu.com/FanNetworking > > On Aug 3, 2017 12:46 AM, "Amaury Amblard-Ladurantie" > wrote: > > Hello, > > I am deploying 10< bare metal servers to serve as hosts for containers > managed through LXD. > As the number of container grows, management of inter-container > running on different hosts becomes difficult to manage and need to be > streamlined. > > The goal is to setup a 192.168.0.0/24 network over which containers > could communicate regardless of their host. The solutions I looked at > [1] [2] [3] recommend use of OVS and/or GRE on hosts and the use of > bridge.driver: openvswitch configuration for LXD. > Note: baremetal servers are hosted on different physical networks and > use of multicast was ruled out. > > An illustration of the goal architecture is similar to the image visible on > https://books.google.fr/books?id=vVMoDwAAQBAJ=PA168= > 6aJRw15HSf=PA197#v=onepage=false > Note: this extract is from a book about LXC, not LXD. > > The point that is not clear is > - whether each container needs to have as many veth as there are > baremetal host, in which case [de]commission of a new baremetal would > require configuration updated of all existing containers (and > basically rule out this scenario) > - or whether it is possible to "hide" this mesh network at the host > level and have a single veth inside each container to access the > private network and communicate with all the other containers > regardless of their physical location and regardeless of the number of > physical peers > > Has anyone built such a setup? > Does the OVS+GRE setup need to be build prior to LXD init or can LXD > automate part of the setup? > Online documentation is scarce on the topic so any help would be > appreciated. > > Regards, > Amaury > > [1] https://stgraber.org/2016/10/27/network-management-with-lxd-2-3/ > [2] https://stackoverflow.com/questions/39094971/want-to-use > -the-vlan-feature-of-openvswitch-with-lxd-lxc > [3] https://bayton.org/docs/linux/lxd/lxd-zfs-and-bridged-ne > tworking-on-ubuntu-16-04-lts/ > > > ___ > lxc-users mailing list > lxc-users@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-users ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? ?==?utf-8?q? ?= "lxc network create" erro
On Tuesday, August 01, 2017 19:49 JST, Sjoerd <sjo...@sjomar.eu> wrote: > >> I vote for feature, since dev is most likely a reserved word, since it's > >> short for device in routing terms. > > Unless someone has i.e. "prod" and "dev" environments. > Unrelate imho. In this case you're trying to create a network, which > implies routing commands under the hood, so than I find it logical that > dev can't be used as name. What spec defines that "dev" can't be used as a name in networking world? I.e. this one works: # ifconfig eth0:dev 1.2.3.4 # ifconfig eth0:prod 2.3.4.5 This will also work: # ip addr add 10.1.2.3 dev eth0 label eth0:dev # ip addr add 10.2.3.4 dev eth0 label eth0:prod This one also works: # brctl addbr prod # brctl addbr dev # brctl show bridge name bridge id STP enabled interfaces dev 8000. no prod 8000. no -- Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? ?==?utf-8?q? "lxc network create" error
On Tuesday, August 01, 2017 18:04 JST, Sjoerd <sjo...@sjomar.eu> wrote: > > > On 30-07-17 17:15, Tomasz Chmielewski wrote: > > Bug or a feature? > > > > # lxc network create dev > > error: Failed to run: ip link add dev type bridge: Error: either "dev" is > > duplicate, or "bridge" is a garbage. > > > > > > # lxc network create devel > > Network devel created > > > > > I vote for feature, since dev is most likely a reserved word, since it's > short for device in routing terms. Unless someone has i.e. "prod" and "dev" environments. > i.e. setting routing can be done like : ip route add 192.168.10.0/24 via > 10.2.2.1 dev eth0 But that's a different command. > So in you-re case the command would end like : dev dev ...I would be > confused by that as well ;) If we treat it as a feature - then it's an undocumented feature. We would need documentation specifying a list of disallowed network names. -- Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] "lxc network create" error
Bug or a feature? # lxc network create dev error: Failed to run: ip link add dev type bridge: Error: either "dev" is duplicate, or "bridge" is a garbage. # lxc network create devel Network devel created -- Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? LXD 2.14 - Ubuntu 16.04 - kernel 4.4.0-57-generic - SWAP continuing to grow
Most likely your database cache is simply set too large. I've been experiencing similar issues with MySQL (please read in detail): https://stackoverflow.com/questions/43259136/mysqld-out-of-memory-with-plenty-of-memory/43259820 It finally went away after I've been lowering MySQL cache by a few GBs from each OOM to OOM, until it stopped happenin Tomasz Chmielewski https://lxadm.com On Saturday, July 15, 2017 18:36 JST, Saint Michael <vene...@gmail.com> wrote: > I have a lot of memory management issues using pure LXC. In my case, my box > has only one container. I use LXC to be able to move my app around, not to > squeeze performance out of hardware. What happens is my database gets > killed the OOM manager, although there are gigabytes of RAM used for cache. > The memory manager kills applications instead of reclaiming memory from > disc cache. How can this be avoided? > > My config at the host is: > > vm.hugepages_treat_as_movable=0 > vm.hugetlb_shm_group=27 > vm.nr_hugepages=2500 > vm.nr_hugepages_mempolicy=2500 > vm.nr_overcommit_hugepages=0 > vm.overcommit_memory=0 > vm.swappiness=0 > vm.vfs_cache_pressure=150 > vm.dirty_ratio=10 > vm.dirty_background_ratio=5 > > This shows the issue > [9449866.130270] Node 0 hugepages_total=1250 hugepages_free=1250 > hugepages_surp=0 hugepages_size=2048kB > [9449866.130271] Node 1 hugepages_total=1250 hugepages_free=1248 > hugepages_surp=0 hugepages_size=2048kB > [9449866.130271] 46181 total pagecache pages > [9449866.130273] 33203 pages in swap cache > [9449866.130274] Swap cache stats: add 248571542, delete 248538339, find > 69031185/100062903 > [9449866.130274] Free swap = 0kB > [9449866.130275] Total swap = 8305660kB > [9449866.130276] 20971279 pages RAM > [9449866.130276] 0 pages HighMem/MovableOnly > [9449866.130276] 348570 pages reserved > [9449866.130277] 0 pages cma reserved > [9449866.130277] 0 pages hwpoisoned > [9449866.130278] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds > swapents oom_score_adj name > [9449866.130286] [ 618] 0 61887181 135 168 3 >3 0 systemd-journal > [9449866.130288] [ 825] 0 82511343 130 25 3 >0 0 systemd-logind > [9449866.130289] [ 830] 0 830 1642 31 8 3 >0 0 mcelog > [9449866.130290] [ 832] 996 83226859 51 23 3 > 47 0 chronyd > [9449866.130292] [ 834] 0 834 4905 100 12 3 >0 0 irqbalance > [9449866.130293] [ 835] 0 835 6289 177 15 3 >0 0 smartd > [9449866.130295] [ 837]81 83728499 258 28 3 > 149 -900 dbus-daemon > [9449866.130296] [ 857] 0 857 1104 16 7 3 >0 0 rngd > [9449866.130298] [ 859] 0 859 19246337114 224 4 > 40630 0 NetworkManager > [9449866.130300] [ 916] 0 91625113 229 50 3 >0 -1000 sshd > [9449866.130302] [ 924] 0 924 6490 50 17 3 >0 0 atd > [9449866.130303] [ 929] 0 92935327 199 20 3 > 284 0 agetty > [9449866.130305] [ 955] 0 95522199 3185 43 3 > 312 0 dhclient > [9449866.130307] [ 1167] 0 1167 6125 88 17 3 >2 0 lxc-autostart > [9449866.130309] [ 1176] 0 117610818 275 24 3 > 38 0 systemd > [9449866.130310] [ 1188] 0 118813303 1980 29 3 > 36 0 systemd-journal > [9449866.130312] [ 1372]99 1372 38812 12 3 > 45 0 dnsmasq > [9449866.130313] [ 1375]81 1375 6108 77 17 3 > 39 -900 dbus-daemon > [9449866.130315] [ 1394] 0 1394 6175 46 15 3 > 168 0 systemd-logind > [9449866.130316] [ 1395] 0 139578542 1142 69 3 >4 0 rsyslogd > [9449866.130317] [ 1397] 0 1397 1614 32 8 3 >0 0 agetty > [9449866.130319] [ 1398] 0 1398 1614 31 8 3 >0 0 agetty > [9449866.130320] [ 1400] 0 1400 1614 31 8 3 >0 0 agetty > [9449866.130321] [ 1401] 0 1401 16142 8 3 > 30 0 agetty > [9449866.130322] [ 1402] 0 1402 16142 8 3 > 29 0 agetty > [9449866.130324] [ 1403] 0 1403 1614 31 8 3 >0 0 agetty > [9449866.13032
Re: [lxc-users] ?==?utf-8?q? ?==?utf-8?q? ?= lxc commands failing randomly:?(null)
\"2017-07-12T21:24:21.01927786Z\",\n\t\t\"updated_at\": \"2017-07-12T21:24:21.01927786Z\",\n\t\t\"status\": \"Running\",\n\t\t\"status_code\": 103,\n\t\t\"resources\": {\n\t\t\t\"containers\": [\n\t\t\t\t\"/1.0/containers/vpn-hz1\"\n\t\t\t]\n\t\t},\n\t\t\"metadata\": {\n\t\t\t\"fds\": {\n\t\t\t\t\"0\": \"05693ff9baedf120f482bf513e13956b214536d47629e5b40c635d0037e8bd27\",\n\t\t\t\t\"1\": \"a59868283645f18a144eb671d15768681dc577b4d2ca1163cb4c65454ce6192e\",\n\t\t\t\t\"2\": \"707ef10a6025752a4e89dcf24c26ac1ef2962492660b71e4ab1e51fb073c4939\",\n\t\t\t\t\"control\": \"14b79db9b5ccd0224201c0e22cd35b2420a236ffb080992ec46c2488b36a3314\"\n\t\t\t}\n\t\t},\n\t\t\"may_cancel\": false,\n\t\t\"err\": \"\"\n\t}" t=2017-07-12T21:24:21+ lvl=dbug msg="Connected to the websocket" t=2017-07-12T21:24:21+ lvl=dbug msg="Connected to the websocket" t=2017-07-12T21:24:21+ lvl=dbug msg="Connected to the websocket" t=2017-07-12T21:24:21+ etag= lvl=info method=GET msg="Sending request to LXD" t=2017-07-12T21:24:21+ url=https://lxd-server:8443/1.0/operations/c2148e6d-bb82-4d8f-aafa-0a2a5b7c9fa0 lvl=dbug msg="got message barrier" t=2017-07-12T21:24:21+ lvl=dbug msg="got message barrier" t=2017-07-12T21:24:21+ error: not found Wed Jul 12 21:24:21 UTC 2017 1 On Thursday, July 13, 2017 04:55 JST, Ivan Kurnosov <zer...@zerkms.ru> wrote: > Please run it with `--debug` for more details. > > On 13 July 2017 at 03:42, Tomasz Chmielewski <man...@wpkg.org> wrote: > > > On Thursday, July 13, 2017 00:35 JST, "Tomasz Chmielewski" < > > man...@wpkg.org> wrote: > > > > > On Wednesday, July 12, 2017 20:52 JST, "Tomasz Chmielewski" < > > man...@wpkg.org> wrote: > > > > > > > It only fails with "error: not found" on the first, second or third > > "lxc config" line. > > > > > > > > It started to fail in the last 2 weeks I think (lxd updates?) - > > before, it was rock solid. > > > > > > Also "lxc exec" fails. > > > > > > Here is a reproducer: > > > > > > # lxc copy base-uni-web01 ztest > > > # lxc start ztest ; while true ; do OUT=$(lxc exec ztest date ; echo $?) > > ; echo $OUT ; done > > > Wed Jul 12 15:32:43 UTC 2017 0 > > > Wed Jul 12 15:32:54 UTC 2017 0 > > > Wed Jul 12 15:32:54 UTC 2017 0 > > > Wed Jul 12 15:32:55 UTC 2017 0 > > > Wed Jul 12 15:32:55 UTC 2017 0 > > > Wed Jul 12 15:32:55 UTC 2017 0 > > > Wed Jul 12 15:32:56 UTC 2017 0 > > > Wed Jul 12 15:32:56 UTC 2017 0 > > > Wed Jul 12 15:32:57 UTC 2017 0 > > > Wed Jul 12 15:32:57 UTC 2017 0 > > > Wed Jul 12 15:32:57 UTC 2017 0 > > > Wed Jul 12 15:32:58 UTC 2017 0 > > > error: not found > > > Wed Jul 12 15:32:58 UTC 2017 1 > > > Wed Jul 12 15:33:09 UTC 2017 0 > > > Wed Jul 12 15:33:09 UTC 2017 0 > > > Wed Jul 12 15:33:19 UTC 2017 0 > > > Wed Jul 12 15:33:25 UTC 2017 0 > > > Wed Jul 12 15:33:40 UTC 2017 0 > > > Wed Jul 12 15:33:46 UTC 2017 0 > > > > > > > > > It seems to be easier to reproduce if the host server is slightly > > overloaded. > > > > Also - it only happens when lxc remote is https://. > > > > It doesn't happen when lxc remote is unix:// > > > > > > Tomasz Chmielewski > > https://lxadm.com > > ___ > > lxc-users mailing list > > lxc-users@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-users > > > > > -- > With best regards, Ivan Kurnosov -- Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? ?==?utf-8?q? ?= lxc commands failing randomly:?(null)
On Thursday, July 13, 2017 00:35 JST, "Tomasz Chmielewski" <man...@wpkg.org> wrote: > On Wednesday, July 12, 2017 20:52 JST, "Tomasz Chmielewski" <man...@wpkg.org> > wrote: > > > It only fails with "error: not found" on the first, second or third "lxc > > config" line. > > > > It started to fail in the last 2 weeks I think (lxd updates?) - before, it > > was rock solid. > > Also "lxc exec" fails. > > Here is a reproducer: > > # lxc copy base-uni-web01 ztest > # lxc start ztest ; while true ; do OUT=$(lxc exec ztest date ; echo $?) ; > echo $OUT ; done > Wed Jul 12 15:32:43 UTC 2017 0 > Wed Jul 12 15:32:54 UTC 2017 0 > Wed Jul 12 15:32:54 UTC 2017 0 > Wed Jul 12 15:32:55 UTC 2017 0 > Wed Jul 12 15:32:55 UTC 2017 0 > Wed Jul 12 15:32:55 UTC 2017 0 > Wed Jul 12 15:32:56 UTC 2017 0 > Wed Jul 12 15:32:56 UTC 2017 0 > Wed Jul 12 15:32:57 UTC 2017 0 > Wed Jul 12 15:32:57 UTC 2017 0 > Wed Jul 12 15:32:57 UTC 2017 0 > Wed Jul 12 15:32:58 UTC 2017 0 > error: not found > Wed Jul 12 15:32:58 UTC 2017 1 > Wed Jul 12 15:33:09 UTC 2017 0 > Wed Jul 12 15:33:09 UTC 2017 0 > Wed Jul 12 15:33:19 UTC 2017 0 > Wed Jul 12 15:33:25 UTC 2017 0 > Wed Jul 12 15:33:40 UTC 2017 0 > Wed Jul 12 15:33:46 UTC 2017 0 > > > It seems to be easier to reproduce if the host server is slightly overloaded. Also - it only happens when lxc remote is https://. It doesn't happen when lxc remote is unix:// Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? ?==?utf-8?q? ?= lxc commands failing randomly:?=?==?utf-8?q? error: not fou
On Wednesday, July 12, 2017 20:52 JST, "Tomasz Chmielewski" <man...@wpkg.org> wrote: > It only fails with "error: not found" on the first, second or third "lxc > config" line. > > It started to fail in the last 2 weeks I think (lxd updates?) - before, it > was rock solid. Also "lxc exec" fails. Here is a reproducer: # lxc copy base-uni-web01 ztest # lxc start ztest ; while true ; do OUT=$(lxc exec ztest date ; echo $?) ; echo $OUT ; done Wed Jul 12 15:32:43 UTC 2017 0 Wed Jul 12 15:32:54 UTC 2017 0 Wed Jul 12 15:32:54 UTC 2017 0 Wed Jul 12 15:32:55 UTC 2017 0 Wed Jul 12 15:32:55 UTC 2017 0 Wed Jul 12 15:32:55 UTC 2017 0 Wed Jul 12 15:32:56 UTC 2017 0 Wed Jul 12 15:32:56 UTC 2017 0 Wed Jul 12 15:32:57 UTC 2017 0 Wed Jul 12 15:32:57 UTC 2017 0 Wed Jul 12 15:32:57 UTC 2017 0 Wed Jul 12 15:32:58 UTC 2017 0 error: not found Wed Jul 12 15:32:58 UTC 2017 1 Wed Jul 12 15:33:09 UTC 2017 0 Wed Jul 12 15:33:09 UTC 2017 0 Wed Jul 12 15:33:19 UTC 2017 0 Wed Jul 12 15:33:25 UTC 2017 0 Wed Jul 12 15:33:40 UTC 2017 0 Wed Jul 12 15:33:46 UTC 2017 0 It seems to be easier to reproduce if the host server is slightly overloaded. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? lxc commands failing randomly:?=?==?utf-8?q? error: not foun
On Wednesday, July 12, 2017 20:33 JST, "Tomasz Chmielewski" wrote: > In the last days, lxc commands are failing randomly. > > Example (used in a script): > > # lxc config set TDv2-z-testing-a19ea62218-2017-07-12-11-23-03 raw.lxc > lxc.aa_allow_incomplete=1 > error: not found > > It shouldn't fail, because: > > # lxc list|grep TDv2-z-testing-a19ea62218-2017-07-12-11-23-03 > | TDv2-z-testing-a19ea62218-2017-07-12-11-23-03| STOPPED | > | | > PERSISTENT | 0 | To be more specific: it only seems to fail on "lxc config" command (lxc config set, lxc config edit, lxc config device add). My container startup script work like below: lxc copy container-template newcontainer lxc config set newcontainer raw.lxc "lxc.aa_allow_incomplete=1" ...lots of lxc file pull / push... lxc config show newcontainer | sed -e "s/$BRIDGE_OLD/$BRIDGE_NEW/" | lxc config edit newcontainer ...again, lots of lxc file pull / push... lxc config device add newcontainer uploads disk source=/some/path path=/var/www/path It only fails with "error: not found" on the first, second or third "lxc config" line. It started to fail in the last 2 weeks I think (lxd updates?) - before, it was rock solid. Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxc commands failing randomly:?==?utf-8?q? error: not found
In the last days, lxc commands are failing randomly. Example (used in a script): # lxc config set TDv2-z-testing-a19ea62218-2017-07-12-11-23-03 raw.lxc lxc.aa_allow_incomplete=1 error: not found It shouldn't fail, because: # lxc list|grep TDv2-z-testing-a19ea62218-2017-07-12-11-23-03 | TDv2-z-testing-a19ea62218-2017-07-12-11-23-03| STOPPED | | | PERSISTENT | 0 | lxc client a container and is using: # dpkg -l|grep lxd ii lxd-client 2.15-0ubuntu6~ubuntu16.04.1~ppa1 amd64Container hypervisor based on LXC - client lxd server is using: # dpkg -l|grep lxd ii lxd 2.15-0ubuntu6~ubuntu16.04.1~ppa1amd64 Container hypervisor based on LXC - daemon ii lxd-client2.15-0ubuntu6~ubuntu16.04.1~ppa1amd64 Container hypervisor based on LXC - client Not sure how I can debug this. -- Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxc network set...?==?utf-8?q? error: Only managed networks can be modified
I've set this bridge in /etc/network/interfaces on the host: # lxc network show br-testing config: {} description: "" name: br-testing type: bridge used_by: - /1.0/containers/TDv2-testing-307c906842-2017-06-29-11-55-34 - /1.0/containers/TDv2-testing-bc29e3f587-2017-06-29-11-47-25 - /1.0/containers/TDv2-testing-bc29e3f587-2017-06-29-11-49-17 - /1.0/containers/TDv2-z-testing-a19ea62218-2017-06-29-10-31-48 managed: false Because of this, it's unmanaged, and I'm not able to configure it via "lxc network edit": # lxc network edit br-testing error: Only managed networks can be modified. How do I best get this unmanaged network to be managed? This one unfortunately doesn't work - so I guess it will be some sort of removing the bridge from the config, then adding it with lxc: # lxc network set br-testing managed=true error: Only managed networks can be modified. But perhaps there is some "recommended" way? -- Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? lxd 2.15 broke "lxc file push -r"?
On Thursday, June 29, 2017 13:30 JST, Stéphane Graber <stgra...@ubuntu.com> wrote: > Hmm, so that just proves that we really more systematic testing of the > way we handle all that mess in "lxc file push". > > For this particular case, my feeling is that LXD 2.5's behavior is > correct. OK, I can see we can restore "2.14 behaviour" with an asterisk, i.e.: lxc file push -r /tmp/testdir/* container/tmp Which makes "lxc file push -r" in 2.15 behave similar like cp. Before, 2.14 behaved similar like rsync. So can we assume, going forward, the behaviour won't change anymore? :) Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] lxd 2.15 broke "lxc file push -r"?
With lxd 2.14: # mkdir /tmp/testdir # touch /tmp/testdir/file1 /tmp/testdir/file2 # lxc file push -r /tmp/testdir/ testvm1/tmp # < note the trailing slash after /tmp/testdir/ # echo $? 0 # lxc exec testvm1 ls /tmp file1 file2 With lxd 2.15: # mkdir /tmp/testdir # touch /tmp/testdir/file1 /tmp/testdir/file2 # lxc file push -r /tmp/testdir/ testvm2/tmp # < note the trailing slash after /tmp/testdir/ # lxc exec testvm2 ls /tmp testdir # lxc exec testvm2 ls /tmp/testdir file1 file2 This breaks many scripts! -- Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] how to set raw.idmap?==?utf-8?q? (bot uid and gid) on container creation?
This one works: $ lxc launch images:ubuntu/trusty/amd64 testct1 -c 'raw.idmap=uid 1000 33' This one doesn't: $ lxc launch images:ubuntu/trusty/amd64 testct2 -c 'raw.idmap=uid 1000 33\ngid 1002 33' Creating testct2 error: invalid raw.idmap line uid 1000 33\ngid 1000 33 What's the correct syntax to set it? Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
[lxc-users] error: Get https://images.linuxcontainers.org:8443/1.0/images/ubuntu/xenial/amd64:?==?utf-8?q? x509: certificate is valid for images.linuxcontainers.org
I'm not able to launch a container on one of the servers: # lxc launch images:ubuntu/xenial/amd64 containername error: Get https://images.linuxcontainers.org:8443/1.0/images/ubuntu/xenial/amd64: x509: certificate is valid for images.linuxcontainers.org, uk.images.linuxcontainers.org, us.images.linuxcontainers.org, not *.linuxcontainers.org Not sure how to debug this. This works: # curl https://images.linuxcontainers.org:8443/1.0/images/ubuntu/xenial/amd64 301 Moved Permanently Moved Permanently The document has moved https://uk.images.linuxcontainers.org:8443/1.0/images/ubuntu/xenial/amd64;>here. Apache/2.4.7 (Ubuntu) Server at images.linuxcontainers.org Port 8443 -- Tomasz Chmielewski https://lxadm.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? LXD and Kernel Samepage Merging (KSM)
On Monday, June 05, 2017 10:58 JST, "Fajar A. Nugraha" <l...@fajar.net> wrote: > On Mon, Jun 5, 2017 at 7:48 AM, Ron Kelley <rkelley...@gmail.com> wrote: > > > > > As for the openvz link; I read that a few times but I don’t get any > > positive results using those methods. This leads me to believe (a) LXD > > does not support KSM or (b) the applications are not registering w/the KSM > > part of the kernel. > > > > > > Works for me here on a simple test, in 2 privileged containers. That's valuable info. I wonder if you also checked in unprivileged containers? -- Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? LXD - upgrade from 2.08 to 2.12 enables BTRFS quota
FYI btrfs quotas are not yet "stable" even in the newest kernel (4.11.x as we speak). Reference: https://btrfs.wiki.kernel.org/index.php/Status There are also btrfs quota problems mentioned on btrfs mailing lists quite often (severe performance problems, inaccurate or even negative counts, especially when snapshots are in use; kernel crashes/stability issues). So making LXD enable it automatically might be premature - btrfs quotas might work in two or five years from now, but they are not there yet. If LXD wants to support it - fine; but IMO don't enable it by default, add a big fat warning for the user about the potential issues and document a way to enable it manually. Tomasz Chmielewski https://lxadm.com On Monday, June 05, 2017 02:11 JST, Stéphane Graber <stgra...@ubuntu.com> wrote: > On Sun, Jun 04, 2017 at 08:36:28AM -0400, Ron Kelley wrote: > > Greetings all. > > > > I recently upgraded a number of LXD 2.08 servers to 2.12 and noticed the > > btrfs “quota” option was enabled where it was disabled before. Enabling > > quotas on a filesystem with lots of snapshots can cause huge performance > > issues (as indicated by our 20min outage this morning when I tried to clear > > out some old snapshots). > > > > Can one of the developers confirm if the upgrade enables quota? If so, I > > will file a bug to ensure the user gets notified/alerted quotas will be > > enabled (so it can be disabled if necessary). > > > > -Ron > > LXD enables btrfs quota on any newly created btrfs pool, which includes > all your existing ones when upgrading to LXD 2.9 or higher (as the > upgrade step effectively creates a new pool). > > That's done so that the size limit feature of LXD can work on btrfs. > > > Given the potential performance impact and the fact that btrfs quotas > aren't that great to begin with, I'd be fine with adding a pool > configuration option that lets you turn them off. And tweaking our > upgrade code to set that property based on the existing state. > > > Filing a github issue would be good. Please include what I wrote above > and I suspect Christian will take a look into this shortly. > > -- > Stéphane Graber > Ubuntu developer > http://www.ubuntu.com ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users
Re: [lxc-users] ?==?utf-8?q? LXD and Kernel Samepage Merging (KSM)
KSM only works with applications which support it: KSM only operates on those areas of address space which an application has advised to be likely candidates for merging, by using the madvise(2) system call: int madvise(addr, length, MADV_MERGEABLE. This means that doing: echo 1 > /sys/kernel/mm/ksm/run will be enough for KVM, but will not do anything for applications like bash, nginx, apache, php-fpm and so on. Please refer to "Enabling memory deduplication libraries in containers" on https://openvz.org/KSM_(kernel_same-page_merging) - you will have to use ksm_preload mentioned. I haven't personally used it with LXD. Tomasz Chmielewski https://lxadm.com On Monday, June 05, 2017 09:48 JST, Ron Kelley <rkelley...@gmail.com> wrote: > Thanks Fajar. > > This is on-site with our own physical servers, storage, etc. The goal is to > get the most containers per server as possible. While our servers have lots > of RAM, we need to come up with a long-term scaling plan and hope KSM can > help us scale beyond the standard numbers. > > As for the openvz link; I read that a few times but I don’t get any positive > results using those methods. This leads me to believe (a) LXD does not > support KSM or (b) the applications are not registering w/the KSM part of the > kernel. > > I am going to run through some tests this week to see if I can get KSM > working outside the LXD environment then try to replicate the same tests > inside LXD. > > Thanks again for the feedback. > > > > > On Jun 4, 2017, at 6:15 PM, Fajar A. Nugraha <l...@fajar.net> wrote: > > > > On Sun, Jun 4, 2017 at 11:16 PM, Ron Kelley <rkelley...@gmail.com> wrote: > > (Reviving the thread about Container Scaling: > > https://lists.linuxcontainers.org/pipermail/lxc-users/2016-May/011607.html) > > > > We have hit critical mass with LXD 2.12 and I need to get Kernel Samepage > > Merging (KSM) working as soon as possible. All my research has come to a > > dead-end, and I am reaching out to the group at large for suggestions. > > > > Background: We have 5 host servers - each running U16.04 > > (4.4.0-57-generic), 8G RAM, 20G SWAP, and 50 containers (exact configs per > > server - nginx and php 7). > > > > > > Is this a cloud, or on-site setup? > > > > For cloud, there are a lot of options that could get you running with MUCH > > more memory, which would save you lots of headaches getting KSM to work. My > > favorite is EC2 spot instance on AWS. > > > > On another note, I now setup most of my hosts with no swap, since > > performance plummets whenever swap is used. YMMV. > > > > > > I am trying to get KSM working since each container is an identical replica > > of the other (other than hostname/IP). I have read a ton of information on > > the ‘net about Ubuntu and KSM, yet I can’t seem to get any pages to share > > on the host. I am not sure if this is a KSM config issue or if LXD won’t > > allow KSM between containers. > > > > > > > > > > > > Here is what I have done thus far: > > -- > > * Installed the ksmtuned utility and verified ksmd is running on each host. > > * Created the ksm_preload and ksm-wrapper tools per this site (the > > https://github.com/unbrice/ksm_preload). > > * Created 50 identical Ubuntu 16.04 containers running nginx > > * Modified the nginx startup script on each container to include the > > ksm_preload.so library; no issues running nginx. > > > > (Note: since I could not find the ksm_preload library for Ubuntu, I had to > > use the ksm-wrapper tool listed above) > > > > All the relevant files under /sys/kernel/mm/ksm still show 0 (pages_shared, > > pages_sharing, etc) regardless of what I do. > > > > > > Can any (@stgraber @brauner) confirm if KSM is supported with LXD? If so, > > what is the “magic” to make it work? We really want to get 2-3x more sites > > per container if possible. > > > > > > Have you read https://openvz.org/KSM_(kernel_same-page_merging) ? Some info > > might be relevant. For example, it mentions something which you did not > > wrote: > > > > To start ksmd, issue > > [root@HN ~]# echo 1 > /sys/kernel/mm/ksm/run > > > > Also the section about Tuning and Caveats. > > > > -- > > Fajar > > ___ > > lxc-users mailing list > > lxc-users@lists.linuxcontainers.org > > http://lists.linuxcontainers.org/listinfo/lxc-users > > ___ > lxc-users mailing list > lxc-users@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-users -- Tomasz Chmielewski ___ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users