[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
with https://github.com/snapcore/snapd/pull/3130 landed, should this bug be fix-released or is there desire to actually fix the Go implementation that snapd uses at some point ? (in which case we probably should let it open) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snapd/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
** Project changed: snappy => snapd ** Changed in: snapd (Ubuntu) Status: New => Triaged ** Changed in: snapd Status: New => Triaged -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snapd/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
to gather some data i did 16 fresh installs in a row today (freshly flashed SD for each) and captured the log between the start of the key generation ("Generate device key") and the finalization ("System seeded") ... http://paste.ubuntu.com/24312862/ has the captured raw data ... in average the key generation is taking between 10 and 20seconds in these tests. now the slowest bit of a first boot is the wait for the "press enter to configure" message to come up (30sec up to 1min) this should be researched in a new bug, but i suspect this is caused by copying the snaps in place before we allow interaction with the system, so perhaps unavoidable. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
http://paste.ubuntu.com/24307672/ has the log output of a test of the above image on a pi3 ... (with manually added rng_core.default_quality=700 to cmdline.txt in system-boot before the first boot) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
in case anyone wants to test: http://people.canonical.com/~ogra/snappy/all-snaps/ssh-keygen-test/ has images for pi2, pi3 and beaglebone black with https://github.com/snapcore/snapd/pull/3130 included in snapd -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
ssh-keygen is about 5.5 times faster than that genkey.go program on my laptop too (it's just that the times are much smaller, obviously). AIUI, generating an RSA key ends up benchmarking montgomery multiplication, and I know there are SIMD tricks you can use for that. Go doesn't seem to be using anything in this area but it looks like ssh-keygen is using openssh's routines for this and I bet they are optimized nearly as much as is possible. So, shelling out to ssh-keygen to make a key is probably going to be much faster until someone goes to the effort of improving Go's implementation... -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
using https://github.com/tmc/genkey/blob/master/genkey.go i see similar slowness ... right after boot a key generation (with 4096 bits like used in snapd) takes between 3-5 min. doing a 25 run loop like in http://paste.ubuntu.com/24260311/ gets me runs between 12sec and 7min ... with an average around 2-3min. using the rng_core.default_quality=700 boot option seems to help the run right after boot a little, but it never seems to run as fast as ssh- keygen can where the slowest i saw was around 50sec. so it seems to clearly be the rsa.GenerateKey() function of go that does cause the slowness on arm. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
@seth: this is with an entropy of above 900 using the rng_core.default_quality=700 kernel boot option. even if i let the board sit for a while before hitting enter to run console-conf the builtin key generation never goes below 3min while the ssh-keygen run is always fast (it goes down to 15-20sec if there is more entropy or if i run it multiple times in succession (that latter one might indeed be more I/O than entropy related)) the key type might be based on store requirements, not sure why else we picked 4096/rsa here. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
How much time had elapsed between the user.creation step and the manual run of ssh-keygen? Perhaps the ssh-keygen just ran quickly because the kernel had already accumulated enough entropy. Do these keys need to be 4096 bit RSA keys? curve25519 ecc keys are vastly smaller, faster to generate, and should provide security roughly on par with 2048 bit RSA keys. (There are longer curves in case RSA 4096 security level is important for this use; see Ed448-goldilocks, m-511, e-521, for examples.) Thanks -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
one obvious workaround would indeed be to make snapd call out to ssh- keygen to have it handle the key generation ... if the security team does not see any drawbacks with this -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
we have been doing various test runs ... adding rng_core.default_quality=700 to teh commandline on pi and dragonboard gives a much greater entropy on boot: $ cat /proc/sys/kernel/random/entropy_avail 958 but the user.creation step in console-conf still hangs for 3min doing the device key generation. this is a 4096bit rsa key ... for comparison trying to run something like: ssh-keygen -N "" -b 4096 -t rsa -f foo.key does finish in abouit 30sec so the suspicion is that rsa.GenerateKey() from go has some issues with arm here ... the respective code in snapd is at: https://github.com/snapcore/snapd/blob/master/overlord/devicestate/handlers.go#L88 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
A default image on firstboot doesn't have a lot of tools available, and I'm not sure how to modify that at image-build time... so I have only very limited information about what exactly it's doing. It might not be key generation after all though... I don't recall seeing it for a while, but today I caught it hanging for 10 minutes on rpi3. Mar 22 16:08:14 localhost /usr/lib/snapd/snapd[1179]: taskrunner.go:353: DEBUG: Running task 26 on Do: Make snap "pi3" (6) available to the system Mar 22 16:08:14 localhost /usr/lib/snapd/snapd[1179]: taskrunner.go:353: DEBUG: Running task 27 on Do: Setup snap "pi3" (6) security profiles (phase 2) Mar 22 16:08:14 localhost /usr/lib/snapd/snapd[1179]: taskrunner.go:353: DEBUG: Running task 32 on Do: Generate device key Mar 22 16:08:14 localhost /usr/lib/snapd/snapd[1179]: taskrunner.go:353: DEBUG: Running task 28 on Do: Set automatic aliases for snap "pi3" --- blocked here --- Mar 22 16:18:18 localhost /usr/lib/snapd/snapd[1179]: daemon.go:176: DEBUG: uid=1000;@ GET /v2/system-info 9m49.567409367s 200 Mar 22 16:18:18 localhost /usr/lib/snapd/snapd[1179]: taskrunner.go:353: DEBUG: Running task 29 on Do: Setup snap "pi3" aliases Mar 22 16:18:18 localhost /usr/lib/snapd/snapd[1179]: taskrunner.go:353: DEBUG: Running task 33 on Do: Request device serial At 16:08:28 a script logged in via ssh and tried to run "snap version", which blocked for 10 minutes. But in this case it looks like maybe the key generation was already complete. Anyway, here's the rest of syslog from the time between snapd entries: Mar 22 16:08:24 localhost rsyslogd-2007: action 'action 11' suspended, next retry is Wed Mar 22 16:08:54 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Mar 22 16:08:25 localhost systemd[1]: Created slice User Slice of qa. Mar 22 16:08:25 localhost systemd[1]: Starting User Manager for UID 1000... Mar 22 16:08:25 localhost systemd[1]: Started Session 1 of user qa. Mar 22 16:08:25 localhost systemd[1721]: Reached target Sockets. Mar 22 16:08:25 localhost systemd[1721]: Reached target Paths. Mar 22 16:08:25 localhost systemd[1721]: Reached target Timers. Mar 22 16:08:25 localhost systemd[1721]: Reached target Basic System. Mar 22 16:08:25 localhost systemd[1721]: Reached target Default. Mar 22 16:08:25 localhost systemd[1721]: Startup finished in 691ms. Mar 22 16:08:25 localhost systemd[1]: Started User Manager for UID 1000. Mar 22 16:08:26 localhost systemd[1]: Stopping User Manager for UID 1000... Mar 22 16:08:26 localhost systemd[1721]: Reached target Shutdown. Mar 22 16:08:26 localhost systemd[1721]: Starting Exit the Session... Mar 22 16:08:26 localhost systemd[1721]: Stopped target Default. Mar 22 16:08:26 localhost systemd[1721]: Stopped target Basic System. Mar 22 16:08:26 localhost systemd[1721]: Stopped target Paths. Mar 22 16:08:26 localhost systemd[1721]: Stopped target Sockets. Mar 22 16:08:26 localhost systemd[1721]: Stopped target Timers. Mar 22 16:08:26 localhost systemd[1721]: Received SIGRTMIN+24 from PID 1737 (kill). Mar 22 16:08:26 localhost systemd[1]: Stopped User Manager for UID 1000. Mar 22 16:08:26 localhost systemd[1]: Removed slice User Slice of qa. Mar 22 16:08:26 localhost systemd[1]: Created slice User Slice of qa. Mar 22 16:08:26 localhost systemd[1]: Starting User Manager for UID 1000... Mar 22 16:08:27 localhost systemd[1]: Started Session 2 of user qa. Mar 22 16:08:27 localhost systemd[1749]: Reached target Timers. Mar 22 16:08:27 localhost systemd[1749]: Reached target Paths. Mar 22 16:08:27 localhost systemd[1749]: Reached target Sockets. Mar 22 16:08:27 localhost systemd[1749]: Reached target Basic System. Mar 22 16:08:27 localhost systemd[1749]: Reached target Default. Mar 22 16:08:27 localhost systemd[1749]: Startup finished in 629ms. Mar 22 16:08:27 localhost systemd[1]: Started User Manager for UID 1000. Mar 22 16:08:27 localhost systemd[1]: Stopping User Manager for UID 1000... Mar 22 16:08:27 localhost systemd[1749]: Stopped target Default. Mar 22 16:08:27 localhost systemd[1749]: Stopped target Basic System. Mar 22 16:08:27 localhost systemd[1749]: Reached target Shutdown. Mar 22 16:08:27 localhost systemd[1749]: Stopped target Timers. Mar 22 16:08:27 localhost systemd[1749]: Stopped target Paths. Mar 22 16:08:27 localhost systemd[1749]: Starting Exit the Session... Mar 22 16:08:27 localhost systemd[1749]: Stopped target Sockets. Mar 22 16:08:27 localhost systemd[1749]: Received SIGRTMIN+24 from PID 1767 (kill). Mar 22 16:08:27 localhost systemd[1]: Stopped User Manager for UID 1000. Mar 22 16:08:27 localhost systemd[1]: Removed slice User Slice of qa. Mar 22 16:08:28 localhost systemd[1]: Created slice User Slice of qa. Mar 22 16:08:28 localhost systemd[1]: Starting User Manager for UID 1000... Mar 22 16:08:28 localhost systemd[1]: Started Session 3 of user qa. Mar 22 16:08:28 localhost systemd[1776]: Reached target Timers. Mar 22 16:08:28 localhost systemd[1776]: Reached target Sockets. Mar 22 16:08:28 localhost
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
What kind of device key is being constructed? For what purpose? Do the machines have suitable randomness at boot? Are there hardware sources of entropy that aren't feeding into the kernel's randomness pool? Where's the slowdown? Thanks -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
Until the device key is generated there is nothing it _can_ do (in fact it might not even be listening to requests yet; not sure offhand). That said, on the pi2 it takes a little over a minute to do the keygen (and that's too long, but not much we can do about it without better hardware support for vector multiplication), so you should be seeing times significantly shorter than that. Would you be able to build a profiling snapd and do a first boot with that? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
I'm not sure if it's the same issue, but I tend to get a delay of ~8 minutes on a dragonboard on first boot. I'm booting an image with a system-user assertion, not using the console at all. On first boot, the system creates a new user, I confirm that ssh works, then log in and try to run "snap version" to detect the version of snapd... and sometimes it works, while sometimes it hangs for ~8 minutes until snapd starts responding. I haven't been able to investigate very deeply since a default image doesn't include many debugging tools, but this seems to be the last entry in syslog before it hangs: Mar 1 05:55:33 localhost /usr/lib/snapd/snapd[1371]: taskrunner.go:353: DEBUG: Running task 29 on Do: Generate device key The 100% CPU load isn't a problem for me, but it's an issue when snapd doesn't respond to new requests. Could it be made to spin off tasks into separate threads so the service won't block? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
>From the moment I entered my SSO email and hit "go" to the moment it returned telling me how to SSH in was about 7 minutes on the dragonboard. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1638537] Re: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0
I see this as well on various arm based devices. Any more ideas what could be the reason or a possible fix? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1638537 Title: snapd eats 100% CPU for about 5 minutes on first boot causing a load of >2.0 To manage notifications about this bug go to: https://bugs.launchpad.net/snappy/+bug/1638537/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs