On Thu, May 21, 2026 at 07:37:58AM +0800, Barry Song wrote: > On Thu, May 21, 2026 at 5:35 AM David Hildenbrand (Arm) > <[email protected]> wrote: > > > > On 5/20/26 23:15, Matthew Wilcox wrote: > > > On Thu, May 21, 2026 at 05:14:20AM +0800, Barry Song wrote: > > >> My understanding is that we should not blame applications here. This is > > >> 2026: > > >> there are basically only two kinds of applications — single-threaded and > > >> multi-threaded — and single-threaded applications are nearly extinct. > > > > > > all of the applications i run are either single threaded or don't fork. > > > what multithreaded applications call fork? > > > > Traditionally the problem was random libraries using fork+execve to launch > > other > > programs ... instead of using alternatives like posix_spwan (some use cases > > require more work done before execve and cannot yet switch to that). I'd > > hope > > that that is less of a problem on Android. > > > > I assume Android zygote might be multi threaded? Maybe sshd as well? > > Systemd? > > But I'd be surprised if there are really performance implications. > > I am trying to answer the question above: > > 1. zygote, multi-threaded on my phone using Android13. > / # ls /proc/`pidof zygote64`/task/ > 1359 22728 22729 22730 22731 22732 > > /proc/1359/task # cat 22728/comm > Jit thread pool > /proc/1359/task # cat 22730/comm > ReferenceQueueD > /proc/1359/task # cat 22731/comm > FinalizerDaemon > /proc/1359/task # cat 22732/comm > FinalizerWatchd > /proc/1359/task # cat 1359/comm > main > > But on another phone of mine running Android 16, zygote64 is > single-threaded. > Not sure if it is due to the Android team making some changes > related to threads from Android 13 to Android 16. > > 2. sshd, multi-processes instead of multi-threads: > $ ps aux | grep sshd > root 1192 0.0 0.0 15444 9032 ? Ss 09:42 0:00 > sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups > root 2465 0.0 0.0 17164 10760 ? Ss 09:42 0:00 > sshd: barry [priv] > barry 2632 0.0 0.0 17164 7852 ? S 09:42 0:00 > sshd: barry@pts/0 > root 3305 2.5 0.0 17164 10772 ? Ss 09:44 0:00 > sshd: barry [priv] > barry 3406 0.0 0.0 17164 7940 ? S 09:44 0:00 > sshd: barry@pts/1 > > 3. systemd, also multi-processes > > $ ps ax | grep systemd > 350 ? S<s 0:00 /lib/systemd/systemd-journald > 387 ? Ss 0:00 /lib/systemd/systemd-udevd > 666 ? Ss 0:00 /lib/systemd/systemd-oomd > 667 ? Ss 0:00 /lib/systemd/systemd-resolved > 728 ? Ss 0:00 @dbus-daemon --system --address=systemd: > --nofork --nopidfile --systemd-activation --syslog-only > 751 ? Ss 0:00 /lib/systemd/systemd-logind > 753 ? Ssl 0:00 /usr/sbin/thermald --systemd > --dbus-enable --adaptive > 1350 ? Ss 0:00 /lib/systemd/systemd --user > 1428 ? Ss 0:00 /usr/bin/dbus-daemon --session > --address=systemd: --nofork --nopidfile --systemd-activation > --syslog-only > 1900 ? Ssl 0:00 /usr/libexec/gnome-session-binary > --systemd-service --session=ubuntu > 2141 ? Ssl 0:00 /lib/systemd/systemd-timesyncd > > > > > Not sure about webbroswers .... I think most of them switched to fork > > servers, > > where I would assume fork servers would be single-threaded. > > On my phone, Chrome is multi-process, but its parent process > chrome_zygote (10774) is single-threaded: > > ps -A | grep chrome > u0_i15 9883 10774 321066464 119452 do_epoll_wait 0 S > com.android.chrome:sandboxed_process0:org.chromium.content.app.SandboxedProcessService0:15 > u0_a142 10164 1359 35110548 277640 do_epoll_wait 0 S > com.android.chrome > u0_a278 10724 1359 9779864 104988 do_epoll_wait 0 S > com.google.android.apps.chromecast.app > u0_a142 10774 1359 32803908 64076 do_sys_poll 0 S > com.android.chrome_zygote > u0_a142 11173 1359 34208592 142192 do_epoll_wait 0 S > com.android.chrome:privileged_process0 > > /proc/10774/task # ls > 10774 > > > > > So, yeah, getting a clear understanding how this ends up being a problem on > > Android would be great. > > I guess the real issue is that in the Android market, there > are so many applications that are out of our control? > > Here are some trace examples from Nanzhe: > > iQIYI plugin > vma reader thread: > PbMisc-0, pid=27183, tgid=26444 > > vma writer thread: > i.video:plugin1, pid=27298, tgid=26444 > writer blocked: 440394938 ns (440 ms) > > reader stack: > vma_start_read > lock_vma_under_rcu > do_page_fault > do_translation_fault > do_mem_abort > el0_da > el0t_64_sync_handler > el0t_64_sync > > writer stack: > __vma_start_write > dup_mmap > copy_mm > copy_process > kernel_clone > __arm64_sys_clone > invoke_syscall > el0_svc_common > do_el0_svc > el0_svc > > > Baidu Tieba > vma reader thread: > elastic_pms_pro, pid=7731, tgid=7575 > > vma writer thread: > com.baidu.tieba, pid=8005, tgid=7575 > writer blocked: 514975545 ns(515 ms) > > reader stack: > vma_start_read > lock_vma_under_rcu > do_page_fault > do_translation_fault > do_mem_abort > el0_da > el0t_64_sync_handler > el0t_64_sync > > writer stack: > __vma_start_write > dup_mmap > copy_mm > copy_process > kernel_clone > __arm64_sys_clone > invoke_syscall > el0_svc_common > do_el0_svc > el0_svc > > Thanks > Barry
Again this is making me want to sit outside and sip on some lemonade and ice :) Yes - android processes are aggressively multi-threaded, sure of course. The missing bit here is the forking - what, where, why, when? And then you say zygote is sometimes multi-threaded but sometimes single-threaded, which is adding a whole bunch of confusion on top of all that. I don't find these stack trace dumps all that useful (though thanks of course for taking the time to gather them), I think we'd be better off with specific data on forking, in some _concise_ _summarised_ form, ideally with numbers. There's such a thing as too much information :)) Anyway, again, please let's see a new _RFC_ with the approach proposed by Suren, with some _succinct_ data demonstrating _exactly_ what the problem is, so we can make some headway here. And now I'm off for a cornetto! :) Thanks, Lorenzo
