Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 24 July 2018 at 12:05, Bruce Merry wrote: > To reproduce: > 1. Start cadvisor running. I use the 0.30.2 binary from Github, and > run it with sudo ./cadvisor-0.30.2 --logtostderr=true > 2. Run the Python 3 script below, which repeatedly creates a cgroup, > enters it, stats some files in it, and leaves it again (and removes > it). It takes a few minutes to run. > 3. time cat /sys/fs/cgroup/memory/memory.stat. It now takes about 20ms for me. > 4. sudo sysctl vm.drop_caches=2 > 5. time cat /sys/fs/cgroup/memory/memory.stat. It is back to 1-2ms. > > I've also added some code to memcg_stat_show to report the number of > cgroups in the hierarchy (iterations in for_each_mem_cgroup_tree). > Running the script increases it from ~700 to ~41000. The script > iterates 250,000 times, so only some fraction of the cgroups become > zombies. I've discovered that I'd messed up that instrumentation code (it was incrementing inside a loop so counted 5x too many cgroups), so some of the things I said turn out to be wrong. Let me try again: - Running the script generates about 8000 zombies (not 4), with or without Shakeel's patch (for 250,000 cgroups created/destroyed - so possibly there is some timing condition that makes them into zombies. I've only measured it with 4.17, but based on timing results I have no particular reason to think it's wildly different to older kernels. - After running the script 5 times (to generate 40K zombies), getting the stats takes 20ms with Shakeel's patch and 80ms without it (on 4.17.9) - which is a speedup of the same order of magnitude as Shakeel observed with non-zombies. - 4.17.9 already seems to be an improvement over 4.15: with 40K (non-zombie) cgroups, memory.stat time decreases from 200ms to 75ms. So with 4.15 -> 4.17.9 plus Shakeel's patch, the effects are reduced by an order of magnitude, which is good news. Of course, that doesn't solve the fundamental issue of why the zombies get generated in the first place. I'm not a kernel developer and I very much doubt I'll have the time to try to debug what may turn out to be a race condition, but let me know if I can help with testing things. Regards Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 24 July 2018 at 12:05, Bruce Merry wrote: > To reproduce: > 1. Start cadvisor running. I use the 0.30.2 binary from Github, and > run it with sudo ./cadvisor-0.30.2 --logtostderr=true > 2. Run the Python 3 script below, which repeatedly creates a cgroup, > enters it, stats some files in it, and leaves it again (and removes > it). It takes a few minutes to run. > 3. time cat /sys/fs/cgroup/memory/memory.stat. It now takes about 20ms for me. > 4. sudo sysctl vm.drop_caches=2 > 5. time cat /sys/fs/cgroup/memory/memory.stat. It is back to 1-2ms. > > I've also added some code to memcg_stat_show to report the number of > cgroups in the hierarchy (iterations in for_each_mem_cgroup_tree). > Running the script increases it from ~700 to ~41000. The script > iterates 250,000 times, so only some fraction of the cgroups become > zombies. I've discovered that I'd messed up that instrumentation code (it was incrementing inside a loop so counted 5x too many cgroups), so some of the things I said turn out to be wrong. Let me try again: - Running the script generates about 8000 zombies (not 4), with or without Shakeel's patch (for 250,000 cgroups created/destroyed - so possibly there is some timing condition that makes them into zombies. I've only measured it with 4.17, but based on timing results I have no particular reason to think it's wildly different to older kernels. - After running the script 5 times (to generate 40K zombies), getting the stats takes 20ms with Shakeel's patch and 80ms without it (on 4.17.9) - which is a speedup of the same order of magnitude as Shakeel observed with non-zombies. - 4.17.9 already seems to be an improvement over 4.15: with 40K (non-zombie) cgroups, memory.stat time decreases from 200ms to 75ms. So with 4.15 -> 4.17.9 plus Shakeel's patch, the effects are reduced by an order of magnitude, which is good news. Of course, that doesn't solve the fundamental issue of why the zombies get generated in the first place. I'm not a kernel developer and I very much doubt I'll have the time to try to debug what may turn out to be a race condition, but let me know if I can help with testing things. Regards Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 26 July 2018 at 02:55, Singh, Balbir wrote: > Do you by any chance have use_hierarch=1? memcg_stat_show should just rely on > counters inside the memory cgroup and the the LRU sizes for each node. Yes, /sys/fs/cgroup/memory/memory.use_hierarchy is 1. I assume systemd is doing that. Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 26 July 2018 at 02:55, Singh, Balbir wrote: > Do you by any chance have use_hierarch=1? memcg_stat_show should just rely on > counters inside the memory cgroup and the the LRU sizes for each node. Yes, /sys/fs/cgroup/memory/memory.use_hierarchy is 1. I assume systemd is doing that. Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: [PATCH] memcg: reduce memcg tree traversals for stats collection
On 25 July 2018 at 00:46, Shakeel Butt wrote: > I ran a simple benchmark which reads the root_mem_cgroup's stat file > 1000 times in the presense of 2500 memcgs on cgroup-v1. The results are: > > Without the patch: > $ time ./read-root-stat-1000-times > > real0m1.663s > user0m0.000s > sys 0m1.660s > > With the patch: > $ time ./read-root-stat-1000-times > > real0m0.468s > user0m0.000s > sys 0m0.467s Thanks for cc'ing me. I've tried this patch using my test case and the results are interesting. With the patch applied, running my script only generates about 8000 new cgroups, compared to 40,000 before - presumably because the optimisation has altered the timing. On the other hand, if I run the script 5 times to generate 4 zombie cgroups, the time to get stats for the root cgroup (cgroup-v1) is almost unchanged at around 18ms (was 20ms, but there were slightly more cgroups as well), compared to the almost 4x speedup you're seeing in your test. Regards Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: [PATCH] memcg: reduce memcg tree traversals for stats collection
On 25 July 2018 at 00:46, Shakeel Butt wrote: > I ran a simple benchmark which reads the root_mem_cgroup's stat file > 1000 times in the presense of 2500 memcgs on cgroup-v1. The results are: > > Without the patch: > $ time ./read-root-stat-1000-times > > real0m1.663s > user0m0.000s > sys 0m1.660s > > With the patch: > $ time ./read-root-stat-1000-times > > real0m0.468s > user0m0.000s > sys 0m0.467s Thanks for cc'ing me. I've tried this patch using my test case and the results are interesting. With the patch applied, running my script only generates about 8000 new cgroups, compared to 40,000 before - presumably because the optimisation has altered the timing. On the other hand, if I run the script 5 times to generate 4 zombie cgroups, the time to get stats for the root cgroup (cgroup-v1) is almost unchanged at around 18ms (was 20ms, but there were slightly more cgroups as well), compared to the almost 4x speedup you're seeing in your test. Regards Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 19:40, Bruce Merry wrote: >> Yes, very easy to produce zombies, though I don't think kernel >> provides any way to tell how many zombies exist on the system. >> >> To create a zombie, first create a memcg node, enter that memcg, >> create a tmpfs file of few KiBs, exit the memcg and rmdir the memcg. >> That memcg will be a zombie until you delete that tmpfs file. > > Thanks, that makes sense. I'll see if I can reproduce the issue. Hi I've had some time to experiment with this issue, and I've now got a way to reproduce it fairly reliably, including with a stock 4.17.8 kernel. However, it's very phase-of-the-moon stuff, and even apparently trivial changes (like switching the order in which the files are statted) makes the issue disappear. To reproduce: 1. Start cadvisor running. I use the 0.30.2 binary from Github, and run it with sudo ./cadvisor-0.30.2 --logtostderr=true 2. Run the Python 3 script below, which repeatedly creates a cgroup, enters it, stats some files in it, and leaves it again (and removes it). It takes a few minutes to run. 3. time cat /sys/fs/cgroup/memory/memory.stat. It now takes about 20ms for me. 4. sudo sysctl vm.drop_caches=2 5. time cat /sys/fs/cgroup/memory/memory.stat. It is back to 1-2ms. I've also added some code to memcg_stat_show to report the number of cgroups in the hierarchy (iterations in for_each_mem_cgroup_tree). Running the script increases it from ~700 to ~41000. The script iterates 250,000 times, so only some fraction of the cgroups become zombies. I also tried the suggestion of force_empty: it makes the problem go away, but is also very, very slow (about 0.5s per iteration), and given the sensitivity of the test to small changes I don't know how meaningful that is. Reproduction code (if you have tqdm installed you get a nice progress bar, but not required). Hopefully Gmail doesn't do any format mangling: #!/usr/bin/env python3 import os try: from tqdm import trange as range except ImportError: pass def clean(): try: os.rmdir(name) except FileNotFoundError: pass def move_to(cgroup): with open(cgroup + '/tasks', 'w') as f: print(pid, file=f) pid = os.getpid() os.chdir('/sys/fs/cgroup/memory') name = 'dummy' N = 25 clean() try: for i in range(N): os.mkdir(name) move_to(name) for filename in ['memory.stat', 'memory.swappiness']: os.stat(os.path.join(name, filename)) move_to('user.slice') os.rmdir(name) finally: move_to('user.slice') clean() Regards Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 19:40, Bruce Merry wrote: >> Yes, very easy to produce zombies, though I don't think kernel >> provides any way to tell how many zombies exist on the system. >> >> To create a zombie, first create a memcg node, enter that memcg, >> create a tmpfs file of few KiBs, exit the memcg and rmdir the memcg. >> That memcg will be a zombie until you delete that tmpfs file. > > Thanks, that makes sense. I'll see if I can reproduce the issue. Hi I've had some time to experiment with this issue, and I've now got a way to reproduce it fairly reliably, including with a stock 4.17.8 kernel. However, it's very phase-of-the-moon stuff, and even apparently trivial changes (like switching the order in which the files are statted) makes the issue disappear. To reproduce: 1. Start cadvisor running. I use the 0.30.2 binary from Github, and run it with sudo ./cadvisor-0.30.2 --logtostderr=true 2. Run the Python 3 script below, which repeatedly creates a cgroup, enters it, stats some files in it, and leaves it again (and removes it). It takes a few minutes to run. 3. time cat /sys/fs/cgroup/memory/memory.stat. It now takes about 20ms for me. 4. sudo sysctl vm.drop_caches=2 5. time cat /sys/fs/cgroup/memory/memory.stat. It is back to 1-2ms. I've also added some code to memcg_stat_show to report the number of cgroups in the hierarchy (iterations in for_each_mem_cgroup_tree). Running the script increases it from ~700 to ~41000. The script iterates 250,000 times, so only some fraction of the cgroups become zombies. I also tried the suggestion of force_empty: it makes the problem go away, but is also very, very slow (about 0.5s per iteration), and given the sensitivity of the test to small changes I don't know how meaningful that is. Reproduction code (if you have tqdm installed you get a nice progress bar, but not required). Hopefully Gmail doesn't do any format mangling: #!/usr/bin/env python3 import os try: from tqdm import trange as range except ImportError: pass def clean(): try: os.rmdir(name) except FileNotFoundError: pass def move_to(cgroup): with open(cgroup + '/tasks', 'w') as f: print(pid, file=f) pid = os.getpid() os.chdir('/sys/fs/cgroup/memory') name = 'dummy' N = 25 clean() try: for i in range(N): os.mkdir(name) move_to(name) for filename in ['memory.stat', 'memory.swappiness']: os.stat(os.path.join(name, filename)) move_to('user.slice') os.rmdir(name) finally: move_to('user.slice') clean() Regards Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 20:13, Shakeel Butt wrote: > On Wed, Jul 18, 2018 at 10:58 AM Bruce Merry wrote: > Yes, if there is no memory pressure such memory can stay around. > > On your production machine, before deleting memory containers, you can > try force_empty to reclaim such memory from them. See if that helps. Thanks. At the moment the cgroups are all managed by systemd and docker, but I'll keep that in mind while experimenting. Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 20:13, Shakeel Butt wrote: > On Wed, Jul 18, 2018 at 10:58 AM Bruce Merry wrote: > Yes, if there is no memory pressure such memory can stay around. > > On your production machine, before deleting memory containers, you can > try force_empty to reclaim such memory from them. See if that helps. Thanks. At the moment the cgroups are all managed by systemd and docker, but I'll keep that in mind while experimenting. Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 19:48, Shakeel Butt wrote: > On Wed, Jul 18, 2018 at 10:40 AM Bruce Merry wrote: >> > Yes, very easy to produce zombies, though I don't think kernel >> > provides any way to tell how many zombies exist on the system. >> > >> > To create a zombie, first create a memcg node, enter that memcg, >> > create a tmpfs file of few KiBs, exit the memcg and rmdir the memcg. >> > That memcg will be a zombie until you delete that tmpfs file. >> >> Thanks, that makes sense. I'll see if I can reproduce the issue. Do >> you expect the same thing to happen with normal (non-tmpfs) files that >> are sitting in the page cache, and/or dentries? >> > > Normal files and their dentries can get reclaimed while tmpfs will > stick and even if the data of tmpfs goes to swap, the kmem related to > tmpfs files will remain in memory. Sure, page cache and dentries are reclaimable given memory pressure. These machines all have more memory than they need though (64GB+) and generally don't come under any memory pressure. I'm just wondering if the behaviour we're seeing can be explained as a result of a lot of dentries sticking around (because there is no memory pressure) and in turn causing a lot of zombie cgroups to stay present until something forces reclamation of dentries. Cheers Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 19:48, Shakeel Butt wrote: > On Wed, Jul 18, 2018 at 10:40 AM Bruce Merry wrote: >> > Yes, very easy to produce zombies, though I don't think kernel >> > provides any way to tell how many zombies exist on the system. >> > >> > To create a zombie, first create a memcg node, enter that memcg, >> > create a tmpfs file of few KiBs, exit the memcg and rmdir the memcg. >> > That memcg will be a zombie until you delete that tmpfs file. >> >> Thanks, that makes sense. I'll see if I can reproduce the issue. Do >> you expect the same thing to happen with normal (non-tmpfs) files that >> are sitting in the page cache, and/or dentries? >> > > Normal files and their dentries can get reclaimed while tmpfs will > stick and even if the data of tmpfs goes to swap, the kmem related to > tmpfs files will remain in memory. Sure, page cache and dentries are reclaimable given memory pressure. These machines all have more memory than they need though (64GB+) and generally don't come under any memory pressure. I'm just wondering if the behaviour we're seeing can be explained as a result of a lot of dentries sticking around (because there is no memory pressure) and in turn causing a lot of zombie cgroups to stay present until something forces reclamation of dentries. Cheers Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 17:49, Shakeel Butt wrote: > On Wed, Jul 18, 2018 at 8:37 AM Bruce Merry wrote: >> That sounds promising. Is there any way to tell how many zombies there >> are, and is there any way to deliberately create zombies? If I can >> produce zombies that might give me a reliable way to reproduce the >> problem, which could then sensibly be tested against newer kernel >> versions. >> > > Yes, very easy to produce zombies, though I don't think kernel > provides any way to tell how many zombies exist on the system. > > To create a zombie, first create a memcg node, enter that memcg, > create a tmpfs file of few KiBs, exit the memcg and rmdir the memcg. > That memcg will be a zombie until you delete that tmpfs file. Thanks, that makes sense. I'll see if I can reproduce the issue. Do you expect the same thing to happen with normal (non-tmpfs) files that are sitting in the page cache, and/or dentries? Cheers Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 17:49, Shakeel Butt wrote: > On Wed, Jul 18, 2018 at 8:37 AM Bruce Merry wrote: >> That sounds promising. Is there any way to tell how many zombies there >> are, and is there any way to deliberately create zombies? If I can >> produce zombies that might give me a reliable way to reproduce the >> problem, which could then sensibly be tested against newer kernel >> versions. >> > > Yes, very easy to produce zombies, though I don't think kernel > provides any way to tell how many zombies exist on the system. > > To create a zombie, first create a memcg node, enter that memcg, > create a tmpfs file of few KiBs, exit the memcg and rmdir the memcg. > That memcg will be a zombie until you delete that tmpfs file. Thanks, that makes sense. I'll see if I can reproduce the issue. Do you expect the same thing to happen with normal (non-tmpfs) files that are sitting in the page cache, and/or dentries? Cheers Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 17:26, Shakeel Butt wrote: > On Wed, Jul 18, 2018 at 7:29 AM Bruce Merry wrote: > It seems like you are using cgroup-v1. How many nodes are there in > your memcg tree and also how many cpus does the system have? >From my original email: "there are 106 memory.stat files in /sys/fs/cgroup/memory." - is that what you mean by the number of nodes? The affected systems all have 8 CPU cores (hyperthreading is disabled). > Please note that memcg_stat_show or reading memory.stat in cgroup-v1 > is not optimized as cgroup-v2. The function memcg_stat_show() in 4.13 > does ~17 tree walks and then for ~12 of those tree walks, it goes > through all cpus for each node in the memcg tree. In 4.16, > a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat > reporting") optimizes aways the cpu traversal at the expense of some > accuracy. Next optimization would be to do just one memcg tree > traversal similar to cgroup-v2's memory_stat_show(). On most machines it is still fast (1-2ms), and there is no difference in the number of CPUs and only very small differences in the number of live memory cgroups, so presumably something else is going on. > The memcg tree does include all zombie memcgs and these zombies does > contribute to the memcg_stat_show cost. That sounds promising. Is there any way to tell how many zombies there are, and is there any way to deliberately create zombies? If I can produce zombies that might give me a reliable way to reproduce the problem, which could then sensibly be tested against newer kernel versions. Thanks Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 17:26, Shakeel Butt wrote: > On Wed, Jul 18, 2018 at 7:29 AM Bruce Merry wrote: > It seems like you are using cgroup-v1. How many nodes are there in > your memcg tree and also how many cpus does the system have? >From my original email: "there are 106 memory.stat files in /sys/fs/cgroup/memory." - is that what you mean by the number of nodes? The affected systems all have 8 CPU cores (hyperthreading is disabled). > Please note that memcg_stat_show or reading memory.stat in cgroup-v1 > is not optimized as cgroup-v2. The function memcg_stat_show() in 4.13 > does ~17 tree walks and then for ~12 of those tree walks, it goes > through all cpus for each node in the memcg tree. In 4.16, > a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat > reporting") optimizes aways the cpu traversal at the expense of some > accuracy. Next optimization would be to do just one memcg tree > traversal similar to cgroup-v2's memory_stat_show(). On most machines it is still fast (1-2ms), and there is no difference in the number of CPUs and only very small differences in the number of live memory cgroups, so presumably something else is going on. > The memcg tree does include all zombie memcgs and these zombies does > contribute to the memcg_stat_show cost. That sounds promising. Is there any way to tell how many zombies there are, and is there any way to deliberately create zombies? If I can produce zombies that might give me a reliable way to reproduce the problem, which could then sensibly be tested against newer kernel versions. Thanks Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 16:47, Michal Hocko wrote: >> Thanks for looking into this. I'm not familiar with ftrace. Can you >> give me a specific command line to run? Based on "perf record cat >> /sys/fs/cgroup/memory/memory.stat"/"perf report", I see the following: >> >> 42.09% cat [kernel.kallsyms] [k] memcg_stat_show >> 29.19% cat [kernel.kallsyms] [k] memcg_sum_events.isra.22 >> 12.41% cat [kernel.kallsyms] [k] mem_cgroup_iter >>5.42% cat [kernel.kallsyms] [k] _find_next_bit >>4.14% cat [kernel.kallsyms] [k] css_next_descendant_pre >>3.44% cat [kernel.kallsyms] [k] find_next_bit >>2.84% cat [kernel.kallsyms] [k] mem_cgroup_node_nr_lru_pages > > I would just use perf record as you did. How long did the call take? > Also is the excessive time an outlier or a more consistent thing? If the > former does perf record show any difference? I didn't note the exact time for that particular run, but it's pretty consistently 372-377ms on the machine that has that perf report. The times differ between machines showing the symptom (anywhere from 200-500ms), but are consistent (within a few ms) in back-to-back runs on each machine. >> Ubuntu 16.04 with kernel 4.13.0-41-generic (so presumably includes >> some Ubuntu special sauce). > > Do you see the same whe running with the vanilla kernel? We don't currently have any boxes running vanilla kernels. While I could install a test box with a vanilla kernel, I don't know how to reproduce the problem, what piece of our production environment is triggering it, or even why some machines are unaffected, so if the problem didn't re-occur on the test box I wouldn't be able to conclude anything useful. Do you have suggestions on things I could try that might trigger this? e.g. are there cases where a cgroup no longer shows up in the filesystem but is still lingering while waiting for its refcount to hit zero? Does every child cgroup contribute to the stat_show cost of its parent or does it have to have some non-trivial variation from its parent? Thanks Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 16:47, Michal Hocko wrote: >> Thanks for looking into this. I'm not familiar with ftrace. Can you >> give me a specific command line to run? Based on "perf record cat >> /sys/fs/cgroup/memory/memory.stat"/"perf report", I see the following: >> >> 42.09% cat [kernel.kallsyms] [k] memcg_stat_show >> 29.19% cat [kernel.kallsyms] [k] memcg_sum_events.isra.22 >> 12.41% cat [kernel.kallsyms] [k] mem_cgroup_iter >>5.42% cat [kernel.kallsyms] [k] _find_next_bit >>4.14% cat [kernel.kallsyms] [k] css_next_descendant_pre >>3.44% cat [kernel.kallsyms] [k] find_next_bit >>2.84% cat [kernel.kallsyms] [k] mem_cgroup_node_nr_lru_pages > > I would just use perf record as you did. How long did the call take? > Also is the excessive time an outlier or a more consistent thing? If the > former does perf record show any difference? I didn't note the exact time for that particular run, but it's pretty consistently 372-377ms on the machine that has that perf report. The times differ between machines showing the symptom (anywhere from 200-500ms), but are consistent (within a few ms) in back-to-back runs on each machine. >> Ubuntu 16.04 with kernel 4.13.0-41-generic (so presumably includes >> some Ubuntu special sauce). > > Do you see the same whe running with the vanilla kernel? We don't currently have any boxes running vanilla kernels. While I could install a test box with a vanilla kernel, I don't know how to reproduce the problem, what piece of our production environment is triggering it, or even why some machines are unaffected, so if the problem didn't re-occur on the test box I wouldn't be able to conclude anything useful. Do you have suggestions on things I could try that might trigger this? e.g. are there cases where a cgroup no longer shows up in the filesystem but is still lingering while waiting for its refcount to hit zero? Does every child cgroup contribute to the stat_show cost of its parent or does it have to have some non-trivial variation from its parent? Thanks Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 12:42, Michal Hocko wrote: > [CC some more people] > > On Tue 17-07-18 21:23:07, Andrew Morton wrote: >> (cc linux-mm) >> >> On Tue, 3 Jul 2018 08:43:23 +0200 Bruce Merry wrote: >> >> > Hi >> > >> > I've run into an odd performance issue in the kernel, and not being a >> > kernel dev or knowing terribly much about cgroups, am looking for >> > advice on diagnosing the problem further (I discovered this while >> > trying to pin down high CPU load in cadvisor). >> > >> > On some machines in our production system, cat >> > /sys/fs/cgroup/memory/memory.stat is extremely slow (500ms on one >> > machine), while on other nominally identical machines it is fast >> > (2ms). > > Could you try to use ftrace to see where the time is spent? Thanks for looking into this. I'm not familiar with ftrace. Can you give me a specific command line to run? Based on "perf record cat /sys/fs/cgroup/memory/memory.stat"/"perf report", I see the following: 42.09% cat [kernel.kallsyms] [k] memcg_stat_show 29.19% cat [kernel.kallsyms] [k] memcg_sum_events.isra.22 12.41% cat [kernel.kallsyms] [k] mem_cgroup_iter 5.42% cat [kernel.kallsyms] [k] _find_next_bit 4.14% cat [kernel.kallsyms] [k] css_next_descendant_pre 3.44% cat [kernel.kallsyms] [k] find_next_bit 2.84% cat [kernel.kallsyms] [k] mem_cgroup_node_nr_lru_pages > memory_stat_show should only scale with the depth of the cgroup > hierarchy for memory.stat to get cumulative numbers. All the rest should > be simply reads of gathered counters. There is no locking involved in > the current kernel. What is the kernel version you are using, btw? Ubuntu 16.04 with kernel 4.13.0-41-generic (so presumably includes some Ubuntu special sauce). Some new information: when this occurred on another machine I ran "echo 2 > /proc/sys/vm/drop_caches" to drop the dentry cache, and performance immediately improved. Unfortunately, I've not been able to deliberately reproduce the issue. I've tried doing the following 10^7 times in a loop and while it inflates the dentry cache, it doesn't cause any significant slowdown: 1. Create a temporary cgroup: mkdir /sys/fs/cgroup/memory/. 2. stat /sys/fs/cgroup/memory//memory.stat 3. rmdir /sys/fs/cgroup/memory/ I've also tried inflating the dentry cache just by stat-ing millions of non-existent files, and again, no slowdown. So I'm not sure exactly how dentry cache is related. Regards Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
On 18 July 2018 at 12:42, Michal Hocko wrote: > [CC some more people] > > On Tue 17-07-18 21:23:07, Andrew Morton wrote: >> (cc linux-mm) >> >> On Tue, 3 Jul 2018 08:43:23 +0200 Bruce Merry wrote: >> >> > Hi >> > >> > I've run into an odd performance issue in the kernel, and not being a >> > kernel dev or knowing terribly much about cgroups, am looking for >> > advice on diagnosing the problem further (I discovered this while >> > trying to pin down high CPU load in cadvisor). >> > >> > On some machines in our production system, cat >> > /sys/fs/cgroup/memory/memory.stat is extremely slow (500ms on one >> > machine), while on other nominally identical machines it is fast >> > (2ms). > > Could you try to use ftrace to see where the time is spent? Thanks for looking into this. I'm not familiar with ftrace. Can you give me a specific command line to run? Based on "perf record cat /sys/fs/cgroup/memory/memory.stat"/"perf report", I see the following: 42.09% cat [kernel.kallsyms] [k] memcg_stat_show 29.19% cat [kernel.kallsyms] [k] memcg_sum_events.isra.22 12.41% cat [kernel.kallsyms] [k] mem_cgroup_iter 5.42% cat [kernel.kallsyms] [k] _find_next_bit 4.14% cat [kernel.kallsyms] [k] css_next_descendant_pre 3.44% cat [kernel.kallsyms] [k] find_next_bit 2.84% cat [kernel.kallsyms] [k] mem_cgroup_node_nr_lru_pages > memory_stat_show should only scale with the depth of the cgroup > hierarchy for memory.stat to get cumulative numbers. All the rest should > be simply reads of gathered counters. There is no locking involved in > the current kernel. What is the kernel version you are using, btw? Ubuntu 16.04 with kernel 4.13.0-41-generic (so presumably includes some Ubuntu special sauce). Some new information: when this occurred on another machine I ran "echo 2 > /proc/sys/vm/drop_caches" to drop the dentry cache, and performance immediately improved. Unfortunately, I've not been able to deliberately reproduce the issue. I've tried doing the following 10^7 times in a loop and while it inflates the dentry cache, it doesn't cause any significant slowdown: 1. Create a temporary cgroup: mkdir /sys/fs/cgroup/memory/. 2. stat /sys/fs/cgroup/memory//memory.stat 3. rmdir /sys/fs/cgroup/memory/ I've also tried inflating the dentry cache just by stat-ing millions of non-existent files, and again, no slowdown. So I'm not sure exactly how dentry cache is related. Regards Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
Hi I've run into an odd performance issue in the kernel, and not being a kernel dev or knowing terribly much about cgroups, am looking for advice on diagnosing the problem further (I discovered this while trying to pin down high CPU load in cadvisor). On some machines in our production system, cat /sys/fs/cgroup/memory/memory.stat is extremely slow (500ms on one machine), while on other nominally identical machines it is fast (2ms). One other thing I've noticed is that the affected machines generally have much larger values for SUnreclaim in /proc/memstat (up to several GB), and slabtop reports >1GB of dentry. Before I tracked the original problem (high CPU usage in cadvisor) down to this, I rebooted one of the machines and the original problem went away, so it seems to be cleared by a reboot; I'm reluctant to reboot more machines to confirm since I don't have a sure-fire way to reproduce the problem again to debug it. The machines are running Ubuntu 16.04 with kernel 4.13.0-41-generic. They're running Docker, which creates a bunch of cgroups, but not an excessive number: there are 106 memory.stat files in /sys/fs/cgroup/memory. Digging a bit further, cat /sys/fs/cgroup/memory/system.slice/memory.stat also takes ~500ms, but "find /sys/fs/cgroup/memory/system.slice -mindepth 2 -name memory.stat | xargs cat" takes only 8ms. Any thoughts, particularly on what I should compare between the good and bad machines to narrow down the cause, or even better, how to prevent it happening? Thanks Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines
Hi I've run into an odd performance issue in the kernel, and not being a kernel dev or knowing terribly much about cgroups, am looking for advice on diagnosing the problem further (I discovered this while trying to pin down high CPU load in cadvisor). On some machines in our production system, cat /sys/fs/cgroup/memory/memory.stat is extremely slow (500ms on one machine), while on other nominally identical machines it is fast (2ms). One other thing I've noticed is that the affected machines generally have much larger values for SUnreclaim in /proc/memstat (up to several GB), and slabtop reports >1GB of dentry. Before I tracked the original problem (high CPU usage in cadvisor) down to this, I rebooted one of the machines and the original problem went away, so it seems to be cleared by a reboot; I'm reluctant to reboot more machines to confirm since I don't have a sure-fire way to reproduce the problem again to debug it. The machines are running Ubuntu 16.04 with kernel 4.13.0-41-generic. They're running Docker, which creates a bunch of cgroups, but not an excessive number: there are 106 memory.stat files in /sys/fs/cgroup/memory. Digging a bit further, cat /sys/fs/cgroup/memory/system.slice/memory.stat also takes ~500ms, but "find /sys/fs/cgroup/memory/system.slice -mindepth 2 -name memory.stat | xargs cat" takes only 8ms. Any thoughts, particularly on what I should compare between the good and bad machines to narrow down the cause, or even better, how to prevent it happening? Thanks Bruce -- Bruce Merry Senior Science Processing Developer SKA South Africa
[tip:perf/urgent] perf bench: Fix order of arguments to memcpy_alloc_mem
Commit-ID: e17fdaeaec066c725f73cd3cda1feae52b2646f5 Gitweb: http://git.kernel.org/tip/e17fdaeaec066c725f73cd3cda1feae52b2646f5 Author: Bruce Merry AuthorDate: Thu, 15 Jan 2015 11:20:22 +0200 Committer: Arnaldo Carvalho de Melo CommitDate: Sun, 22 Feb 2015 23:10:56 -0300 perf bench: Fix order of arguments to memcpy_alloc_mem This was causing the destination instead of the source to be filled. As a result, the source was typically all mapped to one zero page, and hence very cacheable. Signed-off-by: Bruce Merry Acked-by: Ingo Molnar Cc: Paul Mackerras Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20150115092022.GA11292@kryton Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/bench/mem-memcpy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c index 6c14afe..db1d3a2 100644 --- a/tools/perf/bench/mem-memcpy.c +++ b/tools/perf/bench/mem-memcpy.c @@ -289,7 +289,7 @@ static u64 do_memcpy_cycle(const struct routine *r, size_t len, bool prefault) memcpy_t fn = r->fn.memcpy; int i; - memcpy_alloc_mem(, , len); + memcpy_alloc_mem(, , len); if (prefault) fn(dst, src, len); @@ -312,7 +312,7 @@ static double do_memcpy_gettimeofday(const struct routine *r, size_t len, void *src = NULL, *dst = NULL; int i; - memcpy_alloc_mem(, , len); + memcpy_alloc_mem(, , len); if (prefault) fn(dst, src, len); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:perf/urgent] perf bench: Fix order of arguments to memcpy_alloc_mem
Commit-ID: e17fdaeaec066c725f73cd3cda1feae52b2646f5 Gitweb: http://git.kernel.org/tip/e17fdaeaec066c725f73cd3cda1feae52b2646f5 Author: Bruce Merry bme...@ska.ac.za AuthorDate: Thu, 15 Jan 2015 11:20:22 +0200 Committer: Arnaldo Carvalho de Melo a...@redhat.com CommitDate: Sun, 22 Feb 2015 23:10:56 -0300 perf bench: Fix order of arguments to memcpy_alloc_mem This was causing the destination instead of the source to be filled. As a result, the source was typically all mapped to one zero page, and hence very cacheable. Signed-off-by: Bruce Merry bme...@ska.ac.za Acked-by: Ingo Molnar mi...@kernel.org Cc: Paul Mackerras pau...@samba.org Cc: Peter Zijlstra pet...@infradead.org Link: http://lkml.kernel.org/r/20150115092022.GA11292@kryton Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com --- tools/perf/bench/mem-memcpy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c index 6c14afe..db1d3a2 100644 --- a/tools/perf/bench/mem-memcpy.c +++ b/tools/perf/bench/mem-memcpy.c @@ -289,7 +289,7 @@ static u64 do_memcpy_cycle(const struct routine *r, size_t len, bool prefault) memcpy_t fn = r-fn.memcpy; int i; - memcpy_alloc_mem(src, dst, len); + memcpy_alloc_mem(dst, src, len); if (prefault) fn(dst, src, len); @@ -312,7 +312,7 @@ static double do_memcpy_gettimeofday(const struct routine *r, size_t len, void *src = NULL, *dst = NULL; int i; - memcpy_alloc_mem(src, dst, len); + memcpy_alloc_mem(dst, src, len); if (prefault) fn(dst, src, len); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] perf bench: fix order of arguments to memcpy_alloc_mem
This was causing the destination instead of the source to be filled. As a result, the source was typically all mapped to one zero page, and hence very cacheable. Signed-off-by: Bruce Merry --- tools/perf/bench/mem-memcpy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c index 6c14afe..db1d3a2 100644 --- a/tools/perf/bench/mem-memcpy.c +++ b/tools/perf/bench/mem-memcpy.c @@ -289,7 +289,7 @@ static u64 do_memcpy_cycle(const struct routine *r, size_t len, bool prefault) memcpy_t fn = r->fn.memcpy; int i; - memcpy_alloc_mem(, , len); + memcpy_alloc_mem(, , len); if (prefault) fn(dst, src, len); @@ -312,7 +312,7 @@ static double do_memcpy_gettimeofday(const struct routine *r, size_t len, void *src = NULL, *dst = NULL; int i; - memcpy_alloc_mem(, , len); + memcpy_alloc_mem(, , len); if (prefault) fn(dst, src, len); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] perf bench: fix order of arguments to memcpy_alloc_mem
This was causing the destination instead of the source to be filled. As a result, the source was typically all mapped to one zero page, and hence very cacheable --- tools/perf/bench/mem-memcpy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c index 6c14afe..db1d3a2 100644 --- a/tools/perf/bench/mem-memcpy.c +++ b/tools/perf/bench/mem-memcpy.c @@ -289,7 +289,7 @@ static u64 do_memcpy_cycle(const struct routine *r, size_t len, bool prefault) memcpy_t fn = r->fn.memcpy; int i; -memcpy_alloc_mem(, , len); +memcpy_alloc_mem(, , len); if (prefault) fn(dst, src, len); @@ -312,7 +312,7 @@ static double do_memcpy_gettimeofday(const struct routine *r, size_t len, void *src = NULL, *dst = NULL; int i; -memcpy_alloc_mem(, , len); +memcpy_alloc_mem(, , len); if (prefault) fn(dst, src, len); -- 1.9.1 -- Bruce Merry Senior Science Processing Developer SKA South Africa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] perf bench: fix order of arguments to memcpy_alloc_mem
This was causing the destination instead of the source to be filled. As a result, the source was typically all mapped to one zero page, and hence very cacheable --- tools/perf/bench/mem-memcpy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c index 6c14afe..db1d3a2 100644 --- a/tools/perf/bench/mem-memcpy.c +++ b/tools/perf/bench/mem-memcpy.c @@ -289,7 +289,7 @@ static u64 do_memcpy_cycle(const struct routine *r, size_t len, bool prefault) memcpy_t fn = r-fn.memcpy; int i; -memcpy_alloc_mem(src, dst, len); +memcpy_alloc_mem(dst, src, len); if (prefault) fn(dst, src, len); @@ -312,7 +312,7 @@ static double do_memcpy_gettimeofday(const struct routine *r, size_t len, void *src = NULL, *dst = NULL; int i; -memcpy_alloc_mem(src, dst, len); +memcpy_alloc_mem(dst, src, len); if (prefault) fn(dst, src, len); -- 1.9.1 -- Bruce Merry Senior Science Processing Developer SKA South Africa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] perf bench: fix order of arguments to memcpy_alloc_mem
This was causing the destination instead of the source to be filled. As a result, the source was typically all mapped to one zero page, and hence very cacheable. Signed-off-by: Bruce Merry bme...@ska.ac.za --- tools/perf/bench/mem-memcpy.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c index 6c14afe..db1d3a2 100644 --- a/tools/perf/bench/mem-memcpy.c +++ b/tools/perf/bench/mem-memcpy.c @@ -289,7 +289,7 @@ static u64 do_memcpy_cycle(const struct routine *r, size_t len, bool prefault) memcpy_t fn = r-fn.memcpy; int i; - memcpy_alloc_mem(src, dst, len); + memcpy_alloc_mem(dst, src, len); if (prefault) fn(dst, src, len); @@ -312,7 +312,7 @@ static double do_memcpy_gettimeofday(const struct routine *r, size_t len, void *src = NULL, *dst = NULL; int i; - memcpy_alloc_mem(src, dst, len); + memcpy_alloc_mem(dst, src, len); if (prefault) fn(dst, src, len); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Oops error
Hello Sorry I had to send this to the whole developer list - there wasn't much in the output of ksymoops that told me who to send it to. Here's the background in case this is useful: I have a background process that plays mp3's through amp. After one finished and another tried to start, I got the oops and the mp3 never played (it went on to the next one). The mp3 was on a UDF CD-RW, and the next one it played was on the hard drive. Soon thereafter I noticed that all mp3's had stopped and that any process that tried to read anything from /cdrom (including ls /cdrom) went into daemon state and refused to die, even with kill -9. I'm guessing that this means the problem is in either the CD-ROM code or the UDF code, but it might be unrelated. I've attached the output from ksymoops. B4N Bruce /\ | Bruce Merry (Entropy)| bmerry at iafrica dot com | | Proud user of Linux! | http://www.cs.uct.ac.za/~bmerry | | Disc space -- the final frontier! | \/ ksymoops 2.3.4 on i686 2.4.0-test8. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.0-test8/ (default) -m /usr/src/linux/System.map (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Sep 11 19:31:10 cheese kernel: Oops: Sep 11 19:31:10 cheese kernel: CPU:0 Sep 11 19:31:10 cheese kernel: EIP:0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 Sep 11 19:31:10 cheese kernel: EFLAGS: 00010246 Sep 11 19:31:10 cheese kernel: eax: c2da4080 ebx: c3bf9ca0 ecx: 0008 edx: 07e81b13 Sep 11 19:31:10 cheese kernel: esi: c2ad edi: 0005 ebp: esp: c2de1ccc Sep 11 19:31:10 cheese kernel: ds: 0018 es: 0018 ss: 0018 Sep 11 19:31:10 cheese kernel: Process amp (pid: 366, stackpage=c2de1000) Sep 11 19:31:10 cheese kernel: Stack: c3bf9ca0 c2ad8400 fd036273 c2de1d64 c2de1d08 c4851f0f Sep 11 19:31:10 cheese kernel:c2ad8400 fd036273 c3df9800 c378f5e0 Sep 11 19:31:10 cheese kernel:c484f25f c2ad8400 fd036273 c28d1860 c3df9800 Sep 11 19:31:10 cheese kernel: Call Trace: [] [] [] [] [] [] [tcp_v4_send_check+45/112] Sep 11 19:31:10 cheese kernel:[] [do_no_page+84/256] [d_alloc+21/368] [real_lookup+79/224] [path_walk+614/2144] [open_namei+118/1360] [filp_open+49/112] [getname+104/176] Sep 11 19:31:10 cheese kernel: Code: 0f b6 14 02 eb 3b 8d b6 00 00 00 00 8d bc 27 00 00 00 00 80 >>EIP; c485223d <[sb]__module_parm_desc_acer+3ae5/b908> <= Trace; fd036273 Trace; c4851f0f <[sb]__module_parm_desc_acer+37b7/b908> Trace; fd036273 Trace; c484f25f <[sb]__module_parm_desc_acer+b07/b908> Trace; fd036273 Trace; fd036273 Trace; c484f5b1 <[sb]__module_parm_desc_acer+e59/b908> Code; c485223d <[sb]__module_parm_desc_acer+3ae5/b908> <_EIP>: Code; c485223d <[sb]__module_parm_desc_acer+3ae5/b908> <= 0: 0f b6 14 02 movzbl (%edx,%eax,1),%edx <= Code; c4852241 <[sb]__module_parm_desc_acer+3ae9/b908> 4: eb 3b jmp41 <_EIP+0x41> c485227e <[sb]__module_parm_desc_acer+3b26/b908> Code; c4852243 <[sb]__module_parm_desc_acer+3aeb/b908> 6: 8d b6 00 00 00 00 leal 0x0(%esi),%esi Code; c4852249 <[sb]__module_parm_desc_acer+3af1/b908> c: 8d bc 27 00 00 00 00 leal 0x0(%edi,1),%edi Code; c4852250 <[sb]__module_parm_desc_acer+3af8/b908> 13: 80 00 00 addb $0x0,(%eax) 1 warning issued. Results may not be reliable.
Oops error
Hello Sorry I had to send this to the whole developer list - there wasn't much in the output of ksymoops that told me who to send it to. Here's the background in case this is useful: I have a background process that plays mp3's through amp. After one finished and another tried to start, I got the oops and the mp3 never played (it went on to the next one). The mp3 was on a UDF CD-RW, and the next one it played was on the hard drive. Soon thereafter I noticed that all mp3's had stopped and that any process that tried to read anything from /cdrom (including ls /cdrom) went into daemon state and refused to die, even with kill -9. I'm guessing that this means the problem is in either the CD-ROM code or the UDF code, but it might be unrelated. I've attached the output from ksymoops. B4N Bruce /\ | Bruce Merry (Entropy)| bmerry at iafrica dot com | | Proud user of Linux! | http://www.cs.uct.ac.za/~bmerry | | Disc space -- the final frontier! | \/ ksymoops 2.3.4 on i686 2.4.0-test8. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.0-test8/ (default) -m /usr/src/linux/System.map (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Sep 11 19:31:10 cheese kernel: Oops: Sep 11 19:31:10 cheese kernel: CPU:0 Sep 11 19:31:10 cheese kernel: EIP:0010:[c485223d] Using defaults from ksymoops -t elf32-i386 -a i386 Sep 11 19:31:10 cheese kernel: EFLAGS: 00010246 Sep 11 19:31:10 cheese kernel: eax: c2da4080 ebx: c3bf9ca0 ecx: 0008 edx: 07e81b13 Sep 11 19:31:10 cheese kernel: esi: c2ad edi: 0005 ebp: esp: c2de1ccc Sep 11 19:31:10 cheese kernel: ds: 0018 es: 0018 ss: 0018 Sep 11 19:31:10 cheese kernel: Process amp (pid: 366, stackpage=c2de1000) Sep 11 19:31:10 cheese kernel: Stack: c3bf9ca0 c2ad8400 fd036273 c2de1d64 c2de1d08 c4851f0f Sep 11 19:31:10 cheese kernel:c2ad8400 fd036273 c3df9800 c378f5e0 Sep 11 19:31:10 cheese kernel:c484f25f c2ad8400 fd036273 c28d1860 c3df9800 Sep 11 19:31:10 cheese kernel: Call Trace: [fd036273] [c4851f0f] [fd036273] [c484f25f] [fd036273] [fd036273] [tcp_v4_send_check+45/112] Sep 11 19:31:10 cheese kernel:[c484f5b1] [do_no_page+84/256] [d_alloc+21/368] [real_lookup+79/224] [path_walk+614/2144] [open_namei+118/1360] [filp_open+49/112] [getname+104/176] Sep 11 19:31:10 cheese kernel: Code: 0f b6 14 02 eb 3b 8d b6 00 00 00 00 8d bc 27 00 00 00 00 80 EIP; c485223d [sb]__module_parm_desc_acer+3ae5/b908 = Trace; fd036273 END_OF_CODE+387b841c/ Trace; c4851f0f [sb]__module_parm_desc_acer+37b7/b908 Trace; fd036273 END_OF_CODE+387b841c/ Trace; c484f25f [sb]__module_parm_desc_acer+b07/b908 Trace; fd036273 END_OF_CODE+387b841c/ Trace; fd036273 END_OF_CODE+387b841c/ Trace; c484f5b1 [sb]__module_parm_desc_acer+e59/b908 Code; c485223d [sb]__module_parm_desc_acer+3ae5/b908 _EIP: Code; c485223d [sb]__module_parm_desc_acer+3ae5/b908 = 0: 0f b6 14 02 movzbl (%edx,%eax,1),%edx = Code; c4852241 [sb]__module_parm_desc_acer+3ae9/b908 4: eb 3b jmp41 _EIP+0x41 c485227e [sb]__module_parm_desc_acer+3b26/b908 Code; c4852243 [sb]__module_parm_desc_acer+3aeb/b908 6: 8d b6 00 00 00 00 leal 0x0(%esi),%esi Code; c4852249 [sb]__module_parm_desc_acer+3af1/b908 c: 8d bc 27 00 00 00 00 leal 0x0(%edi,1),%edi Code; c4852250 [sb]__module_parm_desc_acer+3af8/b908 13: 80 00 00 addb $0x0,(%eax) 1 warning issued. Results may not be reliable.