Bug#703356: megasas: Failed to alloc kernel SGL buffer for IOCTL (ref.#688198)
I noticed in the last reply to this bug that the MegaRAID Storage Manager is suspect. I'm running Ubuntu with a 3.5.0-32 kernel and see this same behavior when using the MegaCli64 command line tool. I run this tool through cron each hour to grab the logs from the RAID controller and put them into syslog. Everything was fine for a day or so and then now everytime I run the tool I an error message about the SGL buffer. I believe this appeared in the latest kernel update for Ubuntu. Perhaps a simliar patch was applied to both Debian and Ubuntu recently? -- http://mtu.net/~jpschewe
Bug#703356: megasas: Failed to alloc kernel SGL buffer for IOCTL (ref.#688198)
Jean-Francois Chevrette jf.cr...@gmail.com writes: Package: src:linux Version: 3.2.39-2 Severity: important (first time submiting to a bug report, sorry if I missed anything) We are still affected by bug #688198 Yes, I see that it was closed after applying a related bugfix. But as I noted in http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=688198#25 the reported bug would not be fixed by this after all. The fixed bug was real, but unrelated to the reported one. We have other seemingly identical servers (hardware software) and not all of them have this problem. Is there anything else I can provide to help? The message indicates a memory allocation problem related to sending management commands from userspace to the driver/controller. Management commands are e.g. requests from smartctl, raid monitoring etc. All data transferred between these userspace applications and the controller must be copied to/from dma-coherent buffers for transfer to the controller, and it is the allocation of these buffers which fails. Either because the requests are so bogus (too many or too big) that they just cannot be serviced, or because the system is out of memory in the appropriate pool. Maybe we can get some ideas about why this fails if you describe the conditions you experience the problem under. I believe the fact that you only see this on some of otherwise identical servers is very interesting. If we could find some pattern here, then that would help. Is there some special monitoring application running on the failing servers? Are there other devices in these servers which may have drivers eating memory? I can't, but maybe the Debian kernel gurus can read something out of /proc/slabinfo /proc/buddyinfo /proc/pagetypeinfo Comparing those files on a failing server and a non-failing server would certainly be interesting. Bjørn -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87zjxz241l@nemi.mork.no
Bug#703356: megasas: Failed to alloc kernel SGL buffer for IOCTL (ref.#688198)
Jean-Francois Chevrette jf.cr...@gmail.com writes: On Tue, Mar 19, 2013 at 4:21 AM, Bjørn Mork bj...@mork.no wrote: Maybe we can get some ideas about why this fails if you describe the conditions you experience the problem under. This server is running Xen 4.1 and a single VM. Nothing fancy there. It's also running DRBD to replicate a device to another server. It's also running a few userland tools for monitoring (nagios) and graphing (munin). Other than that nothing fancy. Nagios is the one calling MegaCli to monitor the array consistency. One thing to note is that after a server reboot, the MegaCli tool works fine for a while. This does sounds like there's leak somewhere. I just found out that this server is also running a service called MegaRAID Storage Manager which is a tool provided by LSI to manage the array through a java GUI. Maybe this tool is somehow causing this problem. That sounds like a very likely suspect, yes. Stopping it didn't solve the problem. I'll try disabling the tool and reboot without ever starting it to see if the problem occurs again. Good. If that works then we probably should find out what this tool does to trigger the problem, so that it can be handled properly by the driver. Bjørn -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87mwtz1qma@nemi.mork.no