On 10/14/13 11:17 PM, Matthew Ahrens wrote: > That message is about failing due to running out of memory, which your > changes don't address. Your changes address running out of virtual > address space.
Actually no, in that case the machine was running "low on memory" in the sense of there being very few large unallocated chunks. They had 128GB of physical, of which 90GB was used by ARC - not exactly what you'd call a traditional low memory situation (the ARC being considered expendable). What happened was that the kernel addresses space got heavily fragmented and the ARC wasn't feeling much pressure to contract, so larger allocations started failing. What compounded the situation was that the allocation originated in the ARC itself and was KM_SLEEP, which meant that after a journey through the VM subsystem, the ARC *did* attempt to contract accommodate the new allocation, but it deadlocked (due to having reentered itself again). In the hash table allocation case this deadlock situation fortunately can't happen (KM_NOSLEEP), but what will happen is that the buf_init routine will retry lowering after its request size, potentially resulting in a much smaller hash table with far worse performance. The impact this can have on performance can be very hard to diagnose and since the problem is transient (after a reboot it's most likely gone), very hard to pin down. Cheers, -- Saso _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
