On Wed, Aug 25, 2010 at 09:48:47PM -0400, Ted Ts'o wrote:
On Wed, Aug 25, 2010 at 05:30:42PM -0700, David Rientjes wrote:
We certainly hope that nobody will reimplement the same function without
the __deprecated warning, especially for order PAGE_ALLOC_COSTLY_ORDER
where there's no
On Wed, Aug 25, 2010 at 08:09:21PM -0700, David Rientjes wrote:
On Wed, 25 Aug 2010, Ted Ts'o wrote:
I think it's really sad that the caller can't know what the upper bounds
of its memory requirement are ahead of time or at least be able to
implement a memory freeing function when
On Wed, 2010-08-25 at 20:09 -0700, David Rientjes wrote:
Oh, we can determine an upper bound. You might just not like it.
Actually ext3/ext4 shouldn't be as bad as XFS, which Dave estimated to
be around 400k for a transaction. My guess is that the worst case for
ext3/ext4 is probably
On Wed, 25 Aug 2010, David Rientjes wrote:
Because we can remove the flag, remove branches from the page allocator
slowpath, and none of these allocations actually are dependent on
__GFP_NOFAIL since they are all under PAGE_ALLOC_COSTLY_ORDER.
Then we can simply remove __GFP_NOFAIL? Functions
On Thu, 26 Aug 2010, Christoph Lameter wrote:
Because we can remove the flag, remove branches from the page allocator
slowpath, and none of these allocations actually are dependent on
__GFP_NOFAIL since they are all under PAGE_ALLOC_COSTLY_ORDER.
Then we can simply remove __GFP_NOFAIL?
On Tue, Aug 24, 2010 at 01:11:26PM -0700, David Rientjes wrote:
On Tue, 24 Aug 2010, Jens Axboe wrote:
Should be possible to warn at build time for anyone using __GFP_NOFAIL
without wrapping it in a function.
We could make this __deprecated functions as Peter suggested if you think
On Wed, 2010-08-25 at 07:24 -0400, Ted Ts'o wrote:
Part of the problem is that we have a few places in the kernel where
failure is really not an option --- or rather, if we're going to fail
while we're in the middle of doing a commit, our choices really are
(a) retry the loop in the jbd layer
On Wed, Aug 25, 2010 at 01:35:32PM +0200, Peter Zijlstra wrote:
On Wed, 2010-08-25 at 07:24 -0400, Ted Ts'o wrote:
Part of the problem is that we have a few places in the kernel where
failure is really not an option --- or rather, if we're going to fail
while we're in the middle of doing a
On Wed, 2010-08-25 at 07:57 -0400, Ted Ts'o wrote:
On Wed, Aug 25, 2010 at 01:35:32PM +0200, Peter Zijlstra wrote:
On Wed, 2010-08-25 at 07:24 -0400, Ted Ts'o wrote:
Part of the problem is that we have a few places in the kernel where
failure is really not an option --- or rather, if
On Wed, 2010-08-25 at 14:48 +0200, Peter Zijlstra wrote:
On Wed, 2010-08-25 at 07:24 -0400, Ted Ts'o wrote:
(a) retry the loop in the jbd layer (which Andrew really doesn't
like), (b) keep our own private cache of free memory so we don't fail
and/or loop,
Well (b) sounds a lot
On Aug 25, 2010, at 8:52 AM, Peter Zijlstra wrote:
Also, there's a good reason for disliking (a), its a deadlock scenario,
suppose we need to write out data to get free pages, but the writing out
is blocked on requiring free pages.
There's really nothing the page allocator can do to help
On Wed, Aug 25, 2010 at 02:48:36PM +0200, Peter Zijlstra wrote:
On Wed, 2010-08-25 at 07:57 -0400, Ted Ts'o wrote:
On Wed, Aug 25, 2010 at 01:35:32PM +0200, Peter Zijlstra wrote:
On Wed, 2010-08-25 at 07:24 -0400, Ted Ts'o wrote:
Part of the problem is that we have a few places in the
On Wed, 2010-08-25 at 09:20 -0400, Theodore Tso wrote:
On Aug 25, 2010, at 8:52 AM, Peter Zijlstra wrote:
Also, there's a good reason for disliking (a), its a deadlock scenario,
suppose we need to write out data to get free pages, but the writing out
is blocked on requiring free pages.
On Wed, 2010-08-25 at 09:20 -0400, Theodore Tso wrote:
Well, if all of these users start having their own private pools of
emergency memory, I'm not sure that's such a great idea either.
That's a secondary problem, and could be reduced by something like the
memory reservation system I provided
On Wed, 2010-08-25 at 23:24 +1000, Dave Chinner wrote:
That is, the guarantee that we will always make progress simply does
not exist in filesystems, so a mempool-like concept seems to me to
be doomed from the start
While I appreciate that it might be somewhat (a lot) harder for a
On Wed, 2010-08-25 at 07:24 -0400, Ted Ts'o wrote:
There are other places where we can fail safely (for example, in jbd's
start_this_handle, although that just pushes the layer up the stack,
and ultimately, to userspace where most userspace programs don't
really expect ENOMEM to get returned
On Wed, 25 Aug 2010, Peter Zijlstra wrote:
I'm not sure, but I think the cgroup thing doesn't account kernel
allocations, in which case the above problem doesn't exist.
Right, the only cgroup that does is cpusets because it binds the memory
allocations to a set of nodes.
For the cpuset
On Wed, Aug 25, 2010 at 03:35:42PM +0200, Peter Zijlstra wrote:
While I appreciate that it might be somewhat (a lot) harder for a
filesystem to provide that guarantee, I'd be deeply worried about your
claim that its impossible.
It would render a system without swap very prone to deadlocks.
On Wed, 25 Aug 2010, Ted Ts'o wrote:
A pool of free pages which is reserved for routines that are doing
page cleaning would probably also be a good idea. Maybe that's just
retrying with GFP_ATOMIC if a normal allocation fails, or maybe we
need our own special pool, or maybe we need to
On Wed, 25 Aug 2010, Peter Zijlstra wrote:
The cpusets case is actually the easiest to fix: use GFP_ATOMIC.
I don't think that's a valid usage of GFP_ATOMIC, I think we should
fallback to outside the cpuset for kernel allocations by default.
Cpusets doesn't enforce isolation for only
On Wed, 2010-08-25 at 16:11 -0500, Christoph Lameter wrote:
We already have __GFP_NOFAIL behavior for slab allocations since
a __GFP_NOFAIL flag is passed through to the page allocator if no objects
are available.
The page allocator will do the same as when called directly with
__GFP_NOFAIL
On Wed, 25 Aug 2010, David Rientjes wrote:
It all depends on what flags are passed to kmalloc(), slab nor slub
enforce __GFP_NOFAIL behavior themselves. In slab, cache_grow() will
return NULL depending on whether the page allocator returns NULL, and that
would only happen for __GFP_NORETRY
On Wed, 2010-08-25 at 16:53 -0400, Ted Ts'o wrote:
On Wed, Aug 25, 2010 at 03:35:42PM +0200, Peter Zijlstra wrote:
While I appreciate that it might be somewhat (a lot) harder for a
filesystem to provide that guarantee, I'd be deeply worried about your
claim that its impossible.
It
On Wed, 25 Aug 2010, Christoph Lameter wrote:
If the higher order fails in slub then an order 0 alloc is attempted
without __GFP_NORETRY. In both cases the nofail behavior of the page
allocator determines the outcode.
Right, I thought you said the slab layer passes __GFP_NOFAIL when there's
On Wed, 25 Aug 2010, Peter Zijlstra wrote:
There's still no hard guarantee that the memory will allocatable
(GFP_KERNEL, the compaction, then GFP_ATOMIC could all still fail), but I
don't see how continuously looping the page allocator is possibly supposed
to help in these situations.
On Wed, Aug 25, 2010 at 03:35:42PM +0200, Peter Zijlstra wrote:
On Wed, 2010-08-25 at 23:24 +1000, Dave Chinner wrote:
That is, the guarantee that we will always make progress simply does
not exist in filesystems, so a mempool-like concept seems to me to
be doomed from the start
On Wed, Aug 25, 2010 at 04:11:38PM -0700, David Rientjes wrote:
I'll repropose the patchset with __deprecated as you suggested. Thanks!
And what Dave and I are saying is that we'll either need to do our on
loop to avoid the deprecation warning, or the use of the deprecated
function will
On Wed, 25 Aug 2010, Ted Ts'o wrote:
I'll repropose the patchset with __deprecated as you suggested. Thanks!
And what Dave and I are saying is that we'll either need to do our on
loop to avoid the deprecation warning, or the use of the deprecated
function will probably be used forever...
On Wed, 25 Aug 2010, David Rientjes wrote:
On Wed, 25 Aug 2010, Christoph Lameter wrote:
If the higher order fails in slub then an order 0 alloc is attempted
without __GFP_NORETRY. In both cases the nofail behavior of the page
allocator determines the outcode.
Right, I thought you
On Wed, Aug 25, 2010 at 05:30:42PM -0700, David Rientjes wrote:
We certainly hope that nobody will reimplement the same function without
the __deprecated warning, especially for order PAGE_ALLOC_COSTLY_ORDER
where there's no looping at a higher level. So perhaps the best
alternative is
On Wed, 25 Aug 2010, Ted Ts'o wrote:
We certainly hope that nobody will reimplement the same function without
the __deprecated warning, especially for order PAGE_ALLOC_COSTLY_ORDER
where there's no looping at a higher level. So perhaps the best
alternative is to implement the same
On Wed, 25 Aug 2010, Christoph Lameter wrote:
Right, I thought you said the slab layer passes __GFP_NOFAIL when there's
no objects available.
Yes, the slab layer calls the page allocator when there are no objects
available and passes the __GFP_NOFAIL that the user may have set in the
Add kmalloc_nofail(), kcalloc_nofail(), and kzalloc_nofail(). These
functions are equivalent to kmalloc(), kcalloc(), and kzalloc(),
respectively, except that they will never return NULL and instead loop
forever trying to allocate memory.
If the first allocation attempt fails, a warning will be
On Tue, 2010-08-24 at 03:50 -0700, David Rientjes wrote:
These were added as helper functions for documentation and auditability.
No future callers should be added.
git grep GFP_NOFAIL isn't auditable enough?
might as well declare these functions depricated if you really want to
do this.
FWIW
On 2010-08-24 15:29, Peter Zijlstra wrote:
On Tue, 2010-08-24 at 03:50 -0700, David Rientjes wrote:
These were added as helper functions for documentation and auditability.
No future callers should be added.
git grep GFP_NOFAIL isn't auditable enough?
might as well declare these functions
On Tue, Aug 24, 2010 at 03:29:18PM +0200, Peter Zijlstra wrote:
On Tue, 2010-08-24 at 03:50 -0700, David Rientjes wrote:
These were added as helper functions for documentation and auditability.
No future callers should be added.
git grep GFP_NOFAIL isn't auditable enough?
might as well
On Tue, 24 Aug 2010, Jens Axboe wrote:
Should be possible to warn at build time for anyone using __GFP_NOFAIL
without wrapping it in a function.
We could make this __deprecated functions as Peter suggested if you think
build time warnings for existing users would be helpful.
--
To
On Tue, 24 Aug 2010, Dave Chinner wrote:
git grep GFP_NOFAIL isn't auditable enough?
might as well declare these functions depricated if you really want to
do this.
Also, if you are going to add tight loops, you might want to put a
backoff in the loops like
38 matches
Mail list logo