On Wed, 20 Feb 2008 15:46:25 +0100 Peter Zijlstra <[EMAIL PROTECTED]> wrote:

> Provide the basic infrastructure to reserve and charge/account network memory.
> 
> We provide the following reserve tree:
> 
> 1)  total network reserve
> 2)    network TX reserve
> 3)      protocol TX pages
> 4)    network RX reserve
> 5)      SKB data reserve
> 
> [1] is used to make all the network reserves a single subtree, for easy
> manipulation.
> 
> [2] and [4] are merely for eastetic reasons.
> 
> The TX pages reserve [3] is assumed bounded by it being the upper bound of
> memory that can be used for sending pages (not quite true, but good enough)
> 
> The SKB reserve [5] is an aggregate reserve, which is used to charge SKB data
> against in the fallback path.
> 
> The consumers for these reserves are sockets marked with:
>   SOCK_MEMALLOC
> 
> Such sockets are to be used to service the VM (iow. to swap over). They
> must be handled kernel side, exposing such a socket to user-space is a BUG.
> 
> +/**
> + *   sk_adjust_memalloc - adjust the global memalloc reserve for critical RX
> + *   @socks: number of new %SOCK_MEMALLOC sockets
> + *   @tx_resserve_pages: number of pages to (un)reserve for TX
> + *
> + *   This function adjusts the memalloc reserve based on system demand.
> + *   The RX reserve is a limit, and only added once, not for each socket.
> + *
> + *   NOTE:
> + *      @tx_reserve_pages is an upper-bound of memory used for TX hence
> + *      we need not account the pages like we do for RX pages.
> + */
> +int sk_adjust_memalloc(int socks, long tx_reserve_pages)
> +{
> +     int nr_socks;
> +     int err;
> +
> +     err = mem_reserve_pages_add(&net_tx_pages, tx_reserve_pages);
> +     if (err)
> +             return err;
> +
> +     nr_socks = atomic_read(&memalloc_socks);
> +     if (!nr_socks && socks > 0)
> +             err = mem_reserve_connect(&net_reserve, &mem_reserve_root);

This looks like it should have some locking?

> +     nr_socks = atomic_add_return(socks, &memalloc_socks);
> +     if (!nr_socks && socks)
> +             err = mem_reserve_disconnect(&net_reserve);

Or does that try to make up for it?  Still looks fishy.

> +     if (err)
> +             mem_reserve_pages_add(&net_tx_pages, -tx_reserve_pages);
> +
> +     return err;
> +}
> +
> +/**
> + *   sk_set_memalloc - sets %SOCK_MEMALLOC
> + *   @sk: socket to set it on
> + *
> + *   Set %SOCK_MEMALLOC on a socket and increase the memalloc reserve
> + *   accordingly.
> + */
> +int sk_set_memalloc(struct sock *sk)
> +{
> +     int set = sock_flag(sk, SOCK_MEMALLOC);
> +#ifndef CONFIG_NETVM
> +     BUG();
> +#endif

??  #error, maybe?

> +     if (!set) {
> +             int err = sk_adjust_memalloc(1, 0);
> +             if (err)
> +                     return err;
> +
> +             sock_set_flag(sk, SOCK_MEMALLOC);
> +             sk->sk_allocation |= __GFP_MEMALLOC;
> +     }
> +     return !set;
> +}
> +EXPORT_SYMBOL_GPL(sk_set_memalloc);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to