[UPDATED PATCH] SGISEEQ: use cached memory access to make driver work on IP28

2007-12-02 Thread Thomas Bogendoerfer
SGI IP28 machines would need special treatment (enable adding addtional
wait states) when accessing memory uncached. To avoid this pain I changed
the driver to use only cached access to memory.

Signed-off-by: Thomas Bogendoerfer [EMAIL PROTECTED]
---

Changes to last version:
- Use inline functions for dma_sync_* instead of macros (suggested by Ralf)
- added Kconfig change to make selection for similair SGI boxes easier


 drivers/net/Kconfig   |2 +-
 drivers/net/sgiseeq.c |  243 ++---
 2 files changed, 171 insertions(+), 74 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 7a55bc1..9cbd5de 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -1795,7 +1795,7 @@ config DE620
 
 config SGISEEQ
tristate SGI Seeq ethernet controller support
-   depends on SGI_IP22
+   depends on SGI_HAS_SEEQ
help
  Say Y here if you have an Seeq based Ethernet network card. This is
  used in many Silicon Graphics machines.
diff --git a/drivers/net/sgiseeq.c b/drivers/net/sgiseeq.c
index ff40563..c69bb8b 100644
--- a/drivers/net/sgiseeq.c
+++ b/drivers/net/sgiseeq.c
@@ -12,7 +12,6 @@
 #include linux/init.h
 #include linux/types.h
 #include linux/interrupt.h
-#include linux/slab.h
 #include linux/string.h
 #include linux/delay.h
 #include linux/netdevice.h
@@ -53,14 +52,27 @@ static char *sgiseeqstr = SGI Seeq8003;
sp-tx_old + (SEEQ_TX_BUFFERS - 1) - sp-tx_new : \
sp-tx_old - sp-tx_new - 1)
 
+#define VIRT_TO_DMA(sp, v) ((sp)-srings_dma + 
\
+ (dma_addr_t)((unsigned long)(v) -\
+  (unsigned long)((sp)-rx_desc)))
+
+/* Copy frames shorter than rx_copybreak, otherwise pass on up in
+ * a full sized sk_buff.  Value of 100 stolen from tulip.c (!alpha).
+ */
+static int rx_copybreak = 100;
+
+#define PAD_SIZE(128 - sizeof(struct hpc_dma_desc) - sizeof(void *))
+
 struct sgiseeq_rx_desc {
volatile struct hpc_dma_desc rdma;
-   volatile signed int buf_vaddr;
+   u8 padding[PAD_SIZE];
+   struct sk_buff *skb;
 };
 
 struct sgiseeq_tx_desc {
volatile struct hpc_dma_desc tdma;
-   volatile signed int buf_vaddr;
+   u8 padding[PAD_SIZE];
+   struct sk_buff *skb;
 };
 
 /*
@@ -96,6 +108,18 @@ struct sgiseeq_private {
spinlock_t tx_lock;
 };
 
+static inline void dma_sync_desc_cpu(struct net_device *dev, void *addr)
+{
+   dma_cache_sync(dev-dev.parent, addr, sizeof(struct sgiseeq_rx_desc),
+  DMA_FROM_DEVICE);
+}
+
+static inline void dma_sync_desc_dev(struct net_device *dev, void *addr)
+{
+   dma_cache_sync(dev-dev.parent, addr, sizeof(struct sgiseeq_rx_desc),
+  DMA_TO_DEVICE);
+}
+
 static inline void hpc3_eth_reset(struct hpc3_ethregs *hregs)
 {
hregs-reset = HPC3_ERST_CRESET | HPC3_ERST_CLRIRQ;
@@ -163,35 +187,55 @@ static int seeq_init_ring(struct net_device *dev)
 
/* Setup tx ring. */
for(i = 0; i  SEEQ_TX_BUFFERS; i++) {
-   if (!sp-tx_desc[i].tdma.pbuf) {
-   unsigned long buffer;
-
-   buffer = (unsigned long) kmalloc(PKT_BUF_SZ, 
GFP_KERNEL);
-   if (!buffer)
-   return -ENOMEM;
-   sp-tx_desc[i].buf_vaddr = CKSEG1ADDR(buffer);
-   sp-tx_desc[i].tdma.pbuf = CPHYSADDR(buffer);
-   }
sp-tx_desc[i].tdma.cntinfo = TCNTINFO_INIT;
+   dma_sync_desc_dev(dev, sp-tx_desc[i]);
}
 
/* And now the rx ring. */
for (i = 0; i  SEEQ_RX_BUFFERS; i++) {
if (!sp-rx_desc[i].rdma.pbuf) {
-   unsigned long buffer;
+   dma_addr_t dma_addr;
+   struct sk_buff *skb = netdev_alloc_skb(dev, PKT_BUF_SZ);
 
-   buffer = (unsigned long) kmalloc(PKT_BUF_SZ, 
GFP_KERNEL);
-   if (!buffer)
+   if (skb == NULL)
return -ENOMEM;
-   sp-rx_desc[i].buf_vaddr = CKSEG1ADDR(buffer);
-   sp-rx_desc[i].rdma.pbuf = CPHYSADDR(buffer);
+   skb_reserve(skb, 2);
+   dma_addr = dma_map_single(dev-dev.parent,
+ skb-data - 2,
+ PKT_BUF_SZ, DMA_FROM_DEVICE);
+   sp-rx_desc[i].skb = skb;
+   sp-rx_desc[i].rdma.pbuf = dma_addr;
}
sp-rx_desc[i].rdma.cntinfo = RCNTINFO_INIT;
+   dma_sync_desc_dev(dev, sp-rx_desc[i]);
}
sp-rx_desc[i - 1].rdma.cntinfo |= HPCDMA_EOR;
+   dma_sync_desc_dev(dev, sp-rx_desc[i - 1]);
return 0;

Re: [PATCH 13/21] [TCP]: Store retransmit queue packets in RB tree.

2007-12-02 Thread Ilpo Järvinen
On Sun, 2 Dec 2007, Herbert Xu wrote:

 On Sun, Dec 02, 2007 at 12:48:08AM +0200, Ilpo Järvinen wrote:
  
  This work was mostly done by David S. Miller.
 
 ...
 
  @@ -253,6 +254,8 @@ struct sk_buff {
  struct sk_buff  *next;
  struct sk_buff  *prev;
  
  +   struct rb_node  rb;
 
 Did David really do this? :)

...Believe it or not, yes he did :-), and there's even a more convicing 
proof about that in here:

http://marc.info/?l=linux-netdevm=117269223013422w=2

...and in Dave's obsolete tcp-2.6 tree (which has the bug fixed 
version)... ;-)

...I changed it by just by moving tp-write_queue_rb = RB_ROOT to another 
function because I needed more initialization later on.

 I suppose if we moved this into cb (by growing it by the required amount)
 then at least the new space can be used by others.

...yeah, will do.


-- 
 i.

Re: [PATCH 12/21] [TCP]: Introduce per skb fack_counts to retransmit queue

2007-12-02 Thread Ilpo Järvinen
On Sun, 2 Dec 2007, Herbert Xu wrote:

 On Sun, Dec 02, 2007 at 12:48:07AM +0200, Ilpo Järvinen wrote:
 
  @@ -1220,6 +1221,11 @@ static inline struct sk_buff 
  *tcp_write_queue_next(struct sock *sk, struct sk_bu
  return skb-next;
   }
  
  +static inline struct sk_buff *tcp_write_queue_prev(struct sock *sk, struct 
  sk_buff *skb)
  +{
  +   return skb-prev;
  +}
  +
   #define tcp_for_write_queue(skb, sk)   
  \
  for (skb = (sk)-sk_write_queue.next;   \
   (skb != (struct sk_buff *)(sk)-sk_write_queue);  \
  @@ -1241,6 +1247,11 @@ static inline struct sk_buff *tcp_send_head(struct 
  sock *sk)
  
   static inline void tcp_advance_send_head(struct sock *sk, struct sk_buff 
  *skb)
   {
  +   struct sk_buff *prev = tcp_write_queue_prev(sk, skb);
  +
  +   TCP_SKB_CB(skb)-fack_count = TCP_SKB_CB(prev)-fack_count +
  + tcp_skb_pcount(prev);
  +
  sk-sk_send_head = skb-next;
  if (sk-sk_send_head == (struct sk_buff *)sk-sk_write_queue)
  sk-sk_send_head = NULL;
 
 This crashed my machine on boot.  The reason is that prev was empty.
 I've added this fix to net-2.6.25.

Hmm didn't crash for me, probably a silent corruption instead :-(.

 [TCP]: Fix crash in tcp_advance_send_head
 
 We need to check whether there is a prev in tcp_advance_send_head
 before dereferencing it. This patch does just that and leaves the
 fack count at zero if there is no prev.

Maybe I could add catch for invalid skb dereferences (those list heads) to 
TCP_SKB_CB and tcp_skb_pcount if some DEBUG thing is set. Those two should 
catch at least 98% of the invalid users.

...Not sure if CONFIG_DEBUG_LIST is right one to use, suggestion welcome.

-- 
 i.

Re: [PATCH 12/21] [TCP]: Introduce per skb fack_counts to retransmit queue

2007-12-02 Thread Herbert Xu
On Sun, Dec 02, 2007 at 01:29:36PM +0200, Ilpo Järvinen wrote:

 Hmm didn't crash for me, probably a silent corruption instead :-(.

It crashed during NIS start-up and I suppose I'm the only still
running NIS :)

 Maybe I could add catch for invalid skb dereferences (those list heads) to
 TCP_SKB_CB and tcp_skb_pcount if some DEBUG thing is set. Those two should
 catch at least 98% of the invalid users.
 
 ...Not sure if CONFIG_DEBUG_LIST is right one to use, suggestion welcome.

Sounds OK to me.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/3] [NET] phy/fixed.c: rework to not duplicate PHYlayer functionality

2007-12-02 Thread Joakim Tjernlund
[SNIP]
 ^^ the correct solution is to implement arch_initcall function
 which will create fixed PHYs, and then leave only
 snprintf(fpi-bus_id, 16, PHY_ID_FMT, 0, *data); part in the
 fs_enet's find_phy().
 
 Try add something like this to the fsl_soc.c (compile untested):
 
 - - - -
 static int __init of_add_fixed_phys(void)
 {
   struct device_node *np;
   const u32 *prop;
   struct fixed_phy_status status = {};
 
   while ((np = of_find_node_by_name(NULL, ethernet))) {
   data  = of_get_property(np, fixed-link, NULL);
   if (!data)
   continue;
 
   status.link = 1;
   status.duplex = data[1];
   status.speed  = data[2];

What about Pause and Asym_Pause? Dunno why so few, if any, eth drivers
impl. it, but the PHY lib supports it.
Even if fixed PHYs doesn't support it directly I think the OF interface
should have it.

- fixed-link : a b c d e where a is emulated phy id - choose any,
  but unique to the all specified fixed-links, b is duplex - 0 half,
  1 full, c is link speed - d#10/d#100/d#1000, d is pause - 0 no pause,
  1 pause, d asym_pause - 0 no asym_pause, 1 asym_pause.

Jocke

 
   ret = fixed_phy_add(PHY_POLL, data[0], status);
   if (ret)
   return ret;
   }
 
   return 0;
 }
 arch_initcall(of_add_fixed_phys);
 - - - -
 
 And remove fixed_phy_add() from the fs_enet. This should work
 nicely and also should be ideologically correct. ;-)
 
  How is this supposed to work for modules or for the
  PPC_CPM_NEW_BINDING mode where the device tree is no longer scanned
  during fs_soc initialization but during device initialization?
 
 We should mark fixed.c as bool. Fake/virtual/fixed/platform PHYs
 creation is architecture code anyway, can't be =m.
 
 -- 
 Anton Vorontsov
 email: [EMAIL PROTECTED]
 backup email: [EMAIL PROTECTED]
 irc://irc.freenode.net/bd2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] [NET] phy/fixed.c: rework to not duplicate PHYlayer functionality

2007-12-02 Thread Anton Vorontsov
On Sun, Dec 02, 2007 at 12:54:36PM +0100, Joakim Tjernlund wrote:
 [SNIP]
  ^^ the correct solution is to implement arch_initcall function
  which will create fixed PHYs, and then leave only
  snprintf(fpi-bus_id, 16, PHY_ID_FMT, 0, *data); part in the
  fs_enet's find_phy().
  
  Try add something like this to the fsl_soc.c (compile untested):
  
  - - - -
  static int __init of_add_fixed_phys(void)
  {
  struct device_node *np;
  const u32 *prop;
  struct fixed_phy_status status = {};
  
  while ((np = of_find_node_by_name(NULL, ethernet))) {
  data  = of_get_property(np, fixed-link, NULL);
  if (!data)
  continue;
  
  status.link = 1;
  status.duplex = data[1];
  status.speed  = data[2];
 
 What about Pause and Asym_Pause?

Will be addressed in the next respin of these patches. Let's
hope on Monday.

 Dunno why so few, if any, eth drivers
 impl. it, but the PHY lib supports it.
 Even if fixed PHYs doesn't support it directly I think the OF interface
 should have it.
 
 - fixed-link : a b c d e where a is emulated phy id - choose any,
   but unique to the all specified fixed-links, b is duplex - 0 half,
   1 full, c is link speed - d#10/d#100/d#1000, d is pause - 0 no pause,
   1 pause, d asym_pause - 0 no asym_pause, 1 asym_pause.

-- 
Anton Vorontsov
email: [EMAIL PROTECTED]
backup email: [EMAIL PROTECTED]
irc://irc.freenode.net/bd2
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SSB: No is not an answer

2007-12-02 Thread Arnaldo Carvalho de Melo
Em Sat, Dec 01, 2007 at 09:27:49PM +0100, Michael Buesch escreveu:
 On Saturday 01 December 2007 20:00:23 Arnaldo Carvalho de Melo wrote:
  Em Sat, Dec 01, 2007 at 12:45:32PM -0500, John W. Linville escreveu:
   On Sat, Dec 01, 2007 at 03:17:44PM -0200, Arnaldo Carvalho de Melo wrote:
Sonics Silicon Backplane support (SSB) [M/y/?] (NEW) n

Support for the Sonics Silicon Backplane bus.
You only need to enable this option, if you are
configuring a kernel for an embedded system with
this bus.
It will be auto-selected if needed in other
environments.

The module will be called ssb.

If unsure, say N.

Sonics Silicon Backplane support (SSB) [M/y/?] (NEW)
   
   I think this is OK -- it isn't really offering the choice to say
   no anyway.  You must have turned-on B44 or B43(LEGACY) already?
   
   So, your choice is merely whether to have it built-in or as a module.
  
  Ok, so the comment on being unsure is wrong as we can't say N as
  suggested :-)
 
 Oh, come on... Read the _whole_ comment.

Oh well, if you think that it is OK, so be it, I found it confusing and
reported, you find it ok to suggest chosing 'N' when 'M' or 'y' are the
only possible answers. 

Perhaps telling that N is not a possible answer when B44 or B43 was
previously selected could help, up to you.

- Arnaldo
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] Remove trailing NULs from network bonding sysfs interface.

2007-12-02 Thread Wagner Ferenc
Also remove trailing spaces from multivalued files.

This fixes output like for example:

$ od -c /sys/class/net/bond0/bonding/slaves
000   e   t   h   -   l   e   f   t   e   t   h   -   r   i   g
020   h   t  \n  \0
025

It mostly entails deleting '+1'-s after sprintf() calls: the return value
of sprintf is the number of characters printed, without the closing NUL,
ie. exactly what the sysfs interface requires.  The three multivalue
cases are different, because they also have to swallow back a trailing
space.

Signed-off-by: Ferenc Wagner [EMAIL PROTECTED]
---

Jay Vosburgh [EMAIL PROTECTED] writes:

The previous code returned '\n' (that is, a single empty line)
from most files, with one exception (xmit_hash_policy), where
it returned 'NA\n'.  This patch consolidates each file to return
nothing at all if not applicable, not even a '\n'.

I find this behaviour more usual, more useful, more efficient
and shorter to code from both sides.
 [...]
+ if ((bond-params.mode == BOND_MODE_XOR) ||
+ (bond-params.mode == BOND_MODE_8023AD)) {
  count = sprintf(buf, %s %d\n,
  xmit_hashtype_tbl[bond-params.xmit_policy].modename,
  bond-params.xmit_policy);

   Rather than this (returning nothing if not in xor or 802.3ad
 mode), I'd prefer to see this always return whatever the xmit policy is
 (regardless of the mode), and remove the mode test from
 bonding_store_xmit_hash().

   This would be consistent with the way the arp_ip_target option
 is treated: the actual value is always displayed, even if it is not
 used, and it is legal to change the value, regardless of the mode.

Okay, I'm resending the full patch series with correct subject
counters accounting for the two followup patches taking care for your
comments (4 and 5).  Please let me know if something is still missing.

Thanks,
Feri.

 drivers/net/bonding/bond_sysfs.c |   66 +
 1 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index b29330d..a3f1b4a 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -86,14 +86,13 @@ static ssize_t bonding_show_bonds(struct class *cls, char 
*buffer)
/* not enough space for another interface name */
if ((PAGE_SIZE - res)  10)
res = PAGE_SIZE - 10;
-   res += sprintf(buffer + res, ++more++);
+   res += sprintf(buffer + res, ++more++ );
break;
}
res += sprintf(buffer + res, %s ,
   bond-dev-name);
}
-   res += sprintf(buffer + res, \n);
-   res++;
+   if (res) buffer[res-1] = '\n'; /* eat the leftover space */
up_read((bonding_rwsem));
return res;
 }
@@ -235,14 +234,13 @@ static ssize_t bonding_show_slaves(struct device *d,
/* not enough space for another interface name */
if ((PAGE_SIZE - res)  10)
res = PAGE_SIZE - 10;
-   res += sprintf(buf + res, ++more++);
+   res += sprintf(buf + res, ++more++ );
break;
}
res += sprintf(buf + res, %s , slave-dev-name);
}
read_unlock(bond-lock);
-   res += sprintf(buf + res, \n);
-   res++;
+   if (res) buf[res-1] = '\n'; /* eat the leftover space */
return res;
 }
 
@@ -406,7 +404,7 @@ static ssize_t bonding_show_mode(struct device *d,
 
return sprintf(buf, %s %d\n,
bond_mode_tbl[bond-params.mode].modename,
-   bond-params.mode) + 1;
+   bond-params.mode);
 }
 
 static ssize_t bonding_store_mode(struct device *d,
@@ -463,11 +461,11 @@ static ssize_t bonding_show_xmit_hash(struct device *d,
if ((bond-params.mode != BOND_MODE_XOR) 
(bond-params.mode != BOND_MODE_8023AD)) {
// Not Applicable
-   count = sprintf(buf, NA\n) + 1;
+   count = sprintf(buf, NA\n);
} else {
count = sprintf(buf, %s %d\n,
xmit_hashtype_tbl[bond-params.xmit_policy].modename,
-   bond-params.xmit_policy) + 1;
+   bond-params.xmit_policy);
}
 
return count;
@@ -527,7 +525,7 @@ static ssize_t bonding_show_arp_validate(struct device *d,
 
return sprintf(buf, %s %d\n,
   arp_validate_tbl[bond-params.arp_validate].modename,
-  bond-params.arp_validate) + 1;
+  bond-params.arp_validate);
 }
 
 static ssize_t bonding_store_arp_validate(struct device *d,
@@ -627,7 +625,7 @@ static ssize_t bonding_show_arp_interval(struct device *d,
 {

[PATCH 2/5] net/bonding: Return nothing for not applicable values

2007-12-02 Thread Wagner Ferenc
The previous code returned '\n' (that is, a single empty line)
from most files, with one exception (xmit_hash_policy), where
it returned 'NA\n'.  This patch consolidates each file to return
nothing at all if not applicable, not even a '\n'.

I find this behaviour more usual, more useful, more efficient
and shorter to code from both sides.

Signed-off-by: Ferenc Wagner [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_sysfs.c |   25 -
 1 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index a3f1b4a..6bb91e2 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -455,14 +455,11 @@ static ssize_t bonding_show_xmit_hash(struct device *d,
  struct device_attribute *attr,
  char *buf)
 {
-   int count;
+   int count = 0;
struct bonding *bond = to_bond(d);
 
-   if ((bond-params.mode != BOND_MODE_XOR) 
-   (bond-params.mode != BOND_MODE_8023AD)) {
-   // Not Applicable
-   count = sprintf(buf, NA\n);
-   } else {
+   if ((bond-params.mode == BOND_MODE_XOR) ||
+   (bond-params.mode == BOND_MODE_8023AD)) {
count = sprintf(buf, %s %d\n,
xmit_hashtype_tbl[bond-params.xmit_policy].modename,
bond-params.xmit_policy);
@@ -1079,8 +1076,6 @@ static ssize_t bonding_show_primary(struct device *d,
 
if (bond-primary_slave)
count = sprintf(buf, %s\n, bond-primary_slave-dev-name);
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1186,7 +1181,7 @@ static ssize_t bonding_show_active_slave(struct device *d,
 {
struct slave *curr;
struct bonding *bond = to_bond(d);
-   int count;
+   int count = 0;
 
read_lock(bond-curr_slave_lock);
curr = bond-curr_active_slave;
@@ -1194,8 +1189,6 @@ static ssize_t bonding_show_active_slave(struct device *d,
 
if (USES_PRIMARY(bond-params.mode)  curr)
count = sprintf(buf, %s\n, curr-dev-name);
-   else
-   count = sprintf(buf, \n);
return count;
 }
 
@@ -1309,8 +1302,6 @@ static ssize_t bonding_show_ad_aggregator(struct device 
*d,
struct ad_info ad_info;
count = sprintf(buf, %d\n, 
(bond_3ad_get_active_agg_info(bond, ad_info)) ?  0 : ad_info.aggregator_id);
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1331,8 +1322,6 @@ static ssize_t bonding_show_ad_num_ports(struct device *d,
struct ad_info ad_info;
count = sprintf(buf, %d\n, 
(bond_3ad_get_active_agg_info(bond, ad_info)) ?  0: ad_info.ports);
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1353,8 +1342,6 @@ static ssize_t bonding_show_ad_actor_key(struct device *d,
struct ad_info ad_info;
count = sprintf(buf, %d\n, 
(bond_3ad_get_active_agg_info(bond, ad_info)) ?  0 : ad_info.actor_key);
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1375,8 +1362,6 @@ static ssize_t bonding_show_ad_partner_key(struct device 
*d,
struct ad_info ad_info;
count = sprintf(buf, %d\n, 
(bond_3ad_get_active_agg_info(bond, ad_info)) ?  0 : ad_info.partner_key);
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
@@ -1401,8 +1386,6 @@ static ssize_t bonding_show_ad_partner_mac(struct device 
*d,
print_mac(mac, ad_info.partner_system));
}
}
-   else
-   count = sprintf(buf, \n);
 
return count;
 }
-- 
1.4.4.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] net/bonding: Purely cosmetic: rename a local variable

2007-12-02 Thread Wagner Ferenc
Code for rendering multivalue sysfs files occurs three times
in this module.  Rename 'buffer' to 'buf' in the first, for
the sake of consistency.

Signed-off-by: Ferenc Wagner [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_sysfs.c |9 -
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 6bb91e2..5c31f5c 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -74,7 +74,7 @@ struct rw_semaphore bonding_rwsem;
  * show function for the bond_masters attribute.
  * The class parameter is ignored.
  */
-static ssize_t bonding_show_bonds(struct class *cls, char *buffer)
+static ssize_t bonding_show_bonds(struct class *cls, char *buf)
 {
int res = 0;
struct bonding *bond;
@@ -86,13 +86,12 @@ static ssize_t bonding_show_bonds(struct class *cls, char 
*buffer)
/* not enough space for another interface name */
if ((PAGE_SIZE - res)  10)
res = PAGE_SIZE - 10;
-   res += sprintf(buffer + res, ++more++ );
+   res += sprintf(buf + res, ++more++ );
break;
}
-   res += sprintf(buffer + res, %s ,
-  bond-dev-name);
+   res += sprintf(buf + res, %s , bond-dev-name);
}
-   if (res) buffer[res-1] = '\n'; /* eat the leftover space */
+   if (res) buf[res-1] = '\n'; /* eat the leftover space */
up_read((bonding_rwsem));
return res;
 }
-- 
1.4.4.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] net/bonding: Adhere to coding style: break line after the if condition

2007-12-02 Thread Wagner Ferenc
Signed-off-by: Ferenc Wagner [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_sysfs.c |9 ++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 5c31f5c..9de2c52 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -91,7 +91,8 @@ static ssize_t bonding_show_bonds(struct class *cls, char 
*buf)
}
res += sprintf(buf + res, %s , bond-dev-name);
}
-   if (res) buf[res-1] = '\n'; /* eat the leftover space */
+   if (res)
+   buf[res-1] = '\n'; /* eat the leftover space */
up_read((bonding_rwsem));
return res;
 }
@@ -239,7 +240,8 @@ static ssize_t bonding_show_slaves(struct device *d,
res += sprintf(buf + res, %s , slave-dev-name);
}
read_unlock(bond-lock);
-   if (res) buf[res-1] = '\n'; /* eat the leftover space */
+   if (res)
+   buf[res-1] = '\n'; /* eat the leftover space */
return res;
 }
 
@@ -705,7 +707,8 @@ static ssize_t bonding_show_arp_targets(struct device *d,
res += sprintf(buf + res, %u.%u.%u.%u ,
   NIPQUAD(bond-params.arp_targets[i]));
}
-   if (res) buf[res-1] = '\n'; /* eat the leftover space */
+   if (res)
+   buf[res-1] = '\n'; /* eat the leftover space */
return res;
 }
 
-- 
1.4.4.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] net/bonding: Allow setting and querying xmit policy regardless of mode

2007-12-02 Thread Wagner Ferenc
For consistency with the behaviour of the arp_ip_target option,
let /sys/class/net/bond0/bonding/xmit_hash_policy accept and report
current policy even if the bonding mode in effect does not use it.

Signed-off-by: Ferenc Wagner [EMAIL PROTECTED]
---
 drivers/net/bonding/bond_sysfs.c |   21 +++--
 1 files changed, 3 insertions(+), 18 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 9de2c52..11b76b3 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -456,17 +456,11 @@ static ssize_t bonding_show_xmit_hash(struct device *d,
  struct device_attribute *attr,
  char *buf)
 {
-   int count = 0;
struct bonding *bond = to_bond(d);
 
-   if ((bond-params.mode == BOND_MODE_XOR) ||
-   (bond-params.mode == BOND_MODE_8023AD)) {
-   count = sprintf(buf, %s %d\n,
-   xmit_hashtype_tbl[bond-params.xmit_policy].modename,
-   bond-params.xmit_policy);
-   }
-
-   return count;
+   return sprintf(buf, %s %d\n,
+  xmit_hashtype_tbl[bond-params.xmit_policy].modename,
+  bond-params.xmit_policy);
 }
 
 static ssize_t bonding_store_xmit_hash(struct device *d,
@@ -484,15 +478,6 @@ static ssize_t bonding_store_xmit_hash(struct device *d,
goto out;
}
 
-   if ((bond-params.mode != BOND_MODE_XOR) 
-   (bond-params.mode != BOND_MODE_8023AD)) {
-   printk(KERN_ERR DRV_NAME
-  %s: Transmit hash policy is irrelevant in this mode.\n,
-  bond-dev-name);
-   ret = -EPERM;
-   goto out;
-   }
-
new_value = bond_parse_parm((char *)buf, xmit_hashtype_tbl);
if (new_value  0)  {
printk(KERN_ERR DRV_NAME
-- 
1.4.4.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: namespace support requires network modules to say GPL

2007-12-02 Thread Alan Cox
On Sat, 1 Dec 2007 16:30:35 -0800
Stephen Hemminger [EMAIL PROTECTED] wrote:

 On Sat, 1 Dec 2007 11:17:36 -0800
 Stephen Hemminger [EMAIL PROTECTED] wrote:
 
  Then init_net needs to be not GPL limited. Sorry, we need to allow
  non GPL network drivers.  There is a fine line between keeping the
  binary seething masses from accessing random kernel functions, and allowing
  reasonable (but still non GPL) things like ndiswrapper to use network
  device interface.
  
 I spoke too soon earlier, ndiswrapper builds and loads against current
 2.6.24-rc3. Vmware and proprietary VPN software probably do not. Once again I 
 don't
 give a damn, but the enterprise distro vendors certainly care.

Enterprise distro vendors ship kernels from the 2.6.19 era, so I don't
see why they care.

Alan
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sock_valbool_flag is required by VMware

2007-12-02 Thread Christoph Hellwig
On Fri, Nov 30, 2007 at 10:52:22PM +0100, Jiri Slaby wrote:
 Hi,
 
 this commit:
 [NET]: Move sock_valbool_flag to socket.c
 
 The sock_valbool_flag() helper is used in setsockopt to
 set or reset some flag on the sock. This helper is required
 in the net/socket.c only, so move it there.
 
 Besides, patch two places in sys_setsockopt() that repeat
 this helper functionality manually.
 
 Since this is not a bugfix, but a trivial cleanup, I
 prepared this patch against net-2.6.25, but it also
 applies (with a single offset) to the latest net-2.6.
 
 breaks vmware module compilation, since it uses sock_valbool_flag(). Is this
 their business (and they should use sock_set_flag/sock_reset_flag) or should
 this be reverted?

It's their business, they should never have used it.

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Reproducible data corruption with sendfile+vsftp - splice regression?

2007-12-02 Thread Holger Hoffstaette

On Fri, 30 Nov 2007 10:26:54 -0800, Rick Jones wrote:

 Could the corruption be seen in a tcpdump trace prior to transmission (ie
 taken on the sender) or was it only seen after the data passed out the
 NIC?

I did the following:

1) turn on tso on the server's r8169: ethtool --offload eth0 tso on
2) on the server: tcpdump -i eth0 -s 0 -w file
3) ftp'ed file to 100mbit client

As expected the file was corrupted, and the various corrupted byte
sequences also show up in the tcpdump file at the corresponding offsets.

I did this with 2.6.22.14, so it does not seem to be a recent regression
in .23/.24.

All files can be found here:
http://hoho.dyndns.org/~holger/dist/r8169-tso/

I will gladly try out any other tweaks but need some guidance as I don't
know what exactly to change - maybe without NAPI for the r8169?

thank you
Holger


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-2.6.25] TCP: Fix copy-paste (or code move) error

2007-12-02 Thread Ilpo Järvinen
Should get the skb from the same queue. I had it first elsewhere
and missed this change while moving.

Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
---

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7ae72c3..5929022 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1524,7 +1524,7 @@ static inline void tcp_reset_fack_counts(struct sock *sk, 
struct sk_buff *inskb)
/* Lazy find for the other queue */
if (skb[queue] == NULL) {
skb[queue] = tcp_write_queue_find(sk, 
TCP_SKB_CB(prev)-seq,
- queue ^ 
TCP_WQ_SACKED);
+ queue);
if (skb[queue] == NULL)
break;
}
--
1.5.0.6

Re: Inconsistent lock state and possible irq lock inversion dependency detected in ax25.ko

2007-12-02 Thread Bernard Pidoux

Hi,

Many thanks for your patch for ~/net/ax25/ax25_subr.c

Introduction of local_bh_disable() ... local_bh_enable()

cured the inconsistent lock state related to AX25 connect timeout.

I have now a stable monoprocessor system running AX25 and ROSE network 
packet switching application FPAC, whether kernel is compiled with or 
without hack option.


There is no more problem during normal operations.

This was achieved, thanks to your AX25 patch and the patch from Alexey 
Dobriyan for rose module.


I also patched rose module in order to get packet routing more 
efficient, taking into account the restarted flag that is raised when 
a neighbour node is already connected.


To summarize the present situation on my Linux machine, I built a patch 
against kernel 2.6.23.9.


I would appreciate if you could make it included into a next kernel release.

Many thanks and best regards,

Bernard Pidoux
F6BVP




rose-patch-2.6.23.9.tgz
Description: application/compressed-tar


Re: namespace support requires network modules to say GPL

2007-12-02 Thread Ben Greear

Stephen Hemminger wrote:


Naw, enterprise (or any other) distro vendors shouldn't have any issues here,
since they can just patch their kernels around any issues.

But it looks like Eric has this one thought out well enough.



So you are saying all this is not a problem, fine.
Any affected parties can certainly lobby for themselves. But I suspect
they all think the kernel community is a bunch of ... and will just ignore
the problem. 
  
I have a binary module that uses dev_get_by_name...it's sort of a 
bridge-like thing and

needs user-space to tell it which device to listen for packets on...

This code doesn't need or care about name-spaces, so I don't see how it 
could really
be infringing on the author's code (any worse than loading a binary 
driver into the kernel

ever does).

I would certainly prefer to not have to patch around any problems with 
calling dev_get_by_name
from a non-gpl module, but if required, I can probably figure something 
out...


Thanks,
Ben

--
Ben Greear [EMAIL PROTECTED] 
Candela Technologies Inc  http://www.candelatech.com



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: namespace support requires network modules to say GPL

2007-12-02 Thread Valdis . Kletnieks
On Sun, 02 Dec 2007 13:51:04 GMT, Alan Cox said:
 On Sat, 1 Dec 2007 16:30:35 -0800
  I spoke too soon earlier, ndiswrapper builds and loads against current
  2.6.24-rc3. Vmware and proprietary VPN software probably do not. Once again 
  I don't
  give a damn, but the enterprise distro vendors certainly care.
 
 Enterprise distro vendors ship kernels from the 2.6.19 era, so I don't
 see why they care.

They don't care *now*.  They will care when they try to rev forward from .19.

Not that they'll care a *lot* - it took *me* all of about an hour to get VMware
Server 1.0.4 working under -rc3-mm2.  Probably will take an enterprise distro
4-5 hours, 30 mins for the port and 4 1/2 hours for the paperwork. :)



pgpZ81OzGlrGA.pgp
Description: PGP signature


Re: namespace support requires network modules to say GPL

2007-12-02 Thread Patrick McHardy

Ben Greear wrote:

Stephen Hemminger wrote:


Naw, enterprise (or any other) distro vendors shouldn't have any 
issues here,

since they can just patch their kernels around any issues.

But it looks like Eric has this one thought out well enough.



So you are saying all this is not a problem, fine.
Any affected parties can certainly lobby for themselves. But I suspect
they all think the kernel community is a bunch of ... and will just 
ignore
the problem.   


I have a binary module that uses dev_get_by_name...it's sort of a 
bridge-like thing and

needs user-space to tell it which device to listen for packets on...

This code doesn't need or care about name-spaces, so I don't see how it 
could really
be infringing on the author's code (any worse than loading a binary 
driver into the kernel

ever does).

I would certainly prefer to not have to patch around any problems with 
calling dev_get_by_name
from a non-gpl module, but if required, I can probably figure something 
out...



For all I care binary modules can break, but frankly I don't see
how encapsulating a couple of structures and pointers in a new
structure and adding a new argument to existing functions shifts
the decision about how a function should be usable to the namespace
guys. IMO all functions should continue to be usable as before,
as decided by whoever actually wrote them. The only exception
might be stuff where an existing EXPORT_SYMBOL is clearly wrong,
but that would be a seperate discussion.

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: namespace support requires network modules to say GPL

2007-12-02 Thread Adrian Bunk
On Sun, Dec 02, 2007 at 09:03:56PM +0100, Patrick McHardy wrote:
 Ben Greear wrote:
 Stephen Hemminger wrote:

 Naw, enterprise (or any other) distro vendors shouldn't have any issues 
 here,
 since they can just patch their kernels around any issues.

 But it looks like Eric has this one thought out well enough.
 

 So you are saying all this is not a problem, fine.
 Any affected parties can certainly lobby for themselves. But I suspect
 they all think the kernel community is a bunch of ... and will just 
 ignore
 the problem.   
 
 I have a binary module that uses dev_get_by_name...it's sort of a 
 bridge-like thing and
 needs user-space to tell it which device to listen for packets on...

 This code doesn't need or care about name-spaces, so I don't see how it 
 could really
 be infringing on the author's code (any worse than loading a binary driver 
 into the kernel
 ever does).

 I would certainly prefer to not have to patch around any problems with 
 calling dev_get_by_name
 from a non-gpl module, but if required, I can probably figure something 
 out...


 For all I care binary modules can break, but frankly I don't see
 how encapsulating a couple of structures and pointers in a new
 structure and adding a new argument to existing functions shifts
 the decision about how a function should be usable to the namespace
 guys. IMO all functions should continue to be usable as before,
 as decided by whoever actually wrote them.
...

Even ignoring the fact that it's unclear whether distributing modules 
with not GPLv2 compatible licences is legal at all or might bring you in 
jail, your statement has an interesting implication:

Stuff like e.g. the EXPORT_SYMBOL(sk_alloc) predates the 
EXPORT_SYMBOL_GPL stuff.

Who is considered the author of this code?

And when should he state whether he prefers to use EXPORT_SYMBOL_GPL 
but wasn't able to use it at that when he wrote it since his code 
predates it and is glad to be able to decide this now?

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] [TFRC]: Make the rx history slab be global

2007-12-02 Thread Arnaldo Carvalho de Melo
This is in preparation for merging the new rx history code written by Gerrit 
Renker.

Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/ccids/ccid3.c  |   35 ++---
 net/dccp/ccids/lib/packet_history.c |   95 ++-
 net/dccp/ccids/lib/packet_history.h |   43 ++--
 3 files changed, 60 insertions(+), 113 deletions(-)

diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c
index 5dea690..07920bb 100644
--- a/net/dccp/ccids/ccid3.c
+++ b/net/dccp/ccids/ccid3.c
@@ -49,8 +49,6 @@ static int ccid3_debug;
 #define ccid3_pr_debug(format, a...)
 #endif
 
-static struct dccp_rx_hist *ccid3_rx_hist;
-
 /*
  * Transmitter Half-Connection Routines
  */
@@ -807,9 +805,9 @@ static int ccid3_hc_rx_detect_loss(struct sock *sk,
}
 
 detect_out:
-   dccp_rx_hist_add_packet(ccid3_rx_hist, hcrx-ccid3hcrx_hist,
-  hcrx-ccid3hcrx_li_hist, packet,
-  hcrx-ccid3hcrx_seqno_nonloss);
+   dccp_rx_hist_add_packet(hcrx-ccid3hcrx_hist,
+   hcrx-ccid3hcrx_li_hist, packet,
+   hcrx-ccid3hcrx_seqno_nonloss);
return loss;
 }
 
@@ -852,8 +850,7 @@ static void ccid3_hc_rx_packet_recv(struct sock *sk, struct 
sk_buff *skb)
return;
}
 
-   packet = dccp_rx_hist_entry_new(ccid3_rx_hist, opt_recv-dccpor_ndp,
-   skb, GFP_ATOMIC);
+   packet = dccp_rx_hist_entry_new(opt_recv-dccpor_ndp, skb, GFP_ATOMIC);
if (unlikely(packet == NULL)) {
DCCP_WARN(%s(%p), Not enough mem to add rx packet 
  to history, consider it lost!\n, dccp_role(sk), sk);
@@ -936,7 +933,7 @@ static void ccid3_hc_rx_exit(struct sock *sk)
ccid3_hc_rx_set_state(sk, TFRC_RSTATE_TERM);
 
/* Empty packet history */
-   dccp_rx_hist_purge(ccid3_rx_hist, hcrx-ccid3hcrx_hist);
+   dccp_rx_hist_purge(hcrx-ccid3hcrx_hist);
 
/* Empty loss interval history */
dccp_li_hist_purge(hcrx-ccid3hcrx_li_hist);
@@ -1013,33 +1010,13 @@ MODULE_PARM_DESC(ccid3_debug, Enable debug messages);
 
 static __init int ccid3_module_init(void)
 {
-   int rc = -ENOBUFS;
-
-   ccid3_rx_hist = dccp_rx_hist_new(ccid3);
-   if (ccid3_rx_hist == NULL)
-   goto out;
-
-   rc = ccid_register(ccid3);
-   if (rc != 0)
-   goto out_free_rx;
-out:
-   return rc;
-
-out_free_rx:
-   dccp_rx_hist_delete(ccid3_rx_hist);
-   ccid3_rx_hist = NULL;
-   goto out;
+   return ccid_register(ccid3);
 }
 module_init(ccid3_module_init);
 
 static __exit void ccid3_module_exit(void)
 {
ccid_unregister(ccid3);
-
-   if (ccid3_rx_hist != NULL) {
-   dccp_rx_hist_delete(ccid3_rx_hist);
-   ccid3_rx_hist = NULL;
-   }
 }
 module_exit(ccid3_module_exit);
 
diff --git a/net/dccp/ccids/lib/packet_history.c 
b/net/dccp/ccids/lib/packet_history.c
index b628714..e1ab853 100644
--- a/net/dccp/ccids/lib/packet_history.c
+++ b/net/dccp/ccids/lib/packet_history.c
@@ -114,48 +114,33 @@ EXPORT_SYMBOL_GPL(tfrc_tx_hist_rtt);
 /*
  * Receiver History Routines
  */
-struct dccp_rx_hist *dccp_rx_hist_new(const char *name)
+static struct kmem_cache *tfrc_rx_hist_slab;
+
+struct dccp_rx_hist_entry *dccp_rx_hist_entry_new(const u32 ndp,
+ const struct sk_buff *skb,
+ const gfp_t prio)
 {
-   struct dccp_rx_hist *hist = kmalloc(sizeof(*hist), GFP_ATOMIC);
-   static const char dccp_rx_hist_mask[] = rx_hist_%s;
-   char *slab_name;
-
-   if (hist == NULL)
-   goto out;
-
-   slab_name = kmalloc(strlen(name) + sizeof(dccp_rx_hist_mask) - 1,
-   GFP_ATOMIC);
-   if (slab_name == NULL)
-   goto out_free_hist;
-
-   sprintf(slab_name, dccp_rx_hist_mask, name);
-   hist-dccprxh_slab = kmem_cache_create(slab_name,
-sizeof(struct dccp_rx_hist_entry),
-0, SLAB_HWCACHE_ALIGN, NULL);
-   if (hist-dccprxh_slab == NULL)
-   goto out_free_slab_name;
-out:
-   return hist;
-out_free_slab_name:
-   kfree(slab_name);
-out_free_hist:
-   kfree(hist);
-   hist = NULL;
-   goto out;
-}
+   struct dccp_rx_hist_entry *entry = kmem_cache_alloc(tfrc_rx_hist_slab,
+   prio);
 
-EXPORT_SYMBOL_GPL(dccp_rx_hist_new);
+   if (entry != NULL) {
+   const struct dccp_hdr *dh = dccp_hdr(skb);
 
-void dccp_rx_hist_delete(struct dccp_rx_hist *hist)
-{
-   const char* name = kmem_cache_name(hist-dccprxh_slab);
+   entry-dccphrx_seqno = DCCP_SKB_CB(skb)-dccpd_seq;
+   entry-dccphrx_ccval = dh-dccph_ccval;
+   entry-dccphrx_type  

[PATCH 3/7] [TFRC]: Rename tfrc_tx_hist to tfrc_tx_hist_slab, for consistency

2007-12-02 Thread Arnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/ccids/lib/packet_history.c |   20 ++--
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/net/dccp/ccids/lib/packet_history.c 
b/net/dccp/ccids/lib/packet_history.c
index 1d4d6ee..b628714 100644
--- a/net/dccp/ccids/lib/packet_history.c
+++ b/net/dccp/ccids/lib/packet_history.c
@@ -53,7 +53,7 @@ struct tfrc_tx_hist_entry {
 /*
  * Transmitter History Routines
  */
-static struct kmem_cache *tfrc_tx_hist;
+static struct kmem_cache *tfrc_tx_hist_slab;
 
 static struct tfrc_tx_hist_entry *
tfrc_tx_hist_find_entry(struct tfrc_tx_hist_entry *head, u64 seqno)
@@ -66,7 +66,7 @@ static struct tfrc_tx_hist_entry *
 
 int tfrc_tx_hist_add(struct tfrc_tx_hist_entry **headp, u64 seqno)
 {
-   struct tfrc_tx_hist_entry *entry = kmem_cache_alloc(tfrc_tx_hist, 
gfp_any());
+   struct tfrc_tx_hist_entry *entry = kmem_cache_alloc(tfrc_tx_hist_slab, 
gfp_any());
 
if (entry == NULL)
return -ENOBUFS;
@@ -85,7 +85,7 @@ void tfrc_tx_hist_purge(struct tfrc_tx_hist_entry **headp)
while (head != NULL) {
struct tfrc_tx_hist_entry *next = head-next;
 
-   kmem_cache_free(tfrc_tx_hist, head);
+   kmem_cache_free(tfrc_tx_hist_slab, head);
head = next;
}
 
@@ -278,17 +278,17 @@ EXPORT_SYMBOL_GPL(dccp_rx_hist_purge);
 
 __init int packet_history_init(void)
 {
-   tfrc_tx_hist = kmem_cache_create(tfrc_tx_hist,
-sizeof(struct tfrc_tx_hist_entry), 0,
-SLAB_HWCACHE_ALIGN, NULL);
+   tfrc_tx_hist_slab = kmem_cache_create(tfrc_tx_hist,
+ sizeof(struct 
tfrc_tx_hist_entry), 0,
+ SLAB_HWCACHE_ALIGN, NULL);
 
-   return tfrc_tx_hist == NULL ? -ENOBUFS : 0;
+   return tfrc_tx_hist_slab == NULL ? -ENOBUFS : 0;
 }
 
 void packet_history_exit(void)
 {
-   if (tfrc_tx_hist != NULL) {
-   kmem_cache_destroy(tfrc_tx_hist);
-   tfrc_tx_hist = NULL;
+   if (tfrc_tx_hist_slab != NULL) {
+   kmem_cache_destroy(tfrc_tx_hist_slab);
+   tfrc_tx_hist_slab = NULL;
}
 }
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC][PATCHES 0/7]: Reorganization of RX history patches

2007-12-02 Thread Arnaldo Carvalho de Melo
WARNING: After reading some messages from Ingo Molnar on lkml I think we should 
really
 trim the number of lists we use for kernel development. And since I 
moved
 back to using mutt for reading e-mails, something I should have never, 
ever
 stopped doing, I guess we should move the DCCP discussions to netdev,
 where we hopefully can get more people interested and reviewing the 
work we
 do, so please consider moving DCCP discussion to 
netdev@vger.kernel.org,
 where lots of smart networking folks are present and can help our 
efforts
 on turning RFCs to code.

Back to business...:

Hi Gerrit,

Please take a look at this patch series where I reorganized your work 
on the new
TFRC rx history handling code. I'll wait for your considerations and then do as 
many
interactions as reasonable to get your work merged.

It should be completely equivalent, plus some fixes and optimizations, 
such as:

. The code that allocates the RX ring deals with failures when one of the 
entries in
  the ring buffer is not successfully allocated, the original code was leaking 
the
  successfully allocated entries.

. We do just one allocation for the ring buffer, as the number of entries is 
fixed we
  should just do one allocation and not TFRC_NDUPACK times.

. I haven't checked if all the code was commited, as I tried to introduce just 
what was
  immediatelly used, probably we'll need to do some changes when working on the 
merge
  of your loss intervals code.

. I changed the ccid3_hc_rx_packet_recv code to set hcrx-ccid3hcrx_s for the 
first
  non-data packet instead of calling ccid3_hc_rx_set_state, that would use 0 as 
the
  initial value in the EWMA calculation.

. I also moved some patch parts (hunks) around trying to improve the 
readability of
  the patches, trying to get things that logically replaced what was there 
before
  closer together.

. Separation of parts of your patches and combination of others is also another 
thing
  you'll see in this patch set. I understand that it is difficult to find the 
right
  compromise and I hope you don't feel too bad with the decisions I made, 
eventually
  we'll find a common ground.

. Another change was related to namespacing, I added tfrc_rx_hist_ to a number 
of
  functions and in some cases just normalised the naming to be consistent.

. I'm not that happy with deferring changes to the loss intervals code that uses
  rx handling data structures, but I'm OK with leaving some code commented out 
till
  we get to merging the new loss intervals code.

For what is worth I leave her my deep appreciation of your work and 
also my
(repeated) apologies for not being able to do these kinds of review sessions 
months ago,
but I also I'm willing and able to cure these shortcomings by continuing the 
work I've
been doing recently on finally reviewing your hard work, keep it up!

It is available at:

master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.25

Best Regards,

- Arnaldo

 b/net/dccp/ccids/Kconfig   |   13
 b/net/dccp/ccids/ccid3.c   |   35 -
 b/net/dccp/ccids/ccid3.h   |   14
 b/net/dccp/ccids/lib/Makefile  |2
 b/net/dccp/ccids/lib/loss_interval.c   |   14
 b/net/dccp/ccids/lib/packet_history.c  |   27 -
 b/net/dccp/ccids/lib/packet_history.h  |3
 b/net/dccp/ccids/lib/packet_history_internal.h |   68 +++
 b/net/dccp/ccids/lib/tfrc.c|   48 ++
 b/net/dccp/ccids/lib/tfrc.h|   18
 b/net/dccp/dccp.h  |   13
 net/dccp/ccids/ccid3.c |  289 --
 net/dccp/ccids/lib/loss_interval.c |   14
 net/dccp/ccids/lib/packet_history.c|  483 +
 net/dccp/ccids/lib/packet_history.h|  175 +++--
 15 files changed, 602 insertions(+), 614 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] [TFRC]: Provide central source file and debug facility

2007-12-02 Thread Arnaldo Carvalho de Melo
From: Gerrit Renker [EMAIL PROTECTED]

This patch changes the tfrc_lib module in the following manner:

 (1) a dedicated tfrc source file to call the packet history 
 loss interval init/exit functions.
 (2) a dedicated tfrc_pr_debug macro with toggle switch `tfrc_debug'.

Commiter note: renamed tfrc_module.c to tfrc.c, and made CONFIG_IP_DCCP_CCID3
select IP_DCCP_TFRC_LIB.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
Signed-off-by: Ian McDonald [EMAIL PROTECTED]
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/ccids/Kconfig  |   13 ++---
 net/dccp/ccids/lib/Makefile |2 +-
 net/dccp/ccids/lib/packet_history.c |   27 ++-
 net/dccp/ccids/lib/packet_history.h |3 +-
 net/dccp/ccids/lib/tfrc.c   |   48 +++
 net/dccp/ccids/lib/tfrc.h   |   17 +---
 6 files changed, 75 insertions(+), 35 deletions(-)
 create mode 100644 net/dccp/ccids/lib/tfrc.c

diff --git a/net/dccp/ccids/Kconfig b/net/dccp/ccids/Kconfig
index 3d7d867..1227594 100644
--- a/net/dccp/ccids/Kconfig
+++ b/net/dccp/ccids/Kconfig
@@ -38,6 +38,7 @@ config IP_DCCP_CCID2_DEBUG
 config IP_DCCP_CCID3
tristate CCID3 (TCP-Friendly) (EXPERIMENTAL)
def_tristate IP_DCCP
+   select IP_DCCP_TFRC_LIB
---help---
  CCID 3 denotes TCP-Friendly Rate Control (TFRC), an equation-based
  rate-controlled congestion control mechanism.  TFRC is designed to
@@ -63,10 +64,6 @@ config IP_DCCP_CCID3
 
  If in doubt, say M.
 
-config IP_DCCP_TFRC_LIB
-   depends on IP_DCCP_CCID3
-   def_tristate IP_DCCP_CCID3
-
 config IP_DCCP_CCID3_DEBUG
  bool CCID3 debugging messages
  depends on IP_DCCP_CCID3
@@ -110,5 +107,13 @@ config IP_DCCP_CCID3_RTO
is serious network congestion: experimenting with larger values 
should
therefore not be performed on WANs.
 
+config IP_DCCP_TFRC_LIB
+   tristate
+   default n
+
+config IP_DCCP_TFRC_DEBUG
+   bool
+   depends on IP_DCCP_TFRC_LIB
+   default y if IP_DCCP_CCID3_DEBUG
 
 endmenu
diff --git a/net/dccp/ccids/lib/Makefile b/net/dccp/ccids/lib/Makefile
index 5f940a6..68c93e3 100644
--- a/net/dccp/ccids/lib/Makefile
+++ b/net/dccp/ccids/lib/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_IP_DCCP_TFRC_LIB) += dccp_tfrc_lib.o
 
-dccp_tfrc_lib-y := loss_interval.o packet_history.o tfrc_equation.o
+dccp_tfrc_lib-y := tfrc.o tfrc_equation.o packet_history.o loss_interval.o
diff --git a/net/dccp/ccids/lib/packet_history.c 
b/net/dccp/ccids/lib/packet_history.c
index 4805de9..1d4d6ee 100644
--- a/net/dccp/ccids/lib/packet_history.c
+++ b/net/dccp/ccids/lib/packet_history.c
@@ -35,7 +35,6 @@
  *  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
 
-#include linux/module.h
 #include linux/string.h
 #include packet_history.h
 
@@ -277,39 +276,19 @@ void dccp_rx_hist_purge(struct dccp_rx_hist *hist, struct 
list_head *list)
 
 EXPORT_SYMBOL_GPL(dccp_rx_hist_purge);
 
-extern int __init dccp_li_init(void);
-extern void dccp_li_exit(void);
-
-static __init int packet_history_init(void)
+__init int packet_history_init(void)
 {
-   if (dccp_li_init() != 0)
-   goto out;
-
tfrc_tx_hist = kmem_cache_create(tfrc_tx_hist,
 sizeof(struct tfrc_tx_hist_entry), 0,
 SLAB_HWCACHE_ALIGN, NULL);
-   if (tfrc_tx_hist == NULL)
-   goto out_li_exit;
 
-   return 0;
-out_li_exit:
-   dccp_li_exit();
-out:
-   return -ENOBUFS;
+   return tfrc_tx_hist == NULL ? -ENOBUFS : 0;
 }
-module_init(packet_history_init);
 
-static __exit void packet_history_exit(void)
+void packet_history_exit(void)
 {
if (tfrc_tx_hist != NULL) {
kmem_cache_destroy(tfrc_tx_hist);
tfrc_tx_hist = NULL;
}
-   dccp_li_exit();
 }
-module_exit(packet_history_exit);
-
-MODULE_AUTHOR(Ian McDonald [EMAIL PROTECTED], 
- Arnaldo Carvalho de Melo [EMAIL PROTECTED]);
-MODULE_DESCRIPTION(DCCP TFRC library);
-MODULE_LICENSE(GPL);
diff --git a/net/dccp/ccids/lib/packet_history.h 
b/net/dccp/ccids/lib/packet_history.h
index 0670f46..9a2642e 100644
--- a/net/dccp/ccids/lib/packet_history.h
+++ b/net/dccp/ccids/lib/packet_history.h
@@ -39,8 +39,7 @@
 #include linux/ktime.h
 #include linux/list.h
 #include linux/slab.h
-
-#include ../../dccp.h
+#include tfrc.h
 
 /* Number of later packets received before one is considered lost */
 #define TFRC_RECV_NUM_LATE_LOSS 3
diff --git a/net/dccp/ccids/lib/tfrc.c b/net/dccp/ccids/lib/tfrc.c
new file mode 100644
index 000..3a7a183
--- /dev/null
+++ b/net/dccp/ccids/lib/tfrc.c
@@ -0,0 +1,48 @@
+/*
+ * TFRC: main module holding the pieces of the TFRC library together
+ *
+ * Copyright (c) 2007 The University of Aberdeen, Scotland, UK
+ * Copyright (c) 2007 Arnaldo Carvalho de Melo [EMAIL PROTECTED]
+ */
+#include 

[PATCH 6/7] [CCID3]: The receiver of a half-connection does not set window counter values

2007-12-02 Thread Arnaldo Carvalho de Melo
From: Gerrit Renker [EMAIL PROTECTED]

Only the sender sets window counters [RFC 4342, sections 5 and 8.1].

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
Signed-off-by: Ian McDonald [EMAIL PROTECTED]
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/ccids/ccid3.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c
index c95dca8..5ff5aab 100644
--- a/net/dccp/ccids/ccid3.c
+++ b/net/dccp/ccids/ccid3.c
@@ -733,7 +733,6 @@ static int ccid3_hc_rx_insert_options(struct sock *sk, 
struct sk_buff *skb)
return 0;
 
hcrx = ccid3_hc_rx_sk(sk);
-   DCCP_SKB_CB(skb)-dccpd_ccval = hcrx-ccid3hcrx_ccval_last_counter;
 
if (dccp_packet_without_ack(skb))
return 0;
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Inconsistent lock state and possible irq lock inversion dependency detected in ax25.ko

2007-12-02 Thread Jarek Poplawski
Bernard Pidoux wrote, On 12/02/2007 06:37 PM:

 Hi,
 
 Many thanks for your patch for ~/net/ax25/ax25_subr.c
 
 Introduction of local_bh_disable() ... local_bh_enable()
 
 cured the inconsistent lock state related to AX25 connect timeout.
 
 I have now a stable monoprocessor system running AX25 and ROSE network 
 packet switching application FPAC, whether kernel is compiled with or 
 without hack option.
 
 There is no more problem during normal operations.
 
 This was achieved, thanks to your AX25 patch and the patch from Alexey 
 Dobriyan for rose module.
 
 I also patched rose module in order to get packet routing more 
 efficient, taking into account the restarted flag that is raised when 
 a neighbour node is already connected.
 
 To summarize the present situation on my Linux machine, I built a patch 
 against kernel 2.6.23.9.
 
 I would appreciate if you could make it included into a next kernel release.
... 

Bernard, I'm very glad I could be a little helpful, but I'm not sure of
your intentions: my patch proposal is rather trivial interpretation of
lockdep's report; I haven't studied AX25 enough even to be sure there is
a real lockup possible in this place. Since this change looks not very
costly and quite safe, I can 'take a risk' to sign this off after your
testing. But anything more is beyond my 'range'.

So, since you've spent quite a lot of time on this all, maybe it would
be simpler if you've tried the same with the current kernel, and resent
proper (not gzipped and with changelog) patch or patches. Then, I hope,
Ralf, as the maintainer, will make the rest.

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: namespace support requires network modules to say GPL

2007-12-02 Thread Patrick McHardy

Adrian Bunk wrote:

On Sun, Dec 02, 2007 at 09:03:56PM +0100, Patrick McHardy wrote:


For all I care binary modules can break, but frankly I don't see
how encapsulating a couple of structures and pointers in a new
structure and adding a new argument to existing functions shifts
the decision about how a function should be usable to the namespace
guys. IMO all functions should continue to be usable as before,
as decided by whoever actually wrote them.
...


Even ignoring the fact that it's unclear whether distributing modules 
with not GPLv2 compatible licences is legal at all or might bring you in 
jail,


Agreed, lets ignore that :)


your statement has an interesting implication:

Stuff like e.g. the EXPORT_SYMBOL(sk_alloc) predates the 
EXPORT_SYMBOL_GPL stuff.


Who is considered the author of this code?

And when should he state whether he prefers to use EXPORT_SYMBOL_GPL 
but wasn't able to use it at that when he wrote it since his code 
predates it and is glad to be able to decide this now?



He can state it when he feels like it, I don't see the point.
Authors generally get to decide whether they use EXPORT_SYMBOL
or EXPORT_SYMBOL_GPL unless in cases where its really clear-cut
that EXPORT_SYMBOL is inapproriate. But thats a different matter.

If a symbol was OK to be used previously and something using it
would not automatically be considered a derived work, how does
passing init_net to the function just to make the compiler
happy, avoid BUG_ONs and generally keep things working as before
make it more of a derived work?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] [DCCP]: Introduce generic function to test for `data packets'

2007-12-02 Thread Arnaldo Carvalho de Melo
From: Gerrit Renker [EMAIL PROTECTED]

as per  RFC 4340, sec. 7.7.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
Signed-off-by: Ian McDonald [EMAIL PROTECTED]
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/dccp.h |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h
index ee97950..f4a5ea1 100644
--- a/net/dccp/dccp.h
+++ b/net/dccp/dccp.h
@@ -334,6 +334,7 @@ struct dccp_skb_cb {
 
 #define DCCP_SKB_CB(__skb) ((struct dccp_skb_cb *)((__skb)-cb[0]))
 
+/* RFC 4340, sec. 7.7 */
 static inline int dccp_non_data_packet(const struct sk_buff *skb)
 {
const __u8 type = DCCP_SKB_CB(skb)-dccpd_type;
@@ -346,6 +347,17 @@ static inline int dccp_non_data_packet(const struct 
sk_buff *skb)
   type == DCCP_PKT_SYNCACK;
 }
 
+/* RFC 4340, sec. 7.7 */
+static inline int dccp_data_packet(const struct sk_buff *skb)
+{
+   const __u8 type = DCCP_SKB_CB(skb)-dccpd_type;
+
+   return type == DCCP_PKT_DATA ||
+  type == DCCP_PKT_DATAACK  ||
+  type == DCCP_PKT_REQUEST  ||
+  type == DCCP_PKT_RESPONSE;
+}
+
 static inline int dccp_packet_without_ack(const struct sk_buff *skb)
 {
const __u8 type = DCCP_SKB_CB(skb)-dccpd_type;
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] [TFRC]: Rename dccp_rx_ to tfrc_rx_

2007-12-02 Thread Arnaldo Carvalho de Melo
This is in preparation for merging the new rx history code written by Gerrit 
Renker.

Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/ccids/ccid3.c  |   32 ++--
 net/dccp/ccids/lib/loss_interval.c  |   14 +++---
 net/dccp/ccids/lib/packet_history.c |   90 +-
 net/dccp/ccids/lib/packet_history.h |   48 +-
 4 files changed, 92 insertions(+), 92 deletions(-)

diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c
index 07920bb..c95dca8 100644
--- a/net/dccp/ccids/ccid3.c
+++ b/net/dccp/ccids/ccid3.c
@@ -677,7 +677,7 @@ static void ccid3_hc_rx_send_feedback(struct sock *sk)
 {
struct ccid3_hc_rx_sock *hcrx = ccid3_hc_rx_sk(sk);
struct dccp_sock *dp = dccp_sk(sk);
-   struct dccp_rx_hist_entry *packet;
+   struct tfrc_rx_hist_entry *packet;
ktime_t now;
suseconds_t delta;
 
@@ -701,7 +701,7 @@ static void ccid3_hc_rx_send_feedback(struct sock *sk)
return;
}
 
-   packet = dccp_rx_hist_find_data_packet(hcrx-ccid3hcrx_hist);
+   packet = tfrc_rx_hist_find_data_packet(hcrx-ccid3hcrx_hist);
if (unlikely(packet == NULL)) {
DCCP_WARN(%s(%p), no data packet in history!\n,
  dccp_role(sk), sk);
@@ -709,7 +709,7 @@ static void ccid3_hc_rx_send_feedback(struct sock *sk)
}
 
hcrx-ccid3hcrx_tstamp_last_feedback = now;
-   hcrx-ccid3hcrx_ccval_last_counter   = packet-dccphrx_ccval;
+   hcrx-ccid3hcrx_ccval_last_counter   = packet-tfrchrx_ccval;
hcrx-ccid3hcrx_bytes_recv   = 0;
 
if (hcrx-ccid3hcrx_p == 0)
@@ -752,12 +752,12 @@ static int ccid3_hc_rx_insert_options(struct sock *sk, 
struct sk_buff *skb)
 }
 
 static int ccid3_hc_rx_detect_loss(struct sock *sk,
-   struct dccp_rx_hist_entry *packet)
+   struct tfrc_rx_hist_entry *packet)
 {
struct ccid3_hc_rx_sock *hcrx = ccid3_hc_rx_sk(sk);
-   struct dccp_rx_hist_entry *rx_hist =
-   dccp_rx_hist_head(hcrx-ccid3hcrx_hist);
-   u64 seqno = packet-dccphrx_seqno;
+   struct tfrc_rx_hist_entry *rx_hist =
+   tfrc_rx_hist_head(hcrx-ccid3hcrx_hist);
+   u64 seqno = packet-tfrchrx_seqno;
u64 tmp_seqno;
int loss = 0;
u8 ccval;
@@ -766,9 +766,9 @@ static int ccid3_hc_rx_detect_loss(struct sock *sk,
tmp_seqno = hcrx-ccid3hcrx_seqno_nonloss;
 
if (!rx_hist ||
-  follows48(packet-dccphrx_seqno, hcrx-ccid3hcrx_seqno_nonloss)) {
+  follows48(packet-tfrchrx_seqno, hcrx-ccid3hcrx_seqno_nonloss)) {
hcrx-ccid3hcrx_seqno_nonloss = seqno;
-   hcrx-ccid3hcrx_ccval_nonloss = packet-dccphrx_ccval;
+   hcrx-ccid3hcrx_ccval_nonloss = packet-tfrchrx_ccval;
goto detect_out;
}
 
@@ -789,7 +789,7 @@ static int ccid3_hc_rx_detect_loss(struct sock *sk,
dccp_inc_seqno(tmp_seqno);
hcrx-ccid3hcrx_seqno_nonloss = tmp_seqno;
dccp_inc_seqno(tmp_seqno);
-   while (dccp_rx_hist_find_entry(hcrx-ccid3hcrx_hist,
+   while (tfrc_rx_hist_find_entry(hcrx-ccid3hcrx_hist,
   tmp_seqno, ccval)) {
hcrx-ccid3hcrx_seqno_nonloss = tmp_seqno;
hcrx-ccid3hcrx_ccval_nonloss = ccval;
@@ -799,13 +799,13 @@ static int ccid3_hc_rx_detect_loss(struct sock *sk,
 
/* FIXME - this code could be simplified with above while */
/* but works at moment */
-   if (follows48(packet-dccphrx_seqno, hcrx-ccid3hcrx_seqno_nonloss)) {
+   if (follows48(packet-tfrchrx_seqno, hcrx-ccid3hcrx_seqno_nonloss)) {
hcrx-ccid3hcrx_seqno_nonloss = seqno;
-   hcrx-ccid3hcrx_ccval_nonloss = packet-dccphrx_ccval;
+   hcrx-ccid3hcrx_ccval_nonloss = packet-tfrchrx_ccval;
}
 
 detect_out:
-   dccp_rx_hist_add_packet(hcrx-ccid3hcrx_hist,
+   tfrc_rx_hist_add_packet(hcrx-ccid3hcrx_hist,
hcrx-ccid3hcrx_li_hist, packet,
hcrx-ccid3hcrx_seqno_nonloss);
return loss;
@@ -815,7 +815,7 @@ static void ccid3_hc_rx_packet_recv(struct sock *sk, struct 
sk_buff *skb)
 {
struct ccid3_hc_rx_sock *hcrx = ccid3_hc_rx_sk(sk);
const struct dccp_options_received *opt_recv;
-   struct dccp_rx_hist_entry *packet;
+   struct tfrc_rx_hist_entry *packet;
u32 p_prev, r_sample, rtt_prev;
int loss, payload_size;
ktime_t now;
@@ -850,7 +850,7 @@ static void ccid3_hc_rx_packet_recv(struct sock *sk, struct 
sk_buff *skb)
return;
}
 
-   packet = dccp_rx_hist_entry_new(opt_recv-dccpor_ndp, skb, GFP_ATOMIC);
+   packet = tfrc_rx_hist_entry_new(opt_recv-dccpor_ndp, skb, GFP_ATOMIC);
if (unlikely(packet == 

[PATCH 7/7] [TFRC] New rx history code

2007-12-02 Thread Arnaldo Carvalho de Melo
Credit here goes to Gerrit Renker, that provided the initial implementation for
this new codebase.

I modified it just to try to make it closer to the existing API, hide details 
from
the CCIDs and fix a couple bugs found in the initial implementation.

Original changeset comment from Gerrit:
  ---
This provides a new, self-contained and generic RX history service for TFRC
based protocols.

Details:
 * new data structure, initialisation and cleanup routines;
 * allocation of dccp_rx_hist entries local to packet_history.c,
   as a service exported by the dccp_tfrc_lib module.
 * interface to automatically track highest-received seqno;
 * receiver-based RTT estimation (needed for instance by RFC 3448, 6.3.1);
 * a generic function to test for `data packets' as per  RFC 4340, sec. 7.7.
  ---

Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/ccids/ccid3.c   |  255 
 net/dccp/ccids/ccid3.h   |   14 +-
 net/dccp/ccids/lib/loss_interval.c   |   14 +-
 net/dccp/ccids/lib/packet_history.c  |  277 +++---
 net/dccp/ccids/lib/packet_history.h  |   82 +++-
 net/dccp/ccids/lib/packet_history_internal.h |   67 ++
 6 files changed, 353 insertions(+), 356 deletions(-)
 create mode 100644 net/dccp/ccids/lib/packet_history_internal.h

diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c
index 5ff5aab..af64c1d 100644
--- a/net/dccp/ccids/ccid3.c
+++ b/net/dccp/ccids/ccid3.c
@@ -641,6 +641,15 @@ static int ccid3_hc_tx_getsockopt(struct sock *sk, const 
int optname, int len,
 /*
  * Receiver Half-Connection Routines
  */
+
+/* CCID3 feedback types */
+enum ccid3_fback_type {
+   CCID3_FBACK_NONE = 0,
+   CCID3_FBACK_INITIAL,
+   CCID3_FBACK_PERIODIC,
+   CCID3_FBACK_PARAM_CHANGE
+};
+
 #ifdef CONFIG_IP_DCCP_CCID3_DEBUG
 static const char *ccid3_rx_state_name(enum ccid3_hc_rx_states state)
 {
@@ -673,53 +682,60 @@ static inline void ccid3_hc_rx_update_s(struct 
ccid3_hc_rx_sock *hcrx, int len)
hcrx-ccid3hcrx_s = tfrc_ewma(hcrx-ccid3hcrx_s, len, 9);
 }
 
-static void ccid3_hc_rx_send_feedback(struct sock *sk)
+static void ccid3_hc_rx_send_feedback(struct sock *sk,
+ const struct sk_buff *skb,
+ enum ccid3_fback_type fbtype)
 {
struct ccid3_hc_rx_sock *hcrx = ccid3_hc_rx_sk(sk);
struct dccp_sock *dp = dccp_sk(sk);
-   struct tfrc_rx_hist_entry *packet;
ktime_t now;
-   suseconds_t delta;
+   s64 delta = 0;
 
ccid3_pr_debug(%s(%p) - entry \n, dccp_role(sk), sk);
 
+   if (unlikely(hcrx-ccid3hcrx_state == TFRC_RSTATE_TERM))
+   return;
+
now = ktime_get_real();
 
-   switch (hcrx-ccid3hcrx_state) {
-   case TFRC_RSTATE_NO_DATA:
+   switch (fbtype) {
+   case CCID3_FBACK_INITIAL:
hcrx-ccid3hcrx_x_recv = 0;
+   hcrx-ccid3hcrx_pinv   = ~0U;   /* see RFC 4342, 8.5 */
break;
-   case TFRC_RSTATE_DATA:
-   delta = ktime_us_delta(now,
-  hcrx-ccid3hcrx_tstamp_last_feedback);
-   DCCP_BUG_ON(delta  0);
-   hcrx-ccid3hcrx_x_recv =
-   scaled_div32(hcrx-ccid3hcrx_bytes_recv, delta);
+   case CCID3_FBACK_PARAM_CHANGE:
+   /*
+* When parameters change (new loss or p  p_prev), we do not
+* have a reliable estimate for R_m of [RFC 3448, 6.2] and so
+* need to  reuse the previous value of X_recv. However, when
+* X_recv was 0 (due to early loss), this would kill X down to
+* s/t_mbi (i.e. one packet in 64 seconds).
+* To avoid such drastic reduction, we approximate X_recv as
+* the number of bytes since last feedback.
+* This is a safe fallback, since X is bounded above by X_calc.
+*/
+   if (hcrx-ccid3hcrx_x_recv  0)
+   break;
+   /* fall through */
+   case CCID3_FBACK_PERIODIC:
+   delta = ktime_us_delta(now, 
hcrx-ccid3hcrx_tstamp_last_feedback);
+   if (delta = 0)
+   DCCP_BUG(delta (%ld) = 0, (long)delta);
+   else
+   hcrx-ccid3hcrx_x_recv =
+   scaled_div32(hcrx-ccid3hcrx_bytes_recv, delta);
break;
-   case TFRC_RSTATE_TERM:
-   DCCP_BUG(%s(%p) - Illegal state TERM, dccp_role(sk), sk);
+   default:
return;
}
 
-   packet = tfrc_rx_hist_find_data_packet(hcrx-ccid3hcrx_hist);
-   if (unlikely(packet == NULL)) {
-   DCCP_WARN(%s(%p), no data packet in history!\n,
- dccp_role(sk), sk);
-   return;
-   }
+   

kernel 2.6.23.8: KERNEL: assertion in net/ipv4/tcp_input.c

2007-12-02 Thread Wolfgang Walter
Hello,

with kernel 2.6.23.8 we saw a

KERNEL: assertion ((int)tcp_packets_in_flight(tp) = 0) failed at 
net/ipv4/tcp_input.c (1292)

Regards,

Wolfgang Walter
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: namespace support requires network modules to say GPL

2007-12-02 Thread David Schwartz

  Then init_net needs to be not GPL limited. Sorry, we need to allow
  non GPL network drivers.  There is a fine line between keeping the

 Why - they aren't exactly likely to be permissible by law

Really? What law and/or what clause in the GPL says that derivative works
have to be licensed under the GPL? Or does the kernel have some new
technique to determine whether or not code has been distributed?

As I read the GPL, it only requires you to release something under the GPL
if you distribute it. The kernel has no idea whether or not code has been
distributed. So if it's enforcing the GPL, it cannot prohibit anything
non-distributed code can lawfully do. (Ergo, it's *NOT* *ENFORCING* the
GPL.)

  binary seething masses from accessing random kernel functions,
 and allowing
  reasonable (but still non GPL) things like ndiswrapper to use network
  device interface.

 Its up to the ndiswrapper authors how the licence their code, but they
 should respect how we licence ours.

You license yours under the GPL, so they should respect the GPL.

It sounds like we're back to where we were years ago. Didn't we already
agree that EXPORT_SYMBOL_GPL was *NOT* a GPL-enforcement mechanism and had
nothing to do with respecting the GPL? After all, if it s a GPL-enforcement
mechanism, why is it not a further restriction which is prohibited by the
GPL? (The GPL contains no restrictions on what code can use what symbols if
that code is not distributed, but EXPORT_SYMBOL_GPL does.)

Are you now claiming that EXPORT_SYMBOL_GPL is intended to enforce the GPL?

DS


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: fix lro_gen_skb() alignment

2007-12-02 Thread Andrew Gallatin

Herbert Xu wrote:

On Fri, Nov 30, 2007 at 02:35:43PM -0500, Andrew Gallatin wrote:

Isn't the value of 2 ethernet-specific (to round the 14-byte header up
to 16)?  Given that the rest of the lro code is fairly careful to
calculate mac_hdr_len etc it seems as if it would be cleaner to make
this independent of the specific L2 being used.

(And I plan on using the LRO module for IP-over-InfiniBand so this is
not completely theoretical)


Good point!

We really should rename NET_IP_ALIGN so that both Ethernet and DMA
occur in it somehow :)


Good point.  I tend to think all the world is ethernet.
Perhaps the better way would be to simply add an alignment pad
field to lro_mgr?  When the driver initializes it, it specifies
any padding needed.  Ethernet drivers would specify 2.


Just pass in the mac_hdr_len, and calculate the padding as


That was my first thought as well, but it turns out that
when lro_gen_skb() is called via the out1 label, mac_hdr_len
may not be known.  It seemed simplest and cleanest to just
make it a field in lro_mgr.

Drew
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: namespace support requires network modules to say GPL

2007-12-02 Thread Alan Cox
 You license yours under the GPL, so they should respect the GPL.
 
 It sounds like we're back to where we were years ago. Didn't we already
 agree that EXPORT_SYMBOL_GPL was *NOT* a GPL-enforcement mechanism and had
 nothing to do with respecting the GPL? After all, if it s a GPL-enforcement

No we seem to be back recycling the fact that certain people were making
statements that might be construed, unanswered, as giving permission to
violate the GPL.

I'm merely reminding people that I've not waived my GPL rights, I've not
said modules are somehow magically OK, and I don't agree with Linus. 

The GPL very clearly says that you can make your own unredistributed
modifications and keep them that way.

Alan
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: namespace support requires network modules to say GPL

2007-12-02 Thread Adrian Bunk
On Sun, Dec 02, 2007 at 10:59:46PM +0100, Patrick McHardy wrote:
 Adrian Bunk wrote:
 On Sun, Dec 02, 2007 at 09:03:56PM +0100, Patrick McHardy wrote:
...
 your statement has an interesting implication:

 Stuff like e.g. the EXPORT_SYMBOL(sk_alloc) predates the EXPORT_SYMBOL_GPL 
 stuff.

 Who is considered the author of this code?

 And when should he state whether he prefers to use EXPORT_SYMBOL_GPL but 
 wasn't able to use it at that when he wrote it since his code predates it 
 and is glad to be able to decide this now?

 He can state it when he feels like it, I don't see the point.
 Authors generally get to decide whether they use EXPORT_SYMBOL
 or EXPORT_SYMBOL_GPL unless in cases where its really clear-cut
 that EXPORT_SYMBOL is inapproriate. But thats a different matter.
...

You miss my point.

Stuff like sk_alloc was exported to modules before EXPORT_SYMBOL_GPL 
existed (it was even exported to modules before EXPORT_SYMBOL existed).

We are talking about code and exports that are at about 12 years old, 
which is at about twice as old as EXPORT_SYMBOL_GPL.

So what should happen in your opinion if e.g. Alan checks which of the 
network code he had written when it was exported a dozen years ago, 
stating that he never wanted it to be available to non-GPL modules?

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] [IrDA] stir4200 fixes

2007-12-02 Thread Samuel Ortiz
The attached patch observes the stir4200 fifo size and will clear the fifo, if
the size is increasing, while it should be transmitting bytes

From: Olaf Hartmann [EMAIL PROTECTED]
Signed-off-by: Samuel Ortiz [EMAIL PROTECTED]

---
 drivers/net/irda/stir4200.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

Index: net-2.6/drivers/net/irda/stir4200.c
===
--- net-2.6.orig/drivers/net/irda/stir4200.c2007-11-25 05:53:43.0 
+0100
+++ net-2.6/drivers/net/irda/stir4200.c 2007-11-26 02:16:12.0 +0100
@@ -142,9 +142,6 @@
 };
 
 enum StirFifoCtlMask {
-   FIFOCTL_EOF = 0x80,
-   FIFOCTL_UNDER = 0x40,
-   FIFOCTL_OVER = 0x20,
FIFOCTL_DIR = 0x10,
FIFOCTL_CLR = 0x08,
FIFOCTL_EMPTY = 0x04,
@@ -594,9 +591,10 @@
 {
int err;
unsigned long count, status;
+   unsigned long prev_count = 0x1fff;
 
/* Read FIFO status and count */
-   for(;;) {
+   for (;; prev_count = count) {
err = read_reg(stir, REG_FIFOCTL, stir-fifo_status, 
   FIFO_REGS_SIZE);
if (unlikely(err != FIFO_REGS_SIZE)) {
@@ -629,6 +627,10 @@
if (space = 0  STIR_FIFO_SIZE - 4  space + count)
return 0;
 
+   /* queue confused */
+   if (prev_count  count)
+   break;
+
/* estimate transfer time for remaining chars */
msleep((count * 8000) / stir-speed);
}

-- 

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] [IrDA] mcs7780 needs to free allocated rx buffer

2007-12-02 Thread Samuel Ortiz
While testing the mcs7780 based IrDA USB dongle I've stumbled upon
memory leak in mcs_net_close(). Patch below fixes it.

Signed-off-by: Hinko Kocevar [EMAIL PROTECTED]
Signed-off-by: Samuel Ortiz [EMAIL PROTECTED]

---
 drivers/net/irda/mcs7780.c |2 ++
 1 file changed, 2 insertions(+)

Index: net-2.6/drivers/net/irda/mcs7780.c
===
--- net-2.6.orig/drivers/net/irda/mcs7780.c 2007-11-25 05:53:43.0 
+0100
+++ net-2.6/drivers/net/irda/mcs7780.c  2007-11-25 07:12:11.0 +0100
@@ -677,6 +677,8 @@
/* Stop transmit processing */
netif_stop_queue(netdev);
 
+   kfree_skb(mcs-rx_buff.skb);
+
/* kill and free the receive and transmit URBs */
usb_kill_urb(mcs-rx_urb);
usb_free_urb(mcs-rx_urb);

-- 

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5] [IrDA] IrDA updates for net-2.6

2007-12-02 Thread Samuel Ortiz
Hi Herbert,

Here goes a series of 5 IrDA patches against your latest net-2.6 tree.

Cheers,
Samuel.
-- 

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] [IrDA] irda parameters warning fixes.

2007-12-02 Thread Samuel Ortiz
This patch fixes:
  CHECK   /home/kernel/src/net/irda/parameters.c
/home/kernel/src/net/irda/parameters.c:466:2: warning: Using plain integer as 
NULL pointer
/home/kernel/src/net/irda/parameters.c:520:2: warning: Using plain integer as 
NULL pointer
/home/kernel/src/net/irda/parameters.c:573:2: warning: Using plain integer as 
NULL pointer

Signed-off-by: Richard Knutsson [EMAIL PROTECTED]
Signed-off-by: Samuel Ortiz [EMAIL PROTECTED]
Cc: Andrew Morton [EMAIL PROTECTED]
---
 net/irda/parameters.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Index: net-2.6/net/irda/parameters.c
===
--- net-2.6.orig/net/irda/parameters.c  2007-11-25 05:54:02.0 +0100
+++ net-2.6/net/irda/parameters.c   2007-11-26 10:20:38.0 +0100
@@ -463,7 +463,7 @@
int n = 0;
 
IRDA_ASSERT(buf != NULL, return ret;);
-   IRDA_ASSERT(info != 0, return ret;);
+   IRDA_ASSERT(info != NULL, return ret;);
 
pi_minor = pi  info-pi_mask;
pi_major = pi  info-pi_major_offset;
@@ -517,7 +517,7 @@
int n = 0;
 
IRDA_ASSERT(buf != NULL, return ret;);
-   IRDA_ASSERT(info != 0, return ret;);
+   IRDA_ASSERT(info != NULL, return ret;);
 
pi_minor = buf[n]  info-pi_mask;
pi_major = buf[n]  info-pi_major_offset;
@@ -570,7 +570,7 @@
int n = 0;
 
IRDA_ASSERT(buf != NULL, return ret;);
-   IRDA_ASSERT(info != 0, return ret;);
+   IRDA_ASSERT(info != NULL, return ret;);
 
/*
 * Parse all parameters. Each parameter must be at least two bytes

-- 

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] [IrDA] Race between open and disconnect in irda-usb

2007-12-02 Thread Samuel Ortiz
It seems to me that irda_usb_net_open() must set self-netopen
under spinlock or disconnect() may fail to kill all URBs, if it is called
while an interface is opened.

Signed-off-by: Oliver Neukum [EMAIL PROTECTED]
Signed-off-by: Samuel Ortiz [EMAIL PROTECTED]

---
 drivers/net/irda/irda-usb.c |5 +
 1 file changed, 5 insertions(+)

Index: net-2.6/drivers/net/irda/irda-usb.c
===
--- net-2.6.orig/drivers/net/irda/irda-usb.c2007-11-25 05:53:43.0 
+0100
+++ net-2.6/drivers/net/irda/irda-usb.c 2007-11-25 07:12:09.0 +0100
@@ -1168,6 +1168,7 @@
 static int irda_usb_net_open(struct net_device *netdev)
 {
struct irda_usb_cb *self;
+   unsigned long flags;
charhwname[16];
int i;

@@ -1177,13 +1178,16 @@
self = (struct irda_usb_cb *) netdev-priv;
IRDA_ASSERT(self != NULL, return -1;);
 
+   spin_lock_irqsave(self-lock, flags);
/* Can only open the device if it's there */
if(!self-present) {
+   spin_unlock_irqrestore(self-lock, flags);
IRDA_WARNING(%s(), device not present!\n, __FUNCTION__);
return -1;
}
 
if(self-needspatch) {
+   spin_unlock_irqrestore(self-lock, flags);
IRDA_WARNING(%s(), device needs patch\n, __FUNCTION__) ;
return -EIO ;
}
@@ -1198,6 +1202,7 @@
/* To do *before* submitting Rx urbs and starting net Tx queue
 * Jean II */
self-netopen = 1;
+   spin_unlock_irqrestore(self-lock, flags);
 
/* 
 * Now that everything should be initialized properly,

-- 

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PASEMI_MAC] Don't claim to do IPv6 checksum offload

2007-12-02 Thread David Woodhouse
Signed-off-by: David Woodhouse [EMAIL PROTECTED]

diff --git a/drivers/net/pasemi_mac.c b/drivers/net/pasemi_mac.c
index 09b4fde..a8db5d7 100644
--- a/drivers/net/pasemi_mac.c
+++ b/drivers/net/pasemi_mac.c
@@ -1362,7 +1362,7 @@ pasemi_mac_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 
netif_napi_add(dev, mac-napi, pasemi_mac_poll, 64);
 
-   dev-features = NETIF_F_HW_CSUM | NETIF_F_LLTX | NETIF_F_SG;
+   dev-features = NETIF_F_IP_CSUM | NETIF_F_LLTX | NETIF_F_SG;
 
/* These should come out of the device tree eventually */
mac-dma_txch = index;

-- 
dwmw2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/4] dst: Distributed storage documentation.

2007-12-02 Thread Matt Mackall
On Thu, Nov 29, 2007 at 03:53:23PM +0300, Evgeniy Polyakov wrote:
 
 Distributed storage documentation.
 
 Algorithms used in the system, userspace interfaces
 (sysfs dirs and files), design and implementation details
 are described here.

Can you give us a summary of how this differs from using device mapper
with NBD?

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH (resubmit)] Fix inet_diag.ko register vs rcv race

2007-12-02 Thread Herbert Xu
On Thu, Nov 29, 2007 at 04:01:25PM +0300, Pavel Emelyanov wrote:

 @@ -863,13 +861,13 @@ int inet_diag_register(const struct inet_diag_handler 
 *h)
   if (type = INET_DIAG_GETSOCK_MAX)
   goto out;
  
 - spin_lock(inet_diag_register_lock);
 + mutex_lock(inet_diag_mutex);
   err = -EEXIST;
   if (inet_diag_table[type] == NULL) {
   inet_diag_table[type] = h;
   err = 0;
   }
 - spin_unlock(inet_diag_register_lock);
 + mutex_unlock(inet_diag_mutex);

Actually this causes a dead-lock when the handlers are built as modules
because we try to load them with that mutex held.

I've fixed it with this patch on top.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
commit d523a328fb0271e1a763e985a21f2488fd816e7e
Author: Herbert Xu [EMAIL PROTECTED]
Date:   Mon Dec 3 15:51:25 2007 +1100

[INET]: Fix inet_diag dead-lock regression

The inet_diag register fix broke inet_diag module loading because the
loaded module had to take the same mutex that's already held by the
loader in order to register the new handler.

This patch fixes it by introducing a separate mutex to protect the
handling of handlers.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 6b3fffb..e468e7a 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -51,6 +51,29 @@ static struct sock *idiagnl;
 #define INET_DIAG_PUT(skb, attrtype, attrlen) \
RTA_DATA(__RTA_PUT(skb, attrtype, attrlen))
 
+static DEFINE_MUTEX(inet_diag_table_mutex);
+
+static const struct inet_diag_handler *inet_diag_lock_handler(int type)
+{
+#ifdef CONFIG_KMOD
+   if (!inet_diag_table[type])
+   request_module(net-pf-%d-proto-%d-type-%d, PF_NETLINK,
+  NETLINK_INET_DIAG, type);
+#endif
+
+   mutex_lock(inet_diag_table_mutex);
+   if (!inet_diag_table[type])
+   return ERR_PTR(-ENOENT);
+
+   return inet_diag_table[type];
+}
+
+static inline void inet_diag_unlock_handler(
+   const struct inet_diag_handler *handler)
+{
+   mutex_unlock(inet_diag_table_mutex);
+}
+
 static int inet_csk_diag_fill(struct sock *sk,
  struct sk_buff *skb,
  int ext, u32 pid, u32 seq, u16 nlmsg_flags,
@@ -235,9 +258,12 @@ static int inet_diag_get_exact(struct sk_buff *in_skb,
struct inet_hashinfo *hashinfo;
const struct inet_diag_handler *handler;
 
-   handler = inet_diag_table[nlh-nlmsg_type];
-   BUG_ON(handler == NULL);
+   handler = inet_diag_lock_handler(nlh-nlmsg_type);
+   if (!handler)
+   return -ENOENT;
+
hashinfo = handler-idiag_hashinfo;
+   err = -EINVAL;
 
if (req-idiag_family == AF_INET) {
sk = inet_lookup(hashinfo, req-id.idiag_dst[0],
@@ -255,11 +281,12 @@ static int inet_diag_get_exact(struct sk_buff *in_skb,
}
 #endif
else {
-   return -EINVAL;
+   goto unlock;
}
 
+   err = -ENOENT;
if (sk == NULL)
-   return -ENOENT;
+   goto unlock;
 
err = -ESTALE;
if ((req-id.idiag_cookie[0] != INET_DIAG_NOCOOKIE ||
@@ -296,6 +323,8 @@ out:
else
sock_put(sk);
}
+unlock:
+   inet_diag_unlock_handler(handler);
return err;
 }
 
@@ -678,8 +707,10 @@ static int inet_diag_dump(struct sk_buff *skb, struct 
netlink_callback *cb)
const struct inet_diag_handler *handler;
struct inet_hashinfo *hashinfo;
 
-   handler = inet_diag_table[cb-nlh-nlmsg_type];
-   BUG_ON(handler == NULL);
+   handler = inet_diag_lock_handler(cb-nlh-nlmsg_type);
+   if (!handler)
+   goto no_handler;
+
hashinfo = handler-idiag_hashinfo;
 
s_i = cb-args[1];
@@ -743,7 +774,7 @@ skip_listen_ht:
}
 
if (!(r-idiag_states  ~(TCPF_LISTEN | TCPF_SYN_RECV)))
-   return skb-len;
+   goto unlock;
 
for (i = s_i; i  hashinfo-ehash_size; i++) {
struct inet_ehash_bucket *head = hashinfo-ehash[i];
@@ -805,6 +836,9 @@ next_dying:
 done:
cb-args[1] = i;
cb-args[2] = num;
+unlock:
+   inet_diag_unlock_handler(handler);
+no_handler:
return skb-len;
 }
 
@@ -816,15 +850,6 @@ static int inet_diag_rcv_msg(struct sk_buff *skb, struct 
nlmsghdr *nlh)
nlmsg_len(nlh)  hdrlen)
return -EINVAL;
 
-#ifdef CONFIG_KMOD
-   if (inet_diag_table[nlh-nlmsg_type] == NULL)
-   request_module(net-pf-%d-proto-%d-type-%d, PF_NETLINK,
-  NETLINK_INET_DIAG, nlh-nlmsg_type);
-#endif
-
-   if (inet_diag_table[nlh-nlmsg_type] == NULL)
-   

Re: [PASEMI_MAC] Don't claim to do IPv6 checksum offload

2007-12-02 Thread Olof Johansson
On Mon, Dec 03, 2007 at 04:34:32AM +, David Woodhouse wrote:
 Signed-off-by: David Woodhouse [EMAIL PROTECTED]
Acked-by: Olof Johansson [EMAIL PROTECTED]

Jeff,

I'd like to see this in 2.6.24, please.


Thanks,

-Olof

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] XFRM: SPD auditing fix to include the netmask/prefix-length

2007-12-02 Thread Paul Moore
On Saturday 01 December 2007 7:28:34 am Herbert Xu wrote:
 On Fri, Nov 30, 2007 at 09:51:48AM -0500, Paul Moore wrote:
  Steve and/or Joy, could we get a verdict on this issue?  The lack of a
  netmask in the SPD audit messages is pretty serious so I'd like to see
  this fixed as soon as possible.

 I'll take the resounding silence as an indication of approval :)

 Patch applied to net-2.6.25.  Thanks Paul.

Thanks Herbert.  If I hear any grumblings from the audit folks I'll send out 
another patch to make the prefix length a permanent fixture.

-- 
paul moore
linux security @ hp
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] [IrDA] IrDA updates for net-2.6

2007-12-02 Thread Herbert Xu
On Mon, Dec 03, 2007 at 10:28:31AM +0100, Samuel Ortiz wrote:
 Hi Herbert,
 
 Here goes a series of 5 IrDA patches against your latest net-2.6 tree.

I seem to be missing 3/5.  Could you please resend?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode

2007-12-02 Thread Wang Chen
Herbert Xu said the following on 2007-12-1 9:54:
 On Fri, Nov 30, 2007 at 11:19:49AM +, Gerrit Renker wrote:
 |  csum_copy_err:
 | -  UDP6_INC_STATS_USER(UDP_MIB_INERRORS, is_udplite);
 | +  UDP6_INC_STATS_BH(UDP_MIB_INERRORS, is_udplite);
 |skb_kill_datagram(sk, skb, flags);
 |  
 |if (flags  MSG_DONTWAIT)
 | 
 Is it not the other way round ?? :- 
 
 I agree.  Wang Chen, please change this and other appropriate
 BH calls to USER.  Basically recvmsg should always be USER while
 rcv is BH.
 

Resubmit the patch.

System calls should be USER. So change the BH to USER for
UDP*_INC_STATS_BH().

Signed-off-by: Wang Chen [EMAIL PROTECTED]
---
 ipv4/udp.c |4 ++--
 ipv6/udp.c |2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

--- linux-2.6.24.rc3.org/net/ipv4/udp.c 2007-12-03 15:10:49.0 +0800
+++ linux-2.6.24.rc3/net/ipv4/udp.c 2007-12-03 14:52:54.0 +0800
@@ -874,7 +874,7 @@ try_again:
if (err)
goto out_free;
 
-   UDP_INC_STATS_BH(UDP_MIB_INDATAGRAMS, is_udplite);
+   UDP_INC_STATS_USER(UDP_MIB_INDATAGRAMS, is_udplite);
 
sock_recv_timestamp(msg, sk, skb);
 
@@ -899,7 +899,7 @@ out:
return err;
 
 csum_copy_err:
-   UDP_INC_STATS_BH(UDP_MIB_INERRORS, is_udplite);
+   UDP_INC_STATS_USER(UDP_MIB_INERRORS, is_udplite);
 
skb_kill_datagram(sk, skb, flags);
 
--- linux-2.6.24.rc3.org/net/ipv6/udp.c 2007-12-03 15:10:49.0 +0800
+++ linux-2.6.24.rc3/net/ipv6/udp.c 2007-12-03 15:09:55.0 +0800
@@ -164,7 +164,7 @@ try_again:
if (err)
goto out_free;
 
-   UDP6_INC_STATS_BH(UDP_MIB_INDATAGRAMS, is_udplite);
+   UDP6_INC_STATS_USER(UDP_MIB_INDATAGRAMS, is_udplite);
 
sock_recv_timestamp(msg, sk, skb);

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.25] Add packet filtering based on process's security context.

2007-12-02 Thread Patrick McHardy

James Morris wrote:

On Thu, 22 Nov 2007, Tetsuo Handa wrote:


This patch allows LSM modules filter incoming connections/datagrams
based on the process's security context who is attempting to pick up.

There are already hooks to filter incoming connections/datagrams
based on the socket's security context, but these hooks are not
applicable when one wants to do TCP Wrapper-like filtering
(e.g. App1 is permitted to accept TCP connections from 192.168.0.0/16).


This functionality looks like it could be useful in that we currently have 
no direct security mapping from incoming packet to user process, but only 
to the receiving socket, as you mention.  For SELinux, it may help us 
simplify/clarify policy.


It's also been long-desired for netfilter/iptables, to allow ipt_owner to 
work cleanly for incoming packets.


So, this probably needs to be implemented in a way which works for both LSM 
and netfilter.  There have been several discussions on the issue from the 
netfilter side, although I don't know what the latest status of that is 
(I've expanded the cc list to hopefully get some more feedback).


No news on that. I'm also a bit sceptical if adding all this complexity
and overhead would really be worth it (considering only netfilter) just
to use the owner match and UID/GID matching. It wouldn't even be
accurate because there is not 1:1 mapping of sockets and processes.

I actually like Samir Bellabes' approach, which doesn't suffer from
these problems IIRC.

From memory, one approach under discussion was to add netfilter hooks to 
the transport layer, which could be invoked correctly by each type of 
protocol when the target process is selected.


We can only invoke the hooks after the socket lookup, but we don't
know which process is going to call recvmsg() for that socket.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] netem: trace enhancement

2007-12-02 Thread Patrick McHardy

Ariane Keller wrote:

Thanks for your comments!

I'd like to better understand your dislike of the current implementation 
 of the data transfer from user space to kernel space.

Is it the fact that we use configfs?
I think, we had already a discussion about this (and we changed from 
procfs to configfs).
Or don't you like that we need a user space daemon which is responsible 
for feeding the data to the kernel module?
I think we do not have another option, since the trace file may be of 
arbitrary length.

Or anything else?



I dislike using anything besides rtnetlink for qdisc configuration.
The only way to transfer arbitary amounts of data over netlink would
be to spread the data over multiple messages. But then again, you're
using kmalloc and only seem to allocate 4k, so how large are these
traces in practice?

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html