Re: [PATCH] NET : convert network timestamps to ktime_t

2007-03-05 Thread Eric Dumazet

David Miller a écrit :

From: Eric Dumazet [EMAIL PROTECTED]
Date: Mon, 05 Mar 2007 08:40:03 +0100

Here is the second version of this patch, including missing bits spoted by 
Stephen. This is against net-2.6.22


Applied, thanks a lot.


Note : this patch includes a bug correction in compat_sock_get_timestamp()
where a err = 0; was missing (so this syscall returned -ENOENT instead of
0)


Grrr, good spotting.  Eric can you send me a version of just this bug
fix against 2.6.21?  I'll push that to Linus and 2.6.x-stable.


Yes I will do it. But, are your sure its really used ?

I dont know very much this compat stuff, but I found 
compat_sock_get_timestamp() only called from net/x25/af_x25.c


I wonder how a 32 bit task on x86_64 can enter this compat code on other 
sockets ?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v.2] RTNL and flush_scheduled_work deadlocks

2007-03-05 Thread Jarek Poplawski
On Sun, Feb 18, 2007 at 10:27:19PM -0800, Ben Greear wrote:
 Jarek Poplawski wrote:
 On Fri, Feb 16, 2007 at 11:04:02AM -0800, Ben Greear wrote:
...
 On Thu, 15 Feb 2007 23:40:32 -0800
 Ben Greear [EMAIL PROTECTED] wrote:
   
 Maybe there should be something like an ASSERT_NOT_RTNL() in the 
 flush_scheduled_work()
 method?  If it's performance criticial, #ifdef it out if we're not 
 debugging locks?
...
 For performance reasons, I'd leave the rtnl_owner inside the
 #if debugging locking code
 
 You are also changing the semantics of ASSERT_RTNL (assert *this thread* 
 has rtnl, from the
 old behaviour:  assert *some thread* has rtnl).  It may be better this
 way, but it could break code that assumes the old behaviour.
 
 Ben

Hi,

I'm not sure anybody is interested yet, but I think it's
a good idea so here is a reworked proposal. 

Jarek P.
===

[NET] DEBUG_ASSERT_RTNL and DEBUG_ASSERT_NOT_RTNL macros

Debug RTNL macros usable with CONFIG_DEBUG_MUTEXES on.
Based on suggestions of Ben Greear (to help debugging
flush_schedule_work).

Signed-off-by: Jarek Poplawski [EMAIL PROTECTED]
Cc: Ben Greear [EMAIL PROTECTED]

---

 include/linux/rtnetlink.h |   27 +++
 net/core/rtnetlink.c  |   10 ++
 2 files changed, 37 insertions(+)


diff -Nurp linux-2.6.21-rc2-git2-/include/linux/rtnetlink.h 
linux-2.6.21-rc2-git2/include/linux/rtnetlink.h
--- linux-2.6.21-rc2-git2-/include/linux/rtnetlink.h2007-02-04 
19:44:54.0 +0100
+++ linux-2.6.21-rc2-git2/include/linux/rtnetlink.h 2007-03-04 
14:47:00.0 +0100
@@ -728,6 +728,33 @@ rtattr_failure:
return table;
 }
 
+#ifdef CONFIG_DEBUG_MUTEXES
+extern int debug_rtnl_assert(void);
+
+#define DEBUG_ASSERT_RTNL() do { \
+   if (unlikely(!debug_rtnl_assert())) { \
+   printk(KERN_ERR DEBUG RTNL: \
+   assertion failed at %s (%d)\n, \
+  __FILE__,  __LINE__); \
+   dump_stack(); \
+   } \
+} while(0)
+
+#define DEBUG_ASSERT_NOT_RTNL() do { \
+   if (unlikely(debug_rtnl_assert())) { \
+   printk(KERN_ERR DEBUG NOT RTNL: \
+   assertion failed at %s (%d)\n, \
+  __FILE__,  __LINE__); \
+   dump_stack(); \
+   } \
+} while(0)
+
+#else
+/* debug_rtnl_assert() invalid here */
+#define DEBUG_ASSERT_RTNL()
+#define DEBUG_ASSERT_NOT_RTNL()
+#endif /* CONFIG_DEBUG_MUTEXES */
+
 #endif /* __KERNEL__ */
 
 
diff -Nurp linux-2.6.21-rc2-git2-/net/core/rtnetlink.c 
linux-2.6.21-rc2-git2/net/core/rtnetlink.c
--- linux-2.6.21-rc2-git2-/net/core/rtnetlink.c 2007-02-21 19:46:47.0 
+0100
+++ linux-2.6.21-rc2-git2/net/core/rtnetlink.c  2007-03-04 15:23:34.0 
+0100
@@ -82,6 +82,13 @@ int rtnl_trylock(void)
return mutex_trylock(rtnl_mutex);
 }
 
+#ifdef CONFIG_DEBUG_MUTEXES
+int debug_rtnl_assert(void)
+{
+   return (rtnl_mutex.owner == current_thread_info());
+}
+#endif
+
 int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, int len)
 {
memset(tb, 0, sizeof(struct rtattr*)*maxattr);
@@ -893,3 +900,6 @@ EXPORT_SYMBOL(rtnl_unlock);
 EXPORT_SYMBOL(rtnl_unicast);
 EXPORT_SYMBOL(rtnl_notify);
 EXPORT_SYMBOL(rtnl_set_sk_err);
+#ifdef CONFIG_DEBUG_MUTEXES
+EXPORT_SYMBOL(debug_rtnl_assert);
+#endif
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] NET : convert network timestamps to ktime_t

2007-03-05 Thread David Miller
From: Eric Dumazet [EMAIL PROTECTED]
Date: Mon, 05 Mar 2007 09:21:30 +0100

 I dont know very much this compat stuff, but I found 
 compat_sock_get_timestamp() only called from net/x25/af_x25.c

Good point, I thought for some reason that this function
was used for all SO_TIMESTAMP handling, but looking again
I see that the main stuff is handled in put_cmsg() compat
code.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Ensure IF_READY is unset when link is not ready

2007-03-05 Thread Mitsuru Chinen
Hi there,

On linux-2.6.21-rc2 or later, IPv6 link-local address is not assigned to
some kind of interfaces during system start-up. (I found this issue
occures with e100, e1000 and tg3.)

This issue comes from the change that inet6_dev is allocated when
NETDEV_REGISTER event occurs. The allocation code is at ipv6_add_dev()
in net/ipv6/addrinfo.c. At the code, IF_READY bit would be set when the
link is ready. The link readyness is verified by netif_caeeir_ok().
However as the drivers don't call netif_carrier_off() prior to calling
register_netdev(), netif_caeeir_ok() returns true in spite of the
actual link state.

Here's a work around patch. This make IF_READY unset when NETDEV_UP
event occurs and the link is not ready. This patch may not be an
fundamental fix. But I don't have any other idea now.

Signed-off-by: Mitsuru Chinen [EMAIL PROTECTED]
---
 net/ipv6/addrconf.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index e16f1bb..1593cd1 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2196,6 +2196,7 @@ static int addrconf_notify(struct notifi
ADDRCONF(NETDEV_UP): %s: 
link is not ready\n,
dev-name);
+   idev-if_flags = ~IF_READY;
break;
}
 
-- 
1.4.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


linux 2.6 Ipv4 routing enhancement (fwd)

2007-03-05 Thread Richard Kojedzinszky

Dear all,

I work for an ISP, and we do not spend money on heavy routers, we use 
linux to do the routing tasks, even at core level. We use commercial Intel 
servers to do this job, but when such a router has come to handle ~1GBit/s 
traffic, and also update the routing table (from BGP), the route cache 
seemed to be the bottleneck, as upon every fib update the whole route 
cache is flushed, and sometimes it took as many cpu cycles to let some 
packets being dropped. Meanwhile i knew that *BSD systems do not use such 
a cache, and of course without it a router can provide a constant 
performance, not depending on the number of different ip flows, and 
updating the fib does not take such a long time.


For this to be solved, i have played with ipv4 routing in linux kernel a 
bit. I have done two separate things:

- developed a new fib algorithm in fib_trie's place for ipv4
- rewrote the kernel not to use it's dst cache

I named my work Linux Express Forwarding, I hope i will not get any trouble 
with this. :)


The fib algorithm is like cisco's CEF (at least if my knowledge is correct), 
but first I use a 16-branching tree, to look up the address by 4 bit steps, and 
each node in this tree contains a simple sub-tree which is a radix tree, of 
course with maximum possible height 4. I think this is very simple, and is 
nearly 3 times faster than fib_trie. Now it has a missing feature: it does not 
export the fib in /proc/net/route.


The second thing i have done to minimize the cpu cycles during the forwarding 
phase, rewriting ip_input.c, route.c and some others to lef.c, and having a 
minimal functionality. I mean, for example, when a packet gets through the lef 
functions, ipsec policies are not checked.
And to be more efficient, I attached a neighbour pointer to each fib entry, and 
using this the lookup + forwarding code is very fast.


Of course, the route cache needs very little time to forward packets when there 
are a small number of different ip flows, but when dealing with traffic in an 
ISP at core level, this cannot be stated.


So I have done tests with LEF, and compared them to the original linux kernel's 
performance.
With the worst case, LEF performed nearly 90% of the linux kernel with the most 
optimal case. Of course original linux performs poorly with the worst case.


I will list the features/bugs needed to be completed/fixed (a TODO list):
FIB:
- export data to /proc/net/route
LEF:
- support packet fragmentation
- support SMP

These are the most important. Of course some might decide not to use it at all 
without these. LEF has been running on our routers for 3 months at all, and no 
problems arised. Now it seems that the routers internal bus speed is the 
bottleneck, but that could only be fixed with hardware. :)


The patches are for 2.6.19.1, i have not done an effort to apply them on the 
latest kernel.


So i send the patches, and please say something about it, may i hope that it 
gets into the kernel or not, or what more should i do.


The files should be applied in alphabetic order.

Regards,
Richard Kojedzinszky

lef.tgz
Description: GNU Unix tar archive


Re: Extensible hashing and RCU

2007-03-05 Thread Robert Olsson

Michael K. Edwards writes:

  This, incidentally, seems very similar to the process that Robert
  Olsson and Stefan Nilsson have gone through with their trie/hash
  project.  Although I haven't tried it out yet and don't have any basis
  for an independent opinion, the data and analysis provided in their
  paper are quite convincing. 

 I've info about this process :) Moved fib_trie.c to userland to extend 
 it longer an variable keylengths. Testprogram was doing insert/lookup/remove 
 with random data with blistering performance (very flat trees).

 So quite happy I moved this trie back to the kernel and started testing with
 real data - ip addresses and rDoS.  Result was disastrous very deep 
 trees and awful network performance. Random data is very different from 
 real data was the lesson learned again. Gave up. 

 A couple days later an idea came up. I'll remembered the poof from the
 LC-trie paper that length of key does not impact tree depth. So went 
 to Stefan, Can't we just fix up the data rather than fiddling with an 
 new improved algorithm? The result was this hash header to boost tree 
 compression.
 
 Cheers
--ro

 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Extensible hashing and RCU

2007-03-05 Thread Evgeniy Polyakov
On Sun, Mar 04, 2007 at 02:02:36AM -0800, Michael K. Edwards ([EMAIL 
PROTECTED]) wrote:
 On 3/3/07, Evgeniy Polyakov [EMAIL PROTECTED] wrote:
 Btw, you could try to implement something you have written above to show
 its merits, so that it would not be an empty words :)
 
 Before I implement, I design.  Before I design, I analyze.  Before I
 analyze, I prototype.  Before I prototype, I gather requirements.
 Before I gather requirements, I think -- and the only way I know how
 to think about technical matters is to write down my intuitions and
 compare them against the sea of published research on the topic.  I'm

You forgot 'before (and whlist) I do all above I write tons of words in
mail lists about nothing, thus spending my and others time'.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] NetXen: Make driver use multiple PCI functions

2007-03-05 Thread Linsys Contractor Mithlesh Thukral
Hi All,

I will be sending updates to NetXen: 1G/10G Ethernet driver in subsequent mails.
The patches will be with respect to netdev#upstream.

My apologies for insufficient explanation of the patches in my last patchset.

Regards,
Mithlesh Thukral
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] NetXen: Removal of redundant function call parameters

2007-03-05 Thread Linsys Contractor Mithlesh Thukral
NetXen: Removal of redundant function call parameters and bug fixes.
This patch will remove the redundant paramters which were being passed to
many functions since now adapter-portnum can be used.

Signed-off-by: Mithlesh Thukral [EMAIL PROTECTED]

---
 drivers/net/netxen/netxen_nic.h |   33 +++
 drivers/net/netxen/netxen_nic_ethtool.c |   10 +-
 drivers/net/netxen/netxen_nic_hdr.h |1 
 drivers/net/netxen/netxen_nic_hw.c  |   19 +++-
 drivers/net/netxen/netxen_nic_hw.h  |4 
 drivers/net/netxen/netxen_nic_init.c|7 -
 drivers/net/netxen/netxen_nic_isr.c |9 +-
 drivers/net/netxen/netxen_nic_main.c|   30 --
 drivers/net/netxen/netxen_nic_niu.c |   96 +-
 9 files changed, 111 insertions(+), 98 deletions(-)

diff --git a/drivers/net/netxen/netxen_nic.h b/drivers/net/netxen/netxen_nic.h
index 64f22ef..38d7409 100644
--- a/drivers/net/netxen/netxen_nic.h
+++ b/drivers/net/netxen/netxen_nic.h
@@ -205,6 +205,7 @@ #define RCV_DESC_TYPE(ID) \
 
 #define MAX_CMD_DESCRIPTORS1024
 #define MAX_RCV_DESCRIPTORS16384
+#define MAX_RCV_DESCRIPTORS_1G (MAX_RCV_DESCRIPTORS / 4)
 #define MAX_JUMBO_RCV_DESCRIPTORS  1024
 #define MAX_LRO_RCV_DESCRIPTORS64
 #define MAX_RCVSTATUS_DESCRIPTORS  MAX_RCV_DESCRIPTORS
@@ -780,6 +781,7 @@ struct netxen_hardware_context {
struct pci_dev *cmd_desc_pdev;
dma_addr_t cmd_desc_phys_addr;
struct netxen_adapter *adapter;
+   int pci_func;
 };
 
 #define RCV_RING_LRO   RCV_DESC_LRO
@@ -916,15 +918,15 @@ struct netxen_adapter {
struct netxen_ring_ctx *ctx_desc;
struct pci_dev *ctx_desc_pdev;
dma_addr_t ctx_desc_phys_addr;
-   int (*enable_phy_interrupts) (struct netxen_adapter *, int);
-   int (*disable_phy_interrupts) (struct netxen_adapter *, int);
+   int (*enable_phy_interrupts) (struct netxen_adapter *);
+   int (*disable_phy_interrupts) (struct netxen_adapter *);
void (*handle_phy_intr) (struct netxen_adapter *);
int (*macaddr_set) (struct netxen_adapter *, netxen_ethernet_macaddr_t);
int (*set_mtu) (struct netxen_adapter *, int);
int (*set_promisc) (struct netxen_adapter *, netxen_niu_prom_mode_t);
int (*unset_promisc) (struct netxen_adapter *, netxen_niu_prom_mode_t);
-   int (*phy_read) (struct netxen_adapter *, long phy, long reg, u32 *);
-   int (*phy_write) (struct netxen_adapter *, long phy, long reg, u32 val);
+   int (*phy_read) (struct netxen_adapter *, long reg, u32 *);
+   int (*phy_write) (struct netxen_adapter *, long reg, u32 val);
int (*init_port) (struct netxen_adapter *, int);
void (*init_niu) (struct netxen_adapter *);
int (*stop_port) (struct netxen_adapter *);
@@ -970,27 +972,21 @@ static inline void __iomem *pci_base(str
return NULL;
 }
 
-int netxen_niu_xgbe_enable_phy_interrupts(struct netxen_adapter *adapter,
- int port);
-int netxen_niu_gbe_enable_phy_interrupts(struct netxen_adapter *adapter,
-int port);
-int netxen_niu_xgbe_disable_phy_interrupts(struct netxen_adapter *adapter,
-  int port);
-int netxen_niu_gbe_disable_phy_interrupts(struct netxen_adapter *adapter,
- int port);
-int netxen_niu_xgbe_clear_phy_interrupts(struct netxen_adapter *adapter,
-int port);
-int netxen_niu_gbe_clear_phy_interrupts(struct netxen_adapter *adapter,
-   int port);
+int netxen_niu_xgbe_enable_phy_interrupts(struct netxen_adapter *adapter);
+int netxen_niu_gbe_enable_phy_interrupts(struct netxen_adapter *adapter);
+int netxen_niu_xgbe_disable_phy_interrupts(struct netxen_adapter *adapter);
+int netxen_niu_gbe_disable_phy_interrupts(struct netxen_adapter *adapter);
+int netxen_niu_xgbe_clear_phy_interrupts(struct netxen_adapter *adapter);
+int netxen_niu_gbe_clear_phy_interrupts(struct netxen_adapter *adapter);
 void netxen_nic_xgbe_handle_phy_intr(struct netxen_adapter *adapter);
 void netxen_nic_gbe_handle_phy_intr(struct netxen_adapter *adapter);
 void netxen_niu_gbe_set_mii_mode(struct netxen_adapter *adapter, int port,
 long enable);
 void netxen_niu_gbe_set_gmii_mode(struct netxen_adapter *adapter, int port,
  long enable);
-int netxen_niu_gbe_phy_read(struct netxen_adapter *adapter, long phy, long reg,
+int netxen_niu_gbe_phy_read(struct netxen_adapter *adapter, long reg,
__u32 * readval);
-int netxen_niu_gbe_phy_write(struct netxen_adapter *adapter, long phy,
+int netxen_niu_gbe_phy_write(struct netxen_adapter *adapter,
 long reg, __u32 val);
 
 /* Functions available from netxen_nic_hw.c */
@@ -1010,6 +1006,7 @@ int 

[PATCH 3/3] NetXen: Fix ping failure of Jumbo frames on MEZ cards.

2007-03-05 Thread Linsys Contractor Mithlesh Thukral
NetXen: Fix ping failure of Jumbo frames on MEZ cards.

Signed-off by: Mithlesh Thukral [EMAIL PROTECTED]

---

 drivers/net/netxen/netxen_nic_hw.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletion(-)
 
diff --git a/drivers/net/netxen/netxen_nic_hw.c 
b/drivers/net/netxen/netxen_nic_hw.c
index 693d01a..81ebc81 100644
--- a/drivers/net/netxen/netxen_nic_hw.c
+++ b/drivers/net/netxen/netxen_nic_hw.c
@@ -962,7 +962,12 @@ int netxen_nic_set_mtu_gb(struct netxen_
 int netxen_nic_set_mtu_xgb(struct netxen_adapter *adapter, int new_mtu)
 {
new_mtu += NETXEN_NIU_HDRSIZE + NETXEN_NIU_TLRSIZE;
-   netxen_nic_write_w0(adapter, NETXEN_NIU_XGE_MAX_FRAME_SIZE, new_mtu);
+   if (adapter-portnum == 0)
+   netxen_nic_write_w0(adapter, NETXEN_NIU_XGE_MAX_FRAME_SIZE, 
+   new_mtu);
+   else if (adapter-portnum == 1)
+   netxen_nic_write_w0(adapter, NETXEN_NIU_XG1_MAX_FRAME_SIZE,
+   new_mtu);
return 0;
 }
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] NetXen: Make driver use multi PCI functions

2007-03-05 Thread Mithlesh Thukral
On Saturday 03 March 2007 06:35, Jeff Garzik wrote:
 Linsys Contractor Mithlesh Thukral wrote:
  NetXen: Make driver use multi PCI functions.
 
  Signed-off by: Mithlesh Thukral [EMAIL PROTECTED]
 
  ---
 
   netxen_nic.h  |  126 +---
   netxen_nic_ethtool.c  |   80 +++
   netxen_nic_hdr.h  |8
   netxen_nic_hw.c   |  213 +++-
   netxen_nic_hw.h   |   18 -
   netxen_nic_init.c |  115 +++---
   netxen_nic_isr.c  |   80 +++
   netxen_nic_main.c |  523
  +- netxen_nic_niu.c 
  |   27 +-
   netxen_nic_phan_reg.h |  125 ---
   10 files changed, 631 insertions(+), 684 deletions(-)

 all three patches in this patchset contained nothing but one-line
 summaries of the changes included in them, and are overall very poorly
 and vaguely described.

 This patch is far too big, with far too little description and
 justification to go along with it.

 If you are not going to make the effort to write a paragraph or two
 describing such huge changes, then I'm not going to make the effort to
 review and apply it.  NAK.
My apologies for insufficient explanation of the patch. I have resend this 
patch some time ago. 

Regards,
Mithlesh Thukral
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT] sky2 auto negotiation PHY errata

2007-03-05 Thread Rob Sims
On Tue, Feb 20, 2007 at 11:00:53AM -0800, Stephen Hemminger wrote:
 You need the flow control fix and the tx_timeout fix posted for 2.6.20 
 (stable)
 and current git tree. 

sky2 1.13 has been far better than 1.10; there have been no system hangs
or permanent sky2 failures.  However, the following two incidents were
in syslog:

Feb 27 07:08:21 btd kernel: Linux version 2.6.20.sky2.1.13-btd3 ([EMAIL 
PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP 
PREEMPT Tue Feb 27 00:07:34 MST 2007
Feb 27 07:08:21 btd kernel: sky2 :04:00.0: v1.13 addr 0xfa9fc000 irq 17 
Yukon-EC (0xb6) rev 2
Feb 27 07:08:21 btd kernel: sky2 eth0: addr 00:1a:92:23:52:4d
Feb 27 07:08:21 btd kernel: sky2 :03:00.0: v1.13 addr 0xfa8fc000 irq 16 
Yukon-EC (0xb6) rev 2
Feb 27 07:08:21 btd kernel: sky2 eth1: addr 00:1a:92:23:4b:a6
Feb 27 07:08:21 btd kernel: sky2 eth0: enabling interface
Feb 27 07:08:21 btd kernel: sky2 eth0: ram buffer 48K
Feb 27 07:08:21 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, 
flow control both
Feb 27 19:48:34 btd kernel: sky2 :04:00.0: v1.13 addr 0xfa9fc000 irq 17 
Yukon-EC (0xb6) rev 2
Feb 27 19:48:34 btd kernel: sky2 eth0: addr 00:1a:92:23:52:4d
Feb 27 19:48:34 btd kernel: sky2 :03:00.0: v1.13 addr 0xfa8fc000 irq 16 
Yukon-EC (0xb6) rev 2
Feb 27 19:48:34 btd kernel: sky2 eth1: addr 00:1a:92:23:4b:a6
Feb 27 19:48:34 btd kernel: sky2 eth0: enabling interface
Feb 27 19:48:34 btd kernel: sky2 eth0: ram buffer 48K
Feb 27 19:48:34 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, 
flow control both

Feb 28 19:06:57 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out
Feb 28 19:06:57 btd kernel: sky2 eth0: tx timeout
Feb 28 19:06:57 btd kernel: sky2 eth0: transmit ring 133 ..  110 report=133 
done=133
Feb 28 19:06:57 btd kernel: sky2 eth0: disabling interface
Feb 28 19:06:57 btd kernel: sky2 eth0: enabling interface
Feb 28 19:06:57 btd kernel: sky2 eth0: ram buffer 48K
Feb 28 19:07:00 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, 
flow control both

Mar  4 13:58:31 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar  4 13:58:31 btd kernel: sky2 eth0: tx timeout
Mar  4 13:58:31 btd kernel: sky2 eth0: transmit ring 353 .. 330 report=353 
done=353
Mar  4 13:58:31 btd kernel: sky2 eth0: disabling interface
Mar  4 13:58:31 btd kernel: sky2 eth0: enabling interface
Mar  4 13:58:31 btd kernel: sky2 eth0: ram buffer 48K
Mar  4 13:58:34 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, 
flow control both

I only noticed the second of the two.
-- 
Rob


signature.asc
Description: Digital signature


[PATCH] twcal_jiffie should be unsigned long, not int

2007-03-05 Thread Eric Dumazet
Hi David

While browsing include/net/inet_timewait_sock.h, I found this buggy definition 
of twcal_jiffie.

int twcal_jiffie;

I wonder how inet_twdr_twcal_tick() can really works on x86_64

This seems quite an old bug, it was there before introduction of 
inet_timewait_death_row made by Arnaldo Carvalho de Melo.

[PATCH] twcal_jiffie should be unsigned long, not int

Signed-off-by: Eric Dumazet [EMAIL PROTECTED]

diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index f7be1ac..09a2532 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -66,7 +66,7 @@ #define INET_TWDR_TWKILL_QUOTA 100
 struct inet_timewait_death_row {
/* Short-time timewait calendar */
int twcal_hand;
-   int twcal_jiffie;
+   unsigned long   twcal_jiffie;
struct timer_list   twcal_timer;
struct hlist_head   twcal_row[INET_TWDR_RECYCLE_SLOTS];
 


[PATCH/RFC 00/13] Transparent proxying patches, take two

2007-03-05 Thread KOVACS Krisztian
  Hi,

These patches are my second try at providing Linux 2.2-like transparent
proxying support for Linux 2.6.

Major changes since the first version:

- iptable_tproxy now does IPv4 fragment reassembly (necessary for
  processing TCP/UDP header)

- The removal of the source address check in ip_route_output() was
  incorrect.  Instead, I've implemented a separate setsockopt-settable
  per-socket flag (setting it requires CAP_NET_ADMIN) to selectively
  loosen that check in ip_route_output().

Besides these, I've tried to fix all the problems raised on netdev@ in
January.

Unfortunately the newly introduced IP_TRANSPARENT socket option leads to
a quite intrusive set of patches touching core IPv4 routing and TCP
code, however this was necessary as DaveM rejected our idea of using
IP_FREEBIND instead (and he's right, of course, as it would have caused
ABI breakage.) The current approach works by adding a new bit to the
flag field in struct flowi.

Furthermore, I haven't removed the IPv4 routing local diversion code
(caching socket lookups in the skb) yet. Patrick recommended throwing it
out altogether and use mark-based policy routing instead, but I still
think that would be harming usability as the user would need to
harmonize the configuration in order to have two completely independent
subsystems interoperate.

-- 
 Regards,
  Krisztian Kovacs
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 01/13] Implement local diversion of IPv4 skbs

2007-03-05 Thread KOVACS Krisztian
The input path for non-local bound sockets requires diverting certain
packets locally, even if their destination IP address is not
considered local. We achieve this by assigning a specially crafted dst
entry to these skbs, and optionally also attaching a socket to the skb
so that the upper layer code does not need to redo the socket lookup.

We also have to be able to differentiate between these fake entries
and real entries in the cache: it is perfectly legal that the
diversion is done only for certain TCP or UDP packets and not for all
packets of the flow. Since these special dst entries are used only by
the iptables tproxy code, and that code uses exclusively these
entries, simply flagging these entries as DST_DIVERTED is OK. All
other cache lookup paths skip diverted entries, while our new
ip_divert_local() function uses exclusively diverted dst entries.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/net/dst.h   |1 
 include/net/route.h |2 +
 net/ipv4/route.c|  113 +++
 3 files changed, 115 insertions(+), 1 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index e12a8ce..4cd0745 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -48,6 +48,7 @@ struct dst_entry
 #define DST_NOPOLICY   4
 #define DST_NOHASH 8
 #define DST_BALANCED0x10
+#define DST_DIVERTED   0x20
unsigned long   expires;
 
unsigned short  header_len; /* more space at head required 
*/
diff --git a/include/net/route.h b/include/net/route.h
index 749e4df..efaa6b2 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -125,6 +125,8 @@ extern int  ip_rt_ioctl(unsigned int cmd, void 
__user *arg);
 extern voidip_rt_get_source(u8 *src, struct rtable *rt);
 extern int ip_rt_dump(struct sk_buff *skb,  struct 
netlink_callback *cb);
 
+extern int ip_divert_local(struct sk_buff *skb, const struct 
in_device *in, struct sock *sk);
+
 struct in_ifaddr;
 extern void fib_add_ifaddr(struct in_ifaddr *);
 
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 37e0d4d..c526fb2 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -100,6 +100,7 @@
 #include net/ip_fib.h
 #include net/arp.h
 #include net/tcp.h
+#include linux/dccp.h
 #include net/icmp.h
 #include net/xfrm.h
 #include net/ip_mp_alg.h
@@ -941,9 +942,11 @@ restart:
while ((rth = *rthp) != NULL) {
 #ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED
if (!(rth-u.dst.flags  DST_BALANCED) 
+   !((rt-u.dst.flags ^ rth-u.dst.flags)  DST_DIVERTED) 
compare_keys(rth-fl, rt-fl)) {
 #else
-   if (compare_keys(rth-fl, rt-fl)) {
+   if (!((rt-u.dst.flags ^ rth-u.dst.flags)  DST_DIVERTED) 
+   compare_keys(rth-fl, rt-fl)) {
 #endif
/* Put it first */
*rthp = rth-u.dst.rt_next;
@@ -1165,6 +1168,7 @@ void ip_rt_redirect(__be32 old_gw, __be32 daddr, __be32 
new_gw,
if (rth-fl.fl4_dst != daddr ||
rth-fl.fl4_src != skeys[i] ||
rth-fl.oif != ikeys[k] ||
+   (rth-u.dst.flags  DST_DIVERTED) ||
rth-fl.iif != 0) {
rthp = rth-u.dst.rt_next;
continue;
@@ -1525,6 +1529,111 @@ static int ip_rt_bug(struct sk_buff *skb)
return 0;
 }
 
+static void ip_divert_free_sock(struct sk_buff *skb)
+{
+   struct sock *sk = skb-sk;
+
+   skb-sk = NULL;
+   skb-destructor = NULL;
+
+   if (sk) {
+   /* TIME_WAIT inet sockets have to be handled differently */
+   if (((sk-sk_protocol == IPPROTO_TCP)  (sk-sk_state == 
TCP_TIME_WAIT)) ||
+   ((sk-sk_protocol == IPPROTO_DCCP)  (sk-sk_state == 
DCCP_TIME_WAIT)))
+   inet_twsk_put(inet_twsk(sk));
+   else
+   sock_put(sk);
+   }
+}
+
+int ip_divert_local(struct sk_buff *skb, const struct in_device *in, struct 
sock *sk)
+{
+   struct iphdr *iph = skb-nh.iph;
+   struct rtable *rth, *rtres;
+   unsigned hash;
+   const int iif = in-dev-ifindex;
+   u_int8_t tos;
+   int err;
+
+   /* look up hash first */
+   tos = iph-tos  IPTOS_RT_MASK;
+   hash = rt_hash_code(iph-daddr, iph-saddr ^ (iif  5));
+
+   rcu_read_lock();
+   for (rth = rcu_dereference(rt_hash_table[hash].chain); rth;
+rth = rcu_dereference(rth-u.dst.rt_next)) {
+   if (rth-fl.fl4_dst == iph-daddr 
+   rth-fl.fl4_src == iph-saddr 
+   rth-fl.iif == iif 
+   rth-fl.oif == 0 
+   (rth-u.dst.flags  DST_DIVERTED)) {
+   rth-u.dst.lastuse = jiffies;
+

[PATCH/RFC 02/13] Port redirection support for TCP

2007-03-05 Thread KOVACS Krisztian
Current TCP code relies on the local port of the listening socket
being the same as the destination address of the incoming
connection. Port redirection used by many transparent proxying
techniques obviously breaks this, so we have to store the original
destination port address.

This patch extends struct inet_request_sock and stores the incoming
destination port value there. It also modifies the handshake code to
use that value as the source port when sending reply packets.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/net/inet_sock.h |1 +
 include/net/tcp.h   |1 +
 net/ipv4/inet_connection_sock.c |2 ++
 net/ipv4/syncookies.c   |1 +
 net/ipv4/tcp_output.c   |2 +-
 5 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index ce6da97..0bd167b 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -64,6 +64,7 @@ struct inet_request_sock {
 #endif
__be32  loc_addr;
__be32  rmt_addr;
+   __be16  loc_port;
__be16  rmt_port;
u16 snd_wscale : 4, 
rcv_wscale : 4, 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 5c472f2..e1cb3d0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -982,6 +982,7 @@ static inline void tcp_openreq_init(struct request_sock 
*req,
ireq-acked = 0;
ireq-ecn_ok = 0;
ireq-rmt_port = skb-h.th-source;
+   ireq-loc_port = skb-h.th-dest;
 }
 
 extern void tcp_enter_memory_pressure(void);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 43fb160..83ad972 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -502,6 +502,8 @@ struct sock *inet_csk_clone(struct sock *sk, const struct 
request_sock *req,
newicsk-icsk_bind_hash = NULL;
 
inet_sk(newsk)-dport = inet_rsk(req)-rmt_port;
+   inet_sk(newsk)-num = ntohs(inet_rsk(req)-loc_port);
+   inet_sk(newsk)-sport = inet_rsk(req)-loc_port;
newsk-sk_write_space = sk_stream_write_space;
 
newicsk-icsk_retransmits = 0;
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 33016cc..431c81d 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -223,6 +223,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct 
sk_buff *skb,
treq-rcv_isn   = ntohl(skb-h.th-seq) - 1;
treq-snt_isn   = cookie;
req-mss= mss;
+   ireq-loc_port  = skb-h.th-dest;
ireq-rmt_port  = skb-h.th-source;
ireq-loc_addr  = skb-nh.iph-daddr;
ireq-rmt_addr  = skb-nh.iph-saddr;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index dc15113..a3ea7a1 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2135,7 +2135,7 @@ struct sk_buff * tcp_make_synack(struct sock *sk, struct 
dst_entry *dst,
th-syn = 1;
th-ack = 1;
TCP_ECN_make_synack(req, th);
-   th-source = inet_sk(sk)-sport;
+   th-source = ireq-loc_port;
th-dest = ireq-rmt_port;
TCP_SKB_CB(skb)-seq = tcp_rsk(req)-snt_isn;
TCP_SKB_CB(skb)-end_seq = TCP_SKB_CB(skb)-seq + 1;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 03/13] Don't do the TCP socket lookup if we already have one attached

2007-03-05 Thread KOVACS Krisztian
TCP input code path looks up the TCP socket hash tables to find a
socket matching the incoming packet. However, as iptable_tproxy does
socket lookups early the skb may already have the appropriate
reference attached, in that case we steal that reference instead of
doing the lookup.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 net/ipv4/tcp_ipv4.c |   13 ++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 0ba74bb..536db7b 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1647,9 +1647,16 @@ int tcp_v4_rcv(struct sk_buff *skb)
TCP_SKB_CB(skb)-flags   = skb-nh.iph-tos;
TCP_SKB_CB(skb)-sacked  = 0;
 
-   sk = __inet_lookup(tcp_hashinfo, skb-nh.iph-saddr, th-source,
-  skb-nh.iph-daddr, th-dest,
-  inet_iif(skb));
+   if (unlikely(skb-sk)) {
+   /* steal reference */
+   sk = skb-sk;
+   skb-destructor = NULL;
+   skb-sk = NULL;
+   } else {
+   sk = __inet_lookup(tcp_hashinfo, skb-nh.iph-saddr, 
th-source,
+  skb-nh.iph-daddr, th-dest,
+  inet_iif(skb));
+   }
 
if (!sk)
goto no_tcp_socket;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 04/13] Don't do the UDP socket lookup if we already have one attached

2007-03-05 Thread KOVACS Krisztian
UDP input code path looks up the UDP socket hash tables to find a
socket matching the incoming packet. However, as iptable_tproxy does
socket lookups early the skb may already have the appropriate
reference attached, in that case we steal that reference instead of
doing the lookup.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 net/ipv4/udp.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index ce6c460..1d15edc 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1226,8 +1226,15 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct 
hlist_head udptable[],
if(rt-rt_flags  (RTCF_BROADCAST|RTCF_MULTICAST))
return __udp4_lib_mcast_deliver(skb, uh, saddr, daddr, 
udptable);
 
-   sk = __udp4_lib_lookup(saddr, uh-source, daddr, uh-dest,
-  skb-dev-ifindex, udptable);
+   if (skb-sk) {
+   /* steal reference */
+   sk = skb-sk;
+   skb-destructor = NULL;
+   skb-sk = NULL;
+   } else {
+   sk = __udp4_lib_lookup(saddr, uh-source, daddr, uh-dest,
+  skb-dev-ifindex, udptable);
+   }
 
if (sk != NULL) {
int ret = udp_queue_rcv_skb(sk, skb);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 05/13] Loosen source address check on IPv4 output

2007-03-05 Thread KOVACS Krisztian
ip_route_output() contains a check to make sure that no flows with
non-local source IP addresses are routed. This obviously makes using
such addresses impossible.

This patch introduces a flowi flag which makes omitting this check
possible. The new flag provides a way of handling transparent and
non-transparent connections differently.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/net/flow.h |1 +
 net/ipv4/route.c   |8 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/net/flow.h b/include/net/flow.h
index ce4b10d..9eb91f2 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -49,6 +49,7 @@ struct flowi {
__u8proto;
__u8flags;
 #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01
+#define FLOWI_FLAG_TRANSPARENT 0x02
union {
struct {
__be16  sport;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index c526fb2..8091a96 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -572,7 +572,8 @@ static inline int compare_keys(struct flowi *fl1, struct 
flowi *fl2)
(*(u16 *)fl1-nl_u.ip4_u.tos ^
 *(u16 *)fl2-nl_u.ip4_u.tos) |
(fl1-oif ^ fl2-oif) |
-   (fl1-iif ^ fl2-iif)) == 0;
+   (fl1-iif ^ fl2-iif) |
+   ((fl1-flags ^ fl2-flags)  FLOWI_FLAG_TRANSPARENT)) == 0;
 }
 
 #ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED
@@ -2338,6 +2339,7 @@ static inline int __mkroute_output(struct rtable **result,
rth-fl.fl4_src = oldflp-fl4_src;
rth-fl.oif = oldflp-oif;
rth-fl.mark= oldflp-mark;
+   rth-fl.flags   = oldflp-flags;
rth-rt_dst = fl-fl4_dst;
rth-rt_src = fl-fl4_src;
rth-rt_iif = oldflp-oif ? : dev_out-ifindex;
@@ -2482,6 +2484,7 @@ static int ip_route_output_slow(struct rtable **rp, const 
struct flowi *oldflp)
  RT_SCOPE_LINK :
  RT_SCOPE_UNIVERSE),
  } },
+   .flags = oldflp-flags,
.mark = oldflp-mark,
.iif = loopback_dev.ifindex,
.oif = oldflp-oif };
@@ -2506,7 +2509,7 @@ static int ip_route_output_slow(struct rtable **rp, const 
struct flowi *oldflp)
 
/* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
dev_out = ip_dev_find(oldflp-fl4_src);
-   if (dev_out == NULL)
+   if (dev_out == NULL  !(oldflp-flags  
FLOWI_FLAG_TRANSPARENT))
goto out;
 
/* I removed check for oif == dev_out-oif here.
@@ -2678,6 +2681,7 @@ int __ip_route_output_key(struct rtable **rp, const 
struct flowi *flp)
rth-fl.iif == 0 
rth-fl.oif == flp-oif 
rth-fl.mark == flp-mark 
+   !((rth-fl.flags ^ flp-flags)  FLOWI_FLAG_TRANSPARENT) 
!((rth-fl.fl4_tos ^ flp-fl4_tos) 
(IPTOS_RT_MASK | RTO_ONLINK))) {
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 06/13] Implement IP_TRANSPARENT socket option

2007-03-05 Thread KOVACS Krisztian
This patch introduces the IP_TRANSPARENT socket option: enabling that will make
the IPv4 routing omit the non-local source address check on output. Setting
IP_TRANSPARENT requires NET_ADMIN capability.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/linux/in.h   |1 +
 include/net/inet_sock.h  |3 ++-
 include/net/inet_timewait_sock.h |3 ++-
 include/net/route.h  |1 +
 net/ipv4/inet_timewait_sock.c|1 +
 net/ipv4/ip_sockglue.c   |   12 +++-
 6 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/include/linux/in.h b/include/linux/in.h
index 1912e7c..66be615 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -75,6 +75,7 @@ struct in_addr {
 #define IP_IPSEC_POLICY16
 #define IP_XFRM_POLICY 17
 #define IP_PASSSEC 18
+#define IP_TRANSPARENT 19
 
 /* BSD compatibility */
 #define IP_RECVRETOPTS IP_RETOPTS
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 0bd167b..14b597d 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -128,7 +128,8 @@ struct inet_sock {
is_icsk:1,
freebind:1,
hdrincl:1,
-   mc_loop:1;
+   mc_loop:1,
+   transparent:1;
int mc_index;
__be32  mc_addr;
struct ip_mc_socklist   *mc_list;
diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index f7be1ac..e30dd61 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -126,7 +126,8 @@ struct inet_timewait_sock {
__be16  tw_dport;
__u16   tw_num;
/* And these are ours. */
-   __u8tw_ipv6only:1;
+   __u8tw_ipv6only:1,
+   tw_transparent:1;
/* 15 bits hole, try to pack */
__u16   tw_ipv6_offset;
int tw_timeout;
diff --git a/include/net/route.h b/include/net/route.h
index efaa6b2..13da592 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -27,6 +27,7 @@
 #include net/dst.h
 #include net/inetpeer.h
 #include net/flow.h
+#include net/inet_sock.h
 #include linux/in_route.h
 #include linux/rtnetlink.h
 #include linux/route.h
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index a73cf93..f57f81a 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -108,6 +108,7 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct 
sock *sk, const int stat
tw-tw_reuse= sk-sk_reuse;
tw-tw_hash = sk-sk_hash;
tw-tw_ipv6only = 0;
+   tw-tw_transparent  = inet-transparent;
tw-tw_prot = sk-sk_prot_creator;
atomic_set(tw-tw_refcnt, 1);
inet_twsk_dead_node_init(tw);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 23048d9..02e8d9f 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -414,7 +414,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
(1IP_TTL) | (1IP_HDRINCL) |
(1IP_MTU_DISCOVER) | (1IP_RECVERR) |
(1IP_ROUTER_ALERT) | (1IP_FREEBIND) |
-   (1IP_PASSSEC))) ||
+   (1IP_PASSSEC) | (1IP_TRANSPARENT))) ||
optname == IP_MULTICAST_TTL ||
optname == IP_MULTICAST_LOOP) {
if (optlen = sizeof(int)) {
@@ -875,6 +875,16 @@ mc_msf_out:
err = xfrm_user_policy(sk, optname, optval, optlen);
break;
 
+   case IP_TRANSPARENT:
+   if (!capable(CAP_NET_ADMIN)) {
+   err = -EPERM;
+   break;
+   }
+   if (optlen  1)
+   goto e_inval;
+   inet-transparent = !!val;
+   break;
+
default:
err = -ENOPROTOOPT;
break;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 07/13] Conditionally enable transparent flow flag when connecting

2007-03-05 Thread KOVACS Krisztian
Set FLOWI_FLAG_TRANSPARENT in flowi-flags if the socket has the
transparent socket option set. This way we selectively enable certain
connections with non-local source addresses to be routed.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/net/route.h |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/include/net/route.h b/include/net/route.h
index 13da592..4dff368 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -161,6 +161,10 @@ static inline int ip_route_connect(struct rtable **rp, 
__be32 dst,
 .dport = dport } } };
 
int err;
+
+   if (inet_sk(sk)-transparent)
+   fl.flags |= FLOWI_FLAG_TRANSPARENT;
+
if (!dst || !src) {
err = __ip_route_output_key(rp, fl);
if (err)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 08/13] Handle TCP SYN+ACK/ACK/RST transparency

2007-03-05 Thread KOVACS Krisztian
The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to
incoming packets. The non-local source address check on output bites
us again, as replies for transparently redirected traffic won't have a
chance to leave the node.

This patch selectively sets the FLOWI_FLAG_TRANSPARENT flag when doing
the route lookup for those replies. Transparent replies are enabled if
the listening socket has the transparent socket flag set.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/net/ip.h|3 +++
 include/net/request_sock.h  |3 ++-
 net/ipv4/inet_connection_sock.c |2 ++
 net/ipv4/ip_output.c|6 +-
 net/ipv4/syncookies.c   |2 ++
 net/ipv4/tcp_ipv4.c |   16 ++--
 net/ipv4/tcp_minisocks.c|3 ++-
 7 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index e79c3e3..8b71991 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -133,8 +133,11 @@ static inline void ip_tr_mc_map(__be32 addr, char *buf)
buf[5]=0x00;
 }
 
+#define IP_REPLY_ARG_NOSRCCHECK 1
+
 struct ip_reply_arg {
struct kvec iov[1];   
+   int flags;
__wsum  csum;
int csumoffset; /* u16 offset of csum in iov[0].iov_base */
/* -1 if not needed */ 
diff --git a/include/net/request_sock.h b/include/net/request_sock.h
index 7aed02c..b9c8974 100644
--- a/include/net/request_sock.h
+++ b/include/net/request_sock.h
@@ -34,7 +34,8 @@ struct request_sock_ops {
   struct request_sock *req,
   struct dst_entry *dst);
void(*send_ack)(struct sk_buff *skb,
-   struct request_sock *req);
+   struct request_sock *req,
+   int reply_flags);
void(*send_reset)(struct sock *sk,
  struct sk_buff *skb);
void(*destructor)(struct request_sock *req);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 83ad972..90459a1 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -323,6 +323,8 @@ struct dst_entry* inet_csk_route_req(struct sock *sk,
.saddr = ireq-loc_addr,
.tos = RT_CONN_FLAGS(sk) } },
.proto = sk-sk_protocol,
+   .flags = inet_sk(sk)-transparent ?
+   FLOWI_FLAG_TRANSPARENT : 0,
.uli_u = { .ports =
   { .sport = inet_sk(sk)-sport,
 .dport = ireq-rmt_port } } };
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index d096332..7af25d4 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -312,6 +312,8 @@ int ip_queue_xmit(struct sk_buff *skb, int ipfragok)
.saddr = inet-saddr,
.tos = 
RT_CONN_FLAGS(sk) } },
.proto = sk-sk_protocol,
+   .flags = inet-transparent ?
+FLOWI_FLAG_TRANSPARENT 
: 0,
.uli_u = { .ports =
   { .sport = inet-sport,
 .dport = inet-dport } 
} };
@@ -1357,7 +1359,9 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, 
struct ip_reply_arg *ar
.uli_u = { .ports =
   { .sport = skb-h.th-dest,
 .dport = skb-h.th-source } },
-   .proto = sk-sk_protocol };
+   .proto = sk-sk_protocol,
+   .flags = (arg-flags  
IP_REPLY_ARG_NOSRCCHECK) ?
+   FLOWI_FLAG_TRANSPARENT : 0 };
security_skb_classify_flow(skb, fl);
if (ip_route_output_key(rt, fl))
return;
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 431c81d..08d8920 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -261,6 +261,8 @@ struct sock *cookie_v4_check(struct sock *sk, struct 
sk_buff *skb,
.saddr = ireq-loc_addr,
.tos = RT_CONN_FLAGS(sk) } },
.proto = IPPROTO_TCP,
+   .flags = inet_sk(sk)-transparent ?
+

[PATCH/RFC 09/13] Create a tproxy flag in struct sk_buff

2007-03-05 Thread KOVACS Krisztian
We would like to be able to match on whether or not a given packet has
been diverted by tproxy. To make this possible we need a flag in
sk_buff.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/linux/skbuff.h |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4ff3940..6d7f5c7 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -284,7 +284,8 @@ struct sk_buff {
nfctinfo:3;
__u8pkt_type:3,
fclone:2,
-   ipvs_property:1;
+   ipvs_property:1,
+   ip_tproxy:1;
__be16  protocol;
 
void(*destructor)(struct sk_buff *skb);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 10/13] Export UDP socket lookup function

2007-03-05 Thread KOVACS Krisztian
The iptables tproxy code has to be able to do UDP socket hash lookups,
so we have to provide an exported lookup function for this purpose.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/net/udp.h |4 
 net/ipv4/udp.c|8 
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index 1b921fa..ea5aa31 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -141,6 +141,10 @@ extern int udp_lib_setsockopt(struct sock *sk, int 
level, int optname,
   char __user *optval, int optlen,
   int (*push_pending_frames)(struct sock *));
 
+extern struct sock *udp4_lib_lookup(__be32 saddr, __be16 sport,
+   __be32 daddr, __be16 dport,
+   int dif);
+
 DECLARE_SNMP_STAT(struct udp_mib, udp_statistics);
 /*
  * SNMP statistics for UDP and UDP-Lite
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 1d15edc..52695a6 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -285,6 +285,14 @@ static struct sock *__udp4_lib_lookup(__be32 saddr, __be16 
sport,
return result;
 }
 
+struct sock *udp4_lib_lookup(__be32 saddr, __be16 sport,
+__be32 daddr, __be16 dport,
+int dif)
+{
+   return __udp4_lib_lookup(saddr, sport, daddr, dport, dif, udp_hash);
+}
+EXPORT_SYMBOL_GPL(udp4_lib_lookup);
+
 static inline struct sock *udp_v4_mcast_next(struct sock *sk,
 __be16 loc_port, __be32 loc_addr,
 __be16 rmt_port, __be32 rmt_addr,

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 11/13] iptables tproxy table

2007-03-05 Thread KOVACS Krisztian
The iptables tproxy table registers a new hook on PRE_ROUTING and for
each incoming TCP/UDP packet performs as follows:

1. Does IPv4 fragment reassembly. We need this to be able to do TCP/UDP
   header processing.

2. Does a TCP/UDP socket hash lookup to decide whether or not the packet
   is sent to a non-local bound socket. If a matching socket is found
   and the socket has the IP_TRANSPARENT socket option enabled the skb is
   diverted locally and the socket reference is stored in the skb.

3. If no matching socket was found, the PREROUTING chain of the
   iptables tproxy table is consulted. Matching rules with the TPROXY
   target can do transparent redirection here. (In this case it is not
   necessary to have the IP_TRANSPARENT socket option enabled for the
   target socket, redirection takes place even for regular
   sockets. This way no modification of the application is necessary.)

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/linux/netfilter_ipv4.h   |1 
 include/linux/netfilter_ipv4/ip_tproxy.h |   20 ++
 include/net/ip.h |3 
 net/ipv4/netfilter/Kconfig   |   10 +
 net/ipv4/netfilter/Makefile  |1 
 net/ipv4/netfilter/iptable_tproxy.c  |  267 ++
 6 files changed, 301 insertions(+), 1 deletions(-)

diff --git a/include/linux/netfilter_ipv4.h b/include/linux/netfilter_ipv4.h
index ceae87a..cc4d83b 100644
--- a/include/linux/netfilter_ipv4.h
+++ b/include/linux/netfilter_ipv4.h
@@ -58,6 +58,7 @@ enum nf_ip_hook_priorities {
NF_IP_PRI_SELINUX_FIRST = -225,
NF_IP_PRI_CONNTRACK = -200,
NF_IP_PRI_MANGLE = -150,
+   NF_IP_PRI_TPROXY = -125,
NF_IP_PRI_NAT_DST = -100,
NF_IP_PRI_FILTER = 0,
NF_IP_PRI_NAT_SRC = 100,
diff --git a/include/linux/netfilter_ipv4/ip_tproxy.h 
b/include/linux/netfilter_ipv4/ip_tproxy.h
new file mode 100644
index 000..ae890e3
--- /dev/null
+++ b/include/linux/netfilter_ipv4/ip_tproxy.h
@@ -0,0 +1,20 @@
+#ifndef _IP_TPROXY_H
+#define _IP_TPROXY_H
+
+#include linux/types.h
+
+/* look up and get a reference to a matching socket */
+extern struct sock *
+ip_tproxy_get_sock(const u8 protocol,
+  const __be32 saddr, const __be32 daddr,
+  const __be16 sport, const __be16 dport,
+  const struct net_device *in);
+
+/* divert skb to a given socket */
+extern int
+ip_tproxy_do_divert(struct sk_buff *skb,
+   const struct sock *sk,
+   const int require_freebind,
+   const struct net_device *in);
+
+#endif
diff --git a/include/net/ip.h b/include/net/ip.h
index 8b71991..a589e6e 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -321,7 +321,8 @@ enum ip_defrag_users
IP_DEFRAG_CONNTRACK_OUT,
IP_DEFRAG_VS_IN,
IP_DEFRAG_VS_OUT,
-   IP_DEFRAG_VS_FWD
+   IP_DEFRAG_VS_FWD,
+   IP_DEFRAG_TP_IN,
 };
 
 struct sk_buff *ip_defrag(struct sk_buff *skb, u32 user);
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 601808c..17c3ec8 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -628,6 +628,16 @@ config IP_NF_RAW
  If you want to compile it as a module, say M here and read
  file:Documentation/modules.txt.  If unsure, say `N'.
 
+# tproxy table
+config IP_NF_TPROXY
+   tristate Transparent proxying
+   depends on IP_NF_IPTABLES
+   help
+ Transparent proxying. For more information see
+ http://www.balabit.com/downloads/tproxy.
+
+ To compile it as a module, choose M here.  If unsure, say N.
+
 # ARP tables
 config IP_NF_ARPTABLES
tristate ARP tables support
diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index 6625ec6..21a29f4 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -81,6 +81,7 @@ obj-$(CONFIG_IP_NF_MANGLE) += iptable_mangle.o
 obj-$(CONFIG_IP_NF_NAT) += iptable_nat.o
 obj-$(CONFIG_NF_NAT) += iptable_nat.o
 obj-$(CONFIG_IP_NF_RAW) += iptable_raw.o
+obj-$(CONFIG_IP_NF_TPROXY) += iptable_tproxy.o
 
 # matches
 obj-$(CONFIG_IP_NF_MATCH_IPRANGE) += ipt_iprange.o
diff --git a/net/ipv4/netfilter/iptable_tproxy.c 
b/net/ipv4/netfilter/iptable_tproxy.c
new file mode 100644
index 000..a241f11
--- /dev/null
+++ b/net/ipv4/netfilter/iptable_tproxy.c
@@ -0,0 +1,267 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (c) 2006-2007 BalaBit IT Ltd.
+ * Author: Balazs Scheidler, Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include linux/version.h
+#include linux/module.h
+
+#include linux/net.h
+#include linux/if.h
+#include linux/netdevice.h
+#include linux/inetdevice.h
+#include linux/in.h
+#include net/tcp.h
+#include net/udp.h
+#include 

[PATCH/RFC 12/13] iptables TPROXY target

2007-03-05 Thread KOVACS Krisztian
The TPROXY target implements redirection of non-local TCP/UDP traffic
to local sockets. It is simply a wrapper around functionality exported
from iptable_tproxy.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 include/linux/netfilter_ipv4/ipt_TPROXY.h |9 +++
 net/ipv4/netfilter/Kconfig|   11 +++
 net/ipv4/netfilter/Makefile   |1 
 net/ipv4/netfilter/ipt_TPROXY.c   |   92 +
 4 files changed, 113 insertions(+), 0 deletions(-)

diff --git a/include/linux/netfilter_ipv4/ipt_TPROXY.h 
b/include/linux/netfilter_ipv4/ipt_TPROXY.h
new file mode 100644
index 000..d05c956
--- /dev/null
+++ b/include/linux/netfilter_ipv4/ipt_TPROXY.h
@@ -0,0 +1,9 @@
+#ifndef _IPT_TPROXY_H_target
+#define _IPT_TPROXY_H_target
+
+struct ipt_tproxy_target_info {
+   u_int16_t lport;
+   u_int32_t laddr;
+};
+
+#endif
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 17c3ec8..ecd8da5 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -638,6 +638,17 @@ config IP_NF_TPROXY
 
  To compile it as a module, choose M here.  If unsure, say N.
 
+config IP_NF_TARGET_TPROXY
+   tristate TPROXY target support
+   depends on IP_NF_TPROXY
+   help
+ This option adds a `TPROXY' target, which is somewhat similar to
+ REDIRECT.  It can only be used in the tproxy table and is useful
+ to redirect traffic to a transparent proxy.  It does _not_ depend
+ on Netfilter connection tracking.
+
+ To compile it as a module, choose M here.  If unsure, say N.
+
 # ARP tables
 config IP_NF_ARPTABLES
tristate ARP tables support
diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index 21a29f4..a50a64e 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -106,6 +106,7 @@ obj-$(CONFIG_IP_NF_TARGET_LOG) += ipt_LOG.o
 obj-$(CONFIG_IP_NF_TARGET_ULOG) += ipt_ULOG.o
 obj-$(CONFIG_IP_NF_TARGET_CLUSTERIP) += ipt_CLUSTERIP.o
 obj-$(CONFIG_IP_NF_TARGET_TTL) += ipt_TTL.o
+obj-$(CONFIG_IP_NF_TARGET_TPROXY) += ipt_TPROXY.o
 
 # generic ARP tables
 obj-$(CONFIG_IP_NF_ARPTABLES) += arp_tables.o
diff --git a/net/ipv4/netfilter/ipt_TPROXY.c b/net/ipv4/netfilter/ipt_TPROXY.c
new file mode 100644
index 000..89a08b1
--- /dev/null
+++ b/net/ipv4/netfilter/ipt_TPROXY.c
@@ -0,0 +1,92 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (c) 2006-2007 BalaBit IT Ltd.
+ * Author: Balazs Scheidler, Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include linux/module.h
+#include linux/skbuff.h
+#include linux/ip.h
+#include net/checksum.h
+#include net/udp.h
+#include net/inet_sock.h
+
+#include linux/netfilter_ipv4/ip_tables.h
+#include linux/netfilter_ipv4/ip_tproxy.h
+#include linux/netfilter_ipv4/ipt_TPROXY.h
+
+static unsigned int
+target(struct sk_buff **pskb,
+   const struct net_device *in,
+   const struct net_device *out,
+   unsigned int hooknum,
+   const struct xt_target *target,
+   const void *targinfo)
+{
+   const struct iphdr *iph = (*pskb)-nh.iph;
+   const struct ipt_tproxy_target_info *tgi =
+   (const struct ipt_tproxy_target_info *) targinfo;
+   unsigned int verdict = NF_ACCEPT;
+   struct sk_buff *skb = *pskb;
+   struct udphdr _hdr, *hp;
+   struct sock *sk;
+   __be32 daddr;
+   __be16 dport;
+
+   /* TCP/UDP only */
+   if ((iph-protocol != IPPROTO_TCP) 
+   (iph-protocol != IPPROTO_UDP))
+   return NF_ACCEPT;
+
+   hp = skb_header_pointer(*pskb, iph-ihl * 4, sizeof(_hdr), _hdr);
+   if (hp == NULL)
+   return NF_DROP;
+
+   daddr = tgi-laddr ? : iph-daddr;
+   dport = tgi-lport ? : hp-dest;
+   sk = ip_tproxy_get_sock(iph-protocol,
+   iph-saddr, daddr,
+   hp-source, dport, in);
+   if (sk != NULL) {
+   if (ip_tproxy_do_divert(skb, sk, 0, in)  0)
+   verdict = NF_DROP;
+
+   if ((iph-protocol == IPPROTO_TCP)  (sk-sk_state == 
TCP_TIME_WAIT))
+   inet_twsk_put(inet_twsk(sk));
+   else
+   sock_put(sk);
+   }
+
+   return verdict;
+}
+
+static struct xt_target ipt_tproxy_reg = {
+   .name   = TPROXY,
+   .family = AF_INET,
+   .target = target,
+   .targetsize = sizeof(struct ipt_tproxy_target_info),
+   .table  = tproxy,
+   .me = THIS_MODULE,
+};
+
+static int __init init(void)
+{
+   return xt_register_target(ipt_tproxy_reg);
+}
+
+static void __exit fini(void)
+{
+   xt_unregister_target(ipt_tproxy_reg);
+}
+
+module_init(init);
+module_exit(fini);

[PATCH/RFC 13/13] iptables tproxy match

2007-03-05 Thread KOVACS Krisztian
Implements an iptables module which matches packets which have the
tproxy flag set, that is, packets diverted in the tproxy table.

Signed-off-by: KOVACS Krisztian [EMAIL PROTECTED]

---

 net/netfilter/Kconfig |9 +
 net/netfilter/Makefile|1 +
 net/netfilter/xt_tproxy.c |   77 +
 3 files changed, 87 insertions(+), 0 deletions(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 253fce3..b22346e 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -603,6 +603,15 @@ config NETFILTER_XT_MATCH_QUOTA
  If you want to compile it as a module, say M here and read
  file:Documentation/modules.txt.  If unsure, say `N'.
 
+config NETFILTER_XT_MATCH_TPROXY
+   tristate 'tproxy match support'
+   depends on NETFILTER_XTABLES
+   help
+ This option adds a `tproxy' match, which allows you to match
+ packets which have been diverted to local sockets by TProxy.
+
+ To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_MATCH_REALM
tristate  'realm match support'
depends on NETFILTER_XTABLES
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index b2b5c75..83b2fd9 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -64,6 +64,7 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_MARK) += xt_mark.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_MULTIPORT) += xt_multiport.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_POLICY) += xt_policy.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_PKTTYPE) += xt_pkttype.o
+obj-$(CONFIG_NETFILTER_XT_MATCH_TPROXY) += xt_tproxy.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_QUOTA) += xt_quota.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_REALM) += xt_realm.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_SCTP) += xt_sctp.o
diff --git a/net/netfilter/xt_tproxy.c b/net/netfilter/xt_tproxy.c
new file mode 100644
index 000..53f8bee
--- /dev/null
+++ b/net/netfilter/xt_tproxy.c
@@ -0,0 +1,77 @@
+/*
+ * Transparent proxy support for Linux/iptables
+ *
+ * Copyright (c) 2007 BalaBit IT Ltd.
+ * Author: Krisztian Kovacs
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include linux/module.h
+#include linux/skbuff.h
+
+#include linux/netfilter/x_tables.h
+
+static int
+match(const struct sk_buff *skb,
+  const struct net_device *in,
+  const struct net_device *out,
+  const struct xt_match *match,
+  const void *matchinfo,
+  int offset,
+  unsigned int protoff,
+  int *hotdrop)
+{
+   return skb-ip_tproxy;
+}
+
+static int
+check(const char *tablename,
+  const void *entry,
+  const struct xt_match *match,
+  void *matchinfo,
+  unsigned int hook_mask)
+{
+   return 1;
+}
+
+static struct xt_match tproxy_matches[] = {
+   {
+   .name   = tproxy,
+   .match  = match,
+   .matchsize  = 0,
+   .checkentry = check,
+   .family = AF_INET,
+   .me = THIS_MODULE,
+   },
+   {
+   .name   = tproxy,
+   .match  = match,
+   .matchsize  = 0,
+   .checkentry = check,
+   .family = AF_INET6,
+   .me = THIS_MODULE,
+   },
+};
+
+static int __init xt_tproxy_init(void)
+{
+   return xt_register_matches(tproxy_matches, ARRAY_SIZE(tproxy_matches));
+}
+
+static void __exit xt_tproxy_fini(void)
+{
+   xt_unregister_matches(tproxy_matches, ARRAY_SIZE(tproxy_matches));
+}
+
+module_init(xt_tproxy_init);
+module_exit(xt_tproxy_fini);
+
+MODULE_LICENSE(GPL);
+MODULE_AUTHOR(Krisztian Kovacs [EMAIL PROTECTED]);
+MODULE_DESCRIPTION(iptables tproxy match module);
+MODULE_ALIAS(ipt_tproxy);
+MODULE_ALIAS(ip6t_tproxy);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SWS for rcvbuf MTU

2007-03-05 Thread Alex Sidorenko
On March 3, 2007 06:40:12 pm John Heffner wrote:
 David Miller wrote:
  From: John Heffner [EMAIL PROTECTED]
  Date: Fri, 02 Mar 2007 16:16:39 -0500
 
  Please don't apply the patch I sent.  I've been thinking about this a
  bit harder, and it may not fix this particular problem.  (Hard to say
  without knowing exactly what it is.)  As the comment above
  __tcp_select_window() states, we do not do full receive-side SWS
  avoidance because of header prediction.
 
  Alex, you're right I missed that special zero-window case.  I'm still
  not quite sure I'm completely happy with this patch.  I'd like to think
  about this a little bit harder...
 
  Ok

 Alright, I've thought about it a bit more, and I think the patch I sent
 should work.  Alex, any opinion?  Any way you can test this out?

Here are the values from live kernel (obtained with 'crash') when the host was 
in SWS state:

full_space=708  full_space/2=354
free_space=393
window=76

In this case the test from my original fix, (window  full_space/2),  
succeeds. But John's test

free_space  window + full_space/2
393  430

does not. So I suspect that the new fix will not always work. From tcpdump 
traces we can see that both hosts exchange with 76-byte packets for a long 
time. From customer's application log we see that it continues to read 
76-byte chunks per each read() call - even though more than that is available 
in the receive buffer. Technically it's OK for read() to return even after 
reading one byte, so if sk-receive_queue contains multiple 76-byte skbuffs 
we may return after processing just one skbuff (but we we don't understand 
the details of why this happens on customer's system).

Are there any particular reasons why you want to postpone window update until 
free_space becomes  window + full_space/2 and not as soon as 
free_space  full_space/2? As the only real-life occurance of SWS shows 
free_space oscillating slightly above full_space/2, I created the fix 
specifically to match this phenomena as seen on customer's host. We reach the 
modified section only when (free_space  full_space/2) so it should be OK to 
update the window at this point if mss==full_space. 

So yes, we can test John's fix on customer's host but I doubt it will work for 
the reasons mentioned above, in brief:

'window = free_space' instead of 'window=full_space/2' is OK,
but the test 'free_space  window + full_space/2' is not for the specific 
pattern customer sees on his hosts.

Thanks,
Alex


-- 
--
Alexandre Sidorenko email: [EMAIL PROTECTED]
Global Solutions Engineering:   Unix Networking
Hewlett-Packard (Canada)
--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers

2007-03-05 Thread Patrick McHardy
David Miller wrote:
 Frankly, I think now that we have ktime and all of the proper generic
 infrastructure to do this stuff properly, I think we should just use
 ktime for the packet scheduler across the board and just delete all of
 that old by-hand timekeeping selection crap from pkt_sched.h

Sounds good, I'm going to remove all other clock sources.
Will resend in a couple of days after fixing a few more
problems I noticed.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 17/31] net: Factor out __dev_alloc_name from dev_alloc_name

2007-03-05 Thread Benjamin Thery

Hello Eric,

See comments about __dev_alloc_name() below.

Regards,
Benjamin

Eric W. Biederman wrote:

From: Eric W. Biederman [EMAIL PROTECTED] - unquoted

When forcibly changing the network namespace of a device
I need something that can generate a name for the device
in the new namespace without overwriting the old name.

__dev_alloc_name provides me that functionality.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 net/core/dev.c |   44 +---
 1 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 32fe905..fc0d2af 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -655,9 +655,10 @@ int dev_valid_name(const char *name)
 }
 
 /**

- * dev_alloc_name - allocate a name for a device
- * @dev: device
+ * __dev_alloc_name - allocate a name for a device
+ * @net: network namespace to allocate the device name in
  * @name: name format string
+ * @buf:  scratch buffer and result name string
  *
  * Passed a format string - eg lt%d it will try and find a suitable
  * id. It scans list of devices to build up a free map, then chooses
@@ -668,18 +669,13 @@ int dev_valid_name(const char *name)
  * Returns the number of the unit assigned or a negative errno code.
  */
 
-int dev_alloc_name(struct net_device *dev, const char *name)

+static int __dev_alloc_name(net_t net, const char *name, char buf[IFNAMSIZ])


IMHO the third parameter should be: char *buf
Indeed using char buf[IFNAMSIZ] is misleading because later in the 
routine sizeof(buf) is used (with an expected result of IFNAMSIZ).
Unfortunately this is no longer the case: sizeof(buf) value is only 4 
now (buf is pointer parameter).


This corrupts the registration of network devices (now I understand 
why only one of my e1000 showed up after each reboot :).


Also sizeof(buf) should be replaced by IFNAMSIZ in this new routine.
(See below)


 {
int i = 0;
-   char buf[IFNAMSIZ];
const char *p;
const int max_netdevices = 8*PAGE_SIZE;
long *inuse;
struct net_device *d;
-   net_t net;
-
-   BUG_ON(null_net(dev-nd_net));
-   net = dev-nd_net;
 
 	p = strnchr(name, IFNAMSIZ-1, '%');

if (p) {
@@ -713,10 +709,8 @@ int dev_alloc_name(struct net_device *dev, const char 
*name)
}
 
 	snprintf(buf, sizeof(buf), name, i);


Replace snprintf(buf, IFNAMSIZ, name, i); or i will never be 
appended to name and all your ethernet devices will all try to 
register the name eth.


There is another occurence of snprintf(buf, sizeof(buf), ...) to 
replace in the for loop above.



-   if (!__dev_get_by_name(net, buf)) {
-   strlcpy(dev-name, buf, IFNAMSIZ);
+   if (!__dev_get_by_name(net, buf))
return i;
-   }
 
 	/* It is possible to run out of possible slots

 * when the name is long and there isn't enough space left
@@ -725,6 +719,34 @@ int dev_alloc_name(struct net_device *dev, const char 
*name)
return -ENFILE;
 }
 
+/**

+ * dev_alloc_name - allocate a name for a device
+ * @dev: device
+ * @name: name format string
+ *
+ * Passed a format string - eg lt%d it will try and find a suitable
+ * id. It scans list of devices to build up a free map, then chooses
+ * the first empty slot. The caller must hold the dev_base or rtnl lock
+ * while allocating the name and adding the device in order to avoid
+ * duplicates.
+ * Limited to bits_per_byte * page size devices (ie 32K on most platforms).
+ * Returns the number of the unit assigned or a negative errno code.
+ */
+
+int dev_alloc_name(struct net_device *dev, const char *name)
+{
+   char buf[IFNAMSIZ];
+   net_t net;
+   int ret;
+
+   BUG_ON(null_net(dev-nd_net));
+   net = dev-nd_net;
+   ret = __dev_alloc_name(net, name, buf);
+   if (ret = 0)
+   strlcpy(dev-name, buf, IFNAMSIZ);
+   return ret;
+}
+
 
 /**

  * dev_change_name - change name of a device



--
B e n j a m i n   T h e r y  - BULL/DT/Open Software RD

   http://www.bull.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers

2007-03-05 Thread Stephen Hemminger
On Mon, 05 Mar 2007 18:42:26 +0100
Patrick McHardy [EMAIL PROTECTED] wrote:

 David Miller wrote:
  Frankly, I think now that we have ktime and all of the proper generic
  infrastructure to do this stuff properly, I think we should just use
  ktime for the packet scheduler across the board and just delete all of
  that old by-hand timekeeping selection crap from pkt_sched.h
 
 Sounds good, I'm going to remove all other clock sources.
 Will resend in a couple of days after fixing a few more
 problems I noticed.
 

Don't bother changing netem. I have a version that uses hrtimer's
and doesn't use PSCHED() clock source anymore.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers

2007-03-05 Thread Patrick McHardy
Stephen Hemminger wrote:
 Don't bother changing netem. I have a version that uses hrtimer's
 and doesn't use PSCHED() clock source anymore.

Me too :) I'm going to send it with my other patches soon, if you
don't like it we can still drop it.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] natsemi: netpoll fixes

2007-03-05 Thread Sergei Shtylyov
Fix two issues in this driver's netpoll path: one usual, with spin_unlock_irq()
enabling interrupts which nobody asks it to do (that has been fixed recently in
a number of drivers) and one unusual, with poll_controller() method possibly
causing loss of interrupts due to the interrupt status register being cleared
by a simple read and the interrpupt handler simply storing it, not accumulating.

Signed-off-by: Sergei Shtylyov [EMAIL PROTECTED]

---
 drivers/net/natsemi.c |   24 +++-
 1 files changed, 19 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/net/natsemi.c
===
--- linux-2.6.orig/drivers/net/natsemi.c
+++ linux-2.6/drivers/net/natsemi.c
@@ -2024,6 +2024,7 @@ static int start_tx(struct sk_buff *skb,
struct netdev_private *np = netdev_priv(dev);
void __iomem * ioaddr = ns_ioaddr(dev);
unsigned entry;
+   unsigned long flags;
 
/* Note: Ordering is important here, set the field with the
   ownership bit last, and only then increment cur_tx. */
@@ -2037,7 +2038,7 @@ static int start_tx(struct sk_buff *skb,
 
np-tx_ring[entry].addr = cpu_to_le32(np-tx_dma[entry]);
 
-   spin_lock_irq(np-lock);
+   spin_lock_irqsave(np-lock, flags);
 
if (!np-hands_off) {
np-tx_ring[entry].cmd_status = cpu_to_le32(DescOwn | skb-len);
@@ -2056,7 +2057,7 @@ static int start_tx(struct sk_buff *skb,
dev_kfree_skb_irq(skb);
np-stats.tx_dropped++;
}
-   spin_unlock_irq(np-lock);
+   spin_unlock_irqrestore(np-lock, flags);
 
dev-trans_start = jiffies;
 
@@ -,6 +2223,8 @@ static void netdev_rx(struct net_device 
pkt_len = (desc_status  DescSizeMask) - 4;
if ((desc_status(DescMore|DescPktOK|DescRxLong)) != DescPktOK){
if (desc_status  DescMore) {
+   unsigned long flags;
+
if (netif_msg_rx_err(np))
printk(KERN_WARNING
%s: Oversized(?) Ethernet 
@@ -2236,12 +2239,12 @@ static void netdev_rx(struct net_device 
 * reset procedure documented in
 * AN-1287. */
 
-   spin_lock_irq(np-lock);
+   spin_lock_irqsave(np-lock, flags);
reset_rx(dev);
reinit_rx(dev);
writel(np-ring_dma, ioaddr + RxRingPtr);
check_link(dev);
-   spin_unlock_irq(np-lock);
+   spin_unlock_irqrestore(np-lock, flags);
 
/* We'll enable RX on exit from this
 * function. */
@@ -2396,8 +2399,19 @@ static struct net_device_stats *get_stat
 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void natsemi_poll_controller(struct net_device *dev)
 {
+   struct netdev_private *np = netdev_priv(dev);
+
disable_irq(dev-irq);
-   intr_handler(dev-irq, dev);
+
+   /*
+* A real interrupt might have already reached us at this point
+* but NAPI might still haven't called us back.  As the interrupt
+* status register is cleared by reading, we should prevent an
+* interrupt loss in this case...
+*/
+   if (!np-intr_status)
+   intr_handler(dev-irq, dev);
+
enable_irq(dev-irq);
 }
 #endif

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Jim Chow
After inspection of some networking code, it seems there is a use of
uninitialized data in udp_recvmsg(),
linux-2.6.20.1/net/ipv4/udp.c:843, while testing msg-msg_flags (see
the backtrace below).  It looks like sys_recvfrom() is not
initializing msg.msg_flags and, along the path given below, msg_flags
is tested (at #0) without (necessarily) being written to.

A simple fix for this particular problem is given below.

Alternatively, udp_recvmsg() could be changed to initialize msg_flags
for its caller, since udp_recvmsg() (always? [*]) uses msg_flags as an
output argument.

In any case, I wanted to verify the bug with the networking gurus to see 
if they agree.

#0 udp_recvmsg (linux-2.6.20.1/net/ipv4/udp.c:843)
#1 sock_common_recvmsg (linux-2.6.20.1/net/core/sock.c:1617)
#2 sock_recvmsg (linux-2.6.20.1/net/socket.c:630)
#3 sys_recvfrom (linux-2.6.20.1/net/socket.c:1608)
#4 sys_socketcall (linux-2.6.20.1/net/socket.c:2007)
#5 syscall_call (linux-2.6.20.1/arch/i386/kernel/entry.S:0)

Index: linux-2.6.20.1/net/socket.c
===
--- linux-2.6.20.1.orig/net/socket.c
+++ linux-2.6.20.1/net/socket.c
@@ -1601,6 +1601,7 @@
iov.iov_base = ubuf;
msg.msg_name = address;
msg.msg_namelen = MAX_SOCK_ADDR;
+   msg.msg_flags = 0;
if (sock-file-f_flags  O_NONBLOCK)
flags |= MSG_DONTWAIT;
err = sock_recvmsg(sock, msg, size, flags);


--

[*] Although do_sock_read() linux-2.6.20.1/net/socket.c:704, for one,
seems to want to initialize msg_flags nonzero, so maybe not.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] twcal_jiffie should be unsigned long, not int

2007-03-05 Thread David Miller
From: Eric Dumazet [EMAIL PROTECTED]
Date: Mon, 5 Mar 2007 16:09:21 +0100

 While browsing include/net/inet_timewait_sock.h, I found this buggy 
 definition 
 of twcal_jiffie.
 
 int twcal_jiffie;
 
 I wonder how inet_twdr_twcal_tick() can really works on x86_64
 
 This seems quite an old bug, it was there before introduction of 
 inet_timewait_death_row made by Arnaldo Carvalho de Melo.
 
 [PATCH] twcal_jiffie should be unsigned long, not int
 
 Signed-off-by: Eric Dumazet [EMAIL PROTECTED]

Grrr, good catch Eric.  I'll push this fix to -stable too.

Thanks a lot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


skge 0000:01:0a.0: unsupported phy type 0x0

2007-03-05 Thread Chris Stromsoe
I have a bunch of dual-port SK 98xx cards that work with sk98lin but not 
with skge.  After loading skge, I get


ACPI: PCI Interrupt :01:0a.0[A] - Link [LNKC] - GSI 10 (level, low) - 
IRQ 10
skge :01:0a.0: unsupported phy type 0x0
ACPI: PCI interrupt for device :01:0a.0 disabled
skge: probe of :01:0a.0 failed with error -95


lspci -vv output for the card:

:01:0a.0 Ethernet controller: Syskonnect (Schneider  Koch) SK-98xx Gigabit 
Ethernet Server Adapter (rev 12)
Subsystem: Syskonnect (Schneider  Koch) SK-9844 Gigabit Ethernet 
Server Adapter (SK-NET GE-SX dual link)
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
Interrupt: pin A routed to IRQ 10
Region 0: Memory at ff8fc000 (32-bit, non-prefetchable) [size=16K]
Region 1: I/O ports at d800 [size=256]
Expansion ROM at ff40 [disabled] [size=128K]
Capabilities: [48] Power Management version 1
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] Vital Product Data




-Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Herbert Xu
Jim Chow [EMAIL PROTECTED] wrote:
 After inspection of some networking code, it seems there is a use of
 uninitialized data in udp_recvmsg(),
 linux-2.6.20.1/net/ipv4/udp.c:843, while testing msg-msg_flags (see
 the backtrace below).  It looks like sys_recvfrom() is not

msg_flags is set on return and its initial value is not used.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Jim Chow
On Tue, 6 Mar 2007, Herbert Xu wrote:
 msg_flags [...] its initial value is not used.

That's not true.  Please see my post.

Specifically, at linux-2.6.20.1/net/ipv4/udp.c:843, it seems that 
udp_recvmsg() can randomly ignore whether the HW has computed a checksum 
and compute it in SW redundantly.

Jim
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: skge 0000:01:0a.0: unsupported phy type 0x0

2007-03-05 Thread Stephen Hemminger
On Mon, 5 Mar 2007 13:48:29 -0800 (PST)
Chris Stromsoe [EMAIL PROTECTED] wrote:

 I have a bunch of dual-port SK 98xx cards that work with sk98lin but not 
 with skge.  After loading skge, I get
 
 ACPI: PCI Interrupt :01:0a.0[A] - Link [LNKC] - GSI 10 (level, low) - 
 IRQ 10
 skge :01:0a.0: unsupported phy type 0x0
 ACPI: PCI interrupt for device :01:0a.0 disabled
 skge: probe of :01:0a.0 failed with error -95
 
 
 lspci -vv output for the card:
 
 :01:0a.0 Ethernet controller: Syskonnect (Schneider  Koch) SK-98xx 
 Gigabit Ethernet Server Adapter (rev 12)
  Subsystem: Syskonnect (Schneider  Koch) SK-9844 Gigabit Ethernet 
 Server Adapter (SK-NET GE-SX dual link)
  Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- 
 Stepping- SERR+ FastB2B-
  Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
 TAbort- MAbort- SERR- PERR-
  Interrupt: pin A routed to IRQ 10
  Region 0: Memory at ff8fc000 (32-bit, non-prefetchable) [size=16K]
  Region 1: I/O ports at d800 [size=256]
  Expansion ROM at ff40 [disabled] [size=128K]
  Capabilities: [48] Power Management version 1
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
 PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 PME-Enable- DSel=0 DScale=1 PME-
  Capabilities: [50] Vital Product Data
 

What kernel version. Type 0 is XMAC support, and that was added to a fairly
recent kernel (2.6.19?)

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: skge 0000:01:0a.0: unsupported phy type 0x0

2007-03-05 Thread Chris Stromsoe

On Mon, 5 Mar 2007, Stephen Hemminger wrote:


What kernel version. Type 0 is XMAC support, and that was added to a 
fairly recent kernel (2.6.19?)


It was an old kernel.  I booted into 2.6.16 instead of 2.6.20.  See my 
follow-up (and ignore the report).


-Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] natsemi: netpoll fixes

2007-03-05 Thread Mark Brown
On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote:

  #ifdef CONFIG_NET_POLL_CONTROLLER
  static void natsemi_poll_controller(struct net_device *dev)
  {
 + struct netdev_private *np = netdev_priv(dev);
 +
   disable_irq(dev-irq);
 - intr_handler(dev-irq, dev);
 +
 + /*
 +  * A real interrupt might have already reached us at this point
 +  * but NAPI might still haven't called us back.  As the interrupt
 +  * status register is cleared by reading, we should prevent an
 +  * interrupt loss in this case...
 +  */
 + if (!np-intr_status)
 + intr_handler(dev-irq, dev);
 +
   enable_irq(dev-irq);

Is it possible for this to run at the same time as the NAPI poll?  If so
then it is possible for the netpoll poll to run between np-intr_status
being cleared and netif_rx_complete() being called.  If the hardware
asserts an interrupt at the wrong moment then this could cause the 

In any case, this is a problem independently of netpoll if the chip
shares an interrupt with anything so the interrupt handler should be
fixed to cope with this situation instead.

-- 
You grabbed my hand and we fell into it, like a daydream - or a fever.


signature.asc
Description: Digital signature


Re: [PATCH] natsemi: netpoll fixes

2007-03-05 Thread Mark Brown
[Once more with CCs]

On Tue, Mar 06, 2007 at 12:10:08AM +0400, Sergei Shtylyov wrote:

  #ifdef CONFIG_NET_POLL_CONTROLLER
  static void natsemi_poll_controller(struct net_device *dev)
  {
 + struct netdev_private *np = netdev_priv(dev);
 +
   disable_irq(dev-irq);
 - intr_handler(dev-irq, dev);
 +
 + /*
 +  * A real interrupt might have already reached us at this point
 +  * but NAPI might still haven't called us back.  As the
 interrupt
 +  * status register is cleared by reading, we should prevent an
 +  * interrupt loss in this case...
 +  */
 + if (!np-intr_status)
 + intr_handler(dev-irq, dev);
 +
   enable_irq(dev-irq);

Is it possible for this to run at the same time as the NAPI poll?  If so
then it is possible for the netpoll poll to run between np-intr_status
being cleared and netif_rx_complete() being called.  If the hardware
asserts an interrupt at the wrong moment then this could cause the

In any case, this is a problem independently of netpoll if the chip
shares an interrupt with anything so the interrupt handler should be
fixed to cope with this situation instead.

--
You grabbed my hand and we fell into it, like a daydream - or a fever.


signature.asc
Description: Digital signature


ignore; Re: skge 0000:01:0a.0: unsupported phy type 0x0

2007-03-05 Thread Chris Stromsoe
Ignore this.  I rebooted into the wrong kernel and was testing with 2.6.16 
instead of 2.6.20.  It works fine with 2.6.20.


-Chris

On Mon, 5 Mar 2007, Chris Stromsoe wrote:

I have a bunch of dual-port SK 98xx cards that work with sk98lin but not 
with skge.  After loading skge, I get


ACPI: PCI Interrupt :01:0a.0[A] - Link [LNKC] - GSI 10 (level, low) - 
IRQ 10

skge :01:0a.0: unsupported phy type 0x0
ACPI: PCI interrupt for device :01:0a.0 disabled
skge: probe of :01:0a.0 failed with error -95


lspci -vv output for the card:

:01:0a.0 Ethernet controller: Syskonnect (Schneider  Koch) SK-98xx 
Gigabit Ethernet Server Adapter (rev 12)
   Subsystem: Syskonnect (Schneider  Koch) SK-9844 Gigabit Ethernet 
Server Adapter (SK-NET GE-SX dual link)
   Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
   Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR-

   Interrupt: pin A routed to IRQ 10
   Region 0: Memory at ff8fc000 (32-bit, non-prefetchable) [size=16K]
   Region 1: I/O ports at d800 [size=256]
   Expansion ROM at ff40 [disabled] [size=128K]
   Capabilities: [48] Power Management version 1
   Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)

   Status: D0 PME-Enable- DSel=0 DScale=1 PME-
   Capabilities: [50] Vital Product Data




-Chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8132] New: pptp server lockup in ppp_asynctty_receive()

2007-03-05 Thread Andrew Morton
On Mon, 5 Mar 2007 14:26:30 -0800
[EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8132
 
Summary: pptp server lockup in ppp_asynctty_receive()
 Kernel Version:  2.6.20
 Status: NEW
   Severity: high
  Owner: [EMAIL PROTECTED]
  Submitter: [EMAIL PROTECTED]
 CC: [EMAIL PROTECTED]
 
 
 Already several kernel releases i've expirienced different lockups of  vpn 
 (pptp) server.
 There is more then 200  ppp connections sometimes.
 With kernel debug i was able to retrive next information:
 
 First:
 Showing all locks held in the system:
 1 lock held by agetty/4486:
  #0:  (tty-atomic_read_lock){--..}, at: [c01d4e6f] read_chan+0x41a/0x60b
 1 lock held by agetty/4487:
  #0:  (tty-atomic_read_lock){--..}, at: [c01d4e6f] read_chan+0x41a/0x60b
 1 lock held by agetty/4488:
  #0:  (tty-atomic_read_lock){--..}, at: [c01d4e6f] read_chan+0x41a/0x60b
 2 locks held by pptpctrl/4500:
  #0:  (tty-atomic_write_lock){--..}, at: [c01d0f8f] tty_write+0x83/0x1d0
  #1:  (ap-recv_lock){}, at: [c02241fe] 
 ppp_asynctty_receive+0x2e/0x710
 
 =
 BUG: spinlock lockup on CPU#1, pppd/4504, df5048c4
  [c01c42c6] _raw_spin_lock+0x100/0x134
  [c0223f03] ppp_async_ioctl+0xa7/0x1d0
  [c0221b7c] ppp_ioctl+0xa5/0xbff
  [c012f2a5] down_read+0x29/0x3a
  [c0223e5c] ppp_async_ioctl+0x0/0x1d0
  [c0221ba5] ppp_ioctl+0xce/0xbff
  [c029b78f] _spin_unlock+0x14/0x1c
  [c014a22a] do_wp_page+0x256/0x4ba
  [c014c1e9] __handle_mm_fault+0x74e/0xa22
  [c0167224] do_ioctl+0x64/0x6d
  [c016727d] vfs_ioctl+0x50/0x273
  [c01674d4] sys_ioctl+0x34/0x50
  [c0102dae] sysenter_past_esp+0x5f/0x99
  ===
 BUG: soft lockup detected on CPU#0!
  [c013ce27] softlockup_tick+0x8d/0xbc
  [c0123784] update_process_times+0x28/0x5e
  [c010d9c4] smp_apic_timer_interrupt+0x80/0x9c
  [c0103903] apic_timer_interrupt+0x33/0x38
  [c01c2e9d] delay_tsc+0x9/0x13
  [c01c2ed3] __delay+0x6/0x7
  [c01c426f] _raw_spin_lock+0xa9/0x134
  [c01d0f8f] tty_write+0x83/0x1d0
  [c01cf418] tty_ldisc_try+0x2f/0x33
  [c029bc8c] lock_kernel+0x19/0x24
  [c01d1017] tty_write+0x10b/0x1d0
  [c01d35cf] write_chan+0x0/0x320
  [c015d502] vfs_write+0x87/0xf0
  [c01d0f0c] tty_write+0x0/0x1d0
  [c015daa9] sys_write+0x41/0x6a
  [c0102dae] sysenter_past_esp+0x5f/0x99
  ===
 
 
 Second)
 0BUG: spinlock lockup on CPU#0, pppd/5209, de3e2884
  [c01c42c6] _raw_spin_lock+0x100/0x134
 BUG: spinlock lockup on CPU#1, ip-down/7524, c0353300
  [c01c42c6] _raw_spin_lock+0x100/0x134
  [c029bc8c] lock_kernel+0x19/0x24
  [c015f234] chrdev_open+0x8a/0x16e
  [c015f1aa] chrdev_open+0x0/0x16e
  [c015ba94] __dentry_open+0xaf/0x1a0
  [c015bc12] nameidata_to_filp+0x31/0x3a
  [c015bc54] do_filp_open+0x39/0x40
  [c029b78f] _spin_unlock+0x14/0x1c
  [c015b9d4] get_unused_fd+0xaa/0xbb
  [c015bc95] do_sys_open+0x3a/0x6d
  [c015bd03] sys_open+0x1c/0x20
  [c0102dae] sysenter_past_esp+0x5f/0x99
  ===
  [c0223f03] ppp_async_ioctl+0xa7/0x1d0
  [c0221b7c] ppp_ioctl+0xa5/0xbff
  [c012f2a5] down_read+0x29/0x3a
  [c0223e5c] ppp_async_ioctl+0x0/0x1d0
  [c0221ba5] ppp_ioctl+0xce/0xbff
  [c029b78f] _spin_unlock+0x14/0x1c
  [c014a22a] do_wp_page+0x256/0x4ba
  [c014c1e9] __handle_mm_fault+0x74e/0xa22
  [c0167224] do_ioctl+0x64/0x6d
  [c016727d] vfs_ioctl+0x50/0x273
  [c01674d4] sys_ioctl+0x34/0x50
  [c0102dae] sysenter_past_esp+0x5f/0x99
  ===
 
 Third)
 BUG: soft lockup detected on CPU#0!
  [c013ce27] softlockup_tick+0x8d/0xbc
  [c0123784] update_process_times+0x28/0x5e
  [c010d9c4] smp_apic_timer_interrupt+0x80/0x9c
  [c0103903] apic_timer_interrupt+0x33/0x38
  [c01c2e9d] delay_tsc+0x9/0x13
  [c01c2ed3] __delay+0x6/0x7
  [c01c426f] _raw_spin_lock+0xa9/0x134
  [c01cf418] tty_ldisc_try+0x2f/0x33
  [c029bc8c] lock_kernel+0x19/0x24
  [c01d1984] tty_read+0x5a/0xbe
  [c015d5f0] vfs_read+0x85/0xee
  [c01d192a] tty_read+0x0/0xbe
  [c015da3f] sys_read+0x41/0x6a
  [c0102dae] sysenter_past_esp+0x5f/0x99
  ===
 BUG: soft lockup detected on CPU#0!
  [c013ce27] softlockup_tick+0x8d/0xbc
  [c0123784] update_process_times+0x28/0x5e
  [c010d9c4] smp_apic_timer_interrupt+0x80/0x9c
  [c0103903] apic_timer_interrupt+0x33/0x38
  [c01c00d8] prio_tree_insert+0xe8/0x23b
  [c01c4275] _raw_spin_lock+0xaf/0x134
  [c01cf418] tty_ldisc_try+0x2f/0x33
  [c029bc8c] lock_kernel+0x19/0x24
  [c01d1984] tty_read+0x5a/0xbe
  [c015d5f0] vfs_read+0x85/0xee
  [c01d192a] tty_read+0x0/0xbe
  [c015da3f] sys_read+0x41/0x6a
  [c0102dae] sysenter_past_esp+0x5f/0x99
 
 
 Next via SysRq:
 
 Showing all locks held in the system:
 1 lock held by agetty/5057:
  #0:  (tty-atomic_read_lock){--..}, at: [c01d4e6f] read_chan+0x41a/0x60b
 1 lock held by agetty/5058:
  #0:  (tty-atomic_read_lock){--..}, at: [c01d4e6f] read_chan+0x41a/0x60b
 1 lock held by agetty/5059:
  #0:  (tty-atomic_read_lock){--..}, at: [c01d4e6f] read_chan+0x41a/0x60b
 2 locks held by pptpctrl/5071:
  #0:  

Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Herbert Xu
On Mon, Mar 05, 2007 at 01:01:16PM -0800, Jim Chow wrote:
 On Tue, 6 Mar 2007, Herbert Xu wrote:
  msg_flags [...] its initial value is not used.
 
 That's not true.  Please see my post.
 
 Specifically, at linux-2.6.20.1/net/ipv4/udp.c:843, it seems that 
 udp_recvmsg() can randomly ignore whether the HW has computed a checksum 
 and compute it in SW redundantly.

Sorry, you're right.  This bug has been there for years.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Herbert Xu
On Tue, Mar 06, 2007 at 10:34:49AM +1100, Herbert Xu wrote:
  
  That's not true.  Please see my post.
  
  Specifically, at linux-2.6.20.1/net/ipv4/udp.c:843, it seems that 
  udp_recvmsg() can randomly ignore whether the HW has computed a checksum 
  and compute it in SW redundantly.
 
 Sorry, you're right.  This bug has been there for years.

Actually I think we should fix UDP regardless of whether we initialise
msg_flags to zero here.  It's just too error-prone to rely on it to not
have MSG_TRUNC set.

I'm going to clean this up for UDP and improve the UDP-lite checksum
handling while I'm at it.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Johannes Berg
On Sun, 2007-03-04 at 17:17 -0800, Greg KH wrote:

   This is due to the recent sysfs restructuring I think. IIRC the fix is
   to upgrade hal to a current git version.
  
  If that's the cause, the fix is to back out whatever was done to break
  userspace. Breaking userspace is not ok. Upgrading from 2.6.x to
  2.6.x+1 should not entail replacing substantial parts of userspace,
  especially with NOT-EVEN-FRAKKING-RELEASED-YET CODE. 
 
 I should not have broken any userspace if CONFIG_SYSFS_DEPRECATED is
 enabled with that patch.  If that is enabled, and that patch still
 causes problems, please let me know.

Yeah, sorry for not mentioning that in my original email and the
discussion it stirred.

johannes


signature.asc
Description: This is a digitally signed message part


Re: [1/6] 2.6.21-rc2: known regressions

2007-03-05 Thread Marcel Holtmann
Hi Greg,

  Subject: Bluetooth RFComm locks up the machine  (device_move() related)
  References : http://lkml.org/lkml/2007/3/4/64
  Submitter  : Mark Lord [EMAIL PROTECTED]
  Caused-By  : Marcel Holtmann [EMAIL PROTECTED]
   commit c1a3313698895d8ad4760f98642007bf236af2e8
  Status : unknown
  
  A 2-line patch exists for fs/sysfs/dir.c to address this.
  Waiting on Greg to apply it or substitute something prettier.  ;)
 
 I want to see if Marcel agrees with it, as he did the original patch in
 that area.

I am not deep enough in the sysfs code to tell you if Mark's change it
correct or not. It looks however fully reasonable to me. From the higher
level perspective of the device_move() usage the RFCOMM code looks
correct and has been tested before I submitted it for inclusion.

Regards

Marcel


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Theodore Tso
On Sun, Mar 04, 2007 at 05:17:29PM -0800, Greg KH wrote:
 I should not have broken any userspace if CONFIG_SYSFS_DEPRECATED is
 enabled with that patch.  If that is enabled, and that patch still
 causes problems, please let me know.

But we still need to update the help text for CONFIG_SYS_DEPRECATED to
make it clear that its deprecation schedule still needs to be 2009 to
2011 (depending on whether we want to accomodate Debian's glacial
release schedule).  Certainly the 2006 date which is currently there
simply isn't accurate.

- Ted
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP 2MSL on loopback

2007-03-05 Thread Eric Dumazet
On Monday 05 March 2007 12:20, Howard Chu wrote:
 Why is the Maximum Segment Lifetime a global parameter? Surely the
 maximum possible lifetime of a particular TCP segment depends on the
 actual connection. At the very least, it would be useful to be able to
 set it on a per-interface basis. E.g., in the case of the loopback
 interface, it would be useful to be able to set it to a very small
 duration.

Hi Howard

I think you should address these questions on netdev instead of linux-kernel.


 As I note in this draft
 http://www.ietf.org/internet-drafts/draft-chu-ldap-ldapi-00.txt
 when doing a connection soak test of OpenLDAP using clients connected
 through localhost, the entire port range is exhausted in well under a
 second, at which point the test stalls until a port comes out of
 TIME_WAIT state so the next connection can be opened.

 These days it's not uncommon for an OpenLDAP slapd server to handle tens
 of thousands of connections per second in real use (e.g., at Google, or
 at various telcos). While the LDAP server is fast enough to saturate
 even 10gbit ethernet using contemporary CPUs, we have to resort to
 multiple virtual interfaces just to make sure we have enough port
 numbers available.


I dont uderstand... doesnt slapd server listen for connections on a given 
port, like http ? Or is it doing connections like a ftp server ?

Of course, if you want to open more than 60.000 concurrent connections, using 
127.0.0.1 address, you might have a problem...

 Ideally the 2MSL parameter would be dynamically adjusted based on the
 route to the destination and the weights associated with those routes.
 In the simplest case, connections between machines on the same subnet
 (i.e., no router hops involved) should have a much smaller default value
 than connections that traverse any routers. I'd settle for a two-level
 setting - with no router hops, use the small value; with any router hops
 use the large value.

Well, is it really a MSL problem ?

I did a small test (linux-2.6.21-rc1) and was able to get 1.000.000 
connections on localhost on my dual proc machine in one minute, without an 
error.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Venkat Yekkirala
 @@ -2552,7 +2550,7 @@ static int pfkey_spdget(struct sock 
 *sk, struct sk_buff *skb, struct sadb_msg *h
   return -EINVAL;
  
   xp = xfrm_policy_byid(XFRM_POLICY_TYPE_MAIN, dir, 
 pol-sadb_x_policy_id,
 -   hdr-sadb_msg_type == SADB_X_SPDDELETE2);
 +   hdr-sadb_msg_type == 
 SADB_X_SPDDELETE2, err);
   if (xp == NULL)
   return -ENOENT;
I guess you meant to do this here?
else if (err)
return err;

Also, [Joy cc'd] deletions here needn't be audited?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Venkat Yekkirala

 Also, [Joy cc'd] deletions here needn't be audited?

OK, I see the next patch addressed this :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Tomasz Torcz
On Mon, Mar 05, 2007 at 01:13:26AM -0600, Matt Mackall wrote:
 On Sun, Mar 04, 2007 at 11:02:48PM -0800, Greg KH wrote:
  On Mon, Mar 05, 2007 at 12:42:29AM -0600, Matt Mackall wrote:
   On Sun, Mar 04, 2007 at 05:16:25PM -0800, Greg KH wrote:
On Sun, Mar 04, 2007 at 04:08:57PM -0600, Matt Mackall wrote:
 Recent kernels are having troubles with wireless for me. Two seemingly
 related problems:
 
 a) NetworkManager seems oblivious to the existence of my IPW2200
 b) Manual iwconfig waits for 60s and then reports:
 
 Error for wireless request Set Encode (8B2A) :
 SET failed on device eth1 ; Operation not supported.

Do you have CONFIG_SYSFS_DEPRECATED enabled?  If not, please do as that
will keep you from having to change any userspace code.
   
   No, it's disabled. Will test once I'm done tracking down the iwconfig
   problem. From the help text for SYSFS_DEPRECATED:
   
 If you are using a distro that was released in 2006 or
   later, it should be safe to say N here.
   
   If we need an as-yet-unreleased HAL without it, I would say the above
   should be changed to 2008 or so. If Debian actually cuts a release in
   the next few months, you might make that 2010.
  
  Well, just because Debian has such a slow release cycle, should the rest
  of the world be forced to follow suit?  :)
  
  When I originally wrote that, I thought Debian would have already done
  their release, my mistake...
 
 That's not the point. The point is that Debian/unstable as of _this
 morning_ doesn't work. For reference, I'm running both the latest
 releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
 there are people telling me I need a copy of HAL out of git that
 hasn't even been released for Debian to package. Debian isn't the
 problem here.

  hal 0.5.9-rc1 (released, not from git) should work. It will be
problably released soon and picked by sane distributions. Debian is very
irritating corner case.

-- 
Tomasz TorczOnly gods can safely risk perfection,
[EMAIL PROTECTED] it's a dangerous thing for a man.  -- Alia

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Venkat Yekkirala
 
 Signed-off-by: Eric Paris [EMAIL PROTECTED]
Acked-by: Venkat Yekkirala [EMAIL PROTECTED] 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] Add xfrm policy change auditing to pfkey_spdget

2007-03-05 Thread Venkat Yekkirala

 pfkey_spdget neither had an LSM security hook nor auditing for the
 removal of xfrm_policy structs.  The security hook was added 
 when it was
 moved into xfrm_policy_byid instead of the callers to that function by
 my earlier patch and this patch adds the auditing hooks as well.
 
 Signed-off-by: Eric Paris [EMAIL PROTECTED]
Acked-by: Venkat Yekkirala [EMAIL PROTECTED]  
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm audit hook misplaced in pfkey_delete and xfrm_del_sa

2007-03-05 Thread Venkat Yekkirala
 Inside pfkey_delete and xfrm_del_sa the audit hooks were not called if
 there was any permission/security failures in attempting to do the del
 operation (such as permission denied from security_xfrm_state_delete).
 This patch moves the audit hook to the exit path such that 
 all failures
 (and successes) will actually get audited.

Not sure ALL failures are being audited this way elsewhere, but I guess
they would catchup in course of time.

 
 Signed-off-by: Eric Paris [EMAIL PROTECTED]
Acked-by: Venkat Yekkirala [EMAIL PROTECTED] 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread James Morris
On Mon, 5 Mar 2007, Venkat Yekkirala wrote:

  
  Signed-off-by: Eric Paris [EMAIL PROTECTED]
 Acked-by: Venkat Yekkirala [EMAIL PROTECTED] 

What about your previous comment:

 I guess you meant to do this here?
else if (err)
return err; 




-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Venkat Yekkirala
   
   Signed-off-by: Eric Paris [EMAIL PROTECTED]
  Acked-by: Venkat Yekkirala [EMAIL PROTECTED] 
 
 What about your previous comment:
 
  I guess you meant to do this here?
 else if (err)
 return err; 

I saw that this was taken care of in patch-2 for the delete case, but
while err isn't currently applicable to the non-delete case, it would
be proper/complete for err to still be handled for the non-delete case.
Thanks for asking.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying

2007-03-05 Thread Olaf Kirch
On Friday 02 March 2007 05:28, NeilBrown wrote:
 The sunrpc server code needs to know the source and destination address
 for UDP packets so it can reply properly.
 It currently copies code out of the network stack to pick the pieces out
 of the skb.
 This is ugly and causes compile problems with the IPv6 stuff.

... and this IPv6 code could never have worked anyway:


   case AF_INET6: {
...
 - rqstp-rq_addrlen = sizeof(struct sockaddr_in);
... this should have been sizeof(sockaddr_in6)...

 - /* Remember which interface received this request */
 - ipv6_addr_copy(rqstp-rq_daddr.addr6,
 - skb-nh.ipv6h-saddr);
 and this should have copied from daddr, not saddr.

But I find using recvmsg just for getting at the addresses
a little awkward too. And I think to be on the safe side, you
should check that you're really looking at a PKTINFO cmsg
rather than something else.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 07:59:50AM -0500, Theodore Tso wrote:
 On Sun, Mar 04, 2007 at 05:17:29PM -0800, Greg KH wrote:
  I should not have broken any userspace if CONFIG_SYSFS_DEPRECATED is
  enabled with that patch.  If that is enabled, and that patch still
  causes problems, please let me know.
 
 But we still need to update the help text for CONFIG_SYS_DEPRECATED to
 make it clear that its deprecation schedule still needs to be 2009 to
 2011 (depending on whether we want to accomodate Debian's glacial
 release schedule).  Certainly the 2006 date which is currently there
 simply isn't accurate.

Ok, how about the following patch.  Is it acceptable to everyone?

thanks,

greg k-h

---
 init/Kconfig |   13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- gregkh-2.6.orig/init/Kconfig
+++ gregkh-2.6/init/Kconfig
@@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
  that belong to a class, back into the /sys/class heirachy, in
  order to support older versions of udev.
 
- If you are using a distro that was released in 2006 or later,
- it should be safe to say N here.
+ If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
+ release from 2007 or later, it should be safe to say N here.
+
+ If you are using Debian or other distros that are slow to
+ update HAL, please say Y here.
+
+ If you have any problems with devices not being found properly
+ from userspace programs, and this option is disabled, say Y
+ here.
+
+ If you are unsure about this at all, say Y.
 
 config RELAY
bool Kernel-user space relay support (formerly relayfs)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying

2007-03-05 Thread Olaf Kirch

Hi Neil,

here's another minor comment:

On Friday 02 March 2007 05:28, NeilBrown wrote:
 +static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
 + struct cmsghdr *cmh)
  {
   switch (rqstp-rq_sock-sk_sk-sk_family) {
   case AF_INET: {
 + struct in_pktinfo *pki = CMSG_DATA(cmh);
 + rqstp-rq_daddr.addr.s_addr = pki-ipi_spec_dst.s_addr;
   break;
 + }
...

The daddr that is extracted here will only ever be used to build
another PKTINFO cmsg when sending the reply. So it would be
much easier to just store the raw control message in the svc_rqst,
without looking at its contents, and send it out along with the reply,
unchanged.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
 Ok, how about the following patch.  Is it acceptable to everyone?
 
 -   If you are using a distro that was released in 2006 or later,
 -   it should be safe to say N here.
 +   If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
 +   release from 2007 or later, it should be safe to say N here.
 +
 +   If you are using Debian or other distros that are slow to
 +   update HAL, please say Y here.

What HAL version do you think Debian ought to have, pray tell? And
what the hell version do those other distros have?

The last HAL release was 0.5.8 on 11-Sep-2006. It showed up in
Debian/unstable on 2-Oct. There have been six Debian bugfix releases,
the most recent on 12-Feb.

http://people.freedesktop.org/~david/dist/
http://packages.debian.org/changelogs/pool/main/h/hal/hal_0.5.8.1-6.1/changelog

The last NetworkManager is 0.6.4 released 13-Jul-2006. It showed up in
Debian/unstable on 8-Aug. There have been five bugfix releases, the
most recent on 30-Nov.

http://ftp.gnome.org/pub/GNOME/sources/NetworkManager/0.6/
http://packages.debian.org/changelogs/pool/main/n/network-manager/network-manager_0.6.4-6/changelog

Debian is NOT the problem.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying

2007-03-05 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote:
 
 Hi Neil,
 
 here's another minor comment:
 
 On Friday 02 March 2007 05:28, NeilBrown wrote:
  +static inline void svc_udp_get_dest_address(struct svc_rqst *rqstp,
  +   struct cmsghdr *cmh)
   {
  switch (rqstp-rq_sock-sk_sk-sk_family) {
  case AF_INET: {
  +   struct in_pktinfo *pki = CMSG_DATA(cmh);
  +   rqstp-rq_daddr.addr.s_addr = pki-ipi_spec_dst.s_addr;
  break;
  +   }
 ...
 
 The daddr that is extracted here will only ever be used to build
 another PKTINFO cmsg when sending the reply. So it would be
 much easier to just store the raw control message in the svc_rqst,
 without looking at its contents, and send it out along with the reply,
 unchanged.

Yes, sounds tempting, doesn't it?
Unfortunately it isn't that simple as I found out when the sunrpc code
in glibc did exactly that.

You see sendmsg will use the interface-number as well as the source
address from the PKTINFO structure.

Suppose my server has two interfaces (A and B) on two subnets that
both are connected to some router which is connected to a third subnet
that my client is on.  Further, suppose my server has only one default
route, out interface A.
The client chooses the IP address of interface B and sends a request.
It arrives on interface B and is processed.
If the PKTINFO received is passed unchanged to sendmsg, the pack will
be sent out interface B.  But interfacve B doesn't have a route to
that client, so the packet is dropped.

This exactly what was happening for me with mountd a few years ago.

So yes, we could just zero the interface field, but I think it is
clearer to extract that wanted data, then re-insert it.  They really
are different structures with different meanings (send verse receive)
which happen to have the same layout.

Thanks,
NeilBrown

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread Eric Paris
On Mon, 2007-03-05 at 11:39 -0500, James Morris wrote:
 On Mon, 5 Mar 2007, Venkat Yekkirala wrote:
 
   
   Signed-off-by: Eric Paris [EMAIL PROTECTED]
  Acked-by: Venkat Yekkirala [EMAIL PROTECTED] 
 
 What about your previous comment:
 
  I guess you meant to do this here?
 else if (err)
 return err; 

That also gets taken care of in the pfkey_spdget cleanup in a later
patch.  The return isn't in that same place venkat suggested it instead
happens inside the new if (delete) block.  (err is only non-zero on
delete operations so there is no need to check it otherwise)

-Eric

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote:
  That's not the point. The point is that Debian/unstable as of _this
  morning_ doesn't work. For reference, I'm running both the latest
  releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
  there are people telling me I need a copy of HAL out of git that
  hasn't even been released for Debian to package. Debian isn't the
  problem here.
 
   hal 0.5.9-rc1 (released, not from git) should work. It will be
 problably released soon and picked by sane distributions. Debian is very
 irritating corner case.

Presumably the -rc1 stands for release candidate. Which means not
yet released. And when did it show up? 04-Mar-2007 at 18:31. That's
right, YESTERDAY. Almost a full month after Greg's commit.

For the last time, DEBIAN IS NOT THE PROBLEM.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Joel Becker
On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote:
 On Mon, Mar 05, 2007 at 01:13:26AM -0600, Matt Mackall wrote:
  That's not the point. The point is that Debian/unstable as of _this
  morning_ doesn't work. For reference, I'm running both the latest
  releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
  there are people telling me I need a copy of HAL out of git that
  hasn't even been released for Debian to package. Debian isn't the
  problem here.
 
   hal 0.5.9-rc1 (released, not from git) should work. It will be
 problably released soon and picked by sane distributions. Debian is very
 irritating corner case.

As of right now, Fedora Core 6 has hal-0.5.8.1-6.fc6.  This is also
too old.  Please, stop claiming that Debian unstable is some corner
case.  No one is talking about Debian stable here.  No one is talking
about the Enterprise versions of Red Hat or SuSE (you'd find them just
as irritating with modern kernels).  Debian unstable tracks released
code as fast or faster than Fedora and OpenSuSE.  They all keep up with
releases.
But the last release of hal is 0.5.8.1.  _Release_, not release
candidate.  You can't break that.  You can't break it for a while, if
you want a sane deprecation schedule.  These are userspace interfaces.
Matt is absolutely correct that you should't deprecate a
userspace-kernel interface before you've even provided a release of
the tool that detects the change!

Joel

-- 

When ideas fail, words come in very handy. 
 - Goethe

Joel Becker
Principal Software Developer
Oracle
E-mail: [EMAIL PROTECTED]
Phone: (650) 506-8127
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 01:55:30PM -0600, Matt Mackall wrote:
 On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
  Ok, how about the following patch.  Is it acceptable to everyone?
  
  - If you are using a distro that was released in 2006 or later,
  - it should be safe to say N here.
  + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
  + release from 2007 or later, it should be safe to say N here.
  +
  + If you are using Debian or other distros that are slow to
  + update HAL, please say Y here.
 
 What HAL version do you think Debian ought to have, pray tell? And
 what the hell version do those other distros have?
 
 The last HAL release was 0.5.8 on 11-Sep-2006. It showed up in
 Debian/unstable on 2-Oct. There have been six Debian bugfix releases,
 the most recent on 12-Feb.
 
 http://people.freedesktop.org/~david/dist/
 http://packages.debian.org/changelogs/pool/main/h/hal/hal_0.5.8.1-6.1/changelog

Ok, I only named HAL as that is what people have told me the problem is.
I have been running this change on my boxs, without
CONFIG_SYSFS_DEPRECATED since last July or so.

But I don't use NetworkManager here for the most part, but I have tried
this in the OpenSuse10.3 alpha releases and it seems to work just fine
with whatever version of NetworkManager it uses.

So perhaps it's some wrapper scripts somewhere?  I think SuSE had some
odd things hard coded somewhere that prevented 10.1 from working
properly with this change.

Ok, so I'll drop the HAL wording above, what should I say instead?

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] s2io: add PCI error recovery support

2007-03-05 Thread Ramkrishna Vepa
Comments on this patch -

1. device_close_flag is unused and is not required.
 +static pci_ers_result_t s2io_io_error_detected(struct pci_dev *pdev,
 +   pci_channel_state_t
state)
 +{
...
 + do_s2io_card_down(sp, 0);
 + sp-device_close_flag = TRUE;   /* Device is shut down.
*/

2. s2io_reset can fail to reset the device. Ideally s2io_reset should
return a failure in this case (return is void now) and in this case
could s2io_io_slot_reset() be called again, maybe try thrice, in total,
before failing to reset the slot?

Ram
 -Original Message-
 From: Linas Vepstas [mailto:[EMAIL PROTECTED]
 Sent: Thursday, February 15, 2007 3:09 PM
 To: Ramkrishna Vepa; Raghavendra Koushik; Ananda Raju
 Cc: Wen Xiong; linux-kernel@vger.kernel.org; linux-
 [EMAIL PROTECTED]; netdev@vger.kernel.org; Jeff Garzik;
Andrew
 Morton
 Subject: [PATCH] s2io: add PCI error recovery support
 
 
 Koushik, Raju,
 
 Please review, comment, and if you find this acceptable,
 please forward upstream. This patch incorporates all of
 fixes resulting from the last set of discussions, circa
 November 2006.
 
 --linas
 
 This patch adds PCI error recovery support to the
 s2io 10-Gigabit ethernet device driver. Fourth revision,
 blocks interrupts and the watchdog. Adds a flag to
 s2io_down(), to avoid doing I/O when PCI bus is offline.
 
 Tested, seems to work well.
 
 Signed-off-by: Linas Vepstas [EMAIL PROTECTED]
 Acked-by: Ramkrishna Vepa [EMAIL PROTECTED]
 Cc: Raghavendra Koushik [EMAIL PROTECTED]
 Cc: Ananda Raju [EMAIL PROTECTED]
 Cc: Wen Xiong [EMAIL PROTECTED]
 
 
  drivers/net/s2io.c |  116
 ++---
  drivers/net/s2io.h |5 ++
  2 files changed, 116 insertions(+), 5 deletions(-)
 
 Index: linux-2.6.20-git4/drivers/net/s2io.c
 ===
 --- linux-2.6.20-git4.orig/drivers/net/s2io.c 2007-02-15
 15:39:35.0 -0600
 +++ linux-2.6.20-git4/drivers/net/s2io.c  2007-02-15
16:15:10.0 -
 0600
 @@ -435,11 +435,18 @@ static struct pci_device_id s2io_tbl[] _
 
  MODULE_DEVICE_TABLE(pci, s2io_tbl);
 
 +static struct pci_error_handlers s2io_err_handler = {
 + .error_detected = s2io_io_error_detected,
 + .slot_reset = s2io_io_slot_reset,
 + .resume = s2io_io_resume,
 +};
 +
  static struct pci_driver s2io_driver = {
.name = S2IO,
.id_table = s2io_tbl,
.probe = s2io_init_nic,
.remove = __devexit_p(s2io_rem_nic),
 +  .err_handler = s2io_err_handler,
  };
 
  /* A simplifier macro used both by init and free shared_mem Fns(). */
 @@ -2577,6 +2584,9 @@ static void s2io_netpoll(struct net_devi
   u64 val64 = 0xULL;
   int i;
 
 + if (pci_channel_offline(nic-pdev))
 + return;
 +
   disable_irq(dev-irq);
 
   atomic_inc(nic-isr_cnt);
 @@ -3079,6 +3089,8 @@ static void alarm_intr_handler(struct s2
   int i;
   if (atomic_read(nic-card_state) == CARD_DOWN)
   return;
 + if (pci_channel_offline(nic-pdev))
 + return;
   nic-mac_control.stats_info-sw_stat.ring_full_cnt = 0;
   /* Handling the XPAK counters update */
   if(nic-mac_control.stats_info-xpak_stat.xpak_timer_count 
72000)
 {
 @@ -4117,6 +4129,10 @@ static irqreturn_t s2io_isr(int irq, voi
   struct mac_info *mac_control;
   struct config_param *config;
 
 + /* Pretend we handled any irq's from a disconnected card */
 + if (pci_channel_offline(sp-pdev))
 + return IRQ_NONE;
 +
   atomic_inc(sp-isr_cnt);
   mac_control = sp-mac_control;
   config = sp-config;
 @@ -6188,7 +6204,7 @@ static void s2io_rem_isr(struct s2io_nic
   } while(cnt  5);
  }
 
 -static void s2io_card_down(struct s2io_nic * sp)
 +static void do_s2io_card_down(struct s2io_nic * sp, int do_io)
  {
   int cnt = 0;
   struct XENA_dev_config __iomem *bar0 = sp-bar0;
 @@ -6203,7 +6219,8 @@ static void s2io_card_down(struct s2io_n
   atomic_set(sp-card_state, CARD_DOWN);
 
   /* disable Tx and Rx traffic on the NIC */
 - stop_nic(sp);
 + if (do_io)
 + stop_nic(sp);
 
   s2io_rem_isr(sp);
 
 @@ -6211,7 +6228,7 @@ static void s2io_card_down(struct s2io_n
   tasklet_kill(sp-task);
 
   /* Check if the device is Quiescent and then Reset the NIC */
 - do {
 + while(do_io) {
   /* As per the HW requirement we need to replenish the
* receive buffer to avoid the ring bump. Since there is
* no intention of processing the Rx frame at this
pointwe are
 @@ -6236,8 +6253,9 @@ static void s2io_card_down(struct s2io_n
 (unsigned long long) val64);
   break;
   }
 - } while (1);
 - s2io_reset(sp);
 + }
 + if (do_io)
 + s2io_reset(sp);
 
   

Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Jeffrey Hundstad

Greg KH wrote:

On Mon, Mar 05, 2007 at 07:59:50AM -0500, Theodore Tso wrote:
  
Ok, how about the following patch.  Is it acceptable to everyone?


thanks,

greg k-h

---
 init/Kconfig |   13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- gregkh-2.6.orig/init/Kconfig
+++ gregkh-2.6/init/Kconfig
@@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
  that belong to a class, back into the /sys/class heirachy, in
  order to support older versions of udev.
 
-	  If you are using a distro that was released in 2006 or later,

- it should be safe to say N here.
+ If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
+ release from 2007 or later, it should be safe to say N here.
+
+ If you are using Debian or other distros that are slow to
+ update HAL, please say Y here.
+
+ If you have any problems with devices not being found properly
+ from userspace programs, and this option is disabled, say Y
+ here.
+
+ If you are unsure about this at all, say Y.
 
 config RELAY

bool Kernel-user space relay support (formerly relayfs)


Since it appears you're trying to offend people with this patch, it 
would seem appropriate to call someone's mother a bad name.  This may 
be in the style guide; perhaps I should submit a patch.


--
Jeffrey Hundstad
PS: Humor (really!) relax.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Adrian Bunk
On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
 
 Ok, how about the following patch.  Is it acceptable to everyone?
 
 thanks,
 
 greg k-h
 
 ---
  init/Kconfig |   13 +++--
  1 file changed, 11 insertions(+), 2 deletions(-)
 
 --- gregkh-2.6.orig/init/Kconfig
 +++ gregkh-2.6/init/Kconfig
 @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
 that belong to a class, back into the /sys/class heirachy, in
 order to support older versions of udev.
  
 -   If you are using a distro that was released in 2006 or later,
 -   it should be safe to say N here.
 +   If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
 +   release from 2007 or later, it should be safe to say N here.
 +
 +   If you are using Debian or other distros that are slow to
 +   update HAL, please say Y here.
...

The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
for all users, and schedule it's removal for mid-2008 (or later).

12 months after the first _release_ of a HAL that can live without seems 
to be the first time when we can consider getting rid of it, since all 
distributions with at least one release a year should ship it by then.

Currently, SYSFS_DEPRECATED is only a trap for users.

Suggested patch below.

cu
Adrian


--  snip  --


unconditionally enable SYSFS_DEPRECATED

This patch unconditionally enables SYSFS_DEPRECATED and schedules it's
removal for July 2008.

Currently, SYSFS_DEPRECATED is only a trap for users accidentally
disabling it.

In July 2008, all distributions with at least one release a year should
be able to run without SYSFS_DEPRECATED.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

diff --git a/Documentation/feature-removal-schedule.txt 
b/Documentation/feature-removal-schedule.txt
index c3b1430..b0bce93 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -316,3 +316,13 @@ Why:   The option/code is
 Who:   Johannes Berg [EMAIL PROTECTED]
 
 ---
+
+What:  deprecated sysfs files (CONFIG_SYSFS_DEPRECATED)
+When:  July 2008
+Why:   None of these features or values should be used any longer,
+   as they export driver core implementation details to userspace
+   or export properties which can't be kept stable across kernel
+   releases.
+Who:   Greg KH [EMAIL PROTECTED]
+
+---
diff --git a/init/Kconfig b/init/Kconfig
index f977086..f652b6f 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -274,24 +274,9 @@ config CPUSETS
  Say N if unsure.
 
 config SYSFS_DEPRECATED
-   bool Create deprecated sysfs files
+   bool
default y
help
- This option creates deprecated symlinks such as the
- device-link, the subsystem:name-link, and the
- bus-link. It may also add deprecated key in the
- uevent environment.
- None of these features or values should be used today, as
- they export driver core implementation details to userspace
- or export properties which can't be kept stable across kernel
- releases.
-
- If enabled, this option will also move any device structures
- that belong to a class, back into the /sys/class heirachy, in
- order to support older versions of udev.
-
- If you are using a distro that was released in 2006 or later,
- it should be safe to say N here.
 
 config RELAY
bool Kernel-user space relay support (formerly relayfs)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] [PATCH 001 of 3] knfsd: Use recv_msg to get peer address for NFSD instead of code-copying

2007-03-05 Thread Neil Brown
On Monday March 5, [EMAIL PROTECTED] wrote:
 On Friday 02 March 2007 05:28, NeilBrown wrote:
  The sunrpc server code needs to know the source and destination address
  for UDP packets so it can reply properly.
  It currently copies code out of the network stack to pick the pieces out
  of the skb.
  This is ugly and causes compile problems with the IPv6 stuff.
 
 ... and this IPv6 code could never have worked anyway:

:-(
It's hard to test the IPv6 server until we have an IPv6 client I
guess, so thanks for the code review, even though we aren't going to
end up using that code...

 
 But I find using recvmsg just for getting at the addresses
 a little awkward too.

Do you?  It's surely a lot better than code duplication, and it is
exactly how you would get the information from user-space.

   And I think to be on the safe side, you
 should check that you're really looking at a PKTINFO cmsg
 rather than something else.

Maybe.
But is there really a chance that it might not be PKTINFO?
And what do you do if it isn't?
Log an error and drop the packet I guess.

I'll see what I can do.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] div64_64 support

2007-03-05 Thread Stephen Hemminger
On 03 Mar 2007 03:31:52 +0100
Andi Kleen [EMAIL PROTECTED] wrote:

 Stephen Hemminger [EMAIL PROTECTED] writes:
 
  Here is another way to handle the 64 bit divide case.
  It allows full 64 bit divide by adding the support routine
  GCC needs.
 
 Not supplying that was intentional by Linus so that people
 think twice (or more often) before they using such expensive
 operations. A plain / looks too innocent.
 
 Is it really needed by CUBIC anyways?  It uses it for getting
 the cubic root, but the algorithm recommended by Hacker's Delight
 (great book) doesn't use any divisions at all. Probably better 
 to use a better algorithm without divisions.
 

I tried the code from Hacker's Delight.
It is cool, but performance is CPU (and data) dependent:

Average # of usecs per operation:

Hacker  Newton
Pentium 3   68.6   90.4
T2050   98.6   92.0
U1400   450415
Xeon70 90
Xeon (newer)71 78

EM64T   21.8   24.6
AMD64   23.4   32.0

It might be worth the change for code size reduction though.


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
 On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
  
  Ok, how about the following patch.  Is it acceptable to everyone?
  
  thanks,
  
  greg k-h
  
  ---
   init/Kconfig |   13 +++--
   1 file changed, 11 insertions(+), 2 deletions(-)
  
  --- gregkh-2.6.orig/init/Kconfig
  +++ gregkh-2.6/init/Kconfig
  @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
that belong to a class, back into the /sys/class heirachy, in
order to support older versions of udev.
   
  - If you are using a distro that was released in 2006 or later,
  - it should be safe to say N here.
  + If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
  + release from 2007 or later, it should be safe to say N here.
  +
  + If you are using Debian or other distros that are slow to
  + update HAL, please say Y here.
 ...
 
 The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
 for all users, and schedule it's removal for mid-2008 (or later).
 
 12 months after the first _release_ of a HAL that can live without seems 
 to be the first time when we can consider getting rid of it, since all 
 distributions with at least one release a year should ship it by then.
 
 Currently, SYSFS_DEPRECATED is only a trap for users.

Huh?

No, again, I've been using this just fine for about 6 months now.

And what about all of the servers not using HAL/NetworkManager?
And what about all of the embedded systems not using either?

So to not allow this to be turned off by people who might want to (we
want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
other distros released this year), is pretty heavy-handed.

It also will work in OpenSuSE 10.2 which is already released, and I
think Fedora 6, but I've only limited experience with these.

Oh, and Gentoo works just fine, and has been for the past 6 months.

I would just prefer to come up with an acceptable set of wording that
will work to properly warn people.

I proposed one such wording which some people took as a slam against
Debian, which it really was not at all.

Does someone else want to propose some other wording instead?

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] div64_64 support

2007-03-05 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Mon, 5 Mar 2007 15:57:14 -0800

 I tried the code from Hacker's Delight.
 It is cool, but performance is CPU (and data) dependent:
 
 Average # of usecs per operation:

Interesting results.

The problem with these algorithms that tradoff one or more
multiplies in order to avoid a divide is that they don't
give anything and often lose when both multiplies and
divides are emulated in software.

This is particularly true in this cube-root case from Hacker's
Delight, because it's using 3 multiplies per iteration in place of one
divide per iteration.

Actually, sorry, there is only one real multiply in there since the
other two can be computed using addition and shifts.

Another thing is that the non-Hacker's Delight version iterates
differently for different input values, so the input value space is
very important to consider when comparing these two pieces of code.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] Use of uninitialized data in udp_recvmsg().

2007-03-05 Thread Jim Chow
On Tue, 6 Mar 2007, Herbert Xu wrote:
 It's just too error-prone to rely on it to not have MSG_TRUNC set.

Agreed.

 I'm going to clean this up for UDP and improve the UDP-lite checksum
 handling while I'm at it.

Great.  It'll be good to get this years-old UDP bug fixed.

Thanks,
Jim
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[UDP]: Reread uh pointer after pskb_trim

2007-03-05 Thread Herbert Xu
Hi Dave:

[UDP]: Reread uh pointer after pskb_trim

The header may have moved when trimming.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index ce6c460..fc620a7 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1215,6 +1215,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct hlist_head 
udptable[],
 
if (ulen  sizeof(*uh) || pskb_trim_rcsum(skb, ulen))
goto short_packet;
+   uh = skb-h.uh;
 
udp4_csum_init(skb, uh);
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [UDP]: Reread uh pointer after pskb_trim

2007-03-05 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Tue, 6 Mar 2007 12:00:20 +1100

 Hi Dave:
 
 [UDP]: Reread uh pointer after pskb_trim
 
 The header may have moved when trimming.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Good catch, I'll apply this and push to -stable, thanks
Herbert.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: when having to acquire an SA, ipsec drops the packet

2007-03-05 Thread Joy Latten
From: Joy Latten [EMAIL PROTECTED]
Date: Mon, 05 Feb 2007 14:53:39 -0600

 I can run some tests with this patch and report any results... 

Please check out the two most recent patches I posted:

1) Updated core patch with ipv6 side added.
2) Fix for thinko noticed by Venkat.

I have been testing this a lot in the lspp kernel.
Plan to test also in upstream kernel.
I am seeing a second ACQUIRE occur while establishing the SAs.

My scenario:
My policy states to use both the ESP and AH protocols (may not
make much sense but this was for testing purposes).  I get double 
SAs with only difference being SPI.

Here is what I see happening... 

1. Trigger first ACQUIRE via ping or netperf.

2. xfrm_lookup() calls xfrm_tmpl_resolv() who calls xfrm_state_find().
   First time around, we need to establish SA, so a minimal SA
   get allocated and put in SAD, timer is set for the minimal SA
   to be ACQUIRED and km_query() gets called.
   
3. xfrm_tmpl_resolv() returns -EAGAIN causing add_wait_queue(km_waitq, wait)
   and proceeding code to get called waiting for SA to be established.
   As long as the minimal SA with XFRM_STATE_ACQUIRE is in SAD,
   we keep waiting...
   
4. First set of SAs (one for AH and ESP) for IN direction get inserted in SAD.
 
5. Around the time the set of SAs for OUT direction are to be
   inserted into SAD, I see another ACQUIRE happening.
   
   I have not yet figured out where this second ACQUIRE comes from
   and why it happens. As long as the minimal SA or set of valid outgoing
   SAs exist in SAD, an ACQUIRE should not happen.
   The minimal SA does not get removed from the SAD until the set 
   of SAs for OUT get added and the xfrm_state_lock
   released. And the lock pretty much guarantees no one else can step
   through the SAD until after new SAs are being added...
   and if someone gets the lock to step though SAD before OUT SAs
   are added, minimal SA is still there... 

 6. Since this second ACQUIRE was able to happen, result is identical
sets of SAs for the traffic stream. SPIs are only difference.
 
 7. Noticed something while pasting log info below.
Perhaps when outgoing AH SA is added, wake_up(km_waitq) gets called, 
lock released, and minimal SA deleted (xfrm_state_add()), 
xfrm_tmpl_resolv() is called and it looks first for the outgoing
ESP SA. Since it is not there yet and no minimal SA, then km_query()
results in an ACQUIRE just before the outgoing ESP SA gets added.

It would explain why I only see it when both ESP and AH are specified...
that is if I am thinking correctly... 

Regards,
Joy Latten

From my log file:

Mar  5 19:10:02 racoon: INFO: initiate new phase 2 negotiation: 
9.3.192.210[500]=9.3.189.55[500]
Mar  5 19:10:03 racoon: INFO: IPsec-SA established: AH/Transport 
9.3.189.55[0]-9.3.192.210[0] spi=137942922(0x838d78a)
Mar  5 19:10:03 racoon: INFO: IPsec-SA established: ESP/Transport 
9.3.189.55[0]-9.3.192.210[0] spi=244321490(0xe900cd2)
Mar  5 19:10:03 racoon: INFO: IPsec-SA established: AH/Transport 
9.3.192.210[0]-9.3.189.55[0] spi=38721750(0x24ed8d6)
Mar  5 19:10:03 racoon: INFO: initiate new phase 2 negotiation: 
9.3.192.210[500]=9.3.189.55[500]
Mar  5 19:10:03 racoon: INFO: IPsec-SA established: ESP/Transport 
9.3.192.210[0]-9.3.189.55[0] spi=265079770(0xfcccbda)
Mar  5 19:10:05 racoon: INFO: IPsec-SA established: AH/Transport 
9.3.189.55[0]-9.3.192.210[0] spi=108627618(0x67986a2)
Mar  5 19:10:05 racoon: INFO: IPsec-SA established: ESP/Transport 
9.3.189.55[0]-9.3.192.210[0] spi=182973856(0xae7f5a0)
Mar  5 19:10:05 racoon: INFO: IPsec-SA established: AH/Transport 
9.3.192.210[0]-9.3.189.55[0] spi=58486297(0x37c6e19)
Mar  5 19:10:05 racoon: INFO: IPsec-SA established: ESP/Transport 
9.3.192.210[0]-9.3.189.55[0] spi=268295215(0xffddc2f)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH]: Dynamically sized routing cache hash table.

2007-03-05 Thread David Miller

This is essentially a port of Nick Piggin's dcache hash table
patches to the routing cache.  It solves the locking issues
during table grow/shrink that I couldn't handle properly last
time I tried to code up a patch like this.

But one of the core issues of this kind of change still remains.
There is a conflict between the desire of routing cache garbage
collection to reach a state of equilibrium and the hash table
grow code's desire to match the table size to the current state
of affairs.

Actually, more accurately, the conflict exists in how this GC
logic is implemented.  The core issue is that hash table size
guides the GC processing, and hash table growth therefore
modifies those GC goals.  So with the patch below we'll just
keep growing the hash table instead of giving GC some time to
try to keep the working set in equilibrium before doing the
hash grow.

One idea is to put the hash grow check in the garbage collector,
and put the hash shrink check in rt_del().

In fact, it would be a good time to perhaps hack up some entirely
new passive GC logic for the routing cache.

BTW, another thing that plays into this is that Robert's TRASH work
could make this patch not necessary :-)

Finally, I know that (due to some of Nick's helpful comments the
other day) that I'm missing some rcu_assign_pointer()'s in here.
Fixes in this area are most welcome.

This patch passes basic testing on UP sparc64, but please handle
with care :)

Signed-off-by: David S. Miller [EMAIL PROTECTED]

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 0b3d7bf..57e004a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -92,6 +92,9 @@
 #include linux/jhash.h
 #include linux/rcupdate.h
 #include linux/times.h
+#include linux/workqueue.h
+#include linux/vmalloc.h
+#include linux/mutex.h
 #include net/protocol.h
 #include net/ip.h
 #include net/route.h
@@ -242,28 +245,195 @@ static spinlock_t*rt_hash_locks;
 # define rt_hash_lock_init()
 #endif
 
-static struct rt_hash_bucket   *rt_hash_table;
-static unsignedrt_hash_mask;
-static int rt_hash_log;
-static unsigned intrt_hash_rnd;
+#define MIN_RTHASH_SHIFT 4
+#if BITS_PER_LONG == 32
+#define MAX_RTHASH_SHIFT 24
+#else
+#define MAX_RTHASH_SHIFT 30
+#endif
+
+struct rt_hash {
+   struct rt_hash_bucket   *table;
+   unsigned intmask;
+   unsigned intlog;
+};
+
+struct rt_hash *rt_hash __read_mostly;
+struct rt_hash *old_rt_hash __read_mostly;
+static unsigned int rt_hash_rnd __read_mostly;
+static DEFINE_SEQLOCK(resize_transfer_lock);
+static DEFINE_MUTEX(resize_mutex);
 
 static DEFINE_PER_CPU(struct rt_cache_stat, rt_cache_stat);
 #define RT_CACHE_STAT_INC(field) \
(__raw_get_cpu_var(rt_cache_stat).field++)
 
-static int rt_intern_hash(unsigned hash, struct rtable *rth,
-   struct rtable **res);
+static void rt_hash_resize(unsigned int new_shift);
+static void check_nr_rthash(void)
+{
+   unsigned int sz = rt_hash-mask + 1;
+   unsigned int nr = atomic_read(ipv4_dst_ops.entries);
+
+   if (unlikely(nr  (sz + (sz  1
+   rt_hash_resize(rt_hash-log + 1);
+   else if (unlikely(nr  (sz  1)))
+   rt_hash_resize(rt_hash-log - 1);
+}
 
-static unsigned int rt_hash_code(u32 daddr, u32 saddr)
+static struct rt_hash_bucket *rthash_alloc(unsigned int sz)
+{
+   struct rt_hash_bucket *n;
+
+   if (sz = PAGE_SIZE)
+   n = kmalloc(sz, GFP_KERNEL);
+   else if (hashdist)
+   n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL);
+   else
+   n = (struct rt_hash_bucket *)
+   __get_free_pages(GFP_KERNEL, get_order(sz));
+
+   if (n)
+   memset(n, 0, sz);
+
+   return n;
+}
+
+static void rthash_free(struct rt_hash_bucket *r, unsigned int sz)
+{
+   if (sz = PAGE_SIZE)
+   kfree(r);
+   else if (hashdist)
+   vfree(r);
+   else
+   free_pages((unsigned long)r, get_order(sz));
+}
+
+static unsigned int rt_hash_code(struct rt_hash *hashtable,
+u32 daddr, u32 saddr)
 {
return (jhash_2words(daddr, saddr, rt_hash_rnd)
-rt_hash_mask);
+hashtable-mask);
 }
 
-#define rt_hash(daddr, saddr, idx) \
-   rt_hash_code((__force u32)(__be32)(daddr),\
+#define rt_hashfn(htab, daddr, saddr, idx) \
+   rt_hash_code(htab, (__force u32)(__be32)(daddr),\
 (__force u32)(__be32)(saddr) ^ ((idx)  5))
 
+static unsigned int resize_new_shift;
+
+static void rt_hash_resize_work(struct work_struct *work)
+{
+   struct rt_hash *new_hash, *old_hash;
+   unsigned int new_size, old_size, transferred;
+   int i;
+
+   if (!mutex_trylock(resize_mutex))
+   goto out;
+
+   new_hash = kmalloc(sizeof(struct rt_hash), GFP_KERNEL);
+   if (!new_hash)
+   goto out_unlock;
+
+   

Re: [PATCH] LVS: Send ICMP unreachable responses to end-users when real-servers are removed

2007-03-05 Thread David Miller
From: Horms [EMAIL PROTECTED]
Date: Sun, 11 Feb 2007 12:04:43 +0900

 this is a small patch by  Janusz Krzysztofik to ip_route_output_slow()
 that allows VIP-less LVS linux director to generate packets originating
 From VIP if sysctl_ip_nonlocal_bind is set.
 
 In a nutshell, the intention is for an LVS linux director to be able
 to send ICMP unreachable responses to end-users when real-servers are
 removed.
 
 http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00106.html
 
 I'm not really sure about the correctness of this approach,
 so I am sending it here to netdev for review
 
 Cc: Janusz Krzysztofik [EMAIL PROTECTED]
 Signed-off-by: Simon Horman [EMAIL PROTECTED]

I'm not against this patch or the idea, I just want to
think about it some more to make sure there are not bad
unintended side effects to allowing this.

If someone else could provide some feedback or comments,
I'd very much appreciate that as well.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Adrian Bunk
On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
 On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
  On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
   
   Ok, how about the following patch.  Is it acceptable to everyone?
   
   thanks,
   
   greg k-h
   
   ---
init/Kconfig |   13 +++--
1 file changed, 11 insertions(+), 2 deletions(-)
   
   --- gregkh-2.6.orig/init/Kconfig
   +++ gregkh-2.6/init/Kconfig
   @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
   that belong to a class, back into the /sys/class heirachy, in
   order to support older versions of udev.

   -   If you are using a distro that was released in 2006 or later,
   -   it should be safe to say N here.
   +   If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
   +   release from 2007 or later, it should be safe to say N here.
   +
   +   If you are using Debian or other distros that are slow to
   +   update HAL, please say Y here.
  ...
  
  The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
  for all users, and schedule it's removal for mid-2008 (or later).
  
  12 months after the first _release_ of a HAL that can live without seems 
  to be the first time when we can consider getting rid of it, since all 
  distributions with at least one release a year should ship it by then.
  
  Currently, SYSFS_DEPRECATED is only a trap for users.
 
 Huh?
 
 No, again, I've been using this just fine for about 6 months now.
 
 And what about all of the servers not using HAL/NetworkManager?

On a server, it shouldn't harm.

 And what about all of the embedded systems not using either?

If it was much code, I would have sent a patch that allowed disabling it 
if EMBEDDED=y.

 So to not allow this to be turned off by people who might want to (we
 want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
 other distros released this year), is pretty heavy-handed.
 
 It also will work in OpenSuSE 10.2 which is already released, and I
 think Fedora 6, but I've only limited experience with these.
 
 Oh, and Gentoo works just fine, and has been for the past 6 months.

For most people, it simply doesn't matter whether SYSFS_DEPRECATED is 
on or off.

But accidentally disabling SYSFS_DEPRECATED has proven to be a trap 
people sometimes fall into - and tracking them down to 
SYSFS_DEPRECATED=n sometimes takes some time.

 I would just prefer to come up with an acceptable set of wording that
 will work to properly warn people.
 
 I proposed one such wording which some people took as a slam against
 Debian, which it really was not at all.
 
 Does someone else want to propose some other wording instead?
 
 thanks,
 
 greg k-h

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfrm_policy delete security check misplaced

2007-03-05 Thread James Morris
On Fri, 2 Mar 2007, Eric Paris wrote:

 Signed-off-by: Eric Paris [EMAIL PROTECTED]

Acked-by: James Morris [EMAIL PROTECTED]



-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add xfrm policy change auditing to pfkey_spdget

2007-03-05 Thread James Morris
On Fri, 2 Mar 2007, Eric Paris wrote:

 pfkey_spdget neither had an LSM security hook nor auditing for the
 removal of xfrm_policy structs.  The security hook was added when it was
 moved into xfrm_policy_byid instead of the callers to that function by
 my earlier patch and this patch adds the auditing hooks as well.
 
 Signed-off-by: Eric Paris [EMAIL PROTECTED]

Acked-by: James Morris [EMAIL PROTECTED]


-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] xfrm audit hook misplaced in pfkey_delete and xfrm_del_sa

2007-03-05 Thread James Morris
On Fri, 2 Mar 2007, Eric Paris wrote:

 Inside pfkey_delete and xfrm_del_sa the audit hooks were not called if
 there was any permission/security failures in attempting to do the del
 operation (such as permission denied from security_xfrm_state_delete).
 This patch moves the audit hook to the exit path such that all failures
 (and successes) will actually get audited.
 
 Signed-off-by: Eric Paris [EMAIL PROTECTED]

Acked-by: James Morris [EMAIL PROTECTED]


-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Tue, Mar 06, 2007 at 11:24:57AM +1100, Bron Gondwana wrote:
 On Mon, Mar 05, 2007 at 03:14:25PM -0600, Matt Mackall wrote:
  On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote:
That's not the point. The point is that Debian/unstable as of _this
morning_ doesn't work. For reference, I'm running both the latest
releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
there are people telling me I need a copy of HAL out of git that
hasn't even been released for Debian to package. Debian isn't the
problem here.
   
 hal 0.5.9-rc1 (released, not from git) should work. It will be
   problably released soon and picked by sane distributions. Debian is very
   irritating corner case.
  
  Presumably the -rc1 stands for release candidate. Which means not
  yet released. And when did it show up? 04-Mar-2007 at 18:31. That's
  right, YESTERDAY. Almost a full month after Greg's commit.
  
  For the last time, DEBIAN IS NOT THE PROBLEM.
 
 Can I please second this (having been burned by hell that was udev of
 the 0.5ish era) - Greg, please try to make changes in a cross-compatible
 way so that versions of userspace and kernel are not so closely
 dependant on tracking each other.  The whole 2.6.8 - 2.6.12 series of
 kernels and associated udevs are fraught with race conditions where
 upgrading one but not the other will leave your machine unbootable.

But I AM TRYING TO MAKE IT COMPATIBLE!!!

That's what that config option is there for.  If you happen to be
running a newer userspace, a different distro than what is in Debian
right now, or don't use HAL and Networkmanager, then disable that
option.  Then all of sysfs looks just like it used to, no user visble
changes at all.  It doesn't get any more compatible than that.

Again, I've pointed out distros that work just fine many times in this
thread...

It's been there since 2.6.20 I think, no one seemed to have noticed it
then for an odd reason...

And the default is enabled, you have to manually turn it off in order to
break your machine.

Again, how can I word this in a manner that would be sufficient to keep
this misunderstanding from happening again?

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Tue, Mar 06, 2007 at 01:35:41AM +0100, Adrian Bunk wrote:
 On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
  On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
   On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:

Ok, how about the following patch.  Is it acceptable to everyone?

thanks,

greg k-h

---
 init/Kconfig |   13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- gregkh-2.6.orig/init/Kconfig
+++ gregkh-2.6/init/Kconfig
@@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
  that belong to a class, back into the /sys/class heirachy, in
  order to support older versions of udev.
 
- If you are using a distro that was released in 2006 or later,
- it should be safe to say N here.
+ If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
+ release from 2007 or later, it should be safe to say N here.
+
+ If you are using Debian or other distros that are slow to
+ update HAL, please say Y here.
   ...
   
   The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
   for all users, and schedule it's removal for mid-2008 (or later).
   
   12 months after the first _release_ of a HAL that can live without seems 
   to be the first time when we can consider getting rid of it, since all 
   distributions with at least one release a year should ship it by then.
   
   Currently, SYSFS_DEPRECATED is only a trap for users.
  
  Huh?
  
  No, again, I've been using this just fine for about 6 months now.
  
  And what about all of the servers not using HAL/NetworkManager?
 
 On a server, it shouldn't harm.

But if they wanted that option enabled?

  And what about all of the embedded systems not using either?
 
 If it was much code, I would have sent a patch that allowed disabling it 
 if EMBEDDED=y.

It's not a code size issue.  In fact, if the option is enabled, like you
have done, it builds more code into the kernel than before.

  So to not allow this to be turned off by people who might want to (we
  want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
  other distros released this year), is pretty heavy-handed.
  
  It also will work in OpenSuSE 10.2 which is already released, and I
  think Fedora 6, but I've only limited experience with these.
  
  Oh, and Gentoo works just fine, and has been for the past 6 months.
 
 For most people, it simply doesn't matter whether SYSFS_DEPRECATED is 
 on or off.

Exactly.

 But accidentally disabling SYSFS_DEPRECATED has proven to be a trap 
 people sometimes fall into - and tracking them down to 
 SYSFS_DEPRECATED=n sometimes takes some time.

So how do I put up the warning flag any larger than I have?

I do not want this always enabled, that option is not acceptable to me,
or to the zillions of people who are running a distro that this option
works just fine on (see above list...)

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Bron Gondwana
On Mon, Mar 05, 2007 at 03:14:25PM -0600, Matt Mackall wrote:
 On Mon, Mar 05, 2007 at 04:46:09PM +0100, Tomasz Torcz wrote:
   That's not the point. The point is that Debian/unstable as of _this
   morning_ doesn't work. For reference, I'm running both the latest
   releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And
   there are people telling me I need a copy of HAL out of git that
   hasn't even been released for Debian to package. Debian isn't the
   problem here.
  
hal 0.5.9-rc1 (released, not from git) should work. It will be
  problably released soon and picked by sane distributions. Debian is very
  irritating corner case.
 
 Presumably the -rc1 stands for release candidate. Which means not
 yet released. And when did it show up? 04-Mar-2007 at 18:31. That's
 right, YESTERDAY. Almost a full month after Greg's commit.
 
 For the last time, DEBIAN IS NOT THE PROBLEM.

Can I please second this (having been burned by hell that was udev of
the 0.5ish era) - Greg, please try to make changes in a cross-compatible
way so that versions of userspace and kernel are not so closely
dependant on tracking each other.  The whole 2.6.8 - 2.6.12 series of
kernels and associated udevs are fraught with race conditions where
upgrading one but not the other will leave your machine unbootable.

I read the manifesto for udev showing how crap devfs was, it was
broken, it could never be fixed etc - yet my experience was that devfs
systems just worked[tm] and udev was very dangerous.  My thinking is
going to be tarnished by that for a while and my mental image of udev
is unreliable POS.  I'm hoping enough good experiences with udev might
make me feel less scared whenever I have to deal with it.

Similarly, I'm hoping I don't have to think oh shit, will this break
boot every time I upgrade either a kernel or hal version for the next
year, because it would really suck to do that all over again.  It
contributes to the meme that linux is unreliable and perpetually
unstable.

Regards,

Bron.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Theodore Tso
On Mon, Mar 05, 2007 at 04:37:15PM -0800, Greg KH wrote:
 But I AM TRYING TO MAKE IT COMPATIBLE!!!
 
 That's what that config option is there for.  If you happen to be
 running a newer userspace, a different distro than what is in Debian
 right now, or don't use HAL and Networkmanager, then disable that
 option.  Then all of sysfs looks just like it used to, no user visble
 changes at all.  It doesn't get any more compatible than that.

This is great, but I think the real problem isn't the config option,
but what is changing if the config option isn't enabled.  The claim
which some, including Matt and Bron, seem to be making is that if you
turn *off* CONFIG_SYSFS_DEPRECATED, you must be using at least hal
0.5.9-rc1, released ***yesterday***, or suffer breakages for at least
some system configurations.

So the problem with putting a date in Kconfig.txt help file, or in
Documentation/feature-removal-schedule.txt, is that if there are other
incompatible changes which are added to sysfs in say, December 2007 or
January 2008, but which are papered over with CONFIG_SYSFS_DEPRECATED,
and then come June 2008, CONFIG_SYSFS_DEPRECATED is unceremoniously
ripped out, then users will get screwed.  

So the question really is are we really done making changes to sysfs,
or maybe what we should do is talk about major version numbers to
sysfs.  Call what we have currently not CONFIG_SYSFS_DEPRECATED, but
rather CONFIG_SYSFS_LAYOUT_1.  At the moment, CONFIG_SYSFS_LAYOUT_2 is
undergoing changes, but at some point we need to lock down and state
that Layout version 2 is never going to change, and then people who
want changes can go work on CONFIG_SYSFS_LAYOUT_3.  

The problem with calling CONFIG_SYSFS_DEPRECATED is that people think
that since it's deprecated, it should be turned off, but if we have
staged major version numbers, with guarantees of absolute stability
once a particular major version number is locked down, then it may
make it a lot easier to talk about what version of hal and udev and
Network Manager is really needed for different versions.  

- Ted
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Andrew Morton
On Mon, 5 Mar 2007 19:56:25 -0500
Theodore Tso [EMAIL PROTECTED] wrote:

 So the question really is are we really done making changes to sysfs,
 or maybe what we should do is talk about major version numbers to
 sysfs.

Perhaps using a config option wasn't the right way to do this - a kernel
boot parameter might be better.

In fact, one could envisage a kernel boot parameter sysfs_version=N which 
will allow distro people to select the sysfs-of-the-day which works with their
userspace.

Because it does appear that we need _something_ which will get us away from this
ongoing problem of needing to keep the kernel and userspace synchronised across
sysfs changes.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 07:56:25PM -0500, Theodore Tso wrote:
 On Mon, Mar 05, 2007 at 04:37:15PM -0800, Greg KH wrote:
  But I AM TRYING TO MAKE IT COMPATIBLE!!!
  
  That's what that config option is there for.  If you happen to be
  running a newer userspace, a different distro than what is in Debian
  right now, or don't use HAL and Networkmanager, then disable that
  option.  Then all of sysfs looks just like it used to, no user visble
  changes at all.  It doesn't get any more compatible than that.
 
 This is great, but I think the real problem isn't the config option,
 but what is changing if the config option isn't enabled.  The claim
 which some, including Matt and Bron, seem to be making is that if you
 turn *off* CONFIG_SYSFS_DEPRECATED, you must be using at least hal
 0.5.9-rc1, released ***yesterday***, or suffer breakages for at least
 some system configurations.

Ok, well that has been proven incorrect.  I originally thought it was
HAL that had the problem, but I think that is not true, as I am using
the older version of hal here (0.5.7.1) just fine.

 So the problem with putting a date in Kconfig.txt help file, or in
 Documentation/feature-removal-schedule.txt, is that if there are other
 incompatible changes which are added to sysfs in say, December 2007 or
 January 2008, but which are papered over with CONFIG_SYSFS_DEPRECATED,
 and then come June 2008, CONFIG_SYSFS_DEPRECATED is unceremoniously
 ripped out, then users will get screwed.  
 
 So the question really is are we really done making changes to sysfs,
 or maybe what we should do is talk about major version numbers to
 sysfs.  Call what we have currently not CONFIG_SYSFS_DEPRECATED, but
 rather CONFIG_SYSFS_LAYOUT_1.  At the moment, CONFIG_SYSFS_LAYOUT_2 is
 undergoing changes, but at some point we need to lock down and state
 that Layout version 2 is never going to change, and then people who
 want changes can go work on CONFIG_SYSFS_LAYOUT_3.  
 
 The problem with calling CONFIG_SYSFS_DEPRECATED is that people think
 that since it's deprecated, it should be turned off, but if we have
 staged major version numbers, with guarantees of absolute stability
 once a particular major version number is locked down, then it may
 make it a lot easier to talk about what version of hal and udev and
 Network Manager is really needed for different versions.  

This is what Documentation/ABI/ has tried to nail down, unfortunatly it
has turned out to be very hard to track down all of the odd userspace
programs that use sysfs and see what they are relying on.  We are slowly
fixing things, as is proof in the OpenSuSE and Gentoo releases.

And I'll be the first to admit that the ABI/ directory needs some
flushing out...

And it isn't really a whole different layout, the only problem here is
that a directory has turned into a symlink, so programs that were not
written that well (and I'll be the first to admit that I made the same
mistake in udev many years ago) and can't handle the change.

So numerous programs just work fine, but for a limited few, they have
problems, hence the config option so that nothing will break.

And if you look in the ABI/ directory, it describes this usage of the
class devices in sysfs.  But again, no one is flushing out the users of
these features, or even reading the stuff that is there...

So, again, a better wording for the CONFIG help text anyone?  Or a
better name for the CONFIG value itself?

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 05:08:49PM -0800, Andrew Morton wrote:
 On Mon, 5 Mar 2007 19:56:25 -0500
 Theodore Tso [EMAIL PROTECTED] wrote:
 
  So the question really is are we really done making changes to sysfs,
  or maybe what we should do is talk about major version numbers to
  sysfs.
 
 Perhaps using a config option wasn't the right way to do this - a kernel
 boot parameter might be better.

Ok, I have no problem with that if people really want it.  But give me
the option to also make it a config option so I don't have to change our
bootloaders too.

Does that sound acceptable?

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Andrew Morton
On Mon, 5 Mar 2007 17:17:09 -0800
Greg KH [EMAIL PROTECTED] wrote:

 On Mon, Mar 05, 2007 at 05:08:49PM -0800, Andrew Morton wrote:
  On Mon, 5 Mar 2007 19:56:25 -0500
  Theodore Tso [EMAIL PROTECTED] wrote:
  
   So the question really is are we really done making changes to sysfs,
   or maybe what we should do is talk about major version numbers to
   sysfs.
  
  Perhaps using a config option wasn't the right way to do this - a kernel
  boot parameter might be better.
 
 Ok, I have no problem with that if people really want it.  But give me
 the option to also make it a config option so I don't have to change our
 bootloaders too.

Sometimes we provide a config option which provides the default version of
the boot option.  So:

CONFIG_SYSFS_VERSION=1.2

and

if (user_provided_sysfs_version == NULL)
user_provided_sysfs_version = CONFIG_SYSFS_VERSION;


 Does that sound acceptable?

If we make CONFIG_SYSFS_DEPRECATED just a boolean boot option then that
fixes this problem (we hope) but won't help us next time we want to change
something.

It all depends on whether sysfs is finished yet ;)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
 On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
  On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:
   
   Ok, how about the following patch.  Is it acceptable to everyone?
   
   thanks,
   
   greg k-h
   
   ---
init/Kconfig |   13 +++--
1 file changed, 11 insertions(+), 2 deletions(-)
   
   --- gregkh-2.6.orig/init/Kconfig
   +++ gregkh-2.6/init/Kconfig
   @@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
   that belong to a class, back into the /sys/class heirachy, in
   order to support older versions of udev.

   -   If you are using a distro that was released in 2006 or later,
   -   it should be safe to say N here.
   +   If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
   +   release from 2007 or later, it should be safe to say N here.
   +
   +   If you are using Debian or other distros that are slow to
   +   update HAL, please say Y here.
  ...
  
  The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
  for all users, and schedule it's removal for mid-2008 (or later).
  
  12 months after the first _release_ of a HAL that can live without seems 
  to be the first time when we can consider getting rid of it, since all 
  distributions with at least one release a year should ship it by then.
  
  Currently, SYSFS_DEPRECATED is only a trap for users.
 
 Huh?
 
 No, again, I've been using this just fine for about 6 months now.
 
 And what about all of the servers not using HAL/NetworkManager?
 And what about all of the embedded systems not using either?
 
 So to not allow this to be turned off by people who might want to (we
 want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
 other distros released this year), is pretty heavy-handed.
 
 It also will work in OpenSuSE 10.2 which is already released, and I
 think Fedora 6, but I've only limited experience with these.
 
 Oh, and Gentoo works just fine, and has been for the past 6 months.

 I would just prefer to come up with an acceptable set of wording that
 will work to properly warn people.
 
 I proposed one such wording which some people took as a slam against
 Debian, which it really was not at all.
 
 Does someone else want to propose some other wording instead?

Back up a bit. Let's review:

Problem: NetworkManager stopped working with my ipw2200 on Debian/unstable

Theory A: It broke because I'm not running an as-yet-unreleased HAL.

 Then we should revert the patch pronto because it's an unqualified
 regression.

Theory B: It broke because I'm not running relatively recent HAL.

 By all accounts I'm running the latest and greatest HAL and Network
 Manager, more than recent enough to work.

Theory C: It broke because I've got some goofy config.

 My setup passes no arguments to either. The HAL config file is
 completely bare-bones and there's no sign of any configuration files
 for Network Manager.

Theory D: It broke for some nebulous Debian-related reason.

 That's a bunch of unhelpful crap.

Can we come up with an actual theory for what's wrong with my setup, please?
Like, perhaps:

Theory E: There's some undiagnosed new breakage that this introduces
that no else hit until it went into mainline.

 Hmmm, this one sounds more promising.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 07:30:21PM -0600, Matt Mackall wrote:
 On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
  On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
   On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:

Ok, how about the following patch.  Is it acceptable to everyone?

thanks,

greg k-h

---
 init/Kconfig |   13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- gregkh-2.6.orig/init/Kconfig
+++ gregkh-2.6/init/Kconfig
@@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
  that belong to a class, back into the /sys/class heirachy, in
  order to support older versions of udev.
 
- If you are using a distro that was released in 2006 or later,
- it should be safe to say N here.
+ If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
+ release from 2007 or later, it should be safe to say N here.
+
+ If you are using Debian or other distros that are slow to
+ update HAL, please say Y here.
   ...
   
   The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
   for all users, and schedule it's removal for mid-2008 (or later).
   
   12 months after the first _release_ of a HAL that can live without seems 
   to be the first time when we can consider getting rid of it, since all 
   distributions with at least one release a year should ship it by then.
   
   Currently, SYSFS_DEPRECATED is only a trap for users.
  
  Huh?
  
  No, again, I've been using this just fine for about 6 months now.
  
  And what about all of the servers not using HAL/NetworkManager?
  And what about all of the embedded systems not using either?
  
  So to not allow this to be turned off by people who might want to (we
  want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
  other distros released this year), is pretty heavy-handed.
  
  It also will work in OpenSuSE 10.2 which is already released, and I
  think Fedora 6, but I've only limited experience with these.
  
  Oh, and Gentoo works just fine, and has been for the past 6 months.
 
  I would just prefer to come up with an acceptable set of wording that
  will work to properly warn people.
  
  I proposed one such wording which some people took as a slam against
  Debian, which it really was not at all.
  
  Does someone else want to propose some other wording instead?
 
 Back up a bit. Let's review:
 
 Problem: NetworkManager stopped working with my ipw2200 on Debian/unstable
 
 Theory A: It broke because I'm not running an as-yet-unreleased HAL.
 
  Then we should revert the patch pronto because it's an unqualified
  regression.
 
 Theory B: It broke because I'm not running relatively recent HAL.
 
  By all accounts I'm running the latest and greatest HAL and Network
  Manager, more than recent enough to work.
 
 Theory C: It broke because I've got some goofy config.
 
  My setup passes no arguments to either. The HAL config file is
  completely bare-bones and there's no sign of any configuration files
  for Network Manager.
 
 Theory D: It broke for some nebulous Debian-related reason.
 
  That's a bunch of unhelpful crap.
 

 Can we come up with an actual theory for what's wrong with my setup, please?
 Like, perhaps:
 
 Theory E: There's some undiagnosed new breakage that this introduces
 that no else hit until it went into mainline.

Theory F:  It broke because you are using NetworkManager for your
network devices and the patches that fix this have not made it into a
real release?

I'm just guessing, but does anyone who is having this problem, NOT using
NetworkManager?

I'm running an old version of HAL just fine, but I'm not using
NetworkManager here.

I am using NetworkManager on a OpenSuSE 10.3 release, but suse's version
of NetworkManager is well known to not be anywhere near what is released
as a tarball :(

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 07:30:21PM -0600, Matt Mackall wrote:
 On Mon, Mar 05, 2007 at 04:07:22PM -0800, Greg KH wrote:
  On Tue, Mar 06, 2007 at 12:40:52AM +0100, Adrian Bunk wrote:
   On Mon, Mar 05, 2007 at 10:58:13AM -0800, Greg KH wrote:

Ok, how about the following patch.  Is it acceptable to everyone?

thanks,

greg k-h

---
 init/Kconfig |   13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- gregkh-2.6.orig/init/Kconfig
+++ gregkh-2.6/init/Kconfig
@@ -290,8 +290,17 @@ config SYSFS_DEPRECATED
  that belong to a class, back into the /sys/class heirachy, in
  order to support older versions of udev.
 
- If you are using a distro that was released in 2006 or later,
- it should be safe to say N here.
+ If you are using an OpenSuSE, Gentoo, Ubuntu, or Fedora
+ release from 2007 or later, it should be safe to say N here.
+
+ If you are using Debian or other distros that are slow to
+ update HAL, please say Y here.
   ...
   
   The sane solution seems to be to enable SYSFS_DEPRECATED unconditionally 
   for all users, and schedule it's removal for mid-2008 (or later).
   
   12 months after the first _release_ of a HAL that can live without seems 
   to be the first time when we can consider getting rid of it, since all 
   distributions with at least one release a year should ship it by then.
   
   Currently, SYSFS_DEPRECATED is only a trap for users.
  
  Huh?
  
  No, again, I've been using this just fine for about 6 months now.
  
  And what about all of the servers not using HAL/NetworkManager?
  And what about all of the embedded systems not using either?
  
  So to not allow this to be turned off by people who might want to (we
  want this for OpenSuSE 10.3, and Fedora 7 also will want this, as will
  other distros released this year), is pretty heavy-handed.
  
  It also will work in OpenSuSE 10.2 which is already released, and I
  think Fedora 6, but I've only limited experience with these.
  
  Oh, and Gentoo works just fine, and has been for the past 6 months.
 
  I would just prefer to come up with an acceptable set of wording that
  will work to properly warn people.
  
  I proposed one such wording which some people took as a slam against
  Debian, which it really was not at all.
  
  Does someone else want to propose some other wording instead?
 
 Back up a bit. Let's review:
 
 Problem: NetworkManager stopped working with my ipw2200 on Debian/unstable

Wait, have confirmed that if you enable this config option,
NetworkManager starts back up again and works properly?

If so, can you disable the option and strace it to see what program is
trying to access what?  That will put the
HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty
quickly :)

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)

2007-03-05 Thread Matthew Garrett
On Mon, Mar 05, 2007 at 02:39:00PM -0800, Greg KH wrote:

 Ok, I only named HAL as that is what people have told me the problem is.
 I have been running this change on my boxs, without
 CONFIG_SYSFS_DEPRECATED since last July or so.
 
 But I don't use NetworkManager here for the most part, but I have tried
 this in the OpenSuse10.3 alpha releases and it seems to work just fine
 with whatever version of NetworkManager it uses.

At a guess, you're carrying either a git snapshot or have backports from 
git. Several distributions do this, but until there's actually been a 
released version that works, it's a bit early to set a timescale.

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote:
 
 Wait, have confirmed that if you enable this config option,
 NetworkManager starts back up again and works properly?

Yep, probably should have mentioned that.

 If so, can you disable the option and strace it to see what program is
 trying to access what?  That will put the
 HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty
 quickly :)

Did that a few hours ago, got a very large dump from both programs. No
smoking guns to my eye, but I'll send you the logs later.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Matt Mackall
On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote:
 If so, can you disable the option and strace it to see what program is
 trying to access what?  That will put the
 HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty
 quickly :)

Ok, I've got straces of both good and bad (5M each). Filtered out
random pointer values and the like, diffed, and filtered for /sys/,
and the result's still 1.5M. What should I be looking for?

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.21 patch] unconditionally enable SYSFS_DEPRECATED

2007-03-05 Thread Greg KH
On Mon, Mar 05, 2007 at 09:39:47PM -0600, Matt Mackall wrote:
 On Mon, Mar 05, 2007 at 06:48:50PM -0800, Greg KH wrote:
  If so, can you disable the option and strace it to see what program is
  trying to access what?  That will put the
  HAL/NetworkManager/libsysfs/distro script finger pointing to rest pretty
  quickly :)
 
 Ok, I've got straces of both good and bad (5M each). Filtered out
 random pointer values and the like, diffed, and filtered for /sys/,
 and the result's still 1.5M. What should I be looking for?

Failures when trying to read from /sys/class/net/

Or opening the directory and iterating over the subdirs in there.  Or
something like that.

But the /sys/class/net/ stuff should hopefully help narrow it down.

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Arp announce (for Xen)

2007-03-05 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Thu, 1 Mar 2007 17:30:30 -0800

 What about implementing the unused arp_announce flag on the inetdevice?
 Something like the following.  Totally untested...
 
 Looks like it either was there (and got removed) or was planned but
 never implemented.

This idea is fine.  But:

 + case NETDEV_CHANGEADDR:
 + /* Send gratuitous ARP in case of address change or new device 
 */
 + if (IN_DEV_ARP_ANNOUNCE(in_dev))
 + arp_send(ARPOP_REQUEST, ETH_P_ARP,
 +  in_dev-ifa_list-ifa_address, dev,
 +  in_dev-ifa_list-ifa_address, NULL, 
 +  dev-dev_addr, NULL);

We'll need to make sure the appropriate 'arp_anounce' address
selection is employed here.

One idea is to change arp_solicit() such that it can be invoked in
this context, or provide a new helper function which will do the
source address selection rules of 'arp_announce' and then invoke
arp_send() as appropriate for us.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >