[PATCH] NET : convert network timestamps to ktime_t
Hi David Here is the second version of this patch, including missing bits spoted by Stephen. This is against net-2.6.22 Thank you [PATCH] NET : convert network timestamps to ktime_t We currently use a special structure (struct skb_timeval) and plain 'struct timeval' to store packet timestamps in sk_buffs and struct sock. This has some drawbacks : - Fixed resolution of micro second. - Waste of space on 64bit platforms where sizeof(struct timeval)=16 I suggest using ktime_t that is a nice abstraction of high resolution time services, currently capable of nanosecond resolution. As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits a 8 byte shrink of this structure on 64bit architectures. Some other structures also benefit from this size reduction (struct ipq in ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...) Once this ktime infrastructure adopted, we can more easily provide nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or SO_TIMESTAMPNS/SCM_TIMESTAMPNS) Note : this patch includes a bug correction in compat_sock_get_timestamp() where a "err = 0;" was missing (so this syscall returned -ENOENT instead of 0) Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]> CC: Stephen Hemminger <[EMAIL PROTECTED]> CC: John find <[EMAIL PROTECTED]> include/linux/skbuff.h | 26 -- include/net/sock.h | 18 +++ kernel/time.c |1 net/bridge/netfilter/ebt_ulog.c |6 +++-- net/compat.c| 15 net/core/dev.c | 19 +++- net/core/sock.c | 16 +++-- net/econet/af_econet.c |2 - net/ipv4/ip_fragment.c |8 +++--- net/ipv4/netfilter/ip_queue.c |6 +++-- net/ipv4/netfilter/ipt_ULOG.c |8 -- net/ipv6/exthdrs.c |2 - net/ipv6/netfilter/ip6_queue.c |6 +++-- net/ipv6/netfilter/nf_conntrack_reasm.c |6 ++--- net/ipv6/reassembly.c |6 ++--- net/ipx/af_ipx.c|4 +-- net/netfilter/nfnetlink_log.c |8 +++--- net/netfilter/nfnetlink_queue.c |8 +++--- net/packet/af_packet.c |8 -- net/sunrpc/svcsock.c| 10 ++-- 20 files changed, 85 insertions(+), 98 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 4ff3940..24dcbb3 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -27,6 +27,7 @@ #include #include #include #include +#include #define HAVE_ALLOC_SKB /* For the drivers to know */ #define HAVE_ALIGNABLE_SKB /* Ditto 8)*/ @@ -156,11 +157,6 @@ struct skb_shared_info { #define SKB_DATAREF_SHIFT 16 #define SKB_DATAREF_MASK ((1 << SKB_DATAREF_SHIFT) - 1) -struct skb_timeval { - u32 off_sec; - u32 off_usec; -}; - enum { SKB_FCLONE_UNAVAILABLE, @@ -233,7 +229,7 @@ struct sk_buff { struct sk_buff *prev; struct sock *sk; - struct skb_timeval tstamp; + ktime_t tstamp; struct net_device *dev; struct net_device *input_dev; @@ -1360,26 +1356,14 @@ extern void skb_add_mtu(int mtu); */ static inline void skb_get_timestamp(const struct sk_buff *skb, struct timeval *stamp) { - stamp->tv_sec = skb->tstamp.off_sec; - stamp->tv_usec = skb->tstamp.off_usec; + *stamp = ktime_to_timeval(skb->tstamp); } -/** - * skb_set_timestamp - set timestamp of a skb - * @skb: skb to set stamp of - * @stamp: pointer to struct timeval to get stamp from - * - * Timestamps are stored in the skb as offsets to a base timestamp. - * This function converts a struct timeval to an offset and stores - * it in the skb. - */ -static inline void skb_set_timestamp(struct sk_buff *skb, const struct timeval *stamp) +static inline void __net_timestamp(struct sk_buff *skb) { - skb->tstamp.off_sec = stamp->tv_sec; - skb->tstamp.off_usec = stamp->tv_usec; + skb->tstamp = ktime_get_real(); } -extern void __net_timestamp(struct sk_buff *skb); extern __sum16 __skb_checksum_complete(struct sk_buff *skb); diff --git a/include/net/sock.h b/include/net/sock.h index f352d22..59af9fc 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -244,7 +244,7 @@ #define sk_prot __sk_common.skc_prot struct sk_filter*sk_filter; void*sk_protinfo; struct timer_list sk_timer; - struct timeval sk_stamp; + ktime_t sk_stamp; struct socket *sk_socket; void*sk_user_data; struct page *sk_sndmsg_page; @@ -1307,19 +1307,19 @@ static inlin
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Sun, Mar 04, 2007 at 11:02:48PM -0800, Greg KH wrote: > On Mon, Mar 05, 2007 at 12:42:29AM -0600, Matt Mackall wrote: > > On Sun, Mar 04, 2007 at 05:16:25PM -0800, Greg KH wrote: > > > On Sun, Mar 04, 2007 at 04:08:57PM -0600, Matt Mackall wrote: > > > > Recent kernels are having troubles with wireless for me. Two seemingly > > > > related problems: > > > > > > > > a) NetworkManager seems oblivious to the existence of my IPW2200 > > > > b) Manual iwconfig waits for 60s and then reports: > > > > > > > > Error for wireless request "Set Encode" (8B2A) : > > > > SET failed on device eth1 ; Operation not supported. > > > > > > Do you have CONFIG_SYSFS_DEPRECATED enabled? If not, please do as that > > > will keep you from having to change any userspace code. > > > > No, it's disabled. Will test once I'm done tracking down the iwconfig > > problem. From the help text for SYSFS_DEPRECATED: > > > > If you are using a distro that was released in 2006 or > > later, it should be safe to say N here. > > > > If we need an as-yet-unreleased HAL without it, I would say the above > > should be changed to 2008 or so. If Debian actually cuts a release in > > the next few months, you might make that 2010. > > Well, just because Debian has such a slow release cycle, should the rest > of the world be forced to follow suit? :) > > When I originally wrote that, I thought Debian would have already done > their release, my mistake... That's not the point. The point is that Debian/unstable as of _this morning_ doesn't work. For reference, I'm running both the latest releases of both hal (0.5.8.1-6.1) and network-manager (0.6.4-6). And there are people telling me I need a copy of HAL out of git that hasn't even been released for Debian to package. Debian isn't the problem here. If it is indeed the case that HAL needs to be upgraded here, the clock on deprecating these features can't even start counting until a usable HAL version is released. And then you need to give it at least a year after that before you can start recommending people disable it. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Extensible hashing and RCU
On 3/4/07, David Miller <[EMAIL PROTECTED]> wrote: From: "Michael K. Edwards" <[EMAIL PROTECTED]> > Before I implement, I design. Before I design, I analyze. Before I > analyze, I prototype. Before I prototype, I gather requirements. How the heck do you ever get to writing ANYTHING if you work that way? Oh, c'mon, the whole design process takes maybe 3 weeks for something I'm going to spend 3 months implementing. And half the time the client mistakes the prototype for the implementation and tries to run with it -- with the foreseeable consequences. Smaller projects are a lot less formal but the principle still holds: every time I've implemented without actually doing the homework leading up to a design, I've had cause to regret it. That goes double for problems that already have off-the-shelf solutions and only need improvement in ease of use and robustness in hostile environments. I certainly would never have written one single line of Linux kernel code if I had to go through that kind of sequence to actually get to writing code. Lots of _code_ gets written as a side effect of each stage of that sequence. But none of that code should be mistaken for the _implementation_. Not when the implementation is going to ship inside hundreds of thousands of widgets that people are going to stick in their ears, or is going to monitor the integrity of an intercontinental telecoms network. Most shipping code, including most code that I have written and large swathes of the Linux kernel, has never really gone beyond the prototype stage. There's no shame in that; the shame would be in refusing to admit it. And that's definitely not the "Linux way". You code up ideas as soon as you come up with one that has a chance of working, and you see what happens. Sure, you'll throw a lot away, but at least you will "know" instead of "think". I call that process "prototyping". I used to do a lot of it; but I have since found that thinking first is actually more efficient. There is very little point in prototyping the square wheel again and again and again. And especially given that Linux already has plenty of nice, round wheels, there aren't that many places left where you can impress the world by replacing a square wheel with a hexagonal one. Replacing wheels with maglev tracks requires a real design phase. You have to try things, "DO" stuff, not just sit around and theorize and design things and shoot down ideas on every negative minute detail you can come up with before you type any code in. That mode of development doesn't inspire people and get a lot of code written. I'll cop to the "negative" part of this, at least to some degree, but not the rest. "Naive hashes DDoS easily" is not a minute detail. "Hash tables don't provide the operations characteristic of priority queues" is not a minute detail. "The benchmarks offered do not accurately model system impact" is not a minute detail. If I were not sufficiently familiar with some of Evgeniy's other contributions to the kernel to think him capable of responding to these critiques with a better design, I would not have bothered. I definitely do not think others should use this design/prototype/analyze/blah/balh way of developing as an example, instead I think folks should use people like Ingo Molnar as an example of a good Linux developer. People like Ingo rewrite the scheduler one night because of a tiny cool idea, and even if only 1 out of 10 hacks like that turn out to be useful, his work is invaluable and since he's actually trying to do things and writing lots of code this inspires other people. Ingo Molnar is certainly a good, nay an excellent, Linux developer. His prototypes are brilliant even when they're under-designed by my way of thinking. By all means, readers should go thou and do likewise. And when someone presses an alternative design, "show me the code" is a fair response. At present, I am not offering code, nor design, nor even a prototype. But I am starting to figure out which bits of the Linux kernel really are implementations based on a solid design. Any prototyping I wind up doing is going to be bolted firmly onto those implementations, not floating in the surrounding sea of prototype code; and any analysis I offer based on that prototype will reflect, to the best of my ability, an understanding of its weaknesses as well as its strengths. If that analysis passes community scrutiny, and if I remain interested in the project, perhaps I will go through with design and implementation as well. This, incidentally, seems very similar to the process that Robert Olsson and Stefan Nilsson have gone through with their trie/hash project. Although I haven't tried it out yet and don't have any basis for an independent opinion, the data and analysis provided in their paper are quite convincing. Any prototyping I might do would probably build on their work, perhaps adding a more explicit DDoS-survival strategy based on a priori
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 12:42:29AM -0600, Matt Mackall wrote: > On Sun, Mar 04, 2007 at 05:16:25PM -0800, Greg KH wrote: > > On Sun, Mar 04, 2007 at 04:08:57PM -0600, Matt Mackall wrote: > > > Recent kernels are having troubles with wireless for me. Two seemingly > > > related problems: > > > > > > a) NetworkManager seems oblivious to the existence of my IPW2200 > > > b) Manual iwconfig waits for 60s and then reports: > > > > > > Error for wireless request "Set Encode" (8B2A) : > > > SET failed on device eth1 ; Operation not supported. > > > > Do you have CONFIG_SYSFS_DEPRECATED enabled? If not, please do as that > > will keep you from having to change any userspace code. > > No, it's disabled. Will test once I'm done tracking down the iwconfig > problem. From the help text for SYSFS_DEPRECATED: > > If you are using a distro that was released in 2006 or > later, it should be safe to say N here. > > If we need an as-yet-unreleased HAL without it, I would say the above > should be changed to 2008 or so. If Debian actually cuts a release in > the next few months, you might make that 2010. Well, just because Debian has such a slow release cycle, should the rest of the world be forced to follow suit? :) When I originally wrote that, I thought Debian would have already done their release, my mistake... thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] NET : convert network timestamps to ktime_t
David Miller a écrit : From: Eric Dumazet <[EMAIL PROTECTED]> Date: Fri, 02 Mar 2007 23:46:14 +0100 Stephen Hemminger a écrit : You missed a couple of spots. Arg yes... ... - } - skb_get_timestamp(skb, &svsk->sk_sk->sk_stamp); + svsk->sk_sk->sk_stamp = (skb->tstamp.tv64 != 0) ? skb->tstamp + : ktime_get_real(); Well, if we want to stay in the spirit of old code, we probably want to use current_kernel_time() (+ timespec_to_ktime()), because its less expensive. And also setting the skb tstamp, no ? Can you guys cook up an integrated patch with all the missing cases fixed up as desired, so I can add this to net-2.6.22, thanks? Yes, I will send the patch against net-2.6.22 this morning. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Sun, Mar 04, 2007 at 05:16:25PM -0800, Greg KH wrote: > On Sun, Mar 04, 2007 at 04:08:57PM -0600, Matt Mackall wrote: > > Recent kernels are having troubles with wireless for me. Two seemingly > > related problems: > > > > a) NetworkManager seems oblivious to the existence of my IPW2200 > > b) Manual iwconfig waits for 60s and then reports: > > > > Error for wireless request "Set Encode" (8B2A) : > > SET failed on device eth1 ; Operation not supported. > > Do you have CONFIG_SYSFS_DEPRECATED enabled? If not, please do as that > will keep you from having to change any userspace code. No, it's disabled. Will test once I'm done tracking down the iwconfig problem. From the help text for SYSFS_DEPRECATED: If you are using a distro that was released in 2006 or later, it should be safe to say N here. If we need an as-yet-unreleased HAL without it, I would say the above should be changed to 2008 or so. If Debian actually cuts a release in the next few months, you might make that 2010. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [1/6] 2.6.21-rc2: known regressions
On Sun, Mar 04, 2007 at 11:01:33PM -0500, Mark Lord wrote: > Adrian Bunk wrote: > > > >Subject: Bluetooth RFComm locks up the machine (device_move() related) > >References : http://lkml.org/lkml/2007/3/4/64 > >Submitter : Mark Lord <[EMAIL PROTECTED]> > >Caused-By : Marcel Holtmann <[EMAIL PROTECTED]> > > commit c1a3313698895d8ad4760f98642007bf236af2e8 > >Status : unknown > > A 2-line patch exists for fs/sysfs/dir.c to address this. > Waiting on Greg to apply it or substitute something prettier. ;) I want to see if Marcel agrees with it, as he did the original patch in that area. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][UDP]: Fix "whitespace" cleanup
From: "Arnaldo Carvalho de Melo" <[EMAIL PROTECTED]> Date: Mon, 5 Mar 2007 00:11:49 -0300 > Hi David, Stephen > >Whitespace cleanups have to pass the compile test too ;-) This > is just in net-2.6.22 tho :-) > > Signed-off-by: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> I think I'll just revert Stephen's patch, both revisions were done very sloppily, and this lack of compile testing basically proves it. :-/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [1/6] 2.6.21-rc2: known regressions
Adrian Bunk wrote: Subject: Bluetooth RFComm locks up the machine (device_move() related) References : http://lkml.org/lkml/2007/3/4/64 Submitter : Mark Lord <[EMAIL PROTECTED]> Caused-By : Marcel Holtmann <[EMAIL PROTECTED]> commit c1a3313698895d8ad4760f98642007bf236af2e8 Status : unknown A 2-line patch exists for fs/sysfs/dir.c to address this. Waiting on Greg to apply it or substitute something prettier. ;) Cheers - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [1/6] 2.6.21-rc2: known regressions
On Mon, Mar 05, 2007 at 02:50:31AM +0100, Adrian Bunk wrote: > Subject: kref refcounting breakage > References : http://lkml.org/lkml/2007/3/2/67 > Submitter : Andrew Morton <[EMAIL PROTECTED]> > Handled-By : Greg KH <[EMAIL PROTECTED]> > Status : unknown I'm working on tracking this down still... > Subject: wireless breakage (ipw2200, iwconfig, NetworkManager) > References : http://lkml.org/lkml/2007/3/4/135 > Submitter : Matt Mackall <[EMAIL PROTECTED]> > Caused-By : Greg Kroah-Hartman <[EMAIL PROTECTED]> (?) > commit 43cb76d91ee85f579a69d42bc8efc08bac560278 (?) > Handled-By : Johannes Berg <[EMAIL PROTECTED]> > Status : unknown I really think this is a CONFIG_SYSFS_DEPRECATED issue (not being set), but want to get Matt confirm either way before saying this is a real issue or not. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][UDP]: Fix "whitespace" cleanup
On 3/5/07, Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> wrote: Hi David, Stephen Whitespace cleanups have to pass the compile test too ;-) This is just in net-2.6.22 tho :-) Signed-off-by: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> There is one more problem: - __u16 destp = ntohs(inet->dport); - __u16 srcp= ntohs(inet->sport); + __u16 dest = ntohs(inet->dport); + __u16 srcp = ntohs(inet->sport); It changes 'destp' to dest, and there is already a dest variable, updated patch attached, Signed-off-by: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 5309dd5..45b58ab 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -196,7 +196,7 @@ gotit: if (sk2->sk_hash == snum && sk2 != sk&& (!sk2->sk_reuse|| !sk->sk_reuse) && - || sk2->sk_bound_dev_if == sk->sk_bound_dev_if) && + (sk2->sk_bound_dev_if == sk->sk_bound_dev_if)&& (*saddr_comp)(sk, sk2)) goto fail; } @@ -1660,7 +1660,7 @@ static void udp4_format_sock(struct sock *sp, char *tmpbuf, int bucket) struct inet_sock *inet = inet_sk(sp); __be32 dest = inet->daddr; __be32 src = inet->rcv_saddr; - __u16 dest = ntohs(inet->dport); + __u16 destp = ntohs(inet->dport); __u16 srcp = ntohs(inet->sport); sprintf(tmpbuf, "%4d: %08X:%04X %08X:%04X"
[PATCH][UDP]: Fix "whitespace" cleanup
Hi David, Stephen Whitespace cleanups have to pass the compile test too ;-) This is just in net-2.6.22 tho :-) Signed-off-by: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> - Arnaldo diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 5309dd5..2e9fd1e 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -196,7 +196,7 @@ gotit: if (sk2->sk_hash == snum && sk2 != sk&& (!sk2->sk_reuse|| !sk->sk_reuse) && - || sk2->sk_bound_dev_if == sk->sk_bound_dev_if) && + (sk2->sk_bound_dev_if == sk->sk_bound_dev_if)&& (*saddr_comp)(sk, sk2)) goto fail; }
Re: Wifi support for iproute2
Stephen Hemminger wrote: > Don't waste your time with a tool that uses the exist wext API. > But a tool that could use cfg80211 would be useful. After the wireless > summit in Jan, I put it on my "interesting ideas" list. Ok. I would be happy if anyone will notify me when this is available without going too far from a mainstream kernel. Stephan pgprkOl3SjAmX.pgp Description: PGP signature
Re: [1/6] 2.6.21-rc2: known regressions
On Mon, 5 Mar 2007 02:50:31 +0100 Adrian Bunk <[EMAIL PROTECTED]> wrote: > This email lists some known regressions in 2.6.21-rc2 compared to 2.6.20 > that are not yet fixed in Linus' tree. We seem to have broken an unusually large amount of stuff this time. partial post-mortem: - The ACPICA merge landed in -mm super-late: basically it was in mainline a week afterwards and saw only a single -mm release. Part of the reason for this short period in -mm was that ACPICA had its paws all over x86_64 code and conflicted badly with significant changes in the x86_64 tree. That happens sometimes. But when it does, the mess lands in my lap rather than in the laps of the perpetrators. Lesson: keep the code well-factored so that different subsystems don't soil each others' kennels. - The hrtimers/dynticks stuff is simply hard: timekeeping, low-level x86, even APICs. These are areas in which things break a lot, so churning it was inevitably going to cause problems. Lesson: none, I think. Low-level x86 support is just hard, and changing it breaks things. So that accounts for _some_ of the damage, but I wonder if there's more to it than that. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6 patch] make drivers/net/s2io.c:vlan_strip_flag static
This patch makes the needlessly global vlan_strip_flag static. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> --- --- linux-2.6.21-rc2-mm1/drivers/net/s2io.c.old 2007-03-04 21:37:59.0 +0100 +++ linux-2.6.21-rc2-mm1/drivers/net/s2io.c 2007-03-04 21:38:14.0 +0100 @@ -316,7 +316,7 @@ } /* A flag indicating whether 'RX_PA_CFG_STRIP_VLAN_TAG' bit is set or not */ -int vlan_strip_flag; +static int vlan_strip_flag; /* Unregister the vlan */ static void s2io_vlan_rx_kill_vid(struct net_device *dev, unsigned long vid) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[-mm patch] drivers/net/bonding/bond_main.c:make 3 functions static
This patch makes the following needlessly global functions static: - bond_mode_name() - bond_sethwaddr() - bond_mii_monitor() Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> --- drivers/net/bonding/bond_main.c |7 --- drivers/net/bonding/bonding.h |3 --- 2 files changed, 4 insertions(+), 6 deletions(-) --- linux-2.6.21-rc2-mm1/drivers/net/bonding/bonding.h.old 2007-03-04 21:33:14.0 +0100 +++ linux-2.6.21-rc2-mm1/drivers/net/bonding/bonding.h 2007-03-04 21:34:46.0 +0100 @@ -301,13 +301,10 @@ void bond_destroy_slave_symlinks(struct net_device *master, struct net_device *slave); int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev); int bond_release(struct net_device *bond_dev, struct net_device *slave_dev); -int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev); -void bond_mii_monitor(struct work_struct *work); void bond_loadbalance_arp_mon(struct work_struct *work); void bond_activebackup_arp_mon(struct work_struct *work); void bond_set_mode_ops(struct bonding *bond, int mode); int bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl); -const char *bond_mode_name(int mode); void bond_select_active_slave(struct bonding *bond); void bond_change_active_slave(struct bonding *bond, struct slave *new_active); void bond_register_arp(struct bonding *); --- linux-2.6.21-rc2-mm1/drivers/net/bonding/bond_main.c.old2007-03-04 21:33:29.0 +0100 +++ linux-2.6.21-rc2-mm1/drivers/net/bonding/bond_main.c2007-03-04 21:34:56.0 +0100 @@ -187,7 +187,7 @@ /* General routines -*/ -const char *bond_mode_name(int mode) +static const char *bond_mode_name(int mode) { switch (mode) { case BOND_MODE_ROUNDROBIN : @@ -1200,7 +1200,8 @@ /*-- IOCTL --*/ -int bond_sethwaddr(struct net_device *bond_dev, struct net_device *slave_dev) +static int bond_sethwaddr(struct net_device *bond_dev, + struct net_device *slave_dev) { dprintk("bond_dev=%p\n", bond_dev); dprintk("slave_dev=%p\n", slave_dev); @@ -2014,7 +2015,7 @@ /* Monitoring ---*/ /* this function is called regularly to monitor each slave's link. */ -void bond_mii_monitor(struct work_struct *work) +static void bond_mii_monitor(struct work_struct *work) { struct bonding *bond = container_of(work, struct bonding, mii_work.work); struct net_device *bond_dev = bond->dev; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6 patch] drivers/net/qla3xxx.c: make 2 functions static
This patch makes two needlessly global functions static. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> --- --- linux-2.6.21-rc2-mm1/drivers/net/qla3xxx.c.old 2007-03-04 21:41:27.0 +0100 +++ linux-2.6.21-rc2-mm1/drivers/net/qla3xxx.c 2007-03-04 21:41:39.0 +0100 @@ -1797,14 +1797,14 @@ atomic_inc(&qdev->tx_count); } -void ql_get_sbuf(struct ql3_adapter *qdev) +static void ql_get_sbuf(struct ql3_adapter *qdev) { if (++qdev->small_buf_index == NUM_SMALL_BUFFERS) qdev->small_buf_index = 0; qdev->small_buf_release_cnt++; } -struct ql_rcv_buf_cb *ql_get_lbuf(struct ql3_adapter *qdev) +static struct ql_rcv_buf_cb *ql_get_lbuf(struct ql3_adapter *qdev) { struct ql_rcv_buf_cb *lrg_buf_cb = NULL; lrg_buf_cb = &qdev->lrg_buf[qdev->lrg_buf_index]; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[1/6] 2.6.21-rc2: known regressions
This email lists some known regressions in 2.6.21-rc2 compared to 2.6.20 that are not yet fixed in Linus' tree. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject: kwin dies silently References : http://lkml.org/lkml/2007/2/28/112 Submitter : Sid Boyce <[EMAIL PROTECTED]> Status : unknown Subject: resume: slab error in verify_redzone_free(): cache `size-512': memory outside object was overwritten References : http://lkml.org/lkml/2007/2/24/41 Submitter : Pavel Machek <[EMAIL PROTECTED]> Handled-By : Marcel Holtmann <[EMAIL PROTECTED]> Status : unknown Subject: bluetooth hardlocks References : http://lkml.org/lkml/2007/3/2/85 Submitter : Pavel Machek <[EMAIL PROTECTED]> Status : unknown Subject: Bluetooth RFComm locks up the machine (device_move() related) References : http://lkml.org/lkml/2007/3/4/64 Submitter : Mark Lord <[EMAIL PROTECTED]> Caused-By : Marcel Holtmann <[EMAIL PROTECTED]> commit c1a3313698895d8ad4760f98642007bf236af2e8 Status : unknown Subject: kref refcounting breakage References : http://lkml.org/lkml/2007/3/2/67 Submitter : Andrew Morton <[EMAIL PROTECTED]> Handled-By : Greg KH <[EMAIL PROTECTED]> Status : unknown Subject: wireless breakage (ipw2200, iwconfig, NetworkManager) References : http://lkml.org/lkml/2007/3/4/135 Submitter : Matt Mackall <[EMAIL PROTECTED]> Caused-By : Greg Kroah-Hartman <[EMAIL PROTECTED]> (?) commit 43cb76d91ee85f579a69d42bc8efc08bac560278 (?) Handled-By : Johannes Berg <[EMAIL PROTECTED]> Status : unknown Subject: forcedeth: skb_over_panic References : http://bugzilla.kernel.org/show_bug.cgi?id=8058 Submitter : Albert Hopkins <[EMAIL PROTECTED]> Status : unknown - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Sun, Mar 04, 2007 at 06:25:50PM -0600, Matt Mackall wrote: > On Mon, Mar 05, 2007 at 12:39:24AM +0100, Johannes Berg wrote: > > [adding linux-wireless to CC] > > > > On Sun, 2007-03-04 at 16:08 -0600, Matt Mackall wrote: > > > Recent kernels are having troubles with wireless for me. Two seemingly > > > related problems: > > > > I don't think they are related actually. > > > > > a) NetworkManager seems oblivious to the existence of my IPW2200 > > > > This is due to the recent sysfs restructuring I think. IIRC the fix is > > to upgrade hal to a current git version. > > If that's the cause, the fix is to back out whatever was done to break > userspace. Breaking userspace is not ok. Upgrading from 2.6.x to > 2.6.x+1 should not entail replacing substantial parts of userspace, > especially with NOT-EVEN-FRAKKING-RELEASED-YET CODE. I should not have broken any userspace if CONFIG_SYSFS_DEPRECATED is enabled with that patch. If that is enabled, and that patch still causes problems, please let me know. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Sun, Mar 04, 2007 at 04:08:57PM -0600, Matt Mackall wrote: > Recent kernels are having troubles with wireless for me. Two seemingly > related problems: > > a) NetworkManager seems oblivious to the existence of my IPW2200 > b) Manual iwconfig waits for 60s and then reports: > > Error for wireless request "Set Encode" (8B2A) : > SET failed on device eth1 ; Operation not supported. Do you have CONFIG_SYSFS_DEPRECATED enabled? If not, please do as that will keep you from having to change any userspace code. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Sun, Mar 04, 2007 at 04:45:25PM -0800, Andrew Morton wrote: > On Sun, 4 Mar 2007 18:25:50 -0600 Matt Mackall <[EMAIL PROTECTED]> wrote: > > > On Mon, Mar 05, 2007 at 12:39:24AM +0100, Johannes Berg wrote: > > > [adding linux-wireless to CC] > > > > > > On Sun, 2007-03-04 at 16:08 -0600, Matt Mackall wrote: > > > > Recent kernels are having troubles with wireless for me. Two seemingly > > > > related problems: > > > > > > I don't think they are related actually. > > > > > > > a) NetworkManager seems oblivious to the existence of my IPW2200 > > > > > > This is due to the recent sysfs restructuring I think. IIRC the fix is > > > to upgrade hal to a current git version. > > > > If that's the cause, the fix is to back out whatever was done to break > > userspace. Breaking userspace is not ok. Upgrading from 2.6.x to > > 2.6.x+1 should not entail replacing substantial parts of userspace, > > especially with NOT-EVEN-FRAKKING-RELEASED-YET CODE. > > yep. Adrian, I think we should track this as a blocking regression, at > least until we've fully understood the implications and had the usual > arguments. I'm currently tracking it as one of the 31 2.6.21-rc regressions that are not yet fixed in Linus' tree, and for me each of them is a blocker until proven otherwise. Whether Linus releases 2.6.21 despite blocking regressions is a different question... cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Sun, 4 Mar 2007 18:25:50 -0600 Matt Mackall <[EMAIL PROTECTED]> wrote: > On Mon, Mar 05, 2007 at 12:39:24AM +0100, Johannes Berg wrote: > > [adding linux-wireless to CC] > > > > On Sun, 2007-03-04 at 16:08 -0600, Matt Mackall wrote: > > > Recent kernels are having troubles with wireless for me. Two seemingly > > > related problems: > > > > I don't think they are related actually. > > > > > a) NetworkManager seems oblivious to the existence of my IPW2200 > > > > This is due to the recent sysfs restructuring I think. IIRC the fix is > > to upgrade hal to a current git version. > > If that's the cause, the fix is to back out whatever was done to break > userspace. Breaking userspace is not ok. Upgrading from 2.6.x to > 2.6.x+1 should not entail replacing substantial parts of userspace, > especially with NOT-EVEN-FRAKKING-RELEASED-YET CODE. yep. Adrian, I think we should track this as a blocking regression, at least until we've fully understood the implications and had the usual arguments. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [EMAIL PROTECTED]: Mail delivery failed: returning message to sender]
In the current code SID 0 indicates that the socket is to be un-bound. Supporting unbinding of the socket was intended to permit the PPPoE session to be reconnected without closing/reopening the socket; which would mean that you'd have to re-bind the PPPoE/PPP channel bindings. Thus it is conceivable to swap or renegotiate PPPoE connection underneath a PPP connection, hypothetically if anyone ever considered doing so. Is that worth it? I don't know. One could eliminate that disconnect behavior and I don't think anyone would care. I'll conceed that a SID of 0 could appear from outer space. I've never seen that happening. The only way I see this being an issue is if a PPPoE server insists on giving you SID 0 and only SID 0 repeatedly. And I've never seen *that* happening. If you'd really like to pursue this, I'll be happy to review and ack patches in this regard. However, I don't see what there is to be actually gained by pursuing this. I'm open to being convinced; what is the motivation behind this? If there is a real problem here I'll be glad to get involved in fixing it myself. -- Michal Ostrowski <[EMAIL PROTECTED]> On Sun, 2007-03-04 at 15:56 +0100, Florian Zumbiehl wrote: > - Forwarded message from Mail Delivery System <[EMAIL PROTECTED]> - > > Date: Sun, 04 Mar 2007 15:52:51 +0100 > From: Mail Delivery System <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Subject: Mail delivery failed: returning message to sender > Delivery-date: Sun, 04 Mar 2007 15:54:05 +0100 > Auto-Submitted: auto-generated > > This message was created automatically by mail delivery software. > > A message that you sent could not be delivered to one or more of its > recipients. This is a permanent error. The following address(es) failed: > > [EMAIL PROTECTED] > SMTP error from remote mailer after MAIL FROM:<[EMAIL PROTECTED]> > SIZE=5299: > host mx1.earthlink.net [209.86.93.226]: 550 550 Dynamic/zombied/spam IPs > blocked. Write [EMAIL PROTECTED] > > -- This is a copy of the message, including all the headers. -- > > Return-path: <[EMAIL PROTECTED]> > Received: from florz.florz.dyndns.org ([192.168.0.121]) > by rain.florz.dyndns.org with esmtp (Exim 4.50) > id 1HNs4n-0006c7-E0; Sun, 04 Mar 2007 15:52:41 +0100 > Received: from florz by florz.florz.dyndns.org with local (Exim 3.35 #1 > (Debian)) > id 1HNs4k-xd-00; Sun, 04 Mar 2007 15:52:38 +0100 > Date: Sun, 4 Mar 2007 15:52:38 +0100 > From: Florian Zumbiehl <[EMAIL PROTECTED]> > To: Michal Ostrowski <[EMAIL PROTECTED]> > Cc: David Miller <[EMAIL PROTECTED]>, netdev@vger.kernel.org, > [EMAIL PROTECTED] > Subject: Re: Session ID 0 with PPPoE > Message-ID: <[EMAIL PROTECTED]> > References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> > Mime-Version: 1.0 > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > In-Reply-To: <[EMAIL PROTECTED]> > User-Agent: Mutt/1.5.9i > > Hi, > > > >>From the RFC: > > > > 5.4 The PPPoE Active Discovery Session-confirmation (PADS) packet > > > >When the Access Concentrator receives a PADR packet, it prepares to > >begin a PPP session. It generates a unique SESSION_ID for the PPPoE > >session and replies to the Host with a PADS packet. The > >DESTINATION_ADDR field is the unicast Ethernet address of the Host > >that sent the PADR. The CODE field is set to 0x65 and the SESSION_ID > >MUST be set to the unique value generated for this PPPoE session. > > > >The PADS packet contains exactly one TAG of TAG_TYPE Service-Name, > >indicating the service under which Access Concentrator has accepted > >the PPPoE session, and any number of other TAG types. > > > >If the Access Concentrator does not like the Service-Name in the > >PADR, then it MUST reply with a PADS containing a TAG of TAG_TYPE > >Service-Name-Error (and any number of other TAG types). In this case > >the SESSION_ID MUST be set to 0x. > > > > > > > > As you can see from the last paragraph, a session id of 0 implies a > > rejection of the PADR. Thus, you can't possibly get a PADS packet that > > completes and initiates a valid session if the session id is 0. > > > > Note that the RFC does not prohibit all other aspects of the PADS to be > > structured as if it were a valid success response; the only condition > > and requirement of a failure mode here is the session id. > > | [...] then it MUST reply with a PADS containing a TAG of TAG_TYPE > | Service-Name-Error [...] > > !?! > > To my understanding, the indicator is the Service-Name-Error tag, and > the RFC only states that if such a tag is present (indicating that > the AC "doesn't like" the requested service name and thus rejects the > session request), the session id field must be 0x - not that the > session id field may not be 0x if this tag is not present (which > would indicate that this is a valid session). > > > Also 0x is reserved for future
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On 3/5/07, Matt Mackall <[EMAIL PROTECTED]> wrote: > This is due to the recent sysfs restructuring I think. IIRC the fix is > to upgrade hal to a current git version. If that's the cause, the fix is to back out whatever was done to break userspace. Breaking userspace is not ok. Upgrading from 2.6.x to 2.6.x+1 should not entail replacing substantial parts of userspace, especially with NOT-EVEN-FRAKKING-RELEASED-YET CODE. I will try a new HAL when it shows up in Debian/unstable and not a moment sooner. But you're running a kernel that's not in Debian/unstable so this seems a bit hypocritical. When you work with bleeding edge kernels you have to be prepared to work around things. Hell for ages git wasn't in Debian - unstable even, udev would break things etc. Just my 2c worth. -- Web: http://wand.net.nz/~iam4 Blog: http://iansblog.jandi.co.nz WAND Network Research Group - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
On Mon, Mar 05, 2007 at 12:39:24AM +0100, Johannes Berg wrote: > [adding linux-wireless to CC] > > On Sun, 2007-03-04 at 16:08 -0600, Matt Mackall wrote: > > Recent kernels are having troubles with wireless for me. Two seemingly > > related problems: > > I don't think they are related actually. > > > a) NetworkManager seems oblivious to the existence of my IPW2200 > > This is due to the recent sysfs restructuring I think. IIRC the fix is > to upgrade hal to a current git version. If that's the cause, the fix is to back out whatever was done to break userspace. Breaking userspace is not ok. Upgrading from 2.6.x to 2.6.x+1 should not entail replacing substantial parts of userspace, especially with NOT-EVEN-FRAKKING-RELEASED-YET CODE. I will try a new HAL when it shows up in Debian/unstable and not a moment sooner. > > b) Manual iwconfig waits for 60s and then reports: > > That one's strange. > > > A second attempt to enable WEP via iwconfig succeeds and network > > connectivity is normal. However, NetworkManager still ignores the > > device at this point. > > I'd think it's a ipw bug but I have no idea if that was even touched > during this time. > > > Bisect with Mercurial points to this patch: > > > > $ hg bisect bad > > The first bad revision is: > > changeset: 46985:f701b96bb2f7 > > user:Greg Kroah-Hartman <[EMAIL PROTECTED]> > > date:Wed Feb 07 10:37:11 2007 -0800 > > summary: Network: convert network devices to use struct device > > instead of class_device > > > > which corresponds to 43cb76d91ee85f579a69d42bc8efc08bac560278 in git. > > Yup, sysfs breakage/hal stuff. Can you try with a recent hal? And maybe > try to bisect the iwconfig stop thing if you've got enough time... Will double-check the iwconfig tests. It's been masked by NetworkManager for a while. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fun splitting reassembled UDP skbuffs
From: David Howells <[EMAIL PROTECTED]> Date: Sun, 04 Mar 2007 14:53:35 + > Is it possible to mix the likes of skb_clone(), pskb_pull() and pskb_trim()? It should work, pskb_pull() and pskb_trim() check for cloning and COW the data area as-needed. I think you just have a refcounting error somewhere, based upon your SLAB corruption trace. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Sun, 4 Mar 2007 20:05:30 +0100 (MET) > These patches convert the GETTIMEOFDAY packet scheduler clock source to > ktime (based on Stephen's patch) and add support for using nano-second > clock resolution. I chose a scalar time representation within the packet > schedulers instead of ktime_t since it minimizes the ktime_to_ns() calls > in most cases, it allows to clean up pkt_sched.h quite a bit and HFSC > needs it anyway. > > Unlike my previous attempt at this, these patches keep old iproute > versions working with nano-second resolution with the exception of HFSC. > I'm not sure what to do about HFSC yet, so just RFC for now. This looks great to me. Frankly, I think now that we have ktime and all of the proper generic infrastructure to do this stuff properly, I think we should just use ktime for the packet scheduler across the board and just delete all of that old by-hand timekeeping selection crap from pkt_sched.h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] NET : convert network timestamps to ktime_t
From: Eric Dumazet <[EMAIL PROTECTED]> Date: Fri, 02 Mar 2007 23:46:14 +0100 > Stephen Hemminger a écrit : > > You missed a couple of spots. > > Arg yes... ... > > - } > > - skb_get_timestamp(skb, &svsk->sk_sk->sk_stamp); > > + svsk->sk_sk->sk_stamp = (skb->tstamp.tv64 != 0) ? skb->tstamp > > + : ktime_get_real(); > > Well, if we want to stay in the spirit of old code, we probably want to use > current_kernel_time() (+ timespec_to_ktime()), because its less expensive. > > And also setting the skb tstamp, no ? Can you guys cook up an integrated patch with all the missing cases fixed up as desired, so I can add this to net-2.6.22, thanks? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] div64_64 consolidate (rev3)
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 17:29:20 -0800 > Here is the current version of the 64 bit divide common code. > Since it is used by three times by networking code, can we put it net-2.6.22 > tree? > > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> Applied to net-2.6.22, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] NET : keep sk_backlog near sk_lock
From: Eric Dumazet <[EMAIL PROTECTED]> Date: Thu, 22 Feb 2007 15:50:15 +0100 > sk_backlog is a critical field of struct sock. (known famous words) > > It is (ab)used in hot paths, in particular in release_sock(), tcp_recvmsg(), > tcp_v4_rcv(), sk_receive_skb(). > > It really makes sense to place it next to sk_lock, because sk_backlog is only > used after sk_lock locked (and thus memory cache line in L1 cache). This > should reduce cache misses and sk_lock acquisition time. > > (In theory, we could only move the head pointer near sk_lock, and leaving > tail > far away, because 'tail' is normally not so hot, but keep it simple :) ) > > Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]> Applied, thanks a lot Eric. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][BUG][SECURITY] Re: Weird problem with PPPoE on tap interface
From: Florian Zumbiehl <[EMAIL PROTECTED]> Date: Sun, 4 Mar 2007 13:09:39 +0100 > > From: Florian Zumbiehl <[EMAIL PROTECTED]> > > Date: Sun, 4 Mar 2007 02:55:16 +0100 > > > > > Below you find a slightly changed version of the patch > > > > I already applied your first patch, so if you have any > > fixes to submit please provide them as relative patches > > to your original change. > > > > Thank you. > > Here you go ... Applied, thanks a lot Florian. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Wifi support for iproute2
On Sun, 2007-03-04 at 02:26 +0100, Stephan Maka wrote: > Everyone told me dscape is near, but I hope iwlib will be ported then. Just a few short notes to add to Stephen's reply. dscape (or mac80211 now, it's been renamed) is going to migrate away from wireless extensions, in fact, wext is going to loom in the "backward compatibility" corner. That's quite some time off, but there's no point in writing new tools for it anyway. Within the kernel, cfg80211 is (hopefully!) going to replace wext, with the main userspace API being nl80211 (but some sysfs stuff too.) If you look at http://git.sipsolutions.net/pynl80211.git/ (usable as both a git and gitweb url) you'll find a python nl80211 tool I hacked up for some basic stuff to configure things via nl80211. Most of that doesn't actually work yet, of course, since mac80211 still uses wext internally, but as soon as we get enough manpower that'll change. johannes signature.asc Description: This is a digitally signed message part
Re: Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
[adding linux-wireless to CC] On Sun, 2007-03-04 at 16:08 -0600, Matt Mackall wrote: > Recent kernels are having troubles with wireless for me. Two seemingly > related problems: I don't think they are related actually. > a) NetworkManager seems oblivious to the existence of my IPW2200 This is due to the recent sysfs restructuring I think. IIRC the fix is to upgrade hal to a current git version. > b) Manual iwconfig waits for 60s and then reports: That one's strange. > A second attempt to enable WEP via iwconfig succeeds and network > connectivity is normal. However, NetworkManager still ignores the > device at this point. I'd think it's a ipw bug but I have no idea if that was even touched during this time. > Bisect with Mercurial points to this patch: > > $ hg bisect bad > The first bad revision is: > changeset: 46985:f701b96bb2f7 > user:Greg Kroah-Hartman <[EMAIL PROTECTED]> > date:Wed Feb 07 10:37:11 2007 -0800 > summary: Network: convert network devices to use struct device > instead of class_device > > which corresponds to 43cb76d91ee85f579a69d42bc8efc08bac560278 in git. Yup, sysfs breakage/hal stuff. Can you try with a recent hal? And maybe try to bisect the iwconfig stop thing if you've got enough time... johannes signature.asc Description: This is a digitally signed message part
[NET] Fix PCnet32 performance bug on non-coherent architecutres
The PCnet32 driver always passed the the size of the largest possible packet to the pci_dma_sync_single_for_cpu and pci_dma_sync_single_for_device. This results in a fairly large "colateral damage" in the caches and makes the flush operation itself much slower. On a system with a 40MHz CPU this patch increases network bandwidth by about 12%. Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]> diff --git a/drivers/net/pcnet32.c b/drivers/net/pcnet32.c index 36f9d98..4d94ba7 100644 --- a/drivers/net/pcnet32.c +++ b/drivers/net/pcnet32.c @@ -1234,14 +1234,14 @@ static void pcnet32_rx_entry(struct net_device *dev, skb_put(skb, pkt_len); /* Make room */ pci_dma_sync_single_for_cpu(lp->pci_dev, lp->rx_dma_addr[entry], - PKT_BUF_SZ - 2, + pkt_len, PCI_DMA_FROMDEVICE); eth_copy_and_sum(skb, (unsigned char *)(lp->rx_skbuff[entry]->data), pkt_len, 0); pci_dma_sync_single_for_device(lp->pci_dev, lp->rx_dma_addr[entry], - PKT_BUF_SZ - 2, + pkt_len, PCI_DMA_FROMDEVICE); } lp->stats.rx_bytes += skb->len; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Recent wireless breakage (ipw2200, iwconfig, NetworkManager)
Recent kernels are having troubles with wireless for me. Two seemingly related problems: a) NetworkManager seems oblivious to the existence of my IPW2200 b) Manual iwconfig waits for 60s and then reports: Error for wireless request "Set Encode" (8B2A) : SET failed on device eth1 ; Operation not supported. During this time, my keyboard in X is unresponsive, but everything else seems to be functioning properly. Queued keypresses eventually show up. Alt-sysrq-w gives: ieee80211_crypt: registered algorithm 'WEP' SysRq : Show Blocked State freesibling task PCstack pid father child younger older events/0 D C0102D3C 0 4 1 5 3 (L-TLB) c1d0bf1c 0046 c1d0ac20 c0102d3c c1d0aa70 0022 000a c1d0aa70 ca8ed618 0034 0cd3 c1d0ab7c 0287 0002 f581f040 0002 c1d0bf34 0246 f7b5cb04 c1d0aa70 c038224e c0102c3a Call Trace: [] __switch_to+0x11b/0x143 [] __mutex_lock_slowpath+0xfb/0x1e2 [] __switch_to+0x19/0x143 [] ipw_bg_link_down+0x19/0xbd [ipw2200] [] ipw_bg_link_down+0x0/0xbd [ipw2200] [] run_workqueue+0x97/0x156 [] worker_thread+0x105/0x12e [] default_wake_function+0x0/0xc [] worker_thread+0x0/0x12e [] kthread+0xa0/0xc9 [] kthread+0x0/0xc9 [] kernel_thread_helper+0x7/0x10 === ipw2200/0 D 0020 0 1985 6 2260 1983 (L-TLB) f7981f24 0046 0001 0020 c1cdf8c0 000a f7d09030 e1fdc8c3 0034 093a f7d0913c 0086 0020 f7c4a740 0086 f7981f3c 0246 f7b5cb04 f7d09030 c038224e f7d09030 c0496550 Call Trace: [] __mutex_lock_slowpath+0xfb/0x1e2 [] __sched_text_start+0x4b3/0x56b [] ipw_bg_gather_stats+0x0/0x27 [ipw2200] [] ipw_bg_gather_stats+0x17/0x27 [ipw2200] [] run_workqueue+0x97/0x156 [] worker_thread+0x105/0x12e [] default_wake_function+0x0/0xc [] worker_thread+0x0/0x12e [] kthread+0xa0/0xc9 [] kthread+0x0/0xc9 [] kernel_thread_helper+0x7/0x10 === ieee80211_crypt_wep: could not allocate crypto API arc4 eth1: could not initialize WEP: load module ieee80211_crypt_wep ADDRCONF(NETDEV_UP): eth1: link is not ready A second attempt to enable WEP via iwconfig succeeds and network connectivity is normal. However, NetworkManager still ignores the device at this point. Bisect with Mercurial points to this patch: $ hg bisect bad The first bad revision is: changeset: 46985:f701b96bb2f7 user:Greg Kroah-Hartman <[EMAIL PROTECTED]> date:Wed Feb 07 10:37:11 2007 -0800 summary: Network: convert network devices to use struct device instead of class_device which corresponds to 43cb76d91ee85f579a69d42bc8efc08bac560278 in git. -- "Love the dolphins," she advised him. "Write by W.A.S.T.E.." - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Extensible hashing and RCU
From: "Michael K. Edwards" <[EMAIL PROTECTED]> Date: Sun, 4 Mar 2007 02:02:36 -0800 > On 3/3/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote: > > Btw, you could try to implement something you have written above to show > > its merits, so that it would not be an empty words :) > > Before I implement, I design. Before I design, I analyze. Before I > analyze, I prototype. Before I prototype, I gather requirements. How the heck do you ever get to writing ANYTHING if you work that way? I certainly would never have written one single line of Linux kernel code if I had to go through that kind of sequence to actually get to writing code. And that's definitely not the "Linux way". You code up ideas as soon as you come up with one that has a chance of working, and you see what happens. Sure, you'll throw a lot away, but at least you will "know" instead of "think". You have to try things, "DO" stuff, not just sit around and theorize and design things and shoot down ideas on every negative minute detail you can come up with before you type any code in. That mode of development doesn't inspire people and get a lot of code written. I definitely do not think others should use this design/prototype/analyze/blah/balh way of developing as an example, instead I think folks should use people like Ingo Molnar as an example of a good Linux developer. People like Ingo rewrite the scheduler one night because of a tiny cool idea, and even if only 1 out of 10 hacks like that turn out to be useful, his work is invaluable and since he's actually trying to do things and writing lots of code this inspires other people. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC IPROUTE 07/08]: Handle different kernel clock resolutions
[IPROUTE]: Handle different kernel clock resolutions Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit 0e0db5d408bdac33eadd9d947c0e6904df26ab8f tree 950a6f47287bec01e5996bec3f1141b60f3f6f6a parent 5950296ff76ba81593928a2ee89757d69b2acba9 author Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 tc/tc_core.c | 26 +++--- 1 files changed, 19 insertions(+), 7 deletions(-) diff --git a/tc/tc_core.c b/tc/tc_core.c index e27254e..58155fb 100644 --- a/tc/tc_core.c +++ b/tc/tc_core.c @@ -23,9 +23,8 @@ #include #include "tc_core.h" -static __u32 t2us=1; -static __u32 us2t=1; static double tick_in_usec = 1; +static double clock_factor = 1; int tc_core_time2big(long time) { @@ -48,12 +47,12 @@ long tc_core_tick2time(long tick) long tc_core_time2ktime(long time) { - return time; + return time * clock_factor; } long tc_core_ktime2time(long ktime) { - return ktime; + return ktime / clock_factor; } unsigned tc_calc_xmittime(unsigned rate, unsigned size) @@ -98,16 +97,29 @@ int tc_calc_rtable(unsigned bps, __u32 * int tc_core_init() { - FILE *fp = fopen("/proc/net/psched", "r"); + FILE *fp; + __u32 clock_res; + __u32 t2us; + __u32 us2t; + fp = fopen("/proc/net/psched", "r"); if (fp == NULL) return -1; - if (fscanf(fp, "%08x%08x", &t2us, &us2t) != 2) { + if (fscanf(fp, "%08x%08x%08x", &t2us, &us2t, &clock_res) != 3) { fclose(fp); return -1; } fclose(fp); - tick_in_usec = (double)t2us/us2t; + + /* compatibility hack: for old iproute binaries (ignoring +* the kernel clock resolution) the kernel advertises a +* tick multiplier of 1000 in case of nano-second resolution, +* which really is 1. */ + if (clock_res == 10) + t2us = us2t; + + clock_factor = (double)clock_res / TIME_UNITS_PER_SEC; + tick_in_usec = (double)t2us / us2t * clock_factor; return 0; } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC IPROUTE 08/08]: Increase internal clock resolution to nsec
[IPROUTE]: Increase internal clock resolution to nsec Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit 8a76c51f61dd3881bba35f8c73aaa92eabaf50da tree d4b252b801a14ee19ed77d4a06daaacd8c17b495 parent 0e0db5d408bdac33eadd9d947c0e6904df26ab8f author Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 tc/tc_core.h |2 +- tc/tc_util.c |7 ++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/tc/tc_core.h b/tc/tc_core.h index a139da6..28bf97a 100644 --- a/tc/tc_core.h +++ b/tc/tc_core.h @@ -4,7 +4,7 @@ #define _TC_CORE_H_ 1 #include #include -#define TIME_UNITS_PER_SEC 100 +#define TIME_UNITS_PER_SEC 10 int tc_core_time2big(long time); long tc_core_time2tick(long time); diff --git a/tc/tc_util.c b/tc/tc_util.c index a7e4257..a07c6aa 100644 --- a/tc/tc_util.c +++ b/tc/tc_util.c @@ -228,6 +228,9 @@ int get_time(unsigned *time, const char else if (strcasecmp(p, "us") == 0 || strcasecmp(p, "usec")==0 || strcasecmp(p, "usecs") == 0) t *= TIME_UNITS_PER_SEC/100; + else if (strcasecmp(p, "ns") == 0 || strcasecmp(p, "nsec")==0 || +strcasecmp(p, "nsecs") == 0) + t *= TIME_UNITS_PER_SEC/10; else return -1; } @@ -245,8 +248,10 @@ void print_time(char *buf, int len, __u3 snprintf(buf, len, "%.1fs", tmp/TIME_UNITS_PER_SEC); else if (tmp >= TIME_UNITS_PER_SEC/1000) snprintf(buf, len, "%.1fms", tmp/(TIME_UNITS_PER_SEC/1000)); + else if (tmp >= TIME_UNITS_PER_SEC/100) + snprintf(buf, len, "%.1fus", tmp/(TIME_UNITS_PER_SEC/100)); else - snprintf(buf, len, "%uus", time); + snprintf(buf, len, "%uns", time); } char * sprint_time(__u32 time, char *buf) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC IPROUTE 06/08]: Add sprint_ticks() function and use in CBQ
[IPROUTE]: Add sprint_ticks() function and use in CBQ Add helper function to print ticks to avoid assumptions about clock resolution in CBQ. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit 5950296ff76ba81593928a2ee89757d69b2acba9 tree aca072937195b2011c9f64a305716ddfc1b40c66 parent c2cf24282b2a051942b18fbf894a9c1b490d925c author Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 tc/q_cbq.c |7 --- tc/q_netem.c |6 -- tc/tc_util.c |5 + tc/tc_util.h |1 + 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/tc/q_cbq.c b/tc/q_cbq.c index 913b26a..f2b4ce8 100644 --- a/tc/q_cbq.c +++ b/tc/q_cbq.c @@ -418,6 +418,7 @@ static int cbq_print_opt(struct qdisc_ut struct tc_cbq_wrropt *wrr = NULL; struct tc_cbq_fopt *fopt = NULL; struct tc_cbq_ovl *ovl = NULL; + SPRINT_BUF(b1); if (opt == NULL) return 0; @@ -500,17 +501,17 @@ static int cbq_print_opt(struct qdisc_ut if (lss && show_details) { fprintf(f, "\nlevel %u ewma %u avpkt %ub ", lss->level, lss->ewma_log, lss->avpkt); if (lss->maxidle) { - fprintf(f, "maxidle %luus ", tc_core_tick2time(lss->maxidle>>lss->ewma_log)); + fprintf(f, "maxidle %s ", sprint_ticks(lss->maxidle>>lss->ewma_log, b1)); if (show_raw) fprintf(f, "[%08x] ", lss->maxidle); } if (lss->minidle!=0x7fff) { - fprintf(f, "minidle %luus ", tc_core_tick2time(lss->minidle>>lss->ewma_log)); + fprintf(f, "minidle %s ", sprint_ticks(lss->minidle>>lss->ewma_log, b1)); if (show_raw) fprintf(f, "[%08x] ", lss->minidle); } if (lss->offtime) { - fprintf(f, "offtime %luus ", tc_core_tick2time(lss->offtime)); + fprintf(f, "offtime %s ", sprint_ticks(lss->offtime, b1)); if (show_raw) fprintf(f, "[%08x] ", lss->offtime); } diff --git a/tc/q_netem.c b/tc/q_netem.c index 6035c4f..67a2ff3 100644 --- a/tc/q_netem.c +++ b/tc/q_netem.c @@ -120,12 +120,6 @@ static int get_ticks(__u32 *ticks, const return 0; } -static char *sprint_ticks(__u32 ticks, char *buf) -{ - return sprint_usecs(tc_core_tick2usec(ticks), buf); -} - - static int netem_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nlmsghdr *n) { diff --git a/tc/tc_util.c b/tc/tc_util.c index b73fae9..a7e4257 100644 --- a/tc/tc_util.c +++ b/tc/tc_util.c @@ -255,6 +255,11 @@ char * sprint_time(__u32 time, char *buf return buf; } +char * sprint_ticks(__u32 ticks, char *buf) +{ + return sprint_time(tc_core_tick2time(ticks), buf); +} + int get_size(unsigned *size, const char *str) { double sz; diff --git a/tc/tc_util.h b/tc/tc_util.h index b713cf1..eade72d 100644 --- a/tc/tc_util.h +++ b/tc/tc_util.h @@ -57,6 +57,7 @@ extern char * sprint_size(__u32 size, ch extern char * sprint_qdisc_handle(__u32 h, char *buf); extern char * sprint_tc_classid(__u32 h, char *buf); extern char * sprint_time(__u32 time, char *buf); +extern char * sprint_ticks(__u32 ticks, char *buf); extern char * sprint_percent(__u32 percent, char *buf); extern void print_tcstats_attr(FILE *fp, struct rtattr *tb[], char *prefix, struct rtattr **xstats); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC IPROUTE 04/08]: Introduce TIME_UNITS_PER_SEC to represent internal clock resolution
[IPROUTE]: Introduce TIME_UNITS_PER_SEC to represent internal clock resolution Introduce TIME_UNITS_PER_SEC and conversion functions between internal resolution and resolution expected by the kernel (currently implemented as NOPs, only needed by HFSC, which currently always uses microseconds). Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit d81b168f4017e45343cdd723e653401058f728e9 tree 7e28041ff488f863e79d8745fde85aaed30dd4ac parent 8b41013e2abb8eb6d3c960911d2ce137b40ccd50 author Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 tc/q_hfsc.c | 12 ++-- tc/q_tbf.c|8 tc/tc_cbq.c |4 ++-- tc/tc_core.c | 14 -- tc/tc_core.h |4 tc/tc_estimator.c |2 +- tc/tc_util.c | 14 +++--- 7 files changed, 36 insertions(+), 22 deletions(-) diff --git a/tc/q_hfsc.c b/tc/q_hfsc.c index 4e8c09b..f7a30f2 100644 --- a/tc/q_hfsc.c +++ b/tc/q_hfsc.c @@ -226,7 +226,7 @@ hfsc_print_sc(FILE *f, char *name, struc fprintf(f, "%s ", name); fprintf(f, "m1 %s ", sprint_rate(sc->m1, b1)); - fprintf(f, "d %s ", sprint_usecs(sc->d, b1)); + fprintf(f, "d %s ", sprint_usecs(tc_core_ktime2time(sc->d), b1)); fprintf(f, "m2 %s ", sprint_rate(sc->m2, b1)); } @@ -320,7 +320,7 @@ hfsc_get_sc1(int *argcp, char ***argvp, return -1; sc->m1 = m1; - sc->d = d; + sc->d = tc_core_time2ktime(d); sc->m2 = m2; *argvp = argv; @@ -367,13 +367,13 @@ hfsc_get_sc2(int *argcp, char ***argvp, return -1; } - if (dmax != 0 && ceil(umax * 100.0 / dmax) > rate) { + if (dmax != 0 && ceil(1.0 * umax * TIME_UNITS_PER_SEC / dmax) > rate) { /* * concave curve, slope of first segment is umax/dmax, * intersection is at dmax */ - sc->m1 = ceil(umax * 100.0 / dmax); /* in bps */ - sc->d = dmax; + sc->m1 = ceil(1.0 * umax * TIME_UNITS_PER_SEC / dmax); /* in bps */ + sc->d = tc_core_time2ktime(dmax); sc->m2 = rate; } else { /* @@ -381,7 +381,7 @@ hfsc_get_sc2(int *argcp, char ***argvp, * is at dmax - umax / rate */ sc->m1 = 0; - sc->d = ceil(dmax - umax * 100.0 / rate); /* in usec */ + sc->d = tc_core_time2ktime(ceil(dmax - umax * TIME_UNITS_PER_SEC / rate)); sc->m2 = rate; } diff --git a/tc/q_tbf.c b/tc/q_tbf.c index cbfdcd8..a102696 100644 --- a/tc/q_tbf.c +++ b/tc/q_tbf.c @@ -161,9 +161,9 @@ static int tbf_parse_opt(struct qdisc_ut } if (opt.limit == 0) { - double lim = opt.rate.rate*(double)latency/100 + buffer; + double lim = opt.rate.rate*(double)latency/TIME_UNITS_PER_SEC + buffer; if (opt.peakrate.rate) { - double lim2 = opt.peakrate.rate*(double)latency/100 + mtu; + double lim2 = opt.peakrate.rate*(double)latency/TIME_UNITS_PER_SEC + mtu; if (lim2 < lim) lim = lim2; } @@ -245,9 +245,9 @@ static int tbf_print_opt(struct qdisc_ut if (show_raw) fprintf(f, "limit %s ", sprint_size(qopt->limit, b1)); - latency = 100*(qopt->limit/(double)qopt->rate.rate) - tc_core_tick2usec(qopt->buffer); + latency = TIME_UNITS_PER_SEC*(qopt->limit/(double)qopt->rate.rate) - tc_core_tick2usec(qopt->buffer); if (qopt->peakrate.rate) { - double lat2 = 100*(qopt->limit/(double)qopt->peakrate.rate) - tc_core_tick2usec(qopt->mtu); + double lat2 = TIME_UNITS_PER_SEC*(qopt->limit/(double)qopt->peakrate.rate) - tc_core_tick2usec(qopt->mtu); if (lat2 > latency) latency = lat2; } diff --git a/tc/tc_cbq.c b/tc/tc_cbq.c index 0abcc9d..c7b3a2d 100644 --- a/tc/tc_cbq.c +++ b/tc/tc_cbq.c @@ -38,7 +38,7 @@ unsigned tc_cbq_calc_maxidle(unsigned bn if (vxmt > maxidle) maxidle = vxmt; } - return tc_core_usec2tick(maxidle*(1< #include +#define TIME_UNITS_PER_SEC 100 + int tc_core_usec2big(long usec); long tc_core_usec2tick(long usec); long tc_core_tick2usec(long tick); +long tc_core_time2ktime(long time); +long tc_core_ktime2time(long ktime); unsigned tc_calc_xmittime(unsigned rate, unsigned size); unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks); int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu, unsigned mpu); diff --git a/tc/tc_estimator.c b/tc/tc_estimator.c index 434db0f..e559add 100644 --- a/tc/tc_estimator.c +++ b/tc/tc_estimator.c @@ -26,7 +26,7 @@ #includ
[RFC IPROUTE 05/08]: Replace "usec" by "time" in function names
[IPROUTE]: Replace "usec" by "time" in function names Rename functions containing "usec" since they don't necessarily return usec units anymore. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit c2cf24282b2a051942b18fbf894a9c1b490d925c tree 128aa960a599aee0725d723b621db651e02ffa74 parent d81b168f4017e45343cdd723e653401058f728e9 author Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 tc/m_estimator.c |4 ++-- tc/q_cbq.c |6 +++--- tc/q_hfsc.c |6 +++--- tc/q_netem.c |8 tc/q_tbf.c |8 tc/tc_cbq.c |4 ++-- tc/tc_core.c | 14 +++--- tc/tc_core.h |6 +++--- tc/tc_util.c | 14 +++--- tc/tc_util.h |6 +++--- 10 files changed, 38 insertions(+), 38 deletions(-) diff --git a/tc/m_estimator.c b/tc/m_estimator.c index d931551..a9e5dbc 100644 --- a/tc/m_estimator.c +++ b/tc/m_estimator.c @@ -45,12 +45,12 @@ int parse_estimator(int *p_argc, char ** duparg("estimator", *argv); if (matches(*argv, "help") == 0) est_help(); - if (get_usecs(&A, *argv)) + if (get_time(&A, *argv)) invarg("estimator", "invalid estimator interval"); NEXT_ARG(); if (matches(*argv, "help") == 0) est_help(); - if (get_usecs(&time_const, *argv)) + if (get_time(&time_const, *argv)) invarg("estimator", "invalid estimator time constant"); if (tc_setup_estimator(A, time_const, est) < 0) { fprintf(stderr, "Error: estimator parameters are out of range.\n"); diff --git a/tc/q_cbq.c b/tc/q_cbq.c index a56..913b26a 100644 --- a/tc/q_cbq.c +++ b/tc/q_cbq.c @@ -500,17 +500,17 @@ static int cbq_print_opt(struct qdisc_ut if (lss && show_details) { fprintf(f, "\nlevel %u ewma %u avpkt %ub ", lss->level, lss->ewma_log, lss->avpkt); if (lss->maxidle) { - fprintf(f, "maxidle %luus ", tc_core_tick2usec(lss->maxidle>>lss->ewma_log)); + fprintf(f, "maxidle %luus ", tc_core_tick2time(lss->maxidle>>lss->ewma_log)); if (show_raw) fprintf(f, "[%08x] ", lss->maxidle); } if (lss->minidle!=0x7fff) { - fprintf(f, "minidle %luus ", tc_core_tick2usec(lss->minidle>>lss->ewma_log)); + fprintf(f, "minidle %luus ", tc_core_tick2time(lss->minidle>>lss->ewma_log)); if (show_raw) fprintf(f, "[%08x] ", lss->minidle); } if (lss->offtime) { - fprintf(f, "offtime %luus ", tc_core_tick2usec(lss->offtime)); + fprintf(f, "offtime %luus ", tc_core_tick2time(lss->offtime)); if (show_raw) fprintf(f, "[%08x] ", lss->offtime); } diff --git a/tc/q_hfsc.c b/tc/q_hfsc.c index f7a30f2..b190c71 100644 --- a/tc/q_hfsc.c +++ b/tc/q_hfsc.c @@ -226,7 +226,7 @@ hfsc_print_sc(FILE *f, char *name, struc fprintf(f, "%s ", name); fprintf(f, "m1 %s ", sprint_rate(sc->m1, b1)); - fprintf(f, "d %s ", sprint_usecs(tc_core_ktime2time(sc->d), b1)); + fprintf(f, "d %s ", sprint_time(tc_core_ktime2time(sc->d), b1)); fprintf(f, "m2 %s ", sprint_rate(sc->m2, b1)); } @@ -303,7 +303,7 @@ hfsc_get_sc1(int *argcp, char ***argvp, if (matches(*argv, "d") == 0) { NEXT_ARG(); - if (get_usecs(&d, *argv) < 0) { + if (get_time(&d, *argv) < 0) { explain1("d"); return -1; } @@ -346,7 +346,7 @@ hfsc_get_sc2(int *argcp, char ***argvp, if (matches(*argv, "dmax") == 0) { NEXT_ARG(); - if (get_usecs(&dmax, *argv) < 0) { + if (get_time(&dmax, *argv) < 0) { explain1("dmax"); return -1; } diff --git a/tc/q_netem.c b/tc/q_netem.c index cfd1799..6035c4f 100644 --- a/tc/q_netem.c +++ b/tc/q_netem.c @@ -108,15 +108,15 @@ static int get_ticks(__u32 *ticks, const { unsigned t; - if(get_usecs(&t, str)) + if(get_time(&t, str)) return -1; - if (tc_core_usec2big(t)) { - fprintf(stderr, "Illegal %d usecs (too large)\n", t); + if (tc_core_time2big(t)) { + fprintf(stderr, "Illegal %u time (too large)\n", t); return -1; } - *ticks = tc_core_usec2tick(t); + *ticks = tc_core_time2tick(t); return 0; } diff --git a/tc/q_tbf.c b/tc/q_tbf.c index a102696..1fc05f4 100644 --- a/tc/q_tbf.c +++ b/tc/q_tbf.c @@ -67,7 +67,7 @@ static int tbf_parse_opt(struct qdisc_u
[RFC IPROUTE 03/08]: Introduce tc_calc_xmitsize and use where appropriate
[IPROUTE]: Introduce tc_calc_xmitsize and use where appropriate Add tc_calc_xmitsize() as complement to tc_calc_xmittime(), which calculates the size that can be transmitted at a given rate during a given time. Replace all expressions of the form "size = rate*tc_core_tick2usec(time))/100" by tc_calc_xmitsize() calls. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit 8b41013e2abb8eb6d3c960911d2ce137b40ccd50 tree 85068a3019ad77563ae82b9277bd13f3a6de19ba parent a367af8046d31e986740aac45677c8fe8910c293 author Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:50 +0100 tc/m_police.c |2 +- tc/q_htb.c|4 ++-- tc/q_tbf.c|4 ++-- tc/tc_core.c |5 + tc/tc_core.h |1 + 5 files changed, 11 insertions(+), 5 deletions(-) diff --git a/tc/m_police.c b/tc/m_police.c index 93f317c..36a7719 100644 --- a/tc/m_police.c +++ b/tc/m_police.c @@ -331,7 +331,7 @@ #endif fprintf(f, " police 0x%x ", p->index); fprintf(f, "rate %s ", sprint_rate(p->rate.rate, b1)); - buffer = ((double)p->rate.rate*tc_core_tick2usec(p->burst))/100; + buffer = tc_calc_xmitsize(p->rate.rate, p->burst); fprintf(f, "burst %s ", sprint_size(buffer, b1)); fprintf(f, "mtu %s ", sprint_size(p->mtu, b1)); if (show_raw) diff --git a/tc/q_htb.c b/tc/q_htb.c index d5f85c3..53e3f78 100644 --- a/tc/q_htb.c +++ b/tc/q_htb.c @@ -259,9 +259,9 @@ static int htb_print_opt(struct qdisc_ut fprintf(f, "quantum %d ", (int)hopt->quantum); } fprintf(f, "rate %s ", sprint_rate(hopt->rate.rate, b1)); - buffer = ((double)hopt->rate.rate*tc_core_tick2usec(hopt->buffer))/100; + buffer = tc_calc_xmitsize(hopt->rate.rate, hopt->buffer); fprintf(f, "ceil %s ", sprint_rate(hopt->ceil.rate, b1)); - cbuffer = ((double)hopt->ceil.rate*tc_core_tick2usec(hopt->cbuffer))/100; + cbuffer = tc_calc_xmitsize(hopt->ceil.rate, hopt->cbuffer); if (show_details) { fprintf(f, "burst %s/%u mpu %s overhead %s ", sprint_size(buffer, b1), diff --git a/tc/q_tbf.c b/tc/q_tbf.c index b50519f..cbfdcd8 100644 --- a/tc/q_tbf.c +++ b/tc/q_tbf.c @@ -218,7 +218,7 @@ static int tbf_print_opt(struct qdisc_ut if (RTA_PAYLOAD(tb[TCA_TBF_PARMS]) < sizeof(*qopt)) return -1; fprintf(f, "rate %s ", sprint_rate(qopt->rate.rate, b1)); - buffer = ((double)qopt->rate.rate*tc_core_tick2usec(qopt->buffer))/100; + buffer = tc_calc_xmitsize(qopt->rate.rate, qopt->buffer); if (show_details) { fprintf(f, "burst %s/%u mpu %s ", sprint_size(buffer, b1), 1rate.mpu, b2)); @@ -230,7 +230,7 @@ static int tbf_print_opt(struct qdisc_ut if (qopt->peakrate.rate) { fprintf(f, "peakrate %s ", sprint_rate(qopt->peakrate.rate, b1)); if (qopt->mtu || qopt->peakrate.mpu) { - mtu = ((double)qopt->peakrate.rate*tc_core_tick2usec(qopt->mtu))/100; + mtu = tc_calc_xmitsize(qopt->peakrate.rate, qopt->mtu); if (show_details) { fprintf(f, "mtu %s/%u mpu %s ", sprint_size(mtu, b1), 1 peakrate.mpu, b2)); diff --git a/tc/tc_core.c b/tc/tc_core.c index 90a097d..1ca4583 100644 --- a/tc/tc_core.c +++ b/tc/tc_core.c @@ -51,6 +51,11 @@ unsigned tc_calc_xmittime(unsigned rate, return tc_core_usec2tick(100*((double)size/rate)); } +unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks) +{ + return ((double)rate*tc_core_tick2usec(ticks))/100; +} + /* rtab[pkt_len>>cell_log] = pkt_xmit_time */ diff --git a/tc/tc_core.h b/tc/tc_core.h index 65611b6..ff00f92 100644 --- a/tc/tc_core.h +++ b/tc/tc_core.h @@ -8,6 +8,7 @@ int tc_core_usec2big(long usec); long tc_core_usec2tick(long usec); long tc_core_tick2usec(long tick); unsigned tc_calc_xmittime(unsigned rate, unsigned size); +unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks); int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu, unsigned mpu); int tc_setup_estimator(unsigned A, unsigned time_const, struct tc_estimator *est); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC IPROUTE 02/08]: Use tc_calc_xmittime() where appropriate
[IPROUTE]: Use tc_calc_xmittime() where appropriate Replace expressions of the form "tc_core_usec2tick(100 * size/rate)" by tc_calc_xmittime(). The CBQ case deserves an extra comment: when called with bnwd=rate, tc_cbq_calc_maxidle() behaves identical to tc_calc_xmittime(): unsigned tc_cbq_calc_maxidle(...) { double g = 1.0 - 1.0/(1< --- commit a367af8046d31e986740aac45677c8fe8910c293 tree 056436a16da372e602f43259609e54bb3e7dce16 parent b17ff630348093476b7679b421aba1797f3d6466 author Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:47 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 20:30:47 +0100 tc/q_cbq.c |2 +- tc/tc_core.c |2 +- tc/tc_red.c |2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/tc/q_cbq.c b/tc/q_cbq.c index bc7e8ba..a56 100644 --- a/tc/q_cbq.c +++ b/tc/q_cbq.c @@ -147,7 +147,7 @@ static int cbq_parse_opt(struct qdisc_ut if (ewma_log < 0) ewma_log = TC_CBQ_DEF_EWMA; lss.ewma_log = ewma_log; - lss.maxidle = tc_cbq_calc_maxidle(r.rate, r.rate, avpkt, lss.ewma_log, 0); + lss.maxidle = tc_calc_xmittime(r.rate, avpkt); lss.change = TCF_CBQ_LSS_MAXIDLE|TCF_CBQ_LSS_EWMA|TCF_CBQ_LSS_AVPKT; lss.avpkt = avpkt; diff --git a/tc/tc_core.c b/tc/tc_core.c index 10c375e..90a097d 100644 --- a/tc/tc_core.c +++ b/tc/tc_core.c @@ -76,7 +76,7 @@ int tc_calc_rtable(unsigned bps, __u32 * sz += overhead; if (sz < mpu) sz = mpu; - rtab[i] = tc_core_usec2tick(100*((double)sz/bps)); + rtab[i] = tc_calc_xmittime(bps, sz); } return cell_log; } diff --git a/tc/tc_red.c b/tc/tc_red.c index 385e7af..8f9bde0 100644 --- a/tc/tc_red.c +++ b/tc/tc_red.c @@ -71,7 +71,7 @@ int tc_red_eval_ewma(unsigned qmin, unsi int tc_red_eval_idle_damping(int Wlog, unsigned avpkt, unsigned bps, __u8 *sbuf) { - double xmit_time = tc_core_usec2tick(100*(double)avpkt/bps); + double xmit_time = tc_calc_xmittime(bps, avpkt); double lW = -log(1.0 - 1.0/(1
[RFC IPROUTE 01/08]: tbf: fix latency printing
[IPROUTE]: tbf: fix latency printing The calculated latency is already in usecs, the additional tick2usec conversion breaks the calculation with jiffies or tsc clock source. Example: # tc qdisc add dev dummy0 root tbf latency 20ms burst 10k rate 50mbit # tc qdisc show dev dummy0 qdisc tbf 8002: rate 5Kbit burst 10Kb lat 15.4ms Fixed: # tc qdisc show dev dummy0 qdisc tbf 8002: rate 5Kbit burst 10Kb lat 20ms Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit b17ff630348093476b7679b421aba1797f3d6466 tree 95acccdac41e6eeeb0a9b270b43d8e0c747f2524 parent 40076f622e0aacb2b792d3ac1b5d12aa97c4da9c author Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 19:31:16 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sat, 03 Mar 2007 19:31:16 +0100 tc/q_tbf.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/tc/q_tbf.c b/tc/q_tbf.c index b8251cb..b50519f 100644 --- a/tc/q_tbf.c +++ b/tc/q_tbf.c @@ -251,7 +251,7 @@ static int tbf_print_opt(struct qdisc_ut if (lat2 > latency) latency = lat2; } - fprintf(f, "lat %s ", sprint_usecs(tc_core_tick2usec(latency), b1)); + fprintf(f, "lat %s ", sprint_usecs(latency, b1)); return 0; } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC IPROUTE 00/08]: Time cleanups + nano-second clock resolution support
This patchset consists of four parts: - minor TBF time conversion fix - consolidation of time calculations: consolidate commonly used expressions with the goal of making it easier to audit for integer overflows when increasing the internally used clock resolution. - support for detecting the clock resolution used by the kernel and converting time values as necessary. - finally, increase the internally used clock resolution to nano-seconds These patches have been tested (well, TBF and HFSC) with both old kernels and patched kernels using nano-second resolution. tc/m_estimator.c |4 +-- tc/m_police.c |2 - tc/q_cbq.c| 15 +++-- tc/q_hfsc.c | 18 +++ tc/q_htb.c|4 +-- tc/q_netem.c | 14 +++- tc/q_tbf.c| 22 +-- tc/tc_cbq.c |8 +++ tc/tc_core.c | 61 ++ tc/tc_core.h | 13 +++ tc/tc_estimator.c |2 - tc/tc_red.c |2 - tc/tc_util.c | 40 ++- tc/tc_util.h |7 +++--- 14 files changed, 125 insertions(+), 87 deletions(-) Patrick McHardy: [IPROUTE]: tbf: fix latency printing [IPROUTE]: Use tc_calc_xmittime() where appropriate [IPROUTE]: Introduce tc_calc_xmitsize and use where appropriate [IPROUTE]: Introduce TIME_UNITS_PER_SEC to represent internal clock resolution [IPROUTE]: Replace "usec" by "time" in function names [IPROUTE]: Add sprint_ticks() function and use in CBQ [IPROUTE]: Handle different kernel clock resolutions [IPROUTE]: Increase internal clock resolution to nsec - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC NET_SCHED 03/03]: Add support for nano-second clock resolution
[NET_SCHED]: Add support for nano-second clock resolution Add support to nano-second clock resolution with ktime as clock source. Since the ABI uses clock ticks in some places and all clock sources previously used micro-second resolution, this changes the API. To avoid breakage with old iproute versions, a clock multiplier of 1000 is advertised in /proc/net/psched, which keeps everything but HFSC working properly (modulo integer overflows). New iproute versions can detect support for nano-second resolution by reading the third value in /proc/net/psched and ignore the multiplier. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit 7978ac74b18bf1e9b01284a15ea5b54442005bb4 tree 0c1782e884f2c8e791b35df15901c9cc9e64513e parent 8e4951375c3678b4720de46791c39064d8633fca author Patrick McHardy <[EMAIL PROTECTED]> Fri, 02 Mar 2007 03:48:49 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sun, 04 Mar 2007 19:54:04 +0100 include/net/pkt_sched.h | 13 ++--- net/sched/Kconfig | 17 + net/sched/sch_api.c |4 ++-- net/sched/sch_hfsc.c| 19 +-- 4 files changed, 42 insertions(+), 11 deletions(-) diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index b25cc6c..84731cb 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -51,12 +51,19 @@ static inline void *qdisc_priv(struct Qd typedef u64psched_time_t; typedef long psched_tdiff_t; -#ifdef CONFIG_NET_SCH_CLK_GETTIMEOFDAY -#include - +#ifdef CONFIG_NET_SCH_CLK_NSEC_RESOLUTION +#define PSCHED_CLOCK_RESOLUTIONNSEC_PER_SEC +#define PSCHED_US2NS(x)(x) +#define PSCHED_NS2US(x)(x) +#else +#define PSCHED_CLOCK_RESOLUTIONUSEC_PER_SEC /* Avoid doing 64 bit divide by 1000 */ #define PSCHED_US2NS(x)((s64)(x) << 10) #define PSCHED_NS2US(x)((x) >> 10) +#endif + +#ifdef CONFIG_NET_SCH_CLK_GETTIMEOFDAY +#include #define PSCHED_GET_TIME(stamp) \ ((stamp) = PSCHED_NS2US(ktime_to_ns(ktime_get( diff --git a/net/sched/Kconfig b/net/sched/Kconfig index f4544dd..11f9bfd 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -102,6 +102,23 @@ config NET_SCH_CLK_CPU endchoice +config NET_SCH_CLK_NSEC_RESOLUTION + bool "Use nano-second resolution (EXPERIMENTAL)" + depends on EXPERIMENTAL + depends on NET_SCH_CLK_GETTIMEOFDAY + help + This option enables nano-second resolution for the packet scheduler + clock source. + + To take full advantage of the increased precision, an iproute version + using nano-seconds internally is needed. + + Note: enabling the option might cause misbehaviour because of + integer overflows. It will also break HFSC unless a current + version of iproute is used. + + If unsure, say N. + comment "Queueing/Scheduling" config NET_SCH_CBQ diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 503db48..1a1652d 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1181,9 +1181,9 @@ static int psched_tick_per_us; #ifdef CONFIG_PROC_FS static int psched_show(struct seq_file *seq, void *v) { - seq_printf(seq, "%08x %08x %08x %08x\n", + seq_printf(seq, "%08x %08x %08lx %08x\n", psched_tick_per_us, psched_us_per_tick, - 100, HZ); + PSCHED_CLOCK_RESOLUTION, HZ); return 0; } diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index 7df1003..79ccc27 100644 --- a/net/sched/sch_hfsc.c +++ b/net/sched/sch_hfsc.c @@ -383,7 +383,7 @@ cftree_update(struct hfsc_class *cl) * Clock source resolution (CONFIG_NET_SCH_CLK_*) * JIFFIES: for 48<=HZ<=1534 resolution is between 0.63us and 1.27us. * CPU: resolution is between 0.5us and 1us. - * GETTIMEOFDAY: resolution is 1.024us. + * GETTIMEOFDAY: resolution is 1.024us, 1ns with NET_SCH_CLK_NSEC_RESOLUTION. * * sm and ism are scaled in order to keep effective digits. * SM_SHIFT and ISM_SHIFT are selected to keep at least 4 effective @@ -395,18 +395,25 @@ cftree_update(struct hfsc_class *cl) * * bits/sec 100Kbps 1Mbps 10Mbps 100Mbps1Gbps * +--- + * bytes/ns 12.5e-6125e-6 1250e-612500e-6 125000e-6 * bytes/0.5us 6.25e-362.5e-3625e-3 6250e-e62500e-3 * bytes/us 12.5e-3125e-3 1250e-312500e-3 125000e-3 * bytes/1.024us 12.8e-3128e-3 1280e-312800e-3 128000e-3 * bytes/1.27us 15.875e-3 158.75e-3 1587.5e-3 15875e-3 158750e-3 * + * ns/byte 8 8000 80080 8 * 0.5us/byte16016 1.60.16 0.016 * us/byte 80 8 0.80.08 0.008 * 1.024us/byte 78.125 7.8125 0.781250.078125 0.0078125 * 1.27us/byte 63 6.30.63
[RFC TIME 01/03]: Add jiffies_to_nsecs/nsecs_to_jiffies
[TIME]: Add jiffies_to_nsecs/nsecs_to_jiffies Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit 861b3967e1bb01335e544220df779f91b20a27e7 tree cd877a5091f461008ec708b801a2fa04d3c26004 parent 2ff7354fe888f46f6629b57e463b0a1eb956c02b author Patrick McHardy <[EMAIL PROTECTED]> Fri, 02 Mar 2007 02:49:43 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sun, 04 Mar 2007 19:31:38 +0100 include/linux/jiffies.h |2 ++ kernel/time.c | 26 ++ 2 files changed, 28 insertions(+), 0 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index c080f61..e01f8f2 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -263,8 +263,10 @@ #endif */ extern unsigned int jiffies_to_msecs(const unsigned long j); extern unsigned int jiffies_to_usecs(const unsigned long j); +extern unsigned int jiffies_to_nsecs(const unsigned long j); extern unsigned long msecs_to_jiffies(const unsigned int m); extern unsigned long usecs_to_jiffies(const unsigned int u); +extern unsigned long nsecs_to_jiffies(const unsigned int n); extern unsigned long timespec_to_jiffies(const struct timespec *value); extern void jiffies_to_timespec(const unsigned long jiffies, struct timespec *value); diff --git a/kernel/time.c b/kernel/time.c index c6c80ea..736f5e1 100644 --- a/kernel/time.c +++ b/kernel/time.c @@ -500,6 +500,18 @@ #endif } EXPORT_SYMBOL(jiffies_to_usecs); +unsigned int jiffies_to_nsecs(const unsigned long j) +{ +#if HZ <= NSEC_PER_SEC && !(NSEC_PER_SEC % HZ) + return (NSEC_PER_SEC / HZ) * j; +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC) + return (j + (HZ / NSEC_PER_SEC) - 1)/(HZ / NSEC_PER_SEC); +#else + return (j * NSEC_PER_SEC) / HZ; +#endif +} +EXPORT_SYMBOL(jiffies_to_nsecs); + /* * When we convert to jiffies then we interpret incoming values * the following way: @@ -569,6 +581,20 @@ #endif } EXPORT_SYMBOL(usecs_to_jiffies); +unsigned long nsecs_to_jiffies(const unsigned int n) +{ + if (n > jiffies_to_nsecs(MAX_JIFFY_OFFSET)) + return MAX_JIFFY_OFFSET; +#if HZ <= NSEC_PER_SEC && (!NSEC_PER_SEC % HZ) + return (n + (NSEC_PER_SEC / HZ) - 1) / (NSEC_PER_SEC / HZ); +#elif HZ > NSEC_PER_SEC && !(HZ % NSEC_PER_SEC) + return n * (HZ / NSEC_PER_SEC); +#else + return (n * HZ + NSEC_PER_SEC - 1) / NSEC_PER_SEC; +#endif +} +EXPORT_SYMBOL(nsecs_to_jiffies); + /* * The TICK_NSEC - 1 rounds up the value to the next resolution. Note * that a remainder subtract here would not do the right thing as the - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC NET_SCHED 02/03]: Replace gettimeofday clocksource by ktime
[NET_SCHED]: Replace gettimeofday clocksource by ktime Using a monotonic clock avoids glitches when NTP adjusts the clock, additionally it will allow us to take advantage of the higher resolution in the future. This patch also gets rid of the non-scalar representation, which allows to clean up a lot of the mess in pkt_sched.h and results in less ktime_to_ns() calls in most cases. Based on patch by Stephen Hemminger <[EMAIL PROTECTED]> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> --- commit 8e4951375c3678b4720de46791c39064d8633fca tree c3e0567265dc1d13ae52601970d829cb3c8190b2 parent 861b3967e1bb01335e544220df779f91b20a27e7 author Patrick McHardy <[EMAIL PROTECTED]> Fri, 02 Mar 2007 03:44:09 +0100 committer Patrick McHardy <[EMAIL PROTECTED]> Sun, 04 Mar 2007 19:44:41 +0100 include/net/pkt_sched.h | 128 --- kernel/hrtimer.c|1 net/sched/sch_api.c |7 ++- net/sched/sch_hfsc.c| 18 +-- 4 files changed, 30 insertions(+), 124 deletions(-) diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index f6afee7..b25cc6c 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -37,11 +37,6 @@ static inline void *qdisc_priv(struct Qd The things are not so bad, because we may use artifical clock evaluated by integration of network data flow in the most critical places. - - Note: we do not use fastgettimeofday. - The reason is that, when it is not the same thing as - gettimeofday, it returns invalid timestamp, which is - not updated, when net_bh is active. */ /* General note about internal clock. @@ -53,22 +48,23 @@ static inline void *qdisc_priv(struct Qd may be read from /proc/net/psched. */ +typedef u64psched_time_t; +typedef long psched_tdiff_t; #ifdef CONFIG_NET_SCH_CLK_GETTIMEOFDAY +#include -typedef struct timeval psched_time_t; -typedef long psched_tdiff_t; +/* Avoid doing 64 bit divide by 1000 */ +#define PSCHED_US2NS(x)((s64)(x) << 10) +#define PSCHED_NS2US(x)((x) >> 10) -#define PSCHED_GET_TIME(stamp) do_gettimeofday(&(stamp)) -#define PSCHED_US2JIFFIE(usecs) usecs_to_jiffies(usecs) -#define PSCHED_JIFFIE2US(delay) jiffies_to_usecs(delay) +#define PSCHED_GET_TIME(stamp) \ + ((stamp) = PSCHED_NS2US(ktime_to_ns(ktime_get( -#else /* !CONFIG_NET_SCH_CLK_GETTIMEOFDAY */ - -typedef u64psched_time_t; -typedef long psched_tdiff_t; +#define PSCHED_US2JIFFIE(usecs) nsecs_to_jiffies(PSCHED_US2NS((usecs))) +#define PSCHED_JIFFIE2US(delay) PSCHED_NS2US(jiffies_to_nsecs((delay))) -#ifdef CONFIG_NET_SCH_CLK_JIFFIES +#elif defined(CONFIG_NET_SCH_CLK_JIFFIES) #if HZ < 96 #define PSCHED_JSCALE 14 @@ -83,11 +79,11 @@ #define PSCHED_JSCALE 10 #endif #define PSCHED_GET_TIME(stamp) ((stamp) = (get_jiffies_64()<>PSCHED_JSCALE) #define PSCHED_JIFFIE2US(delay) ((delay)< extern psched_tdiff_t psched_clock_per_hz; @@ -107,106 +103,24 @@ do { \ (stamp) = cur>>psched_clock_scale; \ } \ } while (0) + #define PSCHED_US2JIFFIE(delay) (((delay)+psched_clock_per_hz-1)/psched_clock_per_hz) #define PSCHED_JIFFIE2US(delay) ((delay)*psched_clock_per_hz) -#endif /* CONFIG_NET_SCH_CLK_CPU */ - -#endif /* !CONFIG_NET_SCH_CLK_GETTIMEOFDAY */ - -#ifdef CONFIG_NET_SCH_CLK_GETTIMEOFDAY -#define PSCHED_TDIFF(tv1, tv2) \ -({ \ - int __delta_sec = (tv1).tv_sec - (tv2).tv_sec; \ - int __delta = (tv1).tv_usec - (tv2).tv_usec; \ - if (__delta_sec) { \ - switch (__delta_sec) { \ - default: \ - __delta = 0; \ - case 2: \ - __delta += USEC_PER_SEC; \ - case 1: \ - __delta += USEC_PER_SEC; \ - } \ - } \ - __delta; \ -}) - -static inline int -psched_tod_diff(int delta_sec, int bound) -{ - int delta; - - if (bound <= USEC_PER_SEC || delta_sec > (0x7FFF/USEC_PER_SEC)-1) - return bound; - delta = delta_sec * USEC_PER_SEC; - if (delta > bound || delta < 0) - delta = bound; - return delta; -} - -#define PSCHED_TDIFF_SAFE(tv1, tv2, bound) \ -({ \ - int __delta_sec = (tv1).tv_sec - (tv2).tv_sec; \ - int __delta = (tv1).tv_usec - (tv2).tv_usec; \ - switch (__delta_sec) { \ - default: \ - __delta = psched_tod_diff(__delta_sec, bound); break; \ - case 2: \ - __delta += USEC_PER_SEC; \ - case 1: \ - __delta += USEC_PER_SEC; \ - case 0: \ - if (__delta > bound || __delta < 0) \ - __delta = bound; \ - } \ - __delta; \ -}) - -#define PSCHED_
[RFC NET_SCHED 00/03]: ktime + nano-second clock resolution for packet schedulers
These patches convert the GETTIMEOFDAY packet scheduler clock source to ktime (based on Stephen's patch) and add support for using nano-second clock resolution. I chose a scalar time representation within the packet schedulers instead of ktime_t since it minimizes the ktime_to_ns() calls in most cases, it allows to clean up pkt_sched.h quite a bit and HFSC needs it anyway. Unlike my previous attempt at this, these patches keep old iproute versions working with nano-second resolution with the exception of HFSC. I'm not sure what to do about HFSC yet, so just RFC for now. include/linux/jiffies.h |2 include/net/pkt_sched.h | 141 ++-- kernel/hrtimer.c|1 kernel/time.c | 26 net/sched/Kconfig | 17 + net/sched/sch_api.c | 11 ++- net/sched/sch_hfsc.c| 37 +--- 7 files changed, 100 insertions(+), 135 deletions(-) Patrick McHardy: [TIME]: Add jiffies_to_nsecs/nsecs_to_jiffies [NET_SCHED]: Replace gettimeofday clocksource by ktime [NET_SCHED]: Add support for nano-second clock resolution - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Fun splitting reassembled UDP skbuffs
Hi Dave, Is it possible to mix the likes of skb_clone(), pskb_pull() and pskb_trim()? I've tried to do this and I seem to end up with skb refcounting errors amongst other things. As it happens, the UDP packet was fragmented, and so the skbuff I've got is non-linear. The problem I've got to deal with is that RxRPC permits you to glue packets together into one UDP datagram (a jumbogram). If the UDP datagram wasn't fragmented, you'd end up with a single packet looking like this: +---+ | IP header| +---+ | UDP header | +---+ | RxRPC header | <--- has the RXRPC_JUMBO_PACKET flag set +---+ : : : Data : : : +---+ | Jumbo header | <--- has the RXRPC_JUMBO_PACKET flag set +---+ : : : Data : : : +---+ | Jumbo header | <--- has the RXRPC_JUMBO_PACKET flag set +---+ : : : Data : : : +---+ | Jumbo header | +---+ : : : Data : : : +---+ This is equivalent to four lots of this: +---+ | IP header| +---+ | UDP header | +---+ | RxRPC header | +---+ : : : Data : : : +---+ With the sequence number and serial number advanced by 1 in each successive packet and the flags and checksums set appropriately for each. Theoretically, it is permissible for an intervening agency (such as a router) to split a jumbo packet back into its constituent data packets and send them on. If that doesn't happen, then the intended receiver has to reconstitute them. For reference, the RxRPC header looks like this: struct rxrpc_header { __be32 epoch; __be32 cid; __be32 callNumber; __be32 seq; __be32 serial; u8 type; u8 flags; u8 userStatus; u8 securityIndex; __be16 _rsvd; // security checksum __be16 serviceId; }; And the jumbo header looks like this: struct rxrpc_jumbo_header { u8 flags; u8 pad; __be16 _rsvd; // security checksum }; The RxRPC headers are reconstituted by fixing the sequence and serial numbers as previously mentioned, and then replacing the flags and checksum fields in the RxRPC header with the replacements included in the Jumbo header for each packet. So, I have a function that looks like this: void rxrpc_process_jumbo_packet(struct rxrpc_call *call, struct sk_buff *jumbo) { struct rxrpc_jumbo_header jhdr; struct rxrpc_skb_priv *sp; struct sk_buff *part; kenter(",{%u}", jumbo->data_len); sp = rxrpc_skb(jumbo); do { sp->hdr.flags &= ~RXRPC_JUMBO_PACKET; /* make a clone to represent the first subpacket in * what's left of the jumbo packet */ part = skb_clone(jumbo, GFP_ATOMIC); if (!part) { /* simply ditch the tail in the event of * ENOMEM */ pskb_trim(jumbo, RXRPC_JUMBO_DATALEN); break; } pskb_trim(part, RXRPC_JUMBO_DATALEN); if (!pskb_pull(jumbo, RXRPC_JUMBO_DATALEN)) goto protocol_error; if (skb_copy_bits(jumbo, 0, &jhdr, sizeof(jhdr)) < 0) goto protocol_error; if (!pskb_pull(jumbo, sizeof(jhdr))) BUG(); sp->hdr.seq = htonl(ntohl(sp->hdr.seq) + 1); sp->hdr.serial = htonl(ntohl(sp->hdr.serial) + 1); sp->hdr.flags = jhdr.flags; sp->hdr._rsvd = jhdr._rsvd; kproto("Rx DATA Jumbo %%%u", ntohl(sp->hdr.serial) - 1); rxrpc_fast_process_packet(call, part); part = NULL; } while (sp->hdr.flags & RXRPC_JUMBO_PACKET); rxrpc_fast_process_packet(call, jumbo); kleave
Re: Session ID 0 with PPPoE
Hi, > >>From the RFC: > > 5.4 The PPPoE Active Discovery Session-confirmation (PADS) packet > >When the Access Concentrator receives a PADR packet, it prepares to >begin a PPP session. It generates a unique SESSION_ID for the PPPoE >session and replies to the Host with a PADS packet. The >DESTINATION_ADDR field is the unicast Ethernet address of the Host >that sent the PADR. The CODE field is set to 0x65 and the SESSION_ID >MUST be set to the unique value generated for this PPPoE session. > >The PADS packet contains exactly one TAG of TAG_TYPE Service-Name, >indicating the service under which Access Concentrator has accepted >the PPPoE session, and any number of other TAG types. > >If the Access Concentrator does not like the Service-Name in the >PADR, then it MUST reply with a PADS containing a TAG of TAG_TYPE >Service-Name-Error (and any number of other TAG types). In this case >the SESSION_ID MUST be set to 0x. > > > > As you can see from the last paragraph, a session id of 0 implies a > rejection of the PADR. Thus, you can't possibly get a PADS packet that > completes and initiates a valid session if the session id is 0. > > Note that the RFC does not prohibit all other aspects of the PADS to be > structured as if it were a valid success response; the only condition > and requirement of a failure mode here is the session id. | [...] then it MUST reply with a PADS containing a TAG of TAG_TYPE | Service-Name-Error [...] !?! To my understanding, the indicator is the Service-Name-Error tag, and the RFC only states that if such a tag is present (indicating that the AC "doesn't like" the requested service name and thus rejects the session request), the session id field must be 0x - not that the session id field may not be 0x if this tag is not present (which would indicate that this is a valid session). > Also 0x is reserved for future use. Thus it cannot be used as a > sentinel value to indicate an invalid session id. Well, currently it could (IMO, a connect() specifying 0x as the session ID should fail anyway as of now as it is not a valid session id as per the RFC - and 0x in the session id field could be used to mean basically anything at the protocol level in the future) - however that probably wouldn't be a good choice for extensibility reasons: If at some point, a protocol session id field of 0x does somehow mean something that would sensibly be represented as 0x in the session id field of the internal data structure, one would have to change the code again. So I guess the session id simply shouldn't be overloaded, not even with an indication of its validity. > Changing this code would require that the user-space component be > synchronized with this change; as the socket interface implies that 0 is > an invalid/unbound session id. Well, either that or the indication as to whether the session id is currently valid should be stored in some different way. > Lots of badness will occur if 0 is allowed as a session id, and nothing > will be gained because it can't possibly be a valid session id. Well, if that was the case, sure. But I still don't see any reason why it can't be. Florian - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Session ID 0 with PPPoE
>From the RFC: 5.4 The PPPoE Active Discovery Session-confirmation (PADS) packet When the Access Concentrator receives a PADR packet, it prepares to begin a PPP session. It generates a unique SESSION_ID for the PPPoE session and replies to the Host with a PADS packet. The DESTINATION_ADDR field is the unicast Ethernet address of the Host that sent the PADR. The CODE field is set to 0x65 and the SESSION_ID MUST be set to the unique value generated for this PPPoE session. The PADS packet contains exactly one TAG of TAG_TYPE Service-Name, indicating the service under which Access Concentrator has accepted the PPPoE session, and any number of other TAG types. If the Access Concentrator does not like the Service-Name in the PADR, then it MUST reply with a PADS containing a TAG of TAG_TYPE Service-Name-Error (and any number of other TAG types). In this case the SESSION_ID MUST be set to 0x. As you can see from the last paragraph, a session id of 0 implies a rejection of the PADR. Thus, you can't possibly get a PADS packet that completes and initiates a valid session if the session id is 0. Note that the RFC does not prohibit all other aspects of the PADS to be structured as if it were a valid success response; the only condition and requirement of a failure mode here is the session id. Also 0x is reserved for future use. Thus it cannot be used as a sentinel value to indicate an invalid session id. Changing this code would require that the user-space component be synchronized with this change; as the socket interface implies that 0 is an invalid/unbound session id. Lots of badness will occur if 0 is allowed as a session id, and nothing will be gained because it can't possibly be a valid session id. -- Michal Ostrowski <[EMAIL PROTECTED]> On Sat, 2007-03-03 at 21:07 -0800, David Miller wrote: > From: Florian Zumbiehl <[EMAIL PROTECTED]> > Date: Sun, 4 Mar 2007 03:30:00 +0100 > > > I noticed that the PPPoE code doesn't allow session id 0x to be used > > for an actual session but rather considers 0 a special value denoting > > that the socket is unbound. Now, when reading RFC 2516, I couldn't really > > find anything that would forbid 0x as a session id. Only 0x "is > > reserved for future use and MUST NOT be used", while 0x is specified > > as the only allowed value for the session id field on certain types of > > packets, but neither can I find any statement that forbids 0x as > > an ordinary session identifier, nor can I find any reasons that would > > prevent PPPoE from functioning properly with a session id of 0x. > > > > Does anyone of you see any reason why a server would not be allowed to > > select 0x as the session id for a PPPoE session? > > I can't, feel free to provide a patch to remove this limitation > if it's important to you. > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][BUG][SECURITY] Re: Weird problem with PPPoE on tap interface
Sorry for the late reply I've been on the road the past few days. I ACK the patch. I'll need to think about it some more, but we could probably go a step further and eliminate the MAC address from the hash as well. -- Michal Ostrowski <[EMAIL PROTECTED]> On Sat, 2007-03-03 at 21:08 -0800, David Miller wrote: > From: Florian Zumbiehl <[EMAIL PROTECTED]> > Date: Sun, 4 Mar 2007 02:55:16 +0100 > > > Below you find a slightly changed version of the patch > > I already applied your first patch, so if you have any > fixes to submit please provide them as relative patches > to your original change. > > Thank you. > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][BUG][SECURITY] Re: Weird problem with PPPoE on tap interface
> From: Florian Zumbiehl <[EMAIL PROTECTED]> > Date: Sun, 4 Mar 2007 02:55:16 +0100 > > > Below you find a slightly changed version of the patch > > I already applied your first patch, so if you have any > fixes to submit please provide them as relative patches > to your original change. > > Thank you. Here you go ... --- linux-2.6.20/drivers/net/pppoe.c.orig 2007-03-04 13:06:01.0 +0100 +++ linux-2.6.20/drivers/net/pppoe.c2007-03-04 02:11:51.0 +0100 @@ -140,7 +140,7 @@ ret = item_hash_table[hash]; - while (ret && !(cmp_addr(&ret->pppoe_pa, sid, addr) && ret->pppoe_dev->ifindex == ifindex)) + while (ret && !(cmp_addr(&ret->pppoe_pa, sid, addr) && ret->pppoe_ifindex == ifindex)) ret = ret->next; return ret; @@ -153,7 +153,7 @@ ret = item_hash_table[hash]; while (ret) { - if (cmp_2_addr(&ret->pppoe_pa, &po->pppoe_pa) && ret->pppoe_dev->ifindex == po->pppoe_dev->ifindex) + if (cmp_2_addr(&ret->pppoe_pa, &po->pppoe_pa) && ret->pppoe_ifindex == po->pppoe_ifindex) return -EALREADY; ret = ret->next; @@ -174,7 +174,7 @@ src = &item_hash_table[hash]; while (ret) { - if (cmp_addr(&ret->pppoe_pa, sid, addr) && ret->pppoe_dev->ifindex == ifindex) { + if (cmp_addr(&ret->pppoe_pa, sid, addr) && ret->pppoe_ifindex == ifindex) { *src = ret->next; break; } @@ -529,7 +529,7 @@ po = pppox_sk(sk); if (po->pppoe_pa.sid) { - delete_item(po->pppoe_pa.sid, po->pppoe_pa.remote, po->pppoe_dev->ifindex); + delete_item(po->pppoe_pa.sid, po->pppoe_pa.remote, po->pppoe_ifindex); } if (po->pppoe_dev) @@ -577,7 +577,7 @@ pppox_unbind_sock(sk); /* Delete the old binding */ - delete_item(po->pppoe_pa.sid,po->pppoe_pa.remote,po->pppoe_dev->ifindex); + delete_item(po->pppoe_pa.sid,po->pppoe_pa.remote,po->pppoe_ifindex); if(po->pppoe_dev) dev_put(po->pppoe_dev); @@ -597,6 +597,7 @@ goto end; po->pppoe_dev = dev; + po->pppoe_ifindex = dev->ifindex; if (!(dev->flags & IFF_UP)) goto err_put; --- linux-2.6.20/include/linux/if_pppox.h.orig 2007-02-09 10:21:19.0 +0100 +++ linux-2.6.20/include/linux/if_pppox.h 2007-03-04 02:14:24.0 +0100 @@ -114,6 +114,7 @@ #ifdef __KERNEL__ struct pppoe_opt { struct net_device *dev; /* device associated with socket*/ + int ifindex; /* ifindex of device associated with socket */ struct pppoe_addr pa; /* what this socket is bound to*/ struct sockaddr_pppox relay;/* what socket data will be relayed to (PPPoE relaying) */ @@ -132,6 +133,7 @@ unsigned short num; }; #define pppoe_dev proto.pppoe.dev +#define pppoe_ifindex proto.pppoe.ifindex #define pppoe_pa proto.pppoe.pa #define pppoe_relayproto.pppoe.relay - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Extensible hashing and RCU
On 3/3/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote: Btw, you could try to implement something you have written above to show its merits, so that it would not be an empty words :) Before I implement, I design. Before I design, I analyze. Before I analyze, I prototype. Before I prototype, I gather requirements. Before I gather requirements, I think -- and the only way I know how to think about technical matters is to write down my intuitions and compare them against the sea of published research on the topic. I'm only partway through thinking about RCU and DDoS, especially as this is on the fringe of my professional expertise and the appropriate literature is not currently at my fingertips. The only times that I make exceptions to the above sequence are 1, when someone is paying me well to do so (usually to retrofit some kind of sanity onto a pile of crap someone else wrote) and 2, when I really feel like it. At present neither exception applies here, although I may yet get so het up about threadlets that I go into a coding binge (which may or may not produce an RCU splay tree as a side effect). I wouldn't hold my breath if I were you, though; it's the first of what promises to be a string of fine weekends, and if I binge on anything this spring it's likely to be gardening. Cheers, - Michael - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html