Re: Netchannles: first stage has been completed. Further ideas.

2006-07-31 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED] Date: Fri, 28 Jul 2006 15:54:04 +1000 (1) I am imagining some Grand Unified Flow Cache (Olsson trie?) that holds (some subset of?) flows. A successful lookup immediately after packet comes off NIC gives destiny for packet: what route, (optionally) what

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-28 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED] Date: Thu, 27 Jul 2006 11:54:19 -0700 I think we sell our existing stack short. I agree. There are lots of opportunities left to look more closely at actual real performance bottlenecks and improve incrementally. But it requires, tools, time, faster

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED] Date: Thu, 27 Jul 2006 15:46:12 +1000 Yes, my first thought back in January was how netfilter would interact with this in a sane way. One answer is don't: once someone registers on any hook we go into slow path. Another is to run the hooks in socket

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread Alexey Kuznetsov
Hello! On Thu, Jul 27, 2006 at 03:46:12PM +1000, Rusty Russell wrote: Of course, it means rewriting all the userspace tools, documentation, and creating a complete new infrastructure for connection tracking and NAT, but if that's what's required, then so be it. That's what I love to hear. Not

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread Evgeniy Polyakov
Hello, Alexey. On Thu, Jul 27, 2006 at 08:33:35PM +0400, Alexey Kuznetsov ([EMAIL PROTECTED]) wrote: First, it was stated that suggested implementation performs better and even much better. I am asking why do we see such improvement? I am absolutely not satisifed with statement It is better.

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread Stephen Hemminger
On Wed, 26 Jul 2006 23:00:28 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: From: Rusty Russell [EMAIL PROTECTED] Date: Thu, 27 Jul 2006 15:46:12 +1000 Yes, my first thought back in January was how netfilter would interact with this in a sane way. One answer is don't: once someone

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread Alexey Kuznetsov
Hello! kernel thread takes 100% cpu (with preemption Preemption, you tell... :-) I begged you to spend 1 minute of your time to press ^Z. Did you? Alexey - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread Rusty Russell
On Thu, 2006-07-27 at 20:33 +0400, Alexey Kuznetsov wrote: Hello! On Thu, Jul 27, 2006 at 03:46:12PM +1000, Rusty Russell wrote: Of course, it means rewriting all the userspace tools, documentation, and creating a complete new infrastructure for connection tracking and NAT, but if that's

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread Evgeniy Polyakov
On Fri, Jul 28, 2006 at 12:56:51AM +0400, Alexey Kuznetsov ([EMAIL PROTECTED]) wrote: Hello! kernel thread takes 100% cpu (with preemption Preemption, you tell... :-) I begged you to spend 1 minute of your time to press ^Z. Did you? What would you expect from non-preemptible kernel?

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread David Miller
From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Fri, 28 Jul 2006 09:17:25 +0400 What would you expect from non-preemptible kernel? Hard lockup, no acks, no soft irqs. Why does pressing Ctrl-Z on the user process stop kernel soft irq processing? - To unsubscribe from this list: send the line

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread Evgeniy Polyakov
On Thu, Jul 27, 2006 at 10:34:00PM -0700, David Miller ([EMAIL PROTECTED]) wrote: From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Fri, 28 Jul 2006 09:17:25 +0400 What would you expect from non-preemptible kernel? Hard lockup, no acks, no soft irqs. Why does pressing Ctrl-Z on the user

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-27 Thread Rusty Russell
On Wed, 2006-07-26 at 23:00 -0700, David Miller wrote: From: Rusty Russell [EMAIL PROTECTED] Date: Thu, 27 Jul 2006 15:46:12 +1000 Yes, my first thought back in January was how netfilter would interact with this in a sane way. One answer is don't: once someone registers on any hook we

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-26 Thread Rusty Russell
On Wed, 2006-07-19 at 03:01 +0400, Alexey Kuznetsov wrote: Hello! Can I ask couple of questions? Just as a person who looked at VJ's slides once and was confused. And startled, when found that it is not considered as another joke of genuis. :-) Hi Alexey! About locks: is

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-26 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED] Date: Thu, 27 Jul 2006 12:17:51 +1000 On Wed, 2006-07-19 at 03:01 +0400, Alexey Kuznetsov wrote: About locks: is completely lockless (there is one irq lock when skb is queued/dequeued into netchannels queue in hard/soft irq, Equivalent

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-26 Thread Rusty Russell
On Wed, 2006-07-26 at 22:17 -0700, David Miller wrote: I read this as we will be able to get around the problems but no specific answer as to how. I am an optimist too but I want to start seeing concrete discussion about the way in which the problems will be dealt with. Alexey has some

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-24 Thread Stephen Hemminger
On Wed, 19 Jul 2006 13:01:50 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Wed, 19 Jul 2006 15:52:04 -0400 As a related note, I am looking into fixing inet hash tables to use RCU. IBM had posted a patch a long time ago, which would be

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-24 Thread Alexey Kuznetsov
Hello! Also, there is some code for refcnt's in it that looks wrong. Yes, it is disgusting. rcu does not allow to increase socket refcnt in lookup routine. Ben's version looks cleaner here, it does not touch refcnt in rcu lookups. But it is dubious too: do_time_wait: + sock_hold(sk);

RE: Netchannles: first stage has been completed. Further ideas.

2006-07-22 Thread Caitlin Bestler
[EMAIL PROTECTED] wrote: Evgeniy Polyakov wrote: On Thu, Jul 20, 2006 at 02:21:57PM -0700, Ben Greear ([EMAIL PROTECTED]) wrote: Out of curiosity, is it possible to have the single producer logic if you have two+ ethernet interfaces handling frames for a single TCP connection? (I am

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread Evgeniy Polyakov
On Thu, Jul 20, 2006 at 09:55:04PM -0700, David Miller ([EMAIL PROTECTED]) wrote: From: Alexey Kuznetsov [EMAIL PROTECTED] Date: Fri, 21 Jul 2006 02:59:08 +0400 Moving protocol (no matter if it is TCP or not) closer to user allows naturally control the dataflow - when user can read that

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread Evgeniy Polyakov
On Thu, Jul 20, 2006 at 02:21:57PM -0700, Ben Greear ([EMAIL PROTECTED]) wrote: Out of curiosity, is it possible to have the single producer logic if you have two+ ethernet interfaces handling frames for a single TCP connection? (I am assuming some sort of multi-path routing logic...) I do

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread Evgeniy Polyakov
On Fri, Jul 21, 2006 at 11:19:00AM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) wrote: On Thu, Jul 20, 2006 at 02:21:57PM -0700, Ben Greear ([EMAIL PROTECTED]) wrote: Out of curiosity, is it possible to have the single producer logic if you have two+ ethernet interfaces handling frames for

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread Evgeniy Polyakov
On Fri, Jul 21, 2006 at 09:40:32AM +1200, Ian McDonald ([EMAIL PROTECTED]) wrote: If we consider netchannels as how Van Jackobson discribed them, then mutext is not needed, since it is impossible to have several readers or writers. But in socket case even if there is only one userspace

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread David Miller
From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Fri, 21 Jul 2006 11:10:10 +0400 On Thu, Jul 20, 2006 at 09:55:04PM -0700, David Miller ([EMAIL PROTECTED]) wrote: Correct, and too large delay even results in retransmits. You can say that RTT will be adjusted by delay of ACK, but if user

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread Evgeniy Polyakov
On Fri, Jul 21, 2006 at 12:47:13AM -0700, David Miller ([EMAIL PROTECTED]) wrote: Correct, and too large delay even results in retransmits. You can say that RTT will be adjusted by delay of ACK, but if user context switches cleanly at the beginning, resulting in near immediate ACKs,

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread David Miller
From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Fri, 21 Jul 2006 13:06:11 +0400 Receiving side, nor matter if it is socket or netchannel, will drop packets (socket due to queue overfull, netchannels will not drop, but will not ack (it's maximum queue len is 1mb)). So both approaches behave

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread Evgeniy Polyakov
On Fri, Jul 21, 2006 at 02:19:55AM -0700, David Miller ([EMAIL PROTECTED]) wrote: From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Fri, 21 Jul 2006 13:06:11 +0400 Receiving side, nor matter if it is socket or netchannel, will drop packets (socket due to queue overfull, netchannels will not

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread David Miller
From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Fri, 21 Jul 2006 13:39:09 +0400 On Fri, Jul 21, 2006 at 02:19:55AM -0700, David Miller ([EMAIL PROTECTED]) wrote: From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Fri, 21 Jul 2006 13:06:11 +0400 Receiving side, nor matter if it is socket

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread Ben Greear
Evgeniy Polyakov wrote: On Thu, Jul 20, 2006 at 02:21:57PM -0700, Ben Greear ([EMAIL PROTECTED]) wrote: Out of curiosity, is it possible to have the single producer logic if you have two+ ethernet interfaces handling frames for a single TCP connection? (I am assuming some sort of multi-path

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread Rick Jones
All this talk reminds me of one thing, how expensive tcp_ack() is. And this expense has nothing to do with TCP really. The main cost is purging and freeing up the skbs which have been ACK'd in the retransmit queue. So tcp_ack() sort of inherits the cost of freeing a bunch of SKBs which haven't

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread Evgeniy Polyakov
On Fri, Jul 21, 2006 at 09:14:39AM -0700, Ben Greear ([EMAIL PROTECTED]) wrote: Out of curiosity, is it possible to have the single producer logic if you have two+ ethernet interfaces handling frames for a single TCP connection? (I am assuming some sort of multi-path routing logic...) I do

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-21 Thread David Miller
From: Rick Jones [EMAIL PROTECTED] Date: Fri, 21 Jul 2006 09:26:42 -0700 All this talk reminds me of one thing, how expensive tcp_ack() is. And this expense has nothing to do with TCP really. The main cost is purging and freeing up the skbs which have been ACK'd in the retransmit queue.

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Evgeniy Polyakov
Hello! Hello, Alexey. [ Sorry for long delay, there are some problems with mail servers, so I can not access them remotely, so I create mail by hads, hopefully thread will not be broken. ] There is no socket spinlock anymore. Above lock is skb_queue lock which is held inside

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Evgeniy Polyakov
Hello. [ Sorry for long delay, there are some problems with mail servers, so I can not access them remotely, so I create mail by hads, hopefully thread will not be broken. ] Your description makes it sound as if you would take a huge leap, changing all in-kernel code _and_ the userspace

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Alexey Kuznetsov
Hello! Small question first: userspace, but also there are big problems, like one syscall per ack, I do not see redundant syscalls. Is not it expected to send ACKs only after receiving data as you said? What is the problem? Now boring things: There is no BH protocol processing at all, so

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Evgeniy Polyakov
On Thu, Jul 20, 2006 at 08:41:00PM +0400, Alexey Kuznetsov ([EMAIL PROTECTED]) wrote: Hello! Hello, Alexey. Small question first: userspace, but also there are big problems, like one syscall per ack, I do not see redundant syscalls. Is not it expected to send ACKs only after receiving

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Ben Greear
Evgeniy Polyakov wrote: Backlog is actually not a protection, but a thing equivalent to netchannel. The difference is only that it tries to process something immediately, when it is safe. You can omit this and push everything to backlog(=netchannel), which is processed only by syscalls, if you

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Ian McDonald
If we consider netchannels as how Van Jackobson discribed them, then mutext is not needed, since it is impossible to have several readers or writers. But in socket case even if there is only one userspace consumer, that lock must be held to protect against bh (or introduce several queues and

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread Alexey Kuznetsov
Hello! Moving protocol (no matter if it is TCP or not) closer to user allows naturally control the dataflow - when user can read that data(and _this_ is the main goal), user acks, when it can not - it does not generate ack. In theory To all that I rememeber, in theory absence of feedback

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-20 Thread David Miller
From: Alexey Kuznetsov [EMAIL PROTECTED] Date: Fri, 21 Jul 2006 02:59:08 +0400 Moving protocol (no matter if it is TCP or not) closer to user allows naturally control the dataflow - when user can read that data(and _this_ is the main goal), user acks, when it can not - it does not generate

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread Jörn Engel
On Tue, 18 July 2006 23:08:01 +0400, Evgeniy Polyakov wrote: On Tue, Jul 18, 2006 at 02:15:17PM +0200, J?rn Engel ([EMAIL PROTECTED]) wrote: Your description makes it sound as if you would take a huge leap, changing all in-kernel code _and_ the userspace interface in a single patch.

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread Alexey Kuznetsov
Hello! There is no socket spinlock anymore. Above lock is skb_queue lock which is held inside skb_dequeue/skb_queue_tail calls. Lock is named differently, but it is still here. BTW for UDP even the name is the same. Equivalent of socket user lock. No, it is an equivalent for hash lock

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread Stephen Hemminger
As a related note, I am looking into fixing inet hash tables to use RCU. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED] Date: Wed, 19 Jul 2006 15:52:04 -0400 As a related note, I am looking into fixing inet hash tables to use RCU. IBM had posted a patch a long time ago, which would be not so hard to munge into the current tree. See if you can spot it in the archives :)

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-19 Thread Stephen Hemminger
On Wed, 19 Jul 2006 13:01:50 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Wed, 19 Jul 2006 15:52:04 -0400 As a related note, I am looking into fixing inet hash tables to use RCU. IBM had posted a patch a long time ago, which would be

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread David Miller
From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Tue, 18 Jul 2006 12:16:26 +0400 I would ask to push netchannel support into -mm tree, but I expect in advance that having two separate TCP stacks (one of which can contain some bugs (I mean atcp.c)) is not that good idea, so I understand possible

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread Evgeniy Polyakov
On Tue, Jul 18, 2006 at 01:34:37AM -0700, David Miller ([EMAIL PROTECTED]) wrote: Perhaps I am mistaken with my priorities, but I tend to hit all the easy patches and bug fixes first, before significant new work. And even in the realm of new work, your things require the most serious

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread Christian Borntraeger
Hello Evgeniy, +asmlinkage long sys_netchannel_control(void __user *arg) [...] + if (copy_from_user(ctl, arg, sizeof(struct unetchannel_control))) + return -ERESTARTSYS; ^^^ [...] + if (copy_to_user(arg, ctl, sizeof(struct

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread Evgeniy Polyakov
On Tue, Jul 18, 2006 at 01:16:18PM +0200, Christian Borntraeger ([EMAIL PROTECTED]) wrote: Hello Evgeniy, +asmlinkage long sys_netchannel_control(void __user *arg) [...] + if (copy_from_user(ctl, arg, sizeof(struct unetchannel_control))) + return -ERESTARTSYS;

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread Jörn Engel
On Tue, 18 July 2006 12:16:26 +0400, Evgeniy Polyakov wrote: Current tests with the latest netchannel patch show that netchannels outperforms sockets in any type of bulk transfer (big-sized, small-sized, sending, receiving) over 1gb wire. I omit graphs and numbers here, since I posted it

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread Christian Borntraeger
On Tuesday 18 July 2006 13:51, Evgeniy Polyakov wrote: I think this should be -EFAULT instead of -ERESTARTSYS, right? I have no strong feeling on what must be returned in that case. As far as I see, copy*user can fail due to absence of the next destination page, so -ERESTARTSYS makes sence,

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread Evgeniy Polyakov
On Tue, Jul 18, 2006 at 02:15:17PM +0200, J?rn Engel ([EMAIL PROTECTED]) wrote: Your description makes it sound as if you would take a huge leap, changing all in-kernel code _and_ the userspace interface in a single patch. Am I wrong? Or am I right and would it make sense to extract small

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread David Miller
From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Tue, 18 Jul 2006 23:11:37 +0400 Actually userspace will not see ERESTARTSYS, when it is returned from syscall. This is true only when a signal is pending. It is the signal dispatch code that fixes up the return value either by changing it to

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread Alexey Kuznetsov
Hello! Can I ask couple of questions? Just as a person who looked at VJ's slides once and was confused. And startled, when found that it is not considered as another joke of genuis. :-) About locks: is completely lockless (there is one irq lock when skb is queued/dequeued into

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread David Miller
From: Alexey Kuznetsov [EMAIL PROTECTED] Date: Wed, 19 Jul 2006 03:01:21 +0400 The only improvement in this area suggested in VJ's slides is a lock-free producer-consumer ring. It is missing in your patch and I could guess it is not big loss, it is unlikely to improve something significantly

Re: Netchannles: first stage has been completed. Further ideas.

2006-07-18 Thread Evgeniy Polyakov
On Wed, Jul 19, 2006 at 03:01:21AM +0400, Alexey Kuznetsov ([EMAIL PROTECTED]) wrote: Hello! Hello, Alexey. Can I ask couple of questions? Just as a person who looked at VJ's slides once and was confused. And startled, when found that it is not considered as another joke of genuis. :-)