Re: [patch 13/26] Xen-paravirt_ops: Consistently wrap paravirt ops callsites to make them patchable

2007-03-18 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED]
Date: Sat, 17 Mar 2007 21:33:58 +1100

 On Fri, 2007-03-16 at 13:38 -0700, Jeremy Fitzhardinge wrote:
  David Miller wrote:
   Perhaps the problem can be dealt with using ELF relocations.
  
   There is another case, discussed yesterday on netdev, where run-time
   resolution of ELF relocations would be useful (for
   very-very-very-read-only variables) so if it can solve this problem
   too it would be nice to have a generic infrastructure for it.
  
  That's an interesting idea.  Have you or anyone else looked at what it
  would take to code up?
  
  For this case, I guess you'd walk the relocs looking for references into
  the paravirt_ops structure.  You'd need to check that was a reference
  from an indirect jump or call instruction, which would identify a
  patchable callsite.  The offset into the pv_ops structure would identify
  which operation is involved.
 
 I wrote a whole email on ways to do this, BUT...

The idea is _NOT_ that you go look for references to the paravirt_ops
members structure, that would be stupid and you wouldn't be able to
use the most efficient addressing mode on a given cpu, you'd be
patching up indirect calls and crap like that.  Just say no...

Instead you get rid of paravirt ops completely, and you call functions
whose symbol name will not resolve in the initial kernel link.

You do an initial link of the kernel, look for the unresolved symbols
in the ELF relocation tables (just like the linker does), and put
those references into a table that is use to patch things up and you
can use standard ELF relocation code to handle this, exactly like code
we already have for module loading in the kernel already.

This idea is about 15 years old, sparc32 has been doing exactly this
via something called btfixup to handle the page table, TLB, and
cache differences of 15 different cpu+cache type combinations.

 #define pv_patch(call, args...) \
   asm volatile(:); 
   call(args);
   asm volatile(8889:
[ stuff to put 8889,  and call in fixup section ]

Please, use ELF and it's powerful and clean existing way to
do this please. :-)

  What are the netdev requirements?
 
 Reading Ben LaHaise's (very cool!) patch, it's not clear that using
 reloc postprocessing is going to be clearer than open-coding it as he
 has done.

Ben's case can be handled in the same way.  Just do not define the
symbols, pre-link, look for the references in the relocation tables,
and run through that when you do the set_very_readonly() or
install_paravirt_ops() thing.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [patch 13/26] Xen-paravirt_ops: Consistently wrap paravirt ops callsites to make them patchable

2007-03-19 Thread David Miller
From: Linus Torvalds [EMAIL PROTECTED]
Date: Mon, 19 Mar 2007 20:18:14 -0700 (PDT)

   Please don't subject us to another couple months of hair-pulling only
   to have Linus yank the thing out again, there are certainly more
   useful things to spend time on :-)
 
 Good call. Dwarf2 unwinding simply isn't worth doing. But I won't yank it 
 out, I simply won't merge it. It was more than just totally buggy code, it 
 was an inability of the people to understand that even bugfree code 
 isn't enough - you have to be able to also handle buggy data.

Thank you.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [patch 13/26] Xen-paravirt_ops: Consistently wrap paravirt ops callsites to make them patchable

2007-04-12 Thread David Miller
From: Paul Mackerras [EMAIL PROTECTED]
Date: Wed, 21 Mar 2007 11:03:14 +1100

 Linus Torvalds writes:
 
  We should just do this natively. There's been several tests over the years 
  saying that it's much more efficient to do sti/cli as a simple store, and 
  handling the oops, we got an interrupt while interrupts were disabled as 
  a special case.
  
  I have this dim memory that ARM has done it that way for a long time 
  because it's so expensive to do a real cli/sti.
  
  And I think -rt does it for other reasons. It's just more flexible.
 
 64-bit powerpc does this now as well.

I was curious about this so I had a look.

There appears to be three pieces of state used to manage this
on powerpc, PACASOFTIRQEN(r13), PACAHARDIRQEN(r13) and the
SOFTE() in the stackframe.

Plus there is all of this complicated logic on trap entry and
exit to manage these three values properly.

local_irq_restore() doesn't look like a simple piece of code
either.  Logically it should be simple, update the software
binary state, and if enabling see if any interrupts came in
while we were disable so we can run them.

Given all of that, is it really cheaper than just flipping the
bit in the cpu control register? :-/
___
Virtualization mailing list
[EMAIL PROTECTED]
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [1/2] [NET] link_watch: Move link watch list into net_device

2007-05-09 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Tue, 8 May 2007 22:13:22 +1000

 [NET] link_watch: Move link watch list into net_device
 
 These days the link watch mechanism is an integral part of the
 network subsystem as it manages the carrier status.  So it now
 makes sense to allocate some memory for it in net_device rather
 than allocating it on demand.
 
 In fact, this is necessary because we can't tolerate a memory
 allocation failure since that means we'd have to potentially
 throw a link up event away.
 
 It also simplifies the code greatly.
 
 In doing so I discovered a subtle race condition in the use
 of singleevent.  This race condition still exists (and is
 somewhat magnified) without singleevent but it's now plugged
 thanks to an smp_mb__before_clear_bit.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Applied, thanks Herbert.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [1/2] [NET] link_watch: Move link watch list into net_device

2007-05-10 Thread David Miller
From: Jeremy Fitzhardinge [EMAIL PROTECTED]
Date: Thu, 10 May 2007 15:00:05 -0700

 Herbert Xu wrote:
  [NET] link_watch: Move link watch list into net_device
 
  These days the link watch mechanism is an integral part of the
  network subsystem as it manages the carrier status.  So it now
  makes sense to allocate some memory for it in net_device rather
  than allocating it on demand.
 
 I think there's a problem with one of these two patches.

Yes, there are :-)

Did you catch the follow-on bug fixes?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [1/2] [NET] link_watch: Move link watch list into net_device

2007-05-10 Thread David Miller
From: Jeremy Fitzhardinge [EMAIL PROTECTED]
Date: Thu, 10 May 2007 15:22:17 -0700

 Andrew Morton wrote:
  Five minutes after boot is when jiffies wraps.  Are you sure it's
  a list-screwup rather than a jiffy-wrap screwup?

 
 
 Hm, its suggestive, isn't it?  Apparently they've already fixed this in
 the sekret networking clubhouse, so I'll need to track it down.

I'm not so certain now that we know it's the jiffies wrap point :-)

The fixes in question are attached below and they were posted and
discussed on netdev:


commit fe47cdba83b3041e4ac1aa1418431020a4afe1e0
Author: Herbert Xu [EMAIL PROTECTED]
Date:   Tue May 8 23:22:43 2007 -0700

[NET] link_watch: Eliminate potential delay on wrap-around

When the jiffies wrap around or when the system boots up for the first
time, down events can be delayed indefinitely since we no longer
update linkwatch_nextevent when only urgent events are processed.

This patch fixes this by setting linkwatch_nextevent when a
wrap-around occurs.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
Signed-off-by: David S. Miller [EMAIL PROTECTED]

diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index b5f4579..4674ae5 100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -101,8 +101,10 @@ static void linkwatch_schedule_work(unsigned long delay)
return;
 
/* If we wrap around we'll delay it by at most HZ. */
-   if (delay  HZ)
+   if (delay  HZ) {
+   linkwatch_nextevent = jiffies;
delay = 0;
+   }
 
schedule_delayed_work(linkwatch_work, delay);
 }

commit 4cba637dbb9a13706494a1c85174c8e736914010
Author: Herbert Xu [EMAIL PROTECTED]
Date:   Wed May 9 00:17:30 2007 -0700

[NET] link_watch: Always schedule urgent events

Urgent events may be delayed if we already have a non-urgent event
queued for that device.  This patch changes this by making sure that
an urgent event is always looked at immediately.

I've replaced the LW_RUNNING flag by LW_URGENT since whether work
is scheduled is already kept track by the work queue system.

The only complication is that we have to provide some exclusion for
the setting linkwatch_nextevent which is available in the actual
work function.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
Signed-off-by: David S. Miller [EMAIL PROTECTED]

diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index 4674ae5..a5e372b 100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -26,7 +26,7 @@
 
 
 enum lw_bits {
-   LW_RUNNING = 0,
+   LW_URGENT = 0,
 };
 
 static unsigned long linkwatch_flags;
@@ -95,18 +95,41 @@ static void linkwatch_add_event(struct net_device *dev)
 }
 
 
-static void linkwatch_schedule_work(unsigned long delay)
+static void linkwatch_schedule_work(int urgent)
 {
-   if (test_and_set_bit(LW_RUNNING, linkwatch_flags))
+   unsigned long delay = linkwatch_nextevent - jiffies;
+
+   if (test_bit(LW_URGENT, linkwatch_flags))
return;
 
-   /* If we wrap around we'll delay it by at most HZ. */
-   if (delay  HZ) {
-   linkwatch_nextevent = jiffies;
+   /* Minimise down-time: drop delay for up event. */
+   if (urgent) {
+   if (test_and_set_bit(LW_URGENT, linkwatch_flags))
+   return;
delay = 0;
}
 
-   schedule_delayed_work(linkwatch_work, delay);
+   /* If we wrap around we'll delay it by at most HZ. */
+   if (delay  HZ)
+   delay = 0;
+
+   /*
+* This is true if we've scheduled it immeditately or if we don't
+* need an immediate execution and it's already pending.
+*/
+   if (schedule_delayed_work(linkwatch_work, delay) == !delay)
+   return;
+
+   /* Don't bother if there is nothing urgent. */
+   if (!test_bit(LW_URGENT, linkwatch_flags))
+   return;
+
+   /* It's already running which is good enough. */
+   if (!cancel_delayed_work(linkwatch_work))
+   return;
+
+   /* Otherwise we reschedule it again for immediate exection. */
+   schedule_delayed_work(linkwatch_work, 0);
 }
 
 
@@ -123,7 +146,11 @@ static void __linkwatch_run_queue(int urgent_only)
 */
if (!urgent_only)
linkwatch_nextevent = jiffies + HZ;
-   clear_bit(LW_RUNNING, linkwatch_flags);
+   /* Limit wrap-around effect on delay. */
+   else if (time_after(linkwatch_nextevent, jiffies + HZ))
+   linkwatch_nextevent = jiffies;
+
+   clear_bit(LW_URGENT, linkwatch_flags);
 
spin_lock_irq(lweventlist_lock);
next = lweventlist;
@@ -166,7 +193,7 @@ static void __linkwatch_run_queue(int urgent_only)
}
 
if (lweventlist)
-   linkwatch_schedule_work(linkwatch_nextevent - jiffies);
+   

Re: [1/2] [NET] link_watch: Move link watch list into net_device

2007-05-10 Thread David Miller
From: Jeremy Fitzhardinge [EMAIL PROTECTED]
Date: Thu, 10 May 2007 15:45:42 -0700

 David Miller wrote:
  I'm not so certain now that we know it's the jiffies wrap point :-)
 
  The fixes in question are attached below and they were posted and
  discussed on netdev:

 
 Yep, this patch gets rid of my spinning thread.  I can't find this patch
 or any discussion on marc.info; is there a better netdev list archive?

I don't see it there either... let me check my mail archive...

Indeed, they were posted to netdev but were blocked by the vger
regexp filters on the keyword urgent so that postings never made it
to the list.  I removed that filter regexp so that never happens
again, sorry.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/3] skb_partial_csum_set

2008-01-15 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED]
Date: Tue, 15 Jan 2008 21:41:55 +1100

 Implement skb_partial_csum_set, for setting partial csums on untrusted 
 packets.
 
 Use it in virtio_net (replacing buggy version there), it's also going
 to be used by TAP for partial csum support.
 
 Signed-off-by: Rusty Russell [EMAIL PROTECTED]

Looks fine to me.

Acked-by: David S. Miller [EMAIL PROTECTED]

If you like I can merge this into my net-2.6.25 tree, or alternatively
if it makes your life easier you then you can handle it yourself.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC 4/5] tun: vringfd xmit support.

2008-04-07 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED]
Date: Mon, 7 Apr 2008 17:24:51 +1000

 On Monday 07 April 2008 15:13:44 Herbert Xu wrote:
  On second thought, this is not going to work.  The network stack
  can clone individual pages out of this skb and put it into a new
  skb.  Therefore whatever scheme we come up with will either need
  to be page-based, or add a flag to tell the network stack that it
  can't clone those pages.
 
 Erk... I'll put in the latter for now.   A page-level solution is not really 
 an option: if userspace hands us mmaped pages for example.

Keep in mind that the core of the TCP stack really depends
upon being able to slice and dice paged SKBs as is pleases
in order to send packets out.

In fact, it also does such splitting during SACK processing.

It really is a base requirement for efficient TSO support.
Otherwise the above operations would be so incredibly
expensive we might as well rip all of the TSO support out.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/5] /dev/vring: simple userspace-kernel ringbuffer interface.

2008-04-19 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED]
Date: Sun, 20 Apr 2008 02:41:14 +1000

 If only there were some kind of, I don't know... summit... for kernel 
 people... 

I'm starting to disbelieve the myth that because we can discuss
technical issues on mailing lists, we should talk primarily about
process issues during the kernel summit.

There is a distinct advantage to discussing and hashing things out in
person.  You can't say screw you, your idea sucks when you're face
to face with the other person, whereas online it's way too easy.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [6/6] [VIRTIO] net: Allow receiving SG packets

2008-04-21 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED]
Date: Tue, 22 Apr 2008 05:06:16 +1000

 I'm not sure what the right number is here.  Say worst case is header which 
 goes over a page boundary then MAX_SKB_FRAGS in the skb, but for some reason 
 that already has a +2:
 
 /* To allow 64K frame to be packed as single skb without frag_list */
 #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
 
 Unless someone explains, I'll change the xmit sg to 2+MAX_SKB_FRAGS as well.

MAX_SKB_FRAGS + 1 is what you ought to need.

MAX_SKB_FRAGS is only accounting for the skb frag pages.
If you want to know how many segments skb-data might
consume as well, you have to add one.

skb-data is linear, therefore it's not possible to need
more than one scatterlist entry for it.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [6/6] [VIRTIO] net: Allow receiving SG packets

2008-04-21 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED]
Date: Tue, 22 Apr 2008 12:50:27 +1000

 But I was curious as to why the +2 in the MAX_SKB_FRAGS definition?

To be honest I have no idea.

When Alexey added the TSO changeset way back then, it had the
+2, from the history-2.6 tree:

commit 80223d5186f73bf42a7e260c66c9cb9f7d8ec9cf
Author: Alexey Kuznetsov [EMAIL PROTECTED]
Date:   Wed Aug 28 11:52:03 2002 -0700

[NET]: Add TCP segmentation offload core infrastructure.

 ...
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a812681..9b6e6ad 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -109,7 +109,8 @@ struct sk_buff_head {
 
 struct sk_buff;
 
-#define MAX_SKB_FRAGS 6
+/* To allow 64K frame to be packed as single skb without frag_list */
+#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
 
 typedef struct skb_frag_struct skb_frag_t;
 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 5/5] Remove now unused structs from kvm_para.h

2008-06-03 Thread David Miller

You sent these patches to kvm-owner, ie. the mailing list owner, and
not the list itself which would be plain kvm.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] tun: Interface to query tun/tap features.

2008-07-01 Thread David Miller
From: Max Krasnyansky [EMAIL PROTECTED]
Date: Tue, 01 Jul 2008 21:59:02 -0700

 Dave, do you want me to put all outstanding TUN patches into a git tree so
 that you can pull them in one shot ?
 Otherwise if you're ok with applying them one by one please apply this one.
 
 Acked-by: Max Krasnyansky [EMAIL PROTECTED]

I'll apply Rusty's patches after I give them a review too.

Thanks Max.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/3] tun: Interface to query tun/tap features.

2008-07-03 Thread David Miller
From: Rusty Russell [EMAIL PROTECTED]
Date: Thu, 3 Jul 2008 11:32:12 +1000

 The problem with introducing checksum offload and gso to tun is they
 need to set dev-features to enable GSO and/or checksumming, which is
 supposed to be done before register_netdevice(), ie. as part of
 TUNSETIFF.
 ...
 Signed-off-by: Rusty Russell [EMAIL PROTECTED]

Applied to net-next-2.6, thanks!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] tun: Fix/rewrite packet filtering logic

2008-07-14 Thread David Miller
From: David Miller [EMAIL PROTECTED]
Date: Mon, 14 Jul 2008 22:16:02 -0700 (PDT)

 It doesn't apply cleanly to net-next-2.6, as I just tried to
 stick this into my tree.

Ignore this, I did something stupid.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] tun: Fix/rewrite packet filtering logic

2008-07-14 Thread David Miller
From: Max Krasnyansky [EMAIL PROTECTED]
Date: Sat, 12 Jul 2008 01:52:54 -0700

 This is on top of the latest and greatest :). Assuming virt folks are ok with
 the API this should go into 2.6.27.

Really? :-)

It doesn't apply cleanly to net-next-2.6, as I just tried to
stick this into my tree.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] tun: Fix/rewrite packet filtering logic

2008-07-22 Thread David Miller
From: Jeff Garzik [EMAIL PROTECTED]
Date: Tue, 22 Jul 2008 19:41:47 -0400

 looks mostly OK, but stuff like the above should be
 
   (void __user *) arg
 
 Did you check this with sparse (Documentation/sparse.txt)?

Jeff, I already added this particular patch to the tree
a week or so ago.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/1] tun: TUNGETIFF interface to query name and flags

2008-08-15 Thread David Miller
From: Max Krasnyansky [EMAIL PROTECTED]
Date: Fri, 15 Aug 2008 11:00:19 -0700

 Rusty Russell wrote:
  On Thursday 14 August 2008 00:30:16 Mark McLoughlin wrote:
  A very simple approach is attached; I did consider doing a TUNGETFLAGS
  that would return tun-flags, but I think it's nicer to have a companion
  to TUNGETIFF since it also allows one to query the interface name from
  the file descriptor.
  
  This seems really sensible to me.
  
  If Max acks it, I'd say Dave should merge it.
 
 Makes perfect sense to me.
 Definitely Ack. It has zero impact on existing user and I'd be ok if this goes
  in during .27-rc series.

I've applied Mark's patch, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_net: large tx MTU support

2008-11-26 Thread David Miller
From: Mark McLoughlin [EMAIL PROTECTED]
Date: Wed, 26 Nov 2008 13:58:11 +

 We don't really have a max tx packet size limit, so allow configuring
 the device with up to 64k tx MTU.
 
 Signed-off-by: Mark McLoughlin [EMAIL PROTECTED]

Rusty, ACK?

If so, I'll toss this into net-next-2.6, thanks!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] AF_VMCHANNEL address family for guest-host communication.

2008-12-14 Thread David Miller
From: Gleb Natapov g...@redhat.com
Date: Sun, 14 Dec 2008 13:50:55 +0200

 It is undesirable to use TCP/IP for this purpose since network
 connectivity may not exist between host and guest and if it exists the
 traffic can be not routable between host and guest for security reasons
 or TCP/IP traffic can be firewalled (by mistake) by unsuspecting VM user.

I don't really accept this argument, sorry.

If you can't use TCP because it might be security protected or
misconfigured, adding this new stream protocol thing is not one
bit better.  It doesn't make any sense at all.

Also, if TCP could be misconfigured this new thing could just as
easily be screwed up too.  And I wouldn't be surprised to see a whole
bunch of SELINUX and netfilter features proposed later for this and
then we're back to square one.

You guys really need to rethink this.  Either a stream protocol is a
workable solution to your problem, or it isn't.

And don't bring up any virtualization is special because...
arguments into your reply because virtualization has nothing to do
with my objections stated above.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] AF_VMCHANNEL address family for guest-host communication.

2008-12-15 Thread David Miller
From: Anthony Liguori anth...@codemonkey.ws
Date: Mon, 15 Dec 2008 09:02:23 -0600

 There is already an AF_IUCV for s390.

This is a scarecrow and irrelevant to this discussion.

And this is exactly why I asked that any arguments in this thread
avoid talking about virtualization technology and why it's special.

This proposed patch here is asking to add new infrastructure for
hypervisor facilities that will be _ADDED_ and for which we have
complete control over.

Whereas the S390 folks have to deal with existing infrastructure which
is largely outside of their control.  So if they implement access
mechanisms for that, it's fine.

I would be doing the same thing if I added a protocol socket layer for
accessing the Niagara hypervisor virtualization channels.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] AF_VMCHANNEL address family for guest-host communication.

2008-12-15 Thread David Miller
From: Anthony Liguori anth...@codemonkey.ws
Date: Mon, 15 Dec 2008 14:44:26 -0600

 We want this communication mechanism to be simple and reliable as we
 want to implement the backends drivers in the host userspace with
 minimum mess.

One implication of your statement here is that TCP is unreliable.
That's absolutely not true.

 Within the guest, we need the interface to be always available and
 we need an addressing scheme that is hypervisor specific.  Yes, we
 can build this all on top of TCP/IP.  We could even build it on top
 of a serial port.  Both have their down-sides wrt reliability and
 complexity.

I don't know of any zero-copy through the hypervisor mechanisms for
serial ports, but I know we do that with the various virtualization
network devices.

 Do you have another recommendation?

I don't have to make alternative recommendations until you can
show that what we have can't solve the problem acceptably, and
TCP emphatically can.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [2/2] tun: Fix sk_sleep races when attaching/detaching

2009-04-20 Thread David Miller
From: Herbert Xu herb...@gondor.apana.org.au
Date: Mon, 20 Apr 2009 16:35:50 +0800

 On Thu, Apr 16, 2009 at 07:09:52PM +0800, Herbert Xu wrote:
 
 tun: Fix sk_sleep races when attaching/detaching
 
 That patch doesn't apply anymore because of contextual changes
 caused by the first patch.  Here's an update.
 
 tun: Fix sk_sleep races when attaching/detaching

Do you think these two patches are ready to go into net-2.6
now?

Thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [RFC] virtio: orphan skbs if we're relying on timer to free them

2009-05-18 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Mon, 18 May 2009 22:18:47 +0930

 We check for finished xmit skbs on every xmit, or on a timer (unless
 the host promises to force an interrupt when the xmit ring is empty).
 This can penalize userspace tasks which fill their sockbuf.  Not much
 difference with TSO, but measurable with large numbers of packets.
 
 There are a finite number of packets which can be in the transmission
 queue.  We could fire the timer more than every 100ms, but that would
 just hurt performance for a corner case.  This seems neatest.
 ...
 Signed-off-by: Rusty Russell ru...@rustcorp.com.au

If this is so great for virtio it would also be a great idea
universally, but we don't do it.

What you're doing by orphan'ing is creating a situation where a single
UDP socket can loop doing sends and monopolize the TX queue of a
device.  The only control we have over a sender for fairness in
datagram protocols is that send buffer allocation.

I'm guilty of doing this too in the NIU driver, also because there I
lack a TX queue empty interrupt and this can keep TCP sockets from
getting stuck.

I think we need a generic solution to this issue because it is getting
quite common to see cases where the packets in the TX queue of a
device can sit there indefinitely.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [RFC] virtio: orphan skbs if we're relying on timer to free them

2009-05-21 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Thu, 21 May 2009 16:27:05 +0930

 On Tue, 19 May 2009 12:10:13 pm David Miller wrote:
 What you're doing by orphan'ing is creating a situation where a single
 UDP socket can loop doing sends and monopolize the TX queue of a
 device.  The only control we have over a sender for fairness in
 datagram protocols is that send buffer allocation.
 
 Urgh, that hadn't even occurred to me.  Good point.

Now this all is predicated on this actually mattering. :-)

You could argue that the scheduler as well as the size of the
TX queue should be limiting and enforcing fairness.

Someone really needs to test this.  Just skb_orphan() every packet
at the beginning of dev_hard_start_xmit(), then run some test
program with two clients looping out UDP packets to see if one
can monopolize the device and get a significantly larger amount
of TX resources than the other.  Repeat for 3, 4, 5, etc. clients.

 I haven't thought this through properly, but how about a hack where
 we don't orphan packets if the ring is over half full?

That would also work.  And for the NIU case this would be great
because I DO have a marker bit for triggering interrupts in the TX
descriptors.  There's just no all empty interrupt on TX (who
designs these things? :( ).

 Then I guess we could overload the watchdog as a more general
 timer-after-no- xmit?

Yes, but it means that teardown of a socket can be delayed up to
the amount of that timer.  Factor in all of this crazy
round_jiffies() stuff people do these days and it could cause
pauses for real use cases and drive users batty.

Probably the most profitable avenue is to see if this is a real issue
afterall (see above).  If we can get away with having the socket
buffer represent socket -- device space only, that's the most ideal
solution.  It will probably also improve performance a lot across the
board, especially on NUMA/SMP boxes as our TX complete events tend to
be in difference places than the SKB producer.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-02 Thread David Miller
From: Patrick Ohly patrick.o...@intel.com
Date: Mon, 01 Jun 2009 21:47:22 +0200

 On Fri, 2009-05-29 at 23:44 +0930, Rusty Russell wrote:
 This patch adds skb_orphan to the start of dev_hard_start_xmit(): it
 can be premature in the NETDEV_TX_BUSY case, but that's uncommon.
 
 Would it be possible to make the new skb_orphan() at the start of
 dev_hard_start_xmit() conditionally so that it is not executed for
 packets that are to be time stamped?
 
 As discussed before
 (http://article.gmane.org/gmane.linux.network/121378/), the skb-sk
 socket pointer is required for sending back the send time stamp from
 inside the device driver. Calling skb_orphan() unconditionally as in
 this patch would break the hardware time stamping of outgoing packets.

Indeed, we need to check that case, at a minimum.

And there are other potentially other problems.  For example, I
wonder how this interacts with the new TX MMAP af_packet support
in net-next-2.6 :-/

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-02 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Tue, 2 Jun 2009 23:38:29 +0930

 On Tue, 2 Jun 2009 04:55:53 pm David Miller wrote:
 From: Patrick Ohly patrick.o...@intel.com
 Date: Mon, 01 Jun 2009 21:47:22 +0200

  On Fri, 2009-05-29 at 23:44 +0930, Rusty Russell wrote:
  This patch adds skb_orphan to the start of dev_hard_start_xmit(): it
  can be premature in the NETDEV_TX_BUSY case, but that's uncommon.
 
  Would it be possible to make the new skb_orphan() at the start of
  dev_hard_start_xmit() conditionally so that it is not executed for
  packets that are to be time stamped?
 
  As discussed before
  (http://article.gmane.org/gmane.linux.network/121378/), the skb-sk
  socket pointer is required for sending back the send time stamp from
  inside the device driver. Calling skb_orphan() unconditionally as in
  this patch would break the hardware time stamping of outgoing packets.

 Indeed, we need to check that case, at a minimum.

 And there are other potentially other problems.  For example, I
 wonder how this interacts with the new TX MMAP af_packet support
 in net-next-2.6 :-/
 
 I think I'll do this in the driver for now, and let's revisit doing it 
 generically later?

That might be the best course of action for the time being.
This whole area is a rat's nest.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-03 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Thu, 4 Jun 2009 13:24:57 +0930

 On Thu, 4 Jun 2009 06:32:53 am Eric Dumazet wrote:
 Also, taking a reference on socket for each xmit packet in flight is very
 expensive, since it slows down receiver in __udp4_lib_lookup(). Several
 cpus are fighting for sk-refcnt cache line.
 
 Now we have decent dynamic per-cpu, we can finally implement bigrefs.  More 
 obvious for device counts than sockets, but perhaps applicable here as well?

It might be very beneficial for longer lasting, active, connections, but
for high connection rates it's going to be a lose in my estimation.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-06-03 Thread David Miller
From: Eric Dumazet eric.duma...@gmail.com
Date: Thu, 04 Jun 2009 06:54:24 +0200

 We also can avoid the sock_put()/sock_hold() pair for each tx packet,
 to only touch sk_wmem_alloc (with appropriate atomic_sub_return() in 
 sock_wfree()
 and atomic_dec_test in sk_free
 
 We could initialize sk-sk_wmem_alloc to one instead of 0, so that
 sock_wfree() could just synchronize itself with sk_free()

Excellent idea Eric.

 Patch will follow after some testing

I look forward to it :-)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-07-03 Thread David Miller
From: Herbert Xu herb...@gondor.apana.org.au
Date: Fri, 3 Jul 2009 15:55:30 +0800

 Calling skb_orphan like this should be forbidden.  Apart from the
 problems already raised, it is a sign that the driver is trying to
 paper over a more serious issue of not cleaning up skb's timely.
 
 Yes skb_orphan will work for the cases where calling the skb
 destructor allows forward progress, but for the cases where you
 really need to the skb to be freed (e.g., iSCSI or Xen), this
 simply doesn't work.
 
 So anytime someone tries to propose such a solution it is a sign
 that they have bigger problems.

Agreed, but alas we are foaming at the mouth until we have a truly
usable alternative.

In particular the case of handling a device without usable TX
completion event indications is still quite troublesome.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-07-03 Thread David Miller
From: Herbert Xu herb...@gondor.apana.org.au
Date: Sat, 4 Jul 2009 11:08:30 +0800

 On Fri, Jul 03, 2009 at 08:02:54PM -0700, David Miller wrote:

 In particular the case of handling a device without usable TX
 completion event indications is still quite troublesome.
 e
 Which particular devices do you have in mind?

NIU

I basically can't defer interrupts because the chip supports
per-TX-desc interrupt indications but it lacks an all TX queue sent
event.  So if, say, tell it to interrupt every 1/4 of the TX queue
then up to 1/4 of the queue can have packets stuck in there
if TX activity all of a sudden ceases.

The only thing I've come up with to be able to mitigate interrupts is
to use an hrtimer of some sort.  But that's going to be hard to get
right, and who knows what kind of latencies will be introduced for TX
completion packet freeing unless I am very carefull.

And finally this belongs in generic code, not in the NIU driver,
whatever we come up with.  Especially since my understanding is that
this is similar to what Rusty needs.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] net/bridge: Add 'hairpin' port forwarding mode

2009-08-13 Thread David Miller
From: Fischer, Anna anna.fisc...@hp.com
Date: Thu, 13 Aug 2009 16:55:16 +

 This patch adds a 'hairpin' (also called 'reflective relay') mode
 port configuration to the Linux Ethernet bridge kernel module.
 A bridge supporting hairpin forwarding mode can send frames back
 out through the port the frame was received on.
 
 Hairpin mode is required to support basic VEPA (Virtual
 Ethernet Port Aggregator) capabilities.
 
 You can find additional information on VEPA here:
 http://tech.groups.yahoo.com/group/evb/
 http://www.ieee802.org/1/files/public/docs2009/new-hudson-vepa_seminar-20090514d.pdf
 http://www.internet2.edu/presentations/jt2009jul/20090719-congdon.pdf
 
 An additional patch 'bridge-utils: Add 'hairpin' port forwarding mode'
 is provided to allow configuring hairpin mode from userspace tools.
 
 Signed-off-by: Paul Congdon paul.cong...@hp.com
 Signed-off-by: Anna Fischer anna.fisc...@hp.com

Applied to net-next-2.6
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

2009-08-18 Thread David Miller
From: Herbert Xu herb...@gondor.apana.org.au
Date: Wed, 19 Aug 2009 13:19:26 +1000

 I'm in the process of repeating the same experiment with cxgb3
 which hopefully should let me turn interrupts off on descriptors
 while still reporting completion status.

Ok, I look forward to seeing your work however it turns out.

Once I see what you've done, I'll give it a spin on niu.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2.6.31-rc9] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-09-28 Thread David Miller
From: Shreyas Bhatewara sbhatew...@vmware.com
Date: Mon, 28 Sep 2009 16:56:45 -0700

 +   uint32_t rxdIdx:12;/* Index of the RxDesc */

Don't use uint32_t et al. sized types, use u32 and friends
throughout.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2.6.32-rc1] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-09-30 Thread David Miller
From: Stephen Hemminger shemmin...@vyatta.com
Date: Wed, 30 Sep 2009 17:39:23 -0700

 Why not use NETIF_F_LRO and ethtool to control LRO support?

In fact, you must, in order to handle bridging and routing
correctly.

Bridging and routing is illegal with LRO enabled, so the kernel
automatically issues the necessary ethtool commands to disable
LRO in the relevant devices.

Therefore you must support the ethtool LRO operation in order to
support LRO at all.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2.6.32-rc1] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-10-01 Thread David Miller
From: Shreyas Bhatewara sbhatew...@vmware.com
Date: Wed, 30 Sep 2009 14:34:57 -0700 (PDT)

 +{
 + struct vmxnet3_adapter *adapter = netdev_priv(netdev);
 + u8 *base;
 + int i;
 +
 + VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD,
 VMXNET3_CMD_GET_STATS);
 +
 + /* this does assume each counter is 64-bit wide */
 +
 + base = (u8 *)adapter-tqd_start-stats;
 + for (i = 0; i  ARRAY_SIZE(vmxnet3_tq_dev_stats); i++)
 + *buf++ = *(u64 *)(base + vmxnet3_tq_dev_stats[i].offset);
 +
 + base = (u8 *)adapter-tx_queue.stats;
 + for (i = 0; i  ARRAY_SIZE(vmxnet3_tq_driver_stats); i++)
 + *buf++ = *(u64 *)(base + vmxnet3_tq_driver_stats[i].offset);
 +
 + base = (u8 *)adapter-rqd_start-stats;

There's a lot of code like this that isn't indented properly.  Either
that or your email client has corrupted the patch by breaking up long
lines or similar.

Another example:

 +static int
 +vmxnet3_set_rx_csum(struct net_device *netdev, u32 val)
 +{
 + struct vmxnet3_adapter *adapter = netdev_priv(netdev);
 +
 + if (adapter-rxcsum != val) {
 + adapter-rxcsum = val;
 + if (netif_running(netdev)) {
 + if (val)
 + adapter-shared-devRead.misc.uptFeatures |=
 + UPT1_F_RXCSUM;
 + else
 + adapter-shared-devRead.misc.uptFeatures =
 + ~UPT1_F_RXCSUM;
 +
 + VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD,
 + VMXNET3_CMD_UPDATE_FEATURE);
 + }
 + }
 + return 0;
 +}

Yikes! :-)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2.6.32-rc4] net: VMware virtual Ethernet NIC driver: vmxnet3

2009-10-13 Thread David Miller
From: Shreyas Bhatewara sbhatew...@vmware.com
Date: Mon, 12 Oct 2009 15:18:42 -0700 (PDT)

 
 Ethernet NIC driver for VMware's vmxnet3
 
 From: Shreyas Bhatewara sbhatew...@vmware.com
 
 This patch adds driver support for VMware's virtual Ethernet NIC: vmxnet3
 Guests running on VMware hypervisors supporting vmxnet3 device will thus have
 access to improved network functionalities and performance.
 
 Signed-off-by: Shreyas Bhatewara sbhatew...@vmware.com
 Signed-off-by: Bhavesh Davda bhav...@vmware.com
 Signed-off-by: Ronghua Zhang rong...@vmware.com

Ok, looks good, applied to net-2.6, thanks!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv7 1/3] tun: export underlying socket

2009-11-04 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Tue, 3 Nov 2009 19:24:00 +0200

 Assuming it's okay with davem, I think it makes sense to merge this
 patch through Rusty's tree because vhost is the first user of the new
 interface.  Posted here for completeness.

I'm fine with that, please add my:

Acked-by: David S. Miller da...@davemloft.net
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv3 0/4] macvlan: add vepa and bridge mode

2009-11-26 Thread David Miller
From: Patrick McHardy ka...@trash.net
Date: Thu, 26 Nov 2009 17:26:17 +0100

 Arnd Bergmann wrote:
 Version 2 description:
 The patch to iproute2 has not changed, so I'm not including
 it this time. Patch 4/4 (the netlink interface) is basically
 unchanged as well but included for completeness.
 
 The other changes have moved forward a bit, to the point where
 I find them a lot cleaner and am more confident in the code
 being ready for inclusion. The implementation hardly resembles
 Erics original patch now, so I've dropped his signed-off-by.
 
 Please take a look and ack if you are happy so we can get it
 into 2.6.33.
 
 Looks good to me, nice work.
 
 Acked-by: Patrick McHardy ka...@trash.net
 
 for the entire series.

All applied to net-next-2.6, thanks everyone!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [RFC] macvlan: add tap device backend

2009-12-14 Thread David Miller
From: Arnd Bergmann a...@arndb.de
Date: Mon, 14 Dec 2009 13:00:36 +0100

 c) prepare a combined patch for net-next.git, or

This is probably fine.

I'll be taking patches into net-next-2.6 right after Linus
releases 2.6.33-rc1.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/2] virtio net improvements

2010-02-02 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Wed, 3 Feb 2010 09:57:06 +1030

 On Fri, 29 Jan 2010 11:46:43 pm Rusty Russell wrote:
 Hi Dave,
 
Nice driver optimization from Shirley, but requires a new virtio hook.
 Do you want to take both?  I have nothing else overlapping it.
 
 Dave, any news on this?

Just slowly creeping up the backlog :-)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] virtio: Add ability to detach unused buffers from vrings

2010-02-02 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Fri, 29 Jan 2010 23:49:05 +1030

 From: Shirley Ma mashi...@us.ibm.com
 
 There's currently no way for a virtio driver to ask for unused
 buffers, so it has to keep a list itself to reclaim them at shutdown.
 This is redundant, since virtio_ring stores that information.  So
 add a new hook to do this.
 
 Signed-off-by: Shirley Ma x...@us.ibm.com
 Signed-off-by: Amit Shah amit.s...@redhat.com
 Signed-off-by: Rusty Russell ru...@rustcorp.com.au

Applied.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/2] virtio_net: Defer skb allocation in receive path Date: Wed, 13 Jan 2010 12:53:38 -0800

2010-02-02 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Fri, 29 Jan 2010 23:50:04 +1030

 From: Shirley Ma mashi...@us.ibm.com
 
 virtio_net receives packets from its pre-allocated vring buffers, then it 
 delivers these packets to upper layer protocols as skb buffs. So it's not
 necessary to pre-allocate skb for each mergable buffer, then frees extra 
 skbs when buffers are merged into a large packet. This patch has deferred 
 skb allocation in receiving packets for both big packets and mergeable buffers
 to reduce skb pre-allocations and skb frees. It frees unused buffers by 
 calling 
 detach_unused_buf in vring, so recv skb queue is not needed.
 
 Signed-off-by: Shirley Ma x...@us.ibm.com
 Signed-off-by: Rusty Russell ru...@rustcorp.com.au

Applied.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/3 v4] macvtap driver

2010-02-03 Thread David Miller
From: Arnd Bergmann a...@arndb.de
Date: Sat, 30 Jan 2010 23:22:15 +0100

 This is the fourth version of the macvtap driver,
 based on the comments I got for the last version
 I got a few days ago. Very few changes:
 
 * release netdev in chardev open function so
   we can destroy it properly.
 * Implement TUNSETSNDBUF
 * fix sleeping call in rcu_read_lock
 * Fix comment in namespace isolation patch
 * Fix small context difference to make it apply
   to net-next
 
 I can't really test here while travelling, so please
 give it a go if you're interested in this driver.

All applied to net-next-2.6, thanks!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] vhost-net: switch to smp barriers

2010-02-14 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Sat, 13 Feb 2010 19:39:11 +0200

 Dave, I see it's marked not applicable:
 http://patchwork.ozlabs.org/patch/44207/
 the patch applies to net-next as of
 b3b3f04fb587ecb61b5baa6c1c5f0e666fd12d73.
 Can this be queued up please?
 Should I resubmit with Rusty's ack?

Sorry about that, I must have thought Rusty would queue
it up.

I'll fix the state to under-review and process it in my
backlog.

Thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3] vhost: fix get_user_pages_fast error handling

2010-02-23 Thread David Miller

Just for the record I'm generally not interested in vhost
patches.

If it's a specific network one that will be merged via
the networking tree, yes please CC: me.

But if it's a bunch of changes to vhost.c and other pieces
of infrastructure, feel free to leave me out of it.  It just
clutters my already overflowing inbox.

Thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3] vhost: fix get_user_pages_fast error handling

2010-02-23 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Wed, 24 Feb 2010 07:37:37 +0200

 Dave, so while Rusty's on vacation, what's the best way to get vhost
 infrastructure fixes in? Are you ok with getting pull requests and
 merging them into net-next?  That should keep the clutter in your inbox
 to the minimum.
 
 Of course network changes would still go the usual way.

Well, who is providing oversight of vhost work while he's
gone?  Has he, implicitly or explicitly, appointed a maintainer
while he's away?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3] vhost: fix get_user_pages_fast error handling

2010-02-23 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Wed, 24 Feb 2010 09:34:25 +0200

 Implicitly, I guess. He said if there's an issue Michael Tsirkin is the
 best person to resolve it, this was wrt merging his virtiolguest tree.
 He didn't mention vhost, I wrote all of vhost though, there shouldn't be
 an issue with that.

That's good enough for me.

Feel free to setup a tree for me to pull from.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] vhost-net fixes for 2.6.34

2010-02-28 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Sun, 28 Feb 2010 20:44:40 +0200

 The following changes since commit 655ffee284dfcf9a24ac0343f3e5ee6db85b85c5:
   Jiri Pirko (1):
 wireless: convert to use netdev_for_each_mc_addr
 
 are available in the git repository at:
 
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost

Pulled, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] vhost-net fixes for issues in 2.6.34-rc1

2010-03-20 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Thu, 18 Mar 2010 11:53:55 +0200

 The following tree includes patches fixing issues with vhost-net in
 2.6.34-rc1.  Please pull them for 2.6.34.

Pulled, thanks a lot.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] vhost-net fix for 2.6.34-rc3

2010-04-07 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Wed, 7 Apr 2010 20:35:02 +0300

 David,
 The following tree includes a patch fixing an issue with vhost-net in
 2.6.34-rc3.  Please pull for 2.6.34.

Pulled, thanks Michael.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] first round of vhost-net enhancements for net-next

2010-05-03 Thread David Miller
From: David Miller da...@davemloft.net
Date: Mon, 03 May 2010 15:07:29 -0700 (PDT)

 From: Michael S. Tsirkin m...@redhat.com
 Date: Tue, 4 May 2010 00:32:45 +0300
 
 The following tree includes a couple of enhancements that help vhost-net.
 Please pull them for net-next. Another set of patches is under
 debugging/testing and I hope to get them ready in time for 2.6.35,
 so there may be another pull request later.
 
 Pulled, thanks.

Nevermind, reverted.

Do you even compile test what you send to people?

drivers/net/macvtap.c: In function ‘macvtap_ioctl’:
drivers/net/macvtap.c:713: warning: control reaches end of non-void function

You're really batting 1000 today Michael...
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization

Re: [GIT PULL] first round of vhost-net enhancements for net-next

2010-05-03 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Tue, 4 May 2010 00:32:45 +0300

 The following tree includes a couple of enhancements that help vhost-net.
 Please pull them for net-next. Another set of patches is under
 debugging/testing and I hope to get them ready in time for 2.6.35,
 so there may be another pull request later.

Pulled, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] amended: first round of vhost-net enhancements for net-next

2010-05-06 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Tue, 4 May 2010 14:21:01 +0300

 This is an amended pull request: I have rebased the tree to the
 correct patches. This has been through basic tests and seems
 to work fine here.
 
 The following tree includes a couple of enhancements that help vhost-net.
 Please pull them for net-next. Another set of patches is under
 debugging/testing and I hope to get them ready in time for 2.6.35,
 so there may be another pull request later.

Pulled, thanks Michael.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2] vhost-net: add dhclient work-around from userspace

2010-06-29 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Mon, 28 Jun 2010 13:08:07 +0300

 Userspace virtio server has the following hack
 so guests rely on it, and we have to replicate it, too:
 
 Use port number to detect incoming IPv4 DHCP response packets,
 and fill in the checksum for these.
 
 The issue we are solving is that on linux guests, some apps
 that use recvmsg with AF_PACKET sockets, don't know how to
 handle CHECKSUM_PARTIAL;
 The interface to return the relevant information was added
 in 8dc4194474159660d7f37c495e3fc3f10d0db8cc,
 and older userspace does not use it.
 One important user of recvmsg with AF_PACKET is dhclient,
 so we add a work-around just for DHCP.
 
 Don't bother applying the hack to IPv6 as userspace virtio does not
 have a work-around for that - let's hope guests will do the right
 thing wrt IPv6.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com

Yikes, this is awful too.

Nothing in the kernel should be mucking around with procotol packets
like this by default.  In particular, what the heck does port 67 mean?
Locally I can use it for whatever I want for my own purposes, I don't
have to follow the conventions for service ports as specified by the
IETF.

But I can't have the packet checksum state be left alone for port 67
traffic on a box using virtio because you have this hack there.

And yes it's broken on machines using the qemu thing, but at least the
hack there is restricted to userspace.

I really don't want anything in the kernel that looks like this.

These applications are broken, and we've provided a way for them to
work properly.  What's the point of having fixed applications if
all of these hacks grow like fungus over every virtualization transport?

It just means that people won't fix the apps, since they don't have
to.  There is no incentive, and the mechanism we created to properly
handle this loses it's value.

At best, you can write a netfilter module that mucks up the packet
checksum state in these situations.  At least in that case, you can
make it generic (it mangles iff a packet matches a certain rule,
so for your virtio guests you'd make it match for DHCP frames) instead
of being some hard-coded DHCP thing by design.

And since this is so cleanly seperated and portable you don't even
need to push it upstream.  It's a temporary workaround for a temporary
problem.  You can just delete it as soon as the majority of guests
have the fixed dhcp.  The qemu crap should disappear similarly.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2] vhost-net: add dhclient work-around from userspace

2010-06-30 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Tue, 29 Jun 2010 16:04:39 +0300

 Since using the module involves updating the management tools
 as well, if we go down this route it will be much less painful
 for everyone to do push it upstream.

Ok, you can make your case to Patrick McHardy and if he'll merge
it into his netfilter GIT tree I guess I'll have to take it :)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL net-2.6] vhost-net: more error handling fixes

2010-07-02 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Thu, 1 Jul 2010 19:41:27 +0300

 David,
 The following tree includes more fixes dealing with
 error handling in vhost-net. It is on top of net-2.6.
 Please merge it for 2.6.35.
 Thanks!
 
 The following changes since commit 38000a94a902e94ca8b5498f7871c6316de8957a:
 
   sky2: enable rx/tx in sky2_phy_reinit() (2010-06-23 14:37:04 -0700)
 
 are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net
 
 Michael S. Tsirkin (2):
   vhost: break out of polling loop on error
   vhost: add unlikely annotations to error path

Pulled.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [Pv-drivers] RFC: Network Plugin Architecture (NPA) for vmxnet3

2010-07-14 Thread David Miller
From: Pankaj Thakkar pthak...@vmware.com
Date: Wed, 14 Jul 2010 10:18:22 -0700

 The plugin is guest agnostic and hence we did not want to rely on
 any kernel provided functions.

While I disagree entirely with this kind of approach, even that
doesn't justify what you're doing here.

memcpy() and memset() are on a much more fundamental ground than
kernel provided functions.  They had better be available no matter
where you build this thing.

And doing what you're doing is foolish on so many levels.  One more
duplication of code, one more place for unnecessary bugs to live, one
more place that might need optimizations and thus require duplication
of even more work people have done over the years.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] vhost-net fixes

2010-07-16 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Fri, 16 Jul 2010 15:25:30 +0300

 David, please pull the following fixes for 2.6.35.
 Thanks!
 
 The following changes since commit 91a72a70594e5212c97705ca6a694bd307f7a26b:
 
   net/core: neighbour update Oops (2010-07-14 18:02:16 -0700)
 
 are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net
 
 Michael S. Tsirkin (2):
   vhost-net: avoid flush under lock
   vhost: avoid pr_err on condition guest can trigger

Pulled, thanks!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL net-next-2.6] vhost-net patchset for 2.6.36

2010-08-02 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Wed, 28 Jul 2010 16:32:31 +0300

 The following changes since commit 4cfa580e7eebb8694b875d2caff3b989ada2efac:
 
   r6040: Fix args to phy_mii_ioctl(). (2010-07-21 21:10:49 -0700)
 
 are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net-next
 

Pulled, thanks Michael.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL net-2.6] vhost-net: 2.6.36 regression fixes

2010-09-09 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Mon, 6 Sep 2010 14:36:06 +0300

 The following tree includes more regression fixes for vhost-net
 in 2.6.36.  It is on top of net-2.6.
 Please merge it for 2.6.36.

Pulled, thanks Michael.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL net-2.6] vhost-net: fix range checking

2010-09-20 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Mon, 20 Sep 2010 19:42:22 +0200

   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net

Pulled, thanks!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL net-next-2.6] vhost-net patchset for 2.6.37

2010-10-06 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Tue, 5 Oct 2010 20:27:32 +0200

 It looks like it was a quiet cycle for vhost-net:
 probably because most of energy was spent on bugfixes
 that went in for 2.6.36.
 People are working on multiqueue, tracing but I'm not
 sure it'll get done in time for 2.6.37 - so here's
 a tree with a single patch that helps windows guests
 which we definitely want in the next kernel.
 
 Please merge for 2.6.37.

Pulled, thanks Michael.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL net-2.6] vhost-net: access_ok fix

2010-10-21 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Tue, 19 Oct 2010 16:59:01 +0200

 David,
 Not sure if it's too late for 2.6.36 - in case it's not, the following tree
 includes a last minute bugfix for vhost-net, found by code inspection.
 It is on top of net-2.6.
 Thanks!
 
 The following changes since commit b0057c51db66c5f0f38059f242c57d61c4741d89:
 
   tg3: restore rx_dropped accounting (2010-10-11 16:06:24 -0700)
 
 are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net

Even though it's too late, I've pulled this.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL net-2.6] vhost-net: rcu fixup

2010-11-28 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Thu, 25 Nov 2010 14:23:01 +0200

 Please merge the following fix for 2.6.36.
 Thanks!
 
 The following changes since commit a27e13d370415add3487949c60810e36069a23a6:
 
   econet: fix CVE-2010-3848 (2010-11-24 11:51:47 -0800)
 
 are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net
 
 Michael S. Tsirkin (1):
   vhost/net: fix rcu check usage
 

Pulled, thanks Michael.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL net-2.6] vhost-net: logging fixup

2010-12-12 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Sun, 12 Dec 2010 12:09:43 +0200

 Please merge the following fix for 2.6.37.
 It is also applicable to -stable.
 Thanks!
 
 The following changes since commit a19faf0250e09b16cac169354126404bc8aa342b:
 
   net: fix skb_defer_rx_timestamp() (2010-12-10 16:20:56 -0800)

Pulled, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [GIT PULL net-next-2.6] vhost-net: tools, cleanups, optimizations

2010-12-14 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Tue, 14 Dec 2010 14:23:26 +0200

 On Mon, Dec 13, 2010 at 12:44:13PM +0200, Michael S. Tsirkin wrote:
 Please merge the following tree for 2.6.38.
 Thanks!
 
 Rusty Acked it as is, so please pull the below.
 Thanks very much!
 
 The following changes since commit ad1184c6cf067a13e8cb2a4e7ccc407f947027d0:
 
   net: au1000_eth: remove unused global variable. (2010-12-11 12:01:48 -0800)
 
 are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net-next

Pulled, thanks a lot.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PULL] vhost-net: 2.6.38 - warning fix

2011-02-01 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Tue, 1 Feb 2011 17:44:40 +0200

 git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net

Pulled, thanks Michael.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_net: Add schedule check to napi_enable call

2011-02-10 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Thu, 10 Feb 2011 19:57:26 +0200

 On Thu, Feb 10, 2011 at 12:32:50PM +1030, Rusty Russell wrote:
 From: Bruce Rogers brog...@novell.com
 
 Under harsh testing conditions, including low memory, the guest would
 stop receiving packets. With this patch applied we no longer see any
 problems in the driver while performing these tests for extended periods
 of time.
 
 Make sure napi is scheduled subsequent to each napi_enable.
 
 Signed-off-by: Bruce Rogers brog...@novell.com
 Signed-off-by: Olaf Kirch o...@suse.de
 Cc: sta...@kernel.org
 Signed-off-by: Rusty Russell ru...@rustcorp.com.au
 
 Rusty, so this is 2.6.38 material - you'll send this to Linus? Or DaveM?

Don't worry I'll apply this to net-2.6, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_net: Add schedule check to napi_enable call

2011-02-10 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Thu, 10 Feb 2011 12:32:50 +1030

 From: Bruce Rogers brog...@novell.com
 
 Under harsh testing conditions, including low memory, the guest would
 stop receiving packets. With this patch applied we no longer see any
 problems in the driver while performing these tests for extended periods
 of time.
 
 Make sure napi is scheduled subsequent to each napi_enable.
 
 Signed-off-by: Bruce Rogers brog...@novell.com
 Signed-off-by: Olaf Kirch o...@suse.de
 Cc: sta...@kernel.org
 Signed-off-by: Rusty Russell ru...@rustcorp.com.au

Applied, thanks everyone.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] vhost: copy_from_user - __copy_from_user

2011-03-06 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Sun, 6 Mar 2011 13:33:49 +0200

 copy_from_user is pretty high on perf top profile,
 replacing it with __copy_from_user helps.
 It's also safe because we do access_ok checks during setup.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com

Is Rusty going to take this or should I?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PULL net-2.6] vhost: cleanups and fixes

2011-03-20 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Thu, 17 Mar 2011 16:04:04 +0200

 The following changes since commit 1fc050a13473348f5c439de2bb41c8e92dba5588:
 
   ipv4: Cache source address in nexthop entries. (2011-03-07 20:54:48 -0800)
 
 are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net-next
 
 Jason Wang (3):
   vhost-net: check the support of mergeable buffer outside the receive 
 loop
   vhost-net: Unify the code of mergeable and big buffer handling
   vhost: lock receive queue, not the socket
 
 Krishna Kumar (1):
   vhost: Cleanup vhost.c and net.c
 
 Michael S. Tsirkin (2):
   vhost: copy_from_user - __copy_from_user
   vhost-net: remove unlocked use of receive_queue

Pulled, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_net: convert to hw_features

2011-04-01 Thread David Miller
From: Michał Mirosław mirq-li...@rere.qmqm.pl
Date: Thu, 31 Mar 2011 13:01:35 +0200 (CEST)

 Signed-off-by: Michał Mirosław mirq-li...@rere.qmqm.pl

Applied.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization

Re: [PATCH RESEND] net: convert xen-netfront to hw_features

2011-04-01 Thread David Miller
From: Michał Mirosław mirq-li...@rere.qmqm.pl
Date: Thu, 31 Mar 2011 13:01:35 +0200 (CEST)

 Not tested in any way. The original code for offload setting seems broken
 as it resets the features on every netback reconnect.
 
 This will set GSO_ROBUST at device creation time (earlier than connect time).
 
 RX checksum offload is forced on - so advertise as it is.
 
 Signed-off-by: Michał Mirosław mirq-li...@rere.qmqm.pl

Applied.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization

Re: [PATCH RESEND] net: convert xen-netfront to hw_features

2011-04-04 Thread David Miller
From: Ian Campbell ian.campb...@eu.citrix.com
Date: Mon, 4 Apr 2011 13:29:19 +0100

From 0b56469abe56efae415b4603ef508ce9aec0e4c1 Mon Sep 17 00:00:00 2001
 From: Ian Campbell ian.campb...@citrix.com
 Date: Mon, 4 Apr 2011 10:58:50 +0100
 Subject: [PATCH] xen: netfront: assume all hw features are available until 
 backend connection setup
 
 We need to assume that all features will be available when registering the
 netdev otherwise they are ommitted from the initial set of
 dev-wanted_features. When we connect to the backed we reduce the set as
 necessary due to the call to netdev_update_features() in xennet_connect().
 
 Signed-off-by: Ian Campbell ian.campb...@citrix.com

I've applied this, thanks Ian.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: Signed bit field; int have_hotplug_status_watch:1

2011-04-06 Thread David Miller
From: Ian Campbell ian.campb...@eu.citrix.com
Date: Mon, 4 Apr 2011 09:26:24 +0100

 Subject: [PATCH] xen: netback: use unsigned type for one-bit bitfield.
 
 Fixes error from sparse:
   CHECK   drivers/net/xen-netback/xenbus.c
 drivers/net/xen-netback/xenbus.c:29:40: error: dubious one-bit signed bitfield
 
 int have_hotplug_status_watch:1;
 
 Reported-by: Dr. David Alan Gilbert li...@treblig.org
 Signed-off-by: Ian Campbell ian.campb...@citrix.com

Applied to net-next-2.6, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] xen: drop anti-dependency on X86_VISWS

2011-04-06 Thread David Miller
From: Ian Campbell ian.campb...@eu.citrix.com
Date: Mon, 4 Apr 2011 10:55:55 +0100

 You mean the !X86_VISWS I presume? It doesn't make sense to me either.

No, I think 32-bit x86 allmodconfig elides XEN because of it's X86_TSC 
dependency.

And, well, you could type make allmodconfig on your tree and see for
yourself instead of asking me :-)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 00/14] virtio and vhost-net performance enhancements

2011-05-19 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Fri, 20 May 2011 02:10:07 +0300

 Rusty, I think it will be easier to merge vhost and virtio bits in one
 go. Can it all go in through your tree (Dave in the past acked
 sending a very similar patch through you so should not be a problem)?

And in case you want an explicit ack for the net bits:

Acked-by: David S. Miller da...@davemloft.net

:-)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_net: introduce VIRTIO_NET_HDR_F_DATA_VALID

2011-06-11 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Fri, 10 Jun 2011 14:28:28 +0300

 On Fri, Jun 10, 2011 at 06:56:17PM +0800, Jason Wang wrote:
 There's no need for the guest to validate the checksum if it have been
 validated by host nics. So this patch introduces a new flag -
 VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum
 examing in guest. The backend (tap/macvtap) may set this flag when
 met skbs with CHECKSUM_UNNECESSARY to save cpu utilization.
 
 No feature negotiation is needed as old driver just ignore this flag.
 
 This wasn't required by the spec, but maybe it should be.
 
 Iperf shows 12%-30% performance improvement for UDP traffic. For TCP,
 when gro is on no difference as it produces skb with partial
 checksum. But when gro is disabled, 20% or even higher improvement
 could be measured by netperf.
 
 Signed-off-by: Jason Wang jasow...@redhat.com
 
 
 Acked-by: Michael S. Tsirkin m...@redhat.com

Applied to net-next-2.6
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-net: per cpu 64 bit stats (v2)

2011-06-17 Thread David Miller
From: Stephen Hemminger shemmin...@vyatta.com
Date: Wed, 15 Jun 2011 12:36:29 -0400

 Use per-cpu variables to maintain 64 bit statistics.
 
 Signed-off-by: Stephen Hemminger shemmin...@vyatta.com

I'll apply this, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH 4/4] xen/netback: Add module alias for autoloading

2011-06-30 Thread David Miller
From: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Date: Thu, 30 Jun 2011 12:39:54 -0400

 On Wed, Jun 29, 2011 at 02:41:32PM +0200, Bastian Blank wrote:
 Add xen-backend:vif module alias to the xen-netback module. This allows
 automatic loading of the module.
 
 Dave,
 
 Could you queue this up for 3.1 please? I've the other two patches in my
 tree for 3.1 and the block patch ready for Jens.

Done.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: Large Patch Series in Email

2011-07-15 Thread David Miller
From: Michael Witten mfwit...@gmail.com
Date: Fri, 15 Jul 2011 22:55:29 -

 On Sat, 16 Jul 2011 00:09:03 +0300, Dan Carpenter wrote:
 
 On Fri, Jul 15, 2011 at 06:25:55PM -, Michael Witten wrote:
   Do not send more than 15 patches at once to the vger
   mailing lists!!!

 ... Don't be a whinge bucket.
 
 Or be respectful of bandwidth, differing email environments, and the
 official guidelines for submitting patches, which I will quote again:
 
 If you cannot condense your patch set into a smaller set
 of patches, then only post say 15 or so at a time and wait
 for review and integration.
 ...
 Do not send more than 15 patches at once to the vger
 mailing lists!!!

Indeed, it really sucks when people send huge patch sets, do
not do it.

If the official SubmittingPatches document isn't convincing
enough, then maybe me (the vger postmaster) saying it will.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv9] vhost: experimental tx zero-copy support

2011-07-17 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Sun, 17 Jul 2011 22:36:14 +0300

 The below is what I came up with. We add the feature enabled
 by default ...

s/enabled/disabled/  Well, at least you got it right in the
commit message where it counts :-)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv11] vhost: vhost TX zero-copy support

2011-07-18 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Mon, 18 Jul 2011 16:48:46 +0300

From: Shirley Ma mashi...@us.ibm.com
 
 This adds experimental zero copy support in vhost-net,
 disabled by default. To enable, set
 experimental_zcopytx module option to 1.
 
 This patch maintains the outstanding userspace buffers in the
 sequence it is delivered to vhost. The outstanding userspace buffers
 will be marked as done once the lower device buffers DMA has finished.
 This is monitored through last reference of kfree_skb callback. Two
 buffer indices are used for this purpose.
 
 The vhost-net device passes the userspace buffers info to lower device
 skb through message control. DMA done status check and guest
 notification are handled by handle_tx: in the worst case is all buffers
 in the vq are in pending/done status, so we need to notify guest to
 release DMA done buffers first before we get any new buffers from the
 vq.
 
 One known problem is that if the guest stops submitting
 buffers, buffers might never get used until some
 further action, e.g. device reset. This does not
 seem to affect linux guests.
 
 Signed-off-by: Shirley x...@us.ibm.com
 Signed-off-by: Michael S. Tsirkin m...@redhat.com

Applied, thanks!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCH repost] Fix panic in virtnet_remove

2011-07-21 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Wed, 20 Jul 2011 17:31:15 +0300

 On Wed, Jul 20, 2011 at 07:26:02PM +0530, Krishna Kumar wrote:
 Fix a panic in virtnet_remove. unregister_netdev has already
 freed up the netdev (and virtnet_info) due to dev-destructor
 being set, while virtnet_info is still required. Remove
 virtnet_free altogether, and move the freeing of the per-cpu
 statistics from virtnet_free to virtnet_remove.
 
 Tested patch below.
 
 Signed-off-by: Krishna Kumar krkum...@in.ibm.com
 
 Also note that the crash was apparently introduced by
 3fa2a1df909482cc234524906e4bd30dee3514df in net-next,
 so this is a net-next only patch.
 
 Stephen, was there any special reason to free the memory
 in the destructor like you did?
 
 Acked-by: Michael S. Tsirkin m...@redhat.com

Applied.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PULL net] vhost-net: zercopy mode fixes

2011-07-22 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Fri, 22 Jul 2011 09:00:46 +0300

 The following includes vhost-net fixes - both in the
 experimental zero copy mode.
 Please pull for 3.1.
 Thanks!
 

Where is this the following?  I don't see any GIT url to pull
from or anything :-)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PULL net (try 2)] vhost-net: zercopy mode fixes

2011-07-22 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Fri, 22 Jul 2011 09:32:38 +0300

 Fixing a corrupted pull request sent earlier.
 Sorry about the noise!
 
 The following includes vhost-net fixes - both in the
 experimental zero copy mode.
 Please pull for 3.1.

Pulled, thanks!
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [RFC 0/0] Introducing a generic socket offload framework

2011-08-18 Thread David Miller

I'm not reading any RFC without any example code, sorry.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH net-next v2] enable virtio_net to return bus_info in ethtool -i consistent with emulated NICs

2011-11-16 Thread David Miller
From: r...@tardy.cup.hp.com (Rick Jones)
Date: Mon, 14 Nov 2011 16:17:08 -0800 (PST)

 From: Rick Jones rick.jon...@hp.com
 
 Add a new .bus_name to virtio_config_ops then modify virtio_net to 
 call through to it in an ethtool .get_drvinfo routine to report
 bus_info in ethtool -i output which is consistent with other
 emulated NICs and the output of lspci.  
 
 Signed-off-by: Rick Jones rick.jon...@hp.com
 
 ---
 
 The changes to drivers/lguest/lguest_device.c, drivers/s390/kvm/kvm_virtio.c,
 and drivers/virtio/virtio_mmio.c code inspected only, not compiled.

Applied, thanks Rick.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first

2011-11-24 Thread David Miller
From: Krishna Kumar2 krkum...@in.ibm.com
Date: Fri, 25 Nov 2011 09:39:11 +0530

 Jason Wang jasow...@redhat.com wrote on 11/25/2011 08:51:57 AM:

 My description is not clear again :(
 I mean the same vhost thead:

 vhost thread #0 transmits packets of flow A on processor M
 ...
 vhost thread #0 move to another process N and start to transmit packets
 of flow A
 
 Thanks for clarifying. Yes, binding vhosts to CPU's
 makes the incoming packet go to the same vhost each
 time. BTW, are you doing any binding and/or irqbalance
 when you run your tests? I am not running either at
 this time, but thought both might be useful.

So are we going with this patch or are we saying that vhost binding
is a requirement?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vhost-net: Acquire device lock when releasing device

2011-11-26 Thread David Miller
From: Sasha Levin levinsasha...@gmail.com
Date: Fri, 18 Nov 2011 11:19:42 +0200

 Device lock should be held when releasing a device, and specifically
 when calling vhost_dev_cleanup(). Otherwise, RCU complains about it:
 ...
 Cc: Michael S. Tsirkin m...@redhat.com
 Cc: k...@vger.kernel.org
 Cc: virtualization@lists.linux-foundation.org
 Cc: net...@vger.kernel.org
 Signed-off-by: Sasha Levin levinsasha...@gmail.com

Michael et al., are you guys going to gather this fix or should I
apply it directly to thet net tree?

Thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first

2011-12-07 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Wed, 7 Dec 2011 18:10:02 +0200

 On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote:
 From: Krishna Kumar2 krkum...@in.ibm.com
 Date: Fri, 25 Nov 2011 09:39:11 +0530
 
  Jason Wang jasow...@redhat.com wrote on 11/25/2011 08:51:57 AM:
 
  My description is not clear again :(
  I mean the same vhost thead:
 
  vhost thread #0 transmits packets of flow A on processor M
  ...
  vhost thread #0 move to another process N and start to transmit packets
  of flow A
  
  Thanks for clarifying. Yes, binding vhosts to CPU's
  makes the incoming packet go to the same vhost each
  time. BTW, are you doing any binding and/or irqbalance
  when you run your tests? I am not running either at
  this time, but thought both might be useful.
 
 So are we going with this patch or are we saying that vhost binding
 is a requirement?
 
 OK we didn't come to a conclusion so I would be inclined
 to merge this patch as is for 3.2, and revisit later.
 One question though: do these changes affect userspace
 in any way? For example, will this commit us to
 ensure that a single flow gets a unique hash even
 for strange configurations that transmit the same flow
 from multiple cpus?

Once you sort this out, reply with an Acked-by: for me, thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 REPOST] xen-netfront: delay gARP until backend switches to Connected

2011-12-09 Thread David Miller
From: Laszlo Ersek ler...@redhat.com
Date: Fri,  9 Dec 2011 12:38:58 +0100

 These two together provide complete ordering. Sub-condition (1) is
 satisfied by pvops commit 43223efd9bfd.

I don't see this commit in Linus's tree, so I doubt it's valid for
me to apply this as a bug fix to my 'net' tree since the precondition
pvops commit isn't upstream as far as I can tell.

Where did you intend me to apply this patch, and how did you expect
the dependent commit to make it's way into the tree so that this
fix is complete?

BTW, you should always explicitly specificy the answers to all the
questions in the previous paragraph, otherwise (like right now) we
go back and forth wasting time establishing these facts.

The way to say which tree the patch is intended for is to specify
it in the Subject like, f.e. [PATCH net-next v3 REPOST] ...

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 REPOST] xen-netfront: delay gARP until backend switches to Connected

2011-12-09 Thread David Miller
From: Ian Campbell ian.campb...@citrix.com
Date: Fri, 9 Dec 2011 21:23:00 +

 On Fri, 2011-12-09 at 18:45 +, David Miller wrote:
 From: Laszlo Ersek ler...@redhat.com
 Date: Fri,  9 Dec 2011 12:38:58 +0100
 
  These two together provide complete ordering. Sub-condition (1) is
  satisfied by pvops commit 43223efd9bfd.
 
 I don't see this commit in Linus's tree,
 
 The referenced commit is in
 git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git#xen/next-2.6.32  
 some people call the pvops tree but there's no reason to expect someone 
 outside the Xen world to know that...
 
 A better reference would have been 6b0b80ca7165 in
 git://xenbits.xen.org/people/ianc/linux-2.6.git#upstream/dom0/backend/netback-history
  which is the precise branch that was flattened to make f942dc2552b8, which 
 is the upstream commit that added netback, so this change is already in 
 upstream.

I want the commit message fixed so someone seeing the commit ID can
figure out what it actually refers to.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first

2011-12-20 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Tue, 20 Dec 2011 13:15:12 +0200

 On Wed, Dec 07, 2011 at 01:52:35PM -0500, David Miller wrote:
 Once you sort this out, reply with an Acked-by: for me, thanks.
 
 Acked-by: Michael S. Tsirkin m...@redhat.com

Applied.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v1 1/2] virtio_net: Pass gfp flags when allocating rx buffers.

2012-01-05 Thread David Miller
From: Rusty Russell ru...@rustcorp.com.au
Date: Thu, 05 Jan 2012 10:40:02 +1030

 On Wed, 04 Jan 2012 14:52:32 -0800, Mike Waychison mi...@google.com wrote:
 Currently, the refill path for RX buffers will always allocate the
 buffers as GFP_ATOMIC, even if we are in process context.  This will
 fail to apply memory pressure as the worker thread will not contribute
 to the freeing of memory.
 
 Fix this by changing add_recvbuf_small to use the gfp variant allocator,
 __netdev_alloc_skb_ip_align().
 
 Signed-off-by: Mike Waychison mi...@google.com
 
 OK, this is a no-brainer.  Thanks!  Dave, can you pick this up?
 
 Acked-by: Rusty Russell ru...@rustcorp.com.au

Applied.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vhost-net: add module alias (v2.1)

2012-01-12 Thread David Miller
From: Stephen Hemminger shemmin...@vyatta.com
Date: Wed, 11 Jan 2012 21:30:38 -0800

 Subject: vhost-net: add module alias (v2.1)
 
 By adding some module aliases, programs (or users) won't have to explicitly
 call modprobe. Vhost-net will always be available if built into the kernel.
 It does require assigning a permanent minor number for depmod to work.
 
 Also:
   - use C99 style initialization.
   - add missing entry in documentation for loop-control
 
 Signed-off-by: Stephen Hemminger shemmin...@vyatta.com

ACKs, NACKs?  What is happening here?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vhost-net: add module alias (v2.1)

2012-01-13 Thread David Miller
From: Kay Sievers kay.siev...@vrfy.org
Date: Fri, 13 Jan 2012 05:19:05 +0100

 On Fri, Jan 13, 2012 at 05:07, David Miller da...@davemloft.net wrote:
 From: Stephen Hemminger shemmin...@vyatta.com
 Date: Wed, 11 Jan 2012 21:30:38 -0800

 Subject: vhost-net: add module alias (v2.1)

 By adding some module aliases, programs (or users) won't have to explicitly
 call modprobe. Vhost-net will always be available if built into the kernel.
 It does require assigning a permanent minor number for depmod to work.

 Also:
   - use C99 style initialization.
   - add missing entry in documentation for loop-control

 Signed-off-by: Stephen Hemminger shemmin...@vyatta.com

 ACKs, NACKs?  What is happening here?
 
 In general, static minors are acceptable and very useful to make
 on-demand loading of kernel modules working. They should be used only
 for single-instance devices though, which usually means: One single
 static device name associated with a module.
 
 That looks all fine here, and for what it's worth:
   Acked-By: Kay Sievers kay.siev...@vrfy.org

Ok, applied, thanks everyone.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vhost-net: add module alias (v2.1)

2012-01-16 Thread David Miller
From: Stephen Hemminger shemmin...@vyatta.com
Date: Mon, 16 Jan 2012 07:52:36 -0800

 On Mon, 16 Jan 2012 12:26:45 +
 Alan Cox a...@linux.intel.com wrote:
 
   ACKs, NACKs?  What is happening here?
  
  I would like an Ack from Alan Cox who switched vhost-net
  to a dynamic minor in the first place, in commit
  79907d89c397b8bc2e05b347ec94e928ea919d33.
 
 Sorry dev...@lanana.org isn't yet back from the kernel hack incident.
 
 I don't read netdev so someone needs to summarise the issue and send me
 a copy of the patch to look at.
 
 Alan
 
 Subject: vhost-net: add module alias (v2.1)
 
 By adding some module aliases, programs (or users) won't have to explicitly
 call modprobe. Vhost-net will always be available if built into the kernel.
 It does require assigning a permanent minor number for depmod to work.
 
 Also:
   - use C99 style initialization.
   - add missing entry in documentation for loop-control
 
 Signed-off-by: Stephen Hemminger shemmin...@vyatta.com

I already applied your first patch, so you need to give me something
relative to apply on top of your original one.

And it also shows that you're really not generating these patches
against current 'net', otherwise you'd have noticed your other patch
already there.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


  1   2   3   4   5   >