Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Hannes Frederic Sowa
On Mi, 2015-01-28 at 15:43 +0200, Michael S. Tsirkin wrote:
 On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa wrote:
  Hi,
  
  On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
   On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
Hello,

On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
  On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
   On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
If the IPv6 fragment id has not been set and we perform
fragmentation due to UFO, select a new fragment id.
When we store the fragment id into skb_shinfo, set the bit
in the skb so we can re-use the selected id.
This preserves the behavior of UFO packets generated on the
host and solves the issue of id generation for packet sockets
and tap/macvtap devices.
   
This patch moves ipv6_select_ident() back in to the header 
file.  
It also provides the helper function that sets skb_shinfo() 
fragd have to patch both kernels *in your case*.
If it's all done by host, then it's in a single place, on host.
id and sets the bit.
   
It also makes sure that we select the fragment id when doing
just gso validation, since it's possible for the packet to
come from an untrusted source (VM) and be forwarded through
a UFO enabled device which will expect the fragment id.
   
CC: Eric Dumazet eduma...@google.com
Signed-off-by: Vladislav Yasevich vyase...@redhat.com
---
 include/linux/skbuff.h |  3 ++-
 include/net/ipv6.h |  2 ++
 net/ipv6/ip6_output.c  |  4 ++--
 net/ipv6/output_core.c |  9 -
 net/ipv6/udp_offload.c | 10 +-
 5 files changed, 23 insertions(+), 5 deletions(-)
   
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 85ab7d7..3ad5203 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -605,7 +605,8 @@ struct sk_buff {
 __u8ipvs_property:1;
 __u8inner_protocol_type:1;
 __u8remcsum_offload:1;
-/* 3 or 5 bit hole */
+__u8ufo_fragid_set:1;
[...]
   
Doesn't the flag belong in struct skb_shared_info, rather 
than struct
sk_buff?  Otherwise this looks fine.
   
Ben.
   
Hmm we seem to be out of tx flags.
Maybe ip6_frag_id == 0 should mean not set.

Maybe that is the best idea. Definitely the ufo_fragid_set bit 
should
move into the skb_shared_info area.
   
   That's what I originally wanted to do, but had to move and grow 
   txflags thus
   skb_shinfo ended up growing.  I wanted to avoid that, so stole an 
   skb flag.
   
   I considered treating fragid == 0 as unset, but a 0 fragid is 
   perfectly valid
   from the protocol perspective and could actually be generated by 
   the id generator
   functions.  This may cause us to call the id generation multiple 
   times.
  
  Are there plans in the long run to let virtio_net transmit auxiliary
  data to the other end so we can clean all of this this up one day?
  
  I don't like the whole situation: looking into the virtio_net 
  headers
  just adding a field for ipv6 fragmentation ids to those small 
  structs
  seems bloated, not doing it feels incorrect. :/
  
  Thoughts?
  
  Bye,
  Hannes
 
 I'm not sure - what will be achieved by generating the IDs guest side 
 as
 opposed to host side?  It's certainly harder to get hold of entropy
 guest-side.

It is not only about entropy but about uniqueness.  Also fragmentation
ids should not be discoverable,
   
   I belive predictable is the language used by the IETF draft.
   
so there are several aspects:

I see fragmentation id generation still as security critical:
When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
identifiers less predictable) I could patch my kernels and use the
patch regardless of the machine being virtualized or not. It was not
dependent on the hypervisor.
   
   And now it's even easier - just patch the hypervisor, and all VMs
   automatically benefit.
  
  Sometimes the hypervisor is not under my control. You would need to
  patch both kernels in your case - non gso frames would still get the
  fragmentation id generated in the host 

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Vlad Yasevich
On 01/28/2015 05:34 AM, Hannes Frederic Sowa wrote:
 Hi,
 
 On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
 On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
 Hello,

 On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
 On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
 On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
 On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
 On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
 If the IPv6 fragment id has not been set and we perform
 fragmentation due to UFO, select a new fragment id.
 When we store the fragment id into skb_shinfo, set the bit
 in the skb so we can re-use the selected id.
 This preserves the behavior of UFO packets generated on the
 host and solves the issue of id generation for packet sockets
 and tap/macvtap devices.

 This patch moves ipv6_select_ident() back in to the header file.  
 It also provides the helper function that sets skb_shinfo() frag
 id and sets the bit.

 It also makes sure that we select the fragment id when doing
 just gso validation, since it's possible for the packet to
 come from an untrusted source (VM) and be forwarded through
 a UFO enabled device which will expect the fragment id.

 CC: Eric Dumazet eduma...@google.com
 Signed-off-by: Vladislav Yasevich vyase...@redhat.com
 ---
  include/linux/skbuff.h |  3 ++-
  include/net/ipv6.h |  2 ++
  net/ipv6/ip6_output.c  |  4 ++--
  net/ipv6/output_core.c |  9 -
  net/ipv6/udp_offload.c | 10 +-
  5 files changed, 23 insertions(+), 5 deletions(-)

 diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
 index 85ab7d7..3ad5203 100644
 --- a/include/linux/skbuff.h
 +++ b/include/linux/skbuff.h
 @@ -605,7 +605,8 @@ struct sk_buff {
  __u8ipvs_property:1;
  __u8inner_protocol_type:1;
  __u8remcsum_offload:1;
 -/* 3 or 5 bit hole */
 +__u8ufo_fragid_set:1;
 [...]

 Doesn't the flag belong in struct skb_shared_info, rather than struct
 sk_buff?  Otherwise this looks fine.

 Ben.

 Hmm we seem to be out of tx flags.
 Maybe ip6_frag_id == 0 should mean not set.

 Maybe that is the best idea. Definitely the ufo_fragid_set bit should
 move into the skb_shared_info area.

 That's what I originally wanted to do, but had to move and grow txflags 
 thus
 skb_shinfo ended up growing.  I wanted to avoid that, so stole an skb 
 flag.

 I considered treating fragid == 0 as unset, but a 0 fragid is perfectly 
 valid
 from the protocol perspective and could actually be generated by the id 
 generator
 functions.  This may cause us to call the id generation multiple times.

 Are there plans in the long run to let virtio_net transmit auxiliary
 data to the other end so we can clean all of this this up one day?

 I don't like the whole situation: looking into the virtio_net headers
 just adding a field for ipv6 fragmentation ids to those small structs
 seems bloated, not doing it feels incorrect. :/

 Thoughts?

 Bye,
 Hannes

 I'm not sure - what will be achieved by generating the IDs guest side as
 opposed to host side?  It's certainly harder to get hold of entropy
 guest-side.

 It is not only about entropy but about uniqueness.  Also fragmentation
 ids should not be discoverable,

 I belive predictable is the language used by the IETF draft.

 so there are several aspects:

 I see fragmentation id generation still as security critical:
 When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
 identifiers less predictable) I could patch my kernels and use the
 patch regardless of the machine being virtualized or not. It was not
 dependent on the hypervisor.

 And now it's even easier - just patch the hypervisor, and all VMs
 automatically benefit.
 
 Sometimes the hypervisor is not under my control. You would need to
 patch both kernels in your case - non gso frames would still get the
 fragmentation id generated in the host kernel.

Why would non-gso frames need a frag id?  We are talking only UDP IPv6
here, so there is no frag id generation if the packet does't need to
be fragmented.

 
 I think that is the same reasoning why we
 don't support TOE.
 If we use one generator in the hypervisor in an openstack alike setting,
 the host deals with quite a lot of overlay networks. A lot of default
 configurations use the same addresses internally, so on the hypervisor
 the frag id generators would interfere by design.
 I could come up with an attack scenario for DNS servers (again :) ):

 You are sitting next to a DNS server on the same hypervisor and can send
 packets without source validation (because that is handled later on in
 case of openvswitch when the packet is put into the corresponding
 overlay network). You emit a gso packet with the same source 

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Michael S. Tsirkin
On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa wrote:
 Hi,
 
 On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
  On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
   Hello,
   
   On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
 On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
  On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
   On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
   On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
   On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
   If the IPv6 fragment id has not been set and we perform
   fragmentation due to UFO, select a new fragment id.
   When we store the fragment id into skb_shinfo, set the bit
   in the skb so we can re-use the selected id.
   This preserves the behavior of UFO packets generated on the
   host and solves the issue of id generation for packet sockets
   and tap/macvtap devices.
  
   This patch moves ipv6_select_ident() back in to the header 
   file.  
   It also provides the helper function that sets skb_shinfo() 
   frag
   id and sets the bit.
  
   It also makes sure that we select the fragment id when doing
   just gso validation, since it's possible for the packet to
   come from an untrusted source (VM) and be forwarded through
   a UFO enabled device which will expect the fragment id.
  
   CC: Eric Dumazet eduma...@google.com
   Signed-off-by: Vladislav Yasevich vyase...@redhat.com
   ---
include/linux/skbuff.h |  3 ++-
include/net/ipv6.h |  2 ++
net/ipv6/ip6_output.c  |  4 ++--
net/ipv6/output_core.c |  9 -
net/ipv6/udp_offload.c | 10 +-
5 files changed, 23 insertions(+), 5 deletions(-)
  
   diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
   index 85ab7d7..3ad5203 100644
   --- a/include/linux/skbuff.h
   +++ b/include/linux/skbuff.h
   @@ -605,7 +605,8 @@ struct sk_buff {
  __u8ipvs_property:1;
  __u8inner_protocol_type:1;
  __u8remcsum_offload:1;
   -  /* 3 or 5 bit hole */
   +  __u8ufo_fragid_set:1;
   [...]
  
   Doesn't the flag belong in struct skb_shared_info, rather than 
   struct
   sk_buff?  Otherwise this looks fine.
  
   Ben.
  
   Hmm we seem to be out of tx flags.
   Maybe ip6_frag_id == 0 should mean not set.
   
   Maybe that is the best idea. Definitely the ufo_fragid_set bit 
   should
   move into the skb_shared_info area.
  
  That's what I originally wanted to do, but had to move and grow 
  txflags thus
  skb_shinfo ended up growing.  I wanted to avoid that, so stole an 
  skb flag.
  
  I considered treating fragid == 0 as unset, but a 0 fragid is 
  perfectly valid
  from the protocol perspective and could actually be generated by 
  the id generator
  functions.  This may cause us to call the id generation multiple 
  times.
 
 Are there plans in the long run to let virtio_net transmit auxiliary
 data to the other end so we can clean all of this this up one day?
 
 I don't like the whole situation: looking into the virtio_net headers
 just adding a field for ipv6 fragmentation ids to those small structs
 seems bloated, not doing it feels incorrect. :/
 
 Thoughts?
 
 Bye,
 Hannes

I'm not sure - what will be achieved by generating the IDs guest side as
opposed to host side?  It's certainly harder to get hold of entropy
guest-side.
   
   It is not only about entropy but about uniqueness.  Also fragmentation
   ids should not be discoverable,
  
  I belive predictable is the language used by the IETF draft.
  
   so there are several aspects:
   
   I see fragmentation id generation still as security critical:
   When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
   identifiers less predictable) I could patch my kernels and use the
   patch regardless of the machine being virtualized or not. It was not
   dependent on the hypervisor.
  
  And now it's even easier - just patch the hypervisor, and all VMs
  automatically benefit.
 
 Sometimes the hypervisor is not under my control. You would need to
 patch both kernels in your case - non gso frames would still get the
 fragmentation id generated in the host kernel.

Confused. You would have to patch both kernels *in your case*.
If it's all done by host, then it's in a single place, on host.

   I think that is the same reasoning why we
   don't support TOE.
   If we use one generator in the hypervisor in an openstack alike setting,
   the host 

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Hannes Frederic Sowa
Hello,

On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
  On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
   On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
If the IPv6 fragment id has not been set and we perform
fragmentation due to UFO, select a new fragment id.
When we store the fragment id into skb_shinfo, set the bit
in the skb so we can re-use the selected id.
This preserves the behavior of UFO packets generated on the
host and solves the issue of id generation for packet sockets
and tap/macvtap devices.
   
This patch moves ipv6_select_ident() back in to the header file.  
It also provides the helper function that sets skb_shinfo() frag
id and sets the bit.
   
It also makes sure that we select the fragment id when doing
just gso validation, since it's possible for the packet to
come from an untrusted source (VM) and be forwarded through
a UFO enabled device which will expect the fragment id.
   
CC: Eric Dumazet eduma...@google.com
Signed-off-by: Vladislav Yasevich vyase...@redhat.com
---
 include/linux/skbuff.h |  3 ++-
 include/net/ipv6.h |  2 ++
 net/ipv6/ip6_output.c  |  4 ++--
 net/ipv6/output_core.c |  9 -
 net/ipv6/udp_offload.c | 10 +-
 5 files changed, 23 insertions(+), 5 deletions(-)
   
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 85ab7d7..3ad5203 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -605,7 +605,8 @@ struct sk_buff {
 __u8ipvs_property:1;
 __u8inner_protocol_type:1;
 __u8remcsum_offload:1;
-/* 3 or 5 bit hole */
+__u8ufo_fragid_set:1;
[...]
   
Doesn't the flag belong in struct skb_shared_info, rather than struct
sk_buff?  Otherwise this looks fine.
   
Ben.
   
Hmm we seem to be out of tx flags.
Maybe ip6_frag_id == 0 should mean not set.

Maybe that is the best idea. Definitely the ufo_fragid_set bit should
move into the skb_shared_info area.
   
   That's what I originally wanted to do, but had to move and grow txflags 
   thus
   skb_shinfo ended up growing.  I wanted to avoid that, so stole an skb 
   flag.
   
   I considered treating fragid == 0 as unset, but a 0 fragid is perfectly 
   valid
   from the protocol perspective and could actually be generated by the id 
   generator
   functions.  This may cause us to call the id generation multiple times.
  
  Are there plans in the long run to let virtio_net transmit auxiliary
  data to the other end so we can clean all of this this up one day?
  
  I don't like the whole situation: looking into the virtio_net headers
  just adding a field for ipv6 fragmentation ids to those small structs
  seems bloated, not doing it feels incorrect. :/
  
  Thoughts?
  
  Bye,
  Hannes
 
 I'm not sure - what will be achieved by generating the IDs guest side as
 opposed to host side?  It's certainly harder to get hold of entropy
 guest-side.

It is not only about entropy but about uniqueness. Also fragmentation
ids should not be discoverable, so there are several aspects:

I see fragmentation id generation still as security critical:
When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
identifiers less predictable) I could patch my kernels and use the
patch regardless of the machine being virtualized or not. It was not
dependent on the hypervisor. I think that is the same reasoning why we
don't support TOE.

If we use one generator in the hypervisor in an openstack alike setting,
the host deals with quite a lot of overlay networks. A lot of default
configurations use the same addresses internally, so on the hypervisor
the frag id generators would interfere by design.

I could come up with an attack scenario for DNS servers (again :) ):

You are sitting next to a DNS server on the same hypervisor and can send
packets without source validation (because that is handled later on in
case of openvswitch when the packet is put into the corresponding
overlay network). You emit a gso packet with the same source and
destination addresses as the DNS server would do and would get an
fragmentation id which is linearly (+ time delta) incremented depending
on the source and destination address. With such a leak you could start
trying attack and spoof DNS responses (fragmentation attacks etc.).

See also details on such kind of attacks in the description of commit
04ca6973f7c1a0d.

AFAIK IETF tried with IPv6 to push fragmentation id generation to the
end hosts, that's also the reason for the introduction of atomic
fragments 

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Michael S. Tsirkin
On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
 Hello,
 
 On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
  On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
   On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
 On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
 On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
 If the IPv6 fragment id has not been set and we perform
 fragmentation due to UFO, select a new fragment id.
 When we store the fragment id into skb_shinfo, set the bit
 in the skb so we can re-use the selected id.
 This preserves the behavior of UFO packets generated on the
 host and solves the issue of id generation for packet sockets
 and tap/macvtap devices.

 This patch moves ipv6_select_ident() back in to the header file.  
 It also provides the helper function that sets skb_shinfo() frag
 id and sets the bit.

 It also makes sure that we select the fragment id when doing
 just gso validation, since it's possible for the packet to
 come from an untrusted source (VM) and be forwarded through
 a UFO enabled device which will expect the fragment id.

 CC: Eric Dumazet eduma...@google.com
 Signed-off-by: Vladislav Yasevich vyase...@redhat.com
 ---
  include/linux/skbuff.h |  3 ++-
  include/net/ipv6.h |  2 ++
  net/ipv6/ip6_output.c  |  4 ++--
  net/ipv6/output_core.c |  9 -
  net/ipv6/udp_offload.c | 10 +-
  5 files changed, 23 insertions(+), 5 deletions(-)

 diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
 index 85ab7d7..3ad5203 100644
 --- a/include/linux/skbuff.h
 +++ b/include/linux/skbuff.h
 @@ -605,7 +605,8 @@ struct sk_buff {
__u8ipvs_property:1;
__u8inner_protocol_type:1;
__u8remcsum_offload:1;
 -  /* 3 or 5 bit hole */
 +  __u8ufo_fragid_set:1;
 [...]

 Doesn't the flag belong in struct skb_shared_info, rather than 
 struct
 sk_buff?  Otherwise this looks fine.

 Ben.

 Hmm we seem to be out of tx flags.
 Maybe ip6_frag_id == 0 should mean not set.
 
 Maybe that is the best idea. Definitely the ufo_fragid_set bit should
 move into the skb_shared_info area.

That's what I originally wanted to do, but had to move and grow txflags 
thus
skb_shinfo ended up growing.  I wanted to avoid that, so stole an skb 
flag.

I considered treating fragid == 0 as unset, but a 0 fragid is perfectly 
valid
from the protocol perspective and could actually be generated by the id 
generator
functions.  This may cause us to call the id generation multiple times.
   
   Are there plans in the long run to let virtio_net transmit auxiliary
   data to the other end so we can clean all of this this up one day?
   
   I don't like the whole situation: looking into the virtio_net headers
   just adding a field for ipv6 fragmentation ids to those small structs
   seems bloated, not doing it feels incorrect. :/
   
   Thoughts?
   
   Bye,
   Hannes
  
  I'm not sure - what will be achieved by generating the IDs guest side as
  opposed to host side?  It's certainly harder to get hold of entropy
  guest-side.
 
 It is not only about entropy but about uniqueness.  Also fragmentation
 ids should not be discoverable,

I belive predictable is the language used by the IETF draft.

 so there are several aspects:
 
 I see fragmentation id generation still as security critical:
 When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
 identifiers less predictable) I could patch my kernels and use the
 patch regardless of the machine being virtualized or not. It was not
 dependent on the hypervisor.

And now it's even easier - just patch the hypervisor, and all VMs
automatically benefit.

 I think that is the same reasoning why we
 don't support TOE.
 If we use one generator in the hypervisor in an openstack alike setting,
 the host deals with quite a lot of overlay networks. A lot of default
 configurations use the same addresses internally, so on the hypervisor
 the frag id generators would interfere by design.
 I could come up with an attack scenario for DNS servers (again :) ):
 
 You are sitting next to a DNS server on the same hypervisor and can send
 packets without source validation (because that is handled later on in
 case of openvswitch when the packet is put into the corresponding
 overlay network). You emit a gso packet with the same source and
 destination addresses as the DNS server would do and would get an
 fragmentation id which is linearly (+ time delta) incremented depending
 on the source and destination address. With 

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Hannes Frederic Sowa
Hi,

On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
 On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
  Hello,
  
  On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
   On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
 On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
  On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
  On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
  On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
  If the IPv6 fragment id has not been set and we perform
  fragmentation due to UFO, select a new fragment id.
  When we store the fragment id into skb_shinfo, set the bit
  in the skb so we can re-use the selected id.
  This preserves the behavior of UFO packets generated on the
  host and solves the issue of id generation for packet sockets
  and tap/macvtap devices.
 
  This patch moves ipv6_select_ident() back in to the header file. 
   
  It also provides the helper function that sets skb_shinfo() frag
  id and sets the bit.
 
  It also makes sure that we select the fragment id when doing
  just gso validation, since it's possible for the packet to
  come from an untrusted source (VM) and be forwarded through
  a UFO enabled device which will expect the fragment id.
 
  CC: Eric Dumazet eduma...@google.com
  Signed-off-by: Vladislav Yasevich vyase...@redhat.com
  ---
   include/linux/skbuff.h |  3 ++-
   include/net/ipv6.h |  2 ++
   net/ipv6/ip6_output.c  |  4 ++--
   net/ipv6/output_core.c |  9 -
   net/ipv6/udp_offload.c | 10 +-
   5 files changed, 23 insertions(+), 5 deletions(-)
 
  diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
  index 85ab7d7..3ad5203 100644
  --- a/include/linux/skbuff.h
  +++ b/include/linux/skbuff.h
  @@ -605,7 +605,8 @@ struct sk_buff {
   __u8ipvs_property:1;
   __u8inner_protocol_type:1;
   __u8remcsum_offload:1;
  -/* 3 or 5 bit hole */
  +__u8ufo_fragid_set:1;
  [...]
 
  Doesn't the flag belong in struct skb_shared_info, rather than 
  struct
  sk_buff?  Otherwise this looks fine.
 
  Ben.
 
  Hmm we seem to be out of tx flags.
  Maybe ip6_frag_id == 0 should mean not set.
  
  Maybe that is the best idea. Definitely the ufo_fragid_set bit 
  should
  move into the skb_shared_info area.
 
 That's what I originally wanted to do, but had to move and grow 
 txflags thus
 skb_shinfo ended up growing.  I wanted to avoid that, so stole an skb 
 flag.
 
 I considered treating fragid == 0 as unset, but a 0 fragid is 
 perfectly valid
 from the protocol perspective and could actually be generated by the 
 id generator
 functions.  This may cause us to call the id generation multiple 
 times.

Are there plans in the long run to let virtio_net transmit auxiliary
data to the other end so we can clean all of this this up one day?

I don't like the whole situation: looking into the virtio_net headers
just adding a field for ipv6 fragmentation ids to those small structs
seems bloated, not doing it feels incorrect. :/

Thoughts?

Bye,
Hannes
   
   I'm not sure - what will be achieved by generating the IDs guest side as
   opposed to host side?  It's certainly harder to get hold of entropy
   guest-side.
  
  It is not only about entropy but about uniqueness.  Also fragmentation
  ids should not be discoverable,
 
 I belive predictable is the language used by the IETF draft.
 
  so there are several aspects:
  
  I see fragmentation id generation still as security critical:
  When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
  identifiers less predictable) I could patch my kernels and use the
  patch regardless of the machine being virtualized or not. It was not
  dependent on the hypervisor.
 
 And now it's even easier - just patch the hypervisor, and all VMs
 automatically benefit.

Sometimes the hypervisor is not under my control. You would need to
patch both kernels in your case - non gso frames would still get the
fragmentation id generated in the host kernel.

  I think that is the same reasoning why we
  don't support TOE.
  If we use one generator in the hypervisor in an openstack alike setting,
  the host deals with quite a lot of overlay networks. A lot of default
  configurations use the same addresses internally, so on the hypervisor
  the frag id generators would interfere by design.
  I could come up with an attack scenario for DNS servers (again :) ):
  
  You are sitting next to a DNS server on the same 

Re: memory barriers in virtq.lua?

2015-01-28 Thread Nikolay Nikolaev
Hello Michael,


On Tue, Jan 27, 2015 at 6:01 PM, Michael S. Tsirkin m...@redhat.com wrote:
 Hi Nikolay,
 I poked at src/lib/virtio/virtq.lua a bit -
 I was surprised to find no explicit CPU memory
 barriers in the virtq implementation.
 These are typically required when using virtio
 on smp machines - the spec actually mention where
 barriers are necessary.
 Are the barriers implicit somehow for lua?
 I'd be curious to learn.



thanks for looking at our code and providing your feedback.

The virtq.lua implements the virtq operations from a device point of
view. We compile this with LuaJIT which is guaranteed to not reorder
operations [1]. We also target the x86 architecture, which is
guaranteed to not reorder stores [2]:
Stores Are Seen in a Consistent Order by Other Processors.
We rely on both these facts and don't use barrier in the virtq code.
However I do agree that we'll have to put barriers once we switch to
other architectures and/or LuaJIT implements ordering optmisations.

Finally, I checked the virtio 1.0 spec again and didn't see any
explicit mentioning of memory barriers regarding the device side of
the spec. There are several places where memory barriers are mentioned
and these all are about the driver. Maybe they are omitted because
they are implicit somehow? Please clarify.

regards,
Nikolay Nikolaev

[1] 
https://www.freelists.org/post/luajit/Compiler-loadstore-barrier-volatile-pointer-barriers-in-general,1
[2] 
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf
 - 8.2.3.7


 Thanks,

 --
 MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Hannes Frederic Sowa
On Mi, 2015-01-28 at 18:48 +0200, Michael S. Tsirkin wrote:
 On Wed, Jan 28, 2015 at 05:15:49PM +0100, Hannes Frederic Sowa wrote:
  Hi,
  
  On Mi, 2015-01-28 at 18:00 +0200, Michael S. Tsirkin wrote:
   On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa wrote:
Hi,

On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
 On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
  Hello,
  
  On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
   On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa 
   wrote:
On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
 On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
  On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
  On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings 
  wrote:
  On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich 
  wrote:
  If the IPv6 fragment id has not been set and we perform
  fragmentation due to UFO, select a new fragment id.
  When we store the fragment id into skb_shinfo, set the 
  bit
  in the skb so we can re-use the selected id.
  This preserves the behavior of UFO packets generated on 
  the
  host and solves the issue of id generation for packet 
  sockets
  and tap/macvtap devices.
 
  This patch moves ipv6_select_ident() back in to the 
  header file.  
  It also provides the helper function that sets 
  skb_shinfo() frag
  id and sets the bit.
 
  It also makes sure that we select the fragment id when 
  doing
  just gso validation, since it's possible for the packet 
  to
  come from an untrusted source (VM) and be forwarded 
  through
  a UFO enabled device which will expect the fragment id.
 
  CC: Eric Dumazet eduma...@google.com
  Signed-off-by: Vladislav Yasevich vyase...@redhat.com
  ---
   include/linux/skbuff.h |  3 ++-
   include/net/ipv6.h |  2 ++
   net/ipv6/ip6_output.c  |  4 ++--
   net/ipv6/output_core.c |  9 -
   net/ipv6/udp_offload.c | 10 +-
   5 files changed, 23 insertions(+), 5 deletions(-)
 
  diff --git a/include/linux/skbuff.h 
  b/include/linux/skbuff.h
  index 85ab7d7..3ad5203 100644
  --- a/include/linux/skbuff.h
  +++ b/include/linux/skbuff.h
  @@ -605,7 +605,8 @@ struct sk_buff {
   __u8ipvs_property:1;
   __u8inner_protocol_type:1;
   __u8remcsum_offload:1;
  -/* 3 or 5 bit hole */
  +__u8ufo_fragid_set:1;
  [...]
 
  Doesn't the flag belong in struct skb_shared_info, rather 
  than struct
  sk_buff?  Otherwise this looks fine.
 
  Ben.
 
  Hmm we seem to be out of tx flags.
  Maybe ip6_frag_id == 0 should mean not set.
  
  Maybe that is the best idea. Definitely the ufo_fragid_set 
  bit should
  move into the skb_shared_info area.
 
 That's what I originally wanted to do, but had to move and 
 grow txflags thus
 skb_shinfo ended up growing.  I wanted to avoid that, so 
 stole an skb flag.
 
 I considered treating fragid == 0 as unset, but a 0 fragid is 
 perfectly valid
 from the protocol perspective and could actually be generated 
 by the id generator
 functions.  This may cause us to call the id generation 
 multiple times.

Are there plans in the long run to let virtio_net transmit 
auxiliary
data to the other end so we can clean all of this this up one 
day?

I don't like the whole situation: looking into the virtio_net 
headers
just adding a field for ipv6 fragmentation ids to those small 
structs
seems bloated, not doing it feels incorrect. :/

Thoughts?

Bye,
Hannes
   
   I'm not sure - what will be achieved by generating the IDs guest 
   side as
   opposed to host side?  It's certainly harder to get hold of 
   entropy
   guest-side.
  
  It is not only about entropy but about uniqueness.  Also 
  fragmentation
  ids should not be discoverable,
 
 I belive predictable is the language used by the IETF draft.
 
  so there are several aspects:
  
  I see fragmentation id generation still as security critical:
  When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: 
  make IP
  identifiers less predictable) I could 

Re: [Qemu-devel] [PATCH RFC v6 05/20] virtio: support more feature bits

2015-01-28 Thread David Gibson
On Wed, Jan 28, 2015 at 04:59:45PM +0100, Cornelia Huck wrote:
 On Thu, 22 Jan 2015 12:43:43 +1100
 David Gibson da...@gibson.dropbear.id.au wrote:
 
  On Thu, Dec 11, 2014 at 02:25:07PM +0100, Cornelia Huck wrote:
   With virtio-1, we support more than 32 feature bits. Let's extend both
   host and guest features to 64, which should suffice for a while.
   
   vhost and migration have been ignored for now.
  
  [snip]
  
   diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
   index f6c0379..08141c7 100644
   --- a/include/hw/virtio/virtio.h
   +++ b/include/hw/virtio/virtio.h
   @@ -55,6 +55,12 @@
/* A guest should never accept this.  It implies negotiation is broken. 
   */
#define VIRTIO_F_BAD_FEATURE 30

   +/* v1.0 compliant. */
   +#define VIRTIO_F_VERSION_1  32
  
  This is already in the kernel header, isn't it?

 
 Yes. But nearly all files include this header but not the kernel
 header.

Can't you change that?  Or this file include the kernel header?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp9IOeyeiUqJ.pgp
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [Qemu-devel] [PATCH RFC v6 07/20] virtio: allow virtio-1 queue layout

2015-01-28 Thread David Gibson
On Wed, Jan 28, 2015 at 05:07:01PM +0100, Cornelia Huck wrote:
 On Thu, 22 Jan 2015 13:06:09 +1100
 David Gibson da...@gibson.dropbear.id.au wrote:
 
  On Thu, Dec 11, 2014 at 02:25:09PM +0100, Cornelia Huck wrote:
   For virtio-1 devices, we allow a more complex queue layout that doesn't
   require descriptor table and rings on a physically-contigous memory area:
   add virtio_queue_set_rings() to allow transports to set this up.
   
   Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
   ---
hw/virtio/virtio-mmio.c|3 +++
hw/virtio/virtio.c |   53 
   
include/hw/virtio/virtio.h |3 +++
3 files changed, 40 insertions(+), 19 deletions(-)
   
   diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
   index 43b7e02..0c9b63b 100644
   --- a/hw/virtio/virtio-mmio.c
   +++ b/hw/virtio/virtio-mmio.c
   @@ -244,8 +244,11 @@ static void virtio_mmio_write(void *opaque, hwaddr 
   offset, uint64_t value,
case VIRTIO_MMIO_QUEUENUM:
DPRINTF(mmio_queue write %d max %d\n, (int)value, 
   VIRTQUEUE_MAX_SIZE);
virtio_queue_set_num(vdev, vdev-queue_sel, value);
   +/* Note: only call this function for legacy devices */
  
  It's not clear to me if this is an assertion that this *does* only
  call the function for legacy devices or a fixme, that it *should* only
  call the function for legacy devices.
 
 It's more like a note to whoever takes the virtio-mmio legacy device
 code and writes a virtio-1 virtio-mmio device.
 
 Does
 /* Note: this function must only be called for legacy devices */
 make that intention clearer?

Yes, I think that's better.

 
  
   +virtio_queue_update_rings(vdev, vdev-queue_sel);
break;
case VIRTIO_MMIO_QUEUEALIGN:
   +/* Note: this is only valid for legacy devices */
virtio_queue_set_align(vdev, vdev-queue_sel, value);
break;
case VIRTIO_MMIO_QUEUEPFN:
 
 (...)
 
/* virt queue functions */
   -static void virtqueue_init(VirtQueue *vq)
   +void virtio_queue_update_rings(VirtIODevice *vdev, int n)
  
  Perhaps something in the name to emphasise that this is only for v1.0
  devices?
 
 virtio_queue_legacy_update_rings()? Maybe a bit long...

There aren't many callers, so I think long is ok in this case.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpFnTLmFesIv.pgp
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Michael S. Tsirkin
On Wed, Jan 28, 2015 at 10:27:47AM -0500, Vlad Yasevich wrote:
 On 01/28/2015 09:45 AM, Hannes Frederic Sowa wrote:
  Hi,
  
  On Mi, 2015-01-28 at 09:16 -0500, Vlad Yasevich wrote:
  On 01/28/2015 05:34 AM, Hannes Frederic Sowa wrote:
  Hi,
 
  On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
  On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
  Hello,
 
  On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
  On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
  On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
  On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
  On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
  On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
  On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
  If the IPv6 fragment id has not been set and we perform
  fragmentation due to UFO, select a new fragment id.
  When we store the fragment id into skb_shinfo, set the bit
  in the skb so we can re-use the selected id.
  This preserves the behavior of UFO packets generated on the
  host and solves the issue of id generation for packet sockets
  and tap/macvtap devices.
 
  This patch moves ipv6_select_ident() back in to the header file. 
   
  It also provides the helper function that sets skb_shinfo() frag
  id and sets the bit.
 
  It also makes sure that we select the fragment id when doing
  just gso validation, since it's possible for the packet to
  come from an untrusted source (VM) and be forwarded through
  a UFO enabled device which will expect the fragment id.
 
  CC: Eric Dumazet eduma...@google.com
  Signed-off-by: Vladislav Yasevich vyase...@redhat.com
  ---
   include/linux/skbuff.h |  3 ++-
   include/net/ipv6.h |  2 ++
   net/ipv6/ip6_output.c  |  4 ++--
   net/ipv6/output_core.c |  9 -
   net/ipv6/udp_offload.c | 10 +-
   5 files changed, 23 insertions(+), 5 deletions(-)
 
  diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
  index 85ab7d7..3ad5203 100644
  --- a/include/linux/skbuff.h
  +++ b/include/linux/skbuff.h
  @@ -605,7 +605,8 @@ struct sk_buff {
   __u8ipvs_property:1;
   __u8inner_protocol_type:1;
   __u8remcsum_offload:1;
  -/* 3 or 5 bit hole */
  +__u8ufo_fragid_set:1;
  [...]
 
  Doesn't the flag belong in struct skb_shared_info, rather than 
  struct
  sk_buff?  Otherwise this looks fine.
 
  Ben.
 
  Hmm we seem to be out of tx flags.
  Maybe ip6_frag_id == 0 should mean not set.
 
  Maybe that is the best idea. Definitely the ufo_fragid_set bit 
  should
  move into the skb_shared_info area.
 
  That's what I originally wanted to do, but had to move and grow 
  txflags thus
  skb_shinfo ended up growing.  I wanted to avoid that, so stole an 
  skb flag.
 
  I considered treating fragid == 0 as unset, but a 0 fragid is 
  perfectly valid
  from the protocol perspective and could actually be generated by the 
  id generator
  functions.  This may cause us to call the id generation multiple 
  times.
 
  Are there plans in the long run to let virtio_net transmit auxiliary
  data to the other end so we can clean all of this this up one day?
 
  I don't like the whole situation: looking into the virtio_net headers
  just adding a field for ipv6 fragmentation ids to those small structs
  seems bloated, not doing it feels incorrect. :/
 
  Thoughts?
 
  Bye,
  Hannes
 
  I'm not sure - what will be achieved by generating the IDs guest side 
  as
  opposed to host side?  It's certainly harder to get hold of entropy
  guest-side.
 
  It is not only about entropy but about uniqueness.  Also fragmentation
  ids should not be discoverable,
 
  I belive predictable is the language used by the IETF draft.
 
  so there are several aspects:
 
  I see fragmentation id generation still as security critical:
  When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
  identifiers less predictable) I could patch my kernels and use the
  patch regardless of the machine being virtualized or not. It was not
  dependent on the hypervisor.
 
  And now it's even easier - just patch the hypervisor, and all VMs
  automatically benefit.
 
  Sometimes the hypervisor is not under my control. You would need to
  patch both kernels in your case - non gso frames would still get the
  fragmentation id generated in the host kernel.
 
  Why would non-gso frames need a frag id?  We are talking only UDP IPv6
  here, so there is no frag id generation if the packet does't need to
  be fragmented.
  
  E.g. raw sockets still can generate fragments locally. It is also a
  valid setup to have multiple interfaces in one machine, one that is UFO
  enabled and one that isn't. In that case, fragmentation id generation
  happens on different hosts which I want to avoid.
 
 OK, so you are concerned about both host and guest generating fragment
 

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Hannes Frederic Sowa
Hi,

On Mi, 2015-01-28 at 09:16 -0500, Vlad Yasevich wrote:
 On 01/28/2015 05:34 AM, Hannes Frederic Sowa wrote:
  Hi,
  
  On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
  On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
  Hello,
 
  On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
  On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
  On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
  On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
  On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
  On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
  On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
  If the IPv6 fragment id has not been set and we perform
  fragmentation due to UFO, select a new fragment id.
  When we store the fragment id into skb_shinfo, set the bit
  in the skb so we can re-use the selected id.
  This preserves the behavior of UFO packets generated on the
  host and solves the issue of id generation for packet sockets
  and tap/macvtap devices.
 
  This patch moves ipv6_select_ident() back in to the header file.  
  It also provides the helper function that sets skb_shinfo() frag
  id and sets the bit.
 
  It also makes sure that we select the fragment id when doing
  just gso validation, since it's possible for the packet to
  come from an untrusted source (VM) and be forwarded through
  a UFO enabled device which will expect the fragment id.
 
  CC: Eric Dumazet eduma...@google.com
  Signed-off-by: Vladislav Yasevich vyase...@redhat.com
  ---
   include/linux/skbuff.h |  3 ++-
   include/net/ipv6.h |  2 ++
   net/ipv6/ip6_output.c  |  4 ++--
   net/ipv6/output_core.c |  9 -
   net/ipv6/udp_offload.c | 10 +-
   5 files changed, 23 insertions(+), 5 deletions(-)
 
  diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
  index 85ab7d7..3ad5203 100644
  --- a/include/linux/skbuff.h
  +++ b/include/linux/skbuff.h
  @@ -605,7 +605,8 @@ struct sk_buff {
 __u8ipvs_property:1;
 __u8inner_protocol_type:1;
 __u8remcsum_offload:1;
  -  /* 3 or 5 bit hole */
  +  __u8ufo_fragid_set:1;
  [...]
 
  Doesn't the flag belong in struct skb_shared_info, rather than 
  struct
  sk_buff?  Otherwise this looks fine.
 
  Ben.
 
  Hmm we seem to be out of tx flags.
  Maybe ip6_frag_id == 0 should mean not set.
 
  Maybe that is the best idea. Definitely the ufo_fragid_set bit should
  move into the skb_shared_info area.
 
  That's what I originally wanted to do, but had to move and grow 
  txflags thus
  skb_shinfo ended up growing.  I wanted to avoid that, so stole an skb 
  flag.
 
  I considered treating fragid == 0 as unset, but a 0 fragid is 
  perfectly valid
  from the protocol perspective and could actually be generated by the 
  id generator
  functions.  This may cause us to call the id generation multiple times.
 
  Are there plans in the long run to let virtio_net transmit auxiliary
  data to the other end so we can clean all of this this up one day?
 
  I don't like the whole situation: looking into the virtio_net headers
  just adding a field for ipv6 fragmentation ids to those small structs
  seems bloated, not doing it feels incorrect. :/
 
  Thoughts?
 
  Bye,
  Hannes
 
  I'm not sure - what will be achieved by generating the IDs guest side as
  opposed to host side?  It's certainly harder to get hold of entropy
  guest-side.
 
  It is not only about entropy but about uniqueness.  Also fragmentation
  ids should not be discoverable,
 
  I belive predictable is the language used by the IETF draft.
 
  so there are several aspects:
 
  I see fragmentation id generation still as security critical:
  When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
  identifiers less predictable) I could patch my kernels and use the
  patch regardless of the machine being virtualized or not. It was not
  dependent on the hypervisor.
 
  And now it's even easier - just patch the hypervisor, and all VMs
  automatically benefit.
  
  Sometimes the hypervisor is not under my control. You would need to
  patch both kernels in your case - non gso frames would still get the
  fragmentation id generated in the host kernel.
 
 Why would non-gso frames need a frag id?  We are talking only UDP IPv6
 here, so there is no frag id generation if the packet does't need to
 be fragmented.

E.g. raw sockets still can generate fragments locally. It is also a
valid setup to have multiple interfaces in one machine, one that is UFO
enabled and one that isn't. In that case, fragmentation id generation
happens on different hosts which I want to avoid.

I haven't looked closely but mismatch of MTUs on interfaces seems like
it could lead to unwanted fragmentation, e.g. see is_skb_forwardable
which is mostly always true for gso frames, so we never stop them on
bridges etc.

  I think that is the same 

Re: [Qemu-devel] [PATCH RFC v6 05/20] virtio: support more feature bits

2015-01-28 Thread Cornelia Huck
On Thu, 22 Jan 2015 12:43:43 +1100
David Gibson da...@gibson.dropbear.id.au wrote:

 On Thu, Dec 11, 2014 at 02:25:07PM +0100, Cornelia Huck wrote:
  With virtio-1, we support more than 32 feature bits. Let's extend both
  host and guest features to 64, which should suffice for a while.
  
  vhost and migration have been ignored for now.
 
 [snip]
 
  diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
  index f6c0379..08141c7 100644
  --- a/include/hw/virtio/virtio.h
  +++ b/include/hw/virtio/virtio.h
  @@ -55,6 +55,12 @@
   /* A guest should never accept this.  It implies negotiation is broken. */
   #define VIRTIO_F_BAD_FEATURE   30
   
  +/* v1.0 compliant. */
  +#define VIRTIO_F_VERSION_1  32
 
 This is already in the kernel header, isn't it?
 

Yes. But nearly all files include this header but not the kernel
header.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Michael S. Tsirkin
On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa wrote:
 Hi,
 
 On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
  On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
   Hello,
   
   On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
 On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
  On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
   On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
   On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
   On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
   If the IPv6 fragment id has not been set and we perform
   fragmentation due to UFO, select a new fragment id.
   When we store the fragment id into skb_shinfo, set the bit
   in the skb so we can re-use the selected id.
   This preserves the behavior of UFO packets generated on the
   host and solves the issue of id generation for packet sockets
   and tap/macvtap devices.
  
   This patch moves ipv6_select_ident() back in to the header 
   file.  
   It also provides the helper function that sets skb_shinfo() 
   frag
   id and sets the bit.
  
   It also makes sure that we select the fragment id when doing
   just gso validation, since it's possible for the packet to
   come from an untrusted source (VM) and be forwarded through
   a UFO enabled device which will expect the fragment id.
  
   CC: Eric Dumazet eduma...@google.com
   Signed-off-by: Vladislav Yasevich vyase...@redhat.com
   ---
include/linux/skbuff.h |  3 ++-
include/net/ipv6.h |  2 ++
net/ipv6/ip6_output.c  |  4 ++--
net/ipv6/output_core.c |  9 -
net/ipv6/udp_offload.c | 10 +-
5 files changed, 23 insertions(+), 5 deletions(-)
  
   diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
   index 85ab7d7..3ad5203 100644
   --- a/include/linux/skbuff.h
   +++ b/include/linux/skbuff.h
   @@ -605,7 +605,8 @@ struct sk_buff {
  __u8ipvs_property:1;
  __u8inner_protocol_type:1;
  __u8remcsum_offload:1;
   -  /* 3 or 5 bit hole */
   +  __u8ufo_fragid_set:1;
   [...]
  
   Doesn't the flag belong in struct skb_shared_info, rather than 
   struct
   sk_buff?  Otherwise this looks fine.
  
   Ben.
  
   Hmm we seem to be out of tx flags.
   Maybe ip6_frag_id == 0 should mean not set.
   
   Maybe that is the best idea. Definitely the ufo_fragid_set bit 
   should
   move into the skb_shared_info area.
  
  That's what I originally wanted to do, but had to move and grow 
  txflags thus
  skb_shinfo ended up growing.  I wanted to avoid that, so stole an 
  skb flag.
  
  I considered treating fragid == 0 as unset, but a 0 fragid is 
  perfectly valid
  from the protocol perspective and could actually be generated by 
  the id generator
  functions.  This may cause us to call the id generation multiple 
  times.
 
 Are there plans in the long run to let virtio_net transmit auxiliary
 data to the other end so we can clean all of this this up one day?
 
 I don't like the whole situation: looking into the virtio_net headers
 just adding a field for ipv6 fragmentation ids to those small structs
 seems bloated, not doing it feels incorrect. :/
 
 Thoughts?
 
 Bye,
 Hannes

I'm not sure - what will be achieved by generating the IDs guest side as
opposed to host side?  It's certainly harder to get hold of entropy
guest-side.
   
   It is not only about entropy but about uniqueness.  Also fragmentation
   ids should not be discoverable,
  
  I belive predictable is the language used by the IETF draft.
  
   so there are several aspects:
   
   I see fragmentation id generation still as security critical:
   When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
   identifiers less predictable) I could patch my kernels and use the
   patch regardless of the machine being virtualized or not. It was not
   dependent on the hypervisor.
  
  And now it's even easier - just patch the hypervisor, and all VMs
  automatically benefit.
 
 Sometimes the hypervisor is not under my control.

In that case doing things like extending virtio
is out of the question too, isn't it?
It needs hypervisor changes.

 You would need to
 patch both kernels in your case - non gso frames would still get the
 fragmentation id generated in the host kernel.
 
   I think that is the same reasoning why we
   don't support TOE.
   If we use one generator in the hypervisor in an openstack alike setting,
   the host deals with 

Re: [Qemu-devel] [PATCH RFC v6 07/20] virtio: allow virtio-1 queue layout

2015-01-28 Thread Cornelia Huck
On Thu, 22 Jan 2015 13:06:09 +1100
David Gibson da...@gibson.dropbear.id.au wrote:

 On Thu, Dec 11, 2014 at 02:25:09PM +0100, Cornelia Huck wrote:
  For virtio-1 devices, we allow a more complex queue layout that doesn't
  require descriptor table and rings on a physically-contigous memory area:
  add virtio_queue_set_rings() to allow transports to set this up.
  
  Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
  ---
   hw/virtio/virtio-mmio.c|3 +++
   hw/virtio/virtio.c |   53 
  
   include/hw/virtio/virtio.h |3 +++
   3 files changed, 40 insertions(+), 19 deletions(-)
  
  diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
  index 43b7e02..0c9b63b 100644
  --- a/hw/virtio/virtio-mmio.c
  +++ b/hw/virtio/virtio-mmio.c
  @@ -244,8 +244,11 @@ static void virtio_mmio_write(void *opaque, hwaddr 
  offset, uint64_t value,
   case VIRTIO_MMIO_QUEUENUM:
   DPRINTF(mmio_queue write %d max %d\n, (int)value, 
  VIRTQUEUE_MAX_SIZE);
   virtio_queue_set_num(vdev, vdev-queue_sel, value);
  +/* Note: only call this function for legacy devices */
 
 It's not clear to me if this is an assertion that this *does* only
 call the function for legacy devices or a fixme, that it *should* only
 call the function for legacy devices.

It's more like a note to whoever takes the virtio-mmio legacy device
code and writes a virtio-1 virtio-mmio device.

Does
/* Note: this function must only be called for legacy devices */
make that intention clearer?

 
  +virtio_queue_update_rings(vdev, vdev-queue_sel);
   break;
   case VIRTIO_MMIO_QUEUEALIGN:
  +/* Note: this is only valid for legacy devices */
   virtio_queue_set_align(vdev, vdev-queue_sel, value);
   break;
   case VIRTIO_MMIO_QUEUEPFN:

(...)

   /* virt queue functions */
  -static void virtqueue_init(VirtQueue *vq)
  +void virtio_queue_update_rings(VirtIODevice *vdev, int n)
 
 Perhaps something in the name to emphasise that this is only for v1.0
 devices?

virtio_queue_legacy_update_rings()? Maybe a bit long...

 
   {
  -hwaddr pa = vq-pa;
  +VRing *vring = vdev-vq[n].vring;
   
  -vq-vring.desc = pa;
  -vq-vring.avail = pa + vq-vring.num * sizeof(VRingDesc);
  -vq-vring.used = vring_align(vq-vring.avail +
  - offsetof(VRingAvail, ring[vq-vring.num]),
  - vq-vring.align);
  +if (!vring-desc) {
  +/* not yet setup - nothing to do */
  +return;
  +}
  +vring-avail = vring-desc + vring-num * sizeof(VRingDesc);
  +vring-used = vring_align(vring-avail +
  +  offsetof(VRingAvail, ring[vring-num]),
  +  vring-align);
 
 Would it make sense to implement this in terms of
 virtio_queue_set_rings()?

Perhaps a bit confusing, since that would re-write desc.
 
   }

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Hannes Frederic Sowa
Hi,

On Mi, 2015-01-28 at 18:00 +0200, Michael S. Tsirkin wrote:
 On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa wrote:
  Hi,
  
  On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
   On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
Hello,

On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
  On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
   On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
If the IPv6 fragment id has not been set and we perform
fragmentation due to UFO, select a new fragment id.
When we store the fragment id into skb_shinfo, set the bit
in the skb so we can re-use the selected id.
This preserves the behavior of UFO packets generated on the
host and solves the issue of id generation for packet sockets
and tap/macvtap devices.
   
This patch moves ipv6_select_ident() back in to the header 
file.  
It also provides the helper function that sets skb_shinfo() 
frag
id and sets the bit.
   
It also makes sure that we select the fragment id when doing
just gso validation, since it's possible for the packet to
come from an untrusted source (VM) and be forwarded through
a UFO enabled device which will expect the fragment id.
   
CC: Eric Dumazet eduma...@google.com
Signed-off-by: Vladislav Yasevich vyase...@redhat.com
---
 include/linux/skbuff.h |  3 ++-
 include/net/ipv6.h |  2 ++
 net/ipv6/ip6_output.c  |  4 ++--
 net/ipv6/output_core.c |  9 -
 net/ipv6/udp_offload.c | 10 +-
 5 files changed, 23 insertions(+), 5 deletions(-)
   
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 85ab7d7..3ad5203 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -605,7 +605,8 @@ struct sk_buff {
 __u8ipvs_property:1;
 __u8inner_protocol_type:1;
 __u8remcsum_offload:1;
-/* 3 or 5 bit hole */
+__u8ufo_fragid_set:1;
[...]
   
Doesn't the flag belong in struct skb_shared_info, rather 
than struct
sk_buff?  Otherwise this looks fine.
   
Ben.
   
Hmm we seem to be out of tx flags.
Maybe ip6_frag_id == 0 should mean not set.

Maybe that is the best idea. Definitely the ufo_fragid_set bit 
should
move into the skb_shared_info area.
   
   That's what I originally wanted to do, but had to move and grow 
   txflags thus
   skb_shinfo ended up growing.  I wanted to avoid that, so stole an 
   skb flag.
   
   I considered treating fragid == 0 as unset, but a 0 fragid is 
   perfectly valid
   from the protocol perspective and could actually be generated by 
   the id generator
   functions.  This may cause us to call the id generation multiple 
   times.
  
  Are there plans in the long run to let virtio_net transmit auxiliary
  data to the other end so we can clean all of this this up one day?
  
  I don't like the whole situation: looking into the virtio_net 
  headers
  just adding a field for ipv6 fragmentation ids to those small 
  structs
  seems bloated, not doing it feels incorrect. :/
  
  Thoughts?
  
  Bye,
  Hannes
 
 I'm not sure - what will be achieved by generating the IDs guest side 
 as
 opposed to host side?  It's certainly harder to get hold of entropy
 guest-side.

It is not only about entropy but about uniqueness.  Also fragmentation
ids should not be discoverable,
   
   I belive predictable is the language used by the IETF draft.
   
so there are several aspects:

I see fragmentation id generation still as security critical:
When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
identifiers less predictable) I could patch my kernels and use the
patch regardless of the machine being virtualized or not. It was not
dependent on the hypervisor.
   
   And now it's even easier - just patch the hypervisor, and all VMs
   automatically benefit.
  
  Sometimes the hypervisor is not under my control.
 
 In that case doing things like extending virtio
 is out of the question too, isn't it?
 It needs hypervisor changes.

Sure, but I would like to have the fragmentation id generator to reside
inside the end-host kernel. Hypervisor 

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Vlad Yasevich
On 01/28/2015 09:45 AM, Hannes Frederic Sowa wrote:
 Hi,
 
 On Mi, 2015-01-28 at 09:16 -0500, Vlad Yasevich wrote:
 On 01/28/2015 05:34 AM, Hannes Frederic Sowa wrote:
 Hi,

 On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
 On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
 Hello,

 On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa wrote:
 On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
 On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
 On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings wrote:
 On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
 If the IPv6 fragment id has not been set and we perform
 fragmentation due to UFO, select a new fragment id.
 When we store the fragment id into skb_shinfo, set the bit
 in the skb so we can re-use the selected id.
 This preserves the behavior of UFO packets generated on the
 host and solves the issue of id generation for packet sockets
 and tap/macvtap devices.

 This patch moves ipv6_select_ident() back in to the header file.  
 It also provides the helper function that sets skb_shinfo() frag
 id and sets the bit.

 It also makes sure that we select the fragment id when doing
 just gso validation, since it's possible for the packet to
 come from an untrusted source (VM) and be forwarded through
 a UFO enabled device which will expect the fragment id.

 CC: Eric Dumazet eduma...@google.com
 Signed-off-by: Vladislav Yasevich vyase...@redhat.com
 ---
  include/linux/skbuff.h |  3 ++-
  include/net/ipv6.h |  2 ++
  net/ipv6/ip6_output.c  |  4 ++--
  net/ipv6/output_core.c |  9 -
  net/ipv6/udp_offload.c | 10 +-
  5 files changed, 23 insertions(+), 5 deletions(-)

 diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
 index 85ab7d7..3ad5203 100644
 --- a/include/linux/skbuff.h
 +++ b/include/linux/skbuff.h
 @@ -605,7 +605,8 @@ struct sk_buff {
__u8ipvs_property:1;
__u8inner_protocol_type:1;
__u8remcsum_offload:1;
 -  /* 3 or 5 bit hole */
 +  __u8ufo_fragid_set:1;
 [...]

 Doesn't the flag belong in struct skb_shared_info, rather than 
 struct
 sk_buff?  Otherwise this looks fine.

 Ben.

 Hmm we seem to be out of tx flags.
 Maybe ip6_frag_id == 0 should mean not set.

 Maybe that is the best idea. Definitely the ufo_fragid_set bit should
 move into the skb_shared_info area.

 That's what I originally wanted to do, but had to move and grow 
 txflags thus
 skb_shinfo ended up growing.  I wanted to avoid that, so stole an skb 
 flag.

 I considered treating fragid == 0 as unset, but a 0 fragid is 
 perfectly valid
 from the protocol perspective and could actually be generated by the 
 id generator
 functions.  This may cause us to call the id generation multiple times.

 Are there plans in the long run to let virtio_net transmit auxiliary
 data to the other end so we can clean all of this this up one day?

 I don't like the whole situation: looking into the virtio_net headers
 just adding a field for ipv6 fragmentation ids to those small structs
 seems bloated, not doing it feels incorrect. :/

 Thoughts?

 Bye,
 Hannes

 I'm not sure - what will be achieved by generating the IDs guest side as
 opposed to host side?  It's certainly harder to get hold of entropy
 guest-side.

 It is not only about entropy but about uniqueness.  Also fragmentation
 ids should not be discoverable,

 I belive predictable is the language used by the IETF draft.

 so there are several aspects:

 I see fragmentation id generation still as security critical:
 When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
 identifiers less predictable) I could patch my kernels and use the
 patch regardless of the machine being virtualized or not. It was not
 dependent on the hypervisor.

 And now it's even easier - just patch the hypervisor, and all VMs
 automatically benefit.

 Sometimes the hypervisor is not under my control. You would need to
 patch both kernels in your case - non gso frames would still get the
 fragmentation id generated in the host kernel.

 Why would non-gso frames need a frag id?  We are talking only UDP IPv6
 here, so there is no frag id generation if the packet does't need to
 be fragmented.
 
 E.g. raw sockets still can generate fragments locally. It is also a
 valid setup to have multiple interfaces in one machine, one that is UFO
 enabled and one that isn't. In that case, fragmentation id generation
 happens on different hosts which I want to avoid.

OK, so you are concerned about both host and guest generating fragment
ids.  Host would do it for GSO frames and guest would do it for fragmented
frames.  Yes, there is room for collision, which is why we are aiming to
fix this with fragment id passing through virtio_net.  However, I am still
trying to 

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Ben Hutchings
On Wed, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
 On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
[...]
  I see fragmentation id generation still as security critical:
  When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make IP
  identifiers less predictable) I could patch my kernels and use the
  patch regardless of the machine being virtualized or not. It was not
  dependent on the hypervisor.
 
 And now it's even easier - just patch the hypervisor, and all VMs
 automatically benefit.
[...]

You are advocating that the hypervisor should continue to act as a
middle-box that quietly modifies packets.  This may be useful to protect
guests that have poor fragment ID generation, but then that should be an
optional netfilter module and *not* the default.  The default should be
that UFO has no effect on the packet headers on the wire, and therefore
that the fragment ID is chosen by the IPv6 stack in the guest.

Ben.

-- 
Ben Hutchings
Teamwork is essential - it allows you to blame someone else.


signature.asc
Description: This is a digitally signed message part
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/3] ipv6: Select fragment id during UFO/GSO segmentation if not set.

2015-01-28 Thread Michael S. Tsirkin
On Wed, Jan 28, 2015 at 05:15:49PM +0100, Hannes Frederic Sowa wrote:
 Hi,
 
 On Mi, 2015-01-28 at 18:00 +0200, Michael S. Tsirkin wrote:
  On Wed, Jan 28, 2015 at 11:34:02AM +0100, Hannes Frederic Sowa wrote:
   Hi,
   
   On Mi, 2015-01-28 at 11:46 +0200, Michael S. Tsirkin wrote:
On Wed, Jan 28, 2015 at 09:25:08AM +0100, Hannes Frederic Sowa wrote:
 Hello,
 
 On Di, 2015-01-27 at 18:08 +0200, Michael S. Tsirkin wrote:
  On Tue, Jan 27, 2015 at 05:02:31PM +0100, Hannes Frederic Sowa 
  wrote:
   On Di, 2015-01-27 at 09:26 -0500, Vlad Yasevich wrote:
On 01/27/2015 08:47 AM, Hannes Frederic Sowa wrote:
 On Di, 2015-01-27 at 10:42 +0200, Michael S. Tsirkin wrote:
 On Tue, Jan 27, 2015 at 02:47:54AM +, Ben Hutchings 
 wrote:
 On Mon, 2015-01-26 at 09:37 -0500, Vladislav Yasevich wrote:
 If the IPv6 fragment id has not been set and we perform
 fragmentation due to UFO, select a new fragment id.
 When we store the fragment id into skb_shinfo, set the bit
 in the skb so we can re-use the selected id.
 This preserves the behavior of UFO packets generated on the
 host and solves the issue of id generation for packet 
 sockets
 and tap/macvtap devices.

 This patch moves ipv6_select_ident() back in to the header 
 file.  
 It also provides the helper function that sets 
 skb_shinfo() frag
 id and sets the bit.

 It also makes sure that we select the fragment id when 
 doing
 just gso validation, since it's possible for the packet to
 come from an untrusted source (VM) and be forwarded through
 a UFO enabled device which will expect the fragment id.

 CC: Eric Dumazet eduma...@google.com
 Signed-off-by: Vladislav Yasevich vyase...@redhat.com
 ---
  include/linux/skbuff.h |  3 ++-
  include/net/ipv6.h |  2 ++
  net/ipv6/ip6_output.c  |  4 ++--
  net/ipv6/output_core.c |  9 -
  net/ipv6/udp_offload.c | 10 +-
  5 files changed, 23 insertions(+), 5 deletions(-)

 diff --git a/include/linux/skbuff.h 
 b/include/linux/skbuff.h
 index 85ab7d7..3ad5203 100644
 --- a/include/linux/skbuff.h
 +++ b/include/linux/skbuff.h
 @@ -605,7 +605,8 @@ struct sk_buff {
__u8ipvs_property:1;
__u8inner_protocol_type:1;
__u8remcsum_offload:1;
 -  /* 3 or 5 bit hole */
 +  __u8ufo_fragid_set:1;
 [...]

 Doesn't the flag belong in struct skb_shared_info, rather 
 than struct
 sk_buff?  Otherwise this looks fine.

 Ben.

 Hmm we seem to be out of tx flags.
 Maybe ip6_frag_id == 0 should mean not set.
 
 Maybe that is the best idea. Definitely the ufo_fragid_set 
 bit should
 move into the skb_shared_info area.

That's what I originally wanted to do, but had to move and grow 
txflags thus
skb_shinfo ended up growing.  I wanted to avoid that, so stole 
an skb flag.

I considered treating fragid == 0 as unset, but a 0 fragid is 
perfectly valid
from the protocol perspective and could actually be generated 
by the id generator
functions.  This may cause us to call the id generation 
multiple times.
   
   Are there plans in the long run to let virtio_net transmit 
   auxiliary
   data to the other end so we can clean all of this this up one day?
   
   I don't like the whole situation: looking into the virtio_net 
   headers
   just adding a field for ipv6 fragmentation ids to those small 
   structs
   seems bloated, not doing it feels incorrect. :/
   
   Thoughts?
   
   Bye,
   Hannes
  
  I'm not sure - what will be achieved by generating the IDs guest 
  side as
  opposed to host side?  It's certainly harder to get hold of entropy
  guest-side.
 
 It is not only about entropy but about uniqueness.  Also fragmentation
 ids should not be discoverable,

I belive predictable is the language used by the IETF draft.

 so there are several aspects:
 
 I see fragmentation id generation still as security critical:
 When Eric patched the frag id generator in 04ca6973f7c1a0d (ip: make 
 IP
 identifiers less predictable) I could patch my kernels and use the
 patch regardless of the machine being virtualized or not. It was not
 dependent on the hypervisor.

And now it's even easier - just patch the hypervisor, and all VMs
automatically benefit.
   
   Sometimes the hypervisor is not under my