subject:"Re\: \[PATCH 06\/13\]\: \[IPV4\/6\]\: Netfilter IPsec input hooks"

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-12-18 Thread YOSHIFUJI Hideaki / 吉藤英明

In article <[EMAIL PROTECTED]> (at Sun, 18 Dec 2005 23:59:35 +0100), Patrick 
McHardy <[EMAIL PROTECTED]> says:

> YOSHIFUJI Hideaki wrote:
:
> >  BTW, we're now using full of skb->cb
> > (and we are even exceeding it w/ mobile-ipv6 extensions)...
> 
> Not in mainline so far, so maybe we can fit your extensions
> and my patches without the mobile extensions, that apparently
> exceed the CB anyway, in there for now. Can I look at those
> patches somewhere? BTW, other fields in the IP6CB seem to
> store offsets in u16 fields, is this OK for nhoff too? I
> thought with jumbo options I need to use a u32 field.

Well, don't mind too much.
I just wanted to note that we're about to exceed skb->cb.
I will probably enlarge skb->cb if needed.

And, yes, good point.
I think nhoff should be u32 because it is critical.
In theory, u32 is definitely needed for others as well...

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-12-18 Thread YOSHIFUJI Hideaki / 吉藤英明

In article <[EMAIL PROTECTED]> (at Sun, 18 Dec 2005 15:27:01 +0100), Patrick 
McHardy <[EMAIL PROTECTED]> says:

> YOSHIFUJI Hideaki wrote:
> > In article <[EMAIL PROTECTED]> (at Tue, 22 Nov 2005 02:14:26 +0100), 
> > Patrick McHardy <[EMAIL PROTECTED]> says:
> > 
> > 
> >>The easiest way would be to store nhoff somewhere in the skb and
> >>use it to continue at the next header. But I still hope there is
> >>a way without keeping data in the skb.
> > 
> > 
> > We've coded up this.
> 
> How about this patch instead? It eliminates the nhoffp argument
> to IPv6 protocol handlers by storing it in the IP6CB, which allows
> to call ip6_input_finish a second time and have it skip already
> parsed headers and also gets rid of the manual hopopts skipping.

The idea to store IP6CB itself seems sane to me.

BTW, we're now using full of skb->cb
(and we are even exceeding it w/ mobile-ipv6 extensions)...

--yoshfuji
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-12-04 Thread Patrick McHardy


Herbert Xu wrote:

On Sun, Dec 04, 2005 at 11:06:02PM +0100, Patrick McHardy wrote:

If there is a DNAT in the way, this will jump to the very start of
the stack.  So if we have a hostile IPsec peer, and the DNAT rules
are such that this can occur, then we could be in trouble (especially
because policy/selector verification does not occur until all IPsec
has been done so we can't check inner address validitiy at this point).


We could return NET_XMIT_BYPASS from ip_xfrm_transport_hook(), although
it looks a bit ugly to use NET_XMIT* on the input path.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-12-04 Thread Herbert Xu

On Sun, Dec 04, 2005 at 11:06:02PM +0100, Patrick McHardy wrote:
>
> >I'm worried about this bit.  This looks like it'll go back to the top
> >of the IP stack with the existing call chain.  So could grow as the
> >number of transforms increase.
> 
> Its not so bad. It adds ip_xfrm_transport_hook and
> ip_local_deliver_finish to the call stack, but since two subsequent
> transport mode SAs are always processed at once it can't take this
> path again without calling netif_rx in between.

If there is a DNAT in the way, this will jump to the very start of
the stack.  So if we have a hostile IPsec peer, and the DNAT rules
are such that this can occur, then we could be in trouble (especially
because policy/selector verification does not occur until all IPsec
has been done so we can't check inner address validitiy at this point).
 
> Besides the double counting, packets also appear on the packet sockets
> after transport mode decapsulation with the original approach. For
> IPv6 there's also the double-parsing of extension header issue.

Having the packets appear twice on AF_PACKET is probably desirable :)

I'll need to think about the double-parsing though.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-12-04 Thread Patrick McHardy


Herbert Xu wrote:

On Sun, Nov 20, 2005 at 04:31:36PM +, Patrick McHardy wrote:


@@ -145,7 +149,17 @@ int xfrm4_rcv_encap(struct sk_buff *skb,
netif_rx(skb);
return 0;
} else {
+#ifdef CONFIG_NETFILTER
+   __skb_push(skb, skb->data - skb->nh.raw);
+   skb->nh.iph->tot_len = htons(skb->len);
+   ip_send_check(skb->nh.iph);
+
+   NF_HOOK(PF_INET, NF_IP_PRE_ROUTING, skb, skb->dev, NULL,
+   ip_xfrm_transport_hook);
+   return 0;
+#else
return -skb->nh.iph->protocol;
+#endif



I'm worried about this bit.  This looks like it'll go back to the top
of the IP stack with the existing call chain.  So could grow as the
number of transforms increase.


Its not so bad. It adds ip_xfrm_transport_hook and
ip_local_deliver_finish to the call stack, but since two subsequent
transport mode SAs are always processed at once it can't take this
path again without calling netif_rx in between.


Perhaps we need to play a dst_input/netif_rx trick here.

Actually, was there a problem with your original netif_rx approach
apart from the issue with double counting?


Besides the double counting, packets also appear on the packet sockets
after transport mode decapsulation with the original approach. For
IPv6 there's also the double-parsing of extension header issue.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-30 Thread Herbert Xu

On Sun, Nov 20, 2005 at 04:31:36PM +, Patrick McHardy wrote:
>
> @@ -145,7 +149,17 @@ int xfrm4_rcv_encap(struct sk_buff *skb,
>   netif_rx(skb);
>   return 0;
>   } else {
> +#ifdef CONFIG_NETFILTER
> + __skb_push(skb, skb->data - skb->nh.raw);
> + skb->nh.iph->tot_len = htons(skb->len);
> + ip_send_check(skb->nh.iph);
> +
> + NF_HOOK(PF_INET, NF_IP_PRE_ROUTING, skb, skb->dev, NULL,
> + ip_xfrm_transport_hook);
> + return 0;
> +#else
>   return -skb->nh.iph->protocol;
> +#endif

I'm worried about this bit.  This looks like it'll go back to the top
of the IP stack with the existing call chain.  So could grow as the
number of transforms increase.

Perhaps we need to play a dst_input/netif_rx trick here.

Actually, was there a problem with your original netif_rx approach
apart from the issue with double counting?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-23 Thread YOSHIFUJI Hideaki / 吉藤英明

Hello.

In article <[EMAIL PROTECTED]> (at Tue, 22 Nov 2005 02:14:26 +0100), Patrick 
McHardy <[EMAIL PROTECTED]> says:

> The easiest way would be to store nhoff somewhere in the skb and
> use it to continue at the next header. But I still hope there is
> a way without keeping data in the skb.

We've coded up this.

Though we have still another issue (call chain issue) to resolve,
we're getting closer to the goal.
i.e. we should continue the loop for common case.

Signed-off-by: YOSHIFUJI Hideaki <[EMAIL PROTECTED]>
Signed-off-by: Yasuyuki Kozawai <[EMAIL PROTECTED]>

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8c5d600..1101851 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -272,6 +272,9 @@ struct sk_buff {
void(*destructor)(struct sk_buff *skb);
 #ifdef CONFIG_NETFILTER
__u32   nfmark;
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+   unsigned intnf_nhoff;
+#endif
struct nf_conntrack *nfct;
 #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
struct sk_buff  *nfct_reasm;
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 44b979a..0531d0a 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -463,6 +463,8 @@ extern int  ip6_dst_lookup(struct sock 
 extern int ip6_output(struct sk_buff *skb);
 extern int ip6_forward(struct sk_buff *skb);
 extern int ip6_input(struct sk_buff *skb);
+extern int ip6_input_finish2(struct sk_buff *skb,
+ unsigned int nhoff);
 extern int ip6_mc_input(struct sk_buff *skb);
 
 /*
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index e84b3cd..cd0606a 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -134,31 +134,13 @@ out:
  * Deliver the packet to the host
  */
 
-
-static inline int ip6_input_finish(struct sk_buff *skb)
+int ip6_input_finish2(struct sk_buff *skb, unsigned int nhoff)
 {
struct inet6_protocol *ipprot;
struct sock *raw_sk;
-   unsigned int nhoff;
-   int nexthdr;
+   unsigned int nexthdr;
u8 hash;
 
-   skb->h.raw = skb->nh.raw + sizeof(struct ipv6hdr);
-
-   /*
-*  Parse extension headers
-*/
-
-   nexthdr = skb->nh.ipv6h->nexthdr;
-   nhoff = offsetof(struct ipv6hdr, nexthdr);
-
-   /* Skip hop-by-hop options, they are already parsed. */
-   if (nexthdr == NEXTHDR_HOP) {
-   nhoff = sizeof(struct ipv6hdr);
-   nexthdr = skb->h.raw[0];
-   skb->h.raw += (skb->h.raw[1]+1)<<3;
-   }
-
rcu_read_lock();
 resubmit:
if (!pskb_pull(skb, skb->h.raw - skb->data))
@@ -221,6 +203,26 @@ discard:
return 0;
 }
 
+static inline int ip6_input_finish(struct sk_buff *skb)
+{
+   unsigned int nhoff;
+
+   skb->h.raw = skb->nh.raw + sizeof(struct ipv6hdr);
+
+   /*
+*  Parse extension headers
+*/
+
+   nhoff = offsetof(struct ipv6hdr, nexthdr);
+
+   /* Skip hop-by-hop options, they are already parsed. */
+   if (skb->nh.ipv6h->nexthdr == NEXTHDR_HOP) {
+   nhoff = sizeof(struct ipv6hdr);
+   skb->h.raw += (skb->h.raw[1]+1)<<3;
+   }
+
+   return ip6_input_finish2(skb, nhoff);
+}
 
 int ip6_input(struct sk_buff *skb)
 {
diff --git a/net/ipv6/ipv6_syms.c b/net/ipv6/ipv6_syms.c
index 1648278..6051783 100644
--- a/net/ipv6/ipv6_syms.c
+++ b/net/ipv6/ipv6_syms.c
@@ -15,6 +15,7 @@ EXPORT_SYMBOL(ndisc_mc_map);
 EXPORT_SYMBOL(register_inet6addr_notifier);
 EXPORT_SYMBOL(unregister_inet6addr_notifier);
 EXPORT_SYMBOL(ip6_route_output);
+EXPORT_SYMBOL(ip6_input_finish2);
 EXPORT_SYMBOL(addrconf_lock);
 EXPORT_SYMBOL(ipv6_setsockopt);
 EXPORT_SYMBOL(ipv6_getsockopt);
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
index 9987416..2e3b28d 100644
--- a/net/ipv6/xfrm6_input.c
+++ b/net/ipv6/xfrm6_input.c
@@ -9,6 +9,7 @@
  * IPv6 support
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -17,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static inline void ipip6_ecn_decapsulate(struct sk_buff *skb)
@@ -28,6 +30,25 @@ static inline void ipip6_ecn_decapsulate
IP6_ECN_set_ce(inner_iph);
 }
 
+#ifdef CONFIG_NETFILTER
+static inline int xfrm6_rcv_spi_finish2(struct sk_buff *skb)
+{
+   __skb_pull(skb, skb->h.raw - skb->nh.raw);
+   return ip6_input_finish2(skb, skb->nf_nhoff);
+}
+
+static inline int xfrm6_rcv_spi_finish(struct sk_buff *skb)
+{
+   if (skb->dst == NULL) {
+   ip6_route_input(skb);
+   return dst_input(skb);
+   }
+
+   return NF_HOOK(PF_INET6, NF_IP6_LOCAL_IN, skb, skb->dev, NULL,
+  xfrm6_rcv_spi_finish2);
+}
+#endif
+
 int xfrm6_rcv_spi(struct sk_buff **pskb, unsigned int *

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Patrick McHardy

Kazunori Miyazawa wrote:
> Patrick McHardy wrote:
> 
>>The problem is that netfilter hooks take ownership of the skb, so the
>>caller can't touch it afterwards. It would be possible, but it would
>>become very ugly. Can I assume that you would also be satisfied if
>>the double-parsing of extension headers is fixed in some other way?
> 
> 
> My concern is cost to look up routing table and parse extention headers twice.
> I think the latter will be critical issue if some extention header makes
> some state in the stack.

The routing lookup is not done unless someone resets skb->dst (i.e.
nfqueue). I'm looking into fixing the extension header problem,
so I hope that will resolve all issues.

> IMHO, we will call the loop part of ip6_input_finish to inject the skb
> to upper layer directly in ip6_xfrm_transport_hook.
> But it may make duplicate codes and raise other issues...

The easiest way would be to store nhoff somewhere in the skb and
use it to continue at the next header. But I still hope there is
a way without keeping data in the skb.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Kazunori Miyazawa

Patrick McHardy wrote:
> Kazunori Miyazawa wrote:
> 
>>Hello Patrick,
>>
>>I have a comment about the patch on the IPv6 input process.
>>The kernel applied your patch will always call ip6_rcv_finish
>>when enabling netfilter and receiving a esp packet so that
>>it will always look up the routing table and parse
>>extention headers twice a packet.
>>It will probably cost.
>>
>>Your ip_xfrm_transport_hook is a good idea, I think.
> 
> 
> Yes, not passing the packets through the entire stack seems like
> the right thing to do.
> 
> 
>>We could call ip6_rcv_finish if the netfilter changed the addresses
>>or otherwise we can continue the loop to avoid the cost in a similar
>>way because we can know the change with checking skb->dst.
>>
>>If you fix the point, your change will resolve our issues which were
>>mentioned in the previous mail from Kozakai-san.
> 
> 
> The problem is that netfilter hooks take ownership of the skb, so the
> caller can't touch it afterwards. It would be possible, but it would
> become very ugly. Can I assume that you would also be satisfied if
> the double-parsing of extension headers is fixed in some other way?

My concern is cost to look up routing table and parse extention headers twice.
I think the latter will be critical issue if some extention header makes
some state in the stack.

IMHO, we will call the loop part of ip6_input_finish to inject the skb
to upper layer directly in ip6_xfrm_transport_hook.
But it may make duplicate codes and raise other issues...

I'm also consider the issue.

Thank you Patrick,

--
Kazunori Miyazawa
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Patrick McHardy


David S. Miller wrote:

I've read over Patrick's two most recent postings of these patches
and I think they are generally sane and I cannot find any holes in
them.  Herbert brought up the legitimate concern about defragmentation,
but I think that's a detail and does not take away from the structural
soundness of Patrick's approach.


I think we implicitly agreed on moving the POST_ROUTING hook before
fragmentation and change the user-visible behaviour of the mangle
POSTROUTING chain. At least neither Harald not Rusty objected to
the patch :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Patrick McHardy


Yasuyuki KOZAKAI wrote:

I don't see why it is confusing. Plain text packets are visible before
encapsulation (and they have to be because we don't necessarily know
if packets will be encapsulated at the time the hooks are called in
case the policy lookup after NAT returns a policy), plain text packets
are visible after decapsulation. With different hooks we can't have
symetrical behaviour because of the case I mentioned above, and that
would be confusing IMO.


Well, what I worried about was just ease to use, not internal processing.
I suspected that users can correctly configure IPsec and packet filtering.
Just doing "iptables -P INPUT -j DROP; iptables -A INPUT -p esp -j ACCEPT"
will drop all input packets and this is different behavior with current
kernel, for example. So I just imagined many people would say "Why ?".

But as you said in other mail, probably this is my needles fear, isn't this ?
As you know, I'm basically worrier. :)


:) I think its OK since in tunnel mode the decapsulated packets already
need to be allowed by the ruleset, so I think its not too confusing to
expect the same in transport mode.


Then, why don't we make xfrm{4,6}_rcv_spi() return 0 (1 if IPv6)
so that ip_local_deliver_finish()/ip6_input_finish() can continue to process
headers if packets have not been mangled ? Is this difficult or impossible to
implement ?


I'm not sure I understand. Do you propose to check if the packet was
mangled after the PRE_ROUTING hook (if it was not stolen or queued)
and if not return directly to ip6_input_finish()?


Yes.


Where would the LOCAL_IN hook be called?


Ah, indeed. But we can add it just before return directly to
ip6_input_finish().


Please see my mail to Kazunori. The hooks take ownership of the
skb, changing this would become pretty ugly because of queued
or stolen packets. And it would still leave one path where packets
are not directly returned, so I think the double-parsing should
be handled otherwise.


AFAICT statistics are not affected by my patches, except for the
iptables counters. The double parsing of extension headers is indeed
a problem I forgot about, it looks like we need to carry nhoff around
if it can't be derived from the packet.


Of cause that is one solution.


I'm going to try that if I can't come up with something better.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Patrick McHardy

Kazunori Miyazawa wrote:
> Hello Patrick,
> 
> I have a comment about the patch on the IPv6 input process.
> The kernel applied your patch will always call ip6_rcv_finish
> when enabling netfilter and receiving a esp packet so that
> it will always look up the routing table and parse
> extention headers twice a packet.
> It will probably cost.
> 
> Your ip_xfrm_transport_hook is a good idea, I think.

Yes, not passing the packets through the entire stack seems like
the right thing to do.

> We could call ip6_rcv_finish if the netfilter changed the addresses
> or otherwise we can continue the loop to avoid the cost in a similar
> way because we can know the change with checking skb->dst.
> 
> If you fix the point, your change will resolve our issues which were
> mentioned in the previous mail from Kozakai-san.

The problem is that netfilter hooks take ownership of the skb, so the
caller can't touch it afterwards. It would be possible, but it would
become very ugly. Can I assume that you would also be satisfied if
the double-parsing of extension headers is fixed in some other way?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Yasuyuki KOZAKAI


Hi,

From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Mon, 21 Nov 2005 07:52:36 +0100

> I don't see why it is confusing. Plain text packets are visible before
> encapsulation (and they have to be because we don't necessarily know
> if packets will be encapsulated at the time the hooks are called in
> case the policy lookup after NAT returns a policy), plain text packets
> are visible after decapsulation. With different hooks we can't have
> symetrical behaviour because of the case I mentioned above, and that
> would be confusing IMO.

Well, what I worried about was just ease to use, not internal processing.
I suspected that users can correctly configure IPsec and packet filtering.
Just doing "iptables -P INPUT -j DROP; iptables -A INPUT -p esp -j ACCEPT"
will drop all input packets and this is different behavior with current
kernel, for example. So I just imagined many people would say "Why ?".

But as you said in other mail, probably this is my needles fear, isn't this ?
As you know, I'm basically worrier. :)

> > And this can be said about IPv6 input path. If packets have not been mangled
> > (this is ordinary case because ip6tables doesn't have neither NAT nor
> > target module to mangle addresses directly), they don't have to re-route
> > and don't have to re-visit ip6_input_finish().
> > 
> > In the other way, if their addresses have been mangled, it's necessary to
> > re-route. I agree re-visiting ip6_input_finish() in this case.
> 
> Same goes for ip6_input_finish as for ip_local_deliver_finish(),
> the packet would continue its path there anyway. Do you actually
> mean ip6_rcv_finish()?

No. I mean ip6_input_finish(). calling ip6_input_finish() twice causes
problem at processing of IPv6 extension headers. This is different point
between IPv4 and IPv6.

Please note that I don't mean IPv4 processing in your patch has problem.
I think it will work. What I wanted to do was just avoiding double processing
of extension headers and synchronizing IPv4/IPv6 behavior as possible.

> > Then, why don't we make xfrm{4,6}_rcv_spi() return 0 (1 if IPv6)
> > so that ip_local_deliver_finish()/ip6_input_finish() can continue to process
> > headers if packets have not been mangled ? Is this difficult or impossible 
> > to
> > implement ?
> 
> I'm not sure I understand. Do you propose to check if the packet was
> mangled after the PRE_ROUTING hook (if it was not stolen or queued)
> and if not return directly to ip6_input_finish()?

Yes.

>Where would the
> LOCAL_IN hook be called?

Ah, indeed. But we can add it just before return directly to
ip6_input_finish().

> > This can solve the issue about twice processing of statistics and IPv6
> > extension headers. Because ip_local_deliver_finish()/ip6_input_finish() can
> > continue to process extension headers after ESP/AH in ordinary case.
> 
> AFAICT statistics are not affected by my patches, except for the
> iptables counters. The double parsing of extension headers is indeed
> a problem I forgot about, it looks like we need to carry nhoff around
> if it can't be derived from the packet.

Of cause that is one solution.

-- Yasuyuki Kozakai
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread YOSHIFUJI Hideaki / 吉藤英明

Hello.

In article <[EMAIL PROTECTED]> (at Mon, 21 Nov 2005 17:31:41 +0900), Kazunori 
Miyazawa <[EMAIL PROTECTED]> says:

> Your ip_xfrm_transport_hook is a good idea, I think.
> 
> We could call ip6_rcv_finish if the netfilter changed the addresses
> or otherwise we can continue the loop to avoid the cost in a similar
> way because we can know the change with checking skb->dst.

Well, I agree.

In article <[EMAIL PROTECTED]> (at Sun, 20 Nov 2005 17:31:36 +0100), Patrick 
McHardy <[EMAIL PROTECTED]> says:

> diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
> index b93e7cd..3c39296 100644
> --- a/net/ipv4/netfilter.c
> +++ b/net/ipv4/netfilter.c
> @@ -105,6 +105,26 @@ int ip_dst_output(struct sk_buff *skb)
>   return dst_output(skb);
>  }
>  EXPORT_SYMBOL(ip_dst_output);
> +
> +/*
> + * okfn for transport mode xfrm_input.c hook. Basically a copy of
> + * ip_rcv_finish without statistics and option parsing.
> + */
> +int ip_xfrm_transport_hook(struct sk_buff *skb)
> +{
> + struct iphdr *iph = skb->nh.iph;
> +
> + if (likely(skb->dst == NULL)) {
> + int err = ip_route_input(skb, iph->daddr, iph->saddr, iph->tos,
> +  skb->dev);
> + if (unlikely(err))
> + goto drop;
> + }
> + return dst_input(skb);
> +drop:
> + kfree_skb(skb);
> + return NET_RX_DROP;
> +}
>  #endif /* CONFIG_XFRM */
>  
:
> @@ -129,7 +133,16 @@ int xfrm6_rcv_spi(struct sk_buff **pskb,
>   netif_rx(skb);
>   return -1;
>   } else {
> +#ifdef CONFIG_NETFILTER
> + skb->nh.ipv6h->payload_len = htons(skb->len);
> + __skb_push(skb, skb->data - skb->nh.raw);
> +
> + NF_HOOK(PF_INET6, NF_IP6_PRE_ROUTING, skb, skb->dev, NULL,
> + ip6_rcv_finish);
> + return -1;
> +#else
>   return 1;
> +#endif
>   }
>  

Probably, we can do similarly for ipv6; e.g.:

int ip6_xfrm_transport_hook(struct sk_buff *skb)
{
#if 0 /* We NEVER support NAT. :-) */
 if (likely(skb->dst == NULL)) {
int err = ip6_route_input()
if (unlikely(err))
 goto drop;
 }
#endif
 __skb_pull(skb, skb->h.raw - skb->nh.raw);
 return NET_RX_SUCCESS;
drop:
 kfree_skb(skb);
 return NET_RX_DROP;
}

:

  } else {
#ifdef CONFIG_NETFILTER
 skb->nh.ipv6h->payload_len = htons(skb->len);
 skb->h.raw = skb->data;
 __skb_push(skb, skb->data - skb->nh.raw);

 if (NF_HOOK(PF_INET6, NF_IP6_PRE_ROUTING, skb, skb->dev, NULL,
 ip6_xfrm_transport_hook) == NET_RX_DROP)
 return -1;
#endif
 return 1;
  }

Then, we can continue parsing extension headers, I think.

-- 
YOSHIFUJI Hideaki @ USAGI Project  <[EMAIL PROTECTED]>
GPG-FP  : 9022 65EB 1ECF 3AD1 0BDF  80D8 4807 F894 E062 0EEA
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-21 Thread Kazunori Miyazawa

Hello Patrick,

I have a comment about the patch on the IPv6 input process.
The kernel applied your patch will always call ip6_rcv_finish
when enabling netfilter and receiving a esp packet so that
it will always look up the routing table and parse
extention headers twice a packet.
It will probably cost.

Your ip_xfrm_transport_hook is a good idea, I think.

We could call ip6_rcv_finish if the netfilter changed the addresses
or otherwise we can continue the loop to avoid the cost in a similar
way because we can know the change with checking skb->dst.

If you fix the point, your change will resolve our issues which were
mentioned in the previous mail from Kozakai-san.

Patrick McHardy wrote:
> [IPV4/6]: Netfilter IPsec input hooks
> 
> When the innermost transform uses transport mode the decapsulated packet
> is not visible to netfilter. Pass the packet through the PRE_ROUTING and
> LOCAL_IN hooks again before handing it to upper layer protocols to make
> netfilter-visibility symetrical to the output path.
> 
> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
> 
> ---
> commit 08cf39d5d7d8b942431a6529daa3ab69ecfb34b5
> tree 6f2a1a85f915b1b6f89ad50cf3d8855f21a561b6
> parent b847425c693f43a63137d18e36e5c8cf0187c175
> author Patrick McHardy <[EMAIL PROTECTED]> Sat, 19 Nov 2005 21:50:22 +0100
> committer Patrick McHardy <[EMAIL PROTECTED]> Sat, 19 Nov 2005 21:50:22 +0100
> 
>  include/linux/netfilter_ipv4.h |2 +-
>  include/net/ipv6.h |2 ++
>  net/ipv4/netfilter.c   |   20 
>  net/ipv4/xfrm4_input.c |   14 ++
>  net/ipv6/ip6_input.c   |2 +-
>  net/ipv6/xfrm6_input.c |   13 +
>  6 files changed, 51 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/netfilter_ipv4.h b/include/linux/netfilter_ipv4.h
> index fdc4a95..e9103fe 100644
> --- a/include/linux/netfilter_ipv4.h
> +++ b/include/linux/netfilter_ipv4.h
> @@ -79,7 +79,7 @@ enum nf_ip_hook_priorities {
>  
>  #ifdef __KERNEL__
>  extern int ip_route_me_harder(struct sk_buff **pskb);
> -
> +extern int ip_xfrm_transport_hook(struct sk_buff *skb);
>  #endif /*__KERNEL__*/
>  
>  #endif /*__LINUX_IP_NETFILTER_H*/
> diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> index 6addb4d..4fbfe43 100644
> --- a/include/net/ipv6.h
> +++ b/include/net/ipv6.h
> @@ -414,6 +414,8 @@ extern intipv6_rcv(struct sk_buff 
> *sk
>struct packet_type *pt,
>struct net_device *orig_dev);
>  
> +extern int   ip6_rcv_finish(struct sk_buff *skb);
> +
>  /*
>   *   upper-layer output functions
>   */
> diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
> index b93e7cd..3c39296 100644
> --- a/net/ipv4/netfilter.c
> +++ b/net/ipv4/netfilter.c
> @@ -105,6 +105,26 @@ int ip_dst_output(struct sk_buff *skb)
>   return dst_output(skb);
>  }
>  EXPORT_SYMBOL(ip_dst_output);
> +
> +/*
> + * okfn for transport mode xfrm_input.c hook. Basically a copy of
> + * ip_rcv_finish without statistics and option parsing.
> + */
> +int ip_xfrm_transport_hook(struct sk_buff *skb)
> +{
> + struct iphdr *iph = skb->nh.iph;
> +
> + if (likely(skb->dst == NULL)) {
> + int err = ip_route_input(skb, iph->daddr, iph->saddr, iph->tos,
> +  skb->dev);
> + if (unlikely(err))
> + goto drop;
> + }
> + return dst_input(skb);
> +drop:
> + kfree_skb(skb);
> + return NET_RX_DROP;
> +}
>  #endif /* CONFIG_XFRM */
>  
>  /*
> diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
> index 2d3849c..d90cd93 100644
> --- a/net/ipv4/xfrm4_input.c
> +++ b/net/ipv4/xfrm4_input.c
> @@ -11,6 +11,8 @@
>  
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -137,6 +139,8 @@ int xfrm4_rcv_encap(struct sk_buff *skb,
>   memcpy(skb->sp->x+skb->sp->len, xfrm_vec, xfrm_nr*sizeof(struct 
> sec_decap_state));
>   skb->sp->len += xfrm_nr;
>  
> + nf_reset(skb);
> +
>   if (decaps) {
>   if (!(skb->dev->flags&IFF_LOOPBACK)) {
>   dst_release(skb->dst);
> @@ -145,7 +149,17 @@ int xfrm4_rcv_encap(struct sk_buff *skb,
>   netif_rx(skb);
>   return 0;
>   } else {
> +#ifdef CONFIG_NETFILTER
> + __skb_push(skb, skb->data - skb->nh.raw);
> + skb->nh.iph->tot_len = htons(skb->len);
> + ip_send_check(skb->nh.iph);
> +
> + NF_HOOK(PF_INET, NF_IP_PRE_ROUTING, skb, skb->dev, NULL,
> + ip_xfrm_transport_hook);
> + return 0;
> +#else
>   return -skb->nh.iph->protocol;
> +#endif
>   }
>  
>  drop_unlock:
> diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> index 33d3b0e..e84b3cd 100644
> --- a/net/ipv6/ip6_input.c
> +++ b/net/ipv6/ip6_input.c
> @@ -48,7 +48,7 @@
>  
>  
>  
> -static inline int i

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-20 Thread Herbert Xu

David S. Miller <[EMAIL PROTECTED]> wrote:
> 
> I've read over Patrick's two most recent postings of these patches
> and I think they are generally sane and I cannot find any holes in
> them.  Herbert brought up the legitimate concern about defragmentation,
> but I think that's a detail and does not take away from the structural
> soundness of Patrick's approach.

Yes I agree completely.  The new IPsec stack has been around for three
years and it's about time that we have proper netfilter support for it.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-20 Thread David S. Miller

From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Mon, 21 Nov 2005 07:52:36 +0100

> I don't see why it is confusing. Plain text packets are visible before
> encapsulation (and they have to be because we don't necessarily know
> if packets will be encapsulated at the time the hooks are called in
> case the policy lookup after NAT returns a policy), plain text packets
> are visible after decapsulation. With different hooks we can't have
> symetrical behaviour because of the case I mentioned above, and that
> would be confusing IMO.

I think this is a very important point.

I can see no serious argument against this behavior, especially for
output.  On input, there is an argument of paranoia about seeing
plaintext packets, but administrator could do this anyways with
tcpdump or custom kernel module if this system is the decapsulation
point.

I've read over Patrick's two most recent postings of these patches
and I think they are generally sane and I cannot find any holes in
them.  Herbert brought up the legitimate concern about defragmentation,
but I think that's a detail and does not take away from the structural
soundness of Patrick's approach.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-20 Thread Patrick McHardy


Yasuyuki KOZAKAI wrote:

At first, now I could agree to use same name for hooks before/after xfrm
processing, if it's important to keep compatibility than to avoid difficulty
to use. Even now I think it's confusing to pass packets before/after xfrm to
same hook, and believe it's ideal to use different name for them.
But people can configure correctly according to you, then my concern might be
needless fear.


I don't see why it is confusing. Plain text packets are visible before
encapsulation (and they have to be because we don't necessarily know
if packets will be encapsulated at the time the hooks are called in
case the policy lookup after NAT returns a policy), plain text packets
are visible after decapsulation. With different hooks we can't have
symetrical behaviour because of the case I mentioned above, and that
would be confusing IMO.


Next is about re-visiting stack, I'm beginning to understand your patch.
It looks natural to re-route after DNAT on input path of IPv4. That's really
needed if packets have been mangled.

But is there any reason that we have to allow packets to re-visit
ip_local_deliver_finish() in the case that they have not been mangled
at PRE_ROUTING ? I think no. This situation is like ip_nat_out().


My patches don't change when and if packets will reach
ip_local_deliver_finish(), they just add a possibility for rerouting.
Currently the transforms are called from ip_local_deliver_finish() and
in transport mode the decapsulated packet continues its path in
ip_local_deliver_finish(), with my patches they will go through two
netfilter hooks and continue the exact same codepath, given that
they are not NATed to a foreign destination IP on their way.


And this can be said about IPv6 input path. If packets have not been mangled
(this is ordinary case because ip6tables doesn't have neither NAT nor
target module to mangle addresses directly), they don't have to re-route
and don't have to re-visit ip6_input_finish().

In the other way, if their addresses have been mangled, it's necessary to
re-route. I agree re-visiting ip6_input_finish() in this case.


Same goes for ip6_input_finish as for ip_local_deliver_finish(),
the packet would continue its path there anyway. Do you actually
mean ip6_rcv_finish()?


Then, why don't we make xfrm{4,6}_rcv_spi() return 0 (1 if IPv6)
so that ip_local_deliver_finish()/ip6_input_finish() can continue to process
headers if packets have not been mangled ? Is this difficult or impossible to
implement ?


I'm not sure I understand. Do you propose to check if the packet was
mangled after the PRE_ROUTING hook (if it was not stolen or queued)
and if not return directly to ip6_input_finish()? Where would the
LOCAL_IN hook be called?


This can solve the issue about twice processing of statistics and IPv6
extension headers. Because ip_local_deliver_finish()/ip6_input_finish() can
continue to process extension headers after ESP/AH in ordinary case.


AFAICT statistics are not affected by my patches, except for the
iptables counters. The double parsing of extension headers is indeed
a problem I forgot about, it looks like we need to carry nhoff around
if it can't be derived from the packet.

Regards
Patrick

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

2005-11-20 Thread Yasuyuki KOZAKAI

Hi, Patrick,

From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Sun, 20 Nov 2005 17:31:36 +0100

> [IPV4/6]: Netfilter IPsec input hooks
> 
> When the innermost transform uses transport mode the decapsulated packet
> is not visible to netfilter. Pass the packet through the PRE_ROUTING and
> LOCAL_IN hooks again before handing it to upper layer protocols to make
> netfilter-visibility symetrical to the output path.
> 
> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

At first, now I could agree to use same name for hooks before/after xfrm
processing, if it's important to keep compatibility than to avoid difficulty
to use. Even now I think it's confusing to pass packets before/after xfrm to
same hook, and believe it's ideal to use different name for them.
But people can configure correctly according to you, then my concern might be
needless fear.

Next is about re-visiting stack, I'm beginning to understand your patch.
It looks natural to re-route after DNAT on input path of IPv4. That's really
needed if packets have been mangled.

But is there any reason that we have to allow packets to re-visit
ip_local_deliver_finish() in the case that they have not been mangled
at PRE_ROUTING ? I think no. This situation is like ip_nat_out().

And this can be said about IPv6 input path. If packets have not been mangled
(this is ordinary case because ip6tables doesn't have neither NAT nor
target module to mangle addresses directly), they don't have to re-route
and don't have to re-visit ip6_input_finish().

In the other way, if their addresses have been mangled, it's necessary to
re-route. I agree re-visiting ip6_input_finish() in this case.

Then, why don't we make xfrm{4,6}_rcv_spi() return 0 (1 if IPv6)
so that ip_local_deliver_finish()/ip6_input_finish() can continue to process
headers if packets have not been mangled ? Is this difficult or impossible to
implement ?

This can solve the issue about twice processing of statistics and IPv6
extension headers. Because ip_local_deliver_finish()/ip6_input_finish() can
continue to process extension headers after ESP/AH in ordinary case.

In special case, if some codes mangle IPv6 addresses, that's the codes
to take care of the possibility of re-visiting ip6_input_finish().

What do you think ?

# Please note that these are just my opinions and other USAGI guys might have
# other opinions.

Regards,

-- Yasuyuki Kozakai
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

Re: [PATCH 06/13]: [IPV4/6]: Netfilter IPsec input hooks

19 matches

Site Navigation

Mail list logo

Footer information