Re: [Swan] IPsec PFP support on linux

2017-05-02 Thread Sowmini Varadhan
On (05/02/17 09:58), Paul Wouters wrote:
>I think you want to use Opportunistic IPsec, eg see 
>https://libreswan.org/wiki/HOWTO:_Opportunistic_IPsec

I dont think that what I want is opportunistic ipsec..

taking an extreme example, I can set up the ipsec tunnels with 
esp-null for *.5001 and 5001.* (and that's how I'm able to
trigger quick mode in my experiment).

But I want each 5-tuple to/from 5001 to get its own SPI. That
is not happening in my case. 

If I used OE, I would *never* get a per-5-tuple SPI, right?
(I can give this a try later today but the rfc definition of OE is
not what I have in mind- I really want PFP, not OE).

>Note that IKEv2 also allows you to define one connection and instantiate
> a connection based on the trigger packet whose src/dst proto/port are
>included in the IKEv2 packet as traffic selectors. See RFC7296 and
>"narrowing".

yes, the rfc's permit this, I'm not sure how to set up my 
SPD for this though.




IPsec PFP support on linux

2017-05-02 Thread Sowmini Varadhan
I have a question about linux support for IPsec PFP (as defined in
rfc 4301). I am assuming this exists, and is accessible from uspace,
in which case I need some hints on how to set it up.

Assuming I have a server listening at port 5001 that I want to
secure via ipsec. Suppose I want to make sure that each TCP/UDP 5-tuple
sending packets to port 5001 gets its own SA.

RFC4301 has this:

   - SPD-S: For traffic that is to be protected using IPsec, the
 entry consists of the values of the selectors that apply to the
 traffic to be protected via AH or ESP, controls on how to
 create SAs based on these selectors, ...

and further down
  If IPsec processing is specified for
  an entry, a "populate from packet" (PFP) flag may be asserted for
  one or more of the selectors in the SPD entry (Local IP address;
  Remote IP address; Next Layer Protocol; and, depending on Next
  Layer Protocol, Local port and Remote port, or ICMP type/code, or
  Mobility Header type).  If asserted for a given selector X, the
  flag indicates that the SA to be created should take its value for
  X from the value in the packet.  Otherwise, the SA should take its
  value(s) for X from the value(s) in the SPD entry.

A google search produces a discarded patch
  http://marc.info/?l=linux-netdev=119746758904140
but its not clear to me how to set this up (if PFP works fine,
as suggested by Herbert's response above)

I tried experimenting with IP_XFRM_POLICY from a simple udp client but
(a) that seems to require a SPI and reqid to set up the SPD 
(b) I see the SADB_ACQUIRE upcall being triggered after the local port
is bound (and SADB entry is set up for the lport).  But ike phase2
does not converge for the lport specific sadb added
by the bind (even in quick mode)

My understanding is that pluto shoud be generating spi's to make sure
they are sufficiently unique/random etc. so (a) makes me think I'm
either not setting this up or not using this correctly.

Any hints/sample code/RTFMs would be helpful (documentation for
IP_XFRM_POLICY seems scant, afaict). I'd be happy to share my 
udp client program, if it can provide more context to my question.

--Sowmini



Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-15 Thread Sowmini Varadhan
On (03/15/17 10:08), Dmitry Vyukov wrote:
> After I've applied the patch these reports stopped to happen, and I
> have not seem any other reports that look relevant.
> However, it there was one, but it looks like a different issue and it
> was probably masked by massive amounts of original deadlock reports:

Yes, this looks like a valid deadlock.

I think there may be some ->dumpit callbacks that take the rtnl_lock
and do not unlock it before return, e.g., I see nl80211_dump_interface()
doing this at 

   2612 rtnl_lock();
   2613 if (!cb->args[2]) {
 :
   2619 ret = nl80211_dump_wiphy_parse(skb, cb, );
   2620 if (ret)
   2621 return ret;

afaict, nl80211_dump_wiphy_parse does not itself do rtnl_unlock on error.


If that's the case then we'd run into the circular locking dependancy
flagged by lockdep. 

Disclaimer: I did not check every single ->dumpit, there may be more
than one of these..






Re: crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex

2017-03-14 Thread Sowmini Varadhan
On (03/14/17 09:14), Dmitry Vyukov wrote:
> Another one now involving rds_tcp_listen_stop
   :
> kworker/u4:1/19 is trying to acquire lock:
>  (sk_lock-AF_INET){+.+.+.}, at: [] lock_sock
> include/net/sock.h:1460 [inline]
>  (sk_lock-AF_INET){+.+.+.}, at: []
> rds_tcp_listen_stop+0x5c/0x150 net/rds/tcp_listen.c:288
> 
> but task is already holding lock:
>  (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20
> net/core/rtnetlink.c:70

Is this also a false positive?

genl_lock_dumpit takes the genl_lock and then waits on the rtnl_lock
(e.g., out of tipc_nl_bearer_dump).

netdev_run_todo takes the rtnl_lock and then wants lock_sock()
for the TCP/IPv4 socket. 

Why is lockdep seeing a circular dependancy here? Same pattern
seems to be happening  for 
  http://www.spinics.net/lists/netdev/msg423368.html
and maybe also http://www.spinics.net/lists/netdev/msg423323.html?

--Sowmini

> Chain exists of:
>   sk_lock-AF_INET --> genl_mutex --> rtnl_mutex
> 
>  Possible unsafe locking scenario:
> 
>CPU0CPU1
>
>   lock(rtnl_mutex);
>lock(genl_mutex);
>lock(rtnl_mutex);
>   lock(sk_lock-AF_INET);
> 
>  *** DEADLOCK ***
> 
> 4 locks held by kworker/u4:1/19:
>  #0:  ("%s""netns"){.+.+.+}, at: []
> __write_once_size include/linux/compiler.h:283 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: [] atomic64_set
> arch/x86/include/asm/atomic64_64.h:33 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: [] atomic_long_set
> include/asm-generic/atomic-long.h:56 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: [] set_work_data
> kernel/workqueue.c:617 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: []
> set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
>  #0:  ("%s""netns"){.+.+.+}, at: []
> process_one_work+0xab3/0x1c10 kernel/workqueue.c:2089
>  #1:  (net_cleanup_work){+.+.+.}, at: []
> process_one_work+0xb07/0x1c10 kernel/workqueue.c:2093
>  #2:  (net_mutex){+.+.+.}, at: []
> cleanup_net+0x22b/0xa90 net/core/net_namespace.c:429
>  #3:  (rtnl_mutex){+.+.+.}, at: []
> rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
> 


Re: Git bisected regression for ipsec/aead

2016-08-27 Thread Sowmini Varadhan
On (08/25/16 16:49), Herbert Xu wrote:
> 
> On Fri, Aug 19, 2016 at 03:21:24PM -0400, Sowmini Varadhan wrote:
> >7271b33cb87e80f3a416fb031ad3ca87f0bea80a is the first bad commit

> This bisection doesn't make much sense as this patch just causes
> cryptd to be used a little more more frequently.  But it does
> point the finger at cryptd.

On additional testing, I think this might be related to some
subtle race/timing issue so that git-bisect may not necessarily
be able to pin-point the correct bad-commit: if I add a few 
printks in other parts of the IPsec stack (and change the timing), 
the problem does not reproduce. Let me try to collect more data
on this. 

Meanwhile, if you can see some bug in the commit above, then
it probably makes sense to fix it upstream anyway.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git bisected regression for ipsec/aead

2016-08-25 Thread Sowmini Varadhan
On (08/25/16 16:49), Herbert Xu wrote:
> This bisection doesn't make much sense as this patch just causes
> cryptd to be used a little more more frequently.  But it does
> point the finger at cryptd.
:
> So we have list corruption here, possibly caused by use-after-free.
> I did spot one bug in the AEAD cryptd path where we end up using
> the wrong tfm to track refcnt.
> 
> Does this patch help?

Unfortunately no, I still see the panic.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Git bisected regression for ipsec/aead

2016-08-19 Thread Sowmini Varadhan

Hi Herbert,

In the process of testing ipsec I ran into panics (details below)
with  the algorithm
"aead rfc4106(gcm(aes)) 0x1234567890123456789012345678901234567890 64"

git-bisect analyzed this down to

   7271b33cb87e80f3a416fb031ad3ca87f0bea80a is the first bad commit
   commit 7271b33cb87e80f3a416fb031ad3ca87f0bea80a
   Author: Herbert Xu 
   Date:   Tue Jun 21 16:55:16 2016 +0800

crypto: ghash-clmulni - Fix cryptd reordering
 :

Could you please take a look? here are additional details:

To reproduce the panic, I set up ipsec as follows, on 2 test machines

  # #set up laddr to be local interface address, faddr as peer's addres.
  # ip x p add dir out src $laddr dst $faddr proto tcp \
   tmpl proto esp src $laddr dst $faddr spi 0x0001 \
   mode transport reqid 1
  # ip x p add dir in src $laddr dst $faddr proto tcp \
   tmpl proto esp dst $laddr src $faddr spi 0x0001 \
   mode transport reqid 1
  # ip x s add proto esp src $laddr dst $faddr spi 0x0001 \
   mode transport reqid 1 replay-window 32 \
   aead 'rfc4106(gcm(aes))' 0x1234567890123456789012345678901234567890 64 \
   sel src $laddr dst $faddr proto tcp
  # ip x s add proto esp dst $laddr src $faddr spi 0x0001 \
   mode transport reqid 1 replay-window 32 \
   aead 'rfc4106(gcm(aes))' 0x1234567890123456789012345678901234567890 64 \
   sel src $laddr dst $faddr proto tcp

Then run iperf i.e., start "iperf -s" on one node (server), and 
"iperf -c $faddr -P 1" on the on the other (client). The client will
panic with something like this in the dmesg:

[  124.627594] BUG: unable to handle kernel paging request at 000100c5
[  124.627612] [ cut here ]
[  124.627620] WARNING: CPU: 3 PID: 0 at lib/list_debug.c:62 __list_del_entry+0x
86/0xd0
[  124.627621] list_del corruption. next->prev should be 88085cebd168, but w
as ff8d
[  124.627622] Modules linked in:
   :
   :
[  124.627650] CPU: 3 PID: 0 Comm: swapper/3 Tainted: GE   
4.7.0-rc1-ipsec-offload-api2+ #15
[  124.627651] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS 
GRNDSDP1.86B.0046.R00.1502111331 02/11/2015
[  124.627666]  [] dump_stack+0x51/0x78
[  124.627667]  [] ? __list_del_entry+0x86/0xd0
[  124.627673]  [] __warn+0xfd/0x120
[  124.627676]  [] warn_slowpath_fmt+0x49/0x50
[  124.627677]  [] __list_del_entry+0x86/0xd0
[  124.627683]  [] detach_tasks+0x1ab/0x280
[  124.627685]  [] load_balance+0x32b/0x860
[  124.627691]  [] ? enqueue_hrtimer+0x49/0xa0
[  124.627693]  [] ? run_timer_softirq+0x4c/0x300
[  124.627695]  [] rebalance_domains+0x144/0x290
[  124.627696]  [] run_rebalance_domains+0x49/0x60
[  124.627701]  [] __do_softirq+0xeb/0x2d8
[  124.627703]  [] ? hrtimer_interrupt+0xb8/0x170
[  124.627706]  [] irq_exit+0xa5/0xb0
[  124.627708]  [] smp_apic_timer_interrupt+0x46/0x60
[  124.627709]  [] apic_timer_interrupt+0x7f/0x90
[  124.627709]   
[  124.627716]  [] ? cpuidle_enter_state+0xc9/0x2d0
[  124.627718]  [] ? cpuidle_enter_state+0xbb/0x2d0
[  124.627719]  [] ? menu_select+0x103/0x3a0
[  124.627721]  [] cpuidle_enter+0x17/0x20
[  124.627723]  [] call_cpuidle+0x2e/0x40
[  124.627724]  [] cpuidle_idle_call+0x68/0x100
[  124.627725]  [] cpu_idle_loop+0x155/0x240
[  124.627726]  [] cpu_startup_entry+0x21/0x30
[  124.627732]  [] start_secondary+0x73/0x80
[  124.627733] ---[ end trace d9352c1808e65391 ]---
[  124.640240] paging request
[  124.640557]  at 000100c5
 :
[  124.640809] IP: [] account_system_time+0x66/0x130
[  124.641146] PGD 85a8c3067 PUD 0 
[  124.641533] Thread overran stack, or stack corrupted
[  124.641795] Oops:  [#1] SMP
[  124.642049] Modules linked in: seqiv esp4 xfrm4_mode_transport 
sha256_generic drbg ansi_cprng ctr ghash_generic gf128mul ghash_clmulni_intel 
cryptd gcm autofs4 8021q garp stp llc sunrpc cpufreq_ondemand ipv6 iTCO_wdt 
iTCO_vendor_support pcspkr i40e i2c_i801 i2c_core sg lpc_ich mfd_core xhci_pci 
xhci_hcd ixgbe dca hwmon mdio hed wmi ipmi_si ipmi_msghandler acpi_cpufreq 
acpi_pad ext4(E) mbcache(E) jbd2(E) sd_mod(E) sr_mod(E) cdrom(E) ahci(E) 
libahci(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
[  124.647568] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1
.86B.0046.R00.1502111331 02/11/2015
[  124.648027] task: 88085f344100 ti: 88085f348000 task.ti: 880855cb
b2e0
[  124.648293] RIP: 0010:[]  [] 
account_system_time+0x66/0x130
[  124.648814] RSP: 0018:88087ec03d68  EFLAGS: 00010086
[  124.649075] RAX: 0001 RBX: 88085f344100 RCX: ff8d
[  124.649342] RDX: 0001 RSI: 0002 RDI: 
[  124.649609] RBP: 88087ec03d88 R08: 0001 R09: 880855cbb2a8
[  124.649877] R10:  R11: 0001 R12: 
[  124.650143] R13: 88085f32edd8 R14: 88087ec0fc80 R15: 001cdb6654dd
[  124.650409] FS:  () 

Re: [PATCH v2] crypto: hash - Fix page length clamping in hash walk

2016-05-04 Thread Sowmini Varadhan
On (05/04/16 12:08), Steffen Klassert wrote:
> > > Sowmini, could you please doublecheck with your test setup?
> > 
> > Don't bother, my patch was crap.  Here's one that's more likely
> > to work:
> 
> This one is much better, works here :)

The patch seems to work, but the caveat is that the original
bug (iperf segv) was not always reproducible on demand - it depended
on memory and other config. 

But looks like Steffen has a reliable way to reproduce the original
problem, so lets go with this patch for now.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-05-04 Thread Sowmini Varadhan
On (05/04/16 10:32), Herbert Xu wrote:
> 
> Please base it on cryptodev.
> 

Looks like this got fixed in cryptodev by commit cece762f6f3c 
("lib/mpi: mpi_write_sgl(): fix out-of-bounds stack access")

Thanks,
--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-05-03 Thread Sowmini Varadhan
On (05/03/16 16:12), Herbert Xu wrote:
> 
> Sorry, but your patch doesn't apply against the current tree at all.
> Please rebase it if it is still needed.

Hello,

I had based my patch off of net-next, which is where I do my work.

I'd be happy to rebase it on the "current tree", 
but given that mpicoder.c does not have an entry in MAINTAINERS, 
please clarify what you mean by "current tree" in this case.

do you mean

 git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
or
 git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git
or 
 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

(which are the three possible candidates I can see in MAINTAINERS).

It would be nice to get this bug fixed, since the fix is fairly
obvious, and the nuisance factor from the generated "unaligned
access" messages on the impacted non-intel platforms is quite high, 

thanks,
--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] lib/mpi: Fix kernel unaligned access in mpi_write_to_sgl

2016-04-27 Thread Sowmini Varadhan

Commit 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers") added
mpi_write_to_sgl() which generates traps due to unaligned
access on some platforms like sparc. Fix this by using
the get_unaligned* and put_unaligned* functions.

Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
v2: tadeusz.struk comments: Predicate on BYTES_PER_MPI_LIMB.

 lib/mpi/mpicoder.c |   21 +
 1 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index eb15e7d..b61eb6b 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include "mpi-internal.h"
+#include 
 
 #define MAX_EXTERN_MPI_BITS 16384
 
@@ -405,10 +406,22 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, 
unsigned *nbytes,
p -= sizeof(alimb);
continue;
} else {
-   mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
-   mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
-   + lzeros;
-   *limb1 = *limb2;
+   mpi_limb_t tmp;
+#if BYTES_PER_MPI_LIMB == 4
+   tmp = get_unaligned_be32((void *)p -
+sizeof(alimb) +
+lzeros);
+   put_unaligned_be32(tmp, (void *)p -
+  sizeof(alimb));
+#elif BYTES_PER_MPI_LIMB == 8
+   tmp = get_unaligned_be64((void *)p -
+sizeof(alimb) +
+lzeros);
+   put_unaligned_be64(tmp, (void *)p -
+  sizeof(alimb));
+#else
+#error please implement for this limb size.
+#endif
p -= lzeros;
y = lzeros;
}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


console noise after commit c1e9b3b0eea

2016-04-19 Thread Sowmini Varadhan

Hi Anatoly,

after commit c1e9b3b0eea1 ("hwrng: n2 - Attach on T5/M5, T7/M7 SPARC CPUs")
I get a *lot* of console noise on my T5-2, of the form:

n2rng f028f21c: Selftest failed on unit 0
n2rng f028f21c: Test buffer slot 0 [0x]
n2rng f028f21c: Test buffer slot 1 [0xe63f56d6a22eb116]
n2rng f028f21c: Test buffer slot 2 [0xe63f56d6a22eb116]
n2rng f028f21c: Test buffer slot 3 [0xe63f56d6a22eb116]
n2rng f028f21c: Test buffer slot 4 [0xe63f56d6a22eb116]
n2rng f028f21c: Test buffer slot 5 [0xe63f56d6a22eb116]
n2rng f028f21c: Test buffer slot 6 [0xe63f56d6a22eb116]
n2rng f028f21c: Test buffer slot 7 [0xe63f56d6a22eb116]

Why/when is your commit needed on my T5-2?

I'm not sure how this was tested, but if you need to revise it and test
on sparc, please let me know- I think it needs more work on sparc.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-08 Thread Sowmini Varadhan
On (12/08/15 12:32), Steffen Klassert wrote:
> 
> Would be nice if you could share the results. Comments are

Sure, not a problem. Give me some time though, I'm also looking
into the skb_cow_data and other memory-management issues that
were flagged on this thread. 

I'll have all this info by netdev, at the latest.

--Sowmini


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-07 Thread Sowmini Varadhan
On (12/07/15 09:40), Steffen Klassert wrote:
> 
> I've pushed it to
> 
> https://git.kernel.org/cgit/linux/kernel/git/klassert/linux-stk.git/log/?h=net-next-ipsec-offload
> 
> It is just example code, nothing that I would show usually.
> But you asked for it, so here is it :)

that's fine, I dont expect more at this point, just want to 
test-drive it, and see how it compares to my approach. 

thanks!

> The GRO part seems to work well, the GSO part is just a hack at the 
> moment.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-03 Thread Sowmini Varadhan
On (12/03/15 09:45), Steffen Klassert wrote:
> pcrypt(echainiv(authenc(hmac(sha1-ssse3),cbc-aes-aesni)))
> 
> Result:
> 
> iperf -c 10.0.0.12 -t 60
> 
> Client connecting to 10.0.0.12, TCP port 5001
> TCP window size: 45.0 KByte (default)
> 
> [  3] local 192.168.0.12 port 39380 connected with 10.0.0.12 port 5001
> [ ID] Interval   Transfer Bandwidth
> [  3]  0.0-60.0 sec  32.8 GBytes  4.70 Gbits/sec
> 
> I provide more informatios as soon as the code is available.

that's pretty good compared to the baseline. 
I'd like to try out our patches, when they are ready.

I think you may get some more improvement if you manually pin the irq 
and iperf to specific cpus (at least that was my observation for transp 
mode) 

--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-03 Thread Sowmini Varadhan
On (12/03/15 14:33), David Miller wrote:
> 
> Doesn't skb_cow_data() contribute significantly to the ESP base cost,
> especially for TCP packets?

Indeed. For esp-null, it's about half of the total time spent
in esp_output (for one run that I just instrumented with perf 
tracepoints 2.5 ms compared to 5.8 ms)

It never goes into the slow path of skb_cow_data (the path
with comment about mincer fragments) because whether or not you
do this after GSO, TCP makes sure to fit within the MSS, so (unless you
have Jumbo enabled?) you'd send it something that does not have
a fraglist.

Was the cow_data call just intended to handle the fraglist case? 
Or is there something else more generic going on here (it's hard
to tell because esp_output doesnt have too many comments to explain
what it thinks its doing, as it juggles different len fields around)

> I mean, we're copying every TCP data frame.
> 
> If this is the case, even with GSO/whatever offloads, I expect that
> performance will be roughly halfed.

The other obvious "low-hanging fruit" is to address the TODO in the
comment above esp_alloc_tmp.

--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-02 Thread Sowmini Varadhan
On (12/02/15 12:41), David Laight wrote:
> You are getting 0.7 Gbps with ass-ccm-a-128, scale the esp-null back to
> that and it would use 7/18*71 = 27% of the cpu.
> So 69% of the cpu in the a-128 case is probably caused by the
> encryption itself.
> Even if the rest of the code cost nothing you'd not increase
> above 1Gbps.

Fortunately, the situation is not quite hopeless yet.

Thanks to Rick Jones for supplying the hints for this, but with
some careful manual pinning of irqs and iperf processes to cpus,
I can get to 4.5 Gbps for the esp-null case.

Given that the [clear traffic + GSO without GRO] gets me about 5-7 Gbps,
the 4.5 Gbps is not that far off (and at that point, the nickel-and-dime
tweaks may help even more).

For AES-GCM, I'm able to go from 1.8 Gbps (no GSO) to 2.8 Gbps.
Still not great, but proves that we haven't yet hit any upper bounds
yet.

I think a lot of the manual tweaking of irq/process placement
is needed because the existing rps/rfs flow steering is looking
for TCP/UDP flow numbers to do the steering. It can just as easily
use the IPsec SPI numbers to do this, and that's another place where
we can make this more ipsec-friendly.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-02 Thread Sowmini Varadhan
On (12/02/15 13:07), Tom Herbert wrote:
> That's easy enough to add to flow dissector, but is SPI really
> intended to be used an L4 entropy value? We would need to consider the

yes. To quote https://en.wikipedia.org/wiki/Security_Parameter_Index
"This works like port numbers in TCP and UDP connections. What it means
 is that there could be different SAs used to provide security to one
 connection. An SA could therefore act as a set of rules."

> effects of running multiple TCP connections over an IPsec. Also, you
> might want to try IPv6, the flow label should provide a good L4 hash
> for RPS/RFS, it would be interesting to see what the effects are with
> IPsec processing. (ESP/UDP could also if RSS/ECMP is critical)

IPv6 would be an interesting academic exercise, but it's going
to be a while before we get RDS-TCP to go over IPv6.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-02 Thread Sowmini Varadhan
On (12/02/15 14:01), Tom Herbert wrote:
> No, please don't persist is this myopic "we'll get to IPv6 later"
> model! IPv6 is a real protocol, it has significant deployment of the
> Internet, and there are now whole data centers that are IPv6 only
> (e.g. FB), and there are plenty of use cases of IPSEC/IPv6 that could
> benefit for performance improvements just as much IPv4. This vendor
> mentality that IPv6 is still not important simply doesn't help
> matters. :-(

Ok, I'll get you the numbers for this later, and sure, if we do
this, we should solve the ipv6 problem too.

BTW, the ipv6 nov3 paths have severe alignment issues. I flagged
this a long time ago http://www.spinics.net/lists/netdev/msg336257.html

I think all of it is triggered by mld. Someone needs to do
something about that too. I dont think those paths are using 
NET_ALIGN very well, and I dont think this is the most wholesome
thing for perf.

--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-02 Thread Sowmini Varadhan
On (12/02/15 07:53), Steffen Klassert wrote:
> 
> I'm currently working on a GRO/GSO codepath for IPsec too. The GRO part
> works already. I decapsulate/decrypt the packets on layer2 with a esp GRO
> callback function and reinject them into napi_gro_receive(). So in case
> the decapsulated packet is TCP, GRO can aggregate big packets.

Would you be able to share your patch with me? I'd like to give that a try
just to get preliminary numbers (and I could massage it as needed
for transport mode too).

> My approach to GSO is a bit different to yours. I focused on tunnel mode,
> but transport mode should work too. I encapsulate the big GSO packets
> but don't do the encryption. Then I've added a esp_gso_segment() function,
> so the (still not encrypted ESP packets) get segmented with GSO. Finally I
> do encryption for all segments. This works well as long as I do sync crypto.
> The hard part is when crypto returns async. This is what I'm working on now.
> I hope to get this ready during the next weeks that I can post a RFC version
> and some numbers.

I see. My thought for attacking tunnel mode would have been to 
callout the esp code at the tail of gre_gso_segment, but I did not
yet consider this carefully - clearly you've spent more time on it,
and know more about all the gotchas there.

> Also I tried to consider the IPsec GRO/GSO codepath as a software fallback.
> So I added hooks for the encapsulation, encryption etc. If a NIC can do
> IPsec, it can use this hooks to prepare the packets the way it needs it.
> There are NICs that can do IPsec, it's just that our stack does not support
> it.

yes, this is one of the things I wanted to bring up at netdev 1.1.
Evidently many of the 10G NICS (Niantic, Twinville, Sageville) already
support ipsec offload but that feature is not enabled for BSD or linux
because the stack does not support it (though Microsoft does. The intel
folks pointed me at this doc:
https://msdn.microsoft.com/en-us/library/windows/hardware/ff556996%28v=vs.85%29.aspx)

But quite independant of h/w offload, the s/w stack can already do
a very good job for 10G with just GSO and GRO, so being able to extend
that path to do encryption after segmentation should at least bridge
the huge gap between the ipsec and non-ipsec mech.

And that gap should be as small as possible for esp-null, so that
the only big hit we take is for the complexity of encryption itself!

> Another thing, I thought about setting up an IPsec BoF/workshop at
> netdev1.1. My main topic is GRO/GSO for IPsec. I'll send out a mail
> to the list later this week to see if there is enough interest and
> maybe some additional topics.

Sounds like an excellent idea. I'm certainly interested.

--Sowmini
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-02 Thread Sowmini Varadhan
On (12/02/15 11:56), David Laight wrote:
> > Gbps  peak cpu util
> > esp-null 1.8   71%
> > aes-gcm-c-2561.6   79%
> > aes-ccm-a-1280.7   96%
> > 
> > That trend made me think that if we can get esp-null to be as close
> > as possible to GSO/GRO, the rest will follow closely behind.
> 
> That's not how I read those figures.
> They imply to me that there is a massive cost for the actual encryption
> (particularly for aes-ccm-a-128) - so whatever you do to the esp-null
> case won't help.

I'm not a crypto expert, but my understanding is that the CCM mode
is the "older" encryption algorithm, and GCM is the way of the future.
Plus, I think the GCM mode has some type of h/w support (hence the
lower cpu util)

I'm sure that crypto has a cost, not disputing that, but my point
was that 1.8 -> 1.6 -> 0.7 is a curve with a much gentler slope than
the 9 Gbps (clear traffic, GSO, GRO) 
-> 4 Gbps (clear, no gro, gso) 
   -> 1.8 (esp-null)
That steeper slope smells of s/w perf that we need to resolve first,
before getting into the work of faster crypto?

> One way to get a view of the cost of the encryption (and copies)
> is to do the operation twice.

I could also just instrument it with perf tracepoints, if that 
data is interesting

--Sowmini


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-02 Thread Sowmini Varadhan
On (12/02/15 12:41), David Laight wrote:
> 
> Also what/how are you measuring cpu use.
> I'm not sure anything on Linux gives you a truly accurate value
> when processes are running for very short periods.

I was using mpstat, while running iperf. Should I be using
something else? or running it for longer intervals?

but I hope we are not doomed at 1 Gbps, or else security itself would
come at a very unattractive cost. Anyway, even aside from crypto.
we need to have some way to add TCP options (that depend on the 
contents of the tcp header) etc post-GSO, in the interest of not 
ossifying the stack.

> On an SMP system you also get big effects when work is switched
> between cpus. I've got some tests that run a lot faster if I
> put all but one of the cpus into a busy-loop in userspace
> (eg: while :; do :; done)!

yes Rick Jones also pointed the same thing to me, and one of the
things I was going to try out later today is to instrument the
effects of pinning irqs and iperf threads to a specific cpu.

--Sowmini


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-01 Thread Sowmini Varadhan
On (12/01/15 16:56), David Ahern wrote:
> 
> Using iperf3 and AH with NULL algorithm between 2 peers connected by
> a 10G link.
> 
I'm using esp-null, not AH, and iperf2, which I understand is 
quite different from, and more aggressive than, iperf3 (though I'm not 
sure that it matters for this single-stream case).

> With AH I get ~1.5 Gbps with MTU at 1500:

But yes, I get approx that too. 

The "good" news is that I can get about 3 Gbps with my patch. So one
could say that I've 2x-ed the perf. Except that:

The "bad" news is that even GSO/GRO can do way better, so we
need to be able to extend that perf to also be available 
to some key TCP and IP extensions (like md5 and ipsec, maybe)
and beyond (i.e need to de-ossify the stack so we can extend
TCP/IP features without sacrificing perf along the way).

The not-so-great news is that I see that just adding perf tracepoints
(not even enabling them!) seems to make a small diff (3 Gbps vs 3.2 Gbps) 
to my numbers. Is that mere standard-deviation, or something 
one should be aware of, about perf?

> iperf3 runs about 60% CPU and ksoftirqd/2 is at 86%.

yes, not surprising. You really need to compare this to GSO/GRO
for a pure-s/w,  apples-apples comparison.

> Bumping the MTU to 9000:

Yes that's not always an option. See also the comments from Eric/Rick
about latency [http://lists.openwall.net/netdev/2015/11/24/111]. 

--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ipsec impact on performance

2015-12-01 Thread Sowmini Varadhan

I instrumented iperf with and without ipsec, just using esp-null, 
and 1 thread, to keep things simple. I'm seeing some pretty dismal 
performance numbers with ipsec, and trying to think of ways to
improve this. Here are my findings, please share feedback.

I suspect that a big part of the problem is the implicit loss of GSO,
and this is made worse by some inefficiencies in the xfrm code:
for single stream iperf (to avoid effects of rx-hash), I see the
following on a 10G p2p ethernet link.
 8.5-9.5 Gbps clear traffic, TSO disabled, so GSO, GRO is in effect
 3-4 Gbps clear traffic, with both TSO/GSO disabled
 1.8-2 Gbps for esp-null.
So the above numbers suggest that losing TSO/GSO results in one
big drop in performance, and then there's another cliff for the 
clear -> esp-null transition. And those cliffs apply even if you are
merely doing TCP-MD5 or AO for basic protection of the TCP connection.

I tried moving things about a bit to defer the ipsec after GSO - I'll 
share my experimental patch as an RFC in a separate thread. (Disclaimer:
the patch is just an experiment at this point).

In that patch, I'm only focussing on esp-null and transp-mode ipsec
for now, just to get some basic performance numbers to see if this is 
at all interesting.  Essentially my hack mainly involves the following

- don't disable TSO in sk_setup_caps() if a dst->header_len is found
- in xfrm4_output, if GSO is applicable, bail out without esp header 
  addition - that will get done after skb_segment()
- at the end of tcp_gso_segment() (when tcp segment is available),
  set things up for xfrm_output_one and trigger the esp_output..
  I have to be very careful about setting up skb pointers here, since
  it looks like esp_output overloads the mac_header pointer e.g., for
  setting up the ip protocol field 

If I do all these things, the ipsec+iperf improves slightly- for
esp-null, I move from approx 1.8 Gbps  to about 3 Gbps, but clearly,
this is still quite far from the 8 - 9 Gbps that I can get with just
GSO+GRO for non-ipsec traffic.

There are some inefficiencies that I can see in the xfrm code,
that I am inheriting in my patch, e.g.,:
  memory management in the xfrm code has room for improvement. Every
  pass through xfrm_transport_output ends up doing a (avoidable?) memmove,
  and each pass through esp_output ends up doing a kmalloc/free of the
  "tmp" buffer. 
But these are all still relatively small things - tweaking them 
doesnt get me significantly past the 3 Gbps limit. Any suggestions
on how to make this budge (or design criticism of the patch) would
be welcome.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-01 Thread Sowmini Varadhan
On (12/01/15 10:17), Rick Jones wrote:
> 
> What do the perf profiles show?  Presumably, loss of TSO/GSO means
> an increase in the per-packet costs, but if the ipsec path
> significantly increases the per-byte costs...

For ESP-null, there's actually very little work to do - we just
need to add the 8 byte ESP header with an spi and a seq#.. no
crypto work to do.. so the overhead *should* be minimal, else
we've painted ourself into a corner where we can't touch anything
including TCP options like md5.

perf profiles: I used perf tracepoints to instrument latency.
Yes, there is function call overhead for the xfrm path. So, for example,
the stack ends up being like this:
  :
  e5d2f2 ip_finish_output ([kerne.kallsyms])
  75d6d0 ip_output ([kernel.kallsyms])
  7c08ad xfrm_output_resume ([kernel.kallsyms])
  7c0aae xfrm_output ([kernel.kallsyms])
  7b1bdd xfrm4_output_finish ([kernel.kallsyms])
  7b1c7e __xfrm4_output ([kernel.kallsyms])
  7b1dbe xfrm4_output ([kernel.kallsyms])
  75bac4 ip_local_out ([kernel.kallsyms])
  75c012 ip_queue_xmit ([kernel.kallsyms])
  7736a3 tcp_transmit_skb ([kernel.kallsyms])
  :
where the detour into xfrm has been indented out, and esp_output
gets called out of xfrm_output_resume(). And as I said, there's
some nickels-and-dimes of perf to be squeezed out from 
better memory management in xfrm, but the fact that it doesnt move
beyond 3 Gbps strikes me as some other bottleneck/serialization.

> Short of a perf profile, I suppose one way to probe for per-packet
> versus per-byte would be to up the MTU.  That should reduce the
> per-packet costs while keeping the per-byte roughly the same.

actually the hack/rfc I sent out does help (in that it almost
doubles the existing 1.8 Gbps). Problem is that this cliff is much
steeper than that, and there's more hidden somewhere.

--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-01 Thread Sowmini Varadhan
On (12/01/15 10:18), Tom Herbert wrote:
> >  8.5-9.5 Gbps clear traffic, TSO disabled, so GSO, GRO is in effect
> >  3-4 Gbps clear traffic, with both TSO/GSO disabled
> >  1.8-2 Gbps for esp-null.
> 
> Are you losing checksum offload also?

I tried with both checksum offload on and off.
For the GSO case, doesnt make a huge difference to perf. 
For my patch, I disable h/w cksum offload, so that I can leverage
from the existing cksum calculations in the existing GSO code. That helps
a bit (goes from 3 Gbps -> 3.2 Gbps, but I need a 2x jump here)


> Thanks for the nice data! We could certainly implement GRO/GSO for
> esp-null to get your numbers up but I don't think that would be very
> useful to anyone. Do you have the performance numbers using real
> encryption?

I was using esp-null merely to not have the crypto itself perturb
the numbers (i.e., just focus on the s/w overhead for now), but here
are the numbers for the stock linux kernel stack
Gbps  peak cpu util
esp-null 1.8   71%
aes-gcm-c-2561.6   79%
aes-ccm-a-1280.7   96%

That trend made me think that if we can get esp-null to be as close
as possible to GSO/GRO, the rest will follow closely behind.

So is my patch in the right direction? Anything obvious I am missing?
I'd like to budge that number beyond 3 Gbps :-)

--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ipsec impact on performance

2015-12-01 Thread Sowmini Varadhan
On (12/01/15 10:50), Rick Jones wrote:
> 
> Something of a longshot, but are you sure you are still getting
> effective CKO/GRO on the receiver?

Good question. With ipsec, GRO (like GSO) gets implicitly disabled.

But when I explictly disable GRO on receiver, leaving only GSO
on sender, I can still get about 6 Gbps for clear traffic.

Thus if I could get closer to 6 Gbps for my patch, I'd at least
know that it was just GRO that was missing.. but I'm only getting 
3 Gbps, which makes me think I missed something else on the sender
itself.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/2] Crypto kernel tls socket

2015-11-23 Thread Sowmini Varadhan
On (11/23/15 09:43), Dave Watson wrote:
> Currently gcm(aes) represents ~80% of our SSL connections.
> 
> Userspace interface:
> 
> 1) A transform and op socket are created using the userspace crypto interface
> 2) Setsockopt ALG_SET_AUTHSIZE is called
> 3) Setsockopt ALG_SET_KEY is called twice, since we need both send/recv keys
> 4) ALG_SET_IV cmsgs are sent twice, since we need both send/recv IVs.
>To support userspace heartbeats, changeciphersuite, etc, we would also need
>to get these back out, use them, then reset them via CMSG.
> 5) ALG_SET_OP cmsg is overloaded to mean FD to read/write from.

[from patch 0/2:]
> If a non application-data TLS record is seen, it is left on the TCP
> socket and an error is returned on the ALG socket, and the record is
> left for userspace to manage.

Interesting approach.

FWIW, I was hoping to discuss solutions for securing traffic tunnelled
over L3 at netdev 1.1, so hopefully we'll be able to go over the
trade-offs there. 

I'm trying to see how your approach would fit with the RDS-type of
use-case. RDS-TCP is mostly similar in concept to kcm,
except that rds has its own header for multiplexing, and has no 
dependancy on BPF for basic things like re-assembling the datagram. 
If I were to try to use this for RDS-TCP, the tls_tcp_read_sock() logic
would be merged into the recv_actor callback for RDS, right?  Thus tls
control-plane message could be seen in the middle of the
data-stream, so we really have to freeze the processing of the data
stream till the control-plane message is processed?

I'm concerned about the possiblilites for async that can happen when
we separate the data-plane from the control-plane (uspace tls
does not have to deal with this), but we now have control plane
separated from data-plane. (And IPsec/IKE has plenty of headaches
from this sort of thing already)

In the tls.c example that you have, the opfd is generated from
the accept() on the AF_ALG socket- how would this work if I wanted
my opfd to be a PF_RDS or a PF_KCM or similar?

One concern is that this patchset provides a solution for the "80%"
case but what about the other 20% (and the non x86 platforms)?
E.g., if I get a cipher-suite request outside the aes-ni, what would
happen (punt to uspace?)

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/2] Crypto kernel tls socket

2015-11-23 Thread Sowmini Varadhan
On (11/23/15 13:43), Dave Watson wrote:
> 
> For kcm, opfd is the fd you would pass along in kcm_attach.
> For rds, it looks like you'd want to use opfd as the sock instead of
> the new one created by sock_create_kern in rds_tcp_conn_connect.

I see.

It's something to consider, and it would certainly secure the
RDS header and app data, but TLS by itself may not be
enough- we'd need to protect the TCP control plane as well, and 
at the moment, I'm finding that even using esp-null (or AO, or MD5,
for that matter) means that I lose GSO, and perf tanks. I'll try to
put all my data together for this for netdev 1.1.


> > E.g., if I get a cipher-suite request outside the aes-ni, what would
> > happen (punt to uspace?)
> >
> > --Sowmini
> 
> Right, bind() would fail and you would fallback to uspace.

That's the approach that Solaris KSSL took, back in 1999. It quickly
became obsolete, again more details in netdev 1.1.

--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/2] xfrm: Fix unaligned access in xfrm_notify_sa() for DELSA

2015-10-21 Thread Sowmini Varadhan
On (10/21/15 06:22), David Miller wrote:
> memcpy() _never_ works for avoiding unaligned accessed.
> 
> I repeat, no matter what you do, no matter what kinds of casts or
> fancy typing you use, memcpy() _never_ works for this purpose.
  :
> There is one and only one portable way to access unaligned data,
> and that is with the get_unaligned() and put_unaligned() helpers.

ok. I'll fix it up to use the *_unaligned functions and resend this 
out later today.

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/2] xfrm: Fix unaligned access in xfrm_notify_sa() for DELSA

2015-10-21 Thread Sowmini Varadhan
On (10/21/15 06:54), Sowmini Varadhan wrote:
> But __alignof__(*p) is 8 on sparc, and without the patch I get
> all types of unaligned access. So what do you suggest as the fix?

Even though the alignment is, in fact, 8 (and that comes from
struct xfrm_lifetime_cfg), if uspace is firmly attached to the 4 byte
alignment, I think we can retain that behavior and still avoid
unaligned access in the kernel with the following (admittedly ugly hack).
Can you please take a look? I tested it with 'ip x m' and a transport 
mode tunnel on my sparc.


diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 158ef4a..ca4e7f0 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2620,7 +2620,7 @@ static inline size_t xfrm_sa_len(struct xfrm_state *x)
 static int xfrm_notify_sa(struct xfrm_state *x, const struct km_event *c)
 {
struct net *net = xs_net(x);
-   struct xfrm_usersa_info *p;
+   struct xfrm_usersa_info *p, tmp;
struct xfrm_usersa_id *id;
struct nlmsghdr *nlh;
struct sk_buff *skb;
@@ -2659,11 +2659,16 @@ static int xfrm_notify_sa(struct xfrm_state *x, const 
struct km_event *c)
if (attr == NULL)
goto out_free_skb;
 
-   p = PTR_ALIGN(nla_data(attr), __alignof__(*p));
+   p = nla_data(attr);
+   err = copy_to_user_state_extra(x, , skb);
+   if (err)
+   goto out_free_skb;
+   memcpy((u8 *)p, , sizeof(tmp));
+   } else {
+   err = copy_to_user_state_extra(x, p, skb);
+   if (err)
+   goto out_free_skb;
}
-   err = copy_to_user_state_extra(x, p, skb);
-   if (err)
-   goto out_free_skb;
 
nlmsg_end(skb, nlh);


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/2] xfrm: Fix unaligned access in xfrm_notify_sa() for DELSA

2015-10-21 Thread Sowmini Varadhan
On (10/21/15 08:57), Steffen Klassert wrote:
> > --- a/net/xfrm/xfrm_user.c
> > +++ b/net/xfrm/xfrm_user.c
> > @@ -2659,7 +2659,7 @@ static int xfrm_notify_sa(struct xfrm_state *x, const 
> > struct km_event *c)
> > if (attr == NULL)
> > goto out_free_skb;
> >  
> > -   p = nla_data(attr);
> > +   p = PTR_ALIGN(nla_data(attr), __alignof__(*p));
> 
> Hm, this breaks userspace notifications on 64-bit systems.
> Userspace expects this to be aligned to 4, with your patch
> it is aligned to 8 on 64-bit.
> 
> Without your patch I get the correct notification when deleting a SA:
> 

But __alignof__(*p) is 8 on sparc, and without the patch I get
all types of unaligned access. So what do you suggest as the fix?

(and openswan/pluto dont flag any errors with the patch, which is
a separate bug).

--Sowmini

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 2/2] xfrm: Fix unaligned access in xfrm_notify_sa() for DELSA

2015-10-19 Thread Sowmini Varadhan
On sparc, deleting established SAs (e.g., by restarting ipsec
at the peer) results in unaligned access messages via
xfrm_del_sa -> km_state_notify -> xfrm_send_state_notify().
Use an aligned pointer to xfrm_usersa_info for this case.

Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
 net/xfrm/xfrm_user.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index a8de9e3..158ef4a 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2659,7 +2659,7 @@ static int xfrm_notify_sa(struct xfrm_state *x, const 
struct km_event *c)
if (attr == NULL)
goto out_free_skb;
 
-   p = nla_data(attr);
+   p = PTR_ALIGN(nla_data(attr), __alignof__(*p));
}
err = copy_to_user_state_extra(x, p, skb);
if (err)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] xfrm/crypto: unaligned access fixes

2015-10-19 Thread Sowmini Varadhan
A two-part patchset that fixes some "unaligned access" warnings
that showed up my sparc test machines with ipsec set up.


Sowmini Varadhan (2):
  crypto/x509: Fix unaligned access in x509_get_sig_params()
  Fix unaligned access in xfrm_notify_sa() for DELSA

 crypto/asymmetric_keys/x509_public_key.c |5 +++--
 net/xfrm/xfrm_user.c |2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unaligned access in pkcs7_verify

2015-10-13 Thread Sowmini Varadhan
On (10/12/15 21:32), Herbert Xu wrote:
> 
> .. pkcs7_verify definitely
> shouldn't place the structure after the digest without aligning the
> pointer.  So something like your patch is needed (but please use
> alignof instead of sizeof).  Also don't put in digest_size but
> instead align the pointer like
> 
>   desc = PTR_ALIGN(digest + digest_size, ...)

I tried the patch below for 24+ hours, and it ran without mishaps. 
But given that the panics I saw earlier (page faults typically
triggered  by openswan's pluto) occurred unpredictably, it would be 
useful to have an expert-review of this code.

Thanks,
--Sowmini

diff --git a/crypto/asymmetric_keys/pkcs7_verify.c 
b/crypto/asymmetric_keys/pkcs7_verify.c
index d20c0b4..958ac01 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -46,14 +46,15 @@ static int pkcs7_digest(struct pkcs7_message *pkcs7,
return (PTR_ERR(tfm) == -ENOENT) ? -ENOPKG : PTR_ERR(tfm);
 
desc_size = crypto_shash_descsize(tfm) + sizeof(*desc);
-   sinfo->sig.digest_size = digest_size = crypto_shash_digestsize(tfm);
-
+   sinfo->sig.digest_size = crypto_shash_digestsize(tfm);
+   digest_size = crypto_shash_digestsize(tfm) +
+   ALIGN(crypto_shash_digestsize(tfm), __alignof__(*desc));
ret = -ENOMEM;
digest = kzalloc(digest_size + desc_size, GFP_KERNEL);
if (!digest)
goto error_no_desc;
 
-   desc = digest + digest_size;
+   desc = PTR_ALIGN(digest + digest_size, __alignof__(*desc));
desc->tfm   = tfm;
desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] crypto/pkcs7_verify: Fix unaligned access in pkcs7_verify()

2015-10-13 Thread Sowmini Varadhan

On sparc, we see unaligned access messages on each modprobe[-r]:

Kernel unaligned access at TPC[6ad9b4] pkcs7_verify [..]
Kernel unaligned access at TPC[6a5484] crypto_shash_finup [..]
Kernel unaligned access at TPC[6a5390] crypto_shash_update [..]
Kernel unaligned access at TPC[10150308] sha1_sparc64_update [..]
Kernel unaligned access at TPC[101501ac] __sha1_sparc64_update [..]

These ware triggered by mod_verify_sig() invocations of pkcs_verify(), and
are are being caused by an unaligned desc at (sha1, digest_size is 0x14)
desc = digest + digest_size;

To fix this, pkcs7_verify needs to make sure that desc is pointing
at an aligned value past the digest_size, and kzalloc appropriately,
taking alignment values into consideration.

Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
 crypto/asymmetric_keys/pkcs7_verify.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/crypto/asymmetric_keys/pkcs7_verify.c 
b/crypto/asymmetric_keys/pkcs7_verify.c
index d20c0b4..325575c 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -49,11 +49,12 @@ static int pkcs7_digest(struct pkcs7_message *pkcs7,
sinfo->sig.digest_size = digest_size = crypto_shash_digestsize(tfm);
 
ret = -ENOMEM;
-   digest = kzalloc(digest_size + desc_size, GFP_KERNEL);
+   digest = kzalloc(ALIGN(digest_size, __alignof__(*desc)) + desc_size,
+GFP_KERNEL);
if (!digest)
goto error_no_desc;
 
-   desc = digest + digest_size;
+   desc = PTR_ALIGN(digest + digest_size, __alignof__(*desc));
desc->tfm   = tfm;
desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unaligned access in pkcs7_verify

2015-10-12 Thread Sowmini Varadhan
On (10/12/15 21:32), Herbert Xu wrote:
> Thanks.  We have two bugs here.  First of all pkcs7_verify definitely
> shouldn't place the structure after the digest without aligning the
> pointer.  So something like your patch is needed (but please use
> alignof instead of sizeof).  Also don't put in digest_size but
> instead align the pointer like
> 
>   desc = PTR_ALIGN(digest + digest_size, ...)

That patch might not be rock-solid by itself though.  I was seeing
some panics/crashes when I was running with that patch, so I backed 
it off temporarily - should sinfo->sig.digest_size be set to the aligned
value?

> The sparc sha algorithms themselves need to declare the alignment
> that they require.  Currently they claim to be able to handle any
> alignment which appears to not be the case.

How would one do that correctly? I'm not a crypto expert, but I can
help test the patch..

--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unaligned access in pkcs7_verify

2015-10-08 Thread Sowmini Varadhan
On (10/08/15 21:15), Herbert Xu wrote:
> > desc_size = crypto_shash_descsize(tfm) + sizeof(*desc);
> > -   sinfo->sig.digest_size = digest_size = crypto_shash_digestsize(tfm);
> > +   sinfo->sig.digest_size = digest_size = 
> > +   ALIGN(crypto_shash_digestsize(tfm), sizeof (*desc));
  :
> What hash algorithm were you using?

Algorithm is sha1. From printk, crypto_shash_descsize(tfm) comes out
to 0x60, digest_size to 0x14. Stack trace (for each modprobe [-r]) is 

pkcs7_verify+0x1d0/0x5e0
system_verify_data+0x54/0xb4
mod_verify_sig+0xa0/0xc4
load_module+0x48/0x16a0
SyS_init_module+0x114/0x128
linux_sparc_syscall+0x34/0x44

--Sowmini


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


unaligned access in pkcs7_verify

2015-10-02 Thread Sowmini Varadhan

Hi,

I'm getting a lot of unaligned access messages each time I 
do "modprobe [-r] " on sparc:

Kernel unaligned access at TPC[6ad9b4] pkcs7_verify+0x1ec/0x5e0
Kernel unaligned access at TPC[6a5484] crypto_shash_finup+0xc/0x5c
Kernel unaligned access at TPC[6a5390] crypto_shash_update+0xc/0x54
Kernel unaligned access at TPC[10150308] sha1_sparc64_update+0x14/0x5c 
[sha1_sparc64]
Kernel unaligned access at TPC[101501ac] __sha1_sparc64_update+0xc/0x98 
[sha1_sparc64]

Looks like these are being caused by an unaligned desc at

desc = digest + digest_size;

Doing this:

--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -46,7 +46,8 @@ static int pkcs7_digest(struct pkcs7_message *pkcs7,
return (PTR_ERR(tfm) == -ENOENT) ? -ENOPKG : PTR_ERR(tfm);
 
desc_size = crypto_shash_descsize(tfm) + sizeof(*desc);
-   sinfo->sig.digest_size = digest_size = crypto_shash_digestsize(tfm);
+   sinfo->sig.digest_size = digest_size = 
+   ALIGN(crypto_shash_digestsize(tfm), sizeof (*desc));
 
ret = -ENOMEM;
digest = kzalloc(digest_size + desc_size, GFP_KERNEL);

makes the unaliagned message go away, but I dont know if
sinfo->sig.digest_size needs to be set to the (unaligned) raw 
value of crypto_shash_digestsize() itself, and how to verify that
this doesnt break something else in crypto

--Sowmini
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html