Re: Call for testers: FPU changes

2010-11-20 Thread Kostik Belousov
On Sat, Nov 20, 2010 at 01:30:54AM -0500, Mike Tancsa wrote:
 On 11/16/2010 4:43 AM, Kostik Belousov wrote:
  On Mon, Nov 15, 2010 at 10:42:50PM -0500, Mike Tancsa wrote:
  On 11/15/2010 4:13 PM, Kostik Belousov wrote:
 
  Patch is at
  http://people.freebsd.org/~kib/misc/releng_8_fpu.1.patch
 
 
 I did some more tests post commit today using the aesni kld taken
 directly from HEAD.  BTW, do you plan to MFC this as well ?
Sure, I will merge aesni(4), it was the only reason to work on the
kern_fpu in stable/8.

I want some pause between KPI and driver MFC, to ease the handling
of possible mismerge or fixing latent bugs (since stable has much
larger testing base then HEAD).
 
 Results at the bottom of http://www.tancsa.com/fpu.html
 
 It certainly makes a difference with geli. IPSEC and userland stuff, not
 so much. The CPU itself is crazy fast, so its hard to see a difference
 in things like ssh and even ipsec didnt yield any differences.  For ssh
 and userland stuff I guess once there is an aesni userland engine, this
 would probably help over the cryptodev interface.

Yes, the small blocks encoding/decoding has a large overhead of loop setup
code.

Thank you.


pgpGfPDahzWMt.pgp
Description: PGP signature


Re: Call for testers: FPU changes

2010-11-19 Thread Mike Tancsa
On 11/16/2010 4:43 AM, Kostik Belousov wrote:
 On Mon, Nov 15, 2010 at 10:42:50PM -0500, Mike Tancsa wrote:
 On 11/15/2010 4:13 PM, Kostik Belousov wrote:

 Patch is at
 http://people.freebsd.org/~kib/misc/releng_8_fpu.1.patch


I did some more tests post commit today using the aesni kld taken
directly from HEAD.  BTW, do you plan to MFC this as well ?

Results at the bottom of http://www.tancsa.com/fpu.html

It certainly makes a difference with geli. IPSEC and userland stuff, not
so much. The CPU itself is crazy fast, so its hard to see a difference
in things like ssh and even ipsec didnt yield any differences.  For ssh
and userland stuff I guess once there is an aesni userland engine, this
would probably help over the cryptodev interface.

---Mike
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Call for testers: FPU changes

2010-11-18 Thread Mike Tancsa
On 11/17/2010 11:35 AM, Kostik Belousov wrote:
 
 Meantime, the similar change may be beneficial for padlock(4) too.
 f you are going to test it, please note that most likely, openssl padlock
 engine does not use padlock(4), I do not know for sure.


I did some more tests since someone said they had problems with geli,
ipsec and padlock. In the simple tests I did, I didnt find any
regressions or speed differences.  Info appended to

http://www.tancsa.com/fpu.html

I also compared to stock RELENG_8 and padlock and didnt find any issues.


---Mike

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Call for testers: FPU changes

2010-11-18 Thread Daryl Richards
Not sure if this is the kind of testing you were looking for; but I've
run both mprime and boinc/setiathome for the last two days without any
problem...

It's not a notebook so I can't test suspend/resume..

On 10-11-15 4:13 PM, Kostik Belousov wrote:
 Hello,
 this is a call for testers of the merge of fpu_kern_enter/leave(9)
 to RELENG_8. The changes are required to fix some issues with VIA
 padlock engine, and to actually merge aesni(4) to RELENG_8.
 
 I ask to look at the possible FPU context handling regressions.
 Reports from the users of VIA padlock hardware are also needed.
 Any user that has suspend/resume magically working on
 8 branch, please test that the patch does not make the things
 worse.
 
 Please note that the pre-release freeze will start in 2 weeks, so 
 I need to get testing results relatively quickly to be in time for 8.2.
 
 Patch is at
 http://people.freebsd.org/~kib/misc/releng_8_fpu.1.patch
 
 Thanks in advance.

-- 
Daryl Richards
Isle Technical Services Inc.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Call for testers: FPU changes

2010-11-17 Thread Kostik Belousov
On Tue, Nov 16, 2010 at 08:46:23PM -0500, Mike Tancsa wrote:
 On 11/16/2010 5:19 PM, Kostik Belousov wrote:
  Would your conclusion be that the patch seems to increase the throughput
  of the aesni(4) ?
  
  I think that on small-sized blocks, when using aesni(4), the dominating
  factor is the copying/copyout of the data to/from the kernel address
  space. Still would be interesting to compare the full output
  of openssl speed on aesni(4) with and without the patch I posted.
 
 Hi,
   There does seem to be some improvement on large blocks.  But there are
 some freakishly fast times. On other sizes, there is no difference in
 speed it would seem
 
 I did 20 runs. Updated stats at http://www.tancsa.com/fpu.html

Thank you. Indeed, I think that the test units are too small so that
random system events can cause the variation. Nonetheless, patch seems
to help, so I committed it.

Meantime, the similar change may be beneficial for padlock(4) too.
f you are going to test it, please note that most likely, openssl padlock
engine does not use padlock(4), I do not know for sure.

diff --git a/sys/crypto/via/padlock.c b/sys/crypto/via/padlock.c
index 77e059b..ba63093 100644
--- a/sys/crypto/via/padlock.c
+++ b/sys/crypto/via/padlock.c
@@ -170,7 +170,7 @@ padlock_newsession(device_t dev, uint32_t *sidp, struct 
cryptoini *cri)
struct padlock_session *ses = NULL;
struct cryptoini *encini, *macini;
struct thread *td;
-   int error;
+   int error, saved_ctx;
 
if (sidp == NULL || cri == NULL)
return (EINVAL);
@@ -238,10 +238,18 @@ padlock_newsession(device_t dev, uint32_t *sidp, struct 
cryptoini *cri)
 
if (macini != NULL) {
td = curthread;
-   error = fpu_kern_enter(td, ses-ses_fpu_ctx, FPU_KERN_NORMAL);
+   if (!is_fpu_kern_thread(0)) {
+   error = fpu_kern_enter(td, ses-ses_fpu_ctx,
+   FPU_KERN_NORMAL);
+   saved_ctx = 1;
+   } else {
+   error = 0;
+   saved_ctx = 0;
+   }
if (error == 0) {
error = padlock_hash_setup(ses, macini);
-   fpu_kern_leave(td, ses-ses_fpu_ctx);
+   if (saved_ctx)
+   fpu_kern_leave(td, ses-ses_fpu_ctx);
}
if (error != 0) {
padlock_freesession_one(sc, ses, 0);
diff --git a/sys/crypto/via/padlock_cipher.c b/sys/crypto/via/padlock_cipher.c
index 0ae26c8..1456ddf 100644
--- a/sys/crypto/via/padlock_cipher.c
+++ b/sys/crypto/via/padlock_cipher.c
@@ -205,7 +205,7 @@ padlock_cipher_process(struct padlock_session *ses, struct 
cryptodesc *enccrd,
struct thread *td;
u_char *buf, *abuf;
uint32_t *key;
-   int allocated, error;
+   int allocated, error, saved_ctx;
 
buf = padlock_cipher_alloc(enccrd, crp, allocated);
if (buf == NULL)
@@ -250,14 +250,21 @@ padlock_cipher_process(struct padlock_session *ses, 
struct cryptodesc *enccrd,
}
 
td = curthread;
-   error = fpu_kern_enter(td, ses-ses_fpu_ctx, FPU_KERN_NORMAL);
+   if (!is_fpu_kern_thread(0)) {
+   error = fpu_kern_enter(td, ses-ses_fpu_ctx, FPU_KERN_NORMAL);
+   saved_ctx = 1;
+   } else {
+   error = 0;
+   saved_ctx = 0;
+   }
if (error != 0)
goto out;
 
padlock_cbc(abuf, abuf, enccrd-crd_len / AES_BLOCK_LEN, key, cw,
ses-ses_iv);
 
-   fpu_kern_leave(td, ses-ses_fpu_ctx);
+   if (saved_ctx)
+   fpu_kern_leave(td, ses-ses_fpu_ctx);
 
if (allocated) {
crypto_copyback(crp-crp_flags, crp-crp_buf, enccrd-crd_skip,
diff --git a/sys/crypto/via/padlock_hash.c b/sys/crypto/via/padlock_hash.c
index 58c58b2..0fe182b 100644
--- a/sys/crypto/via/padlock_hash.c
+++ b/sys/crypto/via/padlock_hash.c
@@ -366,17 +366,24 @@ padlock_hash_process(struct padlock_session *ses, struct 
cryptodesc *maccrd,
 struct cryptop *crp)
 {
struct thread *td;
-   int error;
+   int error, saved_ctx;
 
td = curthread;
-   error = fpu_kern_enter(td, ses-ses_fpu_ctx, FPU_KERN_NORMAL);
+   if (!is_fpu_kern_thread(0)) {
+   error = fpu_kern_enter(td, ses-ses_fpu_ctx, FPU_KERN_NORMAL);
+   saved_ctx = 1;
+   } else {
+   error = 0;
+   saved_ctx = 0;
+   }
if (error != 0)
return (error);
if ((maccrd-crd_flags  CRD_F_KEY_EXPLICIT) != 0)
padlock_hash_key_setup(ses, maccrd-crd_key, maccrd-crd_klen);
 
error = padlock_authcompute(ses, maccrd, crp-crp_buf, crp-crp_flags);
-   fpu_kern_leave(td, ses-ses_fpu_ctx);
+   if (saved_ctx)
+   fpu_kern_leave(td, ses-ses_fpu_ctx);
return (error);
 }
 



Re: Call for testers: FPU changes

2010-11-17 Thread Mike Tancsa
On 11/17/2010 11:35 AM, Kostik Belousov wrote:
 Meantime, the similar change may be beneficial for padlock(4) too.
 f you are going to test it, please note that most likely, openssl padlock
 engine does not use padlock(4), I do not know for sure.
 
 diff --git a/sys/crypto/via/padlock.c b/sys/crypto/via/padlock.c
 index 77e059b..ba63093 100644
 --- a/sys/crypto/via/padlock.c
 +++ b/sys/crypto/via/padlock.c

Patch applied cleanly


Full results at the bottom of
http://www.tancsa.com/fpu.html

On large blocks, version 1 vs the above patch show no significant
difference.  This is with openssl using the cryptodev engine. I also
compared to the openssl padlock engine which gave interesting results!



0(via)# cat version1.txt | sed -e 's/k//g' | awk '{print $6}'  1
0(via)# cat version2.txt | sed -e 's/k//g' | awk '{print $6}'  2
0(via)# ministat 1 2
x 1
+ 2
N   Min   MaxMedian   AvgStddev
x  30 2591851.6 6645345.1 4326340.6 4227917.6 1083181.2
+  30 2574883.9 8830282.8 4033610.4 4241195.6 1519334.8
No difference proven at 95.0% confidence

0(via)# cat version1.txt | sed -e 's/k//g' | awk '{print $5}'  1
0(via)# cat version2.txt | sed -e 's/k//g' | awk '{print $5}'  2
0(via)# ministat 1 2
N   Min   MaxMedian   AvgStddev
x  30 1124673.3 2320883.7 1527677.1 1550631.9  295165.4
+  30 1069788.2 2508865.7 1594506.2 1588193.2 389414.33
No difference proven at 95.0% confidence
0(via)#




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Call for testers: FPU changes

2010-11-17 Thread Kostik Belousov
On Wed, Nov 17, 2010 at 02:18:50PM -0500, Mike Tancsa wrote:
 On 11/17/2010 11:35 AM, Kostik Belousov wrote:
  Meantime, the similar change may be beneficial for padlock(4) too.
  f you are going to test it, please note that most likely, openssl padlock
  engine does not use padlock(4), I do not know for sure.
  
  diff --git a/sys/crypto/via/padlock.c b/sys/crypto/via/padlock.c
  index 77e059b..ba63093 100644
  --- a/sys/crypto/via/padlock.c
  +++ b/sys/crypto/via/padlock.c
 
 Patch applied cleanly
 
 
 Full results at the bottom of
 http://www.tancsa.com/fpu.html
 
 On large blocks, version 1 vs the above patch show no significant
 difference.  This is with openssl using the cryptodev engine. I also
 compared to the openssl padlock engine which gave interesting results!
 
 
 
 0(via)# cat version1.txt | sed -e 's/k//g' | awk '{print $6}'  1
 0(via)# cat version2.txt | sed -e 's/k//g' | awk '{print $6}'  2
 0(via)# ministat 1 2
 x 1
 + 2
 N   Min   MaxMedian   AvgStddev
 x  30 2591851.6 6645345.1 4326340.6 4227917.6 1083181.2
 +  30 2574883.9 8830282.8 4033610.4 4241195.6 1519334.8
 No difference proven at 95.0% confidence
 
 0(via)# cat version1.txt | sed -e 's/k//g' | awk '{print $5}'  1
 0(via)# cat version2.txt | sed -e 's/k//g' | awk '{print $5}'  2
 0(via)# ministat 1 2
 N   Min   MaxMedian   AvgStddev
 x  30 1124673.3 2320883.7 1527677.1 1550631.9  295165.4
 +  30 1069788.2 2508865.7 1594506.2 1588193.2 389414.33
 No difference proven at 95.0% confidence
 0(via)#
 

Thank you once more.

If nothing new pops up, I will commit the MFC tomorrow.
Unfortunately, no suspend/resume testers appeared, so be it.


pgpCEmxG512GE.pgp
Description: PGP signature


Re: Call for testers: FPU changes

2010-11-16 Thread Kostik Belousov
On Mon, Nov 15, 2010 at 10:42:50PM -0500, Mike Tancsa wrote:
 On 11/15/2010 4:13 PM, Kostik Belousov wrote:
  
  Patch is at
  http://people.freebsd.org/~kib/misc/releng_8_fpu.1.patch
 
 
 Hi,
   One small failure on the patch
 
 The text leading up to this was:
 --
 |Index: pc98/include/npx.h
 |===
 |--- pc98/include/npx.h (revision 215253)
 |+++ pc98/include/npx.h (working copy)
 --
 Patching file pc98/include/npx.h using Plan A...
 Hunk #1 failed at 1.
 1 out of 1 hunks failed--saving rejects to pc98/include/npx.h.rej
This is because our patch(1) in base is somewhat old, I believe.
The diff was generated by svn diff from the up to date stable/8
checkout, and the reason for failure is expanded $FreeBSD$ tags.

Newer gnu patch, available in ports, handless this correctly,
reporting about patches applied with fuzz.

 
 
 I tested with openssl and openvpn and all seems to work great on the via
 board and my i5 board!!  Simple test details at
 
 http://www.tancsa.com/fpu.html
 
 I will try out geli and some more extensive tests tomorrow
 
 Thanks for porting this back to RELENG_8 !
This is actually somewhat puzzling. Does openssl in base automatically
use crypto(4) ?

Also, could you, please redo the speed tests for aesni(4) with the
following patch applied over the driver sources ?

Thank you !

diff --git a/sys/crypto/aesni/aesni_wrap.c b/sys/crypto/aesni/aesni_wrap.c
index 36c66ea..3fd397c 100644
--- a/sys/crypto/aesni/aesni_wrap.c
+++ b/sys/crypto/aesni/aesni_wrap.c
@@ -246,14 +246,21 @@ int
 aesni_cipher_setup(struct aesni_session *ses, struct cryptoini *encini)
 {
struct thread *td;
-   int error;
+   int error, saved_ctx;
 
td = curthread;
-   error = fpu_kern_enter(td, ses-fpu_ctx, FPU_KERN_NORMAL);
+   if (!is_fpu_kern_thread(0)) {
+   error = fpu_kern_enter(td, ses-fpu_ctx, FPU_KERN_NORMAL);
+   saved_ctx = 1;
+   } else {
+   error = 0;
+   saved_ctx = 0;
+   }
if (error == 0) {
error = aesni_cipher_setup_common(ses, encini-cri_key,
encini-cri_klen);
-   fpu_kern_leave(td, ses-fpu_ctx);
+   if (saved_ctx)
+   fpu_kern_leave(td, ses-fpu_ctx);
}
return (error);
 }
@@ -264,16 +271,22 @@ aesni_cipher_process(struct aesni_session *ses, struct 
cryptodesc *enccrd,
 {
struct thread *td;
uint8_t *buf;
-   int error, allocated;
+   int error, allocated, saved_ctx;
 
buf = aesni_cipher_alloc(enccrd, crp, allocated);
if (buf == NULL)
return (ENOMEM);
 
td = curthread;
-   error = fpu_kern_enter(td, ses-fpu_ctx, FPU_KERN_NORMAL);
-   if (error != 0)
-   goto out;
+   if (!is_fpu_kern_thread(0)) {
+   error = fpu_kern_enter(td, ses-fpu_ctx, FPU_KERN_NORMAL);
+   if (error != 0)
+   goto out;
+   saved_ctx = 1;
+   } else {
+   saved_ctx = 0;
+   error = 0;
+   }
 
if ((enccrd-crd_flags  CRD_F_KEY_EXPLICIT) != 0) {
error = aesni_cipher_setup_common(ses, enccrd-crd_key,
@@ -311,7 +324,8 @@ aesni_cipher_process(struct aesni_session *ses, struct 
cryptodesc *enccrd,
ses-iv);
}
}
-   fpu_kern_leave(td, ses-fpu_ctx);
+   if (saved_ctx)
+   fpu_kern_leave(td, ses-fpu_ctx);
if (allocated)
crypto_copyback(crp-crp_flags, crp-crp_buf, enccrd-crd_skip,
enccrd-crd_len, buf);


pgpTmlaTNbgbt.pgp
Description: PGP signature


Re: Call for testers: FPU changes

2010-11-16 Thread Mike Tancsa
On 11/16/2010 4:43 AM, Kostik Belousov wrote:
 On Mon, Nov 15, 2010 at 10:42:50PM -0500, Mike Tancsa wrote:
 On 11/15/2010 4:13 PM, Kostik Belousov wrote:

 Patch is at
 http://people.freebsd.org/~kib/misc/releng_8_fpu.1.patch


 Hi,
  One small failure on the patch

 The text leading up to this was:
 --
 |Index: pc98/include/npx.h
 |===
 |--- pc98/include/npx.h (revision 215253)
 |+++ pc98/include/npx.h (working copy)
 --
 Patching file pc98/include/npx.h using Plan A...
 Hunk #1 failed at 1.
 1 out of 1 hunks failed--saving rejects to pc98/include/npx.h.rej
 This is because our patch(1) in base is somewhat old, I believe.
 The diff was generated by svn diff from the up to date stable/8
 checkout, and the reason for failure is expanded $FreeBSD$ tags.
 
 Newer gnu patch, available in ports, handless this correctly,
 reporting about patches applied with fuzz.
 


 I tested with openssl and openvpn and all seems to work great on the via
 board and my i5 board!!  Simple test details at

 http://www.tancsa.com/fpu.html

 I will try out geli and some more extensive tests tomorrow

 Thanks for porting this back to RELENG_8 !
 This is actually somewhat puzzling. Does openssl in base automatically
 use crypto(4) ?


I force it it via ssl.cnf


0(achinetboot)% tail -11 /etc/ssl/openssl.cnf

openssl_conf = openssl_def

[openssl_def]
engines = openssl_engines

[openssl_engines]
padlock = cryptodev_engine

[cryptodev_engine]
default_algorithms = ALL
0(achinetboot)%


The limiting factor here for ssh seems to be the 100Mb link my i5 box is
on. Here is with and without aesni loaded

0(achinetboot)% /usr/bin/time scp -c aes128-cbc test.bin
mdtan...@10.255.255.1:/dev/null
test.bin
  100%   88MB  11.0MB/s   00:08
8.14 real 0.44 user 0.57 sys
0(achinetboot)% /usr/bin/time scp -c aes128-cbc test.bin
mdtan...@10.255.255.1:/dev/null
test.bin
  100%   88MB  11.0MB/s   00:08
8.15 real 1.46 user 0.36 sys
0(achinetboot)%

I will move it to gigabit to get a better test shortly.

 
 Also, could you, please redo the speed tests for aesni(4) with the
 following patch applied over the driver sources ?
 
 Thank you !
 
 diff --git a/sys/crypto/aesni/aesni_wrap.c b/sys/crypto/aesni/aesni_wrap.c
 index 36c66ea..3fd397c 100644
 --- a/sys/crypto/aesni/aesni_wrap.c
 +++ b/sys/crypto/aesni/aesni_wrap.c
 @@ -246,14 +246,21 @@ int



 patch -p2  a
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|diff --git a/sys/crypto/aesni/aesni_wrap.c b/sys/crypto/aesni/aesni_wrap.c
|index 36c66ea..3fd397c 100644
|--- a/sys/crypto/aesni/aesni_wrap.c
|+++ b/sys/crypto/aesni/aesni_wrap.c
--
Patching file crypto/aesni/aesni_wrap.c using Plan A...
Hunk #1 succeeded at 246.
Hunk #2 succeeded at 271.
Hunk #3 succeeded at 324.
Hmm...  Ignoring the trailing garbage.
done


Seems to work ok



0(achinetboot)# kldload aesni
0(achinetboot)#  openssl speed -evp aes-128-cbc
To get the most accurate results, try to run this
program when this computer is idle.
Doing aes-128-cbc for 3s on 16 size blocks: 2587085 aes-128-cbc's in 0.39s
Doing aes-128-cbc for 3s on 64 size blocks: 2425301 aes-128-cbc's in 0.38s
Doing aes-128-cbc for 3s on 256 size blocks: 1925353 aes-128-cbc's in 0.19s
Doing aes-128-cbc for 3s on 1024 size blocks: 1098255 aes-128-cbc's in 0.11s
Doing aes-128-cbc for 3s on 8192 size blocks: 152631 aes-128-cbc's in 0.05s
OpenSSL 0.9.8n 24 Mar 2010
built on: date not available
options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long)
aes(partial) blowfish(idx)
compiler: cc
available timing options: USE_TOD HZ=128 [sysconf value]
timing function used: getrusage
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192
bytes
aes-128-cbc 105979.48k   404781.84k  2632455.13k  9955323.90k
27619906.16k
0(achinetboot)#

But there is a LOT of variation between runs for some reason.

I added to http://www.tancsa.com/fpu.html

the different runs



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Call for testers: FPU changes

2010-11-16 Thread Kostik Belousov
On Tue, Nov 16, 2010 at 05:08:30PM -0500, Mike Tancsa wrote:
 On 11/16/2010 4:43 AM, Kostik Belousov wrote:
  On Mon, Nov 15, 2010 at 10:42:50PM -0500, Mike Tancsa wrote:
  On 11/15/2010 4:13 PM, Kostik Belousov wrote:
 
  Patch is at
  http://people.freebsd.org/~kib/misc/releng_8_fpu.1.patch
 
 
  Hi,
 One small failure on the patch
 
  The text leading up to this was:
  --
  |Index: pc98/include/npx.h
  |===
  |--- pc98/include/npx.h (revision 215253)
  |+++ pc98/include/npx.h (working copy)
  --
  Patching file pc98/include/npx.h using Plan A...
  Hunk #1 failed at 1.
  1 out of 1 hunks failed--saving rejects to pc98/include/npx.h.rej
  This is because our patch(1) in base is somewhat old, I believe.
  The diff was generated by svn diff from the up to date stable/8
  checkout, and the reason for failure is expanded $FreeBSD$ tags.
  
  Newer gnu patch, available in ports, handless this correctly,
  reporting about patches applied with fuzz.
  
 
 
  I tested with openssl and openvpn and all seems to work great on the via
  board and my i5 board!!  Simple test details at
 
  http://www.tancsa.com/fpu.html
 
  I will try out geli and some more extensive tests tomorrow
 
  Thanks for porting this back to RELENG_8 !
  This is actually somewhat puzzling. Does openssl in base automatically
  use crypto(4) ?
 
 
 I force it it via ssl.cnf
 
 
 0(achinetboot)% tail -11 /etc/ssl/openssl.cnf
 
 openssl_conf = openssl_def
 
 [openssl_def]
 engines = openssl_engines
 
 [openssl_engines]
 padlock = cryptodev_engine
 
 [cryptodev_engine]
 default_algorithms = ALL
 0(achinetboot)%
Ah, that explains the results.

 
 
 The limiting factor here for ssh seems to be the 100Mb link my i5 box is
 on. Here is with and without aesni loaded
 
 0(achinetboot)% /usr/bin/time scp -c aes128-cbc test.bin
 mdtan...@10.255.255.1:/dev/null
 test.bin
   100%   88MB  11.0MB/s   00:08
 8.14 real 0.44 user 0.57 sys
 0(achinetboot)% /usr/bin/time scp -c aes128-cbc test.bin
 mdtan...@10.255.255.1:/dev/null
 test.bin
   100%   88MB  11.0MB/s   00:08
 8.15 real 1.46 user 0.36 sys
 0(achinetboot)%
 
 I will move it to gigabit to get a better test shortly.
 
  
  Also, could you, please redo the speed tests for aesni(4) with the
  following patch applied over the driver sources ?
  
  Thank you !
  
  diff --git a/sys/crypto/aesni/aesni_wrap.c b/sys/crypto/aesni/aesni_wrap.c
  index 36c66ea..3fd397c 100644
  --- a/sys/crypto/aesni/aesni_wrap.c
  +++ b/sys/crypto/aesni/aesni_wrap.c
  @@ -246,14 +246,21 @@ int
 
 
 
  patch -p2  a
 Hmm...  Looks like a unified diff to me...
 The text leading up to this was:
 --
 |diff --git a/sys/crypto/aesni/aesni_wrap.c b/sys/crypto/aesni/aesni_wrap.c
 |index 36c66ea..3fd397c 100644
 |--- a/sys/crypto/aesni/aesni_wrap.c
 |+++ b/sys/crypto/aesni/aesni_wrap.c
 --
 Patching file crypto/aesni/aesni_wrap.c using Plan A...
 Hunk #1 succeeded at 246.
 Hunk #2 succeeded at 271.
 Hunk #3 succeeded at 324.
 Hmm...  Ignoring the trailing garbage.
 done
 
 
 Seems to work ok
 
 
 
 0(achinetboot)# kldload aesni
 0(achinetboot)#  openssl speed -evp aes-128-cbc
 To get the most accurate results, try to run this
 program when this computer is idle.
 Doing aes-128-cbc for 3s on 16 size blocks: 2587085 aes-128-cbc's in 0.39s
 Doing aes-128-cbc for 3s on 64 size blocks: 2425301 aes-128-cbc's in 0.38s
 Doing aes-128-cbc for 3s on 256 size blocks: 1925353 aes-128-cbc's in 0.19s
 Doing aes-128-cbc for 3s on 1024 size blocks: 1098255 aes-128-cbc's in 0.11s
 Doing aes-128-cbc for 3s on 8192 size blocks: 152631 aes-128-cbc's in 0.05s
 OpenSSL 0.9.8n 24 Mar 2010
 built on: date not available
 options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long)
 aes(partial) blowfish(idx)
 compiler: cc
 available timing options: USE_TOD HZ=128 [sysconf value]
 timing function used: getrusage
 The 'numbers' are in 1000s of bytes per second processed.
 type 16 bytes 64 bytes256 bytes   1024 bytes   8192
 bytes
 aes-128-cbc 105979.48k   404781.84k  2632455.13k  9955323.90k
 27619906.16k
 0(achinetboot)#
 
 But there is a LOT of variation between runs for some reason.
 
 I added to http://www.tancsa.com/fpu.html
 
 the different runs
 
 
Mike, thank you again.

Would your conclusion be that the patch seems to increase the throughput
of the aesni(4) ?

I think that on small-sized blocks, when using aesni(4), the dominating
factor is the copying/copyout of the data to/from the kernel address
space. Still would be interesting to compare the full output
of openssl speed on aesni(4) with and without the patch I posted.


pgpC53U96rkuf.pgp
Description: PGP signature


Re: Call for testers: FPU changes

2010-11-16 Thread Mike Tancsa
On 11/16/2010 5:19 PM, Kostik Belousov wrote:
 Would your conclusion be that the patch seems to increase the throughput
 of the aesni(4) ?
 
 I think that on small-sized blocks, when using aesni(4), the dominating
 factor is the copying/copyout of the data to/from the kernel address
 space. Still would be interesting to compare the full output
 of openssl speed on aesni(4) with and without the patch I posted.

Hi,
There does seem to be some improvement on large blocks.  But there are
some freakishly fast times. On other sizes, there is no difference in
speed it would seem

I did 20 runs. Updated stats at http://www.tancsa.com/fpu.html

---Mike

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Call for testers: FPU changes

2010-11-15 Thread Kostik Belousov
Hello,
this is a call for testers of the merge of fpu_kern_enter/leave(9)
to RELENG_8. The changes are required to fix some issues with VIA
padlock engine, and to actually merge aesni(4) to RELENG_8.

I ask to look at the possible FPU context handling regressions.
Reports from the users of VIA padlock hardware are also needed.
Any user that has suspend/resume magically working on
8 branch, please test that the patch does not make the things
worse.

Please note that the pre-release freeze will start in 2 weeks, so 
I need to get testing results relatively quickly to be in time for 8.2.

Patch is at
http://people.freebsd.org/~kib/misc/releng_8_fpu.1.patch

Thanks in advance.


pgp3FKznhbprw.pgp
Description: PGP signature


Re: Call for testers: FPU changes

2010-11-15 Thread Mike Tancsa
On 11/15/2010 4:13 PM, Kostik Belousov wrote:
 
 Patch is at
 http://people.freebsd.org/~kib/misc/releng_8_fpu.1.patch


Hi,
One small failure on the patch

The text leading up to this was:
--
|Index: pc98/include/npx.h
|===
|--- pc98/include/npx.h (revision 215253)
|+++ pc98/include/npx.h (working copy)
--
Patching file pc98/include/npx.h using Plan A...
Hunk #1 failed at 1.
1 out of 1 hunks failed--saving rejects to pc98/include/npx.h.rej


I tested with openssl and openvpn and all seems to work great on the via
board and my i5 board!!  Simple test details at

http://www.tancsa.com/fpu.html

I will try out geli and some more extensive tests tomorrow

Thanks for porting this back to RELENG_8 !

---Mike
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org