Re: [viff-devel] PRSS zero- and double-sharings

2008-08-01 Thread Mikkel Krøigård
Well, there are many more or less interesting conclusions to draw from your
benchmark, Marting. Not surprisingly, matrix multiplication turns out to be
expensive. It is worth considering using a non-naive algorithm for this
multiplication, but I am not convinced there is very much to gain.

One thing I really do find interesting about the table is the amount of time
spent in inc_pc_wrapper. Perhaps it is possible to improve this somehow?

___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk


Re: [viff-devel] PRSS zero- and double-sharings

2008-08-01 Thread Martin Geisler
Mikkel Krøigård [EMAIL PROTECTED] writes:

 Well, there are many more or less interesting conclusions to draw
 from your benchmark, Martin. Not surprisingly, matrix multiplication
 turns out to be expensive.

Hmm... I did see that there were a bunch of calls to __mul__ in
matrix.py, but I thought they came from the initialization of _hyper
in ActiveRuntime. The the initialization of _hyper does not use any
matrix multiplications, so this is wrong?!

When the mul method uses prss_get_triple, then _hyper should never be
used or even initialized and so there should be no matrix stuff going
on... I think I measured the wrong code somehow :-)

 One thing I really do find interesting about the table is the amount
 of time spent in inc_pc_wrapper. Perhaps it is possible to improve
 this somehow?

It only used 0.5 seconds of its own time -- the 21 seconds are the
total time spend in the child-calls made by inc_pc_wrapper. Since it
wraps all important functions its clear that the cumulative time will
be big:

   ncalls  tottime  percall  cumtime  percall
   48003/60030.5180.000   21.1950.004

But any optimization would be good -- if we can same a tiny bit for
each of the 48000 calls it might sum up :-)

-- 
Martin Geisler
___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk


Re: [viff-devel] PRSS zero- and double-sharings

2008-08-01 Thread Mikkel Krøigård
  Well, there are many more or less interesting conclusions to draw
  from your benchmark, Martin. Not surprisingly, matrix multiplication
  turns out to be expensive.

 Hmm... I did see that there were a bunch of calls to __mul__ in
 matrix.py, but I thought they came from the initialization of _hyper
 in ActiveRuntime. The the initialization of _hyper does not use any
 matrix multiplications, so this is wrong?!
No the initialization of hyper should definitely not use matrix multiplications.

 When the mul method uses prss_get_triple, then _hyper should never be
 used or even initialized and so there should be no matrix stuff going
 on... I think I measured the wrong code somehow :-)
Well somehow, matrices were multiplied. And not only that, a lot of time was
spent doing it :)

  One thing I really do find interesting about the table is the amount
  of time spent in inc_pc_wrapper. Perhaps it is possible to improve
  this somehow?

 It only used 0.5 seconds of its own time -- the 21 seconds are the
 total time spend in the child-calls made by inc_pc_wrapper. Since it
 wraps all important functions its clear that the cumulative time will
 be big:

ncalls  tottime  percall  cumtime  percall
48003/60030.5180.000   21.1950.004

 But any optimization would be good -- if we can same a tiny bit for
 each of the 48000 calls it might sum up :-)
My observation was just that wrapping is somewhat expensive. I do not quite know
what to do about this. We have already discussed the alternatives to using this
method, and they were not particularly promising.
___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk


Re: [viff-devel] PRSS zero- and double-sharings

2008-08-01 Thread Martin Geisler
Martin Geisler [EMAIL PROTECTED] writes:

 Strangely the time for preprocessing has not improved... It stayed
 at an average time of about *20 ms* for a multiplication triple both
 before and after the change -- I don't understand that :-(

I do now! :-)

It turned out that the preprocessing was still done with the old code.
Using the new PRSS-based code brings the time down to *5 ms* pr
triple, so it cuts the time by a factor of four.

Thanks to Mikkel for making me think about why there were matrix
multiplications in the profile information...

-- 
Martin Geisler
___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk


Re: [viff-devel] PRSS zero- and double-sharings

2008-08-01 Thread Martin Geisler
Mikkel Krøigård [EMAIL PROTECTED] writes:

 It only used 0.5 seconds of its own time -- the 21 seconds are the
 total time spend in the child-calls made by inc_pc_wrapper. Since
 it wraps all important functions its clear that the cumulative time
 will be big:

ncalls  tottime  percall  cumtime  percall
48003/60030.5180.000   21.1950.004

 But any optimization would be good -- if we can same a tiny bit for
 each of the 48000 calls it might sum up :-)

 My observation was just that wrapping is somewhat expensive. I do
 not quite know what to do about this. We have already discussed the
 alternatives to using this method, and they were not particularly
 promising.

Function calls are unfortunately quite expensive in Python. One idea
is to use Psyco (http://tracker.viff.dk/issue51) since it supposedly
should be able to write optimized code on the fly, maybe even do
inlining -- I'm not sure.

-- 
Martin Geisler
___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk


Re: [viff-devel] PRSS zero- and double-sharings

2008-07-31 Thread Martin Geisler
Martin Geisler [EMAIL PROTECTED] writes:

 I'm thinking that there might be some unfortunate overhead in the
 preprocessing book-keeping. We should try running benchmark.py under
 a profiler to see where the time is spent.

There is now support for a --profile flag, and running benchmark.py
with it gives (1000 actively secure multiplications, 4 players):

 2706417 function calls (2113774 primitive calls) in 37.755 CPU seconds

   Ordered by: internal time, call count
   List reduced from 301 to 40 due to restriction 40

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   586.6420.115   34.6400.597 selectreactor.py:82(doSelect)
 230033/170146.0110.000   29.8500.002 defer.py:306(_runCallbacks)
305044/2490233.0990.0009.9370.000 defer.py:168(addCallbacks)
350001.6370.0008.0420.000 runtime.py:1088(mul)
465041.3380.0004.2180.000 runtime.py:184(__init__)
   1051151.3100.0001.7800.000 field.py:371(__mul__)
   1883071.2690.0001.2690.000 field.py:339(__init__)
 20001.0530.001   12.0170.006 matrix.py:128(__mul__)
 117027/210140.9920.000   27.1290.001 
defer.py:283(_startRunCallbacks)
720000.8480.0001.4860.000 field.py:342(__add__)
192023/1580170.8200.0008.2830.000 defer.py:185(addCallback)
 117027/210140.8070.000   27.2820.001 defer.py:229(callback)
370000.7990.0002.3820.000 runtime.py:137(clone)
   1155120.7620.0001.3680.000 runtime.py:70(__init__)
150060.6030.0000.6030.000 runtime.py:574(_expect_data)
 102008/670060.5250.0008.9530.000 
runtime.py:208(_callback_fired)
   48003/60030.5180.000   21.1950.004 runtime.py:353(inc_pc_wrapper)
360000.5140.0006.2010.000 runtime.py:742(add)
 90000.4940.0001.2850.000 shamir.py:95(recombine)
435040.3730.0005.0850.000 runtime.py:216(gather_shares)
190060.3400.0001.0510.000 runtime.py:306(sendData)
   700.3360.005   27.8190.397 basic.py:345(dataReceived)
150090.3280.000   27.4830.002 runtime.py:267(stringReceived)
   1170260.2870.0000.2870.000 defer.py:162(__init__)
190090.2720.0000.5170.000 abstract.py:164(write)
 60000.2630.0000.4550.000 matrix.py:201(transpose)
 30000.2490.0002.5070.001 runtime.py:719(exchange)
 20000.2490.0001.0530.001 shamir.py:30(share)
360000.2330.0001.1740.000 runtime.py:754(lambda)
   15000/40000.2030.0003.7750.001 
runtime.py:534(schedule_callback)
150060.1970.0001.2490.000 runtime.py:608(_expect_share)
190090.1950.0000.7120.000 basic.py:357(sendString)
340000.1870.0000.6970.000 runtime.py:1102(lambda)
 20000.1780.0001.2540.001 
prss.py:69(convert_replicated_shamir)
  5040.1760.0000.2630.001 defer.py:458(__init__)
880000.1670.0000.1670.000 matrix.py:72(__getitem__)
 60000.1600.0000.1600.000 prss.py:260(__init__)
 60000.1550.0000.1550.000 prss.py:326(__call__)
370000.1500.0004.2250.000 runtime.py:144(split_result)
340000.1460.0005.5000.000 runtime.py:105(__rmul__)

I'm not really sure how to interpret this... If any of you want to
investigate, then here's a link to the manual:

  http://docs.python.org/lib/module-profile.html

:-)


-- 
Martin Geisler
___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk