Re: [viff-devel] Two-threaded VIFF
Marcel Keller writes: > Hi friends of VIFF, > > I've finally completed the patch for a two-threaded VIFF where most of > the VIFF code runs in a separate thread. The patch is against the tip > of my repository: http://hg.viff.dk/mkeller Okay. Your clone is about 70 commits and 6 weeks behind the current version. There has been some changes in the ShareExchanger code since which your code also touches. You should therefore pull in the latest code and merge/rebase your patch. But also, please look at splitting this up into smaller pieces. There are a number of changes which do not logically belong together and which should therefore be put into separate patches. Quoting from a post on the Mercurial mailinglist: http://www.selenic.com/pipermail/mercurial-devel/2008-August/007632.html First, the probability of a patch being acceptable is the *product* of the probability of each individual feature being acceptable. Which means it's a good idea to submit each feature separately. Second, the probability of there being a bug in the patch is the sum of there being a bug in each feature. Which means you want each feature in a separate commit for locating regressions. Third, the difficulty of reviewing a patch increases more than linearly with the number of features included. Which means we really don't want to review patches with more than one feature. In this case we're changing the basic design of how things are processed in VIFF -- we're introducing threads and locks, which are traditionally the source of many funny errors. Therefore we need some good documentation for this: * what is waiting on what? * where can we deadlock -- how do we avoid this? * what are the hidden assumptions (I saw that putting (None, None) into the deferred_queue signals that the thread should stop? * maybe more :-) > It turned out be not so straight-forward as I thought. I had to use > a recursion, as in the hack, but I refined it to ensure that the > recursion limit isn't exceeded. Please document this somewhere, maybe in doc/runtime.txt or maybe a new document which describes the design. There is a saying that if it was hard to write the code, then it will be even harder to read it afterwards. Therefore, if you put all your cleverness into writing the code, you will not be able to understand it when reading it again :-) > Benchmarks: > Unfortunately, this solution is slower than the hack, e.g. one AES > block encryption takes 4 seconds compared to 3 seconds with the hack. The hack used a single thread but pumped messages out quicker by directly calling the send-stuff function in the reactor, right? Then it's no big surprice that it was faster since it had no locking to do. > On the other hand, the preprocessing time in the actively secure > multiplication is linear and not quadratic, whereas the online time is > significantly larger: > > two-threadedhackoriginal > (n,t) online preprocessing online preprocessing online preproc. > (4,1) 6 22 4 17 4 20 > (7,2) 10 37 6 29 6 42 > (10,3)13 53 8 42 8 82 > (13,4)17 68 10 56 10 136 > (16,5)20 84 12 68 12 208 > (19,6)23 106 13 83 14 287 > (22,7)26 120 15 98 17 377 It is a shame that things become slower, but maybe that is the price we have to pay for this more elaborate design. > I did some profiling and didn't find an obvious reason why the > two-thread is slower. Therefore, I guess that the reason is the > multi-threading implementation of Python (which could be better, as > mentioned in the discussion about the hack). The guess is also > supported by the fact that having an own thread for every callback, > which I also tried, turned out to be really slow. > > Unit tests: > All unit test get passed, even the previosly skipped > test_multiple_callbacks. Excellent! > This because I added the @increment_pc decorator to > schedule_callback(). This of course changes the program counters > heavily but I didn't experience any problems. > > Best regards, > Marcel > diff -r e2759515f57f apps/aes.py > --- a/apps/aes.py Thu Mar 05 21:02:57 2009 +0100 > +++ b/apps/aes.py Tue Apr 21 17:25:45 2009 +0200 > @@ -82,7 +82,7 @@ > rt.shutdown() > > g = gather_shares(opened_ciphertext) > -g.addCallback(fin) > +rt.schedule_callback(g, fin) > > def share_key(rt): > key = [] > diff -r e2759515f57f apps/benchmark.py > --- a/apps/benchmark.py Thu Mar 05 21:02:57 2009 +0100 > +++ b/apps/benchmark.py Tue Apr 21 17:25:45 2009 +0200 > @@ -91,6 +91,7 @@ > print "Total time used: %.3f sec" % (stop-start) > print "Time per %s operation: %.0f ms" % (w
[viff-devel] Two-threaded VIFF
Hi friends of VIFF, I've finally completed the patch for a two-threaded VIFF where most of the VIFF code runs in a separate thread. The patch is against the tip of my repository: http://hg.viff.dk/mkeller It turned out be not so straight-forward as I thought. I had to use a recursion, as in the hack, but I refined it to ensure that the recursion limit isn't exceeded. Benchmarks: Unfortunately, this solution is slower than the hack, e.g. one AES block encryption takes 4 seconds compared to 3 seconds with the hack. On the other hand, the preprocessing time in the actively secure multiplication is linear and not quadratic, whereas the online time is significantly larger: two-threadedhackoriginal (n,t) online preprocessing online preprocessing online preproc. (4,1) 6 22 4 17 4 20 (7,2) 10 37 6 29 6 42 (10,3) 13 53 8 42 8 82 (13,4) 17 68 10 56 10 136 (16,5) 20 84 12 68 12 208 (19,6) 23 106 13 83 14 287 (22,7) 26 120 15 98 17 377 I did some profiling and didn't find an obvious reason why the two-thread is slower. Therefore, I guess that the reason is the multi-threading implementation of Python (which could be better, as mentioned in the discussion about the hack). The guess is also supported by the fact that having an own thread for every callback, which I also tried, turned out to be really slow. Unit tests: All unit test get passed, even the previosly skipped test_multiple_callbacks. This because I added the @increment_pc decorator to schedule_callback(). This of course changes the program counters heavily but I didn't experience any problems. Best regards, Marcel diff -r e2759515f57f apps/aes.py --- a/apps/aes.py Thu Mar 05 21:02:57 2009 +0100 +++ b/apps/aes.py Tue Apr 21 17:25:45 2009 +0200 @@ -82,7 +82,7 @@ rt.shutdown() g = gather_shares(opened_ciphertext) -g.addCallback(fin) +rt.schedule_callback(g, fin) def share_key(rt): key = [] diff -r e2759515f57f apps/benchmark.py --- a/apps/benchmark.py Thu Mar 05 21:02:57 2009 +0100 +++ b/apps/benchmark.py Tue Apr 21 17:25:45 2009 +0200 @@ -91,6 +91,7 @@ print "Total time used: %.3f sec" % (stop-start) print "Time per %s operation: %.0f ms" % (what, 1000*(stop-start) / count) print "*" * 6 +sys.stdout.flush() operations = ["mul", "compToft05", "compToft07", "eq"] @@ -174,7 +175,7 @@ # run with no preprocessing. So they are quite brittle. if self.operation == operator.mul: key = ("generate_triples", (Zp,)) -desc = [(i, 1, 0) for i in range(2, 2 + count)] +desc = [(2, 2, 2 * count + 1, 2, i, 1, 0) for i in range(1, count + 1)] program_desc.setdefault(key, []).extend(desc) elif isinstance(self.rt, ComparisonToft05Mixin): key = ("generate_triples", (GF256,)) @@ -228,7 +229,8 @@ if seconds > 0: print "Starting test in %d" % seconds sys.stdout.flush() -reactor.callLater(1, self.countdown, None, seconds - 1) +time.sleep(1) +self.countdown(None, seconds - 1) else: print "Starting test now" sys.stdout.flush() @@ -255,6 +257,7 @@ def run_test(self, _): c_shares = [] record_start("parallel test") +sys.stdout.flush() while self.a_shares and self.b_shares: a = self.a_shares.pop() b = self.b_shares.pop() diff -r e2759515f57f apps/millionaires.py --- a/apps/millionaires.py Thu Mar 05 21:02:57 2009 +0100 +++ b/apps/millionaires.py Tue Apr 21 17:25:45 2009 +0200 @@ -97,10 +97,10 @@ # the argument (which is None since self.results_ready does # not return anything), so we throw it away using a lambda # expressions which ignores its first argument. -results.addCallback(lambda _: runtime.synchronize()) +runtime.schedule_callback(results, lambda _: runtime.synchronize()) # The next callback shuts the runtime down, killing the # connections between the players. -results.addCallback(lambda _: runtime.shutdown()) +runtime.schedule_callback(results, lambda _: runtime.shutdown()) def results_ready(self, results): # Since this method is called as a callback above, the results diff -r e2759515f57f viff/active.py --- a/viff/active.py Thu Mar 05 21:02:57 2009 +0100 +++ b/viff/active.py Tue Apr 21 17:25:45 2009 +0200 @@ -501,6 +501,7 @@ result = Share(self, share_x.field) # This is the Deferred we will do processing on. triple = self.get_triple(share_x.field) +