Re: [viff-devel] Multiparty AES in less than 3 seconds per block thanks to Twisted hack

Martin Geisler Sat, 07 Mar 2009 14:19:30 -0800

Martin Geisler <m...@daimi.au.dk> writes:

> I'll try and write a mail to them to explain our problem in more
> detail. Maybe your short patch didn't provide enough information when
> taken out of context.


I've included a larger answer from Jean-Paul Calderone below -- you can
read in context here:

  http://thread.gmane.org/gmane.comp.python.twisted/18029/focus=18089


Jean-Paul Calderone <exar...@divmod.com> writes:

> On Sat, 07 Mar 2009 19:38:46 +0100, Martin Geisler <m...@daimi.au.dk> wrote:
>> Jean-Paul Calderone <exar...@divmod.com> writes:
>>
>> Hi,
>>
>> Thanks for the answer. I'm also with the VIFF project and I would
>> like to explain a bit more about the background for the hack by
>> Marcel.
>>
>>> [snip]
>>>
>>> So you're doing a ton of work all at once now and you want to split
>>> up that ton of work into smaller pieces and do it a little at a
>>> time?
>>
>> Sort of. We have overloaded the arithmetic operators in our library,
>> so people will expect to be able to write
>>
>>  # xs and ys are big lists of our objects
>>  dot_product
>>  for (x, y) in zip(xs, ys):
>>    dot_product += x * y
>>
>> Here the multiplications involves network traffic and return
>> Deferreds. We would like the network traffic for the first
>> multiplication to begin immediately, *before* the remaining
>> multiplications are done.
>>
>> Doing all the multiplications up front makes the code block the
>> reactor and uses an awful lot of RAM. If we let each multiplication
>> trigger the sending of its data immediately, and if we process
>> incoming messages along the way, memory can be reclaimed for the
>> earlier multiplications and the above calculation should run in
>> constant memory.
>
> Hm.  I would bet the constant would be pretty high, though.  Things
> will only balance out once the network is keeping up with the local
> for loop.  Actually, I'm not sure why it would be constant at all.
> Won't the local for loop always run much faster than any network
> operations can happen?  In this case, memory usage will go towards
> K(local) * set size - K(remote) * set size, or just O(set size); that
> is to say, linear.
>
>> Sending and processing data in a more even flow makes our benchmark
>> results better and more consistent from one run to the next.
>
> It seems like what you'll really benefit from most is pipelining with
> a maximum pipeline depth that's not too big (so as to conserve memory)
> but not too small (so as to avoid wasting time on network round trip
> time).
>
>>> If that's the case, then you don't need to modify the reactor, you
>>> just need to split up the work your code is going. There are a lot
>>> of techniques for doing this. coiterate and inlineCallbacks are two
>>> solutions which are closest to "cookie cutter" (ie, you have the
>>> least flexibility in deciding how to use them).
>>
>> Right -- we might be able to use these techniques. I haven't looked
>> at coiterate yet. With inlineCallbacks I guess the code would look
>> something like this:
>>
>>  # xs and ys are big lists of our objects
>>  dot_product
>>  for (x, y) in zip(xs, ys):
>>    dot_product += (yield x * y)
>>
>> which is not so bad, expect that it destroys the nice illusion that x
>> and y behave like normal integers even though the multiplication
>> involves network traffic.
>
> Perhaps with coiterate this might look something like...
>
>    def op(xs, ys):
>        dot_product = 0
>        for (x, y) in zip(xs, ys):
>            dot_product += x * y
>            yield
>
>        yield dot_product
>
>    d = coiterate(op(xs, ys))
>
> This is flawed, but maybe it can be fixed.  What's good about it is
> that it doesn't try to drive the reactor re-entrantly. :) It also
> avoids the yield in the += and * operations, which somewhat preserves
> your illusion of normalcy (I'm not saying I like that illusion, but
> I'll leave that aside for now).
>
> What's bad about it is that it still lets the local loop run
> arbitrarily far ahead of the results from the network, giving you
> unbounded memory usage.  To fix that, it should yield a Deferred every
> once in a while.  The reason I leave it flawed here is that it's a
> little tricky to figure out which Deferred to yield.  What would be
> great would be to yield the Deferred for an operation which preceeds
> the most recently executed operation by some number of operations.
> The number of operations by which it preceeds the most recent will be
> your pipeline depth (roughly).  The effect this has on coiterate is to
> cause your local loop to cease to execute until that Deferred has
> fired.  By making it a Deferred for an operation some time /before/
> the most recent operation, you keep the network busy and avoid wasting
> time on round trip times.  Once the Deferred fires, your loop gets run
> a few more times which has the effect of putting more work into the
> pipeline - until you've got enough extra work to keep things from
> stalling again, and then coiterate puts you to sleep for a while
> again.
>
>>> You have a very long, steep, uphill battle to convince me that
>>> adding support for re-entrant iteration is a good idea.
>>
>> One problem I can think of is the memory usage associated with a very
>> deep recursion. Since there is no such thing as tail call
>> optimization in Python, each level in the recursion will hold onto
>> any local variables even though they might not be needed any more.
>>
>> Are there other general problems with having a re-entrant reactor?
>
> One general problem is simply that the reactor is not written with
> this in mind.  So with the current implementation, even including the
> patch Marcel contributed, it doesn't work, period.  When I say
> "doesn't work", I mean that there are cases which will simply result
> in incorrect behavior, though there may be other cases where
> everything does work out as you expect.  Going along with this, it's
> quite a bit harder to test that things work when re-entrancy is
> allowed or expected than if it isn't, so even if all of the current
> reactor implementations were made re-entrant, there would likely be a
> higher rate of defects related to this in the future, simply because
> it's harder.
>
> This isn't limited solely to the reactor, either.  A re-entrant
> reactor almost certainly means that application code will be invoked
> re-entrantly.  This is actually the case already, unfortunately, and
> it is a bit of an open question as to what should be done about it.  A
> very common bug I find (and write! :() in Twisted applications is
> failure to handle various forms of re-entrancy correctly.  This is an
> existing issue with Twisted, not related to this proposed change, but
> making this change would only make the problem worse and essentially
> destroy and hope for ever making it better.
>
> Actually, that general problem is so general that I can't really think
> of any others to discuss, so I'll leave it at that for now. :) If you
> want, I can probably share some specific examples of how re-entrancy
> leads to surprising bugs in unexpected places (probably from real
> programs, too :/).
>
> Jean-Paul


-- 
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.

pgpSaMyheP47o.pgp
Description: PGP signature

_______________________________________________
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk

Re: [viff-devel] Multiparty AES in less than 3 seconds per block thanks to Twisted hack

Reply via email to