Re: [viff-devel] Say hello to viff.boost
Hi friends of VIFF, my repository now also features a C implementation of viff.field. It uses GMP in GFElement. The benchmarks show a speed up between 100 and over 200 percent (the latter in multiplication triple generation with hyperinvertible matrices). The instructions to use it are the same. Best regards, Marcel Marcel Keller wrote: Hi friends of VIFF, I've implemented Share and ShareList in C, based on a C implementation of Deferred. Using the C extension, benchmark.py and aes.py show a speed up between 50 and 100 percent. The code is in my repository: http://hg.viff.dk/mkeller To use the extension, first compile and install it: $ python setup.py install [--prefix PREFIX] Then, add the following lines two to your program before importing anything from viff or twisted: import viff.boost viff.boost.install(with_viff_reactor=True|False) If the parameter is True, the VIFF reactor is also installed, so this doesn't have to be done separately. There is a notable difference to the standard implementation of Deferred: Errbacks are disabled and exceptions raised in callbacks are not caught. This is for the following reasons: - The implementation was inefficient (too many isinstance() calls). - Errbacks are only used for error reporting in VIFF. However, I prefer VIFF to crash (and print a traceback) immediately if there is an exception instead of wrapping the exception and reporting it later, which makes debugging harder. I'm open for comments on this issue. It shouldn't be hard to reimplement errbacks or to implement something different like a global errback. For compatibility, all Deferred methods are still present. If trial is modified to load the C extension, all tests succeed except a few doctests. Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Value overflow in Toft07
Dear Lars, thanks for pointing it out. It is now fixed in the official repository. Best regards, Marcel Lars Krapf wrote: Hello VIFF-team I would like to suggest the following patch to viff/comparison.py: 159c159 l = int(self.options.security_parameter + math.log(dst_field.modulus, 2)) --- l = self.options.security_parameter + math.log(dst_field.modulus, 2) otherwise the l in the next line: this_mask = rand.randint(0, (2**l) -1) is a float, and we get 34, Value out of Range exceptions for big l. Best greetings Lars Krapf ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] VIFF and random numbers
Thomas P Jakobsen wrote: The urandom is os-specific: This function returns random bytes from an OS-specific randomness source. The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation. On a UNIX-like system this will query /dev/urandom, and on Windows it will use CryptGenRandom. I don't know whether this will be good enough. There is a paper describing various flaws: http://www.pinkas.net/PAPERS/gpr06.pdf If not, I guess we'll have to use some external package (openssl?) or implement our own algorithm. viff.util.rand is used to make all randomness replayable, which already helped me to find bugs triggered by certain randomness. I would like to have this feature also in the future, therefore I would vote against a random number generator using noise not only to generate a seed. If the environment variable VIFF_SEED is set to the empty string, VIFF already uses /dev/urandom by using random.SystemRandom for random number generation. This possibility is not documented however. Best regards, Marcel On Tue, Jul 6, 2010 at 15:40, Ivan Bjerre Damgård i...@cs.au.dk wrote: It is not good to use the wrong kind of PRG, it should be fixed as soon as possible. But do we know that os.urandom will be OK on any platform, or is this OS -dependent at the end of the day? - Ivan On 06/07/2010, at 15.22, Thomas P Jakobsen wrote: VIFF itself as well as most protocols implemented in VIFF uses the viff.util package for random number generation. This package in turn uses the random package in the Python standard library. This means that random numbers are generated using a Mersenne twister. As far as I can see, this is a problem, since Mersenne twister PRNGs are generally not suited for cryptographic usage. E.g. it is not known to pass the next-bit test and withstand the state compromise extensions, see http://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator. One solution would be to use the os.urandom() function instead. This has specifically been designed to produce cryptographically secure random numbers. (We should probably keep the old random generator, too. It is probably faster and not all random numbers used in VIFF and VIFF programs need to be cryptographically secure.) Let me know what you think about this. Kind regards, Thomas ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] VIFF reactor
Hi Joel, Is it still necessary to run `viff.reactor.install()` as described in http://www.mail-archive.com/viff-devel@viff.dk/msg00657.html in order to utilize the VIFF reactor? - If so, would it be possible to fix that? I don't see a good way to that, for the following reasons: - To change the default reactor properly, it would be necessary to change the Twisted source code. That is not an elegant way. - The VIFF reactor could be installed in the runtime module. However, if it would be done that way, viff.runtime would have to be imported before importing the reactor, which is a possible source of errors. Furthermore, that would prevent other reactors like the GtkReactor from being used. - If not, then the example apps would need to be updated accordingly. As far as I can see, all example apps are updated in VIFF 1.0. Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
[viff-devel] OrlandiRuntime implementation
Hi Claudio and Jesper, In the code review of the OrlandiRuntime we found two points, we want to discuss with you. Step 3 of the triple generation protocol says: Coin-flip a subset \fancy_T \subset \fancy_M of size \lambda(2d + 1)M. The current implementation loops over the elements in \fancy_M and decides fifty-fifty for every element whether it should by in \fancy_T until there are enough elements in \fancy_T. I however think that this choice should be biased according to the size of \fancy_M and \fancy_T, i.e. every element of \fancy_M should be put in \fancy_T with probability \lambda(2d + 1)M / (1 + \lambda)(2d + 1)M = \lambda / (1 + \lambda). Furthermore, I think the probability should be updated whenever \fancy_M is processed completely and \fancy_T still is not big enough. Maybe it would be easier to choose \lambda(2d + 1)M times a random element of the remaining ones in \fancy_M instead. In step 6, the implementation generates a distributed random number in Z_p to determine a partition where one element should be put if there is still space there. I suggest to use the randomness more efficiently to save communication. E.g., if a random number in Z_p is smaller than M^l with l \le log_M(p), one can extract l random numbers in Z_M. The same method probably could be used in step 3 as well. Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] OrlandiRuntime implementation
Claudio Orlandi wrote: On Wed, Nov 4, 2009 at 5:46 AM, Marcel Keller mkel...@cs.au.dk wrote: Hi Claudio and Jesper, In the code review of the OrlandiRuntime we found two points, we want to discuss with you. Step 3 of the triple generation protocol says: Coin-flip a subset \fancy_T \subset \fancy_M of size \lambda(2d + 1)M. The current implementation loops over the elements in \fancy_M and decides fifty-fifty for every element whether it should by in \fancy_T until there are enough elements in \fancy_T. I however think that this choice should be biased according to the size of \fancy_M and \fancy_T, i.e. every element of \fancy_M should be put in \fancy_T with probability \lambda(2d + 1)M / (1 + \lambda)(2d + 1)M = \lambda / (1 + \lambda). Furthermore, I think the probability should be updated whenever \fancy_M is processed completely and \fancy_T still is not big enough. Maybe it would be easier to choose \lambda(2d + 1)M times a random element of the remaining ones in \fancy_M instead. So you could loop the elements of M and include them in T with probability \lambda/(1+\lambda), but then you're not guaranteed that you have enough triplets left in M. Exactly, that's why I suggested the solution which in that case would restart with an adapted probability. Choosing the right number of random elements from M and include them in T is the right choice (or any solution that guarantees the size of M \ T to be 2(2d+1)M --- I don't know if the typo is in your email or in Step 3, but the size of the set M should be (1+\lambda)2(2d+1)M --- you miss a 2. I think the 2 is missing in your report and the code. In step 6, the implementation generates a distributed random number in Z_p to determine a partition where one element should be put if there is still space there. I suggest to use the randomness more efficiently to save communication. E.g., if a random number in Z_p is smaller than M^l with l \le log_M(p), one can extract l random numbers in Z_M. The same method probably could be used in step 3 as well. Right. Actually if you want to save communication here you can also use a PRG using the distributed random number as the seed. Best regards, Marcel If you have time, there are a lot of things that need to be done here. I already told Janus knows some of them, but said he can't work on them. Here is something nice we should do to reduce the online time of a factor (at least) 3: In the protocol there are Pedersen commitemnts with three base element g,h_1,h_2. To commit to x using randomness r_1,r_2 one computes g^xh_1^r_1h_2^r_2. The protocol now does it in the following way: 1) compute a=g^x 2) compute b=h_1^r_1 3) compute c=h_2^r_2 4) compute d=a*b*c The complexity is dominated by the three exponentiation, that one implements using some ladder (square and multiply) There is no need to do three exponentiation if one does something a bit smarter: - precompute g001=g^0h_1^0h_2^1 g010=g^0h_1^1h_2^0 ... g010=g^1h_1^1h_2^1 Now, to compute g^xh_1^r_1h_2^r_2 one does the ladder by looking comtemporarily at the i-th bit of x, r_1, r_2. The complexity of this is just one ladder, so virtually the same as just one of the above exponentiation. One can go further and precompute more elements, for instance g000 ... g333 And now you look at 2 bits of x,r_1,r_2 for every square in the square-and-multiply. This saves another factor 2 in time but you need to store a bigger table of size 8^2. What is the best strategy given our architecture? How many powers should we precompute? The commitments are computed by an external elliptic curve library, so I can't answer anything here. Janus should know more about it. Regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] OrlandiRuntime implementation
Claudio Orlandi schrieb: On Wed, Nov 4, 2009 at 10:15 AM, Marcel Keller mkel...@cs.au.dk wrote: Claudio Orlandi wrote: On Wed, Nov 4, 2009 at 5:46 AM, Marcel Keller mkel...@cs.au.dk wrote: Hi Claudio and Jesper, In the code review of the OrlandiRuntime we found two points, we want to discuss with you. Step 3 of the triple generation protocol says: Coin-flip a subset \fancy_T \subset \fancy_M of size \lambda(2d + 1)M. The current implementation loops over the elements in \fancy_M and decides fifty-fifty for every element whether it should by in \fancy_T until there are enough elements in \fancy_T. I however think that this choice should be biased according to the size of \fancy_M and \fancy_T, i.e. every element of \fancy_M should be put in \fancy_T with probability \lambda(2d + 1)M / (1 + \lambda)(2d + 1)M = \lambda / (1 + \lambda). Furthermore, I think the probability should be updated whenever \fancy_M is processed completely and \fancy_T still is not big enough. Maybe it would be easier to choose \lambda(2d + 1)M times a random element of the remaining ones in \fancy_M instead. So you could loop the elements of M and include them in T with probability \lambda/(1+\lambda), but then you're not guaranteed that you have enough triplets left in M. Exactly, that's why I suggested the solution which in that case would restart with an adapted probability. Choosing the right number of random elements from M and include them in T is the right choice (or any solution that guarantees the size of M \ T to be 2(2d+1)M --- I don't know if the typo is in your email or in Step 3, but the size of the set M should be (1+\lambda)2(2d+1)M --- you miss a 2. I think the 2 is missing in your report and the code. Wow. Then it really shouldn't work! The factors are: 1+lambda : because in the test we check a fraction lambda/1+lambda of the triplets 2 : because there is a test to check the correctness of the triplets that burns one triplet to ensure the other one is good 2d+1: because we do the shamir secret sharing multiplication Isn't the burning already done in TripleTest, so you really have just (lambda+1)(2d+1)M triples when doing coin-flipping for the test set? I guess at some point I should also give a look at the code... In step 6, the implementation generates a distributed random number in Z_p to determine a partition where one element should be put if there is still space there. I suggest to use the randomness more efficiently to save communication. E.g., if a random number in Z_p is smaller than M^l with l \le log_M(p), one can extract l random numbers in Z_M. The same method probably could be used in step 3 as well. Right. Actually if you want to save communication here you can also use a PRG using the distributed random number as the seed. Best regards, Marcel If you have time, there are a lot of things that need to be done here. I already told Janus knows some of them, but said he can't work on them. Here is something nice we should do to reduce the online time of a factor (at least) 3: In the protocol there are Pedersen commitemnts with three base element g,h_1,h_2. To commit to x using randomness r_1,r_2 one computes g^xh_1^r_1h_2^r_2. The protocol now does it in the following way: 1) compute a=g^x 2) compute b=h_1^r_1 3) compute c=h_2^r_2 4) compute d=a*b*c The complexity is dominated by the three exponentiation, that one implements using some ladder (square and multiply) There is no need to do three exponentiation if one does something a bit smarter: - precompute g001=g^0h_1^0h_2^1 g010=g^0h_1^1h_2^0 ... g010=g^1h_1^1h_2^1 Now, to compute g^xh_1^r_1h_2^r_2 one does the ladder by looking comtemporarily at the i-th bit of x, r_1, r_2. The complexity of this is just one ladder, so virtually the same as just one of the above exponentiation. One can go further and precompute more elements, for instance g000 ... g333 And now you look at 2 bits of x,r_1,r_2 for every square in the square-and-multiply. This saves another factor 2 in time but you need to store a bigger table of size 8^2. What is the best strategy given our architecture? How many powers should we precompute? The commitments are computed by an external elliptic curve library, so I can't answer anything here. Janus should know more about it. Actually i think that the elliptic curve library offers method for square, multiply, and a multiplication, not for the commitments as an object... I seem to remember that Janus implemented them. In any case, the commitments are completely implemented in C and I didn't have a look at that code. ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Noisy preprocessing
Actually, I meant the update method in runtime.py, line 807, which doesn't output program counters. Janus Dam Nielsen schrieb: Hi Marcel, Occasionally I need to see which program counters are needed, so I can save them and use the NeededDataBenchmark strategy. This suggests that we should make it an option. On 28/10/2009, at 20.01, Marcel Keller wrote: Hi Janus, do you still need the timing output in the update callback in Runtime.preprocess()? It makes benchmarking the usual runtimes very noisy because the update callback is called many times there. Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk mailto:viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk ** * * *Janus Dam Nielsen* * * Research and Innovationspecialist, PhD. CENTRE FOR IT-SECURITY THE ALEXANDRA INSTITUTE LTD. T +45 42 22 93 56 E janus.niel...@alexandra.dk mailto:janus.niel...@alexandra.dk W alexandra.dk ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
[viff-devel] Noisy preprocessing
Hi Janus, do you still need the timing output in the update callback in Runtime.preprocess()? It makes benchmarking the usual runtimes very noisy because the update callback is called many times there. Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
[viff-devel] Orlandi preprocessing
Hi Janus, I remember you saying today that the preprocessing in the OrlandiRuntime is more efficient per item the more items are requested. Is that correct? I ask because in my optimizations, I limited the items being preprocessed per call in order to save memory. I would of course drop that because it doesn't really affect the other cases. Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Optimizing preprocessing
Martin Geisler wrote: Janus Dam Nielsen janus.niel...@alexandra.dk writes: Hi Marcel, I am not opposed to your suggestion. However I would like to point out that in VIFF you compute on shares and not field elements! Well, we've actually made the outer runtime interfaces in such a way that add, mul, xor, etc... accept both integers, FieldElements and Shares. The methods then wrap their input as needed -- or they *dont* wrap it if that leads to a short cut (e.g., constant multiplication) I agree (see also my answer). Computing directly on the field elements is hacking the abstractions of VIFF. Computation on field elements or rather the representation of a Share can be useful as an optimization, however this optimization should be confined within applications or runtimes, and should not progress over interface boundaries as I fear you are suggesting. I think we are in agreement: public methods on the runtimes will keep returning Shares. Methods used internally in runtimes can return other things as needed. To me it sounds like a better API to require preprocessing functions to return a list of Deferreds: [D(?), D(?), ...], instead of a Deferred list of tuples containing Deferreds :-) I think it will simplify the interface nicely, at least for consumers. Using simpler types also leads to less memory usage which has a positive effect on performance, as Marcel notes. So let's go for it. So this makes 2 votes in favour of it and 1 against it. Maybe we should have a meeting to discuss it. What do you think? Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
[viff-devel] Say hello to viff.reactor
Hi friends of VIFF, I've now implemented a special reactor for VIFF, based on the standard SelectReactor. The new reactor should make non-trivial programs considerably faster, e.g. computation of 10 AES-blocks in parallel from over 6 seconds to 2.3 seconds per block (3 players, passive security). The code is in my repository: http://hg.viff.dk/mkeller To use the new reactor, simply install it before importing the reactor: import viff.reactor viff.reactor.install() from twisted.internet import reactor If you use other reactors such as Gtk2Reactor or WxReactor, VIFF still will work but without any speed improvements. Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
[viff-devel] Non-default reactors
Hi friends of VIFF, Who is working with non-default reactors, such as gtk2reactor? How do you deal with the fact that VIFF might block the reactor for a long time which might render the GUI non-responsive? I'm currently working on a special reactor for VIFF to get more control over the scheduling. Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Confusing behaviour?
I think the problem is the following: Every players sends its id on a new connection, and the connection is considered to be set up when this id arrives. So it may occur that a player has received the ids from all other players and wants to send its id to other player(s). But nothing is sent before all callbacks are executed. Since the player considers all connections to made, also the main callback, which blocks, is executed. Therefore, the problem is somehow related to what I'm working on because in a two-threaded solution the Twisted reactor is (nearly) never blocked. However, your problem probably could be fixed by synchronizing before blocking: sync = runtime.synchronize() sync.schedule_callback(blocking_code()) Thomas P Jakobsen wrote: Hi all, When I execute the attached VIFF protocol on three servers I would expect all three to ask me to press enter. When all three servers have done that, I would expect the computation of c to start and that the servers will eventually finish. I've run the protocol several times on Linux and Windows. What happens is this: Sometimes it works out as expected, but at other times only two of the servers will ask the user to press enter. In some cases, the third server will ask its user to press enter as soon as one of the two other servers presses enter. At other times, the third server will first wake up and ask its user to press enter when both of the two first servers have pressed enter. I find this behaviour confusing (..twisted?) and wonder whether it is a bug or a feature? If it's a bug, could it be related to the issue discussed in the thread Mystery of the quadratic running time solved?. I must admit that I haven't followed that thread in detail... By the way, the same thing seem sto happen if the protocol, insted of asking the user to press enter, does some local computations like quering a database, sleeping, computing primes or the like. Sometimes, one of the servers will sit still until one or both of the other servers have finished their local jobs. This was how I initially stumbled across the problem. Best regards, Thomas ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Mystery of the quadratic running time solved?
Hi Ivan, I just wanted to say that I think it would be great if you would implement a version of your proposed two-threaded solution. I do not have a firm grasp of all the programming details, but it does seem that the overall idea is converging, and that some time soon the best way to judge the idea is to go ahead and do it. OK, i will start doing it next Monday. Note that Alexandra is committed to using a non-trivial amount of resources on developing VIFF and related software. Although this may not mean that lots of man-hours are available just now, it might be possible that they could help you. It may also be more efficient that one guy does it in the first iteration, I'll leave this up to you. Since it doesn't seem that complicated to me, I will try it on my own first. But I keep that in mind for the case that I'm wrong. Note that myself, Martin and Jakob will not be at the meeting Thursday. Maybe we should postpone the meeting until next week? Me neither, I'm on holidays until the weekend. Therefore, I would agree to postpone the meeting. Best regards, Marcel ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Mystery of the quadratic running time solved?
You're talking about this two-threaded solution as if it is something that exists and will solve all our problems... No, for now, it's just an imagination in my mind, a proposal for the next meeting, and a strong feeling that it's the right way to do it. But I still haven't seen it, and I would like to see how you can cleanly seperate the work of the two threads. I'm afraid that the threads will be alternating between working and waiting on the other thread in such a way that we could just as well use one thread. My idea is that the Twisted main loop runs in one thread and most of the VIFF code in the other. The Twisted thread only waits for I/O and the VIFF thread only waits if there is no work to be done. If the group decides to give this idea a try, I would be happy to implement it. I would claim that it can be done in one or two weeks. Also, threading in Python is unfortunately not optimal :-( The Python VM will use a single thread to execute the bytecode, even when using multiple thread in Python. It is only if the threads do things blocking for I/O that you will see a performance increase by using multiple threads. I'm aware that Python only runs on one CPU, a friend pointed that out to me today. However, the Twisted thread mentioned above would block on I/O. ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Mystery of the quadratic running time solved?
Indeed we did not know (well I didn't) back then that the data was not sent immediately by Twisted, and I was starting to think yesterday whether the hack would make a difference. Lucky for us, it apparently does :) That is not the only problem. To free the memory of the shares and to send out further shares, also the incoming shares must be processed as soon as possible. This is even trickier because incoming shares might trigger code that calls functions sending out data, which activates the Twisted reactor again and therefore leads to a possibly too deep recursion. I think I have a solution for that, it just wasn't necessary to implement it for now because the hack worked anyway. ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk