Hi Everyone, BLAKE2 (https://blake2.net/blake2.pdf) has a tree implementation option that can execute in parallel. The parallel tree-code is missing from the RFC (RFC 7693), but I'm guessing its probably useful in some situations.
I'm working on a BLAKE2 implementation, but it _will not_ include the tree implementation for the initial cut-in. However, I want to ensure we can add the tree code in the future without changing things or breaking clients. I'd like some feedback on the stub implementation for the tree-based code. I'm not examining the option of: we perform all the parallel processing in the "core library" using Windows threads, pthreads or OpenMP tasks. I don't feel its within purview of the library. However, we probably want to provide an OpenMP test case for the tree-based code to ensure it works as expected. ***** Crypto++ follows the Init/Update/Final pattern: // Constructor follows RAII and performs Init() // Existing.... void Update(const byte* inString size_t length); // Existing void Final(byte* digest); void TruncatedFinal(byte* digest, size_t length); ***** Now we need to decide how to carve-in the tree-based code. I'm think four things need to be performed, similar to Hadoop processing: (1) Data needs to be partitioned (2) Data needs to be packaged (3) Data needs to processed (4) Results needs to combined (1) is the client's responsibility. (2) is the client's responsibility, but we need to provide the data structure. (3) is our responsibility. (4) is the client's responsibility. With that said, I'm thinking there should be a structure to represent the packaged data. I'm thinking it should be an inner class, similar to an iterator: (a) BLAKE2::TreeNode I think we need an additional Update method, and that method should be const: (b) bool BLAKE2::Update(TreeNode& node, bool throwOnError) const (c) TreeNode will likely need to be an IN/OUT parameter. IN, it will hold a reference to the existing state of the BLAKE2 hash. The BLAKE2 object state is constant. The TreeNode state is mutable. (d) Update will perform the processing on the node, and only modify the members of TreeNode. It will not modify the state of the BLAKE2 object. (e) OUT, the TreeNode will provide the result of processing, meaning the transformed data. (f) Update will return true/false to indicate success/failure. If throwOnError is true, then Update will throw on failure instead. I think this is important because async methods and parallel processing complicates catching a throw and matching an error to offending code. (g) After the function returns, the results need to be combined and the BLAKE2 object needs to be updated. This is where we update the BLAKE2 object state, if necessary. ***** Does anyone see any gaps or problems with (a) - (g)? Does anyone have an idea of what should happen after (f) to accomplish (g)? What should the (g) function look like? what is its signature? Sorry about the long write-up. I know its not easy to parse. Thanks in advance. Jeff -- -- You received this message because you are subscribed to the "Crypto++ Users" Google Group. To unsubscribe, send an email to cryptopp-users-unsubscr...@googlegroups.com. More information about Crypto++ and this group is available at http://www.cryptopp.com. --- You received this message because you are subscribed to the Google Groups "Crypto++ Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to cryptopp-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.