Hi Everyone,

BLAKE2 (https://blake2.net/blake2.pdf) has a tree implementation option 
that can execute in parallel. The parallel tree-code is missing from the 
RFC (RFC 7693), but I'm guessing its probably useful in some situations.

I'm working on a BLAKE2 implementation, but it _will not_ include the tree 
implementation for the initial cut-in. However, I want to ensure we can add 
the tree code in the future without changing things or breaking clients. 
I'd like some feedback on the stub implementation for the tree-based code.

I'm not examining the option of: we perform all the parallel processing in 
the "core library" using Windows threads, pthreads or OpenMP tasks. I don't 
feel its within purview of the library. However, we probably want to 
provide an OpenMP test case for the tree-based code to ensure it works as 
expected.

*****

Crypto++ follows the Init/Update/Final pattern:

    // Constructor follows RAII and performs Init()

    // Existing....
    void Update(const byte* inString size_t length);

    // Existing
    void Final(byte* digest);
    void TruncatedFinal(byte* digest, size_t length);

*****

Now we need to decide how to carve-in the tree-based code. I'm think four 
things need to be performed, similar to Hadoop processing:

    (1) Data needs to be partitioned
    (2) Data needs to be packaged
    (3) Data needs to processed
    (4) Results needs to combined

(1) is the client's responsibility. (2) is the client's responsibility, but 
we need to provide the data structure. (3) is our responsibility. (4) is 
the client's responsibility.

With that said, I'm thinking there should be a structure to represent the 
packaged data. I'm thinking it should be an inner class, similar to an 
iterator:

(a)   BLAKE2::TreeNode

I think we need an additional Update method, and that method should be 
const:

(b)    bool BLAKE2::Update(TreeNode& node, bool throwOnError) const

(c) TreeNode will likely need to be an IN/OUT parameter. IN, it will hold a 
reference to the existing state of the BLAKE2 hash. The BLAKE2 object state 
is constant. The TreeNode state is mutable.

(d) Update will perform the processing on the node, and only modify the 
members of TreeNode. It will not modify the state of the BLAKE2 object.

(e) OUT, the TreeNode will provide the result of processing, meaning the 
transformed data.

(f) Update will return true/false to indicate success/failure. If 
throwOnError is true, then Update will throw on failure instead. I think 
this is important because async methods and parallel processing complicates 
catching a throw and matching an error to offending code.

(g) After the function returns, the results need to be combined and the 
BLAKE2 object needs to be updated. This is where we update the BLAKE2 
object state, if necessary.

*****

Does anyone see any gaps or problems with (a) - (g)?

Does anyone have an idea of what should happen after (f) to accomplish (g)? 
What should the (g) function look like? what is its signature?

Sorry about the long write-up. I know its not easy to parse. Thanks in 
advance.

Jeff

-- 
-- 
You received this message because you are subscribed to the "Crypto++ Users" 
Google Group.
To unsubscribe, send an email to cryptopp-users-unsubscr...@googlegroups.com.
More information about Crypto++ and this group is available at 
http://www.cryptopp.com.
--- 
You received this message because you are subscribed to the Google Groups 
"Crypto++ Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to cryptopp-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to