Larry Hastings <la...@hastings.org> added the comment:
Jack O'Connor: > Was any of the experimental C extension code [...] useful to you? > I was wondering if it could be possible to copy blake3module.c from > there verbatim. I concede I didn't even look at it. The glue code to mate the library with the CPython runtime wasn't the hard part. And, Python has its own way of working (Argument Clinic etc). I did what seemed easiest, which was to start with CPython's BLAKE2 support and hack it up. Given that I had it working in maybe half a day, this seems reasonable. I should take a peek at your experimental build though, just to confirm I got the interfaces right. > The setup.py build there also has working Windows and NEON support. The CPython build process doesn't use setup.py to build extensions on Windows. I haven't looked recently, but back in the day they were built like any other DLL, using its own "project file". This will require someone to use the MSVS GUI to create a new project, add files, etc etc. I assume they can just use the .asm files for the SIMD extensions so hopefully this will be straightforward. As for NEON support, the technique I used theoretically should Just Work. I look forward to hearing back from someone building on ARM. (I have a cross-compiler setup for building for MiSTer, which is an ARM platform with NEON support. But the cross-compiler is a royal PITA to use... it's a Docker image, and requires a bunch of manual configuration, and I haven't touched any of that in more than a year.) You seem to have done at least a partial code review of my PR, given your comments to follow. I appreciate it! I tried to add you as a "reviewer" but GitHub wouldn't permit it--I assume this is some sort of permissions problem. Still, it'd be nice if you were able to do future code reviews using the GitHub review tool; are you permitted to use that for my PR? > - My derive_key_context parameter requires a string and refuses to > accept bytes. This is consistent with our Rust and C APIs (though the C > API does include a _raw version specifically for bindings, which we're > using here). I was considering going the other way with it actually, requiring bytes. Note that Python has first-class support for hard-coded bytes strings: b = blake3.blake3(derive_key_context=b"My Funny Valentine (1984)") The C interface takes "char *", not a "wchar_t *", and this seemed like the most direct and relatable way to reflect that. But I'm not militant about this, and I'm willing to change the interface to require an actual string (as in Unicode). I note that your C API already dictates that Unicode be encoded as UTF-8, so we can do that, and if the encoding fails the user can deal with it. > - I've included an `AUTO` constant that provides a special value (-1) > for the `max_threads` argument, and I explicitly don't support > `max_threads=0`. I can do that too; again, I prefer the 0 there, but I'm not militant about it. However, it would make sense to me if you had that constant somewhere in the BLAKE3 C .h files, which AFAICT you currently don't. > - I enforce that the `data` arguments are positional-only and that the > other keyword arguments are keyword-only. I think this is consistent > with the rest of hashlib. I suspect hashlib is mostly like that, due to being chock full of legacy code. But I don't see why that's necessary. I think permitting "data" to be a named argument is fine. So unless you have a strong conviction about it--which I bet you don't--I'll leave it as positional-or-keyword. There are rare circumstances where positional-only arguments are useful; this isn't one of them. > - I include a `.reset()` method. I don't mind adding that. > - Unrelated to tests: I haven't made any attempt to zero memory in my > `dealloc` function. But if that's what other hashlib functions do, > then I'm certainly in favor of doing it here too. I inherited that from the BLAKE2 code I carved up to make the BLAKE3 version. And yeah, it made sense to me, so I kept it. Christian Heimes: > GH-31686 is a massive patch set. I'm feeling uncomfortable adding > such much new code for a new hashing algorithm. Did you ask the > Steering Council for approval? I didn't. Like most hashing algorithms, BLAKE3 doesn't allocate memory and doesn't perform any I/O. All it does is meditate on the data you pass in, and write to various pre-allocated fixed-size buffers. As large codebases go this seems pretty harmless, almost inert. The Modules/_blake3/impl directory is about 32kloc. I note that the Modules/_blake2/impl directory you checked in in 2016 is about 21kloc, and you didn't require Steering Council (or BDFL) approval for that. As (former) Steering Council member Barry Warsaw says: JFDI! > The platform detection and compiler flag logic must be added to > configure.ac instead of setup.py. Erlend and I are in the process > of making setup.py optional. I plan to remove it entirely along > with distutils in 3.12. "must"? What empowers you to make such an edict? Currently extensions are built with setup.py, full stop. There are configuration settings in setup.py for extension modules--which my patch extends to support "blake3"--but none of the actual building work is done inside configure. My patch works within the current build system, and I see nothing inappropriate about how I solved the problem. Furthermore, I don't know anything about modifying configure.ac files, and I'm not going to learn it just to get this patch merged. If you feel strongly about it I encourage you to submit changes to my PR. Or, wait until the 3.12 development cycle, when you say you plan to rewrite it all anyway. But it's not necessary to move the platform detection and compiler flag logic into configure before merging my PR. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue39298> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com