Sounds good, hopefully others will find the Python library as useful as I have!
There is already a basic unit test for the vector of sketches, see here: https://github.com/apache/incubator-datasketches-cpp/blob/master/python/tests/vector_of_kll_test.py It tests the vector of floats sketches, tests initializing the vector of ints sketches, and tests updating with 2D (which works) and 3D (which fails, as expected) arrays. It's based heavily on the kll unit tests, so I think it is a sufficient set of tests, but no harm in adding more :) Michael ________________________________ From: Jon Malkin <[email protected]> Sent: Wednesday, August 5, 2020 2:28 PM To: [email protected] <[email protected]> Subject: KDD tutorial and C++ 2.1 plans Hi everyone, Daniel Ting and I have a tutorial on sketching at KDD later this month. He's covering more of the theory (in a minimally-mathy way) of sketching, while I'm talking more about the practical aspects -- and including a demo via jupyter notebook using python. BUT...I also kind of cheated a bit in my demo. I added a vector input method to KLL since it sped up adding data quite substantially. It's probably bad to have a demo rely on an unreleased API though :) Since this seems like a great chance to expose more people on the python side to our library, I'm proposing to do a few things between now and the tutorial date. That'll be August 26, I believe. 1. Add descriptions to all the python methods 2. Add a basic unit test to the newly added vectorized input (if I didn't do that already, I forget) 3. Release v2.1 I'm willing to handle the changes since I don't think it'll be that much work, but should significantly improve usability. Following that proposal, I think v2.1 would have a few minor cleanup items, but mostly be python changes. Michael contributed the vector_of_kll object and then all the documentation changes. Let me know if in particular if you have any objections. jon
