Some thoughts and questions for clarification of your strategy: - Is your plan to just use this repo for .sk files? I.e., just data. With shared code, we would have to partition sections of the repo for different languages. Nonetheless, when we have more than one repo sharing the same language (~4 for Java now), there is an opportunity to have a place here for shared run-time code (e.g., Hash functions, common math functions, common bit-twiddling code, etc). If we ever want to do this we might want a name that is more neutral than "testsuite"
- Is your plan to choose one language to generate the .sk files? Then what language? That means at least one language would have to implement all sketches. Right now that would be either Java or C++. - What do you mean by "Unstable snapshots"? How are you measuring "unstable"? On Wed, Jan 7, 2026 at 5:02 PM tison <[email protected]> wrote: > Looks like some sketches are unstable for generating. Not sure if we > should make them stable; or, if that's impossible, ignore the diff and > optionally check they are still logically compatible. > > The unstable snapshots are: > > * cpp_generated_files/bf_n0_h3_cpp.sk > * cpp_generated_files/bf_n0_h5_cpp.sk > * cpp_generated_files/bf_n10000_h3_cpp.sk > * cpp_generated_files/bf_n10000_h5_cpp.sk > * cpp_generated_files/bf_n2000000_h3_cpp.sk > * cpp_generated_files/bf_n2000000_h5_cpp.sk > * cpp_generated_files/bf_n30000000_h3_cpp.sk > * cpp_generated_files/bf_n30000000_h5_cpp.sk > * cpp_generated_files/kll_double_n1000000_cpp.sk > * cpp_generated_files/kll_double_n100000_cpp.sk > * cpp_generated_files/kll_double_n10000_cpp.sk > * cpp_generated_files/kll_double_n1000_cpp.sk > * cpp_generated_files/kll_float_n1000000_cpp.sk > * cpp_generated_files/kll_float_n100000_cpp.sk > * cpp_generated_files/kll_float_n10000_cpp.sk > * cpp_generated_files/kll_float_n1000_cpp.sk > * cpp_generated_files/kll_string_n1000000_cpp.sk > * cpp_generated_files/kll_string_n100000_cpp.sk > * cpp_generated_files/kll_string_n10000_cpp.sk > * cpp_generated_files/kll_string_n1000_cpp.sk > * cpp_generated_files/quantiles_double_n1000000_cpp.sk > * cpp_generated_files/quantiles_double_n100000_cpp.sk > * cpp_generated_files/quantiles_double_n10000_cpp.sk > * cpp_generated_files/quantiles_double_n1000_cpp.sk > * cpp_generated_files/quantiles_string_n1000000_cpp.sk > * cpp_generated_files/quantiles_string_n100000_cpp.sk > * cpp_generated_files/quantiles_string_n10000_cpp.sk > * cpp_generated_files/quantiles_string_n1000_cpp.sk > * cpp_generated_files/req_float_n1000000_cpp.sk > * cpp_generated_files/req_float_n100000_cpp.sk > * cpp_generated_files/req_float_n10000_cpp.sk > * cpp_generated_files/req_float_n1000_cpp.sk > * cpp_generated_files/varopt_sketch_long_n1000000_cpp.sk > * cpp_generated_files/varopt_sketch_long_n100000_cpp.sk > * cpp_generated_files/varopt_sketch_long_n10000_cpp.sk > * cpp_generated_files/varopt_sketch_long_n1000_cpp.sk > * cpp_generated_files/varopt_sketch_long_n100_cpp.sk > * cpp_generated_files/varopt_sketch_long_sampling_cpp.sk > * cpp_generated_files/varopt_union_double_sampling_cpp.sk > > * java_generated_files/bf_n0_h3_java.sk > * java_generated_files/bf_n0_h5_java.sk > * java_generated_files/bf_n10000_h3_java.sk > * java_generated_files/bf_n10000_h5_java.sk > * java_generated_files/bf_n2000000_h3_java.sk > * java_generated_files/bf_n2000000_h5_java.sk > * java_generated_files/bf_n30000000_h3_java.sk > * java_generated_files/bf_n30000000_h5_java.sk > * java_generated_files/kll_double_n1000000_java.sk > * java_generated_files/kll_double_n100000_java.sk > * java_generated_files/kll_double_n10000_java.sk > * java_generated_files/kll_double_n1000_java.sk > * java_generated_files/kll_float_n1000000_java.sk > * java_generated_files/kll_float_n100000_java.sk > * java_generated_files/kll_float_n10000_java.sk > * java_generated_files/kll_float_n1000_java.sk > * java_generated_files/kll_long_n1000000_java.sk > * java_generated_files/kll_long_n100000_java.sk > * java_generated_files/kll_long_n10000_java.sk > * java_generated_files/kll_long_n1000_java.sk > * java_generated_files/kll_string_n1000000_java.sk > * java_generated_files/kll_string_n100000_java.sk > * java_generated_files/kll_string_n10000_java.sk > * java_generated_files/kll_string_n1000_java.sk > * java_generated_files/quantiles_double_n1000000_java.sk > * java_generated_files/quantiles_double_n100000_java.sk > * java_generated_files/quantiles_double_n10000_java.sk > * java_generated_files/quantiles_double_n1000_java.sk > * java_generated_files/quantiles_string_n1000000_java.sk > * java_generated_files/quantiles_string_n100000_java.sk > * java_generated_files/quantiles_string_n10000_java.sk > * java_generated_files/quantiles_string_n1000_java.sk > * java_generated_files/req_float_n1000000_java.sk > * java_generated_files/req_float_n100000_java.sk > * java_generated_files/req_float_n10000_java.sk > * java_generated_files/req_float_n1000_java.sk > * java_generated_files/varopt_sketch_long_n1000000_java.sk > * java_generated_files/varopt_sketch_long_n100000_java.sk > * java_generated_files/varopt_sketch_long_n10000_java.sk > * java_generated_files/varopt_sketch_long_n1000_java.sk > * java_generated_files/varopt_sketch_long_n100_java.sk > * java_generated_files/varopt_sketch_long_sampling_java.sk > * java_generated_files/varopt_union_double_sampling_java.sk > > Best, > tison. > > tison <[email protected]> 于2026年1月8日周四 08:53写道: > > > > Hi, > > > > Following up on the discussion [1], I'd like to seek consensus to > > rename our existing but unused repo > > https://github.com/apache/datasketches-java-common to > > datasketches-testsuite to hold shared snapshot generator and > > (optional) serde tests. > > > > [1] https://github.com/apache/datasketches-rust/issues/10 > > > > Here is a repo link that would replace the current content [2]. It > contains: > > > > a. A script (gensnaps.py) to generate sketch snapshots for some language > impls. > > b. Checked-in snapshots that can guard the generator behavior, and for > > language serde tests to easily download the snaps instead of > > generating in place. > > c. (Optionally) Run some basic snap tests. Not included yet. > > > > [2] https://github.com/tisonkun/datasketches-testsuite > > > > If we can reach a consensus, I'll open an INFRA ticket to ask the > > INFRA team to do the rename. > > > > What do you think? > > > > Best, > > tison. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
