Hi Henrik, Thank you for preparing these commits! I will test them on Kafka Streams benchmark results in the next couple of days. I am especially curious about the p-value fix.
Best, Alex On Sun, Dec 8, 2024 at 12:30 PM Henrik Ingo <hen...@nyrkio.com> wrote: > Hi all > > I didn't yet receive powers to push the entire hunter repo into > https://github.com/apache/hunter > > ...and since there isn't any code or branches there, I also cannot create a > pull request just yet. But... I did spend this Sunday afternoon preparing a > branch that is rebased on top of what I expect to become the canonical > upstream: > > https://github.com/nyrkio/hunter/tree/to-asf-upstream2 > > Apparently our fork is 6 commits ahead. I will submit separate PRs once > there is a repo in place to submit PRs against. But you can already take a > sneak peek from above. > > The two most interesting ones should be: > - Add to_json() and from_json() serialization methods' > - Incremental Hunter: Since Datastax introduced an approach where we only > consider the w(indown length) closest points, it turns out that the next > logical step is an optimization: since the common case is that new test > results are appended to the end, we can limit the recompute to the 1-2 > last windows at the tail end. Everything before that point is guaranteed to > not have changed, because the newly added test result is not inside the > window that is considered as input for the algorithm. > > Other changes are fixes or an additional test/benchmark. Some are quite > interesting stlil. Especially I can tell with confidence that Piotr never > thought of using a p-value of 0.1 or higher :-) (You can look at the patch > yourself to find out.) > > henrik >