Re: [sympy] Contributing new SymPy benchmarks (comparative & non-time metrics)

Aaron Meurer Thu, 06 Oct 2022 12:32:07 -0700

On Thu, Sep 29, 2022 at 4:33 PM Aaron Meurer <[email protected]> wrote:
>
> On Wed, Sep 28, 2022 at 2:04 PM Sam Brockie <[email protected]> wrote:
> >
> > Hi All,
> >
> > I'd like to begin adding some additional benchmarks to SymPy to help inform 
> > the code generation work that I'm doing as part of the CZI grant.
> >
> > I'm aware of the benchmarks in the benchmarks repository. My understanding 
> > is that these are run using airspeed velocity as part of the CI, and track 
> > how the performance of a particular benchmark has changed relative to the 
> > most recent SymPy release and the master branch.
> >
> > There are two other types of benchmark that I think might be useful:
> >
> > 1. Comparison of multiple ways to do equivalent computations
> >
> > Below is a contrived example in which there are two functions, option_1 and 
> > option_2, that produce the same result but have different implementations.
> >
> > >>> from sympy import Matrix, symbols
> > >>>
> > >>> def option_1(a, b):
> > ...     return Matrix([a+b, a*b]).jacobian(Matrix([a, b]))
> > ...
> > >>> def option_2(a, b):
> > ...     Matrix([[(a+b).diff(a), (a+b).diff(b)],
> > ...             [(a*b).diff(a), (a*b).diff(b)]])
> > ...
> > >>> a, b = symbols(“a, b”)
> > >>> option_1(a, b) == option_2(a, b)
> > True
> >
> > A benchmark in this case would time the execution of both option_1 and 
> > option_2 (for a range of inputs), compare the relative speeds, and report 
> > the differences. As this type of benchmark is not comparing the same 
> > benchmark across different SymPy versions, I believe that airspeed velocity 
> > may not be the best tool to use here.
> >
> > I see this type of benchmark as being useful for: (1) determining which 
> > algorithm to use when implementing a new function or refactoring an 
> > existing function; and (2) ensuring that an implementation remains superior 
> > to alternatives as changes are made elsewhere in SymPy.
> >
> > I have had success in the past implementing these sorts of benchmarks using 
> > pytest-benchmark. Is there currently anything similar anywhere is SymPy? 
> > Would the sympy/sympy_benchmarks repository be the best place to contribute 
> > PRs for these sorts of benchmarks? Does anyone have any differing opinions 
> > about how and where these should be implemented, or the value of this type 
> > of benchmark?
>
> For now let's just add these to the benchmarks repo
> https://github.com/sympy/sympy_benchmarks. asv is currently limited in
> what it is able to do, but we shouldn't let that stop us from writing
> useful benchmarks. The important thing is to write the benchmark down,
> in a way that it can at least be run in some capacity. Better tooling
> around it, CI, etc. can come later.
>
> There have been some recent discussions about improving it and other
> benchmarking tooling among some other projects in the ecosystem, and
> I'll make sure to keep you involved in the conversations.


These discussions are happening publicly over at
https://github.com/airspeed-velocity/asv/issues/1219. I encourage
everyone here to join that discussion and notate what you'd like to
see in the existing Python benchmarking tooling.

Aaron Meurer

>
> The reason we have a separate benchmarking repo is that it makes it
> easier to run benchmarks across different versions of SymPy. Also,
> unlike tests, it doesn't really make sense to ship benchmarks with the
> SymPy releases.
>
> >
> > 2. Measurement of non-time metrics
> >
> > Below is another contrived example in which common subexpression 
> > elimination is used on an expression, y, and it is shown that the result of 
> > cse(y) involves fewer operations that the original expression.
> >
> > >>> from sympy import count_ops, cse, exp, sin, symbols
> > >>>
> > >>> a, b = symbols(“a, b”)
> > >>> y = (sin(a/b) + (a/b) - exp(b)) * ((a/b) - exp(b))
> > >>>
> > >>> count_ops(y)
> > 10
> > >>> count_ops(cse(y))
> > 6
> >
> > A benchmark in this case would count the number of operations in the return 
> > value from cse(y) and compare this to 6. Assuming that the implementation 
> > of the cse function has been changed, if the number of operations is six 
> > then we know that its performance hasn’t been changed by the refactor. If 
> > the count is greater than six a regression has taken place. If the count is 
> > less than six the performance of the function has been improved. 
> > Benchmarking for a range of inputs would obviously be required.
> >
> > I see this type of benchmark as being useful for: (1) measuring SymPy’s 
> > performance in instances where timing code snippets isn’t necessarily the 
> > best, or only valuable, indicator of performance; and (2) ensuring 
> > regressions haven’t occurred during refactoring.
> >
> > I believe this type of benchmark can be implemented using airspeed 
> > velocity’s track prefix. Or perhaps this type of benchmark would be best 
> > implemented as regression tests in the sympy/sympy repository’s test suite, 
> > comparing the non-time metrics to hard-coded values.
>
> I would say both things are useful. The main benefit of having it in
> the asv benchmarks is that we can see how things changed over time,
> whereas having it in the test suite prevents regressions.
>
> One thing I would say for this sort of thing in asv is that it's
> possible that the implementation of count_ops itself might change or
> have changed. So it might be a good idea to write a simple version of
> count_ops just for use in the benchmark.
>
> >
> > As before, is there currently anything similar anywhere is SymPy? Should 
> > PRs for these sorts of benchmarks be contributed to the 
> > sympy/sympy_benchmarks repository using airspeed velocity's track or as 
> > regression tests in the sympy/sympy repository? Does anyone have any 
> > differing opinions about how and where these should be implemented, or the 
> > value of this type of benchmark?
>
> I think it's valuable. We could do similar things with functions like
> simplify(), and potentially even use it to track features being
> implemented (e.g., how many of a suite of integrals is SymPy able to
> compute across different versions). Again, asv is somewhat limited in
> what it can do, but I'm hopeful that can be improved in the future.
>
> Aaron Meurer
>
> >
> > Sam
> >
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "sympy" group.
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to [email protected].
> > To view this discussion on the web visit 
> > https://groups.google.com/d/msgid/sympy/36e40796-3caa-4aa1-9753-1606773e9288n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAKgW%3D6JQEsfnXbYOYqenfm%3DnJu5bzQcTMuvzPWafjoqpHVeRfQ%40mail.gmail.com.

Re: [sympy] Contributing new SymPy benchmarks (comparative & non-time metrics)

Reply via email to