Hi SymPy community,

I am Naman Rajput, a student interested in the GSoC 2026 project “Classical 
Mechanics: Efficient Equations of Motion Generation.” Before writing my 
proposal, I spent the past week setting up a dev install and profiling 
KanesMethod. I wanted to share my findings and get mentor feedback early.


*SETUP*

SymPy 1.15.0.dev (editable install) · Windows 11 · Python 3.x
Problem: N-link planar pendulum, cold-cache timing, 
KanesMethod.kanes_equations()


*SCALING RESULTS*

  N=3 → 0.081s   (baseline)
  N=4 → 0.124s   (1.5x,  alpha ≈ 1.5)
  N=5 → 0.301s   (3.7x,  alpha ≈ 4.0)

The exponent alpha=4.0 at N=4→5 matches the expected O(N^4)–O(N^5) behavior 
from symbolic expression swell in _form_frstar. Projecting to N=7 (a 
typical robot manipulator) gives estimated runtimes of 10–30 seconds per 
kanes_equations() call, making iterative model development impractical.

Side finding: SymPy’s @cacheit mechanism can mask true EoM generation cost 
by up to 3x within the same session. Reliable timing requires clear_cache() 
between runs or a fresh process.


*LINE PROFILER RESULTS*

I monkey-patched _form_frstar, _form_fr, msubs, and time_derivative at 
runtime using line_profiler without modifying any SymPy source files. 

Top confirmed bottlenecks:

*In _form_frstar (kane.py):*

  L486  body.masscenter.acc(N)                  27.3% of runtime
        Triggers time_derivative twice (vel → acc chain), uncached.
        Recomputed fresh on every kanes_equations() call per body.

  L515  fr_star = -(MM * msubs(...) + nonMM)     22.6% of runtime
        Symbolic N×N matrix multiply on fully expanded MM.
        CSE before this step could reduce effective expression size.

  L496  MM[j,k] += M * tmp_vel.dot(...)          11.1% of runtime
        O(N²) inner loop — 27 dot products at N=3, grows as N² × bodies.
        Partial velocities recomputed independently from _form_fr.

*In time_derivative (vector/functions.py):*

  L207  ang_vel_in(frame) ^ Vector([v])           51.4% of td runtime
        Rotating-frame cross product, called 42 times for a 3-link system.

  L203  express(v[0], frame, ...).diff(t)         39.6% of td runtime
        Symbolic diff + frame re-expression on already-expressed vectors.


*PROPOSED OPTIMIZATIONS*

1. Cache acc() per (body, frame) pair on the KanesMethod instance.
   Eliminates the 27.3% cost at L486 after the first computation.

2. Share partial velocities between _form_fr and _form_frstar.
   Both methods compute them independently today. A pre-computed cache
   eliminates the O(N²) redundancy at L496.

3. Apply sympy.cse() before the final matrix multiply at L515.
   Extracts repeated sub-expressions from MM before the N×N multiply,
   reducing the effective expression size.


*QUESTIONS FOR MENTO*RS

1. Has partial velocity caching been attempted before? I want to check
   for prior correctness issues with dependent or auxiliary speeds.

2. Where should a benchmark suite live? I plan to build an asv benchmark
   for the n-link pendulum — is sympy/benchmarks/ the right location?

3. For acc() caching, should the cache live on the KanesMethod instance
   (reset per kanes_equations() call) or be module-level keyed by
   expression identity?

I am currently looking at open issues labeled ‘mechanics’ on GitHub to open 
a small first PR this week.

Thank you for the detailed codebase, I look forward to any feedback before 
I finalize my proposal.

Best regards,
Naman Rajput
[GitHub: https://github.com/NamanRajput-git/]
[HBTU Kanpur, India]

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/sympy/24be726b-3b26-4fd6-ac84-d2ffadddbd3an%40googlegroups.com.

Reply via email to