Oscar Benjamin writes: > The break even point where both implementations take equal time is > around about 5% density. What that means is that for a 1000 x 1000 > matrix with 10% of elements nonzero it is faster to ask flint to > construct an enormous dense matrix and perform a huge number of > arithmetic operations (mostly involving zeros) than it is to use a > pure Python implementation that has more favourable asymptotic > complexity and theoretically computes the result with 100x fewer > arithmetic "operations". In this situation there is a sliding scale > where the faster the Python interpreter gets the less often you > benefit from calling the C routine in the first place.
Sure, but what's also happening here is that you're optimizing programmer cost by not writing the sparse algorithm in C, C++, or Rust. So I haven't done the math, but I guess to double the percentage of nonzero matrix elements that constitutes the breakeven point you need to double the speed of the Python runtime, and I don't think that's going to happen any time soon. As far as I can see, any reasonably anticipatable speedup is quite marginal for you (a 10% speedup in arithmetic is, I hope, a dream, but that would get you from 5% to 5.5% -- is that really a big deal?) > This happens because it works out faster from the perspective of > pure Python code that is encumbered by interpreter overhead and has > a limited range of C routines to choose from. If the interpreter > overhead is less then the options to choose from are improved. Absolutely. But the real problem you face is that nobody is writing routines for sparse matrices in languages that compile to machine code (or worse, not wrapping already available C libraries). > In the case of SymPy/flint if the maximum speed gain of flint was only > 10x then I might not bother using it at all to avoid the complexity of > having multiple implementations to choose from and external > dependencies etc. Sure, but my guesstimate is that that would require a 90% speedup in Python arithmetic. Really, is that going to happen? I feel your pain (even though for me it's quite theoretical, my own data is dense, even impenetrable). But I just don't see even potential 10% or 20% speedups in Python overcoming the generic need for programmers to either (1) accept the practical limits to the size of data they can work with in Python or (2) bite the bullet and write C (or ctypes) that can do the calculations 100x as fast as a well-tuned Python program. I'm all for Mark's work, and I'm glad somebody's willing to pay him some approximation to what he's worth, even though I probably won't benefit myself (nor my students). But I really don't see the economics of individual programmers changing very much -- 90% of us will just use the tried-and-true packages (some of which are accelerated like NumPy and Pandas), 9% will think for ten minutes and choose (1) or (2) above, and 1% will do the careful breakeven analysis you do, and write (and deal with the annoyances of) hybrid code. Steve _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/UWLGL6HGTM6LIUOS2HFZX23GFJDXQPG7/ Code of Conduct: http://python.org/psf/codeofconduct/