On Mon, Jul 26, 2010 at 11:31 AM, Nicolas M. Thiery <nicolas.thi...@u-psud.fr> wrote: > On Fri, Jul 23, 2010 at 01:05:00PM -0700, William Stein wrote: >> During the last few weeks I've spent time optimizing code, among >> other things. > > Great! > >> A surprising number of random things I look at in Sage are too slow >> by a factor of 10-100, usually due to programming style. One way to >> help to remedy that would be to write a book or article about how to >> program Cython + Python *efficiently*, and avoid the numerous traps. >> Basically, one false move in Python when implementing mathematics >> algorithms can knock you down by a factor of 10-100 in speed. > > Definitely. > >> As an example of this, try multiplying polynomials in ZZ['x'] in >> Sage right now: >> >> sage: R.<x> = ZZ[] >> sage: timeit('x*x',number=10^6) >> 1000000 loops, best of 3: 1.96 µs per loop >> >> FLINT could do that multiply in a few hundred nanoseconds, so why does >> it take 2 microseconds (2000 nanoseconds)? Because the code is >> complicated and general, but was written without any thought about >> efficiency. By making a few changes (really making the code less >> clever), I get: >> >> sage: timeit('x*x', number=10^6) >> 1000000 loops, best of 3: 687 ns per loop >> >> The categories implementation as it is in Sage must be changed or >> removed. There, I got your attention. > > Let's see about this. > > That's before any category code: > > zephyr> /opt/sage-4.2.1/sage > sage: R.<x> = ZZ[] > sage: timeit('x*x',number=10^6) > 1000000 loops, best of 3: 1.14 µs per loop > > That's after the integration of the main category patches: > > zephyr-~monoid>/opt/sage-4.3/sage -br main > > sage: R.<x> = ZZ[] > sage: timeit('x*x',number=10^6) > 1000000 loops, best of 3: 1.12 µs per loop > > That's after #7921: Categories for extension types via __getattr___: > > zephyr-~monoid>/opt/sage-4.3.2/sage -br main > sage: R.<x> = ZZ[] > sage: timeit('x*x',number=10^6) > 1000000 loops, best of 3: 1.13 µs per loop > > That's now: > > zephyr-~monoid>/opt/sage-4.4.4/sage -br main > sage: R.<x> = ZZ[] > sage: timeit('x*x',number=10^6) > 1000000 loops, best of 3: 1.19 µs per loop > > Please don't blame this one on categories. The overhead is in the > coercion code which was written by our favorite Cython expert.
I'm talking about the __getattr__ that you added. Simply commenting out the __getattr__ function leads to a significant speedup. This has been replicated by several other people, and this thread contains a deep investigation into why this is happening (though things still aren't clear). (But you realized all this below.) > Now, should the coercion code be changed or removed because it is too > generic and horribly slow? First note that we are speaking about a > speed factor of 2 or 3, not 10 or 100. And this is for very low level > arithmetic; the overhead would be negligible on anything more > complicated (if not just larger polynomials). So the right thing to do > is simply to override the generic code by optimized code for those > element classes which have very cheap arithmetic. That's what has > been done for ZZ, ... and as far as I know everybody is happy about > this approach. The point -- which you miss -- is that the way the category idea is implemented in Sage surprisingly causes *all* attribute lookups to slowdown substantially. It can easily be the case that a whole program is just a sequence of attribute lookups, so the performance penalty of slowing this down is significant. > speed factor of 2 or 3, not 10 or 100 A factor of 2 or 3 is very significant when it happens uniformly across the board. >> A few months ago some code was added to Sage by the combinat folks >> called "categories". I recently learned ObjectiveC, and found out >> that this idea of "categories" goes back to smalltalk at least, and >> was fully implemented in ObjectiveC in the mid 1980s. It has little >> to do with category theory in the sense of mathematics, though is >> related. Basically, it is a way to attach a bunch of generic >> methods to a class without using inheritance. > > That's a very reductive view. And actually quite inaccurate: most of > Sage's parents and elements *do* get their generic methods from the > categories, through class inheritance. There is indeed one exception: > extension types (i.e. Cython classes), which currently use the getattr > hack (because Cython does not support yet any form of multiple > inheritance). Do you know anything about Objective C or smalltalk or "categories" in those languages? > Anyway that's just an implementation detail. Of course, categories in > Sage are strongly rooted in object oriented ideas which date back from > smalltalk. But what the categories really bring to Sage is about > interactions between parents, elements, and morphisms, as well as > generic tests, functorial constructions, etc, all of which have much > to do with mathematics. > > >> The combinat folks implemented this programming idea in Sage not by >> changing the Python languge, but by defining a custom __getattr__ >> method at sticking it in the base classes for Sage parents and >> elements. > > Two notes: > > - About the getattr hack, any blame should go to me personally and > the reviewer (Robert), not to the "combinat folks". I don't really care about blame; I just want to find a way to resolve this problem. I've slowed down/messed up etc., so many things in Sage over the years, it isn't even funny. > > - I never liked this hack. Actually the original plan for the > implementation of categories before I jumped in was to > systematically use this hack. And I had to argue a lot to advance > instead the dynamic class approach. I would *love* to be able to > get rid of this hack for extension classes as well. > >> The problem: this means that every single method lookup on every >> single Sage object seem to now have about 1500nanoseconds added to >> it in completely unnecessary overhead. Most of what Sage does is >> call functions, so this is a huge, huge problem. > > Wow, I confirm this: > > Without: > > sage: sage: sage: sage: QQ.testit = 'foobar' > timeit('QQ.testit',number=10^6) > timeit('QQ._list',number=10^6) > timeit('QQ.gens',number=10^6) > timeit('QQ.an_element',number=10^6) > > 1000000 loops, best of 3: 1.7 µs per loop > 1000000 loops, best of 3: 422 ns per loop > 1000000 loops, best of 3: 236 ns per loop > 1000000 loops, best of 3: 586 ns per loop > > With: > > sage: QQ.testit = 'foobar' > sage: timeit('QQ.testit',number=10^6) > sage: timeit('QQ._list',number=10^6) > sage: timeit('QQ.gens',number=10^6) > sage: timeit('QQ.an_element',number=10^6) > 1000000 loops, best of 3: 2.68 µs per loop > 1000000 loops, best of 3: 1.45 µs per loop > 1000000 loops, best of 3: 1.2 µs per loop > 1000000 loops, best of 3: 1.56 µs per loop > > Which is totally unacceptable indeed. I'm very happy you can replicate this. > Ironically, extension types are far less affected, if at all: > > Without: > > sage: timeit('ZZ._list',number=10^6) > sage: timeit('ZZ.gens',number=10^6) > sage: timeit('ZZ.an_element',number=10^6) > 1000000 loops, best of 3: 312 ns per loop > 1000000 loops, best of 3: 170 ns per loop > 1000000 loops, best of 3: 391 ns per loop > > With: > > sage: timeit('ZZ._list',number=10^6) > sage: timeit('ZZ.gens',number=10^6) > sage: timeit('ZZ.an_element',number=10^6) > 1000000 loops, best of 3: 318 ns per loop > 1000000 loops, best of 3: 180 ns per loop > 1000000 loops, best of 3: 391 ns per loop I think the impact depends a lot on how many python def'd methods the class has. E.g., I was testing with the huge number field classes, and the slowdown is much greater than with testing with a naked class that just derives from Element. I did not notice any real difference between otherwise identical Cython and Python classes. > > > We had done quite some benchmarking with Robert (see e.g. the IRC log > around mid-January). But somehow we let this slip through: Python's > doc for __getattr__ states: > > «Note that if the attribute is found through the normal mechanism, > __getattr__() is not called.» > > We checked that indeed __getattr__ was only called when the attribute > is not found. And we ran benchmark to make sure that there was no > overhead for extension types. Our mistake was to assume that there > would be no overhead either for Python classes. Ouch! Mea culpa. > > >> This is a major problem. The best solution I can think of at >> present is to remove these methods from the top level of Sage, then >> add objects like Parent_cat and Element_cat, and put them there. >> Then change all the combinat objects to derive from those classes. >> I'm not sure this would work though. > > I see three options: > > 1 Simply revert #7921. Most of the combinat code is plain Python, so > won't be affected. What will break are access to generic methods > and generic TestSuite's for extension types. This will break some > of the recently added tests, but hopefully not too many > (e.g. TestSuite calls, or Rob's tests for multiplication tables, > things around permutation groups, ...) > > 2 Since only extension types need the getattr hack, make sure that > __getattr__ is only defined for those. I would love it. Anyone a > good idea on how to achieve this? The problem is that all parents > and elements classes derive from Parent and Element, which are the > natural places to put the getattr hack. I'm dubious that this is not an issue for Cython classes. > 3 Implement partial multiple inheritance in Cython > > We discussed Option 3 quite some with Robert during the categories > brainstorm in Nancy (October 2008). As far as I remember from the > discussions, full multiple inheritance (including merging of data > structure and method lookup table) was impossible; on the other hand, > there was no technical obstructions for allowing a Cython class to > inherit from multiple purely abstract classes besides the main base > class branch, if one accepts the price of a virtual lookup for the > methods defined there. Essentially, this would boil down to > implementing a variant of the getattr trick, but within the Cython > compiler, and with full control so as to make sure that standard > attribute and method lookups would not be affected. That sounds like implementing what is called "categories" in ObjectiveC. ObjectiveC is an extension to C that was done in the late 1980s to add features like one has in Smalltalk. It's built on C structs, like Cython is, and does *not* have multiple inheritance. Categories make up for this. > Robert: can you confirm? Are you still thinking about it? If yes, do > you have an idea of a possible timeline? > > Best, > Nicolas > > > PS: currently, the "combinat folks" are focusing on features rather > than pure speed. We can screw up here and there, but we are taking > great care in the design to make sure to leave full room for later > optimizations. In particular, a lot of the "generic approach", is > designed to concentrate all the time critical code in very few spots, > which will be easy to locate and optimize when someone will have time > for it (e.g. CombinatorialFreeModule). > > > -- > Nicolas M. Thiéry "Isil" <nthi...@users.sf.net> > http://Nicolas.Thiery.name/ > > -- > You received this message because you are subscribed to the Google Groups > "sage-combinat-devel" group. > To post to this group, send email to sage-combinat-de...@googlegroups.com. > To unsubscribe from this group, send email to > sage-combinat-devel+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/sage-combinat-devel?hl=en. > > -- William Stein Professor of Mathematics University of Washington http://wstein.org -- You received this message because you are subscribed to the Google Groups "sage-combinat-devel" group. To post to this group, send email to sage-combinat-de...@googlegroups.com. To unsubscribe from this group, send email to sage-combinat-devel+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sage-combinat-devel?hl=en.