Re: [sympy] Working on removing C

Aaron Meurer Fri, 19 Jul 2013 18:36:28 -0700

On Thu, Jul 18, 2013 at 9:03 PM, Joachim Durchholz <[email protected]> wrote:
> Am 19.07.2013 01:25, schrieb Aaron Meurer:
>
>>>>> 3. Use relative imports.
>>>>>
>>>> Actually, now that we have dropped Python 2.5 support,
>>>
>>>
>>> Oh did we? I missed the final decision then.
>>
>>
>> The final decision was made a while ago to keep support until 0.7.3,
>> but that's been released since last week. All remaining support was
>> removed in https://github.com/sympy/sympy/pull/2273, except for some
>> the things mentioned in
>> https://github.com/sympy/sympy/pull/2273#issuecomment-20853941
>> (relative imports being one of those things).
>
>
> Ah I see I have pounded on wide-open doors with the proposal to do relative
> imports :-)
>
>
>>>> I thought that most C uses could be solved by moving imports inside
>>>> functions.
>>>
>>>
>>> Those with real cycles: yes.
>>> I'd prefer to use normal importing where an inside-function import isn't
>>> needed (see below).
>>
>>
>> You could also use bottom imports (moving the imports to the bottom of
>> the file), but frankly, I'd rather have imports inside functions,
>> unless the function is called enough that it's a performance hit.
>
>
> What problems would that solve?
>
> Specifying the superclass of a class is going to break with that: The
> superclass won't be imported at that moment.
> So I don't think that everything can be moved to the end, and we'd be having
> two import lists, one at the top and one at the bottom. That's a definitive
> disadvantage.
>
>
>>> ... hm well, okay, what we *can* profile is whether this has an impact on
>>> a
>>> function that does very little other than the import.
>>> If the impact is negligible even then, that would give us more design
>>> alternatives.
>>> Still, I'm of a static school (not the best mindset for Python, I know),
>>> and
>>> I prefer to be able to see the dependencies at a glance in the header.
>>
>>
>> Importing can have a performance hit. See for example
>>
>> https://github.com/sympy/sympy/commit/769dccca81006fe6e0a3f71085241b3f85e6ff99.
>
>
> Sounds like in-function imports do have a noticeable impact.
> This benchmark shows a potential impact of at least 600% slowdown via a
> local import, possibly more if the optimized function actually reached the
> import statement in a substantial fraction of cases, or if the function is
> doing a lot of stuff.
> Sheesh, that doesn't look good. We'd really have to determine for each
> function how often it is really called in practice, which is going to slow
> the whole activity to a crawl (and also bind a lot of attention in the
> future).
> With that background, I'd want to avoid in-function imports as far as
> possible.


Yeah, well there's a reason C was created in the first place.

>
> Maybe the classes should be initialized in an on-demand fashion.
> Possibly like this:
> - Have a getter for each class attribute (or other attribute) where
> initialization is expensive or risky.
> - The getter is supposed to run only once, namely the first time the
> attribute is accessed.
> - The getter is specified using a lambda. The getter runs the lambda and
> assigns whatever result the lambda emits to the raw variable.
> - The getter deletes itself, exposing the raw variable to any future access.
> - As the final polish, the getter returns the lambda's result that got
> assigned to the member variable.
>
> I'm not sure what the best way to write such a getter would be; it could be
> a raw function as in
>
>   _a = lazyInit('_a', lambda: ...)
>
> or maybe a decorator could be made to work:
>
>   @lazyInit
>   _a = lambda: ...
>
> or
>
>   @lazyInit
>   def _a():
>     ...
>
> The main challenge is whether the decorator knows the name of the
> attribute/function it is being applied to; I dimly remember that that's the
> case without further ado in decorators, so the last approach looks best to
> me.

Yeah, this is possible. f.func_name will give the name.

>
> ... oh. stackoverflow for the win.
> See
> http://stackoverflow.com/questions/3237678/how-to-create-decorator-for-lazy-initialization-of-a-property
> , the code is already available in precanned form. @read_only_lazyprop seems
> to do all that we need.

You'll also want to profile that, at least for stuff in the core.
Also consider that you might not speed things up as much as you think
because things might already be cached by the global cache.

This solves the problems for attributes of classes, but what about
stuff that's at the module level?

>
>
>> You're right that it only matters for functions that are called a lot,
>> though (but _sympify is such a function).
>
>
> Oh yes, that thing is pervasive.
> I guess the module-level imports that replace C. should be done on the core
> functions first, so that any cycles that we might still have to solve with
> in-function imports would happen elsewhere.
> I hope to get rid of all cycles via lazy initialization though.
>
>
>> No, it doesn't matter if it's slow when the class is used. The class
>> attribute is rarely called. It's only called if those particular
>> functions are used, and in that case, it is nothing compared to the
>> rest of the algorithm.
>
>
> Unless that function is being called in a tight loop.
> Or doing a recursion with many levels and a high multiplicity.
> Or if user code call into that function for a gazillion times, possibly
> indirectly so the user doesn't even notice.

For this particular algorithm, it's not though.

>
> I find the prospect of in-function imports quite scary, and would prefer to
> keep it at arm's length as a tool of last resort.
>
>
>> On the other hand, you can really feel even the
>>
>> smallest things at import time, because that multiplies up over
>> various libraries to make loading things like pylab slower
>
>
> I noticed that all import statements are run twice.
> Well, maybe not all, just those that I added logging code to; it was just
> half a dozen cases where I did that. Still, the outputs would always be
> twice when run from bin/test.
> I haven't investigated this effect at all, but if this affects not just
> testing but also normal use, that could be a contributing factor to start-up
> times.
>
> I'll probably take a look as soon as I get around to doing more Python work.
> Might take a while though, so feel free to investigate yourself at your
> leisure :-)
>
>
>> (and even
>>
>> without considering performance, running things like sqrt(pi) at
>> import time make debugging the core a huge pain, because if sqrt(pi)
>> raises an exception, then you can't even import sympy.
>
>
> Oh. How would calculating a constant like sqrt(pi) ever raise an exception?

When you break the assumptions system, which is used in Pow.__new__
(an tons of other places).

>
>
>> This actually,
>>
>> not performance, was the true motivation behind that commit).
>
>
> I can believe that :-)
>
> Seems like that simple C. replacement mission is growing as we get along.
> Things on my list right now are:
> - Find out why that import seems to run twice
> - Agreeing on and implementing a @lazyProperty decorator
> - Decorating all properties (and possibly other stuff) with @lazyProperty
> (includes performance testing to see what parts of Sympy initialization
> really take so long)
> - Replace all imports with relative imports
> - Possibly replace all imports with imports from the declaring module
> - Get rid of all those C. references... the smallest part of it all
>
> I'm looking forward to do that all, so no trouble here - just a heads-up
> that it's going to take longer than initially expected.

Aaron Meurer

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sympy.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sympy] Working on removing C

Reply via email to