On Mon, 1 Feb 2016 at 11:51 Yury Selivanov <yselivanov...@gmail.com> wrote:
> Hi Brett, > > On 2016-02-01 2:30 PM, Brett Cannon wrote: > > > > > > On Mon, 1 Feb 2016 at 11:11 Yury Selivanov <yselivanov...@gmail.com > > <mailto:yselivanov...@gmail.com>> wrote: > > > > Hi, > > > [..] > > > > What's next? > > ------------ > > > > First, I'd like to merge the new LOAD_METHOD opcode, see issue 26110 > > [1]. It's a very straightforward optimization, the patch is small > and > > easy to review. > > > > > > +1 from me. > > > > > > Second, I'd like to merge the new opcode cache, see issue 26219 [5]. > > All unittests pass. Memory usage increase is very moderate (<1mb for > > the entire test suite), and the performance increase is significant. > > The only potential blocker for this is PEP 509 approval (which I'd be > > happy to assist with). > > > > > > I think the fact that it improves performance across the board as well > > as eliminates the various tricks people use to cache global and > > built-ins, a big +1 from me. I guess that means Victor needs to ask > > for pronouncement on PEP 509. > > Great! AFAIK Victor still needs to update the PEP with some changes > (globally unique ma_version). My patch includes the latest > implementation of PEP 509, and it works fine (no regressions, no broken > unittests). I can also assist with reviewing Victor's implementation if > the PEP is accepted. > > > > > BTW, where does LOAD_ATTR fit into all of this? > > LOAD_ATTR optimization doesn't use any of PEP 509 new stuff (if I > understand you question correctly). It's based on the following > assumptions (that really make JITs work so well): > > 1. Most classes don't implement __getattribute__. > > 2. A lot of attributes are stored in objects' __dict__s. > > 3. Most attributes aren't shaded by descriptors/getters-setters; most > code just uses "self.attr". > > 4. An average method/function works on objects of the same type. Which > means that those objects were constructed in a very similar (if not > exact) fashion. > > For instance: > > class F: > def __init__(self, name): > self.name = name > def say(self): > print(self.name) # <- For all F instances, > # offset of 'name' in `F().__dict__`s > # will be the same > > If LOAD_ATTR gets too many cache misses (20 in my current patch) it gets > deoptimized, and the default implementation is used. So if the code is > very dynamic - there's no improvement, but no performance penalty either. > > In my patch, I use the cache to store (for LOAD_ATTR specifically): > > - pointer to object's type > - type->tp_version_tag > - the last successful __dict__ offset > > The first two fields are used to make sure that we have objects of the > same type. If it changes, we deoptimize the opcode immediately. Then > we try the offset. If it's successful - we have a cache hit. If not, > that's fine, we'll try another few times before deoptimizing the opcode. > So this is a third "next step" that has its own issue? -Brett > > > > > What do you think? > > > > > > It all looks great to me! > > Thanks! > > Yury > >
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com