[pypy-commit] extradoc extradoc: add the result of discussions from leysin (unedited)

fijal Mon, 29 Feb 2016 01:52:20 -0800

Author: fijal
Branch: extradoc
Changeset: r5612:de191b0da0b9
Date: 2016-02-29 10:51 +0100
http://bitbucket.org/pypy/extradoc/changeset/de191b0da0b9/


Log:    add the result of discussions from leysin (unedited)

diff --git a/planning/sprint-leysin-2016-notes.rst 
b/planning/sprint-leysin-2016-notes.rst
new file mode 100644
--- /dev/null
+++ b/planning/sprint-leysin-2016-notes.rst
@@ -0,0 +1,323 @@
+Tasks
+=====
+
+- mercurial benchmarks on PyPy runner exists, some benchmarks
+- mercurial porting C extensions to cffi MORE PROGRESS (fijal)
+- fix multiple inheritance resolution in cpyext (arigo, cfbolz around)
+- py3k work FIXING EVEN MORE TESTS, MERGED DEFAULT (AGAIN) (manuel, ronan)
+- register allocator, more information is now available, FIRST PROTOTYPE 
(remi, richard if remi has time), created an issue
+- clean up project lists (ronan, fijal)
+- test optimizeopt chain with hypothesis (cfbolz, fijal to discuss)
+- try fix speed center (richard, fijal to get him access), created issue
+- go skiing (marmoute)
+- go shopping
+- turn won't manage into issues (all)
+- start with new binary jit-log-opt (richard, fijal)
+- fixing stm (remi)
+- fix buffer API (arigo, fijal)
+
+
+
+
+won't manage
+--------------------
+
+- VMProf on OS X, fix bugs (can't reproduce)
+- jit leaner frontend
+- live ranges in JIT viewer
+- fix logging to not take time MESS
+- continuing to refactoring annotator
+- add warmup time to VMProf
+- use probablistic data structures for guard_value, WE HAVE A PLAN
+- single-run branch progress
+- update setup.py and upload rpython to pip
+
+
+done:
+---------
+- dict-proxy with cpyext DONE
+- fix bug in FixedSizeArray DONE
+- compress resume data more, play with hypothesis (cfbolz, arigo, fijal) DONE
+- maps reordering DONE
+- take funding calls off the website, write blog post DONE
+- fix lxml on cpyext-gc-support-2 (ronan, arigo) DONE, MERGED
+- apply vmprof to a non pypy lang (cfbolz, fijal around) DONE
+- talk benchmark statistics (cfbolz, mattip, ronan) DONE
+- merging default into stm (remi) LESS MESS, MERGING DONE
+- cpyext-gc-support-2 blog post (mattip, arigo) DONE
+- get data about test duration DONE
+- start a bit of a dev process document
+- merging cpyext-ext, numpy-on-cpyext NEXT NEXT SEGFAULT, IMPORTS NUMPY WITH 
ONE HACK
+- fix tests
+- a script to upload to bitbucket IN PROGRESS
+- have a test in rpython that checks against imports from pypy (cfbolz)
+- make snowperson (cfbolz, fijal)
+- general wizardry (cfbolz, arigo, samuele not around) 
+
+
+too many bridges
+-------------------------
+
+Problems:
+ -  pypy py.test is slow
+ - most bridges come from guard_value(map) (then guard_class)
+
+Steps:
+ - detect the situation (cardinality estimation)
+ - trace a general version
+ - look at all promotes in pypy, to see whether the general version is good
+ - in particular, we need to general version for maps
+ - make maps give up if the object is too big
+
+Research:
+ - how to deal with method invocations of the same method on different classes
+ - 
+
+
+Python3
+=========
+  - add more rposix features, use less replacements of os.XXX
+  - merge py3.3 -> py3k and create py3.5
+  - solve the speed issue
+  - utf8 & unicode problems
+   - list of things we suspect are slow on pypy3k:
+     * unicode & utf8 strings and dictionaries of those strings, potential 
solution
+       is not to use rpython unicode type
+     * itertools stuff is slower than python 3
+  - manuel & ronan go and work and SFC
+  - what to do with crowdfunding
+
+
+Idea around Mercurial
+==================
+(notes about "new" feature that could be useful in pypy
+
+- clone bundle,
+- share.pool,
+- people version,
+
+
+
+summer of code
+=============
+
+- volunteers from the pypy side: fijal, ronan, richard, remi, backup: armin
+- looking for students: richard, remi
+- unicode stuff as project
+
+
+
+cpyext+numpy
+============
+
+- two approaches:
+  - micronumpy: basically works, but no story for cpyext, bit of a dead end
+  - using numpy code with cpyext, with hooks into micronumpy
+
+- safe (but maybe slow) default, everything just works
+- hard part: hijack some of the functionality and replace it with micronumpy 
code
+- &#8211;&#8211;> Bucharest?
+
+
+
+tooling
+=======
+
+technical problems:
+- too many tools (vmprof, jitviewer, stmlog)
+- too many output formats (vmprof, jit-log-opt, stmlog*2)
+- jit-log-opt output format is brittle
+- parsing debug_merge_point is brittle
+- not good fallbacks
+- a lot of pypy-specific
+- identifying traces is not unique
+
+
+consolidation goals:
+- better format for jit-log-opt (keeping a way to show the old ascii output)
+- having a programmatic way to turn on trace dumps
+- combining vmprof/jitviewer
+- documentation/tutorial
+
+future cool features:
+- memory
+- warmup time
+- extensible events
+- web app changes respectively
+- navigation in jitviewer
+- way to compare runs
+- rpython functions where ops are coming from
+- threading and forking support
+
+
+volunteers:
+- Maciek
+- Matti
+- Richard
+- Sebastian
+
+
+steps:
+- collect interesting examples
+- embed jit-log-opt into vmprof-file
+- web stuff
+
+buildbot:
+- script/url to start/stop master
+- account for matti
+
+
+
+
+unstucking benchmarking
+====================
+
+problems:
+- py3k what benchmarks are there, where would we run them (and store the 
results)
+- split benchmark running
+- comparisons are broken (javascript exception)
+- old version with custom hacks that are not backed up??
+- access to raw data
+- store all the raw data
+- benchmarks are too quick on jit / too slow on interp
+- non consistent approach to warmup
+- we don't have errors
+- what to do with historical data
+- what to do with branch data
+
+simple steps to improve the situation:
+- revive single run branch
+- fix comparison (simple if you know JS)
+- add an api to get the data
+- upload json files to buildbot
+
+harder steps to improve the situation:
+- idea: tooling sprint
+- move to new machine
+- rerun benchmarks
+- upgrade benchmarks (particularly the libraries)
+- larger bechmarks
+- make unreliable benchmarks reliable
+- automatic slowdown reporting
+
+
+volunteer:
+- fijal?, cfbolz?, arigo?
+- start a bit during the sprint (Thursday)
+
+
+code quality & failing tests
+=====================
+
+problems:
+- tests fail for too long
+- general instability of recent releases (mostly the fault of unrolling)
+- some non-modular impenetrable code:
+  - ll2ctypes
+  - unroll
+  - cpyext
+  - structure of the jit optimizing chain
+- tests are slow
+
+solutions:
+- ll2ctypes: use cffi (see other discussion)
+- unroll: reducing features is the only idea we currently have
+- on the process level:
+  - release candidates
+  - RC PPAs?
+  - don't merge default into release branch
+  - be more principled about bugfix releases
+  - do the bugfix also on the latest release branch
+  - reduce the overhead of doing bugfix releases:
+    - look into automated bitbucket uploading
+- use hypothesis more!
+- run our own tests on pypy!
+- run tests in parallel
+
+Bitbucket related questions
+====================
+Bitbucket:
+- "We are not unhappy with bitbucket; Much better than anything we have before"
+- API to upload binary [question asked]
+- limited bandwidth to upload
+- limited bandwidth to download
+- push speed
+- clone speed [cloning under a minute on the way]
+- email notification "not usable": [improvement planned]
+  - a mail per push (not per commit)
+  - format &#8594; trimmed log message//trimmed diff.
+  - "From committer" wanted.
+- blocking --force (prevent multiple heads) [on their roadmap]
+- Comment on random commit/pull request.
+
+rffi discussion
+==========
+
+what do we want in the end:
+1) an interface like cffi used at the interpreter level and in rpython/*.
+2) rtyping, gctransformer use lltype objects
+
+problems:
+
+- interface & implementation of ll2ctypes
+- difference between translated pypy and test env
+- deprecated api (rawffi, rawffi_alt)
+- no special support for rffi in the annotator?? (seems unclear)
+
+how do we go forward:
+
+- create small examples (e.g. crypt module) that use cffi for testing and at 
the later point in see
+  how we can support full translation.
+  rffi.llexternal -> variants that release the gil, some don't. how do we 
readd the possibility
+  of doing the same using cffi?
+
+Example:
+
+sandbox_safe: where do we put the flag so that the annotator understands that?
+- preprocess step in cdef
+  common agreement to use pragma to define this flags (e.g
+    #pragma sandboxmode on -> off
+
+volunteer for the first small module:
+maybe scope for gsoc? manuel after merging py3.3
+
+
+Numpy Hijacking
+------------------------
+
+Start over with a different module that uses the multiarray type from numpy 
instead of W_NDimArray
+Make it use indexing for the first step, start copying methods from micronumpy
+use raw_virtual
+
+
+a fast trace hook
+------------------------
+
+
+
+support virtualenvs natively
+---------------------------------------
+  * needed to implement venv module on Python 3.3+
+  * consider backporting to 2.7 to help virtualenv
+  * poke Donal to mail rationale to pypy-dev
+  
+  
+STM
+--------
+
+Problems:
+ - performance
+ - what kind of conflicts are reasonable?
+ - how many conflicts are still ok?
+ - very slow warmup
+ - too many major collections
+ - what's the overhead of tiny transactions?
+ - need more data!
+ - maybe shorter transactions?
+ - measure, measure, measure
+ - talk to Intel
+
+ideas:
+ - find an application that we can speed up
+ - write a framework for that
+ - try to find a real-world something
_______________________________________________
pypy-commit mailing list
pypy-commit@python.org
https://mail.python.org/mailman/listinfo/pypy-commit

[pypy-commit] extradoc extradoc: add the result of discussions from leysin (unedited)

Reply via email to