On Thu, 2012-10-18 at 02:15 -0700, Brian Harring wrote:
> If folks haven't looked at python_generate_wrapper_scripts in 
> python.eclass, I'd suggest doing so.  For examples of it's usage, grep 
> for 'python_generate_wrapper_scripts' in /usr/bin/; any place you see 
> it, look for <that-script-name>-${PYTHON_TARGETS} (for example, 
> /usr/bin/sphinx-build{,-2.7,-3.2}.
> 
> Each usage there is a separate custom script for that specific binary; 
> if there is a bug in the script, well, we're screwed- requires 
> re-merging the package.
> 
> This setup, at least on my hardware, is .04s added to every 
> invocation; this is ignoring the inode cost for each, and the issue if 
> a bug ever appears in the script generation code (in which case we're 
> screwed- would require re-merging the package).
> 
> In parallel, we've got python-wrapper (ls /usr/bin/python -l); this is 
> provided by eselect-python and basically discern what the active 
> python version is, and use that in the absense of any directives.  
> This is implemented in C, and is reasonably sane; the overhead for 
> that is basically non-existant.
> 
> Roughly, I'm proposing we do away with python eclass's 
> generate_python_wrapper_scripts generation of a script, instead having 
> that just symlink to a binary provided by eselect-python that handles 
> this.  This centralizes the implementation (fix in one spot), and 
> would allow a c version to be used- basically eliminating the 
> overhead.
> 
> 
> There's a trick to this; currently, those generated scripts hardcode 
> the allowed/known python versions for that package.  We obviously have 
> to preserve that; I propose we shove it into the symlink path.
> 
> Basically, we add a /usr/libexec/python directory; within it, we have 
> a wrapper binary (explained below), and a set of symlinks pointing at 
> the root of that directory.  To cover our current python versions, the 
> following would suffice:
> 
> for x in {2.{4,5,6,7},3.{0,1,2,3,4}}-cpy 2.5-jython 2.7-pypy-1.{7,8} 
> \2.7-pypy-1.9; do
>   ln -s ./ /usr/libexec/python/$x 
> done
> 
> While that seems insane, there is a reason; via that, we can encode 
> the allowed versions into the symlink.  Using pkgcore's pquery for 
> example (which should support cpy: 2.5, 2.6, 2.7, 3.1, 3.2, 3.3) 
> instead of a wrapper script at /usr/bin/pquery, we'd have thus:
> 
> targets=( 2.{5,6,7}-cpy 3.{1,2,3}-cpy )
> targets=$(IFS=/;echo -n "${targets[*]}")
> # This results in
> # targets=2.5-cpy/2.6-cpy/2.7-cpy/3.1-cpy/3.2-cpy/3.3-cpy
> ln -s "/usr/libexec/python/${targets}/wrapper" \
>   /usr/bin/pquery
> 
> /usr/libexec/python/wrapper upon invocation, takes a look at argv[0]; 
> sees how it was invoked basically.  This will be the /usr/bin/whatever 
> pathway.  It reads the symlink, in the process getting the allowed 
> versions and preferred order of the versions.
> 
> Few notes; vast majority of filesystems will store the symlink target 
> into the core inode if at all possible- in doing so, this avoids 
> wasting an inode and is only limited by the length of the target.  
> That length is capped by PATH_MAX- which can range from 256 to 4k (or 
> higher).
> 
> For the pquery example above, that comes out to ~73 bytes for the 
> symlink pathway; well under PATH_MAX.
> 
> For the scenarios where PATH_MAX caps the symlink pathway, or for 
> whatever reason we don't want to use that trick, a tree of files 
> contained within /usr/libexec/python/ holding the allowed versions for 
> the matching pathway would suffice.
> 
> Either proposal here would be far faster than what we've got now; also 
> will use less space (ancillary benefit).
> 
> One subtle aspect here is that if we did this, it makes it possible to 
> avoid losing the invocation information- currently if you did 
> `/usr/bin/python3.2 $(which sphinx-build) blah`, because of how things 
> are implemented now (specifically the two layers of wrappers)- you'll 
> get python2.7 running that sphinx-build invocation.
> 
> This is wrong (it's directly discarding what the invocation 
> requested), although you're only going to see it for scripts that 
> do python introspection.
> 
> Via doing the restructuring I'm mentioning above, that issue can be 
> fixed, while making things faster/saner.
> 
> On a related note; we currently install multiple versions of the same 
> script- the only difference being the shebang.  If one ignores the 
> shebang, in some cases this is necessary- where the script is 2to3 
> translated, and the code for py2k vs py3k differs.  For most, the only 
> difference is in the shebang however.
> 

What if the invoking script is not needed to be 2to3 translated (super
minimal python code) but the remaining python libs need to be?

> While it's minor in space savings, it's possible to eliminate that 
> redundancy via a shebang target that looks at the pathway it was 
> invoked via.  Fairly easy actually, and basically zero overhead if 
> done.
> 
> Either way, thoughts?
> 
> What I'm proposing isn't perfect, but I'm of the view it's a step up 
> from what's in place now- and via centralizing this crap, makes it 
> easier to change/maintain this going forward as necessary.
> ~harring
> 

+1 from me.

Eclean has been checking the name it was invoked with long before I did
the major re-write.  From that it cleans either distfiles or packages if
invoked by either the eclean-dist or eclean-pkg symlinks.  If invoked by
eclean itself then it looks for the target in the arguments.  So Brian's
proposal is not something totally new, never been done before...

While this proposes something a little different.  It is still very much
along the same line and in my opinion a much better solution.
-- 
Brian Dolbec <dol...@gentoo.org>

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to