Re: [gentoo-portage-dev] [PATCH] isolated-functions.sh: eliminate loop in has()
On 4/22/16 9:07 AM, rindeal wrote: >>From edc6df44de4e0f22322062c7c7e1b973bd89f4cd Mon Sep 17 00:00:00 2001 > From: Jan Chren > Date: Fri, 22 Apr 2016 14:21:08 +0200 > Subject: [PATCH] isolated-functions.sh: eliminate loop in has() > > Looping is slow and clutters debug log. > Still this wouldn't matter that much if has() wasn't one of the most used > functions. do you have any benchmarks? what you say makes sense but i'm not sure of the implementation details of "$A" == "*${B}*" so its hard to say. > > Thus this patch should bring a general improvement. > --- > bin/isolated-functions.sh | 10 -- > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/bin/isolated-functions.sh b/bin/isolated-functions.sh > index e320f71..6900f99 100644 > --- a/bin/isolated-functions.sh > +++ b/bin/isolated-functions.sh > @@ -463,14 +463,12 @@ hasv() { > } > > has() { > - local needle=$1 > + local needle=$'\a'"$1"$'\a' why the ascii bell? just because you'd never expect it in a parameter to has? > shift > + local IFS=$'\a' > + local haystack=$'\a'"$@"$'\a' you want "$*" here not "$@" > > - local x > - for x in "$@"; do > - [ "${x}" = "${needle}" ] && return 0 > - done > - return 1 > + [[ "${haystack}" == *"${needle}"* ]] > } > > __repo_attr() { > -- > 2.7.3 > -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
[gentoo-portage-dev] [PATCH] pym/portage/util/locale.py: add a C module to check locale
From: "Anthony G. Basile" The current method to check for the system locale is to use python's ctype.util.find_library() to construct a full library path to the system libc.so which is then passed to ctypes.CDLL(). However, this gets bogged down in implementation dependant details and fails with musl. We work around this design flaw in ctypes with a small python module written in C called 'portage_c_check_locale', and only fall back on the current ctype-based check when this module is not available. Since this is the first python module written in C included in portage, as a side effect, we introduce the machinary for future modules in setup.py. X-Gentoo-bug: 571444 X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 --- pym/portage/util/locale.py | 36 +--- setup.py | 9 ++- src/check_locale.c | 139 + 3 files changed, 174 insertions(+), 10 deletions(-) create mode 100644 src/check_locale.c diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py index 2a15ea1..ed97f61 100644 --- a/pym/portage/util/locale.py +++ b/pym/portage/util/locale.py @@ -30,17 +30,17 @@ locale_categories = ( _check_locale_cache = {} -def _check_locale(silent): +def _ctypes_check_locale(): """ - The inner locale check function. + Check for locale using ctypes. """ libc_fn = find_library("c") if libc_fn is None: - return None + return (None, "") libc = LoadLibrary(libc_fn) if libc is None: - return None + return (None, "") lc = list(range(ord('a'), ord('z')+1)) uc = list(range(ord('A'), ord('Z')+1)) @@ -48,9 +48,6 @@ def _check_locale(silent): ruc = [libc.toupper(c) for c in lc] if lc != rlc or uc != ruc: - if silent: - return False - msg = ("WARNING: The LC_CTYPE variable is set to a locale " + "that specifies transformation between lowercase " + "and uppercase ASCII characters that is different than " + @@ -71,11 +68,32 @@ def _check_locale(silent): msg.extend([ " %s -> %s" % (chars(uc), chars(rlc)), " %28s: %s" % ('expected', chars(lc))]) + return (False, msg) + + return (True, "") + + +def _check_locale(silent): + """ + The inner locale check function. + """ + + try: + from portage_c_check_locale import _c_check_locale + (ret, msg) = _c_check_locale() + except ImportError: + writemsg_level("!!! Unable to import portage_c_check_locale\n", + level=logging.WARNING, noiselevel=-1) + (ret, msg) = _ctypes_check_locale() + + if ret: + return True + + if not silent: writemsg_level("".join(["!!! %s\n" % l for l in msg]), level=logging.ERROR, noiselevel=-1) - return False - return True + return False def check_locale(silent=False, env=None): diff --git a/setup.py b/setup.py index b066fae..ed353a3 100755 --- a/setup.py +++ b/setup.py @@ -4,7 +4,7 @@ from __future__ import print_function -from distutils.core import setup, Command +from distutils.core import setup, Command, Extension from distutils.command.build import build from distutils.command.build_scripts import build_scripts from distutils.command.clean import clean @@ -41,6 +41,11 @@ x_scripts = { ], } +x_c_helpers = { + 'portage_c_check_locale' : [ + 'src/check_locale.c' + ], +} class x_build(build): """ Build command with extra build_man call. """ @@ -636,6 +641,8 @@ setup( ['$sysconfdir/portage/repo.postsync.d', ['cnf/repo.postsync.d/example']], ], + ext_modules = [Extension(name=n, sources=m) for n, m in x_c_helpers.items()], + cmdclass = { 'build': x_build, 'build_man': build_man, diff --git a/src/check_locale.c b/src/check_locale.c new file mode 100644 index 000..a995028 --- /dev/null +++ b/src/check_locale.c @@ -0,0 +1,139 @@ +/* Copyright 2005-2015 Gentoo Foundation + * Distributed under the terms of the GNU General Public License v2 + */ + +#include +#include +#include + +static PyObject * portage_c_check_locale(PyObject *, PyObject *); + +static PyMethodDef CheckLocaleMethods[] = { + {"_c_check_locale&quo
Re: [gentoo-portage-dev] [PATCH] pym/portage/util/locale.py: add a C module to check locale
On 5/17/16 8:47 AM, Anthony G. Basile wrote: > From: "Anthony G. Basile" > > The current method to check for the system locale is to use python's > ctype.util.find_library() to construct a full library path to the > system libc.so which is then passed to ctypes.CDLL(). However, > this gets bogged down in implementation dependant details and > fails with musl. > > We work around this design flaw in ctypes with a small python module > written in C called 'portage_c_check_locale', and only fall back on > the current ctype-based check when this module is not available. > > Since this is the first python module written in C included in portage, > as a side effect, we introduce the machinary for future modules in > setup.py. > > X-Gentoo-bug: 571444 > X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 > --- I didn't want to clutter the commit message, but I'd like to add that I tested this on glibc, uclibc and musl. I also removed the module to make sure the fallback worked. BTW, I made two typos in the commit message where I wrote ctype rather than ctypes. I'll fix that on revision (if necessary). -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] pym/portage/util/locale.py: add a C module to check locale
On 5/17/16 9:38 AM, Brian Dolbec wrote: > On Tue, 17 May 2016 09:02:55 -0400 > "Anthony G. Basile" wrote: > >> On 5/17/16 8:47 AM, Anthony G. Basile wrote: >>> From: "Anthony G. Basile" >>> >>> The current method to check for the system locale is to use python's >>> ctype.util.find_library() to construct a full library path to the >>> system libc.so which is then passed to ctypes.CDLL(). However, >>> this gets bogged down in implementation dependant details and >>> fails with musl. >>> >>> We work around this design flaw in ctypes with a small python module >>> written in C called 'portage_c_check_locale', and only fall back on >>> the current ctype-based check when this module is not available. >>> >>> Since this is the first python module written in C included in >>> portage, as a side effect, we introduce the machinary for future >>> modules in setup.py. >>> >>> X-Gentoo-bug: 571444 >>> X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 >>> --- >> >> I didn't want to clutter the commit message, but I'd like to add that >> I tested this on glibc, uclibc and musl. I also removed the module to >> make sure the fallback worked. > > > I don't think this clutters the commit message and is good to > and should be included in it. *Everything* should be tested. I wanted to reassure people. I can add a comment on revision. > >> >> BTW, I made two typos in the commit message where I wrote ctype rather >> than ctypes. I'll fix that on revision (if necessary). >> >> > > Not being familiar with building/testing c code, is this testable in a > checkout by simply running setup.py build and running the build dir > code? I believe snakeoil has something along those lines for it's > compiled code/pkgcore testing. You can use PYTHONPATH to aim it to the right module path, but its variable for each arch/python version. > > Overall I like it, but I'll let others review the actual code > implementation since I'm not an experienced "C" coder. In 30 > years, I never did more than hello world a few times in C. And the > pascal coding I did in College is more like python than c. > > -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] pym/portage/util/locale.py: add a C module to check locale
On 5/17/16 8:47 AM, Anthony G. Basile wrote: > + > + try: > + from portage_c_check_locale import _c_check_locale > + (ret, msg) = _c_check_locale() > + except ImportError: > + writemsg_level("!!! Unable to import portage_c_check_locale\n", > + level=logging.WARNING, noiselevel=-1) > + (ret, msg) = _ctypes_check_locale() > + actually there's an error here. msg returned form _c_check_locale() is a string while what's returned form _ctypes_check_locale() is a list of strings which gets joined later. i kept going back and forth on the code and mixed it up. i'll fix it on revision but i'll wait a bit for other comments first. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] pym/portage/util/locale.py: add a C module to check locale
On 5/18/16 2:50 AM, Alexander Berntsen wrote: > On 17/05/16 14:47, Anthony G. Basile wrote: >> Since this is the first python module written in C included in >> portage, as a side effect, we introduce the machinary for future >> modules in setup.py. > Split it into two commits. > Read the code, you really can't. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] pym/portage/util/locale.py: add a C module to check locale
On 5/17/16 9:38 AM, Brian Dolbec wrote: > > Overall I like it, but I'll let others review the actual code > implementation since I'm not an experienced "C" coder. In 30 > years, I never did more than hello world a few times in C. And the > pascal coding I did in College is more like python than c. > > I updated the patch and its working fine with python3.4 but with 2.7 I'm hitting the following error which makes no sense to me. What's even worse, is that I can manually import the module from a python2.7 shell and it works as intended, but it doesn't in the context of portage. As part of my debugging I stripped the module down to what amounts to a "hello world" and I still hit this. So *any* python module written in C is going to have this issue. Anyone have a clue? Traceback (most recent call last): File "/usr/lib/python-exec/python2.7/emerge", line 50, in retval = emerge_main() File "/usr/lib64/python2.7/site-packages/_emerge/main.py", line 1157, in emerge_main action=myaction, args=myfiles, opts=myopts) File "/usr/lib64/python2.7/site-packages/portage/proxy/objectproxy.py", line 31, in __call__ return result(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/_emerge/actions.py", line 2387, in load_emerge_config root_trees["root_config"] = RootConfig(settings, root_trees, setconfig) File "/usr/lib64/python2.7/site-packages/_emerge/RootConfig.py", line 27, in __init__ self.sets = self.setconfig.getSets() File "/usr/lib64/python2.7/site-packages/portage/_sets/__init__.py", line 284, in getSets self._parse() File "/usr/lib64/python2.7/site-packages/portage/_sets/__init__.py", line 230, in _parse optdict[oname] = parser.get(sname, oname) File "/usr/lib64/python2.7/ConfigParser.py", line 623, in get return self._interpolate(section, option, value, d) File "/usr/lib64/python2.7/ConfigParser.py", line 691, in _interpolate self._interpolate_some(option, L, rawval, section, vars, 1) File "/usr/lib64/python2.7/ConfigParser.py", line 723, in _interpolate_some option, section, rest, var) InterpolationMissingOptionError: Bad value substitution: section: [usersets] option : directory key: portage_configroot rawval : etc/portage/sets -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] pym/portage/util/locale.py: add a C module to check locale
On 5/18/16 10:26 AM, Brian Dolbec wrote: > On Wed, 18 May 2016 06:08:20 -0400 > "Anthony G. Basile" wrote: > >> On 5/18/16 2:50 AM, Alexander Berntsen wrote: >>> On 17/05/16 14:47, Anthony G. Basile wrote: >>>> Since this is the first python module written in C included in >>>> portage, as a side effect, we introduce the machinary for future >>>> modules in setup.py. >>> Split it into two commits. >>> >> Read the code, you really can't. >> > > I think he means the setup.py change that adds the extension > capability, then add the locale changes and new module. > That seems like a logical split to me. > Actually its got nothing to do with the module. Its an independent bug. Here's how you can reproduce: 1. edit /etc/locale.gen and uncomment all the locales 2. run locale-gen 3. use eselect locale to choose 'turkish' 4. use eselect python to choose 2.7 5. emerge =sys-apps/gradm or anything No patch nothing. You'll hit it with 2.2.28 and above. Brian, can you see if you hit it there. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] pym/portage/util/locale.py: add a C module to check locale
On 5/18/16 10:51 AM, Anthony G. Basile wrote: > On 5/18/16 10:26 AM, Brian Dolbec wrote: >> On Wed, 18 May 2016 06:08:20 -0400 >> "Anthony G. Basile" wrote: >> >>> On 5/18/16 2:50 AM, Alexander Berntsen wrote: >>>> On 17/05/16 14:47, Anthony G. Basile wrote: >>>>> Since this is the first python module written in C included in >>>>> portage, as a side effect, we introduce the machinary for future >>>>> modules in setup.py. >>>> Split it into two commits. >>>> >>> Read the code, you really can't. >>> >> >> I think he means the setup.py change that adds the extension >> capability, then add the locale changes and new module. >> That seems like a logical split to me. >> > > Actually its got nothing to do with the module. Its an independent bug. > Here's how you can reproduce: > > 1. edit /etc/locale.gen and uncomment all the locales > > 2. run locale-gen > > 3. use eselect locale to choose 'turkish' > > 4. use eselect python to choose 2.7 > > 5. emerge =sys-apps/gradm or anything > > No patch nothing. You'll hit it with 2.2.28 and above. > > Brian, can you see if you hit it there. > oops sorry I replied to the wrong email. I can do that split. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
[gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to check locale
From: "Anthony G. Basile" The current method to check for the system locale is to use python's ctypes.util.find_library() to construct a full library path to the system libc.so which is then passed to ctypes.CDLL(). However, this gets bogged down in implementation dependant details and fails with musl. We work around this design flaw in ctypes with a small python module written in C called 'portage_c_check_locale', and only fall back on the current ctypes-based check when this module is not available. This has been tested on glibc, uClibc and musl systems. X-Gentoo-bug: 571444 X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 Signed-off-by: Anthony G. Basile --- pym/portage/util/locale.py | 40 + setup.py | 6 +- src/check_locale.c | 144 + 3 files changed, 178 insertions(+), 12 deletions(-) create mode 100644 src/check_locale.c diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py index 2a15ea1..fc5d052 100644 --- a/pym/portage/util/locale.py +++ b/pym/portage/util/locale.py @@ -30,17 +30,17 @@ locale_categories = ( _check_locale_cache = {} -def _check_locale(silent): +def _ctypes_check_locale(): """ - The inner locale check function. + Check for locale using ctypes. """ libc_fn = find_library("c") if libc_fn is None: - return None + return (None, "") libc = LoadLibrary(libc_fn) if libc is None: - return None + return (None, "") lc = list(range(ord('a'), ord('z')+1)) uc = list(range(ord('A'), ord('Z')+1)) @@ -48,9 +48,6 @@ def _check_locale(silent): ruc = [libc.toupper(c) for c in lc] if lc != rlc or uc != ruc: - if silent: - return False - msg = ("WARNING: The LC_CTYPE variable is set to a locale " + "that specifies transformation between lowercase " + "and uppercase ASCII characters that is different than " + @@ -71,11 +68,32 @@ def _check_locale(silent): msg.extend([ " %s -> %s" % (chars(uc), chars(rlc)), " %28s: %s" % ('expected', chars(lc))]) - writemsg_level("".join(["!!! %s\n" % l for l in msg]), - level=logging.ERROR, noiselevel=-1) - return False + msg = "".join(["!!! %s\n" % l for l in msg]), + return (False, msg) + + return (True, "") + + +def _check_locale(silent): + """ + The inner locale check function. + """ + + try: + from portage_c_check_locale import _c_check_locale + (ret, msg) = _c_check_locale() + except ImportError: + writemsg_level("!!! Unable to import portage_c_check_locale\n", + level=logging.WARNING, noiselevel=-1) + (ret, msg) = _ctypes_check_locale() + + if ret: + return True + + if not silent: + writemsg_level(msg, level=logging.ERROR, noiselevel=-1) - return True + return False def check_locale(silent=False, env=None): diff --git a/setup.py b/setup.py index 25429bc..e44ac41 100755 --- a/setup.py +++ b/setup.py @@ -47,7 +47,11 @@ x_scripts = { # Dictionary custom modules written in C/C++ here. The structure is # key = module name # value = list of C/C++ source code, path relative to top source directory -x_c_helpers = {} +x_c_helpers = { + 'portage_c_check_locale' : [ + 'src/check_locale.c', + ], +} class x_build(build): """ Build command with extra build_man call. """ diff --git a/src/check_locale.c b/src/check_locale.c new file mode 100644 index 000..9762ef2 --- /dev/null +++ b/src/check_locale.c @@ -0,0 +1,144 @@ +/* Copyright 2005-2015 Gentoo Foundation + * Distributed under the terms of the GNU General Public License v2 + */ + +#include +#include +#include + +static PyObject * portage_c_check_locale(PyObject *, PyObject *); + +static PyMethodDef CheckLocaleMethods[] = { + {"_c_check_locale", portage_c_check_locale, METH_NOARGS, "Check the system locale."}, + {NULL, NULL, 0, NULL} +}; + +#if PY_MAJOR_VERSION >= 3 +static struct PyModuleDef moduledef = { + PyModuleDef_HEAD_INIT, + "portage_c_check_locale", /* m_name */
[gentoo-portage-dev] [PATCH 1/2] setup.py: add stub for building custom modules in C/C++
From: "Anthony G. Basile" Currently portage doesn't include any custom modules written in C/C++. This commit introduces stub code for building such modules in setup.py. Signed-off-by: Anthony G. Basile --- setup.py | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/setup.py b/setup.py index 75c4bcb..25429bc 100755 --- a/setup.py +++ b/setup.py @@ -4,7 +4,7 @@ from __future__ import print_function -from distutils.core import setup, Command +from distutils.core import setup, Command, Extension from distutils.command.build import build from distutils.command.build_scripts import build_scripts from distutils.command.clean import clean @@ -30,6 +30,9 @@ import sys # TODO: # - smarter rebuilds of docs w/ 'install_docbook' and 'install_epydoc'. +# Dictionary of scripts. The structure is +# key = location in filesystem to install the scripts +# value = list of scripts, path relative to top source directory x_scripts = { 'bin': [ 'bin/ebuild', 'bin/egencache', 'bin/emerge', 'bin/emerge-webrsync', @@ -41,6 +44,10 @@ x_scripts = { ], } +# Dictionary custom modules written in C/C++ here. The structure is +# key = module name +# value = list of C/C++ source code, path relative to top source directory +x_c_helpers = {} class x_build(build): """ Build command with extra build_man call. """ @@ -636,6 +643,8 @@ setup( ['$sysconfdir/portage/repo.postsync.d', ['cnf/repo.postsync.d/example']], ], + ext_modules = [Extension(name=n, sources=m) for n, m in x_c_helpers.items()], + cmdclass = { 'build': x_build, 'build_man': build_man, -- 2.7.3
Re: [gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to check locale
On 5/19/16 8:43 AM, Anthony G. Basile wrote: > From: "Anthony G. Basile" > > This has been tested on glibc, uClibc and musl systems. To be clear here, I needed the patch from bug #583412 to fix an independent bug in portage + python2.7 + turkish (and possibly other) locale. Arfrever is going to commit that upstream, so it'll eventually trickle down to us. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to check locale
On 5/19/16 9:38 AM, Michał Górny wrote: > Dnia 19 maja 2016 14:43:38 CEST, "Anthony G. Basile" > napisał(a): >> From: "Anthony G. Basile" >> >> The current method to check for the system locale is to use python's >> ctypes.util.find_library() to construct a full library path to the >> system libc.so which is then passed to ctypes.CDLL(). However, >> this gets bogged down in implementation dependant details and >> fails with musl. >> >> We work around this design flaw in ctypes with a small python module >> written in C called 'portage_c_check_locale', and only fall back on >> the current ctypes-based check when this module is not available. >> >> This has been tested on glibc, uClibc and musl systems. >> >> X-Gentoo-bug: 571444 >> X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 >> Signed-off-by: Anthony G. Basile > > To be honest, I don't like this. Do we really want to duplicate all this, > including duplicating the complete messages? This really looks messy and > unmaintainable. > > The only reason libc functions were being used was to workaround the hacky > built-in case conversion functions. And this is the only part where CDLL was > really used. > > So please change this to provide trivial wrappers around the C library > functions, and use them alternatively to ones obtained using CDLL. With a > single common code doing all the check logic and messaging. > > ctypes is itself a problem and so hacky python built-ins were replaced by more hacky python. CDLL shouldn't be used at all. The other problem is that non utf-8 encodings cause problems much earlier in the portage codebase than the test for a sane environment, which is a bigger problem than this small issue. No one checked what happens with python2.7 + portage + exotic locale. Since I don't know where this might head in the future, I kinda like the standalone potential. The only repeated code here is the message. Nonetheless, I can reduce this to just two functions, and do something like the following. I assume that's what you're suggesting: try: from portage_c_convert_case import _c_toupper, _c_tolower libc_toupper = _c_toupper libc_lolower = _c_tolower except ImportError: libc_fn = find_library("c") if libc_fn is None: return None libc = LoadLibrary(libc_fn) if libc is None: return None libc_toupper = libc.toupper libc_tolower = libc.tolower Incidentally, another approach, one that I use in bash is as follows. I think it requires bash4, and I'm not sure how to pull this into python gracefully. l="abcdefghijklmnopqrstuvwxyz" u="ABCDEFGHIJKLMNOPQRSTUVWXYZ" ru=${l^^} rl=${u,,} [[ $l == $rl ]] && echo "lower case okay" || echo "lower case bad" [[ $u == $ru ]] && echo "upper case okay" || echo "upper case bad" -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
[gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to help check locale
From: "Anthony G. Basile" The current method to check for a sane system locale is to use python's ctypes.util.find_library() to construct a full library path to the system libc.so and pass that path to ctypes.CDLL() so we can call toupper() and tolower() directly. However, this gets bogged down in implementation details and fails with musl. We work around this design flaw in ctypes with a small python module written in C which provides thin wrappers to toupper() and tolower(), and only fall back on the current ctypes-based check when this module is not available. This has been tested on glibc, uClibc and musl systems. X-Gentoo-bug: 571444 X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 Signed-off-by: Anthony G. Basile --- pym/portage/util/locale.py | 32 ++- setup.py | 6 ++- src/portage_c_convert_case.c | 94 3 files changed, 121 insertions(+), 11 deletions(-) create mode 100644 src/portage_c_convert_case.c diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py index 2a15ea1..85ddd2b 100644 --- a/pym/portage/util/locale.py +++ b/pym/portage/util/locale.py @@ -11,6 +11,7 @@ from __future__ import absolute_import, unicode_literals import locale import logging import os +import sys import textwrap import traceback @@ -34,18 +35,26 @@ def _check_locale(silent): """ The inner locale check function. """ - - libc_fn = find_library("c") - if libc_fn is None: - return None - libc = LoadLibrary(libc_fn) - if libc is None: - return None + try: + from portage_c_convert_case import _c_toupper, _c_tolower + libc_tolower = _c_tolower + libc_toupper = _c_toupper + except ImportError: + writemsg_level("!!! Unable to import portage_c_convert_case\n!!!\n", + level=logging.WARNING, noiselevel=-1) + libc_fn = find_library("c") + if libc_fn is None: + return None + libc = LoadLibrary(libc_fn) + if libc is None: + return None + libc_tolower = libc.tolower + libc_toupper = libc.toupper lc = list(range(ord('a'), ord('z')+1)) uc = list(range(ord('A'), ord('Z')+1)) - rlc = [libc.tolower(c) for c in uc] - ruc = [libc.toupper(c) for c in lc] + rlc = [libc_tolower(c) for c in uc] + ruc = [libc_toupper(c) for c in lc] if lc != rlc or uc != ruc: if silent: @@ -62,7 +71,10 @@ def _check_locale(silent): "as LC_CTYPE in make.conf.") msg = [l for l in textwrap.wrap(msg, 70)] msg.append("") - chars = lambda l: ''.join(chr(x) for x in l) + if sys.version_info.major >= 3: + chars = lambda l: ''.join(chr(x) for x in l) + else: + chars = lambda l: ''.join(chr(x).decode('utf-8', 'replace') for x in l) if uc != ruc: msg.extend([ " %s -> %s" % (chars(lc), chars(ruc)), diff --git a/setup.py b/setup.py index 25429bc..8b6b408 100755 --- a/setup.py +++ b/setup.py @@ -47,7 +47,11 @@ x_scripts = { # Dictionary custom modules written in C/C++ here. The structure is # key = module name # value = list of C/C++ source code, path relative to top source directory -x_c_helpers = {} +x_c_helpers = { + 'portage_c_convert_case' : [ + 'src/portage_c_convert_case.c', + ], +} class x_build(build): """ Build command with extra build_man call. """ diff --git a/src/portage_c_convert_case.c b/src/portage_c_convert_case.c new file mode 100644 index 000..f60b0c2 --- /dev/null +++ b/src/portage_c_convert_case.c @@ -0,0 +1,94 @@ +/* Copyright 2005-2016 Gentoo Foundation + * Distributed under the terms of the GNU General Public License v2 + */ + +#include +#include + +static PyObject * portage_c_tolower(PyObject *, PyObject *); +static PyObject * portage_c_toupper(PyObject *, PyObject *); + +static PyMethodDef ConvertCaseMethods[] = { + {"_c_tolower", portage_c_tolower, METH_VARARGS, "Convert to lower case using system locale."}, + {"_c_toupper", portage_c_toupper, METH_VARARGS, "Convert to upper case using system locale."}, + {NULL, NULL, 0, NULL} +}; + +#if PY_MAJOR_VERSION >= 3 +static struct PyModuleDef moduledef = { + PyModuleDef_HEAD_INIT, + "portage_c_convert_case",
[gentoo-portage-dev] [PATCH 1/2] setup.py: add stub for building custom modules in C/C++
From: "Anthony G. Basile" Currently portage doesn't include any custom modules written in C/C++. This commit introduces stub code for building such modules in setup.py. Signed-off-by: Anthony G. Basile --- setup.py | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/setup.py b/setup.py index 75c4bcb..25429bc 100755 --- a/setup.py +++ b/setup.py @@ -4,7 +4,7 @@ from __future__ import print_function -from distutils.core import setup, Command +from distutils.core import setup, Command, Extension from distutils.command.build import build from distutils.command.build_scripts import build_scripts from distutils.command.clean import clean @@ -30,6 +30,9 @@ import sys # TODO: # - smarter rebuilds of docs w/ 'install_docbook' and 'install_epydoc'. +# Dictionary of scripts. The structure is +# key = location in filesystem to install the scripts +# value = list of scripts, path relative to top source directory x_scripts = { 'bin': [ 'bin/ebuild', 'bin/egencache', 'bin/emerge', 'bin/emerge-webrsync', @@ -41,6 +44,10 @@ x_scripts = { ], } +# Dictionary custom modules written in C/C++ here. The structure is +# key = module name +# value = list of C/C++ source code, path relative to top source directory +x_c_helpers = {} class x_build(build): """ Build command with extra build_man call. """ @@ -636,6 +643,8 @@ setup( ['$sysconfdir/portage/repo.postsync.d', ['cnf/repo.postsync.d/example']], ], + ext_modules = [Extension(name=n, sources=m) for n, m in x_c_helpers.items()], + cmdclass = { 'build': x_build, 'build_man': build_man, -- 2.7.3
Re: [gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to help check locale
On 5/23/16 2:44 AM, Michał Górny wrote: > On Sun, 22 May 2016 13:04:40 -0400 > "Anthony G. Basile" wrote: > >> From: "Anthony G. Basile" >> >> The current method to check for a sane system locale is to use python's >> ctypes.util.find_library() to construct a full library path to the >> system libc.so and pass that path to ctypes.CDLL() so we can call >> toupper() and tolower() directly. However, this gets bogged down in >> implementation details and fails with musl. >> >> We work around this design flaw in ctypes with a small python module >> written in C which provides thin wrappers to toupper() and tolower(), >> and only fall back on the current ctypes-based check when this module >> is not available. >> >> This has been tested on glibc, uClibc and musl systems. >> >> X-Gentoo-bug: 571444 >> X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 >> >> Signed-off-by: Anthony G. Basile >> --- >> pym/portage/util/locale.py | 32 ++- >> setup.py | 6 ++- >> src/portage_c_convert_case.c | 94 >> >> 3 files changed, 121 insertions(+), 11 deletions(-) >> create mode 100644 src/portage_c_convert_case.c >> >> diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py >> index 2a15ea1..85ddd2b 100644 >> --- a/pym/portage/util/locale.py >> +++ b/pym/portage/util/locale.py >> @@ -11,6 +11,7 @@ from __future__ import absolute_import, unicode_literals >> import locale >> import logging >> import os >> +import sys >> import textwrap >> import traceback >> >> @@ -34,18 +35,26 @@ def _check_locale(silent): >> """ >> The inner locale check function. >> """ >> - >> -libc_fn = find_library("c") >> -if libc_fn is None: >> -return None >> -libc = LoadLibrary(libc_fn) >> -if libc is None: >> -return None >> +try: >> +from portage_c_convert_case import _c_toupper, _c_tolower >> +libc_tolower = _c_tolower >> +libc_toupper = _c_toupper > > Now I'm being picky... but if you named the functions toupper() > and tolower(), you could actually import the whole module as 'libc' > and have less code! I see what you're saying, and its tempting because its elegant, but I'm afraid of a clash of names. I've got a bad feeling this will get us into trouble later. Let me play with this and see what happens. > > Also it would be nice to actually make the module more generic. There > are more places where we use CDLL, and all of them could eventually be > supported by the module (unshare() would be much better done in C, for > example). Yeah I get your point here. Let me convince myself first. > >> +except ImportError: >> +writemsg_level("!!! Unable to import >> portage_c_convert_case\n!!!\n", >> +level=logging.WARNING, noiselevel=-1) > > Do we really want to warn verbosely about this? I think it'd be > a pretty common case for people running the git checkout. This should stay. Its good to know that the module is not being imported and silently falling back on the ctypes stuff. 1) its only going to happen in the rare occasion that you're using something like a turkish locale and can't import the module. 2) people who do a git checkout should add PYTHONPATH=build/lib.linux-x86_64-3.4 to their env to test the module. I can add something to testpath. Users will have to be instructed to run `./setup build` and then the script shoudl read something like this unamem=$(uname -m) pythonversion=$(python --version 2>&1 | cut -c8-) pythonversion=${pythonversion%\.*} portagedir=$(dirname ${BASH_SOURCE[0]}) export PATH="${portagedir}/bin:${PATH}" export PYTHONPATH="${portagedir}/build/lib.linux-${unamem}-${pythonversion}:${portagedir}/pym:${PYTHONPATH:+:}${PYTHONPATH}" export PYTHONWARNINGS=d,i::ImportWarning BTW, the original code must have a bug in it. It reads export PYTHONPATH=PYTHONPATH="$(dirname $BASH_SOURCE[0])/pym:${PYTHONPATH:+:}${PYTHONPATH}" The double PYTHONPATH=PYTHONPATH= can't be right. > >> +libc_fn = find_library("c") >> +if libc_fn is None: >> +return None >> +libc = LoadLibrary(libc_fn) >> +if libc is None: >> +return None >> +libc_tolower = libc.tolower >> +
[gentoo-portage-dev] [PATCH 1/3] pym/portage/util/locale.py: fix decoding for python2 with some locales
From: "Anthony G. Basile" When using python2 with some locales, like turkish, chr() is passed values not in range(128) which cannot be decoded as ASCII, thus throwing a UnicodeDecodeError exception. We use _unicode_decode() from portage.util to address this. Signed-off-by: Anthony G. Basile --- pym/portage/util/locale.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py index 2a15ea1..093eb86 100644 --- a/pym/portage/util/locale.py +++ b/pym/portage/util/locale.py @@ -15,7 +15,7 @@ import textwrap import traceback import portage -from portage.util import writemsg_level +from portage.util import _unicode_decode, writemsg_level from portage.util._ctypes import find_library, LoadLibrary @@ -62,7 +62,7 @@ def _check_locale(silent): "as LC_CTYPE in make.conf.") msg = [l for l in textwrap.wrap(msg, 70)] msg.append("") - chars = lambda l: ''.join(chr(x) for x in l) + chars = lambda l: ''.join(_unicode_decode(chr(x)) for x in l) if uc != ruc: msg.extend([ " %s -> %s" % (chars(lc), chars(ruc)), -- 2.7.3
[gentoo-portage-dev] [PATCH 3/3] pym/portage/util/locale.py: add a C module to help check locale
From: "Anthony G. Basile" The current method to check for a sane system locale is to use python's ctypes.util.find_library() to construct a full library path to the system libc.so and pass that path to ctypes.CDLL() so we can call toupper() and tolower() directly. However, this gets bogged down in implementation details and fails with musl. We work around this design flaw in ctypes with a small python module written in C which provides thin wrappers to toupper() and tolower(), and only fall back on the current ctypes-based check when this module is not available. This has been tested on glibc, uClibc and musl systems. X-Gentoo-bug: 571444 X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 Signed-off-by: Anthony G. Basile --- pym/portage/util/locale.py | 16 ++- setup.py | 6 +++- src/portage_util_libc.c| 70 ++ 3 files changed, 84 insertions(+), 8 deletions(-) create mode 100644 src/portage_util_libc.c diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py index 093eb86..5b09945 100644 --- a/pym/portage/util/locale.py +++ b/pym/portage/util/locale.py @@ -34,13 +34,15 @@ def _check_locale(silent): """ The inner locale check function. """ - - libc_fn = find_library("c") - if libc_fn is None: - return None - libc = LoadLibrary(libc_fn) - if libc is None: - return None + try: + from portage.util import libc + except ImportError: + libc_fn = find_library("c") + if libc_fn is None: + return None + libc = LoadLibrary(libc_fn) + if libc is None: + return None lc = list(range(ord('a'), ord('z')+1)) uc = list(range(ord('A'), ord('Z')+1)) diff --git a/setup.py b/setup.py index 25429bc..5ca8156 100755 --- a/setup.py +++ b/setup.py @@ -47,7 +47,11 @@ x_scripts = { # Dictionary custom modules written in C/C++ here. The structure is # key = module name # value = list of C/C++ source code, path relative to top source directory -x_c_helpers = {} +x_c_helpers = { + 'portage.util.libc' : [ + 'src/portage_util_libc.c', + ], +} class x_build(build): """ Build command with extra build_man call. """ diff --git a/src/portage_util_libc.c b/src/portage_util_libc.c new file mode 100644 index 000..00b09c2 --- /dev/null +++ b/src/portage_util_libc.c @@ -0,0 +1,70 @@ +/* Copyright 2005-2016 Gentoo Foundation + * Distributed under the terms of the GNU General Public License v2 + */ + +#include +#include +#include + +static PyObject * _libc_tolower(PyObject *, PyObject *); +static PyObject * _libc_toupper(PyObject *, PyObject *); + +static PyMethodDef LibcMethods[] = { + {"tolower", _libc_tolower, METH_VARARGS, "Convert to lower case using system locale."}, + {"toupper", _libc_toupper, METH_VARARGS, "Convert to upper case using system locale."}, + {NULL, NULL, 0, NULL} +}; + +#if PY_MAJOR_VERSION >= 3 +static struct PyModuleDef moduledef = { + PyModuleDef_HEAD_INIT, + "libc", /* m_name */ + "Module for converting case using the system locale", /* m_doc */ + -1, /* m_size */ + LibcMethods,/* m_methods */ + NULL, /* m_reload */ + NULL, /* m_traverse */ + NULL, /* m_clear */ + NULL, /* m_free */ +}; +#endif + +PyMODINIT_FUNC + +#if PY_MAJOR_VERSION >= 3 +PyInit_libc(void) +{ + PyObject *m; + m = PyModule_Create(&moduledef); + return m; +} +#else +initlibc(void) +{ + Py_InitModule("libc", LibcMethods); +} +#endif + + +static PyObject * +_libc_tolower(PyObject *self, PyObject *args) +{ + int c; + + if (!PyArg_ParseTuple(args, "i", &c)) + return NULL; + + return Py_BuildValue("i", tolower(c)); +} + + +static PyObject * +_libc_toupper(PyObject *self, PyObject *args) +{ + int c; + + if (!PyArg_ParseTuple(args, "i", &c)) + return NULL; + + return Py_BuildValue("i", toupper(c)); +} -- 2.7.3
[gentoo-portage-dev] [PATCH 2/3] setup.py: add stub for building custom modules in C/C++
From: "Anthony G. Basile" Currently portage doesn't include any custom modules written in C/C++. This commit introduces stub code for building such modules in setup.py. Signed-off-by: Anthony G. Basile --- setup.py | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/setup.py b/setup.py index 75c4bcb..25429bc 100755 --- a/setup.py +++ b/setup.py @@ -4,7 +4,7 @@ from __future__ import print_function -from distutils.core import setup, Command +from distutils.core import setup, Command, Extension from distutils.command.build import build from distutils.command.build_scripts import build_scripts from distutils.command.clean import clean @@ -30,6 +30,9 @@ import sys # TODO: # - smarter rebuilds of docs w/ 'install_docbook' and 'install_epydoc'. +# Dictionary of scripts. The structure is +# key = location in filesystem to install the scripts +# value = list of scripts, path relative to top source directory x_scripts = { 'bin': [ 'bin/ebuild', 'bin/egencache', 'bin/emerge', 'bin/emerge-webrsync', @@ -41,6 +44,10 @@ x_scripts = { ], } +# Dictionary custom modules written in C/C++ here. The structure is +# key = module name +# value = list of C/C++ source code, path relative to top source directory +x_c_helpers = {} class x_build(build): """ Build command with extra build_man call. """ @@ -636,6 +643,8 @@ setup( ['$sysconfdir/portage/repo.postsync.d', ['cnf/repo.postsync.d/example']], ], + ext_modules = [Extension(name=n, sources=m) for n, m in x_c_helpers.items()], + cmdclass = { 'build': x_build, 'build_man': build_man, -- 2.7.3
Re: [gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to help check locale
On 5/23/16 10:25 AM, Michał Górny wrote: > On Mon, 23 May 2016 08:08:18 -0400 > "Anthony G. Basile" wrote: > >> On 5/23/16 2:44 AM, Michał Górny wrote: >>> On Sun, 22 May 2016 13:04:40 -0400 >>> "Anthony G. Basile" wrote: >>> >>>> From: "Anthony G. Basile" >>>> >>>> The current method to check for a sane system locale is to use python's >>>> ctypes.util.find_library() to construct a full library path to the >>>> system libc.so and pass that path to ctypes.CDLL() so we can call >>>> toupper() and tolower() directly. However, this gets bogged down in >>>> implementation details and fails with musl. >>>> >>>> We work around this design flaw in ctypes with a small python module >>>> written in C which provides thin wrappers to toupper() and tolower(), >>>> and only fall back on the current ctypes-based check when this module >>>> is not available. >>>> >>>> This has been tested on glibc, uClibc and musl systems. >>>> >>>> X-Gentoo-bug: 571444 >>>> X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 >>>> >>>> Signed-off-by: Anthony G. Basile >>>> --- >>>> pym/portage/util/locale.py | 32 ++- >>>> setup.py | 6 ++- >>>> src/portage_c_convert_case.c | 94 >>>> >>>> 3 files changed, 121 insertions(+), 11 deletions(-) >>>> create mode 100644 src/portage_c_convert_case.c >>>> >>>> diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py >>>> index 2a15ea1..85ddd2b 100644 >>>> --- a/pym/portage/util/locale.py >>>> +++ b/pym/portage/util/locale.py >>>> @@ -11,6 +11,7 @@ from __future__ import absolute_import, unicode_literals >>>> import locale >>>> import logging >>>> import os >>>> +import sys >>>> import textwrap >>>> import traceback >>>> >>>> @@ -34,18 +35,26 @@ def _check_locale(silent): >>>>""" >>>>The inner locale check function. >>>>""" >>>> - >>>> - libc_fn = find_library("c") >>>> - if libc_fn is None: >>>> - return None >>>> - libc = LoadLibrary(libc_fn) >>>> - if libc is None: >>>> - return None >>>> + try: >>>> + from portage_c_convert_case import _c_toupper, _c_tolower >>>> + libc_tolower = _c_tolower >>>> + libc_toupper = _c_toupper >>> >>> Now I'm being picky... but if you named the functions toupper() >>> and tolower(), you could actually import the whole module as 'libc' >>> and have less code! >> >> I see what you're saying, and its tempting because its elegant, but I'm >> afraid of a clash of names. I've got a bad feeling this will get us >> into trouble later. >> >> Let me play with this and see what happens. > > I don't think this will be problematic since things like this happen > in Python all the time ;-). And after all, C function names can be > different than Python function names. It works fine so my last set of patches adopts this approach. > >>> Also it would be nice to actually make the module more generic. There >>> are more places where we use CDLL, and all of them could eventually be >>> supported by the module (unshare() would be much better done in C, for >>> example). >> >> Yeah I get your point here. Let me convince myself first. > > I've got a killer argument: right now we hardcode constants from Linux > headers in the Python code! > > Not that I'm asking you to actually add code for that as well. Just > rename the module to something more generic like portage.util.libc ;-). Well you might as well point me in this direction since I'm working on this now. > >>>> + except ImportError: >>>> + writemsg_level("!!! Unable to import >>>> portage_c_convert_case\n!!!\n", >>>> + level=logging.WARNING, noiselevel=-1) >>> >>> Do we really want to warn verbosely about this? I think it'd be >>> a pretty common case for people running the git checkout. >> >> This should stay. Its good to know that the module
Re: [gentoo-portage-dev] [PATCH 3/3] pym/portage/util/locale.py: add a C module to help check locale
On 5/29/16 2:30 AM, Michał Górny wrote: > On Fri, 27 May 2016 10:26:44 -0400 > "Anthony G. Basile" wrote: > >> From: "Anthony G. Basile" >> >> The current method to check for a sane system locale is to use python's >> ctypes.util.find_library() to construct a full library path to the >> system libc.so and pass that path to ctypes.CDLL() so we can call >> toupper() and tolower() directly. However, this gets bogged down in >> implementation details and fails with musl. >> >> We work around this design flaw in ctypes with a small python module >> written in C which provides thin wrappers to toupper() and tolower(), >> and only fall back on the current ctypes-based check when this module >> is not available. >> >> This has been tested on glibc, uClibc and musl systems. >> >> X-Gentoo-bug: 571444 >> X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 >> >> Signed-off-by: Anthony G. Basile >> --- >> pym/portage/util/locale.py | 16 ++- >> setup.py | 6 +++- >> src/portage_util_libc.c| 70 >> ++ >> 3 files changed, 84 insertions(+), 8 deletions(-) >> create mode 100644 src/portage_util_libc.c >> >> diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py >> index 093eb86..5b09945 100644 >> --- a/pym/portage/util/locale.py >> +++ b/pym/portage/util/locale.py >> @@ -34,13 +34,15 @@ def _check_locale(silent): >> """ >> The inner locale check function. >> """ >> - >> -libc_fn = find_library("c") >> -if libc_fn is None: >> -return None >> -libc = LoadLibrary(libc_fn) >> -if libc is None: >> -return None >> +try: >> +from portage.util import libc >> +except ImportError: >> +libc_fn = find_library("c") >> +if libc_fn is None: >> +return None >> +libc = LoadLibrary(libc_fn) >> +if libc is None: >> +return None >> >> lc = list(range(ord('a'), ord('z')+1)) >> uc = list(range(ord('A'), ord('Z')+1)) >> diff --git a/setup.py b/setup.py >> index 25429bc..5ca8156 100755 >> --- a/setup.py >> +++ b/setup.py >> @@ -47,7 +47,11 @@ x_scripts = { >> # Dictionary custom modules written in C/C++ here. The structure is >> # key = module name >> # value = list of C/C++ source code, path relative to top source directory >> -x_c_helpers = {} >> +x_c_helpers = { >> +'portage.util.libc' : [ >> +'src/portage_util_libc.c', >> +], >> +} >> >> class x_build(build): >> """ Build command with extra build_man call. """ >> diff --git a/src/portage_util_libc.c b/src/portage_util_libc.c >> new file mode 100644 >> index 000..00b09c2 >> --- /dev/null >> +++ b/src/portage_util_libc.c >> @@ -0,0 +1,70 @@ >> +/* Copyright 2005-2016 Gentoo Foundation >> + * Distributed under the terms of the GNU General Public License v2 >> + */ >> + >> +#include >> +#include >> +#include >> + >> +static PyObject * _libc_tolower(PyObject *, PyObject *); >> +static PyObject * _libc_toupper(PyObject *, PyObject *); >> + >> +static PyMethodDef LibcMethods[] = { >> +{"tolower", _libc_tolower, METH_VARARGS, "Convert to lower case using >> system locale."}, >> +{"toupper", _libc_toupper, METH_VARARGS, "Convert to upper case using >> system locale."}, >> +{NULL, NULL, 0, NULL} >> +}; >> + >> +#if PY_MAJOR_VERSION >= 3 >> +static struct PyModuleDef moduledef = { >> +PyModuleDef_HEAD_INIT, >> +"libc", /* >> m_name */ >> +"Module for converting case using the system locale", /* >> m_doc */ >> +-1, /* >> m_size */ >> +LibcMethods,/* >> m_methods */ >> +NULL, /* >> m_reload */ >> +NULL, /* >> m_traverse */ >>
[gentoo-portage-dev] [PATCH 3/3] pym/portage/util/locale.py: add a C module to help check locale
From: "Anthony G. Basile" The current method to check for a sane system locale is to use python's ctypes.util.find_library() to construct a full library path to the system libc.so and pass that path to ctypes.CDLL() so we can call toupper() and tolower() directly. However, this gets bogged down in implementation details and fails with musl. We work around this design flaw in ctypes with a small python module written in C which provides thin wrappers to toupper() and tolower(), and only fall back on the current ctypes-based check when this module is not available. This has been tested on glibc, uClibc and musl systems. X-Gentoo-bug: 571444 X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 Signed-off-by: Anthony G. Basile --- pym/portage/util/locale.py | 16 ++- setup.py | 6 +++- src/portage_util_libc.c| 68 ++ 3 files changed, 82 insertions(+), 8 deletions(-) create mode 100644 src/portage_util_libc.c diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py index 093eb86..5b09945 100644 --- a/pym/portage/util/locale.py +++ b/pym/portage/util/locale.py @@ -34,13 +34,15 @@ def _check_locale(silent): """ The inner locale check function. """ - - libc_fn = find_library("c") - if libc_fn is None: - return None - libc = LoadLibrary(libc_fn) - if libc is None: - return None + try: + from portage.util import libc + except ImportError: + libc_fn = find_library("c") + if libc_fn is None: + return None + libc = LoadLibrary(libc_fn) + if libc is None: + return None lc = list(range(ord('a'), ord('z')+1)) uc = list(range(ord('A'), ord('Z')+1)) diff --git a/setup.py b/setup.py index 25429bc..5ca8156 100755 --- a/setup.py +++ b/setup.py @@ -47,7 +47,11 @@ x_scripts = { # Dictionary custom modules written in C/C++ here. The structure is # key = module name # value = list of C/C++ source code, path relative to top source directory -x_c_helpers = {} +x_c_helpers = { + 'portage.util.libc' : [ + 'src/portage_util_libc.c', + ], +} class x_build(build): """ Build command with extra build_man call. """ diff --git a/src/portage_util_libc.c b/src/portage_util_libc.c new file mode 100644 index 000..977b954 --- /dev/null +++ b/src/portage_util_libc.c @@ -0,0 +1,68 @@ +/* Copyright 2005-2016 Gentoo Foundation + * Distributed under the terms of the GNU General Public License v2 + */ + +#include +#include +#include + +static PyObject * _libc_tolower(PyObject *, PyObject *); +static PyObject * _libc_toupper(PyObject *, PyObject *); + +static PyMethodDef LibcMethods[] = { + {"tolower", _libc_tolower, METH_VARARGS, "Convert to lower case using system locale."}, + {"toupper", _libc_toupper, METH_VARARGS, "Convert to upper case using system locale."}, + {NULL, NULL, 0, NULL} +}; + +#if PY_MAJOR_VERSION >= 3 +static struct PyModuleDef moduledef = { + PyModuleDef_HEAD_INIT, + "libc", /* m_name */ + "Module for converting case using the system locale", /* m_doc */ + -1, /* m_size */ + LibcMethods,/* m_methods */ + NULL, /* m_reload */ + NULL, /* m_traverse */ + NULL, /* m_clear */ + NULL, /* m_free */ +}; + +PyMODINIT_FUNC +PyInit_libc(void) +{ + PyObject *m; + m = PyModule_Create(&moduledef); + return m; +} +#else +PyMODINIT_FUNC +initlibc(void) +{ + Py_InitModule("libc", LibcMethods); +} +#endif + + +static PyObject * +_libc_tolower(PyObject *self, PyObject *args) +{ + int c; + + if (!PyArg_ParseTuple(args, "i", &c)) + return NULL; + + return Py_BuildValue("i", tolower(c)); +} + + +static PyObject * +_libc_toupper(PyObject *self, PyObject *args) +{ + int c; + + if (!PyArg_ParseTuple(args, "i", &c)) + return NULL; + + return Py_BuildValue("i", toupper(c)); +} -- 2.7.3
[gentoo-portage-dev] Resending patchset for C module to help check locale
As per Brian's request on IRC, here's the patchset for adding a C module to help check locales. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197 From 76131d44e1731c876840e7925ccff2468ea157a7 Mon Sep 17 00:00:00 2001 From: "Anthony G. Basile" Date: Wed, 25 May 2016 08:48:32 -0400 Subject: [PATCH 1/3] pym/portage/util/locale.py: fix decoding for python2 plus some locales When using python2 with some locales, like turkish, chr() is passed values not in range(128) which cannot be decoded as ASCII, thus throwing a UnicodeDecodeError exception. We use _unicode_decode() from portage.util to address this. Signed-off-by: Anthony G. Basile --- pym/portage/util/locale.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py index 2a15ea1..093eb86 100644 --- a/pym/portage/util/locale.py +++ b/pym/portage/util/locale.py @@ -15,7 +15,7 @@ import textwrap import traceback import portage -from portage.util import writemsg_level +from portage.util import _unicode_decode, writemsg_level from portage.util._ctypes import find_library, LoadLibrary @@ -62,7 +62,7 @@ def _check_locale(silent): "as LC_CTYPE in make.conf.") msg = [l for l in textwrap.wrap(msg, 70)] msg.append("") - chars = lambda l: ''.join(chr(x) for x in l) + chars = lambda l: ''.join(_unicode_decode(chr(x)) for x in l) if uc != ruc: msg.extend([ " %s -> %s" % (chars(lc), chars(ruc)), -- 2.7.3 From 831af672827380a928e8e07c3fb5ecd01545e3c9 Mon Sep 17 00:00:00 2001 From: "Anthony G. Basile" Date: Thu, 19 May 2016 06:52:43 -0400 Subject: [PATCH 2/3] setup.py: add stub for building custom modules in C/C++ Currently portage doesn't include any custom modules written in C/C++. This commit introduces stub code for building such modules in setup.py. Signed-off-by: Anthony G. Basile --- setup.py | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/setup.py b/setup.py index 2220d23..8d20355 100755 --- a/setup.py +++ b/setup.py @@ -4,7 +4,7 @@ from __future__ import print_function -from distutils.core import setup, Command +from distutils.core import setup, Command, Extension from distutils.command.build import build from distutils.command.build_scripts import build_scripts from distutils.command.clean import clean @@ -30,6 +30,9 @@ import sys # TODO: # - smarter rebuilds of docs w/ 'install_docbook' and 'install_epydoc'. +# Dictionary of scripts. The structure is +# key = location in filesystem to install the scripts +# value = list of scripts, path relative to top source directory x_scripts = { 'bin': [ 'bin/ebuild', 'bin/egencache', 'bin/emerge', 'bin/emerge-webrsync', @@ -41,6 +44,10 @@ x_scripts = { ], } +# Dictionary custom modules written in C/C++ here. The structure is +# key = module name +# value = list of C/C++ source code, path relative to top source directory +x_c_helpers = {} class x_build(build): """ Build command with extra build_man call. """ @@ -636,6 +643,8 @@ setup( ['$sysconfdir/portage/repo.postsync.d', ['cnf/repo.postsync.d/example']], ], + ext_modules = [Extension(name=n, sources=m) for n, m in x_c_helpers.items()], + cmdclass = { 'build': x_build, 'build_man': build_man, -- 2.7.3 From ee31675efc06d1df6806cfcdfc4102271e596c96 Mon Sep 17 00:00:00 2001 From: "Anthony G. Basile" Date: Fri, 27 May 2016 09:47:34 -0400 Subject: [PATCH 3/3] pym/portage/util/locale.py: add a C module to help check locale The current method to check for a sane system locale is to use python's ctypes.util.find_library() to construct a full library path to the system libc.so and pass that path to ctypes.CDLL() so we can call toupper() and tolower() directly. However, this gets bogged down in implementation details and fails with musl. We work around this design flaw in ctypes with a small python module written in C which provides thin wrappers to toupper() and tolower(), and only fall back on the current ctypes-based check when this module is not available. This has been tested on glibc, uClibc and musl systems. X-Gentoo-bug: 571444 X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444 Signed-off-by: Anthony G. Basile --- pym/portage/util/locale.py | 16 ++- setup.py | 6 +++- src/portage_util_libc.c| 68 ++ 3 files changed, 82 insertions(+),
Re: [gentoo-portage-dev] [PATCH] file_copy: replace loff_t with off_t for portability (bug 617778)
On 5/7/17 7:50 PM, Zac Medico wrote: > The loff_t type is a GNU extension, so use the portable off_t > type instead. Also, enable Large File Support macros in setup.py, > for 64-bit offsets. > > Reported-by: Patrick Steinhardt > X-Gentoo-bug: 617778 > X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=617778 > --- > setup.py | 5 - > src/portage_util_file_copy_reflink_linux.c | 6 +++--- > 2 files changed, 7 insertions(+), 4 deletions(-) > > diff --git a/setup.py b/setup.py > index e993177..1ba6f87 100755 > --- a/setup.py > +++ b/setup.py > @@ -676,7 +676,10 @@ setup( > ['$sysconfdir/portage/repo.postsync.d', > ['cnf/repo.postsync.d/example']], > ], > > - ext_modules = [Extension(name=n, sources=m) for n, m in > x_c_helpers.items()], > + ext_modules = [Extension(name=n, sources=m, > + extra_compile_args=['-D_FILE_OFFSET_BITS=64', > + '-D_LARGEFILE_SOURCE', '-D_LARGEFILE64_SOURCE']) > + for n, m in x_c_helpers.items()], > > cmdclass = { > 'build': x_build, > diff --git a/src/portage_util_file_copy_reflink_linux.c > b/src/portage_util_file_copy_reflink_linux.c > index b031d96..2fb17a0 100644 > --- a/src/portage_util_file_copy_reflink_linux.c > +++ b/src/portage_util_file_copy_reflink_linux.c > @@ -66,7 +66,7 @@ initreflink_linux(void) > * (errno is set appropriately). > */ > static ssize_t > -cfr_wrapper(int fd_out, int fd_in, loff_t *off_out, size_t len) > +cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len) > { > #ifdef __NR_copy_file_range > return syscall(__NR_copy_file_range, fd_in, NULL, fd_out, > @@ -96,7 +96,7 @@ cfr_wrapper(int fd_out, int fd_in, loff_t *off_out, size_t > len) > * reaches EOF. > */ > static off_t > -do_lseek_data(int fd_out, int fd_in, loff_t *off_out) { > +do_lseek_data(int fd_out, int fd_in, off_t *off_out) { > #ifdef SEEK_DATA > /* Use lseek SEEK_DATA/SEEK_HOLE for sparse file support, > * as suggested in the copy_file_range man page. > @@ -189,7 +189,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args) > ssize_t buf_bytes, buf_offset, copyfunc_ret; > struct stat stat_in, stat_out; > char* buf; > -ssize_t (*copyfunc)(int, int, loff_t *, size_t); > +ssize_t (*copyfunc)(int, int, off_t *, size_t); > > if (!PyArg_ParseTuple(args, "ii", &fd_in, &fd_out)) > return NULL; This looks good to me. I tested it on amd64 and it works fine. -- Anthony G. Basile, Ph.D. Gentoo Linux Developer [Hardened] E-Mail: bas...@freeharbor.net GnuPG FP : 1FED FAD9 D82C 52A5 3BAB DC79 9384 FA6E F52D 4BBA GnuPG ID : F52D4BBA
[gentoo-portage-dev] xattr wrapper for install, bug #465000
Hi everyone, A while back, I wrote a python wrapper for install to preserve xattrs. Its installed in LIBDIR/portage/bin/install.py. It is *painfully* slow. For a package like moodle with 16650 .php files, none of which probably need any xattr's set, it takes about 30 mins to install. I rewrote the wrapper in C. Replacing the python wrapper with the C wrapper, the same example reduces from about 30 mins to 2 mins. Mike and I did some back and forth about how best to write it. The latest version is pretty much done at https://bugs.gentoo.org/show_bug.cgi?id=465000#c56 We need to get that integrated into portage. 1) I'm not 100% sure how to do that. 2) We may want to install it at /usr/bin/install-xattr because I'm sure it will be useful for more than just portage. Comments? -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] xattr wrapper for install, bug #465000
On 01/27/2014 09:02 AM, viv...@gmail.com wrote: On 01/26/14 23:53, Anthony G. Basile wrote: Hi everyone, A while back, I wrote a python wrapper for install to preserve xattrs. Its installed in LIBDIR/portage/bin/install.py. It is *painfully* slow. For a package like moodle with 16650 .php files, none of which probably need any xattr's set, it takes about 30 mins to install. I rewrote the wrapper in C. Replacing the python wrapper with the C wrapper, the same example reduces from about 30 mins to 2 mins. Mike and I did some back and forth about how best to write it. The latest version is pretty much done at https://bugs.gentoo.org/show_bug.cgi?id=465000#c56 We need to get that integrated into portage. 1) I'm not 100% sure how to do that. 2) We may want to install it at /usr/bin/install-xattr because I'm sure it will be useful for more than just portage. Comments? patch install from coreutils (and then upstream changes) is not an option? they already support selinux contexts anyway install-xattr could be useful and /usr/bin would be a good option IMHO Been there and I even had a patch ready. Upstream answer was '\0'. The only people who engaged the discussion were gentoo devs. Would patching coreutils have been the better approach? Some people might argue install and cp and mv etc should just copy contents to keep these utilities as simple as possible. Although, as you say, install can copy selinux contexts, and cp can copy xattr attributes. So what's the problem with extending installs functionality to include arbitrary xattr attributes? Anyhow, seeing as upstream is uninterested, I prefer this wrapper to maintaining a local patch against coreutils. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] ebuild-helpers/xattr/install: use install-xattr
On 06/02/14 12:13, Alec Warner wrote: On Sat, May 31, 2014 at 4:36 PM, wrote: From: "Anthony G. Basile" Currently bin/ebuild-helpers/xattr/install uses ${PORTAGE_BIN_PATH}/install.py as a wrapper to coreutils' install to preserve a file's extended attributes when installing, usually during src_install(). This is needed, for instance, when preserving xattr based PaX flags, bug #465000. However the python wrapper is very slow, comment #42 of bug #465000. A C wrapper was developed and tested, bugs #501534 and #511984. This patch checks for the existence of the C wrapper, and uses it, falling back on the python wrapper only if not found, or if over- ridden by ${PORTAGE_INSTALL_XATTR_IMPLEMENTATION}. --- bin/ebuild-helpers/xattr/install | 27 +-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/bin/ebuild-helpers/xattr/install b/bin/ebuild-helpers/xattr/install index f51f621..9b5d346 100755 --- a/bin/ebuild-helpers/xattr/install +++ b/bin/ebuild-helpers/xattr/install @@ -4,9 +4,32 @@ PORTAGE_BIN_PATH=${PORTAGE_BIN_PATH:-/usr/lib/portage/bin} PORTAGE_PYM_PATH=${PORTAGE_PYM_PATH:-/usr/lib/portage/pym} +INSTALL_XATTR=${EPREFIX}/usr/bin/install-xattr # Use safe cwd, avoiding unsafe import for bug #469338. export __PORTAGE_HELPER_CWD=${PWD} cd "${PORTAGE_PYM_PATH}" export __PORTAGE_HELPER_PATH=${BASH_SOURCE[0]} -PYTHONPATH=${PORTAGE_PYTHONPATH:-${PORTAGE_PYM_PATH}} \ - exec "${PORTAGE_PYTHON:-/usr/bin/python}" "${PORTAGE_BIN_PATH}/install.py" "$@" + + +if [[ ${PORTAGE_INSTALL_XATTR_IMPLEMENTATION} == "c" ]]; then + implementation="c" +elif [[ ${PORTAGE_INSTALL_XATTR_IMPLEMENTATION} == "python" ]]; then + implementation="python" +else + # If PORTAGE_INSTALL_XATTR_IMPLEMENTATION is not set then we'll autodetect This doesn't run if it is unset, it runs if it is unset, or it is set, but not to 'c' or 'python'. -A Easy fix. I have another issue with install-xattr that needs to be addressed so it plays nice with the way portage wants to do things. I'll resubmit when that's done. + if [[ -x "${INSTALL_XATTR}" ]]; then + implementation="c" + else + implementation="python" + fi +fi + +if [[ "${implementation}" == "c" ]]; then + exec "${INSTALL_XATTR}" "$@" +elif [[ "${implementation}" == "python" ]]; then + PYTHONPATH=${PORTAGE_PYTHONPATH:-${PORTAGE_PYM_PATH}} \ + exec "${PORTAGE_PYTHON:-/usr/bin/python}" "${PORTAGE_BIN_PATH}/install.py" "$@" +else + echo "Unknown implementation for PORTAGE_INSTALL_XATTR_IMPLEMENTATION" + exit -1 +fi -- 1.8.5.5 -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] Next team meeting
On 06/11/14 11:58, Brian Dolbec wrote: The next team meeting scheduler: http://whenisgood.net/xnq98br the results: http://whenisgood.net/xnq98br/results/9zqwgaj On the agenda: Team lead election, also co-lead, release co-ordinator next release plugin-sync finalization report and a couple decisions to make. repoman rewrite: round one breakup has started subslots, backtracking, resolver <== we really need to get a sub team together to to handle the bugs, learn the code, create a new code model to work towards implementing. Sebastian is all alone on this. He needs help. Can I be invited to discuss xattrs + portage. I want to co-ordinate with the team on how to best integrate this since some of the machinary is unnecessarily complex. I will take about 5 mins. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] Generate soname dependency metadata (282639)
connection to the standard. They should all be in provided by elfutils, but I'm sure standard is documented somewhere officially. Even if you just say "see " that might be enough to clue people where to look for these definitions. diff --git a/pym/portage/util/elf/header.py b/pym/portage/util/elf/header.py new file mode 100644 index 000..3310eeb --- /dev/null +++ b/pym/portage/util/elf/header.py @@ -0,0 +1,62 @@ +# Copyright 2015 Gentoo Foundation +# Distributed under the terms of the GNU General Public License v2 + +import collections + +from ..endian.decode import (decode_uint16_le, decode_uint32_le, + decode_uint16_be, decode_uint32_be) +from .constants import (E_ENTRY, E_MACHINE, EI_CLASS, ELFCLASS32, + ELFCLASS64, ELFDATA2LSB, ELFDATA2MSB) + +class ELFHeader(object): + + __slots__ = ('e_flags', 'e_machine', 'ei_class', 'ei_data') + + @classmethod + def read(cls, f): + """ + @param f: an open ELF file + @type f: file + @rtype: ELFHeader + @return: A new ELFHeader instance containing data from f + """ + f.seek(EI_CLASS) + ei_class = ord(f.read(1)) + ei_data = ord(f.read(1)) + + if ei_class == ELFCLASS32: + width = 32 + elif ei_class == ELFCLASS64: + width = 64 + else: + width = None + + if ei_data == ELFDATA2LSB: + uint16 = decode_uint16_le + uint32 = decode_uint32_le + elif ei_data == ELFDATA2MSB: + uint16 = decode_uint16_be + uint32 = decode_uint32_be + else: + uint16 = None + uint32 = None + + if width is None or uint16 is None: + e_machine = None + e_flags = None + else: + f.seek(E_MACHINE) + e_machine = uint16(f.read(2)) + + # E_ENTRY + 3 * sizeof(uintN) + e_flags_offset = E_ENTRY + 3 * width // 8 + f.seek(e_flags_offset) + e_flags = uint32(f.read(4)) + + obj = cls() + obj.e_flags = e_flags + obj.e_machine = e_machine + obj.ei_class = ei_class + obj.ei_data = ei_data + + return obj Looks good. I'm going to perf test this but I don't think it will be too big a hit. I don't know how we would get to this point in the code tree, but let me ask you this: am I right in thinking you won't hit ELFHeader.read() with every file that's being installed by portage? You'll only get here for elf objects? Correct? diff --git a/pym/portage/util/endian/__init__.py b/pym/portage/util/endian/__init__.py new file mode 100644 index 000..4725d33 --- /dev/null +++ b/pym/portage/util/endian/__init__.py @@ -0,0 +1,2 @@ +# Copyright 2015 Gentoo Foundation +# Distributed under the terms of the GNU General Public License v2 diff --git a/pym/portage/util/endian/decode.py b/pym/portage/util/endian/decode.py new file mode 100644 index 000..ec0dcec --- /dev/null +++ b/pym/portage/util/endian/decode.py @@ -0,0 +1,56 @@ +# Copyright 2015 Gentoo Foundation +# Distributed under the terms of the GNU General Public License v2 + +def decode_uint16_be(data): + """ + Decode an unsigned 16-bit integer with big-endian encoding. + + @param data: string of bytes of length 2 + @type data: bytes + @rtype: int + @return: unsigned integer value of the decoded data + """ + return (ord(data[0:1]) << 8) + ord(data[1:2]) + +def decode_uint16_le(data): + """ + Decode an unsigned 16-bit integer with little-endian encoding. + + @param data: string of bytes of length 2 + @type data: bytes + @rtype: int + @return: unsigned integer value of the decoded data + """ + return ord(data[0:1]) + (ord(data[1:2]) << 8) + +def decode_uint32_be(data): + """ + Decode an unsigned 32-bit integer with big-endian encoding. + + @param data: string of bytes of length 4 + @type data: bytes + @rtype: int + @return: unsigned integer value of the decoded data + """ + return ( + (ord(data[0:1]) << 24) + + (ord(data[1:2]) << 16) + + (ord(data[2:3]) << 8) + + ord(data[3:4]) + ) + +def decode_uint32_le(data): + """ + Decode an unsigned 32-bit integer with little-endian encoding. + + @param data: string of bytes of length 4 + @type data: bytes + @rtype: int + @return: unsigned integer value of the decoded data + """ + return ( + ord(data[0:1]) + + (ord(data[1:2]) << 8) + + (ord(data[2:3]) << 16) + + (ord(data[3:4]) << 24) + ) Endian fun. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] Generate soname dependency metadata (282639)
On 01/29/15 21:02, Zac Medico wrote: +#hppa_{32,64} +#ia_{32,64} +#m68k_{32,64} +#mips_{eabi32,eabi64,n32,n64,o32,o64} +#ppc_{32,64} +#s390_{32,64} +#sh_{32,64} +#sparc_{32,64} +#x86_{32,64,x32} +# +# NOTES: +# +# * The ABIs referenced by some of the above *_32 and *_64 categories +# may be imaginary, but they are listed anyway, since the goal is to +# establish a naming convention that is as consistent and uniform as +# possible. +# +# * The Elf header's e_ident[EI_OSABI] byte is completely ignored, +# since OS-independence is one of the goals. The assumption is that, +# for given installation, we are only interested in tracking multilib +# ABIs for a single OS. If you run readelf -h on (say) bash in any of our stage3's tarballs you always get "OS/ABI: UNIX - System V" irrespective of arch and abi. I don't know what you would get on BSD, but the field is totally irrelevant for our purposes despite the name. As far as I can tell, it is totally invariant across arches and abis. You can even unpack the the stage3's on an amd64 host and run readelf form the host on the chroot target and you'll get the elf header, so you don't need access to native hardware. The comment suggests that there might be some interesting information there, but there isn't. Maybe I'm just reading too much into it. Well, a quick google search seems to indicate that FreeBSD uses EI_OSABI. I was specifically thinking about FreeBSD when I wrote that comment, because I was aware that Gentoo/FBSD was using ELF, and I just assumed that they would have a different EI_OSABI than Linux. Even there you'll get "UNIX - System V". I don't have a freebsd system ready to go, but here's what i get from my openbsd system: # uname -a OpenBSD obi.dis 5.6 GENERIC.MP#333 amd64 # readelf -h /bin/sh ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI:UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x400260 Start of program headers: 64 (bytes into file) Start of section headers: 442512 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 7 Size of section headers: 64 (bytes) Number of section headers: 17 Section header string table index: 16 I have *never* seen the OS/ABI be anything different. That's what struck me about your comment. Anyhow, we're far afield. I might install freebsd later in a vm just to have one handy and check. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] Generate soname dependency metadata (282639)
On 01/30/15 18:19, Zac Medico wrote: On 01/30/2015 03:01 PM, Anthony G. Basile wrote: On 01/29/15 21:02, Zac Medico wrote: +#hppa_{32,64} +#ia_{32,64} +#m68k_{32,64} +#mips_{eabi32,eabi64,n32,n64,o32,o64} +#ppc_{32,64} +#s390_{32,64} +#sh_{32,64} +#sparc_{32,64} +#x86_{32,64,x32} +# +# NOTES: +# +# * The ABIs referenced by some of the above *_32 and *_64 categories +# may be imaginary, but they are listed anyway, since the goal is to +# establish a naming convention that is as consistent and uniform as +# possible. +# +# * The Elf header's e_ident[EI_OSABI] byte is completely ignored, +# since OS-independence is one of the goals. The assumption is that, +# for given installation, we are only interested in tracking multilib +# ABIs for a single OS. If you run readelf -h on (say) bash in any of our stage3's tarballs you always get "OS/ABI: UNIX - System V" irrespective of arch and abi. I don't know what you would get on BSD, but the field is totally irrelevant for our purposes despite the name. As far as I can tell, it is totally invariant across arches and abis. You can even unpack the the stage3's on an amd64 host and run readelf form the host on the chroot target and you'll get the elf header, so you don't need access to native hardware. The comment suggests that there might be some interesting information there, but there isn't. Maybe I'm just reading too much into it. Well, a quick google search seems to indicate that FreeBSD uses EI_OSABI. I was specifically thinking about FreeBSD when I wrote that comment, because I was aware that Gentoo/FBSD was using ELF, and I just assumed that they would have a different EI_OSABI than Linux. Even there you'll get "UNIX - System V". I don't have a freebsd system ready to go, but here's what i get from my openbsd system: # uname -a OpenBSD obi.dis 5.6 GENERIC.MP#333 amd64 # readelf -h /bin/sh ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI:UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x400260 Start of program headers: 64 (bytes into file) Start of section headers: 442512 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 7 Size of section headers: 64 (bytes) Number of section headers: 17 Section header string table index: 16 I have *never* seen the OS/ABI be anything different. That's what struck me about your comment. Anyhow, we're far afield. I might install freebsd later in a vm just to have one handy and check. I just loop-mounted a GhostBSD iso, and here's what I found: # readelf -h /media/iso/bin/sh ELF Header: Magic: 7f 45 4c 46 01 01 01 09 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI:UNIX - FreeBSD ABI Version: 0 Type: EXEC (Executable file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x804a100 Start of program headers: 52 (bytes into file) Start of section headers: 119888 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 8 Size of section headers: 40 (bytes) Number of section headers: 28 Section header string table index: 27 Yeah I just confirmed that. I installed amd64 fbsd 10.1. I've used obsd for years and noticed the "UNIX - System V" and just thought it was the same for all *bsd systems. This is the only time I've seen a different OS/ABI. Anyhow, I did some perf testing. Concentrating on www-apps/moodle (which is a huge package of some 19000 files but no elfs), and app-emulation/wine (which has 11000 elf objects) and I found no appreciable performance hit. Other tests show that PROVIDES, REQUIRES and NEEDED.ELF.2 are correctly being generated. Once committed, I'll rebuild @system and see if we get the correct linkage graph. I have scripts do build such a graph from readelf -d and so I can compare. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] [PATCH] Generate soname dependency metadata (282639)
On 02/01/15 15:00, Zac Medico wrote: On 01/31/2015 08:04 AM, Anthony G. Basile wrote: Yeah I just confirmed that. I installed amd64 fbsd 10.1. I've used obsd for years and noticed the "UNIX - System V" and just thought it was the same for all *bsd systems. This is the only time I've seen a different OS/ABI. Hmm, that's interesting. I just noticed that on wikipedia they have a table listing 9 different constants [1]. Oh I know there are different values possible here! But are they used in practice? Note that wikipedia also says "It is often set to 0 regardless of the target platform." Anyhow, I did some perf testing. Concentrating on www-apps/moodle (which is a huge package of some 19000 files but no elfs), and app-emulation/wine (which has 11000 elf objects) and I found no appreciable performance hit. Good. If you have enough memory, the ELF headers are likely to be in the buffer cache when portage probes them for the multilib category. So, if you're lucky, it will just read from the buffer cache instead of from the disk. Other tests show that PROVIDES, REQUIRES and NEEDED.ELF.2 are correctly being generated. Great, thanks for validating this. Once committed, I'll rebuild @system and see if we get the correct linkage graph. I have scripts do build such a graph from readelf -d and so I can compare. Excellent. Also, I've been making lots of progress on soname dependency resolution in portage [2]. Pretty soon, it should be ready to merge into the integration branch [3], for wider testing via the corresponding overlay [4]. [1] http://en.wikipedia.org/wiki/Executable_and_Linkable_Format#File_header [2] https://github.com/zmedico/portage/tree/binpkg-soname-deps [3] https://github.com/zmedico/portage/tree/binpkg-support-integration [4] https://github.com/zmedico/portage-binpkg-support-overlay Nice! -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] Does temp need g+w?
On 02/14/15 11:15, Zac Medico wrote: On 02/14/2015 04:18 AM, Jan Sever wrote: Hi all, does temp directory in /var/tmp/portage/$cat/$pkg really need g+w permission? Well, that g+w bit is part of the FEATURES=userpriv implementation. I have to use two versions of hardened kernel, one with disabled CONFIG_GRKERNSEC_TPE_ALL (for emerge) and one with enabled (for normal run). If you have portage-2.2.15 or later, then it has then it has g-w in $T as discussed here: https://bugs.gentoo.org/show_bug.cgi?id=519566 We went through a lot of trouble with that so yes, its needed. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
[gentoo-portage-dev] Pre RFC on RFC: Add compiler information to exported a Package Manger's Cached Information.
Hi everyone, I'm not sure about the following RFC which I was thinking of sending out to gentoo-dev@ but I think I want to bounce off the portage developers first for opinions. So I guess this is a Pre-RFC to an RFC. RFC: Add compiler information to exported a Package Manger's Cached Information. Hi everyone, I'd like to propose adding compiler information to portage's VDB on a per package basis so that we record the version of the compiler a package is built with. The motivation and details of what is to be cached and exported (as per GLEP 64) I spell out below. The implementation specific are left to the package maintainers. Currently we cache dynamic linkage information about a package's linkable and executable objects so we can obtain a dependency graph connecting shared objects to the shared objects and executables that consume them. This is useful for deciding what packages need to be rebuilt when a library package is updated that breaks backwards compatibility. Recently the portage team expanded the cached linkage information to include multilib ABI identifiers to distinguish between different executable ABIs present on a system. [1,2] Since the consumer of a library can only link against a library file of the same ABI, this information is necessary to properly construct the dependencies. I propose extending this to include the ABI information for C++ libraries. While C++ ABIs are of a different quality than the executable ABIs of a multilib system, the dependency issue is the same: the consumer of a library can only link against a library file of the same ABI. Differences in C++ ABIs arrise because there is no guaranteed compatibility between shared and executable objects built under different C++ standards. [3] Also, there is no guaranteed ABI compatibility even with the same standard when built with different versions of GCC differing in minor bumps. [4] Since in Gentoo we allow users to supply their own CFLAGS/CXXFLAGS, thus specifying the C++ standard via the -std= flag, and because we allow them to switch between version of compiler and even between different compilers, we introduce the possibility of breakage due to differing C++ ABIs. While changes in ABI are normally reflected in changes in the SONAME of the library, GCC upstream is not willing to do so for libstdc++ for other design reasons [4]. So to identify possible mismatches in C++ ABIs, it is necessary to record any user supplied CFLAGS/CXXFLAGS which change the default C++ standard as well as the compiler used and its version. The flags are already cached in VDB and can be compare to the compiler's default c++ standard obtained from `$CC -x c++ -E -P - <<< __cplusplus`. Thus only the compiler and its version need to be added to VDB cache. This should be done on a per package bases in VDB///COMPILER. The contents of this file are to be parsed from the output of the first line of `$CC --version` of the compiler used to build the package. The format should be as follows: eg. "clang 3.3" or "gcc 4.8.3 (Gentoo Hardened 4.8.3 p1.1, pie-0.5.9)" Per GLEP 64, this information should be made available for utilities to help identify C++ ABI mismatches. Finally, a limitation of the above should be noted. Since the CFLAGS/CXXFLAGS cached are only those supplied by the user, it does not cover situations where the package build system or ebuild supply their own -std= flag. Since this information cannot and should not be cached by the package manager, utilities used to find any mismatches in C++ ABI must provide for this intelligence. Refs. [1] https://bugs.gentoo.org/show_bug.cgi?id=534206. [2] http://cgit.gentooexperimental.org/proj/portage.git/commit/?id=f1c1b8a77eebf7713b32e5f9945690f60f4f46de [3] This can lead to breakage between libraries and their consumers. For example, see https://bugs.gentoo.org/show_bug.cgi?id=513386. [4] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61758. "It is totally unsupported (and unlikely to work) to mix C++11 code built with GCC 4.x and 4.y, for any x!=y" The same incompatibilities may be introduced by clang as well. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] Pre RFC on RFC: Add compiler information to exported a Package Manger's Cached Information.
On 02/22/15 01:30, Zac Medico wrote: On 02/21/2015 10:22 PM, Zac Medico wrote: If we put the real/canonical libstdc++.so path in the DT_NEEDED section, then it will automatically work with existing soname dependency support. Actually, we'd also have to add a way for you to put the full path of the libstdc++.so in PROVIDES. For example: PROVIDES_ABSOLUTE="/usr/lib/gcc/*/*/libstdc++.so.6" I guess I don't understand how this would work exactly. What if someone has gcc-4.8.3. Builds library libfoo.so which uses c++. Then upgrades to gcc-4.9, removes 4.8 and then tries to build bar which is also written in c++ and links against libfoo.so. We would have mismatching abis. How would this catch it and trigger the correct rebuilds? Unless I'm misunderstanding your *'s in that line. Are you using PROVIDES_ABSOLUTE as a way of recording what version of the compiler libfoo.so was build with? So that you'd have a line that says libfoo.so links against /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libstdc++.so, so that parsing that line gives 4.8.3? Also if you had the absolute path in VDB somewhere, like in PROVIDES, then you don't need it in the elf's rpath which would make me feel better. -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] Pre RFC on RFC: Add compiler information to exported a Package Manger's Cached Information.
On 02/25/15 14:51, Anthony G. Basile wrote: On 02/22/15 01:30, Zac Medico wrote: On 02/21/2015 10:22 PM, Zac Medico wrote: If we put the real/canonical libstdc++.so path in the DT_NEEDED section, then it will automatically work with existing soname dependency support. Actually, we'd also have to add a way for you to put the full path of the libstdc++.so in PROVIDES. For example: PROVIDES_ABSOLUTE="/usr/lib/gcc/*/*/libstdc++.so.6" I guess I don't understand how this would work exactly. What if someone has gcc-4.8.3. Builds library libfoo.so which uses c++. Then upgrades to gcc-4.9, removes 4.8 and then tries to build bar which is also written in c++ and links against libfoo.so. We would have mismatching abis. How would this catch it and trigger the correct rebuilds? Unless I'm misunderstanding your *'s in that line. Are you using PROVIDES_ABSOLUTE as a way of recording what version of the compiler libfoo.so was build with? So that you'd have a line that says libfoo.so links against /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libstdc++.so, so that parsing that line gives 4.8.3? Also if you had the absolute path in VDB somewhere, like in PROVIDES, then you don't need it in the elf's rpath which would make me feel better. Actually no, you'd still need rpath for the elf itslef otherwise you can still link against the wrong version of libstdc++.so. Note in my following example that even though I build test.cpp with 4.7.3 I still wind up linking aginast 4.8.3. yellow tmp # cat test.cpp #include using namespace std; int main() { cout << "hello owrld" << endl ; } yellow tmp # gcc-config -l [1] x86_64-pc-linux-gnu-4.7.3 * [2] x86_64-pc-linux-gnu-4.7.3-hardenednopie [3] x86_64-pc-linux-gnu-4.7.3-hardenednopiessp [4] x86_64-pc-linux-gnu-4.7.3-hardenednossp [5] x86_64-pc-linux-gnu-4.7.3-vanilla [6] x86_64-pc-linux-gnu-4.8.3 [7] x86_64-pc-linux-gnu-4.8.3-hardenednopie [8] x86_64-pc-linux-gnu-4.8.3-hardenednopiessp [9] x86_64-pc-linux-gnu-4.8.3-hardenednossp [10] x86_64-pc-linux-gnu-4.8.3-vanilla yellow tmp # g++ -o go test.cpp yellow tmp # ldd go linux-vdso.so.1 (0x033f63717000) libstdc++.so.6 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libstdc++.so.6 (0x033f631cf000) libm.so.6 => /lib64/libm.so.6 (0x033f62ecb000) libgcc_s.so.1 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libgcc_s.so.1 (0x033f62cb4000) libc.so.6 => /lib64/libc.so.6 (0x033f628f8000) /lib64/ld-linux-x86-64.so.2 (0x033f634f6000) yellow tmp # g++ -Wl,-rpath,/usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/ -o go test.cpp yellow tmp # ldd go linux-vdso.so.1 (0x036035212000) libstdc++.so.6 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/libstdc++.so.6 (0x036034ccf000) libm.so.6 => /lib64/libm.so.6 (0x0360349cb000) libgcc_s.so.1 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.7.3/libgcc_s.so.1 (0x0360347b4000) libc.so.6 => /lib64/libc.so.6 (0x0360343f8000) /lib64/ld-linux-x86-64.so.2 (0x036034ff1000) -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197
Re: [gentoo-portage-dev] Pre RFC on RFC: Add compiler information to exported a Package Manger's Cached Information.
On 02/25/15 15:38, Zac Medico wrote: On 02/25/2015 12:01 PM, Anthony G. Basile wrote: On 02/25/15 14:51, Anthony G. Basile wrote: On 02/22/15 01:30, Zac Medico wrote: On 02/21/2015 10:22 PM, Zac Medico wrote: If we put the real/canonical libstdc++.so path in the DT_NEEDED section, then it will automatically work with existing soname dependency support. Actually, we'd also have to add a way for you to put the full path of the libstdc++.so in PROVIDES. For example: PROVIDES_ABSOLUTE="/usr/lib/gcc/*/*/libstdc++.so.6" I guess I don't understand how this would work exactly. What if someone has gcc-4.8.3. Builds library libfoo.so which uses c++. Then upgrades to gcc-4.9, removes 4.8 and then tries to build bar which is also written in c++ and links against libfoo.so. We would have mismatching abis. How would this catch it and trigger the correct rebuilds? Unless I'm misunderstanding your *'s in that line. Are you using PROVIDES_ABSOLUTE as a way of recording what version of the compiler libfoo.so was build with? So that you'd have a line that says libfoo.so links against /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.3/libstdc++.so, so that parsing that line gives 4.8.3? Also if you had the absolute path in VDB somewhere, like in PROVIDES, then you don't need it in the elf's rpath which would make me feel better. Actually no, you'd still need rpath for the elf itslef otherwise you can still link against the wrong version of libstdc++.so. Note in my following example that even though I build test.cpp with 4.7.3 I still wind up linking aginast 4.8.3. If DT_NEEDED contains the absolute libstdc++.so path, it's guaranteed to link against the correct version, regardless of rpath. How do you get DT_NEEDED to the absolute libstdc++.so path when building? -- Anthony G. Basile, Ph. D. Chair of Information Technology D'Youville College Buffalo, NY 14201 (716) 829-8197