Re: [gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to check locale

2016-05-20 Thread Michał Górny
On Fri, 20 May 2016 06:59:14 -0400
"Anthony G. Basile"  wrote:

> On 5/19/16 9:38 AM, Michał Górny wrote:
> > Dnia 19 maja 2016 14:43:38 CEST, "Anthony G. Basile" 
> >  napisał(a):  
> >> From: "Anthony G. Basile" 
> >>
> >> The current method to check for the system locale is to use python's
> >> ctypes.util.find_library() to construct a full library path to the
> >> system libc.so which is then passed to ctypes.CDLL().  However,
> >> this gets bogged down in implementation dependant details and
> >> fails with musl.
> >>
> >> We work around this design flaw in ctypes with a small python module
> >> written in C called 'portage_c_check_locale', and only fall back on
> >> the current ctypes-based check when this module is not available.
> >>
> >> This has been tested on glibc, uClibc and musl systems.
> >>
> >> X-Gentoo-bug: 571444
> >> X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444
> >> Signed-off-by: Anthony G. Basile   
> > 
> > To be honest, I don't like this. Do we really want to duplicate all this, 
> > including duplicating the complete messages? This really looks messy and 
> > unmaintainable.
> > 
> > The only reason libc functions were being used was to workaround the hacky 
> > built-in case conversion functions. And this is the only part where CDLL 
> > was really used.
> > 
> > So please change this to provide trivial wrappers around the C library 
> > functions, and use them alternatively to ones obtained using CDLL. With a 
> > single common code doing all the check logic and messaging.
> > 
> >   
> 
> ctypes is itself a problem and so hacky python built-ins were replaced
> by more hacky python.  CDLL shouldn't be used at all.  The other problem
> is that non utf-8 encodings cause problems much earlier in the portage
> codebase than the test for a sane environment, which is a bigger problem
> than this small issue.  No one checked what happens with python2.7 +
> portage + exotic locale.  Since I don't know where this might head in
> the future, I kinda like the standalone potential.  The only repeated
> code here is the message.  Nonetheless, I can reduce this to just two
> functions, and do something like the following.  I assume that's what
> you're suggesting:
> 
> try:
>   from portage_c_convert_case import _c_toupper, _c_tolower
>   libc_toupper = _c_toupper
>   libc_lolower = _c_tolower
> except ImportError:
>   libc_fn = find_library("c")
>   if libc_fn is None:
>   return None
>   libc = LoadLibrary(libc_fn)
>   if libc is None:
>   return None
>   libc_toupper = libc.toupper
>   libc_tolower = libc.tolower
> 
> 
> Incidentally, another approach, one that I use in bash is as follows.  I
> think it requires bash4, and I'm not sure how to pull this into python
> gracefully.
> 
> l="abcdefghijklmnopqrstuvwxyz"
> u="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
> ru=${l^^}
> rl=${u,,}
> [[ $l == $rl ]] && echo "lower case okay" || echo "lower case bad"
> [[ $u == $ru ]] && echo "upper case okay" || echo "upper case bad"

Exactly as pointed out above. You have to import tolower()
and toupper() from libc as Python case conversion functions don't give
the same results as 'pure' libc used by bash.

-- 
Best regards,
Michał Górny



pgpfrBpVYAfyE.pgp
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to check locale

2016-05-20 Thread Anthony G. Basile
On 5/19/16 9:38 AM, Michał Górny wrote:
> Dnia 19 maja 2016 14:43:38 CEST, "Anthony G. Basile" 
>  napisał(a):
>> From: "Anthony G. Basile" 
>>
>> The current method to check for the system locale is to use python's
>> ctypes.util.find_library() to construct a full library path to the
>> system libc.so which is then passed to ctypes.CDLL().  However,
>> this gets bogged down in implementation dependant details and
>> fails with musl.
>>
>> We work around this design flaw in ctypes with a small python module
>> written in C called 'portage_c_check_locale', and only fall back on
>> the current ctypes-based check when this module is not available.
>>
>> This has been tested on glibc, uClibc and musl systems.
>>
>> X-Gentoo-bug: 571444
>> X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444
>> Signed-off-by: Anthony G. Basile 
> 
> To be honest, I don't like this. Do we really want to duplicate all this, 
> including duplicating the complete messages? This really looks messy and 
> unmaintainable.
> 
> The only reason libc functions were being used was to workaround the hacky 
> built-in case conversion functions. And this is the only part where CDLL was 
> really used.
> 
> So please change this to provide trivial wrappers around the C library 
> functions, and use them alternatively to ones obtained using CDLL. With a 
> single common code doing all the check logic and messaging.
> 
> 

ctypes is itself a problem and so hacky python built-ins were replaced
by more hacky python.  CDLL shouldn't be used at all.  The other problem
is that non utf-8 encodings cause problems much earlier in the portage
codebase than the test for a sane environment, which is a bigger problem
than this small issue.  No one checked what happens with python2.7 +
portage + exotic locale.  Since I don't know where this might head in
the future, I kinda like the standalone potential.  The only repeated
code here is the message.  Nonetheless, I can reduce this to just two
functions, and do something like the following.  I assume that's what
you're suggesting:

try:
from portage_c_convert_case import _c_toupper, _c_tolower
libc_toupper = _c_toupper
libc_lolower = _c_tolower
except ImportError:
libc_fn = find_library("c")
if libc_fn is None:
return None
libc = LoadLibrary(libc_fn)
if libc is None:
return None
libc_toupper = libc.toupper
libc_tolower = libc.tolower


Incidentally, another approach, one that I use in bash is as follows.  I
think it requires bash4, and I'm not sure how to pull this into python
gracefully.

l="abcdefghijklmnopqrstuvwxyz"
u="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
ru=${l^^}
rl=${u,,}
[[ $l == $rl ]] && echo "lower case okay" || echo "lower case bad"
[[ $u == $ru ]] && echo "upper case okay" || echo "upper case bad"

-- 
Anthony G. Basile, Ph. D.
Chair of Information Technology
D'Youville College
Buffalo, NY 14201
(716) 829-8197



Re: [gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to check locale

2016-05-19 Thread Michał Górny
Dnia 19 maja 2016 14:43:38 CEST, "Anthony G. Basile" 
 napisał(a):
>From: "Anthony G. Basile" 
>
>The current method to check for the system locale is to use python's
>ctypes.util.find_library() to construct a full library path to the
>system libc.so which is then passed to ctypes.CDLL().  However,
>this gets bogged down in implementation dependant details and
>fails with musl.
>
>We work around this design flaw in ctypes with a small python module
>written in C called 'portage_c_check_locale', and only fall back on
>the current ctypes-based check when this module is not available.
>
>This has been tested on glibc, uClibc and musl systems.
>
>X-Gentoo-bug: 571444
>X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444
>Signed-off-by: Anthony G. Basile 

To be honest, I don't like this. Do we really want to duplicate all this, 
including duplicating the complete messages? This really looks messy and 
unmaintainable.

The only reason libc functions were being used was to workaround the hacky 
built-in case conversion functions. And this is the only part where CDLL was 
really used.

So please change this to provide trivial wrappers around the C library 
functions, and use them alternatively to ones obtained using CDLL. With a 
single common code doing all the check logic and messaging.


>---
> pym/portage/util/locale.py |  40 +
> setup.py   |   6 +-
>src/check_locale.c | 144
>+
> 3 files changed, 178 insertions(+), 12 deletions(-)
> create mode 100644 src/check_locale.c
>
>diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py
>index 2a15ea1..fc5d052 100644
>--- a/pym/portage/util/locale.py
>+++ b/pym/portage/util/locale.py
>@@ -30,17 +30,17 @@ locale_categories = (
> _check_locale_cache = {}
> 
> 
>-def _check_locale(silent):
>+def _ctypes_check_locale():
>   """
>-  The inner locale check function.
>+  Check for locale using ctypes.
>   """
> 
>   libc_fn = find_library("c")
>   if libc_fn is None:
>-  return None
>+  return (None, "")
>   libc = LoadLibrary(libc_fn)
>   if libc is None:
>-  return None
>+  return (None, "")
> 
>   lc = list(range(ord('a'), ord('z')+1))
>   uc = list(range(ord('A'), ord('Z')+1))
>@@ -48,9 +48,6 @@ def _check_locale(silent):
>   ruc = [libc.toupper(c) for c in lc]
> 
>   if lc != rlc or uc != ruc:
>-  if silent:
>-  return False
>-
>   msg = ("WARNING: The LC_CTYPE variable is set to a locale " +
>   "that specifies transformation between lowercase " +
>   "and uppercase ASCII characters that is different than 
> " +
>@@ -71,11 +68,32 @@ def _check_locale(silent):
>   msg.extend([
>   "  %s -> %s" % (chars(uc), chars(rlc)),
>   "  %28s: %s" % ('expected', chars(lc))])
>-  writemsg_level("".join(["!!! %s\n" % l for l in msg]),
>-  level=logging.ERROR, noiselevel=-1)
>-  return False
>+  msg = "".join(["!!! %s\n" % l for l in msg]),
>+  return (False, msg)
>+
>+  return (True, "")
>+
>+
>+def _check_locale(silent):
>+  """
>+  The inner locale check function.
>+  """
>+
>+  try:
>+  from portage_c_check_locale import _c_check_locale
>+  (ret, msg) = _c_check_locale()
>+  except ImportError:
>+  writemsg_level("!!! Unable to import portage_c_check_locale\n",
>+  level=logging.WARNING, noiselevel=-1)
>+  (ret, msg) = _ctypes_check_locale()
>+
>+  if ret:
>+  return True
>+
>+  if not silent:
>+  writemsg_level(msg, level=logging.ERROR, noiselevel=-1)
> 
>-  return True
>+  return False
> 
> 
> def check_locale(silent=False, env=None):
>diff --git a/setup.py b/setup.py
>index 25429bc..e44ac41 100755
>--- a/setup.py
>+++ b/setup.py
>@@ -47,7 +47,11 @@ x_scripts = {
> # Dictionary custom modules written in C/C++ here.  The structure is
> #   key   = module name
>#   value = list of C/C++ source code, path relative to top source
>directory
>-x_c_helpers = {}
>+x_c_helpers = {
>+  'portage_c_check_locale' : [
>+  'src/check_locale.c',
>+  ],
>+}
> 
> class x_build(build):
>   """ Build command with extra build_man call. """
>diff --git a/src/check_locale.c b/src/check_locale.c
>new file mode 100644
>index 000..9762ef2
>--- /dev/null
>+++ b/src/check_locale.c
>@@ -0,0 +1,144 @@
>+/* Copyright 2005-2015 Gentoo Foundation
>+ * Distributed under the terms of the GNU General Public License v2
>+ */
>+
>+#include 
>+#include 
>+#include 
>+
>+static PyObject * portage_c_check_locale(PyObject *, PyObject *);
>+
>+static PyMethodDef CheckLocaleMethods[] = {
>+  {"_c_check_locale", portage_c_check_locale, METH

Re: [gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to check locale

2016-05-19 Thread Anthony G. Basile
On 5/19/16 8:43 AM, Anthony G. Basile wrote:
> From: "Anthony G. Basile" 

> 
> This has been tested on glibc, uClibc and musl systems.

To be clear here, I needed the patch from bug #583412 to fix an
independent bug in portage + python2.7 + turkish (and possibly other)
locale.  Arfrever is going to commit that upstream, so it'll eventually
trickle down to us.


-- 
Anthony G. Basile, Ph. D.
Chair of Information Technology
D'Youville College
Buffalo, NY 14201
(716) 829-8197



[gentoo-portage-dev] [PATCH 2/2] pym/portage/util/locale.py: add a C module to check locale

2016-05-19 Thread Anthony G. Basile
From: "Anthony G. Basile" 

The current method to check for the system locale is to use python's
ctypes.util.find_library() to construct a full library path to the
system libc.so which is then passed to ctypes.CDLL().  However,
this gets bogged down in implementation dependant details and
fails with musl.

We work around this design flaw in ctypes with a small python module
written in C called 'portage_c_check_locale', and only fall back on
the current ctypes-based check when this module is not available.

This has been tested on glibc, uClibc and musl systems.

X-Gentoo-bug: 571444
X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=571444
Signed-off-by: Anthony G. Basile 
---
 pym/portage/util/locale.py |  40 +
 setup.py   |   6 +-
 src/check_locale.c | 144 +
 3 files changed, 178 insertions(+), 12 deletions(-)
 create mode 100644 src/check_locale.c

diff --git a/pym/portage/util/locale.py b/pym/portage/util/locale.py
index 2a15ea1..fc5d052 100644
--- a/pym/portage/util/locale.py
+++ b/pym/portage/util/locale.py
@@ -30,17 +30,17 @@ locale_categories = (
 _check_locale_cache = {}
 
 
-def _check_locale(silent):
+def _ctypes_check_locale():
"""
-   The inner locale check function.
+   Check for locale using ctypes.
"""
 
libc_fn = find_library("c")
if libc_fn is None:
-   return None
+   return (None, "")
libc = LoadLibrary(libc_fn)
if libc is None:
-   return None
+   return (None, "")
 
lc = list(range(ord('a'), ord('z')+1))
uc = list(range(ord('A'), ord('Z')+1))
@@ -48,9 +48,6 @@ def _check_locale(silent):
ruc = [libc.toupper(c) for c in lc]
 
if lc != rlc or uc != ruc:
-   if silent:
-   return False
-
msg = ("WARNING: The LC_CTYPE variable is set to a locale " +
"that specifies transformation between lowercase " +
"and uppercase ASCII characters that is different than 
" +
@@ -71,11 +68,32 @@ def _check_locale(silent):
msg.extend([
"  %s -> %s" % (chars(uc), chars(rlc)),
"  %28s: %s" % ('expected', chars(lc))])
-   writemsg_level("".join(["!!! %s\n" % l for l in msg]),
-   level=logging.ERROR, noiselevel=-1)
-   return False
+   msg = "".join(["!!! %s\n" % l for l in msg]),
+   return (False, msg)
+
+   return (True, "")
+
+
+def _check_locale(silent):
+   """
+   The inner locale check function.
+   """
+
+   try:
+   from portage_c_check_locale import _c_check_locale
+   (ret, msg) = _c_check_locale()
+   except ImportError:
+   writemsg_level("!!! Unable to import portage_c_check_locale\n",
+   level=logging.WARNING, noiselevel=-1)
+   (ret, msg) = _ctypes_check_locale()
+
+   if ret:
+   return True
+
+   if not silent:
+   writemsg_level(msg, level=logging.ERROR, noiselevel=-1)
 
-   return True
+   return False
 
 
 def check_locale(silent=False, env=None):
diff --git a/setup.py b/setup.py
index 25429bc..e44ac41 100755
--- a/setup.py
+++ b/setup.py
@@ -47,7 +47,11 @@ x_scripts = {
 # Dictionary custom modules written in C/C++ here.  The structure is
 #   key   = module name
 #   value = list of C/C++ source code, path relative to top source directory
-x_c_helpers = {}
+x_c_helpers = {
+   'portage_c_check_locale' : [
+   'src/check_locale.c',
+   ],
+}
 
 class x_build(build):
""" Build command with extra build_man call. """
diff --git a/src/check_locale.c b/src/check_locale.c
new file mode 100644
index 000..9762ef2
--- /dev/null
+++ b/src/check_locale.c
@@ -0,0 +1,144 @@
+/* Copyright 2005-2015 Gentoo Foundation
+ * Distributed under the terms of the GNU General Public License v2
+ */
+
+#include 
+#include 
+#include 
+
+static PyObject * portage_c_check_locale(PyObject *, PyObject *);
+
+static PyMethodDef CheckLocaleMethods[] = {
+   {"_c_check_locale", portage_c_check_locale, METH_NOARGS, "Check the 
system locale."},
+   {NULL, NULL, 0, NULL}
+};
+
+#if PY_MAJOR_VERSION >= 3
+static struct PyModuleDef moduledef = {
+   PyModuleDef_HEAD_INIT,
+   "portage_c_check_locale",   /* 
m_name */
+   "Module for checking the system locale for portage",/* 
m_doc */
+   -1, /* 
m_size */
+   CheckLocaleMethods, /* 
m_methods */
+   NULL,   /* 
m_reload */
+   NULL,   /* 
m_traverse