[issue41842] Add codecs.unregister() to unregister a codec search function

2020-09-23 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 23.09.2020 16:49, STINNER Victor wrote:
> 
> STINNER Victor  added the comment:
> 
>> Just found an internal API which already takes care of
>> unregistering a search function: _PyCodec_Forget().
>>
>> All that needs to be done is to expose this as codecs.unregister()
>> and add the clearing of the lookup cache.
> 
> Yeah, I saw this function, but it's related to the cache, not to the list of 
> search functions.

Ah, right. I just looked at the first occurance of codec_search_path :-)

>> BTW: While you're at it, having a way to access the search function
>> list from Python would be nice as well, since this would then open
>> up the possibility to reorder search functions.
> 
> I didn't hear anyone (ok, apart you) who requested to order search functions.

This has come up in the past from people who wanted to override
builtin codecs with their own versions.

> I dislike the idea of exposing it, since it introduces the risk that someone 
> "unregisters" a search function simply by removing it from the list, without 
> invalidating the cache.
> 
> I prefer to hide the internals to ensure that the cache remains consistent.

Sure, a function would merely return a tuple with the entries,
not the list itself, e.g. in pseudo code:

def get_search_path():
return tuple(interp->codec_search_path)

For replacing the vanilla setup, this is not needed, since only
one search function gets registered (the builtin one), so rather
low priority, I guess.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 23 2020)
>>> Python Projects, Coaching and Support ...https://www.egenix.com/
>>> Python Product Development ...https://consulting.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   https://www.egenix.com/company/contact/
 https://www.malemburg.com/

--

___
Python tracker 
<https://bugs.python.org/issue41842>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41420] Academic Free License v. 2.1 link is not found and is obsolete

2020-09-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

The Adobe form itself also still lists the broken URL. Only Ewa or Betsy can 
fix this, I suppose. I'll write them an email.

--

___
Python tracker 
<https://bugs.python.org/issue41420>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41420] Academic Free License v. 2.1 link is not found and is obsolete

2020-09-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Fixed https://www.python.org/psf/contrib/ to point to 
https://spdx.org/licenses/AFL-2.1.html instead. The contrib-form page 
(https://www.python.org/psf/contrib/contrib-form/) already had this change, but 
the PDF you can download from there still lists the old link.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue41420>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40789] C-level destructor in PySide2 breaks gen_send_ex, which assumes it's safe to call Py_DECREF with a live exception

2020-05-27 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 27.05.2020 05:56, Nathaniel Smith wrote:
> In CPython in general, it could be worked around by not invoking deallocators 
> with a live exception... I'm actually pretty surprised that this is even 
> possible! It seems like having a live exception when you start executing 
> arbitrary Python code would be bad. So maybe that's the real bug? Adding both 
> "asyncio" and "memory management" interest groups to the nosy.

Exception handlers can execute arbitrary Python code, so it's not
surprising that objects get allocated, deallocated, etc.

What you're describing sounds more like a problem with the PySide2
code not being reentrant. Clearing exceptions always has to be done
with some care. It's normally only applied to replace the exception
with a more specific one, when the exception is expected and handled
in the C code, or when there is no way to report the exception back
up the stack.

Note: Even the PyErr_Print() can result in Python code being
executed and because it's likely that PySide2 objects are part
of the stack trace, even PySide2 methods may be called as a result.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, May 27 2020)
>>> Python Projects, Coaching and Support ...https://www.egenix.com/
>>> Python Product Development ...https://consulting.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   https://www.egenix.com/company/contact/
 https://www.malemburg.com/

--

___
Python tracker 
<https://bugs.python.org/issue40789>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35967] Better platform.processor support

2020-05-18 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Thanks, Jason. I'll have a closer look at the issue and report back later this 
week.

--

___
Python tracker 
<https://bugs.python.org/issue35967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40578] Deprecate numeric item access for platform.uname()

2020-05-13 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I am closing this issue, since deprecations should really only be used when no 
other means are possible.

The namedtuples returned by platform.uname() do support index access and so any 
implementation change altering this is surprising and backwards incompatible, 
potentially breaking existing code which makes reasonable use of the index 
interface (the namedtuple and processor attribute was introduced in Python 3.3, 
so code written for prior versions may well still use the perfectly reasonable 
index approach).

You are essentially suggesting to change the return type, since you want to 
remove a standard tuple interface.

--
resolution:  -> rejected
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue40578>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35967] Better platform.processor support

2020-05-13 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Reopening the ticket, since the implementation makes backwards incompatible 
changes to platform.uname(): see https://bugs.python.org/issue40570 for a 
discussion on a better approach to lazy evaluation of getting the processor 
information.

Before we head on into implementation details, could you please point me to the 
motivation why only the processor detail of uname() needs lazy evaluation ?

Thanks.

--
resolution: fixed -> 
status: closed -> open

___
Python tracker 
<https://bugs.python.org/issue35967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40570] len(platform.uname()) has changed in Python 3.9

2020-05-13 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Hi Jason,

I think I have to review the whole set of changes again to understand what your 
motivation is/was. 

For https://bugs.python.org/issue35967 I had already stated that your use case 
is not special enough to make the platform.py logic more complex.

BTW: Please don't open several different tickets for the same problem. It 
doesn't really help to see what is going on. I'll reopen the issue35967 to 
continue the discussion there.

Thanks.

--

___
Python tracker 
<https://bugs.python.org/issue40570>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40570] len(platform.uname()) has changed in Python 3.9

2020-05-09 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Ok, let me add some more context: When I wrote the uname interface
I was aware that calling the API will take some resources. That's
why I added the cache. IMO, that was enough as optimization.

Now, you added a late binding optimization for the whole uname return
tuple to save the effort of going out to the system and figure our
the value using separate APIs or even shell access.

I think this would have been better implemented in the various
uname() consumers
(https://github.com/python/cpython/blob/77c614624b6bf2145bef69830d0f499d8b55ec0c/Lib/platform.py#L898
and below), using a variant of the uname() API, say _uname(),
which leaves out the processor information for those APIs which don't
need it and only provide the late binding in processor() (which could
then also fill in the cache value for uname().

The uname() API would then still do the full lookup, but applications
could then use the specialized API to query only the information
they need.

I don't think that deprecating standard tuple access is an option
for the uname() return value, since it's documented to be a tuple.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, May 09 2020)
>>> Python Projects, Coaching and Support ...https://www.egenix.com/
>>> Python Product Development ...https://consulting.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   https://www.egenix.com/company/contact/
 https://www.malemburg.com/

--

___
Python tracker 
<https://bugs.python.org/issue40570>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40570] len(platform.uname()) has changed in Python 3.9

2020-05-09 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Hi Jason,

to achieve better backwards compatibility, it's probably better to use
the approach taken for CodeInfo in the codecs.py module:

class CodecInfo(tuple):
"""Codec details when looking up the codec registry"""

def __new__(cls, encode, decode, streamreader=None, streamwriter=None,
incrementalencoder=None, incrementaldecoder=None, name=None,
*, _is_text_encoding=None):
self = tuple.__new__(cls, (encode, decode, streamreader,
streamwriter))
self.name = name
self.encode = encode
self.decode = decode
self.incrementalencoder = incrementalencoder
self.incrementaldecoder = incrementaldecoder
self.streamwriter = streamwriter
self.streamreader = streamreader
if _is_text_encoding is not None:
self._is_text_encoding = _is_text_encoding
return self

def __repr__(self):
return "<%s.%s object for encoding %s at %#x>" % \
(self.__class__.__module__, self.__class__.__qualname__,
 self.name, id(self))

This used to be a 4 entry tuple and was extended to hold additional
fields. To the outside, it still looks like a 4-tuple in all aspects,
but attribute access permits accessing the additional fields.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, May 09 2020)
>>> Python Projects, Coaching and Support ...https://www.egenix.com/
>>> Python Product Development ...https://consulting.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   https://www.egenix.com/company/contact/
 https://www.malemburg.com/

--

___
Python tracker 
<https://bugs.python.org/issue40570>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40505] getpath.c doesn't know about lib64

2020-05-04 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

No, I have not opened a bug report on OpenSUSE. Since the OS uses bi-arch 
throughout, using lib64 is the natural thing to use for libdir on the OS.

I think the issue lies with getpath.c only, since it makes an assumption about 
the libdir config value, which doesn't necessarily hold in practice. The libdir 
config value can be changed via the standard --libdir configure parameter, so 
getpath.c should really not assume that the default setting is always used.

Having a variable in Python 3.9 is nice, but perhaps we can still make Python 
3.7 and 3.8 work as well.

The correct approach to building the full lib_python path is not to use 
exec_prefix + "/lib/python" + VERSION, but instead to use libdir + "/python" + 
VERSION (which corresponds to the Makefile variable BINLIBDEST). libdir would 
have to be taken from the config variable LIBDIR.

--

___
Python tracker 
<https://bugs.python.org/issue40505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40505] getpath.c doesn't know about lib64

2020-05-04 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Just to clarify: the CONFIG_SITE script on OpenSUSE causes configure to use 
lib64, not the Python configure script itself.

--

___
Python tracker 
<https://bugs.python.org/issue40505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40505] getpath.c doesn't know about lib64

2020-05-04 Thread Marc-Andre Lemburg


New submission from Marc-Andre Lemburg :

On platforms which configure identifies as bi-arch platform, libdir is set to 
$[exec_prefix}/lib64, which results in the C extensions to get installed in 
e.g. /usr/local/lib64/python3.8/lib-dynload/.

However, the getpath.c routines use a fixed "lib/python" VERSION (see 
https://github.com/python/cpython/blob/3.8/Modules/getpath.c#L1200) path to 
build sys.path. As a result, the built Python binary cannot load the builtin C 
extensions.

A work-around on OpenSUSE is to set CONFIG_SITE="" when configuring Python. 
This disables the bi-arch support and has libdir default to ${exec_prefix}/lib 
again.

Looking at the master branch, this may already have been fixed for 3.9, since a 
PLATLIBDIR variable is used instead. The patch would have to be backported to 
earlier Python versions as well.

--
components: Interpreter Core
messages: 368086
nosy: lemburg
priority: normal
severity: normal
status: open
title: getpath.c doesn't know about lib64
type: behavior
versions: Python 3.5, Python 3.6, Python 3.7, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue40505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21081] missing vietnamese codec TCVN 5712:1993 in Python

2020-04-28 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Found an "Unlink" bottom at the bottom of the message view. This appears to 
remove the messages from the issue.

--

___
Python tracker 
<https://bugs.python.org/issue21081>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21081] missing vietnamese codec TCVN 5712:1993 in Python

2020-04-28 Thread Marc-Andre Lemburg


Change by Marc-Andre Lemburg :


--
Removed message: https://bugs.python.org/msg367514

___
Python tracker 
<https://bugs.python.org/issue21081>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21081] missing vietnamese codec TCVN 5712:1993 in Python

2020-04-28 Thread Marc-Andre Lemburg


Change by Marc-Andre Lemburg :


--
Removed message: https://bugs.python.org/msg367515

___
Python tracker 
<https://bugs.python.org/issue21081>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21081] missing vietnamese codec TCVN 5712:1993 in Python

2020-04-28 Thread Marc-Andre Lemburg


Change by Marc-Andre Lemburg :


--
Removed message: https://bugs.python.org/msg320603

___
Python tracker 
<https://bugs.python.org/issue21081>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21081] missing vietnamese codec TCVN 5712:1993 in Python

2020-04-28 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I have marked the messages as spam. Can't seem to remove them, though.

--

___
Python tracker 
<https://bugs.python.org/issue21081>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-03-08 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

It looks like Brian is expecting some kind of normalization of the strings 
before they enter the function, e.g. convert to lowercase, remove extra 
whitespace, convert diacritics to regular letters, combinations of such 
normalizations, etc.

Since both "word" and "possibilities" would have to be normalized, I think it's 
better to let the application deal with this efficiently than try to come up 
with a new function or add a normalize keyword function parameter.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue39891>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37751] In codecs, function 'normalizestring' should convert both spaces and hyphens to underscores.

2020-01-14 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Just to clarify: the change in the C implementation was the breaking change. 
The patch just restores the previous behavior: 
https://github.com/python/cpython/blob/master/Lib/encodings/__init__.py#L43

Please note that external codec packages should not rely on the semantics of 
the Python stdlib encodings package's search function. They should really 
register their own search function: 
https://docs.python.org/3.9/library/codecs.html#codecs.register

It's good practice to always only use ASCII lower case chars and the underscore 
for codec names.

--

___
Python tracker 
<https://bugs.python.org/issue37751>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37848] More fully implement Unicode's case mappings

2019-08-14 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

The Unicode implementation is deliberately not locale specific and
this should not change.

If a locale specific mapping is requested, this should be done
explicitly by e.g. providing a parameter to str.lower() / upper() /
title().

--

___
Python tracker 
<https://bugs.python.org/issue37848>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37760] Refactor makeunicodedata.py: dedupe parsing, use dataclass

2019-08-13 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

BTW: Since when do we use type annotations in Python's stdlib ?

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue37760>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37751] In codecs, function 'normalizestring' should convert both spaces and hyphens to underscores.

2019-08-12 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Jordon is right. Conversion has to be to underscores, not hyphens. I guess this 
bug was introduced when the normalization function was converted to C.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue37751>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37406] Disable runtime checks in release mode

2019-06-26 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Given that extensions call these APIs, I find it highly risky to
disable these checks in any version of the Python runtime and
am -1 on such a change.

Using assert() in C is a pretty bad alternative, since this crashes
the whole process. It should really only be used where no other
means of error handling are possible. Python's exception mechanism
is a much better way to signal and handle such errors at the
application level.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Jun 26 2019)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

--
nosy: +lemburg
title: Disable debug runtime checks in release mode -> Disable runtime checks 
in release mode

___
Python tracker 
<https://bugs.python.org/issue37406>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35551] Encoding and alias issues

2019-06-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

1. Background for "tactis":

https://github.com/python/cpython/commit/4fd73f0465ba11c22f0986d04cf91b387ed22c47

# The codecs for these encodings are not distributed with the
# Python core, but are included here for reference, since the
# locale module relies on having these aliases available.

This codec was available as separate package at the time. Later the CJK codecs 
got added to the stdlib, but this codec was not.

I guess it's fine to remove the alias.

2. If the mappings are identical, just leaving one and making the other an 
alias is fine. Same for aliases of those mapping names.

3. I think we had already resolved this some time ago.

--

___
Python tracker 
<https://bugs.python.org/issue35551>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36346] Prepare for removing the legacy Unicode C API

2019-03-18 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 18.03.2019 22:33, Stefan Behnel wrote:
> 
> I had also looked through the unrelated changes, and while, yes, they are 
> unrelated, they seemed to be correct and reasonable modernisations of the 
> code base while touching it. They could be moved to a separate PR, but there 
> is a relatively high risk of conflicts, so I'm ok with keeping them in here 
> for now.

I don't think changing sequence iteration to list iteration only
is something that should be hidden in a wchar_t removal PR.

My guess is that these changes have made it into the PR by mistake.
They deserve a separate PR and discussion.

--

___
Python tracker 
<https://bugs.python.org/issue36346>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36346] Prepare for removing the legacy Unicode C API

2019-03-18 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I'd change the title of this bpo item to "Prepare for removing the whcar_t 
caching in the Unicode C API".

Note that the wchar_t caching was put in place to allow for external 
applications and C code to easily and efficiently interface with Python. By 
removing it you will slow down such code significantly, esp. on Linux and 
Windows where wchar_t code is fairly common (one of the reasons we added UCS4 
in Python was to make the interaction with Linux wchar_t code more efficient).

This should be clearly mentioned as part of the change and the compile time 
flags.


BTW: You have a few other changes in the PR which don't have anything to do 
with the intended removal:

-envsize = PySequence_Fast_GET_SIZE(keys);
-if (PySequence_Fast_GET_SIZE(values) != envsize) {
+envsize = PyList_GET_SIZE(keys);
+if (PyList_GET_SIZE(values) != envsize) {

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue36346>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35967] Better platform.processor support

2019-03-16 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 08.03.2019 18:50, Jason R. Coombs wrote:
> 
>> It's also easy to bypass that by simply seeding the global cache
>> for uname(): _uname_cache. 
>> Or you could monkey-patch the platform module
>> in your utility to work around the circular reference.
> 
> I don't think these options are possible in the general case. It was what I 
> attempted to do in the first place, but could not. Consider the situation 
> where a namespace package is present or where a script uses pkg_resources to 
> bootstrap itself (a very common case), or any other case where 
> `platform.(anything)` is invoked before the "bypass" or "monkey-patch" has a 
> chance to run. This happens when running the test suite for `cmdix` because 
> pytest invokes pkg_resources to search for entry points and that code invokes 
> `platform.system` (or similar) to evaluate environment markers long before 
> the cmdix code has been imported.

I don't quite follow: since you are the author of the tool, you can of
course have your uname.py import platform and then apply one of the
above tricks, e.g.

"""
#!/usr/bin/env python3
import platform

# Seed uname cache to avoid calling uname
platform._uname_cache = platform.uname_result(
system='Linux',
node='moon',
release='5.99.99',
version='#1 SMP 2020',
machine='x86_64',
processor='x86_64')

print ('Hello from uname.py')
print ('platform.uname() = %r' % (platform.uname(),))
"""

> Here's what happens:
> 
> `platform.(anything)` runs `platform.uname` and `platform.uname` invokes 
> `uname -p` in a subprocess _unconditionally_. Python doesn't provide hooks to 
> monkey-patch that out before it gets invoked.

This is only true for the platform APIs which need information from
uname. Not in general.

>> Or you could call your utility something else.
> 
> The point of this utility is to supply "coreutils" using Python. It's derived 
> from an abandoned project called "pycoreutils", one purpose of which is to 
> provide the core utilities on a minimal Linux distribution that doesn't have 
> uname. Another is to supply coreutils on Windows. Having an alternate name 
> isn't really viable when the purpose is to supply that interface.
> 
> 
> I do think your considerations are reasonable, and I'm close to giving up. I 
> look forward to your feedback on the 'resolved-late' branch.

I don't have anything against making calling of uname lazy.
I also don't have anything against return useful information
rather than "unknown".

Your PR is missing tests, though, to support that it actually
returns the same values are before for a set of common platforms.

--

___
Python tracker 
<https://bugs.python.org/issue35967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36297] Remove unicode_internal codec

2019-03-15 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 15.03.2019 17:55, Serhiy Storchaka wrote:
> Is it for debugging only?

No, you can use it to store Unicode object as-is without any
encoding/decoding, but after the recent changes to the internals
of the Unicode implementation it's not all that useful anymore,
since we now have per object state which is not reflected by the
codec.

--

___
Python tracker 
<https://bugs.python.org/issue36297>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36297] Remove unicode_internal codec

2019-03-15 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 15.03.2019 17:35, Serhiy Storchaka wrote:
> 
> What is the purpose of the unicode-internal codec at first place?

It provides a fast and direct access to the internal representation of
Unicode used in Python to the outside world.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue36297>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35967] Better platform.processor support

2019-03-08 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 08.03.2019 18:00, Jason R. Coombs wrote:
> 
>> Perhaps adding a more capable API to interface to /proc/cpuinfo
> would be a good idea.
> 
> The core concern I want to address is that it's not possible to use any 
> function in the platform module without invoking "uname -p", and thus it's 
> not possible to implement "uname" in Python. No amount of supplementary 
> interfaces will help with that.

I don't know where you get that idea from. The uname family of APIs
do use "uname -p" on platforms where this exists, but the other
ones don't.

It's also easy to bypass that by simply seeding the global cache
for uname(): _uname_cache. Or you could call your utility
something else. Or you could monkey-patch the platform module
in your utility to work around the circular reference.

To be clear: I do not consider your use case to be particularly common
enough to warrant changes to the module, but would welcome additions
which bring more or better functionality to the module, e.g. having
the processor variable return meaningful where it previously did
not (ie. uname() return '' for the processor entry), or adding
another API to provide more detailed information.

--

___
Python tracker 
<https://bugs.python.org/issue35967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35967] Better platform.processor support

2019-03-08 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Jason: StackExchange does have lots of good hints, but it's not always
the correct. In this case, it's clearly wrong. uname -p has been
available on many Unix installations for decades.

I started writing the module back in 1999 and even then, the support
was already working on the systems I used at the time, and several
others, as you can see from this page:

https://www.egenix.com/www2002/python/mxCGIPython.html

The module was originally created to come up with a good name to
use for identifying platform binaries coming out of my mxCGIPython
project.

Note that the processor is not always needed to determine whether
software runs on a machine or not. The "uname -m" output often
is enough, but there are cases where e.g. compiler options are
used which produces code that only works on particular processors.

Perhaps adding a more capable API to interface to /proc/cpuinfo
would be a good idea.

--

___
Python tracker 
<https://bugs.python.org/issue35967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35967] Better platform.processor support

2019-03-08 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Thanks. It would be good to do some before/after tests on popular
platforms, e.g. a few Linuxes, MacOS, Windows.

--

___
Python tracker 
<https://bugs.python.org/issue35967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35967] Better platform.processor support

2019-03-08 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

As the documentation says, the API is intended as fairly portable 
implementation of the Unix uname helper across platforms. It's fine to redirect 
this directly to e.g. /proc output instead of using the executable, but in 
whatever you do here, the output of platform.uname() needs to stay compatible 
to what the function returned prior to such a change, which usually means: to 
the output of the uname helper on a system.

Could you please check that on most systems, the output remains the same ?

Thanks.

--

___
Python tracker 
<https://bugs.python.org/issue35967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35616] Change references to '4.0'.

2019-01-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Why not change the wording to read "... will be considered for removal in the 
next major Python release".

Note that removal of Py_UNICODE APIs will not only break compatibility with 
Python 2, but also with the early Python 3 releases.

And please also consider that we may see another change in the Unicode 
implementation... I've heard discussions about using UTF-8 as internal 
representation to address the issues with the current unified approach.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue35616>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35544] unicode.encode docstring says return value can be unicode

2018-12-20 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

You can install any codec you like and those essentially decide
on what to return as type. However, the unicode methods only
allow strings or unicode to be returned in Python 2.
In Python 3, .encode() only allows bytes.

You can still get the full codec encode/decode functionality
via the codecs encode/decode methods in Python 3.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Dec 20 2018)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

--

___
Python tracker 
<https://bugs.python.org/issue35544>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35348] Problems with handling the file command output in platform.architecture()

2018-12-17 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Guys, please read the doc-string of the platform.architecture() function (or 
ask the person who wrote most of the module). It clearly refers to inspecting a 
specific executable and only uses the Python interpreter as default.

The running process can provide some sane defaults, but is not necessarily 
using the same values as the given executable.

The function does not support multi-architecture executables. This is simply 
out of scope for the function.

Victor: AFAIK, I still own this module, so if you want to deprecate something, 
please ping me first.

--

___
Python tracker 
<https://bugs.python.org/issue35348>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35389] Use gnu_get_libc_version() in platform.libc_ver()?

2018-12-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Nice. I never liked the "parse the executable approach", but there wasn't 
anything better available at the time.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue35389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35346] Modernize Lib/platform.py code

2018-11-29 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Ok, let me add some history here:

When I created the platform module it was clear that this would be
a module which will frequently need updates, since platforms evolve
faster than Python does.

I had developed this with a larger number of contributors outside
the stdlib for a while and then there was a request to add it to the
stdlib.

Now in order to keep the module more or less up-to-date, it still
required regular updates, so the plan was to have it updated in the
current versions of Python, but allow it to be used in older Python
versions as well. That was the compromise to have it in the stdlib
and not external. Otherwise, I would have not added it to the stdlib.

This is why it has a special status and keep backwards compatibility
much longer than other code in the stdlib.

This worked quite well, but for some systems such as the Linux
distros, it was impossible to keep up with the development in that
mode. Well, actually, there were multiple reasons why this part
failed: 1. Linux distros didn't not have a standard when I added
the code, 2. Then some distros started two or three different ones,
3. Distros started to use multiple standards with conflicting data,
4. New distros became popular more often than we could update the
code.

That's why I was fine with removing the code again and leaving this
part to a PyPI package.

Does it make more sense now ?

--

___
Python tracker 
<https://bugs.python.org/issue35346>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35346] Modernize Lib/platform.py code

2018-11-29 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Please keep Python 2.7 compatibility. It should be possible to copy the module 
back into Python 2.7 and use it there. This is not hard to do and allows it to 
fulfill its purpose as platform detection module even while part of the stdlib.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue35346>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31873] Inconsistent capitalization of proper noun - Unicode.

2018-11-04 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

If you want to do this correctly, you have to check each case:

* if "unicode object" refers to a C PyUnicode object, it's probably better to 
use "PyUnicode object"
* if "unicode object" refers to a C PyObject object, with type "unicode", it's 
probably better to leave it as is
* if "unicode object" refers to a Python unicode object, it's probably better 
to call it "Unicode string object" or just "string object" in Python 3
* if "unicode object" does not indicate whether Python or C is meant, "Unicode 
object" is probably better

--

___
Python tracker 
<https://bugs.python.org/issue31873>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25416] Add encoding aliases from the (HTML5) Encoding Standard

2018-11-02 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Please note that we can only add aliases if the encodings are indeed the same. 
Given that WhatWG has made changes to several standard encodings, this is 
especially important, since our codecs are mostly based on what the Unicode 
consortium defines as these encodings.

Tests for aliases can be minimal: just verify that the codecs subsystem detects 
them and results in the correct codec being used. There's no need to download 
any WhatWG specs for this.

--

___
Python tracker 
<https://bugs.python.org/issue25416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22232] str.splitlines splitting on non-\r\n characters

2018-10-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Why not simply add a new parameter, to make people who want
ASCII linebreaks continue to use .splitlines() ?

It think it would be less than ideal to have one method break on
all Unicode line breaks and another only on ASCII ones.

--

___
Python tracker 
<https://bugs.python.org/issue22232>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18291] codecs.open interprets FS, RS, GS as line ends

2018-10-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 05.10.2018 14:06, Serhiy Storchaka wrote:
> 
> Then this particularity of codecs streams should be explicitly documented.

Yes, probably. Such extensions of scope for different character
types in Unicode vs. ASCII are a common gotcha when moving from
Python 2 to 3. The same applies to eg. upper/lower
case conversion, conversion to numeric values, the various .is*()
methods, etc.

> codecs.open() was advertised as a way of writing portable code for Python 2 
> and 3, and it can still be used in many old programs.

AFAIR, we changed this to recommend io.open() instead,
after the io module was rewritten in C.

Before that we did indeed advertise codecs.open() as a way to
write code which produces Unicode in a similar way as io does
in Python 3 (they were never fully identical, though).

--

___
Python tracker 
<https://bugs.python.org/issue18291>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18291] codecs.open interprets FS, RS, GS as line ends

2018-10-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Sorry, I probably wasn't clear: the codecs interface is a direct 
interface to the Unicode codecs and thus has to work according to 
what Unicode defines.

Your PR changes this to be non-compliant and does this for all codecs.
That's a major backwards and Unicode incompatible change and I'm -1
on such a change for the stated reasons.

If people want to have ASCII only line break handling, they should
use the io module, which only uses the codecs and can apply different
logic (as it does).

Please note that many file formats where not defined for Unicode,
and it's only natural that using Unicode codecs on them will
result in some differences compared to the ASCII world. Line breaks
are one of those differences, but there are plenty others as well,
e.g. potentially breaking combining characters or bidi sections,
different ideas about upper and lower case handling, different
interpretations of control characters, etc.

The approach to this has to be left with the applications dealing
with these formats. The stdlib has to stick to standards and
clear documentation.

--

___
Python tracker 
<https://bugs.python.org/issue18291>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22232] str.splitlines splitting on non-\r\n characters

2018-10-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I am -1 on changing the default behavior. The Unicode standard defines what a 
linebreak code point is (all code points with character properties Zl or 
bidirectional property B) and we adhere to that. This may confuse parsers 
coming from the ASCII world, but that's really a problem with those parsers 
assuming that .splitlines() only splits on ASCII line breaks, i.e. they are not 
written in a Unicode compatible way.

As mentioned in https://bugs.python.org/issue18291 we could add a parameter to 
.splitlines(), but this would render the method not much faster than re.split().

Using re.split() is not a work-around in his case, it's an explicit form  of 
defining the character you want to split lines on, if the standards defining 
your file format as only accepting ASCII line break characters.

Since there are many such file formats, perhaps adding a parameter 
asciionly=True/False would make sense. .splitlines() could then be made to only 
split on ASCII linebreak characters. This new parameter would then have to 
default to False to maintain compatibility with Unicode and all previous 
releases.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue22232>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18291] codecs.open interprets FS, RS, GS as line ends

2018-10-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

The Unicode .splitlines() splits strings on what Unicode defines as linebreak 
characters (all code points with character properties Zl or bidirectional 
property B).

This is different than what typical CSV file parsers or other parsers built for 
the ASCII text files treat as newline. They usually only break on CR, CRLF, LF, 
so the use of .splitlines() in this context is wrong, not the method itself.

It may make sense extending .splitlines() to pass in a list of linebreak 
characters to break on, but that would make it a lot slower and the same can 
already be had by using re.split() on Unicode strings.

Closing this as won't fix.

--
resolution:  -> wont fix
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue18291>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34538] Remove encouragement to author a base class for all Exception subclasses in a module

2018-10-01 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Just as extra data point:

It is fairly common to have a common exception class which is then used a mixin 
class together with the standard exception classes, so that you can indeed 
identify the source of an exception and catch errors based on the source (e.g. 
say you want to catch database errors coming from MySQL specifically).

The Python DB-API also requires to create a separate hierarchy for this purpose.

Overall, I wouldn't call this a non-best practice. It depends on the use case, 
whether it's useful or not.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue34538>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34763] Python lacks 0x4E17

2018-09-21 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

We use the Unicode database for these methods. Could you please check whether 
the database marks the character as numeric ?

If yes, we may need to check the database generation.

Otherwise, there isn't much we can do, since we use the Unicode database as 
reference.

Thanks
-- 
Marc-Andre Lemburg

Sent from my phone. 
See http://www.egenix.com/company/ for contact information
and impressum.

On 21 September 2018 18:38:05 GMT+02:00, Serhiy Storchaka 
 wrote:
> 
> Change by Serhiy Storchaka :
> 
> 
> --
> nosy: +lemburg
> 
> ___
> Python tracker 
> <https://bugs.python.org/issue34763>
> ___

--

___
Python tracker 
<https://bugs.python.org/issue34763>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26544] platform.libc_ver() returns incorrect version number

2018-08-21 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I added a comment to the PR, but other than that I think it's good to go.

--

___
Python tracker 
<https://bugs.python.org/issue26544>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33598] ActiveState Recipes links in docs, and the apparent closure of Recipes

2018-05-25 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

I'd suggest to contact ActiveState first before jumping to conclusions.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33598>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28167] remove platform.linux_distribution()

2018-05-16 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

Hi Petr, I'm fine with this. Maintaining the necessary logic Python is not 
really possible in the stdlib. It's better to have a PyPI module for this which 
can be updated much more easily.

Thanks.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue28167>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2018-05-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

Thanks, Serhiy.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33278] libexpat uses HAVE_SYSCALL_GETRANDOM instead of HAVE_GETRANDOM_SYSCALL

2018-04-14 Thread Marc-Andre Lemburg

New submission from Marc-Andre Lemburg <m...@egenix.com>:

See https://github.com/python/cpython/blob/3.6/Modules/expat/xmlparse.c#L87

The Python configure script tests and sets the variable HAVE_GETRANDOM_SYSCALL.

The solution would be to have Python's config script define 
HAVE_SYSCALL_GETRANDOM as well, in case it detects the function.

--
components: XML
messages: 315289
nosy: lemburg
priority: normal
severity: normal
status: open
title: libexpat uses HAVE_SYSCALL_GETRANDOM instead of HAVE_GETRANDOM_SYSCALL
versions: Python 3.6, Python 3.7, Python 3.8

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33278>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32846] Deletion of large sets of strings is extra slow

2018-02-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

Reminds of some experiments someone did a while ago as part of the
GIL removal attempts where the ref count integers are all kept in a
separate array. The intent there was to be able to do locking on
a single array rather than on individual decref cells.

This would solve the issue with having to jump around in memory
to decref all objects, but I'm not sure whether the overall win
would be a lot, since deallocation of the memory blocks typically
requires accessing the block itself as well (to update the block
chain list pointers), unless the memory allocator uses some
smart cache local block management as well (I believe that pymalloc
does, but could be wrong).

In any case, this sounds like a fun experiment for a GSoC student.
Perhaps the PSF could donate an AWS EC2 instance with enough RAM to
do the experiments.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32846>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32516] Add a shared library mechanism for win32

2018-01-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

On 17.01.2018 02:52, xoviat wrote:
> 
> xoviat <xov...@gmail.com> added the comment:
> 
> For the record, moving the DLL path manipulation code into the interpreter 
> would address the concern that importing a module would not manipulate the 
> search path because the behavior would move into Python itself.

Can't you simply place the DLLs into the PythonXX\DLLs\ directory ?

That's where Python itself keeps external DLLs (and several PYDs)
and it won't change after installation of Python.

Or create a special container package on PyPI into which you place
the DLLs and add dependencies to this in all other packages.

You can then load the DLL via win32 LoadLibrary either using the
Python win32 tools or ctypes:

https://docs.python.org/3.7/library/ctypes.html
http://timgolden.me.uk/pywin32-docs/win32api__LoadLibrary_meth.html
https://www.programcreek.com/python/example/51388/win32api.LoadLibrary

FWIW: I think this ticket has shown plenty options to possible
solutions, including many which do not manipulate the path.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32516>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32516] Add a shared library mechanism for win32

2018-01-16 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

Probably better overall to go with a conda package which puts
the DLLs in a central location and manages the dependencies.

You can then load the DLL in the package before loading the PYD
and you're all set. Whether in an __init__.py or elsewhere is
really up to the package.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32516>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

Sounds like a good compromise :-)

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31900>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

Indeed. The major problem with all libc locale functions is that they are not 
thread safe. The GIL does help a bit protecting against corrupted data, though.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31900>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

Ok, it seems that the C setlocale() itself does not follow the conventions set 
forth for environment variables:

http://pubs.opengroup.org/onlinepubs/7908799/xsh/setlocale.html

(see the example at the bottom)

So the behavior shown by Python's setlocale() is fine.

However, that still doesn't magically make this work:

locale.setlocale(locale.LC_ALL, 'C.UTF-8')
locale.setlocale(locale.LC_NUMERIC, 'fr_FR.ISO8859-1')

If LC_NUMERIC uses a different encoding than LC_ALL, there's really no surprise 
in having numeric formatting fail. localeconv() will output the set encoding 
for the numeric string conversion and Python will decode this using the locale 
encoding set by LC_ALL. If those two are different, you run into problems.

I would not consider this a bug in Python, but rather in the locale settings 
passed to setlocale().

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31900>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

I just wanted to note that the description and title may cause a wrong 
interpretation of what should happen:

If you first set LC_ALL and then one of the other categories such as 
LC_NUMERIC, locale C functions will still use the LC_ALL setting for 
everything. LC_NUMERIC does not override the LC_ALL setting.

I tested this on OpenSUSE and get the same wrong results. Apparently, 
locale.localeconv() does not respect the above order. That's a bug.

I'm not sure whether the OP's quoted behavior is a bug, though, since if the 
locale encoding is not UTF-8, you cannot really expect using UTF-8 numeric 
separators to output correctly.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31900>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32516] Add a shared library mechanism for win32

2018-01-15 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

>From experience with doing something similar in egenix-pyopenssl, I recommend 
>putting the DLLs into the same directory as the PYD file on Windows. If you 
>want to be extra safe, you can explicitly load the DLL, but normally this is 
>not needed.

On Linux and other OSes, it's best to dlopen() to explicitly load the lib, 
since rpath and OS search paths are not always reliable.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32516>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31900] localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding

2018-01-15 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

Just FYI: LC_ALL has precedence over all other more specific LC_* settings:

http://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html
http://man7.org/linux/man-pages/man7/locale.7.html

Please confirm the bug without having LC_ALL or LANG set. Thanks.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31900>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31895] Native hijri calendar support

2017-10-30 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

There are other PyPI packages available for this specific calendar as well, 
e.g. https://pypi.python.org/pypi/umalqurra/

Perhaps you could send Neil a PR to make the calculation more accurate ?!

In any case, the stdlib is not meant to cover everything, only a basic subset 
of functionality, so adding support for more than just one calendar is out of 
scope.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31895>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31895] Native hijri calendar support

2017-10-30 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

I agree with Steven: It's best to use a PyPI package for calendar support such 
as https://pypi.python.org/pypi/convertdate/.

We only have the standard Gregorian calendar support in datetime and calendar 
modules.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31895>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31803] time.clock() should emit a DeprecationWarning

2017-10-26 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

On 26.10.2017 13:46, Serhiy Storchaka wrote:
> 
> It is very bad, that the function with such attractive name has different 
> meaning on Windows and Unix. I'm sure that virtually all uses of clock() are 
> broken because its behavior on other platform than used by the author of the 
> code.

Not really, no. People who write cross-platform code are well
aware of the differences of time.clock() and people who just
write for one platform know how the libc function of the same
function works on their platform.

Unix: http://man7.org/linux/man-pages/man3/clock.3.html
Windows: https://msdn.microsoft.com/en-us/library/4e2ess30.aspx

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31803>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31803] time.clock() should emit a DeprecationWarning

2017-10-25 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

On 25.10.2017 01:31, STINNER Victor wrote:
> 
> Marc-Andre: "Yes, to avoid yet another Python 2/3 difference. It should be 
> replaced with the appropriate variant on Windows and non-Windows platforms. 
> From Serhiy's response that's time.process_time() on non-Windows platforms 
> and time.perf_counter() on Windows."
> 
> I don't understand why you mean by "replaced with". Do you mean modify the 
> implementation of the time.clock()?

What I meant is that time.clock() is replaced with the higher
accuracy timers corresponding to the current time.clock()
implementation on the various platforms in order to retain
backwards compatibility.

In other words:

if sys.platform == 'win32':
time.clock = time.perf_counter
else:
time.clock = time.process_time

I know that time.clock() behaves differently on different platforms,
but this fact has been known for a long time and is being used by
Python code out there for timing purposes.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31803>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31803] time.clock() should emit a DeprecationWarning

2017-10-24 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

On 24.10.2017 11:23, STINNER Victor wrote:
> 
> Marc-Andre Lemburg: "Thanks for pointing that out. I didn't know."
> 
> Do you still think that we need to modify time.clock() rather than 
> deprecating it?

Yes, to avoid yet another Python 2/3 difference. It should be
replaced with the appropriate variant on Windows
and non-Windows platforms. From Serhiy's response that's
time.process_time() on non-Windows platforms and time.perf_counter()
on Windows.

The documentation can point to the new functions and recommend
these over time.clock().

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31803>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31803] time.clock() should emit a DeprecationWarning

2017-10-22 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

On 22.10.2017 15:14, Serhiy Storchaka wrote:
> 
> Serhiy Storchaka <storchaka+cpyt...@gmail.com> added the comment:
> 
> On non-Windows platforms clock() returns the processor time, perf_counter() 
> does include time elapsed during sleep.
> 
>>>> import time
>>>> start = time.clock(); time.sleep(1); print(time.clock() - start)
> 9.7001374e-05
>>>> start = time.perf_counter(); time.sleep(1); print(time.perf_counter() - 
>>>> start)
> 1.000714950998372

Thanks for pointing that out. I didn't know.

Is there a different clock with similar accuracy we can use
to only count CPU time on Unix ?

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31803>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31803] time.clock() should emit a DeprecationWarning

2017-10-22 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

On 20.10.2017 19:46, Serhiy Storchaka wrote:
> 
> This will break a code that depends on the current behavior on non-Windows 
> platforms. And this will contradict the expectation of non-Windows 
> programmers. If change the behavior of clock() on non-Windows platforms, it 
> should be done only after the period of emitting FutureWarning.

Could you explain which behavior is changed by this on non-Windows
platforms ?

time.clock() only switches to a more accurate clock. That's pretty
much it. Which clock it used on which platform was platform
dependent anyway, so there's no real change in behavior.

For benchmarking and other measurements, time.time() was recommended
on Unix and time.clock() on Windows. time.clock() never good
resolution on Unix, so the situation only improves by using
a more accurate clock.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31803>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31803] time.clock() should emit a DeprecationWarning

2017-10-18 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

On 18.10.2017 11:45, STINNER Victor wrote:
> Marc-Andre, Ethan: What do you think of removing the deprecation warning from 
> the C (my last commit), leave the deprecation warning in the documentation, 
> and modify time.clock() to become an alias to time.perf_counter()?
> 
> By alias, I really mean time.clock = time.perf_counter, so 
> time.clock.__name__ would say "perf_counter".

That's what I think would be a better solution, since the
absolute value of time.clock() is never used, only the difference.

If you then get better accuracy in that difference, things
can only get better, so this is not really backwards compatibility
issue (nothing gets worse).

Not sure whether the function name would cause an incompatibility
issue. I doubt it, but if it does we could have time.clock()
as function which then simply calls time.perf_counter().

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31803>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31803] time.clock() should emit a DeprecationWarning

2017-10-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <m...@egenix.com> added the comment:

time.cock() is used in a lot of code. Why can't we simply replace the 
functionality with one of the other functions ?

The documentation certainly allows for such a change, since it pretty much just 
says that only the delta between two values has a meaning.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31803>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31530] Python 2.7 readahead feature of file objects is not thread safe

2017-09-20 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 20.09.2017 17:22, Guido van Rossum wrote:
> 
>> Why not simply document the fact that read ahead in Python 2.7
>> is not thread-safe and leave it at that ?
> 
> Program bugs should not crash the interpreter. (ctypes excepted.)

Ideally not, agreed :-)

--
title: [2.7] Python 2.7 readahead feature of file objects is not thread safe -> 
Python 2.7 readahead feature of file objects is not thread safe

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31530>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31530] [2.7] Python 2.7 readahead feature of file objects is not thread safe

2017-09-20 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Ah, didn't see Benjamin's patch: much better solution :-)

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31530>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31530] [2.7] Python 2.7 readahead feature of file objects is not thread safe

2017-09-20 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Why not simply document the fact that read ahead in Python 2.7
is not thread-safe and leave it at that ?

.next() and .readline() already don't work well together, so this
would just add one more case.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31530>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31332] Building modules by Clang with Microsoft CodeGen

2017-09-03 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

You will have to also create a new compiler class for the compiler. If this is 
more or less the same clang as used on Unix and MacOS, chances are high, the 
UnixCCompiler class already supports most of it. Only some changes related to 
paths may be necessary.

That said, standard CPython is compiled with VC++ so you will likely get better 
compatibility by compiling extension modules with the same compiler.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue31332>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30717] Add unicode grapheme cluster break algorithm

2017-08-03 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 03.08.2017 15:05, Guillaume Sanchez wrote:
> 
> Guillaume Sanchez added the comment:
> 
> I have a few criticism to do against that proto-PEP
> 
> http://mail.python.org/pipermail/python-dev/2001-July/015938.html
> 
> In particular, the fact that all those functions return an index prevents any 
> state keeping.

If you want state keeping for iterating over multiple 
parts of the string, you can use an iterator.

The APIs were inspired by the standard string.find() APIs, that's
why they work on indexes and don't return Unicode strings. As
such, they serve a different use case than an iterator.

With the APIs, scanning would always start at the given index
in the string and move forward/backward to the start of the next
.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue30717>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27788] platform module's version number doesn't match its docstring

2017-03-24 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:


New changeset 6059ce45aa96f52fa0150e68ea655fbfdc25609a by Marc-Andre Lemburg 
(Matthias Bussonnier) in branch 'master':
bpo-27788 : synchronise platform.py version number (#246)
https://github.com/python/cpython/commit/6059ce45aa96f52fa0150e68ea655fbfdc25609a


--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27788>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

The main purpose of the alias table is to support normalization and this is 
used for getdefaultencoding() which was created to be able to determine the 
default encoding based on what X.org uses as default without doing temporary 
setlocale() tricks.

Now, normalization also happens when passing a locale value to the underlying 
setlocale(), mainly to avoid many common bugs due to setlocale() being 
extremely picky about the locale value. A side effect of this is that 
normalization will also kick in to add the encoding in case no encoding is 
given in the parameter.

Note that no normalization is necessary to simply set the configured default 
locale configured on the system. In such a case, you'd run setlocale('LC_ALL') 
and get what's configured.

If you run the lib C setlocale() with a locale without encoding, the encoding 
used by the system entirely on what's configured on the system. The SUPPORTED 
file only gives a hint at what glibc think it should install per default, but 
any admin or distributor could change these settings simply by running 
localedef with some other encoding (charmap in locale speak).

I suppose that we could resolve some of the confusion by adding a parameter to 
disable this normalization in setlocale().

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-10 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 10.03.2017 08:37, Benjamin Peterson wrote:
> 
> Do you believe this program should work?
> 
> import locale, os
> for l in open("/usr/share/i18n/SUPPORTED"):
> alias, encoding = l.strip().split()
> locale.setlocale(locale.LC_ALL, alias)
> try:
> enc = locale.getlocale()[1]
> except ValueError:
> continue # not in table
> normalized = enc.replace("ISO", "ISO-"). \
>  replace("_", "-"). \
>  replace("euc", "EUC-"). \
>  replace("big5", "big5-").upper()
> assert normalized == locale.nl_langinfo(locale.CODESET)
> 
> After my change it does—the encoding returned from getlocale() is the one 
> actually being used by glibc. It fails dramatically on earlier versions of 
> Python (for example on the en_IN example from #29571.) I don't understand why 
> Python needs to editorialize whatever choices libc or the system 
> administrator has made.

Your program essentially tests what alias is configured
on your particular system. It will fail on older systems
(with a different or no version of SUPPORTED), it will fail on
systems that do not have all locales installed, it will
fail on systems that use the X.org aliases table as basis
rather than some list of supported locales of glibc, or
custom alias tables.

What we want in Python is a consistent mapping of aliases to locales
across all (Unix based) Python installations, just like what we
have for encoding aliases and those mappings should be taken
from a support alias database, not a list of default installations
on some glibc version.

Also note that a lot of these discussions are really academic,
since locales should always be specified with encoding.

While Unix gravitates to UTF-8 for all system related things,
users still use other encodings a lot for their daily operations,
as you can see in the X.org aliases file.

This is why defaulting to UTF-8 for locales (as e.g.
is done for many locales in the glibc default installs) is not
a good idea. Locales affect user work products. What's fine for
command line interfacing or piping, is not necessarily for
fine for e.g. documents created by users.

So to answer your question: No, I don't believe that SUPPORTED
has any authority for our purposes and thus don't think that
the program can be considered a valid test case.

The SUPPORTED file can server as extra resource for fixing bugs
in the table, but nothing more.

> Is getlocale() expected to return something different from the underlying C 
> locale?

getlocale() will return whatever is currently configured via
setlocale().

Of course, it can return something different from what some glibc
SUPPORTED lists as default installation encoding, if you don't provide
the encoding when using setlocale(), but it will always default
to the same locale and encoding on all platforms where you
run Python.

> In fact, why have this table at all instead of using nl_langinfo to return 
> the encoding for the current locale?

The table is meant to normalize locale names and enrich
them with default encodings from a well known database of
such aliases, where necessary. As mentioned above the locale setting
should ideally include the encoding as well, so that any such
guesses are not necessary.

Regarding nl_langinfo():

nl_langinfo() will only work if you have called
setlocale() already, since a process always starts up in
the C locale without this call.

If you don't have a problem with calling setlocale() for
testing the default locale settings (e.g. Python is not
embedded, you don't have other threads running, no
APIs which use locale information called yet, setlocale()
was already called to setup the locale, etc.),
you can use the approach taken by getpreferredencoding(),
which is to temporarily set the locale to the default.

Going forward, I think that the following changes make
sense:

* from ISO8859-1 to ISO8859-15 (the -15 version adds
  the Euro sign)

* casing changes e.g. 'zh_CN.gb2312' to 'zh_CN.GB2312'

* fixes which undo removal of modifiers such as
  'uz_uz@cyrillic' -> 'uz_UZ.UTF-8' to 'uz_UZ.UTF-8@cyrillic'

As for the other changes: please undo them and also
revert the unconditional use of glibc mappings overriding
the X.org ones, as mentioned earlier in the thread.

We can readd some of the modifications later on if there's
evidence that they actually do make sense.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29783] Modify codecs.open() to use the io module instead of codecs.StreamReaderWriter()

2017-03-10 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 10.03.2017 15:17, STINNER Victor wrote:
> 
> The codecs.StreamReaderWriter() class still has old unfixed issues like the 
> issue #12508 (open since 2011). This issue is even seen as a security 
> vulnerability by the owasp-pysec project:
> https://github.com/ebranca/owasp-pysec/wiki/Unicode-string-silently-truncated

The issue should be fixed. Patches welcome :-)

The reason for the problem is the UTF-8 decoder (and other
decoders) expecting an extension to the codec decoder API,
which are not implemented in its StreamReader class (it simply
uses the base class). It's not a problem of the base class, but
that of the codec.

And no: it doesn't have anything to do with codec.open()
or the StreamReaderWriter class.

> I propose to modify codecs.open() to reuse the io module: call io.open() with 
> newline=''. The io module is now battle-tested and handles well many corner 
> cases of incremental codecs with multibyte encodings.

-1. People who want to use the io module should use it directly.

> With this change, codecs.open() cannot be used with non-text encodings... but 
> I'm not sure that this feature ever worked in Python 3:
> 
> $ ./python -bb
> Python 3.7.0a0
>>>> import codecs
>>>> f = codecs.open('test', 'w', encoding='rot13')
>>>> f.write('hello')
> TypeError: a bytes-like object is required, not 'str'
>>>> f.write(b'hello')
> TypeError: a bytes-like object is required, not 'dict'

That's a bug in the rot13 codec, not a feature. codec.open()
works just find with 'hex' and 'base64'.

> The next step would be to deprecate the codecs.StreamReaderWriter class and 
> the codecs.open(). But my latest attempt to deprecate them was the PEP 400 
> and it wasn't a full success, so I now prefer to move step by step :-)

I'm still -1 on the deprecations in PEP 400. You are essentially
suggesting to replace the complete codecs subsystem with the
io module, but forgetting that all codecs use StreamWriter and
StreamReader as base classes.

StreamReaderWriter is just an amalgamation of the two
classes StreamReader and StreamWriter, nothing more. It's
a completely harmless class in the codecs.py.

The codecs sub system has a clean design. If used correctly
and maintained with more care, it works really well. Trying
to rip things out won't make it better. Fixing implementations,
where the appropriate care was not applied, is a much better
strategy.

I'm tired of having to fight these fights every few years.
Can't we just stop having them, please ?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29783>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-09 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 09.03.2017 11:47, Serhiy Storchaka wrote:
> 
> The SUPPORTED file from glibc is used for determining the default encoding  
> for locales that don't include it explicitly. For example en_IN uses UTF-8 
> rather than ISO8859-1.

No, the glibc locales don't say anything about default encodings
used in a locale:

http://manpages.ubuntu.com/manpages/wily/en/man5/locale.5.html

These encodings are just used for determining the default
set of locale.encoding variants to install on the system,
nothing more:

https://github.com/bminor/glibc/blob/73dfd088936b9237599e4ab737c7ae2ea7d710e1/localedata/Makefile#L204

glibc does have a locale.alias file:

https://github.com/bminor/glibc/blob/73dfd088936b9237599e4ab737c7ae2ea7d710e1/intl/locale.alias

which uses the X.org format, but this is completely out of
date and declared obsolete.

Serhiy: If you believe that there's anything authoritative about
the glibc SUPPORTED file in terms of defining the commonly
used encoding in a locale, please provide references. These
should also clarify why the glibc encoding is the correct one
compared to the X.org mapping.

It doesn't help, trying to interpret things into such build
files. We need a database that is being actively maintained
and has a track record of representing what people actually
use in their locales. The only one I know is the X.org one.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-09 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 09.03.2017 08:15, Benjamin Peterson wrote:
> 
> "eo_XX" is just something that appears in the X11 locale.alias file. My 
> change doesn't add that; it was already there. (for Esperanto, which I 
> suppose explains the "XX")

Yes, I know. That was an example of a bug in the X.org list.

> Most of the changes you identify the glibc aliases taking precedence over the 
> X11 ones. e.g., glibc has "fi_FI ISO-8859-1" while the X11 locale list has 
> "fi_FI.ISO8859-15". That seems correct to me as far as the intent of this 
> change is concerned.

No, it's not correct. ISO-8859-1 is the older version of Latin-1
without the Euro sign. ISO8859-15 adds it.

> How do you propose to pick and choose what we use from the X11 locale alias 
> list?

We have to go through the list one by one to check whether
the mapping update makes sense and is correct.

This will be difficult in a few cases where the glibc mapping
switches to UTF-8 from an ISO encoding. We'll have to find
evidence that this change does indeed make sense.

My take on this is that the X.org folks know better than the
glibc folks, since the former have to deal with end users that
rely on the locale settings a lot more than applications
using glibc for getting an initial locale setting right.

Also note that you are parsing the SUPPORTED file from
glibc (in slightly processed form):

https://github.com/bminor/glibc/blob/master/localedata/SUPPORTED

This file does not provide a locale alias mapping as
the routine in makelocalealias.py suggests. Instead it's
a list of locales to install by default:

https://github.com/bminor/glibc/blob/73dfd088936b9237599e4ab737c7ae2ea7d710e1/localedata/Makefile

In glibc you can define both the locale and the encoding separately
when creating a locale using localedef and the file simply provides
the default parameters to pass to this tool.

As such, I don't see how you can derive a default alias
meaning from the file.

It's simply an indication of what glibc would have installed
in case it were installed from source, but that's hardly ever
the case. On today's systems only a bare subset of locales
is installed and more added as necessary, so you rarely have
all the locales defined in SUPPORTED installed on a system.

So the file doesn't even provide a hint at what could
be installed on the system ("locale -a" gives you that list).

Here's the history:

https://github.com/bminor/glibc/commits/master/localedata/SUPPORTED

It's merely a list of additions and removals from the
default set. Nothing more. It does provide a list of
known and supported locales, but no usable or authoritative
encoding information (locales are defined using Unicode, so
the encoding is a parameter and not predefined).

Overall, I believe the file is pretty useless to use as
basis for an alias table providing encoding information.
It may provide some ideas for corrections, but should not
override the X.org one by default.

On the other hand, you have the local.alias master file:

https://cgit.freedesktop.org/xorg/lib/libX11/tree/nls/locale.alias.pre

together with the history of why changes were made and when.
This is an authoritative resource and people are making changes
against it from the user perspective.

I'd suggest to make the override optional in makelocalealias.py
via a command line switch and to use this for manually adding
or fixing X.org entries.

If you absolutely want to parse the glibc file per default as
well, please only let it add new entries, not override existing
ones. As we've seen in the patch, those overrides need to be
carefully reviewed.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-08 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Why was the PR merged while we were still discussing it ?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-08 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 08.03.2017 10:37, Serhiy Storchaka wrote:
> 
> The problem is that that table can get incorrect result for non-Linux 
> platforms (or for Linux with old glibc).

Sure, it's a best effort approach.

Also note that on today's systems you often don't have the full set of
locales available anymore - instead these have to either be installed
separately or generated on the target system.

Our locale database works on all these system, regardless of
what's installed or not.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-08 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 08.03.2017 07:27, Benjamin Peterson wrote:
> 
> Why is the X11 locale alias map used at all? It seems like it can only create 
> confusion with libc.

Because it was the only such maintained mapping available at the
time. It's also used for the X.org system, which has a rather strong
focus on user interfaces where locale matter a lot, unlike
the lib C :-)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-08 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 08.03.2017 08:20, Serhiy Storchaka wrote:
> 
> Serhiy Storchaka added the comment:
> 
> Not all platforms use glibc 2.24 as libc.

True. Many don't even use glibc.

> Ideally most of entries should even not exist. We should ask libc for the 
> default encoding if it is not included in the locale name. The aliases table 
> should be used only for mapping commonly used but unsupported by libc locales 
> to supported by libc locales.

I think you have a wrong understanding of what this alias table
is used for: we need it to determine the lib C compatible locale
name without using lib C APIs such as setlocale(), since these are
not thread safe and have side-effects for the whole process.

The alias table is there to avoid having to go to the lib C
to ask it indirectly for more details. Unfortunately, there are
no cross-platform lib C APIs which would allow querying these
details without also changing the local settings of the process.

I know that Python still plays the usual "save current locale,
run setlocale(), revert to previous locale" trick in a couple
of places and this works if Python is the only thread running,
but it doesn't when embedded into other applications.

Regarding the patch: we cannot simply use the output from the
script to set new values. The changes have to be manually
reviewed as well.

E.g. this entry in the table is clearly a typo:

'en_zw.utf8':   'en_ZS.UTF-8',

(it should read en_ZW.UTF-8)

This entry appears wrong as well:

'eo':   'eo_XX.ISO8859-3',

(XX is not a valid country ISO code)

How should we go about this ? Mark all the problems in the PR ?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-07 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 07.03.2017 18:23, Serhiy Storchaka wrote:
> 
> Serhiy Storchaka added the comment:
> 
>> 'cy_GB.ISO8859-1' to 'cy_GB.ISO8859-14'
> 
> Looks as just fixing an error. The default West-European ISO8859-1 is changed 
> to Celtic cy_GB.ISO8859-14. This looks better option for Welsh.
> 
>> 'tg_TJ.KOI8-C' to 'tg_TJ.KOI8-T'
> 
> KOI8-C is not supported by Python, but KOI8-T is supported. I don't know what 
> KOI8-C means, there are several rarely used incompatible encodings with this 
> name.

While all this may make sense, I'm missing some more reasoning
behind the differences between X.org and glibc.

This change also looks strange:

-'ka_ge':'ka_GE.GEORGIAN-ACADEMY',
+'ka_ge':'ka_GE.GEORGIAN_PS',
 'ka_ge.georgianacademy':'ka_GE.GEORGIAN-ACADEMY',
 'ka_ge.georgianps': 'ka_GE.GEORGIAN-PS',
 'ka_ge.georgianrs': 'ka_GE.GEORGIAN-ACADEMY',

Why is GEORGIAN_PS written with an underscore whereas the other
mappings use dashes ?

Or this one:

-'fi_fi':'fi_FI.ISO8859-15',
+'fi_fi':'fi_FI.ISO8859-1',

Why would a locale switch away from an encoding having
the Euro sign to one without it ?

Or why is this latin variant removed:

-'nan_tw@latin': 'nan_TW.UTF-8@latin',

Why should Russians switch back to ISO ?

-'ru_ru':'ru_RU.UTF-8',
+'ru_ru':'ru_RU.ISO8859-5',

or from ISO to KOI ?

-'russian':  'ru_RU.ISO8859-5',
+'russian':  'ru_RU.KOI8-R',

The more I look at these changes, the more I believe we
should not simply take everything we find in the files
for granted. They obviously both have bugs.

>> I also don't understand why some "xx.utf-8" locale mappings were removed - I 
>> don't think we should remove those, unless they are no longer needed due to 
>> some other logic implying these mappings.
> 
> The aliases table is a table of exceptions. Removed entries no longer are 
> exceptional.

It's not a table of exceptions, it's a table mapping commonly
used locale settings to ones which the lib C understands :-)

But regardless, I checked the code and it is already
smart enough to convert lib C incompatible spellings such
as "utf8" to "UTF-8", so these entries can indeed be
removed, but only if the locale is otherwise listed.

In some cases, it's probably better to drop the ".utf8"
to have more generic mappings, e.g.

+'bhb_in.utf8':  'bhb_IN.UTF-8',

or

 'de_li.utf8':   'de_LI.UTF-8',

though I'd expect that mapping to be:

 'de_li':   'de_LI.ISO8859-1',

as for all other "de" entries.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20087] Mismatch between glibc and X11 locale.alias

2017-03-07 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

I agree that it's reasonable to have glibc's aliases override
the X.org ones, but this patch makes some pretty significant changes to 
Python's default assumptions with respect to default encodings for several 
locales.

While some changes obviously make sense (e.g. 'ca_AD.ISO8859-1' to 
'ca_AD.ISO8859-15'), others are less clear (e.g. 'cy_GB.ISO8859-1' to 
'cy_GB.ISO8859-14' or 'tg_TJ.KOI8-C' to 'tg_TJ.KOI8-T' or several of the moves 
from ISO encodings to UTF-8). Is there some reference for why glibc chose 
different values than X.org for these ?

I also don't understand why some "xx.utf-8" locale mappings were removed - I 
don't think we should remove those, unless they are no lot needed due to some 
other logic implying these mappings.

Since these are major changes, we need an appropriate warning in the NEWS file 
(and the "What's New" document), an update of the top comment (under "### 
Database") to mention that the glibc database takes precedence and where to 
find it,

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29724] Itertools docs propose a harmful “speedup” without any explanation

2017-03-05 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

The localization using keyword parameters is a very old trick to avoid global 
lookups. It does give a noticeable speedup, esp. when the localized variables 
are used in tight loops or the function itself is used in such loops.

The 5% speedup Steven measured matches my experience with this trick as well. 
In some cases, it can provide a more dramatic speedup, but this depends a lot 
on how the code is written.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29724>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27788] platform module's version number doesn't match its docstring

2017-02-24 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Hmm, not sure why the merge is not showing up on the ticket.

Here's the link: 
https://github.com/python/cpython/commit/6059ce45aa96f52fa0150e68ea655fbfdc25609a

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27788>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27788] platform module's version number doesn't match its docstring

2017-02-24 Thread Marc-Andre Lemburg

Changes by Marc-Andre Lemburg <m...@egenix.com>:


--
assignee:  -> lemburg
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27788>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27788] platform module's version number doesn't match its docstring

2017-02-24 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

The purpose of __version__ in the platform module is to be able to use it with 
other Python as well (and then detect which version is available in 
applications).

So I think it's good to keep it around.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27788>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10735] platform.architecture() gives misleading results for OS X multi-architecture executables

2017-02-20 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

The term "linkage" is probably a misnomer... "execformat" would be more correct:

 * https://en.wikipedia.org/wiki/Comparison_of_executable_file_formats

Too late to change, I guess.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10735>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10735] platform.architecture() gives misleading results for OS X multi-architecture executables

2017-02-20 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

I think there's a misunderstanding in what platform.architecture() is meant 
for. The purpose is to find out more details about the executable you pass to 
it, e.g. whether it's a 32-bit or 64-bit binary, or whether it's an ELF or PE 
binary. And it's a best effort API, just as most other platform APIs - this is 
also the reason why most of them have parameters available to modify the 
default return values.

It doesn't work with multi-architecture executables. We'd need a new API for 
this.

Regarding returning multiple architectures in the linkage return value: I'm not 
sure whether that's a good idea. The architectures are not necessarily of 
different linkage types. In fact on Macs, the correct values is "Mach-O". The 
API should probably return this instead of the default empty string.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10735>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29605] platform.architecture() with Python2.7-32 misreports architecture on macOS.

2017-02-20 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Ok, thanks for the clarification. So if I understand correctly, the main change 
in Python 3 is that points to the stub launcher, not the binary itself.

In any case, a new function would have to be added to the platform module to 
query multiple architectures available in a binary and probably another one to 
return the architecture that Python runs.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29605>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29605] platform.architecture() with Python2.7-32 misreports architecture on macOS.

2017-02-20 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Thanks for the report, but there really isn't much we can do, since the API is 
not geared up for handling executables which contain binaries for multiple 
architectures.

AFAIK, the Python 3 binaries available from python.org are no longer built as 
universal binaries, so the problem doesn't show with those.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29605>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29585] site.py imports relatively large `sysconfig` module.

2017-02-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

I don't think rewriting party of site.py in C is a good idea. It's a rather 
maintenance intense module.

However, optimizing access is certainly something that's possible, e.g. by 
placing the few variables that are actually needed by site.py into a bootstrap 
module for sysconfig, which only contains the few variables needed by 
interpreter startup.

Alternatively, sysconfig data could be made available via a C lookup function; 
with the complete dictionary only being created on demand. get_config_var() 
already is such a lookup API which could be used as front-end.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29585>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29580] "Built-in Functions" not being functions

2017-02-16 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

While "functions" may not be accurate anymore, they are all callables.

Historically, those callables were functions. Later on some of the built-ins 
were replaced with type objects.

Regarding your last comment: It is common in Python to write "func()" for 
callables in Python. The "()" signal the callable property.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29580>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29574] python-3.6.0.tgz permissions borked

2017-02-16 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

Indeed, there are two issue with the .tgz file:

 * it uses "staff" as group for all files (which will likely exist on some 
systems), but this appears unrelated in you case
 * all subdirs have go-x set, which prevents changing into the dir if you're 
not in the staff group, which is more of an issue

In your case, the system does not seem to have a staff group, but the numeric 
IDs stored in the .tgz file map to another user/group. As a result, you don't 
get access.

When creating .tgz files for redistribution, it's usually better to explicitly 
set the owner and group to either something that's not likely to exist on 
target machines or to root.root via --owner=root --group=root.

As user, you can work around this by using the options --no-same-owner 
--no-same-permissions when extracting the archive.

--
nosy: +lemburg

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29574>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   5   6   7   8   9   10   >