[Python-ideas] Re: Adding pep8-casing-compliant aliases for the entire stdlib

2021-11-11 Thread Christian Heimes

On 11/11/2021 14.41, Matt del Valle wrote:

So the scope of my suggestion is as follows:

- lowercase types become PascalCase (e.g., `str` -> `Str`, 
`collections.defaultdict` -> `collections.DefaultDict`)


- lowercase attributes/functions/methods become snake_case (no changes 
for names that only contain a single word, so `str.lower()` would be 
unaffected, but `str.removeprefix()` would get the alias 
`str.remove_prefix()`)


- pep8 and the python docs are updated to state that the pep8-compliant 
forms of stdlib names should be strongly preferred over the legacy 
names, and that IDEs and linters should include (configurable?) weak 
warnings to discourage the use of legacy-cased stdlib names


- `help()` would be special-cased for builtin types to no longer display 
any current non-pep8-compliant names, and the python docs would also no 
longer show them, instead only making a note at the top of the page as 
with the `threading` module.



Given the horrors of the python 2.7 schism I don't think there's any 
rush to officially deprecate or remove the current non-pep8 names at 
all. I think that's the sort of thing that can happily and fully be 
kicked down the road.


If we add aliases and they see widespread adoption to the point where 
the non-pep8 forms are barely ever even seen out in the wild then maybe 
in 10 or 20 years time when the steering council is deliberating on a 
new major python version they can consider rolling the removal of legacy 
badly-cased names into it. And if not then no big deal.



Adding new APIs or replacing existing APIs for PEP 8 style-compliance 
would be a violation of PEP 8. The PEP 8 document states that 
consistency, readability, and backwards compatibility are more important 
than naming and style conventions.


By **not** applying PEP 8 style to existing code, CPython stays PEP 8 
compliant.


https://www.python.org/dev/peps/pep-0008/#a-foolish-consistency-is-the-hobgoblin-of-little-minds

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QBNMSU6AI2UUZ5ZEFDM7SD5VOGZSZ27B/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Fwd: Simple curl/wget-like download functionality in urllib (like http offers server)

2021-10-19 Thread Christian Heimes

On 19/10/2021 00.06, Chris Angelico wrote:

On Tue, Oct 19, 2021 at 9:00 AM Cameron Simpson  wrote:

The problem with a "download()" method is that it is almost never what
you need. There are too many ways to want to do it, and one almost
_never_ wants to suck the download itself into memory as you do above
with read() because downloads are often large, sometimes very large.

You also don't always want to put it into a file.



OTOH, if you *do* want to put it into a file, it should be possible to
take advantage of zero-copy APIs to reduce unnecessary transfers. I'm
not sure if there's a way to do that with requests. Ideally, what you
want is os.sendfile() but it'd need to be cleanly wrapped by the
library itself.


Splicing APIs like sendfile() require a Kernel socket. You cannot do 
sendfile() with userspace sockets like OpenSSL socket for e.g. HTTPS.


Latest Linux Kernel and OpenSSL 3.0.0 have a new feature called kTLS. 
Kernel TLS uses OpenSSL to establish the TLS connection and then handles 
payload transfer in Kernel to for zero-copy sendfile().


Christian



___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4IWZOPHWKQKKGAZPHYFH3D667W64VSCV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: os.workdir() context manager

2021-09-15 Thread Christian Heimes
On 15/09/2021 11.56, Marc-Andre Lemburg wrote:
> - Chris mentioned that library code should not be changing the
>   CWD. In an ideal world, I'd agree, but there are cases where
>   you don't have an option and you need to change the CWD in order
>   make certain things work, e.g. you don't control the code you
>   want to run, but need to make it work in a specific directory
>   by changing the CWD and then passing relative paths to the
>   code.

This seems rather hypothetical to me. Can you provide a real-world
example where you cannot use absolute paths?

Christian

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UIBLP5GNXIVAE52IUJCCTLOFVQ7LOMYS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: os.workdir() context manager

2021-09-15 Thread Christian Heimes
On 15/09/2021 09.21, Eric V. Smith wrote:
> On 9/15/2021 3:02 AM, Christian Heimes wrote:
>> On 15/09/2021 01.55, Guido van Rossum wrote:
>>> I know where I'd file a bug. :-)
>>>
>>> "Bug magnet" is an extremely subjective pejorative term. When the
>>> *better* way to do things (os.workdir()) is harder than the *easy* way
>>> to do (os.chdir()), which is the real bug magnet?
>> The "better way" to handle current working directory is to use the
>> modern *at() variants of syscalls, e.g. openat() instead open(). The
>> variants take an additional file descriptor dirfd that is used as the
>> current working directory for the syscall.
> 
> While I generally agree, the only times I've written a context manager
> like os.workdir() is when running an executable with subprocess.call(),
> and the executable requires that its current directory be set to some
> specific directory. So while I don't use this functionality very often,
> there are times when nothing else will do. I realize I could handle this
> temporary working directory with yet another executable (including a
> shell), but using a context manager is just easier, and I only use this
> in single-threaded programs.

You don't have to change the current working directory of your process
in order to run a child process in a different working directory.
subprocess.call() and other functions in the subprocess module accept a
"cwd" argument that lets you run a process with a different working
directory. The "cwd" argument is thread safe.


> And I'm not crazy about the name "workdir". To me, it sounds like it
> returns something, not sets and resets something. But I don't have a
> particularly great alternative in mind: in my own code I've used
> "change_dir", which isn't awesome either.

Speaking with almost 25 years experience in Unix system scripting:
Changing the current working directory during the runtime of a process
is problematic and can lead to bugs. In general applications should only
change their working directory once right after start and not rely on
the cwd. It's better to normalize paths, use absolute paths, or *at()
syscalls with dirfd.

In my opinion a workdir() context manager would only be useful for small
quick-n-dirty scripts.

Christian


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DGGZHKSJGVRBRARZF7T63FFR2OOGF4XC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: os.workdir() context manager

2021-09-15 Thread Christian Heimes
On 15/09/2021 01.55, Guido van Rossum wrote:
> I know where I'd file a bug. :-)
> 
> "Bug magnet" is an extremely subjective pejorative term. When the
> *better* way to do things (os.workdir()) is harder than the *easy* way
> to do (os.chdir()), which is the real bug magnet?

The "better way" to handle current working directory is to use the
modern *at() variants of syscalls, e.g. openat() instead open(). The
variants take an additional file descriptor dirfd that is used as the
current working directory for the syscall.

Christian

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FDCKFPVBBSNLBDBTI577NP55JN2DBWCE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: disallow assignment to unknown ssl.SSLContext attributes

2021-06-28 Thread Christian Heimes
On 28/06/2021 20.36, Brendan Barnwell wrote:
> On 2021-06-28 07:03, Thomas Grainger wrote:
>>> >but in this case the object is security sensitive, and security
>>> should be much more rigorous in ensuring correctness.
>> It looks like there's a consensus being reached, should I create a bpo?
> 
> If we're going to make backwards-incompatible changes to SSLContext,
> might it be a good idea to make a cleaner, more Pythonic API while we're
> at it so that people are discouraged from doing attribute-setting at
> all?  Why not have the class accept only valid options at creation time
> and raise an error if any unexpected arguments are passed?  Is there
> even any reason to allow changing the SSLContext parameters after
> creation, or could we just freeze them on instance creation and make
> people create a separate context if they want a different configuration?
>  I think any of these would be better than the current setup that
> expects people to adjust the options by manually setting attributes one
> by one after instance creation.

There won't be any backwards incompatible changes to SSLContext in near
future. There might be an additional API based on PEP 543 [1]
configuration object if we find time to implement it for 3.11.

Christian


[1] https://www.python.org/dev/peps/pep-0543/#configuration


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QSHMLYTJE3PKRTJLXXJKJFITRZRJFAMI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: disallow assignment to unknown ssl.SSLContext attributes

2021-06-25 Thread Christian Heimes
On 25/06/2021 20.17, Guido van Rossum wrote:
> On Fri, Jun 25, 2021 at 8:22 AM Bluenix  <mailto:bluenix...@gmail.com>> wrote:
> 
> I am not fully aware of how ssl.SSLContext is used, but adding
> __slots__ would prevent this. You would see an error similar to:
> AttributeError: 'MyClass' object has no attribute 'my_attribute'
> 
> 
> That's a reasonable solution, except that it's not backwards compatible.
> It's possible that there is code out there that for some reason adds
> private attributes to an SSLContext instance, and using __slots__ would
> break such usage. (They could perhaps fix their code by using a dummy
> subclass, but that could well become a non-trivial change to their code,
> depending on where they get their SSLContext instances.)
> 
> So unless there's evidence that nobody does that, we're stuck with the
> status quo. I'm adding Christian Heimes to the thread in case he has a
> hunch either way.

I agree, it is a backwards incompatible change. Also __slots__ won't
work. The class has class attributes that can be modified in instances.
You cannot have attributes that are both class and instance attributes
with __slots__. We'd have to overwrite __setattr__() and block unknown
attributes of exact instances of ssl.SSLContext.

Christian


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CCIK7ASYNPYD4QTO462LZHTSKTD6FJKN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: [Python-Dev] Re: Have virtual environments led to neglect of the actual environment?

2021-02-24 Thread Christian Heimes
On 24/02/2021 11.52, Stéfane Fermigier wrote:
> I love pipx and I'm glad it exists at this point because it make 
> 
> The main issue is that each virtualenv takes space, lots of space.
> 
> I have currently 57 apps installed via pipx on my laptop, and the 57
> environments take almost 1 GB already.
> 
>  ~  cd .local/pipx/venvs/
>  ~/.l/p/venvs  ls
> abilian-tools/  concentration/  gitlabber/      pygount/        sphinx/
> ansible/        cookiecutter/   httpie/         pyinfra/        tentakel/
> assertize/      cruft/          isort/          pylint/         tlv/
> autoflake/      cython/         jupyterlab/     pyre-check/     towncrier/
> black/          dephell/        lektor/         pytype/         tox/
> borgbackup/     docformatter/   md2pdf/         pyupgrade/      twine/
> borgmatic/      flake8/         medikit/        radon/          virtualenv/
> bpytop/         flit/           mypy/           re-ver/         virtualfish/
> check-manifest/ flynt/          nox/            sailboat/       vulture/
> clone-github/   gh-clone/       pdoc3/          salvo/
> cloneall/       ghtop/          pdocs/          shed/
> com2ann/        gitchangelog/   pybetter/       sixer/
>  ~/.l/p/venvs  du -sh .
> 990M.
>  ~/.l/p/venvs  ls | wc
>       57      57     475
> 
> There is probably a clever way to reuse common packages (probably via
> clever symlinking) and reduce the footprint of these installations. 

There are tools like https://rdfind.pauldreik.se/rdfind.1.html that
create hard links to deduplicate files. Some files systems have
deduplicated baked in, too.

Christian
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CO6GV2CRDKBJMLE7DZVVQ4AMIPSKPMCJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Allowing -b (BytesWarning) to be activated in other ways

2020-07-16 Thread Christian Heimes
On 16/07/2020 16.38, Shai Berger wrote:
> Hi Pythonistas,
> 
> The -b flag, which turns on checks which emit BytesWarnings on
> operations mixing bytes and str objects, is very useful.
> 
> However, the only way to set this flag is via the Python invocation.
> This limits its usability in contexts where the user's control of the
> Python invocation is limited, e.g when using Python embedded in another
> executable (such as uwsgi). There appears to be no function which can
> set the flag, and no environment variable which controls it.
> 
> Up to Python 3.7, the extension module provided by the bytes-warning
> package[1] works around this (it exposes a function which allows
> setting the flag from within Python). But with 3.8 (and I suspect,
> because of PEP-587[2] related changes), this fails silently and the sys
> flag bytes_warning remains unaffected.
> 
> Can we have a non-invocation way to control this flag?

You can use ctypes to modify bytes warnings. I'm using this trick in
FreeIPA. It works up to Python 3.7. For Python 3.8 and newer you to
modify the bytes_warning member of the current interpreter.

https://github.com/freeipa/freeipa/blob/53d472b490ac7a14fc78516b448d4aa312b79b7f/ipalib/__init__.py#L886-L912

Christian
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6APQ6R5F425UKKPCE6T776DFPYB36GZ6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: EVENTFD(2) support

2020-06-17 Thread Christian Heimes
On 16/06/2020 07.56, doods...@gmail.com wrote:
> Can we implement eventfd(2) as documented here 
> ?
> 
> It would only be available on the Linux platform, and one of the benefits 
> would be the ability to create synchronisation primitives on said platform 
> that can block on normal threads, and be awaited on in coroutines (without 
> busy looping inside said coroutine).
> 
> Currently the best place I can think of to put it would be in one of the 
> Networking and Interprocess Communication modules (possibly `select` or 
> `socket`?). The fact that it's Linux only shouldn't be an issue, since much 
> of the contents of `select` is OS dependent.

We usually expose low-level, file descriptor-related functions in the os
module and then provide high-level wrappers in Python. The approach is
most flexible and allows 3rd parties to build on top of the raw file
descriptor, too.

I opened a BPO and created https://github.com/python/cpython/pull/20930
to implement a low-level interface to glibc's eventfd() function.

Christian
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BC7ZRPW5D7UCDGVKRJWD2GFUYWL2UHZB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: An HTTP API to list versions of Python

2020-05-27 Thread Christian Heimes
On 27/05/2020 18.59, Antoine Pitrou wrote:
> On Tue, 26 May 2020 22:19:12 -0400
> Kyle Stanley  wrote:
>>
>>> It could become more detailed about each minor versions, git tag, links  
>> to changelogs, links to the repositories, to the docs, download links,
>> and so on.
>>
>> I don't know that it needs to be said, but for now, I think we should start
>> with a minimalist approach by keeping the API focused on reducing the
>> number of *existing* locations to update, rather than predicting what might
>> sort of fields might be useful to include. Otherwise, it could very well
>> end up becoming more work to maintain compared to what it actually saves.
> 
> Unless unusual fields are required in the returned information, how
> about using PyPI as the information store?  That way, you don't have to
> design a new API and implement a new backend...
> 
> (that doesn't mean PyPI needs to host any downloadable files for Pyhon,
> by the way - just the metadata)

Barry and Guido own the Python project on PyPI,
https://pypi.org/project/Python/ . There hasn't been an update since
2.5.0 in 2007.

Does PyPI still support empty releases without files? I wouldn't mind to
upload all sources on PyPI, too.

Christian
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZRVS4UO5YBA37LFJATKGGM6GMCTL3OZD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Comparison operator support (>= and <=) for type

2019-06-18 Thread Christian Heimes
On 17/06/2019 16.47, Guido van Rossum wrote:
> Type theorists apparently have chosen to use the <: notation, and
> presumably for the same reason.

Can we call it "party hat operator", please? <:-)

Christian
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GVQPS43UAKXPNXIEQO5IQXLIFEHNB3SD/
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Logical tracebacks

2019-04-15 Thread Christian Heimes
On 15/04/2019 22.07, Antoine Pitrou wrote:
> 
> Hello,
> 
> I apologize because I'm only going to throw a very vague idea and I
> don't currently have time or motivation to explore it myself.  But I
> think it may prove interesting for other people and perhaps spur some
> concrete actionable proposal.
> 
> With the growing complexity of Python software stacks, the length of
> tracebacks is continuously growing and is frequently making debugging
> errors and issues more tedious than it should be.  This is a
> language-agnostic problem.  Java software is often mocked for its
> ridiculously long tracebacks, but Python might come close in the future.
> 
> Especially since Python is often the a language of choice for non
> computer science professionals, including but not only as a teaching
> language, this would be a problem worth solving.  We already recognized
> the issue some years ago, and even implemented a focussed fix for one
> specific context: the elision of importlib frames when an import error
> occurs:
> https://bugs.python.org/issue15110
> 
> However, there are many contexts where implementation details would
> benefit from being hidden from tracebacks (the classical example being
> the internals of framework or middleware code, such as Django, Dask,
> etc.).  We would therefore have to define some kind of protocol by
> which tracebacks can be enumerated, not only as frames, but as logical
> execution blocks, comprised of one or several frames each, whose
> boundaries would reflect the boundaries of the various logical
> execution layers (again: framework, middleware...) involved in the
> frame stack.  We would probably also need some flag(s) to disable the
> feature in cases where the full stack frame is wanted (I imagine
> elaborate UIs could also allow switching back and forth from both
> representations).
> 
> This would need a lot more thinking, and perhaps exploring what kind of
> hacks already exist in the wild to achieve similar functionality.
> Again, I'm just throwing this around for others to play with.

Zope has a feature like that for more than a decade. Code could define
variables __traceback_info__  and __traceback_supplement__ in local
scope, which would then be used by the traceback formatter to annotate
the traceback with additional information. I think it was also possible
to hide frame with a similar technique.

https://zopeexceptions.readthedocs.io/en/latest/narr.html


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Using sha512 instead of md5 on python.org/downloads

2018-12-08 Thread Christian Heimes
On 08/12/2018 05.55, Gregory P. Smith wrote:
> 
> On Fri, Dec 7, 2018 at 3:38 PM Steven D'Aprano
>  > wrote:
> 
> On Fri, Dec 07, 2018 at 01:25:19PM -0800, Nathaniel Smith wrote:
> 
> > For this specific purpose, md5 is just as good as a proper hash.
> But all
> > else being equal, it would still be better to use a proper hash,
> just so
> > people don't have to go through the whole security analysis to
> check that.
> 
> I don't understand what you are trying to say here about "the whole
> security analysis" to check "that". What security analysis, and
> what is "that"?
> 
> It seems to me that moving to a cryptographically-secure hash would
> give
> many people a false sense of security, that just because the hash
> matched, the download was not only not corrupted, but not
> compromised as
> well. For those two purposes:
> 
> - testing for accidental corruption;
> - testing for deliberate compromise;
> 
> md5 and sha512 are precisely equivalent: both are sufficient for the
> first, and useless for the second. But a crypto-hash can give a false
> sense of security. The original post in this thread is evidence of that.
> 
> As such, I don't think we should move to anything stronger than md5.
> 
> 
> If we switched to sha2+ or listed 8 different hashes at once in the
> announcement text so that nobody can find the actual link content, we'd
> stop having people pipe up and complain that we used md5 for something. 
> Less mailing list threads like this one seems like a benefit. :P
> 
> Debian provides all of the popular FIPS hashes, in side files, so people
> can use whatever floats their boat for a content integrity check:
>  https://cdimage.debian.org/debian-cd/current/ppc64el/iso-cd/

By the way it's a common misunderstanding that FIPS forbids MD5 in
general. FIPS is more complicated than black and white lists of
algorithms. FIPS also takes into account how an algorithm is used. For
example and if I recall correctly, AES-GCM is only allowed in network
communication protocols but not for persistent storage.

Simply speaking:
In FIPS mode, MD5 is still allowed in **non-security contexts**. You
cannot use MD5 to make any security claims like file integrity. However
you are still allowed to use MD5 as non-secure hash function to detect
file corruption. The design and documentation must clearly state that
you are only guarding against accidental file corruption caused by
network or hardware issue, but as protection against a malicious attacker.

Christian

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-23 Thread Christian Heimes
On 2018-06-23 21:55, Ezequiel Brizuela [aka EHB or qlixed] wrote:
> 
> 
> El sáb., 23 de jun. de 2018 10:58, Stephan Houben
>  > escribió:
> 
> Would it not be much simpler and more secure to just disable core dumps?
> 
> /etc/security/limits.conf on Linux.
> 
> If the attacker can cause and read a core dump, the game seems over
> anyway since sooner or later he will catch the core dump at a time
> the string was not yet deleted.
> 
> 
> Thing is that this could be leaked in other ways, not just on a core.
> Additiinally there is the case when you need a core to debug the issue,
> you could be sharing sensitive info without knowing it.
> Also is not always an option disabling core generation.

If you have core dumps enabled, then memory wiping will not help against
accidental leakage of sensitive data.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-23 Thread Christian Heimes
On 2018-06-23 07:21, Nathaniel Smith wrote:
> On Fri, Jun 22, 2018 at 6:45 PM, Steven D'Aprano  wrote:
>> On Sat, Jun 23, 2018 at 01:33:59PM +1200, Greg Ewing wrote:
>>> Chris Angelico wrote:
 Downside:
 You can't say "I'm done with this string, destroy it immediately".
>>>
>>> Also it would be hard to be sure there wasn't another
>>> copy of the data somewhere from a time before you
>>> got around to marking the string as sensitive, e.g.
>>> in a file buffer.
>>
>> Don't let the perfect be the enemy of the good.
> 
> That's true, but for security features it's important to have a proper
> analysis of the threat and when the mitigation will and won't work;
> otherwise, you don't know whether it's even "good", and you don't know
> how to educate people on what they need to do to make effective use of
> it (or where it's not worth bothering).
> 
> Another issue: I believe it'd be impossible for this proposal to work
> correctly on implementations with a compacting GC (e.g., PyPy),
> because with a compacting GC strings might get copied around in memory
> during their lifetime. And crucially, this might have already happened
> before the interpreter was told that a particular string object
> contained sensitive data. I'm guessing this is part of why Java and C#
> use a separate type.
> 
> There's a lot of prior art on this in other languages/environments,
> and a lot of experts who've thought hard about it. Python-{ideas,dev}
> doesn't have a lot of security experts, so I'd very much want to see
> some review of that work before we go running off designing something
> ad hoc.
> 
> The PyCA cryptography library has some discussion in their docs:
> https://cryptography.io/en/latest/limitations/
> 
> One possible way to move the discussion forward would be to ask the
> pyca devs what kind of API they'd like to see in the interpreter, if
> any.

A while ago, I spent a good amount of time to investigate memory wiping
for hashlib and hmac module. Although I was only interested to perform
memory wiping in C code [1], I eventually gave up. It was too annoying
to create a platform and architecture independent implementation.
Because compilers do funny things and memset_s() isn't universally
available yet, it it requires code like

   static void * (* const volatile __memset_vp)(void *, int, size_t) =
(memset);

or assembler code like

   asm volatile("" : : "r"(s) : "memory");

to just work around compiler optimization. This doesn't even handle CPU
architecture, virtual memory, paging, core dumps, debuggers or other
things that can read memory or dump memory to disk.


I honestly believe, that memory wiping with the current standard memory
allocator won't do the trick. It might be possible to implement a 90%
solution with a special memory allocator. Said allocator would a
specially configured, mmap memory arena and perform wiping on realloc()
and free(). The secure area can be prevented from swapping with mlock(),
protected with mprotect() and possible hardware encrypted with
pkey_mprotect(). It's just a 90% secure solution, because the data will
eventually land in public buffers.

If you need to protect sensitive data like private keys, then don't load
them into memory of your current process. It's that simple. :) Bugs like
heartbleed were an issue, because private key were in the same process
space as the TLS/SSL code. Solutions like gpg-agent, ssh-agent, TPM,
HSM, Linux's keyring and AF_ALG socket all aim to offload operations
with private key material into a secure subprocess, Kernel space or
special hardware.


[1] https://bugs.python.org/issue17405

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-23 Thread Christian Heimes
On 2018-06-23 15:57, Stephan Houben wrote:
> Would it not be much simpler and more secure to just disable core dumps?
> 
> /etc/security/limits.conf on Linux.
> 
> If the attacker can cause and read a core dump, the game seems over
> anyway since sooner or later he will catch the core dump at a time the
> string was not yet deleted.

That's not sufficient. You'd also need to ensure that the memory page is
never paged to disk or a visible to gdb, ptrace, or any other kind of
debugger. POSIX has mprotect(), but it doesn't necessarily work with
malloc()ed memory and requires mmap() memory.

Christian


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Unified TLS API for Python

2017-02-04 Thread Christian Heimes
On 2017-02-04 19:18, Cory Benfield wrote:
> 
>> On 3 Feb 2017, at 18:30, Steve Dower  wrote:
>>
>> On 02Feb2017 0601, Cory Benfield wrote:
>>>
>>> 4. Eventually, integrating the two backends above into the standard
>>> library so that it becomes possible to reduce the reliance on OpenSSL.
>>> This would allow future Python implementations to ship with all of their
>>> network protocol libraries supporting platform-native TLS
>>> implementations on Windows and macOS. This will almost certainly require
>>> new PEPs. I’ll probably volunteer to maintain a SecureTransport library,
>>> and I have got verbal suggestions from some other people who’d be
>>> willing to step up and help with that. Again, we’d need help with
>>> SChannel (looking at you, Steve).
>>
>> I'm always somewhat interested in learning a new API that I've literally 
>> never looked at before, so yeah, count me in :) (my other work was using the 
>> trust APIs directly, rather than the secure socket APIs).
>>
>> PyCon US sprints? It's not looking like I'll be able to set aside too much 
>> time before then, but I've already fenced off that time.
> 
> That might be a really good idea.
> 
> With feedback from Nathaniel and Glyph I’m going back to the drawing board a 
> bit with this PEP to see if we can reduce the amount of work needed from 
> backends, so this may shrink down to something that can feasibly be done in a 
> sprint.
> 
> For those who are interested, the refactoring proposed by Nathaniel and Glyph 
> is to drop the abstract TLSWrappedSocket class, and instead replace it with a 
> *concrete* TLSWrappedSocket class that is given a socket and an 
> implementation of TLSWrappedBuffers. This would essentially mean that 
> implementations only need to write a TLSWrappedBuffers implementation and 
> would get a TLSWrappedSocket essentially for free (with the freedom to 
> provide a complete reimplementation of TLSWrappedSocket if they need to).
> 
> I’m going to investigate the feasibility of that by writing a stub 
> TLSWrappedBuffers for SecureTransport and then writing the TLSWrappedSocket 
> implementation to validate what it looks like. Assuming the end result looks 
> generally suitable, I’ll come back with a new draft. But the TL;DR is that if 
> we do that all we need to implement on the Windows side is one-or-two 
> classes, and we’d be ready to go. That’d be really nice.

At first I was a bit worried that you were planning to chance the
semantic of socket wrapping. The PEP doesn't get into detail how
wrap_socket() works internally. How about you add a paragraph that the
function wraps an OS-level file descriptors (POSIX) or socket handles
(Windows).

For some TLS libraries it's an optimization to reduce memory copies and
overhead of extra GIL lock/unlock calls. Other TLS libraries (NSS, Linux
kernel TLS with AF_KTLS) can only operator on file descriptors. In fact
NSS can only operator on NSPR PRFileDesc, but NSPR has an API to wrap an
OS-level fd in a PRFileDesc.

By the way, how can a TLS implementation announce that it does not
provide buffer wrapping?

Christian



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] incremental hashing in __hash__

2016-12-30 Thread Christian Heimes
On 2016-12-30 20:59, Matthias Bussonnier wrote:
> On Fri, Dec 30, 2016 at 5:24 PM, Nick Coghlan  wrote:
>>
>> I understood the "iterhash" suggestion to be akin to itertools.accumulate:
>>
>> >>> for value, tally in enumerate(accumulate(range(10))): print(value, 
>> ...
> 
> It reminds me of hmac[1]/hashlib[2], with the API :  h.update(...)
> before a .digest().
> It is slightly lower level than a `from_iterable`, but would be a bit
> more flexible.
> If the API were kept similar things would be easier to remember.

Hi,

I'm the author of PEP 456 (SipHash24) and one of the maintainers of the
hashlib module.

Before we come up with a new API or recipe, I would like to understand
the problem first. Why does the initial op consider hash(large_tuple) a
performance issue? If you have an object with lots of members that
affect both __hash__ and __eq__, then __hash__ is really least of your
concern. The hash has to be computed just once and then will stay the
same over the life time of the object. Once computed the hash can be
cached.

On the other hand __eq__ is at least called once for every successful
hash lookup. On the worst case it is called n-1 for a dict of size n for
a match *and* a hashmap miss. Every __eq__ call has to compare between 1
and m member attributes. For a dict with 1,000 elements with 1,000
members each, that's just 1,000 hash computations with roughly 8 bB
memory allocation but almost a million comparisons in worst case.

A hasher objects adds further overhead, e.g. object allocation, creation
of a bound methods for each call etc. It's also less CPU cache friendly
than the linear data structure of a tuple.

Christian

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Expose reasons for SSL/TLS cert verification failures

2016-09-09 Thread Christian Heimes
On 2016-09-09 12:23, Chi Hsuan Yen wrote:
> Hi Python enthusiasts,
> 
> Currently _ssl.c always reports CERTIFICATE_VERIFY_FAILED for any
> certification verification errors. In OpenSSL, it's possible to tell
> from different reasons that lead to CERTIFICATE_VERIFY_FAILED. For
> example, https://expired.badssl.com/ reports
> X509_V_ERR_CERT_HAS_EXPIRED, and https://self-signed.badssl.com/ reports
> X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT. Seems CPython does not expose
> such information yet? I hope it can be added to CPython. For example,
> creating a new exception class SSLCertificateError, which is a subclass
> of SSLError, that provides error codes like
> X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT. Any ideas?
> 
> The attachment is a naive try to printf some information about a
> verification failure. It's just a proof-of-concept and does not provide
> any practical advantage :)

I'm planning to add a proper validation hook to 3.7. I haven't had time
to design and implement it for 3.6.

Christian


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/