Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Nathaniel Smith
On Fri, Jun 22, 2018 at 6:45 PM, Steven D'Aprano  wrote:
> On Sat, Jun 23, 2018 at 01:33:59PM +1200, Greg Ewing wrote:
>> Chris Angelico wrote:
>> >Downside:
>> >You can't say "I'm done with this string, destroy it immediately".
>>
>> Also it would be hard to be sure there wasn't another
>> copy of the data somewhere from a time before you
>> got around to marking the string as sensitive, e.g.
>> in a file buffer.
>
> Don't let the perfect be the enemy of the good.

That's true, but for security features it's important to have a proper
analysis of the threat and when the mitigation will and won't work;
otherwise, you don't know whether it's even "good", and you don't know
how to educate people on what they need to do to make effective use of
it (or where it's not worth bothering).

Another issue: I believe it'd be impossible for this proposal to work
correctly on implementations with a compacting GC (e.g., PyPy),
because with a compacting GC strings might get copied around in memory
during their lifetime. And crucially, this might have already happened
before the interpreter was told that a particular string object
contained sensitive data. I'm guessing this is part of why Java and C#
use a separate type.

There's a lot of prior art on this in other languages/environments,
and a lot of experts who've thought hard about it. Python-{ideas,dev}
doesn't have a lot of security experts, so I'd very much want to see
some review of that work before we go running off designing something
ad hoc.

The PyCA cryptography library has some discussion in their docs:
https://cryptography.io/en/latest/limitations/

One possible way to move the discussion forward would be to ask the
pyca devs what kind of API they'd like to see in the interpreter, if
any.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Guido van Rossum
On Fri, Jun 22, 2018 at 9:11 PM Chris Angelico  wrote:

> How will other Pythons handle this?
>

It could be optional behavior. ISTR that in Jython, strings are pretty much
just Java strings.

Does Java have such a feature? If not, do Java apps worry about this? If
not, perhaps Python needn't either.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Chris Angelico
On Sat, Jun 23, 2018 at 2:00 PM, Terry Reedy  wrote:
> On 6/22/2018 8:45 PM, Chris Angelico wrote:
>
>> Would it suffice to flag the string as "this contains sensitive data,
>> please overwrite its buffer when it gets deallocated"? The only
>> difference, in your example, would be that the last print would show
>> the original data, and the wipe would happen afterwards. Advantages of
>> this approach include that getpass can automatically flag the string
>> as sensitive, and the "sensitive" flag can infect other strings (so
>> <> would be automatically flagged to be wiped). Downside:
>> You can't say "I'm done with this string, destroy it immediately".
>
>
> But one can be careful about creating references, and in current CPython,
> deleting the last reference does mean destroy, and possibly wipe,
> immediately.
>

Yes, you can, for the most part. It's certainly possible to get stung
(eg exceptions retaining locals), but mostly it should be fine.

How will other Pythons handle this?

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Terry Reedy

On 6/22/2018 8:45 PM, Chris Angelico wrote:


Would it suffice to flag the string as "this contains sensitive data,
please overwrite its buffer when it gets deallocated"? The only
difference, in your example, would be that the last print would show
the original data, and the wipe would happen afterwards. Advantages of
this approach include that getpass can automatically flag the string
as sensitive, and the "sensitive" flag can infect other strings (so
<> would be automatically flagged to be wiped). Downside:
You can't say "I'm done with this string, destroy it immediately".


But one can be careful about creating references, and in current 
CPython, deleting the last reference does mean destroy, and possibly 
wipe, immediately.



--
Terry Jan Reedy

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Steven D'Aprano
On Sat, Jun 23, 2018 at 01:33:59PM +1200, Greg Ewing wrote:
> Chris Angelico wrote:
> >Downside:
> >You can't say "I'm done with this string, destroy it immediately".
> 
> Also it would be hard to be sure there wasn't another
> copy of the data somewhere from a time before you
> got around to marking the string as sensitive, e.g.
> in a file buffer.

Don't let the perfect be the enemy of the good. We know there's at least 
one place that a string could leak private information. Just because 
there could hypothetically be other such places, doesn't make it useless 
to wipe that known potential leak.

Attackers are not always omniscient. Even if an application leaks 
private data in ten places, some attacker may only know of, or be 
capable of, attacking *one* leak. If we can, we ought to plug it, and 
leave those hypothetical other leaks for another day.

(Burglars can lift the tiles off my roof, climb into the ceiling, and 
hence down into my house. Nevertheless I still lock my front door.)


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Chris Angelico
On Sat, Jun 23, 2018 at 11:30 AM, Guido van Rossum  wrote:
> Chris's proposal can be implemented, it would set a hidden flag. Hopefully
> there's room for the flag without increasing the object header size.

If I'm reading the include file correctly, the 'state' bitstruct has
eight bits with defined meanings, and then 24 of padding to ensure
alignment. Allocating one of those bits to say "sensitive" should be
100% backward-compatible.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Greg Ewing

Chris Angelico wrote:

Downside:
You can't say "I'm done with this string, destroy it immediately".


Also it would be hard to be sure there wasn't another
copy of the data somewhere from a time before you
got around to marking the string as sensitive, e.g.
in a file buffer.

--
Greg

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Terry Reedy

On 6/22/2018 8:31 PM, Ezequiel Brizuela [aka EHB or qlixed] wrote:
As all the string in python are immutable, is impossible to overwrite 
the value


Not if one uses ctypes.  Is that what you did?


   Well I already do it:

https://github.com/qlixed/python-memwiper/ 


But i hit a lot of problems in the road, I was working on me free time 
over the last year on this and make it "almost" work, but that is not 
relevant to the proposal.


I think it is.  A very small fraction of Python users need such wiping.

And I doubt that it can be complete.  For instance, I suspect that a 
password entered into getpass, for instance, first exists in OS form 
before being copied into a Python string objects.  Wiping the Python 
string  would not wipe the original copy.  So this really should be 
attacked at the OS level, not the language level.  I have read that 
phones use separate memory for critical data to try to protect critical 
data.


--
Terry Jan Reedy


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Guido van Rossum
A wipe() method that mutates a string while it can still be referenced
elsewhere is unacceptable -- it breaks an abstraction that is widely
assumed.

Chris's proposal can be implemented, it would set a hidden flag. Hopefully
there's room for the flag without increasing the object header size.


On Fri, Jun 22, 2018 at 5:46 PM Chris Angelico  wrote:

> On Sat, Jun 23, 2018 at 10:31 AM, Ezequiel Brizuela [aka EHB or
> qlixed]  wrote:
> >   I propose to make the required changes on the string objects to add an
> > option to overwrite the underlying buffer. To do so:
> >
> >   * Add a wiped as an attribute that is read-only to be set when the
> string
> > is overwrited.
> >   * Add a wipe() method that overwrite the internal string buffer.
>
> Since strings are immutable, it's entirely possible for them to be
> shared in various ways. Having the string be wiped while still
> existing seems to be a risky approach.
>
> > So this will work like this:
> >
>  pwd =getpass.getpass('Set your password:') # could be other sensitive
>  data.
>  encrypted_pwd = crypt.crypt(pwd)  # crypt() just as example.
>  pwd.wiped  # Check if pwd was wiped.
> > False
>  pwd.wipe()  # Overwrite the underlying buffer
>  pwd.wiped  # Check if pwd was wiped.
> > True
>  print(pwd)  # Print noise (or empty str?)
>  del pwd  # Now is in hands of the GC.
>
> Would it suffice to flag the string as "this contains sensitive data,
> please overwrite its buffer when it gets deallocated"? The only
> difference, in your example, would be that the last print would show
> the original data, and the wipe would happen afterwards. Advantages of
> this approach include that getpass can automatically flag the string
> as sensitive, and the "sensitive" flag can infect other strings (so
> <> would be automatically flagged to be wiped). Downside:
> You can't say "I'm done with this string, destroy it immediately".
>
> ChrisA
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Chris Angelico
On Sat, Jun 23, 2018 at 10:31 AM, Ezequiel Brizuela [aka EHB or
qlixed]  wrote:
>   I propose to make the required changes on the string objects to add an
> option to overwrite the underlying buffer. To do so:
>
>   * Add a wiped as an attribute that is read-only to be set when the string
> is overwrited.
>   * Add a wipe() method that overwrite the internal string buffer.

Since strings are immutable, it's entirely possible for them to be
shared in various ways. Having the string be wiped while still
existing seems to be a risky approach.

> So this will work like this:
>
 pwd =getpass.getpass('Set your password:') # could be other sensitive
 data.
 encrypted_pwd = crypt.crypt(pwd)  # crypt() just as example.
 pwd.wiped  # Check if pwd was wiped.
> False
 pwd.wipe()  # Overwrite the underlying buffer
 pwd.wiped  # Check if pwd was wiped.
> True
 print(pwd)  # Print noise (or empty str?)
 del pwd  # Now is in hands of the GC.

Would it suffice to flag the string as "this contains sensitive data,
please overwrite its buffer when it gets deallocated"? The only
difference, in your example, would be that the last print would show
the original data, and the wipe would happen afterwards. Advantages of
this approach include that getpass can automatically flag the string
as sensitive, and the "sensitive" flag can infect other strings (so
<> would be automatically flagged to be wiped). Downside:
You can't say "I'm done with this string, destroy it immediately".

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Secure string disposal (maybe other inmutable seq types too?)

2018-06-22 Thread Ezequiel Brizuela [aka EHB or qlixed]
As all the string in python are immutable, is impossible to overwrite the
value or to make a "secure disposal" (overwrite-then-free) of a string
using something like:

>>> a = "something to hide"
>>> a =  "x"*len(a)

This will lead on the process memory "something to hide" and "x" repeated
len(a) times.

- Who cares? Why is this relevant?
  Well if you handle some sensitive information like CC numbers, Passwords,
PINs, or other kind of information you wanna minimize the chance of leaking
any of it.

- How this "leak" can happen?
  If you get a core/memory dump of an app handling sensitive information
you will get all the information on that core exposed!

- Well, so what we can do about this?
  I propose to make the required changes on the string objects to add an
option to overwrite the underlying buffer. To do so:

  * Add a wiped as an attribute that is read-only to be set when the string
is overwrited.
  * Add a wipe() method that overwrite the internal string buffer.

So this will work like this:

>>> pwd =getpass.getpass('Set your password:') # could be other sensitive
data.
>>> encrypted_pwd = crypt.crypt(pwd)  # crypt() just as example.
>>> pwd.wiped  # Check if pwd was wiped.
False
>>> pwd.wipe()  # Overwrite the underlying buffer
>>> pwd.wiped  # Check if pwd was wiped.
True
>>> print(pwd)  # Print noise (or empty str?)
>>> del pwd  # Now is in hands of the GC.

The wipe method immediately overwrite the underlying string buffer, setting
wiped as True for reference so if the string is further used this can be
checked to confirm that the change was made by a wipe and not by another
procedure. Also initially the idea is to use unicode NULL datapoint to
overwrite the string, but this could be change to let the user parametrize
it over wipe() method.
An alternative to this is to add a new exception "WipedError" that could be
throw where the string is accessed again, but I found this method too
disruptive to implement for a normal/standard string workflow usage.

Quick & Dirty FAQ:

- You do it wrong!, the correct code to do that in a secure way is:
>>> pwd = crypt.crypt(getpass.getpass('Set your password'))
Don't you know that fool?

  Well no, the code still generate a temporary string in memory to pass to
crypt. But now this string is lying there and can't be accessed for an
overwrite with wipe()


- Why not create a new type like in C# or Java?

  I see that this tend to disrupt the usual workflow of string usage. Also
the idea here is not to offer secure storage of string in memory because
there is already a few mechanism to achieve with the current Python base. I
just want to have the hability to overwrite the buffer.


- Why don't use one of the standard algorithms to overwrite like DoD5220 or
MIL-STD-414?

  This kind of standard usually are oriented for usage on persistent
storage, specially on magnetic media for where the data could be "easily"
recoverd. But this could ve an option that could be implemented adding the
option to plug a function that do the overwrite work inside the wipe method.


- This is far beyond of the almost implementation-agnostic definition of
the python lang. How about to you make a module with this functionality and
left the lang as is?

  Well I already do it:

https://github.com/qlixed/python-memwiper/

  But i hit a lot of problems in the road, I was working on me free time
over the last year on this and make it "almost" work, but that is not
relevant to the proposal.
  I think that this kind of security things needs to be tackled from within
the language itself specially when the lang have GC. I firmly believe that
the security and protections needs to be part of the "with batteries" offer
of Python. And I think that this is one little thing that could help a lot
to secure our apps.
  Let me know what do you think!

~ Ezequiel (Ezekiel) Brizuela [ aka Qlixed ] ~
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] staticmethod and classmethod should be callable

2018-06-22 Thread Random832
On Thu, Jun 21, 2018, at 05:00, INADA Naoki wrote:
> When Python 4, I think we can even throw away classmethod and staticmethod
> object.
> PyFunction can have binding flag instead, like METH_CLASS and METH_STATIC
> for PyCFunction.
> classmethod and staticmethod is just a function which modify the flag.

I can't remember the details, but I remember once having a reason to need to 
use staticmethod to store an attribute which happened to be a function.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] String and bytes bitwise operations

2018-06-22 Thread INADA Naoki
Hi Terry,

Thanks, but I didn't care because my password is not so long.
I just want to illustrate real world bytes xor usage.
BTW, New MySQL auth methods (sha256 and caching_sha2)
use bytes xor too.

For performance point of view, websocket masking is performance
critical.  Tornado uses extension module only for it.  If bytearray ^= bytes
is supported, websocket frame masking may look like:

  frame ^= mask * ((len(frame)+3)//4)  # mask is 4 bytes long

On Sat, Jun 23, 2018 at 1:26 AM Terry Reedy  wrote:

> On 6/22/2018 7:08 AM, INADA Naoki wrote:
> > Bitwise xor is used for "masking" code like these:
> >
> >
> https://github.com/PyMySQL/PyMySQL/blob/37eba60439039eff17b32ef1a63b45c25ea28cec/pymysql/connections.py#L139-L146
>
> This points to a function _my_crypt that is O(n*n) because of using
> bytes.append.  Using bytearray.append makes it O(n).
> ---
> import random
> import struct
> import timeit
>
> range_type = range
>
> def _my_crypt(message1, message2):
>  length = len(message1)
>  result = b''
>  for i in range_type(length):
>  x = (struct.unpack('B', message1[i:i+1])[0] ^
>   struct.unpack('B', message2[i:i+1])[0])
>  result += struct.pack('B', x)
>  return result
>
> def _my_crypt2(message1, message2):
>  length = len(message1)
>  result = bytearray()
>  for i in range_type(length):
>  x = (struct.unpack('B', message1[i:i+1])[0] ^
>   struct.unpack('B', message2[i:i+1])[0])
>  result += struct.pack('B', x)
>  return bytes(result)
>
> def make(n):
>  result = bytearray()
>  for i in range(n):
>  result.append(random.randint(0, 255))
>  return result
>
> for m in (10, 100, 1000, 10_000, 100_000, 1000_000):
>  m1 = make(m)
>  m2 = make(m)
>
>  n = 1000_000 // m
>  print(f'bytes len {m}, timeit reps {n}')
>  print('old ', timeit.timeit('_my_crypt(m1, m2)', number = n,
> globals=globals()))
>  print('new ', timeit.timeit('_my_crypt2(m1, m2)', number = n,
> globals=globals()))
> ---
> prints
>
> bytes len 10, timeit reps 10
> old  1.227759412998
> new  1.217421230999
> bytes len 100, timeit reps 1
> old  1.145566423
> new  1.092400212003
> bytes len 1000, timeit reps 1000
> old  1.286030619002
> new  1.116868583999
> bytes len 1, timeit reps 100
> old  1.654334465003
> new  1.118191714
> bytes len 10, timeit reps 10
> old  4.256849211005
> new  1.126613756011
> bytes len 100, timeit reps 1
> old  60.65123814404
> new  1.131502019992
>
> I tried to submit this to https://github.com/PyMySQL/PyMySQL/issues/new
> but [Submit] does not work for me.
>
> --
> Terry Jan Reedy
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
INADA Naoki  
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] String and bytes bitwise operations

2018-06-22 Thread Terry Reedy

On 6/22/2018 7:08 AM, INADA Naoki wrote:

Bitwise xor is used for "masking" code like these:

https://github.com/PyMySQL/PyMySQL/blob/37eba60439039eff17b32ef1a63b45c25ea28cec/pymysql/connections.py#L139-L146


This points to a function _my_crypt that is O(n*n) because of using 
bytes.append.  Using bytearray.append makes it O(n).

---
import random
import struct
import timeit

range_type = range

def _my_crypt(message1, message2):
length = len(message1)
result = b''
for i in range_type(length):
x = (struct.unpack('B', message1[i:i+1])[0] ^
 struct.unpack('B', message2[i:i+1])[0])
result += struct.pack('B', x)
return result

def _my_crypt2(message1, message2):
length = len(message1)
result = bytearray()
for i in range_type(length):
x = (struct.unpack('B', message1[i:i+1])[0] ^
 struct.unpack('B', message2[i:i+1])[0])
result += struct.pack('B', x)
return bytes(result)

def make(n):
result = bytearray()
for i in range(n):
result.append(random.randint(0, 255))
return result

for m in (10, 100, 1000, 10_000, 100_000, 1000_000):
m1 = make(m)
m2 = make(m)

n = 1000_000 // m
print(f'bytes len {m}, timeit reps {n}')
print('old ', timeit.timeit('_my_crypt(m1, m2)', number = n, 
globals=globals()))
print('new ', timeit.timeit('_my_crypt2(m1, m2)', number = n, 
globals=globals()))

---
prints

bytes len 10, timeit reps 10
old  1.227759412998
new  1.217421230999
bytes len 100, timeit reps 1
old  1.145566423
new  1.092400212003
bytes len 1000, timeit reps 1000
old  1.286030619002
new  1.116868583999
bytes len 1, timeit reps 100
old  1.654334465003
new  1.118191714
bytes len 10, timeit reps 10
old  4.256849211005
new  1.126613756011
bytes len 100, timeit reps 1
old  60.65123814404
new  1.131502019992

I tried to submit this to https://github.com/PyMySQL/PyMySQL/issues/new
but [Submit] does not work for me.

--
Terry Jan Reedy

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] staticmethod and classmethod should be callable

2018-06-22 Thread Nick Coghlan
On 21 June 2018 at 03:27, Serhiy Storchaka  wrote:
> 20.06.18 20:07, Guido van Rossum пише:
>>
>> Maybe we're misunderstanding each other? I would think that calling the
>> classmethod object directly would just call the underlying function, so this
>> should have to call utility() with a single arg. This is really the only
>> option, since the descriptor doesn't have any context.
>>
>> In any case it should probably `def utility(cls)` in that example to
>> clarify that the first arg to a class method is a class.
>
>
> Sorry, I missed the cls parameter in the definition of utility().
>
> class Spam:
> @classmethod
> def utility(cls, arg):
> ...
>
> value = utility(???, arg)
>
> What should be passed as the first argument to utility() if the Spam class
> (as well as its subclasses) is not defined still?

That would depend on the definition of `utility` (it may simply not be
useful to call it in the class body, which is also the case with most
instance methods).

The more useful symmetry improvement is to the consistency of
behaviour between instance methods on class instances and the
behaviour of class methods on classes themselves.

So I don't think this is a huge gain in expressiveness, but I do think
it's a low cost consistency improvement that should make it easier to
start unifying more of the descriptor handling logic internally.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874"

2018-06-22 Thread Ronald Oussoren
On 21 Jun 2018, at 09:17, Stephen J. Turnbull  wrote:Ronald Oussoren writes:Possibly just for the “cp…” encodings, but IMHO only if we confirmthat the code to look for the preferred encoding returns a codepagenumber on Windows and changing that code leads to worse resultsthan adding numeric aliases for the “cp…” encodings.Almost all of the CPxxx encodings have multiple aliases[1], so I justdon't see the point unless numeric-only code page designations arebaked in to default "locales"[2] in official releases by major OSvendors.  And probably not even then, since it should be easy enoughto provide a proper "locale" and/or PYTHONIOENCODING setting.The user shouldn’t have to do anything other than install Python. IMHO were doing something wrong when the python interpreter doesn’t start upwith a default system configuration (when the user explicitly sets a bogusPYTHONIOENCODING or locale all bets are off, although even then warning about and then ignoring bad settings would be more userfriendlythan the current behavior)Of course we should help the reporter figure out what's going on andhelp them fix it with appropriate system configuration.  If thatdoesn't work, then (and *only then*) we could think about doing astupid thing.The issue is making slow progress. I’m not Windows users myself and therefore cannot easily experiment with what’s going on (other than byreading the code).Ronald___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] String and bytes bitwise operations

2018-06-22 Thread INADA Naoki
Bitwise xor is used for "masking" code like these:

https://github.com/PyMySQL/PyMySQL/blob/37eba60439039eff17b32ef1a63b45c25ea28cec/pymysql/connections.py#L139-L146
https://github.com/tornadoweb/tornado/blob/0b2b055061eb4754c80a8d6bc28614b86954e336/tornado/util.py#L470-L471
https://github.com/tornadoweb/tornado/blob/master/tornado/speedups.c#L5

I think implementing it in C is really helpful for protocol library authors.

On Thu, May 17, 2018 at 7:54 PM Ken Hilton  wrote:

> Hi all,
>
> We all know the bitwise operators: & (and), | (or), ^ (xor), and ~ (not).
> We know how they work with numbers:
>
> 420 ^ 502
>
> 110100100
> 10110
> == XOR ==
> 001010010
> = 82
>
> But it might be useful in some cases to (let's say) xor a string (or
> bytestring):
>
> HELLO ^ world
>
> 01001000 01000101 01001100 01001100 0100
> 01110111 0110 01110010 01101100 01100100
> === XOR 
> 0011 00101010 0010 0010 00101011
> = ?*> +
>
> Currently, that's done with this expression for strings:
>
> >>> ''.join(chr(ord(a) ^ ord(b)) for a, b in zip('HELLO', 'world'))
> '?*> +'
>
> and this expression for bytestrings:
>
> >>> bytes(a ^ b for a, b in zip(b'HELLO', b'world'))
> b'?*> +'
>
> It would be much more convenient, however, to allow a simple xor of a
> string:
>
> >>> 'HELLO' ^ 'world'
> '?*> +'
>
> or bytestring:
>
> >>> b'HELLO' ^ b'world'
> b'?*> +'
>
> (All of this applies to other bitwise operators, of course.)
> Compatibility issues are a no-brainer - currently, bitwise operators for
> strings raise TypeErrors.
>
> Thanks.
>
> Suggesting,
> Ken
> ​ Hilton​
> ;
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
INADA Naoki  
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/