[issue40791] hmac.compare_digest could try harder to be constant-time.

2020-05-27 Thread Devin Jeanpierre


Change by Devin Jeanpierre :


--
keywords: +patch
pull_requests: +19700
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/20444

___
Python tracker 
<https://bugs.python.org/issue40791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40791] hmac.compare_digest could try harder to be constant-time.

2020-05-27 Thread Devin Jeanpierre


New submission from Devin Jeanpierre :

`hmac.compare_digest` (via `_tscmp`) does not mark the accumulator variable 
`result` as volatile, which means that the compiler is allowed to short-circuit 
the comparison loop as long as it still reads from both strings.

In particular, when `result` is non-volatile, the compiler is allowed to change 
the loop from this:


```c
for (i=0; i < length; i++) {
result |= *left++ ^ *right++;
}
return (result == 0);
```

into (the moral equivalent of) this:


```c
for (i=0; i < length; i++) {
result |= *left++ ^ *right++;
if (result) {
for (; ++i < length;) {
*left++; *right++;
}
return 1;
}
}
return (result == 0);
```

(Code not tested.)

This might not seem like much, but it cuts out almost all of the data 
dependencies between `result`, `left`, and `right`, which in theory would free 
the CPU to race ahead using out of order execution -- it could execute code 
that depends on the result of `_tscmp`, even while `_tscmp` is still performing 
the volatile reads. (I have not actually benchmarked this. :)) In other words, 
this weird short circuiting could still actually improve performance. That, in 
turn, means that it would break constant-time guarantees.

(This is different from saying that it _would_ increase performance, but 
marking it volatile removes the worry.)

(Prior art/discussion: 
https://github.com/google/tink/commit/335291c42eecf29fca3d85fed6179d11287d253e )


I propose two changes, one trivial, and one that's more invasive:

1) Make `result` a `volatile unsigned char` instead of `unsigned char`. 

2) When SSL is available, instead use `CRYPTO_memcmp` from OpenSSL/BoringSSL. 
We are, in effect, "rolling our own crypto". The SSL libraries are more 
strictly audited for timing issues, down to actually checking the generated 
machine code. As tools improve, those libraries will grow to use those tools. 
If we use their functions, we get the benefit of those audits and improvements.

--
components: Library (Lib)
messages: 370053
nosy: Devin Jeanpierre
priority: normal
severity: normal
status: open
title: hmac.compare_digest could try harder to be constant-time.
versions: Python 3.10, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 
3.9

___
Python tracker 
<https://bugs.python.org/issue40791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Users banned

2018-07-15 Thread Devin Jeanpierre
On Sun, Jul 15, 2018 at 5:09 PM Jim Lee  wrote:
> That is, of course, the decision of the moderators - but I happen to
> agree with both Christian and Ethan.  Banning for the simple reason of a
> dissenting opinion is censorship, pure and simple.  While Bart may have
> been prolific in his arguments, he never spoke in a toxic or
> condescending manner, or broke any of the rules of conduct.  I cannot
> say the same for several who engaged with him.

+1000  It seems to me like the python-list moderators are rewarding
people for being bullies, by banning the people they were bullying.
The behavior on the list the past few days has been unforgivably
toxic, and that has nothing to do with the behavior of Bart et al.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Kindness

2018-07-13 Thread Devin Jeanpierre
On Fri, Jul 13, 2018 at 10:49 AM Mark Lawrence  wrote:
>
> On 13/07/18 16:16, Bart wrote:
> > On 13/07/2018 13:33, Steven D'Aprano wrote:
> >> On Fri, 13 Jul 2018 11:37:41 +0100, Bart wrote:
> >>
> >>> (** Something so radical I've been using them elsewhere since forever.)
> >>
> >> And you just can't resist making it about you and your language.
> >
> > And you can't resist having a personal dig.
> >
>
> You are a troll and should have been banned from this list years ago.

This exchange is unacceptable. I don't know who Bart is or what their
language is, but they left a basically OK (if kinda edgy) comment, and
this was immediately escalated into a series of personal attacks.

Not everyone is as familiar with the surrounding context. To me, it
looks like you are all bullying someone. Even if you think it is
justified, consider how it looks to others. I am afraid of ever
getting on your "bad side", and I bet the same is true of other
non-Bart people, too.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Thread-safe way to add a key to a dict only if it isn't already there?

2018-07-07 Thread Devin Jeanpierre
On Sat, Jul 7, 2018 at 6:49 AM Marko Rauhamaa  wrote:
> Is that guaranteed to be thread-safe? The documentation ( s://docs.python.org/3/library/stdtypes.html#dict.setdefault>) makes no
> such promise.

It's guaranteed to be thread-safe because all of Python's core
containers are thread safe (in as far as they document
behaviors/invariants, which implicitly also hold in multithreaded code
-- Python does not take the approach other languages do of
"thread-compatible" containers that have undefined behavior if mutated
from multiple threads simultaneously). It isn't guaranteed to be
_atomic_ by the documentation, but I bet no Python implementation
would make dict.setdefault non-atomic.

There's no good description of the threading rules for Python data
structures anywhere. ISTR there was a proposal to give Python some
defined rules around thread-safety a couple of years ago (to help with
things like GIL-less python projects), but I guess nothing ever came
of it.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why is the use of an undefined name not a syntax error?

2018-04-01 Thread Devin Jeanpierre
On Sun, Apr 1, 2018 at 2:38 PM, Chris Angelico  wrote:
> On Mon, Apr 2, 2018 at 7:24 AM, David Foster  wrote:
>> My understanding is that the Python interpreter already has enough 
>> information when bytecode-compiling a .py file to determine which names 
>> correspond to local variables in functions. That suggests it has enough 
>> information to identify all valid names in a .py file and in particular to 
>> identify which names are not valid.
>>
>
> It's not as simple as you think. Here's a demo. Using all of the
> information available to the compiler, tell me which of these names
> are valid and which are not:

This feels like browbeating to me. Just because a programmer finds it
hard to figure out manually, doesn't mean a computer can't do it
automatically. And anyway, isn't the complexity of reviewing such code
an argument in favor of automatic detection, rather than against?

For example, whether or not "except Exception:" raises an error
depends on what kind of scope we are in and what variable declarations
exist in this scope (in a global or class scope, all lookups are
dynamic and go up to the builtins, whereas in a function body this
would have resulted in an unbound local exception because it uses fast
local lookup). What a complex thing. But easy for a computer to
detect, actually -- it's right in the syntax tree (and bytecode) what
kind of lookup it is, and what paths lead to defining it, and a fairly
trivial control flow analysis would discover if it will always, never,
or sometimes raise a NameError -- in the absence of "extreme dynamism"
like mutating the builtins and so on. :(

Unfortunately, the extreme dynamism can't really be eliminated as a
possibility, and there's no rule that says "just because this will
always raise an exception, we can fail at compile-time instead". Maybe
a particular UnboundLocalError was on purpose, after all. Python
doesn't know.  So probably this can't ever sensibly be a compile
error, even if it's a fantastically useful lint warning.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why is the use of an undefined name not a syntax error?

2018-04-01 Thread Devin Jeanpierre
> But if it is cheap to detect a wide variety of name errors at compile time, 
> is there any particular reason it is not done?

>From my perspective, it is done, but by tools that give better output
than Python's parser. :)

Linters (like pylint) are better than syntax errors here, because they
collect all of the undefined variables, not just the first one. Maybe
Python could/should be changed to give more detailed errors of this
kind as well. e.g. Clang parse errors for C and C++ are much more
thorough and will report all of your typos, not just the first one.

> P.S. Here are some uncommon language features that interfere with identifying 
> all valid names. In their absence, one might expect an invalid name to be a 
> syntax error:

Also, if statements, depending on what you mean by "invalid":

def foo(x):
  if x:
y = 3
  return y # will raise UnboundLocalError if not x

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: binary decision diagrams

2017-12-20 Thread Devin Jeanpierre
On Mon, Dec 18, 2017 at 5:00 AM, Wild, Marcel, Dr 
 wrote:
> Hello everybody:
> I really don't know anything about Python (I'm using Mathematica) but with 
> the help of others learned that
>
> g=expr2bdd(f)
>
> makes the BDD (=binary decision diagram)  g of a Boolean function f.  But 
> what is the easiest (fool-proof) way to print out a diagram of g ?

Python doesn't come with support for (ro)bdds built-in. You're
probably thinking of this library, which includes visualization
instructions:

http://pyeda.readthedocs.io/en/latest/bdd.html

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-09-11 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Oops, so it is. I can't read apparently.

I'll spend my time on making more fuzz tests in the meantime.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-09-11 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

kcc strongly disagrees though. Copying latest comment:

"""
fwiw - I object to us running any of this internally at Google. We need to be 
part of the main oss-fuzz project pulling from upstream revisions. Doing this 
testing within our blackhole of internal stuff adds more work for us internally 
(read: which we're not going to do) and wouldn't provide results feedback to 
the upstream CPython project in a useful timely manner.

We must figure out how to get this to build and run on the external oss-fuzz 
infrastructure
"""

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-09-11 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

> i'd rather make this work in oss-fuzz on cpython.  can you point me to how 
> oss-fuzz works and what it wants to do so i can better understand what it 
> needs?

I don't have any details except for what's in the PR to oss-fuzz 
(https://github.com/google/oss-fuzz/pull/731)  My understanding is matches what 
you've said so far:

Python is built to one directory (/out/), but then needs to be run from another 
directory (/out/ is renamed to /foo/bar/baz/out/). We need python to still 
work. I have no idea how to do this.

The only suggestion on #python-dev IRC was to statically link a libpython.a, 
but this doesn't avoid needing to import libraries like "encodings" 
dynamically, so they still need to be locatable on disk.

Is there a way to build python so that it doesn't use absolute paths to 
everything, and so that the install can be moved at will? Or is there a way to 
tell it that it was moved at runtime? (I am unconvinced PYTHONPATH is a 
maintainable solution, if it works at all...)


oss-fuzz is not going to change away from its model (I asked if they could, 
they said no), so we're stuck with making Python compatible with it one way or 
another.  This is why I am so drawn to running the test internally on Google's 
infrastructure anyway: we already _did_ all this work already, via hermetic 
python. Doing it a second time, but worse, seems annoying.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-09-08 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

So here's an interesting issue: oss-fuzz requires that the built location be 
movable. IOW, we build Python into $OUT, and then the $OUT directory gets moved 
somewhere else and the fuzz test gets run from there. This causes problems 
because Python can no longer find where the modules it needs are (encodings for 
example).

First thought: wouldn't it be nice if we could make a prepackaged and hermetic 
executable that we can move around freely?

Second thought: isn't that "Hermetic Python", as used within Google?

Third thought: doesn't Google have an internal fuzz testing environment we can 
use, instead of oss-fuzz?

So unless someone says this is a bad idea, I'd propose we not run these in 
oss-fuzz and instead run them in Google proper. The alternative is if there's a 
way to make it easy to move Python around -- is there a way to build it s.t. 
the import path is relative and so on?

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-09-07 Thread Devin Jeanpierre

Changes by Devin Jeanpierre <jeanpierr...@gmail.com>:


--
keywords: +patch
pull_requests: +3434
stage: test needed -> patch review

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-09-06 Thread Devin Jeanpierre

Changes by Devin Jeanpierre <jeanpierr...@gmail.com>:


--
pull_requests: +3412

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-09-06 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Huh. I would not have predicted that.

https://gcc.gnu.org/onlinedocs/cpp/Defined.html

I'll send a fix.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-07-25 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

I think they misspoke, it's normal with fuzzing to test against master. The 
current draft of the code runs this git pull before building/launching any 
tests:

git clone --depth 1 https://github.com/python/cpython.git cpython

Speaking of which, I forgot to update this bug thread with the followup PR to 
actually run CPython's fuzz tests (when they exist): 
https://github.com/google/oss-fuzz/pull/731. That's where I grabbed the git 
clone statement from. I think that will be merged after some version of PR 2878 
lands in CPython (still in code review / broken).



For Python 2 I guess it's different, and we will test against the 2.7 branch, 
right?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-07-25 Thread Devin Jeanpierre

Changes by Devin Jeanpierre <jeanpierr...@gmail.com>:


--
pull_requests: +2929

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17870] Python does not provide PyLong_FromIntMax_t() or PyLong_FromUintMax_t() function

2017-06-15 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Oh, to be clear on this last point:

> Hum, who else needs such function except of you?

Right now there is no way to convert an int that might be > 64 bits, into a 
python long, except really bizarre shenanigans, unless we want to rely on 
implementation-defined behavior.

This would be fine if it were easy to implement, but it isn't -- as we've both 
agreed, there's no good way to do this, and it is significantly easier to add 
this to CPython than to implement this from outside of CPython. And I do think 
there is merit in writing code that doesn't rely on implementation-defined 
behavior.

I also think it's simpler -- imagine if we just didn't care about all these int 
types! Phew.

Ack that this isn't "strong rationale" per your standards, so do whatever is 
right for this bug.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17870] Python does not provide PyLong_FromIntMax_t() or PyLong_FromUintMax_t() function

2017-06-15 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

> Making two C functions public is very different from supporting intmax_t. I 
> expect a change of a few lines, whereas my intmax_t patch modified a lot of 
> code.

I requested either a way to create from intmax_t, or from bytes. We have two 
existing functions (that I didn't know about) to do the latter, so it would fix 
this bug report to just make those public, from my POV.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17870] Python does not provide PyLong_FromIntMax_t() or PyLong_FromUintMax_t() function

2017-06-15 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

> Devin, I asked you for a strong rationale to add the feature. I don't see 
> such rationale, so this issue will be closed again.

I guess we have different definitions of "strong rationale". Clearer criteria 
would help.

>> It may be better to make _PyLong_FromByteArray() and _PyLong_AsByteArray() 
>> public.
> That makes sense. I suggest to open a new issue for that.

This request was part of the original bug report, so why open a new issue?

> PyLong_FromIntMax_t(myinteger) would be great. Or maybe even better would be 
> PyLong_FromBytes(, sizeof(myinteger)) ?

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17870] Python does not provide PyLong_FromIntMax_t() or PyLong_FromUintMax_t() function

2017-06-14 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

> Write your own C extension to do that. Sorry, I don't know what is the best 
> way to write such C extension.

If everyone who wants to convert intptr_t to a python int has to write their 
own function, then why not just include it in the C-API?

Having support for intmax_t means we never have to have this conversation ever 
again, because it should work for all int types.

Reopening since this use-case doesn't sound solved yet.

--
resolution: rejected -> 
status: closed -> open

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17870] Python does not provide PyLong_FromIntMax_t() or PyLong_FromUintMax_t() function

2017-06-14 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

> I wrote my first patch in 2013, but I still fail to find a very good example 
> where intmax_t would be an obvious choice. So I have to agree and I will now 
> close the issue.

Hold on, nobody ever answered the question in the OP. How would you convert an 
intptr_t (e.g. Rust's int type) to a Python int?

You can't use FromVoidPtr because of signedness. You can use FromLongLong, but 
that's implementation-defined.

If what we should be using is FromLongLong for all "really big ints", why not 
just rename FromLongLong to FromIntMax and call it a day?



There is no standard relationship between long long and most other int types -- 
all we know is that it's at least 64 bits, but an int type can perfectly 
reasonably be e.g. 80 bits or 128 bits or similar. I think it *is* a worhtwhile 
goal to allow programmers to write C code that has as little 
implementation-defined or undefined behavior as possible.


If that isn't considered a worthwhile goal, maybe we should reconsider using 
such a dangerous and pointy language as C. :)

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-05-09 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

https://github.com/google/oss-fuzz/pull/583 is the PR to oss-fuzz to add the 
project. I'm working on actual tests to be submitted here.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29505] Submit the re, json, & csv modules to oss-fuzz testing

2017-05-02 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Aha, I found an existing issue!

For adding to oss-fuzz, is there a contact email we can use that is connected 
to a google account? I am tempted to just put gregory.p.smith on there if not. 
:)




I can volunteer to fuzz some interesting subset of the stdlib. The list I've 
come up with (by counting uses in my code) is:

the XML parser (which seems to be written in C)
struct (unpack)
the various builtins that parse strings (like int())
hashlib
binascii
datetime's parsing
json


I'd also suggest the ast module, since people do use ast.literal_eval on 
untrusted strings, but I probably won't do that one myself.



I wrote a fuzz test for json via upstream simplejson, but the bug on github is 
getting stale: https://github.com/simplejson/simplejson/issues/163

Should I add it to CPython instead?



> We should investigate creating fuzz targets for the Python re module (_sre.c) 
> at a minimum.

If we prioritize based on security risk, I'd argue that this is lower priority 
than things like json's speedup extension module, because people should 
generally not pass untrusted strings to the re module: it's very easy to DOS a 
service with regexes unless you're using RE2 or similar -- which is fuzzed.  In 
contrast, json is supposed to accept untrusted input and people do that very 
often.

(OTOH, I would be willing to bet that fuzzing re will yield more bugs than 
fuzzing json.)

--
nosy: +Devin Jeanpierre

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29505>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: tempname.mktemp functionality deprecation

2017-04-29 Thread Devin Jeanpierre
On Sat, Apr 29, 2017 at 11:45 AM, Tim Chase
 wrote:
> Unfortunately, tempfile.mktemp() is described as deprecated
> since 2.3 (though appears to still exist in the 3.4.2 that is the
> default Py3 on Debian Stable). While the deprecation notice says
> "In version 2.3 of Python, this module was overhauled for enhanced
> security. It now provides three new functions, NamedTemporaryFile(),
> mkstemp(), and mkdtemp(), which should eliminate all remaining need
> to use the insecure mktemp() function", as best I can tell, all of
> the other functions/objects in the tempfile module return a file
> object, not a string suitable for passing to link().
>
> So which route should I pursue?
>
> - go ahead and use tempfile.mktemp() ignoring the deprecation?
>
> - use a GUID-named temp-file instead for less chance of collision?
>
> - I happen to already have a hash of the file contents, so use
>   the .hexdigest() string as the temp-file name?
>
> - some other solution I've missed?

I vote the last one: you can read the .name attribute of the returned
file(-like) object from NamedTemporaryFile to get a path to a file,
which can be passed to other functions.

I guess ideally, one would use linkat instead of os.link[*], but that's
platform-specific and not exposed in Python AFAIK. Maybe things would
be better if all the functions that accept filenames should also
accept files, and do the best job they can? (if a platform supports
using the fd instead, use that, otherwise use f.name).

.. *: 
http://stackoverflow.com/questions/17127522/create-a-hard-link-from-a-file-handle-on-unix/18644492#18644492

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue29986] Documentation recommends raising TypeError from tp_richcompare

2017-04-04 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Yeah, I agree there might be a use-case (can't find one offhand, but in 
principle), but I think it's rare enough that you're more likely to be led 
astray from reading this note -- almost always, NotImplemented does what you 
want.

In a way this is a special case of being able to raise an exception at all, 
which is mentioned earlier ("if another error occurred it must return NULL and 
set an exception condition.")

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29986>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29986] Documentation recommends raising TypeError from tp_richcompare

2017-04-04 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Sorry, forgot to link to docs because I was copy-pasting from the PR:

https://docs.python.org/2/c-api/typeobj.html#c.PyTypeObject.tp_richcompare

https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_richcompare

> Note: If you want to implement a type for which only a limited set of 
> comparisons makes sense (e.g. == and !=, but not < and friends), directly 
> raise TypeError in the rich comparison function.

--

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29986>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29986] Documentation recommends raising TypeError from tp_richcompare

2017-04-04 Thread Devin Jeanpierre

New submission from Devin Jeanpierre:

am not sure when TypeError is the right choice. Definitely, most of the time 
I've seen it done, it causes trouble, and NotImplemented usually does something 
better.

For example, see the work in https://bugs.python.org/issue8743 to get set to 
interoperate correctly with other set-like classes --- a problem caused by the 
use of TypeError instead of returning NotImplemented (e.g. 
https://hg.python.org/cpython/rev/3615cdb3b86d).

This advice seems to conflict with the usual and expected behavior of objects 
from Python: e.g. object().__lt__(1) returns NotImplemented rather than raising 
TypeError, despite < not "making sense" for object. Similarly for file objects 
and other uncomparable classes. Even complex numbers only return NotImplemented!


>>> 1j.__lt__(1j)
NotImplemented


If this note should be kept, this section could use a decent explanation of the 
difference between "undefined" (should return NotImplemented) and "nonsensical" 
(should apparently raise TypeError). Perhaps a reference to an example from the 
stdlib.

--
assignee: docs@python
components: Documentation
messages: 291144
nosy: Devin Jeanpierre, docs@python
priority: normal
pull_requests: 1167
severity: normal
status: open
title: Documentation recommends raising TypeError from tp_richcompare

___
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29986>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Clickable hyperlinks

2017-01-05 Thread Devin Jeanpierre
Sadly, no. :(  Consoles (and stdout) are just text, not hypertext. The way to 
make an URL clickable is to use a terminal that makes URLs clickable, and print 
the URL:

print("%s: %s" % (description, url))

-- Devin

On Tue, Jan 3, 2017 at 11:46 AM, Deborah Swanson  
wrote:

> Excel has a formula:
>
> =HYPERLINK(url,description)
>
> that will put a clickable link into a cell.
>
> Does python have an equivalent function? Probably the most common use
> for it would be output to the console, similar to a print statement, but
> clickable.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Clickable hyperlinks

2017-01-03 Thread Devin Jeanpierre
Sadly, no. :(  Consoles (and stdout) are just text, not hypertext. The way
to make an URL clickable is to use a terminal that makes URLs clickable,
and print the URL:

print("%s: %s" % (description, url))

-- Devin

On Tue, Jan 3, 2017 at 11:46 AM, Deborah Swanson 
wrote:

> Excel has a formula:
>
> =HYPERLINK(url,description)
>
> that will put a clickable link into a cell.
>
> Does python have an equivalent function? Probably the most common use
> for it would be output to the console, similar to a print statement, but
> clickable.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for ideas to improve library API

2015-11-28 Thread Devin Jeanpierre
Documentation is all you can do.

-- Devin

On Thu, Nov 26, 2015 at 5:35 AM, Chris Lalancette <clalance...@gmail.com> wrote:
> On Thu, Nov 26, 2015 at 7:46 AM, Devin Jeanpierre
> <jeanpierr...@gmail.com> wrote:
>> Why not take ownership of the file object, instead of requiring users
>> to manage lifetimes?
>
> Yeah, I've kind of been coming to this conclusion.  So my question
> then becomes: how do I "take ownership" of it?  I already keep a
> reference to it, but how would I signal to the API user that they
> should no longer use that file object (other than documentation)?
>
> Thanks,
> Chris
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for ideas to improve library API

2015-11-26 Thread Devin Jeanpierre
Why not take ownership of the file object, instead of requiring users
to manage lifetimes?

-- Devin

On Wed, Nov 25, 2015 at 12:52 PM, Chris Lalancette
 wrote:
> Hello,
>  I'm currently developing a library called pyiso (
> https://github.com/clalancette/pyiso), used for manipulating ISO disk
> images.  I'm pretty far along with it, but there is one part of the API
> that I really don't like.
> Typical usage of the library is something like:
>
> import pyiso
>
> p = pyiso.PyIso() // create the object
> f = open('/path/to/original.iso', 'r')
> p.open(f)  // parse all of the metadata from the input ISO
> fp = open('/path/to/file/to/add/to/iso', 'r')
> p.add_fp(fp)  // add a new file to the ISO
> out = open('/path/to/modified.iso', 'w')
> p.write(out)  // write out the modified ISO to another file
> out.close()
> fp.close()
> f.close()
>
> This currently works OK.  The problem ends up being the file descriptor
> lifetimes.  I want the user to be able to do multiple operations to the
> ISO, and I also don't want to read the entire ISO (and new files) into
> memory.  That means that internal to the library, I take a reference to the
> file object that the user passes in during open() and add_fp().  This is
> fine, unless the user decides to close the file object before calling the
> write method, at which point the write complains of I/O to a closed file.
> This is especially problematic when it comes to using context managers,
> since the user needs to leave the context open until they call write().
>   I've thought of a couple ways to deal with this:
>
> 1.  Make a copy of the file object internal to the library, using os.dup()
> to copy the file descriptor.  This is kind of nasty, especially since I
> want to support other kinds of file objects (think StringIO).
> 2.  Just document the fact that the user needs to leave the file objects
> open until they are done.  This is simple, but not super user-friendly.
>
> I'm looking for any ideas of how to do this better, or something I missed.
> Any input is appreciated!
>
> Thanks,
> Chris Lalancette
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Should non-security 2.7 bugs be fixed?

2015-07-20 Thread Devin Jeanpierre
I think you're missing the line where I said all the relevant
conversation happened in IRC, and that you should refer to logs.

On Sun, Jul 19, 2015 at 11:25 PM, Terry Reedy tjre...@udel.edu wrote:
 On 7/19/2015 9:20 PM, Devin Jeanpierre wrote:

 Search your logs for https://bugs.python.org/issue17094
 http://bugs.python.org/issue5315

 I was most frustrated by the first case --

 the patch was (informally) rejected

 By 'the patch', I presume you mean current-frames-cleanup.patch
 by Stefan Ring, who said it is certainly not the most complete solution,
 but it solves my problem.. It was reviewed a month later by a core dev, who
 said it had two defects.  Do you expect us to apply defective patches?

No, I meant my patch. It was discussed in IRC, and I gave the search
term to grep for. (The issue URL.)

 in favor of the right fix,


 right is your word. Natali simply uploaded an alternate patch that did not
 have the defects cited.  It went through 4 versions, two by Pitrou, before
 the commit and close 2 months later, with the comment Hopefully there
 aren't any applications relying on the previous behaviour.

No, right is the word used by members of #python-dev, referrig to
Antoine's fix.

 Two years later, last May, you proposed and uploaded a patch with what looks
 to be a new and different approach.  It has been ignored.  In the absence of
 a core dev focused on 2.7, I expect that this will continue. Too bad you did
 not upload it in Feb 2013, before the review and fix started.

I'm not sure what you're implying here. It couldn't be helped.

 and http://bugs.python.org/issue5315

 Another fairly obscure issue for most of us. Five years ago, this was turned
 into a doc issue, but no patch was ever submitted for either 2.x or 3.x.
 Again, no particular prejudice against 2.x.

 In May, you posted a bugfix which so far has been ignored.  Not too
 surprising.  I submitted a ping and updated the versions.  If anyone
 responds, you might be asked for a patch against 3.4 or 3.5.

Again, the prejudice was expressed in IRC. It was ignored because you
can just use asyncio in 3.x, and because the bug was old.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Should non-security 2.7 bugs be fixed?

2015-07-19 Thread Devin Jeanpierre
On Sat, Jul 18, 2015 at 9:45 PM, Steven D'Aprano st...@pearwood.info wrote:
 It gets really boring submitting 2.7-specific patches, though, when
 they aren't accepted, and the committers have such a hostile attitude
 towards it. I was told by core devs that, instead of fixing bugs in
 Python 2, I should just rewrite my app in Python 3.

 Really? Can you point us to this discussion?

Yes, really. It was on #python-dev IRC.

 If you are right, and that was an official pronouncement, then it seems that
 non-security bug fixes to 2.7 are forbidden.

I never said it was a pronouncement, or official. It wasn't. I have no
idea where you got that idea from, given that I specifically have said
that I think non-security bug fixes are allowed.

 I suspect though that it's not quite that black and white. Perhaps there was
 some doubt about whether or not the patch in question was fixing a bug or
 adding a feature (a behavioural change). Or the core dev in question was
 speaking for themselves, not for all.

They weren't speaking for all. And, I never said they were. Nor did I
imply that they were.

Search your logs for https://bugs.python.org/issue17094 and
http://bugs.python.org/issue5315

I was most frustrated by the first case -- the patch was (informally)
rejected in favor of the right fix, and the right fix was
(informally) rejected because it changed behavior, leaving me only
with the option of absurd workarounds of a bug in Python, or moving to
python 3.

 It has even been
 implied that bugs in Python 2 are *good*, because that might help with
 Python 3 adoption.

 Really? Can you point us to this discussion?

 As they say on Wikipedia, Citation Needed. I would like to see the context
 before taking that at face value.

Of course, it was a joke. The format of the joke goes like this:
people spend a lot of time debugging and writing bugfixes for Python
2.7, and you say:

  dev2 guido wants all python 3 features in python 2, so ssbr` maybe
choose the right time to ask a backport ;-)
  dev1 oh. if i would be paid to contribute to cpython, i would
probably be ok to backport anything from python 3 to python 2
  dev1 since i'm not paid for that, i will to kill python 2, it must
suffer a lot

And that's about as close to logs as I am comfortable posting. Grep
your logs for that, too.



I don't like how this is being redirected to surely you
misunderstood or I don't believe you. The fact that some core devs
are hostile to 2.x development is really bleedingly obvious, you
shouldn't need quotes or context thrown at you. The rhetoric almost
always shies _just_ short of ceasing bugfixes (until 2020, when that
abruptly becomes a cracking good idea). e.g. in 2.7 is here until
2020, please don't call it a waste.

I don't want to argue over who said what. I am sure everyone meant the
best, and I misunderstood them given a complicated context and a rough
day. Let's end this thread here, please.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Should non-security 2.7 bugs be fixed?

2015-07-19 Thread Devin Jeanpierre
On Sun, Jul 19, 2015 at 8:05 PM, Steven D'Aprano st...@pearwood.info wrote:
 On Mon, 20 Jul 2015 11:20 am, Devin Jeanpierre wrote:
 I was most frustrated by the first case -- the patch was (informally)
 rejected in favor of the right fix, and the right fix was
 (informally) rejected because it changed behavior, leaving me only
 with the option of absurd workarounds of a bug in Python, or moving to
 python 3.

 In the first case, 17094, your comments weren't added until TWO YEARS after
 the issue was closed. It's quite possible that nobody has even noticed
 them. In the second case, the issue is still open. So I don't understand
 your description above: there's no sign that the patch in 17094 was
 rejected, the patch had bugs and it was fixed and applied to 3.4. It wasn't
 applied to 2.7 for the reasons explained in the tracker: it could break
 code that is currently working.

 For the second issue, it has neither been applied nor rejected.

I meant search your #python-dev IRC logs, where this was discussed.

As far as whether people notice patches after an issue is closed,
Terry Reedy answered yes earlier in the thread. If the answer is
actually no, then we should fix how bugs are handled post-closure,
in case e.g. someone posts a followup patch that fixes a remaining
case, and so on.

 you
 shouldn't need quotes or context thrown at you. The rhetoric almost
 always shies _just_ short of ceasing bugfixes (until 2020, when that
 abruptly becomes a cracking good idea). e.g. in 2.7 is here until
 2020, please don't call it a waste.

 Right. So you take an extended ten year maintenance period for Python 2.7 as
 evidence that the core devs are *hostile* to maintaining 2.7? That makes no
 sense to me.

That isn't what I said at all.

 If you want to say that *some individuals* who happen to have commit rights
 are hostile to Python 2.7, I can't really argue with that. Individuals can
 have all sorts of ideas and opinions. But the core devs as a group are very
 supportive of Python 2.7, even going to the effort of back-porting
 performance improvements.

I do want to say that. It doesn't help that those same individuals are
the only core devs I have interacted with while trying to patch 2.7.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Should non-security 2.7 bugs be fixed?

2015-07-18 Thread Devin Jeanpierre
Considering CPython is officially accepting performance improvements
to 2.7, surely bug fixes are also allowed?

I have contributed both performance improvements and bug fixes to 2.7.
In my experience, the problem is not the lack of contributors, it's
the lack of code reviewers.

I think this is something everyone should care about. The really great
thing about working on a project like Python is that not only do you
help the programmers who use Python, but also the users who use the
software that those programmers create. Python 2.7 is important in the
software ecosystem of the world. Fixing bugs and making performance
improvements can sometimes significantly help the 1B people who use
the software written in Python 2.7.

-- Devin

On Sat, Jul 18, 2015 at 4:36 PM, Terry Reedy tjre...@udel.edu wrote:
 I asked the following as an off-topic aside in a reply on another thread. I
 got one response which presented a point I had not considered.  I would like
 more viewpoints from 2.7 users.

 Background: each x.y.0 release normally gets up to 2 years of bugfixes,
 until x.(y+1).0 is released.  For 2.7, released summer 2010, the bugfix
 period was initially extended to 5 years, ending about now.  At the spring
 pycon last year, the period was extended to 10 years, with an emphasis on
 security and build fixed.  My general question is what other fixes should be
 made?  Some specific forms of this question are the following.

 If the vast majority of Python programmers are focused on 2.7, why are
 volunteers to help fix 2.7 bugs so scarce?

 Does they all consider it perfect (or sufficient) as is?

 Should the core developers who do not personally use 2.7 stop backporting,
 because no one cares if they do?

 --
 Terry Jan Reedy

 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Should non-security 2.7 bugs be fixed?

2015-07-18 Thread Devin Jeanpierre
On Sat, Jul 18, 2015 at 6:34 PM, Terry Reedy tjre...@udel.edu wrote:
 On 7/18/2015 8:27 PM, Mark Lawrence wrote:
 On 19/07/2015 00:36, Terry Reedy wrote:
 Programmers don't much like doing maintainance work when they're paid to
 do it, so why would they volunteer to do it?

 Right.  So I am asking: if a 3.x user volunteers a 3.x patch and a 3.x core
 developer reviews and edits the patch until it is ready to commit, why
 should either of them volunteer to do a 2.7 backport that they will not use?

Because it helps even more people. The reason people make upstream
contributions is so that the world benefits. If you only wanted to
help yourself, you'd just patch CPython locally, and not bother
contributing anything upstream.

 I am suggesting that if there are 10x as many 2.7only programmers as 3.xonly
 programmers, and none of the 2.7 programmers is willing to do the backport
 *of an already accepted patch*, then maybe it should not be done at all.

That just isn't true. I have backported 3.x patches. Other people have
backported entire modules.

It gets really boring submitting 2.7-specific patches, though, when
they aren't accepted, and the committers have such a hostile attitude
towards it. I was told by core devs that, instead of fixing bugs in
Python 2, I should just rewrite my app in Python 3. It has even been
implied that bugs in Python 2 are *good*, because that might help with
Python 3 adoption.

 Then even if you do the
 work to fix *ANY* bug there is no guarantee that it gets committed.

 I am discussing the situation where there *is* a near guarantee (if the
 backport works and does not break anything and has not been so heavily
 revised as to require a separate review).

That is not how I have experienced contribution to CPython. No, the
patches are *not* guaranteed, and in my experience they are not likely
to be accepted.

If the issue was closed as fixed before I contributed the backported
patch, does anyone even see it?

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-27 Thread Devin Jeanpierre
On Fri, Jun 26, 2015 at 11:16 PM, Steven D'Aprano st...@pearwood.info wrote:
 On Sat, 27 Jun 2015 02:05 pm, Devin Jeanpierre wrote:

 On Fri, Jun 26, 2015 at 8:38 PM, Steven D'Aprano st...@pearwood.info
 wrote:
 Now you say that the application encrypts the data, except that the user
 can turn that option off.

 Just make the AES encryption mandatory, not optional. Then the user
 cannot upload unencrypted malicious data, and the receiver cannot read
 the data. That's two problems solved.

 No, because another application could pretend to be the file-sending
 application, but send unencrypted data instead of encrypted data.

 Did you stop reading my post when you got to that? Because I went on to say:

At that point I quit in frustration, yeah.

 Actually, the more I think about this, the more I come to think that the
 only way this can be secure is for both the sending client application and
 the receiving client appl to both encrypt the data. The sender can't
 trust the receiver not to read the files, so the sender has to encrypt; the
 receiver can't trust the sender not to send malicious files, so the
 receiver has to encrypt too.

When you realize you've said something completely wrong, you should
edit your email.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-27 Thread Devin Jeanpierre
On Sat, Jun 27, 2015 at 6:18 PM, Steven D'Aprano st...@pearwood.info wrote:
 On Sun, 28 Jun 2015 06:30 am, Devin Jeanpierre wrote:

 On Fri, Jun 26, 2015 at 11:16 PM, Steven D'Aprano st...@pearwood.info
 wrote:
 On Sat, 27 Jun 2015 02:05 pm, Devin Jeanpierre wrote:

 On Fri, Jun 26, 2015 at 8:38 PM, Steven D'Aprano st...@pearwood.info
 wrote:
 Now you say that the application encrypts the data, except that the
 user can turn that option off.

 Just make the AES encryption mandatory, not optional. Then the user
 cannot upload unencrypted malicious data, and the receiver cannot read
 the data. That's two problems solved.

 No, because another application could pretend to be the file-sending
 application, but send unencrypted data instead of encrypted data.

 Did you stop reading my post when you got to that? Because I went on to
 say:

 At that point I quit in frustration, yeah.

 Actually, the more I think about this, the more I come to think that the
 only way this can be secure is for both the sending client application
 and the receiving client appl to both encrypt the data. The sender can't
 trust the receiver not to read the files, so the sender has to encrypt;
 the receiver can't trust the sender not to send malicious files, so the
 receiver has to encrypt too.

 When you realize you've said something completely wrong, you should
 edit your email.

 If both the sender and receiver encrypt the data, how is is completely
 wrong to say that encrypting data should be mandatory?

That isn't what I was calling completely wrong. This is:

 Just make the AES encryption mandatory, not optional. Then the user
 cannot upload unencrypted malicious data, and the receiver cannot read
 the data. That's two problems solved.

The user can still upload unencrypted malicious data by writing their
own client that doesn't have mandatory AES encryption.

You realized this later in the email, apparently, which is why you
should have edited your own email to delete your original, insecure,
suggestion. :(

That said, I appreciate the work you've done here asking for a
specific threat model and pushing back on the idea that it's up to
python-list to prove something is insecure, not the other way around.
That's important. I think, for the same reasons, it's also important
to be really careful what cryptosystems we discuss, and not suggest or
appear to suggest ones that won't work.


P.S. FWIW, the base64 idea has a lot of promise and is probably
fundamentally better than a crypto algorithm. With something along the
lines of base64 -- say, encoding a file using just the letters 'a' and
'b' -- one might try to make it it literally impossible to write bad
things to disk, whereas with any crypto, it is always possible to
obtain the key, so one has to be careful with key management to
prevent/mitigate that.  (One might add: why not both? Beats me. I like
using extension modules.)

P.P.S.: of course, I'm not an expert.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-26 Thread Devin Jeanpierre
Johannes, I agree with a lot of what you say, but can you please have
less of a mean attitude?

-- Devin

On Fri, Jun 26, 2015 at 3:42 PM, Johannes Bauer dfnsonfsdu...@gmx.de wrote:
 On 26.06.2015 23:29, Jon Ribbens wrote:

 While you seem to think that Steven is rampaging about nothing, he does
 have a fair point: You consistently were vague about wheter you want to
 have encryption, authentication or obfuscation of data. This suggests
 that you may not be so sure yourself what it is you actually want.

 He hasn't been vague, you and Steven just haven't been paying
 attention.

 Bullshit. Even the topic indicates that he doesn't know what he wants:
 data mangling or encryption, which one is it?

 You always play around with the 256! which would be a ridiculously high
 security margin (1684 bits of security, w!). You totally ignore that
 the system can be broken in a linear fashion.

 No, it can't, because the attacker does not have access to the
 ciphertext.

 Or so you claim.

 I could go into detail about how the assumtion that the ciphertext is
 secret is not a smart one in the context of cryptography. And how side
 channels and other leakage may affect overall system security. But I'm
 going to save my time on that. I do get paid to review cryptographic
 systems and part of the job is dealing with belligerent people who have
 read Schneier's blog and think they can outsmart anyone else. Since I
 don't get paid to convice you, it's absolutely fine that you think your
 substitution scheme is the grand prize.

 Nobody assumes you're a moron. But it's safe to assume that you're a
 crypto layman, because only laymen have no clue on how difficult it is
 to get cryptography even remotely right.

 Amateur crypto is indeed a bad idea. But what you're still not getting
 is that what he's doing here *isn't crypto*.

 So the topic says Encrypting. If you look really closely at the word,
 the part crypt might give away to you that cryptography is involved.

 He's just trying to avoid
 letting third parties write completely arbitrary data to the disk.

 There's your requirement. Then there's obviously some kind of
 implication when a third party *can* write arbitrary data to disk. And
 your other solution to that problem...

 You
 know what would be a perfectly good solution to his problem? Base 64
 encoding. That would solve the issue pretty much completely, the only
 reason it's not an ideal solution is that it of course increases the
 size of the data.

 ...wow.

 That's a nice interpretation of not letting a third party write
 completely arbitrary data. According to your definition, this would be:
 It's okay if the attacker can control 6 of 8 bits.

 That people in 2015 actually defend inventing a substitution-cipher
 cryptosystem sends literally shivers down my spine.

 Nobody is defending such a thing, you just haven't understood what
 problem is being solved here.

 Oh I understand your solutions plenty well. The only thing I don't
 understand is why you don't own a Fields medal yet for your
 groundbreaking work on bulletproof obfuscation.

 Cheers,
 Johannes

 --
 Wo hattest Du das Beben nochmal GENAU vorhergesagt?
 Zumindest nicht öffentlich!
 Ah, der neueste und bis heute genialste Streich unsere großen
 Kosmologen: Die Geheim-Vorhersage.
  - Karl Kaos über Rüdiger Thomas in dsa hidbv3$om2$1...@speranza.aioe.org
 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-26 Thread Devin Jeanpierre
On Fri, Jun 26, 2015 at 8:38 PM, Steven D'Aprano st...@pearwood.info wrote:
 Now you say that the application encrypts the data, except that the user can
 turn that option off.

 Just make the AES encryption mandatory, not optional. Then the user cannot
 upload unencrypted malicious data, and the receiver cannot read the data.
 That's two problems solved.

No, because another application could pretend to be the file-sending
application, but send unencrypted data instead of encrypted data.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-25 Thread Devin Jeanpierre
On Thu, Jun 25, 2015 at 2:57 AM, Chris Angelico ros...@gmail.com wrote:
 On Thu, Jun 25, 2015 at 7:41 PM, Devin Jeanpierre
 jeanpierr...@gmail.com wrote:
 I know that the OP doesn't propose using ROT-13, but a classical
 substitution cipher isn't that much stronger.

 Yes, it is. It requires the attacker being able to see something about
 the ciphertext, unlike ROT13. But it is reasonable to suppose that
 maybe the attacker can trigger the file getting executed, at which
 point maybe you can deduce from the behavior what the starting bytes
 are...?


 If a symmetric cipher is being used and the key is known, anyone can
 simply perform a decryption operation on the desired bytes, get back a
 pile of meaningless encrypted junk, and submit that. When it's
 encrypted with the same key, voila! The cleartext will reappear.

 Asymmetric ciphers are a bit different, though. AIUI you can't perform
 a decryption without the private key, whereas you can encrypt with
 only the public key. So you ought to be safe on that one; the only way
 someone could deliberately craft input that, when encrypted with your
 public key, produces a specific set of bytes, would be to brute-force
 it. (But I might be wrong on that. I'm no crypto expert.)

Yes, so it should be random.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-25 Thread Devin Jeanpierre
On Thu, Jun 25, 2015 at 2:25 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 On Thursday 25 June 2015 14:27, Devin Jeanpierre wrote:
 The original post said that the sender will usually send files they
 encrypted, unless they are malicious. So if the sender wants them to
 be encrypted, they already are.

 The OP *hopes* that the sender will encrypt the files. I think that's a
 vanishingly faint hope, unless the application itself encrypts the file.

 Most people don't have any encryption software beyond password-protecting
 zip files. Zip 2.0 legacy encryption is crap, and there are plenty of tools
 available to break it. Winzip has an extension for 128-bit and 256-bit AES
 encryption, both of which are probably strong enough unless you're targeted
 by the NSA, but the weak link in the chain is the idea that people will
 encrypt the software before sending it. Even if they have the tools,
 laziness being the defining characteristic of most people, they won't use
 them.

You're right, I was supposing that since they wrote the server, they
also wrote the client, and were just protecting from the protocol
itself being weak.

 I know that the OP doesn't propose using ROT-13, but a classical
 substitution cipher isn't that much stronger.

Yes, it is. It requires the attacker being able to see something about
the ciphertext, unlike ROT13. But it is reasonable to suppose that
maybe the attacker can trigger the file getting executed, at which
point maybe you can deduce from the behavior what the starting bytes
are...?

 I don't think any of us *really* understand his use-case or the potential
 threats, but to my way of thinking, you can never have too strong a cipher
 or underestimate the risk of users taking short-cuts.

This is truth. It would be nice if something like keyczar came in the stdlib.

(Otherwise, users of Python take shortcuts and use randomized
substitution ciphers instead of AES.)

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Devin Jeanpierre
How about a random substitution cipher? This will be ultra-weak, but
fast (using bytes.translate/bytes.maketrans) and seems to be the kind
of thing you're asking for.

-- Devin

On Tue, Jun 23, 2015 at 12:02 PM, Randall Smith rand...@tnr.cc wrote:
 Chunks of data (about 2MB) are to be stored on machines using a peer-to-peer
 protocol.  The recipient of these chunks can't assume that the payload is
 benign.  While the data senders are supposed to encrypt data, that's not
 guaranteed, and I'd like to protect the recipient against exposure to
 nefarious data by mangling or encrypting the data before it is written to
 disk.

 My original idea was for the recipient to encrypt using AES.  But I want to
 keep this software pure Python batteries included and not require
 installation of other platform-dependent software.  Pure Python AES and even
 DES are just way too slow.  I don't know that I really need encryption here,
 but some type of fast mangling algorithm where a bad actor sending a payload
 can't guess the output ahead of time.

 Any ideas are appreciated.  Thanks.

 -Randall

 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Devin Jeanpierre
On Wed, Jun 24, 2015 at 9:07 PM, Steven D'Aprano st...@pearwood.info wrote:
 But just sticking to the three above, the first one is partially mitigated
 by allowing virus scanners to scan the data, but that implies that the
 owner of the storage machine can spy on the files. So you have a conflict
 here.

If it's encrypted malware, and you can't decrypt it, there's no threat.

 Honestly, the *only* real defence against the spying issue is to encrypt the
 files. Not obfuscate them with a lousy random substitution cipher. The
 storage machine can keep the files as long as they like, just by making a
 copy, and spend hours bruteforcing them. They *will* crack the substitution
 cipher. In pure Python, that may take a few days or weeks; in C, hours or
 days. If they have the resources to throw at it, minutes. Substitution
 ciphers have not been effective encryption since, oh, the 1950s, unless you
 use a one-time pad. Which you won't be.

The original post said that the sender will usually send files they
encrypted, unless they are malicious. So if the sender wants them to
be encrypted, they already are.

While the data senders are supposed to encrypt data, that's not
guaranteed, and I'd like to protect the recipient against exposure to
nefarious data by mangling or encrypting the data before it is written
to disk.

The cipher is just to keep the sender from being able to control what
is on disk.

I am usually very oppositional when it comes to rolling your own
crypto, but am I alone here in thinking the OP very clearly laid out
their case?

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre
FWIW most of the objections below also apply to JSON, so this doesn't
just have to be about repr/literal_eval. I'm definitely a huge
proponent of widespread use of something like protocol buffers, both
for production code and personal hacky projects.

On Wed, Jun 10, 2015 at 2:36 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 On Wednesday 10 June 2015 14:48, Devin Jeanpierre wrote:

 [...]
 and literal_eval is not a great idea.

 * the common serializer (repr) does not output a canonical form, and
   can serialize things in a way that they can't be deserialized

 For literals, the canonical form is that understood by Python. I'm pretty
 sure that these have been stable since the days of Python 1.0, and will
 remain so pretty much forever:

The problem is that there are two different ways repr might write out
a dict equal to {'a': 1, 'b': 2}. This can make tests brittle -- e.g.
it's why doctest fails badly at examples involving dictionaries. Text
format protocol buffers output everything sorted, so that you can do
textual diffs for compatibility tests and such.

At work, one thing we do in places is mock out services using golden
expected protobuf responses, so that you can test that the server
returns exactly that, and test what the client does with that,
separately. These are checked into perforce in text format.

 * there is no schema
 * there is no well understood migration story for when the data you
   load and store changes

 literal_eval is not a serialisation format itself. It is a primitive
 operation usable when serialising. E.g. you might write out a simple Unix-
 style rc file of key:value pairs:

-snip-

 split on = and call literal_eval on the value.

 This is a perfectly reasonable light-weight solution for simple
 serialisation needs.

I could spend a bunch of time writing yet another config file format,
or I could use text format protocol buffers, YAML, or TOML and call it
a day.

 * it encourages the use of eval when literal_eval becomes inconvenient
   or insufficient

 I don't think so. I think that people who make the effort to import ast and
 call ast.literal_eval are fully aware of the dangers of eval and aren't
 silly enough to start using eval.

The problem is when you have your config file format using python
literals, and another programmer wants to deal with it and doesn't
look at your codebase, and things like that. When transferring data,
this can happen a lot, since you are often not the user of the data
you wrote, and you can't control how others consume it. They might use
eval even if you didn't mean for them to. For example, in JavaScript,
this was once a common problem for services exposing JSON, and it
still happens even now.

 * It is not particularly well specified or documented compared to the
   alternatives.
 * The types you get back differ in python 2 vs 3

 Doesn't matter. The type you *write* are different in Python 2 vs 3, so of
 course you do.

In a shared 2/3 codebase, if I write bytes I expect to get bytes, and
if I write unicode I expect to get unicode. (There is a third category
of thing, which should be bytes on 2.x and string on 3.x, but it's
probably best to handle that outside of the deserializer). If you
thread it through repr and literal_eval using different versions for
each, unicode in python 3 becomes bytes in python 2, and vice versa.
So it makes migrating to Python 3 even harder.

 For most apps, the alternatives are better. Irmen's serpent library is
 strictly better on every front, for example. (Except potentially
 security, who knows.)

 Beyond simple needs, like rc files, literal_eval is not sufficient. You
 can't use it to deserialise arbitrary objects. That might be a feature, but
 if you need something more powerful than basic ints, floats, strings and a
 few others, literal_eval will not be powerful enough.

No, it is powerful enough. After all, JSON has the same limitations.
Protobuf only adds enums and structs to JSON's types, and it's
potentially the most-used serialization format in the world by
operations per second.

Serialization libraries/formats usually need handholding to serialize
complex Python objects into simple serializable types. [Except pickle,
and that's the very reason it's insecure (per previous discussion in
thread.)]

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre
On Wed, Jun 10, 2015 at 4:39 PM, Devin Jeanpierre
jeanpierr...@gmail.com wrote:
 On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy tjre...@udel.edu wrote:
 On 6/10/2015 6:10 PM, Devin Jeanpierre wrote:

 The problem is that there are two different ways repr might write out
 a dict equal to {'a': 1, 'b': 2}. This can make tests brittle


 Not if one compares objects rather than string representations of objects.
 I am strongly of the view that code and tests should be written to directly
 compare objects as much as possible.

 For serialization formats that always output the same string for the
 same data (like text format protos), there is no practical difference
 between the two, except that if you're comparing text, you can easily
 supply a diff to update one to match the other.

Ugh, there's also the fiddly difference between what goes in and what
you read. A serialized data structure might contain lots of data that
is ignored by the deserializer (in protobuf), or it might contain data
which can't be loaded by the deserializer or produces weird /
incorrect results. Being able to inspect and test the serialized data
separately from the deserialized data is useful in that regard, so
that you know where the failure lies, but it's sort of fuzzy.

Some examples of where this crops up: pickles after you've moved a
class, JSON encoders that try to be clever and output invalid JSON,
protocol buffers with unexpected fields.

Overall, though, the diff thing is probably the bigger reason everyone
wants to do this sort of thing with serialized data. If you do it
right and are principled about it, I don't see a problem with it.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre
On Wed, Jun 10, 2015 at 4:46 PM, Terry Reedy tjre...@udel.edu wrote:
 On 6/10/2015 7:39 PM, Devin Jeanpierre wrote:

 On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy tjre...@udel.edu wrote:

 On 6/10/2015 6:10 PM, Devin Jeanpierre wrote:

 The problem is that there are two different ways repr might write out
 a dict equal to {'a': 1, 'b': 2}. This can make tests brittle


 You commented about *tests*

 Not if one compares objects rather than string representations of
 objects.
 I am strongly of the view that code and tests should be written to
 directly
 compare objects as much as possible.


 I responded about *tests*

 For serialization formats that always output the same string for the
 same data (like text format protos), there is no practical difference
 between the two, except that if you're comparing text, you can easily
 supply a diff to update one to match the other.


 Serialization is a different issue.

Yes, tests of code that uses serialization (caching, RPCs, etc.).

I mentioned above a sort of test that divides tests of a client and
server along RPC boundaries by providing fake queries and responses,
and testing that those are the queries and responses given by the
client and server. This way you don't need to actually start the
client and server to test them both and their interactions. This is
one example, there are other uses, but they go along the same lines.
For example, one can also imagine testing that a serialized structure
is identical across version changes, so that it's guaranteed to be
forwards/backwards compatible. It is not enough to test that the
deserialized form is, because it might differ substantially, as long
as the communicated serialized structure is the same.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre
On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy tjre...@udel.edu wrote:
 On 6/10/2015 6:10 PM, Devin Jeanpierre wrote:

 The problem is that there are two different ways repr might write out
 a dict equal to {'a': 1, 'b': 2}. This can make tests brittle


 Not if one compares objects rather than string representations of objects.
 I am strongly of the view that code and tests should be written to directly
 compare objects as much as possible.

For serialization formats that always output the same string for the
same data (like text format protos), there is no practical difference
between the two, except that if you're comparing text, you can easily
supply a diff to update one to match the other.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre
Snipped aplenty.

On Wed, Jun 10, 2015 at 8:21 PM, Steven D'Aprano st...@pearwood.info wrote:
 On Thu, 11 Jun 2015 08:10 am, Devin Jeanpierre wrote:
 [...]
 I could spend a bunch of time writing yet another config file format,
 or I could use text format protocol buffers, YAML, or TOML and call it
 a day.

 Writing a rc parser is so trivial that it's almost easier to just write it
 than it is to look up the APIs for YAML or JSON, to say nothing of the
 rigmarole of defining a protocol buffer config file, compiling it,
 importing the module, and using that.

-snip

 That's a basic, *but acceptable*, rc parser written in literally under a
 minute. At the risk of ending up with egg on my face, I reckon that it's so
 simple and so obviously correct that I can tell it works correctly without
 even testing it. (Famous last words, huh?)

I won't try to egg you. That said, you have to write tests. Also,
everyone who uses it has to learn the format and API, and it may have
corner cases you aren't aware of, it has to get ported to python 3 if
you wrote it for python 2, the parsing errors are obscure and might
need improvement, and so on. There's a place for this, but I suspect
it is small compared to the place where it seemed like a good idea at
the time.

 Beyond simple needs, like rc files, literal_eval is not sufficient. You
 can't use it to deserialise arbitrary objects. That might be a feature,
 but if you need something more powerful than basic ints, floats, strings
 and a few others, literal_eval will not be powerful enough.

 No, it is powerful enough. After all, JSON has the same limitations.

 In the sense that you can build arbitrary objects from a combination of a
 few basic types, yes, literal_eval is powerful enough if you are prepared
 to re-invent JSON, YAML, or protocol buffer.

 But I'm not talking about re-inventing what already exists. If I want JSON,
 I'll use JSON, not spend weeks or months re-writing it from scratch. I
 can't do this:

 class MyClass:
 pass

 a = MyClass()
 serialised = repr(a)
 b = ast.literal_eval(serialised)
 assert a == b

I don't understand. You can't do that in JSON, YAML, XML, or protocol
buffers, either. They only provide a small set of types, comparable to
(but smaller) than the set of types you get from literal_eval/repr.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: enhancement request: make py3 read/write py2 pickle format

2015-06-09 Thread Devin Jeanpierre
There's a lot of subtle issues with pickle compatibility. e.g.
old-style vs new-style classes. It's kinda hard and it's better to
give up. I definitely agree it's better to use something else instead.
For example, we switched to using protocol buffers, which have much
better compatibility properties and are a bit more testable to boot
(since text format protobufs are always output in a canonical (sorted)
form.)

-- Devin

On Tue, Jun 9, 2015 at 11:35 AM, Chris Warrick kwpol...@gmail.com wrote:
 On Tue, Jun 9, 2015 at 8:08 PM, Neal Becker ndbeck...@gmail.com wrote:
 One of the most annoying problems with py2/3 interoperability is that the
 pickle formats are not compatible.  There must be many who, like myself,
 often use pickle format for data storage.

 It certainly would be a big help if py3 could read/write py2 pickle format.
 You know, backward compatibility?

 Don’t use pickle. It’s unsafe — it executes arbitrary code, which
 means someone can give you a pickle file that will delete all your
 files or eat your cat.

 Instead, use a safe format that has no ability to execute code, like
 JSON. It will also work with other programming languages and
 environments if you ever need to talk to anyone else.

 But, FYI: there is backwards compatibility if you ask for it, in the
 form of protocol versions. That’s all you should know — again, don’t
 use pickle.

 --
 Chris Warrick https://chriswarrick.com/
 PGP: 5EAAEA16
 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: enhancement request: make py3 read/write py2 pickle format

2015-06-09 Thread Devin Jeanpierre
Passing around data that can be put into ast.literal_eval is
synonymous with passing around data taht can be put into eval. It
sounds like a trap.

Other points against JSON / etc.: the lack of schema makes it easier
to stuff anything in there (not as easily as pickle, mind), and by
returning a plain dict, it becomes easier to require a field than to
allow a field to be missing, which is bad for robustness and bad for
data format migrations. (Protobuf (v3) has schemas and gives every
field a default value.)

For human readable serialized data, text format protocol buffers are
seriously underrated. (Relatedly: underdocumented, too.)

/me lifts head out of kool-aid and gasps for air

-- Devin

On Tue, Jun 9, 2015 at 5:17 PM, Irmen de Jong irmen.nos...@xs4all.nl wrote:
 On 10-6-2015 1:06, Chris Angelico wrote:
 On Wed, Jun 10, 2015 at 6:07 AM, Devin Jeanpierre
 jeanpierr...@gmail.com wrote:
 There's a lot of subtle issues with pickle compatibility. e.g.
 old-style vs new-style classes. It's kinda hard and it's better to
 give up. I definitely agree it's better to use something else instead.
 For example, we switched to using protocol buffers, which have much
 better compatibility properties and are a bit more testable to boot
 (since text format protobufs are always output in a canonical (sorted)
 form.)

 Or use JSON, if your data fits within that structure. It's easy to
 read and write, it's human-readable, and it's safe (no chance of
 arbitrary code execution). Forcing yourself to use a format that can
 basically be processed by ast.literal_eval() is a good discipline -
 means you don't accidentally save/load too much.

 ChrisA


 I made a specialized serializer for this, which is more expressive than JSON. 
 It outputs
 python literal expressions that can be directly parsed by ast.literal_eval(). 
 You can
 find it on pypi (https://pypi.python.org/pypi/serpent).  It's the default 
 serializer of
 Pyro, and it includes a Java and .NET version as well as an added bonus.


 Irmen


 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: enhancement request: make py3 read/write py2 pickle format

2015-06-09 Thread Devin Jeanpierre
On Tue, Jun 9, 2015 at 8:52 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 On Wednesday 10 June 2015 10:47, Devin Jeanpierre wrote:

 Passing around data that can be put into ast.literal_eval is
 synonymous with passing around data taht can be put into eval. It
 sounds like a trap.

 In what way?

I misspoke, and instead of synonymous, meant also means.
(Implication, not equivalence.)

 For human readable serialized data, text format protocol buffers are
 seriously underrated. (Relatedly: underdocumented, too.)

 Ironically, literal_eval is designed to process text-format protocols using
 human-readable Python syntax for common data types like int, str, and dict.

Protocol buffers are a specific technology, not an abstract concept,
and literal_eval is not a great idea.

* the common serializer (repr) does not output a canonical form, and
  can serialize things in a way that they can't be deserialized
* there is no schema
* there is no well understood migration story for when the data you
  load and store changes
* it is not usable from other programming languages
* it encourages the use of eval when literal_eval becomes inconvenient
  or insufficient
* It is not particularly well specified or documented compared to the
  alternatives.
* The types you get back differ in python 2 vs 3

For most apps, the alternatives are better. Irmen's serpent library is
strictly better on every front, for example. (Except potentially
security, who knows.)

At least it's better than pickle, security wise. Reliability wise,
repr is a black hole, so no dice. :(

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue15138] base64.urlsafe_b64**code are too slow

2015-05-30 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Here's a backport of the patch to 2.7. It's pretty rad, and basically identical 
to how YouTube monkeypatches base64.

Not sure what will happen to this patch. According to recent discussion on the 
list (e.g. https://mail.python.org/pipermail/python-dev/2015-May/140380.html ), 
performance improvements are open for inclusion in 2.7 if anyone wants to 
bother with merging this in and taking on the review / maintenance burden.

I'm OK with just publishing it for others to merge in with their own private 
versions of Python. It is only relevant if you use base64 a lot. :)

--
nosy: +Devin Jeanpierre
Added file: http://bugs.python.org/file39568/base64_27.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17094] sys._current_frames() reports too many/wrong stack frames

2015-05-29 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

The patch I'm providing with this comment has a ... really hokey test case, and 
a two line + whitespace diff for pystate.c . The objective of the patch is only 
to have _current_frames report the correct frame for any live thread. It 
continues to report dead threads' frames, up until they would conflict with a 
live thread. IMO it's the minimal possible fix for this aspect of the bug, and 
suitable for 2.7.x. Let me know what you think.

--
Added file: http://bugs.python.org/file39564/_current_frames_27_setdefault.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17094
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17094] sys._current_frames() reports too many/wrong stack frames

2015-05-28 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

This bug also affects 2.7. The main problem I'm dealing with is 
sys._current_frames will then return wrong stack frames for existing threads. 
One fix to just this would be to change how the dict is created, to keep newer 
threads rather than tossing them.

Alternatively, we could backport the 3.4 fix.

Thoughts?

--
nosy: +Devin Jeanpierre

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17094
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23275] Can assign [] = (), but not () = []

2015-05-27 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

[a, b] = (1, 2) is also fine.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23275
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5315] signal handler never gets called

2015-05-25 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Adding haypo since apparently he's been touching signals stuff a lot lately, 
maybe has some useful thoughts / review? :)

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5315
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24283] Print not safe in signal handlers

2015-05-25 Thread Devin Jeanpierre

New submission from Devin Jeanpierre:

The code attached runs a while loop that prints, and has a signal handler that 
also prints. There is a thread that constantly fires off signals, but this is 
just to ensure the condition for the bug happens -- this is a bug with signal 
handling, not threads -- I can trigger a RuntimeError (... with a missing 
message?) by commenting out the threading lines and instead running a separate 
process while true; do kill -s SIGUSR1 4687; done.

Traceback:

$ python3 threading_print_test.py 
hello
world
Traceback (most recent call last):
  File /usr/local/google/home/devinj/Downloads/threading_print_test.py, line 
36, in module
main()
  File /usr/local/google/home/devinj/Downloads/threading_print_test.py, line 
30, in main
print(world)
  File /usr/local/google/home/devinj/Downloads/threading_print_test.py, line 
13, in print_hello
print(hello)
RuntimeError: reentrant call inside _io.BufferedWriter name='stdout'

--
files: threading_print_test.py
messages: 244020
nosy: Devin Jeanpierre, haypo
priority: normal
severity: normal
status: open
title: Print not safe in signal handlers
Added file: http://bugs.python.org/file39491/threading_print_test.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24283
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24283] Print not safe in signal handlers

2015-05-25 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

It doesn't do any of those things in Python 2, to my knowledge. Why aren't we 
willing to make this work?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24283
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5315] signal handler never gets called

2015-05-24 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Agree with Charles-François's second explanation. This makes it very hard to 
reliably handle signals -- basically everyone has to remember to use 
set_wakeup_fd, and most people don't. For example, gunicorn is likely 
vulnerable to this because it doesn't use set_wakeup_fd. I suspect most code 
using select + signals is wrong.

I've attached a patch which fixes the issue for select(), but not any other 
functions. If it's considered a good patch, I can work on the rest of the 
functions in the select module. (Also, tests for the details of the behavior.)

Also the patch is pretty hokey, so I'd appreciate feedback if it's going to go 
in. :)

--
keywords: +patch
nosy: +Devin Jeanpierre
Added file: http://bugs.python.org/file39489/select_select.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5315
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24235] ABCs don't fail metaclass instantiation

2015-05-18 Thread Devin Jeanpierre

New submission from Devin Jeanpierre:

If a subclass has abstract methods, it fails to instantiate... unless it's a 
metaclass, and then it succeeds.

 import abc
 class A(metaclass=abc.ABCMeta):
... @abc.abstractmethod
... def foo(self): pass
... 
 class B(A): pass
... 
 B()
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: Can't instantiate abstract class B with abstract methods foo
 class C(A, type): pass
... 
 class c(metaclass=C): pass
... 
 C('', (), {})
class '__main__.'


--
components: Library (Lib)
messages: 243540
nosy: Devin Jeanpierre
priority: normal
severity: normal
status: open
title: ABCs don't fail metaclass instantiation
versions: Python 2.7, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24235
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24144] Docs discourage use of binascii.unhexlify etc.

2015-05-07 Thread Devin Jeanpierre

New submission from Devin Jeanpierre:

Maybe the functions should be split up into those you shouldn't need to call 
directly, and those you should? I find it unlikely that you're supposed to use 
codecs.encode(..., 'hex') and codecs.decode(..., 'hex') instead of binascii 
(the only other thing, AFAIK, that works in both 2 and 3).

Relevant quote starts with: Normally, you will not use these functions 
directly

https://docs.python.org/2/library/binascii
https://docs.python.org/3/library/binascii

--
assignee: docs@python
components: Documentation
messages: 242737
nosy: Devin Jeanpierre, docs@python
priority: normal
severity: normal
status: open
title: Docs discourage use of binascii.unhexlify etc.
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24144
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Pickle based workflow - looking for advice

2015-04-13 Thread Devin Jeanpierre
On Mon, Apr 13, 2015 at 10:58 AM, Fabien fabien.mauss...@gmail.com wrote:
 Now, to my questions:
 1. Does that seem reasonable?

A big issue is the use of pickle, which is:

* Often suboptimal performance wise (e.g. you can't load only subsets
of the data)
* Makes forwards/backwards compatibility very difficult
* Can make python 2/3 migrations harder
* Creates data files which are difficult to analyze/fix by hand if
they get broken
* Is schemaless, and can accidentally include irrelevant data you
didn't mean to store, making all of the above worse.
* Means you have to be very careful who wrote the pickles, or you open
a remote code execution vulnerability. It's common for people to
forget that code is unsafe, and get themselves pwned. Security is
always better if you don't do anything bad in the first place, than if
you do something bad but try to manage the context in which the bad
thing is done.

Cap'n Proto might be a decent alternatives that gives you good
performance, by letting you process only the bits of the file you want
to. It is also not a walking security nightmare.

 2. Should Watershed be an object or should it be a simple dictionary? I
 thought that an object could be nice, because it could take care of some
 operations such as plotting and logging. Currently I defined a class
 Watershed, but its attributes are defined and filled by A, B and C (this
 seems a bit wrong to me).

It is usually very confusing for attributes to be defined anywhere
other than __init__. It's very really confusing for them to be defined
by some random other function living somewhere else.

 I could give more responsibilities to this class
 but it might become way too big: since the whole purpose of the tool is to
 work on watersheds, making a Watershed class actually sounds like a code
 smell (http://en.wikipedia.org/wiki/God_object)

Whether they are methods or not doesn't make this any more or less of
a god object -- if it stores all this data used by all these different
things, it is already a bit off.

 3. The operation A opens an external file, reads data out of it and writes
 it in Watershed object. Is it a bad idea to multiprocess this? (I guess it
 is, since the file might be read twice at the same time)

That does sound like a bad idea, for the reason you gave. It might be
possible to read it once, and share it among many processes.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: You must register a new account to report a bug (was: Python 2 to 3 conversion - embrace the pain)

2015-03-16 Thread Devin Jeanpierre
On Sun, Mar 15, 2015 at 11:17 PM, Ben Finney ben+pyt...@benfinney.id.au wrote:
 Sadly becoming the norm. People will run a software project and just
 assume that users will be willing to go through a registration process
 for every project just to report a bug.

Registering for github is a lot easier than creating a reproducible
test case. I agree that we should minimize friction, but friction will
always exist. In GitHub's case, the additional friction is amortized
over all github projects (and there are lots of those).

Other things that can make bug reporting frustrating:

* Slow triage / ignored bug reports
* Automated bug report handling (closing all extant bugs every N months)
* Passive-aggressively requiring hours of work to create an isolated
  system (e.g. brand new install of Ubuntu) before bug reports are
  accepted.
* Dismissing bug reports as WAI without explanation, or with poor explanation
  (we talked about this and decided we disagree)
* Dismissing bugs as not worth fixing
* Passing the buck (This is a bug in XYlib, WAI)
* Insulting bug reporters

IMO registration is not nearly as big a deal as the others. If nothing
else, because it's a one-time cost per project at most, whereas all
the other issues (potentially) rear their head with every single bug
report.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Design thought for callbacks

2015-02-21 Thread Devin Jeanpierre
On Fri, Feb 20, 2015 at 9:42 PM, Chris Angelico ros...@gmail.com wrote:
 No, it's not. I would advise using strong references - if the callback
 is a closure, for instance, you need to hang onto it, because there
 are unlikely to be any other references to it. If I register a
 callback with you, I expect it to be called; I expect, in fact, that
 that *will* keep my object alive.

For that matter, if the callback is a method, you need to hang onto
it, because method wrappers are generated on demand, so the method
would be removed from the valid callbacks instantly.

Weak references for callbacks are broken.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: meaning of: line, =

2015-02-06 Thread Devin Jeanpierre
Sorry for late reply, I somehow missed this email.

On Thu, Feb 5, 2015 at 8:59 AM, Rustom Mody rustompm...@gmail.com wrote:
 The reason I ask: I sorely miss haskell's pattern matching in python.

 It goes some way:

 ((x,y),z) = ((1,2),3)
 x,y,z
 (1, 2, 3)

 But not as far as I would like:

 ((x,y),3) = ((1,2),3)
   File stdin, line 1
 SyntaxError: can't assign to literal


 [Haskell]

 Prelude let (x, (y, (42, z, Hello))) = (1, (2, (42, 3, Hello)))
 Prelude (x,y,z)
 (1,2,3)

Yeah, but Haskell is ludicrous.

Prelude let (x, 2) = (1, 3)
Prelude

Only non-falsifiable patterns really make sense as the left hand side
of an assignment in a language without exceptions, IMO. Otherwise you
should use a match/case statement. (Of course, Python does have
exceptions...)

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: meaning of: line, =

2015-02-05 Thread Devin Jeanpierre
On Thu, Feb 5, 2015 at 8:08 AM, Ian Kelly ian.g.ke...@gmail.com wrote:
 On Thu, Feb 5, 2015 at 2:40 AM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:
 Devin Jeanpierre wrote:
 On Wed, Feb 4, 2015 at 1:18 PM, Chris Angelico ros...@gmail.com wrote:
 [result] = f()
 result
 42

 Huh, was not aware of that alternate syntax.

 Nor are most people. Nor is Python, in some places -- it seems like
 people forgot about it when writing some bits of the grammar.

 Got an example where you can use a,b but not [a,b] or (a,b)?

 def f(a, (b, c)):
 ... print a, b, c
 ...
 f(3, [4, 5])
 3 4 5
 def g(a, [b, c]):
   File stdin, line 1
 def g(a, [b, c]):
  ^
 SyntaxError: invalid syntax

 Although to be fair, the first syntax there is no longer valid either
 in Python 3.

As Ian rightly understood, I was referring to differences between [a,
b, ...] and (a, b, ...).

Here's another example, one that still exists in Python 3:

 [] = ''
 () = ''
  File stdin, line 1
SyntaxError: can't assign to ()

The syntax explicitly blacklists (), but forgets to blacklist [].

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: meaning of: line, =

2015-02-05 Thread Devin Jeanpierre
On Wed, Feb 4, 2015 at 1:18 PM, Chris Angelico ros...@gmail.com wrote:
 On Thu, Feb 5, 2015 at 4:36 AM, Peter Otten __pete...@web.de wrote:
 Another alternative is to put a list literal on the lefthand side:

 def f(): yield 42

 ...
 [result] = f()
 result
 42

 Huh, was not aware of that alternate syntax.

Nor are most people. Nor is Python, in some places -- it seems like
people forgot about it when writing some bits of the grammar.

I'd suggest not using it.

 (If you're worried: neither the list nor the tuple will be created; the
 bytecode is identical in both cases)

 It can't possibly be created anyway. Python doesn't have a notion of
 assignable thing that, when assigned to, will assign to something
 else like C's pointers or C++'s references. There's nothing that you
 could put into the list that would have this behaviour.

C pointers don't do that either. It's really just references. (C
pointers aren't any more action-at-a-distance than Python attributes.)

Anyway, it could create a new list in Python, because Python can do
whatever it wants. But it doesn't, because as you say, that wouldn't
do anything.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: dunder-docs (was Python is DOOMED! Again!)

2015-02-03 Thread Devin Jeanpierre
On Mon, Feb 2, 2015 at 6:07 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Run this code:

 # === cut ===

 class K(object):
 def f(self): pass

 def f(self): pass

 instance = K()
 things = [instance.f, f.__get__(instance, K)]
 from random import shuffle
 shuffle(things)
 print(things)

 # === cut ===


 You allege that one of these things is a method, and the other is not. I
 challenge you to find any behavioural or functional difference between the
 two. (Object IDs don't count.)

 If you can find any meaningful difference between the two, I will accept
 that methods have to be created as functions inside a class body.

In this particular case, there is none. What if the body of the
method was super().f() ?

Some methods can be defined outside of the body and still work exactly
the same, but others won't.

 Otherwise you are reduced to claiming that there is some sort of mystical,
 undetectable essence or spirit that makes one of those two objects a
 real method and the other one a fake method, even though they have the same
 type, the same behaviour, and there is no test that can tell you which is
 which.

It isn't mystical. There are differences in semantics of defining
methods inside or outside of a class that apply in certain situations
(e.g. super(), metaclasses). You have cherrypicked an example that
avoids them.

If one wants to say A method can (...) by using super(), then
methods must be defined to only exist inside of class bodies.

Obviously, once you construct the correct runtime values, behavior
might be identical. The difference is in whether you can do different
things, not in behavior.

 For an example we can all agree on, this is not an instance of
 collections.Iterable, but the docs claim it is iterable:
 https://docs.python.org/2/glossary.html#term-iterable

 class MyIterable(object):
 def __getitem__(self, i): return i

 Iterable is a generic term, not a type. Despite the existence of the
 collections.Iterable ABC, iterable refers to any type which can be
 iterated over, using either of two different protocols.

 As I said above, if you wanted to argue that method was a general term for
 any callable attached to an instance or class, then you might have a point.
 But you're doing something much weirder: you are arguing that given two
 objects which are *identical* save for their object ID, one might be called
 a method, and the other not, due solely to where it was created. Not even
 where it was retrieved from, but where it was created.

 If you believe that method or not depends on where the function was
 defined, then this will really freak you out:


 py class Q:
 ... def f(self): pass  # f defined inside the class
 ...
 py def f(self): pass  # f defined outside the class
 ...
 py f, Q.f = Q.f, f  # Swap the inside f and the outside f.
 py instance = Q()
 py instance.f  # Uses outside f, so not a real method!
 bound method Q.f of __main__.Q object at 0xb7b8fcec
 py MethodType(f, instance)  # Uses inside f, so is a real method!
 bound method Q.f of __main__.Q object at 0xb7b8fcec

You are really missing the point, if you think that surprises me.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: dunder-docs (was Python is DOOMED! Again!)

2015-02-03 Thread Devin Jeanpierre
On Mon, Feb 2, 2015 at 6:20 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Devin Jeanpierre wrote:
 Oops, I just realized why such a claim might be made: the
 documentation probably wants to be able to say that any method can use
 super(). So that's why it claims that it isn't a method unless it's
 defined inside a class body.

 You can use super anywhere, including outside of classes. The only thing you
 can't do is use the Python 3 super hack which automatically fills in the
 arguments to super if you don't supply them. That is compiler magic which
 truly does require the function to be defined inside a class body. But you
 can use super outside of classes:

Obviously, I was referring to no-arg super.

Please assume good faith and non-ignorance on my part.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: dunder-docs (was Python is DOOMED! Again!)

2015-02-02 Thread Devin Jeanpierre
On Mon, Feb 2, 2015 at 4:06 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 On Sun, Feb 1, 2015 at 11:15 PM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:
 Both K.f and K.g are methods, even though only one meets the definition
 given in the glossary. The glossary is wrong.

I agree, it oversimplified and has made a useless distinction here.

 Even if it is so defined, the definition is wrong. You can define methods
 on an instance. I showed an example of an instance with its own personal
 __dir__ method, and showed that dir() ignores it if the instance belongs
 to a new-style class but uses it if it is an old-style class.

 You didn't define a method, you defined a callable attribute.

 That is wrong. I defined a method:

 py from types import MethodType
 py type(instance.f) is MethodType
 True


 instance.f is a method by the glossary definition. Its type is identical to
 types.MethodType, which is what I used to create a method by hand.

You are assuming that they are both methods, just because they are
instances of a type called MethodType. This is like assuming that a
Tree() object is made out of wood.

The documentation is free to define things in terms other than types
and be correct. There are many properties of functions-on-classes that
callable instance attributes that are instances of MethodType do not
have, as we've already noticed. isinstance can say one thing, and the
documentation another, and both can be right, because they are saying
different things.


For an example we can all agree on, this is not an instance of
collections.Iterable, but the docs claim it is iterable:
https://docs.python.org/2/glossary.html#term-iterable

class MyIterable(object):
def __getitem__(self, i): return i

The docs are not wrong, they are just making a distinction for
humans that is separate from the python types involved. This is OK.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: dunder-docs (was Python is DOOMED! Again!)

2015-02-02 Thread Devin Jeanpierre
On Mon, Feb 2, 2015 at 5:00 AM, Devin Jeanpierre jeanpierr...@gmail.com wrote:
 On Mon, Feb 2, 2015 at 4:06 AM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:
 On Sun, Feb 1, 2015 at 11:15 PM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:
 Both K.f and K.g are methods, even though only one meets the definition
 given in the glossary. The glossary is wrong.

 I agree, it oversimplified and has made a useless distinction here.

Oops, I just realized why such a claim might be made: the
documentation probably wants to be able to say that any method can use
super(). So that's why it claims that it isn't a method unless it's
defined inside a class body.

-- Devin

 Even if it is so defined, the definition is wrong. You can define methods
 on an instance. I showed an example of an instance with its own personal
 __dir__ method, and showed that dir() ignores it if the instance belongs
 to a new-style class but uses it if it is an old-style class.

 You didn't define a method, you defined a callable attribute.

 That is wrong. I defined a method:

 py from types import MethodType
 py type(instance.f) is MethodType
 True


 instance.f is a method by the glossary definition. Its type is identical to
 types.MethodType, which is what I used to create a method by hand.

 You are assuming that they are both methods, just because they are
 instances of a type called MethodType. This is like assuming that a
 Tree() object is made out of wood.

 The documentation is free to define things in terms other than types
 and be correct. There are many properties of functions-on-classes that
 callable instance attributes that are instances of MethodType do not
 have, as we've already noticed. isinstance can say one thing, and the
 documentation another, and both can be right, because they are saying
 different things.


 For an example we can all agree on, this is not an instance of
 collections.Iterable, but the docs claim it is iterable:
 https://docs.python.org/2/glossary.html#term-iterable

 class MyIterable(object):
 def __getitem__(self, i): return i

 The docs are not wrong, they are just making a distinction for
 humans that is separate from the python types involved. This is OK.

 -- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python is DOOMED! Again!

2015-02-01 Thread Devin Jeanpierre
On Sun, Feb 1, 2015 at 8:31 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Paul Rubin wrote:
 It's completely practical: polymorphism and type inference get you the
 value you want with usually no effort on your part.

 But it's the usually that bites you.

 If I have an arbitrary pointer, and I want to check if it is safe to
 dereference, how do I do it? Surely I'm not expected to write something
 like:

 if type(ptr) == A:
 if ptr != Anil: ...
 if type(ptr) == B:
 if ptr != Bnil: ...

 etc. That would be insane. So how does Haskell do this?

Haskell has different nulls in the same sense Java does: there's one
keyword, whose type varies by context. Unlike Java, there is no way at
all to cast different nulls to different types.  Haskell has return
value polymorphism and generics, so it's very easy for a function to
return values of different types depending on type parameters. So this
isn't even compiler hackery, it's ordinary.

Also, you don't dereference in Haskell, you unpack. Python and Haskell code:

if x is None:
print(Not found!)
else:
print x

case x of
Nothing - putStrLn Not found
Some y - putStrLn (show y)

Both of these work whenever x is something that can be null and can be
shown -- in Haskell, that's anything of type Maybe T, where you have
access to a Show implementation for T.  In Python, None is its own
type/value, in Haskell there is an incompatible Nothing for each T.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python is DOOMED! Again!

2015-02-01 Thread Devin Jeanpierre
On Sun, Feb 1, 2015 at 8:34 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Devin Jeanpierre wrote:

 It's really only dynamically typed languages that have a single null
 value of a single type. Maybe I misunderstand the original statement.

 Pascal is statically typed and has a single null pointer compatible with all
 pointer types. C has a single nil pointer compatible with all pointer
 types. I expect that the Modula and Oberon family of languages copied
 Pascal, which probably copied Algol.

No, C has a NULL macro which evaluates to something which coerces to
any pointer type and will be the null value of that type. But there's
one null value per type. The C standard makes no guarantees that they
are compatible in any way, e.g. they can be of different sizes. On
some systems, the null function pointer will have a size of N, where
the null int pointer will have a size of M, where N != M -- so these
are clearly not the same null value.

I don't know Pascal, but I wouldn't be surprised if something similar
held, as nonuniform pointer sizes were a thing once.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python is DOOMED! Again!

2015-02-01 Thread Devin Jeanpierre
On Sun, Feb 1, 2015 at 2:27 PM, Paul Rubin no.email@nospam.invalid wrote:
 Devin Jeanpierre jeanpierr...@gmail.com writes:
 That said, Haskell (and the rest) do have a sort of type coercion, of
 literals at compile time (e.g. 3 can be an Integer or a Double
 depending on how you use it.)

 That's polymorphism, not coercion.

OK, yes, that fits better into how Haskell works. After all, that's
how Nothing works. If 3 is just a (magic) constructor, then it's no different.

 The compiler figures out at compile
 time what type of 3 you actually mean: there is never an automatic
 runtime conversion.  sqrt(3) works because sqrt expects a floating
 argument so the compiler deduces that the 3 that you wrote denotes a
 float.  sqrt(3+length(xs)) has to fail because length returns an int, so
 3+length(xs) is an int, and you can't pass an int to sqrt.

 BTW it's weird that in this thread, and in the programmer community at
 large, int-string is considered worse than int-float

 Hehe, though int-string leads to plenty of weird bugs.

 Haskell's idiomatic substitute for a null pointer is a Nothing value
 For that matter, how is this (first part) different from, say, Java?

 In Java, functions expecting to receve sensible values can get null by
 surprise.  In Haskell, if a term can have a Nothing value, that has to
 be reflected in its type.  Haskell's bug-magnet counterpart to Java's
 null values is Bottom, an artifact of lazy evaluation.  E.g. you can
 write
x = 3 / 0
 someplace in your program, and the program will accept this and run
 merrily until you try to actually print something that depends on x,
 at which point it crashes.

This isn't a difference in whether there are multiple nulls, though.

I answered my own question later, by accident: Java nulls are castable
to each other if you do it explicitly (routing through Object -- e.g.
(Something)((Object) ((SomeOtherThing) null.

So in that sense, there is only one null, just with some arbitrary
compiler distinctions you can break through if you try hard enough.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: dunder-docs (was Python is DOOMED! Again!)

2015-02-01 Thread Devin Jeanpierre
-- Devin

On Sun, Feb 1, 2015 at 11:15 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Gregory Ewing wrote:

 Steven D'Aprano wrote:
 [quote]
 If the object has a method named __dir__(), this method will
 be called and must return the list of attributes.
 [end quote]

 The first inaccuracy is that like all (nearly all?) dunder methods,
 Python only looks for __dir__ on the class, not the instance itself.

 It says method, not attribute, so technically
 it's correct. The methods of an object are defined
 by what's in its class.

 Citation please. I'd like to see where that is defined.

https://docs.python.org/3/glossary.html#term-method

 Even if it is so defined, the definition is wrong. You can define methods on
 an instance. I showed an example of an instance with its own personal
 __dir__ method, and showed that dir() ignores it if the instance belongs to
 a new-style class but uses it if it is an old-style class.

You didn't define a method, you defined a callable attribute.
Old-style classes will call those for special method overriding,
because it's the simplest thing to do. New-style classes look methods
up on the class as an optimization, but it also really complicates the
attribute semantics. The lookup strategy is explicitly defined in the
docs. pydoc is, like always, incomplete or inaccurate. See
https://docs.python.org/2/reference/datamodel.html#special-method-names

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: RAII vs gc (was fortran lib which provide python like data type)

2015-01-31 Thread Devin Jeanpierre
On Fri, Jan 30, 2015 at 1:28 PM, Sturla Molden sturla.mol...@gmail.com wrote:
 in Python. It actually corresponds to

 with Foo() as bar:
 suite

The problem with with statements is that they only handle the case of
RAII with stack allocated variables, and can't handle transfer of
ownership cleanly.

Consider the case of a function that opens a file and returns it:

def myfunction(name, stuff):
f = open(name)
f.seek(stuff) # or whatever
return f

def blahblah():
with myfunction('hello', 12) as f:
 

This code is wrong, because if an error occurs during seek in
myfunction, the file is leaked.

The correct myfunction is as follows:

def myfunction(name, stuff)
f = open(name)
try:
f.seek(stuff)
except:
f.close()
raise

Or whatever. (I would love a close_on_error context manager, BTW.)

With RAII, the equivalent C++ looks nearly exactly like the original
(bad) Python approach, except it uses unique_ptr to store the file,
and isn't broken. (Modern) C++ makes this easy to get right. But
then, this isn't the common case.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python is DOOMED! Again!

2015-01-31 Thread Devin Jeanpierre
Sorry, sort of responding to both of you.

On Sat, Jan 31, 2015 at 10:12 PM, Paul Rubin no.email@nospam.invalid wrote:
 Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:
 Some degree of weakness in a type system is not necessarily bad. Even the
 strongest of languages usually allow a few exceptions, such as numeric
 coercions.

 Haskell doesn't have automatic coercions of any sort.  You have to call
 a conversion function if you want to turn an Int into an Integer.

Yeah. In fact, it isn't very compatible with the ML/Haskell type
system to automatically convert, because it does weird things to type
inference and type unification. So this is common in that language
family.

That said, Haskell (and the rest) do have a sort of type coercion, of
literals at compile time (e.g. 3 can be an Integer or a Double
depending on how you use it.)

BTW it's weird that in this thread, and in the programmer community at
large, int-string is considered worse than int-float, when the
former is predictable and reversible, while the latter is lossy and
can cause subtle bugs. Although at least we don't have ten+ types with
sixty different spellings which change from platform to platform, and
all of which automatically coerce despite massive and outrageous
differences in representable values. (Hello, C.)

 I've never come across a language that has pointers which insists on
 having a separate Nil pointer for ever pointer type

 Haskell's idiomatic substitute for a null pointer is a Nothing value
 (like Python's None) and there's a separate one for every type.  The FFI
 offers actual pointers (Foreign.Ptr) and there is a separate nullPtr
 for every type.

For that matter, how is this (first part) different from, say, Java?

It's really only dynamically typed languages that have a single null
value of a single type. Maybe I misunderstand the original statement.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: unpyc3 - a python bytecode decompiler for Python3

2015-01-29 Thread Devin Jeanpierre
On Wed, Jan 28, 2015 at 4:34 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Devin Jeanpierre wrote:
 Git doesn't help if you lose your files in between commits,

 Sure it does? You just lose the changes made since the previous commit, but
 that's no different from restoring from backup. The restored file is only
 as up to date as the last time a backup was taken.

Yeah. My point here is that Drive/Dropbox take snapshots at much
shorter intervals than any reasonable person will commit with a DVCS,
so you lose much less.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: unpyc3 - a python bytecode decompiler for Python3

2015-01-28 Thread Devin Jeanpierre
On Wed, Jan 28, 2015 at 1:40 PM, Chris Angelico ros...@gmail.com wrote:
 On Thu, Jan 29, 2015 at 5:47 AM, Chris Kaynor ckay...@zindagigames.com 
 wrote:
 I use Google Drive for it for all the stuff I do at home, and use SVN
 for all my personal projects, with the SVN depots also in Drive. The
 combination works well for me, I can transfer between my desktop and
 laptop freely, and have full revision history for debugging issues.

 I just do everything in git, no need for either Drive or something as
 old as SVN. Much easier. :)

Git doesn't help if you lose your files in between commits, or if you
lose the entire directory between pushes.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing module backport from 3 to 2.7 - spawn feature

2015-01-28 Thread Devin Jeanpierre
On Wed, Jan 28, 2015 at 10:06 AM, Skip Montanaro
skip.montan...@gmail.com wrote:
 On Wed, Jan 28, 2015 at 7:07 AM, Andres Riancho andres.rian...@gmail.com
 wrote:
 The feature I'm specially interested in is the ability to spawn
 processes [1] instead of forking, which is not present in the 2.7
 version of the module.

 Can you explain what you see as the difference between spawn and fork in
 this context? Are you using Windows perhaps? I don't know anything obviously
 different between the two terms on Unix systems.

On Unix, if you fork without exec*, and had threads open, threads
abruptly terminate, resulting in completely broken mutex state etc.,
which leads to deadlocks or worse if you try to acquire resources in
the forked child process. So in such circumstances, multiprocessing
(in 2.7) is not a viable option. But 3.x adds a feature, spawn, that
lets you fork+exec instead of just forking.

I too would be interested in such a backport. I considered writing
one, but haven't had a strong enough need yet.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: unpyc3 - a python bytecode decompiler for Python3

2015-01-28 Thread Devin Jeanpierre
I distrust any backup strategy that requires explicit action by the
user. I've seen users fail too often. (Including myself.)

-- Devin

On Wed, Jan 28, 2015 at 2:02 PM, Chris Angelico ros...@gmail.com wrote:
 On Thu, Jan 29, 2015 at 8:52 AM, Devin Jeanpierre
 jeanpierr...@gmail.com wrote:
 Git doesn't help if you lose your files in between commits, or if you
 lose the entire directory between pushes.

 So you commit often and push immediately. Solved.

 ChrisA
 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: unpyc3 - a python bytecode decompiler for Python3

2015-01-28 Thread Devin Jeanpierre
FWIW I put all my source code inside Dropbox so that even things I
haven't yet committed/pushed to Bitbucket/Github are backed up. So far
it's worked really well, despite using Dropbox on both Windows and
Linux. (See also: Google Drive, etc.)

(Free) Dropbox has a 30 day recovery time limit, and I think Google
Drive has a trash bin, as well as a 29 day recovery for emptied trash
items.

That said, hindsight is easier than foresight. I'm glad you were able
to recover your files!

-- Devin

On Wed, Jan 28, 2015 at 10:09 AM,  n.poppel...@xs4all.nl wrote:
 Last night I accidentally deleted a group of *.py files 
 (stupid-stupid-stupid!).

 Thanks to unpyc3 I have reconstructed all but one of them so far from the 
 *.pyc files that were in the directory __pycache__. Many thanks!!!

 -- Nico
 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: An object is an instance (or not)?

2015-01-27 Thread Devin Jeanpierre
On Tue, Jan 27, 2015 at 9:37 PM,  random...@fastmail.us wrote:
 On Tue, Jan 27, 2015, at 16:06, Mario Figueiredo wrote:
 That error message has me start that thread arguing that the error is
 misleading because the Sub object does have the __bases__ attribute.
 It's the Sub instance object that does not have it.

 What do you think Sub object means?

 Sub itself is not a Sub object, it is a type object. instance is
 implicit in the phrase foo object.

Yes. Unfortunately, it's still not really completely clear. Sub
instance would avoid this confusion for everyone.

I think the only reason to avoid instance in the past would have
been the old-style object confusion, as Ben Finney pointed out. (BTW I
agree with literally every single thing he said in this thread, it's
really amazing.)

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue23322] parser module docs missing second example

2015-01-25 Thread Devin Jeanpierre

New submission from Devin Jeanpierre:

The port to reST missed the second example: 
https://docs.python.org/release/2.5/lib/node867.html

This is still referred to in the docs, so it is not deliberate. For example, 
the token module docs say The second example for the parser module shows how 
to use the symbol module: 
https://docs.python.org/3.5/library/token.html#module-token

There is no second example, nor any use of the symbol module, in the docs: 
https://docs.python.org/3.5/library/parser.html

--
assignee: docs@python
components: Documentation
messages: 234716
nosy: Devin Jeanpierre, docs@python
priority: normal
severity: normal
status: open
title: parser module docs missing second example
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23322
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Alternative to multi-line lambdas: Assign-anywhere def statements

2015-01-24 Thread Devin Jeanpierre
On Sat, Jan 24, 2015 at 11:55 AM, Chris Angelico ros...@gmail.com wrote:
 That's still only able to assign to a key of a dictionary, using the
 function name. There's no way to represent fully arbitrary assignment
 in Python - normally, you can assign to a name, an attribute, a
 subscripted item, etc. (Augmented assignment is a different beast
 altogether, and doesn't really make sense with functions.) There's no
 easy way to say @stash(dispatch_table_a['asdf']) and have that end
 up assigning to exactly that.

Obviously, nobody will be happy until you can do:

def call(*a, **kw): return lambda f: f(*a, **kw)

@call()
def x, y ():
yield 1
yield 2

Actually, maybe not even then.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Alternative to multi-line lambdas: Assign-anywhere def statements

2015-01-24 Thread Devin Jeanpierre
On Sat, Jan 24, 2015 at 5:58 PM, Ethan Furman et...@stoneleaf.us wrote:
 On 01/24/2015 11:55 AM, Chris Angelico wrote:
 On Sun, Jan 25, 2015 at 5:56 AM, Ethan Furman et...@stoneleaf.us wrote:
 If the non-generic is what you're concerned about:

 # not tested
 dispatch_table_a = {}
 dispatch_table_b = {}
 dispatch_table_c = {}

 class dispatch:
   def __init__(self, dispatch_table):
 self.dispatch = dispatch_table
   def __call__(self, func):
 self.dispatch[func.__name__] = func
 return func

 @dispatch(dispatch_table_a)
 def foo(...):
pass

 That's still only able to assign to a key of a dictionary, using the
 function name.

 This is a Good Thing.  The def statement populates a few items, __name__ 
 being one of them.  One of the reasons lambda
 is not encouraged is because its name is always 'lambda', which just ain't 
 helpful when the smelly becomes air borne!  ;)

Actually, in this case you'd probably want the function's __name__ to
be something different, since it'd be confusing if all three dispatch
tables had a 'foo' entry, using functions whose name was 'foo'.

No reason a function's name can't be dispatch_table_a['foo']

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Trees

2015-01-20 Thread Devin Jeanpierre
There are similarly many kinds of hash tables.

For a given use case (e.g. a sorted dict, or a list with efficient
removal, etc.), there's a few data structures that make sense, and a
library (even the standard library) doesn't have to expose which one
was picked as long as the performance is good.

-- Devin

On Tue, Jan 20, 2015 at 12:15 PM, Ken Seehart k...@seehart.com wrote:
 Exactly. There are over 23,000 different kinds of trees. There's no way you
 could get all of them to fit in a library, especially a standard one.
 Instead, we prefer to provide people with the tools they need to grow their
 own trees.

 http://caseytrees.org/programs/planting/ctp/
 http://www.ncsu.edu/project/treesofstrength/treefact.htm
 http://en.wikipedia.org/wiki/Tree

 On 1/19/2015 3:01 PM, Mark Lawrence wrote:

 On 19/01/2015 22:06, Zachary Gilmartin wrote:

 Why aren't there trees in the python standard library?


 Probably because you'd never get agreement as to which specific tree and
 which specific implementation was the most suitable for inclusion.


 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Trees

2015-01-19 Thread Devin Jeanpierre
On Mon, Jan 19, 2015 at 3:08 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Zachary Gilmartin wrote:

 Why aren't there trees in the python standard library?

 Possibly because they aren't needed? Under what circumstances would you use
 a tree instead of a list or a dict or combination of both?

 That's not a rhetorical question. I am genuinely curious, what task do you
 have that you think must be solved by a tree?

In general, any time you want to maintain a sorted list or mapping,
balanced search tree structures come in handy.

Here's an example task: suppose you want to represent a calendar,
where timeslots can be reserved for something. Calendar events are not
allowed to intersect.

The most important query is: What events are there that intersect with
the timespan between datetimes d1 and d2? (To draw a daily agenda,
figure out if you should display an alert to the user that an event is
ongoing or imminent, etc.)

You also want to be able to add a new event to the calendar, that
takes place between d1 and d2, and to remove a event.

I leave it to the reader to implement this using a sorted map. (hint:
sort by start.)

This maybe seems contrived, but I've used this exact datatype, or a
remarkably similar one, in a few different circumstances: sequenced
actions of characters in a strategy game, animation, motion
planning...

There are a few possible implementations using Python data structures.
You can do it using a linear scan, which gets a little slow pretty
quickly. You can make insertion slow (usually OK) by sorting on
insertion, but if you ever forget to resort your list you will get a
subtle bug you might not notice for a while. And so on. It's better in
every way to use the third-party blist module, so why bother?

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue23275] Can assign [] = (), but not () = []

2015-01-19 Thread Devin Jeanpierre

New submission from Devin Jeanpierre:

 [] = ()
 () = []
  File stdin, line 1
SyntaxError: can't assign to ()

This contradicts the assignment grammar, which would make both illegal: 
https://docs.python.org/3/reference/simple_stmts.html#assignment-statements

--
components: Interpreter Core
messages: 234324
nosy: Devin Jeanpierre
priority: normal
severity: normal
status: open
title: Can assign [] = (), but not () = []
type: behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23275
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Hello World

2015-01-17 Thread Devin Jeanpierre
Sorry for necro.

On Sat, Dec 20, 2014 at 10:44 PM, Chris Angelico ros...@gmail.com wrote:
 On Sun, Dec 21, 2014 at 5:31 PM, Terry Reedy tjre...@udel.edu wrote:
 Just to be clear, writing to sys.stdout works fine in Idle.
 import sys; sys.stdout.write('hello ')
 hello  #2.7

 In 3.4, the number of chars? bytes? is returned and written also.

 Whether you mean something different by 'stdout' or not, I am not sure.  The
 error is from writing to a non-existent file descriptor.

 That's because sys.stdout is replaced. But stdout itself, file
 descriptor 1, is not available:

It surprises me that IDLE, and most other shells, don't dup2
stdout/err/in so that those FDs talk to IDLE.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PyWart: Poor Documentation Examples

2015-01-10 Thread Devin Jeanpierre
On Sat, Jan 10, 2015 at 6:32 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 At the point you are demonstrating reduce(), if the reader doesn't
 understand or can't guess the meaning of n = 4, n+1 or range(), they
 won't understand anything you say.

 Teachers need to understand that education is a process of building upon
 that which has come before. If the teacher talks down to the student by
 assuming that the student knows nothing, and tries to go back to first
 principles for every little thing, they will never get anywhere.

Agree wholeheartedly. That said, I do think reduce(operator.mul, [1,
2, 3, 4]) actually _is_ a better example, since it cuts right to the
point.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Decimals and other numbers

2015-01-09 Thread Devin Jeanpierre
On Fri, Jan 9, 2015 at 2:20 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
-snip-
 I don't understand what you're trying to say here. You can't just
 arbitrarily declare that 0**1 equals something other than 0 (or for that
 matter, doesn't equal anything at all).

You can, actually. It's just silly. (Similarly, you can declare that
0**0 is something other than 1 (or for that matter, doesn't equal
anything at all), but it's silly.)

 Can we agree that 0**1 is well-defined and get back to 0**0?

Believe it or not I actually misread your whole thing and thought we
were talking about 0**0. Otherwise I would've been much briefer...

 Not quite. I agree that, *generally speaking* having 0**0 equal 1 is the
 right answer, or at least *a* right answer, but not always. It depends on
 how you get to 0**0...

 You don't get to a number. Those are limits. Limits and arithmetic
 are different.

 (Well, sort of. :)

 Yes, sort of :-)

I was alluding to the definition of the reals.

 Of course you can get to numbers. We start with counting, that's a way
 to get to the natural numbers, by applying the successor function
 repeatedly until we reach the one we want. Or you can get to pi by
 generating an infinite sequence of closer and closer approximations. Or an
 infinite series. Or an infinite product. Or an infinite continued fraction.
 All of these ways to get to pi converge on the same result.

Yes, all numbers can be represented as a converging limit. However,
that does not mean that the way you compute the result of a function
like x**y is by taking the limit as its arguments approach the input:
that procedure works only for continuous functions. x**y is not
continuous at 0, so this style of computation cannot give you an
answer.

 If 0**0 has a value, we can give that number a name. Let's call it Q. There
 are different ways to evaluate Q:

 lim x - 0 of sin(x)/x  gives 1

 lim x - 0 of x**0  gives 1

 lim x - 0 of 0**x  gives 0

This is a proof that f(x, y) = x**y is not continuous around 0, 0. It
is not a proof that it is undefined at 0, 0, in fact, it says nothing
about the value.

 0**0 = 0**(5-5) = 0**5 / 0**5 = 0/0  gives indeterminate

Here is a nearly identical proof that 0**1 is indeterminate: 0 =
0**1 = 0**(5 - 4) = 0**5 / 0**4 = 0/0 gives indeterminate.

The fact that you can construct a nonsensical expression from an
expression doesn't mean the original expression was nonsensical. In
this case, your proof was invalid, because 0**(X-Y) is not equivalent
to 0**X/0**Y.

 So we have a problem. Since all these ways to get to Q fail to converge,
 the obvious answer is to declare that Q doesn't exist and that 0**0 is
 indeterminate, and that is what many mathematicians do:

That isn't what indeterminate means.

 However, this begs the question of what we mean by 0**0.

 In the case of m**n, with both m and n positive integers, there is an
 intuitively obvious definition for exponentiation: repeated multiplication.
 But there's no obvious meaning for exponentiation when both m and n are
 zero, hence we (meaning, mathematicians) have to define what it means. So
 long as that definition doesn't lead to contradiction, we can make any
 choice we like.

Sorry, I don't follow. n**0 as repeated multiplication makes perfect
sense: we don't perform any multiplications, but if we did, we'd be
multiplying 'n's. 0**m as repeated multiplication makes perfect sense:
whatever we multiply, it's a bunch of 0s. Why doesn't 0**0 make sense?
We don't perform any multiplications, but if we did, we'd be
multiplying 0s.

If we don't perform any multiplications, the things we didn't multiply
don't matter. Whether they are fives, sevens, or zeroes, the answer is
the same: 1.

 Since you can get difference results depending on the method you use to
 calculate it, the technically correct result is that 0**0 is
 indeterminate.

 No, only limits are indeterminate. Calculations not involving limits
 cannot be indeterminate.

 Do tell me what 0/0 equals, if it is not indeterminate.

0/0 is undefined, it isn't indeterminate.

Indeterminate forms are a way of expressing limits where you have
performed a lossy substitution. That is: the limit as x approaches a
of 0/0 is an indeterminate form.

 In the real number system, infinity does not exist. It only exists in
 limits or extended number systems.

 Yes, you are technically correct, the best kind of correct.

 I'm just sketching an informal proof. If you want to make it vigorous by
 using limits, be my guest. It doesn't change the conclusion.

No, the point is that limits are irrelevant.

As has been proven countlessly many times, x**y is not continuous
around the origin. This has no bearing on whether it takes a value at
the origin.


 [...]
 Arguably, *integer* 0**0 could be zero, on the basis that you can't take
 limits of integer-valued quantities, and zero times itself zero times
 surely has to be zero.

 No. No no no. On natural numbers no other thing makes sense than 1.

[issue23086] Add start and stop parameters to the Sequence.index() ABC mixin method

2015-01-09 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

I inferred from Serhiy's comment that if you override __iter__ to be efficient 
and not use __getitem__, this overridden behavior used to pass on to index(), 
but wouldn't after this patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23086
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Decimals and other numbers

2015-01-09 Thread Devin Jeanpierre
On Fri, Jan 9, 2015 at 7:05 PM, Gregory Ewing
greg.ew...@canterbury.ac.nz wrote:
 It's far from clear what *anything* multiplied by
 itself zero times should be.

 A better way of thinking about what x**n for integer
 n means is this: Start with 1, and multiply it by
 x n times. The result of this is clearly 1 when n
 is 0, regardless of the value of x.

 5**4 = 5*5*5*5 = 625

 No:

 5**4 = 1*5*5*5*5
 5**3 = 1*5*5*5
 5**2 = 1*5*5
 5**1 = 1*5
 5**0 = 1

I never liked that, it seemed too arbitrary. How about this explanation:

Assume that we know how to multiply a nonempty list numbers. so
product([a]) == a, product([a, b]) = a * b, and so on.

def product(nums):
if len(nums) == 0:
return ???
return reduce(operator.mul, nums)

It should be the case that given a list of factors A and B,
product(A + B) == product(A) * product(B)   (associativity).
We should let this rule apply even if A or B is the empty list,
otherwise our rules are kind of stupid.

Therefore, product([] + X) == product([]) * product(X)
But since [] + X == X, product([] + X) == product(X)

There's only one number like that: product([]) == 1

(Of course if you choose not to have the full associativity rule for
empty products, then anything is possible.)

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue23201] Decimal(0)**0 is an error, 0**0 is 1, but Decimal(0) == 0

2015-01-09 Thread Devin Jeanpierre

Devin Jeanpierre added the comment:

Does the spec have a handy list of differences to floats anywhere, or do you 
have to internalize the whole thing?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23201
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Decimals and other numbers

2015-01-09 Thread Devin Jeanpierre
Marko, your argument is this function x**y(a, x) must be continuous
on [0, inf), and to be continuous at 0, 0**0 must be a. Since there
are many possible values of a, this is not a justification, this is
a proof by contradiction that the premise was faulty: x**y(a, x)
doesn't have to be continuous after all.

0**0 is 1, which makes some functions continuous and some functions
not, and who cares? It's 1 because that's what is demanded by
combinatorial definitions of exponentiation, and its origins in the
domain of the natural numbers.  Knuth says that thought of
combinatorially on the naturals, x**y counts the number of mappings
from a set of x values to a set of y values. Clearly there's only one
mapping from the empty set to itself: the empty mapping. Number theory
demands that performing multiplication among an empty bag of numbers
gives you the result of 1 -- even if the empty bag is an empty bag of
zeroes instead of an empty bag of fives. The result does not change.

Either of those ideas about exponentiation can be thought of as
descriptions of its behavior, or as definitions. They completely
describe its behavior on the naturals, from which we derive its
behavior on the reals.

-- Devin

On Thu, Jan 8, 2015 at 11:28 PM, Marko Rauhamaa ma...@pacujo.net wrote:
 Devin Jeanpierre jeanpierr...@gmail.com:

 If 0**0 is defined, it must be 1.

 You can justify any value a within [0, 1]. For example, choose

y(a, x) = log(a, x)

 Then,

 limy(a, x) = 0
x - 0+

 and:

lim[x - 0+] x**y(a, x) = a

 For example,

 a = 0.5
 x = 1e-100
 y = math.log(a, x)
 y
0.0030102999566398118
 x**y
0.5


 Marko
 --
 https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Decimals and other numbers

2015-01-09 Thread Devin Jeanpierre
On Fri, Jan 9, 2015 at 12:58 AM, Devin Jeanpierre
jeanpierr...@gmail.com wrote:
 Arguably, *integer* 0**0 could be zero, on the basis that you can't take
 limits of integer-valued quantities, and zero times itself zero times
 surely has to be zero.

I should have responded in more detail here, sorry.

If you aren't performing any multiplication, why does it matter what
numbers you are multiplying? Doing no multiplications of five is the
same as doing no multiplications of two is the same as doing no
multiplications of... 0.

You can define it to be 0 but only if you are multiplying an empty bag
of zeroes, but it's hard to imagine what makes an empty bag of zeroes
different from an empty bag of fives. It really surely is *not* the
case.

Obviously, this kind of ridiculousness comes naturally to Java and C++
programmers, with their statically typed collections. It's no surprise
that's where the Decimal spec came from. ;)

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Decimals and other numbers

2015-01-09 Thread Devin Jeanpierre
On Fri, Jan 9, 2015 at 12:49 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
 Devin Jeanpierre wrote:

 On Thu, Jan 8, 2015 at 6:43 PM, Dave Angel da...@davea.name wrote:
 What you don't say is which behavior you actually expected.  Since 0**0
 is undefined mathematically, I'd expect either an exception or a NAN
 result.

 It can be undefined, if you choose for it to be. You can also choose
 to not define 0**1, of course.

 No you can't -- that would make arithmetic inconsistent. 0**1 is perfectly
 well defined as 0 however you look at it:

 lim of x - 0 of x**1 = 0
 lim of y - 1 of 0**y = 0

This is a misunderstanding of limits. Limits are allowed to differ
from the actual evaluated result when you substitute the limit point:
that's what it means to be discontinuous.

What you call making  arithmetic inconsistent, I call making the
function inside the limit discontinuous at 0.


 If 0**0 is defined, it must be 1. I
 Googled around to find a mathematician to back me up, here:
 http://arxiv.org/abs/math/9205211 (page 6, ripples).

 Not quite. I agree that, *generally speaking* having 0**0 equal 1 is the
 right answer, or at least *a* right answer, but not always. It depends on
 how you get to 0**0...

You don't get to a number. Those are limits. Limits and arithmetic
are different.

(Well, sort of. :)

 Since you can get difference results depending on the method you use to
 calculate it, the technically correct result is that 0**0 is
 indeterminate.

No, only limits are indeterminate. Calculations not involving limits
cannot be indeterminate.

-snip-
 log(Q) = 0*-inf

 What is zero times infinity? In the real number system, that is
 indeterminate, again because it depends on how you calculate it

In the real number system, infinity does not exist. It only exists in
limits or extended number systems.

 : naively it
 sounds like it should be 0, but infinity is pretty big and if you add up
 enough zeroes in the right way you can actually get something non-zero.
 There's no one right answer. So if the log of Q is indeterminate, then so
 must be Q.

 But there are a host of good reasons for preferring 0**0 = 1. Donald Knuth
 writes (using ^ for power):

 Some textbooks leave the quantity 0^0 undefined, because the
 functions 0^x and x^0 have different limiting values when x
 decreases to 0. But this is a mistake. We must define x^0=1
 for all x , if the binomial theorem is to be valid when x=0,
 y=0, and/or x=-y. The theorem is too important to be arbitrarily
 restricted! By contrast, the function 0^x is quite unimportant.

 More discussion here:

 http://mathforum.org/dr.math/faq/faq.0.to.0.power.html

I've already been citing Knuth. :P

 I expected 1, nan, or an exception, but more importantly, I expected
 it to be the same for floats and decimals.

 Arguably, *integer* 0**0 could be zero, on the basis that you can't take
 limits of integer-valued quantities, and zero times itself zero times
 surely has to be zero.

No. No no no. On natural numbers no other thing makes sense than 1.
All of the definitions of exponentiation for natural numbers require
it, except for those derived from analytical notions of
exponentiation. (Integers just give you ratios of natural
exponentials, so again no.)

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   >