[Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Georg Brandl
Hey all,

based on the feedback so far, I revised the PEP.  There is now
a much simpler rule for allowed underscores, with no exceptions.
This made the grammar simpler as well.

---

PEP: 515
Title: Underscores in Numeric Literals
Version: $Revision$
Last-Modified: $Date$
Author: Georg Brandl
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2016
Python-Version: 3.6

Abstract and Rationale
==

This PEP proposes to extend Python's syntax so that underscores can be used in
integral, floating-point and complex number literals.

This is a common feature of other modern languages, and can aid readability of
long literals, or literals whose value should clearly separate into parts, such
as bytes or words in hexadecimal notation.

Examples::

# grouping decimal numbers by thousands
amount = 10_000_000.0

# grouping hexadecimal addresses by words
addr = 0xDEAD_BEEF

# grouping bits into bytes in a binary literal
flags = 0b_0011__0100_1110

# making the literal suffix stand out more
imag = 1.247812376e-15_j


Specification
=

The current proposal is to allow one or more consecutive underscores following
digits and base specifiers in numeric literals.

The production list for integer literals would therefore look like this::

   integer: decimalinteger | octinteger | hexinteger | bininteger
   decimalinteger: nonzerodigit (digit | "_")* | "0" ("0" | "_")*
   nonzerodigit: "1"..."9"
   digit: "0"..."9"
   octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
   hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
   bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*
   octdigit: "0"..."7"
   hexdigit: digit | "a"..."f" | "A"..."F"
   bindigit: "0" | "1"

For floating-point and complex literals::

   floatnumber: pointfloat | exponentfloat
   pointfloat: [intpart] fraction | intpart "."
   exponentfloat: (intpart | pointfloat) exponent
   intpart: digit (digit | "_")*
   fraction: "." intpart
   exponent: ("e" | "E") ["+" | "-"] intpart
   imagnumber: (floatnumber | intpart) ("j" | "J")


Alternative Syntax
==

Underscore Placement Rules
--

Instead of the liberal rule specified above, the use of underscores could be
limited.  Common rules are (see the "other languages" section):

* Only one consecutive underscore allowed, and only between digits.
* Multiple consecutive underscore allowed, but only between digits.

A less common rule would be to allow underscores only every N digits (where N
could be 3 for decimal literals, or 4 for hexadecimal ones).  This is
unnecessarily restrictive, especially considering the separator placement is
different in different cultures.

Different Separators


A proposed alternate syntax was to use whitespace for grouping.  Although
strings are a precedent for combining adjoining literals, the behavior can lead
to unexpected effects which are not possible with underscores.  Also, no other
language is known to use this rule, except for languages that generally
disregard any whitespace.

C++14 introduces apostrophes for grouping, which is not considered due to the
conflict with Python's string literals. [1]_


Behavior in Other Languages
===

Those languages that do allow underscore grouping implement a large variety of
rules for allowed placement of underscores.  This is a listing placing the known
rules into three major groups.  In cases where the language spec contradicts the
actual behavior, the actual behavior is listed.

**Group 1: liberal**

This group is the least homogeneous: the rules vary slightly between languages.
All of them allow trailing underscores.  Some allow underscores after non-digits
like the ``e`` or the sign in exponents.

* D [2]_
* Perl 5 (underscores basically allowed anywhere, although docs say it's more
  restricted) [3]_
* Rust (allows between exponent sign and digits) [4]_
* Swift (although textual description says "between digits") [5]_

**Group 2: only between digits, multiple consecutive underscores**

* C# (open proposal for 7.0) [6]_
* Java [7]_

**Group 3: only between digits, only one underscore**

* Ada [8]_
* Julia (but not in the exponent part of floats) [9]_
* Ruby (docs say "anywhere", in reality only between digits) [10]_


Implementation
==

A preliminary patch that implements the specification given above has been
posted to the issue tracker. [11]_


Open Questions
==

This PEP currently only proposes changing the literal syntax.  The following
extensions are open for discussion:

* Allowing underscores in string arguments to the ``Decimal`` constructor.  It
  could be argued that these are akin to literals, since there is no Decimal
  literal available (yet).

* Allowing underscores in string arguments to ``int()`` with base argument 0,
  

Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Paul Moore
On 10 February 2016 at 23:14, Steven D'Aprano  wrote:
> On Wed, Feb 10, 2016 at 10:53:09PM +, Paul Moore wrote:
>> On 10 February 2016 at 22:20, Georg Brandl  wrote:
>> > This came up in python-ideas, and has met mostly positive comments,
>> > although the exact syntax rules are up for discussion.
>>
>> +1 on the PEP. Is there any value in allowing underscores in strings
>> passed to the Decimal constructor as well? The same sorts of
>> justifications would seem to apply. It's perfectly arguable that the
>> change for Decimal would be so rarely used as to not be worth it,
>> though, so I don't mind either way in practice.
>
> Let's delay making any change to string conversions for now, and that
> includes Decimal. We can also do this:
>
> Decimal("123_456_789.0_12345_67890".replace("_", ""))
>
> for those who absolutely must include underscores in their numeric
> strings. The big win is for numeric literals, not numeric string
> conversions.

Good point. Maybe add this as an example in the PEP to explain why
conversions are excluded. But I did only mean the Decimal constructor,
which I think of more as a "decimal literal" - whereas int() and
float() are (in my mind at least) conversion functions and as such
should not be coupled to literal format (for example, 0x0001 notation
isn't supported by int())

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Georg Brandl
On 02/10/2016 11:42 PM, Glenn Linderman wrote:
> On 2/10/2016 2:20 PM, Georg Brandl wrote:
>> This came up in python-ideas, and has met mostly positive comments,
>> although the exact syntax rules are up for discussion.
>>
>> cheers,
>> Georg
>>
>> 
>>
>> PEP: 515
>> Title: Underscores in Numeric Literals
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Georg Brandl
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/x-rst
>> Created: 10-Feb-2016
>> Python-Version: 3.6
>>
>> Abstract and Rationale
>> ==
>>
>> This PEP proposes to extend Python's syntax so that underscores can be used 
>> in
>> integral and floating-point number literals.
>>
>> This is a common feature of other modern languages, and can aid readability 
>> of
>> long literals, or literals whose value should clearly separate into parts, 
>> such
>> as bytes or words in hexadecimal notation.
>>
>> Examples::
>>
>> # grouping decimal numbers by thousands
>> amount = 10_000_000.0
>>
>> # grouping hexadecimal addresses by words
>> addr = 0xDEAD_BEEF
>>
>> # grouping bits into bytes in a binary literal
>> flags = 0b_0011__0100_1110
> 
> +1
> 
> You don't mention potential restrictions that decimal numbers should permit 
> them
> only every three places, or hex ones only every 2 or 4, and your binary 
> example
> mentions grouping into bytes, but actually groups into nybbles.
> 
> But such restrictions would be annoying: if it is useful to the coder to use
> them, that is fine. But different situation may find other placements more
> useful... particularly in binary, as it might want to match widths of various
> bitfields.
> 
> Adding that as a rejected consideration, with justifications, would be 
> helpful.

I added a short paragraph.

Thanks for the feedback,
Georg



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Time for a change of random number generator?

2016-02-11 Thread Steven D'Aprano
On Thu, Feb 11, 2016 at 01:08:41PM +1300, Greg Ewing wrote:
> The Mersenne Twister is no longer regarded as quite state-of-the art
> because it can get into states that produce long sequences that are
> not very random.
> 
> There is a variation on MT called WELL that has better properties
> in this regard. Does anyone think it would be a good idea to replace
> MT with WELL as Python's default rng?
> 
> https://en.wikipedia.org/wiki/Well_equidistributed_long-period_linear

I'm not able to judge the claims about which PRNG is better (perhaps Tim 
Peters has an opinion?) but if we do change, I'd like to see the 
existing random.Random moved to random.MT_Random for backwards 
compatibility and compatibility with other software which uses MT. Not 
necessarily saying that we have to keep it around forever (after all, we 
did dump the Wichmann-Hill PRNG some time ago) but we ought to keep it 
for at least a couple of releases.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Georg Brandl
On 02/11/2016 10:10 AM, Paul Moore wrote:
> On 10 February 2016 at 23:14, Steven D'Aprano  wrote:
>> On Wed, Feb 10, 2016 at 10:53:09PM +, Paul Moore wrote:
>>> On 10 February 2016 at 22:20, Georg Brandl  wrote:
>>> > This came up in python-ideas, and has met mostly positive comments,
>>> > although the exact syntax rules are up for discussion.
>>>
>>> +1 on the PEP. Is there any value in allowing underscores in strings
>>> passed to the Decimal constructor as well? The same sorts of
>>> justifications would seem to apply. It's perfectly arguable that the
>>> change for Decimal would be so rarely used as to not be worth it,
>>> though, so I don't mind either way in practice.
>>
>> Let's delay making any change to string conversions for now, and that
>> includes Decimal. We can also do this:
>>
>> Decimal("123_456_789.0_12345_67890".replace("_", ""))
>>
>> for those who absolutely must include underscores in their numeric
>> strings. The big win is for numeric literals, not numeric string
>> conversions.
> 
> Good point. Maybe add this as an example in the PEP to explain why
> conversions are excluded. But I did only mean the Decimal constructor,
> which I think of more as a "decimal literal" - whereas int() and
> float() are (in my mind at least) conversion functions and as such
> should not be coupled to literal format (for example, 0x0001 notation
> isn't supported by int())

Actually, it is.  Just not without a base argument, because the default
base is 10.  But both with base 0 and base 16, '0x' prefixes are allowed.

That's why I'm leaning towards supporting the underscores.  In any case
I'm preparing the implementation.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Serhiy Storchaka

On 11.02.16 00:20, Georg Brandl wrote:

**Group 1: liberal (like this PEP)**

* D [2]_
* Perl 5 (although docs say it's more restricted) [3]_
* Rust [4]_
* Swift (although textual description says "between digits") [5]_

**Group 2: only between digits, multiple consecutive underscores**

* C# (open proposal for 7.0) [6]_
* Java [7]_

**Group 3: only between digits, only one underscore**

* Ada [8]_
* Julia (but not in the exponent part of floats) [9]_
* Ruby (docs say "anywhere", in reality only between digits) [10]_


C++ is in this group too.

The documentation of Perl explicitly says that Perl is in this group too 
(23__500 is not legal). Perhaps there is a bug in Perl implementation. 
And may be Swift is intended to be in this group.


I think we should follow the majority of languages and use simple rule: 
"only between digits".


I have provided an implementation.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Georg Brandl
On 02/11/2016 12:04 AM, Victor Stinner wrote:
> It looks like the implementation https://bugs.python.org/issue26331
> only changes the Python parser.
> 
> What about other functions converting strings to numbers at runtime
> like int(str) and float(str)? Paul also asked for Decimal(str).

I added these as "Open Questions" to the PEP.

For Decimal, it's probably a good idea.  For int(), it should only be
allowed with base argument = 0.  For float() and complex(), probably.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Barry Warsaw
On Feb 11, 2016, at 09:22 AM, Georg Brandl wrote:

>based on the feedback so far, I revised the PEP.  There is now
>a much simpler rule for allowed underscores, with no exceptions.
>This made the grammar simpler as well.

I'd be +1, but there's something missing from the PEP: what the underscores
*mean*.  You describe the syntax nicely, but not the semantics.

>From reading the examples, I'd guess that the underscores are semantically
transparent, meaning that the resulting value is the same if you just removed
the underscores and interpreted the resulting literal.

Right or wrong, could you please add a paragraph explaining the meaning of the
underscores?

Cheers,
-Barry
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Glenn Linderman

On 2/11/2016 12:22 AM, Georg Brandl wrote:

Hey all,

based on the feedback so far, I revised the PEP.  There is now
a much simpler rule for allowed underscores, with no exceptions.
This made the grammar simpler as well.


+1 overall


Examples::

 # grouping decimal numbers by thousands
 amount = 10_000_000.0

 # grouping hexadecimal addresses by words
 addr = 0xDEAD_BEEF

 # grouping bits into bytes in a binary literal
nybbles, not bytes, is shown... which is more readable, and does group 
into bytes also.

 flags = 0b_0011__0100_1110


+1 on 0b_ and 0X_ and, especially, 0O_ (but why anyone would use 
uppercase base designators is beyond me, as it is definitely less readable)



 # making the literal suffix stand out more
 imag = 1.247812376e-15_j


+1 on _j

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Serhiy Storchaka

On 11.02.16 10:22, Georg Brandl wrote:

Abstract and Rationale
==

This PEP proposes to extend Python's syntax so that underscores can be used in
integral, floating-point and complex number literals.

This is a common feature of other modern languages, and can aid readability of
long literals, or literals whose value should clearly separate into parts, such
as bytes or words in hexadecimal notation.


I have strong preference for more strict and simpler rule, used by most 
other languages -- "only between two digits". Main arguments:


1. Simple rule is easier to understand, remember and recognize. I care 
not about the complexity of the implementation (there is no large 
difference), but about cognitive complexity.


2. Most languages use this rule. It is better to follow non-formal 
standard that invent the rule that differs from rules in every other 
language. This will help programmers that use multiple languages.


I have provided an alternative patch and can provide an alternative PEP 
if it is needed.



The production list for integer literals would therefore look like this::

integer: decimalinteger | octinteger | hexinteger | bininteger
decimalinteger: nonzerodigit (digit | "_")* | "0" ("0" | "_")*
nonzerodigit: "1"..."9"
digit: "0"..."9"
octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*


octinteger: "0" ("o" | "O") octdigit (["_"] octdigit)*


hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*


hexinteger: "0" ("x" | "X") hexdigit (["_"] hexdigit)*


bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*


bininteger: "0" ("b" | "B") bindigit (["_"] bindigit)*


octdigit: "0"..."7"
hexdigit: digit | "a"..."f" | "A"..."F"
bindigit: "0" | "1"

For floating-point and complex literals::

floatnumber: pointfloat | exponentfloat
pointfloat: [intpart] fraction | intpart "."
exponentfloat: (intpart | pointfloat) exponent
intpart: digit (digit | "_")*


intpart: digit (["_"] digit)*


fraction: "." intpart
exponent: ("e" | "E") ["+" | "-"] intpart
imagnumber: (floatnumber | intpart) ("j" | "J")



**Group 1: liberal**

This group is the least homogeneous: the rules vary slightly between languages.
All of them allow trailing underscores.  Some allow underscores after non-digits
like the ``e`` or the sign in exponents.

* D [2]_
* Perl 5 (underscores basically allowed anywhere, although docs say it's more
   restricted) [3]_
* Rust (allows between exponent sign and digits) [4]_
* Swift (although textual description says "between digits") [5]_

**Group 2: only between digits, multiple consecutive underscores**

* C# (open proposal for 7.0) [6]_
* Java [7]_

**Group 3: only between digits, only one underscore**

* Ada [8]_
* Julia (but not in the exponent part of floats) [9]_
* Ruby (docs say "anywhere", in reality only between digits) [10]_


This classification is misleading. The difference between groups 2 and 3 
is less then between different languages in group 1. To be fair, groups 
2 and 3 should be united in one group. C++ should be included in this 
group. Perl 5 and Swift should be either included in both groups or 
excluded from any group, because they have inconsistencies between the 
documentation and the implementation or between different parts of the 
documentation.


With correct classification it is obvious what variant is the most popular.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Ethan Furman

On 02/11/2016 10:50 AM, Serhiy Storchaka wrote:
> I have strong preference for more strict and simpler rule, used by
> most other languages -- "only between two digits". Main arguments:

> 2. Most languages use this rule. It is better to follow non-formal
> standard that invent the rule that differs from rules in every other
> language. This will help programmers that use multiple languages.

If Python followed other languages in everything:

1) Python would not need to exist; and
2) Python would suck  ;)

If our rule is more permissive that other languages then cross-language 
developers can still use the same style in both languages, without 
penalizing those who want to use the extra freedom in Python.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Glenn Linderman

On 2/11/2016 11:01 AM, Ethan Furman wrote:

On 02/11/2016 10:50 AM, Serhiy Storchaka wrote:
> I have strong preference for more strict and simpler rule, used by
> most other languages -- "only between two digits". Main arguments:

> 2. Most languages use this rule. It is better to follow non-formal
> standard that invent the rule that differs from rules in every other
> language. This will help programmers that use multiple languages.

If Python followed other languages in everything:

1) Python would not need to exist; and
2) Python would suck  ;)

If our rule is more permissive that other languages then 
cross-language developers can still use the same style in both 
languages, without penalizing those who want to use the extra freedom 
in Python.


Ditto.

If people need an idea to shoot down, regarding literal constants, and 
because I couldn't find a Python-Non-Ideas list to post this in, here is 
one.  Note that it is unambiguous, does not conflict with existing 
binary literals, but otherwise sucks.  Please vote this idea down with 
emphasis:


Base 64 decoding literals:

print( 0b64_CjMy_NTM0_Mjkw_NQ )
325342905
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Nick Coghlan
On 11 February 2016 at 19:59, Victor Stinner  wrote:
> 2016-02-11 9:11 GMT+01:00 Georg Brandl :
>> On 02/11/2016 12:04 AM, Victor Stinner wrote:
>>> It looks like the implementation https://bugs.python.org/issue26331
>>> only changes the Python parser.
>>>
>>> What about other functions converting strings to numbers at runtime
>>> like int(str) and float(str)? Paul also asked for Decimal(str).
>>
>> I added these as "Open Questions" to the PEP.
>
> Ok nice. Now another question :-)
>
> Would it be useful to add an option to repr(int) and repr(float), or a
> formatter to int.__format__() and float.__float__() to add an
> underscore for thousands.

Given that str.format supports a thousands separator:

>>> "{:,d}".format(1)
'100,000,000'

it might be reasonable to permit "_" in place of "," in the format specifier.

However, I'm not sure when you'd use it aside from code generation,
and you can already insert the thousands separator and then replace
"," with "_".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Petr Viktorin
On 02/11/2016 11:07 AM, Nick Coghlan wrote:
> On 11 February 2016 at 19:59, Victor Stinner  wrote:
>> 2016-02-11 9:11 GMT+01:00 Georg Brandl :
>>> On 02/11/2016 12:04 AM, Victor Stinner wrote:
 It looks like the implementation https://bugs.python.org/issue26331
 only changes the Python parser.

 What about other functions converting strings to numbers at runtime
 like int(str) and float(str)? Paul also asked for Decimal(str).
>>>
>>> I added these as "Open Questions" to the PEP.
>>
>> Ok nice. Now another question :-)
>>
>> Would it be useful to add an option to repr(int) and repr(float), or a
>> formatter to int.__format__() and float.__float__() to add an
>> underscore for thousands.
> 
> Given that str.format supports a thousands separator:
> 
 "{:,d}".format(1)
> '100,000,000'
> 
> it might be reasonable to permit "_" in place of "," in the format specifier.
> 
> However, I'm not sure when you'd use it aside from code generation,
> and you can already insert the thousands separator and then replace
> "," with "_".

It would make "SI style" [0] numbers a little bit more straightforward
to generate, since the order of operations wouldn't matter.
Currently it's:

"{:,}".format(1234.5678).replace(',', ' ').replace('.', ',')

Also it would make numbers with decimal comma and dot as separator a bit
easier to generate. Currently, that's (from PEP 378):

format(n, "6,f").replace(",", "X").replace(".", ",").replace("X", ".")

[0] https://en.wikipedia.org/wiki/Decimal_mark#Examples_of_use


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Georg Brandl
On 02/11/2016 11:17 AM, Serhiy Storchaka wrote:

>> **Group 3: only between digits, only one underscore**
>>
>> * Ada [8]_
>> * Julia (but not in the exponent part of floats) [9]_
>> * Ruby (docs say "anywhere", in reality only between digits) [10]_
> 
> C++ is in this group too.
> 
> The documentation of Perl explicitly says that Perl is in this group too 
> (23__500 is not legal). Perhaps there is a bug in Perl implementation. 
> And may be Swift is intended to be in this group.
> 
> I think we should follow the majority of languages and use simple rule: 
> "only between digits".
> 
> I have provided an implementation.

Thanks for the alternate patch.  I used the two-function approach you took
in ast.c for my latest revision.

I still think that some cases (like two of the examples in the PEP,
0b__ and 1.5_j) are worth having, and therefore a more relaxed
rule is preferable.

cheers,
Georg


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Steven D'Aprano
On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote:

> And honestly, are you really claiming that in your opinion, "123_456_" 
> is worse than all of their other examples, like "1_23__4"?

Yes I am, because 123_456_ looks like you've forgotten to finish typing 
the last group of digits, while 1_23__4 merely looks like you have no 
taste.

> They're both presented as something the syntax allows, and neither one 
> looks like something I'd ever want to write, much less promote in a 
> style guide or something, but neither one screams out as something 
> that's so heinous we need to complicate the language to ensure it 
> raises a SyntaxError. Yes, that's my opinion, but do.you really have a 
> different opinion about any part of that?

I don't think the rule "underscores must occur between digits" is 
complicating the specification. It is *less* complicated to explain this 
rule than to give a whole lot of special cases

- can you use a leading or trailing underscore?
- can an underscore follow the base prefix 0b 0o 0x?
- can an underscore precede or follow the decimal place?
- can an underscore precede or follow a + or - sign?
- can an underscore precede or follow the e|E exponent symbol?
- can an underscore precede or follow the j suffix for complex numbers?

versus 

- underscores can only appear between (hex)digits.

I'm not sure why you seem to think that "only between digits" is more 
complex than the alternative -- to me it is less complex, with no 
special cases to memorise, just one general rule.

Of course, if (generic) you think that it is a feature to be able to put 
underscores before the decimal point, after the E exponent, etc. then 
you will dislike my suggested rule. That's okay, but in that case, it is 
not because of "simplicity|complexity" but because (generic) you want to 
be able to write things which my rule would prohibit.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Steven D'Aprano
On Thu, Feb 11, 2016 at 08:07:56PM +1000, Nick Coghlan wrote:

> Given that str.format supports a thousands separator:
> 
> >>> "{:,d}".format(1)
> '100,000,000'
> 
> it might be reasonable to permit "_" in place of "," in the format specifier.

+1


> However, I'm not sure when you'd use it aside from code generation,
> and you can already insert the thousands separator and then replace
> "," with "_".

It's not always easy or convenient to call .replace(",", "_") on the 
output of format:

"With my help, the {} caught {:,d} ants.".format("aardvark", 1)

would need to be re-written as something like:

py> "With my help, the {} caught {} ants.".format("aardvark", 
"{:,d}".format(1).replace(",", "_"))
'With my help, the aardvark caught 100_000_000 ants.'



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] fullOfEels, assistant program for writing Python extension modules in C

2016-02-11 Thread Hugh Fisher
I've written a Python program named fullOfEels to speed up the first
stages of writing Python extension modules in C.

It is not a replacement for SWIG, SIP, or ctypes. It's for the case
where you want to work in the opposite direction, specifying a Python
API and then writing an implementation in C. (A small niche maybe, but
I hope it isn't just me who sometimes works this way.)

The input is a Python module specifying what it should do but not how,
with all the functions, classes, and methods being just pass. The
output is a pair of .h and .c files with all the boilerplate C code
required: module initialization, class type structs, C method
functions and method tables.

Downloadable from
https://bitbucket.org/hugh_fisher/fullofeels

All feedback and suggestions welcome.

-- 

cheers,
Hugh Fisher
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Steven D'Aprano
On Thu, Feb 11, 2016 at 06:03:34PM +, Brett Cannon wrote:
> On Thu, 11 Feb 2016 at 02:13 Steven D'Aprano  wrote:
> 
> > On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote:
> >
> > > And honestly, are you really claiming that in your opinion, "123_456_"
> > > is worse than all of their other examples, like "1_23__4"?
> >
> > Yes I am, because 123_456_ looks like you've forgotten to finish typing
> > the last group of digits, while 1_23__4 merely looks like you have no
> > taste.
> >
> 
> OK, but the keyword in your sentence is "taste". 

I disagree. The key *idea* in my sentence is that the trailing 
underscore looks like a programming error. In my opinion, avoiding that 
impression is important enough to make trailing underscores a syntax 
error.

I've seen a few people vote +1 for things like 123_j and 1.23_e99, but I 
haven't seen anyone in favour of trailing underscores. Does anyone think 
there is a good case for allowing trailing underscores?


> If we update PEP 8 for our
> needs to say "Numerical literals should not have multiple underscores in a
> row or have a trailing underscore" then this is taken care of. We get a
> dead-simple rule for when underscores can be used, the implementation is
> simple, and we get to have more tasteful usage in the stdlib w/o forcing
> our tastes upon everyone or complicating the rules or implementation.

I think this is a misrepresentation of the alternative. As I see it, we 
have two alternatives:

- one or more underscores can appear AFTER the base specifier or any digit;
- one or more underscores can appear BETWEEN two digits.

To describe the second alternative as "complicating the rules" is, I 
think, grossly unfair. And if Serhiy's proposal is correct, the 
implementation is also no more complicated:

# underscores after digits
octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*

# underscores between digits
octinteger: "0" ("o" | "O") octdigit (["_"] octdigit)*
hexinteger: "0" ("x" | "X") hexdigit (["_"] hexdigit)*
bininteger: "0" ("b" | "B") bindigit (["_"] bindigit)*


The idea that the second alternative "forc[es] our tastes on everyone" 
while the first does not is bogus. The first alternative also prohibits 
things which are a matter of taste:

# prohibited in both alternatives
0_xDEADBEEF
0._1234
1.2e_99
-_1
1j_


I think that there is broad agreement that:

- the basic idea is sound
- leading underscores followed by digits are currently legal 
  identifiers and this will not change
- underscores should not follow the sign - +
- underscores should not follow the decimal point .
- underscores should not follow the exponent e|E
- underscores will not be permitted inside the exponent (even if 
  it is harmless, it's silly to write 1.2e9_9)
- underscores should not follow the complex suffix j

and only minor disagreement about:

- whether or not underscores will be allowed after the base 
  specifier 0x 0o 0b
- whether or not underscores will be allowed before the decimal 
  point, exponent and complex suffix.

Can we have a show of hands, in favour or against the above two? And 
then perhaps Guido can rule on this one way or the other and we can get 
back to arguing about more important matters? :-)

In case it isn't obvious, I prefer to say No to allowing underscores 
after the base specifier, or before the decimal point, exponent and 
complex suffix.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Martin Panter
On 11 February 2016 at 11:12, Chris Angelico  wrote:
> On Thu, Feb 11, 2016 at 7:22 PM, Georg Brandl  wrote:

The following extensions are open for discussion:

>> * Allowing underscores in string arguments to the ``Decimal`` constructor.  
>> It
>>   could be argued that these are akin to literals, since there is no Decimal
>>   literal available (yet).
>>
>> * Allowing underscores in string arguments to ``int()`` with base argument 0,
>>   ``float()`` and ``complex()``.
>
> I'm -0.5 on both of these, with the caveat that if either gets done,
> both should be. Decimal() shouldn't be different from int() just
> because there's currently no way to express a Decimal literal; if
> Python 3.7 introduces such a literal, there'd be this weird rule
> difference that has to be maintained for backward compatibility, and
> has no justification left.

I would be weakly in favour of all relevant constructors being updated
to match the new syntax. The main reason is just consistency, and that
the documentation already kind of guarantees that the literal syntax
is supported (definitely for int and float; for complex it is too
vague).

To be consistent, the following minor extensions of the syntax should
be allowed, which are not legal Python literals: int("0_001"),
int("J_00", 20), float("0_001"), complex("0_001").

Maybe also with non-ASCII digits. However I tried writing Arabic-Indic
digits (U+0600 etc) and my web browser split the number apart when I
inserted an underscore. Maybe a right-to-left thing. But using
Devangari digits U+0966, U+0967: int("१_०००") (= 1_000). Non-ASCII
digits are apparently intentionally supported, but not documented:
.

> (As a side point, I would be fully in favour of Decimal literals. I'd
> also be in favour of something like "from __future__ import
> fraction_literals" so 1/2 would evaluate to Fraction(1,2) rather than
> 0.5. Hence I'm inclined *not* to support underscores in Decimal().)

Seems more like an argument to have the support in Decimal()
consistent with float() etc, i.e. all or nothing.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Martin Panter
On 12 February 2016 at 00:16, Steven D'Aprano  wrote:
> On Thu, Feb 11, 2016 at 06:03:34PM +, Brett Cannon wrote:
>> On Thu, 11 Feb 2016 at 02:13 Steven D'Aprano  wrote:
>>
>> > On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote:
>> >
>> > > And honestly, are you really claiming that in your opinion, "123_456_"
>> > > is worse than all of their other examples, like "1_23__4"?
>> >
>> > Yes I am, because 123_456_ looks like you've forgotten to finish typing
>> > the last group of digits, while 1_23__4 merely looks like you have no
>> > taste.
>> >
>>
>> OK, but the keyword in your sentence is "taste".
>
> I disagree. The key *idea* in my sentence is that the trailing
> underscore looks like a programming error. In my opinion, avoiding that
> impression is important enough to make trailing underscores a syntax
> error.
>
> I've seen a few people vote +1 for things like 123_j and 1.23_e99, but I
> haven't seen anyone in favour of trailing underscores. Does anyone think
> there is a good case for allowing trailing underscores?
>
>
>> If we update PEP 8 for our
>> needs to say "Numerical literals should not have multiple underscores in a
>> row or have a trailing underscore" then this is taken care of. We get a
>> dead-simple rule for when underscores can be used, the implementation is
>> simple, and we get to have more tasteful usage in the stdlib w/o forcing
>> our tastes upon everyone or complicating the rules or implementation.
>
> I think this is a misrepresentation of the alternative. As I see it, we
> have two alternatives:
>
> - one or more underscores can appear AFTER the base specifier or any digit;
+1

> - one or more underscores can appear BETWEEN two digits.
-0

Having underscores between digits is the main usage, but I don’t see
much harm in the more liberal version, unless it that makes the
specification or implementation too complex. Allowing stuff like
0x_100, 4.7_e3, and 1_j seems of slightly more benefit IMO than
disallowing 1_000_.

> To describe the second alternative as "complicating the rules" is, I
> think, grossly unfair. And if Serhiy's proposal is correct, the
> implementation is also no more complicated:
>
> # underscores after digits
> octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
> hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
> bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*
>
> # underscores between digits
> octinteger: "0" ("o" | "O") octdigit (["_"] octdigit)*
> hexinteger: "0" ("x" | "X") hexdigit (["_"] hexdigit)*
> bininteger: "0" ("b" | "B") bindigit (["_"] bindigit)*
>
>
> The idea that the second alternative "forc[es] our tastes on everyone"
> while the first does not is bogus. The first alternative also prohibits
> things which are a matter of taste:
>
> # prohibited in both alternatives
> 0_xDEADBEEF
> 0._1234
> 1.2e_99
> -_1

This one is already a valid variable identifier name.

> 1j_
>
>
> I think that there is broad agreement that:
>
> - the basic idea is sound
> - leading underscores followed by digits are currently legal
>   identifiers and this will not change
+1 to both
> - underscores should not follow the sign - +
> - underscores should not follow the decimal point .
> - underscores should not follow the exponent e|E
No strong opinion on these from me
> - underscores will not be permitted inside the exponent (even if
>   it is harmless, it's silly to write 1.2e9_9)
-0, it seems like a needless inconsistency, unless it somehow hurts
the implementation
> - underscores should not follow the complex suffix j
No opinion

> and only minor disagreement about:
>
> - whether or not underscores will be allowed after the base
>   specifier 0x 0o 0b
+0

> - whether or not underscores will be allowed before the decimal
>   point, exponent and complex suffix.
No opinion about directly before decimal point; +0 before exponent or
imaginary (complex) suffix.

> Can we have a show of hands, in favour or against the above two? And
> then perhaps Guido can rule on this one way or the other and we can get
> back to arguing about more important matters? :-)
>
> In case it isn't obvious, I prefer to say No to allowing underscores
> after the base specifier, or before the decimal point, exponent and
> complex suffix.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] fullOfEels, assistant program for writing Python extension modules in C

2016-02-11 Thread Hugh Fisher
On Fri, Feb 12, 2016 at 3:30 AM, Nathaniel Smith  wrote:
> You're almost certainly aware of this, but just to double check since you
> don't mention it in the email: cython is also a great tool for handling
> similar situations. Not quite the same since in addition to generating all
> the boilerplate for you it then lets you use almost-python to actually write
> the C implementations as well, and I understand that with your tool you
> write the actual implementations in C. But probably also worth considering
> in cases where you'd consider this tool, so wanted to make sure it was on
> your radar.

Yes, cython is a fine tool and I wouldn't try to dissuade anyone from
using it if it works for them.

FullOfEels is for when the implementation should be hidden altogether.
Most often this is because of cross-platform differences or coding
horrors, but could also be handy for teaching when it's easier to just
give students plain Python modules to look at.

Thanks for replying.

-- 

cheers,
Hugh Fisher
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Steven D'Aprano
On Thu, Feb 11, 2016 at 08:50:09PM +0200, Serhiy Storchaka wrote:

> I have strong preference for more strict and simpler rule, used by most 
> other languages -- "only between two digits". Main arguments:
> 
> 1. Simple rule is easier to understand, remember and recognize. I care 
> not about the complexity of the implementation (there is no large 
> difference), but about cognitive complexity.
> 
> 2. Most languages use this rule. It is better to follow non-formal 
> standard that invent the rule that differs from rules in every other 
> language. This will help programmers that use multiple languages.
> 
> I have provided an alternative patch and can provide an alternative PEP 
> if it is needed.

I don't think an alternative PEP is needed, but I hope that your 
alternative gets a fair treatment in the PEP.


> >The production list for integer literals would therefore look like this::
> >
> >integer: decimalinteger | octinteger | hexinteger | bininteger
> >decimalinteger: nonzerodigit (digit | "_")* | "0" ("0" | "_")*
> >nonzerodigit: "1"..."9"
> >digit: "0"..."9"
> >octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
> 
>  octinteger: "0" ("o" | "O") octdigit (["_"] octdigit)*
> 
> >hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
> 
>  hexinteger: "0" ("x" | "X") hexdigit (["_"] hexdigit)*
> 
> >bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*
> 
>  bininteger: "0" ("b" | "B") bindigit (["_"] bindigit)*

To me, Serhiy's versions (starting with single > symbols) are not only 
simpler to learn, but have a simpler (or at least shorter) 
implementation too.


[...] 
> >**Group 3: only between digits, only one underscore**
> >
> >* Ada [8]_
> >* Julia (but not in the exponent part of floats) [9]_
> >* Ruby (docs say "anywhere", in reality only between digits) [10]_
> 
> This classification is misleading. The difference between groups 2 and 3 
> is less then between different languages in group 1. To be fair, groups 
> 2 and 3 should be united in one group. C++ should be included in this 
> group. Perl 5 and Swift should be either included in both groups or 
> excluded from any group, because they have inconsistencies between the 
> documentation and the implementation or between different parts of the 
> documentation.
> 
> With correct classification it is obvious what variant is the most popular.

It is not obvious to me what you think the correct classification is.

If you disagree with Georg's classification, would you reclassify the 
languages, and if there is agreement that you are correct, he can update 
the PEP?




-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Thursday, February 11, 2016 10:35 AM, Jeff Hardy  wrote:

>On Thu, Feb 11, 2016 at 10:15 AM, Andrew Barnert via Python-Dev 
> wrote:
>
>>That's a good point: we need style rules for PEP 8.


...
>>It might be simpler to write a "whitelist" than a "blacklist" of all the ugly 
>>things people might come up with, and then just give a bunch of examples 
>>instead of a bunch of rules. Something like this:
>>
>>While underscores can legally appear anywhere in the digit string, you should 
>>never use them for purposes other than visually separating meaningful digit 
>>groups like thousands, bytes, and the like.
>>
>>123456_789012: ok (millions are groups, but thousands are more common, 
>> and 6-digit groups are readable, but on the edge)
>>123_456_789_012: better
>>123_456_789_012_: bad (trailing)
>>1_2_3_4_5_6: bad (too many)
>>1234_5678: ok if code is intended to deal with east-Asian numerals (where 
>> 1 is a standard grouping), bad otherwise
>>3__141_592_654: ok if this represents a fixed-point fraction (obviously 
>> bad otherwise)
>>123.456_789e123: good
>>123.456_789e1_23: bad (never useful in exponent)
>>0x1234_5678: good
>>0o123_456: good
>>0x123_456_789: bad (3 hex digits is usually not a meaningful group)
>

>I imagine that for whatever "bad" grouping you can suggest, someone, 
>somewhere, has a legitimate reason to use it. 

That's exactly why we should just have bad examples in the style guide, rather 
than coming up with style rules that try to strongly discourage them (or making 
them syntax errors).

>Any rule more complex than "Use underscores in numeric literals only when the 
>improve clarity" is unnecessarily prescriptive.

Your rule doesn't need to be stated at all. It's already a given that you 
shouldn't add semantically-meaningless characters anywhere unless they improve 
clarity

I don't think saying that they're for "visually separating meaningful digit 
groups like thousands, bytes, and the like" is unnecessarily prescriptive. If 
someone comes up with a legitimate use for something we've never anticipated, 
it will almost certainly just be a way of grouping digits that's meaningful in 
a way we didn't anticipate. And, if not, it's just a style guideline, so it 
doesn't have to apply 100% of the time. If someone really comes up with 
something that has nothing to do with grouping digits, all the style guideline 
will do is make them stop and think about whether it really is a good use of 
underscores--and, if it is, they'll go ahead and do it.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Windows: Remove support of bytes filenames in theos module?

2016-02-11 Thread Stephen J. Turnbull
Executive summary:

My experience is that having bytes APIs in the os module is very
useful.  But perhaps higher-level functions like os.scandir can do
without (I present no arguments either way on that, just acknowledge
it).

Andrew Barnert writes:

 > Anyway, Windows CDs can't cause this problem.

My bad.  I meant archival Mac CDs (or perhaps they were taken from a
network filesystem) which is where I see MacRoman, and Windows (ie,
FAT-formatted) USB drives, which is where I see Shift JIS.  The point
here is not what is technically possible or even standard, it's that
though what I see in practice may not *require* bytes APIs, it's *very
convenient* to have them (especially interactively).

 > The same thing is true with NTFS external drives, VFAT USB drives,
 > etc. Generally, it's usually not Windows media on *nix systems that
 > break Python 2 unicode; it's native *nix filesystems where users
 > mix locales.

IMHO, Python 2 unicode is not breakable, let alone broken. ;-)  Mailman
2 has managed to almost get to a state where you can't get it to raise
a Unicode exception (except where deliberately used as EAFP), let
alone one that is not handled (before the catch-all "except Exception"
that keeps the daemon running).  And that's in an application whose
original encoding support assumed standard conformance by design in a
realm where spammers and junior high school hackers regularly violate
the most ancient of RFCs (the restriction to ASCII in headers goes
back to a 6xx RFC at the latest!)  Python 2 Unicode turns out to have
been an excellent compromise between the needs of backward
compatibility with uniformly encoded bytestreams for Europe, and the
forward-looking needs of a globalizing Internet.  (But you knew that! 
:-)  As I wrote earlier, the world is broken, or at least Japan.  The
world "got bettah", thus Python 3.  And most of the time Python 3 is
wonderful in Japan (specifically, it's trivial to get recalcitrant
students to use best I18N practice).

My point is that *where I live* the experience is very different.
There are *no* Japanese who use *nix (other than Mac OS X) for
paperwork in my neighborhood.  Shift JIS filenames *are* from Windows
media recently written, though probably not by Microsoft-provided
software.  Bytes APIs are a very useful tool in dealing with these
issues, at least in the hands of someone who has become expert in
dealing with them.

I suspect the same is true of China, except that like their business
partner Apple they are in a position to legislate uniformity, and do.
(Unfortunately that's GB18030, not Unicode.)  So maybe they're better
off than a place that coined the phrase "politics that can't decide".

I admit I've not yet used os.scandir, let alone its bytes API.  Perhaps
we can, and perhaps we should, restrict the bytes API in the os module
to a few basic functions, and require that the environment be sane for
cases where we want to use higher-level or optimized functions.

 > > You contradict yourself! ;-)
 > 
 > I'm perfectly happen to have been wrong earlier. And if catching
 > myself before someone else did makes me a flip-flopper, well, I'm
 > not running for president. :P

I consider that the most important qualification for President,
especially if your name is Trump or Sanders.  That's one of the things
I respect most about Python: with a few (negligible) exceptions, minds
change to fit the facts.

And, BTW, EAFP applies here, too.  Make mistakes on the mailing lists
before you commit them to code.  Please!

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Duelling PEPs not needed [was: PEP 515: Underscores in Numeric Literals]

2016-02-11 Thread Stephen J. Turnbull
Serhiy Storchaka writes:

 > I suspect that my arguments can be lost [without a competing PEP].

Send Georg a patch for his PEP, that's where they belong, since only
one of the two PEPs could be approved, and they would be 95% the same
otherwise.  If he doesn't apply it (he's allowed to move it to the
"rejected arguments" section, though), or the decision silently goes
against you, speak up then -- that would be a problem IMO.

Or you could offer to BD1P!  (If you're selected, I hope you change
your mind! :-)



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Time for a change of random number generator?

2016-02-11 Thread Stephen J. Turnbull
Steven D'Aprano writes:

 > Peters has an opinion?) but if we do change, I'd like to see the 
 > existing random.Random moved to random.MT_Random for backwards 
 > compatibility and compatibility with other software which uses MT. Not 
 > necessarily saying that we have to keep it around forever (after all, we 
 > did dump the Wichmann-Hill PRNG some time ago) but we ought to keep it 
 > for at least a couple of releases.

I think we should keep it around forever.  Even my slowest colleagues
are learning that they should record their seeds and PRNG algorithms
for reproducibility's sake. :-)  For that matter, restore Wichmann-Hill.
Both should be clearly marked as "use only for reproducing previous
bitstreams" (eg, in a package random.deprecated_generators).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Glenn Linderman

On 2/11/2016 4:16 PM, Steven D'Aprano wrote:

On Thu, Feb 11, 2016 at 06:03:34PM +, Brett Cannon wrote:

On Thu, 11 Feb 2016 at 02:13 Steven D'Aprano  wrote:


On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote:


And honestly, are you really claiming that in your opinion, "123_456_"
is worse than all of their other examples, like "1_23__4"?

Yes I am, because 123_456_ looks like you've forgotten to finish typing
the last group of digits, while 1_23__4 merely looks like you have no
taste.


OK, but the keyword in your sentence is "taste".

I disagree. The key *idea* in my sentence is that the trailing
underscore looks like a programming error. In my opinion, avoiding that
impression is important enough to make trailing underscores a syntax
error.

I've seen a few people vote +1 for things like 123_j and 1.23_e99, but I
haven't seen anyone in favour of trailing underscores. Does anyone think
there is a good case for allowing trailing underscores?



If we update PEP 8 for our
needs to say "Numerical literals should not have multiple underscores in a
row or have a trailing underscore" then this is taken care of. We get a
dead-simple rule for when underscores can be used, the implementation is
simple, and we get to have more tasteful usage in the stdlib w/o forcing
our tastes upon everyone or complicating the rules or implementation.

I think this is a misrepresentation of the alternative. As I see it, we
have two alternatives:

- one or more underscores can appear AFTER the base specifier or any digit;
- one or more underscores can appear BETWEEN two digits.

To describe the second alternative as "complicating the rules" is, I
think, grossly unfair. And if Serhiy's proposal is correct, the
implementation is also no more complicated:

# underscores after digits
octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*


# underscores after digits
octinteger: "0" ("o" | "O") (octdigit | "_")*
hexinteger: "0" ("x" | "X") (hexdigit | "_")*
bininteger: "0" ("b" | "B") (bindigit | "_")*


An extra side effect is that there are more ways to write zero.  0x, 0b, 
0o, 0X, 0B, 0O, 0x_, 0b_, 0o_, etc.
But most people write   0   anyway, so those would be bad style, anyway, 
but it makes the implementation simpler.




# underscores between digits
octinteger: "0" ("o" | "O") octdigit (["_"] octdigit)*
hexinteger: "0" ("x" | "X") hexdigit (["_"] hexdigit)*
bininteger: "0" ("b" | "B") bindigit (["_"] bindigit)*


The idea that the second alternative "forc[es] our tastes on everyone"
while the first does not is bogus. The first alternative also prohibits
things which are a matter of taste:

# prohibited in both alternatives
0_xDEADBEEF
0._1234
1.2e_99
-_1
1j_


I think that there is broad agreement that:

- the basic idea is sound
- leading underscores followed by digits are currently legal
   identifiers and this will not change
- underscores should not follow the sign - +
- underscores should not follow the decimal point .
- underscores should not follow the exponent e|E
- underscores will not be permitted inside the exponent (even if
   it is harmless, it's silly to write 1.2e9_9)
- underscores should not follow the complex suffix j

and only minor disagreement about:

- whether or not underscores will be allowed after the base
   specifier 0x 0o 0b


+1 to allow underscores after the base specifier.


- whether or not underscores will be allowed before the decimal
   point, exponent and complex suffix.


+1 to allow them. There may be cases where they are useful, and if it is 
not useful, it would not be used.  I really liked someone's style guide 
proposal: use of underscore within numeric constants should only be done 
to aid readability.  However, pre-judging what aids readability to one 
person's particular taste is inappropriate.



Can we have a show of hands, in favour or against the above two? And
then perhaps Guido can rule on this one way or the other and we can get
back to arguing about more important matters? :-)

In case it isn't obvious, I prefer to say No to allowing underscores
after the base specifier, or before the decimal point, exponent and
complex suffix.
I think it was obvious :)  And I think we disagree. And yes, there are 
more important matters. But it was just a couple days ago when I wrote a 
big constant in some new code that I was thinking how nice it would be 
if I could put a delimiter in there... so I'll be glad for the feature 
when it is available.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread David Mertz
Great PEP overall. We definitely don't want the restriction to grouping
numbers only in threes. South Asian crore use grouping in twos.

https://en.m.wikipedia.org/wiki/Crore
On Feb 11, 2016 7:04 PM, "Glenn Linderman"  wrote:

> On 2/11/2016 4:16 PM, Steven D'Aprano wrote:
>
> On Thu, Feb 11, 2016 at 06:03:34PM +, Brett Cannon wrote:
>
> On Thu, 11 Feb 2016 at 02:13 Steven D'Aprano  
>  wrote:
>
>
> On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote:
>
>
> And honestly, are you really claiming that in your opinion, "123_456_"
> is worse than all of their other examples, like "1_23__4"?
>
>
> Yes I am, because 123_456_ looks like you've forgotten to finish typing
> the last group of digits, while 1_23__4 merely looks like you have no
> taste.
>
>
>
> OK, but the keyword in your sentence is "taste".
>
>
> I disagree. The key *idea* in my sentence is that the trailing
> underscore looks like a programming error. In my opinion, avoiding that
> impression is important enough to make trailing underscores a syntax
> error.
>
> I've seen a few people vote +1 for things like 123_j and 1.23_e99, but I
> haven't seen anyone in favour of trailing underscores. Does anyone think
> there is a good case for allowing trailing underscores?
>
>
>
> If we update PEP 8 for our
> needs to say "Numerical literals should not have multiple underscores in a
> row or have a trailing underscore" then this is taken care of. We get a
> dead-simple rule for when underscores can be used, the implementation is
> simple, and we get to have more tasteful usage in the stdlib w/o forcing
> our tastes upon everyone or complicating the rules or implementation.
>
>
> I think this is a misrepresentation of the alternative. As I see it, we
> have two alternatives:
>
> - one or more underscores can appear AFTER the base specifier or any digit;
> - one or more underscores can appear BETWEEN two digits.
>
> To describe the second alternative as "complicating the rules" is, I
> think, grossly unfair. And if Serhiy's proposal is correct, the
> implementation is also no more complicated:
>
> # underscores after digits
> octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
> hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
> bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*
>
>
> # underscores after digits
> octinteger: "0" ("o" | "O") (octdigit | "_")*
> hexinteger: "0" ("x" | "X") (hexdigit | "_")*
> bininteger: "0" ("b" | "B") (bindigit | "_")*
>
>
> An extra side effect is that there are more ways to write zero.  0x, 0b,
> 0o, 0X, 0B, 0O, 0x_, 0b_, 0o_, etc.
> But most people write   0   anyway, so those would be bad style, anyway,
> but it makes the implementation simpler.
>
>
>
> # underscores between digits
> octinteger: "0" ("o" | "O") octdigit (["_"] octdigit)*
> hexinteger: "0" ("x" | "X") hexdigit (["_"] hexdigit)*
> bininteger: "0" ("b" | "B") bindigit (["_"] bindigit)*
>
>
> The idea that the second alternative "forc[es] our tastes on everyone"
> while the first does not is bogus. The first alternative also prohibits
> things which are a matter of taste:
>
> # prohibited in both alternatives
> 0_xDEADBEEF
> 0._1234
> 1.2e_99
> -_1
> 1j_
>
>
> I think that there is broad agreement that:
>
> - the basic idea is sound
> - leading underscores followed by digits are currently legal
>   identifiers and this will not change
> - underscores should not follow the sign - +
> - underscores should not follow the decimal point .
> - underscores should not follow the exponent e|E
> - underscores will not be permitted inside the exponent (even if
>   it is harmless, it's silly to write 1.2e9_9)
> - underscores should not follow the complex suffix j
>
> and only minor disagreement about:
>
> - whether or not underscores will be allowed after the base
>   specifier 0x 0o 0b
>
>
> +1 to allow underscores after the base specifier.
>
> - whether or not underscores will be allowed before the decimal
>   point, exponent and complex suffix.
>
>
> +1 to allow them. There may be cases where they are useful, and if it is
> not useful, it would not be used.  I really liked someone's style guide
> proposal: use of underscore within numeric constants should only be done to
> aid readability.  However, pre-judging what aids readability to one
> person's particular taste is inappropriate.
>
> Can we have a show of hands, in favour or against the above two? And
> then perhaps Guido can rule on this one way or the other and we can get
> back to arguing about more important matters? :-)
>
> In case it isn't obvious, I prefer to say No to allowing underscores
> after the base specifier, or before the decimal point, exponent and
> complex suffix.
>
> I think it was obvious :)  And I think we disagree. And yes, there are
> more important matters. But it was just a couple days ago when I wrote a
> big constant in some new code that I was thinking how nice it would be if 

Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Glenn Linderman

On 2/11/2016 7:56 PM, David Mertz wrote:


Great PEP overall. We definitely don't want the restriction to 
grouping numbers only in threes. South Asian crore use grouping in twos.


https://en.m.wikipedia.org/wiki/Crore



Interesting... 3 digits in the least significant group, and _then_ by 
twos. Wouldn't have predicted that one! Never bumped into that notation 
before!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Time for a change of random number generator?

2016-02-11 Thread Andrew Barnert via Python-Dev
On Thursday, February 11, 2016 7:20 PM, Stephen J. Turnbull 
 wrote:



> I think we should keep it around forever.  Even my slowest colleagues
> are learning that they should record their seeds and PRNG algorithms
> for reproducibility's sake. :-)

+1

> For that matter, restore Wichmann-Hill.

So you can write code that works on 2.3 and 3.6, but not 3.5?

I agree that it shouldn't have gone away, but I think it may be too late for 
adding it back to help too much.

> Both should be clearly marked as "use only for reproducing previous
> bitstreams" (eg, in a package random.deprecated_generators).


I like the random.deprecated_generators idea.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Time for a change of random number generator?

2016-02-11 Thread Tim Peters
[Greg Ewing ]
> The Mersenne Twister is no longer regarded as quite state-of-the art
> because it can get into states that produce long sequences that are
> not very random.
>
> There is a variation on MT called WELL that has better properties
> in this regard. Does anyone think it would be a good idea to replace
> MT with WELL as Python's default rng?

I don't think so, because I've seen no groundswell of discontent about
the Twister among Python users.  Perhaps I'm missing some?  Changes
are disruptive and people argue about RNGs with religious zeal, so I
favor making a change in this area only when it's compelling.  It was
compelling to move away from Wichmann-Hill when the Twister was
introduced:  WH was waay behind the state of the art at the time,
its limitations were causing real problems, and there was
near-universal adoption of the Twister around the world.  The Twister
was a game changer.

When the time comes for a change, I'd be more inclined to (as Robert
Kern already said) look at PCG and Random123.  Like the Twister, WELL
requires massive internal state, and fails the same kinds of
randomnesss tests (while the suggested alternatives fail none to
date).  WELL does escape "zeroland" faster, but still much slower than
PCG or Random123 (which appear to have no systematic attractors).  The
alternatives require much smaller state, and at least PCG much simpler
code.

Note that the seeding function used by Python doesn't take the
user-supplied seed as-is (only __setstate__ does):  it runs rounds of
pseudo-random bit dispersion, to make it highly unlikely that an
initial state with lots of zeroes is produced.  While the Twister
escapes zeroland very slowly, the flip side is that it also
transitions _to_ zeroland very slowly.  It's quite possible that
nobody has ever fallen into such a state (short of contriving to via
__setstate__).  Falling into zeroland was a very real problem in the
Twister's very early days, which is why its authors added the
bit-dispersal code to the seeding function.  Python was wise to wait
until they did.

It's prudent to wait for someone else to find the early surprises in
PCG and Random123 too ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Thursday, February 11, 2016 8:10 PM, Glenn Linderman  
wrote:

>On 2/11/2016 7:56 PM, David Mertz wrote:
>
>Great PEP overall. We definitely don't want the restriction to grouping 
>numbers only in threes. South Asian crore use grouping in twos.
>>https://en.m.wikipedia.org/wiki/Crore
>>
>Interesting... 3 digits in the least significant group, and _then_
   by twos. Wouldn't have predicted that one! Never bumped into that
   notation before!


The first time I used underscore separators in any language, it was a test 
script for a server that wanted social security numbers as integers instead of 
strings, like 123_45_6789.[^1] 

Which is why I suggested the style guideline should just say "meaningful 
grouping of digits", rather than try to predict what counts as "meaningful" for 
every program.


[^1] Of course in Python, it's usually trivial to stick a shim in between the 
database and the model thingy so I could just pass in "123-45-6789", so I don't 
expect to ever need this specific example.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Glenn Linderman

On 2/11/2016 8:22 PM, Andrew Barnert wrote:

On Thursday, February 11, 2016 8:10 PM, Glenn Linderman  
wrote:


On 2/11/2016 7:56 PM, David Mertz wrote:

Great PEP overall. We definitely don't want the restriction to grouping numbers 
only in threes. South Asian crore use grouping in twos.

https://en.m.wikipedia.org/wiki/Crore


Interesting... 3 digits in the least significant group, and _then_

by twos. Wouldn't have predicted that one! Never bumped into that
notation before!


The first time I used underscore separators in any language, it was a test 
script for a server that wanted social security numbers as integers instead of 
strings, like 123_45_6789.[^1]

Which is why I suggested the style guideline should just say "meaningful grouping of 
digits", rather than try to predict what counts as "meaningful" for every program.


[^1] Of course in Python, it's usually trivial to stick a shim in between the database 
and the model thingy so I could just pass in "123-45-6789", so I don't expect 
to ever need this specific example.



Yes, I had thought of the Social Security Number possibility also, 
although having them as constants in a program seems a bit unusual. Test 
script, fake numbers, yeah, I guess so.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Time for a change of random number generator?

2016-02-11 Thread Chris Angelico
On Fri, Feb 12, 2016 at 3:12 PM, Andrew Barnert via Python-Dev
 wrote:
> On Thursday, February 11, 2016 7:20 PM, Stephen J. Turnbull 
>  wrote:
>
>
>
>> I think we should keep it around forever.  Even my slowest colleagues
>> are learning that they should record their seeds and PRNG algorithms
>> for reproducibility's sake. :-)
>
> +1
>
>> For that matter, restore Wichmann-Hill.
>
> So you can write code that works on 2.3 and 3.6, but not 3.5?
>
> I agree that it shouldn't have gone away, but I think it may be too late for 
> adding it back to help too much.

You're probably right, but the point isn't to make the same code run,
necessarily. It's to make things verifiable. Suppose I do some
scientific research that involves a pseudo-random number component,
and I publish my results ("Monte Carlo analysis produced these
results, blah blah, using this seed, etc, etc"). If you want to come
back later and say "I think there was a bug in your code", you need to
be able to generate the exact same PRNG sequence. I published my
algorithm and my seed, so you should in theory be able to recreate
that sequence; but if you have to reimplement the same algorithm,
that's a lot of unnecessary work that could have been replaced with
"from random.deprecated_generators import WichmannHill as Random".
(Plus there's the whole question of "was your reimplemented PRNG
buggy" - or, for that matter, "was the original PRNG buggy". Using the
exact same code eliminates even that.)

So I'm +1 on keeping Mersenne Twister even after it's been replaced as
the default PRNG, -0 on reinstating something that hasn't been used in
well over a decade, and -1 on replacing MT today - I'm not seeing
strong arguments in favour of changing.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Feb 11, 2016, at 00:22, Georg Brandl  wrote:
> 
> Allowing underscores in string arguments to the ``Decimal`` constructor.  It
>  could be argued that these are akin to literals, since there is no Decimal
>  literal available (yet).

I'm +1 on this. Partly for consistency (see below)--but also, one of the use 
cases for Decimal is when you need more precision than float, meaning you'll 
often have even more digits to separate.

> * Allowing underscores in string arguments to ``int()`` with base argument 0,
>  ``float()`` and ``complex()``.

+1, because these are actually defined in terms of literals. For example, under 
int, "Base 0 means to interpret exactly as a code literal". This isn't actually 
quite true, because "-2" is not an integer literal but is accepted here--but 
see float for an example that *is* rigorously defined, and still defers to 
literal syntax and semantics.___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] fullOfEels, assistant program for writing Python extension modules in C

2016-02-11 Thread Nathaniel Smith
You're almost certainly aware of this, but just to double check since you
don't mention it in the email: cython is also a great tool for handling
similar situations. Not quite the same since in addition to generating all
the boilerplate for you it then lets you use almost-python to actually
write the C implementations as well, and I understand that with your tool
you write the actual implementations in C. But probably also worth
considering in cases where you'd consider this tool, so wanted to make sure
it was on your radar.
On Feb 11, 2016 8:21 AM, "Hugh Fisher"  wrote:

> I've written a Python program named fullOfEels to speed up the first
> stages of writing Python extension modules in C.
>
> It is not a replacement for SWIG, SIP, or ctypes. It's for the case
> where you want to work in the opposite direction, specifying a Python
> API and then writing an implementation in C. (A small niche maybe, but
> I hope it isn't just me who sometimes works this way.)
>
> The input is a Python module specifying what it should do but not how,
> with all the functions, classes, and methods being just pass. The
> output is a pair of .h and .c files with all the boilerplate C code
> required: module initialization, class type structs, C method
> functions and method tables.
>
> Downloadable from
> https://bitbucket.org/hugh_fisher/fullofeels
>
> All feedback and suggestions welcome.
>
> --
>
> cheers,
> Hugh Fisher
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Barry Warsaw
On Feb 11, 2016, at 05:57 PM, Georg Brandl wrote:

>D'oh :)  I added (hopefully) clarifying wording.

I saw the diff - perfect!  Thanks.

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Feb 11, 2016, at 02:13, Steven D'Aprano  wrote:
> 
>> On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote:

>> They're both presented as something the syntax allows, and neither one 
>> looks like something I'd ever want to write, much less promote in a 
>> style guide or something, but neither one screams out as something 
>> that's so heinous we need to complicate the language to ensure it 
>> raises a SyntaxError. Yes, that's my opinion, but do.you really have a 
>> different opinion about any part of that?
> 
> I don't think the rule "underscores must occur between digits" is 
> complicating the specification.

That rule isn't in the specification in the PEP, except as one of the 
alternatives rejected for being "too restrictive". It's also not the rule you 
were suggesting in your previous email, arguing where you insisted that you 
wanted something "more liberal". I also don't understand why you're presenting 
this whole thing as an argument against my response, which was suggesting that 
whatever rule we choose should be simpler than what's in the PEP, when that's 
also (apparently, now) your position.

> It is *less* complicated to explain this 
> rule than to give a whole lot of special cases

Sure. Your rule is about as complicated as the Swift rule, and both are much 
less complicated than the PEP. I'm fine with either one, because, as I said, 
the edge cases don't matter to me nearly as much as having a rule that's easy 
to keep it my head and easy to lex. The only reason I specifically proposed the 
Swift rule instead of one of the other simple rules is that it seemed the most 
"liberal", which the PEP was in favor of, and and it has precedent in more 
other languages. But, in favor of your version, almost every language uses some 
variation of "you can put underscores between digits" as the "tutorial-level" 
explanation and rationale.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Georg Brandl
On 02/11/2016 05:52 PM, Steve Dower wrote:
> On 11Feb2016 0651, Barry Warsaw wrote:
>> On Feb 11, 2016, at 09:22 AM, Georg Brandl wrote:
>>
>>> based on the feedback so far, I revised the PEP.  There is now
>>> a much simpler rule for allowed underscores, with no exceptions.
>>> This made the grammar simpler as well.
>>
>> I'd be +1, but there's something missing from the PEP: what the underscores
>> *mean*.  You describe the syntax nicely, but not the semantics.
>>
>>  From reading the examples, I'd guess that the underscores are semantically
>> transparent, meaning that the resulting value is the same if you just removed
>> the underscores and interpreted the resulting literal.
>>
>> Right or wrong, could you please add a paragraph explaining the meaning of 
>> the
>> underscores?
> 
> Glad I kept reading the thread this far - just pretend I also wrote 
> exactly the same thing as Barry.

D'oh :)  I added (hopefully) clarifying wording.

Thanks,
Georg


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Steve Dower

On 11Feb2016 0651, Barry Warsaw wrote:

On Feb 11, 2016, at 09:22 AM, Georg Brandl wrote:


based on the feedback so far, I revised the PEP.  There is now
a much simpler rule for allowed underscores, with no exceptions.
This made the grammar simpler as well.


I'd be +1, but there's something missing from the PEP: what the underscores
*mean*.  You describe the syntax nicely, but not the semantics.

 From reading the examples, I'd guess that the underscores are semantically
transparent, meaning that the resulting value is the same if you just removed
the underscores and interpreted the resulting literal.

Right or wrong, could you please add a paragraph explaining the meaning of the
underscores?


Glad I kept reading the thread this far - just pretend I also wrote 
exactly the same thing as Barry.


Cheers,
Steve

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Brett Cannon
On Thu, 11 Feb 2016 at 02:13 Steven D'Aprano  wrote:

> On Wed, Feb 10, 2016 at 08:41:27PM -0800, Andrew Barnert wrote:
>
> > And honestly, are you really claiming that in your opinion, "123_456_"
> > is worse than all of their other examples, like "1_23__4"?
>
> Yes I am, because 123_456_ looks like you've forgotten to finish typing
> the last group of digits, while 1_23__4 merely looks like you have no
> taste.
>

OK, but the keyword in your sentence is "taste". If we update PEP 8 for our
needs to say "Numerical literals should not have multiple underscores in a
row or have a trailing underscore" then this is taken care of. We get a
dead-simple rule for when underscores can be used, the implementation is
simple, and we get to have more tasteful usage in the stdlib w/o forcing
our tastes upon everyone or complicating the rules or implementation.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Brett Cannon
On Thu, 11 Feb 2016 at 00:23 Georg Brandl  wrote:

> Hey all,
>
> based on the feedback so far, I revised the PEP.  There is now
> a much simpler rule for allowed underscores, with no exceptions.
> This made the grammar simpler as well.
>
> ---
>
> PEP: 515
> Title: Underscores in Numeric Literals
> Version: $Revision$
> Last-Modified: $Date$
> Author: Georg Brandl
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 10-Feb-2016
> Python-Version: 3.6
>
> Abstract and Rationale
> ==
>
> This PEP proposes to extend Python's syntax so that underscores can be
> used in
> integral, floating-point and complex number literals.
>
> This is a common feature of other modern languages, and can aid
> readability of
> long literals, or literals whose value should clearly separate into parts,
> such
> as bytes or words in hexadecimal notation.
>
> Examples::
>
> # grouping decimal numbers by thousands
> amount = 10_000_000.0
>
> # grouping hexadecimal addresses by words
> addr = 0xDEAD_BEEF
>
> # grouping bits into bytes in a binary literal
> flags = 0b_0011__0100_1110
>
> # making the literal suffix stand out more
> imag = 1.247812376e-15_j
>
>
> Specification
> =
>
> The current proposal is to allow one or more consecutive underscores
> following
> digits and base specifiers in numeric literals.
>

+1 from me. Nice and simple! And we can always update PEP 8 do disallow any
usage that we deem ugly.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Feb 11, 2016, at 09:39, Terry Reedy  wrote:
> 
> If trailing _ is allowed, to simplify the implementation, I would like PEP 8, 
> while on the subject, to say something like "While trailing _s on numbers are 
> allowed, to simplify the implementation, they serve no purpose and are 
> strongly discouraged".

That's a good point: we need style rules for PEP 8.

But I think everything that's just obviously pointless (like putting an 
underscore between every pair of digits, or sprinkling underscores all over a 
huge number to make ASCII art), or already handled by other guidelines (e.g., 
using a ton of underscores to "line up a table" is the same as using a ton of 
spaces, which is already discouraged) doesn't really need to be covered. And I 
think trailing underscores probably fall into that category.

It might be simpler to write a "whitelist" than a "blacklist" of all the ugly 
things people might come up with, and then just give a bunch of examples 
instead of a bunch of rules. Something like this:

While underscores can legally appear anywhere in the digit string, you should 
never use them for purposes other than visually separating meaningful digit 
groups like thousands, bytes, and the like.

123456_789012: ok (millions are groups, but thousands are more common, and 
6-digit groups are readable, but on the edge)
123_456_789_012: better
123_456_789_012_: bad (trailing)
1_2_3_4_5_6: bad (too many)
1234_5678: ok if code is intended to deal with east-Asian numerals (where 
1 is a standard grouping), bad otherwise
3__141_592_654: ok if this represents a fixed-point fraction (obviously bad 
otherwise)
123.456_789e123: good
123.456_789e1_23: bad (never useful in exponent)
0x1234_5678: good
0o123_456: good
0x123_456_789: bad (3 hex digits is usually not a meaningful group)

The one case that seems contentious is "123_456_j". Honestly, I don't care 
which way that goes, and I'd be fine if the PEP left out any mention of it, but 
if people feel strongly one way or the other, the PEP could just give it as a 
good or a bad example and that would be enough to clarify the intention.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Victor Stinner
2016-02-11 9:11 GMT+01:00 Georg Brandl :
> On 02/11/2016 12:04 AM, Victor Stinner wrote:
>> It looks like the implementation https://bugs.python.org/issue26331
>> only changes the Python parser.
>>
>> What about other functions converting strings to numbers at runtime
>> like int(str) and float(str)? Paul also asked for Decimal(str).
>
> I added these as "Open Questions" to the PEP.

Ok nice. Now another question :-)

Would it be useful to add an option to repr(int) and repr(float), or a
formatter to int.__format__() and float.__float__() to add an
underscore for thousands. Currently, we have the "n" format which
depends on the current LC_NUMERIC locale:

>>> '{:n}'.format(1234)
'1234'
>>> import locale; locale.setlocale(locale.LC_ALL, '')
'fr_FR.UTF-8'
>>> '{:n}'.format(1234)
'1 234'

My idea:

>>> (1234).__repr__(pep515=True)
'1_234'
>>> (1234.0).__repr__(pep515=True)
'1_234.0'

or maybe:

>>> '{:pep515}'.format(1234)
'1_234'
>>> '{:pep515}'.format(1234.0)
'1_234.0'

I don't think that it would be a good idea to modify repr() default
behaviour, it would likely break a lot of applications.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Chris Angelico
On Thu, Feb 11, 2016 at 7:22 PM, Georg Brandl  wrote:
> * Allowing underscores in string arguments to the ``Decimal`` constructor.  It
>   could be argued that these are akin to literals, since there is no Decimal
>   literal available (yet).
>
> * Allowing underscores in string arguments to ``int()`` with base argument 0,
>   ``float()`` and ``complex()``.

I'm -0.5 on both of these, with the caveat that if either gets done,
both should be. Decimal() shouldn't be different from int() just
because there's currently no way to express a Decimal literal; if
Python 3.7 introduces such a literal, there'd be this weird rule
difference that has to be maintained for backward compatibility, and
has no justification left.

(As a side point, I would be fully in favour of Decimal literals. I'd
also be in favour of something like "from __future__ import
fraction_literals" so 1/2 would evaluate to Fraction(1,2) rather than
0.5. Hence I'm inclined *not* to support underscores in Decimal().)

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Time for a change of random number generator?

2016-02-11 Thread Robert Kern

On 2016-02-11 00:08, Greg Ewing wrote:

The Mersenne Twister is no longer regarded as quite state-of-the art
because it can get into states that produce long sequences that are
not very random.

There is a variation on MT called WELL that has better properties
in this regard. Does anyone think it would be a good idea to replace
MT with WELL as Python's default rng?

https://en.wikipedia.org/wiki/Well_equidistributed_long-period_linear


There was a side-discussion about this during the secrets module proposal 
discussion.


WELL would not be my first choice. It escapes the excess-0 islands faster than 
MT, but still suffers from them. More troubling to me is that it is a linear 
feedback shift register, like MT, and all LFSRs quickly fail the linear 
complexity test in BigCrush.


xorshift* shares some of these flaws, but is significantly stronger and 
dominates WELL in most (all?) relevant dimensions.


  http://xorshift.di.unimi.it/

I'm favorable to the PCG family these days, though xorshift* and Random123 are 
reasonable alternatives.


  http://www.pcg-random.org/
  https://www.deshawresearch.com/resources_random123.html

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python 3.2.7 and 3.3.7

2016-02-11 Thread Georg Brandl
Hi all,

I'm planning to release 3.2.7 and 3.3.7 at the end of February.
There will be a release candidate on Feb 20, and the final on
Feb 27, if there is no holdup.

These are both security (source-only) releases.  3.2.7 will be the
last release from the 3.2 series.

If you know of any patches that should go in, make sure to commit
them in time or notify me.

Thanks,
Georg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Ethan Furman

On 02/11/2016 09:19 AM, Serhiy Storchaka wrote:

On 11.02.16 14:14, Georg Brandl wrote:



I still think that some cases (like two of the examples in the PEP,
0b__ and 1.5_j) are worth having, and therefore a more relaxed
rule is preferable.


Should I write an alternative PEP for strong rule?


Please don't.

A style guide recommendation which allows for variations when necessary 
is much better -- consenting adults, remember?


--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Serhiy Storchaka

On 11.02.16 14:14, Georg Brandl wrote:

On 02/11/2016 11:17 AM, Serhiy Storchaka wrote:


**Group 3: only between digits, only one underscore**

* Ada [8]_
* Julia (but not in the exponent part of floats) [9]_
* Ruby (docs say "anywhere", in reality only between digits) [10]_


C++ is in this group too.

The documentation of Perl explicitly says that Perl is in this group too
(23__500 is not legal). Perhaps there is a bug in Perl implementation.
And may be Swift is intended to be in this group.

I think we should follow the majority of languages and use simple rule:
"only between digits".

I have provided an implementation.


Thanks for the alternate patch.  I used the two-function approach you took
in ast.c for my latest revision.

I still think that some cases (like two of the examples in the PEP,
0b__ and 1.5_j) are worth having, and therefore a more relaxed
rule is preferable.


Should I write an alternative PEP for strong rule?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Terry Reedy

On 2/11/2016 2:45 AM, Georg Brandl wrote:

Thanks for grabbing this issue and moving it forward.  I will like being 
about to write or read 200_000_000 and be sure I an right without 
counting 0s.



Based on the feedback so far, I have an easier rule in mind that I will base
the next PEP revision on.  It's basically

"One ore more underscores allowed anywhere after a digit or a base specifier."

This preserves my preferred non-restrictive cases (0b__, 1.5_j) and
disallows more controversial versions like "1.5e_+_2".


I like both choices above.  I don't like trailing underscores for two 
reasons.


1. The stated purpose of adding '_'s is to visually separate.  Trailing 
underscores do not do that.  They serve no purpose.
2. Trailing _s are used to turn keywords (class) into identifiers 
(class_).  To me, 123_ mentally clashes with this usage.


If trailing _ is allowed, to simplify the implementation, I would like 
PEP 8, while on the subject, to say something like "While trailing _s on 
numbers are allowed, to simplify the implementation, they serve no 
purpose and are strongly discouraged".


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Georg Brandl
On 02/11/2016 06:19 PM, Serhiy Storchaka wrote:

>> Thanks for the alternate patch.  I used the two-function approach you took
>> in ast.c for my latest revision.
>>
>> I still think that some cases (like two of the examples in the PEP,
>> 0b__ and 1.5_j) are worth having, and therefore a more relaxed
>> rule is preferable.
> 
> Should I write an alternative PEP for strong rule?

That seems excessive for a minor point.  Let's collect feedback for
a few days, and we can also collect some informal votes.

In the end, I suspect that Guido will let us know about his preference for
one of the possibilities, and when he does, I will update the PEP accordingly.

cheers,
Georg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Serhiy Storchaka

On 11.02.16 19:40, Georg Brandl wrote:

On 02/11/2016 06:19 PM, Serhiy Storchaka wrote:


Thanks for the alternate patch.  I used the two-function approach you took
in ast.c for my latest revision.

I still think that some cases (like two of the examples in the PEP,
0b__ and 1.5_j) are worth having, and therefore a more relaxed
rule is preferable.


Should I write an alternative PEP for strong rule?


That seems excessive for a minor point.  Let's collect feedback for
a few days, and we can also collect some informal votes.


I suspect that my arguments can be lost otherwise.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Andrew Barnert via Python-Dev
On Feb 11, 2016, at 10:15, Andrew Barnert via Python-Dev 
 wrote:
> 
> That's a good point: we need style rules for PEP 8.

One more point: should the tutorial mention underscores? It looks like the 
intro docs for a lot of the other languages do. And it would only take one 
short sentence in 3.1.1 Numbers to say that you can use underscores to make 
large numbers like 123_456.789_012 more readable.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals

2016-02-11 Thread Jeff Hardy
On Thu, Feb 11, 2016 at 10:15 AM, Andrew Barnert via Python-Dev <
python-dev@python.org> wrote:

> On Feb 11, 2016, at 09:39, Terry Reedy  wrote:
> >
> > If trailing _ is allowed, to simplify the implementation, I would like
> PEP 8, while on the subject, to say something like "While trailing _s on
> numbers are allowed, to simplify the implementation, they serve no purpose
> and are strongly discouraged".
>
> That's a good point: we need style rules for PEP 8.
>
> But I think everything that's just obviously pointless (like putting an
> underscore between every pair of digits, or sprinkling underscores all over
> a huge number to make ASCII art), or already handled by other guidelines
> (e.g., using a ton of underscores to "line up a table" is the same as using
> a ton of spaces, which is already discouraged) doesn't really need to be
> covered. And I think trailing underscores probably fall into that category.
>
> It might be simpler to write a "whitelist" than a "blacklist" of all the
> ugly things people might come up with, and then just give a bunch of
> examples instead of a bunch of rules. Something like this:
>
> While underscores can legally appear anywhere in the digit string, you
> should never use them for purposes other than visually separating
> meaningful digit groups like thousands, bytes, and the like.
>
> 123456_789012: ok (millions are groups, but thousands are more common,
> and 6-digit groups are readable, but on the edge)
> 123_456_789_012: better
> 123_456_789_012_: bad (trailing)
> 1_2_3_4_5_6: bad (too many)
> 1234_5678: ok if code is intended to deal with east-Asian numerals
> (where 1 is a standard grouping), bad otherwise
> 3__141_592_654: ok if this represents a fixed-point fraction
> (obviously bad otherwise)
> 123.456_789e123: good
> 123.456_789e1_23: bad (never useful in exponent)
> 0x1234_5678: good
> 0o123_456: good
> 0x123_456_789: bad (3 hex digits is usually not a meaningful group)
>
> The one case that seems contentious is "123_456_j". Honestly, I don't care
> which way that goes, and I'd be fine if the PEP left out any mention of it,
> but if people feel strongly one way or the other, the PEP could just give
> it as a good or a bad example and that would be enough to clarify the
> intention.
>

I imagine that for whatever "bad" grouping you can suggest, someone,
somewhere, has a legitimate reason to use it. Any rule more complex than
"Use underscores in numeric literals only when the improve clarity" is
unnecessarily prescriptive.

- Jeff
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com