Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Random832
On Wed, Dec 7, 2016, at 22:06, Mikhail V wrote:
> So you were catched up from the beginning with hex, as I see ;)
> I on the contrary in dark times of learning programming
> (that was C) always oriented myself on decimal codes
> and don't regret it now.

C doesn't support decimal in string literals either, only octal and hex
(incidentally octal seems to have been much more common in the
environments where C was first invented). I can think of one context
where decimal is used for characters, actually, now that I think about
it. ANSI/ISO standards for 8-bit character sets often use a 'split'
decimal format (i.e. DEL = 7/15 rather than 0x7F or 127.)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Mikhail V
On 8 December 2016 at 03:32, Matthias welp  wrote:
> Dear Mikhail,
>
> With python3.6 you can use format strings to get very close to your
> desired behaviour:
>
> f"{48:c}" == "0"
> f"{:c}" == chr()
>
> It works with variables too:
>
> charvalue = 48
> f"{charcvalue:c}" == chr(charvalue) # == "0"
>

Waaa! This works!

>
> I hope this helps solve your apparent usability problem.

Big big thanks, I didn't now this feature, but I have googled alot
about "input characters as decimals" , so it is just added?
Another evidence that Python rules!

I'll rewrite some code, hope it'll have no side issues.

Mikhail
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Mikhail V
On 8 December 2016 at 03:36, Alexander Belopolsky
 wrote:
>
> On Wed, Dec 7, 2016 at 9:07 PM, Mikhail V  wrote:
>>
>> it somehow settled in
>> peoples' minds that hex reference should be preferred, for no solid reason
>> IMO.
>
> I may be showing my age, but all the facts that I remember about ASCII codes
> are in hex:
>
> 1. SPACE is 0x20 followed by punctuation symbols.
> 2. Decimal digits start at 0x30 with '0' = 0x30, '1' = 0x31, ...
> 3. @ is 0x40 followed by upper-case letter: 'A' = 0x41, 'B' = 0x42, ...
> 4. Lower-case letters are offset by 0x20 from the uppercase ones: 'a' =
> 0x61, 'b' = 0x62, ...
>
> Unicode is also organized around hexadecimal codes with various scripts
> positioned in sections that start at round hexadecimal numbers.  For example
> Cyrillic is at 0x0400 through 0x4FF
> .
>
> The only decimal fact I remember about Unicode is that the largest
> code-point is 1114111 - a palindrome!

As an aside, I've just noticed that in my example:
s = "first cyrillic letters: \{1040}\{1041}\{1042}"
s = "first cyrillic letters: \u0410\u0411\u0412"

the hex and decimal codes are made up of same digits, such a peculiar
coincidence...

So you were catched up from the beginning with hex, as I see ;)
I on the contrary in dark times of learning programming
(that was C) always oriented myself on decimal codes
and don't regret it now.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Alexander Belopolsky
On Wed, Dec 7, 2016 at 9:07 PM, Mikhail V  wrote:
>
> it somehow settled in
> peoples' minds that hex reference should be preferred, for no solid
reason IMO.

I may be showing my age, but all the facts that I remember about ASCII
codes are in hex:

1. SPACE is 0x20 followed by punctuation symbols.
2. Decimal digits start at 0x30 with '0' = 0x30, '1' = 0x31, ...
3. @ is 0x40 followed by upper-case letter: 'A' = 0x41, 'B' = 0x42, ...
4. Lower-case letters are offset by 0x20 from the uppercase ones: 'a' =
0x61, 'b' = 0x62, ...

Unicode is also organized around hexadecimal codes with various scripts
positioned in sections that start at round hexadecimal numbers.  For
example Cyrillic is at 0x0400 through 0x4FF <
http://unicode.org/charts/PDF/U0400.pdf>.

The only decimal fact I remember about Unicode is that the largest
code-point is 1114111 - a palindrome!
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Matthias welp
Dear Mikhail,

With python3.6 you can use format strings to get very close to your
desired behaviour:

f"{48:c}" == "0"
f"{:c}" == chr()

It works with variables too:

charvalue = 48
f"{charcvalue:c}" == chr(charvalue) # == "0"


This is only 1 character overhead + 1 character extra per char
formatted compared to your example. And as an extra you can use
hex strings (f"{0x30:c}" == "0") and any other integer literal you might want.

I don't see the added value of making character escapes in a non-default
way only (chars escaped + 1) bytes shorter, with the added maintenance
and development cost.

I think that you can do a lot with f-strings, and using the built-in formatting
options you can already get the behaviour you want in Python 3.6, months
earlier than the next opportunity (Python 3.7).

Check out the formatting options for integers and other built-in types here:
https://docs.python.org/3.6/library/string.html#format-specification-mini-language

I hope this helps solve your apparent usability problem.


-Matthias

On 8 December 2016 at 03:07, Mikhail V  wrote:
> On 8 December 2016 at 01:57, Nick Timkovich  wrote:
>>> hex notation not so readable and anyway decimal is kind of standard way to
>>> represent numbers
>>
>>
>> Can you cite some examples of Unicode reference tables I can look up a
>> decimal number in? They seem rare; perhaps in a list as a secondary column,
>> but they're not organized/grouped decimally. Readability counts, and
>> introducing a competing syntax will make it harder for others to read.
>
> There were links to such table in previos discussion. Googling
> "unicode table decimal" and
> first link will it be.
> I think most online tables include decimals as well, usually as tuples
> of 8-bit decimals.
> Also earlier the decimal code was the first column in most tables, but
> it somehow settled in
> peoples' minds that hex reference should be preferred, for no solid reason 
> IMO.
> One reason I think due to HTML standards which started to use it in html files
> long ago and had much influence later, but one should understand,
> that is just for brevity in most cases. Other reason is, file viewers
> show hex by
> default, but that is just misfortune, nothin besides brevity and 4-bit
> word alignment
> gives the hex notation unfortunatly, at least in its current typeface.
> This was discussed actually in that thread.
> Many people also think they are cool hackers if they make everything in hex :)
> In some cases it is worth it, but not this case IMO. Mainly for
> bitwise stuff, but
> then one should look into binary/trinary/quaternary representation
> depending on nature
> of operations and hardware.
>
> Yes there is unicode table pagination correspondence in hex reference,
> but that hardly plays
> any positive role for real applications, most of the time I need to
> look in my code
> and also perform number operations on *specific* ranges and codes, but not
> on whole pages of the table. This could only play role if I do
> low-level filtering of large files
> and want to filter out data after character's page, but that is the
> only positive thing
> I can think of, and I don't think it is directly for Python.
>
> Imagine some cryptography exercise - you take 27 units, you just give
> them numbers (0..26)
> and you do calculations, yes you can view results as hex numbers, but
> I don't do it and most people
> don't and should not, since why? It is ugly and not readable.
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Mikhail V
On 8 December 2016 at 01:57, Nick Timkovich  wrote:
>> hex notation not so readable and anyway decimal is kind of standard way to
>> represent numbers
>
>
> Can you cite some examples of Unicode reference tables I can look up a
> decimal number in? They seem rare; perhaps in a list as a secondary column,
> but they're not organized/grouped decimally. Readability counts, and
> introducing a competing syntax will make it harder for others to read.

There were links to such table in previos discussion. Googling
"unicode table decimal" and
first link will it be.
I think most online tables include decimals as well, usually as tuples
of 8-bit decimals.
Also earlier the decimal code was the first column in most tables, but
it somehow settled in
peoples' minds that hex reference should be preferred, for no solid reason IMO.
One reason I think due to HTML standards which started to use it in html files
long ago and had much influence later, but one should understand,
that is just for brevity in most cases. Other reason is, file viewers
show hex by
default, but that is just misfortune, nothin besides brevity and 4-bit
word alignment
gives the hex notation unfortunatly, at least in its current typeface.
This was discussed actually in that thread.
Many people also think they are cool hackers if they make everything in hex :)
In some cases it is worth it, but not this case IMO. Mainly for
bitwise stuff, but
then one should look into binary/trinary/quaternary representation
depending on nature
of operations and hardware.

Yes there is unicode table pagination correspondence in hex reference,
but that hardly plays
any positive role for real applications, most of the time I need to
look in my code
and also perform number operations on *specific* ranges and codes, but not
on whole pages of the table. This could only play role if I do
low-level filtering of large files
and want to filter out data after character's page, but that is the
only positive thing
I can think of, and I don't think it is directly for Python.

Imagine some cryptography exercise - you take 27 units, you just give
them numbers (0..26)
and you do calculations, yes you can view results as hex numbers, but
I don't do it and most people
don't and should not, since why? It is ugly and not readable.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Terry Reedy

On 12/7/2016 7:22 PM, Mikhail V wrote:

On 8 December 2016 at 01:13, Nick Timkovich  wrote:

Out of curiosity, why do you prefer decimal values to refer to Unicode code
points? Most references, http://unicode.org/charts/PDF/U0400.pdf (official)
or https://en.wikibooks.org/wiki/Unicode/Character_reference/-0FFF ,
prefer to refer to them by hexadecimal as the planes and ranges are broken
up by hex values.


Well, there was a huge discussion in October, see the subject name.
Just didnt want it to go again in that direction.
So in short hex notation not so readable and anyway decimal is
kind of standard way to represent numbers and I treat string as a number array
when I am processing it, so hex simply is redundant and not needed for me.


I sympathize with your preference, but ... Perhap the hex numbers would 
bother you less if you thought of them as 'serial numbers'.  It is 
standard for 'serial numbers' to include letters.  It is also common for 
digit-letter serial numbers to have meaningful fields, as as do the hex 
versions of unicode serial numbers.  The decimal versions are 
meaningless except as strict sequencers.


--
Terry Jan Reedy


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread MRAB

On 2016-12-07 23:52, Mikhail V wrote:

In past discussion about inputing and printing characters,
I was proposing decimal notation instead of hex.
Since the discussion was lost in off-topic talks, I'll try to
summarise my idea better.

I use ASCII only for code input (there are good reasons for that).
Here I'll use Python 3.6, and Windows 7, so I can use print() with unicode
directly and it works now in system console.

Suppose I only start programming and want to do some character manipulation.
The vey first thing I would probably start with is a simple output for
latin and cyrillic capital letters:

caps_lat = ""
for o in range(65, 91):
caps_lat =  caps_lat + chr(o)
print (caps_lat)

caps_cyr = ""
for o in range(1040, 1072):
caps_cyr =  caps_cyr + chr(o)
print (caps_cyr)


Which prints:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ


Say, I want now to input something direct in code:

s = "first cyrillic letters: " + chr(1040) + chr(1041) + chr(1042)

Which works fine and has clean look. However it is not very convinient
because of much typing and also, if I generate such strings,
adds a bit more complexity. But in general it is fine, and I use this
method currently.

=
Proposal: I would want to have a possibility to input it *by decimals*:

s = "first cyrillic letters: \{1040}\{1041}\{1042}"
or:
s = "first cyrillic letters: \(1040)\(1041)\(1042)"


> =
>
It's usually the case that escapes are \ followed by an ASCII-range 
letter or digit; \ followed by anything else makes it a literal, even if 
it's a metacharacter, e.g. " terminates a string that starts with ", but 
\" is a literal ", so I don't like \{...}.


Perl doesn't have \u... or \U..., it has \x{...} instead, and Python 
already has \N{...}, so:


s = "first cyrillic letters: \d{1040}\d{1041}\d{1042}"

might be better, but I'm still -1 because hex is usual when referring to 
Unicode codepoints.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Mikhail V
On 8 December 2016 at 01:13, Nick Timkovich  wrote:
> Out of curiosity, why do you prefer decimal values to refer to Unicode code
> points? Most references, http://unicode.org/charts/PDF/U0400.pdf (official)
> or https://en.wikibooks.org/wiki/Unicode/Character_reference/-0FFF ,
> prefer to refer to them by hexadecimal as the planes and ranges are broken
> up by hex values.

Well, there was a huge discussion in October, see the subject name.
Just didnt want it to go again in that direction.
So in short hex notation not so readable and anyway decimal is
kind of standard way to represent numbers and I treat string as a number array
when I am processing it, so hex simply is redundant and not needed for me.

Mikhail
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Nick Timkovich
Out of curiosity, why do you prefer decimal values to refer to Unicode code
points? Most references, http://unicode.org/charts/PDF/U0400.pdf (official)
or https://en.wikibooks.org/wiki/Unicode/Character_reference/-0FFF ,
prefer to refer to them by hexadecimal as the planes and ranges are broken
up by hex values.

On Wed, Dec 7, 2016 at 5:52 PM, Mikhail V  wrote:

> In past discussion about inputing and printing characters,
> I was proposing decimal notation instead of hex.
> Since the discussion was lost in off-topic talks, I'll try to
> summarise my idea better.
>
> I use ASCII only for code input (there are good reasons for that).
> Here I'll use Python 3.6, and Windows 7, so I can use print() with unicode
> directly and it works now in system console.
>
> Suppose I only start programming and want to do some character
> manipulation.
> The vey first thing I would probably start with is a simple output for
> latin and cyrillic capital letters:
>
> caps_lat = ""
> for o in range(65, 91):
> caps_lat =  caps_lat + chr(o)
> print (caps_lat)
>
> caps_cyr = ""
> for o in range(1040, 1072):
> caps_cyr =  caps_cyr + chr(o)
> print (caps_cyr)
>
>
> Which prints:
> ABCDEFGHIJKLMNOPQRSTUVWXYZ
> АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
>
>
> Say, I want now to input something direct in code:
>
> s = "first cyrillic letters: " + chr(1040) + chr(1041) + chr(1042)
>
> Which works fine and has clean look. However it is not very convinient
> because of much typing and also, if I generate such strings,
> adds a bit more complexity. But in general it is fine, and I use this
> method currently.
>
> =
> Proposal: I would want to have a possibility to input it *by decimals*:
>
> s = "first cyrillic letters: \{1040}\{1041}\{1042}"
> or:
> s = "first cyrillic letters: \(1040)\(1041)\(1042)"
>
> =
>
> This is more compact and seems not very contradictive with
> current Python escape characters in string literals.
> So backslash is a start of some escaping in most cases.
>
> For me most important is that in such way I would avoid
> any presence of hex numbers in strings, which I find very good
> for readability and for me it is very convinient since I use decimals
> for processing everywhere (and encourage everyone to do so).
>
> So this is my proposal, any comments on this are appreciated.
>
>
> PS:
>
> Currently Python 3 supports these in addition to \x:
> (from https://docs.python.org/3/howto/unicode.html)
> """
> If you can’t enter a particular character in your editor or want to keep
> the source code ASCII-only for some reason, you can also use escape
> sequences in string literals.
>
> >>> "\N{GREEK CAPITAL LETTER DELTA}"  # Using the character name
> >>> "\u0394"  # Using a 16-bit hex value
> >>> "\U0394"  # Using a 32-bit hex value
>
> """
> So I have many possibilities and all of them strangely contradicts with
> my image of intuitive and readable. Well, using charater name is readable,
> but seriously not much of a practical solution for input, but could be
> very useful
> for printing description of a character.
>
>
> Mikhail
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Input characters in strings by decimals (Was: Proposal for default character representation)

2016-12-07 Thread Mikhail V
In past discussion about inputing and printing characters,
I was proposing decimal notation instead of hex.
Since the discussion was lost in off-topic talks, I'll try to
summarise my idea better.

I use ASCII only for code input (there are good reasons for that).
Here I'll use Python 3.6, and Windows 7, so I can use print() with unicode
directly and it works now in system console.

Suppose I only start programming and want to do some character manipulation.
The vey first thing I would probably start with is a simple output for
latin and cyrillic capital letters:

caps_lat = ""
for o in range(65, 91):
caps_lat =  caps_lat + chr(o)
print (caps_lat)

caps_cyr = ""
for o in range(1040, 1072):
caps_cyr =  caps_cyr + chr(o)
print (caps_cyr)


Which prints:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ


Say, I want now to input something direct in code:

s = "first cyrillic letters: " + chr(1040) + chr(1041) + chr(1042)

Which works fine and has clean look. However it is not very convinient
because of much typing and also, if I generate such strings,
adds a bit more complexity. But in general it is fine, and I use this
method currently.

=
Proposal: I would want to have a possibility to input it *by decimals*:

s = "first cyrillic letters: \{1040}\{1041}\{1042}"
or:
s = "first cyrillic letters: \(1040)\(1041)\(1042)"

=

This is more compact and seems not very contradictive with
current Python escape characters in string literals.
So backslash is a start of some escaping in most cases.

For me most important is that in such way I would avoid
any presence of hex numbers in strings, which I find very good
for readability and for me it is very convinient since I use decimals
for processing everywhere (and encourage everyone to do so).

So this is my proposal, any comments on this are appreciated.


PS:

Currently Python 3 supports these in addition to \x:
(from https://docs.python.org/3/howto/unicode.html)
"""
If you can’t enter a particular character in your editor or want to keep
the source code ASCII-only for some reason, you can also use escape
sequences in string literals.

>>> "\N{GREEK CAPITAL LETTER DELTA}"  # Using the character name
>>> "\u0394"  # Using a 16-bit hex value
>>> "\U0394"  # Using a 32-bit hex value

"""
So I have many possibilities and all of them strangely contradicts with
my image of intuitive and readable. Well, using charater name is readable,
but seriously not much of a practical solution for input, but could be
very useful
for printing description of a character.


Mikhail
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library

2016-12-07 Thread M.-A. Lemburg
On 07.12.2016 13:57, Nick Coghlan wrote:
> On 7 December 2016 at 18:33, M.-A. Lemburg  wrote:
>> I know that you started this thread focusing on the stdlib,
>> but for the purpose of distributors, the scope goes far
>> beyond just the stdlib.
>>
>> Basically any Python module or package which the distribution can
>> provide should be usable as basis for a nice error message pointing to
>> the package to install.
> 
> The PEP draft covered two questions:
> 
> - experienced redistributors breaking the standard library up into pieces
> - optional modules for folks building their own Python (even if
> they're new to that)
> 
>> Now, it's the distribution which knows which modules/packages
>> are available, so we don't need a list of stdlib modules
>> in Python to help with this.
> 
> Right, that's the case that we realised can be covered entirely by the
> suggestion "patch site.py to install a different default
> sys.excepthook()"
> 
>> A list of stdlib modules may still be useful, but it comes
>> with it's own set of problems, which should be irrelevant
>> for this use case: some stdlib modules are optional and
>> only available if the system provides (and Python can find)
>> certain libs (or header files during compilation).
> 
> While upstream changes turned out not to be necessary for the
> "distributor breaking up the standard library" use case, they may
> still prove worthwhile in making import errors more informative in the
> case of "I just built my own Python from upstream sources and didn't
> notice (or didn't read) the build message indicating that some modules
> weren't built".
> 
> Given the precedent of the sysconfig metadata generation, providing
> some form of machine-readable build-time-generated module manifest
> should be pretty feasible if someone was motivated to implement it,
> and we already have the logic to track which optional modules weren't
> built in order to generate the message at the end of the build
> process.

True, but the build process only covers C extensions. Writing
the information somewhere for Python to pick up would be easy,
though (just dump the .failed* lists somewhere).

For pure Python modules, I suppose the install process could
record all installed modules.

Put all this info into a generated "_sysconfigstdlib" module,
import this into sysconfig and you're set.

Still, in all the years I've been using Python I never ran
into a situation where I was interested in such information.

For cases where a module is optional, you usually write
a try...except and handle this on a case-by-case basis.

That's safer than relying on some build time generated
list, since the Python binary may well have been built
on a different machine than the one the application is
currently running on and so, even if an optional module
is listed as built successfully, it may still fail to
import.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Dec 07 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library

2016-12-07 Thread Stephen J. Turnbull
Nick Coghlan writes:

 > While upstream changes turned out not to be necessary for the
 > "distributor breaking up the standard library" use case, they may
 > still prove worthwhile in making import errors more informative in the
 > case of "I just built my own Python from upstream sources and didn't
 > notice (or didn't read) the build message indicating that some modules
 > weren't built".

This case-by-case line of argument gives me a really bad feeling.  Do
we have to play whack-a-mole with every obscure message that pops up
that somebody might not be reading?  OK, this is a pretty common and
confusing case, but surely there's something more systematic (and
flexible vs. turning every error message into a complete usage manual
... which tl;dr) we can do.

One way to play would be an interactive checklist-based diagnostic
module (ie, a "rule-based expert system") that could be plugged into
IDEs or even into sys.excepthook.  Given Python's excellent
introspective facilities, with a little care the rule interpreter
could be designed with access to namespaces to provide additional
detail or tweak rule priority.  We could even build in a learning
engine to give priority to users' habitual bugs (including typical
mistaken diagnoses).

That said, I don't have time to work on it :-(, so feel free to ignore
me.  And I grant that since AFAIK we have zero existing code for the
engine and rule database, it might be a good idea to do something for
some particular obscure errors in the 3.7 timeframe.

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library

2016-12-07 Thread Nick Coghlan
On 7 December 2016 at 18:33, M.-A. Lemburg  wrote:
> I know that you started this thread focusing on the stdlib,
> but for the purpose of distributors, the scope goes far
> beyond just the stdlib.
>
> Basically any Python module or package which the distribution can
> provide should be usable as basis for a nice error message pointing to
> the package to install.

The PEP draft covered two questions:

- experienced redistributors breaking the standard library up into pieces
- optional modules for folks building their own Python (even if
they're new to that)

> Now, it's the distribution which knows which modules/packages
> are available, so we don't need a list of stdlib modules
> in Python to help with this.

Right, that's the case that we realised can be covered entirely by the
suggestion "patch site.py to install a different default
sys.excepthook()"

> A list of stdlib modules may still be useful, but it comes
> with it's own set of problems, which should be irrelevant
> for this use case: some stdlib modules are optional and
> only available if the system provides (and Python can find)
> certain libs (or header files during compilation).

While upstream changes turned out not to be necessary for the
"distributor breaking up the standard library" use case, they may
still prove worthwhile in making import errors more informative in the
case of "I just built my own Python from upstream sources and didn't
notice (or didn't read) the build message indicating that some modules
weren't built".

Given the precedent of the sysconfig metadata generation, providing
some form of machine-readable build-time-generated module manifest
should be pretty feasible if someone was motivated to implement it,
and we already have the logic to track which optional modules weren't
built in order to generate the message at the end of the build
process.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP: Distributing a Subset of the Standard Library

2016-12-07 Thread M.-A. Lemburg
I know that you started this thread focusing on the stdlib,
but for the purpose of distributors, the scope goes far
beyond just the stdlib.

Basically any Python module or package which the distribution can
provide should be usable as basis for a nice error message pointing to
the package to install.

Now, it's the distribution which knows which modules/packages
are available, so we don't need a list of stdlib modules
in Python to help with this.

The helper function (whether called via sys.excepthook() or perhaps
a new sys.importerrorhook()) would then check the imported
module name against this list and write out the message
pointing the user to the missing package.

A list of stdlib modules may still be useful, but it comes
with it's own set of problems, which should be irrelevant
for this use case: some stdlib modules are optional and
only available if the system provides (and Python can find)
certain libs (or header files during compilation).

For a distribution there are no optional stdlib modules,
since the distributor will know the complete list of available
modules in the distribution, including their external
dependencies.

In other words: Python already provides all the necessary
logic to enable implementing the suggested use case.


On 07.12.2016 06:24, Nick Coghlan wrote:
> On 7 December 2016 at 02:50, Tomas Orsava  wrote:
>> So using _sysconfigdata as inspiration, it would likely be possible to
>> provide a "sysconfig.get_missing_modules()" API that the default
>> sys.excepthook() could use to report that a particular import didn't
>> work because an optional standard library module hadn't been built.
>>
>> Quite interesting. And sysconfig.get_missing_modules() wouldn't even have to
>> be generated during the build process, because it would be called only when
>> the import has failed, at which point it is obvious Python was built without
>> said component (like _sqlite3). So do you see that as an acceptable
>> solution?
> 
> Oh, I'd missed that - yes, the sysconfig API could potentially be
> something like `sysconfig.get_stdlib_modules()` and
> `sysconfig.get_optional_modules()` instead of specifically reporting
> which ones were missed by the build process. There'd still be some
> work around generating the manifests backing those APIs at build time
> (including getting them right for Windows as well), but it would make
> some other questions that are currently annoying to answer relatively
> straightforward (see
> http://stackoverflow.com/questions/6463918/how-can-i-get-a-list-of-all-the-python-standard-library-modules
> for more on that)
> 
>> Do you prefer the one you suggested previously?
> 
> The only strong preference I have around how this is implemented is
> that I don't want to add complex single-purpose runtime infrastructure
> for the task. For all of the other specifics, I think it makes sense
> to err on the side of "What will be easiest to maintain over time?"
> 
>> Alternatively, can the contents of site.py be generated during the build
>> process? Because if some modules couldn't be built, a custom implementation
>> of sys.excepthook might be generated there with the data for the modules
>> that failed to be built.
> 
> We don't really want site.py itself to be auto-generated (although it
> could be updated to use Argument Clinic selectively if we deemed that
> to be an appropriate thing to do), but there's no problem with
> generating either data modules or normal importable modules that get
> accessed from site.py.
> 
> Cheers,
> Nick.
> 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Dec 07 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/