Re: [Python-ideas] Repurpose `assert' into a general-purpose check

2018-01-17 Thread Ethan Furman

On 01/17/2018 10:59 PM, Steven D'Aprano wrote:

On Thu, Jan 18, 2018 at 05:22:06PM +1100, Chris Angelico wrote:


I haven't yet seen any justification for syntax here. The nearest I've
seen is that this "ensure" action is more like:

try:
 cond = x >= 0
except BaseException:
 raise AssertionError("x must be positive")
else:
 if not cond:
 raise AssertionError("x must be positive")

Which, IMO, is a bad idea, and I'm not sure anyone was actually
advocating it anyway.


My understanding is that Sylvain was advocating for that.


Agreed.  Which, as has been pointed out, is an incredibly bad idea.

--
~Ethan~

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Repurpose `assert' into a general-purpose check

2018-01-17 Thread Chris Angelico
On Thu, Jan 18, 2018 at 4:21 PM, Steve Barnes  wrote:
> 1. For asserts that should not be disabled we could have an always
> qualifier optionally added to assert, either as "assert condition
> exception always" or "assert always condition exception", that disables
> the optimisation for that specific exception. This would make it clearer
> that the developer needs this specific check always. Alternatively, we
> could consider a scoped flag, say keep_asserts, that sets the same.

But if they're never to be compiled out, why do they need special syntax?

assert always x >= 0, "x must be positive"

can become

if x < 0: raise ValueError("x must be positive")

I haven't yet seen any justification for syntax here. The nearest I've
seen is that this "ensure" action is more like:

try:
cond = x >= 0
except BaseException:
raise AssertionError("x must be positive")
else:
if not cond:
raise AssertionError("x must be positive")

Which, IMO, is a bad idea, and I'm not sure anyone was actually
advocating it anyway.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Repurpose `assert' into a general-purpose check

2018-01-17 Thread Steve Barnes


On 17/01/2018 17:19, Nikolas Vanderhoof wrote:
> I think having a means for such validations separate from assertions 
> would be helpful.
> However, I agree with Steven that 'validate' would be a bad keyword choice.
> Besides breaking compatibility with programs that use 'validate', it 
> would break
> wsgiref.validate 
> 
>  
> in the standard library.
> 
> ᐧ

To me it looks like this discussion has basically split into two 
separate use cases:

1. Using assert in a way that it will not (ever) get turned off.
2. The specific case of ensuring that a variable/parameter is an 
instance of a specific type.

and I would like to suggest two separate possibly syntaxes that might 
make sense.

1. For asserts that should not be disabled we could have an always 
qualifier optionally added to assert, either as "assert condition 
exception always" or "assert always condition exception", that disables 
the optimisation for that specific exception. This would make it clearer 
that the developer needs this specific check always. Alternatively, we 
could consider a scoped flag, say keep_asserts, that sets the same.

2. For the specific, and to me more desirable, use case of ensuring type 
compliance how about an ensure keyword, (or possibly function), with a 
syntax of "ensure var type" or "ensure(var, type)" which goes a little 
further by attempting to convert the type of var to type and only if var 
cannot be converted raises a type exception. This second syntax could, 
of course, be implemented as a library function rather than a change to 
python itself. Either option could have an optional exception to raise, 
with the default being a type error.

-- 
Steve (Gadget) Barnes
Any opinions in this message are my personal opinions and do not reflect 
those of my employer.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Repurpose `assert' into a general-purpose check

2018-01-17 Thread Nick Coghlan
On 18 January 2018 at 07:46, Steven D'Aprano  wrote:
> To justify a keyword, it needs to do something special that a built-in
> function can't do, like delayed evaluation (without wrapping the
> expression in a function).

My reaction to these threads for a while has been "We should just add
a function for unconditional assertions in expression form", and I
finally got around to posting that to the issue tracker rather than
leaving it solely in mailing list posts:
https://bugs.python.org/issue32590

The gist of the idea is to add a new ensure() builtin along the lines of:

class ValidationError(AssertionError):
pass

_MISSING = object()

def ensure(condition, msg=_MISSING, exc_type=ValidationError):
if not condition:
if msg is _MISSING:
msg = condition
raise exc_type(msg)

There's no need to involve the compiler if you're never going to
optimise the code out, and code-rewriters like the one in pytest can
be taught to recognise "ensure(condition)" as being comparable to an
assert statement.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support WHATWG versions of legacy encodings

2018-01-17 Thread Nathaniel Smith
On Wed, Jan 17, 2018 at 10:13 AM, Rob Speer  wrote:
> I'm going to push back on the idea that this should only be used for
> decoding, not encoding.
>
> The use case I started with -- showing people how to fix mojibake using
> Python -- would *only* use these codecs in the encoding direction. To fix
> the most common case of mojibake, you encode it as web-1252 and decode it as
> UTF-8 (because you got the data from someone who did the opposite).

It's also nice to be able to parse some HTML data, make a few changes
in memory, and then serialize it back to HTML. Having this crash on
random documents is rather irritating, esp. if these documents are
standards-compliant HTML as in this case.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Repurpose `assert' into a general-purpose check

2018-01-17 Thread Steven D'Aprano
On Wed, Jan 17, 2018 at 12:19:51PM -0500, Nikolas Vanderhoof wrote:

> I think having a means for such validations separate from assertions would
> be helpful.

What semantics would this "validate" statement have, and how would it be 
different from what we can write now?


if not condition: raise SomeException(message)

validate condition, SomeException, message
# or some other name

Unless it does something better than a simple "if ... raise", there's 
not much point in adding a keyword just to save a few keystrokes.

To justify a keyword, it needs to do something special that a built-in 
function can't do, like delayed evaluation (without wrapping the 
expression in a function).


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support WHATWG versions of legacy encodings

2018-01-17 Thread Rob Speer
I'm going to push back on the idea that this should only be used for
decoding, not encoding.

The use case I started with -- showing people how to fix mojibake using
Python -- would *only* use these codecs in the encoding direction. To fix
the most common case of mojibake, you encode it as web-1252 and decode it
as UTF-8 (because you got the data from someone who did the opposite).

I have implemented some decode-only codecs (such as CESU-8), for exactly
the reason of "why would you want more text in this encoding", but the
situation is different here.

On Wed, 17 Jan 2018 at 13:00 Chris Barker  wrote:

> On Tue, Jan 16, 2018 at 9:30 PM, Stephen J. Turnbull <
> turnbull.stephen...@u.tsukuba.ac.jp> wrote:
>
>> In what context?  WHAT-WG's encoding standard is *all about browsers*.
>> If a codec is feeding text into a process that renders them all as
>> glyphs for a human to look at, that's one thing.  The codec doesn't
>> want to fatal there, and the likely fallback glyph is something from
>> the control glyphs block if even windows-125x doesn't have a glyph
>> there.  I guess it sort of makes sense.
>>
>
> sure it does -- and python is not a browser, and python itself has
> nothigni visual -- but we sure want to be abel to write code that produces
> visual representations of maybe messy text...
>
> if you're feeding a program
>
> ...
>
>> the codec has no idea when or how that's
>> going to get interpreted.
>
>
> sure -- which is why others have suggested that if WATWG is supported,
> then it *should* only be used for encoding, not encoding. But we are
> supposed to be consenting adults here -- I see no reason to prevent
> encoding -- maybe it would be useful for testing???
>
> (as with JSON data, which I believe is
>> "supposed" to be UTF-8, but many developers use the legacy charsets
>> they're used to and which are often embedded in the underlying
>> databases etc, ditto XML),
>
>
> OK -- if developers do the wrong thing, then they do the wrong thing -- we
> can't prevent that!
>
> And Python's lovely "text is unicode" model actually makes that hard to do
> wong. But we do need a way to decode messy text, and then send it off to
> JSON or whatever properly encoded.
>
> -CHB
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support WHATWG versions of legacy encodings

2018-01-17 Thread Chris Barker
On Tue, Jan 16, 2018 at 9:30 PM, Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> In what context?  WHAT-WG's encoding standard is *all about browsers*.
> If a codec is feeding text into a process that renders them all as
> glyphs for a human to look at, that's one thing.  The codec doesn't
> want to fatal there, and the likely fallback glyph is something from
> the control glyphs block if even windows-125x doesn't have a glyph
> there.  I guess it sort of makes sense.
>

sure it does -- and python is not a browser, and python itself has nothigni
visual -- but we sure want to be abel to write code that produces visual
representations of maybe messy text...

if you're feeding a program

...

> the codec has no idea when or how that's
> going to get interpreted.


sure -- which is why others have suggested that if WATWG is supported, then
it *should* only be used for encoding, not encoding. But we are supposed to
be consenting adults here -- I see no reason to prevent encoding -- maybe
it would be useful for testing???

(as with JSON data, which I believe is
> "supposed" to be UTF-8, but many developers use the legacy charsets
> they're used to and which are often embedded in the underlying
> databases etc, ditto XML),


OK -- if developers do the wrong thing, then they do the wrong thing -- we
can't prevent that!

And Python's lovely "text is unicode" model actually makes that hard to do
wong. But we do need a way to decode messy text, and then send it off to
JSON or whatever properly encoded.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Repurpose `assert' into a general-purpose check

2018-01-17 Thread Nikolas Vanderhoof
I think having a means for such validations separate from assertions would
be helpful.
However, I agree with Steven that 'validate' would be a bad keyword choice.
Besides breaking compatibility with programs that use 'validate', it would
break
wsgiref.validate

in the standard library.

ᐧ

On Tue, Jan 16, 2018 at 2:22 PM, Juancarlo Añez  wrote:

> Perhaps the OP can look into Python macro libraries to get the wanted
> syntax?
>
> https://github.com/lihaoyi/macropy
>
> On Tue, Jan 16, 2018 at 2:35 PM, Paul Moore  wrote:
>
>> On 16 January 2018 at 17:36, Sylvain MARIE
>>  wrote:
>> > (trying with direct reply this time)
>> >
>> >> Why do you do this? What's the requirement for delaying evaluation of
>> the condition?
>> >
>> > Thanks for challenging my poorly chosen examples :)
>> >
>> > The primary requirement is about *catching*
>> unwanted/uncontrolled/heterogenous exceptions happening in the
>> underlying functions that are combined together to provide the validation
>> means, so as to provide a uniform/consistent outcome however diverse the
>> underlying functions are (they can return booleans or raise exceptions, or
>> both).
>> >
>> > In your proposal, if 'is_foo_compliant' raises an exception, it will
>> not be caught by 'assert_valid', therefore the ValidationError will not be
>> raised. So this is not what I want as an application developer.
>>
>> Ah, OK. But nothing in your proposal for a new statement suggests you
>> wanted that, and assert doesn't work like that, so I hadn't realised
>> that's what you were after.
>>
>> You could of course simply do:
>>
>> def assert_valid(expr, help_msg):
>> # Catch exceptions in expr() as you see fit
>> if not expr():
>> raise ValidationError(help_msg)
>>
>> assert_valid(lambda: 0 <= surf < 1 and is_foo_compliant(surf),
>> help_msg="surface should be 0=>
>> No need for a whole expression language :-)
>>
>> Paul
>> ___
>> Python-ideas mailing list
>> Python-ideas@python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
>
> --
> Juancarlo *Añez*
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Repurpose `assert' into a general-purpose check

2018-01-17 Thread Sylvain MARIE
(trying with direct reply this time)

> Why do you do this? What's the requirement for delaying evaluation of the 
> condition?

Thanks for challenging my poorly chosen examples :)

The primary requirement is about *catching* unwanted/uncontrolled/heterogenous 
exceptions happening in the underlying functions that are combined together to 
provide the validation means, so as to provide a uniform/consistent outcome 
however diverse the underlying functions are (they can return booleans or raise 
exceptions, or both).

In your proposal, if 'is_foo_compliant' raises an exception, it will not be 
caught by 'assert_valid', therefore the ValidationError will not be raised. So 
this is not what I want as an application developer.

--
Sylvain

-Message d'origine-
De : Paul Moore [mailto:p.f.mo...@gmail.com] 
Envoyé : mardi 16 janvier 2018 18:01
À : Sylvain MARIE 
Cc : Python-Ideas 
Objet : Re: [Python-ideas] Repurpose `assert' into a general-purpose check

I fixed the reply-to this time, looks like you're still getting messed up by 
Google Groups.

On 16 January 2018 at 16:25, smarie
 wrote:
> Let's consider this example where users want to define on-the-fly one 
> of the validation functions, and combine it with another with a 'or':
>
> assert_valid('surface', surf, or_(lambda x: (x >= 0) & (x < 
> 1), is_foo_compliant), help_msg="surface should be 0= foo compliant")
>
> How ugly for something so simple ! I tried to make it slightly more 
> compact by developping a mini lambda syntax but it obviously makes it slower.

Why do you do this? What's the requirement for delaying evaluation of the 
condition? A validate statement in Python wouldn't be any better able to do 
that, so it'd be just as ugly with a statement. There's no reason I can see why 
I'd ever need delayed evaluation, so what's wrong with just

assert_valid(0 <= surf < 1 and is_foo_compliant(surf), 
help_msg="surface should be 0= There are three reasons why having a 'validate' statement would 
> improve
> this:
>
>  * no more parenthesis: more elegant and readable
>  * inline use of python (1): no more use of lambda or mini_lambda, no 
> performance overhead
>  * inline use of python (2): composition would not require custom 
> function composition operators such as 'or_' (above) or mini-lambda 
> composition anymore, it could be built-in in any language element used 
> after 
>
> resulting in
>
> validate (surf >= 0) & (surf < 1) or 
> is_foo_compliant(surf), "surface should be 0=
> (I removed the variable name alias 'surface' since I don't know if it 
> should remain or not)
>
> Elegant, isn't it ?

No more so than my function version, but yes far more so than yours...

Paul

__
This email has been scanned by the Symantec Email Security.cloud service.
__
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support WHATWG versions of legacy encodings

2018-01-17 Thread Soni L.



On 2018-01-17 03:30 AM, Stephen J. Turnbull wrote:

Soni L. writes:

  > This is surprising to me because I always took those encodings to
  > have those fallbacks [to raw control characters].

ISO-8859-1 implementations do, for historical reasons AFAICT.  And
they frequently produce mojibake and occasionally wilder behavior.
Most legacy encodings don't, and their standards documents frequently
leave the behavior undefined for control character codes (which means
you can error on them) and define use of unassigned codes as an error.

  > It's pretty wild to think someone wouldn't want them.

In what context?  WHAT-WG's encoding standard is *all about browsers*.
If a codec is feeding text into a process that renders them all as
glyphs for a human to look at, that's one thing.  The codec doesn't
want to fatal there, and the likely fallback glyph is something from
the control glyphs block if even windows-125x doesn't have a glyph
there.  I guess it sort of makes sense.

If you're feeding a program (as with JSON data, which I believe is
"supposed" to be UTF-8, but many developers use the legacy charsets
they're used to and which are often embedded in the underlying
databases etc, ditto XML), the codec has no idea when or how that's
going to get interpreted.  In one application I've maintained, an
editor, it has to deal with whatever characters are sent to it, but we
preferred to take charset designations seriously because users were
able to flexibly change those if they wanted to, so the error handler
is some form of replacement with a human-readable representation (not
pass-through), except for the usual HT, CR, LF, FF, and DEL (and ESC
in encodings using ISO 2022 extensions).  Mostly users would use the
editor to remove or replace invalid codes, although of course they
could just leave them in (and they would be converted from display
form to the original codes on output).

In another, a mailing list manager, codes outside the defined
repertoires were a recurring nightmare that crashed server processes
and blocked queues.  It took a decade before we sealed the last known
"leak" and I am not confident there are no leaks left.

So I don't actually have experience of a use case for control
character pass-through, and I wouldn't even automate the superset
substitutions if I could avoid it.  (In the editor case, I would
provide a dialog saying "This is supposed to be iso-8859-1, but I'm
seeing C1 control codes.  Would you like me to try windows-1252, which
uses those codes for graphic characters?")

So to my mind, the use case here is relatively restricted (writing
user display interfaces) and does not need to be in the stdlib, and
would constitute an attractive nuisance there (developers would say
"these users will stop complaining about inability to process their
dirty data if I use a WHAT-WG version of a codec, then they don't have
to clean up").  I don't have an objection to supporting even that use
case, but I don't see why that support needs to be available in the
stdlib.



We use control characters as formatting/control characters on IRC all 
the time.


ISO-8859-1 explicitly defines control characters in the \x80-\x9F range, 
IIRC.


Windows codepages implicitly define control characters in that range, 
but they're still technically defined. It's a de-facto standard for 
those encodings.


I think python should follow the (de-facto) standard. This is it.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/