Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread julien tayon
I wrote a lib specially for the case of validator that would also override
the documentation : default is if name of function +args speaks by it
itself then only this is added to the docstring
ex: @require_odd_numbers() => it would add require_odd_numbers at the end
of __doc__ and the possibilitly to add template of doc strings.
https://github.com/jul/check_arg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Adrien Ricocotam
Hi everyone, first participation in Python’s mailing list, don’t be too hard on 
me

Some suggested above to change the definition of len in the long term. Then I 
think it could be interesting to define len such as :

- If has a finite length : return that length (the way it works now)
- If has a  length that is infinity : return infinity
- If has no length : return None

There’s an issue with this solution, having None returned add complexity to the 
usage of len, then I suggest to have a wrapper over __len__ methods so it 
throws the current error.

But still, there’s a problem with infinite length objects. If people code :

for i in range(len(infinite_list)):
# Something

It’s not clear if people actually want to do this. It’s opened to discussion 
and it is just a suggestion.

If we now consider map, then the length of map (or filter or any other 
generator based on an iterator) is the same as the iterator itself which could 
be either infinite or non defined.

Cheers

> On 29 Nov 2018, at 06:06, Anders Hovmöller  wrote:
> 
> 
>>> +1.  Throwing away information is almost always a bad idea.
>> 
>> "Almost always"? Let's take this seriously, and think about the 
>> consequences if we actually believed that. If I created a series of 
>> integers:
> 
> “Almost". It’s part of my sentence. I have known about addition for many 
> years in fact :)
> 
> / Anders
> 
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Marko Ristin-Kaufmann
Hi Abe,
Thanks for your suggestions! We actually already considered the two
alternatives you propose.

*Multiple predicates per decorator. *The problem is that you can not deal
with toggling/describing individual contracts easily. While you can hack
your way through it (considering the arguments in the sequence, for
example), we found it clearer to have separate decorators. Moreover,
tracebacks are much easier to read, which is important when you debug a
program.

*AST magic. *The problem with any approach based on parsing (be it parsing
the code or the description) is that parsing is slow so you end up spending
a lot of cycles on contracts which might not be enabled (many contracts are
applied only in the testing environment, not int he production). Hence you
must have an approach that offers practically zero overhead cost to
importing a module when its contracts are turned off.

Decoding byte-code does not work as current decoding libraries can not keep
up with the changes in the language and the compiler hence they are always
lagging behind.

*Practicality of decorators. *We have retrospective meetings at the company
and I frequently survey the opinions related to the contracts (explicitly
asking about the readability and maintainability) -- so far nobody had any
difficulties and nobody was bothered by the noisy syntax. The decorator
syntax is simply not beautiful, no discussion about that. But when it comes
to maintenance,  there's a linter included (
https://github.com/Parquery/pyicontract-lint), and if you want contracts
rendered in an appealing way, there's a documentation tool for sphinx (
https://github.com/Parquery/sphinx-icontract). The linter facilitates the
maintainability a lot and sphinx tool gives you nice documentation for a
library so that you don't even have to look into the source code that often
if you don't want to.

We need to be careful not to mistake issues of aesthetics for practical
issues. Something might not be beautiful, but can be useful unless it's
unreadable.

*Conclusion. *What we do need at this moment, IMO, is a broad practical
experience of using contracts in Python. Once you make a change to the
language, it's impossible to undo. In contrast to what has been suggested
in the previous discussions (including my own voiced opinions), I actually
now don't think that introducing a language change would be beneficial *at
this precise moment*. We don't know what the use cases are, and there is no
practical experience to base the language change on.

I'd prefer to hear from people who actually use contracts in their
professional Python programming -- apart from the noisy syntax, how was the
experience? Did it help you catch bugs (and how many)? Were there big
problems with maintainability? Could you easily refactor? What were the
limits of the contracts you encountered? What kind of snapshot mechanism do
we need? How did you deal with multi-threading? And so on.

icontract library is already practically usable and, if you don't use
inheritance, dpcontracts is usable as well.  I would encourage everybody to
try out programming with contracts using an existing library and just hold
their nose when writing the noisy syntax. Once we unearthed deeper problems
related to contracts, I think it will be much easier and much more
convincing to write a proposal for introducing contracts in the core
language. If I had to write a proposal right now, it would be only based on
the experience of writing a humble 100K code base by a team of 5-10 people.
Not very convincing.


Cheers,
Marko

On Thu, 29 Nov 2018 at 02:26, Abe Dillon  wrote:

> Marko, I have a few thoughts that might improve icontract.
> First, multiple clauses per decorator:
>
> @pre(
> *lambda* x: x >= 0,
> *lambda* y: y >= 0,
> *lambda* width: width >= 0,
> *lambda* height: height >= 0,
> *lambda* x, width, img: x + width <= width_of(img),
> *lambda* y, height, img: y + height <= height_of(img))
> @post(
> *lambda* self: (self.x, self.y) in self,
> *lambda* self: (self.x+self.width-1, self.y+self.height-1) in self,
> *lambda* self: (self.x+self.width, self.y+self.height) not in self)
> *def* __init__(self, img: np.ndarray, x: int, y: int, width: int, height:
> int) -> None:
> self.img = img[y : y+height, x : x+width].copy()
> self.x = x
> self.y = y
> self.width = width
> self.height = height
>
> *def* __contains__(self, pt: Tuple[int, int]) -> bool:
> x, y = pt
> return (self.x <= x < self.x + self.width) and (self.y <= y < self.y +
> self.height)
>
>
> You might be able to get away with some magic by decorating a method just
> to flag it as using contracts:
>
>
> @contract  # <- does byte-code and/or AST voodoo
> *def* __init__(self, img: np.ndarray, x: int, y: int, width: int, height:
> int) -> None:
> pre(x >= 0,
> y >= 0,
> width >= 0,
> height >= 0,
> x + width <= width_of(img),
> y + height <= height_of(img))
>
> 

Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Marko Ristin-Kaufmann
Hi,

Property based testing is not about just generating random values till the
> heath death of the universe, but generating sensible values in a
> configurable way to cover all equivalence classes we can think of. if my
> function takes two floating point numbers as arguments, hypothesis
> "strategies" won't try all possible combinations of all possible floating
> point values, but instead all possible combination of interesting values
> (NaN, Infinity, too big, too small, positive, negative, zero, None, decimal
> fractions, etc..), something that an experienced programmer probably would
> end up doing by himself with a lot of test cases, but that can be better
> done with less effort by the automation provided by the hypothesis package.
>

Exactly. A tool can go a step further and, based on the assertions and
contracts, generate the tests automatically or prove that certain
properties of the program always hold. I would encourage people interested
in automatic testing to have a look at the scientific literature on the
topic (formal static analysis). Abstract interpretation has been already
mentioned: https://en.wikipedia.org/wiki/Abstract_interpretation. For some
bleeding edge, have a look what they do at this lab with the machine
learning: https://eth-sri.github.io/publications/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Anders Hovmöller

>> +1.  Throwing away information is almost always a bad idea.
> 
> "Almost always"? Let's take this seriously, and think about the 
> consequences if we actually believed that. If I created a series of 
> integers:

“Almost". It’s part of my sentence. I have known about addition for many years 
in fact :)

/ Anders

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Abe Dillon
One thought I had pertains to a very narrow sub-set of cases, but may
provide a starting point. For the cases where a precondition, invariant, or
postcondition only involves a single parameter, attribute, or the return
value (respectively) and it's reasonably simple, one could write it as an
expression acting directly on the type annotation:

def encabulate(
reactive_inductance: 1 >= float > 0,   # description
capacitive_diractance: int > 1,  # description
delta_winding: bool  # description
) -> len(Set[DingleArm]) > 0:  # ??? I don't know how you would
handle more complex objects...
do_stuff
with_things


Anyway. Just more food for thought...


On Tue, Nov 27, 2018 at 10:47 PM Abe Dillon  wrote:

> I've been pulling a lot of ideas from the recent discussion on design by
> contract (DBC), the elegance and drawbacks
>  of doctests
> , and the amazing talk
>  given by Hillel Wayne at
> this year's PyCon entitled "Beyond Unit Tests: Taking your Tests to the
> Next Level".
>
> To recap a lot of previous discussions:
>
> - Documentation should tell you:
> A) What a variable represents
> B) What kind of thing a variable is
> C) The acceptable values a variable can take
>
> - Typing and Tests can partially take the place of documentation by
> filling in B and C (respectively) and sometimes A can be inferred from
> decent naming and context.
>
> - Contracts can take the place of many tests (especially when combined
> with a library like hypothesis)
>
> - Contracts/assertions can provide "stable" documentation in the sense
> that it can't get out of sync with the code.
>
> - Attempts to implement contracts using standard Python syntax are verbose
> and noisy because they rely heavily on decorators that add a lot of
> repetitive preamble to the methods being decorated. They may also require a
> metaclass which restricts their use to code that doesn't already use a
> metaclass.
>
> - There was some discussion about the importance of "what a variable
> represents" which pointed to this article
>  by Philip J. Guo (author of
> the magnificent pythontutor.com). I believe Guo's usage of "in-the-small"
> and "in-the-large" are confusing because a well decoupled program shouldn't
> yield functions that know or care how they're being used in the grand
> machinations of your project. The examples he gives are of functions that
> could use a doc string and some type annotations, but don't actually say
> how they relate to the rest of the project.
>
> One thing that caught me about Hillel Wayne's talk was that some of his
> examples were close to needing practically no code. He starts with:
>
> def tail(lst: List[Any]) -> List[Any]:
>   assert len(lst) > 0, "precondition"
>   result = lst[1:]
>   assert [lst[0]] + result == lst, "postcondition"
>   return result
>
> He then re-writes the function using a contracts library:
>
> @require("lst must not be empty", lambda args: len(args.lst) > 0)
> @ensure("result is tail of lst", lambda args, result: [args.lst[0]] +
> result == args.lst)
> def tail(lst: List[Any]) -> List[Any]:
>   return lst[1:]
>
> He then writes a unit test for the function:
>
> @given(lists(integers(), 1))
> def test_tail(lst):
>   tail(lst)
>
> What strikes me as interesting is that the test pretty-much doesn't need
> to be written. The 'given' statement should be redundant based on the type
> annotation and the precondition. Anyone who knows hypothesis, just imagine
> the @require is a hypothesis 'assume' call. Furthermore, hypothesis should
> be able to build strategies for more complex objects based on class
> invariants and attribute types:
>
> @invariant("no overdrafts", lambda self: self.balance >= 0)
> class Account:
>   def __init__(self, number: int, balance: float = 0):
> super().__init__()
> self.number: int = number
> self.balance: float = balance
>
> A library like hypothesis should be able to generate valid account
> objects. Hypothesis also has stateful testing
>  but I think
> the implementation could use some work. As it is, you have inherit from a
> class that uses a metaclass AND you have to pollute your class's name-space
> with helper objects and methods.
>
> If we could figure out a cleaner syntax for defining invariants,
> preconditions, and postconditions we'd be half-way to automated testing
> UTOPIA! (ok, maybe I'm being a little over-zealous)
>
> I think there are two missing pieces to this testing problem: side-effect
> verification and failure verification.
>
> Failure verification should test that the expected exceptions get thrown
> when known bad data is passed in or when an object is put in a known
> illegal state. This should be doable by allowing Hypothesis to probe the

Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Abe Dillon
OK. I know I made a mistake by saying, "computers are very good at
*exhaustively* searching multidimensional spaces." I should have said,
"computers are very good at enumerating examples from multi-dimensional
spaces" or something to that effect. Now that we've had our fun, can you
guys please continue in a forked conversation so it doesn't derail the
conversation?

On Wed, Nov 28, 2018 at 7:47 PM David Mertz  wrote:

> I was assuming it was a Numba-ized function since it's purely numeric. ;-)
>
> FWIW, the theoretical limit of Python ints is limited by the fact
> 'int.bit_length()' is a platform native int. So my system cannot store ints
> larger than (2**(2**63-1)). It'll take a lot more memory than my measly
> 4GiB to store that number though.
>
> So yes, that's way longer that heat-death-of-universe even before 128-bit
> machines are widespread.
>
> On Wed, Nov 28, 2018, 6:43 PM Antoine Pitrou 
>>
>> But Python integers are variable-sized, and their size is basically
>> limited by available memory or address space.
>>
>> Let's take a typical 64-bit Python build, assuming 4 GB RAM available.
>> Let's also assume that 90% of those 4 GB can be readily allocated for
>> Python objects (there's overhead, etc.).
>>
>> Also let's take a look at the Python integer representation:
>>
>> >>> sys.int_info
>> sys.int_info(bits_per_digit=30, sizeof_digit=4)
>>
>> This means that every 4 bytes of integer object store 30 bit of actual
>> integer data.
>>
>> So, how many bits has the largest allocatable integer on that system,
>> assuming 90% of 4 GB are available for allocation?
>>
>> >>> nbits = (2**32)*0.9*30/4
>> >>> nbits
>> 28991029248.0
>>
>> Now how many possible integers are there in that number of bits?
>>
>> >>> x = 1 << int(nbits)
>> >>> x.bit_length()
>> 28991029249
>>
>> (yes, that number was successfully allocated in full.  And the Python
>> process occupies 3.7 GB RAM at that point, which validates the estimate.)
>>
>> Let's try to have a readable approximation of that number.  Convert it
>> to a float perhaps?
>>
>> >>> float(x)
>> Traceback (most recent call last):
>>   File "", line 1, in 
>> OverflowError: int too large to convert to float
>>
>> Well, of course.  So let's just extract a power of 10:
>>
>> >>> math.log10(x)
>> 8727169408.819794
>> >>> 10**0.819794
>> 6.603801339268099
>>
>> (yes, math.log10() works on non-float-convertible integers.  I'm
>> impressed!)
>>
>> So the number of representable integers on that system is approximately
>> 6.6e8727169408.  Let's hope the Sun takes its time.
>>
>> (and of course, what is true for ints is true for any variable-sized
>> input, such as strings, lists, dicts, sets, etc.)
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 29/11/2018 à 00:24, David Mertz a écrit :
>> > That's easy, Antoine. On a reasonable modern multi-core workstation, I
>> > can do 4 billion additions per second. A year is just over 30 million
>> > seconds. For 32-bit ints, I can whiz through the task in only 130,000
>> > years. We have at least several hundred million years before the sun
>> > engulfs us.
>> >
>> > On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou > >  wrote:
>> >
>> > On Wed, 28 Nov 2018 15:58:24 -0600
>> > Abe Dillon mailto:abedil...@gmail.com>>
>> wrote:
>> > > Thirdly, Computers are very good at exhaustively searching
>> > multidimensional
>> > > spaces.
>> >
>> > How long do you think it will take your computer to exhaustively
>> search
>> > the space of possible input values to a 2-integer addition function?
>> >
>> > Do you think it can finish before the Earth gets engulfed by the
>> Sun?
>> >
>> > Regards
>> >
>> > Antoine.
>> >
>> >
>> > ___
>> > Python-ideas mailing list
>> > Python-ideas@python.org 
>> > https://mail.python.org/mailman/listinfo/python-ideas
>> > Code of Conduct: http://python.org/psf/codeofconduct/
>> >
>> ___
>> Python-ideas mailing list
>> Python-ideas@python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread David Mertz
I was assuming it was a Numba-ized function since it's purely numeric. ;-)

FWIW, the theoretical limit of Python ints is limited by the fact
'int.bit_length()' is a platform native int. So my system cannot store ints
larger than (2**(2**63-1)). It'll take a lot more memory than my measly
4GiB to store that number though.

So yes, that's way longer that heat-death-of-universe even before 128-bit
machines are widespread.

On Wed, Nov 28, 2018, 6:43 PM Antoine Pitrou 
> But Python integers are variable-sized, and their size is basically
> limited by available memory or address space.
>
> Let's take a typical 64-bit Python build, assuming 4 GB RAM available.
> Let's also assume that 90% of those 4 GB can be readily allocated for
> Python objects (there's overhead, etc.).
>
> Also let's take a look at the Python integer representation:
>
> >>> sys.int_info
> sys.int_info(bits_per_digit=30, sizeof_digit=4)
>
> This means that every 4 bytes of integer object store 30 bit of actual
> integer data.
>
> So, how many bits has the largest allocatable integer on that system,
> assuming 90% of 4 GB are available for allocation?
>
> >>> nbits = (2**32)*0.9*30/4
> >>> nbits
> 28991029248.0
>
> Now how many possible integers are there in that number of bits?
>
> >>> x = 1 << int(nbits)
> >>> x.bit_length()
> 28991029249
>
> (yes, that number was successfully allocated in full.  And the Python
> process occupies 3.7 GB RAM at that point, which validates the estimate.)
>
> Let's try to have a readable approximation of that number.  Convert it
> to a float perhaps?
>
> >>> float(x)
> Traceback (most recent call last):
>   File "", line 1, in 
> OverflowError: int too large to convert to float
>
> Well, of course.  So let's just extract a power of 10:
>
> >>> math.log10(x)
> 8727169408.819794
> >>> 10**0.819794
> 6.603801339268099
>
> (yes, math.log10() works on non-float-convertible integers.  I'm
> impressed!)
>
> So the number of representable integers on that system is approximately
> 6.6e8727169408.  Let's hope the Sun takes its time.
>
> (and of course, what is true for ints is true for any variable-sized
> input, such as strings, lists, dicts, sets, etc.)
>
> Regards
>
> Antoine.
>
>
> Le 29/11/2018 à 00:24, David Mertz a écrit :
> > That's easy, Antoine. On a reasonable modern multi-core workstation, I
> > can do 4 billion additions per second. A year is just over 30 million
> > seconds. For 32-bit ints, I can whiz through the task in only 130,000
> > years. We have at least several hundred million years before the sun
> > engulfs us.
> >
> > On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou  >  wrote:
> >
> > On Wed, 28 Nov 2018 15:58:24 -0600
> > Abe Dillon mailto:abedil...@gmail.com>> wrote:
> > > Thirdly, Computers are very good at exhaustively searching
> > multidimensional
> > > spaces.
> >
> > How long do you think it will take your computer to exhaustively
> search
> > the space of possible input values to a 2-integer addition function?
> >
> > Do you think it can finish before the Earth gets engulfed by the Sun?
> >
> > Regards
> >
> > Antoine.
> >
> >
> > ___
> > Python-ideas mailing list
> > Python-ideas@python.org 
> > https://mail.python.org/mailman/listinfo/python-ideas
> > Code of Conduct: http://python.org/psf/codeofconduct/
> >
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Abe Dillon
Marko, I have a few thoughts that might improve icontract.
First, multiple clauses per decorator:

@pre(
*lambda* x: x >= 0,
*lambda* y: y >= 0,
*lambda* width: width >= 0,
*lambda* height: height >= 0,
*lambda* x, width, img: x + width <= width_of(img),
*lambda* y, height, img: y + height <= height_of(img))
@post(
*lambda* self: (self.x, self.y) in self,
*lambda* self: (self.x+self.width-1, self.y+self.height-1) in self,
*lambda* self: (self.x+self.width, self.y+self.height) not in self)
*def* __init__(self, img: np.ndarray, x: int, y: int, width: int, height:
int) -> None:
self.img = img[y : y+height, x : x+width].copy()
self.x = x
self.y = y
self.width = width
self.height = height

*def* __contains__(self, pt: Tuple[int, int]) -> bool:
x, y = pt
return (self.x <= x < self.x + self.width) and (self.y <= y < self.y +
self.height)


You might be able to get away with some magic by decorating a method just
to flag it as using contracts:


@contract  # <- does byte-code and/or AST voodoo
*def* __init__(self, img: np.ndarray, x: int, y: int, width: int, height:
int) -> None:
pre(x >= 0,
y >= 0,
width >= 0,
height >= 0,
x + width <= width_of(img),
y + height <= height_of(img))

# this would probably be declared at the class level
inv(*lambda* self: (self.x, self.y) in self,
*lambda* self: (self.x+self.width-1, self.y+self.height-1) in self,
*lambda* self: (self.x+self.width, self.y+self.height) not in self)

self.img = img[y : y+height, x : x+width].copy()
self.x = x
self.y = y
self.width = width
self.height = height

That might be super tricky to implement, but it saves you some lambda
noise. Also, I saw a forked thread in which you were considering some sort
of transpiler  with similar syntax to the above example. That also works.
Another thing to consider is that the role of descriptors
 overlaps
some with the role of invariants. I don't know what to do with that
knowledge, but it seems like it might be useful.

Anyway, I hope those half-baked thoughts have *some* value...

On Wed, Nov 28, 2018 at 1:12 AM Marko Ristin-Kaufmann <
marko.ris...@gmail.com> wrote:

> Hi Abe,
>
> I've been pulling a lot of ideas from the recent discussion on design by
>> contract (DBC), the elegance and drawbacks
>>  of doctests
>> , and the amazing talk
>>  given by Hillel Wayne at
>> this year's PyCon entitled "Beyond Unit Tests: Taking your Tests to the
>> Next Level".
>>
>
> Have you looked at the recent discussions regarding design-by-contract on
> this list (
> https://groups.google.com/forum/m/#!topic/python-ideas/JtMgpSyODTU
> and the following forked threads)?
>
> You might want to have a look at static checking techniques such as
> abstract interpretation. I hope to be able to work on such a tool for
> Python in some two years from now. We can stay in touch if you are
> interested.
>
> Re decorators: to my own surprise, using decorators in a larger code base
> is completely practical including the  readability and maintenance of the
> code. It's neither that ugly nor problematic as it might seem at first look.
>
> We use our https://github.com/Parquery/icontract at the company. Most of
> the design choices come from practical issues we faced -- so you might want
> to read the doc even if you don't plant to use the library.
>
> Some of the aspects we still haven't figured out are: how to approach
> multi-threading (locking around the whole function with an additional
> decorator?) and granularity of contract switches (right now we use
> always/optimized, production/non-optimized and teating/slow, but it seems
> that a larger system requires finer categories).
>
> Cheers Marko
>
>
>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Marcos Eliziario
But nobody is talking about exhausting the combinatoric space of all
possible values. Property Based Testing looks like Fuzzy Testing but it is
not quite the same thing.

Property based testing is not about just generating random values till the
heath death of the universe, but generating sensible values in a
configurable way to cover all equivalence classes we can think of. if my
function takes two floating point numbers as arguments, hypothesis
"strategies" won't try all possible combinations of all possible floating
point values, but instead all possible combination of interesting values
(NaN, Infinity, too big, too small, positive, negative, zero, None, decimal
fractions, etc..), something that an experienced programmer probably would
end up doing by himself with a lot of test cases, but that can be better
done with less effort by the automation provided by the hypothesis package.

It could be well that just by using such a tool, a naive programmer could
end up being convinced of the fact that maybe he probably would better be
served by sticking to Decimal Arithmetics :-)




Em qua, 28 de nov de 2018 às 21:43, Antoine Pitrou 
escreveu:

>
> But Python integers are variable-sized, and their size is basically
> limited by available memory or address space.
>
> Let's take a typical 64-bit Python build, assuming 4 GB RAM available.
> Let's also assume that 90% of those 4 GB can be readily allocated for
> Python objects (there's overhead, etc.).
>
> Also let's take a look at the Python integer representation:
>
> >>> sys.int_info
> sys.int_info(bits_per_digit=30, sizeof_digit=4)
>
> This means that every 4 bytes of integer object store 30 bit of actual
> integer data.
>
> So, how many bits has the largest allocatable integer on that system,
> assuming 90% of 4 GB are available for allocation?
>
> >>> nbits = (2**32)*0.9*30/4
> >>> nbits
> 28991029248.0
>
> Now how many possible integers are there in that number of bits?
>
> >>> x = 1 << int(nbits)
> >>> x.bit_length()
> 28991029249
>
> (yes, that number was successfully allocated in full.  And the Python
> process occupies 3.7 GB RAM at that point, which validates the estimate.)
>
> Let's try to have a readable approximation of that number.  Convert it
> to a float perhaps?
>
> >>> float(x)
> Traceback (most recent call last):
>   File "", line 1, in 
> OverflowError: int too large to convert to float
>
> Well, of course.  So let's just extract a power of 10:
>
> >>> math.log10(x)
> 8727169408.819794
> >>> 10**0.819794
> 6.603801339268099
>
> (yes, math.log10() works on non-float-convertible integers.  I'm
> impressed!)
>
> So the number of representable integers on that system is approximately
> 6.6e8727169408.  Let's hope the Sun takes its time.
>
> (and of course, what is true for ints is true for any variable-sized
> input, such as strings, lists, dicts, sets, etc.)
>
> Regards
>
> Antoine.
>
>
> Le 29/11/2018 à 00:24, David Mertz a écrit :
> > That's easy, Antoine. On a reasonable modern multi-core workstation, I
> > can do 4 billion additions per second. A year is just over 30 million
> > seconds. For 32-bit ints, I can whiz through the task in only 130,000
> > years. We have at least several hundred million years before the sun
> > engulfs us.
> >
> > On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou  >  wrote:
> >
> > On Wed, 28 Nov 2018 15:58:24 -0600
> > Abe Dillon mailto:abedil...@gmail.com>> wrote:
> > > Thirdly, Computers are very good at exhaustively searching
> > multidimensional
> > > spaces.
> >
> > How long do you think it will take your computer to exhaustively
> search
> > the space of possible input values to a 2-integer addition function?
> >
> > Do you think it can finish before the Earth gets engulfed by the Sun?
> >
> > Regards
> >
> > Antoine.
> >
> >
> > ___
> > Python-ideas mailing list
> > Python-ideas@python.org 
> > https://mail.python.org/mailman/listinfo/python-ideas
> > Code of Conduct: http://python.org/psf/codeofconduct/
> >
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Marcos Eliziário Santos
mobile/whatsapp/telegram: +55(21) 9-8027-0156
skype: marcos.elizia...@gmail.com
linked-in : https://www.linkedin.com/in/eliziario/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Antoine Pitrou

But Python integers are variable-sized, and their size is basically
limited by available memory or address space.

Let's take a typical 64-bit Python build, assuming 4 GB RAM available.
Let's also assume that 90% of those 4 GB can be readily allocated for
Python objects (there's overhead, etc.).

Also let's take a look at the Python integer representation:

>>> sys.int_info
sys.int_info(bits_per_digit=30, sizeof_digit=4)

This means that every 4 bytes of integer object store 30 bit of actual
integer data.

So, how many bits has the largest allocatable integer on that system,
assuming 90% of 4 GB are available for allocation?

>>> nbits = (2**32)*0.9*30/4
>>> nbits
28991029248.0

Now how many possible integers are there in that number of bits?

>>> x = 1 << int(nbits)
>>> x.bit_length()
28991029249

(yes, that number was successfully allocated in full.  And the Python
process occupies 3.7 GB RAM at that point, which validates the estimate.)

Let's try to have a readable approximation of that number.  Convert it
to a float perhaps?

>>> float(x)
Traceback (most recent call last):
  File "", line 1, in 
OverflowError: int too large to convert to float

Well, of course.  So let's just extract a power of 10:

>>> math.log10(x)
8727169408.819794
>>> 10**0.819794
6.603801339268099

(yes, math.log10() works on non-float-convertible integers.  I'm impressed!)

So the number of representable integers on that system is approximately
6.6e8727169408.  Let's hope the Sun takes its time.

(and of course, what is true for ints is true for any variable-sized
input, such as strings, lists, dicts, sets, etc.)

Regards

Antoine.


Le 29/11/2018 à 00:24, David Mertz a écrit :
> That's easy, Antoine. On a reasonable modern multi-core workstation, I
> can do 4 billion additions per second. A year is just over 30 million
> seconds. For 32-bit ints, I can whiz through the task in only 130,000
> years. We have at least several hundred million years before the sun
> engulfs us.
> 
> On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou   wrote:
> 
> On Wed, 28 Nov 2018 15:58:24 -0600
> Abe Dillon mailto:abedil...@gmail.com>> wrote:
> > Thirdly, Computers are very good at exhaustively searching
> multidimensional
> > spaces.
> 
> How long do you think it will take your computer to exhaustively search
> the space of possible input values to a 2-integer addition function?
> 
> Do you think it can finish before the Earth gets engulfed by the Sun?
> 
> Regards
> 
> Antoine.
> 
> 
> ___
> Python-ideas mailing list
> Python-ideas@python.org 
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> 
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Chris Angelico
On Thu, Nov 29, 2018 at 10:25 AM David Mertz  wrote:
>
> That's easy, Antoine. On a reasonable modern multi-core workstation, I can do 
> 4 billion additions per second. A year is just over 30 million seconds. For 
> 32-bit ints, I can whiz through the task in only 130,000 years. We have at 
> least several hundred million years before the sun engulfs us.
>

Python ints are not 32-bit ints. Have fun. :)

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread David Mertz
That's easy, Antoine. On a reasonable modern multi-core workstation, I can
do 4 billion additions per second. A year is just over 30 million seconds.
For 32-bit ints, I can whiz through the task in only 130,000 years. We have
at least several hundred million years before the sun engulfs us.

On Wed, Nov 28, 2018, 5:09 PM Antoine Pitrou  On Wed, 28 Nov 2018 15:58:24 -0600
> Abe Dillon  wrote:
> > Thirdly, Computers are very good at exhaustively searching
> multidimensional
> > spaces.
>
> How long do you think it will take your computer to exhaustively search
> the space of possible input values to a 2-integer addition function?
>
> Do you think it can finish before the Earth gets engulfed by the Sun?
>
> Regards
>
> Antoine.
>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Abe Dillon
I raised a related problem a while back when I found that random.sample can
only take a sequence. The example I gave was randomly sampling points on a
2D grid to initialize a board for Conway's Game of Life:

>>> def random_board(height: int, width: int, ratio: float = 0.5) ->
Set[Tuple[int, int]]:
... """ produce a set of points randomly chosen from an height x width
grid """
... all_points = itertools.product(range(height), range(width))
... num_samples = ratio*height*width
... return set(random.sample(all_points, num_samples))
...
>>> random_board(height=5, width=10, ratio=0.25)
TypeError: Population must be a sequence or set.  For dicts, use list(d).

It seems like there should be some way to pass along the information that
the size *is* known, but I couldn't think of any way of passing that info
along without adding massive amounts of complexity everywhere.

If map is able to support len() under certain circumstances, it makes sense
that other iterators and generators would be able to do the same. You might
even want a way to annotate a generator function with logic about how it
might support len().

I don't have an answer to this problem, but I hope this provides some sense
of the scope of what you're asking.

On Mon, Nov 26, 2018 at 3:36 PM Kale Kundert  wrote:

> I just ran into the following behavior, and found it surprising:
>
> >>> len(map(float, [1,2,3]))
> TypeError: object of type 'map' has no len()
>
> I understand that map() could be given an infinite sequence and therefore
> might not always have a length.  But in this case, it seems like map()
> should've known that its length was 3.  I also understand that I can just
> call list() on the whole thing and get a list, but the nice thing about
> map() is that it doesn't copy data, so it's unfortunate to lose that
> advantage for no particular reason.
>
> My proposal is to delegate map.__len__() to the underlying iterable.
> Similarly, map.__getitem__() could be implemented if the underlying
> iterable supports item access:
>
> class map:
>
> def __init__(self, func, iterable):
> self.func = func
> self.iterable = iterable
>
> def __iter__(self):
> yield from (self.func(x) for x in self.iterable)
>
> def __len__(self):
> return len(self.iterable)
>
> def __getitem__(self, key):
> return self.func(self.iterable[key])
>
> Let me know if there any downsides to this that I'm not seeing.  From my
> perspective, it seems like there would be only a number of (small)
> advantages:
>
> - Less surprising
> - Avoid some unnecessary copies
> - Backwards compatible
>
> -Kale
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Greg Ewing

E. Madison Bray wrote:

So I might want to check:

finite_definite = True
for it in my_map.iters:
try:
len(it)
except TypeError:
finite_definite = False

if finite_definite:
my_seq = list(my_map)
else:
# some other algorithm


If map is being passed into your function, you can still do this
check before calling map.

If the user is doing the mapping themselves, then in Python 2 it
would have blown up anyway before your function even got called,
so nothing is any worse.

--
Greg


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Greg Ewing

E. Madison Bray wrote:

I still believe
that the actual proposal of making the arguments to a map(...) call
accessible from Python as attributes of the map object (ditto filter,
zip, etc.) is useful in its own right, rather than just having this
completely opaque iterator.


But it will only help if the user passes a map object in particular,
and not some other kind of iterator. Also it won't help if the
inputs to the map are themselves iterators that aren't amenable to
inspection. This smells like exposing an implementation detail of your
function in its API.

I don't see how it would help with your Sage port either, since
the original code only got the result of the mapping and wouldn't
have been able to inspect the underlying iterables.

I wonder whether it's too late to redefine map() so that it returns
a view object instead of an iterator, as was done when merging
dict.{items, iter_items} etc.

Alternatively, add a mapped() bultin that returns a view.

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Steven D'Aprano
On Wed, Nov 28, 2018 at 02:53:50PM -0500, Terry Reedy wrote:

> One of the guidelines in the Zen of Python is
> "Special cases aren't special enough to break the rules."
> 
> This proposal claims that the Python 3 built-in iterator class 'map' is 
> so special that it should break the rule that iterators in general 
> cannot and therefore do not have .__len__ methods because their size may 
> be infinite, unknowable until exhaustion, or declining with each 
> .__next__ call.
> 
> For iterators, 3.4 added an optional __length_hint__ method.  This makes 
> sense for iterators, like tuple_iterator, list_iterator, range_iterator, 
> and dict_keyiterator, based on a known finite collection.  At the time, 
> map.__length_hint__ was proposed and rejected as problematic, for 
> obvious reasons, and insufficiently useful.

Thanks for the background Terry, but doesn't that suggest that sometimes 
special cases ARE special enough to break the rules? *wink*

Unfortunately, I don't think it is obvious why map.__length_hint__ is 
problematic. It only needs to return the *maximum* length, or some 
sentinel (zero?) to say "I don't know". It doesn't need to be accurate, 
unlike __len__ itself.

Perhaps we should rethink the decision not to give map() and filter() a 
length hint?


[...]
> What makes the map class special among all built-in iterator classes? 
> It appears not to be a property of the class itself, as an iterator 
> class, but of its name.  In Python 2, 'map' was bound to a different 
> implementation of the map idea, a function that produced a list, which 
> has a length.  I suspect that if Python 3 were the original Python, we 
> would not have this discussion.

No, in fairness, I too have often wanted to know the length of an 
arbitrary iterator, including map(), without consuming it. In general 
this is an unsolvable problem, but sometimes it is (or at least, at first 
glance *seems*) solvable. map() is one of those cases.

If we could solve it, that would be great -- but I'm not convinced that 
it is solvable, since the solution seems worse than the problem it aims 
to solve. But I live in hope that somebody cleverer than me can point 
out the flaws in my argument.


[...]
> If a function is documented as requiring a list, or a sequence, or a 
> length object, it is a user bug to pass an iterator.  The only thing 
> special about map and filter as errors is the rebinding of the names 
> between Py2 and Py3, so that the same code may be good in 2.x and bad in 
> 3.x.
> 
> Perhaps 2.7, in addition to future imports of text as unicode and print 
> as a function, should have had one to make map and filter be the 3.x 
> iterators.

I think that's future_builtins:

[steve@ando ~]$ python2.7 -c "from future_builtins import *; print map(len, [])"


But that wouldn't have helped E. Madison Bray or SageMath, since their 
difficulty is not their own internal use of map(), but their users' use 
of map().

Unless they simply ban any use of iterators at all, which I imagine will 
be a backwards-incompatible change (and for that matter an excessive 
overreaction for many uses), SageMath can't prevent users from providing 
map() objects or other iterator arguments.



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Abe Dillon
[Antoine Pitrou]

> How long do you think it will take your computer to exhaustively search
> the space of possible input values to a 2-integer addition function?
> Do you think it can finish before the Earth gets engulfed by the Sun?


Yes, ok. I used the word "exhaustively" wrong. Sorry about that.

I don't think humans are made of a magical substance that can exhaustively
search the space of possible pairs of integers before the heat-death of the
universe.
I think humans use strategies based, hopefully; in logic to come up with
test examples, and that it's often more valuable to capture said strategies
in code than to make a human run the algorithms. In cases where
domain-knowledge helps inform the search strategy, there should be
easy-to-use tools to build a domain-specific search strategy.

On Wed, Nov 28, 2018 at 4:09 PM Antoine Pitrou  wrote:

> On Wed, 28 Nov 2018 15:58:24 -0600
> Abe Dillon  wrote:
> > Thirdly, Computers are very good at exhaustively searching
> multidimensional
> > spaces.
>
> How long do you think it will take your computer to exhaustively search
> the space of possible input values to a 2-integer addition function?
>
> Do you think it can finish before the Earth gets engulfed by the Sun?
>
> Regards
>
> Antoine.
>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Antoine Pitrou
On Wed, 28 Nov 2018 15:58:24 -0600
Abe Dillon  wrote:
> Thirdly, Computers are very good at exhaustively searching multidimensional
> spaces.

How long do you think it will take your computer to exhaustively search
the space of possible input values to a 2-integer addition function?

Do you think it can finish before the Earth gets engulfed by the Sun? 

Regards

Antoine.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Steven D'Aprano
On Wed, Nov 28, 2018 at 05:37:39PM +0100, Anders Hovmöller wrote:
> 
> 
> > I just mentioned that porting effort for background.  I still believe
> > that the actual proposal of making the arguments to a map(...) call
> > accessible from Python as attributes of the map object (ditto filter,
> > zip, etc.) is useful in its own right, rather than just having this
> > completely opaque iterator.
> 
> +1.  Throwing away information is almost always a bad idea.

"Almost always"? Let's take this seriously, and think about the 
consequences if we actually believed that. If I created a series of 
integers:

a = 23
b = 0x17
c = 0o27
d = 0b10111
e = int('1b', 12)

your assertion would say it is a bad idea to throw away the information 
about how they were created, and hence we ought to treat all five values 
as distinct and distinguishable. So much for the small integer cache...

Perhaps every single object we create ought to hold onto a AST 
representing the literal or expression which was used to create it.

Let's not exaggerate the benefit, and ignore the costs, of "throwing 
away information". Sometimes we absolutely do want to throw away 
information, or at least make it inaccessible to the consumer of our 
data structures.

Sometimes the right thing to do is *not* to open up interfaces unless 
there is a clear need for it to be open. Doing so adds bloat to the 
interface, prevents many changes in implementation including potential 
optimizations, and may carry significant memory burdens.

Bringing this discussion back to the concrete proposal in this thread, 
as I said earlier, I want to agree with this proposal. I too like the 
idea of having map (and filter, and zip...) objects expose their 
arguments, and for the same reason: "it might be useful some day".

But every time we scratch beneath the surface and try to think about how 
and when we would actually use that information, we run into conceptual 
and practical problems which suggests strongly to me that doing this 
will turn it into a serious bug magnet, an anti-feature which sounds 
good but causes more problems than it solves.

I'm really hoping someone can convince me this is a good idea, but so 
far the proposal seems like an attractive nuisance and not a feature.


> We should have information preservation and transparency be general 
> design goals imo. Not because we can see the obvious use now but 
> because it keeps the door open to discover uses later.

While that is a reasonable position to take in some circumstances, 
in others it goes completely against YAGNI.


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Abe Dillon
[Antoinw Pitrou]

> I think utopia is the word here.  Fuzz testing can be useful, but it's
> not a replacement for manual testing of carefully selected values.


First, they aren't mutually exclusive. It's trivial to add manually
selected cases to a hypothesis test.
Second, from my experience; people rarely choose between carefully selected
optimal values and Fuzz testing, they usually choose between manually
selected trivial values or no test at all.
Thirdly, Computers are very good at exhaustively searching multidimensional
spaces. If your tool sucks so bad at that that a human can do it better,
then your tool needs work. Improving the tool saves way more time than
reverting to manual testing.

There was a post long ago (I think I read it on Digg.com for some
indication) about how to run a cloud-based system correctly. One of the
controversial practice the article advocated was disabling ssh on the
machine instances. The rational is that you never want to waste your time
fiddling with an instance that's not behaving properly. In cloud-systems,
instances should not be special. If they fail, blow them away and bring up
another. If the failure persists, it's a problem with the *system* not the
instance. If you care about individual instances YOU'RE DOING IT WRONG. You
need to re-design the system.

On Wed, Nov 28, 2018 at 8:19 AM Antoine Pitrou  wrote:

> On Tue, 27 Nov 2018 22:47:06 -0600
> Abe Dillon  wrote:
> >
> > If we could figure out a cleaner syntax for defining invariants,
> > preconditions, and postconditions we'd be half-way to automated testing
> > UTOPIA! (ok, maybe I'm being a little over-zealous)
>
> I think utopia is the word here.  Fuzz testing can be useful, but it's
> not a replacement for manual testing of carefully selected values.
>
> Also, the idea that fuzz testing will automatically find edge cases in
> your code is idealistic.  It depends on the algorithm you've
> implemented and the distribution of values chosen by the tester.
> Showcasing trivially wrong examples (such as an addition function that
> always returns 0, or a tail function that doesn't return the tail)
> isn't very helpful for a real-world analysis, IMHO.
>
> In the end, you have to be rigorous when writing tests, and for most
> non-trivial functions it requires that you devise the distribution of
> input values depending on the implemented algorithm, not leave that
> distribution to a third-party library that knows nothing about your
> program.
>
> Regards
>
> Antoine.
>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Abe Dillon
[Steven D'Aprano]

> You should look at the state of the art in Design By Contract. In
> Eiffel, DBC is integrated in the language:
> https://www.eiffel.com/values/design-by-contract/introduction/
>
> https://www.eiffel.org/doc/eiffel/ET-_Design_by_Contract_%28tm%29%2C_Assertions_and_Exceptions
>
> Eiffel uses a rather Pythonic block structure to define invariants.
> The syntax is not identical to Python's (Eiffel eschews the colons) but
> it also comes close to executable pseudo-code.


Thank you! I forgot to mention this (or look into how other languages solve
this problem).
I saw your example syntax in the recent DBC main thread and liked it a lot.

One thought I keep coming back to is this comparison between doc string
formats .
It seems obvious that the "Sphynxy" style is the noisiest, most verbose,
and ugliest format.
Instead of putting ":arg ...:" and ":type ...:" for each parameter and the
return value, it makes much more sense to open up an Args: section and use
a concise notation for type.

The decorator-based pre and post conditions seem like they suffer from the
same redundant, noisy, verbosity problem as the Sphynxy docstring format
but makes it worse by put all that noise before the function declaration
itself.

It makes sense to me that a docstring might have a markdown-style syntax
like

def format_exception(etype, value):
"""Format the exception with a traceback.
Args:   etype (str):  what etype represents   [some
constraint on etype](precondition)   [another constraint on
etype](in_line_precondition?)   value (int): what value represents
  [some constraint on value](precondition)   [some
constraints across multiple params](precondition)
Returns:   What the return value represents  # usually very
similar to the description at the top   [some constraint on
return](postcondition)
"""
...


That ties most bits of the documentation to some code that enforces the
correctness of the documentation. And if it's a little noisy, we could take
another page from markdown's book and offer alternate ways to reference
precondition and postcondition logic. I'm worried that such a style would
carry a lot of the same drawbacks as doctest


Also, my sense of coding style has been heavily influenced by [this talk](
https://vimeo.com/74316116), particularly the part where he shoves a
mangled Hamlet Soliloquy into the margins, so now many of my functions
adopt the following style:

def someDescriptiveName(
arg1: SomeType,
arg2: AnotherType[Thing],
...
argN: SomeOtherType = default_value) -> ReturnType:
"""
what the function does

Args:
arg1: what arg1 represents
arg2: what arg2 represents
...
"""
...

This highlights a rather obvious duplication of code. We declare an
arguments section in code and list all the arguments, then we do so again
in the doc string.
If you want your doc string to stay in sync with the code, this duplication
is a problem. It makes more sense to tie the documentation for an argument
to said argument:

def someDescriptiveName( # what the function does
arg1: SomeType,
# what arg1 represents
arg2: AnotherType[Thing],
# what arg2 represents
...
argN: SomeOtherType = default_value
# what argN represents
) -> ReturnType:  # what the return value represents
...

I think it especially makes sense if you consider the preconditions,
postconditions, and invariants as a sort-of extension of typing in the
sense that it Typing narrows the set of acceptable values to a set of types
and contracts restrict that set further.

I hope that clarifies my thought process. I don't like the d-strings that I
proposed. I'd prefer syntax closer to Eiffel, but the above is the line of
thought I was following to arrive at d-strings.

>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Greg Ewing

E. Madison Bray wrote:

if I have a function that used to take, say,
a list as an argument, and it receives a `map` object, I now have to
be able to deal with map()s, and I may have checks I want to perform
on the underlying iterables before, say, I try to iterate over the
`map`.


This sounds like a backwards way to address the issue. If you
have a function that expects a list in particular, it's up to
its callers to make sure they give it one. Instead of maing the
function do a bunch of looking before it leaps, it would be
better to define something like

   def lmap(f, *args): return list(map(f, *args))

and then replace 'map' with 'lmap' elsewhere in your code.

--
Greg
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Abe Dillon
[Marko Ristin-Kaufmann]
>
> Have you looked at the recent discussions regarding design-by-contract on
> this list


I tried to read through them all before posting, but I may have missed some
of the forks. There was a lot of good discussion!

[Marko Ristin-Kaufmann]

> You might want to have a look at static checking techniques such as
> abstract interpretation. I hope to be able to work on such a tool for
> Python in some two years from now. We can stay in touch if you are
> interested.


I'll look into that! I'm very interested!

[Marko Ristin-Kaufmann]

> Re decorators: to my own surprise, using decorators in a larger code base
> is completely practical including the  readability and maintenance of the
> code. It's neither that ugly nor problematic as it might seem at first look.


Interesting. In the thread you linked on DBC, it seemed like Steve D'Aprano
and David Mertz (and possibly others) were put off by the verbosity and
noisiness of the decorator-based solution you provided with icontract
(though I think there are ways to streamline that solution). It seems like
syntactic support could offer a more concise and less noisy implementation.

One thing that I can get on a soap-box about is the benefit putting the
most relevant information to the reader in the order of top to bottom and
left to right whenever possible. I've written many posts about this. I
think a lot of Python syntax gets this right. It would have been easy to
follow the same order as for-loops when designing comprehensions, but
expressions allow you some freedom to order things differently, so now
comprehensions read:

squares = ...
# squares is

squares = [...
# squares is a list

squares = [number*number...
# squares is a list of num squared

squares = [number*number for num in numbers]
# squares is a list of num squared 'from' numbers

I think decorators sort-of break this rule because they can put a lot of
less important information (like, that a function is logged or timed)
before more important information (like the function's name, signature,
doc-string, etc...). It's not a huge deal because they tend to be
de-emphasized by my IDE and there typically aren't dozens of them on each
function, but I definitely prefer Eiffel's syntax
 over
decorators for that reason.

I understand that syntax changes have an very high bar for very good
reasons. Hillel Wayne's PyCon talk got me thinking that we might be close
enough to a really great solution to a wide variety of testing problems
that it might justify some new syntax or perhaps someone has an idea that
wouldn't require new syntax that I didn't think of.

[Marko Ristin-Kaufmann]

> Some of the aspects we still haven't figured out are: how to approach
> multi-threading (locking around the whole function with an additional
> decorator?) and granularity of contract switches (right now we use
> always/optimized, production/non-optimized and teating/slow, but it seems
> that a larger system requires finer categories).


Yeah... I don't know anything about testing concurrent or parallel code.

On Wed, Nov 28, 2018 at 1:12 AM Marko Ristin-Kaufmann <
marko.ris...@gmail.com> wrote:

> Hi Abe,
>
> I've been pulling a lot of ideas from the recent discussion on design by
>> contract (DBC), the elegance and drawbacks
>>  of doctests
>> , and the amazing talk
>>  given by Hillel Wayne at
>> this year's PyCon entitled "Beyond Unit Tests: Taking your Tests to the
>> Next Level".
>>
>
> Have you looked at the recent discussions regarding design-by-contract on
> this list (
> https://groups.google.com/forum/m/#!topic/python-ideas/JtMgpSyODTU
> and the following forked threads)?
>
> You might want to have a look at static checking techniques such as
> abstract interpretation. I hope to be able to work on such a tool for
> Python in some two years from now. We can stay in touch if you are
> interested.
>
> Re decorators: to my own surprise, using decorators in a larger code base
> is completely practical including the  readability and maintenance of the
> code. It's neither that ugly nor problematic as it might seem at first look.
>
> We use our https://github.com/Parquery/icontract at the company. Most of
> the design choices come from practical issues we faced -- so you might want
> to read the doc even if you don't plant to use the library.
>
> Some of the aspects we still haven't figured out are: how to approach
> multi-threading (locking around the whole function with an additional
> decorator?) and granularity of contract switches (right now we use
> always/optimized, production/non-optimized and teating/slow, but it seems
> that a larger system requires finer categories).
>
> Cheers Marko
>
>
>
>
___
Python-ideas mailing list
Python-ideas@python.org

Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Terry Reedy

On 11/28/2018 9:27 AM, E. Madison Bray wrote:

On Mon, Nov 26, 2018 at 10:35 PM Kale Kundert  wrote:


I just ran into the following behavior, and found it surprising:


len(map(float, [1,2,3]))

TypeError: object of type 'map' has no len()

I understand that map() could be given an infinite sequence and therefore might 
not always have a length.  But in this case, it seems like map() should've 
known that its length was 3.  I also understand that I can just call list() on 
the whole thing and get a list, but the nice thing about map() is that it 
doesn't copy data, so it's unfortunate to lose that advantage for no particular 
reason.

My proposal is to delegate map.__len__() to the underlying iterable.  


One of the guidelines in the Zen of Python is
"Special cases aren't special enough to break the rules."

This proposal claims that the Python 3 built-in iterator class 'map' is 
so special that it should break the rule that iterators in general 
cannot and therefore do not have .__len__ methods because their size may 
be infinite, unknowable until exhaustion, or declining with each 
.__next__ call.


For iterators, 3.4 added an optional __length_hint__ method.  This makes 
sense for iterators, like tuple_iterator, list_iterator, range_iterator, 
and dict_keyiterator, based on a known finite collection.  At the time, 
map.__length_hint__ was proposed and rejected as problematic, for 
obvious reasons, and insufficiently useful.


The proposal above amounts to adding an unspecified __length_hint__ 
misnamed as __len__.  Won't happen.  Instead, proponents should define 
and test one or more specific implementations of __length_hint__ in map 
subclass(es).



I mostly agree with the existing objections, though I have often found
myself wanting this too, especially now that `map` does not simply
return a list.


What makes the map class special among all built-in iterator classes? 
It appears not to be a property of the class itself, as an iterator 
class, but of its name.  In Python 2, 'map' was bound to a different 
implementation of the map idea, a function that produced a list, which 
has a length.  I suspect that if Python 3 were the original Python, we 
would not have this discussion.



As a simple counter-proposal which I believe has fewer issues, I would
really like it if the built-in `map()` and `filter()` at least
provided a Python-level attribute to access the underlying iterables.


This proposes to make map (and filter) special in a different way, by 
adding other special (dunder) attributes.  In general, built-in 
callables do not attach their args to their output, for obvious reasons. 
 If they do, they do not expose them.  If input data must be saved, the 
details are implementation dependent.  A C-coded callable would not 
necessarily save information in the form of Python objects.


Again, it seems to me that the only thing special about these two, 
versus the other iterators left in itertools, is the history of the names.



This is necessary because if I have a function that used to take, say,
a list as an argument, and it receives a `map` object, I now have to
be able to deal with map()s,


If a function is documented as requiring a list, or a sequence, or a 
length object, it is a user bug to pass an iterator.  The only thing 
special about map and filter as errors is the rebinding of the names 
between Py2 and Py3, so that the same code may be good in 2.x and bad in 
3.x.


Perhaps 2.7, in addition to future imports of text as unicode and print 
as a function, should have had one to make map and filter be the 3.x 
iterators.


Perhaps Sage needs something like

def size_map(func, *iterables):
for it in iterables:
if not hasattr(it, '__len__'):
raise TypeError(f'iterable {repr(it)} has no size')
return map(func, *iterables)

https://docs.python.org/3/library/functions.html#map says
"map(function, iterable, ...)
Return an iterator [...]"

The wording is intentional.  The fact that map is a class and the 
iterator an instance of the class is a CPython implementation detail. 
Another implementation could use the generator function equivalent given 
in the Python 2 itertools doc, or a translation thereof.  I don't know 
what pypy and other implementations do.  The fact that CPython itertools 
callables are (now) C-coded classes instead Python-coded generator 
functions, or C translations thereof (which is tricky) is for 
performance and ease of maintenance.


--
Terry Jan Reedy

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Anders Hovmöller



> I just mentioned that porting effort for background.  I still believe
> that the actual proposal of making the arguments to a map(...) call
> accessible from Python as attributes of the map object (ditto filter,
> zip, etc.) is useful in its own right, rather than just having this
> completely opaque iterator.

+1.  Throwing away information is almost always a bad idea. That was fixed with 
classes and kwargs in 3.6 which removes a lot of fiddle workarounds for 
example. 

Throwing away data needlessly is also why 2to3, baron, Parso and probably many 
more had to reimplement a python parser instead of using the built in. 

We should have information preservation and transparency be general design 
goals imo. Not because we can see the obvious use now but because it keeps the 
door open to discover uses later. 

/ Anders
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Jonathan Fine
Hi Madison

Is there a URL somewhere where I can view code written to port sage to
Python3? I've already found
https://trac.sagemath.org/search?q=python3

And because I'm a bit interested in cluster algebra, I went to
https://git.sagemath.org/sage.git/commit/?id=3a6f494ac1d4dbc1e22b0ecbebdbc639f6c7f6d3

Is this a good example of the change required? Are there other example
worth looking at?

-- 
Jonathan
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread E. Madison Bray
I should add, I know the history here of bitterness surrounding Python 3
complaints and this is not one. I defend most things Python 3 and have
ported many projects (Sage just being the largest by orders of magnitude,
with every Python 3 porting quirk represented and often magnified). I agree
with the new iterable map(), filter(), and zip() and welcomed that change.
But I think making them more introspectable would be a useful enhancement.

On Wed, Nov 28, 2018, 17:16 E. Madison Bray  Probably the most proliferate reason it made things *worse* is that many
> functions that can take collections as arguments--in fact probably
> most--were never written to accept arbitrary iterables in the first place.
> Perhaps they should have been, but the majority of that was before my time
> so I and others who worked on the Python 3 port were stuck with that.
>
> Sure the fix is simple enough: check if the object is iterable (itself not
> always as simple as one might assume) and then call list() on it. But we're
> talking thousands upon thousands of functions that need to be updated where
> examples involving map previously would have just worked.
>
> But on top of the obvious workarounds I would now like to do things like
> protect users, where possible, from doing things like passing arbitrarily
> sized data to relatively flimsy C libraries, or as I mentioned in my last
> message make new optimizations that weren't possible before.
>
> Of course this isn't always possible in some cases where dealing with an
> arbitrary opaque iterator, or some pathological cases. But I'm concerned
> more about doing the best we can in the most common cases (lists, tuples,
> vectors, etc) which are *vastly* more common.
>
> I use SageMath as an example but I'm sure others could come up with their
> own clever use cases. I know there are other cases where I've wanted to at
> least try to get the len of a map, at least in cases where it was
> unambiguous (for example making a progress meter or something)
>
> On Wed, Nov 28, 2018, 16:33 Steven D'Aprano 
>> On Wed, Nov 28, 2018 at 04:14:24PM +0100, E. Madison Bray wrote:
>>
>> > For example, some function that used to expect some finite-sized
>> > sequence such as a list or tuple is now passed a "map", possibly
>> > wrapping one or more iterable of arbitrary, possibly non-finite size.
>> > For the purposes of some algorithm I have this is not useful and I
>> > need to convert it to a sequence anyways but don't want to do that
>> > without some guarantee that I won't blow up the user's memory usage.
>> > So I might want to check:
>> >
>> > finite_definite = True
>> > for it in my_map.iters:
>> > try:
>> > len(it)
>> > except TypeError:
>> > finite_definite = False
>> >
>> > if finite_definite:
>> > my_seq = list(my_map)
>> > else:
>> > # some other algorithm
>>
>> But surely you didn't need to do this just because of *map*. Users could
>> have passed an infinite, unsized iterable going back to Python 1 days
>> with the old sequence protocol. They certainly could pass a generator or
>> other opaque iterator apart from map. So I'm having trouble seeing why
>> the Python 2/3 change to map made things worse for SageMath.
>>
>> But in any case, this example comes back to the question of len again,
>> and we've already covered why this is problematic. In case you missed
>> it, let's take a toy example which demonstrates the problem:
>>
>>
>> def mean(it):
>> if isinstance(it, map):
>> # Hypothetical attribute access to the underlying iterable.
>> n = len(it.iterable)
>> return sum(it)/n
>>
>>
>> Now let's pass a map object to it:
>>
>> data = [1, 2, 3, 4, 5]
>> it = map(lambda x: x, data)
>> assert len(it.iterable) == 5
>> next(it); next(it); next(it)
>>
>> assert mean(it) == 4.5
>> # fails, as it actually returns 9/5 instead of 9/2
>>
>>
>> --
>> Steve
>> ___
>> Python-ideas mailing list
>> Python-ideas@python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread E. Madison Bray
Probably the most proliferate reason it made things *worse* is that many
functions that can take collections as arguments--in fact probably
most--were never written to accept arbitrary iterables in the first place.
Perhaps they should have been, but the majority of that was before my time
so I and others who worked on the Python 3 port were stuck with that.

Sure the fix is simple enough: check if the object is iterable (itself not
always as simple as one might assume) and then call list() on it. But we're
talking thousands upon thousands of functions that need to be updated where
examples involving map previously would have just worked.

But on top of the obvious workarounds I would now like to do things like
protect users, where possible, from doing things like passing arbitrarily
sized data to relatively flimsy C libraries, or as I mentioned in my last
message make new optimizations that weren't possible before.

Of course this isn't always possible in some cases where dealing with an
arbitrary opaque iterator, or some pathological cases. But I'm concerned
more about doing the best we can in the most common cases (lists, tuples,
vectors, etc) which are *vastly* more common.

I use SageMath as an example but I'm sure others could come up with their
own clever use cases. I know there are other cases where I've wanted to at
least try to get the len of a map, at least in cases where it was
unambiguous (for example making a progress meter or something)

On Wed, Nov 28, 2018, 16:33 Steven D'Aprano  On Wed, Nov 28, 2018 at 04:14:24PM +0100, E. Madison Bray wrote:
>
> > For example, some function that used to expect some finite-sized
> > sequence such as a list or tuple is now passed a "map", possibly
> > wrapping one or more iterable of arbitrary, possibly non-finite size.
> > For the purposes of some algorithm I have this is not useful and I
> > need to convert it to a sequence anyways but don't want to do that
> > without some guarantee that I won't blow up the user's memory usage.
> > So I might want to check:
> >
> > finite_definite = True
> > for it in my_map.iters:
> > try:
> > len(it)
> > except TypeError:
> > finite_definite = False
> >
> > if finite_definite:
> > my_seq = list(my_map)
> > else:
> > # some other algorithm
>
> But surely you didn't need to do this just because of *map*. Users could
> have passed an infinite, unsized iterable going back to Python 1 days
> with the old sequence protocol. They certainly could pass a generator or
> other opaque iterator apart from map. So I'm having trouble seeing why
> the Python 2/3 change to map made things worse for SageMath.
>
> But in any case, this example comes back to the question of len again,
> and we've already covered why this is problematic. In case you missed
> it, let's take a toy example which demonstrates the problem:
>
>
> def mean(it):
> if isinstance(it, map):
> # Hypothetical attribute access to the underlying iterable.
> n = len(it.iterable)
> return sum(it)/n
>
>
> Now let's pass a map object to it:
>
> data = [1, 2, 3, 4, 5]
> it = map(lambda x: x, data)
> assert len(it.iterable) == 5
> next(it); next(it); next(it)
>
> assert mean(it) == 4.5
> # fails, as it actually returns 9/5 instead of 9/2
>
>
> --
> Steve
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread E. Madison Bray
One thing I'd like to add real quick to this (I'm on my phone so apologies
for crappy quoting):

Although there are existing cases where there is a loss of efficiency over
Python 2 map() when dealing with the opaque, iterable Python 3 map(), the
latter also presents many opportunities for enhancements that weren't
possible before.

For example, previously a user might pass map(func, some_list) where func
is some pure function and the iterable is almost always a list of some
kind. Previously that map() call would be evaluated (often slowly) first.

But now we can treat a map as something a little more formal, as a
container for a function and one or more iterables, which happens to have
this special functionality when you iterate over it, but is otherwise just
a special container. This is technically already the case, we just can't
directly access it as a container. If we could, it would be possible to
implement various optimizations that a user might not have otherwise been
obvious to the user. This is especially the case of the iterable is a
simple list, which is something we can check. The function in this case
very likely might actually be a C function that was wrapped with Cython. I
can easily convert this on the user's behalf to a simple C loop or possibly
even some other more optimal vectorized code.

These are application-specific special cases of course, but many such cases
become easily accessible if map() and friends are usable as specialized
containers.

On Wed, Nov 28, 2018, 16:31 E. Madison Bray  On Wed, Nov 28, 2018 at 4:24 PM Chris Angelico  wrote:
> >
> > On Thu, Nov 29, 2018 at 2:19 AM E. Madison Bray 
> wrote:
> > >
> > > On Wed, Nov 28, 2018 at 4:14 PM Steven D'Aprano 
> wrote:
> > > >
> > > > On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:
> > > >
> > > > > That effort is already mostly done and adding a helper function
> would
> > > > > not have worked as users *passing* map(...) as an argument to some
> > > > > function just expect it to work.
> > > >
> > > > Ah, that's what I was missing.
> > > >
> > > > But... surely the function will still work if they pass an opaque
> > > > iterator *other* than map() and/or filter?
> > > >
> > > > it = (func(x) for x in something if condition(x))
> > > > some_sage_function(it)
> > >
> > > That one is admittedly tricky.  For that matter it might be nice to
> > > have more introspection of generator expressions too, but there at
> > > least we have .gi_code if nothing else.
> >
> > Considering that a genexp can do literally anything, I doubt you'll
> > get anywhere with that introspection.
> >
> > > But those are a far less common example in my case, whereas map() is
> > > *everywhere* in math code :)
> >
> > Perhaps then, the problem is that math code treats "map" as something
> > that is more akin to "instrumented list" than it is to a generator. If
> > you know for certain that you're mapping a low-cost pure function over
> > an immutable collection, the best solution may be to proxy through to
> > the original list than to generate values on the fly. And if that's
> > the case, you don't want the Py3 map *or* the Py2 one, although the
> > Py2 one can behave this way, at the cost of crazy amounts of
> > efficiency.
>
> Yep, that's a great example where it might be possible to introspect a
> given `map` object and take it apart to do something more efficient
> with it.  This is less of a problem with internal code where it's easy
> to just not use map() at all, and that is often the case.  But a lot
> of the people who develop code for Sage are mathematicians, not
> engineers, and they may not be aware of this, so they write code that
> passes `map()` objects to more internal machinery.  And users will do
> the same even moreso.
>
> I can (and have) written horrible C-level hacks--not for this specific
> issue, but others like it--and am sometimes tempted to do the same
> here :(
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Marcos Eliziario
>
> In the end, you have to be rigorous when writing tests, and for most
> non-trivial functions it requires that you devise the distribution of
> input values depending on the implemented algorithm, not leave that
> distribution to a third-party library that knows nothing about your
> program.
>

Indeed.
But the great thing about the "hypothesis" tool is that it allows me to
somewhat automate the generation of sets of input values based on my
specific requirements derived from my knowledge of my program.
It allows me to think about what is the reasonable distribution of values
for each argument in a function by either using existing strategies, using
their arguments, combining and extending them, and them letting the tool do
the grunt work of running the test for lots of different equivalent classes
of argument values.
I think that as long as the tool user keeps what you said in mind and uses
the tool accordingly it can be a great helper, and probably even force the
average programmer to think more rigorously about the input values to be
tested, not to mention the whole class of trivial mistakes and
forgetfulness we are all bound to be subject when writing test cases.

Best,



Em qua, 28 de nov de 2018 às 12:18, Antoine Pitrou 
escreveu:

> On Tue, 27 Nov 2018 22:47:06 -0600
> Abe Dillon  wrote:
> >
> > If we could figure out a cleaner syntax for defining invariants,
> > preconditions, and postconditions we'd be half-way to automated testing
> > UTOPIA! (ok, maybe I'm being a little over-zealous)
>
> I think utopia is the word here.  Fuzz testing can be useful, but it's
> not a replacement for manual testing of carefully selected values.
>
> Also, the idea that fuzz testing will automatically find edge cases in
> your code is idealistic.  It depends on the algorithm you've
> implemented and the distribution of values chosen by the tester.
> Showcasing trivially wrong examples (such as an addition function that
> always returns 0, or a tail function that doesn't return the tail)
> isn't very helpful for a real-world analysis, IMHO.
>
> In the end, you have to be rigorous when writing tests, and for most
> non-trivial functions it requires that you devise the distribution of
> input values depending on the implemented algorithm, not leave that
> distribution to a third-party library that knows nothing about your
> program.
>
> Regards
>
> Antoine.
>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Marcos Eliziário Santos
mobile/whatsapp/telegram: +55(21) 9-8027-0156
skype: marcos.elizia...@gmail.com
linked-in : https://www.linkedin.com/in/eliziario/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Steven D'Aprano
On Wed, Nov 28, 2018 at 04:14:24PM +0100, E. Madison Bray wrote:

> For example, some function that used to expect some finite-sized
> sequence such as a list or tuple is now passed a "map", possibly
> wrapping one or more iterable of arbitrary, possibly non-finite size.
> For the purposes of some algorithm I have this is not useful and I
> need to convert it to a sequence anyways but don't want to do that
> without some guarantee that I won't blow up the user's memory usage.
> So I might want to check:
> 
> finite_definite = True
> for it in my_map.iters:
> try:
> len(it)
> except TypeError:
> finite_definite = False
> 
> if finite_definite:
> my_seq = list(my_map)
> else:
> # some other algorithm

But surely you didn't need to do this just because of *map*. Users could 
have passed an infinite, unsized iterable going back to Python 1 days 
with the old sequence protocol. They certainly could pass a generator or 
other opaque iterator apart from map. So I'm having trouble seeing why 
the Python 2/3 change to map made things worse for SageMath.

But in any case, this example comes back to the question of len again, 
and we've already covered why this is problematic. In case you missed 
it, let's take a toy example which demonstrates the problem:


def mean(it):
if isinstance(it, map):
# Hypothetical attribute access to the underlying iterable.
n = len(it.iterable)  
return sum(it)/n


Now let's pass a map object to it:

data = [1, 2, 3, 4, 5]
it = map(lambda x: x, data)
assert len(it.iterable) == 5
next(it); next(it); next(it)

assert mean(it) == 4.5
# fails, as it actually returns 9/5 instead of 9/2


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread E. Madison Bray
On Wed, Nov 28, 2018 at 4:24 PM Chris Angelico  wrote:
>
> On Thu, Nov 29, 2018 at 2:19 AM E. Madison Bray  wrote:
> >
> > On Wed, Nov 28, 2018 at 4:14 PM Steven D'Aprano  wrote:
> > >
> > > On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:
> > >
> > > > That effort is already mostly done and adding a helper function would
> > > > not have worked as users *passing* map(...) as an argument to some
> > > > function just expect it to work.
> > >
> > > Ah, that's what I was missing.
> > >
> > > But... surely the function will still work if they pass an opaque
> > > iterator *other* than map() and/or filter?
> > >
> > > it = (func(x) for x in something if condition(x))
> > > some_sage_function(it)
> >
> > That one is admittedly tricky.  For that matter it might be nice to
> > have more introspection of generator expressions too, but there at
> > least we have .gi_code if nothing else.
>
> Considering that a genexp can do literally anything, I doubt you'll
> get anywhere with that introspection.
>
> > But those are a far less common example in my case, whereas map() is
> > *everywhere* in math code :)
>
> Perhaps then, the problem is that math code treats "map" as something
> that is more akin to "instrumented list" than it is to a generator. If
> you know for certain that you're mapping a low-cost pure function over
> an immutable collection, the best solution may be to proxy through to
> the original list than to generate values on the fly. And if that's
> the case, you don't want the Py3 map *or* the Py2 one, although the
> Py2 one can behave this way, at the cost of crazy amounts of
> efficiency.

Yep, that's a great example where it might be possible to introspect a
given `map` object and take it apart to do something more efficient
with it.  This is less of a problem with internal code where it's easy
to just not use map() at all, and that is often the case.  But a lot
of the people who develop code for Sage are mathematicians, not
engineers, and they may not be aware of this, so they write code that
passes `map()` objects to more internal machinery.  And users will do
the same even moreso.

I can (and have) written horrible C-level hacks--not for this specific
issue, but others like it--and am sometimes tempted to do the same
here :(
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Chris Angelico
On Thu, Nov 29, 2018 at 2:19 AM E. Madison Bray  wrote:
>
> On Wed, Nov 28, 2018 at 4:14 PM Steven D'Aprano  wrote:
> >
> > On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:
> >
> > > That effort is already mostly done and adding a helper function would
> > > not have worked as users *passing* map(...) as an argument to some
> > > function just expect it to work.
> >
> > Ah, that's what I was missing.
> >
> > But... surely the function will still work if they pass an opaque
> > iterator *other* than map() and/or filter?
> >
> > it = (func(x) for x in something if condition(x))
> > some_sage_function(it)
>
> That one is admittedly tricky.  For that matter it might be nice to
> have more introspection of generator expressions too, but there at
> least we have .gi_code if nothing else.

Considering that a genexp can do literally anything, I doubt you'll
get anywhere with that introspection.

> But those are a far less common example in my case, whereas map() is
> *everywhere* in math code :)

Perhaps then, the problem is that math code treats "map" as something
that is more akin to "instrumented list" than it is to a generator. If
you know for certain that you're mapping a low-cost pure function over
an immutable collection, the best solution may be to proxy through to
the original list than to generate values on the fly. And if that's
the case, you don't want the Py3 map *or* the Py2 one, although the
Py2 one can behave this way, at the cost of crazy amounts of
efficiency.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread E. Madison Bray
On Wed, Nov 28, 2018 at 4:14 PM Steven D'Aprano  wrote:
>
> On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:
>
> > That effort is already mostly done and adding a helper function would
> > not have worked as users *passing* map(...) as an argument to some
> > function just expect it to work.
>
> Ah, that's what I was missing.
>
> But... surely the function will still work if they pass an opaque
> iterator *other* than map() and/or filter?
>
> it = (func(x) for x in something if condition(x))
> some_sage_function(it)

That one is admittedly tricky.  For that matter it might be nice to
have more introspection of generator expressions too, but there at
least we have .gi_code if nothing else.

But those are a far less common example in my case, whereas map() is
*everywhere* in math code :)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread E. Madison Bray
On Wed, Nov 28, 2018 at 4:04 PM Steven D'Aprano  wrote:
>
> On Wed, Nov 28, 2018 at 03:27:25PM +0100, E. Madison Bray wrote:
>
> > I mostly agree with the existing objections, though I have often found
> > myself wanting this too, especially now that `map` does not simply
> > return a list.  This problem alone (along with the same problem for
> > filter) has had a ridiculously outsized impact on the Python 3 porting
> > effort for SageMath, and I find it really irritating at times.
> >
> > As a simple counter-proposal which I believe has fewer issues, I would
> > really like it if the built-in `map()` and `filter()` at least
> > provided a Python-level attribute to access the underlying iterables.
> > This is necessary because if I have a function that used to take, say,
> > a list as an argument, and it receives a `map` object, I now have to
> > be able to deal with map()s, and I may have checks I want to perform
> > on the underlying iterables before, say, I try to iterate over the
> > `map`.
> >
> > Exactly what those checks are and whether or not they're useful may be
> > highly application-specific, which is why say a generic `map.__len__`
> > is not workable.  However, if I can at least inspect those iterables I
> > can make my own choices on how to handle the map.
>
> Can you give a concrete example of what you would do in practice? I'm
> having trouble thinking of how and when this sort of thing would be
> useful. Aside from extracting the length of the iterable(s), under what
> circumstances would you want to bypass the call to map() or filter() and
> access the iterables directly?

For example, some function that used to expect some finite-sized
sequence such as a list or tuple is now passed a "map", possibly
wrapping one or more iterable of arbitrary, possibly non-finite size.
For the purposes of some algorithm I have this is not useful and I
need to convert it to a sequence anyways but don't want to do that
without some guarantee that I won't blow up the user's memory usage.
So I might want to check:

finite_definite = True
for it in my_map.iters:
try:
len(it)
except TypeError:
finite_definite = False

if finite_definite:
my_seq = list(my_map)
else:
# some other algorithm

Of course, some arbitrary object could lie about its __len__ but I'm
not concerned about pathological cases here.  There may be other
opportunities for optimization as well that are otherwise hidden.

Either way, I don't see any reason to hide this data; it's a couple of
slot attributes and instantly better introspection capability.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Steven D'Aprano
On Wed, Nov 28, 2018 at 04:04:33PM +0100, E. Madison Bray wrote:

> That effort is already mostly done and adding a helper function would
> not have worked as users *passing* map(...) as an argument to some
> function just expect it to work.

Ah, that's what I was missing.

But... surely the function will still work if they pass an opaque 
iterator *other* than map() and/or filter?

it = (func(x) for x in something if condition(x))
some_sage_function(it)


You surely don't expect to be able to peer inside every and any iterator
that you are given? So if you have to handle the opaque iterator case 
anyway, how is it *worse* when the user passes map() or filter() instead 
of a generator like the above?


> I just mentioned that porting effort for background.  I still believe
> that the actual proposal of making the arguments to a map(...) call
> accessible from Python as attributes of the map object (ditto filter,
> zip, etc.) is useful in its own right, rather than just having this
> completely opaque iterator.

Perhaps...

I *want* to agree with this, but I'm having trouble thinking of when and 
how it would be useful. Some concrete examples would help justify it.


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread E. Madison Bray
On Wed, Nov 28, 2018 at 4:04 PM Steven D'Aprano  wrote:
>
> On Wed, Nov 28, 2018 at 03:27:25PM +0100, E. Madison Bray wrote:
>
> > I mostly agree with the existing objections, though I have often found
> > myself wanting this too, especially now that `map` does not simply
> > return a list.  This problem alone (along with the same problem for
> > filter) has had a ridiculously outsized impact on the Python 3 porting
> > effort for SageMath, and I find it really irritating at times.
>
> *scratches head*
>
> I presume that SageMath worked fine with Python 2 map and filter? You
> can have them back again:
>
> # put this in a module called py2
> _map = map
> def map(*args):
> return list(_map(*args))
>
>
> And similarly for filter. The only annoying part is to import this new
> map at the start of every module that needs it, but while that's
> annoying, I wouldn't call it a "ridiculously outsized impact". Its one
> line at the top of each module.
>
> from py2 import map, filter
>
>
> What am I missing?

Large amounts of context; size of code base.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread E. Madison Bray
On Wed, Nov 28, 2018 at 3:54 PM Chris Angelico  wrote:
>
> On Thu, Nov 29, 2018 at 1:46 AM Jonathan Fine  wrote:
> >
> > On Wed, Nov 28, 2018 at 2:28 PM E. Madison Bray  
> > wrote:
> >
> > > I mostly agree with the existing objections, though I have often found
> > > myself wanting this too, especially now that `map` does not simply
> > > return a list.  This problem alone (along with the same problem for
> > > filter) has had a ridiculously outsized impact on the Python 3 porting
> > > effort for SageMath, and I find it really irritating at times.
> >
> > I'm a mathematician, so understand your concerns. Here's what I hope
> > is a helpful suggestion.
> >
> > Create a module, say sage.itertools that contains (not tested)
> >
> >def py2map(iterable):
> > return list(map(iterable))
>
> With the nitpick that the arguments should be (func, *iterables)
> rather than just the single iterable, yes, this is a viable transition
> strategy. In fact, it's very similar to what 2to3 would do, except
> that 2to3 would do it at the call site. If any Py3 porting process is
> being held up significantly by this, I would strongly recommend giving
> 2to3 an eyeball - run it on some of your code, then either accept its
> changes or just learn from the diffs. It's not perfect (nothing is),
> but it's a useful tool.

That effort is already mostly done and adding a helper function would
not have worked as users *passing* map(...) as an argument to some
function just expect it to work.  The only alternative would have been
replacing the builtin map with something else at the globals level.
2to3 is mostly useless since a major portion of Sage is written in
Cython anyways.

I just mentioned that porting effort for background.  I still believe
that the actual proposal of making the arguments to a map(...) call
accessible from Python as attributes of the map object (ditto filter,
zip, etc.) is useful in its own right, rather than just having this
completely opaque iterator.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Steven D'Aprano
On Wed, Nov 28, 2018 at 03:27:25PM +0100, E. Madison Bray wrote:

> I mostly agree with the existing objections, though I have often found
> myself wanting this too, especially now that `map` does not simply
> return a list.  This problem alone (along with the same problem for
> filter) has had a ridiculously outsized impact on the Python 3 porting
> effort for SageMath, and I find it really irritating at times.

*scratches head*

I presume that SageMath worked fine with Python 2 map and filter? You 
can have them back again:

# put this in a module called py2
_map = map
def map(*args):
return list(_map(*args))


And similarly for filter. The only annoying part is to import this new 
map at the start of every module that needs it, but while that's 
annoying, I wouldn't call it a "ridiculously outsized impact". Its one 
line at the top of each module.

from py2 import map, filter


What am I missing?



> As a simple counter-proposal which I believe has fewer issues, I would
> really like it if the built-in `map()` and `filter()` at least
> provided a Python-level attribute to access the underlying iterables.
> This is necessary because if I have a function that used to take, say,
> a list as an argument, and it receives a `map` object, I now have to
> be able to deal with map()s, and I may have checks I want to perform
> on the underlying iterables before, say, I try to iterate over the
> `map`.
> 
> Exactly what those checks are and whether or not they're useful may be
> highly application-specific, which is why say a generic `map.__len__`
> is not workable.  However, if I can at least inspect those iterables I
> can make my own choices on how to handle the map.

Can you give a concrete example of what you would do in practice? I'm 
having trouble thinking of how and when this sort of thing would be 
useful. Aside from extracting the length of the iterable(s), under what 
circumstances would you want to bypass the call to map() or filter() and 
access the iterables directly?


> Exposing the underlying iterables to Python also has dangers in that I
> could directly call `next()` on them and possibly create some
> confusion, but consenting adults and all that...

I don't think that's worse than what we can already do if you hold onto 
a reference to the underlying iterable:

py> a = [1, 2, 3]
py> it = map(lambda x: x+100, a)
py> next(it)
101
py> a.insert(0, None)
py> next(it)
101



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Chris Angelico
On Thu, Nov 29, 2018 at 1:46 AM Jonathan Fine  wrote:
>
> On Wed, Nov 28, 2018 at 2:28 PM E. Madison Bray  wrote:
>
> > I mostly agree with the existing objections, though I have often found
> > myself wanting this too, especially now that `map` does not simply
> > return a list.  This problem alone (along with the same problem for
> > filter) has had a ridiculously outsized impact on the Python 3 porting
> > effort for SageMath, and I find it really irritating at times.
>
> I'm a mathematician, so understand your concerns. Here's what I hope
> is a helpful suggestion.
>
> Create a module, say sage.itertools that contains (not tested)
>
>def py2map(iterable):
> return list(map(iterable))

With the nitpick that the arguments should be (func, *iterables)
rather than just the single iterable, yes, this is a viable transition
strategy. In fact, it's very similar to what 2to3 would do, except
that 2to3 would do it at the call site. If any Py3 porting process is
being held up significantly by this, I would strongly recommend giving
2to3 an eyeball - run it on some of your code, then either accept its
changes or just learn from the diffs. It's not perfect (nothing is),
but it's a useful tool.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread Jonathan Fine
On Wed, Nov 28, 2018 at 2:28 PM E. Madison Bray  wrote:

> I mostly agree with the existing objections, though I have often found
> myself wanting this too, especially now that `map` does not simply
> return a list.  This problem alone (along with the same problem for
> filter) has had a ridiculously outsized impact on the Python 3 porting
> effort for SageMath, and I find it really irritating at times.

I'm a mathematician, so understand your concerns. Here's what I hope
is a helpful suggestion.

Create a module, say sage.itertools that contains (not tested)

   def py2map(iterable):
return list(map(iterable))

The porting to Python 3 (for map) is now reduced to writing

from .itertools import py2map as map

at the head of each module.

Please let me know if this helps.

-- 
Jonathan
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] __len__() for map()

2018-11-28 Thread E. Madison Bray
On Mon, Nov 26, 2018 at 10:35 PM Kale Kundert  wrote:
>
> I just ran into the following behavior, and found it surprising:
>
> >>> len(map(float, [1,2,3]))
> TypeError: object of type 'map' has no len()
>
> I understand that map() could be given an infinite sequence and therefore 
> might not always have a length.  But in this case, it seems like map() 
> should've known that its length was 3.  I also understand that I can just 
> call list() on the whole thing and get a list, but the nice thing about map() 
> is that it doesn't copy data, so it's unfortunate to lose that advantage for 
> no particular reason.
>
> My proposal is to delegate map.__len__() to the underlying iterable.  
> Similarly, map.__getitem__() could be implemented if the underlying iterable 
> supports item access:

I mostly agree with the existing objections, though I have often found
myself wanting this too, especially now that `map` does not simply
return a list.  This problem alone (along with the same problem for
filter) has had a ridiculously outsized impact on the Python 3 porting
effort for SageMath, and I find it really irritating at times.

As a simple counter-proposal which I believe has fewer issues, I would
really like it if the built-in `map()` and `filter()` at least
provided a Python-level attribute to access the underlying iterables.
This is necessary because if I have a function that used to take, say,
a list as an argument, and it receives a `map` object, I now have to
be able to deal with map()s, and I may have checks I want to perform
on the underlying iterables before, say, I try to iterate over the
`map`.

Exactly what those checks are and whether or not they're useful may be
highly application-specific, which is why say a generic `map.__len__`
is not workable.  However, if I can at least inspect those iterables I
can make my own choices on how to handle the map.

Exposing the underlying iterables to Python also has dangers in that I
could directly call `next()` on them and possibly create some
confusion, but consenting adults and all that...
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Antoine Pitrou
On Tue, 27 Nov 2018 22:47:06 -0600
Abe Dillon  wrote:
> 
> If we could figure out a cleaner syntax for defining invariants,
> preconditions, and postconditions we'd be half-way to automated testing
> UTOPIA! (ok, maybe I'm being a little over-zealous)

I think utopia is the word here.  Fuzz testing can be useful, but it's
not a replacement for manual testing of carefully selected values.

Also, the idea that fuzz testing will automatically find edge cases in
your code is idealistic.  It depends on the algorithm you've
implemented and the distribution of values chosen by the tester.
Showcasing trivially wrong examples (such as an addition function that
always returns 0, or a tail function that doesn't return the tail)
isn't very helpful for a real-world analysis, IMHO.

In the end, you have to be rigorous when writing tests, and for most
non-trivial functions it requires that you devise the distribution of
input values depending on the implemented algorithm, not leave that
distribution to a third-party library that knows nothing about your
program.

Regards

Antoine.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] [Brainstorm] Testing with Documented ABCs

2018-11-28 Thread Steven D'Aprano
On Tue, Nov 27, 2018 at 10:47:06PM -0600, Abe Dillon wrote:

> If we could figure out a cleaner syntax for defining invariants,
> preconditions, and postconditions we'd be half-way to automated testing
> UTOPIA! (ok, maybe I'm being a little over-zealous)

You should look at the state of the art in Design By Contract. In 
Eiffel, DBC is integrated in the language:

https://www.eiffel.com/values/design-by-contract/introduction/

https://www.eiffel.org/doc/eiffel/ET-_Design_by_Contract_%28tm%29%2C_Assertions_and_Exceptions


Eiffel uses a rather Pythonic block structure to define invariants. 
The syntax is not identical to Python's (Eiffel eschews the colons) but 
it also comes close to executable pseudo-code.

trust this syntax requires little explanation:

require
... preconditions, tested on function entry
do
... body of the function
ensure
... postconditions, tested on function exit
end

There is a similar invariant block for classes.

Cobra is a language which intentionally modeled its syntax on Python. It 
too has contracts integrated with the language:

http://cobra-language.com/how-to/DeclareContracts/

http://cobra-language.com/trac/cobra/wiki/Contracts




-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/