Re: Different "look and feel" of some built-in functions

2021-09-24 Thread Chris Angelico
On Sat, Sep 25, 2021 at 11:11 AM Oscar Benjamin
 wrote:
>
> On Sat, 25 Sept 2021 at 02:01, Chris Angelico  wrote:
>>
>> On Sat, Sep 25, 2021 at 10:56 AM Oscar Benjamin
>>  wrote:
>> >
>> > On Sat, 25 Sept 2021 at 00:37, Greg Ewing 
>> > wrote:
>> > > I suppose they could be fiddled somehow to make it possible, but
>> > > that would be turning them into special cases that break the rules.
>> > > It would be better to provide separate functions, as was done with
>> > > sum().
>> > >
>> >
>> > A separate union function would be good. Even in a situation where all
>> > inputs are assured to be sets the set.union method fails the base case:
>> >
>> > >>> sets = []
>> > >>> set.union(*sets)
>> > Traceback (most recent call last):
>> >   File "", line 1, in 
>> > TypeError: descriptor 'union' of 'set' object needs an argument
>> >
>> > In the case of intersection perhaps the base case should be undefined.
>> >
>>
>> Rather than calling the unbound method, why not just call it on an
>> empty set? That defines your base case as well.
>>
>> set().union(*sets)
>
>
> That is indeed what I do but it seems unnatural compared to simply 
> union(*sets). It shouldn't be necessary to create a redundant empty set just 
> to compute the union of some other sets. If there was a union function then I 
> don't think I would ever use the union method.
>

Maybe, but if you start with a set, then you define the base case, and
it also is quite happy to take non-set arguments:

>>> set().union([1,2,3], map(int, "234"), {3:"three",4:"four",5:"five"})
{1, 2, 3, 4, 5}

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Different "look and feel" of some built-in functions

2021-09-24 Thread Chris Angelico
On Sat, Sep 25, 2021 at 10:56 AM Oscar Benjamin
 wrote:
>
> On Sat, 25 Sept 2021 at 00:37, Greg Ewing 
> wrote:
>
> > On 25/09/21 10:15 am, Steve Keller wrote:
> > > BTW, I like how the min() and max() functions allow both ways of being
> > > called.
> >
> > That wouldn't work for set.union and set.intersection, because as
> > was pointed out, they're actually methods, so set.union(some_seq)
> > is a type error:
> >
> >  >>> a = {1, 2}
> >  >>> b = {3, 4}
> >  >>> set.union([a, b])
> > Traceback (most recent call last):
> >File "", line 1, in 
> > TypeError: descriptor 'union' for 'set' objects doesn't apply to a
> > 'list' object
> >
> > I suppose they could be fiddled somehow to make it possible, but
> > that would be turning them into special cases that break the rules.
> > It would be better to provide separate functions, as was done with
> > sum().
> >
>
> A separate union function would be good. Even in a situation where all
> inputs are assured to be sets the set.union method fails the base case:
>
> >>> sets = []
> >>> set.union(*sets)
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: descriptor 'union' of 'set' object needs an argument
>
> In the case of intersection perhaps the base case should be undefined.
>

Rather than calling the unbound method, why not just call it on an
empty set? That defines your base case as well.

set().union(*sets)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: XML Considered Harmful

2021-09-24 Thread Chris Angelico
On Sat, Sep 25, 2021 at 8:53 AM dn via Python-list
 wrote:
>
> On 25/09/2021 06.59, Peter J. Holzer wrote:
> > There are a gazillion formats and depending on your needs one of them
> > might be perfect. Or you may have to define you own bespoke format (I
> > mean, nobody (except Matt Parker) tries to represent images or videos as
> > CSVs: There's PNG and JPEG and WEBP and H.264 and AV1 and whatever for
> > that).
> >
> > Of the three formats discussed here my take is:
> >
> > CSV: Good for tabular data of a single data type (strings). As soon as
> > there's a second data type (numbers, dates, ...) you leave standard
> > territory and are into "private agreements".
> >
> > JSON: Has a few primitive data types (bool, number, string) and a two
> > compound types (list, dict(string -> any)). Still missing many
> > frequently used data types (e.g. dates) and has no standard way to
> > denote composite types. But its simple and if it's sufficient for your
> > needs, use it.
> >
> > XML: Originally invented for text markup, and that shows. Can represent
> > different types (via tags), can define those types (via DTD and/or
> > schemas), can identify schemas in a globally-unique way and you can mix
> > them all in a single document (and there are tools available to validate
> > your files). But those features make it very complex (you almost
> > certainly don't want to write your own parser) and you really have to
> > understand the data model (especiall namespaces) to use it.
>
> and YAML?

Invented because there weren't enough markup languages, so we needed another?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Different "look and feel" of some built-in functions

2021-09-24 Thread Chris Angelico
On Sat, Sep 25, 2021 at 3:42 AM Stefan Ram  wrote:
>
> "Dieter Maurer"  writes:
> >A list is ordered. Therefore, it is important where
> >in this order an element is added. Thus, for a list,
> >`append` is a better name than `add` -- because it already
> >tells us in the name where it adds the new element.
>
>   In a collection of texts, which is not large but mixed from
>   many different fields and genres, I find (using a Python
>   script, of course) eight hits for "added to the list" :
>
> |s and rock n roll can be added to the list. As - Taylor, 2012
> | of opinion was probably added to the list tow - from a dictionary
> |gg and his wife might be added to the list of  - Sir Walter Scott
> |ships when your name was added to the list. In - Percy Bysshe Shelley
> |em that wealth should be added to the list. No - Henry
> |n was discovered and was added to the list of  - from a dictionary
> |nd said his name must be added to the list, or - Mark Twain
>
>   . There was no hit for "appended to the list".
>
>   When one says "add something to a list", it is usually understood
>   that one adds it at the /end/. In the case of traditional written
>   lists it is not possible in any other way.
>

When I add something to the shopping list, it is not added to the end.
It is added anywhere that there is room. If you care about the
sequence, you would say "add to the end". Or, using the more technical
and briefer word, "append". Most of the lists you're seeing there are
not being added to the end of; for instance, I would guess that quite
a few of them are inherently sorted lists, so you would be adding a
person's name in affabeck lauder, or adding something to a particular
ranked position, or whatever else. This is not the same thing.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Different "look and feel" of some built-in functions

2021-09-24 Thread Chris Angelico
On Fri, Sep 24, 2021 at 11:47 PM Steve Keller  wrote:
>
> Why do some built-in Python functions feel so differently:
>
> For example sum(), all(), any() expect exactly one argument which is a
> sequence to operate on, i.e. a list, an iterator or a generator etc.
>
> sum([1,2,3,4])
> sum(range(1, 101))
> sum(2**i for i in range(10))
> all([True, False])
> any(even, {1,2,3,4})
>
> while other functions like set.union() and set.intersection() work on
> a list of arguments but not on a sequence:
>
> set.intersection({1,2,3}, {3,4,5})
>
> but
>
> set.union(map(...))   # does not work
> set.intersection(list(...))   # does not work
>
> and you have to use a * instead
>
> set.union(*map(...))
>
> etc.
>
> Is this just for historical reason?  And wouldn't it be possible and
> desirable to have more consistency?
>

The ones you're calling off the set class are actually meant to be methods.

>>> s = {1,2,3}
>>> s.intersection({3,4,5})
{3}

They expect a set, specifically, as the first argument, because
normally that one goes before the dot. If you want to call the unbound
method with two arguments, that's fine, but it's not the intended use,
so you have to basically fudge it to look like a normal method call :)
That's why it doesn't take a sequence.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Type annotation pitfall

2021-09-24 Thread Chris Angelico
On Fri, Sep 24, 2021 at 11:43 PM Peter Saalbrink
 wrote:
>
> I don't think this has anything to do with typing or providing type hints.
> The type hint is the `: set` part, not the `= set()` part.
> You can declare the type without assigning to the variable.
> Indeed, as you already said, `x` is a class property here, and is shared
> amongst instances of the class.
> It might be a good idea to move the attribute assignment to the `__init__`
> method.
>
> In the following way, you can safely provide the type hint:
>
> ```python
> class Foo:
> x: set
>
> def __init__(self, s):
> self.x = set()
> if s:
> self.x.add(s)
> ```
>

To be clear, this isn't a case of "never use mutables as class
attributes"; often you *want* a single mutable object to be shared
among instances of a class (so they can all find each other, perhaps).
If you want each instance to have its own set, you construct a new set
every object initialization; if you want them to all use the same set,
you construct a single set and attach it to the class. Neither is
wrong, they just have different meanings.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: XML Considered Harmful

2021-09-23 Thread Chris Angelico
On Fri, Sep 24, 2021 at 1:44 PM Dan Stromberg  wrote:
>
>
> On Thu, Sep 23, 2021 at 8:12 PM Chris Angelico  wrote:
>>
>> One good hybrid is to take a subset of Python syntax (so it still
>> looks like a Python script for syntax highlighting etc), and then
>> parse that yourself, using the ast module. For instance, you can strip
>> out comments, then look for "VARNAME = ...", and parse the value using
>> ast.literal_eval(), which will give you a fairly flexible file format
>> that's still quite safe.
>
>
> Restricting Python with the ast module is interesting, but I don't think I'd 
> want to bet my career on the actual safety of such a thing.  Given that Java 
> bytecode was a frequent problem inside web browsers, imagine all the 
> messiness that could accidentally happen with a subset of Python syntax from 
> untrusted sources.
>
> ast.literal_eval might be a little better - or a list of such, actually.

Uhh, I specifically mention literal_eval in there :) Simple text
parsing followed by literal_eval for the bulk of it is a level of
safety that I *would* bet my career on.

> Better still to use JSON or ini format - IOW something designed for the 
> purpose.

It all depends on how human-editable it needs to be. JSON has several
problems in that respect, including some rigidities, and a lack of
support for comments. INI format doesn't have enough data types for
many purposes. YAML might be closer, but it's not for every situation
either.

That's why we have options.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: XML Considered Harmful

2021-09-23 Thread Chris Angelico
On Fri, Sep 24, 2021 at 12:22 PM Stefan Ram  wrote:
>
> dn  writes:
> >With that, why not code it as Python expressions, and include the module?
>
>   This might create a code execution vulnerability if such
>   files are exchanged between multiple parties.
>
>   If code execution vulnerabilities and human-readability are
>   not an issue, then one could also think about using pickle.
>
>   If one ignores security concerns for a moment, serialization into
>   a text format and subsequent deserialization can be a easy as:
>
> |>>> eval( str( [1, (2, 3)] ))
> |[1, (2, 3)]
>

One good hybrid is to take a subset of Python syntax (so it still
looks like a Python script for syntax highlighting etc), and then
parse that yourself, using the ast module. For instance, you can strip
out comments, then look for "VARNAME = ...", and parse the value using
ast.literal_eval(), which will give you a fairly flexible file format
that's still quite safe.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: XML Considered Harmful

2021-09-23 Thread Chris Angelico
On Fri, Sep 24, 2021 at 7:11 AM Eli the Bearded <*@eli.users.panix.com> wrote:
>
> In comp.lang.python, Christian Gollwitzer   wrote:
> > Am 22.09.21 um 16:52 schrieb Michael F. Stemper:
> >> On 21/09/2021 19.30, Eli the Bearded wrote:
> >>> Yes, CSV files can model that. But it would not be my first choice of
> >>> data format. (Neither would JSON.) I'd probably use XML.
> >> Okay. 'Go not to the elves for counsel, for they will say both no
> >> and yes.' (I'm not actually surprised to find differences of opinion.)
>
> Well, I have a recommendation with my answer.
>
> > It's the same as saying "CSV supports images". Of course it doesn't, its
> > a textfile, but you could encode a JPEG as base64 and then put this
> > string into the cell of a CSV table. That definitely isn't what a sane
> > person would understand as "support".
>
> I'd use one of the netpbm formats instead of JPEG. PBM for one bit
> bitmaps, PGM for one channel (typically grayscale), PPM for three
> channel RGB, and PAM for anything else (two channel gray plus alpha,
> CMYK, RGBA, HSV, YCbCr, and more exotic formats). JPEG is tricky to
> map to CSV since it is a three channel format (YCbCr), where the
> channels are typically not at the same resolution. Usually Y is full
> size and the Cb and Cr channels are one quarter size ("4:2:0 chroma
> subsampling"). The unequal size of the channels does not lend itself
> to CSV, but I can't say it's impossible.
>

Examine prior art, and I truly do mean art, from Matt Parker:

https://www.youtube.com/watch?v=UBX2QQHlQ_I

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: XML Considered Harmful

2021-09-23 Thread Chris Angelico
On Thu, Sep 23, 2021 at 10:55 PM Mats Wichmann  wrote:
>
> On 9/22/21 10:31, Dennis Lee Bieber wrote:
>
> >   If you control both the data generation and the data consumption,
> > finding some format  ...
>
> This is really the key.  I rant at people seeming to believe that csv is
> THE data interchange format, and it's about as bad as it gets at that,
> if you have a choice.  xml is noisy but at least (potentially)
> self-documenting, and ought to be able to recover from certain errors.
> The problem with csv is that a substantial chunk of the world seems to
> live inside Excel, and so data is commonly both generated in csv so it
> can be imported into excel and generated in csv as a result of exporting
> from excel, so the parts often are *not* in your control.
>
> Sigh.

The only people who think that CSV is *the* format are people who
habitually live in spreadsheets. People who move data around the
internet, from program to program, are much more likely to assume that
JSON is the sole format. Of course, there is no single ultimate data
interchange format, but JSON is a lot closer to one than CSV is.

(Or to be more precise: any such thing as a "single ultimate data
interchange format" will be so generic that it isn't enough to define
everything. For instance, "a stream of bytes" is a universal data
interchange format, but that's not ultimately a very useful claim.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Polymorphic imports

2021-09-22 Thread Chris Angelico
On Thu, Sep 23, 2021 at 4:20 AM Dennis Lee Bieber  wrote:
>
> The other alternative may be
> https://docs.python.org/3/library/functions.html#__import__
>

I wouldn't recommend calling a dunder. If you just want to pass a text
string and get back a module, importlib is a better choice.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Polymorphic imports

2021-09-21 Thread Chris Angelico
On Wed, Sep 22, 2021 at 6:05 AM <2qdxy4rzwzuui...@potatochowder.com> wrote:
>
> On 2021-09-22 at 05:10:02 +1000,
> Chris Angelico  wrote:
>
> > You can dynamically import modules using importlib.import_module(),
> > but an easier way might just be a conditional import:
> >
> > # client/__init__.py
> > if some_condition:
> > import module_a_default as module_a
> > else:
> > import module_a_prime as module_a
> >
> > Now everything that refers to client.module_a.whatever will get the
> > appropriate one, either the original or the alternate.
>
> +1
>
> > Alternatively, since you are talking about paths, it might be easiest
> > to give everything the same name, and then use sys.path to control
> > your import directories. Not sure which would work out best.
>
> -1
>
> Please don't do that.  Mutable shared and/or global state (i.e.,
> sys.paths) is the root of all evil.  And homegrown crypto and date
> libraries.  And those funny red hats.

All depends on whether this is a script/application or a library. If
it's a library, then I agree, don't mutate sys.path, don't change the
working directory, etc, etc, etc. But applications are free to do
those sorts of things. I don't know what the OP's purpose here is, and
it's entirely possible that sys.path switching is the cleanest way to
do it.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Polymorphic imports

2021-09-21 Thread Chris Angelico
On Wed, Sep 22, 2021 at 4:59 AM Travis Griggs  wrote:
>
> I guess this is kind of like mocking for testing. I have a simple module 
> that's imported in a number of other spots in my program. There's a condition 
> in the OS/filesystem where I'd like to import a polymorphically compatible 
> variant of the same module. Can this be accomplished in a sort of 
> once-and-only once spot?
>
> For example, consider something like this:
>
> client/
>   module_a
>   module_a_prime
> lib/
>   paths
>lib_a
>lib_b
>...
> model/
>   model_a
>   model_b
>   ...
> top_level_a
> top_level_b
> ...
>
>
> I have a number of imports of module_a. I have a paths module that isolates 
> all of my file system access, and that's where the determination can be made 
> which one to use, so I tried to do something like:
>
> def dynamic_client_module():
>return client.module_a_prime if the_condition_occurs else client.module_a
>
>
> Hoping that I could do something like
>
> from lib import paths
> import paths.dynamic_client_module()
>
> But this seems to not work. Import can only take real modules? Not 
> programatic ones?
>
> Is there a Not-Too-Evil-Way(tm) to add a level of programmatic indirection in 
> the import declarations? Or some other trick from a different angle?

You can dynamically import modules using importlib.import_module(),
but an easier way might just be a conditional import:

# client/__init__.py
if some_condition:
import module_a_default as module_a
else:
import module_a_prime as module_a

Now everything that refers to client.module_a.whatever will get the
appropriate one, either the original or the alternate.

Alternatively, since you are talking about paths, it might be easiest
to give everything the same name, and then use sys.path to control
your import directories. Not sure which would work out best.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: issue for setup pandas

2021-09-21 Thread Chris Angelico
On Tue, Sep 21, 2021 at 11:53 PM Fady Victor Mikhael Abdelmalk
 wrote:
>
>
> Dear Python Team,
>
> I got the below issue when trying to install python on my user. Kindly assist 
> to know how can I solved.
>
>
> WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, 
> status=None)) after connection broken by 
> 'NewConnectionError(' at 0x01ABB0BDE6A0>: Failed to establish a new connection: [Errno 11001] 
> getaddrinfo failed')': /simple/pandas/
> WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, 
> status=None)) after connection broken by 
> 'NewConnectionError(' at 0x01ABB0BDEDC0>: Failed to establish a new connection: [Errno 11001] 
> getaddrinfo failed')': /simple/pandas/
> WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, 
> status=None)) after connection broken by 
> 'NewConnectionError(' at 0x01ABB0C07070>: Failed to establish a new connection: [Errno 11001] 
> getaddrinfo failed')': /simple/pandas/
>

Looks like a problem with your internet connection. When you try to
install things like pandas, they have to be downloaded from the
internet. If your firewall is blocking this, you'll have to grant
permission before the installation can continue.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-20 Thread Chris Angelico
On Tue, Sep 21, 2021 at 3:58 AM Mostowski Collapse  wrote:
>
> I read the following, and you should also know:
>
> > Python's [] is implemented as an array, not a linked list.
> > Although resizing is O(n), appending to it is amortized O(1),
> > because resizes happen very rarely.
> https://stackoverflow.com/a/5932364/502187
>
> The list type doesn't have an O(1) operation to remove
> an element during sweep. The list type, not like its name
> would suggest, in Python is an array.
>
> These arrays are not so expensive when you append()
> an element. Because they are allocated with some excess
> capacity. And they grow exponentially.
>
> So amortisized you can append() a lot of elements to
> a Python list, which is an array. But you cannot poke so
> cheaply holes into it. So if you have this scenario:
>
> Before:
>  - [ A1, .., An , B, C1, .., Cm ]
>
> After:
>  - [ A1, .., An , C1, .., Cm ]
>
> You have to copy C1,..,Cm one position down. On the other
> hand, when scanning the single list, removing the
> element is just pointer swizzling.
>
> The code is here, the positive if-then-else branch keeps
> the element, the negative if-then-else branch drops the
> element. Thats quite standard algorithm for linked lists:
>
>  /* pointer swizzling */
> while temp is not None:
> term = temp
> temp = term.tail
> if (term.flags & MASK_VAR_MARK) != 0:
> term.flags &= ~MASK_VAR_MARK
> if back is not None:
> back.tail = term
> else:
> trail = term
> back = term
> else:
> term.instantiated = NotImplemented
> term.tail = None
> count -= 1
>
> https://github.com/jburse/dogelog-moon/blob/main/devel/runtimepy/drawer/machine.py#L163
>
> There is nothing wrong with implementing a single list
> in Python. Only you never saw that one maybe. If you would
> indeed use Python lists which are arrays, you would
> maybe get a much slower sweep_trail() and this would
> be seen. But currently its not seen. It happens that 100
> of elements are sweeped, if you would do this with copy
> inside a Python list which are arrays, it would get much
> more expensive, and the extremly cheap Prolog garbage
> collection, as it stands now, wouldn't be that cheap anymore.
>
> You can try yourself. My sweep_trail() needs frequent resize,
> which would be O(n) each, so that sweep_trail becomes O(n^2).
> Which the current implementation its only O(n).
>

How about, instead: Use a Python list, and instead of removing
individual items one by one, filter out the ones you don't want, using
a list comprehension? That would be O(n) to completely remove all the
ones you don't want, instead of O(n) for each individual removal.

Also, have you actually benchmarked a version that uses Python's
lists, or are you simply assuming that the removals will be slow?
Implementing your own singly-linked list was clearly suboptimal, but
have you tried writing simpler code and seeing if it is also faster?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-20 Thread Chris Angelico
On Tue, Sep 21, 2021 at 3:51 AM Mostowski Collapse  wrote:
>
> sympy also builds a language on top of Python.
> pandas also builds a language on top of Python.
>
> Is there some pope that says this wouldn't be
> allowed, I dont think so, otherwise sympy, pandas, etc..
>
> wouldn't exist. I dont understand your argument.
>

That's not the same thing as reimplementing your own low-level
features on top of a high level language.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-20 Thread Chris Angelico
On Mon, Sep 20, 2021 at 9:50 PM Peter J. Holzer  wrote:
> > Let Python be Python, don't try to build your own language on top of
> > it.
>
> Well, he's writing a Prolog interpreter, so building his own language on
> top of Python is sort of the point. I think a better way to put it is
> "Don't try to write Python as if it was C".

Fair point. Or combining them both: Writing a language interpreter in
Python as if you were writing it in C, and then complaining that it is
slow, is only going to elicit "well uhh yes?" responses.

Languages like NetRexx (and, I think, Jython, although I can't find
any definitive and current answers) are slightly different from their
"parent" languages, because they make good use of their implementation
languages' features. This Prolog interpreter might not even need to be
different in functionality, but its implementation would be different,
and it could take advantage of the underlying garbage collection.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-19 Thread Chris Angelico
On Mon, Sep 20, 2021 at 3:19 AM Mostowski Collapse  wrote:
>
> I am refering to:
>
> Greg Ewing schrieb:
>  > where [w] is a weak reference object. Then you could periodically
>  > scan the trail looking for dead weakref objects and remove the
>  > corresponding [*] node from the list.
>  >
>  > You can also attach callbacks to weakref objects that are triggered
>  > when the referenced object dies. You might be able to make use of
>  > that to remove items from the trail instead of the periodic scanning.
>
> Question to Chris Angelico: If I stay with my
> sweep_trail(), which is the periodically scanning,
> I can use a single linked list.
>
> On the other hand if I would use the trigger
> from Python, I possibly would need a double linked
> list, to remove an element.
>
> Chris Angelico, is there a third option, that I have
> overlooked? Single linked list uses less space
> than double linked list, this why I go with scan.
>

I don't know. I don't understand your code well enough to offer advice
like that, because *your code is too complicated* and not nearly clear
enough.

But however it is that you're doing things, the best way is almost
always to directly refer to objects. Don't fiddle around with creating
your own concept of a doubly-linked list and a set of objects; just
refer directly to the objects. Let Python be Python, don't try to
build your own language on top of it.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-18 Thread Chris Angelico
On Sun, Sep 19, 2021 at 11:46 AM Mostowski Collapse  wrote:
>
> Yeah, it seems weak references could indeed spare
> me mark_term(). But then I am stil left with sweep_trail().
> I did not yet measure what takes more time mark_term()
> or sweep_trail(). The displayed "gc" is the sum of both.
>
> From what I have seen, very large trail practically reduced
> to a zero trail during Prolog GC, I am assuming that
> mark_term() is not the working horse. Usually mark_term()
> only marks what is not-Garbage, and sweep_trail()
>
> has to deal with Garbage and not-Garbage. And there
> is usually a lot of Garbage, much more than not-Garbage.
> Finding the objects that survive, is like finding the needle
> in the haystack, except we do not have to scan the

If you stop referring to something, it is garbage. Python will dispose of it.

You literally need to do nothing at all, and let the language take
care of things.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-16 Thread Chris Angelico
On Fri, Sep 17, 2021 at 7:17 AM Mostowski Collapse  wrote:
>
> About Exceptions: Thats just building ISO core
> standard Prolog error terms.
>
> About Garbage Collection: Thats just Prolog
> garbage collection, which does shrink some
> single linked lists, which ordinary
> programmig language GC cannot do,
>

Okay, so you're building your own garbage collection on top of
Python's, and you're wondering why it's slow?

Change your code to not try to implement one language inside another,
and you'll see a massive performance improvement.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-16 Thread Chris Angelico
On Fri, Sep 17, 2021 at 3:20 AM Mostowski Collapse  wrote:
>
> Compound is not used for boxing. Integers and floats
> are represented directly. Also integers are not mapped to
> floats. But maybe compound could be a little flattened,
>

"Boxing" in this case isn't about ints and floats, since Java-like
bizarrenesses simply don't happen in Python; I'm talking about the way
that you frequently build up a Compound object for various situations
(even for throwing an error - you have a function that constructs a
generic Exception, and then buries a Compound inside it), and then
you're frequently checking if something is an instance of Compound.
All these constant back-and-forths are extremely expensive, since
they're not part of your algorithm at all.

At very least, use tuples instead of Compounds, but it would be far
better to ask less questions about your data and do more things by
tidying up your algorithm. Unfortunately, I can't really advise with
any detail, because you have code like this:

###
# Mark a term.
#
# @param term The term.
##
def mark_term(term):

What does that even mean?! I get it, you have a term, and you're
marking it. Whatever that mark means. The comments add absolutely
nothing that the function header didn't tell me. Are you implementing
your own garbage collection on top of Python's? Or something else?
It's extremely hard to give any sort of recommendations when your code
is hard to read, and nearly all of the comments are nothing more than
restating what can be seen in the next line of code. Also, with the
number of globals you're using, tracing the purpose of your functions
is not easy.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-15 Thread Chris Angelico
On Thu, Sep 16, 2021 at 7:59 AM Mostowski Collapse  wrote:
>
> BTW: I could already make it faster, by not repeatedly
> accessing .arg anymore. It went down from ca.:
>
> 171'000 ms
>
> To this here:
>
> 140'000 ms
>
> But only in the cold run. In the warm run it went back
> to 171'000 ms. Possibly when my code is faster,
> it will create objects more faster, and kill the Python GC.
>
> Or it was because my Laptop went into screen black?
> And throttled the CPU. Not sure.
>

Instead of worrying about all these details, start by simplifying your
code. Focus on clean, simple, readable code, and don't microoptimize.
Specifically, focus on the core arithmetic that you're trying to do,
and get rid of all the bookkeeping overhead; most of that is a waste
of time. I mentioned earlier the repeated boxing and unboxing in
"Compound" objects - have you changed anything with those?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-15 Thread Chris Angelico
On Thu, Sep 16, 2021 at 5:15 AM Mostowski Collapse  wrote:
>
> If you find a "wonky" spot, I can replace it by "non-wonky"
> code. I noticed some differences between Python Dicts
> and JavaScript objects. Python tends to throw more exceptions.
>
> So in Python I now do the following:
>
>peek = kb.get(functor, NotImplemented)
>if peek is not NotImplemented:
>
> In JavaScript I can directly do:
>
> peek = kb[functor];
> if (peek !== undefined)
>
> But if get() in Python is implemented under the hood with
> exception handling. i.e. using the exception prone [] and
> then in case an exception is thrown, returning the
>
> default value, then Python get() will probably be quite slow.
> Since usually exceptions are slow.
>

No, you're thinking in terms of microoptimizations. Exception handling
isn't THAT slow. I'm talking more about how everything's getting
packaged up and then unpackaged (the repeated use of the "Compound"
class looks highly suboptimal), rather than reworking your algorithm
to behave more cleanly.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ANN: Dogelog Runtime, Prolog to the Moon (2021)

2021-09-15 Thread Chris Angelico
On Thu, Sep 16, 2021 at 3:17 AM Mostowski Collapse  wrote:
>
> I really wonder why my Python implementation
> is a factor 40 slower than my JavaScript implementation.
> Structurally its the same code.
>

Very hard to know. Your code is detailed and complicated. Do they
produce identical results? Are you using the same sort of
floating-point data everywhere, or is one integer and the other float?
What's going on with all the globals, the continuations, etc? My
suspicion is that you're trying to write weird, wonky Python code, and
then are surprised that it doesn't perform well.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-11 Thread Chris Angelico
On Sun, Sep 12, 2021 at 1:07 AM Peter J. Holzer  wrote:
> If you have any "decimals" (i.e decimal digits to the right of your
> decimal point) then the input values won't be exactly representable and
> the nearest representation will use all available bits, thus losing some
> precision with most additions.

That's an oversimplification, though - numbers like 12345.03125 can be
perfectly accurately represented, since the fractional part is a
(negative) power of two.

The perceived inaccuracy of floating point numbers comes from an
assumption that a string of decimal digits is exact, and the
computer's representation of it is not. If I put this in my code:

ONE_THIRD = 0.3

then you know full well that it's not accurate, and that's nothing to
do with IEEE floating-point! The confusion comes from the fact that
one fifth (0.2) can be represented precisely in decimal, and not in
binary.

Once you accept that "perfectly representable numbers" aren't
necessarily the ones you expect them to be, 64-bit floats become
adequate for a huge number of tasks. Even 32-bit floats are pretty
reliable for most tasks, although I suspect that there's little reason
to use them now - would be curious to see if there's any performance
benefit from restricting to the smaller format, given that most FPUs
probably have 80-bit or wider internal registers.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on writing a while loop for rolling two dice

2021-09-11 Thread Chris Angelico
On Sat, Sep 11, 2021 at 3:26 PM dn via Python-list
 wrote:
>
> On 31/08/2021 01.50, Chris Angelico wrote:
> > On Mon, Aug 30, 2021 at 11:13 PM David Raymond  
> > wrote:
> >>
> >>> def how_many_times():
> >>>   x, y = 0, 1
> >>>   c = 0
> >>>   while x != y:
> >>> c = c + 1
> >>> x, y = roll()
> >>>   return c, (x, y)
> >>
> >> Since I haven't seen it used in answers yet, here's another option using 
> >> our new walrus operator
> >>
> >> def how_many_times():
> >> roll_count = 1
> >> while (rolls := roll())[0] != rolls[1]:
> >> roll_count += 1
> >> return (roll_count, rolls)
> >>
> >
> > Since we're creating solutions that use features in completely
> > unnecessary ways, here's a version that uses collections.Counter:
> >
> > def how_many_times():
> > return next((count, rolls) for count, rolls in
> > enumerate(iter(roll, None)) if len(Counter(rolls)) == 1)
> >
> > Do I get bonus points for it being a one-liner that doesn't fit in
> > eighty characters?
>
>
> Herewith my claim to one-liner fame (assuming such leads in any way to
> virtue or fame)
>
> It retains @Peter's preference for a more re-usable roll_die() which
> returns a single event, cf the OP's roll() which returns two results).
>
>
> import itertools, random
>
> def roll_die():
> while True:
> yield random.randrange(1, 7)
>
> def how_many_times():
> return list( itertools.takewhile( lambda r:r[ 0 ] != r[ 1 ],
>   zip( roll_die(), roll_die() )
>   )
>)
>
> Also, a claim for 'bonus points' because the one-liner will fit within
> 80-characters - if only I didn't have that pernicious and vile habit of
> coding a more readable layout.
>
> It doesn't use a two-arg iter, but still rates because it does use a
> relatively-obscure member of the itertools library...
>

Nice, but that's only going to give you the ones that don't match. You
can then count those, and that's a start, but how do you capture the
matching rolls?

I smell another opportunity for gratuitous use of a language feature:
nonlocal. In a lambda function. Which may require shenanigans of epic
proportions.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Connecting python to DB2 database

2021-09-04 Thread Chris Angelico
On Sun, Sep 5, 2021 at 1:26 PM DFS  wrote:
>
> On 9/3/2021 9:50 AM, Chris Angelico wrote:
> > On Fri, Sep 3, 2021 at 11:37 PM DFS  wrote:
> >>
> >> On 9/3/2021 1:47 AM, Chris Angelico wrote:
> >>> On Fri, Sep 3, 2021 at 3:42 PM DFS  wrote:
> >>>>
> >>>> Having a problem with the DB2 connector
> >>>>
> >>>> test.py
> >>>> 
> >>>> import ibm_db_dbi
> >>>> connectstring =
> >>>> 'DATABASE=xxx;HOSTNAME=localhost;PORT=5;PROTOCOL=TCPIP;UID=xxx;PWD=xxx;'
> >>>> conn = ibm_db_dbi.connect(connectstring,'','')
> >>>>
> >>>> curr  = conn.cursor
> >>>> print(curr)
> >>>
> >>> According to PEP 249, what you want is conn.cursor() not conn.cursor.
> >>>
> >>> I'm a bit surprised as to the repr of that function though, which
> >>> seems to be this line from your output:
> >>>
> >>> 
> >>>
> >>> I'd have expected it to say something like "method cursor of
> >>> Connection object", which would have been an immediate clue as to what
> >>> needs to be done. Not sure why the repr is so confusing, and that
> >>> might be something to report upstream.
> >>>
> >>> ChrisA
> >>
> >>
> >> Thanks.  I must've done it right, using conn.cursor(), 500x.
> >> Bleary-eyed from staring at code too long I guess.
> >
> > Cool cool! Glad that's working.
> >
> >> Now can you get DB2 to accept ; as a SQL statement terminator like the
> >> rest of the world?   They call it "An unexpected token"...
> >>
> >
> > Hmm, I don't know that the execute() method guarantees to allow
> > semicolons. Some implementations will strip a trailing semi, but they
> > usually won't allow interior ones, because that's a good way to worsen
> > SQL injection vulnerabilities. It's entirely possible - and within the
> > PEP 249 spec, I believe - for semicolons to be simply rejected.
>
>
> The default in the DB2 'Command Line Plus' tool is semicolons aren't
> "allowed".
>

Yeah, but that's a REPL feature. In Python, the end of a command is
signalled by the end of the string; if you want a multiline command,
you have a string with multiple lines in it. Command line SQL has to
either force you to one line, or have some means of distinguishing
between "more text coming" and "here's a command, run it".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-04 Thread Chris Angelico
On Sun, Sep 5, 2021 at 12:58 PM Greg Ewing  wrote:
>
> On 5/09/21 2:42 am, Hope Rouselle wrote:
> > Here's what I did on this case.  The REPL is telling me that
> >
> >7.23 = 2035064081618043/281474976710656
>
> If 7.23 were exactly representable, you would have got
> 723/1000.
>
> Contrast this with something that *is* exactly representable:
>
>  >>> 7.875.as_integer_ratio()
> (63, 8)
>
> and observe that 7875/1000 == 63/8:
>
>  >>> from fractions import Fraction
>  >>> Fraction(7875,1000)
> Fraction(63, 8)
>
> In general, to find out whether a decimal number is exactly
> representable in binary, represent it as a ratio of integers
> where the denominator is a power of 10, reduce that to lowest
> terms, and compare with the result of as_integer_ratio().
>

Or let Python do that work for you!

>>> from fractions import Fraction
>>> Fraction("7.875") == Fraction(7.875)
True
>>> Fraction("7.8") == Fraction(7.8)
False

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-04 Thread Chris Angelico
On Sun, Sep 5, 2021 at 12:55 PM Hope Rouselle  wrote:
>
> Julio Di Egidio  writes:
>
> > On Thursday, 2 September 2021 at 16:51:24 UTC+2, Christian Gollwitzer wrote:
> >> Am 02.09.21 um 16:49 schrieb Julio Di Egidio:
> >> > On Thursday, 2 September 2021 at 16:41:38 UTC+2, Peter Pearson wrote:
> >> >> On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle wrote:
> >> >
> >> >>> 39.61
> >> >>
> >> >> Welcome to the exciting world of roundoff error:
> >> >
> >> > Welcome to the exiting world of Usenet.
> >> >
> >> > *Plonk*
> >>
> >> Pretty harsh, isn't it? He gave a concise example of the same inaccuracy
> >> right afterwards.
> >
> > And I thought you were not seeing my posts...
> >
> > Given that I have already given a full explanation, you guys, that you
> > realise it or not, are simply adding noise for the usual pub-level
> > discussion I must most charitably guess.
> >
> > Anyway, just my opinion.  (EOD.)
>
> Which is certainly appreciated --- as a rule.  Pub-level noise is pretty
> much unavoidable in investigation, education.  Being wrong is, too,
> unavoidable in investigation, education.  There is a point we eventually
> publish at the most respected journals, but that's a whole other
> interval of the time-line.  IOW, chill out! :-D (Give us a C-k and meet
> us up in the next thread.  Oh, my, you're not a Gnus user: you are a
> G2/1.0 user.  That's pretty scary.)
>

I'm not a fan of the noise level in a pub, but I have absolutely no
problem with arguing these points out. And everyone (mostly) in this
thread is being respectful. I don't mind when someone else is wrong,
especially since - a lot of the time - I'm wrong too (or maybe I'm the
only one who's wrong).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-04 Thread Chris Angelico
On Sun, Sep 5, 2021 at 12:44 PM Hope Rouselle  wrote:
>
> Chris Angelico  writes:
>
> > On Fri, Sep 3, 2021 at 4:29 AM Hope Rouselle  wrote:
> >>
> >> Just sharing a case of floating-point numbers.  Nothing needed to be
> >> solved or to be figured out.  Just bringing up conversation.
> >>
> >> (*) An introduction to me
> >>
> >> I don't understand floating-point numbers from the inside out, but I do
> >> know how to work with base 2 and scientific notation.  So the idea of
> >> expressing a number as
> >>
> >>   mantissa * base^{power}
> >>
> >> is not foreign to me. (If that helps you to perhaps instruct me on
> >> what's going on here.)
> >>
> >> (*) A presentation of the behavior
> >>
> >> >>> import sys
> >> >>> sys.version
> >> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> >> bit (AMD64)]'
> >>
> >> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >> >>> sum(ls)
> >> 39.594
> >>
> >> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >> >>> sum(ls)
> >> 39.61
> >>
> >> All I did was to take the first number, 7.23, and move it to the last
> >> position in the list.  (So we have a violation of the commutativity of
> >> addition.)
> >
> > It's not about the commutativity of any particular pair of operands -
> > that's always guaranteed.
>
> Shall we take this seriously?  It has to be about the commutativity of
> at least one particular pair because it is involved with the
> commutavitity of a set of pairs.  If various pairs are involved, then at
> least one is involved.  IOW, it is about the commutativity of some pair
> of operands and so it could not be the case that it's not about the
> commutativity of any.  (Lol.  I hope that's not too insubordinate.  I
> already protested against a claim for associativity in this thread and
> now I'm going for the king of the hill, for whom I have always been so
> grateful!)

No, that is not the case. Look at the specific pairs of numbers that get added.

ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]

>>> 7.23 + 8.41
15.64
>>> _ + 6.15
21.79
>>> _ + 2.31
24.098
>>> _ + 7.73
31.83
>>> _ + 7.77
39.594

And with the other list:

ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]

>>> 8.41 + 6.15
14.56
>>> _ + 2.31
16.87
>>> _ + 7.73
24.6
>>> _ + 7.77
32.375
>>> _ + 7.23
39.61

If commutativity is being violated, then there should be some
situation where you could have written "7.73 + _" instead of "_ +
7.73" or equivalent, and gotten a different result. But that is simply
not the case. What you are seeing is NOT commutativity, but the
consequences of internal rounding, which is a matter of associativity.

> Alright.  Thanks so much for this example.  Here's a new puzzle for me.
> The REPL makes me think that both 21.79 and 2.31 *are* representable
> exactly in Python's floating-point datatype because I see:
>
> >>> 2.31
> 2.31
> >>> 21.79
> 21.79
>
> When I add them, the result obtained makes me think that the sum is
> *not* representable exactly in Python's floating-point number:
>
> >>> 21.79 + 2.31
> 24.098
>
> However, when I type 24.10 explicitly, the REPL makes me think that
> 24.10 *is* representable exactly:
>
> >>> 24.10
> 24.1
>
> I suppose I cannot trust the appearance of the representation?  What's
> really going on there?  (Perhaps the trouble appears while Python is
> computing the sum of the numbers 21.79 and 2.31?)  Thanks so much!

The representation is a conversion from the internal format into
decimal digits. It is rounded for convenience of display, because you
don't want it to look like this:

>>> print(Fraction(24.10))
3391773469363405/140737488355328

Since that's useless, the repr of a float rounds it to the shortest
plausible number as represented in decimal digits. This has nothing to
do with whether it is exactly representable, and everything to do with
displaying things usefully in as many situations as possible :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-09-04 Thread Chris Angelico
On Sun, Sep 5, 2021 at 12:39 PM Alan Gauld via Python-list
 wrote:
>
> On 03/09/2021 18:37, Chris Angelico wrote:
>
> >>>> Without DST the schools opened in the dark so all the kids
> >>>> had to travel to school in the dark and the number of
> >>>> traffic accidents while crossing roads jumped.
> >
> > Are you saying that you had DST in winter, or that, when summer *and*
> > DST came into effect, there was more light at dawn? Because a *lot* of
> > people confuse summer and DST, and credit DST with the natural effects
> > of the season change.
>
> OK, I see the confusion. What I should point out was that the
> experiment involved us staying on DST and not reverting to UTC
> in the winter - that unified us with most of the EU apparently...
>
> So although I'm saying DST it was really the non-reversion from
> DST to UTC that caused problems. Arguably, if we just stayed on
> UTC and didn't have DST at all there would be no issue - except
> we'd be an hour out of sync with the EU. (Post Brexit that may
> not be seen as a problem!! :-)

Oh, I see what you mean.

When I complain about DST, I'm complaining about the repeated changes
of UTC offset. Whether you either stay on UTC+0 or stay on UTC+1, it's
basically the same, doesn't make a lot of difference. "Abolishing DST"
and "staying on summer time permanently" are effectively the same.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-04 Thread Chris Angelico
On Sun, Sep 5, 2021 at 1:04 PM Hope Rouselle  wrote:
> The same question in other words --- what's a trivial way for the REPL
> to show me such cycles occur?
>
> >> 7.23.as_integer_ratio()
> >>> (2035064081618043, 281474976710656)
>
> Here's what I did on this case.  The REPL is telling me that
>
>   7.23 = 2035064081618043/281474976710656
>
> If that were true, then 7.23 * 281474976710656 would have to equal
> 2035064081618043.  So I typed:
>
> >>> 7.23 * 281474976710656
> 2035064081618043.0
>
> That agrees with the falsehood.  I'm getting no evidence of the problem.
>
> When take control of my life out of the hands of misleading computers, I
> calculate the sum:
>
>844424930131968
>  +5629499534213120
> 197032483697459200
> ==
> 203506408161804288
> =/= 203506408161804300
>
> How I can save the energy spent on manual verification?
>

What you've stumbled upon here is actually a neat elegance of
floating-point, and an often-forgotten fundamental of it: rounding
occurs exactly the same regardless of the scale. The number 7.23 is
represented with a certain mantissa, and multiplying it by some power
of two doesn't change the mantissa, only the exponent. So the rounding
happens exactly the same, and it comes out looking equal!

The easiest way, in Python, to probe this sort of thing is to use
either fractions.Fraction or decimal.Decimal. I prefer Fraction, since
a float is fundamentally a rational number, and you can easily see
what's happening. You can construct a Fraction from a string, and
it'll do what you would expect; or you can construct one from a float,
and it'll show you what that float truly represents.

It's often cleanest to print fractions out rather than just dumping
them to the console, since the str() of a fraction looks like a
fraction, but the repr() looks like a constructor call.

>>> Fraction(0.25)
Fraction(1, 4)
>>> Fraction(0.1)
Fraction(3602879701896397, 36028797018963968)

If it looks like the number you put in, it was perfectly
representable. If it looks like something of roughly that many digits,
it's probably not the number you started with.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-04 Thread Chris Angelico
On Sun, Sep 5, 2021 at 12:50 PM Hope Rouselle  wrote:
>
> Christian Gollwitzer  writes:
>
> > Am 02.09.21 um 15:51 schrieb Hope Rouselle:
> >> Just sharing a case of floating-point numbers.  Nothing needed to be
> >> solved or to be figured out.  Just bringing up conversation.
> >> (*) An introduction to me
> >> I don't understand floating-point numbers from the inside out, but I
> >> do
> >> know how to work with base 2 and scientific notation.  So the idea of
> >> expressing a number as
> >>mantissa * base^{power}
> >> is not foreign to me. (If that helps you to perhaps instruct me on
> >> what's going on here.)
> >> (*) A presentation of the behavior
> >>
> > import sys
> > sys.version
> >> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> >> bit (AMD64)]'
> >>
> > ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> > sum(ls)
> >> 39.594
> >>
> > ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> > sum(ls)
> >> 39.61
> >> All I did was to take the first number, 7.23, and move it to the
> >> last
> >> position in the list.  (So we have a violation of the commutativity of
> >> addition.)
> >
> > I believe it is not commutativity, but associativity, that is
> > violated.
>
> Shall we take this seriously?  (I will disagree, but that doesn't mean I
> am not grateful for your post.  Quite the contary.)  It in general
> violates associativity too, but the example above couldn't be referring
> to associativity because the second sum above could not be obtained from
> associativity alone.  Commutativity is required, applied to five pairs
> of numbers.  How can I go from
>
>   7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>
> to
>
>   8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?
>
> Perhaps only through various application of commutativity, namely the
> ones below. (I omit the parentheses for less typing.  I suppose that
> does not create much trouble.  There is no use of associativity below,
> except for the intented omission of parentheses.)
>
>  7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>= 8.41 + 7.23 + 6.15 + 2.31 + 7.73 + 7.77
>= 8.41 + 6.15 + 7.23 + 2.31 + 7.73 + 7.77
>= 8.41 + 6.15 + 2.31 + 7.23 + 7.73 + 7.77
>= 8.41 + 6.15 + 2.31 + 7.73 + 7.23 + 7.77
>= 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.
>

Show me the pairs of numbers. You'll find that they are not the same
numbers. Commutativity is specifically that a+b == b+a and you won't
find any situation where that is violated.

As soon as you go to three or more numbers, what you're doing is
changing which numbers get added first, which is this:

a + (b + c) != (a + b) + c

and this can most certainly be violated due to intermediate rounding.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-04 Thread Chris Angelico
On Sun, Sep 5, 2021 at 12:48 PM Hope Rouselle  wrote:
>
> Chris Angelico  writes:
>
> > On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle  wrote:
> >>
> >> Hope Rouselle  writes:
> >>
> >> > Just sharing a case of floating-point numbers.  Nothing needed to be
> >> > solved or to be figured out.  Just bringing up conversation.
> >> >
> >> > (*) An introduction to me
> >> >
> >> > I don't understand floating-point numbers from the inside out, but I do
> >> > know how to work with base 2 and scientific notation.  So the idea of
> >> > expressing a number as
> >> >
> >> >   mantissa * base^{power}
> >> >
> >> > is not foreign to me. (If that helps you to perhaps instruct me on
> >> > what's going on here.)
> >> >
> >> > (*) A presentation of the behavior
> >> >
> >> >>>> import sys
> >> >>>> sys.version
> >> > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> >> > bit (AMD64)]'
> >> >
> >> >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >> >>>> sum(ls)
> >> > 39.594
> >> >
> >> >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >> >>>> sum(ls)
> >> > 39.61
> >> >
> >> > All I did was to take the first number, 7.23, and move it to the last
> >> > position in the list.  (So we have a violation of the commutativity of
> >> > addition.)
> >>
> >> Suppose these numbers are prices in dollar, never going beyond cents.
> >> Would it be safe to multiply each one of them by 100 and therefore work
> >> with cents only?  For instance
> >
> > Yes and no. It absolutely *is* safe to always work with cents, but to
> > do that, you have to be consistent: ALWAYS work with cents, never with
> > floating point dollars.
> >
> > (Or whatever other unit you choose to use. Most currencies have a
> > smallest-normally-used-unit, with other currency units (where present)
> > being whole number multiples of that minimal unit. Only in forex do
> > you need to concern yourself with fractional cents or fractional yen.)
> >
> > But multiplying a set of floats by 100 won't necessarily solve your
> > problem; you may have already fallen victim to the flaw of assuming
> > that the numbers are represented accurately.
>
> Hang on a second.  I see it's always safe to work with cents, but I'm
> only confident to say that when one gives me cents to start with.  In
> other words, if one gives me integers from the start.  (Because then, of
> course, I don't even have floats to worry about.)  If I'm given 1.17,
> say, I am not confident that I could turn this number into 117 by
> multiplying it by 100.  And that was the question.  Can I always
> multiply such IEEE 754 dollar amounts by 100?
>
> Considering your last paragraph above, I should say: if one gives me an
> accurate floating-point representation, can I assume a multiplication of
> it by 100 remains accurately representable in IEEE 754?

Humans usually won't give you IEEE 754 floats. What they'll usually
give you is a text string. Let's say you ask someone to type in the
prices of various items, the quantities thereof, and the shipping. You
take strings like "1.17" (or "$1.17"), and you parse that into the
integer 117.

> Hm, I think I see what you're saying.  You're saying multiplication and
> division in IEEE 754 is perfectly safe --- so long as the numbers you
> start with are accurately representable in IEEE 754 and assuming no
> overflow or underflow would occur.  (Addition and subtraction are not
> safe.)
>

All operations are equally valid. Anything that causes rounding can
cause loss of data, and that can happen with multiplication/division
as well as addition/subtraction. But yes, with the caveats you give,
everything is safe.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-09-03 Thread Chris Angelico
On Sat, Sep 4, 2021 at 3:33 AM Alan Gauld via Python-list
 wrote:
>
> On 02/09/2021 19:30, Chris Angelico wrote:
>
> >> Without DST the schools opened in the dark so all the kids
> >> had to travel to school in the dark and the number of
> >> traffic accidents while crossing roads jumped.
> >
> > How do they manage in winter?
>
> That was the winter. Sunrise wasn't till 10:00 or so
> and the schools open at 9. With DST sunrise became
> 9:00 and with pre-dawn light it is enough to see by.

Are you saying that you had DST in winter, or that, when summer *and*
DST came into effect, there was more light at dawn? Because a *lot* of
people confuse summer and DST, and credit DST with the natural effects
of the season change.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-03 Thread Chris Angelico
On Sat, Sep 4, 2021 at 12:08 AM o1bigtenor  wrote:
> Hmmm - - - ZI would suggest that you haven't looked into
> taxation yet!
> In taxation you get a rational number that MUST be multiplied by
> the amount in currency.

(You can, of course, multiply a currency amount by any scalar. Just
not by another currency amount.)

> The error rate here is stupendous.
> Some organizations track each transaction with its taxes rounded.
> Then some track using  use untaxed and then calculate the taxes
> on the whole (when you have 2 or 3 or 4 (dunno about more but
> who knows there are some seriously tax loving jurisdictions out there))
> the differences between adding amounts and then calculating taxes
> and calculating taxes on each amount and then adding all items
> together can have some 'interesting' differences.
>
> So financial data MUST be able to handle rational numbers.
> (I have been bit by the differences enumerated in the previous!)

The worst problem is knowing WHEN to round. Sometimes you have to do
intermediate rounding in order to make something agree with something
else :(

But if you need finer resolution than the cent, I would still
recommend trying to use fixed-point arithmetic. The trouble is
figuring out exactly how much precision you need. Often, 1c precision
is actually sufficient.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Connecting python to DB2 database

2021-09-03 Thread Chris Angelico
On Fri, Sep 3, 2021 at 11:37 PM DFS  wrote:
>
> On 9/3/2021 1:47 AM, Chris Angelico wrote:
> > On Fri, Sep 3, 2021 at 3:42 PM DFS  wrote:
> >>
> >> Having a problem with the DB2 connector
> >>
> >> test.py
> >> 
> >> import ibm_db_dbi
> >> connectstring =
> >> 'DATABASE=xxx;HOSTNAME=localhost;PORT=5;PROTOCOL=TCPIP;UID=xxx;PWD=xxx;'
> >> conn = ibm_db_dbi.connect(connectstring,'','')
> >>
> >> curr  = conn.cursor
> >> print(curr)
> >
> > According to PEP 249, what you want is conn.cursor() not conn.cursor.
> >
> > I'm a bit surprised as to the repr of that function though, which
> > seems to be this line from your output:
> >
> > 
> >
> > I'd have expected it to say something like "method cursor of
> > Connection object", which would have been an immediate clue as to what
> > needs to be done. Not sure why the repr is so confusing, and that
> > might be something to report upstream.
> >
> > ChrisA
>
>
> Thanks.  I must've done it right, using conn.cursor(), 500x.
> Bleary-eyed from staring at code too long I guess.

Cool cool! Glad that's working.

> Now can you get DB2 to accept ; as a SQL statement terminator like the
> rest of the world?   They call it "An unexpected token"...
>

Hmm, I don't know that the execute() method guarantees to allow
semicolons. Some implementations will strip a trailing semi, but they
usually won't allow interior ones, because that's a good way to worsen
SQL injection vulnerabilities. It's entirely possible - and within the
PEP 249 spec, I believe - for semicolons to be simply rejected.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-03 Thread Chris Angelico
On Fri, Sep 3, 2021 at 10:42 PM jak  wrote:
>
> Il 03/09/2021 09:07, Julio Di Egidio ha scritto:
> > On Friday, 3 September 2021 at 01:22:28 UTC+2, Chris Angelico wrote:
> >> On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber  
> >> wrote:
> >>> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico 
> >>> declaimed the following:
> >>>
> >>>> The naive summation algorithm used by sum() is compatible with a
> >>>> variety of different data types - even lists, although it's documented
> >>>> as being intended for numbers - but if you know for sure that you're
> >>>> working with floats, there's a more accurate algorithm available to
> >>>> you.
> >>>>
> >>>>>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
> >>>> 39.6
> >>>>>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
> >>>> 39.6
> >>>>
> >>>> It seeks to minimize loss to repeated rounding and is, I believe,
> >>>> independent of data order.
> >>>
> >>> Most likely it sorts the data so the smallest values get summed first,
> >>> and works its way up to the larger values. That way it minimizes the 
> >>> losses
> >>> that occur when denormalizing a value (to set the exponent equal to that 
> >>> of
> >>> the next larger value).
> >>>
> >> I'm not sure, but that sounds familiar. It doesn't really matter
> >> though - the docs just say that it is an "accurate floating point
> >> sum", so the precise algorithm is an implementation detail.
> >
> > The docs are quite misleading there, it is not accurate without further 
> > qualifications.
> >
> > <https://docs.python.org/3.8/library/math.html#math.fsum>
> > <https://code.activestate.com/recipes/393090/>
> >
>
> https://en.wikipedia.org/wiki/IEEE_754

I believe the definition of "accurate" here is that, if you take all
of the real numbers represented by those floats, add them all together
with mathematical accuracy, and then take the nearest representable
float, that will be the exact value that fsum will return. In other
words, its accuracy is exactly as good as the final result can be.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Connecting python to DB2 database

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 3:42 PM DFS  wrote:
>
> Having a problem with the DB2 connector
>
> test.py
> 
> import ibm_db_dbi
> connectstring =
> 'DATABASE=xxx;HOSTNAME=localhost;PORT=5;PROTOCOL=TCPIP;UID=xxx;PWD=xxx;'
> conn = ibm_db_dbi.connect(connectstring,'','')
>
> curr  = conn.cursor
> print(curr)

According to PEP 249, what you want is conn.cursor() not conn.cursor.

I'm a bit surprised as to the repr of that function though, which
seems to be this line from your output:



I'd have expected it to say something like "method cursor of
Connection object", which would have been an immediate clue as to what
needs to be done. Not sure why the repr is so confusing, and that
might be something to report upstream.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber  wrote:
>
> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico 
> declaimed the following:
>
> >
> >The naive summation algorithm used by sum() is compatible with a
> >variety of different data types - even lists, although it's documented
> >as being intended for numbers - but if you know for sure that you're
> >working with floats, there's a more accurate algorithm available to
> >you.
> >
> >>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
> >39.6
> >>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
> >39.6
> >
> >It seeks to minimize loss to repeated rounding and is, I believe,
> >independent of data order.
> >
>
> Most likely it sorts the data so the smallest values get summed first,
> and works its way up to the larger values. That way it minimizes the losses
> that occur when denormalizing a value (to set the exponent equal to that of
> the next larger value).
>

I'm not sure, but that sounds familiar. It doesn't really matter
though - the docs just say that it is an "accurate floating point
sum", so the precise algorithm is an implementation detail.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 8:01 AM Alan Gauld via Python-list
 wrote:
>
> On 02/09/2021 19:28, Chris Angelico wrote:
>
> >> Except for the places that don't follow the IANA scheme and/or
> >> dynamically change their time settings on a whim. To be complete
> >> you need the ability to manually override too.
> >>
> >
> > What places are those?
>
> Mainly small non-tech oriented places such as small pacific islands
> or principalities with local governance - such as by a group of
> tribal elders. I mentioned earlier the example of Andorra announcing
> on the Friday night before a DST change that they were deferring
> it for a week to preserve the skiing conditions. But we came across
> several similar situations in dealing with multi-national service centres.
>
> > IANA maintains the database by noticing changes
> > and announcements, and updating the database.
>
> But don;t those have to be electronic in nature? How, for example
> would it pick up the radio news announcement mentioned above?

Can't find the specific example of Andorra, but there have been plenty
of times when someone's reported a time change to the IANA list and
it's resulted in the zonedata being updated. "It" picks up changes by
people reporting them, because IANA is, ultimately, a group of people.

> > governments need to "opt in" or anything. Stuff happens because people
> > do stuff, and people do stuff because they want to be able to depend
> > on timezone conversions.
>
> Umm, they do DST because it makes their lives easier - more daylight,
> extra work time. etc. The needs of, or impact on, computers in these
> kinds of small localities and communities are way down the pecking order.

Yes, and what happens when those changes make other people's lives
harder because tzdata is out of date? People change tzdata to be
up-to-date. It's not the local government that maintains tzdata.

> > There ARE times when a government makes a change too quickly to get
> > updates out to everyone, especially those who depend on an OS-provided
> > copy of tzdata, so I agree with the "on a whim" part. Though,
> > fortunately, that's rare.
>
> I agree it is very rare and if you only operate in mainstream
> localities you probably never see it as an issue, it's only
> when you need to support "off grid" locations that manual
> control becomes important. Also the problems we had were about
> 15 years ago, things may be better ordered nowadays. (I've been
> retired for 7 years so can't speak of more recent events)

Oh, fifteen years ago. That explains why searching the tz-announce
list didn't help - the archive doesn't go back past 2012. The actual
tzdata files can be found going all the way back [1] but you'd have to
dig through to find the announcement in question. In any case, I'm
sure you'll find an update there; if not immediately, then after the
event, because tzdata cares a LOT about historical times. For
instance, this recent announcement [2] had one change to upcoming
times, but a number of corrections to past times.

So a manual override is ONLY necessary when (a) tzdata hasn't caught
up yet, or (b) tzdata has been updated, but you're using an old
version (maybe your OS hasn't caught up yet, and you can't get it from
PyPI).

ChrisA

[1] https://data.iana.org/time-zones/releases/
[2] https://mm.icann.org/pipermail/tz-announce/2020-December/63.html
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 4:40 AM Alan Gauld via Python-list
 wrote:
>
> On 31/08/2021 23:31, Chris Angelico wrote:
>
> > Ah, good to know. I think that actually makes a lot of sense; in the
> > US, they try to let everyone pretend that the rest of the world
> > doesn't exist ("we always change at 2AM"), but in Europe, they try to
> > synchronize for the convenience of commerce ("everyone changes at 1AM
> > UTC").
>
> There's another gotcha with DST changes. The EU and USA have different
> dates on which they change to DST.
>
> In one of them (I can't recall which is which) they change on the 4th
> weekend of October/March in the other they change on the last weekend.
>
> That means on some years (when there are 5 weekends) there is a
> week when one has changed and the other hasn't. That caused us
> a lot of head scratching the first time we encountered it because
> our service centres in the US and EU were getting inconsistent
> time reporting and some updates showing as having happened in
> the future!
>

I live in Australia. You folks all change in *the wrong direction*.
Twice a year, there's a roughly two-month period that I call "DST
season", when different countries (or groups of countries) switch DST.
It is a nightmare to schedule anything during that time.

The ONLY way is to let the computer handle it. Don't try to predict
ANYTHING about DST manually.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 4:26 AM Alan Gauld via Python-list
 wrote:
>
> On 31/08/2021 22:32, Chris Angelico wrote:
>
> > If we could abolish DST world-wide, life would be far easier. All the
> > rest of it would be easy enough to handle.
> We tried that in the UK for 2 years back in the '70s and very
> quickly reverted to DST when they realized that the number
> of fatalities among young children going to school doubled
> during those two years.
>
> Without DST the schools opened in the dark so all the kids
> had to travel to school in the dark and the number of
> traffic accidents while crossing roads jumped.

How do they manage in winter? Can that be solved with better street
lighting? That was fifty years ago now, and the negative consequences
of DST are far stronger now.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle  wrote:
>
> Hope Rouselle  writes:
>
> > Just sharing a case of floating-point numbers.  Nothing needed to be
> > solved or to be figured out.  Just bringing up conversation.
> >
> > (*) An introduction to me
> >
> > I don't understand floating-point numbers from the inside out, but I do
> > know how to work with base 2 and scientific notation.  So the idea of
> > expressing a number as
> >
> >   mantissa * base^{power}
> >
> > is not foreign to me. (If that helps you to perhaps instruct me on
> > what's going on here.)
> >
> > (*) A presentation of the behavior
> >
>  import sys
>  sys.version
> > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> > bit (AMD64)]'
> >
>  ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>  sum(ls)
> > 39.594
> >
>  ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>  sum(ls)
> > 39.61
> >
> > All I did was to take the first number, 7.23, and move it to the last
> > position in the list.  (So we have a violation of the commutativity of
> > addition.)
>
> Suppose these numbers are prices in dollar, never going beyond cents.
> Would it be safe to multiply each one of them by 100 and therefore work
> with cents only?  For instance

Yes and no. It absolutely *is* safe to always work with cents, but to
do that, you have to be consistent: ALWAYS work with cents, never with
floating point dollars.

(Or whatever other unit you choose to use. Most currencies have a
smallest-normally-used-unit, with other currency units (where present)
being whole number multiples of that minimal unit. Only in forex do
you need to concern yourself with fractional cents or fractional yen.)

But multiplying a set of floats by 100 won't necessarily solve your
problem; you may have already fallen victim to the flaw of assuming
that the numbers are represented accurately.

> --8<---cut here---start->8---
> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>> sum(map(lambda x: int(x*100), ls)) / 100
> 39.6
>
> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>> sum(map(lambda x: int(x*100), ls)) / 100
> 39.6
> --8<---cut here---end--->8---
>
> Or multiplication by 100 isn't quite ``safe'' to do with floating-point
> numbers either?  (It worked in this case.)

You're multiplying and then truncating, which risks a round-down
error. Try adding a half onto them first:

int(x * 100 + 0.5)

But that's still not a perfect guarantee. Far safer would be to
consider monetary values to be a different type of value, not just a
raw number. For instance, the value $7.23 could be stored internally
as the integer 723, but you also know that it's a value in USD, not a
simple scalar. It makes perfect sense to add USD+USD, it makes perfect
sense to multiply USD*scalar, but it doesn't make sense to multiply
USD*USD.

> I suppose that if I multiply it by a power of two, that would be an
> operation that I can be sure will not bring about any precision loss
> with floating-point numbers.  Do you agree?

Assuming you're nowhere near 2**53, yes, that would be safe. But so
would multiplying by a power of five. The problem isn't precision loss
from the multiplication - the problem is that your input numbers
aren't what you think they are. That number 7.23, for instance, is
really

>>> 7.23.as_integer_ratio()
(2035064081618043, 281474976710656)

... the rational number 2035064081618043 / 281474976710656, which is
very close to 7.23, but not exactly so. (The numerator would have to
be ...8042.88 to be exactly correct.) There is nothing you can do at
this point to regain the precision, although a bit of multiplication
and rounding can cheat it and make it appear as if you did.

Floating point is a very useful approximation to real numbers, but
real numbers aren't the best way to represent financial data. Integers
are.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on writing a while loop for rolling two dice

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 4:51 AM Hope Rouselle  wrote:
>
> Chris Angelico  writes:
>
> > On Mon, Aug 30, 2021 at 11:13 PM David Raymond  
> > wrote:
> >>
> >> > def how_many_times():
> >> >   x, y = 0, 1
> >> >   c = 0
> >> >   while x != y:
> >> > c = c + 1
> >> > x, y = roll()
> >> >   return c, (x, y)
> >>
> >> Since I haven't seen it used in answers yet, here's another option using 
> >> our new walrus operator
> >>
> >> def how_many_times():
> >> roll_count = 1
> >> while (rolls := roll())[0] != rolls[1]:
> >> roll_count += 1
> >> return (roll_count, rolls)
> >>
> >
> > Since we're creating solutions that use features in completely
> > unnecessary ways, here's a version that uses collections.Counter:
> >
> > def how_many_times():
> > return next((count, rolls) for count, rolls in
> > enumerate(iter(roll, None)) if len(Counter(rolls)) == 1)
> >
> > Do I get bonus points for it being a one-liner that doesn't fit in
> > eighty characters?
>
> Lol.  You do not.  In fact, this should be syntax error :-D --- as I
> guess it would be if it were a lambda expression?

It got split across lines when I posted it, but if I did this in a
program, I'd make it a single long line. That said, though - Python
doesn't mind if you mess up the indentation inside a parenthesized
expression. Even broken like this, it WILL work. It just looks even
uglier than it does with proper indentation :)

BTW, this sort of thing is great as an anti-plagiarism check. If a
student ever turns in an abomination like this, you can be extremely
confident that it was copied from some programming site/list.
Especially since I've used the two-arg version of iter() in there -
that's quite a rarity.

Hm.

My mind is straying to evil things.

The two-arg iter can do SO much more than I'm using it for here.

By carefully designing the second argument, we could make something
that is equal to anything whose two elements are equal, which would
then terminate the loop. This... could be a lot worse than it seems.

I'll leave it as an exercise for the reader to figure out how to
capture the matching elements for return.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on floating-point numbers

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 4:29 AM Hope Rouselle  wrote:
>
> Just sharing a case of floating-point numbers.  Nothing needed to be
> solved or to be figured out.  Just bringing up conversation.
>
> (*) An introduction to me
>
> I don't understand floating-point numbers from the inside out, but I do
> know how to work with base 2 and scientific notation.  So the idea of
> expressing a number as
>
>   mantissa * base^{power}
>
> is not foreign to me. (If that helps you to perhaps instruct me on
> what's going on here.)
>
> (*) A presentation of the behavior
>
> >>> import sys
> >>> sys.version
> '3.8.10 (tags/v3.8.10:3d8993a, May  3 2021, 11:48:03) [MSC v.1928 64 bit 
> (AMD64)]'
>
> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>> sum(ls)
> 39.594
>
> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>> sum(ls)
> 39.61
>
> All I did was to take the first number, 7.23, and move it to the last
> position in the list.  (So we have a violation of the commutativity of
> addition.)
>

It's not about the commutativity of any particular pair of operands -
that's always guaranteed. What you're seeing here is the results of
intermediate rounding. Try this:

>>> def sum(stuff):
... total = 0
... for thing in stuff:
... total += thing
... print(thing, "-->", total)
... return total
...
>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(ls)
7.23 --> 7.23
8.41 --> 15.64
6.15 --> 21.79
2.31 --> 24.098
7.73 --> 31.83
7.77 --> 39.594
39.594
>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(ls)
8.41 --> 8.41
6.15 --> 14.56
2.31 --> 16.87
7.73 --> 24.6
7.77 --> 32.375
7.23 --> 39.61
39.61
>>>

Nearly all floating-point confusion stems from an assumption that the
input values are exact. They usually aren't. Consider:

>>> from fractions import Fraction
>>> for n in ls: print(n, Fraction(*n.as_integer_ratio()))
...
8.41 2367204554136617/281474976710656
6.15 3462142213541069/562949953421312
2.31 5201657569612923/2251799813685248
7.73 2175801569973371/281474976710656
7.77 2187060569041797/281474976710656
7.23 2035064081618043/281474976710656

Those are the ACTUAL values you're adding. Do the same exercise with
the partial sums, and see where the rounding happens. It's probably
happening several times, in fact.

The naive summation algorithm used by sum() is compatible with a
variety of different data types - even lists, although it's documented
as being intended for numbers - but if you know for sure that you're
working with floats, there's a more accurate algorithm available to
you.

>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
39.6
>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
39.6

It seeks to minimize loss to repeated rounding and is, I believe,
independent of data order.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on writing a while loop for rolling two dice

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 4:33 AM Hope Rouselle  wrote:
> Yeah.  Here's a little context.  I came across this by processing a list
> of exercises.  (I'm teaching a course --- you know that by now, I
> guess.)  So the first thing I observed was the equal volume of work
> dedicated to while loops and for loops --- so I decided to compared
> which appeared more often in a certain sample of well-written Python
> code.  It turns out the for loop was much more frequent.  Students have
> been reporting too much work in too little time, so I decided to reduce
> the number of exercises involving while loops.  When I began to look at
> the exercises, to see which ones I'd exclude, I decided to exclude them
> all --- lol! --- except for one.  The one that remained was this one
> about rolling dice until a satisfying result would appear.  (All other
> ones were totally more naturally written with a for loop.)
>
> So if I were to also write this with a for-loop, it'd defeat the purpose
> of the course's moment.  Besides, I don't think a for-loop would improve
> the readability here.

It's on the cusp. When you ask someone to express the concept of "do
this until this happens", obviously that's a while loop; but as soon
as you introduce the iteration counter, it becomes less obvious, since
"iterate over counting numbers until this happens" is a quite viable
way to express this. However, if the students don't know
itertools.count(), they'll most likely put in an arbitrary limit (like
"for c in range(1)"), which you can call them out for.

> But I thought your protest against the while-True was very well put:
> while-True is not too readable for a novice.  Surely what's readable or
> more-natural /to someone/ is, well, subjective (yes, by definition).
> But perhaps we may agree that while rolling dice until a certain
> success, we want to roll them while something happens or doesn't happen.
> One of the two.  So while-True is a bit of a jump.  Therefore, in this
> case, the easier and more natural option is to say while-x-not-equal-y.

That may be the case, but in Python, I almost never write "while
True". Consider the two while loops in this function:

https://github.com/Rosuav/shed/blob/master/autohost_manager.py#L92

Thanks to Python's flexibility and efficient compilation, these loops
are as descriptive as those with actual conditions, while still
behaving exactly like "while True". (The inner loop, "more pages",
looks superficially like it should be a for loop - "for page in
pages:" - but the data is coming from successive API calls, so it
can't know.)

> I don't see it.  You seem to have found what we seem to agree that it
> would be the more natural way to write the strategy.  But I can't see
> it.  It certainly isn't
>
> --8<---cut here---start->8---
> def how_many_times_1():
>   c, x, y = 0, None, None
>   while x != y:
> c = c + 1
> x, y = roll()
>   return c, x, y
> --8<---cut here---end--->8---
>
> nor
>
> --8<---cut here---start->8---
> def how_many_times_2():
>   c, x, y = 0, None, None
>   while x == y:
> c = c + 1
> x, y = dados()
>   return c, x, y
> --8<---cut here---end--->8---
>
> What do you have in mind?  I couldn't see it.

You're overlaying two loops here. One is iterating "c" up from zero,
the other is calling a function and testing its results. It's up to
you which of these should be considered the more important, and which
is a bit of extra work added onto it. With the counter as primary, you
get something like this:

for c in itertools.count():
x, y = roll()
if x == y: return c, x, y

With the roll comparison as primary, you get this:

c, x, y = 0, 0, 1
while x != y:
x, y = roll()
c += 1
return c, x, y

Reworking the second into a do-while style (Python doesn't have that,
so we have to write it manually):

c = 0
while "x and y differ":
x, y = roll()
c += 1
if x == y: break
return c, x, y

And at this point, it's looking pretty much identical to the for loop
version. Ultimately, they're all the same and you can pick and choose
elements from each of them.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 4:22 AM Alan Gauld via Python-list
 wrote:
>
> On 31/08/2021 22:13, Chris Angelico wrote:
>
> > But ultimately, it all just means that timezones are too hard for
> > humans to handle, and we MUST handle them using IANA's database. It is
> > the only way.
>
> Except for the places that don't follow the IANA scheme and/or
> dynamically change their time settings on a whim. To be complete
> you need the ability to manually override too.
>

What places are those? IANA maintains the database by noticing changes
and announcements, and updating the database. I don't think
governments need to "opt in" or anything. Stuff happens because people
do stuff, and people do stuff because they want to be able to depend
on timezone conversions.

There ARE times when a government makes a change too quickly to get
updates out to everyone, especially those who depend on an OS-provided
copy of tzdata, so I agree with the "on a whim" part. Though,
fortunately, that's rare.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-09-02 Thread Chris Angelico
On Fri, Sep 3, 2021 at 4:18 AM Dennis Lee Bieber  wrote:
>
> On Tue, 31 Aug 2021 16:53:14 -0500, 2qdxy4rzwzuui...@potatochowder.com
> declaimed the following:
>
> >On 2021-09-01 at 07:32:43 +1000,
> >Chris Angelico  wrote:
> >> If we could abolish DST world-wide, life would be far easier. All the
> >> rest of it would be easy enough to handle.
> >
> >Agreed.
> >
>
> Unfortunately, most of the proposals in the US seem to be that
> /standard time/ would be abolished, and DST would rule year-round. Hence
> putting the center of the time zone one hour off from solar mean noon;
> which is what the time zones were originally based upon.
>

I'd be fine with that, honestly. It's a bit 'off', but it isn't THAT
big a deal. If some city in the US decides that it wants its timezone
to be a few hours AHEAD of UTC, more power to them, just as long as
it's consistent year-round.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-08-31 Thread Chris Angelico
On Wed, Sep 1, 2021 at 9:20 AM dn via Python-list
 wrote:
>
> On 01/09/2021 09.13, Chris Angelico wrote:
> > On Wed, Sep 1, 2021 at 6:38 AM dn via Python-list
> >  wrote:
> >>> Yeah. I do recommend making good use of the IANA tzinfo database
> >>> though (especially since Python 3.9 made that a bit easier to access),
> >>> as it's usually easier to get people to tell you what city/state
> >>> they're in, rather than whether daylight time will be active or not.
> >>> (It might take a little bit of translation to figure out that, for
> >>> instance, New Brunswick CA is America/Halifax, but that's not too hard
> >>> usually.) Letting tzinfo do all the work means you don't have to fret
> >>> about anyone's daylight saving transition dates, or whether they've
> >>> decided to change their clocks by half an hour to be different from
> >>> Japan's clocks, or to have DST not applicable during Ramadan, or to
> >>> have double DST, or double-negative DST. And yes, those are all real,
> >>> because you can't make up anything as insane as actual clock politics.
> >>
> >> So, given that it is a NUMERIC calculation, dispense with "New Brunswick
> >> CA is America/Halifax"; and avoid "Atlantic Time", "Atlantic Standard
> >> Time", "Atlantic Daylight Time", "AT", "ADT", or "AST", and express the
> >> time numerically: "17:00-3"
> >>
> >> Given that, someone at UTC-4 knows that his/her rendez-vous will be
> >> "1600", and I can figure it to be "0800" for me:
> >>
> >> 1700 - -3 = 20:00 (to calculate UTC), then UTC-4 = 16:00
> >> and
> >> 1700 - -3 = 20:00 (to calculate UTC), then UTC+12 = 32:00,
> >>   rounding to 24hrs: 08:00
> >>   (the next day)
> >
> > No, that's not reliable... because of that abomination called Daylight
> > Saving Time. Since I used New Brunswick, and since she's just gone
> > online, I'll use a specific example:
> >
> > DeviCat livestreams at 6pm every Tuesday (and other times, but I'm
> > going to focus on a weekly event here). Since she lives in NB, Canada,
> > she defines that time by what IANA refers to as America/Halifax.
> >
> > I want to be there at the start of each stream, since I'm one of her
> > moderators. But I live in a suburb of Melbourne - my clock shows what
> > IANA calls Australia/Melbourne.
> >
> > To turn this into a purely mathematical calculation, you have to know
> > exactly when she will go on or off DST, and when I will go on or off.
> > Trying to turn it into an offset is going to fail badly as soon as you
> > talk about "next Tuesday" and one of us is shifting DST this weekend.
> >
> > That's why it's better to let Python (or something) handle the whole
> > thing. Don't concern yourself with exactly what the hour differences
> > are, or which way DST is going, or anything; just convert Halifax time
> > to Melbourne time.
>
> OK, I admit it: I am so lazy that I don't use my fingers (nor my toes!)
> but expect my poor, over-worked (and under-paid) computer to calculate
> it all for me...
>
>
> I should have split the earlier explanation of two calculations, more
> clearly:
>
> Devicat can declare the start as "6pm" ("localisation")
> and state that the time-zone is UTC-3
> - or as @MRAB suggested, translate it to "21:00 UTC"
> ("internationalisation")
>
> You (@Chris) then perform the second-half calculation, by adjusting the
> UTC-value to your time-zone.
>
> - and were I to attend, would personalise ("localise") the time
> similarly - but using my locality's (different) UTC-offset.

Gotcha gotcha. Unfortunately that, while theoretically easier, is not
correct; she streams at 6pm every week, which means that the UTC time
is *different* in June and December.

> I agree, the idea of 'Summer Time' is a thorough pain - even more-so
> when the host publishes in local-time but forgets that there will be a
> "spring forward" or "fall back" between the time of publication and the
> meeting-date!

Right. Which is basically guaranteed when it's a recurring event.

> Accordingly, when the Néo-Brunswickoise publishes "6pm", all the locals
> will be happy.
>
> If she adds UTC, or the locally-applicable UTC-offset (for Summer-Time,
> or not), the international community can make their own and personal

Re: urgent

2021-08-31 Thread Chris Angelico
On Wed, Sep 1, 2021 at 9:03 AM Barry  wrote:
>
>
>
> > On 31 Aug 2021, at 16:53, jak  wrote:
> >
> > Il 31/08/2021 03:05, Python ha scritto:
> >> Hari wrote:
> >>> i was download ur python software but it is like boring user interface for
> >>> me like young student to learn ,can u have any updates?
> >> God, let me die please...
> >
> > Oh no, please don't speak in that way ... evidently now that python has
> > reached its tenth version its prompt is a little boring. It may need to
> > be replaced. You could open a competition notice to vote on the new
> > prompt. I would vote for:
> >
> > :^P>
>
> The big problem with >>> is that it means a third level quote in email 
> clients.
> So when people cut-n-paste REPL output it’s formatted badly by email clients.
> A prompt that avoided that issue would be nice.
>
> >>> print(“this is not a quoted reply”)
>

Welp, gonna have to convince people that the Python 3000 decision
needs to be reversed :)

https://www.python.org/dev/peps/pep-3099/#interactive-interpreter

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-08-31 Thread Chris Angelico
On Wed, Sep 1, 2021 at 8:22 AM MRAB  wrote:
>
> [snip]
> In the EU, DST in the member states changes at the same time. It's not
> like the US where it ripples across the timezones, so the differences
> vary during the change. It all happens in one go.
>

Ah, good to know. I think that actually makes a lot of sense; in the
US, they try to let everyone pretend that the rest of the world
doesn't exist ("we always change at 2AM"), but in Europe, they try to
synchronize for the convenience of commerce ("everyone changes at 1AM
UTC").

A quick browse of Wikipedia suggests that some European countries
(outside of the EU, which mandates DST transitions) have constant
year-round UTC offsets. In theory, there could be a non-EU country
that observes DST with different dates, but I can't find any examples.
Here's hoping, hehe.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-08-31 Thread Chris Angelico
On Wed, Sep 1, 2021 at 7:54 AM <2qdxy4rzwzuui...@potatochowder.com> wrote:
>
> On 2021-09-01 at 07:32:43 +1000,
> Chris Angelico  wrote:
>
> > On Wed, Sep 1, 2021 at 7:17 AM <2qdxy4rzwzuui...@potatochowder.com> wrote:
>
> > > What about Phoenix?  In the winter, it's the same time there as it is in
> > > San Francisco, but in the summer, it's the same time there as it is in
> > > Denver (Phoenix doesn't observe Daylight Saving Time).
> >
> > I prefer to say: In winter, San Francisco (or Los Angeles) is the same
> > as Phoenix, but in summer, Los Angeles changes its clocks away, and
> > Denver changes to happen to be the same as Phoenix.
>
> Not exactly.  Sort of.  Phoenix and Denver are both in America/Denver
> (aka US/Mountain), but only Denver observes DST.  San Francisco and Los
> Angeles are both in America/Los_Angeles, and both observe DST.

America/Phoenix is a separate time zone from America/Denver. During
winter they represent the same time, but during summer, Phoenix
doesn't change its offset, and Denver does.

(San Francisco isn't an IANA timezone; the city precisely follows Los
Angeles time.)

> > ... I think Egypt (Africa/Cairo) is currently in the lead for weirdest
> > timezone change ...
>
> Yeah, I read about that somewhere.  Remember when the Pope declared that
> September should skip a bunch of days?

Well, that's from transitioning from the Julian calendar to the
Gregorian. The same transition was done in different countries at
different times. The Pope made the declaration for the Catholic church
in 1582, and all countries whose official religion was Catholic
changed at the same time; other countries chose their own schedules
for the transition. Notably, Russia converted in 1918, immediately
after the "October Revolution", which happened on the 25th of October
on the Julian calendar, but the 7th of November on the Gregorian.

> Way back in the 1990s, I was working with teams in Metro Chicago, Tel
> Aviv, and Tokyo (three separate teams, three really separate time zones,
> at least two seaprate DST transition dates).  I changed my wristwatch to
> 24 hour time (and never looked back).  I tried UTC for a while, which
> was cute, but confusing.

I tried UTC for a while too, but it became easier to revert to local
time for my watch and just do the conversions directly.

Perhaps, in the future, we will all standardize on UTC, and daylight
time will be a historical relic. And then, perhaps, we will start
getting frustrated at relativity-based time discrepancies ("it's such
a pain scheduling anything with someone on Mars, their clocks move
faster than ours do!").

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-08-31 Thread Chris Angelico
On Wed, Sep 1, 2021 at 7:17 AM <2qdxy4rzwzuui...@potatochowder.com> wrote:
>
> On 2021-09-01 at 08:36:55 +1200,
> dn via Python-list  wrote:
>
> > ... there is less consideration about working-out what time it is in
> > Pune cf Kolkata, than between (say) San Francisco and Denver -
> > although they are in the same country, are they in the same time-zone,
> > or not?  (they aren't!)
>
> What about Phoenix?  In the winter, it's the same time there as it is in
> San Francisco, but in the summer, it's the same time there as it is in
> Denver (Phoenix doesn't observe Daylight Saving Time).

I prefer to say: In winter, San Francisco (or Los Angeles) is the same
as Phoenix, but in summer, Los Angeles changes its clocks away, and
Denver changes to happen to be the same as Phoenix.

> And then there's Indiana, a medium size state that tends to get ignored
> (they used to advertise "there's more than just corn in Indiana").  Most
> of Indiana is in US/Eastern, but the cities that are (for practical
> purposes) suburbs of Chicago are in US/Central (aka America/Chicago).

At least the US has governed DST transitions. As I understand it, any
given city has to follow one of the standard time zones, and may
EITHER have no summer time, OR transition at precisely 2AM/3AM local
time on the federally-specified dates. (I think the EU has also
mandated something similar for member states.)

If we could abolish DST world-wide, life would be far easier. All the
rest of it would be easy enough to handle.

> ChrisA is right; you can't make this [stuff] up.

Yeah. And if you think you've heard it all, sign up for the
tzdata-announce mailing list and wait for the next phenomenon. I think
Egypt (Africa/Cairo) is currently in the lead for weirdest timezone
change, for (with short notice) announcing that they'd have DST during
summer but not during Ramadan. Since "summer" is defined by a solar
calendar and "Ramadan" is defined by a lunar calendar, that means the
DST exclusion might happen entirely in winter (no effect), at one end
or other of summer (shortens DST, just changes the dates), or in the
middle of summer (DST on, DST off, DST on, DST off, in a single year).
But they will, at some point, be eclipsed by an even more bizarre
timezone change. I don't dare try to predict what will happen, because
I know that the reality will be even worse

> Having lived in the United States my entire life (and being a nerd), I
> can confirm that (1) I'm used to it and handle it as well as possible,
> but (2) many people are not and don't.

Yup, absolutely. I've been working internationally for a number of
years now, so my employment has been defined by a clock that isn't my
own. I got used to it and developed tools and habits, but far too many
people don't, and assume that simple "add X hours" conversions
suffice.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-08-31 Thread Chris Angelico
On Wed, Sep 1, 2021 at 6:38 AM dn via Python-list
 wrote:
> > Yeah. I do recommend making good use of the IANA tzinfo database
> > though (especially since Python 3.9 made that a bit easier to access),
> > as it's usually easier to get people to tell you what city/state
> > they're in, rather than whether daylight time will be active or not.
> > (It might take a little bit of translation to figure out that, for
> > instance, New Brunswick CA is America/Halifax, but that's not too hard
> > usually.) Letting tzinfo do all the work means you don't have to fret
> > about anyone's daylight saving transition dates, or whether they've
> > decided to change their clocks by half an hour to be different from
> > Japan's clocks, or to have DST not applicable during Ramadan, or to
> > have double DST, or double-negative DST. And yes, those are all real,
> > because you can't make up anything as insane as actual clock politics.
>
> So, given that it is a NUMERIC calculation, dispense with "New Brunswick
> CA is America/Halifax"; and avoid "Atlantic Time", "Atlantic Standard
> Time", "Atlantic Daylight Time", "AT", "ADT", or "AST", and express the
> time numerically: "17:00-3"
>
> Given that, someone at UTC-4 knows that his/her rendez-vous will be
> "1600", and I can figure it to be "0800" for me:
>
> 1700 - -3 = 20:00 (to calculate UTC), then UTC-4 = 16:00
> and
> 1700 - -3 = 20:00 (to calculate UTC), then UTC+12 = 32:00,
>   rounding to 24hrs: 08:00
>   (the next day)

No, that's not reliable... because of that abomination called Daylight
Saving Time. Since I used New Brunswick, and since she's just gone
online, I'll use a specific example:

DeviCat livestreams at 6pm every Tuesday (and other times, but I'm
going to focus on a weekly event here). Since she lives in NB, Canada,
she defines that time by what IANA refers to as America/Halifax.

I want to be there at the start of each stream, since I'm one of her
moderators. But I live in a suburb of Melbourne - my clock shows what
IANA calls Australia/Melbourne.

To turn this into a purely mathematical calculation, you have to know
exactly when she will go on or off DST, and when I will go on or off.
Trying to turn it into an offset is going to fail badly as soon as you
talk about "next Tuesday" and one of us is shifting DST this weekend.

That's why it's better to let Python (or something) handle the whole
thing. Don't concern yourself with exactly what the hour differences
are, or which way DST is going, or anything; just convert Halifax time
to Melbourne time.

> For many of us, the mental-calculations are relatively easy to manage.
> For Python the code is trivial. Computation is easier than terminology
> 'translation' (particularly when one has to research the terms first!
> - did you know what "ADT" meant?)

I asked DeviCat what country and province ("state" in other regions)
she lived in, and then confirmed with her that Halifax time was what
her clock showed. The term "ADT" was never relevant.

In a lot of situations, you don't even need to ask the human - you can
let the web browser or desktop app report the timezone. The app can
say something like "In order to schedule this event,  will need to
know your time zone. Is that okay?" and then send the IANA timezone
name.

> Teasing @Chris: I'm not sure why it should be amusing that two entities
> called 'Ireland' should have different time-zones (pot?kettle) - after
> all, does "Western Australia" use the same time-zone as "South
> Australia"? For that matter, the same as the bulk of the Australian
> population?

Western Australia uses Australia/Perth timezone, and South Australia
uses Australia/Adelaide. They're different base times from the east
coast where I am by two hours, and half an hour, respectively; and
they have different DST rules.

On the east coast, we all have the same winter time, but in summer,
Melbourne, Sydney, and Hobart move clocks forward, but Brisbane
doesn't.

> The time-zone which perplexes me most, is India. This because it is not
> only a different hour, but also requires a 30-minute off-set - operating
> at UTC+5:30!

Yup, we got that too... Adelaide is half an hour back from Melbourne
(UTC+9:30). But it gets worse. Kathmandu is on a quarter hour. And the
Chatham Islands (part of New Zealand) are 12:45 ahead of UTC in
winter, and then they add an hour of DST in summer, putting them at
UTC+13:45.

> Fortunately, like China, the entire country (officially) operates in the
> same time-zone. Accordingly, there is less consideration about
> working-out what time it is in Pune cf Kolkata, than between (say) San
> Francisco and Denver - although they are in the same country, are they
> in the same time-zone, or not?
> (they aren't!)

That would be convenient for working within China, but on the flip
side, it means that geographically-nearby locations can have vastly
different clocks. Oh, and 

Re: Struggling to understand timedelta rpesentation when applying an offset for an hour earlier - why is days = -1?

2021-08-31 Thread Chris Angelico
On Wed, Sep 1, 2021 at 1:55 AM dcs3spp via Python-list
 wrote:
>
> Hi,
>
> I wonder if anyone can help
>
> I am struggling to understand the representation of timedelta when used in 
> conjunction with astimezone.
>
> Given the code below, in a python interactive interpreter, I am trying to 
> calculate a resultant datetime an hour earlier from a UTC datetime
>
> ```bash
> >>> dt = datetime(2021, 8, 22, 23, 59, 31, tzinfo=timezone.utc)
> >>> hour_before=dt.astimezone(timezone(-timedelta(seconds=3600)))
> >>> hour_before
> datetime.datetime(2021, 8, 22, 22, 59, 31, 
> tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=82800)))
> ```
>
> I cannot understand why the resultant datetime.timedelta is days=-1, 
> seconds=82800 (23 hours) .
>
> Why is it not an hour earlier as seconds=-3600? Why is days = -1 when the 
> resultant calculated date is the same, year, day, month??

It's consistent with modulo arithmetic:

>>> x = -3600
>>> x // 86400
-1
>>> x % 86400
82800

>>> help(datetime.timedelta)
...
 |  --
 |  Data descriptors defined here:
 |
 |  days
 |  Number of days.
 |
 |  microseconds
 |  Number of microseconds (>= 0 and less than 1 second).
 |
 |  seconds
 |  Number of seconds (>= 0 and less than 1 day).
 |
 |  --

The sub-day portions are guaranteed to be zero or above, meaning that
a small negative offset is described as "a day ago, plus 23 hours"
rather than "an hour ago". It's the exact same thing, though.

If you would prefer to see ALL components negative, just negate the
timedelta and then negate each component; that will give you an
equivalent timedelta.

>>> datetime.timedelta(seconds=-3600)
datetime.timedelta(days=-1, seconds=82800)
>>> -datetime.timedelta(seconds=-3600)
datetime.timedelta(seconds=3600)
>>> datetime.timedelta(seconds=-3600-86400)
datetime.timedelta(days=-2, seconds=82800)
>>> -datetime.timedelta(seconds=-3600-86400)
datetime.timedelta(days=1, seconds=3600)

Hope that explains it!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Create a real-time interactive TUI using Python.

2021-08-31 Thread Chris Angelico
On Wed, Sep 1, 2021 at 1:59 AM hongy...@gmail.com  wrote:
>
> I want to know whether python can be used to create real-time interactive 
> TUI, as hstr [1] does.
>
> [1] https://github.com/dvorka/hstr
>

Yes.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The sqlite3 timestamp conversion between unixepoch and localtime can't be done according to the timezone setting on the machine automatically.

2021-08-31 Thread Chris Angelico
On Tue, Aug 31, 2021 at 8:55 PM MRAB  wrote:
>
> On 2021-08-31 02:16, dn via Python-list wrote:
> > On 31/08/2021 11.07, Dennis Lee Bieber wrote:
> >> On Sun, 29 Aug 2021 19:49:19 -0700 (PDT), "hongy...@gmail.com"
> >>  declaimed the following:
> > ...
> >
> >>  Might have helped to mention you were in China... To me, CST is North
> >> America Central Standard Time (and I'd have expected this time of year to
> >> see CDT - Central Daylight Time)... That led me on a weird meaningless side
> >> track...
> > ...
> >
> >>  I'm in EDT (Eastern Daylight Time) -- so 4 hours behind UTC.
> >
> >
> > Which is correct?
> >
> > CST in China
> > https://www.timeanddate.com/time/zones/cst-china
> >
> > CST in North America
> > https://www.timeanddate.com/time/zones/cst
> >
> > and not to mention Cuba
> > https://www.timeanddate.com/time/zones/
> >
> [snip]
> What annoys me is when someone starts that a webinar will start at, say,
> xx ET. I have to know which country that person is in and whether
> daylight savings is currently in effect (EST or EDT?) so that I can
> convert to my local time.

If someone says "ET", then I would assume they mean America/New_York -
it seems that only in the US do people so utterly assume that everyone
else is in the same country. In Europe, I hear people say "CEST" and
such (though I still prefer "Europe/Prague" or whatever country
they're in), so the only issue there is that they don't always say
"CEDT" when it's daylight time.

> It's so much easier to use UTC.
>
> I know what timezone I'm in and whether daylight savings is currently in
> effect here, so I know the local offset from UTC.

Yeah. I do recommend making good use of the IANA tzinfo database
though (especially since Python 3.9 made that a bit easier to access),
as it's usually easier to get people to tell you what city/state
they're in, rather than whether daylight time will be active or not.
(It might take a little bit of translation to figure out that, for
instance, New Brunswick CA is America/Halifax, but that's not too hard
usually.) Letting tzinfo do all the work means you don't have to fret
about anyone's daylight saving transition dates, or whether they've
decided to change their clocks by half an hour to be different from
Japan's clocks, or to have DST not applicable during Ramadan, or to
have double DST, or double-negative DST. And yes, those are all real,
because you can't make up anything as insane as actual clock politics.

(I find the Ireland situation particularly amusing. Northern Ireland,
being part of the UK, operates on London time, with clocks advancing
one hour for summer. The Republic of Ireland, on the other hand, has a
standard time which is one hour later than Greenwich's, but then they
subtract an hour during winter, returning to standard time in summer.
So when the rest of Europe adds an hour, Ireland stops subtracting
one. Clocks in Belfast and Dublin always show the same times.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on writing a while loop for rolling two dice

2021-08-30 Thread Chris Angelico
On Tue, Aug 31, 2021 at 12:28 AM Peter Otten <__pete...@web.de> wrote:
>
> On 30/08/2021 15:50, Chris Angelico wrote:
>
> > def how_many_times():
> >  return next((count, rolls) for count, rolls in
> > enumerate(iter(roll, None)) if len(Counter(rolls)) == 1)
>
>
> That's certainly the most Counter-intuitive version so far;)

Thank you, I appreciate that :)

> > Do I get bonus points for it being a one-liner that doesn't fit in
> > eighty characters?
>
> Nah, but you'll get an honorable mention when you run it through
> pycodestyle without line break...
>

Are there any linters that warn against "unintuitive use of
two-argument iter()"?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on writing a while loop for rolling two dice

2021-08-30 Thread Chris Angelico
On Mon, Aug 30, 2021 at 11:13 PM David Raymond  wrote:
>
> > def how_many_times():
> >   x, y = 0, 1
> >   c = 0
> >   while x != y:
> > c = c + 1
> > x, y = roll()
> >   return c, (x, y)
>
> Since I haven't seen it used in answers yet, here's another option using our 
> new walrus operator
>
> def how_many_times():
> roll_count = 1
> while (rolls := roll())[0] != rolls[1]:
> roll_count += 1
> return (roll_count, rolls)
>

Since we're creating solutions that use features in completely
unnecessary ways, here's a version that uses collections.Counter:

def how_many_times():
return next((count, rolls) for count, rolls in
enumerate(iter(roll, None)) if len(Counter(rolls)) == 1)

Do I get bonus points for it being a one-liner that doesn't fit in
eighty characters?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on writing a while loop for rolling two dice

2021-08-29 Thread Chris Angelico
On Mon, Aug 30, 2021 at 9:53 AM dn via Python-list
 wrote:
>
> On 29/08/2021 22.24, Chris Angelico wrote:
> > On Sun, Aug 29, 2021 at 8:14 PM dn via Python-list
> >  wrote:
> >> Efficiency:
> >> - wonder how max( d ) == min( d ) compares for speed with the set() type
> >> constructor?
> >
> > That may or may not be an improvement.
> >
> >> - alternately len( d ) < 2?
> >> - or len( d ) - 1 coerced to a boolean by the if?
> >
> > Neither of these will make any notable improvement. The work is done
> > in constructing the set, and then you're taking the length. How you do
> > the comparison afterwards is irrelevant.
>
> It was far too late for either of us (certainly this little boy) to be
> out-and-coding - plus an excellent illustration of why short-names are a
> false-economy which can quickly (and easily) lead to "technical debt"!
>
>
> The "d" is a tuple (the 'next' returned from the zip-output object)
> consisting of a number of die-throw results). Thus, can toss that into
> len() without (any overhead of) conversion to a set.

Oh. Well, taking the length of the tuple is fast... but useless. The
point was to find out if everything in it was unique :)

Conversion to set tests this because the length of the set is the
number of unique elements; checking max and min works because two
scans will tell you if they're all the same; using all with a
generator stops early if you find a difference, but requires
back-and-forth calls into Python code; there are various options, and
the choice probably won't make a material performance difference
anyway :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP Idea: Real private attribute

2021-08-29 Thread Chris Angelico
On Mon, Aug 30, 2021 at 5:49 AM Mehrzad Saremi  wrote:
>
> No, a class ("the class that I'm lexically inside") cannot be accessed from
> outside of the class. This is why I'm planning to offer it as a core
> feature because only the parser would know. There's apparently no elegant
> solution if you want to implement it yourself. You'll need to write
> self.__privs__[__class__, "foo"], whenever you want to use the feature and
> even wrapping it in superclasses won't remedy it, because the parent class
> isn't aware which class you're inside. It seems to me name mangling must
> have been an ad-hoc solution in a language that it doesn't really fit when
> it could have been implemented in a much more cogent way.
>

If the parent class isn't aware which class you're in, how is the
language going to define it?

Can you give a full run-down of the semantics of your proposed privs,
and how it's different from something like you just used above -
self.__privs__[__class__, "foo"] - ? If the problem is the ugliness
alone, then say so; but also, how this would work with decorators,
since you specifically mention them as a use-case.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on writing a while loop for rolling two dice

2021-08-29 Thread Chris Angelico
On Sun, Aug 29, 2021 at 8:14 PM dn via Python-list
 wrote:
> Efficiency:
> - wonder how max( d ) == min( d ) compares for speed with the set() type
> constructor?

That may or may not be an improvement.

> - alternately len( d ) < 2?
> - or len( d ) - 1 coerced to a boolean by the if?

Neither of these will make any notable improvement. The work is done
in constructing the set, and then you're taking the length. How you do
the comparison afterwards is irrelevant.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP Idea: Real private attribute

2021-08-28 Thread Chris Angelico
On Sun, Aug 29, 2021 at 7:40 AM Mehrzad Saremi  wrote:
>
> Python currently uses name mangling for double-underscore attributes. Name
> mangling is not an ideal method to avoid name conflicting. There are
> various normal programming patterns that can simply cause name conflicting
> in double-underscore members. A typical example is when a class is
> re-decorated using the same decorator. The decorator can not take
> double-underscore members without name conflicts. For example:
>
> ```
> @custom_decorator("a")
> @custom_decorator("b")
> class C:
> pass
> ```
>
> The `@custom_decorator` wrapper may need to hold private members, but
> Python's current name conflict resolution does not provide any solution and
> the decorator cannot hold private members without applying tricky
> programming methods.
>
> Another example is when a class inherits from a base class of the same name.
>
> ```
> class View:
> """A class representing a view of an object; similar to
> numpy.ndarray.view"""
> pass
>
> class Object:
> class View(View):
> """A view class costumized for objects of type Object"""
> pass
> ```
>
> Again, in this example, class `Object.View` can't take double-underscore
> names without conflicting with `View`'s.
>
> My idea is to introduce real private members (by which I do not mean to be
> inaccessible from outside the class, but to be guaranteed not to conflict
> with other private members of the same object). These private members are
> started with triple underscores and are stored in a separate dictionary
> named `__privs__`. Unlike `__dict__` that takes 'str' keys, `__privs__`
> will be a double layer dictionary that takes 'type' keys in the first
> level, and 'str' keys in the second level.
>
> For example, assume that the user runs the following code:
> ```
> class C:
> def __init__(self, value):
> self.___member = value
>
> c = C("my value")
> ```
>
> On the last line, Python's attribute setter creates a new entry in the
> dictionary with key `C`, adds the value "my value" to a new entry with the
> key 'member'.
>
> The user can then retrieve `c.___member` by invoking the `__privs__`
> dictionary:
>
> ```
> print(c.__privs__[C]['member'])  # prints 'my value'
> ```
>
> Note that, unlike class names, class objects are unique and there will not
> be any conflicts. Python classes are hashable and can be dictionary keys.
> Personally, I do not see any disadvantage of using __privs__ over name
> mangling/double-underscores. While name mangling does not truly guarantee
> conflict resolution, __privs__ does.

Not entirely sure how it would know the right type to use (subclassing
makes that tricky), but whatever your definition is, there's nothing
stopping you from doing it yourself. Don't forget that you have
__class__ available if you need to refer to "the class that I'm
lexically inside" (that's how the zero-arg super() function works), so
you might do something like self.__privs__[__class__, "foo"] to refer
to a thing.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on the popularity of loops while and for

2021-08-28 Thread Chris Angelico
On Sun, Aug 29, 2021 at 7:40 AM Hope Rouselle  wrote:
>
> I'd like get a statistic of how often each loop is used in practice.
>
> I was trying to take a look at the Python's standard libraries --- those
> included in a standard installation of Python 3.9.6, say --- to see
> which loops are more often used among while and for loops.  Of course,
> since English use the preposition ``for'' a lot, that makes my life
> harder.  Removing comments is easy, but removing strings is harder.  So
> I don't know yet what I'll do.
>
> Have you guys ever measured something like that in a casual or serious
> way?  I'd love to know.  Thank you!

For analysis like this, I recommend using the Abstract Syntax Tree:

https://docs.python.org/3/library/ast.html

You can take a Python source file, parse it to the AST, and then walk
that tree to see what it's using. That will avoid any false positives
from the word "for" coming up in the wrong places.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on writing a while loop for rolling two dice

2021-08-28 Thread Chris Angelico
On Sun, Aug 29, 2021 at 7:37 AM Hope Rouselle  wrote:
>
> How should I write this?  I'd like to roll two six-sided dice until I
> get the same number on both.  I'd like to get the number of times I
> tried.  Here's a primitive I'm using:
>
> --8<---cut here---start->8---
> >>> x, y = roll()
> >>> x
> 6
> >>> y
> 6 # lucky
>
> >>> x, y = roll()
> >>> x
> 4
> >>> y
> 1 # unlucky
> --8<---cut here---end--->8---
>
> Here's my solution:
>
> --8<---cut here---start->8---
> def how_many_times():
>   x, y = 0, 1
>   c = 0
>   while x != y:
> c = c + 1
> x, y = roll()
>   return c, (x, y)
> --8<---cut here---end--->8---
>
> Why am I unhappy?  I'm wish I could confine x, y to the while loop.  The
> introduction of ``x, y = 0, 1'' must feel like a trick to a novice.  How
> would you write this?  Thank you!

Your loop, fundamentally, is just counting. So let's just count.

def how_many_times():
for c in itertools.count():
...

Inside that loop, you can do whatever you like, including returning
immediately if you have what you want. I'll let you figure out the
details. :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: basic auth request

2021-08-25 Thread Chris Angelico
On Thu, Aug 26, 2021 at 12:48 AM Jon Ribbens via Python-list
 wrote:
>
> On 2021-08-25, Chris Angelico  wrote:
> > On Thu, Aug 26, 2021 at 12:16 AM Jon Ribbens via Python-list
> > wrote:
> >> There are so many trusted CAs these days that the chances of them all
> >> being secure approaches zero - they are not all equal yet they are all
> >> equally trusted. Which is why a change of CA on a site you have visited
> >> before is potentially suspicious.
> >
> > Do any popular web browsers notify you if that happens? I've certainly
> > never noticed it with any that I use (and I've transitioned several
> > sites from one CA to another).
>
> There was, if the site was using "HTTP Public Key Pinning". But
> that appears to have now been removed in favour of "Certificate
> Transparency", which to me seems to be a system very much based
> on the "problem: horse gone; solution: shut stable door" principle.
>
> Another attempt at combatting this problem is DNS CAA records,
> which are a way of politely asking all CAs in the world except the
> ones you choose "please don't issue a certificate for my domain".
> By definition someone who had hacked a CA would pay no attention
> to that request, of course.

True, but that would still prevent legit CAs from unwittingly
contributing to an attack. But it still wouldn't help if someone can
do any sort of large-scale DNS attack, which is kinda essential for
most of this to matter anyway (it doesn't matter if an attacker has a
fake cert if all traffic goes to the legit site anyway).

> > I've come to the conclusion that most security threats don't bother
> > most people, and that security *warnings* bother nearly everyone, so
> > real authentication of servers doesn't really matter all that much.
> > *Encryption* does still have value, but you'd get that with a
> > self-signed cert too.
>
> Encryption without knowing who you're encrypting *to* is worthless,
> it's pretty much functionally equivalent to not encrypting.

Somewhat. It does prevent various forms of MitM attack. It's all about
adding extra difficulties on an attacker, so I wouldn't say
"worthless", just because it isn't 100% reliable.

Earlier I posited a hypothetical approach wherein the server would
sign a new cert using the old cert, and would then be able to present
that upon request. Are there any massive glaring problems with that?
(Actually, I'm pretty sure there will be. Lemme reword. What massive
glaring problems can you see with that?) It would require servers to
retain a chain of certificates, and to be able to provide that upon
request. It wouldn't even need a change to HTTP per se - could be
something like "https://your.host.example/cert_proof.txt; the same way
that robots.txt is done. In theory, that would allow a client to, at
the cost of retaining the one last-seen cert for each site, have
confidence that the site is the same one that was previously seen.

But, maybe we're just coming back to "it doesn't matter and nobody
really cares".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: basic auth request

2021-08-25 Thread Chris Angelico
On Thu, Aug 26, 2021 at 12:16 AM Jon Ribbens via Python-list
 wrote:
>
> On 2021-08-25, Chris Angelico  wrote:
> > On Wed, Aug 25, 2021 at 5:20 PM Barry Scott  wrote:
> >> Only if this threat model matters to you or your organisation.
> >> Personal its low down of the threats I watch out for.
> >>
> >> The on-line world and the real-world are the same here.
> >>
> >> If a business changes hands then do you trust the new owners?
> >>
> >> Nothing we do with PKI certificates will answer that question.
> >
> > Fair enough; but a closer parallel would be walking up to a
> > previously-familiar street vendor and seeing a different person there.
> > Did the business change hands, or did some random dude hop over the
> > counter and pretend to be a new owner?
> >
> > But you're right, it's not usually a particularly high risk threat.
> > Still, it does further weaken the value of named SSL certificates and
> > certificate authorities; there's not actually that much difference if
> > the server just gave you a self-signed cert. In theory, the CA is
> > supposed to protect you against someone doing a DNS hack and
> > substituting a different server, in practice, anyone capable of doing
> > a large-scale DNS hack is probably capable of getting a very
> > legit-looking SSL cert for the name as well.
>
> There are so many trusted CAs these days that the chances of them all
> being secure approaches zero - they are not all equal yet they are all
> equally trusted. Which is why a change of CA on a site you have visited
> before is potentially suspicious.

Do any popular web browsers notify you if that happens? I've certainly
never noticed it with any that I use (and I've transitioned several
sites from one CA to another).

I've come to the conclusion that most security threats don't bother
most people, and that security *warnings* bother nearly everyone, so
real authentication of servers doesn't really matter all that much.
*Encryption* does still have value, but you'd get that with a
self-signed cert too.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: basic auth request

2021-08-25 Thread Chris Angelico
On Wed, Aug 25, 2021 at 5:20 PM Barry Scott  wrote:
>
> Only if this threat model matters to you or your organisation.
> Personal its low down of the threats I watch out for.
>
> The on-line world and the real-world are the same here.
>
> If a business changes hands then do you trust the new owners?
>
> Nothing we do with PKI certificates will answer that question.

Fair enough; but a closer parallel would be walking up to a
previously-familiar street vendor and seeing a different person there.
Did the business change hands, or did some random dude hop over the
counter and pretend to be a new owner?

But you're right, it's not usually a particularly high risk threat.
Still, it does further weaken the value of named SSL certificates and
certificate authorities; there's not actually that much difference if
the server just gave you a self-signed cert. In theory, the CA is
supposed to protect you against someone doing a DNS hack and
substituting a different server, in practice, anyone capable of doing
a large-scale DNS hack is probably capable of getting a very
legit-looking SSL cert for the name as well.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PyQt5 is not recognized from python 3.8 installation in python 3.10

2021-08-22 Thread Chris Angelico
On Mon, Aug 23, 2021 at 4:31 AM Mohsen Owzar  wrote:
> How can I get all the packages available in 3.8 version also available for 
> 3.10 version without any new installation in 3.10 for each all already 
> existing packages?
>

You can't. With compiled binaries, especially, it's important to
install into each version separately - there can be minor differences
which will be taken care of by the installer. Normally, that's not a
problem, other than that you have to install each one again; the best
way would be to keep track of your package dependencies in a file
called requirements.txt, and then you can simply install from that
(python3 -m pip install -r requirements.txt) into the new version.

As to PyQt5 specifically, though I don't know what the issue here
is, but I tried it on my system and it successfully installed version
5.15.4. Are you using the latest version of pip? There might be some
other requirements. Alternatively, I'm seeing a potential red flag
from this line:

> C:\Qt\4.7.4\bin\qmake.exe -query

You're trying to install Qt5, but maybe it's coming across a Qt4
installation? Not sure if that's significant or not.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: basic auth request

2021-08-22 Thread Chris Angelico
On Sun, Aug 22, 2021 at 8:30 PM Barry Scott  wrote:
>
>
>
> On 22 Aug 2021, at 10:37, Chris Angelico  wrote:
>
> When it comes to security, one thing I'm very curious about is why we
> don't have any sort of certificate renewal verification. My browser
> could retain the certificates of some web site (or of all web sites,
> even - they're not THAT large), and if the site presents a different
> cert, it could show the previously retained one and challenge the
> server "prove that you're the same guy". This proof would consist of
> the latest cert, signed by the older cert's key (or possibly a chain
> that can construct such a proof, which would allow the server to
> simply retain each new cert signed by the one previous cert, forming a
> line - or a tree if necessary). My suspicion is that it'd add little
> above simply having a valid cert, but if people are paranoid, surely
> that's a better place to look?
>
>
> The web site proves it owners the hostname and/or IP address using its 
> certificate.
> You use your trust store to show that you can trust that certificate.
>
> The fact that a certificate changes is not a reason to stop trusting a site.
>
> So it does not add anything.
>
> The pain point in PKI is revocation. The gold standard is for a web site to 
> use OCSP stapling.
> But that is rare sadly. And because of issues with revocation lists, 
> (privacy, latency, need to
> fail open on failiure, DoD vector, etc) this is where the paranoid should 
> look.
>

Fair point. Let me give you a bit of context.

Recently, the owner/operator of a site (I'll call it
https://demo.example/ ) died. Other people, who have been using the
site extensively, wish for it to continue. If the domain registration
expires, anyone can reregister it, and can then generate a completely
new certificate for the common name "demo.example", and web browsers
will accept that. The old cert may or may not have expired, but it
won't be revoked.

As far as I can tell, a web browser with default settings will happily
accept the change of ownership. It won't care that the IP address,
certificate, etc, have all changed. It just acknowledges that some CA
has signed some certificate with the right common name. And therein is
the vulnerability. (NOTE: I'm not saying that this is a real and
practical vulnerability - this is theoretical only, and a focus for
the paranoid.)

This is true even if the old cert were one of those enhanced
certificates that some CAs try to upsell you to ("Extended Validation"
and friends). Even if, in the past, your bank was secured by one of
those certs, your browser will still accept a perfectly standard cert
next time. Which, in my opinion, renders those (quite pricey)
certificates no more secure than something from Let's Encrypt that has
no validation beyond ownership of DNS.

Of course, you can pin a certificate. You can ask your browser to warn
you if it's changed *at all*. But since certs expire, that's highly
impractical, hence wondering why we don't have a system for using the
old cert to prove ownership of the new one.

So how is a web browser supposed to distinguish between (a) normal
operation in which certs expire and are replaced, and (b) legit or
non-legit ownership changes? (Of course the browser can't tell you
whether the ownership change is legit, but out-of-band info can help
with that.)

Or does it really matter that little?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: from foo import bar and the ast module

2021-08-22 Thread Chris Angelico
On Mon, Aug 23, 2021 at 12:26 AM Dan Stromberg  wrote:
>
>
> On Sun, Aug 22, 2021 at 7:14 AM Chris Angelico  wrote:
>>
>> On Mon, Aug 23, 2021 at 12:08 AM Dan Stromberg  wrote:
>> >
>> > In 'from foo import bar':
>> >
>> > With the ast module, I see how to get bar, but I do not yet see how to get
>> > the foo.
>> >
>> > There are clearly ast.Import and ast.ImportFrom, but I do not see the foo
>> > part in ast.ImportFrom.
>> >
>> > ?
>>
>> >>> import ast
>> >>> ast.dump(ast.parse("from foo import bar"))
>> "Module(body=[ImportFrom(module='foo', names=[alias(name='bar')],
>> level=0)], type_ignores=[])"
>> >>> ast.parse("from foo import bar").body[0].module
>> 'foo'
>
>
> With 'from . import bar', I get a module of None.
>
>  Does this seem strange?
>

No; it's just the AST so it can't bring in any additional information.
To distinguish package-relative imports, use the level attribute:

>>> ast.dump(ast.parse("from . import bar").body[0])
"ImportFrom(names=[alias(name='bar')], level=1)"
>>> ast.dump(ast.parse("from .foo import bar").body[0])
"ImportFrom(module='foo', names=[alias(name='bar')], level=1)"
>>> ast.dump(ast.parse("from foo.bar import bar").body[0])
"ImportFrom(module='foo.bar', names=[alias(name='bar')], level=0)"
>>> ast.dump(ast.parse("from .foo.bar import bar").body[0])
"ImportFrom(module='foo.bar', names=[alias(name='bar')], level=1)"
>>> ast.dump(ast.parse("from ..foo.bar import bar").body[0])
"ImportFrom(module='foo.bar', names=[alias(name='bar')], level=2)"

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: from foo import bar and the ast module

2021-08-22 Thread Chris Angelico
On Mon, Aug 23, 2021 at 12:08 AM Dan Stromberg  wrote:
>
> In 'from foo import bar':
>
> With the ast module, I see how to get bar, but I do not yet see how to get
> the foo.
>
> There are clearly ast.Import and ast.ImportFrom, but I do not see the foo
> part in ast.ImportFrom.
>
> ?

>>> import ast
>>> ast.dump(ast.parse("from foo import bar"))
"Module(body=[ImportFrom(module='foo', names=[alias(name='bar')],
level=0)], type_ignores=[])"
>>> ast.parse("from foo import bar").body[0].module
'foo'

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: basic auth request

2021-08-22 Thread Chris Angelico
On Sun, Aug 22, 2021 at 6:45 PM Peter J. Holzer  wrote:
>
> On 2021-08-22 05:04:43 +1000, Chris Angelico wrote:
> > On Sun, Aug 22, 2021 at 4:55 AM Martin Di Paola
> >  wrote:
> > > HTTPS ensures encryption so the content, including the Basic Auth
> > > username and password, is secret for any external observer.
> > >
> > > But it is *not* secret for the receiver (the server): if it was
> > > compromised an adversary will have access to your password. It is much
> > > easier to print a captured password than cracking the hashes.
> > >
> > > Other authentication mechanisms exist, like OAuth, which are more
> > > "secure".
>
> OAuth is "an /authorization protocol/, rather than an /authentication
> protocol/" [Wikipedia].
>
> > If your server is compromised in that way, *all is lost*.
>
> If "you" are the service provider, yes. but if "you" are the user, no.

If "your server" is compromised, then you are the service provider,
are you not? I'm not sure what "your server" would mean if "you" are
the user.

But okay. Suppose I log in to Random Service 1, using a user name and
password, and also to Random Service 2, using OAuth. What happens if
those servers get compromised?

1) Someone knows the login credentials that I created for that
service. If I've used the same password that I also use at my bank,
then I am in big trouble. It is, largely, my fault.

2) Someone has access to my login token and the client ID/secret
associated with it. That attacker can now impersonate me to the OAuth
provider, to the exact extent that the scopes permit. At absolute
least, the attacker gets to know a lot about who I am on some entirely
separate service.

I'm talking here about a complete and utter compromise, the sort where
neither SSL encryption nor proper password hashing would protect my
details, since that's what was being claimed.

Which is actually worse? Is it as clear-cut?

> From a user's perspective "all" is much more than the data (including
> username and password) associated with that particular service. So if
> one service is compromised, not all is lost, but only a bit (of course,
> depending on the importance of the service, that bit may be little or
> big; a random web forum probably doesn't matter. Your bank account
> probably does).
>
> So assuming that many people reuse passwords (which of course they
> shouldn't and thanks to password is becoming rarer, but is still
> distressingly common),

True, but reuse of passwords is something under the user's control.
OAuth scope selection is partly under the service's control, and
partly under the provider's (some providers have extremely coarse
scopes, widening the attack).

> there are three levels of security (from highest
> to lowest) in this scenario:
>
> 1: The secret known to the user is never transmitted to the server at
>all, the client only proves that the secret is known. This is the
>case for TLS client authentication (which AFAIK all browsers support
>but is a real pain in the ass to set up, so it's basically never
>used) and for SCRAM (which isn't part of HTTP(S) but could be
>implemented in JavaScript).

This would be great, if nobody minded (a) setting up a unique client
certificate for every site, or (b) allowing the ultimate in remote
tracking cookie whereby any server could recognize you by your TLS
certificate.

> 2: The secret is transmitted on login but never stored. This limits the
>damage to users who logged in while the server was compromised. This
>is the case for Basic Authentication combined with a probperly salted
>hashed storage.

Current best prac, and what I'd generally recommend to most people.

> 3: The secret is stored on the server. When the server is compromised,
>all user's passwords are known. This is (AFAIK) the case for Digest
>and NTLM.

I'm not sure what the effects of wide-spread Digest/NTLM usage would
have on password managers and the risks of compromise to them, but the
way things currently are, I would prefer salted/hashed passwords, such
that a data breach doesn't mean compromise of all historical data.

> So given the choice between Basic Auth and Digest or NTLM (over HTTPS in
> all cases) I would prefer Basic Auth. Ideally I would use SCRAM or a
> public key method, but I admit that my security requirements were never
> high enough to actually bother to do that (actually, I used SSL client
> side auth once, 20 years ago, ...).
>

I would, of course, prefer something like form fill-out over Basic,
but that's due to UI concerns rather than security ones.

SCRAM seems tempting, but in a context of web browsers, I'm not sure
that it would be worth the hassle.

When it comes to securit

Re: basic auth request

2021-08-21 Thread Chris Angelico
On Sun, Aug 22, 2021 at 4:55 AM Martin Di Paola
 wrote:
>
> While it is correct to say that Basic Auth without HTTPS is absolutely
> insecure, using Basic Auth *and* HTTPS is not secure either.
>
> Well, the definition of "secure" depends of your threat model.

Yes. Which makes statements like "not secure" rather suspect :)

> HTTPS ensures encryption so the content, including the Basic Auth
> username and password, is secret for any external observer.
>
> But it is *not* secret for the receiver (the server): if it was
> compromised an adversary will have access to your password. It is much
> easier to print a captured password than cracking the hashes.
>
> Other authentication mechanisms exist, like OAuth, which are more
> "secure".

If your server is compromised in that way, *all is lost*. If an
attacker is actually running code on your server, listening to your
sockets, after everything's decrypted, then *shut that server down*. I
don't think there is ANY security model that can handle this - if
you're using OAuth, and the server is compromised, then your client ID
and client secret are just as visible to the attacker as passwords
would be.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on perhaps unloading modules?

2021-08-21 Thread Chris Angelico
On Sun, Aug 22, 2021 at 4:37 AM Hope Rouselle  wrote:
>
> Greg Ewing  writes:
>
> > On 21/08/21 1:36 pm, Hope Rouselle wrote:
> >> I wish I could restrict their syntax too, though, but I fear that's
> >> not possible.  For instance, it would be very useful if I could
> >> remove loops.
> >
> > Actually you could, using ast.parse to get an AST and then walk
> > it looking for things you don't want to allow.
>
> Very interesting!  Thanks very much.  That would let me block them,
> though the ideal would be a certain python.exe binary that simply blows
> a helpful syntax error when they use something the course doesn't allow.
> But surely the course could provide students with a certain module or
> procedure which would validate their work.  (Don't turn in unless you
> pass these verifications.)
>
> > You could also play around with giving them a custom set of
> > builtins and restricting what they can import. Just be aware
> > that such things are not foolproof, and a sufficiently smart
> > student could find ways around them. (Although if they know
> > that much they probably deserve to pass the course anyway!)
>
> So true!  If they can get around such verifications, they should present
> their work at an extra-curricular sessions.

Agreed... if they do it knowingly. On the other hand, if they just
turn in code copied blindly from Stack Overflow...

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: basic auth request

2021-08-17 Thread Chris Angelico
On Wed, Aug 18, 2021 at 7:15 AM Barry  wrote:
>
>
>
> > On 17 Aug 2021, at 19:25, Chris Angelico  wrote:
> >
> > On Wed, Aug 18, 2021 at 4:16 AM Barry Scott  wrote:
> >> Oh and if you have the freedom avoid Basic Auth as its not secure at all.
> >>
> >
> > That's usually irrelevant, since the alternative is most likely to be
> > form fill-out, which is exactly as secure. If you're serving over
> > HTTPS, the page is encrypted, and that includes the headers; if you're
> > not, then it's not encrypted, and that includes the form body.
>
> There is digest and Ntlm that do not reveal the password.
>

And they require that the password be stored decryptably on the
server, which is a different vulnerability. It's all a matter of which
threat is more serious to you. Fundamentally, basic auth is no better
or worse than any of the other forms - it's just different.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: basic auth request

2021-08-17 Thread Chris Angelico
On Wed, Aug 18, 2021 at 4:16 AM Barry Scott  wrote:
> Oh and if you have the freedom avoid Basic Auth as its not secure at all.
>

That's usually irrelevant, since the alternative is most likely to be
form fill-out, which is exactly as secure. If you're serving over
HTTPS, the page is encrypted, and that includes the headers; if you're
not, then it's not encrypted, and that includes the form body.

There are other issues with basic auth, but security really isn't one.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cyclic imports

2021-08-17 Thread Chris Angelico
On Wed, Aug 18, 2021 at 4:10 AM Barry Scott  wrote:
>
> def allImports( self, module_name ):
> for line in f:
> words = line.strip().split()
> if words[0:1] == ['import']:
> all_imports.append( words[1] )
>

This will work for a lot of programs, but it depends too much on
coding style. If you feel like trying something a little more
adventurous, I'd recommend looking into the ast module:

>>> import ast
>>> ast.parse("""
... import foo
... import bar, baz
... from quux import spam
... try: import hello
... except ImportError: import goodbye
... """)

>>> m = _
>>> ast.dump(m)
"Module(body=[Import(names=[alias(name='foo')]),
Import(names=[alias(name='bar'), alias(name='baz')]),
ImportFrom(module='quux', names=[alias(name='spam')], level=0),
Try(body=[Import(names=[alias(name='hello')])],
handlers=[ExceptHandler(type=Name(id='ImportError', ctx=Load()),
body=[Import(names=[alias(name='goodbye')])])], orelse=[],
finalbody=[])], type_ignores=[])"

If you ast.parse() the text of a Python script, you'll get a Module
that has all the top-level code in a list. It's then up to you how
much you dig into that. For instance, a simple try/except like this is
fairly common, but if something's inside a FunctionDef, you might want
to ignore it. Or maybe just ignore everything that isn't an Import or
FromImport, which would be a lot easier, but would miss the try/except
example.

The main advantage of ast.parse() is that it no longer cares about
code layout, and it won't be fooled by an import statement inside a
docstring, or anything like that. It's also pretty easy to handle
multiple variants (note how "import bar, baz" just has two things in
the names list).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: basic auth request

2021-08-17 Thread Chris Angelico
On Wed, Aug 18, 2021 at 3:04 AM Robin Becker  wrote:
>
> While porting an ap from python2.7 to python3 I see this
>
> base64string = base64.b64encode('%s:%s' % (wsemail, wspassword))
> request.add_header("Authorization", "Basic %s" % base64string)
>
> in python3.x I find this works
>
> base64string = base64.b64encode(('%s:%s' % (wsemail, 
> wspassword)).encode('ascii')).decode('ascii')
> request.add_header("Authorization", "Basic %s" % base64string)
>
> but I find the conversion to and from ascii irksome. Is there a more direct 
> way to create the basic auth value?
>
> As an additional issue I find I have no clear idea what encoding is allowed 
> for the components of a basic auth input.

Hmm, I'm not sure what type your wsemail and wspassword are, but one
option would be to use bytes everywhere (assuming your text is all
ASCII).

wsemail = b"robin@becker.example"
wspassword = b"correct-battery-horse-staple"

base64string = base64.b64encode(b'%s:%s' % (wsemail, wspassword))

But otherwise, it's probably safest to keep using the encode and
decode. As to the appropriate encoding, that's unclear according to
the standard, but UTF-8 is probably acceptable. ASCII is also safe, as
it'll safely error out if it would be ambiguous.

https://stackoverflow.com/questions/7242316/what-encoding-should-i-use-for-http-basic-authentication

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regarding inability of Python Module Winsound to produce beep in decimal frequency

2021-08-16 Thread Chris Angelico
On Tue, Aug 17, 2021 at 1:50 PM Eryk Sun  wrote:
>
> On 8/16/21, Chris Angelico  wrote:
> > On Tue, Aug 17, 2021 at 11:44 AM Eryk Sun  wrote:
> >
> >> Yes, the PC speaker beep does not get used in Windows 7+. The beep
> >> device object is retained for compatibility, but it redirects the
> >> request to a task in the user's session (which could be a remote
> >> desktop session) that generates a WAV buffer in memory and plays it
> >> via PlaySound().
> >
> > That seems a bizarre way to handle it.
>
> Check the documentation [1]:
>
> In Windows 7, Beep was rewritten to pass the beep to the default sound
> device for the session. This is normally the sound card, except when
> run under Terminal Services, in which case the beep is rendered on the
> client.
>

Huh. Okay. Then I withdraw the concern from this list, and instead lay
it at Microsoft's feet. That is, I maintain, a bizarre choice. Surely
there are better ways to trigger audio on the sound card?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regarding inability of Python Module Winsound to produce beep in decimal frequency

2021-08-16 Thread Chris Angelico
On Tue, Aug 17, 2021 at 11:44 AM Eryk Sun  wrote:
>
> On 8/16/21, Roel Schroeven  wrote:
> >
> > We're not necessarily talking about the PC speaker here: (almost) all
> > computers these days have sound cards (mostly integrated on the
> > motherboard) that are much more capable than those one-bit PC speakers.
>
> Yes, the PC speaker beep does not get used in Windows 7+. The beep
> device object is retained for compatibility, but it redirects the
> request to a task in the user's session (which could be a remote
> desktop session) that generates a WAV buffer in memory and plays it
> via PlaySound().

That seems a bizarre way to handle it. What happens if you have
multiple sound cards? Or if a program needs to get the user's
attention despite headphones being connected to the sound card? That's
what the PC speaker is for, and I would be quite surprised if Windows
just says "nahh that doesn't matter".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on perhaps unloading modules?

2021-08-16 Thread Chris Angelico
On Tue, Aug 17, 2021 at 4:02 AM Greg Ewing  wrote:
> The second best way would be to not use import_module, but to
> exec() the student's code. That way you don't create an entry in
> sys.modules and don't have to worry about somehow unloading the
> module.

I would agree with this. If you need to mess around with modules and
you don't want them to be cached, avoid the normal "import" mechanism,
and just exec yourself a module's worth of code.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-16 Thread Chris Angelico
On Tue, Aug 17, 2021 at 3:51 AM Hope Rouselle  wrote:
>
> Chris Angelico  writes:
> >> Wow, I kinda feel the same as you here.  I think this justifies perhaps
> >> using a hardware solution.  (Crazy idea?! Lol.)
> >
> > uhhh Yes. Very crazy idea. Can't imagine why anyone would ever
> > think about doing that.
>
> Lol.  Really?  I mean a certain panic button.  You know the GNU Emacs.
> It has this queue with the implications you mentioned --- as much as it
> can.  (It must of course get the messages from the system, otherwise it
> can't do anything about it.)  And it has the panic button C-g.  The
> keyboard has one the highest precedences in hardware interrupts, doesn't
> it not?  A certain very important system could have a panic button that
> invokes a certain debugger, say, for a crisis-moment.
>
> But then this could be a lousy engineering strategy.  I am not an expert
> at all in any of this.  But I'm surprised with your quick dismissal. :-)
>
> > Certainly nobody in his right mind would have WatchCat listening on
> > the serial port's Ring Indicator interrupt, and then grab a paperclip
> > to bridge the DTR and RI pins on an otherwise-unoccupied serial port
> > on the back of the PC. (The DTR pin was kept high by the PC, and could
> > therefore be used as an open power pin to bring the RI high.)
>
> Why not?  Misuse of hardware?  Too precious of a resource?
>
> > If you're curious, it's pins 4 and 9 - diagonally up and in from the
> > short
> > corner. http://www.usconverters.com/index.php?main_page=page=61=0
>
> You know your pins!  That's impressive.  I thought the OS itself could
> use something like that.  The fact that they never do... Says something,
> doesn't it?  But it's not too obvious to me.
>
> > And of COURSE nobody would ever take an old serial mouse, take the
> > ball out of it, and turn it into a foot-controlled signal... although
> > that wasn't for WatchCat, that was for clipboard management between my
> > app and a Windows accounting package that we used. But that's a
> > separate story.
>
> Lol.  I feel you're saying you would. :-)

This was all a figure of speech, and the denials were all tongue in
cheek. Not only am I saying we would, but we *did*. All of the above.
The Ring Indicator trick was one of the best, since we had very little
other use for serial ports, and it didn't significantly impact the
system during good times, but was always reliable when things went
wrong.

(And when I posted it, I could visualize the port and knew which pins
to bridge, but had to go look up a pinout to be able to say their pin
numbers and descriptions.)

> I heard of Python for the first time in the 90s.  I worked at an ISP.
> Only one guy was really programming there, Allaire ColdFusion.  But, odd
> enough, we used to say we would ``write a script in Python'' when we
> meant to say we were going out for a smoke.  I think that was precisely
> because nobody knew that ``Python'' really was.  I never expected it to
> be a great language.  I imagined it was something like Tcl.  (Lol, no
> offense at all towards Tcl.)

Haha, that's a weird idiom!

Funny you should mention Tcl.

https://docs.python.org/3/library/tkinter.html

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regarding inability of Python Module Winsound to produce beep in decimal frequency

2021-08-14 Thread Chris Angelico
On Sun, Aug 15, 2021 at 1:02 PM John O'Hagan  wrote:
>
> > On 2021-08-13 17:17, Chris Angelico wrote:
> > > Is it really? In my experience, no human ear can distinguish 277Hz
> > > from 277.1826Hz when it's played on a one-bit PC speaker, which the
> > > Beep function will be using.
>
> Rounding to integer frequencies will produce disastrously out-of-tune
> notes in a musical context! Particularly for low notes, where a whole
> semitone is only a couple of Hz difference. Even for higher notes, when
> they're played together any inaccuracies are much more apparent.

But before you advocate that too hard, check to see the *real*
capabilities of a one-bit PC speaker. You go on to give an example
that uses PyAudio and a sine wave, not the timer chip's "beep"
functionality.

Try getting some recordings of a half dozen or so computers making a
beep at 440Hz. Then do some analysis on the recordings and see whether
they're actually within 1Hz of that.

(And that's aside from the fact that quite a number of computers will
show up completely silent, due to either not having an internal
speaker, or not letting you use it.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regarding inability of Python Module Winsound to produce beep in decimal frequency

2021-08-13 Thread Chris Angelico
On Sat, Aug 14, 2021 at 2:11 AM Terry Reedy  wrote:
>
> On 8/13/2021 6:53 AM, Umang Goswami wrote:
> > Hi There, Hope you find this mail in good health.
> >
> > I am Umang Goswami, a Python developer and student working on a huge
> > project for automation of music instruments. I am producing the musical
> > notes using the Beep function of Winsound Module(
> > https://docs.python.org/3/library/winsound.html) by passing frequency as a
> > argument to the function.
> >
> > Now whenever i provide frequency of any note in decimal(for example
> > 277.1826 for C4 note) it shows following error:
> > Traceback (most recent call last):
> >File "C:\Users\Umang Goswami\Desktop\Umang  Goswami\test.py", line 2, in
> > 
> >  winsound.Beep(111.11,11)
> > TypeError: integer argument expected, got float
> >
> > Now I have  to round up the frequencies. This is hurting the quality,
> > accuracy ,authenticity and future of the project. Almost all the notes have
> > the frequencies  in decimal parts. Rounding up means changing semitones and
> > quatertones thus whole note itself. This problem is technically making my
> > program useless.
> >

Is it really? In my experience, no human ear can distinguish 277Hz
from 277.1826Hz when it's played on a one-bit PC speaker, which the
Beep function will be using.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-12 Thread Chris Angelico
On Fri, Aug 13, 2021 at 5:03 AM Grant Edwards  wrote:
>
> On 2021-08-12, Hope Rouselle  wrote:
>
> >> OS/2 had all kinds of amazing features (for its time). [...] Plus,
> >> it had this fancy concept of "extended attributes"; on older
> >> systems (like MS-DOS's "FAT" family), a file might be Read-Only,
> >> Hidden, a System file, or needing to be Archived, and that was it -
> >> but on HPFS, you could attach arbitrary data like "File type:
> >> DeScribe Word Processor" or "Double click action: Run
> >> CASMake.cmd". This allowed the GUI to store all kinds of
> >> information *on the file itself* instead of needing hidden files
> >> (or, in Windows' case, the registry) to track that kind of thing.
> >
> > Yeah, that's kinda nice.  Isn't that a UNIX design?  A file is a
> > sequence of bytes?  Users decide what to put in them?
>
> I think what he's talking about is allowing the user to attach
> arbitrary _metadata_ to the file -- metadata that exists separately
> and independently from the normal data that's just a "sequence of
> bytes". IOW, something similar to the "resource fork" that MacOS used
> to have. https://en.wikipedia.org/wiki/Resource_fork

Correct. OS/2's EAs are name/value pairs (with the value potentially
being a set of values - think how a Python dict maps keys to values,
but the values could be lists), with a few names having significance
to the system, like .TYPE and .LONGNAME (used on file systems that
didn't support longnames - yes, that's possible, since EAs could be
stored in a hidden file on a FAT disk).

> > So OS/2 was taking advantage of that to integrate it well with the
> > system.  Windows was doing the same, but integrating the system with
> > files in odd ways --- such as a registry record to inform the system
> > which programs open which files?  (That does sound more messy.)
>
> Windows never had filesystems that supported metadata like OS/2 and
> MacOS did. The registry was an ugly hack that attempted (very poorly)
> to make up for that lack of metadata.

Very poor indeed - it was very very common back then for Windows
programs to blat themselves all over the registry and then leave it
all behind when you nuke that thing. With EAs, it's all part of the
file itself and will be cleaned up by a simple directory removal.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-12 Thread Chris Angelico
On Fri, Aug 13, 2021 at 2:15 AM Hope Rouselle  wrote:
>
> Chris Angelico  writes:
>
> > History lesson!
> >
> > Once upon a time, IBM and Microsoft looked at what Intel was
> > producing, and went, hey, we need to design an operating system that
> > can take advantage of the fancy features of this 80286 thing. So they
> > collaborate on this plan to make a 16-bit protected mode OS.
> > Unfortunately, things didn't work out too well, partly because this
> > was when Microsoft was at its most monopolistic, and they ended up
> > parting company. IBM continued to make OS/2, but Microsoft took their
> > part of the code and made Windows NT out of it.
>
> How is it possible that Microsoft would take part of the code of OS/2?
> Did IBM just hand it to them?

I presume both companies had all of the code. It was a matter of
licensing, though. There were a few components that were saddled with
awkward restrictions due to the dual ownership (for instance, HPFS386
was Microsoft-controlled, but vanilla HPFS was fine - kinda like the
difference between LZW and LZ77).

> > (Aside: Windows NT's 16-bit applications and OS/2's 16-bit
> > applications were actually identical and compatible. Unfortunately,
> > Win32 introduced a very new API, so as soon as everyone moved to
> > 32-bit everything, the schism became problematic. But it was actually
> > possible to package up a single .EXE file with a 16-bit MS-DOS loader,
> > a Win32 loader, and an OS/2 32-bit loader, all happily coexisting.
>
> Beautiful. :-) So if all libraries were around in each system, they had
> perfect compatibility?

The 16-bit loaders were fine, but the 32-bit loaders were different,
so this trick basically meant having three different copies of the
code wrapped up in a single executable.

> > Plus, it had this fancy
> > concept of "extended attributes"; on older systems (like MS-DOS's
> > "FAT" family), a file might be Read-Only, Hidden, a System file, or
> > needing to be Archived, and that was it - but on HPFS, you could
> > attach arbitrary data like "File type: DeScribe Word Processor" or
> > "Double click action: Run CASMake.cmd". This allowed the GUI to store
> > all kinds of information *on the file itself* instead of needing
> > hidden files (or, in Windows' case, the registry) to track that kind
> > of thing.
>
> Yeah, that's kinda nice.  Isn't that a UNIX design?  A file is a
> sequence of bytes?  Users decide what to put in them?  So OS/2 was
> taking advantage of that to integrate it well with the system.  Windows
> was doing the same, but integrating the system with files in odd ways
> --- such as a registry record to inform the system which programs open
> which files?  (That does sound more messy.)

Something like that, but with a lot more metadata. Modern OSes don't
seem to work that way any more.

> UNIX's execve() is able to read the first line of an executable and
> invoke its interpreter.  I guess OS/2 was doing precisely that in a
> different way?

Kinda, but instead of having the choice of interpreter be inside the
file contents itself, the choice was in the file's metadata. Still
part of the file, but if you open and read the file, it isn't any
different.

> > The default command interpreter and shell on OS/2 was fairly primitive
> > by today's standards, and was highly compatible with the MS-DOS one,
> > but it also had the ability to run REXX scripts. REXX was *way* ahead
> > of its time. It's a shell language but remarkably well suited to
> > building GUIs and other tools (seriously, can you imagine designing a
> > GUI entirely in a bash script??).
>
> I cannot imagine.  I always wondered what REXX was about --- I saw
> programs sometimes written in some website whose name is something like
> Rosetta Code.  REXX looked so weird.  (``Who would program in that?'')
> But I see now there is a context to it.

Yeah. It was a strange choice by today's standards, but back then,
most of my GUI programs were written in REXX.

https://en.wikipedia.org/wiki/VX-REXX
http://www.edm2.com/0206/vrexx.html

(There were other tools too - VisPro REXX, VREXX, DrDialog, and
various others - but VX-REXX was where most of my dev work happened.)

> > Probably the most
> > notable feature, by today's standards, was that it had a single input
> > queue. ... This means that, if the response to a
> > keystroke is to change focus, then *even in a slow and lagged out
> > system*, subsequent keys WOULD be sent to the new target window. That
> > was AWESOME, and I really miss it. Except that I also don't. Because
> > if a single application is having issues, now your entire keyboard and
> > mouse is locked up... wh

Re: Is there a better way to create a list of None objects?

2021-08-12 Thread Chris Angelico
On Thu, Aug 12, 2021 at 6:59 PM Stephen Tucker  wrote:
>
> Hi,
>
> I thought I'd share the following piece of code that I have recently written
> (a) to check that what I have done is reasonable - even optimum,
> (b) to inform others who might be wanting to do similar things, and
> (c) to invite comment from the community.
>
> ---
>
> #
>
> # Yes: Create an empty list of Band Limits for this language
>
> #
>
> # Note. The rather complicated logic on the right-hand side of the
>
> #   assignment below is used here because none of the following
>
> #   alternatives had the desired effect:
>
> #
>
> # Logic Effect
>
> #
>
> # [None * 8]TypeError: unsupported operand type(s) for *: ...
>
> # [(None) * 8]  TypeError: unsupported operand type(s) for *: ...
>
> # [((None)) * 8]TypeError: unsupported operand type(s) for *: ...
>
> # [(None,) * 8] [(None, None, None, None, None, None, None, None)]
>
> # list ((None) * 8) TypeError: unsupported operand type(s) for *: ...
>
> #
>
> diclll_BLim [thisISO_] = list ((None,) * 8)
>

Why not just:

[None] * 8

?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-11 Thread Chris Angelico
On Thu, Aug 12, 2021 at 9:23 AM Dennis Lee Bieber  wrote:
>
> On Thu, 12 Aug 2021 06:15:28 +1000, Chris Angelico 
> declaimed the following:
>
>
> >The default command interpreter and shell on OS/2 was fairly primitive
> >by today's standards, and was highly compatible with the MS-DOS one,
> >but it also had the ability to run REXX scripts. REXX was *way* ahead
> >of its time. It's a shell language but remarkably well suited to
> >building GUIs and other tools (seriously, can you imagine designing a
> >GUI entirely in a bash script??). It had features that we'd consider
> >fairly normal or even primitive by Python's standards, but back then,
> >Python was extremely new and didn't really have very much mindshare.
> >REXX offered arbitrary-precision arithmetic, good databasing support,
> >a solid C API that made it easy to extend, integrations with a variety
> >of other systems... this was good stuff for its day. (REXX is still
> >around, but these days, I'd rather use Python.)
> >
> I was spoiled by the Amiga variant of REXX. Most current
> implementations (well, Regina is the only one I've looked at) can just pass
> command to the default shell. The Amiga version took advantage of Intuition
> Message Ports (OS supported IPC). That allowed it to "address
> " any application that defined an ARexx port, allowing ARexx
> to be used as a scripting language for that application (and with multiple
> applications, one could easily fetch data from app1 and feed it to app2).
> ARexx did not, to my memory, implement arbitrary precision math.

The same functionality was available in OS/2, but not heavily used.
You could 'address cmd commandname' to force something to be
interpreted as a shell command, but that was about it. However, I
built a MUD that used REXX as its scripting language, and the default
destination was sending text back to the person who sent the command;
and you could, of course, still 'address cmd' to run a shell command.

> I've not seen anything equivalent in my light perusal of the Win32 API
> (the various guide books aren't layed out in any way to be a reference),
> and Linux seems to use UNIX sockets for IPC... No way to search for a
> connection point by name...
>

Win32 doesn't really have it. Unix sockets are kinda there but you
identify something by a path to the socket, not the name of the
application. But I think dbus is probably the closest to what you're
thinking of.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-11 Thread Chris Angelico
On Thu, Aug 12, 2021 at 7:25 AM Rob Cliffe via Python-list
 wrote:
>
> On 11/08/2021 19:10, MRAB wrote:
> > On 2021-08-11 18:10, Wolfram Hinderer via Python-list wrote:
> >>
> >>
> >> Am 11.08.2021 um 05:22 schrieb Terry Reedy:
> >>> Python is a little looser about whitespace than one might expect
> >>> from reading 'normal' code when the result is unambiguous in that it
> >>> cannot really mean anything other than what it does.  Two other
> >>> examples:
> >>>
> >>> >>> if3: print('yes!')
> >>> yes!
> >>> >>> [0]  [0]
> >>> 0
> >>
> >> Not sure what you mean here - is it a joke? The first looks like an if
> >> statement, but isn't. The missing space *does* make a difference. (Try
> >> "if0" instead.)
> >>
> > I see what you mean. It's a type annotation:
> >
> > var: type
> >
> > where the "type" is a print statement!
> >
> >> The second is normal indexing, which allows white space. I wouldn't
> >> consider that surprising, but maybe I should? (Honest question, I really
> >> don't know.)
> >>
> I looked at the if3 example, and I was gobsmacked.  I momentarily
> assumed that "if3" was parsed as "if 3", although that clearly makes no
> sense ("if3" is a valid identifier).
> Then I saw the "if0" example and I was even more gobsmacked, because it
> showed that my assumption was wrong.
> I've never used type annotations, I've never planned to used them. And
> now that all is revealed, I'm afraid that my reaction is: I'm even more
> inclined never to use them, because these examples are (to me) so confusing.

Don't judge a feature based on its weirdest example. Based on this
example, you should avoid ever using the len() built-in function:

>>> def show_count(n, word):
... return "{{}} {{:{0}.{0}}}".format(len(word)-(n==1)).format(n, word)
...
>>> show_count(0, "things")
'0 things'
>>> show_count(1, "things")
'1 thing'
>>> show_count(5, "things")
'5 things'
>>> show_count(2, "widgets")
'2 widgets'
>>> show_count(1, "widgets")
'1 widget'

Any syntax can be abused. And the same thing would happen in any other
context. The only difference is that, in a declaration like "if3:
print()", the name if3 doesn't have to have been assigned already,
avoiding this problem:

>>> {
... if3: print("Hello")
... }
Traceback (most recent call last):
  File "", line 2, in 
NameError: name 'if3' is not defined

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-11 Thread Chris Angelico
On Thu, Aug 12, 2021 at 5:00 AM Hope Rouselle  wrote:
>
> Chris Angelico  writes:
>
> > On Wed, Aug 11, 2021 at 4:18 AM Hope Rouselle  
> > wrote:
> >>
> >> Chris Angelico  writes:
> >>
> >> [...]
> >>
> >> >> not disagreeing... and yeah I could have thought deeper about the
> >> >> answer, but I still think "notthing has been OOP" -> "yes it has, they
> >> >> just didn't realize it"  was worth mentioning
> >> >
> >> > Oh yes, absolutely agree.
> >>
> >> At the same time, inside the machine nothing is OOP --- so all the OOP
> >> is no OOP at all and they just didn't realize it?  This seems to show
> >> that OOP is about perspective.  An essential thing for OOP is the
> >> keeping of states.  Closures can keep state, so having procedures as
> >> first-class values allows us to say we are doing OOP too.  (Arguments of
> >> procedures are messages and function application is message passing,
> >> with closures keeping a state --- and all the rest of OOP can be
> >> implemented with enough such functional technology.)  In summary, OOP
> >> should not be defined as some special syntax, otherwise there is no OOP
> >> in ``2 + 2''.
> >>
> >> Having said that, I totally agree with all the nitpicking.
> >
> > Object orientation is a particular abstraction concept. It's not a
> > feature of the machine, it's a feature of the language that you write
> > your source code in. I've done OOP using IDL and CORBA, writing my
> > code in C and able to subclass someone else's code that might have
> > been written in some other language. [1] Central tenets of OOP
> > (polymorphism, inheritance, etc) can be implemented at a lower level
> > using whatever makes sense, but *at the level that you're writing*,
> > they exist, and are useful.
> >
> > Data types, variables, control flow, these are all abstractions. But
> > they're such useful abstractions that we prefer to think that way in
> > our code. So, *to us*, those are features of our code. To the
> > computer, of course, they're just text that gets processed into
> > actually-executable code, but that's not a problem.
>
> Total agreement.
>
> > So I would say that (in Python) there IS object orientation in "2 +
> > 2", and even in the Python C API, there is object orientation, despite
> > C not normally being considered an object-oriented language.
>
> But would you say that 2 + 2 is also an illustration of object
> orientation in any other language too?

In some languages it is; in others, it's not. For instance, REXX
doesn't have polymorphism. You can add two numbers together using x+y,
or you can concatenate two strings with x||y. There's no concept of
doing the same operation (spelled the same way) on different data
types. (Partly that's because REXX doesn't actually *have* data types,
but it does a pretty good job of simulating strings, bignums, floats,
arrays, mappings, etc.)

But in many modern programming languages, yes, it would be a
demonstration of some of the object orientation features.

> Regarding C, I have many times said that myself.  If I wrote assembly,
> I'm sure I would do my best to create things like procedures --- a label
> and some bureaucracy to get arguments in and a return value out.

Oh absolutely. But you'd also have the option to shortcut things if
you wanted to. For better or for worse.

> > [1] And boy oh boy was that good fun. The OS/2 Presentation Manager
> > had a wealth of power available. Good times, sad that's history now.
>
> I know OS/2 only by name.  I never had the pleasure of using it.  In
> fact, I don't even know how it looks.  I must be a little younger than
> you are.  But not too younger because I kinda remember its name.  Was it
> a system that might have thought of competing against Microsoft Windows?
> :-) That's what my memory tells me about it.

History lesson!

Once upon a time, IBM and Microsoft looked at what Intel was
producing, and went, hey, we need to design an operating system that
can take advantage of the fancy features of this 80286 thing. So they
collaborate on this plan to make a 16-bit protected mode OS.
Unfortunately, things didn't work out too well, partly because this
was when Microsoft was at its most monopolistic, and they ended up
parting company. IBM continued to make OS/2, but Microsoft took their
part of the code and made Windows NT out of it.

(Aside: Windows NT's 16-bit applications and OS/2's 16-bit
applications were actually identical and compatible. Unfortunately,
Win32 introduced a very new API, so as soon as everyone moved to
32-bit everything

Re: some problems for an introductory python test

2021-08-10 Thread Chris Angelico
On Wed, Aug 11, 2021 at 4:18 AM Hope Rouselle  wrote:
>
> Chris Angelico  writes:
>
> [...]
>
> >> not disagreeing... and yeah I could have thought deeper about the
> >> answer, but I still think "notthing has been OOP" -> "yes it has, they
> >> just didn't realize it"  was worth mentioning
> >
> > Oh yes, absolutely agree.
>
> At the same time, inside the machine nothing is OOP --- so all the OOP
> is no OOP at all and they just didn't realize it?  This seems to show
> that OOP is about perspective.  An essential thing for OOP is the
> keeping of states.  Closures can keep state, so having procedures as
> first-class values allows us to say we are doing OOP too.  (Arguments of
> procedures are messages and function application is message passing,
> with closures keeping a state --- and all the rest of OOP can be
> implemented with enough such functional technology.)  In summary, OOP
> should not be defined as some special syntax, otherwise there is no OOP
> in ``2 + 2''.
>
> Having said that, I totally agree with all the nitpicking.

Object orientation is a particular abstraction concept. It's not a
feature of the machine, it's a feature of the language that you write
your source code in. I've done OOP using IDL and CORBA, writing my
code in C and able to subclass someone else's code that might have
been written in some other language. [1] Central tenets of OOP
(polymorphism, inheritance, etc) can be implemented at a lower level
using whatever makes sense, but *at the level that you're writing*,
they exist, and are useful.

Data types, variables, control flow, these are all abstractions. But
they're such useful abstractions that we prefer to think that way in
our code. So, *to us*, those are features of our code. To the
computer, of course, they're just text that gets processed into
actually-executable code, but that's not a problem.

So I would say that (in Python) there IS object orientation in "2 +
2", and even in the Python C API, there is object orientation, despite
C not normally being considered an object-oriented language.

ChrisA

[1] And boy oh boy was that good fun. The OS/2 Presentation Manager
had a wealth of power available. Good times, sad that's history now.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-10 Thread Chris Angelico
On Wed, Aug 11, 2021 at 4:18 AM Hope Rouselle  wrote:
>
> I totally agree with you but I didn't know that even numbers were like
> that in Python.  In fact, I still don't quite believe it...
>
> >>> 2.__add__(3)
> SyntaxError: invalid syntax

Yeah, that's because "2." looks like the beginning of a float.

> But then I tried:
>
> >>> (2).__add__(3)
> 5
>
> Now I do believe it! :-)  Awesome.  I had no idea.

You can also do it this way:

>>> x = 2
>>> x.__add__(3)
5

But don't teach this; it ignores several problems. Notably, a subclass
should be able to (re)define operators:

>>> class FancyNumber(int):
... def __repr__(self): return "FancyNumber(%d)" % self
... def __add__(self, other): return type(self)(int(self) + other)
... def __radd__(self, other): return type(self)(other + int(self))
...
>>> FancyNumber(2) + 3
FancyNumber(5)
>>> 2 + FancyNumber(3)
FancyNumber(5)
>>> FancyNumber(2).__add__(3)
FancyNumber(5)
>>> (2).__add__(FancyNumber(3))
5

With the + operator, you always get a FancyNumber back. Explicitly
calling the dunder method fails. (General principle: Dunder methods
are for defining, not for calling, unless you're in the process of
defining a dunder.)

> (*) More opinions
>
> So, yeah, the idea of a course like that is to try to simplify the world
> to students, but it totally backfires in my opinion.  There is so much
> syntax to learn, so many little details...  We spend the entire semester
> discussing these little details.

Agreed, and if you try to teach all the syntax, you're inevitably
going to get bogged down like that.

> I posted here recently a study of the semantics of slices.  When I
> finally got it, I concluded it's not very simple.  The course introduces
> a few examples and expects students to get it all from these examples.
> I would rather not introduce slices but teach students enough to write
> procedures that give them the slices.  The slices are the fish; learning
> to write the procedures is the know-how.  (I'm fine with even teaching
> them how to write procedures to add or subtract numbers [and the rest of
> arithmetic], although none of them would find mysterious what is the
> meaning of arithmetic expressions such as 3 + 6 - 9, even taking
> precedence of operators into account.  That's syntax they already know.
> If we teach them the syntax of procedures, we could be essentially done
> with syntax.)

A language like Python is useful, not because every tiny part of it is
easy to explain, but because the language *as a whole* does what is
expected of it. Toy languages are a lot easier to explain on a
concrete level (look up Brainf*ck - it has very few operations and
each one can be very simply defined), and that might be tempting in
terms of "hey, we can teach the entire language in ten minutes", but
they're utterly useless for students.

Instead, my recommendation would be: Entice students with power.
Pretend you're teaching some aspect of magic at a wizards' school and
make sure your students feel massively OP. Sure, they might not
understand what makes those cantrips work, but they know that they DO
work. You can have students casting print calls left and right, doing
complex incantations with list comprehensions, and even defining their
own classes, all without ever worrying about the details of how it all
works. Then once the "wow" is achieved, you can delve into the details
of how things actually function, transforming progressively from
"wizardry" to "science".

And to be quite frank, a lot of code IS magic to most programmers.
Which isn't a problem. Do you REALLY need to understand how your CPU
does register renaming in order to benefit from it? Or do you need to
know every detail of the TCP packet header before using the internet?
Not likely. Those details hide away under the covers, and we make
happy use of them. There's infinite knowledge out there, and you - and
your students - are free to dip into exactly as much as you're
comfortable with.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-10 Thread Chris Angelico
On Wed, Aug 11, 2021 at 4:14 AM Hope Rouselle  wrote:
>
> Chris Angelico  writes:
>
> > On Tue, Aug 10, 2021 at 7:25 AM Hope Rouselle  
> > wrote:
> >> I came up with the following question.  Using strings of length 5
> >> (always), write a procedure histogram(s) that consumes a string and
> >> produces a dictionary whose keys are each substrings (of the string) of
> >> length 1 and their corresponding values are the number of times each
> >> such substrings appear.  For example, histogram("a") = {"a": 5}.
> >> Students can "loop through" the string by writing out s[0], s[1], s[2],
> >> s[3], s[4].
> >
> > In other words, recreate collections.Counter? Seems decent, but you'll
> > need to decide whether you want them to use defaultdict, use
> > __missing__, or do it all manually.
>
> Yes, the course introduces very little so there is a lot of recreation
> going on.  Hm, I don't know defaultdict and I don't know how to use
> __missing__.  The course does introduce dict.get(), though.  If students
> use dict.get(), then the procedure could essentially be:
>
> def histogram(s):
>   d = {}
>   d[s[0]] = d.get(s[0], 0) + 1
>   d[s[1]] = d.get(s[1], 0) + 1
>   d[s[2]] = d.get(s[2], 0) + 1
>   d[s[3]] = d.get(s[3], 0) + 1
>   d[s[4]] = d.get(s[4], 0) + 1
>   return d

There's nothing wrong with getting students to recreate things, but
there are so many different levels on which you could do this, which
will leave your more advanced students wondering what's legal. :) Here
are several ways to do the same thing:

>>> s = "hello world"

>>> from collections import Counter; Counter(s)
Counter({'l': 3, 'o': 2, 'h': 1, 'e': 1, ' ': 1, 'w': 1, 'r': 1, 'd': 1})

>>> from collections import defaultdict
>>> hist = defaultdict(int)
>>> for ltr in s: hist[ltr] += 1
...
>>> hist
defaultdict(, {'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1,
'w': 1, 'r': 1, 'd': 1})

>>> class Incrementer(dict):
... def __missing__(self, key): return 0
...
>>> hist = Incrementer()
>>> for ltr in s: hist[ltr] += 1
...
>>> hist
{'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1}

>>> hist = {}
>>> for ltr in s: hist[ltr] = hist.get(ltr, 0) + 1
...
>>> hist
{'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1}

>>> hist = {}
>>> for ltr in s:
... if ltr in hist: hist[ltr] += 1
... else: hist[ltr] = 1
...
>>> hist
{'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1}

>>> hist = {}
>>> for ltr in s:
... try: hist[ltr] += 1
... except KeyError: hist[ltr] = 1
...
>>> hist
{'h': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'w': 1, 'r': 1, 'd': 1}

A Counter shows the values ranked, all the others retain insertion
order, but they all get to the same result.

It seems *very* strange to have an exercise like this without looping.
That seems counterproductive. But if you're expecting them to not use
loops, you'll want to also be very clear about what other features
they're allowed to use - or alternatively, stipulate what they ARE
allowed to use, eg "Use only indexing and the get() method".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-09 Thread Chris Angelico
On Tue, Aug 10, 2021 at 1:41 PM Mats Wichmann  wrote:
>
>
> On 8/9/21 6:34 PM, Chris Angelico wrote:
>
> > If you want to highlight the OOP nature of Python, rather than looking
> > at magic methods, I'd first look at polymorphism. You can add a pair
> > of integers; you can add a pair of tuples; you can add a pair of
> > strings. Each one logically adds two things together and gives a
> > result, and they're all spelled the exact same way. Dunder methods are
> > a way for custom classes to slot into that same polymorphism, but the
> > polymorphism exists first and the dunders come later.
> >
> > ChrisA
> >
>
> not disagreeing... and yeah I could have thought deeper about the
> answer, but I still think "notthing has been OOP" -> "yes it has, they
> just didn't realize it"  was worth mentioning

Oh yes, absolutely agree.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-09 Thread Chris Angelico
On Tue, Aug 10, 2021 at 8:19 AM Mats Wichmann  wrote:
> Even if you do
>
> x = 2 + 3
>
> you're actually creating an integer object with a value of 2, and
> calling its add method to add the integer object with the value of 3 to
> it. The syntax hides it, but in a way it's just convenience that it does
> so...
>
>  >>> 2 + 3
> 5
>  >>> x = 2
>  >>> x.__add__(3)
> 5
>
>
> sorry for nitpicking :)  But... don't be afraid of letting them know
> it's OOP, and it''s not huge and complex and scary!
>

Since we're nitpicking already, "2 + 3" isn't the same as
"(2).__add__(3)"; among other things, it's able to call
(3).__radd__(2) instead. Plus there's technicalities about type slots
and such.

If you want to highlight the OOP nature of Python, rather than looking
at magic methods, I'd first look at polymorphism. You can add a pair
of integers; you can add a pair of tuples; you can add a pair of
strings. Each one logically adds two things together and gives a
result, and they're all spelled the exact same way. Dunder methods are
a way for custom classes to slot into that same polymorphism, but the
polymorphism exists first and the dunders come later.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: some problems for an introductory python test

2021-08-09 Thread Chris Angelico
On Tue, Aug 10, 2021 at 7:25 AM Hope Rouselle  wrote:
> I came up with the following question.  Using strings of length 5
> (always), write a procedure histogram(s) that consumes a string and
> produces a dictionary whose keys are each substrings (of the string) of
> length 1 and their corresponding values are the number of times each
> such substrings appear.  For example, histogram("a") = {"a": 5}.
> Students can "loop through" the string by writing out s[0], s[1], s[2],
> s[3], s[4].

In other words, recreate collections.Counter? Seems decent, but you'll
need to decide whether you want them to use defaultdict, use
__missing__, or do it all manually.

> I think you get the idea.  I hope you can provide me with creativity.  I
> have been looking at books, but every one I look at they introduce loops
> very quickly and off they go.  Thank you!

Probably because loops are kinda important? :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on slices, negative indices, which are the equivalent procedures?

2021-08-09 Thread Chris Angelico
On Tue, Aug 10, 2021 at 7:24 AM Jack Brandom  wrote:
>
> Greg Ewing  writes:
>
> > On 6/08/21 12:00 pm, Jack Brandom wrote:
> >> It seems
> >> that I'd begin at position 3 (that's "k" which I save somewhere), then I
> >> subtract 1 from 3, getting 2 (that's "c", which I save somewhere), then
> >> I subtract 1 from 2, getting 1 (that's "a", ...), then I subtract 1 from
> >> 1, getting 0 (that's J, ...), so I got "kcaJ" but my counter is 0 not
> >> -13, which was my stopping point.
> >
> > You need to first replace any negative or missing indices with
> > equivalent indices measured from the start of the string.
> >
> > When you do that in this example, you end up iterating backwards from 3
> > and stopping at -1.
>
> Yeah, that makes sense now.  But it sucks that the rule for replacing
> negative indices is sometimes missing index and sometimes positive
> index.  (That is, we can't always use positive indices.  Sometimes we
> must use no index at all.  I mean that's how it looks to my eyes.)

Sometimes, the index  you need to use is the value None. You cannot
use a positive number to indicate the position to the left of zero -
at least, not if you consider numbers to be on a number line.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: on slices, negative indices, which are the equivalent procedures?

2021-08-06 Thread Chris Angelico
On Sat, Aug 7, 2021 at 5:22 AM Boris Dorestand  wrote:
>
> Jach Feng  writes:
>
> >> > s = "Jack Brandom"
> >> > s[3 : -13 : -1]
> >> >> 'kcaJ'
> >> >> I have no idea how to replace that -13 with a positive index. Is it
> >> >> possible at all?
> > That's not possible because a positive index is relative to the leftmost 
> > item 0
>
> And the middle index is always exclusive, so we can't go to the left of
> 0 and remain positive.  Okay, I think that answers it.  It's not
> possible at all.
>

An absent index isn't the same as any specific positive value, so,
yeah, it's not possible to replace it with a positive index. It IS
possible to replace it with None.

>>> s = "Jack Brandom"
>>> s[3:None:-1]
'kcaJ'

You could implement equivalent logic in your function.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


<    4   5   6   7   8   9   10   11   12   13   >