Re: Lprint = ( Lisp-style printing ( of lists and strings (etc.) ) in Python )

2024-06-01 Thread Peter J. Holzer via Python-list
On 2024-05-30 21:47:14 -0700, HenHanna via Python-list wrote:
> [('the', 36225), ('and', 17551), ('of', 16759), ('i', 16696), ('a', 15816),
> ('to', 15722), ('that', 11252), ('in', 10743), ('it', 10687)]
> 
> ((the 36225) (and 17551) (of 16759) (i 16696) (a 15816) (to 15722) (that
> 11252) (in 10743) (it 10687))
> 
> 
> i think the latter is easier-to-read, so i use this code
>    (by Peter Norvig)

This doesn't work well if your strings contain spaces:

Lprint(
[
["Just", "three", "words"],
["Just", "three words"],
["Just three", "words"],
["Just three words"],
]
)

prints:

((Just three words) (Just three words) (Just three words) (Just three words))

Output is often a compromise between readability and precision.


> def lispstr(exp):
># "Convert a Python object back into a Lisp-readable string."
> if isinstance(exp, list):

This won't work for your example, since you have a list of tuples, not a
list of lists and a tuple is not an instance of a list.

> return '(' + ' '.join(map(lispstr, exp)) + ')'
> else:
> return str(exp)
> 
> def Lprint(x): print(lispstr(x))

I like to use pprint, but it's lacking support for user-defined types. I
should be able to add a method (maybe __pprint__?) to my classes which
handle proper formatting (with line breaks and indentation).

hp
-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: From JoyceUlysses.txt -- words occurring exactly once

2024-06-01 Thread Peter J. Holzer via Python-list
On 2024-05-30 19:26:37 -0700, HenHanna via Python-list wrote:
> hard to decide what to do with hyphens
>and apostrophes
>  (I'd,  he's,  can't, haven't,  A's  and  B's)

Especially since the same character is used as both an apostrophe and a
closing quotation mark. And while that's pretty unambiguous between to
characters it isn't at the end of a word:

This is Alex’ house.
This type of building is called an ‘Alex’ house.
The sentence ‘We are meeting at Alex’ house’ contains an apostrophe.

(using proper unicode quotation marks. It get's worse if you stick to
ASCII.)

Personally I like to use U+0027 APOSTROPHE as an apostrophe and U+2018
LEFT SINGLE QUOTATION MARK and U+2019 RIGHT SINGLE QUOTATION MARK as
single quotation marks[1], but despite the suggestive names, this is not
the common typographical convention, so your texts are unlikely to make
this distinction.

hp

[1] Which I use rarely, anyway.

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Terminal Emulator (Posting On Python-List Prohibited)

2024-05-20 Thread Peter J. Holzer via Python-list
On 2024-05-20 00:26:03 +0200, Roel Schroeven via Python-list wrote:
> Skip Montanaro via Python-list schreef op 20/05/2024 om 0:08:
> > > Modern debian (ubuntu) and fedora block users installing using pip.
> > 
> > Even if you're telling it to install in ~/.local? I could see not allowing
> > to run it as root.
> 
> I assumed pip install --user would work, but no. I tried it (on Debian 12
> (bookworm)):
> 
> > $ pip install --user docopt
> > error: externally-managed-environment
> > 
> > × This environment is externally managed
> > ╰─> To install Python packages system-wide, try apt install
> >     python3-xyz, where xyz is the package you are trying to
> >     install.
> > 
> >     If you wish to install a non-Debian-packaged Python package,
> >     create a virtual environment using python3 -m venv path/to/venv.
> >     Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
> >     sure you have python3-full installed.
> > 
> >     If you wish to install a non-Debian packaged Python application,
> >     it may be easiest to use pipx install xyz, which will manage a
> >     virtual environment for you. Make sure you have pipx installed.
> > 
> >     See /usr/share/doc/python3.11/README.venv for more information.
> > 
> > note: If you believe this is a mistake, please contact your Python
> > installation or OS distribution provider. You can override this, at the
> > risk of breaking your Python installation or OS, by passing
> > --break-system-packages.
> > hint: See PEP 668 for the detailed specification.
> 
> Exactly the same output for sudo pip install.

This message (quoted in all its glory) is too long to be useful. The
important bit is at the end:

> > You can override this, at the risk of breaking your Python
> > installation or OS, by passing --break-system-packages.

(I admit I didn't see this the first time I got this message)

python3 -m pip install --user --break-system-packages 
does indeed install into ~/.local/lib/python3.XX/site-packages.

This inconvenient, but otoh I have accidentally installed packages into
~/.local in the past, so maybe it's good to make that more explicit.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


venvs vs. package management (was: Terminal Emulator (Posting On Python-List Prohibited))

2024-05-19 Thread Peter J. Holzer via Python-list
On 2024-05-18 20:12:33 +0200, Piergiorgio Sartor via Python-list wrote:
> On 18/05/2024 20.04, Mats Wichmann wrote:
> > So venvs make managing all that pretty convenient. Dunno why everybody's
> > so down on venvs...
> 
> Only people which are *not* using python... :-)
> 
> In my experience, venvs is the only possible
> way to use python properly.

That's very much depends on what you mean by properly.

Personally, I use venvs a lot. But most of the reasons have more to do
with team culture than technical constraints. In a different situation
(e.g. if all our developers used Linux and preferrably the same version)
I could see myself using venvs much less or maybe not at all.

> The dependency nightmare created by python, pip and all the rest
> cannot be resolved otherwise.

That's what package management on Linux is for. Sure, it means that you
won't have the newest version of anything and some packages not at all,
but you don't have to care about dependencies. Or updates.

(Missing packages can be a problem: Is there a script to automatically
generate .deb packages from PyPI? I haven't looked recently ...)

> It seems backward compatibility is a taboo...

I have recently written a script which checks out the newest version of
the project, creates a fresh venv using a requirements.txt without
version numbers and runs the test suite. If there is any action required
(either because a test fails or because there is a newer version of any
dependent package) it will create a ticket in redmine. Oh, and this
script runs on a staging server which has the same Linux distribution
(and hence the same Python version) as the production server.
Seems to work, but that is only necessary because we are using venvs. If
we relied on the distro's package management that would basically be a
non-issue.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Terminal Emulator (Posting On Python-List Prohibited)

2024-05-18 Thread Peter J. Holzer via Python-list
On 2024-05-16 19:46:07 +0100, Gordinator via Python-list wrote:
> To be fair, the problem is the fact that they use Windows (but I guess Linux
> users have to deal with venvs, so we're even.

I don't think Linux users have to deal with venvs any more than Windows
users. Maybe even less because many distributions come with a decent
set of Python packages.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Terminal Emulator

2024-05-18 Thread Peter J. Holzer via Python-list
On 2024-05-14 22:37:17 +0200, Mirko via Python-list wrote:
> Am 14.05.24 um 19:44 schrieb Gordinator via Python-list:
> > I wish to write a terminal emulator in Python. I am a fairly competent
> > Python user, and I wish to try a new project idea. What references can I
> > use when writing my terminal emulator? I wish for it to be a true
> > terminal emulator as well, not just a Tk text widget or something like
> > that.
> > 
> > If you have any advice, please do let me know!
> 
> 
> Not sure, what you mean with:
> 
> > true terminal emulator as well, not just a Tk text widget or
> > something like that
> If you want to write a GUI terminal, than that *is* a terminal emulator and
> *has* a text widget as its visible core. If you want to write something like
> getty which runs on the virtual terminals (Ctrl+Alt+F*) than that is a
> terminal (not a terminal emulator).

Getty isn't a terminal (unless there is another program of the same
name). It's a small program to set up a serial communication line.
Basically it waits for the modem to connect, sets the relevant
parameters (bit rate, byte width, parity, ...) and then hands over to
login. Of course in the case of a linux console there is no modem and no
serial line involved, so it doesn't have much to do. (Of course this
raises the question whether the Linux console is a terminal or a
terminal emulator ...)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Terminal Emulator

2024-05-18 Thread Peter J. Holzer via Python-list
On 2024-05-14 16:03:33 -0400, Grant Edwards via Python-list wrote:
> On 2024-05-14, Alan Gauld via Python-list  wrote:
> > On 14/05/2024 18:44, Gordinator via Python-list wrote:
> >
> >> I wish to write a terminal emulator in Python. I am a fairly
> >> competent Python user, and I wish to try a new project idea. What
> >> references can I use when writing my terminal emulator? I wish for
> >> it to be a true terminal emulator as well, not just a Tk text
> >> widget or something like that.
> >
> > The first thing is to decide which terminal.
> 
> If you want to make life easier, make it a superset of a terminal that
> already exists in the terminfo database.
> 
> Going with some sort of ANSI terminal will probably provide
> operability even with dumb apps which ignore $TERM and just spit out
> basic ANSI escape sequences.

And if you want to go for a superset, xterm might be one of the more
useful: https://www.xfree86.org/current/ctlseqs.html

> If you really want to break trail, you could invent your own control
> sequences, which means you'll have to write terminfo and/or termcap
> entries as well as the terminal emulator.

Right. A saner model than ANSI and its supersets might be a good idea
conceptionally. But I'd expect quite a few programs to break.


> > A VT100 is very different from a 3270. And even a VT330 is quite
> > different from a VT100 although sharing a common subset of control
> > codes. And if you start looking at graphical terminals things get
> > even more interesting!
> 
> "Intersting" is putting it mildly...

Yup. Also there aren't many programs which use, e.g. xterm's
pixel-graphics capabilities.

OTOH, there is something like domterm[1], which can (theoretically)
display anything a browser can display.

hp


[1] https://domterm.org/index.html

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python Dialogs

2024-05-04 Thread Peter J. Holzer via Python-list
On 2024-05-02 16:34:38 +0200, Loris Bennett via Python-list wrote:
> r...@zedat.fu-berlin.de (Stefan Ram) writes:
> >   Me (indented by 2) and the chatbot (flush left). Lines lengths > 72!
> 
> Is there a name for this kind of indentation, i.e. the stuff you are
> writing not being flush left?

Ramism.

> It is sort of contrary to what I think of as "normal" indentation.

Stefan is well known for doing everything contrary to normal convention.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: xkcd.com/353 ( Flying with Python )

2024-03-31 Thread Peter J. Holzer via Python-list
On 2024-03-31 12:27:34 -0600, Mats Wichmann via Python-list wrote:
> On 3/30/24 10:31, MRAB via Python-list wrote:
> > On 2024-03-30 11:25, Skip Montanaro via Python-list wrote:
> > > > > https://xkcd.com/1306/
> > > > >   what does  SIGIL   mean?
> > > > 
> > > > I think its' a Perl term, referring to the $/@/# symbols in front of
> > > > identifiers.

[You cut out a lot of context here]

> > I wouldn't consider '@' to be a sigil any more than I would a unary minus.
> 
> Nonetheless, Perl folk do use that term, specifically.

I'm pretty sure he's referring to the use of @ in python to denote a
decorator here. Which is a totally different thing than a Perl sigil.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: xkcd.com/353 ( Flying with Python )

2024-03-31 Thread Peter J. Holzer via Python-list
On 2024-03-30 17:58:08 +, Alan Gauld via Python-list wrote:
> On 30/03/2024 07:04, Greg Ewing via Python-list wrote:
> > On 30/03/24 7:21 pm, HenHanna wrote:
> >> https://xkcd.com/1306/
> >>   what does  SIGIL   mean?
> > 
> > I think its' a Perl term, referring to the $/@/# symbols in front of
> > identifiers.

Correct (although strictly speaking they are in front of an expression,
not an identifier).

> There seem to be several derivation sources including a fantasy world
> city suspended above a very thin, tall steeple
>
> Personally, I know SIGIL as an opensource EPUB editor!

Well, it's an ordinary English word of Latin origin (sigillum means
literally "small sign") in use since the 15th century. No need to go
hunting for proper names.

> None of them seem to have any direct connection to the xkcd cartoon.

In my opinion the connection to Perl sigils is very direct.

hp


-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Configuring an object via a dictionary

2024-03-17 Thread Peter J. Holzer via Python-list
On 2024-03-17 17:15:32 +1300, dn via Python-list wrote:
> On 17/03/24 12:06, Peter J. Holzer via Python-list wrote:
> > On 2024-03-16 08:15:19 +, Barry via Python-list wrote:
> > > > On 15 Mar 2024, at 19:51, Thomas Passin via Python-list 
> > > >  wrote:
> > > > I've always like writing using the "or" form and have never gotten bit
> > > 
> > > I, on the other hand, had to fix a production problem that using “or” 
> > > introducted.
> > > I avoid this idiom because it fails on falsy values.
> > 
> > Perl has a // operator (pronounced "err"), which works like || (or),
> > except that it tests whether the left side is defined (not None in
> > Python terms) instead of truthy. This still isn't bulletproof but I've
> > found it very handy.
> 
> 
> So, if starting from:
> 
> def method( self, name=None, ):
> 
>  rather than:
> 
> self.name = name if name else default_value
> 
> ie
> 
> self.name = name if name is True else default_value

These two lines don't have the same meaning (for the reason you outlined
below). The second line is also not very useful.



> the more precise:
> 
> self.name = name if name is not None or default_value
> 
> or:
> 
> self.name = default_value if name is None or name

Those are syntax errors. I think you meant to write "else" instead of
"or".

Yes, exactly. That's the semantic of Perl's // operator.

JavaScript has a ?? operator with similar semantics (slightly
complicated by the fact that JavaScript has two "nullish" values).

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Configuring an object via a dictionary

2024-03-16 Thread Peter J. Holzer via Python-list
On 2024-03-16 08:15:19 +, Barry via Python-list wrote:
> > On 15 Mar 2024, at 19:51, Thomas Passin via Python-list 
> >  wrote:
> > I've always like writing using the "or" form and have never gotten bit
> 
> I, on the other hand, had to fix a production problem that using “or” 
> introducted.
> I avoid this idiom because it fails on falsy values.

Perl has a // operator (pronounced "err"), which works like || (or),
except that it tests whether the left side is defined (not None in
Python terms) instead of truthy. This still isn't bulletproof but I've
found it very handy.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: A Single Instance of an Object?

2024-03-11 Thread Peter J. Holzer via Python-list
On 2024-03-11 16:53:00 -0400, Ivan "Rambius" Ivanov via Python-list wrote:
> I am refactoring some code and I would like to get rid of a global
> variable. Here is the outline:

...

> The global cache variable made unit testing of the lookup(key) method
> clumsy, because I have to clean it after each unit test. I refactored
> it as:
> 
> class Lookup:
> def __init__(self):
> self.cache = {}
> 
> def lookup(key):
> if key in self.cache:
> return self.cache[key]
> 
> value = None
> 
> cmd = f"mycmd {key}"
> proc = subprocess(cmd, capture_output=True, text=True, check=False)
> if proc.returncode == 0:
> value = proc.stdout.strip()
> else:
> logger.error("cmd returned error")
> 
> self.cache[key] = value
> return value
> 
> Now it is easier to unit test, and the cache is not global. However, I
> cannot instantiate Lookup inside the while- or for- loops in main(),
> because the cache should be only one. I need to ensure there is only
> one instance of Lookup - this is why I made it a global variable, so
> that it is accessible to all functions in that script and the one that
> actually needs it is 4 levels down in the call stack.
[...]
> I am looking for the same behaviour as logging.getLogger(name).
> logging.getLogger("myname") will always return the same object no
> matter where it is called as long as the name argument is the same.
> 
> How would you advise me to implement that?

Just add a dict of Lookup objects to your module:

lookups = {}

def get_lookup(name):
if name not in lookups:
lookups[name] = Lookup()
return lookups[name]

Then (assuming your module is also called "lookup", in all other modules
do

import lookup

lo = lookup.get_lookup("whatever")

...
v = lo.lookup("a key")

In your test cases where you need many different lookup tables use

lo = lookup.get_lookup("test1")
...
lo = lookup.get_lookup("test2")
...
lo = lookup.get_lookup("test3")

hp

PS: You don't have to put that in a separate module but I think it's a
lot cleaner that way.

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Testing (sorry)

2024-02-20 Thread Peter J. Holzer via Python-list
On 2024-02-19 11:38:54 -0500, Thomas Passin via Python-list wrote:
> On 2/19/2024 9:17 AM, Grant Edwards via Python-list wrote:
> > On 2024-02-19, Thomas Passin  wrote:
> > > > About 24 hours later, all of my posts (and the confirmation e-mails)
> > > > all showed up in a burst at the same time on two different unrelated
> > > > e-mail accounts.
> > > > 
> > > > I still have no clue what was going on...
> > > 
> > > Sometimes a post of mine will not show up for hours or even half a day.
> > > They are all addressed directly to the list.  Sometimes my email
> > > provider sends me a notice that the message bounced.  Those notices say
> > > that the address wasn't available when the transmission was tried.
> 
> Here is a typical bounce message that I get:
> 
> : host mail.python.org[188.166.95.178] said:
> 450-4.3.2
> Service currently unavailable 450 4.3.2

This is a *temporary* error. Your provider's server should retry
delivering the message for a decent amount of time (3 to 7 days is
customary).

Your provider's server may send you a notification that the mail cannot
currently be delivered and that it will keep trying. Such notifications
are usually sent after a much shorter period (a few hours).

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Testing (sorry)

2024-02-18 Thread Peter J. Holzer via Python-list
[Replying to the list *and* Grant]

On 2024-02-17 19:38:04 -0500, Grant Edwards via Python-list wrote:
> Today I noticed that nothing I've posted to python-list in past 3
> weeks has shown up on the list.

January 29th, AFAICS. And end of december before that.

> I don't know how to troubleshoot this other than sending test
> messages.  Obviously, if this shows up on the list, then I've gotten
> it to work...

This did show up and 3 other test messages with very similar text
as well. 

Also there was a whole flurry of almost but not quite identical messages
from you in the "nan" thread.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using my routines as functions AND methods

2024-01-06 Thread Peter J. Holzer via Python-list
On 2024-01-03 23:17:34 -0500, Thomas Passin via Python-list wrote:
> On 1/3/2024 8:00 PM, Alan Gauld via Python-list wrote:
> > On 03/01/2024 22:47, Guenther Sohler via Python-list wrote:
> > > Hi,
> > > 
> > > In my cpython i have written quite some functions to modify "objects".
> > > and their python syntax is e.g.\
> > > 
> > > translate(obj, vec). e.g whereas obj is ALWAYS first argument.
^^^
> > 
> > > However, I also want to use these functions as class methods without 
> > > having
> > > to
> > > write the function , twice. When using the SAME function as a methos, the
> > > args tuple must insert/contain "self" in the first location, so i have
> > > written a function to do that:
> > 
> > I'm probably missing something obvious here but can't you
> > just assign your function to a class member?
> > 
> > def myFunction(obj, ...): ...
   ^^^
> > 
> > class MyClass:
> >  myMethod = myFunction
> > 
> > 
> > Then you can call it as
> > 
> > myObject = MyClass()
> > myObject.myMethod()
> > 
> > A naive example seems to work but I haven't tried anything
> > complex so there is probably a catch. But sometimes the simple
> > things just work?
> 
> That works if you assign the function to a class instance, but not if you
> assign it to a class.
> 
> def f1(x):
> print(x)

You omitted the first argument (obj).

That should be 

def f1(obj, x):
print(x)


> f1('The plain function')

> 
> class Class1:
> pass

o = Class1()
f1(o, 'The plain function')

works for me.


> class Class2:
> pass
> 
> c1 = Class1()
> c1.newfunc = f1
> c1.newfunc('f1 assigned to instance') # Works as intended

Now this doesn't work any more (but the OP doesn't want that anyway,
AFAICT).


> Class2.newfunc = f1
> c2 = Class2()
> c2.newfunc('f1 assigned to class')  # Complains about extra argument

But this does.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How/where to store calibration values - written by program A, read by program B

2023-12-30 Thread Peter J. Holzer via Python-list
On 2023-12-29 09:01:24 -0800, Grant Edwards via Python-list wrote:
> On 2023-12-28, Peter J. Holzer via Python-list  wrote:
> > On 2023-12-28 05:20:07 +, rbowman via Python-list wrote:
> >> On Wed, 27 Dec 2023 03:53:42 -0600, Greg Walters wrote:
> >> > The biggest caveat is that the shared variable MUST exist before it can
> >> > be examined or used (not surprising).
> >> 
> >> There are a few other questions. Let's say config.py contains a variable 
> >> like 'font' that is a user set preference or a calibration value 
> >> calculated by A to keep with the thread title. Assuming both scripts are 
> >> running, how does the change get propagated to B after it is set in A
> >
> > It isn't. The variable is set purely in memory. This is a mechanism to
> > share a value between multiple modules used by the same process, not to
> > share between multiple processes (whether they run the same or different
> > scripts)
> >
> >> and written to the shared file?
> >
> > Nothing is ever written to a file.
> 
> Then how does it help the OP to propogate clibration values from one
> program to another or from one program run to the next run?

It doesn't. See his second mail in this thread, where he explains it in
a bit more detail. I think he might be a bit confused in his
terminology.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How/where to store calibration values - written by program A, read by program B

2023-12-28 Thread Peter J. Holzer via Python-list
On 2023-12-28 05:20:07 +, rbowman via Python-list wrote:
> On Wed, 27 Dec 2023 03:53:42 -0600, Greg Walters wrote:
> > The biggest caveat is that the shared variable MUST exist before it can
> > be examined or used (not surprising).
> 
> There are a few other questions. Let's say config.py contains a variable 
> like 'font' that is a user set preference or a calibration value 
> calculated by A to keep with the thread title. Assuming both scripts are 
> running, how does the change get propagated to B after it is set in A

It isn't. The variable is set purely in memory. This is a mechanism to
share a value between multiple modules used by the same process, not to
share between multiple processes (whether they run the same or different
scripts)

> and written to the shared file?

Nothing is ever written to a file.

You could of course write python files from a python script (in fact I
do this), but that's not what this pattern is about, AFAICS.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.12.1, Windows 11: shebang line #!/usr/bin/env python3 doesn't work any more

2023-12-23 Thread Peter J. Holzer via Python-list
On 2023-12-22 22:56:45 -0500, Thomas Passin via Python-list wrote:
> In my experience one should always make sure to know what version of Python
> is being used, at least if there is more than one version installed on the
> computer.  Even on Linux using a shebang line can be tricky, because you are
> likely to get the system's version of Python,

You are not "likely" to get the system's version of Python, you get the
version of Python you specify. If you specify "/usr/bin/python3", that's
the system's version of Python. If you specify something else, you get
something else. If you specify "/usr/bin/env python3", you get whatever
the user has in their PATH first.


> and that often is not what you want.  OTOH you don't want to go
> symlinking python3 to some other version of python because then the OS
> system may not work right.  So either you have to specify the Python
> version in the shebang,

This. In my considered opinion installed scripts should work regardless
pf the user's PATH, so they must have the correct interpreter in the
shebang. That specifying the correct shebang pulls in the complete
virtual environment is IMHO a great feature of Python.

I've written a small script "install-python" which basically just copies
a file and adjusts the shebang line.
<https://git.hjp.at:3000/hjp/install-python/src/branch/master/install-python>
for the use in Makefiles etc.

> or just specify the right version
> on the command line.  In that case you might as well not have included the
> shebang line at all.

Right. However, that's not how scripts are usually invoked on Unix.
Using /usr/bin/env in the command line is supposed to fix that but of
course it assumes that your interpreter is actually called python3.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How/where to store calibration values - written by program A, read by program B

2023-12-09 Thread Peter J. Holzer via Python-list
On 2023-12-06 07:23:51 -0500, Thomas Passin via Python-list wrote:
> On 12/6/2023 6:35 AM, Barry Scott via Python-list wrote:
> > Personally I would not use .ini style these days as the format does not 
> > include type of the data.
> 
> Neither does JSON.

Well, it distinguishes between some primitive types (string, number,
boolean, null) and provides two container types (dict/object,
list/array). As long as those types are sufficient, JSON includes them.
If you need anything else, you're on your own.

    hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newline (NuBe Question)

2023-11-26 Thread Peter J. Holzer via Python-list
On 2023-11-25 08:32:24 -0600, Michael F. Stemper via Python-list wrote:
> On 24/11/2023 21.45, avi.e.gr...@gmail.com wrote:
> > Of course, for serious work, some might suggest avoiding constructs like a
> > list of lists and switch to using modules and data structures [...]
> 
> Those who would recommend that approach do not appear to include Mr.
> Rossum, who said:
>   Avoid overengineering data structures.
  ^^^

The key point here is *over*engineering. Don't make things more
complicated than they need to be. But also don't make them simpler than
necessary.

>   Tuples are better than objects (try namedtuple too though).

If Guido thought that tuples would always be better than objects, then
Python wouldn't have objects. Why would he add such a complicated
feature to the language if he thought it was useless?

The (unspoken?) context here is "if tuples are sufficient, then ..."

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Code improvement question

2023-11-17 Thread Peter J. Holzer via Python-list
On 2023-11-17 07:48:41 -0500, Thomas Passin via Python-list wrote:
> On 11/17/2023 6:17 AM, Peter J. Holzer via Python-list wrote:
> > Oh, and Python (just like Perl) allows you to embed whitespace and
> > comments into Regexps, which helps readability a lot if you have to
> > write long regexps.
> > 
[...]
> > > > > re.findall(r'\b[0-9]{2,7}-[0-9]{2}-[0-9]{2}\b', txt)
> > 
> > \b - a word boundary.
> > [0-9]{2,7} - 2 to 7 digits
> > -  - a hyphen-minus
> > [0-9]{2}   - exactly 2 digits
> > -  - a hyphen-minus
> > [0-9]{2}   - exactly 2 digits
> > \b - a word boundary.
> > 
> > Seems quite straightforward to me. I'll be impressed if you can write
> > that in Python in a way which is easier to read.
> 
> And the re.VERBOSE (also re.X) flag can always be used so the entire
> expression can be written line-by-line with comments nearly the same
> as the example above

Yes. That's what I alluded to above.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Code improvement question

2023-11-17 Thread Peter J. Holzer via Python-list
On 2023-11-16 11:34:16 +1300, Rimu Atkinson via Python-list wrote:
> > > Why don't you use re.findall?
> > > 
> > > re.findall(r'\b[0-9]{2,7}-[0-9]{2}-[0-9]{2}\b', txt)
> > 
> > I think I can see what you did there but it won't make sense to me - or
> > whoever looks at the code - in future.
> > 
> > That answers your specific question. However, I am in awe of people who
> > can just "do" regular expressions and I thank you very much for what
> > would have been a monumental effort had I tried it.
> 
> I feel the same way about regex. If I can find a way to write something
> without regex I very much prefer to as regex usually adds complexity and
> hurts readability.

I find "straight" regexps very easy to write. There are only a handful
of constructs which are all very simple and you just string them
together. But then I've used regexps for 30+ years, so of course they
feel natural to me.

(Reading regexps may be a bit harder, exactly because they are to
simple: There is no abstraction, so a complicated pattern results in a
long regexp.)

There are some extensions to regexps which are conceptually harder, like
lookahead and lookbehind or nested contexts in Perl. I may need the
manual for those (especially because they are new(ish) and every
language uses a different syntax for them) or avoid them altogether.

Oh, and Python (just like Perl) allows you to embed whitespace and
comments into Regexps, which helps readability a lot if you have to
write long regexps.


> You might find https://regex101.com/ to be useful for testing your regex.
> You can enter in sample data and see if it matches.
> 
> If I understood what your regex was trying to do I might be able to suggest
> some python to do the same thing. Is it just removing numbers from text?

Not "removing" them (as I understood it), but extracting them (i.e. find
and collect them).

> > > re.findall(r'\b[0-9]{2,7}-[0-9]{2}-[0-9]{2}\b', txt)

\b - a word boundary.
[0-9]{2,7} - 2 to 7 digits
-  - a hyphen-minus
[0-9]{2}   - exactly 2 digits
-  - a hyphen-minus
[0-9]{2}   - exactly 2 digits
\b - a word boundary.

Seems quite straightforward to me. I'll be impressed if you can write
that in Python in a way which is easier to read.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: xor operator

2023-11-15 Thread Peter J. Holzer via Python-list
On 2023-11-15 12:26:32 +0200, Dom Grigonis wrote:
> 
> Thank you,
> 
> 
> test2 = [True] * 100 + [False] * 2
> test2i = list(range(100))
> 
> %timeit len(set(test2i)) == 1   # 1.6 µs ± 63.6 ns per loop (mean ± std. dev. 
> of 7 runs, 1,000,000 loops each)
> %timeit all(test2)  # 386 ns ± 9.58 ns per loop (mean ± std. dev. 
> of 7 runs, 1,000,000 loops each)
> 
> test2s = set(test2i)
> %timeit len(test2s) == 1# 46.1 ns ± 1.65 ns per loop (mean ± std. 
> dev. of 7 runs, 10,000,000 loops each)
> 
> If you pre-convert to set it is obviously faster. However, set
> operation is most likely going to be part of the procedure. In which
> case it ends up to be significantly slower.

Obviously, if you convert a list to a set just to count the elements
it's going to be slow. My suggestion was to use the set *instead* of the
list. I don't know whether that's possible in your situation, because
you haven't told us anything about it. All I'm suggesting is taking a
step back and reconsider your choice of data structure.

hp
-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: xor operator

2023-11-14 Thread Peter J. Holzer via Python-list
On 2023-11-14 00:11:30 +0200, Dom Grigonis via Python-list wrote:
> Benchmarks:
> test1 = [False] * 100 + [True] * 2
> test2 = [True] * 100 + [False] * 2
> 
> TIMER.repeat([
> lambda: xor(test1), # 0.0168
> lambda: xor(test2), # 0.0172
> lambda: xor_ss(test1),  # 0.1392
> lambda: xor_ss(test2),  # 0.0084
> lambda: xor_new(test1), # 0.0116
> lambda: xor_new(test2), # 0.0074
> lambda: all(test1), # 0.0016
> lambda: all(test2)  # 0.0046
> ])
> Your first function is fairly slow.
> Second one deals with short-circuiting, but is super slow on full search.
> 
> `xor_new` is the best what I could achieve using python builtins.
> 
> But builtin `all` has the best performance.

One question worth asking is if a list of bool is the best data
structure for the job. This is essentially a bitmap, and a bitmap is
equivalent to a set of integers. len(s) == 1 is also a fairly quick
operation if s is small. On my system, len(test1s) == 1 (where test1s is
{100, 101}) is about as fast as all(test1) and len(test2s) == 1 (where
test2s is set(range(100))) is about twice as fast as all(test2).

If you are willing to stray from the standard library, you could e.g.
use pyroaring instead of sets: This is about as fast as all(test1)
whether there are two bits set or a hundred.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.12.0 venv not working with psycopg2

2023-10-02 Thread Peter J. Holzer via Python-list
On 2023-10-02 19:44:12 +0300, אורי via Python-list wrote:
> I have an issue since about 5 months now. Python 3.12.0 venv not working
> with psycopg2 on Windows. I created 2 issues on GitHub but they were
> closed. I checked today with the new Python release but it's still not
> working.
> 
> https://github.com/psycopg/psycopg2/issues/1578
> https://github.com/python/cpython/issues/104830

You wil have to come up with a *minimal* test case which reproduces the
problem. Expecting people to download and test your massive application
is unreasonable.

    hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: path to python in venv

2023-09-27 Thread Peter J. Holzer via Python-list
On 2023-09-27 20:32:25 -, Jon Ribbens via Python-list wrote:
> On 2023-09-27, Larry Martell  wrote:
> > On Wed, Sep 27, 2023 at 12:42 PM Jon Ribbens via 
> > Python-list wrote:
> >> On 2023-09-27, Larry Martell  wrote:
> >> > lrwxrwxrwx 1 larrymartell larrymartell7 Sep 27 11:21 python -> 
> >> > python3
> >> > lrwxrwxrwx 1 larrymartell larrymartell   16 Sep 27 11:21 python3 -> 
> >> > /usr/bin/python3
[...]
> I'm a bit surprised your symlinks are as shown above though - mine
> link from python to python3.11 to /usr/bin/python3.11, so it wouldn't
> change the version of python used even if I installed a different
> system python version.

That's probably because you created the venvs with "python3.11 -m venv ...".
The symlink points to the command you used to create it:

% python3 -m venv venv
% ll venv/bin/python*
lrwxrwxrwx 1 hjp hjp  7 Aug 29  2022 venv/bin/python -> python3*
lrwxrwxrwx 1 hjp hjp 12 Aug 29  2022 venv/bin/python3 -> /bin/python3*
lrwxrwxrwx 1 hjp hjp  7 Aug 29  2022 venv/bin/python3.10 -> python3*

% python3.10 -m venv venv
% ll venv/bin/python*
lrwxrwxrwx 1 hjp hjp 10 Sep 28 00:45 venv/bin/python -> python3.10*
lrwxrwxrwx 1 hjp hjp 10 Sep 28 00:45 venv/bin/python3 -> python3.10*
lrwxrwxrwx 1 hjp hjp 15 Sep 28 00:45 venv/bin/python3.10 -> /bin/python3.10*

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The GIL and PyEval_RestoreThread

2023-09-27 Thread Peter Ebden via Python-list
The thread variable I'm passing in is the one I originally got from calling
Py_NewInterpreter. I'd assumed that I didn't need to particularly track the
one I get back from SaveThread since it should always be the one I restored
previously (which does seem to be the case).

> It looks like you're resuming the same thread twice. As it's already
resumed the second time, no wonder it's not blocking!

That isn't how I read the docs though? It says "If the lock has been
created, the current thread must not have acquired it, otherwise deadlock
ensues." That suggests to me that it should try to acquire the GIL again
and wait until it can (although possibly also that it's not an expected use
and Python thread states are expected to be more 1:1 with C threads).

On Wed, Sep 27, 2023 at 3:53 AM MRAB via Python-list 
wrote:

> On 2023-09-26 14:20, Peter Ebden via Python-list wrote:
> > Hi all,
> >
> > I've been working on embedding Python and have an interesting case around
> > locking with PyEval_RestoreThread which wasn't quite doing what I expect,
> > hoping someone can explain what I should expect here.
> >
> > I have a little example (I'm running this in parallel from two different
> > threads; I have some more C code for that but I don't think it's super
> > interesting):
> >
> > void run_python(PyThreadState* thread) {
> >LOG("Restoring thread %p...", thread);
> >PyEval_RestoreThread(thread);
> >LOG("Restored thread %p", thread);
> >PyRun_SimpleString("import time; print('sleeping'); time.sleep(3.0)");
> >LOG("Saving thread...");
> >PyThreadState* saved_thread = PyEval_SaveThread();
> >LOG("Saved thread %p", saved_thread);
> > }
> >
> > This produces output like
> > 11:46:48.110058893: Restoring thread 0xabc480...
> > 11:46:48.110121656: Restored thread 0xabc480
> > 11:46:48.110166060: Restoring thread 0xabc480...
> > sleeping
> > 11:46:48.110464194: Restored thread 0xabc480
> > sleeping
> > 11:46:51.111307541: Saving thread...
> > 11:46:51.111361075: Saved thread 0xabc480
> > 11:46:51.113116633: Saving thread...
> > 11:46:51.113177605: Saved thread 0xabc480
> >
> > The thing that surprises me is that both threads seem to be able to pass
> > PyEval_RestoreThread before either reaches the corresponding
> > PyEval_SaveThread call, which I wasn't expecting to happen; I assumed
> that
> > since RestoreThread acquires the GIL, that thread state would remain
> locked
> > until it's released.
> >
> > I understand that the system occasionally switches threads, which I guess
> > might well happen with that time.sleep() call, but I wasn't expecting the
> > same thread to become usable somewhere else. Maybe I am just confusing
> > things by approaching the same Python thread from multiple OS threads
> > concurrently and should be managing my own locking around that?
> >
> Storing the result of PyEval_SaveThread in a local variable looks wrong
> to me.
>
> In the source for the regex module, I release the GIL with
> PyEval_SaveThread and save its result. Then, when I want to claim the
> GIL, I pass that saved value to PyEval_RestoreThread.
>
> You seem to be releasing the GIL and discarding the result, so which
> thread are you resuming when you call PyEval_RestoreThread?
>
> It looks like you're resuming the same thread twice. As it's already
> resumed the second time, no wonder it's not blocking!
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>

-- 
Thought Machine Group Limited, a company registered in England & Wales.
Registered number: 4277. 
Registered Office: 5 New Street Square, 
London EC4A 3TW 
<https://maps.google.com/?q=5+New+Street+Square,+London+EC4A+3TW=gmail=g>.


The content of this email is confidential and intended for the recipient 
specified in message only. It is strictly forbidden to share any part of 
this message with any third party, without a written consent of the sender. 
If you received this message by mistake, please reply to this message and 
follow with its deletion, so that we can ensure such a mistake does not 
occur in the future.
-- 
https://mail.python.org/mailman/listinfo/python-list


The GIL and PyEval_RestoreThread

2023-09-26 Thread Peter Ebden via Python-list
Hi all,

I've been working on embedding Python and have an interesting case around
locking with PyEval_RestoreThread which wasn't quite doing what I expect,
hoping someone can explain what I should expect here.

I have a little example (I'm running this in parallel from two different
threads; I have some more C code for that but I don't think it's super
interesting):

void run_python(PyThreadState* thread) {
  LOG("Restoring thread %p...", thread);
  PyEval_RestoreThread(thread);
  LOG("Restored thread %p", thread);
  PyRun_SimpleString("import time; print('sleeping'); time.sleep(3.0)");
  LOG("Saving thread...");
  PyThreadState* saved_thread = PyEval_SaveThread();
  LOG("Saved thread %p", saved_thread);
}

This produces output like
11:46:48.110058893: Restoring thread 0xabc480...
11:46:48.110121656: Restored thread 0xabc480
11:46:48.110166060: Restoring thread 0xabc480...
sleeping
11:46:48.110464194: Restored thread 0xabc480
sleeping
11:46:51.111307541: Saving thread...
11:46:51.111361075: Saved thread 0xabc480
11:46:51.113116633: Saving thread...
11:46:51.113177605: Saved thread 0xabc480

The thing that surprises me is that both threads seem to be able to pass
PyEval_RestoreThread before either reaches the corresponding
PyEval_SaveThread call, which I wasn't expecting to happen; I assumed that
since RestoreThread acquires the GIL, that thread state would remain locked
until it's released.

I understand that the system occasionally switches threads, which I guess
might well happen with that time.sleep() call, but I wasn't expecting the
same thread to become usable somewhere else. Maybe I am just confusing
things by approaching the same Python thread from multiple OS threads
concurrently and should be managing my own locking around that?

Thanks in advance,

Peter

-- 
Thought Machine Group Limited, a company registered in England & Wales.
Registered number: 4277. 
Registered Office: 5 New Street Square, 
London EC4A 3TW 
<https://maps.google.com/?q=5+New+Street+Square,+London+EC4A+3TW=gmail=g>.


The content of this email is confidential and intended for the recipient 
specified in message only. It is strictly forbidden to share any part of 
this message with any third party, without a written consent of the sender. 
If you received this message by mistake, please reply to this message and 
follow with its deletion, so that we can ensure such a mistake does not 
occur in the future.
-- 
https://mail.python.org/mailman/listinfo/python-list


dateutil on PyPI (was: PEP668 / pipx and "--editable" installs)

2023-09-20 Thread Peter J. Holzer via Python-list
On 2023-09-20 13:31:14 +, c.buhtz--- via Python-list wrote:
> Dear Peter,
> 
> maybe we have a missunderstanding.
> 
> Am 20.09.2023 14:43 schrieb Peter J. Holzer via Python-list:
> > > > > "dateutil" is not available from PyPi for Python 3.11
> > 
> > That's quite a curious thing to write if you are aware that dateutil is
> > in fact available from PyPi for Python 3.11.
> 
> Do I miss something here?
> 
> See https://pypi.org/project/dateutils/ and also the open Issue about the
> missing support for Python 3.11

You are missing at least an "s". "dateutils" is not "dateutil".

But dateutils can also be installed from PyPI.

> https://github.com/dateutil/dateutil/issues/1233 ?

So it hasn't been tagged with any version >= 3.10 yet. That doesn't mean
it isn't available. It requires a Python version >= 3.3, but 3.11 is >=
3.3, so that's not a problem.

As I demonstrated, it installs just fine.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP668 / pipx and "--editable" installs

2023-09-20 Thread Peter J. Holzer via Python-list
On 2023-09-18 18:56:35 +, c.buhtz--- via Python-list wrote:
> On 2023-09-18 10:16 "Peter J. Holzer via Python-list"
>  wrote:
> > On 2023-09-15 14:15:23 +, c.buhtz--- via Python-list wrote:
> > > I tried to install it via "pipx install -e .[develop]". It's
> > > pyproject.toml has a bug: A missing dependency "dateutil". But
> > > "dateutil" is not available from PyPi for Python 3.11 (the default
> > > in Debian 12). But thanks to great Debian they have a
> > > "python3-dateutil" package. I installed it.
> > 
> > This can be installed via pip:
> 
> I'm aware of this.

You wrote:

> > > "dateutil" is not available from PyPi for Python 3.11

That's quite a curious thing to write if you are aware that dateutil is
in fact available from PyPi for Python 3.11.

> But this is not the question.

I know. That's why I labeled my comment as a "Sidenote".

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP668 / pipx and "--editable" installs

2023-09-18 Thread Peter J. Holzer via Python-list
On 2023-09-15 14:15:23 +, c.buhtz--- via Python-list wrote:
> I tried to install it via "pipx install -e .[develop]". It's pyproject.toml
> has a bug: A missing dependency "dateutil". But "dateutil" is not available
> from PyPi for Python 3.11 (the default in Debian 12). But thanks to great
> Debian they have a "python3-dateutil" package. I installed it.

Sidenote:
PyPI does have several packages with "dateutil" in their name. From the
version number (2.8.2) I guess that "python-dateutil" is the one
packaged in Debian 12.

This can be installed via pip:

% lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:Debian GNU/Linux 12 (bookworm)
Release:12
Codename:   bookworm

(dateutil) % pip install python-dateutil
Collecting python-dateutil
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
  247.7/247.7 kB 3.1 MB/s eta 
0:00:00
Collecting six>=1.5
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: six, python-dateutil
Successfully installed python-dateutil-2.8.2 six-1.16.0

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Postgresql equivalent of Python's timeit?

2023-09-17 Thread Peter J. Holzer via Python-list
On 2023-09-17 11:01:43 +0200, Albert-Jan Roskam via Python-list wrote:
>On Sep 15, 2023 19:45, "Peter J. Holzer via Python-list"
> wrote:
> 
>  On 2023-09-15 17:42:06 +0200, Albert-Jan Roskam via Python-list wrote:
>  >    This is more related to Postgresql than to Python, I hope this is
>  ok.
>  >    I want to measure Postgres queries N times, much like Python timeit
>  >    (https://docs.python.org/3/library/timeit.html). I know about
>  EXPLAIN
>  >    ANALYZE and psql \timing, but there's quite a bit of variation in
>  the
>  >    times. Is there a timeit-like function in Postgresql?
> 
>  Why not simply call it n times from Python?
> 
>  (But be aware that calling the same query n times in a row is likely to
>  be
>  unrealistically fast because most of the data will already be in
>  memory.)
> 
>=
>Thanks, I'll give this a shot. Hopefully the caching is not an issue if I
>don't re-use the same database connection.

There is some per-session caching, but the bulk of it is shared between
sessions or even in the operating system. And you wouldn't want to get
rid of these caches either (which you could do by rebooting or - a bit
faster - restarting postgres and dropping the caches
(/proc/sys/vm/drop_caches on Linux), because that would make the
benchmark unrealistically slow (unless you want to establish some
worst-case baseline). During normal operations some data will be cached,
but probably not all of it and it will change depending on workload and
possibly other factors.

I think Avi's advice to wait for a few minutes between repetitions is
good. Of course that means that you can't just time the whole thing but
have to time each query separately and then compute the average. (On the
bright side that also gives you the opportunity to compute standard
deviation, min, max, quantiles, etc.)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: `time.perf_counter_ns` always a 64-bit int?

2023-09-16 Thread Peter J. Holzer via Python-list
On 2023-09-15 21:48:37 +, rmlibre--- via Python-list wrote:
> I'd like to capture the output of `time.perf_counter_ns()` as an 8-byte
> timestamp.
> 
> I'm aware that the docs provide an undefined start value for that clock.
> I'm going to assume that means it can't be expected to fit within 8
> bytes.

Theoretically this is true. The reference point could be the switch to
the Gregorian calendar in the Vatican, the begin of the Christian era or
the founding of Babylon, all of which were more than 2**63 seconds ago.
However, using one of these dates would be impractical and defeat the
purpose of the performance counters, which are supposed to be high
resolution, monotonic and independent of political influences. So the
reference point is usually the time the system was booted or something
similar.

> However, it would be rather convenient if it could.

Unless you expect your system to have an uptime in excess of 292 years,
don't worry.

    hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Postgresql equivalent of Python's timeit?

2023-09-15 Thread Peter J. Holzer via Python-list
On 2023-09-15 17:42:06 +0200, Albert-Jan Roskam via Python-list wrote:
>This is more related to Postgresql than to Python, I hope this is ok.
>I want to measure Postgres queries N times, much like Python timeit
>(https://docs.python.org/3/library/timeit.html). I know about EXPLAIN
>ANALYZE and psql \timing, but there's quite a bit of variation in the
>times. Is there a timeit-like function in Postgresql?

Why not simply call it n times from Python?

(But be aware that calling the same query n times in a row is likely to be
unrealistically fast because most of the data will already be in
memory.)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Passing info to function used in re.sub

2023-09-04 Thread Peter J. Holzer via Python-list
On 2023-09-03 18:10:29 +0200, Jan Erik Moström via Python-list wrote:
> I want to replace some text using a regex-pattern, but before creating
> replacement text I need to some file checking/copying etc. My code
> right now look something like this:
> 
> def fix_stuff(m):
>   # Do various things that involves for info
>   # that what's available in m
>   replacement_text = m.group(1) + global_var1 + global_var2
>   return replacement_text
> 
> and the call comes here
> 
> global_var1 = "bla bla"
> global_var2 = "pff"
> 
> new_text = re.sub(im_pattern,fix_stuff,md_text)
> 
> 
> The "problem" is that I've currently written some code that works but
> it uses global variables ... and I don't like global variables. I
> assume there is a better way to write this, but how?

If you use fix_stuff only inside one other function, you could make it
local to that function so that it will capture the local variables of
the outer function:

import re

def demo():

local_var1 = "bla bla"
local_var2 = "pff"

def fix_stuff(m):
# Do various things that involves for info
# that what's available in m
replacement_text = m.group(1) + local_var1 + local_var2
return replacement_text

for md_text in ( "aardvark", "barbapapa", "ba ba ba ba barbara ann"):
new_text = re.sub(r"(a+).*?(b+)", fix_stuff, md_text)
print(md_text, new_text)

demo()

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What sort of exception when a class can't find something?

2023-08-31 Thread Peter J. Holzer via Python-list
On 2023-08-31 21:32:04 +0100, Chris Green via Python-list wrote:
> What sort of exception should a class raise in __init__() when it
> can't find an appropriate set of data for the parameter passed in to
> the class instantiation?
> 
> E.g. I have a database with some names and address in and have a
> class Person that gets all the details for a person given their
> name.
> 
>  
>  
>  person.Person('Fred')
>  ...
>  ...
> 
> 
> If Fred doesn't exist in the database what sort of exception should
> there be?  Is it maybe a ValueError?

It you are going for a builtin exception, I think KeyError is the most
appropriate: It should be a LookupError, since the lookup failed and a
database is more like a mapping than a sequence.

But it would probably be best to define your own exception for that.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using "textwrap" package for unwrappable languages (Japanese)

2023-08-30 Thread Peter J. Holzer via Python-list
On 2023-08-30 13:18:25 +, c.buhtz--- via Python-list wrote:
> Am 30.08.2023 14:07 schrieb Peter J. Holzer via Python-list:
> > another caveat: Japanese characters are usually double-width. So
> > (unless your line length is 130 characters for English) you would
> > want to add that line break every 32 characters.
> 
> I don't get your calculation here. Original line length is 130 but for
> "double-with" characters you would break at 32 instead of 65 ?

No, I wrote "*unless* your original line length was 130 characters".

I assumed that you want your line to be 65 latin characters wide since
this is what fits nicely on an A4 (or letter) page with a bit of a
margin on both sides. Or on an 80 character terminal screen or window.
And it's also generally considered to be a good line length for
readability.

But Asian "full width" or "wide" characters are twice as wide, so you
can fit only half as many in a single line. Hence 65 // 2 = 32.

But that was only my assumption. I considered it possible that you
started with 130 characters per line (many terminals back in the day had
a 132 character mode, and that's also approximately the line length in
landscape mode or when using a compressed typeface - so 132 is also a
common length limit, although rarely for text (too wide to read
comfortably) and more for code, tables, etc.), divided that by two and
arrived at 65 Japanese characters per line that way. So I mentioned that
to indicate that I had considered the possibility but concluded that it
probably wasn't what you meant.

(And as usual when I write a short sentence to clarify something
I wind up writing 4 paragraphs clarifying the clarification :-/)

> Then I will do something like this
> 
> unicodedata.east_asian_width(mystring[0])
> 
> W is "wide". But there is also "F" (full-width).
> What is the difference between "wide" and "full-width"?

I'm not an expert on Japanese typography by any means. But they have
some full width variants of latin characters and halfwidth variants of
katakana characters. I assume that the categories 'F' and 'H' are for
those, while "normal" Japanese characters are "W":

>>> unicodedata.east_asian_width("\N{DIGIT ONE}")
'Na'
>>> unicodedata.east_asian_width("\N{FULLWIDTH DIGIT ONE}")
'F'
>>> unicodedata.east_asian_width("\N{KATAKANA LETTER ME}")
'W'
>>> unicodedata.east_asian_width("\N{HALFWIDTH KATAKANA LETTER ME}")
'H'

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using "textwrap" package for unwrappable languages (Japanese)

2023-08-30 Thread Peter J. Holzer via Python-list
On 2023-08-30 11:32:02 +, c.buhtz--- via Python-list wrote:
> I do use "textwrap" package to wrap longer texts passages. Works well with
> English.
> But the source string used is translated via gettext before it is wrapped.
> 
> Using languages like Japanese or Chinese would IMHO result in unwrapped
> text. Japanese rules do allow to break a line nearly where ever you want.
> 
> How can I handle it with "textwrap"?
> 
> At runtime I don't know which language is really used. So I'm not able to
> decide using "textwrap" or just inserting "\n" every 65 characters.

I don't have a solution but want to add another caveat: Japanese
characters are usually double-width. So (unless your line length is 130
characters for English) you would want to add that line break every 32
characters. (unicodedata.east_asian_width() seems to be the canonical
name to find the width of a character, but it returns a code (like 'W'
or 'Na') not a number.)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Where is the error?

2023-08-06 Thread Peter J. Holzer via Python-list
Mostly, error messages got a lot better in Python 3.10, but this one had
me scratching my head for a few minutes.

Consider this useless and faulty script:


r = {
"x": (1 + 2 + 3)
"y": (4 + 5 + 6)
"z": (7 + 8 + 9)
}


Python 3.9 (and earlier) reports:


  File "/home/hjp/tmp/foo", line 3
"y": (4 + 5 + 6)
^
SyntaxError: invalid syntax


This isn't great, but experience with lots of programming languages
tells me that an error is noticed where or after it actually occurs, so
it's easy to see that there is a comma missing just before the "y".

Python 3.10 and 3.11 report:


  File "/home/hjp/tmp/foo", line 2
"x": (1 + 2 + 3)
  ^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?


The error message is now a lot better, of course, but the fact that it
points at the expression *before* the error completely threw me. The
underlined expression is clearly not missing a comma, nor is there an
error before that. My real program was a bit longer of course, so I
checked the lines before that to see if I forgot to close any
parentheses. Took me some time to notice the missing comma *after* the
underlined expression.

Is this "clairvoyant" behaviour a side-effect of the new parser or was
that a deliberate decision?

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Working with paths

2023-07-16 Thread Peter Slížik via Python-list
Hello,

I finally had a look at the pathlib module. (Should have done it long ago,
but anyway...). Having in mind the replies from my older thread (File
system path annotations), what is the best way to support all possible path
types?

def doit(path: str | bytes | os.PathLike):
match path:
case str() as path:
print("string")

case bytes() as path:
print("bytes")

case os.PathLike() as path:
print("os.PathLike")

Should I branch on the individual types or is there a more elegant way?

Peter
-- 
https://mail.python.org/mailman/listinfo/python-list


Best practices for using super()

2023-07-04 Thread Peter Slížik via Python-list
As a follow-up to my yesterday's question - are there any recommendations
on the usage of super()?

It's clear that super() can be used to invoke parent's:
 - instance methods
 - static methods
 - constants ("static" attributes in the parent class, e.g. super().NUMBER).

This all works, but are there situations in which calling them explicitly
using a parent class name is preferred?

Best regards,
Peter
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Multiple inheritance and a broken super() chain

2023-07-04 Thread Peter Slížik via Python-list
>
> Also, you might find that because of the MRO, super() in your Bottom
> class would actually give you what you want.
>

I knew this, but I wanted to save myself some refactoring, as the legacy
code used different signatures for Left.__init__() and Right.__init__().

I realized the formatting of my code examples was completely removed; sorry
for that.

Best regards,
Peter
-- 
https://mail.python.org/mailman/listinfo/python-list


Multiple inheritance and a broken super() chain

2023-07-03 Thread Peter Slížik via Python-list
Hello.

The legacy code I'm working with uses a classic diamond inheritance. Let me
call the classes *Top*, *Left*, *Right*, and *Bottom*.
This is a trivial textbook example. The classes were written in the
pre-super() era, so all of them initialized their parents and Bottom
initialized both Left and Right in this order.

The result was expected: *Top* was initialized twice:

Top.__init__() Left.__init__() Top.__init__() Right.__init__()
Bottom.__init__()

Now I replaced all parent init calls with *super()*. After this, Top was
initialized only once.

Top.__init__() Right.__init__() Left.__init__() Bottom.__init__()

But at this point, I freaked out. The code is complex and I don't have the
time to examine its inner workings. And before, everything worked correctly
even though Top was initialized twice. So I decided to break the superclass
chain and use super() only in classes inheriting from a single parent. My
intent was to keep the original behavior but use super() where possible to
make the code more readable.

class Top:
def __init__(self):
print("Top.__init__()")

class Left(Top):
def __init__(self):
super().__init__()
print("Left.__init__()")

class Right(Top):
def __init__(self):
super().__init__()
print("Right.__init__()")

class Bottom(Left, Right):
def __init__(self):
Left.__init__(self) # Here I'm calling both parents manually
Right.__init__(self)
print("Bottom.__init__()")

b = Bottom()


The result has surprised me:

Top.__init__() Right.__init__() Left.__init__() Top.__init__()
Right.__init__() Bottom.__init__()

Now, as I see it, from the super()'s point of view, there are two
inheritance chains, one starting at Left and the other at Right. But
*Right.__init__()* is called twice. What's going on here?

Thanks,
Peter
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bug in io.TextIOWrapper?

2023-06-19 Thread Peter J. Holzer via Python-list
On 2023-06-20 02:15:00 +0900, Inada Naoki via Python-list wrote:
> stream.flush() doesn't mean final output.
> Try stream.close()

After close() the value isn't available any more:

Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import io
>>> buffer = io.BytesIO()
>>> stream = io.TextIOWrapper(buffer, encoding='idna')
>>> stream.write('abc.example.com')
15
>>> stream.close()
>>> buffer.getvalue()
Traceback (most recent call last):
  File "", line 1, in 
ValueError: I/O operation on closed file.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Should NoneType be iterable?

2023-06-19 Thread Peter Bona via Python-list
Hi

I am wondering if there has been any discussion why NoneType  is not iterable 
My feeling is that it should be.
Sometimes I am using API calls which return None.
If there is a return value (which is iterable) I am using a for loop to iterate.

Now I am getting 'TypeError: 'NoneType' object is not iterable'.

(Examples are taken from here 
https://rollbar.com/blog/python-typeerror-nonetype-object-is-not-iterable/)
Example 1:
mylist = None
for x in mylist:
print(x)  <== will raise TypeError: 'NoneType' object is not iterable
Solution: extra If statement
if mylist is not None:
for x in mylist:
print(x)


I think Python should handle this case gracefully: if a code would iterate over 
None: it should not run any step. but proceed the next statement.

Has this been discussed or proposed?

Thanks
Peter


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: File system path annotations

2023-06-19 Thread Peter Slížik via Python-list
Thank you, Roel. You've answered all my questions.

> [PEP 519]: ...as that can be represented with typing.Union[str, bytes,
os.PathLike] easily enough and the hope is users
> will slowly gravitate to path objects only.

I read a lot on Python and, frankly, I don't see this happening. People on
the Internet keep using *str* as their path representation choice.
Presumably, programmers don't feel the need to bother with a complex
solution if the simplest option works just fine.

Peter
-- 
https://mail.python.org/mailman/listinfo/python-list


File system path annotations

2023-06-19 Thread Peter Slížik via Python-list
Hello,

what is the preferred way of annotating file system paths?

This StackOverflow answer <https://stackoverflow.com/a/58541858/1062139>
(and a few others) recommend using the

str | os.PathLike

union.

However, byte arrays can be used as paths too, and including them
would make the annotation quite long.

I also believe that str confirms to the PathLike definition. Please,
correct me if I'm wrong.

And finally - using paths in Python programs is so common, that one
would expect to have a special type (or type alias) in typing. Am I
missing something?

My apologies if I'm asking the obvious, but after some googling I came
to the conclusion that information on this topic is surprisingly
limited to a few StackOverflow questions.

Best regards,

Peter
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Log File

2023-05-31 Thread Peter J. Holzer
On 2023-05-31 00:22:11 -0700, ahsan iqbal wrote:
> Why we need a log file ?

A log file contains information about what your program was doing. You
use it to check that your program was performing as intended after the
fact. This is especially useful for tracking down problems with programs
which are either long-running or are running unattended.

For example, if a user tells me that they were having a problem
yesterday evening I can read the log file to see what my program was
doing at the time which will help me to pin down the problem.

The important part to remember is that a log file will only contain
information the program writes. If you didn't think to log some
information then it won't be in the log file and you can't look it up
afterwards. On the other hand you don't want to log too much information
because that might cause performance problems, it might fill up your
disks and it will be a chore to find the relevant information in a sea
of irrelevant details. So deciding what to log is a bit of an art.

> If i read a large text file than how log file help me in this regard?

It won't help you with reading the file.

However, you might want to log some information about operation, e.g.
when you started (log files usually contain time stamps), when you
ended, how large the file was, etc. That way you can compare different
runs (has the file increased in size over the last month? Was reading
especially slow yesterday?). You could also log a status message every
once in a while (e.g. every 100 MB or every 10 lines). That will
give you reassurance that the program is working and a rough estimate
when it will be finished. Or you can log any other information you think
might be useful.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: getting rid of the recursion in __getattribute__

2023-05-25 Thread Peter Otten

On 24/05/2023 15:37, A KR wrote:

It is perfectly explained in the standards here [1] saying that:


In order to avoid infinite recursion in this method, its implementation should 
always call the base class method with the same name to access any attributes 
it needs, for example, object.__getattribute__(self, name).


Therefore, I wrote a code following what the standard says:


class Sample():
 def __init__(self):
 self.a = -10

 def __getattribute__(self, name):
 if name == 'a':
 return object.__getattribute__(self, name)

 raise AttributeError()

s = Sample()
result = s.a
print(result)

I did not fall into recursion, and the output was
-10


While this works it's not how I understand the recommended pattern. I'd
rather treat "special" attributes first and then use the
__getattribute__ method of the base class as a fallback:

>> class Demo:
def __getattribute__(self, name):
if name == "answer":
return 42
return super().__getattribute__(name)

That way your special arguments,

>>> d = Demo()
>>> d.answer
42


missing arguments

>>> d.whatever
Traceback (most recent call last):
  File "", line 1, in 
d.whatever
  File "", line 5, in __getattribute__
return super().__getattribute__(name)
AttributeError: 'Demo' object has no attribute 'whatever'

and "normal" arguments are treated as expected

>>> d.question = "What's up?"
>>> d.question
"What's up?"

Eventual "special" arguments in the superclass would also remain accessible.



However, when I try the code without deriving from a class:

class AnyClassNoRelation:
 pass

class Sample():
 def __init__(self):
 self.a = -10

 def __getattribute__(self, name):
 if name == 'a':
 return AnyClassNoRelation.__getattribute__(self, name)

 raise AttributeError()

s = Sample()

result = s.a
print(result)
and calling __getattribute__ via any class (in this example class 
AnyClassNoRelation) instead of object.__getattribute__(self, name) as the 
standard says call using the base class, I get the same output: no recursion 
and -10.

So my question:

How come this is possible (having the same output without using the base 
class's __getattribute__? Although the standards clearly states that 
__getattribute__ should be called from the base class.



AnyClassNoRelation does not override __getattribute__, so

>>> AnyClassNoRelation.__getattribute__ is object.__getattribute__
True


There is no sanity check whether a method that you call explicitly is
actually in an object's inheritance tree,

>>> class NoRelation:
def __getattribute__(self, name):
return name.upper()


>>> class Demo:
def __getattribute__(self, name):
return "<{}>".format(NoRelation.__getattribute__(self, name))


>>> Demo().some_arg
''

but the only purpose I can imagine of actually calling "someone else's"
method is to confuse the reader...


In order to avoid infinite recursion in this method, its implementation should 
always call the base class method with the same name to access any attributes 
it needs, for example, object.__getattribute__(self, name).


Literally, I can call __getattribute__ with anyclass (except Sample cause it 
will be infinite recursion) I define and it works just fine. Could you explain 
me why that happens?



--
https://mail.python.org/mailman/listinfo/python-list


Re: OT: Addition of a .= operator

2023-05-24 Thread Peter J. Holzer
On 2023-05-24 12:10:09 +1200, dn via Python-list wrote:
> Perhaps more psychology rather than coding?

Both. As they say, coding means writing for other people first, for
the computer second. So that means anticipating what will be least
confusing for that other person[1] who's going to read that code.

hp

[1] Which is often yourself, a few months older. Or it could be an
experienced colleague who's very familiar with the codebase. Or a new
colleague trying to understand what this is all about (possibly while
learning Python).

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Addition of a .= operator

2023-05-24 Thread Peter J. Holzer
On 2023-05-24 08:51:19 +1000, Chris Angelico wrote:
> On Wed, 24 May 2023 at 08:48, Peter J. Holzer  wrote:
> > Yes, that probably wasn't the best example. I sort of deliberately
> > avoided method chaining here to make my point that you don't have to
> > invent a new variable name for every intermediate result, but of course
> > that backfired because in this case you don't need a variable name at
> > all. I should have used regular function calls ...
> >
> 
> In the context of a .= operator, though, that is *in itself* an
> interesting data point: in order to find an example wherein the .=
> operator would be plausible, you had to make the .= operator
> unnecessary.

Another communication failure on my part, I'm afraid: I was going off on
a tangent about variable naming and didn't intend to show anything about
the usefulness (or lack thereof) of a .= operator.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Addition of a .= operator

2023-05-23 Thread Peter J. Holzer
On 2023-05-24 07:12:32 +1000, Chris Angelico wrote:
> On Wed, 24 May 2023 at 07:04, Peter J. Holzer  wrote:
> > But I find it easier to read if I just reuse the same variable name:
> >
> > user = request.GET["user"]
> > user = str(user, encoding="utf-8")
> > user = user.strip()
> > user = user.lower()
> > user = orm.user.get(name=user)
> >
> > Each instance only has a livetime of a single line (or maybe two or
> > three lines if I have to combine variables), so there's little risk of
> > confusion, and reusing the variable name makes it very clear that all
> > those intermediate results are gone and won't be used again.
> >
> 
> Small side point: You can make use of the bytes object's decode()
> method to make the chaining much more useful here, rather than the
> str() constructor.
> 
> This sort of code might be better as a single expression. For example:
> 
> user = (
> request.GET["user"]
> .decode("utf-8")
> .strip()
> .lower()
> )
> user = orm.user.get(name=user)

Yes, that probably wasn't the best example. I sort of deliberately
avoided method chaining here to make my point that you don't have to
invent a new variable name for every intermediate result, but of course
that backfired because in this case you don't need a variable name at
all. I should have used regular function calls ...

hp


-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Addition of a .= operator

2023-05-23 Thread Peter J. Holzer
On 2023-05-21 20:30:45 +0100, Rob Cliffe via Python-list wrote:
> On 20/05/2023 18:54, Alex Jando wrote:
> > So what I'm suggesting is something like this:
> > 
> > 
> > hash = hashlib.sha256(b'word')
> > hash.=hexdigest()
> > 
> > num = Number.One
> > num.=value
> > 
> It seems to me that this would encourage bad style.  When you write
>     num = num.value
> you are using num with two different meanings (an object and an
> attribute of it).

I think that's ok if it's the same thing at a high level.

I sometimes have a chain of transformations (e.g. first decode it, then
strip extra spaces, then normalize spelling, then look it up in a
database and replace it with the record, ...). Technically, of course
all these intermediate objects are different, and I could make that
explicit by using different variable names:

user_param = request.GET["user"]
user_decoded = str(user_param, encoding="utf-8")
user_stripped = user_decoded.strip()
user_normalized = user_stripped.lower()
user_object = orm.user.get(name=user_normalized)

But I find it easier to read if I just reuse the same variable name:

user = request.GET["user"]
user = str(user, encoding="utf-8")
user = user.strip()
user = user.lower()
user = orm.user.get(name=user)

Each instance only has a livetime of a single line (or maybe two or
three lines if I have to combine variables), so there's little risk of
confusion, and reusing the variable name makes it very clear that all
those intermediate results are gone and won't be used again.

hp


-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Addition of a .= operator

2023-05-20 Thread Peter J. Holzer
On 2023-05-20 10:54:59 -0700, Alex Jando wrote:
> I have many times had situations where I had a variable of a certain
> type, all I cared about it was one of it's methods.
> 
> For example:
> 
> 
> hash = hash.hexdigest()
> 
> num = num.value
> 
> 
> So what I'm suggesting is something like this:
> 
> 
> hash.=hexdigest()
> 
> num.=value
> 

I actually needed to read those twice to get their meaning. I think

hash .= hexdigest()
num .= value

would have been clearer (yes, I nag my colleagues about white-space,
too).

Do you have any examples (preferably from real code) where you don't
assign to a simple variable? I feel that
x += 1
isn't much of an improvement over
x = x + 1
but
self.data[line+len(chars)-1] += after
is definitely an improvement over
self.data[line+len(chars)-1] + self.data[line+len(chars)-1] + after

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem with accented characters in mailbox.Maildir()

2023-05-09 Thread Peter J. Holzer
On 2023-05-08 23:02:18 +0200, jak wrote:
> Peter J. Holzer ha scritto:
> > On 2023-05-06 16:27:04 +0200, jak wrote:
> > > Chris Green ha scritto:
> > > > Chris Green  wrote:
> > > > > A bit more information, msg.get("subject", "unknown") does return a
> > > > > string, as follows:-
> > > > > 
> > > > >   Subject: 
> > > > > =?utf-8?Q?aka_Marne_=C3=A0_la_Sa=C3=B4ne_(Waterways_Continental_Europe)?=
> > [...]
> > > > ... and of course I now see the issue!  The Subject: with utf-8
> > > > characters in it gets spaces changed to underscores.  So searching for
> > > > '(Waterways Continental Europe)' fails.
> > > > 
> > > > I'll either need to test for both versions of the string or I'll need
> > > > to change underscores to spaces in the Subject: returned by msg.get().
[...]
> > > 
> > > subj = email.header.decode_header(raw_subj)[0]
> > > 
> > > subj[0].decode(subj[1])
[...]
> > email.header.decode_header returns a *list* of chunks and you have to
> > process and concatenate all of them.
> > 
> > Here is a snippet from a mail to html converter I wrote a few years ago:
> > 
> > def decode_rfc2047(s):
> >  if s is None:
> >  return None
> >  r = ""
> >  for chunk in email.header.decode_header(s):
[...]
> >  r += chunk[0].decode(chunk[1])
[...]
> >  return r
[...]
> > 
> > I do have to say that Python is extraordinarily clumsy in this regard.
> 
> Thanks for the reply. In fact, I gave that answer because I did
> not understand what the OP wanted to achieve. In addition, the
> OP opened a second thread on the similar topic in which I gave a
> more correct answer (subject: "What do these '=?utf-8?' sequences
> mean in python?", date: "Sat, 6 May 2023 14:50:40 UTC").

Right. I saw that after writing my reply. I should have read all
messages, not just that thread before replying.

> the OP, I discovered that the MAME is not the only format used
> to compose the subject.

Not sure what "MAME" is. If it's a typo for MIME, then the base64
variant of RFC 2047 is just as much a part of it as the quoted-printable
variant.

> This made me think that a library could not delegate to the programmer
> the burden of managing all these exceptions,

email.header.decode_header handles both variants, but it produces bytes
sequences which still have to be decoded to get a Python string.


> then I have further investigated to discover that the library also
> provides the conversion function beyond that of coding and this makes
> our labors vain:
> 
> --
> from email.header import decode_header, make_header
> 
> subject = make_header(decode_header( raw_subject )))
> --

Yup. I somehow missed that. That's a lot more convenient than calling
decode in a loop (or generator expression). Depending on what you want
to do with the subject you may have wrap that in a call to str(), but
it's still a one-liner.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem with accented characters in mailbox.Maildir()

2023-05-08 Thread Peter J. Holzer
On 2023-05-06 16:27:04 +0200, jak wrote:
> Chris Green ha scritto:
> > Chris Green  wrote:
> > > A bit more information, msg.get("subject", "unknown") does return a
> > > string, as follows:-
> > > 
> > >  Subject: 
> > > =?utf-8?Q?aka_Marne_=C3=A0_la_Sa=C3=B4ne_(Waterways_Continental_Europe)?=
[...]
> > ... and of course I now see the issue!  The Subject: with utf-8
> > characters in it gets spaces changed to underscores.  So searching for
> > '(Waterways Continental Europe)' fails.
> > 
> > I'll either need to test for both versions of the string or I'll need
> > to change underscores to spaces in the Subject: returned by msg.get().

You need to decode the Subject properly. Unfortunately the Python email
module doesn't do that for you automatically. But it does provide the
necessary tools. Don't roll your own unless you've read and understood
the relevant RFCs.

> 
> This is probably what you need:
> 
> import email.header
> 
> raw_subj =
> '=?utf-8?Q?aka_Marne_=C3=A0_la_Sa=C3=B4ne_(Waterways_Continental_Europe)?='
> 
> subj = email.header.decode_header(raw_subj)[0]
> 
> subj[0].decode(subj[1])
> 
> 'aka Marne à la Saône (Waterways Continental Europe)'

You are an the right track, but that works only because the example
exists only of a single encoded word. This is not always the case (and
indeed not what the RFC recommends).

email.header.decode_header returns a *list* of chunks and you have to
process and concatenate all of them.

Here is a snippet from a mail to html converter I wrote a few years ago:

def decode_rfc2047(s):
if s is None:
return None
r = ""
for chunk in email.header.decode_header(s):
if chunk[1]:
try:
r += chunk[0].decode(chunk[1])
except LookupError:
r += chunk[0].decode("windows-1252")
except UnicodeDecodeError:
r += chunk[0].decode("windows-1252")
elif type(chunk[0]) == bytes:
r += chunk[0].decode('us-ascii')
else:
r += chunk[0]
return r

(this is maybe a bit more forgiving than the OP needs, but I had to deal
with malformed mails)

I do have to say that Python is extraordinarily clumsy in this regard.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What do these '=?utf-8?' sequences mean in python?

2023-05-08 Thread Peter Pearson
On Sat, 6 May 2023 14:50:40 +0100, Chris Green  wrote:
[snip]
> So, what do those =?utf-8? and ?= sequences mean?  Are they part of
> the string or are they wrapped around the string on output as a way to
> show that it's utf-8 encoded?

Yes, "=?utf-8?" signals "MIME header encoding".

I've only blundered about briefly in this area, but I think you
need to make sure that all header values you work with have been
converted to UTF-8 before proceeding.  
Here's the code that seemed to work for me:

def mime_decode_single(pair):
"""Decode a single (bytestring, charset) pair.
"""
b, charset = pair
result = b if isinstance(b, str) else b.decode(
charset if charset else "utf-8")
return result

def mime_decode(s):
"""Decode a MIME-header-encoded character string.
"""
decoded_pairs = email.header.decode_header(s)
return "".join(mime_decode_single(d) for d in decoded_pairs)



-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question regarding unexpected behavior in using __enter__ method

2023-04-21 Thread Peter Otten

On 21/04/2023 00:44, Lorenzo Catoni wrote:

Dear Python Mailing List members,

I am writing to seek your assistance in understanding an unexpected
behavior that I encountered while using the __enter__ method. I have
provided a code snippet below to illustrate the problem:

```

class X:

... __enter__ = int
... __exit__ = lambda *_: None
...

with X() as x:

... pass
...

x

0
```
As you can see, the __enter__ method does not throw any exceptions and
returns the output of "int()" correctly. However, one would normally expect
the input parameter "self" to be passed to the function.

On the other hand, when I implemented a custom function in place of the
__enter__ method, I encountered the following TypeError:

```

def myint(*a, **kw):

... return int(*a, **kw)
...

class X:

... __enter__ = myint
... __exit__ = lambda *_: None
...

with X() as x:

... pass
...
Traceback (most recent call last):
   File "", line 1, in 
   File "", line 2, in myint
TypeError: int() argument must be a string, a bytes-like object or a real
number, not 'X'
```
Here, the TypeError occurred because "self" was passed as an input
parameter to "myint". Can someone explain why this unexpected behavior
occurs only in the latter case?


Cameron is right, it's the descriptor protocol. Technically

inst.attr

invokes attr.__get__(...) if it exists:

>>> class A:
def __get__(self, *args): return args


>>> class B: pass

>>> class X:
a = A()
b = B()


>>> x = X()
>>> x.b
<__main__.B object at 0x02C2E388>
>>> x.a
(<__main__.X object at 0x02C2E280>, )

Python functions support the descriptor protocol

>>> hasattr(lambda: None, "__get__")
True

while builtin functions don't:

>>> hasattr(ord, "__get__")
False


I'm unsure whether to regard int as a class or or function, but as there
is no __get__

>>> hasattr(int, "__get__")
False

it behaves like builtin functions in this case.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Cannot install pkg_resources using pip

2023-04-17 Thread Peter J. Holzer
On 2023-04-16 17:03:43 -0400, Thomas Passin wrote:
> On 4/16/2023 4:42 PM, Rich Shepard wrote:
> > Python3-3.9.10 installed on this Slackware64-14.2 desktop.
[...]
> > # pip install setuptools
> > bash: /usr/bin/pip: /usr/bin/python3.7: bad interpreter: No such file or
> > directory
> > 
> > There is no python3.7 here:
> > # ls /usr/bin/python3.7
> > ls: cannot access '/usr/bin/python3.7': No such file or directory
> > 
> > How do I clean this up?
> 
> What is there to clean up?

There is a version of pip installed for a version of python which isn't
installed. That's definitely not useful, so it should be cleaned up.

As to how to do that:

Find out which package /usr/bin/pip belongs to and deinstall or upgrade
this package. How to find that package is a Slackware question, not a
Python question. And since Rich wrote that he's been comfortably using
Slackware for 20 years, I'll trust that he knows how to do that and just
needed a little nudge into the right direction.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Weak Type Ability for Python

2023-04-15 Thread Peter J. Holzer
On 2023-04-13 03:28:37 +0100, MRAB wrote:
> On 2023-04-13 03:12, avi.e.gr...@gmail.com wrote:
> > I suspect the OP is thinking of languages like PERL or JAVA which guess for
> > you and make such conversions when it seems to make sense.
> > 
> In the case of Perl, there are distinct operators for addition and string
> concatenation, with automatic type conversion (non-numeric strings have a
> numeric value of 0, which can hide bugs).

You get a warning for that, though.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Weak Type Ability for Python

2023-04-15 Thread Peter J. Holzer
On 2023-04-14 10:19:03 +1000, Chris Angelico wrote:
> The entire Presentation Manager and Workplace Shell (broadly
> equivalent to a Linux "desktop manager", I think? Kinda?) were object
> oriented; you would have a WPDataFile for every, well, data file, but
> some of those might be subclasses of WPDataFile. And it was fairly
> straight-forward to write your own subclass of WPDataFile, and there
> was an API to say "if you would ever create a WPDataFile, instead
> create one of my class instead". This brilliant technique allowed
> anyone to enhance the desktop in any way, quite impressive especially
> for its time. I've yearned for that ever since, in various systems,
> although I'm aware that it would make quite a mess of Python if you
> could say "class EnhancedInt(int): ..." and then "any time you would
> create an int, create an EnhancedInt instead". A bit tricky to
> implement.

Or alternatively you might be able to add or replace methods on the
existing int class. So 5 is still just an int, but now (5 + "x") calls
the modified __add__ method which knows how add a string to an int.

Might make even more of a mess ;-).

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Weak Type Ability for Python

2023-04-13 Thread Peter J. Holzer
On 2023-04-13 08:25:51 -0700, Grant Edwards wrote:
> On 2023-04-13, Cameron Simpson  wrote:
> > On 12Apr2023 22:12, avi.e.gr...@gmail.com  wrote:
> >
> >>I suspect the OP is thinking of languages like PERL or JAVA which guess 
> >>for you and make such conversions when it seems to make sense.
> >
> > JavaScript guesses. What a nightmare.
> 
> So does PHP.

Not in this case. Like Perl (Not PERL) it has different operators for
concatenation and addition. So $a + $b is always addition, never
concatenation.

Well, at least numbers and strings. For arrays its a (somewhat bizarre)
union.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Christoph Gohlke and compiled packages

2023-04-11 Thread Peter J. Holzer
On 2023-04-11 12:54:05 +0100, Oscar Benjamin wrote:
> Certainly for the more widely used libraries like numpy installing
> binaries with pip is not a problem these days on Windows or other
> popular OS. I notice that psycopg2 *only* provides binaries for
> Windows and not e.g. OSX or Linux

For Linux there is a separate package psycopg2-binary on PyPI.
That split happened a few years ago and I forgot why it was necessary.
For the distributions I use (Debian and Ubuntu) both packages work (but
for the source package I need to install the necessary development
packages first).

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: built-in pow() vs. math.pow()

2023-03-31 Thread Peter J. Holzer
On 2023-03-31 07:39:25 +0100, Barry wrote:
> On 30 Mar 2023, at 22:30, Chris Angelico  wrote:
> > It's called math.pow. That on its own should be a strong indication
> > that it's designed to work with floats.
> 
> So long as you know that the math module is provided to give access
> the C math.h functions.
> 

Well, that's the first line in the docs:

| This module provides access to the mathematical functions defined by
| the C standard.

Of course a Python programmer may not necessarily know what mathematical
functions the C standard defines or even what C is.

    hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: When is logging.getLogger(__name__) needed?

2023-03-31 Thread Peter Otten

On 31/03/2023 15:01, Loris Bennett wrote:

Hi,

In my top level program file, main.py, I have

   def main_function():

   parser = argparse.ArgumentParser(description="my prog")

   ...

   args = parser.parse_args()
   config = configparser.ConfigParser()

   if args.config_file is None:
   config_file = DEFAULT_CONFIG_FILE
   else:
   config_file = args.config_file

   config.read(config_file)

   logging.config.fileConfig(fname=config_file)
   logger = logging.getLogger(__name__)

   do_some_stuff()

   my_class_instance = myprog.MyClass()

   def do_some_stuff():

   logger.info("Doing stuff")

This does not work, because 'logger' is not known in the function
'do_some_stuff'.

However, if in 'my_prog/my_class.py' I have

   class MyClass:

   def __init__(self):

   logger.debug("created instance of MyClass")

this 'just works'.


Take another look at your code -- you'll probably find


   logger = logging.getLogger(__name__)


on the module level in my_class.py.


to 'do_some_stuff', but why is this necessary in this case but not in
the class?


Your problem has nothing to do with logging -- it's about visibility
("scope") of names:

>>> def use_name():
print(name)


>>> def define_name():
name = "Loris"


>>> use_name()
Traceback (most recent call last):
  File "", line 1, in 
use_name()
  File "", line 2, in use_name
print(name)
NameError: name 'name' is not defined

Binding (=assigning to) a name inside a function makes it local to that
function. If you want a global (module-level) name you have to say so:

>>> def define_name():
global name
name = "Peter"


>>> define_name()
>>> use_name()
Peter

--
https://mail.python.org/mailman/listinfo/python-list


Re: How does a method of a subclass become a method of the base class?

2023-03-27 Thread Peter J. Holzer
On 2023-03-27 01:53:49 +0200, Jen Kris via Python-list wrote:
> But that brings up a new question.  I can create a class instance with
> x = BinaryConstraint(), but what happens when I have a line like
> "EqualityConstraint(prev, v, Strength.REQUIRED)"?

If that is the whole statement it will create a new object of class
EqualityConstraint and immediately discard it. That may have some useful
side effect (for example the object may add itself to a list of
constraints) but this is not apparent from this line.

    hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How does a method of a subclass become a method of the base class?

2023-03-26 Thread Peter J. Holzer
On 2023-03-26 19:43:44 +0200, Jen Kris via Python-list wrote:
> The base class:
> 
> 
> class Constraint(object):
[...]
> def satisfy(self, mark):
>     global planner
>     self.choose_method(mark)
> 
> The subclass:
> 
> class UrnaryConstraint(Constraint):
[...]
>     def choose_method(self, mark):
>     if self.my_output.mark != mark and \
>    Strength.stronger(self.strength, self.my_output.walk_strength):
> self.satisfied = True
>     else:
>     self.satisfied = False
> 
> The base class Constraint doesn’t have a "choose_method" class method,
> but it’s called as self.choose_method(mark) on the final line of
> Constraint shown above. 
> 
> My question is:  what makes "choose_method" a method of the base
> class,

Nothing. choose_method isn't a method of the base class.

> called as self.choose_method instead of
> UrnaryConstraint.choose_method?  Is it super(UrnaryConstraint,
> self).__init__(strength) or just the fact that Constraint is its base
> class? 

This works only if satisfy() is called on a subclass of Constraint which
actually implements this method.

If you do something like

x = UrnaryConstraint()
x.satisfy(whatever)

Then x is a member of class UrnaryConstraint and will have a
choose_method() method which can be called.


> Also, this program also has a class BinaryConstraint that is also a
> subclass of Constraint and it also has a choose_method class method
> that is similar but not identical:
...
> When called from Constraint, it uses the one at UrnaryConstraint.  How
> does it know which one to use? 

By inspecting self. If you call x.satisfy() on an object of class
UrnaryConstraint, then self.choose_method will be the choose_method from
UrnaryConstraint. If you call it on an object of class BinaryConstraint,
then self.choose_method will be the choose_method from BinaryConstraint.

hp

PS: Pretty sure there's one "r" too many in UrnaryConstraint.

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Fwd: Friday finking: IDE 'macro expansions'

2023-03-18 Thread Peter J. Holzer
On 2023-03-18 16:06:49 +, Alan Gauld wrote:
> On 18/03/2023 12:15, Peter J. Holzer wrote:
> >> I think you might be meaning TurboPascal, Delphi's forerunner. It just
> >> had a compiler and text editor.
> > 
> > I'd still classify Turbo Pascal as an IDE. It wasn't a standalone
> > compiler you would invoke on source files you wrote with some other
> 
> It had both

I didn't mention that because I think it is irrelevant to the question
whether Turbo Pascal as an IDE or not.

What is relevant IMNSHO is that it did indeed provide an "integraded
environment" for "developing", combining all those tools which were
traditionally separate in one user interface.

> Indeed, but it was intrinsic to Delphi (even though you could
> write non GUI apps too, but they required extra effort.)
> Eclipse et al have GUI builders available as extras, in Delphi
> (and Lazurus) it is hard to avoid.

This is starting to sound like "Delphi is the only True™ IDE".

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Debugging reason for python running unreasonably slow when adding numbers

2023-03-18 Thread Peter J. Holzer
On 2023-03-15 17:09:52 +, Weatherby,Gerard wrote:
> Sum is faster than iteration in the general case.

I'd say this is the special case, not the general case.

> def sum1():
> s = 0
> for i in range(100):
> s += i
> return s
> 
> def sum2():
> return sum(range(100))

Here you already have the numbers you want to add.

The OP needed to compute those numbers first.

    hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Fwd: Friday finking: IDE 'macro expansions'

2023-03-18 Thread Peter J. Holzer
On 2023-03-18 08:46:42 +, Alan Gauld wrote:
> On 17/03/2023 17:55, Thomas Passin wrote:
> >> I used Delphi and Smalltalk/V which both pretty much only exist within
> >> their own IDEs and I used their features extensively.
> > 
> > Back when Delphi first came out, when I first used it, I don't remember 
> > any IDE; one just used a text editor.
> 
> I think you might be meaning TurboPascal, Delphi's forerunner. It just
> had a compiler and text editor.

I'd still classify Turbo Pascal as an IDE. It wasn't a standalone
compiler you would invoke on source files you wrote with some other
tool. It was a single program where you would write your code, compile
it, see the errors directly in the source code. I think it even had a
debugger which would also use the same editor window (Turbo C did).


> But Delphi from day 1 was an IDE designed to compete with Visual
> Basic. Everything was geared around the GUI builder.

Turbo Pascal predated GUIs, so it wouldn't have a GUI builder. Also not
everything you develop needs a GUI (in fact I haven't written a real
application (i.e. not a learning project) with a traditional desktop GUI
for 20 years) so the presence or absence of a GUI builder isn't an
essential criterion on whether something is or is not an IDE.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Debugging reason for python running unreasonably slow when adding numbers

2023-03-14 Thread Peter J. Holzer
On 2023-03-14 16:48:24 +0900, Alexander Nestorov wrote:
> I'm working on an NLP and I got bitten by an unreasonably slow
> behaviour in Python while operating with small amounts of numbers.
> 
> I have the following code:
[...]
>       # 12x slower than equivalent JS
>       sum_ = 0
>       for key in input:
>           v = weights[key]
>           sum_ += v
> 
>       # 20x slower than equivalent JS
>       #sum_ = reduce(lambda acc, key: acc + weights[key], input)

Not surprising. Modern JavaScript implementations have a JIT compiler.
CPython doesn't.

You may want to try PyPy if your code uses tight loops like that.

Or alternatively it may be possible to use numpy to do these operations.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Fast full-text searching in Python (job for Whoosh?)

2023-03-08 Thread Peter J. Holzer
On 2023-03-08 00:12:04 -0500, Thomas Passin wrote:
> On 3/7/2023 7:33 AM, Dino wrote:
> > in fact it's a dilemma I am facing now. My back-end returns 10
> > entries (I am limiting to max 10 matches server side for reasons you
> > can imagine). As the user keeps typing, should I restrict the
> > existing result set based on the new information or re-issue a API
> > call to the server? Things get confusing pretty fast for the user.
> > You don't want too many cooks in kitchen, I guess.
> > Played a little bit with both approaches in my little application.
> > Re-requesting from the server seems to win hands down in my case.
> > I am sure that them google engineers reached spectacular levels of UI
> > finesse with stuff like this.
> 
> Subject of course to trying this out, I would be inclined to send a much
> larger list of responses to the client, and let the client reduce the number
> to be displayed.  The latency for sending a longer list will be smaller than
> establishing a new connection or even reusing an old one to send a new,
> short list of responses.

That depends very much on how long that list can become. If it's 200
matches - sure, send them all, even if the client will display only 10
of them. Probably even for 2000. But if you might get 20 million matches
you surely don't want to send them all to the client.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Fast full-text searching in Python (job for Whoosh?)

2023-03-07 Thread Peter J. Holzer
On 2023-03-07 04:05:19 +, rbowman wrote:
> On Mon, 6 Mar 2023 21:55:37 -0500, Dino wrote:
> > ne issue that was also correctly foreseen by some is that there's going
> > to be a new request at every user key stroke. Known problem. JavaScript
> > programmers use a trick called "debounceing" to be reasonably sure that
> > the user is done typing before a request is issued:
> > 
> > https://schier.co/blog/wait-for-user-to-stop-typing-using-javascript
> 
> That could be annoying. My use case is address entry. When the user types

It can be. The delay is short but noticeable.

A somewhat smarter strategy is to send each query as soon as the user
hit the key but keep track of what you sent and received and discard
responses for obsolete requests (This is necessary because if you first
send "ma" and then "mas", the response to the first query might arrive
after the response to the second query and you don't want to display
"mansion" if the user already typed "mas".)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Which more Pythonic - self.__class__ or type(self)?

2023-03-04 Thread Peter J. Holzer
On 2023-03-04 12:38:22 -0500, avi.e.gr...@gmail.com wrote:
> Of course each language has commonly used idioms as C with pointer
> arithmetic and code like *p++=*q++ but my point is that although I live near
> a  seaway and from where C originated, I am not aware of words like "c-way"
> or "scenic" as compared to the way people keep saying "pythonic".

Oh, you're talking about the term, not the concept? 

You may have something there. I remember lots of discussions about
"idiomatic C" or "idiomatic Perl", but not about "C-nic" (nice pun, btw)
or "Perlish" code. The Python community may be unique in having invented
an adjective for that.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Which more Pythonic - self.__class__ or type(self)?

2023-03-03 Thread Peter J. Holzer
On 2023-03-03 13:51:11 -0500, avi.e.gr...@gmail.com wrote:
> I do not buy into any concept about something being pythonic or not.
> 
> Python has grown too vast and innovated quite a  bit, but also borrowed from
> others and vice versa.
> 
> There generally is no universally pythonic way nor should there be. Is there
> a C way

Oh, yes. Definitely.

> and then a C++ way and an R way or JavaScript

JavaScript has a quite distinctive style. C++ is a big language (maybe
too big for a single person to grok completely) so there might be
several "dialects". I haven't seen enough R code to form an opinion.

> or does only python a language with a philosophy of what is the
> pythonic way?

No. Even before Python existed there was the adage "a real programmer
can write FORTRAN in any language", indicating that idiomatic usage of a
language is not governed by syntax and library alone, but there is a
cultural element: People writing code in a specific language also read
code by other people in that language, so they start imitating each
other, just like speakers of natural languages imitate each other.
Someone coming from another language will often write code which is
correct but un-idiomatic, and you can often guess which language they
come from (they are "writing FORTRAN in Python"). Also quite similar to
natural languages where you can guess the native language of an L2
speaker by their accent and phrasing.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to escape strings for re.finditer?

2023-03-02 Thread Peter J. Holzer
On 2023-03-01 01:01:42 +0100, Peter J. Holzer wrote:
> On 2023-02-28 15:25:05 -0500, avi.e.gr...@gmail.com wrote:
> > I had no doubt the code you ran was indented properly or it would not work.
> > 
> > I am merely letting you know that somewhere in the process of copying
> > the code or the transition between mailers, my version is messed up.
> 
> The problem seems to be at your end. Jen's code looks ok here.
[...]
> I have no idea why it would join only some lines but not others.

Actually I do have an idea now, since I noticed something similar at
work today: Outlook has an option "remove additional line breaks from
text-only messages" (translated from German) in the the "Email / Message
Format" section. You want to make sure this is off if you are reading
mails where line breaks might be important[1].

hp

[1] Personally I'd say you shouldn't use Outlook if you are reading
mails where line breaks (or other formatting) is important, but ...

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to escape strings for re.finditer?

2023-02-28 Thread Peter J. Holzer
On 2023-03-01 01:01:42 +0100, Peter J. Holzer wrote:
> On 2023-02-28 15:25:05 -0500, avi.e.gr...@gmail.com wrote:
> > It happens to be easy for me to fix but I sometimes see garbled code I
> > then simply ignore.
> 
> Truth to be told, that's one reason why I rarely read your mails to the
> end. The long lines and the triple-spaced paragraphs make it just too
> uncomfortable.

Hmm, since I was now paying a bit more attention to formatting problems
I saw that only about half of your messages have those long lines
although all seem to be sent with the same mailer. Don't know what's
going on there.

    hp


-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Cryptic software announcements (was: ANN: DIPY 1.6.0)

2023-02-28 Thread Peter J. Holzer
[This isn't specifically about DIPY, I've noticed the same thing in
other announcements]

On 2023-02-28 13:48:56 -0500, Eleftherios Garyfallidis wrote:
> Hello all,
> 
> 
> We are excited to announce a new release of DIPY: DIPY 1.6.0 is out from
> the oven!

That's nice, but what is DIPY?


> In addition, registration for the oceanic DIPY workshop 2023 (April 24-28)
> is now open! Our comprehensive program is designed to equip you with the
> skills and knowledge needed to master the latest techniques and tools in
> structural and diffusion imaging.

Ok, so since the workshop is about ".., tools in structural and
diffusion imaging", DIPY is probably such a tool.

However, without this incidental announcement I wouldn't have any idea
what it is or if it would be worth my time clicking at any of the links.


I think it would be a good idea if software announcements would include
a single paragraph (or maybe just a single sentence) summarizing what
the software is and does.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to escape strings for re.finditer?

2023-02-28 Thread Peter J. Holzer
On 2023-02-28 15:25:05 -0500, avi.e.gr...@gmail.com wrote:
> Jen,
> 
>  
> 
> I had no doubt the code you ran was indented properly or it would not work.
> 
>  
> 
> I am merely letting you know that somewhere in the process of copying
> the code or the transition between mailers, my version is messed up.

The problem seems to be at your end. Jen's code looks ok here.

The content type is text/plain, no format=flowed or anything which would
affect the interpretation of line endings. However, after
base64-decoding it only contains unix-style LF line endings, not CRLF
line endings. That might throw your mailer off, but I have no idea why
it would join only some lines but not others.

> It happens to be easy for me to fix but I sometimes see garbled code I
> then simply ignore.

Truth to be told, that's one reason why I rarely read your mails to the
end. The long lines and the triple-spaced paragraphs make it just too
uncomfortable.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-25 Thread Peter J. Holzer
On 2023-02-25 21:58:18 +, Weatherby,Gerard wrote:
> I only use asserts for things I know to be true.

Yeah, that's what assers are for. Or rather for things that you *think*
are true.

> In other words, a failing assert means I have a hole in my program
> logic.

Yes, if you include your assumptions in your definition of "logic".


> For that use, the default behavior –telling me which line the assert
> is on, is more than sufficient. Depending on the circumstance, I’ll
> re-run the code with a breakpoint or replace the assert with an
> informative f-string Exception.

That may not always be practical. Things that we know (or think) are
true often have *are* true in most cases (otherwise we wouldn't think
so). So the case where the assumption fails may not be easily
reproducable and the more information you can get post-mortem the
better. For example, in C on Linux a failed assertion causes a core
dump. So you can inspect the complete state of the program.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is there a more efficient threading lock?

2023-02-25 Thread Peter J. Holzer
On 2023-02-25 09:52:15 -0600, Skip Montanaro wrote:
> BLOB_LOCK = Lock()
> 
> def get_terms(text):
> with BLOB_LOCK:
> phrases = TextBlob(text, np_extractor=EXTRACTOR).noun_phrases
> for phrase in phrases:
> yield phrase
> 
> When I monitor the application using py-spy, that with statement is
> consuming huge amounts of CPU.

Another thought:

How accurate is py-spy? Is it possible that it assigns time actually
spent in 
phrases = TextBlob(text, np_extractor=EXTRACTOR).noun_phrases
to
with BLOB_LOCK:
?

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is there a more efficient threading lock?

2023-02-25 Thread Peter J. Holzer
On 2023-02-25 09:52:15 -0600, Skip Montanaro wrote:
> I have a multi-threaded program which calls out to a non-thread-safe
> library (not mine) in a couple places. I guard against multiple
> threads executing code there using threading.Lock. The code is
> straightforward:
> 
> from threading import Lock
> 
> # Something in textblob and/or nltk doesn't play nice with no-gil, so just
> # serialize all blobby accesses.
> BLOB_LOCK = Lock()
> 
> def get_terms(text):
> with BLOB_LOCK:
> phrases = TextBlob(text, np_extractor=EXTRACTOR).noun_phrases
> for phrase in phrases:
> yield phrase
> 
> When I monitor the application using py-spy, that with statement is
> consuming huge amounts of CPU.

Which OS is this?

> Does threading.Lock.acquire() sleep anywhere?

On Linux it calls futex(2), which does sleep if it can't get the lock
right away. (Of course if it does get the lock, it will return
immediately which may use a lot of CPU if you are calling it a lot.)

hp


-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-25 Thread Peter J. Holzer
On 2023-02-25 09:10:06 -0500, Thomas Passin wrote:
> On 2/25/2023 1:13 AM, Peter J. Holzer wrote:
> > On 2023-02-24 18:19:52 -0500, Thomas Passin wrote:
> > > Sometimes you can use a second parameter to assert if you know what kind 
> > > of
> > > error to expect:
[...]
> > > With type errors, assert may actually give you the information needed:
> > > 
> > > > > > c = {"a": a, "b": 2}
> > > > > > assert a > c
> > > Traceback (most recent call last):
> > >File "", line 1, in 
> > > TypeError: '>' not supported between instances of 'list' and 'dict'
> > 
> > Actually in this case it isn't assert which gives you the information,
> > it's evaluating the expression itself. You get the same error with just
> >  a > c
> > on a line by its own.
> 
> In some cases.  For my example with an explanatory string, you wouldn't want
> to write code like that after an ordinary line of code, at least not very
> often.  The assert statement allows it syntactically.

Yes, but if an error in the expression triggers an exception (as in this
case) the explanatory string will never be displayed.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-24 Thread Peter J. Holzer
On 2023-02-24 18:19:52 -0500, Thomas Passin wrote:
> On 2/24/2023 2:47 PM, dn via Python-list wrote:
> > On 25/02/2023 08.12, Peter J. Holzer wrote:
> > > On 2023-02-24 16:12:10 +1300, dn via Python-list wrote:
> > > > In some ways, providing this information seems appropriate.
> > > > Curiously, this does not even occur during an assert exception -
> > > > despite the value/relationship being the whole point of using
> > > > the command!
> > > > 
> > > >  x = 1
> > > >  assert x == 2
> > > > 
> > > > AssertionError (and that's it)
> 
> Sometimes you can use a second parameter to assert if you know what kind of
> error to expect:
> 
> >>> a = [1,2,3]
> >>> b = [4,5]
> >>> assert len(a) == len(b), f'len(a): {len(a)} != len(b): {len(b)}'
> Traceback (most recent call last):
>   File "", line 1, in 
> AssertionError: len(a): 3 != len(b): 2

Yup. That's very useful (but I tend to forget that).


> With type errors, assert may actually give you the information needed:
> 
> >>> c = {"a": a, "b": 2}
> >>> assert a > c
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: '>' not supported between instances of 'list' and 'dict'

Actually in this case it isn't assert which gives you the information,
it's evaluating the expression itself. You get the same error with just
a > c
on a line by its own.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-24 Thread Peter J. Holzer
On 2023-02-25 08:47:00 +1300, dn via Python-list wrote:
> That said, have observed coders 'graduating' from other languages, making
> wider use of assert - assumed to be more data (value) sanity-checks than
> typing, but ...
> 
> Do you use assert frequently?

Not very often, but I do use it. Sometimes for its intended purpose
(i.e. to guard against bugs or wrong assumptions), sometimes just to
guard incomplete or sloppy code (e.g. browsing through some projects I
find
assert len(data["structure"]["dimensions"]["observation"]) == 1
(incomplete code - I didn't bother to implement multiple observations)
and
assert(header[0] == "Monat (MM)")
(the code below is sloppy. Instead of fixing it I just made the original
programmer's assumptions explicit)
and of course
assert False
(this point should never be reached)).

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-24 Thread Peter J. Holzer
On 2023-02-24 16:12:10 +1300, dn via Python-list wrote:
> In some ways, providing this information seems appropriate. Curiously, this
> does not even occur during an assert exception - despite the
> value/relationship being the whole point of using the command!
> 
> x = 1
> assert x == 2
> 
> AssertionError (and that's it)

Pytest is great there. If an assertion in a test case fails it analyzes
the expression to give you various levels of details:

 test session starts 

platform linux -- Python 3.10.6, pytest-6.2.5, py-1.10.0, pluggy-0.13.0
rootdir: /home/hjp/tmp/t
plugins: cov-3.0.0, anyio-3.6.1
collected 1 item

test_a.py F 
  [100%]

= FAILURES 
==
__ test_a 
___

def test_a():
a = [1, 2, 3]
b = {"a": a, "b": 2}

>   assert len(a) == len(b)
E   AssertionError: assert 3 == 2
E+  where 3 = len([1, 2, 3])
E+  and   2 = len({'a': [1, 2, 3], 'b': 2})

test_a.py:7: AssertionError
== short test summary info 
==
FAILED test_a.py::test_a - AssertionError: assert 3 == 2
= 1 failed in 0.09s 
=

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-24 Thread Peter J. Holzer
On 2023-02-23 20:32:26 -0700, Michael Torrie wrote:
> On 2/23/23 01:08, Hen Hanna wrote:
> >  Python VM  is seeing an "int" object (123)   (and telling me that)
> >  ...   so it should be easy to print that "int" object What does
> >  Python VMknow ?   and when does it know it ?
> It knows there is an object and its name and type.  It knows this from
> the first moment you create the object and bind a name to it.
> > it seems like  it is being playful, teasing (or mean),and
> > hiding  the ball from me
> 
> Sorry you aren't understanding.  Whenever you print() out an object,
> python calls the object's __repr__() method to generate the string to
> display.  For built-in objects this is obviously trivial. But if you
> were dealing an object of some arbitrary class, there may not be a
> __repr__() method

Is this even possible? object has a __repr__ method, so all other
classes would inherit that if they don't define one themselves. I guess
it's possible to explicitely remove it ...

> which would cause an exception, or if the __repr__()
> method itself raised an exception,

Yup. That is possible and has happened to me several times - of course
always in a situation where I really needed that output ...

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-24 Thread Peter J. Holzer
On 2023-02-22 15:46:09 -0800, Hen Hanna wrote:
> On Wednesday, February 22, 2023 at 12:05:34 PM UTC-8, Hen Hanna wrote:
> > > py bug.py 
> > Traceback (most recent call last): 
> > File "C:\Usenet\bug.py", line 5, in  
> > print( a + 12 ) 
> > TypeError: can only concatenate str (not "int") to str 
> > 
> > 
> > Why doesn't Python (error msg) do the obvious thing and tell me 
> > WHAT the actual (offending, arg) values are ? 
> > 
> > In many cases, it'd help to know what string the var A had , when the error 
> > occurred. 
> >  i wouldn't have to put print(a) just above, to see. 
> > 
> > ( pypy doesn't do that either, but Python makes programming
> > (debugging) so easy that i hardly feel any inconvenience.)

That seems like a non-sequitur to me. If you hardly feel any
inconvenience, why argue so forcefully?
And why is pypy relevant here?


> i  see that my example   would be clearER  with this one-line  change:
> 
> 
>   >  py   bug.py
> 
>Traceback (most recent call last):
> 
>   File "C:\Usenet\bug.py", line 5, in 
>  map( Func,fooBar(  X,  Y,  X +  
> Y  ))
>  
>TypeError: can only concatenate str (not "int") to str
> 
> 
> i hope that   NOW   a few of you  can  see this as a genuine,  (reasonable)  
> question.

That doesn't seem a better example to me. There is still only one
subexpression (X + Y) where that error can come from, so I know that X
is a str and Y is an int.

A better example would be something like 

x = (a + b) * (c + d)

In this case it could be either (a + b) or (c + d) which caused the
error. But what I really want to know here is the names of the involved
variables, NOT the values. If the error message told me that the values
were 'foo' and 12.3, I still wouldn't be any wiser. The problem here of
course is that the operands aren't necessarily simple variables as in
this example - they may be arbitrarily complex expressions. However, it
might be sufficient to mark the operator which caused the exception:

|   ...
|   File "/home/hjp/tmp/./foo", line 4, in f
| return (a + b) * (c + d)
| ^
| TypeError: can only concatenate str (not "int") to str

would tell me that (c + d) caused the problem and therefore that c must
be a str which it obviously shouldn't be.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: semi colonic

2023-02-24 Thread Peter J. Holzer
On 2023-02-23 15:56:54 -0500, avi.e.gr...@gmail.com wrote:
> I am not sure it is fair to blame JSON for a design choice.

We can't blame JSON (it has no agency), but as you say, it it was a
choice. And we can absolutely blame Doug for making that choice!

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: File write, weird behaviour

2023-02-19 Thread Peter J. Holzer
On 2023-02-19 12:59:43 -0500, Thomas Passin wrote:
> On 2/19/2023 11:57 AM, Axy via Python-list wrote:
> > Looks like the data to be written is buffered, so actual write takes
> > place after readlines(), when close() flushes buffers.
> > 
> > See io package documentation, BufferedIOBase.
> > 
> > The solution is file.flush() after file.write()
> 
> Another possibility, from the Python docs:
> 
> "...TextIOWrapper (i.e., files opened with mode='r+') ... To disable
> buffering in TextIOWrapper, consider using the write_through flag for
> io.TextIOWrapper.reconfigure()"

That actually doesn't help (I tried it before writing my answer). The
binary layer below the text layer also buffers ...

> Also from the docs:
> 
> "Warning: Calling f.write() without using the with keyword or calling
> f.close() might result in the arguments of f.write() not being completely
> written to the disk, even if the program exits successfully."

He does call file.close():

> > > file.close()

so that doesn't seem relevant.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: File write, weird behaviour

2023-02-19 Thread Peter J. Holzer
On 2023-02-19 16:57:02 +, Axy via Python-list wrote:
> Looks like the data to be written is buffered, so actual write takes place
> after readlines(), when close() flushes buffers.
> 
> See io package documentation, BufferedIOBase.
> 
> The solution is file.flush() after file.write()

Or alternatively, file.seek() to the intended position when switching
between reading and writing. (The C standard says you have to do this. I
can't find it in the Python docs, but apparently Python behaves the
same.)

> On 19/02/2023 14:03, Azizbek Khamdamov wrote:
> > Example 2 (weird behaviour)
> > 
> > file = open("D:\Programming\Python\working_with_files\cities.txt",
> > 'r+') ## contains list cities
> > # the following code DOES NOT add new record TO THE BEGINNING of the
> > file IF FOLLOWED BY readline() and readlines()# Expected behaviour:
> > new content should be added to the beginning of the file (as in
> > Example 1)
> > file.write("new city\n")

Also note that you can't ADD anything at the beginning (or in the
middle) of a file. You will overwrite existing content if you try this.
You can only add at the end of the file. If you want to insert
something, you have to rewrite everything from that position.

(So typically, for small files you wouldn't update a file in place, you
would just replace it completely. For large data sets which need to be
updated you would generally use some kind of database.)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing caching strategies

2023-02-18 Thread Peter J. Holzer
On 2023-02-18 15:59:32 -0500, Thomas Passin wrote:
> On 2/18/2023 2:59 PM, avi.e.gr...@gmail.com wrote:
> > I do not know the internals of any Roaring Bitmap implementation so all I
> > did gather was that once the problem is broken into accessing individual
> > things I chose to call zones for want of a more specific name, then each
> > zone is stored in one of an unknown number of ways depending on some logic.
> 
> Somewhat tangential, but back quite a while ago there was a C library called
> IIRC "julia list".

ITYM Judy arrays. They were mentioned here already.

> It implemented lists in several different ways, some quite
> sophisticated, depending on the size and usage pattern.  I remembered
> it as soon as I took a look at Roaring Bitmap and saw that the latter
> can use different representations depending on size and access
> patterns.

Yes, Roaring Bitmaps are somewhat similar. Judy arrays are more
versatile (they support more data types) and in many ways more
sophisticated, despite being 10+ years older. OTOH Roaring Bitmaps are a
lot simpler which may have contributed to their popularity.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Precision Tail-off?

2023-02-18 Thread Peter J. Holzer
On 2023-02-18 03:52:51 +, Oscar Benjamin wrote:
> On Sat, 18 Feb 2023 at 01:47, Chris Angelico  wrote:
> > On Sat, 18 Feb 2023 at 12:41, Greg Ewing via Python-list
> > > To avoid it you would need to use an algorithm that computes nth
> > > roots directly rather than raising to the power 1/n.
> > >
> >
> > It's somewhat curious that we don't really have that. We have many
> > other inverse operations - addition and subtraction (not just "negate
> > and add"), multiplication and division, log and exp - but we have
> > exponentiation without an arbitrary-root operation. For square roots,
> > that's not a problem, since we can precisely express the concept
> > "raise to the 0.5th power", but for anything else, we have to raise to
> > a fractional power that might be imprecise.
> 
> Various libraries can do this. Both SymPy and NumPy have cbrt for cube roots:

Yes, but that's a special case. Chris was talking about arbitrary
(integer) roots. My calculator has a button labelled [x√y], but my
processor doesn't have an equivalent operation. Come to think of it, it
doesn't even have a a y**x operation - just some simpler operations
which can be used to implement it. GCC doesn't inline pow(y, x) on
x86/64 - it just calls the library function.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing caching strategies

2023-02-18 Thread Peter J. Holzer
On 2023-02-17 18:08:09 -0500, avi.e.gr...@gmail.com wrote:
> Analogies I am sharing are mainly for me to wrap my head around an idea by
> seeing if it matches any existing ideas or templates and is not meant to be
> exact. Fair enough?

Yeah. But if you are venting your musings into a public space you
shouldn't be surprised if people react to them. And we can only react to
what you write, not what you think.

> But in this case, from my reading, the analogy is rather reasonable.

Although that confused me a bit. You are clearly responding to something
I thought about but which you didn't quote below. Did I just think about
it and not write it, but you responded anyway because you're a mind
reader? Nope, it just turns out you (accidentally) deleted that sentence.


> The implementation of Roaring Bitmaps seems to logically break up the
> space of all possible values it looks up into multiple "zones" that
> are somewhat analogous to individual files,

That part is correct. But you presented that in the form of a
performance/space tradeoff, writing about "trying multiple methods" to
find the smallest, and that that makes compression slower. That may be
the case for pkzip, but it's not what RB does: Instead it uses a very
simple heuristic: If there are less than 25% of the bits set in a zone,
it uses a list of offsets, otherwise a plain bitmap. (according to
their 2016 paper which I just skimmed through again - maybe the
implementation is a bit more complex now). So I think your description
would lead the reader to anticipate problems which aren't there and
probably miss ones which are there. So I'll stay with my "not completely
wrong but not very helpful" assessment.


> I did not raise the issue and thus have no interest in promoting this
> technology nor knocking it down. Just wondering what it was under the hood
> and whether I might even have a need for it. I am not saying Social Security
> numbers are a fit, simply that some form of ID number might fit.

Yeah, that's the point: Any form of ID which is a small-ish integer
number fits.

And maybe it's just because I work with databases a lot, but
representing things with numeric ids (in relational databases we call
them "surrogate keys") is such a basic operation that it doesn't warrant
more than a sentence or two.


> If a company has a unique ID number for each client and uses it
> consistently, then an implementation that holds a set stored this way
> of people using product A, such as house insurance, and those using
> product B, such as car insurance, and perhaps product C is an umbrella
> policy, might easily handle some queries such as who uses two or all
> three (intersections of sets) or who might merit a letter telling them
> how much they can save if they subscribed to two or all three as a way
> to get more business. Again, just  a made-up example I can think
> about. A company which has a million customers over the years will
> have fairly large sets as described. 

A much better example. This is indeed how you would use roaring bitmaps.


> What is helpful to me in thinking about something will naturally often not
> be helpful to you or others but nothing you wrote makes me feel my first
> take was in any serious way wrong. It still makes sense to me.
> 
> And FYI, the largest integer in signed 32 bits is 2_147_483_647

I know. It's been some time since I could do hexadecimal arithmetic
in my head but the the limits of common data types are still burned into
my brain ;-).

> which is 10 digits. A Social Security number look like xxx-xx- at
> this time which is only 9 digits.

These are US social security numbers. Other countries have other
schemes. For example, Austrian SSNs have 10 digits, so you would need 34
bits to represent them exactly. However, they (obviously) contain some
redundancy (one of the digits is a checksum, and there aren't 99
days in a century) so it could algorithmically be compressed down to 26
bits. But you probably wouldn't do that because in almost any real
application you wouldn't use the SSN as a primary key (some people don't
have one, and there have been mixups resulting in two people getting the
same SSN).


> As for my other EXAMPLE, I fail to see why I need to provide a specific need
> for an application. I don't care what they need it for. The thought was
> about whether something that does not start as an integer can be uniquely
> mapped into and out of integers of size 32 bits.

That's what confused me. You seemed to concentrate on the "map things to
integers" part which has been solved for decades and is absolutely
incidental to roaring bitmaps and completely ignored what you would be
using them for.

So I thought I was missing something, but it seems I wasn't.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than realit

Re: Precision Tail-off?

2023-02-17 Thread Peter J. Holzer
On 2023-02-17 14:39:42 +, Weatherby,Gerard wrote:
> IEEE did not define a standard for floating point arithmetics. They
> designed multiple standards, including a decimal float point one.
> Although decimal floating point (DFP) hardware used to be
> manufactured, I couldn’t find any current manufacturers.

Doesn't IBM any more? Their POWER processors used to implement decimal
FP (starting with POWER8, if I remember correctly).

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Precision Tail-off?

2023-02-17 Thread Peter J. Holzer
On 2023-02-17 10:27:08 +, Stephen Tucker wrote:
> This is a hugely controversial claim, I know, but I would consider this
> behaviour to be a serious deficiency in the IEEE standard.
> 
> Consider an integer N consisting of a finitely-long string of digits in
> base 10.
> 
> Consider the infinitely-precise cube root of N (yes I know that it could
> never be computed

However, computers exist to compute. Something which can never be
computed is outside of the realm of computing.

> unless N is the cube of an integer, but this is a mathematical
> argument, not a computational one), also in base 10. Let's call it
> RootN.
> 
> Now consider appending three zeroes to the right-hand end of N (let's call
> it NZZZ) and NZZZ's infinitely-precise cube root (RootNZZZ).
> 
> The *only *difference between RootN and RootNZZZ is that the decimal point
> in RootNZZZ is one place further to the right than the decimal point in
> RootN.

No. In mathematics there is no such thing as a decimal point. The only
difference is that RootNZZZ is RootN*10. But there is nothing special
about 10. You could multiply your original number by 512 and then the
new cube root would differ by a factor of 8 (which would show up as
shifted "binary point"[1] in binary but completely different digits in
decimal) or you could multiply by 1728 and then you would need base 12
to get the same digits with a shifted "duodecimal point".

hp

[1] It's really unfortunate that the point which separates the integer
and the fractional part of a number is called a "decimal point" in
English. Makes it hard to talk about non-integer numbers in other
bases.

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Precision Tail-off?

2023-02-17 Thread Peter J. Holzer
On 2023-02-17 08:38:58 -0700, Michael Torrie wrote:
> On 2/17/23 03:27, Stephen Tucker wrote:
> > Thanks, one and all, for your reponses.
> > 
> > This is a hugely controversial claim, I know, but I would consider this
> > behaviour to be a serious deficiency in the IEEE standard.
> 
> No matter how you do it, there are always tradeoffs and inaccuracies
> moving from real numbers in base 10 to base 2.

This is phrased ambiguosly. So just to clarify:

Real numbers are not in base 10. Or base 2 or base 37 or base e. A
positional system (which uses a base) is just a convenient way to write
a small subset of real numbers. By using any base you limit yourself to
rational numbers (no e or π or √2) and in fact only those rational
numbers where the denominator is a power of the base.

Converting numbers from one base to another with any finite precision
will generally involve rounding - so do that as little as possible.


> That's just the nature of the math.  Any binary floating point
> representation is going to have problems.

Any decimal floating point representation is also going to have
problems.

There is nothing magical about base 10. It's just what we are used to
(which also means that we are used to the rounding errors and aren't
surprised by them as much).

> Also we weren't clear on this, but the IEEE standard is not just
> implemented in software. It's the way your CPU represents floating point
> numbers in silicon.  And in your GPUs (where speed is preferred to
> precision).  So it's not like Python could just arbitrarily do something
> different unless you were willing to pay a huge penalty for speed.

I'm pretty sure that compared to the interpreter overhead of CPython the
overhead of a software FP implementation (whether binary or decimal)
would be rather small, maybe negligible.


> > Perhaps this observation should be brought to the attention of the IEEE. I
> > would like to know their response to it.
> Rest assured the IEEE committee that formalized the format decades ago
> knew all about the limitations and trade-offs.  Over the years CPUs have
> increased in capacity and now we can use 128-bit floating point numbers

The very first IEEE compliant processor (the Intel 8087) had an 80 bit
extended type (in fact it did all computations in 80 bit and only
rounded down to 64 or 32 bits when storing the result). By the 1990s, 96
and 128 bit was quite common.

> which mitigate some of the accuracy problems by simply having more
> binary digits. But the fact remains that some rational numbers in
> decimal are irrational in binary,

Be careful: "Rational" and "irrational" have a standard meaning in
mathematics and it's independent of base.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Comparing caching strategies

2023-02-17 Thread Peter J. Holzer
On 2023-02-17 00:07:12 -0500, avi.e.gr...@gmail.com wrote:
> Roaring bitmaps claim to be an improvement not only over uncompressed
> structures but some other compressed versions but my reading shows it
> may be limited to some uses. Bitsets in general seem to be useful only
> for a largely contiguous set of integers where each sequential bit
> represents whether the nth integer above the lowest is in the set or
> not.

They don't really have to be that contiguous. As long as your integers
fit into 32 bits you're fine.

> Of course, properly set up, this makes Unions and Intersections
> and some other operations fairly efficient. But sets are not the same
> as dictionaries and often you are storing other data types than
> smaller integers.

Of course. Different data types are useful for different applications.

> Many normal compression techniques can require lots of time to
> uncompress to find anything. My impression is that Roaring Bitmaps is
> a tad like the pkzip software that tries various compression
> techniques on each file and chooses whatever seems to work better on
> each one. That takes extra time when zipping, but when unzipping a
> file, it goes directly to the method used to compress it as the type
> is in a header and just expands it one way.

While not completely wrong, I don't think this comparison is very
helpful.


> My impression is that Roaring bitmaps breaks up the range of integers
> into smaller chunks and depending on what is being stored in that
> chunk, may leave it as an uncompressed bitmap, or a list of the sparse
> contents, or other storage methods and can search each version fairly
> quickly. 

It's been a few years since I looked at the implementation, but that's
the gist of it.


> So, I have no doubt it works great for some applications such as
> treating social security numbers as integers.

Not sure what you mean here. You mean storing sets of social security
numbers as roaring bitmaps? You might be able to do that (Last time I
looked RB's were limited to 32 bits, which may not be enough to
represent SSNs unmodified), but are set operations on SSNs something you
routinely do?

> It likely would be overkill to store something like the components of
> an IP address between 0 and 255 inclusive.
> 
> But having said that, there may well be non-integer data that can be
> mapped into and out of integers. As an example, airports or radio
> stations have names like LAX or WPIX. If you limit yourself to ASCII
> letters then every one of them can be stored as a 32-bit integer,
> perhaps with some padding.

Again: What would be the application?

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Precision Tail-off?

2023-02-17 Thread Peter Pearson
On Fri, 17 Feb 2023 10:27:08, Stephen Tucker wrote:[Head-posting undone.]
> On Thu, Feb 16, 2023 at 6:49 PM Peter Pearson 
> wrote:
>> On Tue, 14 Feb 2023 11:17:20 +, Oscar Benjamin wrote:
>> > On Tue, 14 Feb 2023 at 07:12, Stephen Tucker 
>> wrote:
>> [snip]
>> >> I have just produced the following log in IDLE (admittedly, in Python
>> >> 2.7.10 and, yes I know that it has been superseded).
>> >>
>> >> It appears to show a precision tail-off as the supplied float gets
>> bigger.
>> [snip]
>> >>
>> >> For your information, the first 20 significant figures of the cube root
>> in
>> >> question are:
>> >>49793385921817447440
>> >>
>> >> Stephen Tucker.
>> >> --
>> >> >>> 123.456789 ** (1.0 / 3.0)
>> >> 4.979338592181744
>> >> >>> 1234567890. ** (1.0 / 3.0)
>> >> 49793385921817.36
>> >
>> > You need to be aware that 1.0/3.0 is a float that is not exactly equal
>> > to 1/3 ...
>> [snip]
>> > SymPy again:
>> >
>> > In [37]: a, x = symbols('a, x')
>> >
>> > In [38]: print(series(a**x, x, Rational(1, 3), 2))
>> > a**(1/3) + a**(1/3)*(x - 1/3)*log(a) + O((x - 1/3)**2, (x, 1/3))
>> >
>> > You can see that the leading relative error term from x being not
>> > quite equal to 1/3 is proportional to the log of the base. You should
>> > expect this difference to grow approximately linearly as you keep
>> > adding more zeros in the base.
>>
>> Marvelous.  Thank you.
[snip]
> Now consider appending three zeroes to the right-hand end of N (let's call
> it NZZZ) and NZZZ's infinitely-precise cube root (RootNZZZ).
>
> The *only *difference between RootN and RootNZZZ is that the decimal point
> in RootNZZZ is one place further to the right than the decimal point in
> RootN.
>
> None of the digits in RootNZZZ's string should be different from the
> corresponding digits in RootN.
>
> I rest my case.
[snip]


I believe the pivotal point of Oscar Benjamin's explanation is
that within the constraints of limited-precision binary floating-point
numbers, the exponent of 1/3 cannot be represented precisely, and
is in practice represented by something slightly smaller than 1/3;
and accordingly, when you multiply your argument by 1000, its
not-quit-cube-root gets multiplied by something slightly smaller
than 10, which is why the number of figures matching the "right"
answer gets steadily smaller.

Put slightly differently, the crux of the problem lies not in the
complicated process of exponentiation, but simply in the failure
to represent 1/3 exactly.  The fact that the exponent is slightly
less than 1/3 means that you would observe the steady loss of
agreement that you report, even if the exponentiation process
were perfect.

-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >