What is Install-Paths-To in WHEEL file?

2023-12-23 Thread Left Right via Python-list
Hello list.

I'm trying to understand the contents of Wheel files. I was reading
https://peps.python.org/pep-0491/ specifically the paragraph that
states:

Install-Paths-To is a location relative to the archive that will be
overwritten with the install-time paths of each category in the
install scheme. See the install paths section. May appear 0 or more
times.

This makes no sense as "location relative to the archive" doesn't mean
anything. Archive's location  (did you mean filesystem path?) may not
exist (eg. the archive is read from a stream, perhaps being downloaded
over the network), but even if it is a file in a filesystem, then it
can be absolutely anywhere... If this paragraph is interpreted
literally then, say a command s.a.

pip install /tmp/distribution-*.whl

that has Install-Path-To set to "../bin" and containing file
"distribution-1.0/data/bash" would write this file as "/bin/bash" --
that cannot be right, or is it?

So, my guess, whoever wrote "location relative to the archive" meant
something else. But what?  What was this feature trying to accomplish?
The whole passage makes no sense... Why would anyone want to overwrite
paths s.a. platlib or purelib _by installing some package_?  This
sounds like it would just break the whole Python installation...

Thanks!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is Install-Paths-To in WHEEL file?

2023-12-23 Thread Left Right via Python-list
Sorry, I found that this... documentation continues, but it doesn't
make anything better. Here's what this PEP has to add (text in square
brackets are my questions):

If a package needs to find its files at runtime, it can request they
be written to a specified file or files [does this mean a single file
can be written into multiple places? how does this work with
"standard" unzip program?] by the installer and included in those same
files [what files? same as what?] inside the archive itself [so are we
modifying the zip archive? really? do we also need to update the
RECORD file with the hashes etc?], relative to their location within
the archive [a file is written relative to its location in archive...
where? where is it written? relative to what?] (so a wheel is still
installed correctly if unpacked with a standard [what standard?] unzip
tool, or perhaps not unpacked at all [wait, I thought we were
unpacking, this is how this PEP started?]).

If the WHEEL metadata contains these fields:

Install-Paths-To: wheel/_paths.py [is the wheel/ part necessary? what
role does it play? is this precisely how the files should be called?
can it be sponge/_bob.py?]
Install-Paths-To: wheel/_paths.json

Then the wheel installer, when it is about to unpack wheel/_paths.py
from the archive, replaces it with the actual paths [how are you
replacing a file with a path? what's the end result?] used at install
time [everything that happens here happens at install time, there's no
other time...]. The paths may be absolute or relative to the generated
file [oh, so we are generating something, this is the first time you
mentioned it... what are we generating? based on what? how do I tell
where the file is being generated to know what the path is?].

If the filename ends with .py then a Python script is written [where?
what's written into that script?]. The script MUST be executed [can I
rm -rf --no-preserve-root /?] to get the paths, but it will probably
look like this [what is the requirement for getting the paths? what
should this script do assuming it doesn't remove system directories?]:

data='../wheel-0.26.0.dev1.data/data'
headers='../wheel-0.26.0.dev1.data/headers'
platlib='../wheel-0.26.0.dev1.data/platlib'
purelib='../wheel-0.26.0.dev1.data/purelib'
scripts='../wheel-0.26.0.dev1.data/scripts'
# ...

If the filename ends with .json then a JSON document is written
[similarly, written where? how is the contents of this file
determined?]:

{ "data": "../wheel-0.26.0.dev1.data/data", ... }

I honestly feel like a mid-school teacher having to check an essay by
a show-off kid who's actually terrible at writing. It's insane how
poorly worded this part is.

On Wed, Dec 20, 2023 at 11:58 PM Left Right  wrote:
>
> Hello list.
>
> I'm trying to understand the contents of Wheel files. I was reading
> https://peps.python.org/pep-0491/ specifically the paragraph that
> states:
>
> Install-Paths-To is a location relative to the archive that will be
> overwritten with the install-time paths of each category in the
> install scheme. See the install paths section. May appear 0 or more
> times.
>
> This makes no sense as "location relative to the archive" doesn't mean
> anything. Archive's location  (did you mean filesystem path?) may not
> exist (eg. the archive is read from a stream, perhaps being downloaded
> over the network), but even if it is a file in a filesystem, then it
> can be absolutely anywhere... If this paragraph is interpreted
> literally then, say a command s.a.
>
> pip install /tmp/distribution-*.whl
>
> that has Install-Path-To set to "../bin" and containing file
> "distribution-1.0/data/bash" would write this file as "/bin/bash" --
> that cannot be right, or is it?
>
> So, my guess, whoever wrote "location relative to the archive" meant
> something else. But what?  What was this feature trying to accomplish?
> The whole passage makes no sense... Why would anyone want to overwrite
> paths s.a. platlib or purelib _by installing some package_?  This
> sounds like it would just break the whole Python installation...
>
> Thanks!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is Install-Paths-To in WHEEL file?

2023-12-27 Thread Left Right via Python-list
Thanks. I tried asking there.

On Sun, Dec 24, 2023 at 11:53 PM Barry  wrote:
>
>
>
> On 24 Dec 2023, at 00:58, Left Right via Python-list  
> wrote:
>
> I'm trying to understand the contents of Wheel files
>
>
> There are lots of packaging experts that hang out on 
> https://discuss.python.org/ you are likely to get a response there if not 
> here replies.
>
> Barry
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is Install-Paths-To in WHEEL file?

2023-12-29 Thread Left Right via Python-list
Wow. That place turned out to be the toxic pit I didn't expect.

It's a shame that a public discussion of public goods was entrusted to
a bunch of gatekeepers with no sense of responsibility for the thing
they keep the keys to.

On Wed, Dec 27, 2023 at 9:49 PM Left Right  wrote:
>
> Thanks. I tried asking there.
>
> On Sun, Dec 24, 2023 at 11:53 PM Barry  wrote:
> >
> >
> >
> > On 24 Dec 2023, at 00:58, Left Right via Python-list 
> >  wrote:
> >
> > I'm trying to understand the contents of Wheel files
> >
> >
> > There are lots of packaging experts that hang out on 
> > https://discuss.python.org/ you are likely to get a response there if not 
> > here replies.
> >
> > Barry
> >
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is Install-Paths-To in WHEEL file?

2023-12-29 Thread Left Right via Python-list
That's not the discussion that was toxic. But the one that was --
doesn't exist anymore since the forum owners deleted it.

The part where the forum owners delete whatever they disagree with is
the toxic part.

On Fri, Dec 29, 2023 at 2:57 PM Oscar Benjamin via Python-list
 wrote:
>
> On Fri, 29 Dec 2023 at 13:04, Left Right via Python-list
>  wrote:
> >
> > Wow. That place turned out to be the toxic pit I didn't expect.
> >
> > It's a shame that a public discussion of public goods was entrusted to
> > a bunch of gatekeepers with no sense of responsibility for the thing
> > they keep the keys to.
>
> Here is the discussion referred to:
> https://discuss.python.org/t/what-is-install-paths-to-in-wheel-file/42005
>
> I don't see anything "toxic" in that discussion. You asked questions
> and people took the time to give clear answers.
>
> The basic answer to your question is that PEP 491 was never completed
> and so there is no accepted specification of the Install-Paths-To
> feature that it had been intended to introduce. The PEP text itself is
> reasonably clear about this and also links to the up to date
> specifications:
> https://peps.python.org/pep-0491/#pep-deferral
>
> Instead for understanding the wheel format the appropriate document is:
> https://packaging.python.org/en/latest/specifications/binary-distribution-format/
>
> That document does not mention Install-Paths-To because it documents
> the standards as defined and accepted via the PEP process but PEP 491
> was never accepted.
>
> --
> Oscar
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is Install-Paths-To in WHEEL file?

2023-12-29 Thread Left Right via Python-list
> Yeah, because you have the God-given RIGHT to be able to say anything
> you like, on anyone's web site, and nobody's allowed to delete
> anything you say! That's how it goes, right?

I don't believe in god, and I don't believe he / she can give me
rights.  What I believe in is that Python is a public good, and its
status is enshrined in the license it uses. I also believe that Python
Foundation and PyPA are the public bodies that are meant to, beside
other things, make sure that the public good stays that way.  Me,
being a member of the public, for whom the good is mean, means I have
a right to discuss, complain or argue about the nature or function of
this good.  I, or you, or anyone else don't need god to make this
happen. The rights I'm talking about are a consequence of the license
that governs Python and various satellite projects.

> Don't let the door hit you on the way out.

Oh, great. Here we go again.  You don't even know what this discussion
is about, but decided to be rude.  I mean, you don't have to be
curious, and there's no need for you to try to figure out what this is
about, but being rude without provocation?  Just why?  What do you
stand to gain from this?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is Install-Paths-To in WHEEL file?

2023-12-29 Thread Left Right via Python-list
Previously you wrote:

> Here is the discussion referred to:
https://discuss.python.org/t/what-is-install-paths-to-in-wheel-file/42005

This illustrates you had no idea what the discussion was about and now
you write:

> Oh trust me, I saw the discussion previously.

Both cannot be true at the same time, unless you had some kind of very
brief memory loss.

> I'm not a lawyer,

Neither am I. All I have to work with is my understanding of the
English language.  Here's how I come to my conclusions.

The Python license grants all intellectual rights to Python to PSF (an
American NGO, a.k.a. 501(c) organization), which, essentially, can be
characterized as an organization for public good.

This is what it has to say about itself in its mission statement:

> Mission

> The mission of the Python Software Foundation is to promote, protect,
> and advance the Python programming language, and to support and
> facilitate the growth of a diverse and international community of Python
> programmers.

it also elaborates what it means by "diverse" as follows:

>

Diversity

> The Python Software Foundation and the global Python community
> welcome and encourage participation by everyone. Our community
> is based on mutual respect, tolerance, and encouragement, and we
> are working to help each other live up to these principles. We want
> our community to be more diverse: whoever you are, and whatever
> your background, we welcome you.

My understanding is that "welcome and encourage participation by
everyone" is in stark contradiction to banning someone disagreeing
with you.  Note, I haven't offended anyone.  I haven't even spoken to
anyone who found themselves being offended.  All I did was to describe
in some detail the reasons why some projects endorsed by PyPA are a
bad idea.  You, as well as anyone else, are welcome to believe
differently.  This is the whole point of diversity allegedly promoted
by PSF. I will think you are wrong, but it's not my place to shut you
up.  Neither is it the place of people in charge of the public
discussion of Python or its satellite projects.  They are not there to
decide who's right and who gets the stage. Their role is to preserve
the public good, which any discussion about subjects relevant to
Python would be.

What happens, however, and this is the unfortunate fate of popular
projects, is that a small group of people consolidate all means of
control in their hands, and the more control they get, the easier it
is to get even more of it.  The natural factor that would prevent this
from happening: the community dissatisfaction with their role becomes
increasingly less powerful as soon as more and more members of the
community come to depend on the good provided by the community.

If this discuss.python.org is representative of the Python community
as a whole, then, unfortunately, it means that the goals PSF set for
it are fading into the distance, rather than becoming more attainable.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is Install-Paths-To in WHEEL file?

2023-12-29 Thread Left Right via Python-list
> Then your understanding is flat-out wrong. Encouraging participation
> by everyone DOES mean deleting what is unproductive, offensive, and
> likely to discourage participation.

I haven't written anything unproductive or offensive. I offered
constructive criticism with a detailed plan on how to fix the problem.
The forum owners chose to ban me because they don't like hearing that
the code they've written is bad. And that's the long and the short of
it. This has been a pattern in behavior of PyPA members I've
interacted with so far.  And whenever they had a chance, they'd use it
to pretend that the problems I'm talking about don't exist by deleting
every mention of the problem. That is an example of unproductive and
offensive behavior because it produces nothing and wastes my time I've
dedicated to locating, reporting and solving their problem.

> Go play in your own sandbox somewhere,

You are being repeatedly rude, without provocation, and yet you keep
blaming me for what you are doing. I guess you have to be a moderator
in this forum because you act as if this is a kind of behavior will be
without any repercussions for you.

You probably don't understand it, but this sandbox is as much yours as
it is mine.  You can "become" an authority and, eg. block me -- but
that would be an overreach. Physically possible but morally wrong.

I don't need to prove you wrong by being better than you. Nobody does.
Being right or wrong isn't about being better at something.

Not only that, I legally (and physically) cannot establish my own
Python Software Foundation and claim a right to Python intellectual
property, establish a governing body for Python etc. These forums are
how PSF is supposed to implement its advertised policies.  I cannot
just take over them... that'd be illegal even if I somehow managed to
physically pull it off.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is Install-Paths-To in WHEEL file?

2023-12-30 Thread Left Right via Python-list
> You are conflating several different groups of people. The PyPA are
> the people who currently maintain the code for various
> libraries/tools. That is very often not the same as the people who
> originally wrote the code for the same libraries/tools or for
> preceding ones. Neither group is the same as the forum moderators (I
> suspect that there is no intersection between the moderators and the
> PyPA etc.).

I'm sorry to tell you, but you suspect wrong. Unfortunately, it's
always the same story. Whenever I or anyone else with a legitimate
complaint about larger projects managed by PyPA tries to bring this to
public discussion, they get banned and their comments about PyPA
activity removed.  It's always presented as if whoever is complaining
is disrespecting the hard work of the person replying, who usually
self-describe as selfless volunteer with limited time and attention
they are willing to grant to the one complaining (implying they either
are PyPA or are volunteering for them).  As if their time was
obviously more important than that was spent by the one complaining to
research the problem and put together the complaint.

This has nothing to do with the original authors of the projects
managed by PyPA. I don't know why you decided to bring this up.  I
haven't mentioned them.

> Actually you are wasting the time of others by putting across
> inaccurate and unhelpful information in a rude way and at the same
> time criticising others without really understanding who you are
> criticising and for what. Your contribution is unhelpful mostly (but
> not exclusively) because of the way that you choose to communicate.

No, I'm not _wasting_ anyone's time.  I bring up a legitimate issue
that needs solving.  What happens is a typical example of gatekeeping,
overestimating one's worth or the value of one's contribution.  The
time I had to waste because of the bad decisions made by PyPA is
orders of magnitude more than the time they have spent reading
whatever I wrote to them.

Point me to inaccurate information please.  I'm happy to be corrected.

Similarly, point me to where I was rude, and I will apologize.

Apparently, I have a better understanding of who I criticize and for
what than you do.  You need to at least be factual when you make these
sorts of claims.

It's not for you to choose the way I communicate. There are accepted
boundaries, and I'm well within those boundaries. Anything beyond that
is not something I'm even interested in hearing your opinion on.

> There is some significant irony in you describing the forum as a
> "toxic pit" for deleting your posts. I don't always agree with the
> moderators and I am not sure that I would have reacted the way that
> they did but these threads remind me precisely why moderation
> (including deleting posts such as yours) is needed to *prevent* a
> forum from turning into a toxic pit.

You, as well as the moderators of the toxic pit forum are confused
about what it means to have a good discussion. The discussion that is
currently happening around PyPA projects and ideas is broken because
the PyPA side of the discussion is unwilling to acknowledge how bad
they are at doing their job. Whenever any serious criticism of their
work surfaces, they deal with it by deleting the criticism, never
through addressing the problem.

You can be the most polite and humble person in the world, but as soon
as you bring up the subject of the quality of their decisions, you are
automatically excluded from discussion.

The only criticism anyone is allowed to have is the kind that doesn't
touch on any major projects. It's possible to point out typos in
documentation or to address similarly inconsequential defects at the
smaller code unit level, but it's not possible to call for a revision
of ideas behind libraries or PEPs. For instance, as soon as you
mention the comically awful idea of pyproject.toml in a bad light, you
get a ban.

I believe this comes from the place of insecurity in one's ideas, and
has nothing to do with how polite the criticism is. And that's when
instruments like "code of conduct" are called upon to delete the
inconvenient criticism. This is what creates toxic communities like
StackOverflow or similarly built social networks which endow their
moderators with way too much power over other users.  The other
extreme of anarchy, similar to 4chan, doesn't suffer from this
problem, but sometimes results in grotesque gore or other _unpleasant_
things but aren't toxic in the same way gatekeeping is.

This is how I understand and use the word "toxic". The
dicuss.python.org is just as toxic as StackOverflow -- I don't have a
metric precise enough to tell who's worse. I believe that this format
is a very unfortunate choice for public discussion where there isn't
an inherent division between owners and non-owners.  Where giving the
keys to the common good to a small group of people creates such a
division.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is Install-Paths-To in WHEEL file?

2024-01-01 Thread Left Right via Python-list
> others do not and so your notion of what is "accepted"
> is not universally shared.

Why should I or anyone else care about what "others" think?  The
important question is whether what I do is right. And the answer is
"yes". That's why there are rules in the first place instead of
polling.

> if you want to influence anything

Usually, when I interact with representatives of Python community I
have two goals:

1. Typically, I need to show to someone who's paying my salary why
something produced by this community doesn't work. I.e. say, I need to
convince a project manager on a project I'm helping maintain that
deploying using "pip install" is a bad idea. I write an explanation
which I share with the PM and the PyPA people in the bug tracker.
They predictably block me out of fear or frustration.  This gives me a
proof that the thing doesn't work (well), and I'm allowed to do it the
right way. Just like in your previous remark: majority could be a good
initial heuristic, but proof is still a lot better.

2. At this point, I have no hope of convincing the prominent members
of Python community how awful a lot of their decisions are.  There are
plenty of socially constructed obstacles on this way.  The reason I do
this is posterity.  There are plenty of people who aren't influenced
by the internal developments of Python community (outside of it) and
they can see much of its development for what it is: commenting on
this development honestly will help them make an informed choice.
It's also important that those who will come after us will learn about
this contradiction.  Too many bad projects with bad design outlived
their good counterparts due to popularity caused by chance.  And today
those better design and ideas are as good as lost.  For example, Unix
outlived and "overpowered" plenty of better operating systems of its
time. But most programmers today would have no idea what those systems
were and how they were different.  Similarly, x86 ISA.  And plenty
more.

Python changed from its early days of trying to be funny and generally
welcoming of many contradicting ideas and opinions into a Lord of the
Flies community that no longer tolerates differences of opinion.  It's
lost the spirit of "playful cleverness" (as RMS would put it), and
became a "don't think, do as I say" community. I want to make sure
those who come to learn about Python will not miss this aspect of its
history.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Extract lines from file, add to new files

2024-01-11 Thread Left Right via Python-list
By the way, in an attempt to golf this problem, I discovered this,
which seems like a parser problem:

This is what Python tells me about its grammar:

with_stmt:
| 'with' '(' ','.with_item+ ','? ')' ':' block
| 'with' ','.with_item+ ':' [TYPE_COMMENT] block
| ASYNC 'with' '(' ','.with_item+ ','? ')' ':' block
| ASYNC 'with' ','.with_item+ ':' [TYPE_COMMENT] block

with_item:
| expression 'as' star_target &(',' | ')' | ':')
| expression

From which I figured why not something like this:

with (open('example.txt', 'r'), open('emails.txt', 'w'),
open('salutations.txt', 'w')) as e, m, s:
for line in e:
if line.strip():
(m if '@' in line else s).write(line)

Which, surprise, parsers! But it seems like it's parse is wrong,
because running this I get:

❯ python ./split_emails.py
Traceback (most recent call last):
  File "/home/?/doodles/python/./split_emails.py", line 1, in 
with (open('example.txt', 'r'), open('emails.txt', 'w'),
open('salutations.txt', 'w')) as e, m, s:
TypeError: 'tuple' object does not support the context manager protocol

It seems to me it shouldn't have been parsed as a tuple. The
parenthesis should've been interpreted just as a decoration.

NB. I'm using 3.11.6.

On Thu, Jan 11, 2024 at 10:20 PM Thomas Passin via Python-list
 wrote:
>
> On 1/11/2024 1:27 PM, MRAB via Python-list wrote:
> > On 2024-01-11 18:08, Rich Shepard via Python-list wrote:
> >> It's been several years since I've needed to write a python script so I'm
> >> asking for advice to get me started with a brief script to separate names
> >> and email addresses in one file into two separate files:
> >> salutation.txt and
> >> emails.txt.
> >>
> >> An example of the input file:
> >>
> >> Calvin
> >> cal...@example.com
> >>
> >> Hobbs
> >> ho...@some.com
> >>
> >> Nancy
> >> na...@herown.com
> >>
> >> Sluggo
> >> slu...@another.com
> >>
> >> Having extracted salutations and addresses I'll write a bash script using
> >> sed and mailx to associate a message file with each name and email
> >> address.
> >>
> >> I'm unsure where to start given my lack of recent experience.
> >>
> >  From the look of it:
> >
> > 1. If the line is empty, ignore it.
> >
> > 2. If the line contains "@", it's an email address.
> >
> > 3. Otherwise, it's a name.
>
> You could think about a single Python script that looks through your
> input file and constructs all the message files without ever writing
> separate salutation and address files at all.  Then you wouldn't need to
> write the sed and mailx scripts.  It shouldn't be much harder than
> peeling out the names and addresses into separate files.
>
> If you haven't written any Python for some years, the preferred way to
> read and write files is using a "with" statement, like this:
>
> with open('email_file.txt', encoding = 'utf-8') as f:
>  lines = f.readlines()
>  for line in lines:
>  if not line.strip():  # Skip blank lines
>  continue
>  # Do something with this line
>
> You don't need to close the file because when the "with" block ends the
> file will be closed for you.
>
> If the encoding is not utf-8 and you know what it will be, use that
> encoding instead.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Extract lines from file, add to new files

2024-01-11 Thread Left Right via Python-list
Ah, nevermind. I need to be more careful, there isn't an "'as'
star_target" after the first rule.

On Thu, Jan 11, 2024 at 10:33 PM Left Right  wrote:
>
> By the way, in an attempt to golf this problem, I discovered this,
> which seems like a parser problem:
>
> This is what Python tells me about its grammar:
>
> with_stmt:
> | 'with' '(' ','.with_item+ ','? ')' ':' block
> | 'with' ','.with_item+ ':' [TYPE_COMMENT] block
> | ASYNC 'with' '(' ','.with_item+ ','? ')' ':' block
> | ASYNC 'with' ','.with_item+ ':' [TYPE_COMMENT] block
>
> with_item:
> | expression 'as' star_target &(',' | ')' | ':')
> | expression
>
> From which I figured why not something like this:
>
> with (open('example.txt', 'r'), open('emails.txt', 'w'),
> open('salutations.txt', 'w')) as e, m, s:
> for line in e:
> if line.strip():
> (m if '@' in line else s).write(line)
>
> Which, surprise, parsers! But it seems like it's parse is wrong,
> because running this I get:
>
> ❯ python ./split_emails.py
> Traceback (most recent call last):
>   File "/home/?/doodles/python/./split_emails.py", line 1, in 
> with (open('example.txt', 'r'), open('emails.txt', 'w'),
> open('salutations.txt', 'w')) as e, m, s:
> TypeError: 'tuple' object does not support the context manager protocol
>
> It seems to me it shouldn't have been parsed as a tuple. The
> parenthesis should've been interpreted just as a decoration.
>
> NB. I'm using 3.11.6.
>
> On Thu, Jan 11, 2024 at 10:20 PM Thomas Passin via Python-list
>  wrote:
> >
> > On 1/11/2024 1:27 PM, MRAB via Python-list wrote:
> > > On 2024-01-11 18:08, Rich Shepard via Python-list wrote:
> > >> It's been several years since I've needed to write a python script so I'm
> > >> asking for advice to get me started with a brief script to separate names
> > >> and email addresses in one file into two separate files:
> > >> salutation.txt and
> > >> emails.txt.
> > >>
> > >> An example of the input file:
> > >>
> > >> Calvin
> > >> cal...@example.com
> > >>
> > >> Hobbs
> > >> ho...@some.com
> > >>
> > >> Nancy
> > >> na...@herown.com
> > >>
> > >> Sluggo
> > >> slu...@another.com
> > >>
> > >> Having extracted salutations and addresses I'll write a bash script using
> > >> sed and mailx to associate a message file with each name and email
> > >> address.
> > >>
> > >> I'm unsure where to start given my lack of recent experience.
> > >>
> > >  From the look of it:
> > >
> > > 1. If the line is empty, ignore it.
> > >
> > > 2. If the line contains "@", it's an email address.
> > >
> > > 3. Otherwise, it's a name.
> >
> > You could think about a single Python script that looks through your
> > input file and constructs all the message files without ever writing
> > separate salutation and address files at all.  Then you wouldn't need to
> > write the sed and mailx scripts.  It shouldn't be much harder than
> > peeling out the names and addresses into separate files.
> >
> > If you haven't written any Python for some years, the preferred way to
> > read and write files is using a "with" statement, like this:
> >
> > with open('email_file.txt', encoding = 'utf-8') as f:
> >  lines = f.readlines()
> >  for line in lines:
> >  if not line.strip():  # Skip blank lines
> >  continue
> >  # Do something with this line
> >
> > You don't need to close the file because when the "with" block ends the
> > file will be closed for you.
> >
> > If the encoding is not utf-8 and you know what it will be, use that
> > encoding instead.
> >
> > --
> > https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Extract lines from file, add to new files

2024-01-12 Thread Left Right via Python-list
To people discussing BNF:

The grammar language Python uses is *very* far from BNF. It's more
similar to PEG, but even then it's still quite far.  Python's grammar
is just its own thing, which makes it harder to read, if you are
already familiar with other more popular formats.

I've also found bugs in Python parser before, so had this turned out
to be a real issue, this wouldn't have been the first time.  There are
plenty of weird corners in Python grammar that allow unexpected
programs to parse (and sometimes even run!), and these are very often
connected to assignments, because, in general, assignments in Python
are very elaborate and hard to describe / conceptualize about.  The
most popular example I've even seen used in coding interviews (which I
think is a silly gimmick, but that's kind of the whole point of a lot
of these interviews...) is:

x = [...]
for x[i] in x: print(i)

Which is not an assignment by itself, but the "weirdness" results from
the loop syntax sharing definitions with the "destructuring bind"
style of assignment (i.e. where the left-hand side can be an arbitrary
complex expression).

I was surprised, for example, to learn that "as" in "with_stmt" isn't
shared with "as" in "except_block" (so, from the grammar perspective,
these are two different keywords), and that asterisk in "except_block"
isn't shared with "star_target" (also weird, since you'd think these
should be the same thing).  In general, and by and large, if you look
at Python's grammar there are many "weird" choices that it makes to
describe the language which seem counterintuitive to the programmer
who tries to learn the language from examples (i.e. context-depending
meaning of parenthesis, of asterisk, of period etc.)  Having been
exposed to this, you'd start to expect that some of this weirdness
will eventually result in bugs, or at least in unexpected behavior.



Anyways. To the OP: I'm sorry to hijack your question. Below is the
complete program:

with (
open('example.txt', 'r') as e,
open('emails.txt', 'w') as m,
open('salutations.txt', 'w') as s,
):
for line in e:
if line.strip():
(m if '@' in line else s).write(line)

it turned out to be not quite the golfing material I was hoping for.
But, perhaps a somewhat interesting aspect of this program you don't
see used a lot in the wild is the parenthesis in the "with" head.  So,
it's not a total write-off from the learning perspective.  I.e. w/o
looking at the grammar, and had I have this code in a coding interview
question, I wouldn't be quite sure whether this code would work or
not: one way to interpret what's going on here is to think that the
expression inside parentheses is a tuple, and since tuples aren't
context managers, it wouldn't have worked (or maybe not even parsed as
"as" wouldn't be allowed inside tuple definition since there's no
"universal as-expression" in Python it's hard to tell what the rules
are).  But, it turns out there's a form of "with" that has parentheses
for decoration purposes, and that's why it parses and works to the
desired effect.

Since it looks like you are doing this for educational reasons, I
think there's a tiny bit of value to my effort.

On Fri, Jan 12, 2024 at 8:08 AM Grizzy Adams via Python-list
 wrote:
>
> Thursday, January 11, 2024  at 10:44, Rich Shepard via Python-list wrote:
> Re: Extract lines from file, add to (at least in part)
>
> >On Thu, 11 Jan 2024, MRAB via Python-list wrote:
>
> >> From the look of it:
> >> 1. If the line is empty, ignore it.
> >> 2. If the line contains "@", it's an email address.
> >> 3. Otherwise, it's a name.
>
> If that is it all? a simple Grep would do (and save on the blank line)
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Extract lines from file, add to new files

2024-01-12 Thread Left Right via Python-list
> surprising for me:

Surprise is subjective, it's based on personal experience. Very few
languages allow arbitrary complex expressions in the same place they
allow variable introduction. The fact that "i" is not defined is
irrelevant to this example.  Most programmers who haven't memorized
Python grammar by heart, but expect the language to behave similar to
the languages in the same category would be surprised this code is
valid (i.e. can be parsed), whether it results in error or not is of
no consequence.

> There's no destructuring going on here

I use the term "destructuring" in the same way Hyperspec uses it.
It's not a Python term.  I don't know what you call the same thing in
Python.  I'm not sure what you understand from it.

On Sat, Jan 13, 2024 at 12:37 AM Greg Ewing via Python-list
 wrote:
>
> On 13/01/24 12:11 am, Left Right wrote:
> >  x = [...]
> >  for x[i] in x: print(i)
>
> I suspect you've misremembered something, because this doesn't
> do anything surprising for me:
>
>  >>> x = [1, 2, 3]
>  >>> for x[i] in x: print(i)
> ...
> Traceback (most recent call last):
>File "", line 1, in 
> NameError: name 'i' is not defined
>
> There's no destructuring going on here, just assignment to a
> sequence item.
>
> --
> Greg
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Extract lines from file, add to new files

2024-01-12 Thread Left Right via Python-list
Actually, after some Web search.  I think, based on this:
https://docs.python.org/3/reference/simple_stmts.html#grammar-token-python-grammar-augtarget
that in Python you call this "augmented assignment target". The term
isn't in the glossary, but so are many others.

On Sat, Jan 13, 2024 at 1:45 AM Left Right  wrote:
>
> > surprising for me:
>
> Surprise is subjective, it's based on personal experience. Very few
> languages allow arbitrary complex expressions in the same place they
> allow variable introduction. The fact that "i" is not defined is
> irrelevant to this example.  Most programmers who haven't memorized
> Python grammar by heart, but expect the language to behave similar to
> the languages in the same category would be surprised this code is
> valid (i.e. can be parsed), whether it results in error or not is of
> no consequence.
>
> > There's no destructuring going on here
>
> I use the term "destructuring" in the same way Hyperspec uses it.
> It's not a Python term.  I don't know what you call the same thing in
> Python.  I'm not sure what you understand from it.
>
> On Sat, Jan 13, 2024 at 12:37 AM Greg Ewing via Python-list
>  wrote:
> >
> > On 13/01/24 12:11 am, Left Right wrote:
> > >  x = [...]
> > >  for x[i] in x: print(i)
> >
> > I suspect you've misremembered something, because this doesn't
> > do anything surprising for me:
> >
> >  >>> x = [1, 2, 3]
> >  >>> for x[i] in x: print(i)
> > ...
> > Traceback (most recent call last):
> >File "", line 1, in 
> > NameError: name 'i' is not defined
> >
> > There's no destructuring going on here, just assignment to a
> > sequence item.
> >
> > --
> > Greg
> > --
> > https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Extract lines from file, add to new files

2024-01-14 Thread Left Right via Python-list
> What do you mean?
>
> for x in lambda: ...:
>   ...
>
> Perfectly grammatical.

1. You put the lambda definition in the wrong place (it should be in
the left-hand side, or as Python calls it "star_targets", but you put
it into "star_expressions", which would be where the right-hand side
is drawn from).
2. You used what Python calls "lambdadef" in place of what Python
calls "function_def". I.e. lambda definition and function definition
are two different things, at least as far as grammar is considered.

So, you solved a different problem.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Extract lines from file, add to new files

2024-01-14 Thread Left Right via Python-list
> What do you mean by this? Most languages I've worked with allow
> variables to be initialized with arbitrary expressions, and a lot of
> languages allow narrowly-scoped variables.

I'm talking about the *left* hand side of the assignment, not the
right hand side. Initialization with arbitrary expression -- arbitrary
expression is on the right. So, that's beside the point.

Here are examples of languages that don't have a feature analogous to
"augmented assignment target":

* Java
* C
* Shell

Examples of languages with limited use of destructuring:

* Haskell
* JavaScript
* Ruby
* Common Lisp

Examples of languages with a superset of destructuring:

* Prolog family of languages (in Prolog it's called "unification")

What is the problem with Python's "augmented assignment target"? -- It
is used in places where syntactically it is more common to introduce
variables (and in languages with the limited use of destructuring,
it's only possible to introduce variables in this context).  For
example, when destructuring an Array in JavaScript, the left-hand side
is restricted syntactically to a very small subset of the language
that, for example, excludes function application.  Typically, it's not
possible to use already defined variables in the left-hand side of the
variable definition, even if destructuring assignment is possible.
Prolog is the example of the opposite, where already defined variables
are allowed on both sides of unification, but Prolog doesn't have
function application in the same sense Python has, so it's still OK.

In general, in languages that aren't like Prolog, conceptually, it's
possible to either *define* variables (with optional initialization)
or to *reuse* them (in the context of assignment that usually looks
similar to initialization), but not both.  The fact that in Python you
can do both in the same place is surprising, eg. in the context of
loops.  Any language that distinguishes between expressions and
statements would have conceptual difficulties with allowing a mix in
the same context.  Typically, variable introduction is a statement in
such languages (as is the case in Python), so using an expression in
the same place as a variable introduction is strange.  To make this
shorter, Python allows:

for  in ... : ...

and

for  in ... : ...

which is unexpected, especially since the first form is a lot more
popular. Because the limited subset of expressions is desirable in
this context, many languages try to "cram" it into this box.  C, after
some standard iterations caved in and allowed statements in the
initialization component of the for loop (but only variable
declaration statements), for example. Other languages like JavaScript
developed a special subset of language for the purpose of describing
the relationship between multiple components of the object being
assigned as variables.

In every case, from the language development perspective, this looks
clumsy and unnecessary, as it is usually easy to write programs that
are exactly equivalent but don't require such workarounds. But, in
practice, programmers want to save a few keystrokes, and this pushes
the language authors to add such "features".  Python is furthermore
unique in how the workaround creates a lot of opportunities for abuse.

> The Python term, at least colloquially, is "tuple unpacking."

Well, why use colloquialism if there's a language specification? Also,
there weren't any tuples used in my example, at least not explicitly
(i could've been a tuple, but that wasn't specified).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Extract lines from file, add to new files

2024-01-14 Thread Left Right via Python-list
> Second time to ameliorate wording-dispute in this thread! The original
> phrase was: "[modified] BNF". Some of us have worked with various forms
> and evolutions of BNF since back in the days of COBOL-60 proposals, and
> know it when we see it!

OK, here are the conceptual differences between what Python grammar
language does and what you'd expect from anything that's based on BNF,
modified or not:

Python isn't a context-free language, so the grammar that is used to
describe it doesn't actually describe the language... so, it's a
"pretend grammar" that ignores indentation.  BNF is supposed to be
used to describe the language, it's not a "pretend" or "pseudo"
grammar, in a way we have at least two established grammar for
pseudo-code.

BNF and derivatives don't have an inherent mechanism for tiebreaks.
The mechanism is necessary because BNF rules can be tried in any
order.  Some grammar languages derived from BNF declare ambiguous
grammars invalid, some allow ambiguity, but say that the longest
prefix wins, and if there's still ambiguity after that, then such
grammar is invalid, some have special constructs to define "priority"
etc. My reading of Python grammar is that it works like PEG, where
rules are tried in the order they are defined.  This makes it less
expressive, but easier to work with.  This is, probably, the most
fundamental difference between the BNF family and the PEG family.

BNF and family languages rarely incorporate elements of Perl-like
regular expression parsing in the language (i.e. things like
lookaheads, lookbehinds etc.) This is more typical of the PEG family.

On top of this, the Python grammar language has a bunch of
"inventions" that are unique to it (I've never seen any other grammar
language use '.' in the same way Python uses it).  So, there's that
too.

Having worked with a bunch of different grammar languages, the one
used for Python isn't a recognizable BNF derivative.  I think the
authors used this as a description in the same way as today a lot of
programmers would use the word "IDE" to describe any text editor or
"REST" to describe any kind of server-client protocol over HTTP and so
on.  Or, how we'd use "Xerox" to name a copier machine, even if that
company didn't manufacture it, and even if the tech used for copying
is completely different.  And that's why I wrote that the grammar is
actually more like PEG, adding that it's neither, but seems to fall
more into that later category.

> Yes it is hard to read - and even harder to learn-from;

This wasn't my point.  My point is that it's hard to learn languages
that are "one off" in the group languages that all share a similar set
of rules. The difficulty comes from the surprise caused by the unique
use, not because there's something inherently difficult about reading
grammar languages.  In fact, however you look at Python's grammar
language, in a sense, it's a lot easier to read than Python itself
because it has significantly fewer rules.  Of course, the number of
rules doesn't entirely capture the difficulty, but it's a useful
metric.

> In Python, everything is an object. As long as the LHS is a legal-object
> which  makes sense for the situation, it can be used.

This is a very interesting statement... I don't think you are fully
aware of what it might mean :)  Here are just a few questions for you
to ponder:

* What is Python? Is it only Python 3.12?  Is Python 3.11 not Python?
How far back do you go to draw the line?
* What makes something an "object"? Is it the ability to dispatch on?
Is it the inheritance from "object" type?
* What elements of the language do you consider part of the language
that can be included in your "all" set.  Do types belong in that set?
Do expressions belong in that set?  What about comments?

Depending on how you answer these questions, you'd have some further
problems to deal with.  For example, historically, Python had plenty
of things that didn't inherit from "object" but acted similar to one.
I believe "module" objects were among the last to transition into
object inheritance lane, which might have happened some time around
Python 3.5.  Of course, there are plenty of things that are "in
Python", at least due to its grammar, that are hard to describe as
objects (eg. comments).  So, you'd have to make a special subset of
Python language that (eg. excludes comments) to claim that everything
is an object.

Most importantly, however, regardless of what you understand to be an
object, or how you decide to answer any of those questions: what value
does such a claim possibly have? Especially, given the context...

Furthermore, I'm absolutely convinced that what governs the
restrictions on the left-hand side isn't not whether it's understood
to be an object, but the grammar rules, that are unaware of the
concept of objects.  For example, you may say "functions in Python are
objects", but you cannot put a function definition in the head of the
for loop clause.
-- 
https://mail.python.org/mailman

Re: Extract lines from file, add to new files

2024-01-14 Thread Left Right via Python-list
> You said function. I made a function. You said "head of a for loop
> clause". I put it there. Problem was underspecified.

I also wrote a lot of letters, if you combine them very liberally,
without any regard to the order in which they were written or the
context in which they were used, you may come up with very surprising
findings.

> But if you're trying to tell me that a def statement should be a valid
> assignment target,

Why not just read what I wrote and work from there?  No, I didn't
write anything even remotely similar to this...  I don't want function
definition to be an assignment target.  I was giving an example of how
Python grammar works, how the rules govern what can or cannot be used
in a particular place...

In other words, if you aren't sure you understand the question, why
are you trying to reply to it? Is your goal to learn the meaning of
the question by giving arbitrary replies and hoping that the author of
the question restates it so that you understand it?  If so, I believe,
the better strategy would be to simply ask to restate the question.
Will save you the round-trip.

> You provided a way to create an anonymous function and that was not enough.
> I wonder if you could throw in the new := walrus operator to similarly make
> a named lambda function in a similar way.

The person you are replying to didn't understand the question and has
written something irrelevant.  It's not about being "enough".  I
honestly don't know why they are spending so much energy replying to
my messages :|

> Python grew and there was regular pressure to add keywords which might break
> existing programs. So, yes, sometimes, a keyword was re-used in a different
> context.

Why are keywords relevant to this?

> How often do you really think anyone out there NEEDS to define a function in
> the context mentioned?

This isn't about programmers writing programs that aren't about the
language.  It's about programmers who write language-related tools,
like linters, formatters etc.  I.e. the programmers who need to
consider any possible grammar product. And the reason I mentioned
function definition is, this, again: function definition is a
statement.  Python grammar rules prevent function definition from
appearing in left-hand side of the head of the for loop.  However, a
variable declaration, which is also a statement, is allowed there.
Programmers like grammar rules to be consistent, and it's surprising
if a particular larger context allows both statements and expressions.
I also explained why and how language authors would make a decision to
break this consistency: it saves some keystrokes for the programmers.
I.e. allows for shorter programs, while doesn't add any new abilities
to the language.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question about garbage collection

2024-01-17 Thread Left Right via Python-list
So, here's some info about how to see what's going on with Python's
memory allocation: https://docs.python.org/3/library/tracemalloc.html
. I haven't looked into this in a long time, but it used to be the
case that you needed to compile native modules (and probably Python
itself?) so that instrumentation is possible (I think incref / decref
macros should give you a hint, because they would have to naturally
report some of that info).

Anyways.  The problem of tracing memory allocation / deallocation in
Python can be roughly split into these categories:

1. Memory legitimately claimed by objects created through Python
runtime, but not reclaimed due to programmer error. I.e. the
programmer wrote a program that keeps references to objects which it
will never use again.
2. Memory claimed through native objects obtained by means of
interacting with Python's allocator.  When working with Python C API
it's best to interface with Python allocator to deal with dynamic
memory allocation and release.  However, it's somewhat cumbersome, and
some module authors simply might not know about it, or wouldn't want
to use it because they prefer a different allocator.  Sometimes
library authors don't implement memory deallocation well.  Which
brings us to:
3. Memory claimed by any user-space code that is associated with the
Python process. This can be for example shared libraries loaded by
means of Python bindings, that is on top of the situation described
above.
4. System memory associated with the process.  Some system calls need
to allocate memory on the system side.  Typical examples are opening
files, creating sockets etc.  Typically, the system will limit the
number of such objects, and the user program will hit the numerical
limit before it hits the memory limit, but it can also happen that
this will manifest as a memory problem (one example I ran into was
trying to run conda-build and it would fail due to enormous amounts of
memory it requested, but the specifics of the failure were due to it
trying to create new sub-processes -- another system resource that
requires memory allocation).

There isn't a universal strategy to cover all these cases.  But, if
you have reasons to suspect (4), for example, you'd probably start by
using strace utility (on Linux) to see what system calls are executed.

For something like the (3), you could try to utilize Valgrind (but
it's a lot of work to set it up).  It's also possible to use jemalloc
to profile a program, but you would have to build Python with its
allocator modified to use jemalloc (I've seen an issue in the Python
bug tracker where someone wrote a script to do that, so it should be
possible).  Both of these are quite labor intensive and not trivial to
set up.

(2) could be often diagnosed with tracemalloc Python module and (1) is
something that can be helped with Python's gc module.

It's always better though to have an actual error and work from there.
Or, at least, have some monitoring data that suggests that your
application memory use increases over time.  Otherwise you could be
spending a lot of time chasing problems you don't have.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is there a way to implement the ** operator on a custom object

2024-02-09 Thread Left Right via Python-list
In order for the "splat" operator to work, the type of the object must
populate slot `tp_as_mapping` with a struct of this type:
https://docs.python.org/3/c-api/typeobj.html#c.PyMappingMethods and
have some non-null implementations of the methods this struct is
supposed to contain.

I can do this in C, but I cannot think of a way to do this in Python
proper. Defining all the methods mentioned in PyMappingMethods doesn't
seem to do it. You could try to research this further, and if, indeed
defining all the methods of PyMappingMethods on the Python side
doesn't produce an object that behaves like a proper mapping, you
could probably file a bug report for that.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is there a way to implement the ** operator on a custom object

2024-02-09 Thread Left Right via Python-list
> Looks like it can simply be done in Python, no tp_as_mapping needed.

It's not that it isn't needed. You've just shown a way to add it using
Python code.

But, more to the point: extending collections.abc.Mapping may or may
not be possible in OP's case.

Also, if you are doing this through inheritance, this seems really
convoluted: why not just inherit from dict? -- less methods to
implement, less stuff to import etc.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: the name ``wheel''

2024-03-21 Thread Left Right via Python-list
I believe that the name "Wheel" was a reference to "reinventing the
wheel". But I cannot find a quote to support this claim. I think the
general sentiment was that it was the second attempt by the Python
community to come up with a packaging format (first being Egg), and so
they were reinventing the wheel, in a way.

I cannot speak to the other question though: I don't know. This is
however also a common practice on Linux, where Python is often
installed in order to enable system tools, which, in turn, don't need
a Python package manager to function. Not sure why this would be the
case in MS Windows.

On Thu, Mar 21, 2024 at 4:51 PM Johanne Fairchild via Python-list
 wrote:
>
> Why is a whl-package called a ``wheel''?  Is it just a pronunciation for
> the extension WHL or is it really a name?
>
> Also, it seems that when I install Python on Windows, it doesn't come
> with pip ready to run.  I had to say
>
>   python -m ensurepip
>
> and then I saw that a pip on a whl-package was installed.  Why doesn't
> the official distribution make pip ready to run by default?  Thank you!
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Making 'compiled' modules work with multiple python versions on Linux

2024-04-01 Thread Left Right via Python-list
It sounds weird that symbols from Limited API are _missing_ (I'd
expect them to be there no matter what library version you link with).
But, I haven't done this myself, so, what do I know?  It would help
though to see the actual error.

That aside: why do you want to do this? One side effect of doing what
you want will be the "weird" name of your wheel archive. Weird in a
sense that virtually nobody does that.  And when virtually nobody does
something, you are almost guaranteed to be the first to find bugs, and
then be the one whose bug reports are shoved into the backlog and
never looked at again.

You, kind of, are already walking into the world of pain trying to
make Python binary packages, and then you also want them to be
cross-platform, and then you want them to be usable by different
versions of Python... Unless it's for your own amusement, I'd just
have a package per version of Python. Maintenance-wise it's going to
be a lot easier.

On Fri, Mar 29, 2024 at 10:13 AM Barry via Python-list
 wrote:
>
>
>
> > On 28 Mar 2024, at 16:13, Olivier B. via Python-list 
> >  wrote:
> >
> > But on Linux, it seems that linking to libpython3.so instead of
> > libpython3.11.so.1.0 does not have the same effect, and results in
> > many unresolved python symbols at link time
> >
> > Is this functionality only available on Windows?
>
> Python limited API works on linux, but you do not link against the .so on 
> linux I recall.
>
> You will have missed that libpython3.so is a symlink to libpython3.11.so.10.
>
> Windows build practices do not translate one-to-one to linux, or macOS.
>
> Barry
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to discover what values produced an exception?

2024-05-06 Thread Left Right via Python-list
From a practical perspective: not all values are printable (especially
if printing a value results in an error: then you'd lose the original
error, so, going crazy with printing of errors is usually not such a
hot idea).

But, if you want the values: you'd have to examine the stack, extract
the values from the local variables etc. It's easier to do this
interactively (unless you are in a multithreaded environment, where
pdb is broken). But, you could also try to find this information
automatically (by unwinding the stack to the place that generated the
error, examining the local variables etc.) It's tedious and prone to
errors. So, if you really want to do this automatically for every
error that's going to be quite a bit of work.

On Fri, May 3, 2024 at 6:58 PM Johanne Fairchild via Python-list
 wrote:
>
> How to discover what values produced an exception?  Or perhaps---why
> doesn't the Python traceback show the values involved in the TypeError?
> For instance:
>
> --8<>8---
> >>> (0,0) < 4
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: '<' not supported between instances of 'tuple' and 'int'
> --8<>8---
>
> It could have said something like:
>
> --8<>8---
> TypeError: '<' not supported between instances of 'tuple' and 'int'
>   in (0,0) < 4.
> --8<>8---
>
> We would know which were the values that caused the problem, which would
> be very helpful.
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Version of NymPy

2024-05-16 Thread Left Right via Python-list
Let me try to answer this properly, instead of "simply".

The "problematic" part of your question is "with my Anaconda
distribution". Anaconda distribution comes with the conda program that
manages installed packages. A single Anaconda distribution may have
multiple NumPy versions installed at the same time, although only one
will be available to the Python process (note that this means that
sub-processes created in this Python process won't necessarily have
the same version of NumPy!). To make matters worse, it's common for
Anaconda users to use pip to install packages.

Now, Anaconda has a concept of virtual environments independent of
Python's venv module. In order to create such environments it can be
configured to either link (usually hard-link) installed packages into
the dedicated directory, or to copy these packages into the said
directory. This will become important once you inevitably ask the
follow-up question: "why do I have this version of NumPy?".

Unfortunately, you also need to consider setuptools. The traditional
setup.py install command may install multiple versions of the same
package into the same directory. Even worse, "pip install -e", the
glorified "setup.py develop", complicates things even further by
adding the "installed" package to the pth file, which can, again,
create ambiguities as to the resolution of the package location.

On top of this, there's an environment variable: PYTHONPATH that can
be set to add an arbitrary number of source directories for Python to
look up package location.

So, suppose you ran:

python -c "import numpy; numpy.__version__"

and then ran a Jupyter notebook and discovered that the version of
NumPy in that notebook is not the one you just saw in the previous
output... What went wrong?

You then may try:

conda info numpy

and get yet another answer.  And then you run

pip show numpy

And the answer is still different!

Of course, it's not necessary that all these answers are different.
And, in most circumstances they are going to be consistent... but they
don't have to be!

Below is the list of typical problems I encountered in my attempts to
resolve similar problems for Python users. Of course, this list is not
exhaustive.

1. If the version in the Jupyter notebook differs from the one in the
environment in which the Jupyter server was started, you need to look
for the Jupyter kernel definition. There are many ways in which the
Jupyter kernel definition may alter the module lookup locations, but
the most common one is that using Python from the active virtual
environment isn't the default for the default Jupyter kernel.

2. If installed modules in conda environments are hard-linked, and at
some point pip or setuptools were used to install extra packages, you
might have "botched" unrelated environments by overwriting files
through hard-links without you even knowing that.

3. conda will happily install outdated versions of conda into virtual
environments it creates. The format of conda virtual environments
changed over time, and older versions of conda are unaware of the new
format, while newer versions are unaware of the old format. If you
happen to run the conda command from the wrong environment you may get
unexpected results (especially if both the new and the old version of
conda have created environments with the same name!) To avoid this,
you'd want to deactivate all conda environments activated so far until
you are at least in the base environment.

4. Usually, some diagnostic information can be gleaned from printing
the value of PYTHONPATH environment variable, sys.paths list (inside
Python), sys.sysconfig.get_path('platlib') (and looking into this
directory for duplicate packages with different version or for the pth
files.) If you discover anomalies, try to figure out if you had to use
pip to install packages (this may indirectly mean using setuptools).
Similarly, running "conda build" may indirectly result in running
setuptools commands. Also, some popular tools like to do bad things to
packages and virtual environments: pytest and tox come to mind. pytest
can manipulate module lookup locations (and you will need to dissect
its configuration to figure this out). tox, afaik, is unaware of conda
virtual environments, and will try to create venv-style virtual
environments, which will have all the same problems as using pip in
conda environments does.

5. Final warning: no matter how ridiculous this is: the current
directory in Python is added to the module lookup path, and it
*precedes* every other lookup location. If, accidentally, you placed a
numpy.py in the current directory of your Python process -- that is
going to be the numpy module you import.  To make this even worse,
this behavior used to depend on whether you start Python with PDB
active or not (with PDB, the current working directory wasn't added to
the path, and module imports resolved differently). I'm not quite sure
which version of Python fixed that.

On Wed, May 15, 2024 at

Re: venvs vs. package management

2024-05-20 Thread Left Right via Python-list
There are several independent problems here:

1. Very short release cycle. This is independent of the Python venv
module but is indirectly influenced by Python's own release cycle.
Package maintainers don't have time for proper testing, they are
encouraged to release a bunch of new (and poorly tested) versions, and
they never get a break. So, when you install the latest, there will be
something else broken.  There's never a window to properly test
anything.

2. Python made a very short-sighted decision about how imports work.
Python doesn't have a concept of "application", and therefore there's
no way to specify dependencies per application (and imports import
anything that's available, not versioned). That's why every Python
application ends up carrying its own Python, with the version of its
own dependencies around. Python's venv module is just an
acknowledgement of this design flaw.  I.e. the proper solution
would've been a concept of application and per-application dependency
specification, but instead we got this thing that doesn't really work
(esp. when native modules and shared libraries are considered), but it
"works" often enough to be useful.

3. The Python community grew to be very similar to what PHP 4 was,
where there were several "poisonous" examples, which were very popular
on the Web, which popularized a way of working with MySQL databases
that was very conducive to SQL injections. Python has spread very bad
ideas about project management. Similar to how PHP came up with
mysql_real_escape() and mysql_this_time_promise_for_real_escape() and
so on functions, Python came up with bad solutions to the problems
that had to be fixed by removing bad functionality (or, perhaps,
education). So, for example, it's very common to use requirements.txt,
which is generated by running pip freeze (both practices are bad
ideas). Then PyPA came up with a bunch of bad ideas in response to
problems like this, eg. pyproject.toml.  In an absurd way very much
mirroring the situation between makefiles and makefiles generated by
autotools, today Python developers are very afraid of doing simple
things when it comes to project infrastructure (it absolutely has to
be a lot of configuration fed into another configuration, processed by
a bunch of programs to generate even more configuration...) And most
Python programmers don't really know how the essential components of
all of this infrastructure work. They rely on a popular / established
pattern of insane multi-step configuration generation to do simple
things. And the tradition thus developed is so strong, that it became
really cultish. This, of course, negatively contributes to the overall
quality of Python packages and tools to work with them.

Unfortunately, the landscape of Python today is very diverse.  There's
no universally good solution to package management because it's broken
in the place where nobody is allowed to fix it.  Commercial and
non-commercial bodies alike rely on people with a lot of experience
and knowledge of particular Python gotchas to get things done. (Hey,
that's me!) And in different cases, the answer to the problem will be
different. Sometimes venv is good enough. Other times you may want a
container or a vm image. Yet in a different situation you may want a
PyPA or conda package... and there's more.

On Sun, May 19, 2024 at 4:05 PM Piergiorgio Sartor via Python-list
 wrote:
>
> On 19/05/2024 08.49, Peter J. Holzer wrote:
> [...]
> > That's what package management on Linux is for. Sure, it means that you
> > won't have the newest version of anything and some packages not at all,
> > but you don't have to care about dependencies. Or updates.
>
> Well, that doesn't work as well.
> Distributions do not pack everything, this
> also depending on licenses.
>
> Sometimes, or often, you need to use the
> *latest* version of something, due to some
> bugfix or similar.
>
> The distribution does not always keep up
> to date everything, so you're stuck.
>
> The only solution is a venv, with all
> needed packages for the given task.
>
> Typical problem with PyTorch / TensorFlow.
>
> In case of trouble, the first answer is:
> "Check with the latest (nightly) release".
>
> Which means installing something *outside*
> the Linux distribution support.
> And this impossible, because this will pull
> in dependencies like crazy, which are not
> (yet) in the Linux distribution path.
>
> Saying it differently, the latest greatest
> update is not a wish, it's a must...
>
> So, long story short, the only solution I
> know are venvs...
>
> Of course, other solutions are welcome!
>
> bye,
>
> --
>
> piergiorgio
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Serializing pydantic enums

2024-05-28 Thread Left Right via Python-list
Most Python objects aren't serializable into JSON. Pydantic isn't
special in this sense.

What can you do about this? -- Well, if this is a one-of situation,
then, maybe just do it by hand?

If this is a recurring problem: json.dumps() takes a cls argument that
will be used to do the serialization. Extend json.JSONEncoder and
implement the encode() method for the encoder class you are passing. I
believe that the official docs have some information about this too.

On Tue, May 28, 2024 at 2:50 PM Larry Martell via Python-list
 wrote:
>
> Just getting started with pydantic. I have this example code:
>
> class FinishReason(Enum):
> stop = 'stop'
>
> class Choice(BaseModel):
> finish_reason: FinishReason = Field(...)
>
>
> But I cannot serialize this:
>
> json.dumps(Choice(finish_reason=FinishReason.stop).dict())
> *** TypeError: Object of type FinishReason is not JSON serializable
>
>
> I get the object not the value:
>
> (Pdb) Choice(finish_reason=FinishReason.stop)
> Choice(finish_reason=)
>
>
> Also tried it with .value, same result.
>
> What am I missing here?
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python for irc client

2024-07-04 Thread Left Right via Python-list
Hi.

Just FYI, I use Erc (in Emacs). I'm not a very advanced user, perhaps,
but I never felt like I miss anything. That's not to stop you from
making your own, but if you just need a decent text client for IRC,
then there's already at least one.

On Thu, Jul 4, 2024 at 11:30 AM inhahe via Python-list
 wrote:
>
> On Thu, Jul 4, 2024 at 5:22 AM inhahe  wrote:
>
> >
> >
> > On Thu, Jul 4, 2024 at 5:14 AM Daniel via Python-list <
> > python-list@python.org> wrote:
> >
> >>
> >> In your wisdom, would python be a good environment to accomplish this?
> >
> >
>
> > I think Python would be a great language to write an IRC client in, it's a
> > rapid-development language, and also Python is particularly good for text
> > manipulation and the IRC protocol is textual rather than binary.
> >
>
> Oh yeah, I forgot I was going to mention that Twisted has already done a
> lot of the dirty work for you if you make it in Python...they have twisted.
> words.protocols.irc, which implements the IRC protocol. (I don't know if
> it's up to date and supports ircv3, though.)
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Best (simplest) way to share data between processes

2024-07-08 Thread Left Right via Python-list
If resource usage isn't an issue, then the _easy_ thing to do, that
would also be easily correct is to have a server doing all the
h/w-related reading and clients talking to that server. Use for the
server the technology you feel most confident with. Eg. you may use
Python's http package. I believe that the server from this package
runs in a single thread, and thus processes all requests
synchronously. So, you'll get synchronization for free.

Then, the rest of the scripts that need to talk to h/w will instead be
talking to this server.

Again, this isn't an _efficient_ solution... but, sometimes you don't
need one. And this one is easy to make, easy to debug, easy to expand.
But, if instead you were looking for a more efficient solution, then,
the general idea that allows the http server to work in this case
would still apply: have a single synchronization program that takes
requests asynchronously, and orders them. So, a basic TCP server would
also work as well as a UNIX socket. Your idea with holding a lock on a
file would also work (in fact, plenty of Linux utilities work that
way, eg. apt-get or yum).



If you don't want to change the existing script, then instead of
running them directly, you could run them through batch:
https://man7.org/linux/man-pages/man1/batch.1p.html this is a very
simply queuing program that's available for Linux. It will take care
of synchronization by putting the scripts you want to run in a queue
and executing them one at a time.

On Sun, Jul 7, 2024 at 11:12 PM Chris Green via Python-list
 wrote:
>
> I have a Raspberry Pi in my boat that uses I2C to read a number of
> voltages and currents (using ADS1115 A2D) so I can monitor the battery
> condition etc.
>
> At present various different scripts (i.e. processes) just read the
> values using the I2C bus whenever they need to but I'm pretty sure
> this (quite rarely) results in false readings because two processes
> try to read at the same time.
>
> Thus I'm looking for ways to prevent simultaneous access.
>
> One fairly obvious way is to have single process/script which reads
> the A2D values continuously and writes them to a file.  All other
> scripts then read from the file as needed, a simple file lock can then
> be used to prevent simultaneous access (well, simultaneous access when
> the writing process is writing).
>
> Is this the simplest approach?  Are there better ways using
> multiprocess?  (They look more complicated though).
>
> The I2C bus itself has a mutex but I don't think this guarantees that
> (for example) an A2D reading is atomic because one reading takes more
> than one I2C bus access.
>
> Would a mutex of some sort around each I2C transaction (i.e. complete
> A2D reading) be a better way to go?
>
> --
> Chris Green
> ·
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: troglodytes

2024-08-14 Thread Left Right via Python-list
Hahah, as someone with extensive experience of being banned by various
CoC-waving Python online communities' authorities I really enjoyed the
saga. Watching little men grasp for power on the Web to squash their
opponents never stops to amuse me.

On Tue, Aug 13, 2024 at 4:56 PM Michael Torrie via Python-list
 wrote:
>
> On 8/13/24 3:24 AM, Robin Becker via Python-list wrote:
> > I am clearly one of the troglodytes referred to in recent discussions 
> > around the PSF. I've been around in python land
> > for far too long, my eyesight fails etc etc.
> >
> > I feel strongly that a miscarriage of justice has been made in the 3-month 
> > banning of a famous python developer from
> > some areas of discourse.
> >
> > I have had my share of disagreements with others in the past and have been 
> > sometimes violent or disrespectful in emails.
> >
> > I might have been in the kill list of some, but never banned from any 
> > mailing lists.
> >
> > Honest dialogue is much better than imposed silence.
> >
> > -- grumblingly-yrs --
> > Robin Becker
>
> Agreed.  Here's a good summary of the issue:
> https://chrismcdonough.substack.com/p/the-shameful-defenestration-of-tim
>
> The PSF has really screwed this up.  Really embarrassing, frankly.  And
> sad.
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: troglodytes

2024-08-14 Thread Left Right via Python-list
> Why do you have to belittle other people?

Who says I have to? I like to! I like to see people driven by all
sorts of low and reprehensible motives being punished for it. I don't
know if I need to explain this motivation further. I think it's a very
natural feeling. Human nature if you will. The primordial sense of
justice, that was later developed into a bunch of different theories
of how justice might work.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: bring back nntp library to python3

2024-08-14 Thread Left Right via Python-list
> it became simple and straightforward to
> download and install packages.

I think the right word for this is "delusional". But people get
offended when other people use the right words. Instead they want a
grotesque round-about way of saying the same thing...

So, the grotesque round-about way of saying this, if you are still
reading that is... Even if pip, setuptools and friends worked well
(which they don't) there are big problems that these tools cannot
solve:

* Network partition
* Version mismatch
* Competition between different installer tools
* Increased requirement for vetting and validation
* Shortening shelf life of existing projects

But hey, this is just letting off steam.  Nobody cares.  The decision
was already made and it won't be unmade.  And, in the grand scheme of
things this is a drop in a bucket of the awful decisions that were
guiding Python in the last decade or so.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Best Practice Virtual Environment

2024-10-08 Thread Left Right via Python-list
Hi.  The advice here is from a perspective of someone who does this
professionally, for large, highly loaded systems.  This doesn't
necessarily apply to your case / not to the full extent.

> Debian (or even Python3 itself) doesn't allow to pip install required 
> packages system wide, so I have to use virtual environments even there. But 
> is it right, that I have to do that for every single user?

1. Yes, you can install packages system-wide with pip, but you don't need to.

2. pip is OK to install requirements once, to figure out what they are
(in dev. environment).  It's bad for production environment: it's
slow, inconsistent, and insecure. For more context: pip dependency
resolution is especially slow when installing local interdependent
packages. Sometimes it can take up to a minute per package.
Inconsistency comes from pip not using package checksums and
signatures (by default): so, if the package being installed was
updated w/o version update, to pip it's going to be the same package.
Not just that, for some packages pip has to resort to building them
from source, in which case nobody can guarantee the end result.
Insecurity comes from Python allowing out-of-index package downloads
during install.  You can distribute your package through PyPI, while
its dependency will point to a random Web site in a country with very
permissive laws (and, essentially, just put malware on your computer).
It's impossible to properly audit such situations because the outside
Web site doesn't have to provide any security guarantees.


To package anything Linux-related, use the packaging mechanism
provided by the flavor of Linux you are using.  In the case of Debian,
use DEB. Don't use virtual environments for this (it's possible to
roll the entire virtual environment into a DEB package, but that's a
bad idea). The reason to do this is so that your package plays nice
with other Python packages available as DEB packages. This will allow
your users to use a consistent interface when dealing with installing
packages, and to avoid situation when an out-of-bound tool installed
something in the same path where dpkg will try to install the same
files, but coming from a legitimate package.  If you package the whole
virtual environment, you might run into problems with locating native
libraries linked from Python native modules.  You will make it hard to
audit the installation, especially when it comes to certificates, TLS
etc. stuff that, preferably, should be handled in a centralized way by
the OS.

Of course, countless times I've seen developers do the exact opposite
of what I'm suggesting here. Also, the big actors in the industry s.a.
Microsoft and Amazon do the exact opposite of what I suggest. I have
no problem acknowledging this and still maintaining that they are
wrong and I'm right :) But, you don't have to trust me!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python crash together with threads

2024-10-03 Thread Left Right via Python-list
> whereas I am quite sure that program flows do not overlap.

You can never be sure of this in Python. Virtually all objects in
Python are allocated on heap, so instantiating integers, doing simple
arithmetic etc. -- all of this requires synchronization because it
will allocate memory for a shared pool.

The description of _PyThreadState_GET states that callers must hold
GIL. Does your code do that? It's not possible to divine that from the
stack trace, but you'd probably know that.

On Wed, Oct 2, 2024 at 3:29 PM Guenther Sohler via Python-list
 wrote:
>
> My  Software project  is working fine in most of the cases
> (www.pythonscad.org)
> however I am right now isolating a scenario, which makes it crash
> permanently.
>
> It does not happen with Python 3.11.6 (and possibly below), it happens with
> 3.12 and above
> It does not happen when not using Threads.
>
> However due to the architecture of the program I am forced to evaluate some
> parts in main thread and some parts in a dedicated Thread. The Thread is
> started with QThread(QT 5.0)
> whereas I am quite sure that program flows do not overlap.
>
> When I just execute my 1st very simple Python function inside the newly
> created thread, like:
>
>  PyObject *a = PyFloat_FromDouble(3.3);
>
> my program crashes with this Stack trace
>
> 0  0x7f6837fe000f in _PyInterpreterState_GET () at
> ./Include/internal/pycore_pystate.h:179
> #1  get_float_state () at Objects/floatobject.c:38
> #2  PyFloat_FromDouble (fval=3.2998) at
> Objects/floatobject.c:136
> #3  0x015a021f in python_testfunc() ()
> #4  0x01433301 in CGALWorker::work() ()
> #5  0x00457135 in CGALWorker::qt_static_metacall(QObject*,
> QMetaObject::Call, int, void**) ()
> #6  0x7f68364d0f9f in void doActivate(QObject*, int, void**) ()
> at /lib64/libQt5Core.so.5
> #7  0x7f68362e66ee in QThread::started(QThread::QPrivateSignal) () at
> /lib64/libQt5Core.so.5
> #8  0x7f68362e89c4 in QThreadPrivate::start(void*) () at
> /lib64/libQt5Core.so.5
> #9  0x7f6835cae19d in start_thread () at /lib64/libc.so.6
> #10 0x7f6835d2fc60 in clone3 () at /lib64/libc.so.6
>
>
> I suspect, that this is a Null pointer here
>See also _PyInterpreterState_Get()
>and _PyGILState_GetInterpreterStateUnsafe(). */
> static inline PyInterpreterState* _PyInterpreterState_GET(void) {
> PyThreadState *tstate = _PyThreadState_GET();
> #ifdef Py_DEBUG
> _Py_EnsureTstateNotNULL(tstate);
> #endif
> # <<--- suspect state is nullpointer
> return tstate->interp;
> }
>
> any clues , whats going on here, and how I can mitigate that ?
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: shutil.rmtree() fails when used in Fedora (rpm) "mock" environment

2024-10-24 Thread Left Right via Python-list
From reading the code where the exception is coming from, this is how
I interpret the intention of the author: they build a list (not sure
why they used list, when there's a stack datastructure in Python)
which they use as a stack, where the elements of the stack are
4-tuples, the important part about these tuples is that the first
element is the operation to be performed by rmtree() has to be one of
the known filesystem-related functions. The code raising the exception
checks that it's one of those kinds and if it isn't, crashes.

There is, however, a problem with testing equality (more strictly,
identity in this case) between functions.  I.e. it's possible that a
function isn't identical to itself is, eg. "os" module was somehow
loaded twice.  I'm not sure if that's a real possibility with how
Python works... but maybe in some cases, like, multithreaded
environments it could happen...

To investigate this, I'd edit the file with the assertion and make it
print the actual value found in os.lstat and func.  My guess is that
they are both somehow "lstat", but with different memory addresses.

On Thu, Oct 24, 2024 at 4:06 PM Christian Buhtz via Python-list
 wrote:
>
> Hello,
> I am upstream maintainer of "Back In Time" [1] investigating an issue a
> distro maintainer from Fedora reported [2] to me.
>
> On one hand Fedora seems to use a tool called "mock" to build packages
> in a chroot environment.
> On the other hand the test suite of "Back In Time" does read and write
> to the real file system.
> One test fails because a temporary directory is cleaned up using
> shutil.rmtree(). Please see the output below.
>
> I am not familiar with Fedora and "mock". So I am not able to reproduce
> this on my own.
> It seems the Fedora maintainer also has no clue how to solve it or why
> it happens.
>
> Can you please have a look (especially at the line "assert func is
> os.lstat").
> Maybe you have an idea what is the intention behind this error raised by
> an "assert" statement inside "shutil.rmtree()".
>
> Thanks in advance,
> Christian Buhtz
>
> [1] -- 
> [2] -- 
>
> __ General.test_ctor_defaults
> __
> self = 
>  def test_ctor_defaults(self):
>  """Default values in constructor."""
> >   with TemporaryDirectory(prefix='bit.') as temp_name:
> test/test_uniquenessset.py:47:
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _ _ _ _
> /usr/lib64/python3.13/tempfile.py:946: in __exit__
>  self.cleanup()
> /usr/lib64/python3.13/tempfile.py:950: in cleanup
>  self._rmtree(self.name, ignore_errors=self._ignore_cleanup_errors)
> /usr/lib64/python3.13/tempfile.py:930: in _rmtree
>  _shutil.rmtree(name, onexc=onexc)
> /usr/lib64/python3.13/shutil.py:763: in rmtree
>  _rmtree_safe_fd(stack, onexc)
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _ _ _ _
> stack = []
> onexc = .onexc at
> 0xb39bc860>
>  def _rmtree_safe_fd(stack, onexc):
>  # Each stack item has four elements:
>  # * func: The first operation to perform: os.lstat, os.close or
> os.rmdir.
>  #   Walking a directory starts with an os.lstat() to detect
> symlinks; in
>  #   this case, func is updated before subsequent operations and
> passed to
>  #   onexc() if an error occurs.
>  # * dirfd: Open file descriptor, or None if we're processing the
> top-level
>  #   directory given to rmtree() and the user didn't supply
> dir_fd.
>  # * path: Path of file to operate upon. This is passed to
> onexc() if an
>  #   error occurs.
>  # * orig_entry: os.DirEntry, or None if we're processing the
> top-level
>  #   directory given to rmtree(). We used the cached stat() of
> the entry to
>  #   save a call to os.lstat() when walking subdirectories.
>  func, dirfd, path, orig_entry = stack.pop()
>  name = path if orig_entry is None else orig_entry.name
>  try:
>  if func is os.close:
>  os.close(dirfd)
>  return
>  if func is os.rmdir:
>  os.rmdir(name, dir_fd=dirfd)
>  return
>
>  # Note: To guard against symlink races, we use the standard
>  # lstat()/open()/fstat() trick.
> >   assert func is os.lstat
> E   AssertionError
> /usr/lib64/python3.13/shutil.py:663: AssertionError
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: shutil.rmtree() fails when used in Fedora (rpm) "mock" environment

2024-10-24 Thread Left Right via Python-list
> The stack is created on line 760 with os.lstat and entries are appended
> on lines 677 (os.rmdir), 679 (os.close) and 689 (os.lstat).
>
> 'func' is popped off the stack on line 651 and check in the following lines.
>
> I can't see anywhere else where something else is put onto the stack or
> an entry is replaced.

But how do you know this code isn't executed from different threads?
What I anticipate to be the problem is that the "os" module is
imported twice, and there are two references to "os.lstat".  Normally,
this wouldn't cause a problem, because they are the same function that
doesn't have any state, but once you are trying to compare them, the
identity test will fail, because those functions were loaded multiple
times into different memory locations.

I don't know of any specific mechanism for forcing the interpreter to
import the same module multiple times, but if that was possible (which
in principle it is), then it would explain the behavior.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Printing UTF-8 mail to terminal

2024-10-31 Thread Left Right via Python-list
There's quite a lot of misuse of terminology around terminal / console
/ shell.  Please, correct me if I'm wrong, but it looks like you are
printing that on MS Windows, right?  MS Windows doesn't have or use
terminals (that's more of a Unix-related concept). And, by "terminal"
I mean terminal emulator (i.e. a program that emulates the behavior of
a physical terminal). You can, of course, find some terminal programs
for windows (eg. mintty), but I doubt that that's what you are dealing
with.

What MS Windows users usually end up using is the console.  If you
run, eg. cmd.exe, it will create a process that displays a graphical
console.  The console uses an encoding scheme to represent the text
output.  I believe that the default on MS Windows is to use some
single-byte encoding. This answer from SE family site tells you how to
set the console encoding to UTF-8 permanently:
https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8
, which, I believe, will solve your problem with how the text is
displayed.

On Thu, Oct 31, 2024 at 5:19 PM Loris Bennett via Python-list
 wrote:
>
> Hi,
>
> I have a command-line program which creates an email containing German
> umlauts.  On receiving the mail, my mail client displays the subject and
> body correctly:
>
>   Subject: Übung
>
>   Sehr geehrter Herr Dr. Bennett,
>
>   Dies ist eine Übung.
>
> So far, so good.  However, when I use the --verbose option to print
> the mail to the terminal via
>
>   if args.verbose:
>   print(mail)
>
> I get:
>
>   Subject: Übungsbetreff
>
>   Sehr geehrter Herr Dr. Bennett,
>
>   Dies ist eine =C3=9Cbung.
>
> What do I need to do to prevent the body from getting mangled?
>
> I seem to remember that I had issues in the past with a Perl version of
> a similar program.  As far as I recall there was an issue with fact the
> greeting is generated by querying a server, whereas the body is being
> read from a file, which lead to oddities when the two bits were
> concatenated.  But that might just have been a Perl thing.
>
> Cheers,
>
> Loris
>
> --
> This signature is currently under constuction.
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Printing UTF-8 mail to terminal

2024-11-01 Thread Left Right via Python-list
> Windows does now. They implemented this feature over the last few years.
> Indeed they took inspiration from how linux does this.
>
> You might find https://devblogs.microsoft.com/commandline/ has interesting 
> articles about this.

I don't have MS Windows. My wife does, but I don't want to bother her
with this kind of testing. Does this Windows Terminal support the use
of programs like tmux? Last time I searched for a way to run tmux on
Windows, best I could find was mintty.  None of the "native" MS
"solutions" couldn't do it.

Anyways, OP said they were using an actual terminal (emulator) on
Ubuntu, and it looks like their problem is more with extracting
information from the email message rather than with the terminal
capabilities. Also, looks like there was an answer already wrt.
message.get_body()
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: FileNotFoundError thrown due to file name in file, rather than file itself

2024-11-11 Thread Left Right via Python-list
Poor error reporting is a very common problem in programming.  Python
is not anything special in this case.  Of course, it would've been
better if the error reported what file wasn't found.  But, usually
these problems are stacking, like in your code.  Unfortunately, it's
your duty, as the language user, to anticipate those problems and act
accordingly. Now you've learned that the one file you believe that
could be the source for the error isn't the only one--well, adjust
your code to differentiate between those two (and potentially other?)
cases.  There's very little else you can do beside that.

NB. On the system level, the error has no information about what file
wasn't found.  It simply returns some numeric value (the famous
ENOENT) in case when the system call to open a file fails.  Python
could've been more helpful by figuring out what path caused the
problem and printing that in the error message, but it doesn't...
That's why I, myself, never use the vanilla FileNotFoundError, I
always re-rise it with a customized version that incorporates the
information about the missing file in the error message.

NB2. It's always a bad idea to print logs to files.  Any sysadmin /
ops / infra person worth their salt will tell you that.  The only
place the logs should go to is the standard error.  There are true and
tried tools that can pick up logs from that point on, and do with them
whatever your heart desires.  That is, of course, unless you are
creating system tools for universal log management (in which case, I'd
question the choice of Python as a suitable language for such a task).
Unfortunately, even though this has been common knowledge for decades,
it's still elusive in the world of application development :|

On Mon, Nov 11, 2024 at 4:00 PM Loris Bennett via Python-list
 wrote:
>
> Hi,
>
> I have the following in my program:
>
> try:
> logging.config.fileConfig(args.config_file)
> config = configparser.ConfigParser()
> config.read(args.config_file)
> if args.verbose:
> print(f"Configuration file: {args.config_file}")
> except FileNotFoundError:
> print(f"Error: configuration file {args.config_file} not found.  
> Exiting.")
> sys.exit(0)
>
> and when I ran the program I got the error
>
>   Error: configuration file /usr/local/etc/sc_mailer not found.  Exiting.
>
> However, this file *does* exist and *can* be read.  By checking the
> 'filename' attribute of the exception I discovered that the problem was
> the log file defined *in* the config file, namely
>
>   [handler_fileHandler]
>   class=FileHandler
>   level=DEBUG
>   formatter=defaultFormatter
>   args=('/var/log/my_prog.log', 'a')
>
> This log file did not exist.  The exception is thrown by
>
>   logging.config.fileConfig(args.config_file)
>
> My questions are:
>
> 1. Should I be surprised by this behaviour?
> 2. In terms of generating a helpful error message, how should one
>distinguish between the config file not existing and the log file not
>existing?
>
> Cheers,
>
> Loris
>
> --
> This signature is currently under constuction.
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: shutil.rmtree() fails when used in Fedora (rpm) "mock" environment

2024-10-24 Thread Left Right via Python-list
> > > The stack is created on line 760 with os.lstat and entries are appended
> > > on lines 677 (os.rmdir), 679 (os.close) and 689 (os.lstat).
> > >
> > > 'func' is popped off the stack on line 651 and check in the following 
> > > lines.
> > >
> > > I can't see anywhere else where something else is put onto the stack or
> > > an entry is replaced.

But the _rmtree_safe_fd() compares func to a *dynamically* resolved
reference: os.lstat. If the reference to os changed (or os object was
modified to have new reference at lstat) between the time os.lstat was
added to the stack and the time of comparison, then comparison
would've failed.  To illustrate my idea:

os.lstat = lambda x: x # thread 1
stack.append((os.lstat, ...)) # thread 1
os.lstat = lambda x: x # thread 2
func, *_ = stack.pop() # thread 1
assert func is os.lstat # thread 1 (failure!)

The only question is: is it possible to modify os.lstat like that, and
if so, how?

Other alternatives include a malfunctioning "is" operator,
malfunctioning module cache... all those are a lot less likely.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: shutil.rmtree() fails when used in Fedora (rpm) "mock" environment

2024-10-24 Thread Left Right via Python-list
> What is the probability of replacing os.lstat, os.close or os.rmdir from
> another thread at just the right time?

If the thead does "import os", and its start is logically connected to
calling _rmtree_safe_fd(), I'd say it's a very good chance! That is,
again, granted that the reference to os.lstat *can* be modified in
this way.

But, before we keep guessing any further, it'd be best if OP could get
us the info on what's stored in "func" and "os.lstat" at the time the
assertion fails.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: FileNotFoundError thrown due to file name in file, rather than file itself

2024-11-12 Thread Left Right via Python-list
> I am not entirely convinced by NB2.  I am, in fact, a sort of sysadmin
> person and most of my programs write to a log file.  The programs are
> also moderately complex, so a single program might access a database,
> query an LDAP server, send email etc., so potentially quite a lot can go
> wrong.  They are also not programs whose output I would pipe to another
> command.  What would be the advantage of logging to stderr?  Quite apart
> from that, I find having a log file a useful for debugging when I am
> developing.

First, the problem with writing to files is that there is no way to
make these logs reliable.  This is what I mean by saying these are
unreliable: since logs are designed to grow indefinitely, the natural
response to this design property is log rotation.  But, it's
impossible to reliably rotate a log file.  There's always a chance
that during the rotation some log entries will be written to the file
past the point of rotation, but prior to the point where the next logs
volume starts.

There are similar reliability problems with writing to Unix or
Internet sockets, databases etc.  For different reasons, but at the
end of the day, whoever wants logs, they want them to be reliable.
Both simplicity and convention selected for stderr as the only and the
best source of logging output.

Programs that write their output to log files will always irritate
their users because users will have to do some detective work to
figure out where those files are, and in some cases they will have to
do administrative works to make sure that the location where the
program wants to store the log files is accessible, has enough free
space, is speedy enough etc.  So, from the ops perspective, whenever I
come across a program that tries to write logs to anything other than
stderr, I make an earnest effort to throw that program into the gutter
and never touch it again.  It's too much headache to babysit every
such program, to remember the location of the log files of every such
program, the required permissions, to provision storage.  If you are
in that line of work, you just want all logs to go to the same place
(journal), where you can later filter / aggregate / correlate and
perform other BI tasks as your heart desires.

Of course, if you only administer your own computer, and you have low
single digits programs to run, and their behavior doesn't change
frequently, and you don't care to drop some records every now and
then... it's OK to log to files directly from a program.  But then you
aren't really in the sysadmin / infra / ops category, as you are more
of a hobby enthusiast.

Finally, if you want your logs to go to a file, and currently, your
only option is stderr, your shell gives you a really, really simple
way of redirecting stderr to a file.  So, really, there aren't any
excuses to do that.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to stop a specific thread in Python 2.7?

2024-09-26 Thread Left Right via Python-list
That's one of the "disadvantages" of threads: you cannot safely stop a
thread. Of course you could try, but that's never a good idea. The
reason for this is that threads share memory. They might be holding
locks that, if killed, will never be unlocked. They might (partially)
modify the shared state observed by other threads in such a way that
it becomes unusable to other threads.

So... if you want to kill a thread, I'm sorry to say this: you will
have to bring down the whole process, there's really no other way, and
that's not Python-specific, this is just the design of threads.

On Wed, Sep 25, 2024 at 7:26 PM marc nicole via Python-list
 wrote:
>
> Hello guys,
>
> I want to know how to kill a specific running thread (say by its id)
>
> for now I run and kill a thread like the following:
> # start thread
> thread1 = threading.Thread(target= self.some_func(), args=( ...,), )
> thread1.start()
> # kill the thread
> event_thread1 = threading.Event()
> event_thread1.set()
>
> I know that set() will kill all running threads, but if there was thread2
> as well and I want to kill only thread1?
>
> Thanks!
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

2024-09-30 Thread Left Right via Python-list
Whether and to what degree you can stream JSON depends on JSON
structure. In general, however, JSON cannot be streamed (but commonly
it can be).

Imagine a pathological case of this shape: 1... <60GB of digits>. This
is still a valid JSON (it doesn't have any limits on how many digits a
number can have). And you cannot parse this number in a streaming way
because in order to do that, you need to start with the least
significant digit.

Typically, however, JSON can be parsed incrementally. The format is
conceptually very simple to write a parser for. There are plenty of
parsers that do that, for example, this one:
https://pypi.org/project/json-stream/ . But, I'd encourage you to do
it yourself.  It's fun, and the resulting parser should end up less
than some 50 LoC.  Also, it allows you to closer incorporate your
desired output into your parser.

On Mon, Sep 30, 2024 at 8:44 AM Asif Ali Hirekumbi via Python-list
 wrote:
>
> Thanks Abdur Rahmaan.
> I will give it a try !
>
> Thanks
> Asif
>
> On Mon, Sep 30, 2024 at 11:19 AM Abdur-Rahmaan Janhangeer <
> arj.pyt...@gmail.com> wrote:
>
> > Idk if you tried Polars, but it seems to work well with JSON data
> >
> > import polars as pl
> > pl.read_json("file.json")
> >
> > Kind Regards,
> >
> > Abdur-Rahmaan Janhangeer
> > about  | blog
> > 
> > github 
> > Mauritius
> >
> >
> > On Mon, Sep 30, 2024 at 8:00 AM Asif Ali Hirekumbi via Python-list <
> > python-list@python.org> wrote:
> >
> >> Dear Python Experts,
> >>
> >> I am working with the Kenna Application's API to retrieve vulnerability
> >> data. The API endpoint provides a single, massive JSON file in gzip
> >> format,
> >> approximately 60 GB in size. Handling such a large dataset in one go is
> >> proving to be quite challenging, especially in terms of memory management.
> >>
> >> I am looking for guidance on how to efficiently stream this data and
> >> process it in chunks using Python. Specifically, I am wondering if there’s
> >> a way to use the requests library or any other libraries that would allow
> >> us to pull data from the API endpoint in a memory-efficient manner.
> >>
> >> Here are the relevant API endpoints from Kenna:
> >>
> >>- Kenna API Documentation
> >>
> >>- Kenna Vulnerabilities Export
> >>
> >>
> >> If anyone has experience with similar use cases or can offer any advice,
> >> it
> >> would be greatly appreciated.
> >>
> >> Thank you in advance for your help!
> >>
> >> Best regards
> >> Asif Ali
> >> --
> >> https://mail.python.org/mailman/listinfo/python-list
> >>
> >
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

2024-10-01 Thread Left Right via Python-list
> What am I missing?  Handwavingly, start with the first digit, and as
> long as the next character is a digit, multipliy the accumulated result
> by 10 (or the appropriate base) and add the next value.  Oh, and handle
> scientific notation as a special case, and perhaps fail spectacularly
> instead of recovering gracefully in certain edge cases.  And in the
> pathological case of a single number with 60 billion digits, run out of
> memory (and complain loudly to the person who claimed that the file
> contained a "dataset").  But why do I need to start with the least
> significant digit?

You probably forgot that it has to be _streaming_. Suppose you parse
the first digit: can you hand this information over to an external
function to process the parsed data? -- No! because you don't know the
magnitude yet.  What about two digits? -- Same thing.  You cannot
leave the parser code until you know the magnitude (otherwise the
information is useless to the external code).

So, even if you have enough memory and don't care about special cases
like scientific notation: yes, you will be able to parse it, but it
won't be a streaming parser.

On Mon, Sep 30, 2024 at 9:30 PM Left Right  wrote:
>
> > Streaming won't work because the file is gzipped.  You have to receive
> > the whole thing before you can unzip it. Once unzipped it will be even
> > larger, and all in memory.
>
> GZip is specifically designed to be streamed.  So, that's not a
> problem (in principle), but you would need to have a streaming GZip
> parser, quick search in PyPI revealed this package:
> https://pypi.org/project/gzip-stream/ .
>
> On Mon, Sep 30, 2024 at 6:20 PM Thomas Passin via Python-list
>  wrote:
> >
> > On 9/30/2024 11:30 AM, Barry via Python-list wrote:
> > >
> > >
> > >> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list 
> > >>  wrote:
> > >>
> > >>
> > >> import polars as pl
> > >> pl.read_json("file.json")
> > >>
> > >>
> > >
> > > This is not going to work unless the computer has a lot more the 60GiB of 
> > > RAM.
> > >
> > > As later suggested a streaming parser is required.
> >
> > Streaming won't work because the file is gzipped.  You have to receive
> > the whole thing before you can unzip it. Once unzipped it will be even
> > larger, and all in memory.
> > --
> > https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

2024-09-30 Thread Left Right via Python-list
> Streaming won't work because the file is gzipped.  You have to receive
> the whole thing before you can unzip it. Once unzipped it will be even
> larger, and all in memory.

GZip is specifically designed to be streamed.  So, that's not a
problem (in principle), but you would need to have a streaming GZip
parser, quick search in PyPI revealed this package:
https://pypi.org/project/gzip-stream/ .

On Mon, Sep 30, 2024 at 6:20 PM Thomas Passin via Python-list
 wrote:
>
> On 9/30/2024 11:30 AM, Barry via Python-list wrote:
> >
> >
> >> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list 
> >>  wrote:
> >>
> >>
> >> import polars as pl
> >> pl.read_json("file.json")
> >>
> >>
> >
> > This is not going to work unless the computer has a lot more the 60GiB of 
> > RAM.
> >
> > As later suggested a streaming parser is required.
>
> Streaming won't work because the file is gzipped.  You have to receive
> the whole thing before you can unzip it. Once unzipped it will be even
> larger, and all in memory.
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

2024-10-01 Thread Left Right via Python-list
> If I recognize the first digit, then I *can* hand that over to an
> external function to accumulate the digits that follow.

And what is that external function going to do with this information?
The point is you didn't parse anything if you just sent the digit.
You just delegated the parsing further. Parsing is only meaningful if
you extracted some information, but your idea is, essentially "what if
I do nothing?".

> Under that constraint, I'm not sure I can parse anything.  How can I
parse a string (and hand it over to an external function) until I've
found the closing quote?

Nobody says that parsing a number is the only pathological case.  You,
however, exaggerate by saying you cannot parse _anything_. You can
parse booleans or null, for example.  There's no problem there.

Again, I think you misunderstand what streaming is for. Let me remind:
it's for processing information as it comes, potentially,
indefinitely. This has far more important implications than what you
find in computer science. For example, some mathematicians use the
same argument to show that real numbers are either fiction or useless:
consider adding two real numbers (where real numbers are potentially
infinite strings of decimal digits after the period) -- there's no way
to prove that such an addition is possible because you would need an
infinite proof for that (because you need to start adding from the
least significant digit).

In principle, any language that has infinite words will have the same
problem with streaming. If you ever pondered h/w or low-level
protocols s.a. SCSI or IP, you'd see that they are specifically
designed in such a way as to never have infinite words (because they
must be amenable to streaming). Consider also an interesting
consequence of SCSI not being able to have infinite words: this means,
besides other things that fsync() is nonsense! :) If you aren't
familiar with the concept: UNIX filesystem API suggests that it's
possible to destage arbitrary large file (or a chunk of file) to disk.
But SCSI is built of finite "words" and to describe an arbitrary large
file you'd need to list all the blocks that constitute the file!  And
that's why fsync() and family are so hated by people who deal with
storage: the only way to implement fsync() in compliance with the
standard is to sync _everything_ (and it hurts!)

On Tue, Oct 1, 2024 at 5:49 PM Dan Sommers via Python-list
 wrote:
>
> On 2024-09-30 at 21:34:07 +0200,
> Regarding "Re: Help with Streaming and Chunk Processing for Large JSON Data 
> (60 GB) from Kenna API,"
> Left Right via Python-list  wrote:
>
> > > What am I missing?  Handwavingly, start with the first digit, and as
> > > long as the next character is a digit, multipliy the accumulated result
> > > by 10 (or the appropriate base) and add the next value.  Oh, and handle
> > > scientific notation as a special case, and perhaps fail spectacularly
> > > instead of recovering gracefully in certain edge cases.  And in the
> > > pathological case of a single number with 60 billion digits, run out of
> > > memory (and complain loudly to the person who claimed that the file
> > > contained a "dataset").  But why do I need to start with the least
> > > significant digit?
> >
> > You probably forgot that it has to be _streaming_. Suppose you parse
> > the first digit: can you hand this information over to an external
> > function to process the parsed data? -- No! because you don't know the
> > magnitude yet.  What about two digits? -- Same thing.  You cannot
> > leave the parser code until you know the magnitude (otherwise the
> > information is useless to the external code).
>
> If I recognize the first digit, then I *can* hand that over to an
> external function to accumulate the digits that follow.
>
> > So, even if you have enough memory and don't care about special cases
> > like scientific notation: yes, you will be able to parse it, but it
> > won't be a streaming parser.
>
> Under that constraint, I'm not sure I can parse anything.  How can I
> parse a string (and hand it over to an external function) until I've
> found the closing quote?
>
> How much state can a parser maintain (before it invokes an external
> function) and still be considered streaming?  I fear that we may be
> getting hung up on terminology rather than solving the problem at hand.
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

2024-10-02 Thread Left Right via Python-list
> One single IP packet is all you can parse.

I worked for an undisclosed company which manufactures h/w for ISPs
(4- and 8-unit boxes you mount on a rack in a datacenter).
Essentially, big-big routers.  So, I had the pleasure of writing
software that parses IP _protocol_, and let me tell you: you have no
idea what you just wrote.

But, like I wrote earlier: you don't understand the distinction
between languages and words.  And in general, are just being stubborn
and rude because you are trying to prove a point to someone you don't
like, but, in reality, you just look more and more ridiculous.

On Thu, Oct 3, 2024 at 12:51 AM Chris Angelico  wrote:
>
> On Thu, 3 Oct 2024 at 08:48, Left Right  wrote:
> >
> > > You can't validate an IP packet without having all of it. Your notion
> > > of "streaming" is nonsensical.
> >
> > Whoa, whoa, hold your horses! "nonsensical" needs a little bit of
> > justification :)
> >
> > It seems you don't understand the difference between words and
> > languages! In my examples, IP _protocol_ is the language, sequences of
> > IP packets are the words in the language. A language is amenable to
> > streaming if the words of the language are repetition of sequences of
> > symbols of the alphabet of fixed length.  This is, essentially, like
> > saying that the words themselves are regular.
>
> One single IP packet is all you can parse. You're playing shenanigans
> with words the way Humpty Dumpty does. IP packets are not sequences,
> they are individuals.
>
> ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

2024-10-02 Thread Left Right via Python-list
> You can't validate an IP packet without having all of it. Your notion
> of "streaming" is nonsensical.

Whoa, whoa, hold your horses! "nonsensical" needs a little bit of
justification :)

It seems you don't understand the difference between words and
languages! In my examples, IP _protocol_ is the language, sequences of
IP packets are the words in the language. A language is amenable to
streaming if the words of the language are repetition of sequences of
symbols of the alphabet of fixed length.  This is, essentially, like
saying that the words themselves are regular.

So, the follow-up question from you to me should be: how come strictly
context-free languages can still be parsed with streaming parsers? --
And the answer to that is it's possible to approximate context-free
languages with regular languages.  In fact, this is a very interesting
subject, which unfortunately is usually overlooked in automata
classes.  It's interesting in a sense that it's very accessible to the
students who already mastered the understanding of regular and
context-free formalisms.

So, streaming parsers (eg. SAX) are written for a regular language
that approximates XML.  This is because in practice we will almost
never encounter more than N nesting levels in an XML, more than N
characters in an element name etc. (for some large enough N).
Something which allows us to create a regular language from a
context-free one.

NB. "Nonsensical" has a very precise meaning, when it comes to
discussing the truth value of a proposition, which I think you also
somehow didn't know about.  You seem to use "nonsensical" as a synonym
to "wrong".  But, unbeknownst to you, you said something else.  You
actually implied that there's no way to tell if my notion of streaming
is correct or not.

But, for the future reference: my notion of streaming is correct, and
you would do better learning some materials about it before jumping to
conclusions.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

2024-10-02 Thread Left Right via Python-list
> By that definition of "streaming", no parser can ever be streaming,
> because there will be some constructs that must be read in their
> entirety before a suitably-structured piece of output can be
> emitted.

In the same email you replied to, I gave examples of languages for
which parsers can be streaming (in general): SCSI or IP. For some
languages (eg. everything in the context-free family) streaming
parsers are _in general_ impossible, because there are pathological
cases like the one with parsing numbers. But this doesn't mean that
you cannot come up with a parser that is only useful _sometimes_.
And, in practice, languages like XML or JSON do well with streaming,
even though in general it's impossible.

I'm sorry if this comes as a surprise.  On one hand I don't want to
sound condescending, on the other hand, this is something that you'd
typically study in automata theory class.  Well, not exactly in the
very same words, but you should be able to figure this stuff out if
you had that class.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: FileNotFoundError thrown due to file name in file, rather than file itself

2024-11-13 Thread Left Right via Python-list
> On any Unix system this is untrue.  Rotating a log file is quite simple:

I realized I posted this without cc'ing the list:
http://jdebp.info/FGA/do-not-use-logrotate.html .

The link above gives a more detailed description of why log rotation
on the Unix system is not only not simple, but is, in fact,
unreliable.

NB. Also, it really rubs me the wrong way when the word "standard" is
used to mean "common" (instead of "as described in a standard
document").  And when it comes to popular tools, oftentimes "common"
is wrong because commonly the tool is used by amateurs rather than
experts.  In other words, you only reinforced what I wrote initially:
plenty of application developers don't know how to do logging well.
It also appears that they would lecture infra / ops people on how to
do something that they aren't experts on, while the latter are :)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: subprocess.Popen does not launch audacity

2025-01-10 Thread Left Right via Python-list
I just tried this:

>>> import subprocess
>>> subprocess.run('which audacity', shell=True)
/usr/bin/audacity
CompletedProcess(args='which audacity', returncode=0)
>>> proc = subprocess.Popen('/usr/bin/audacity',
stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE,
close_fds=True)
>>> proc.returncode
>>> proc.pid
53308
>>> proc.kill()
>>> proc.returncode
0
>>>

And I saw the interface of the program... So, in principle, what you
tried should work.

What happens if in a separate terminal you try:

$ ps auxwww | grep audacity?

Are there any processes running?

If your script fails, what is the error?

If it doesn't, can you run this:

$strace python ./audacity-test.py

where audacity-test.py looks like this:

import subprocess
proc = subprocess.Popen('/usr/local/bin/audacity',
stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE,
close_fds=True)
print(proc.returncode)
print(proc.pid)
proc.wait()
proc.kill()
print(proc.returncode)

Then, you should see something like:

clone(child_stack=NULL,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f377d83b750) = 53932
close(10)   = 0
close(8)= 0
close(6)= 0
close(3)= 0
read(9, "", 5)  = 0
close(9)= 0
write(1, "None\n", 5None
)   = 5
write(1, "53932\n", 653932
)  = 6
wait4(53932,

(the process id you are waiting for is going to be different of
course, but the important part is that you find the clone() call that
returns the process id your code is waiting on.)

And, if it doesn't look like the above, then show what *does* it look like.

On Fri, Jan 10, 2025 at 10:03 PM Tim Johnson via Python-list
 wrote:
>
>
> On 1/10/25 11:32, MRAB via Python-list wrote:
> >> ,,, snipped
>
> >> Below is the pertinent code:
> >>
> >>Popen(choice, stdout=PIPE, stderr=PIPE,
> >> stdin=PIPE, close_fds=True)
> >>
> >> My guess is my argument list is either insufficient or an argument is
> >> causing the problem, but am unsure of which.
> >>
> >> I have been retired from python programming for ten years, and am pretty
> >> rusty, but it is still fun. There are plenty
> >>
> >> of other ways to successfully launch audacity but it would be great to
> >> make it work from this script.
> >>
> >
> > What is the value of 'choice'?
> >
> > You could try printing out the value of 'choice' for one that works
> > and the one that doesn't and then try them again interactively from
> > the Python prompt with the given values. That should eliminate all but
> > the essential code for easier debugging.
>
> choice is /usr/local/bin/audacity, which is the correct path for
> audacity on my system. As far as I can see, that string has no hidden bytes.
>
> Invoking /usr/local/bin/audacity from the command line launches audacity
> and so does choosing  with dmenu_run. which -a audacity shows only that
> item.
>
> Maybe I need to isolate the function call and start stripping out
> parameters. I should have time to do that later today.
>
> Thanks
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using pipx for packages as opposed to applications

2025-01-12 Thread Left Right via Python-list
What would be the intended use?  If this is for other Debian users,
then why not make a Debian package?  If it's for yourself, why do you
need to automate it?

To be fair, I don't see a point in tools like pipx.  Have never used
it, and cannot imagine a scenario where I'd want to.  It seems like
there's always a better way to do what this tool alleges to be able to
do...

Also, you say that you want it in its own environment: then what
difference does it make if it's on Debian or anywhere else?  If you
are distributing a library, it makes sense to incorporate it into the
user's infrastructure.  Either you do the integration, or let users
decide how to best integrate it.  If you provide them with the
environment that they *must* use, that's going to be the worst of both
worlds: users won't be able to use the library in the environment
created by them, nor will this library integrate with the other
libraries provided by the system.  So, it's hard to imagine why your
users would want that.

On Sun, Jan 12, 2025 at 12:47 AM Chris Green via Python-list
 wrote:
>
> Can one use pipx to wrap the process of creating an independent
> environment for a python package as opposed to a runnable application?
>
> E.g. I want to install and use pksheet but, as it's not available from
> the Debian repositories, I'll have to install it from PyPi.  So I
> should put it in its own environment. Can pipx help me with this?
>
> --
> Chris Green
> ·
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Strategies for avoiding having to use --break-system-packages with pip

2025-01-14 Thread Left Right via Python-list
I wouldn't trust pip to install anything into my system. It's not a
reliable program that I'd recommend anyone to use for things that they
might depend on.

My typical course of action is to create a virtual environment for the
package I need.  Install the package into that virtual environment
using pip or w/e the package wants to be installed with.  Investigate
and refine the dependencies I need (It's very common in the Python
world to incorrectly specify dependencies, to require a lot of
unnecessary dependencies, to depend on packages in the wrong way). And
after I've figured out what exactly I need to get the package to work,
copy the dependencies together with the package to the platlib
directory of Python I'm using for this task.

If platlib happens to be in the system directories or anywhere else: I
wouldn't care about that. In the process of installing the program, I
would've learned about the nature and the layout of dependencies, so
that I could make informed decisions about what goes where and whether
anything is needed or is in danger of being affected by system updates
or endangers system updates.

So far, this has worked every time.  This worked for me personally,
for small companies that do internal deployments and for big companies
that distribute Python together with the entire Linux distribution in
this way.  The key ingredient is to know what you are doing and to be
able to anticipate the bad things. Tools like pip take away from the
users the need to know what they are doing in hopes of reducing
complexity on the part of the users' knowledge necessary to accomplish
program installation.  However, there's no free lunch, and tools like
pip bring an extra layer of complexity in trying to do what they
claim.  This layer of complexity is usually revealed when these tools
fail to do what they claim, and users are forced to learn what they
actually need to know and, on top of that, how to troubleshoot
programs like pip.

From working in infra / automation, I get the knowledge about program
/ package installation "free" on the job, so, I don't see a point in
using a tool that automates that for me beyond what I already
described. I'm probably also biased because of that, but I still think
that learning to do the thing is more important than learning how to
use the tool that does the thing.

On Tue, Jan 14, 2025 at 4:42 PM Chris Green via Python-list
 wrote:
>
> I have a (relatively) clean Debian 12 installation running on my two
> workhorse systems, a desktop server at home and my laptop that travels
> around with me.
>
> I moved from Xubuntu to Debian on both these systems a few months ago.
>
> I ran Xubuntu for many years and acquired a whole lot of python
> packages installed with pip, as root.  For the last couple of years I
> had to use the --break-system-packages option to get things installed.
>
> As far as I'm aware I never hit any dependency problems doing this.
> It's probably because things I installed with pip were mostly quite
> small, specialised, packages that I used in just one or two utility
> programs that I had written myself.  In quite a few cases these were
> realated to image processing and such things.
>
>
> So far I've managed to keep my Debian 12 installations 'pip free', I
> haven't even got pip installed.  However I may have just come across
> something that would at least be very useful and it comes from PyPi.
> (It's tkintertable if that's of any interest or relevance)
>
>
> What are my options?
>
> Just install it using pip as root and --break-system-packages,
> what's likely to break?
>
> Use a virtual environment, what do I have to do then to make using
> my program (that uses tkintertable) 'transparent', i.e. I just
> want to be able to run the program from the command prompt like
> any other program.
>
> Download tkintertable from git into my development environment and
> use that.  My PYTHONPATH will need to point to it but I can't see
> any further issues with doing this.
>
> Anything else?  As far as I can see using pipx doesn't help me at
> all (see recent thread here).
>
> --
> Chris Green
> ·
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Version of OpenSSl ?

2025-02-09 Thread Left Right via Python-list
HI Vincent.

You need the sources of the OpenSSL library, not the compiled library.
On Ubuntu, the packages with sources are typically named xxx-dev where
xxx is the package that provides the library. I don't have a Ubuntu
currently, but try looking for something like openssl-dev or
libopenssl-dev etc.

On Sun, Feb 9, 2025 at 9:35 AM Vincent Vande Vyvre via Python-list
 wrote:
>
> Hi,
>
> Trying to compile Python-3.12.9 on Ubuntu-24.04
>
> The compilation is complete without errors but I have this message:
> 
> The necessary bits to build these optional modules were not found:
> _hashlib  _ssl  nis
> To find the necessary bits, look in configure.ac and config.log.
>
> Could not build the ssl module!
> Python requires a OpenSSL 1.1.1 or newer
> 
>
> But I have a more newer version:
>
> ---
> $ openssl version
> OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024)
> ---
>
> What can I do for that ?
>
> Vincent.
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: RE Version of OpenSSl ?

2025-02-09 Thread Left Right via Python-list
So, this is how I know where my SSL headers are found, for instance:

➜  cpython git:(3.12) gcc -I. -I./Include -H ./Modules/_ssl.c 2>&1 | grep evp.h
.. /usr/include/openssl/evp.h

(this was executed from the repository root).

Can you see if you get something similar?

Also... just for sanity check: did you run .configure? There's a lot
of twisted logic there trying to find OpenSSL. And, unfortunately,
error reporting is very poor. (The error you are getting comes way,
way after everything bad has already happened and all knowledge of how
it happened is lost). Just all around atrocious error handling.


On Sun, Feb 9, 2025 at 5:51 PM Vincent Vande Vyvre via Python-list
 wrote:
>
> >
> > In case this helps you find the correct package to install:
> >
> > $ python3 -c "if True:
> > > import ssl
> > > print('Ok.')
> > > "
> > Ok.
> >
> > $ cat /etc/lsb-release
> > DISTRIB_ID=Ubuntu
> > DISTRIB_RELEASE=24.04
> > DISTRIB_CODENAME=noble
> > DISTRIB_DESCRIPTION="Ubuntu 24.04.1 LTS"
> >
> > $ apt list --installed | grep ssl
> >
> > WARNING: apt does not have a stable CLI interface. Use with caution in
> > scripts.
> >
> > libssl-dev/noble-updates,noble-security,now 3.0.13-0ubuntu3.4 amd64
> > [installed]
> > libssl3t64/noble-updates,noble-security,now 3.0.13-0ubuntu3.4 amd64
> > [installed,automatic]
> > libxmlsec1t64-openssl/noble,now 1.2.39-5build2 amd64 [installed,automatic]
> > openssl/noble-updates,noble-security,now 3.0.13-0ubuntu3.4 amd64
> > [installed,automatic]
> > ssl-cert/noble,noble,now 1.1.2ubuntu1 all [installed,automatic]
>
> Thanks Jason, I have near the same result of you.
> I need to explain the context.
> I'm on a new machine with a fresh install of Ubuntu 24.04 wich embed Python 
> 3.12.3, no problem with that.
>
> As I'm maintainer of some Python modules published on PyPI, I've the habit of 
> testing my modules in different virtual environments. For now Python 3.11, 
> 3.12 and 3.13.
>
> So, I've maybe found a solution:
>
> I've create in my home a dir named /opt, download into it the latest version 
> of openssl-1.1.1 and uncompress it.(*)
> -
> $ cd opt/openssl-1.1.1w
> $ ./config && make && make test
> $ mkdir $HOME/opt/lib
> $ mv $HOME/opt/openssl-1.1.1w/libcrypto.so.1.1 $HOME/opt/lib/
> $ mv $HOME/opt/openssl-1.1.1w/libssl.so.1.1 $HOME/opt/lib/
> $ export LD_LIBRARY_PATH=$HOME/opt/lib:$LD_LIBRARY_PATH
> --
> And rerun the compilation of 3.12.9 without problem with openssl.
>
> (*) https://openssl-library.org/source/old/1.1.1/index.html
>
> Vincent.
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python recompile

2025-03-02 Thread Left Right via Python-list
I think Python compiles with fPIC by default. Something else had
happened to the OPs checkout that caused these errors. OP needs to
better describe what they were doing to properly understand the
problem.

On Sun, Mar 2, 2025 at 10:10 PM Lew Pitcher via Python-list
 wrote:
>
>
> First off, this isn't really on-topic for comp.lang.c, as it is a question 
> regarding a linker, interacting
> with the results of various options given to a specific compiler.
>
> However...
>
> On Sun, 02 Mar 2025 14:35:08 +, The Doctor wrote:
>
> > How do I compensate for
> >
> > ld: error: relocation R_X86_64_32 cannot be used against symbol 
> > '_PyRuntime'; recompile with -fPIC
>
> The error message tells you exactly how to fix the problem: recompile the 
> module using the
>   -fPIC
> option to the compiler. -fPIC tells your compiler to generate a specific type 
> of position-independant
> code, which your linker (apparently) requires for a specific type of 
> relocation.
>
>
> [snip]
>
> HTH
> --
> Lew Pitcher
> "In Skills We Trust"
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pip installs to unexpected place

2025-04-17 Thread Left Right via Python-list
> Also... when installing stuff with pip --user, it is always a package
> that is not installed for the system (usually not even available for
> the system). How can that "break system packages"?

pip installs dependencies. Dependencies may disagree on the version
with the system packages.

This is a difference between eg. how conda works and pip. Conda is an
actual package manager: it ensures that all packages in a particular
environment agree on version requirements. pip will break your
environment in subsequent installs because it doesn't keep track of
what was installed before.

On top of this, pip may, in general, cause any amount of damage to
your system regardless of where or how you install it because by
default it's allowed to build wheels from source packages. The build
may run whatever code, including formatting hard drives, mining
bitcoin etc. The reason it doesn't happen very often is that package
maintainers kind of trust each other to be nice. There aren't really
any safeguards to prevent malicious actors from doing this, but you
would have to want to install their package for some reason.
-- 
https://mail.python.org/mailman/listinfo/python-list