Re: Short, perfect program to read sentences of webpage

2021-12-08 Thread Cameron Simpson
On 08Dec2021 23:17, Stefan Ram  wrote:
>  Regexps might have their disadvantages, but when I use them,
>  it is clearer for me to do all the matching with regexps
>  instead of mixing them with Python calls like str.isupper.
>  Therefore, it is helpful for me to have a regexp to match
>  upper and lower case characters separately. Some regexp
>  dialects support "\p{Lu}" and "\p{Ll}" for this.

Aye. I went looking for that in the Python re module docs and could not 
find them. So the comprimise is match any word, then test the word with 
isupper() (or whatever is appropriate).

>  I have not yet incorporated (all) your advice into my code,
>  but I came to the conclusion myself that the repetition of
>  long sequences like r"A-ZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝ" and
>  not using f strings to insert other strings was especially
>  ugly.

The tricky bit with f-strings and regexps is that \w{3,5} means from 3 
through 5 "word characters". So if you've got those in an f-string 
you're off to double-the-brackets land, a bit like double backslash land 
and non-raw-strings.

Otherwise, yes f-strings are a nice way to compose things.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Odd locale error that has disappeared on reboot.

2021-12-08 Thread Inada Naoki
On Wed, Dec 8, 2021 at 2:52 AM Chris Green  wrote:
>
>
> At 03:40 last night it suddenly started throwing the following error every
> time it ran:-
>
> Fatal Python error: initfsencoding: Unable to get the locale encoding
> LookupError: unknown encoding: UTF-8
>
> Current thread 0xb6f8db40 (most recent call first):
> Aborted
>
> Running the program from the command line produced the same error.
> Restarting the Pi system has fixed the problem.
>

This error means Python can not find its standard libraries. There are
some possibilities.

* You set the wrong PYTHONHOME
  PYTHONHOME is very rarely useful. It shouldn't be used if you can
not solve this kind of problem.

* Your Python installation is broken.
  Some files are deleted or overwritten. You need to *clean* install
Python again.

Bets,
-- 
Inada Naoki  
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Short, perfect program to read sentences of webpage

2021-12-08 Thread MRAB

On 2021-12-08 23:17, Stefan Ram wrote:

Cameron Simpson  writes:
Instead, consider the \b (word boundary) and \w (word character) 
markers, which will let you break strings up, and then maybe test the 
results with str.isupper().


   Thanks for your comments, most or all of them are
   valid, and I will try to take them into account!

   Regexps might have their disadvantages, but when I use them,
   it is clearer for me to do all the matching with regexps
   instead of mixing them with Python calls like str.isupper.
   Therefore, it is helpful for me to have a regexp to match
   upper and lower case characters separately. Some regexp
   dialects support "\p{Lu}" and "\p{Ll}" for this.

If you want "\p{Lu}" and "\p{Ll}", have a look at the 'regex' module on 
PyPI:


https://pypi.org/project/regex/

[snip]
--
https://mail.python.org/mailman/listinfo/python-list


[RELEASE] Python 3.11.0a3 is available

2021-12-08 Thread Pablo Galindo Salgado
You can tell that we are slowly getting closer to the first beta as the
number of release blockers that we need to fix on every release starts to
increase [image: :sweat_smile:] But we did it! Thanks to Steve Dower, Ned
Deily, Christian Heimes, Łukasz Langa and Mark Shannon that helped get
things ready for this release :)

Go get the new version here:

https://www.python.org/downloads/release/python-3110a3/

**This is an early developer preview of Python 3.11**

# Major new features of the 3.11 series, compared to 3.10

Python 3.11 is still in development.  This release, 3.11.0a3 is the third
of seven planned alpha releases.

Alpha releases are intended to make it easier to test the current state of
new features and bug fixes and to test the release process.

During the alpha phase, features may be added up until the start of the
beta phase (2022-05-06) and, if necessary, may be modified or deleted up
until the release candidate phase (2022-08-01).  Please keep in mind that
this is a preview release and its use is **not** recommended for production
environments.

Many new features for Python 3.11 are still being planned and written.
Among the new major new features and changes so far:

* [PEP 657](https://www.python.org/dev/peps/pep-0657/) -- Include
Fine-Grained Error Locations in Tracebacks
* [PEP 654](https://www.python.org/dev/peps/pep-0654/) -- Exception Groups
and except*
* The [Faster Cpython Project](https://github.com/faster-cpython) is
already yielding some exciting results: this version of CPython 3.11 is
~12% faster on the geometric mean of the [PyPerformance benchmarks](
speed.python.org), compared to 3.10.0.
 * Hey, **fellow core developer,** if a feature you find important is
missing from this list, let me know.

The next pre-release of Python 3.11 will be 3.11.0a4, currently scheduled
for Monday, 2022-01-03.

# More resources

* [Online Documentation](https://docs.python.org/3.11/)
* [PEP 664](https://www.python.org/dev/peps/pep-0664/), 3.11 Release
Schedule
* Report bugs at [https://bugs.python.org](https://bugs.python.org).
* [Help fund Python and its community](/psf/donations/).

# And now for something completely different

Rayleigh scattering, named after the nineteenth-century British physicist
Lord Rayleigh is the predominantly elastic scattering of light or other
electromagnetic radiation by particles much smaller than the wavelength of
the radiation. For light frequencies well below the resonance frequency of
the scattering particle, the amount of scattering is inversely proportional
to the fourth power of the wavelength. Rayleigh scattering results from the
electric polarizability of the particles. The oscillating electric field of
a light wave acts on the charges within a particle, causing them to move at
the same frequency. The particle, therefore, becomes a small radiating
dipole whose radiation we see as scattered light. The particles may be
individual atoms or molecules; it can occur when light travels through
transparent solids and liquids but is most prominently seen in gases.

The strong wavelength dependence of the scattering means that shorter
(blue) wavelengths are scattered more strongly than longer (red)
wavelengths. This results in the indirect blue light coming from all
regions of the sky.

# We hope you enjoy those new releases!

Thanks to all of the many volunteers who help make Python Development and
these releases possible! Please consider supporting our efforts by
volunteering yourself or through organization contributions to the Python
Software Foundation.

Your friendly release team,
Pablo Galindo @pablogsal
Ned Deily @nad
Steve Dower @steve.dower
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Short, perfect program to read sentences of webpage

2021-12-08 Thread Peter J. Holzer
On 2021-12-09 09:42:07 +1100, Cameron Simpson wrote:
> On 08Dec2021 21:41, Stefan Ram  wrote:
> >Julius Hamilton  writes:
> >>This is a really simple program which extracts the text from webpages and
> >>displays them one sentence at a time.
> >
> >  Our teacher said NLTK will not come up until next year, so
> >  I tried to do with regexps. It still has bugs, for example
> >  it can not tell the dot at the end of an abbreviation from
> >  the dot at the end of a sentence!
> 
> This is almost a classic demo of why regexps are a poor tool as a first 
> choice. You can do much with them, but they are cryptic and bug prone.

I don't think that's problem here. The problem is that natural languages
just aren't regular languages. In fact I'm not sure that they fit
anywhere within the Chomsky hierarchy (but if they aren't type-0, that
would be a strong argument against the possibility of human-level AI).

In English, if a sentence ends with an abbreviation you write only a
single dot. So if you look at these two fragments:

For matching strings, numbers, etc. Python provides regular
expressions.

Let's say you want to match strings, numbers, etc. Python provides
regular expresssions for these tasks.

In second case the dot ends a sentence in the first it doesn't. But to
distinguish those cases you need to at least parse the sentences at the
syntax level (which regular expressions can't do), maybe even understand
them semantically.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Short, perfect program to read sentences of webpage

2021-12-08 Thread Cameron Simpson
On 08Dec2021 21:41, Stefan Ram  wrote:
>Julius Hamilton  writes:
>>This is a really simple program which extracts the text from webpages and
>>displays them one sentence at a time.
>
>  Our teacher said NLTK will not come up until next year, so
>  I tried to do with regexps. It still has bugs, for example
>  it can not tell the dot at the end of an abbreviation from
>  the dot at the end of a sentence!

This is almost a classic demo of why regexps are a poor tool as a first 
choice. You can do much with them, but they are cryptic and bug prone.

I am not seeking to mock you, but trying to make apparent why regexps 
are to be avoided a lot of the time. They have their place.

You've read the whole re module docs I hope:

https://docs.python.org/3/library/re.html#module-re

>import re
>import urllib.request
>uri = r'''http://example.com/article''' # replace this with your URI!
>request = urllib.request.Request( uri )
>resource = urllib.request.urlopen( request )
>cs = resource.headers.get_content_charset()
>content = resource.read().decode( cs, errors="ignore" )
>content = re.sub( r'''[\r\n\t\s]+''', r''' ''', content )

You're not multiline, so I would recommend a plain raw string:

content = re.sub( r'[\r\n\t\s]+', r' ', content )

No need for \r in the class, \s covers that. From the docs:

  \s
For Unicode (str) patterns:

  Matches Unicode whitespace characters (which includes [ 
  \t\n\r\f\v], and also many other characters, for example the 
  non-breaking spaces mandated by typography rules in many 
  languages). If the ASCII flag is used, only [ \t\n\r\f\v] is 
  matched.

>upper = r"[A-ZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝ]" # "[\\p{Lu}]"
>lower = r"[a-zµàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]" # "[\\p{Ll}]"

This is very fragile - you have an arbitrary set of additional uppercase 
characters, almost certainly incomplete, and visually hard to inspect 
for completeness.

Instead, consider the \b (word boundary) and \w (word character) 
markers, which will let you break strings up, and then maybe test the 
results with str.isupper().

>digit = r"[0-9]" #"[\\p{Nd}]"

There's a \d character class for this, covers nondecimal digits too.

>firstwordstart = upper;
>firstwordnext = "(?:[a-zµàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ-])";

Again, an inline arbitrary list of characters. This is fragile.

>wordcharacter = "[A-ZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝa-zµàáâãäåæçèéêëìíîïð\
>ñòóôõöøùúûüýþÿ0-9-]"

Again inline. Why not construct it?

wordcharacter = upper + lower + digit

but I recommend \w instead, or for this: [\w\d]

>addition = "(?:(?:[']" + wordcharacter + "+)*[']?)?"

As a matter of good practice with regexp strings, use raw quotes:

addition = r"(?:(?:[']" + wordcharacter + r"+)*[']?)?"

even when there are no backslahes.

Seriously, doing this with regexps is difficult. A useful exercise for 
learning regexps, but in the general case not the first tool to reach 
for.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python child process in while True loop blocks parent

2021-12-08 Thread Peter J. Holzer
On 2021-12-08 18:11:48 +0100, Jen Kris via Python-list wrote:
> To recap, I'm using a pair of named pipes for IPC between C and
> Python.  Python runs as a child process after fork-execv.  The Python
> program continues to run concurrently in a while True loop, and
> responds to requests from C at intervals, and continues to run until
> it receives a signal from C to exit.  C sends signals to Python, then
> waits to receive data back from Python.  My problem was that C was
> blocked when Python started. 
> 
> The solution was twofold:  (1) for Python to run concurrently it must
> be a multiprocessing loop (from the multiprocessing module),

I don't see how this could achieve anything. It starts another (third)
process, but then it just does all the work in that process and just
waits for it. Doing the same work in the original Python process should
have exactly the same effect.

> and (2) Python must terminate its write strings with \n, or read will
> block in C waiting for something that never comes.

That's also strange. You are using os.write in Python and read in C,
both of which shoudn't care about newlines.

> The multiprocessing module sidesteps the GIL; without multiprocessing
> the GIL will block all other threads once Python starts. 

Your Python interpreter runs in a different process than your C code.
There is absolutely no way the GIL could block threads in your C
program. And your Python code doesn't need to use more than one thread
or process.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Short, perfect program to read sentences of webpage

2021-12-08 Thread Jon Ribbens via Python-list
On 2021-12-08, Julius Hamilton  wrote:
> 1. The HTML extraction is not perfect. It doesn’t produce as clean text as
> I would like. Sometimes random links or tags get left in there. And the
> sentences are sometimes randomly broken by newlines.

Oh. Leaving tags in suggests you are doing this very wrongly. Python
has plenty of open source libraries you can use that will parse the
HTML reliably into tags and text for you.

> 2. Neither is the segmentation perfect. I am currently researching
> developing an optimal segmenter with tools from Spacy.
>
> Brevity is greatly valued. I mean, anyone who can make the program more
> perfect, that’s hugely appreciated. But if someone can do it in very few
> lines of code, that’s also appreciated.

It isn't something that can be done in a few lines of code. There's the
spaces issue you mention for example. Nor is it something that can
necessarily be done just by inspecting the HTML alone. To take a trivial
example:

  powergenitalia  = powergen  italia

but:

  powergenitalia= powergenitalia

but the second with the addition of:

  span { dispaly: block }

is back to "powergen  italia". So you need to parse and apply styles
(including external stylesheets) as well. Potentially you may also need
to execute JavaScript on the page, which means you also need a JavaScript
interpreter and a DOM implementation. Basically you need a complete
browser to do it on general web pages.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python child process in while True loop blocks parent

2021-12-08 Thread Cameron Simpson
On 08Dec2021 18:11, Jen Kris  wrote:
>Python must terminate its write strings with \n, or read will block in 
>C waiting for something that never comes.

There are two aspects to this:
- if upstream is rding "lines of text" then you need a newline to 
  terminate the lines
- you (probably) should flush the output pipe (Python to C) after the 
  newline

I see you're using file descriptors and os.write() to sent data. This is 
unbuffered, so there is nothing to flush, so you have not encountered 
the second point above.

But if you shift to using a Python file object for your output (eg 
f=os.fdopen(pw)), which would let you use print() or any number of other 
things which do things with Python files) your file object would have a 
buffer and normally that would not be sent to the pipe unless it was 
full.

So your deadlock issue has 2 components:
- you need to terminate your records for upstream (C) to see complete 
  records. Your records are lines, so you need a newline character.
- you need to ensure the whole record has been sent upstream (been 
  written to the pipe). If you use a buffered Python file object for 
  your output, you need to flush it at your synchronisation points or 
  upstream will not receive the buffer contents. That synchronisation 
  point for you is the end of the record.

Hopefully this makes the flow considerations more clear.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Short, perfect program to read sentences of webpage

2021-12-08 Thread Cameron Simpson
Assorted remarks inline below:

On 08Dec2021 20:39, Julius Hamilton  wrote:
>deepreader.py:
>
>import sys
>import requests
>import html2text
>import nltk
>
>url = sys.argv[1]

I might spell this:

cmd, url = sys.argv

which enforces exactly one argument. And since you don't care about the 
command name, maybe:

_, url = sys.argv

because "_" is a conventional name for "a value we do not care about".

>sentences = nltk.sent_tokenize(html2text.html2text(requests.get(url).text))

Neat!

># Activate an elementary reader interface for the text
>for index, sentence in enumerate(sentences):

I would be inclined to count from 1, so "enumerate(sentences, 1)".

>  # Print the sentence
>  print(“\n” + str(index) + “/“ + str(len(sentences)) + “: “ + sentence +
>“\n”)

Personally, since print() adds a trailing newline, I would drop the 
final +"\n". If you want an additional blank line, I would put it in the 
input() call below:

>  # Wait for user key-press
>  x = input(“\n> “)

You're not using "x". Just discard input()'s return value:

input("\n> ")

>A lot of refining is possible, and I’d really like to see how some more
>experienced people might handle it.
>
>1. The HTML extraction is not perfect. It doesn’t produce as clean text as
>I would like. Sometimes random links or tags get left in there.

Maybe try beautifulsoup instead of html2text? The module name is "bs4".

>And the
>sentences are sometimes randomly broken by newlines.

I would flatten the newlines. Either the simple:

sentence = sentence.strip().replace("\n", " ")

or maybe better:

sentence = " ".join(sentence.split()

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Short, perfect program to read sentences of webpage

2021-12-08 Thread MRAB

On 2021-12-08 19:39, Julius Hamilton wrote:

Hey,

This is something I have been working on for a very long time. It’s one of
the reasons I got into programming at all. I’d really appreciate if people
could input some advice on this.

This is a really simple program which extracts the text from webpages and
displays them one sentence at a time. It’s meant to help you study dense
material, especially documentation, with much more focus and comprehension.
I actually hope it can be of help to people who have difficulty reading. I
know it’s been of use to me at least.

This is a minimally acceptable way to pull it off currently:

deepreader.py:

import sys
import requests
import html2text
import nltk

url = sys.argv[1]

# Get the html, pull out the text, and sentence-segment it in one line of
code

sentences = nltk.sent_tokenize(html2text.html2text(requests.get(url).text))

# Activate an elementary reader interface for the text

for index, sentence in enumerate(sentences):

   # Print the sentence
   print(“\n” + str(index) + “/“ + str(len(sentences)) + “: “ + sentence +
“\n”)


You can shorten that with format strings:

print("\n{}/{}: {}\n".format(index, len(sentences), sentence))

or even:

print(f"\n{index}/{len(sentences)}: {sentence}\n")


   # Wait for user key-press
   x = input(“\n> “)


EOF



That’s it.

A lot of refining is possible, and I’d really like to see how some more
experienced people might handle it.

1. The HTML extraction is not perfect. It doesn’t produce as clean text as
I would like. Sometimes random links or tags get left in there. And the
sentences are sometimes randomly broken by newlines.

2. Neither is the segmentation perfect. I am currently researching
developing an optimal segmenter with tools from Spacy.

Brevity is greatly valued. I mean, anyone who can make the program more
perfect, that’s hugely appreciated. But if someone can do it in very few
lines of code, that’s also appreciated.

Thanks very much,
Julius



--
https://mail.python.org/mailman/listinfo/python-list


Short, perfect program to read sentences of webpage

2021-12-08 Thread Julius Hamilton
Hey,

This is something I have been working on for a very long time. It’s one of
the reasons I got into programming at all. I’d really appreciate if people
could input some advice on this.

This is a really simple program which extracts the text from webpages and
displays them one sentence at a time. It’s meant to help you study dense
material, especially documentation, with much more focus and comprehension.
I actually hope it can be of help to people who have difficulty reading. I
know it’s been of use to me at least.

This is a minimally acceptable way to pull it off currently:

deepreader.py:

import sys
import requests
import html2text
import nltk

url = sys.argv[1]

# Get the html, pull out the text, and sentence-segment it in one line of
code

sentences = nltk.sent_tokenize(html2text.html2text(requests.get(url).text))

# Activate an elementary reader interface for the text

for index, sentence in enumerate(sentences):

  # Print the sentence
  print(“\n” + str(index) + “/“ + str(len(sentences)) + “: “ + sentence +
“\n”)

  # Wait for user key-press
  x = input(“\n> “)


EOF



That’s it.

A lot of refining is possible, and I’d really like to see how some more
experienced people might handle it.

1. The HTML extraction is not perfect. It doesn’t produce as clean text as
I would like. Sometimes random links or tags get left in there. And the
sentences are sometimes randomly broken by newlines.

2. Neither is the segmentation perfect. I am currently researching
developing an optimal segmenter with tools from Spacy.

Brevity is greatly valued. I mean, anyone who can make the program more
perfect, that’s hugely appreciated. But if someone can do it in very few
lines of code, that’s also appreciated.

Thanks very much,
Julius
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python problem

2021-12-08 Thread Mats Wichmann

On 12/8/21 11:18, Larry Warner wrote:

I am new at Python. I have installed Python 3.10.1 and the latest Pycharm.
When I attempt to execute anything via Pycharm or the command line, I
receive a message it can not find Python.

I do not know where Python was loaded or where to find and to update PATH
to the program.


Since you don't mention which operating system you are using or where 
you got your Python from, it's hard for anyone to help.


Do the general notes here help?

https://docs.python.org/3/using/

This problem usually strikes on Windows, where all the commands don't go 
into a small number of directories which are searched by default. 
Following the Windows link in the page above should describe some 
approaches to dealing with that.


--
https://mail.python.org/mailman/listinfo/python-list


Re: How to package a Python command line app?

2021-12-08 Thread Manfred Lotz
Hi Loris,

On Wed, 08 Dec 2021 15:38:48 +0100
"Loris Bennett"  wrote:

> Hi Manfred,
> 
> Manfred Lotz  writes:
> 
> > The are many possibilities to package a Python app, and I have to
> > admit I am pretty confused.
> >
> > Here is what I have:
> >
> > A Python command line app which requires some packages which are
> > not in the standard library.
> >
> > I am on Linux and like to have an executable (perhaps a zip file
> > with a shebang; whatever) which runs on different Linux systems.
> >
> > Different mean
> > - perhaps different glibc versions
> > - perhaps different Python versions
> >
> > In my specific case this is: 
> > - RedHat 8.4 with Python 3.6.8
> > - Ubuntu 20.04 LTS with Python 3.8.10 
> > - and finally Fedora 33 with Python 3.9.9
> >
> >
> > Is this possible to do? If yes which tool would I use for this?  
> 
> I use poetry[1] on CentOS 7 to handle all the dependencies and create
> a wheel which I then install to a custom directory with pip3.
> 
> You would checkout the repository with your code on the target system,
> start a poetry shell using the Python version required, and then build
> the wheel.  From outside the poetry shell you can set PYTHONUSERBASE
> and then install with pip3.  You then just need to set PYTHONPATH
> appropriately where ever you want to use your software.
> 

In my case it could happen that I do not have access to the target
system but wants to handover the Python app to somebody else. This
person wants just to run it.


> Different Python versions shouldn't be a problem.  If some module
> depends on a specific glibc version, then you might end up in standard
> dependency-hell territory, but you can pin module versions of
> dependencies in poetry, and you could also possibly use different
> branches within your repository to handle that.
> 

I try to avoid using modules which depeng on specific glibc. 

Although, it seems that it doesn't really help for my use case I will
play with poetry to get a better understanding of its capabilities.

-- 
Thanks a lot,
Manfred



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to package a Python command line app?

2021-12-08 Thread Loris Bennett
Hi Manfred,

Manfred Lotz  writes:

> The are many possibilities to package a Python app, and I have to admit
> I am pretty confused.
>
> Here is what I have:
>
> A Python command line app which requires some packages which are not in
> the standard library.
>
> I am on Linux and like to have an executable (perhaps a zip file with a
> shebang; whatever) which runs on different Linux systems.
>
> Different mean
> - perhaps different glibc versions
> - perhaps different Python versions
>
> In my specific case this is: 
> - RedHat 8.4 with Python 3.6.8
> - Ubuntu 20.04 LTS with Python 3.8.10 
> - and finally Fedora 33 with Python 3.9.9
>
>
> Is this possible to do? If yes which tool would I use for this?

I use poetry[1] on CentOS 7 to handle all the dependencies and create a
wheel which I then install to a custom directory with pip3.

You would checkout the repository with your code on the target system,
start a poetry shell using the Python version required, and then build
the wheel.  From outside the poetry shell you can set PYTHONUSERBASE and
then install with pip3.  You then just need to set PYTHONPATH
appropriately where ever you want to use your software.

Different Python versions shouldn't be a problem.  If some module
depends on a specific glibc version, then you might end up in standard
dependency-hell territory, but you can pin module versions of
dependencies in poetry, and you could also possibly use different
branches within your repository to handle that.

HTH

Loris   

Footnotes: 
[1]  https://python-poetry.org

-- 
This signature is currently under construction.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Odd locale error that has disappeared on reboot.

2021-12-08 Thread Chris Green
Julio Di Egidio  wrote:
> On 07/12/2021 16:28, Chris Green wrote:
> > I have a very short Python program that runs on one of my Raspberry
> > Pis to collect temperatures from a 1-wire sensor and write them to a
> > database:-
> > 
> >  #!/usr/bin/python3
> >  #
> >  #
> >  # read temperature from 1-wire sensor and store in database with date 
> > and time
> >  #
> >  import sqlite3
> >  import time
> > 
> >  ftxt = 
> > str(open("/sys/bus/w1/devices/28-01204e1e64c3/w1_slave").read(100))
> >  temp = (float(ftxt[ftxt.find("t=") +2:]))/1000
> >  #
> >  #
> >  # insert date, time and temperature into the database
> >  #
> >  tdb = 
> > sqlite3.connect("/home/chris/.cfg/share/temperature/temperature.db")
> >  cr = tdb.cursor()
> >  dt = time.strftime("%Y-%m-%d %H:%M")
> > cr.execute("Insert INTO temperatures (DateTime, Temperature) VALUES(?, 
> round(?, 2))", (dt, temp) 
> >  )
> >  tdb.commit()
> >  tdb.close()
> > 
> > It's run by cron every 10 minutes.
> > 
> > 
> > At 03:40 last night it suddenly started throwing the following error every
> > time it ran:-
> > 
> >  Fatal Python error: initfsencoding: Unable to get the locale encoding
> >  LookupError: unknown encoding: UTF-8
> > 
> >  Current thread 0xb6f8db40 (most recent call first):
> >  Aborted
> > 
> > Running the program from the command line produced the same error.
> > Restarting the Pi system has fixed the problem.
> > 
> > 
> > What could have caused this?  I certainly wasn't around at 03:40! :-)
> > There aren't any automatic updates enabled on the system, the only
> > thing that might have been going on was a backup as that Pi is also
> > my 'NAS' with a big USB drive connected to it.  The backups have been
> > running without problems for more than a year.  Looking at the system
> > logs shows that a backup was started at 03:35 so I suppose that *could*
> > have provoked something but I fail to understand how.
> 
> Since it's a one-off, doesn't sound like a system problem.  The easiest 
> might be that you try-catch that call and retry when needed, and I'd 
> also check that 'ftxt' is what it should be: "external devices" may 
> fail, including when they do produce output...
> 
Well it repeated every ten minutes through the night until I rebooted
the system, so it wasn't really a "one off".  I hasn't repeated since
though.

-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


How to package a Python command line app?

2021-12-08 Thread Manfred Lotz
The are many possibilities to package a Python app, and I have to admit
I am pretty confused.

Here is what I have:

A Python command line app which requires some packages which are not in
the standard library.

I am on Linux and like to have an executable (perhaps a zip file with a
shebang; whatever) which runs on different Linux systems.

Different mean
- perhaps different glibc versions
- perhaps different Python versions

In my specific case this is: 
- RedHat 8.4 with Python 3.6.8
- Ubuntu 20.04 LTS with Python 3.8.10 
- and finally Fedora 33 with Python 3.9.9


Is this possible to do? If yes which tool would I use for this?


-- 
Manfred

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: For a hierarchical project, the EXE file generated by "pyinstaller" does not start.

2021-12-08 Thread Mohsen Owzar
Chris Angelico schrieb am Dienstag, 7. Dezember 2021 um 19:16:54 UTC+1:
> On Wed, Dec 8, 2021 at 4:49 AM Mohsen Owzar  wrote: 
> > *** 
> > GPIOContrl.py 
> > *** 
> > class GPIOControl: 
> > def my_print(self, args): 
> > if print_allowed == 1: 
> > print(args) 
> > 
> > def __init__(self):
> Can't much help with your main question as I don't do Windows, but one 
> small side point: Instead of having a my_print that checks if printing 
> is allowed, you can conditionally replace the print function itself. 
> 
> if not print_allowed: 
> def print(*args, **kwargs): pass 
> 
> ChrisA

Thanks Chris
Your answer didn't help me to solve my problem, but gave me another idea to 
write a conditional print statement.

Regards
Mohsen
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: HTML extraction

2021-12-08 Thread Pieter van Oostrum
Roland Mueller  writes:

> But isn't bs4 only for SOAP content?
> Can bs4 or lxml cope with HTML code that does not comply with XML as the
> following fragment?
>
> A
> B
> 
>

bs4 can do it, but lxml wants correct XML.

Jupyter console 6.4.0

Python 3.9.9 (main, Nov 16 2021, 07:21:43) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from bs4 import BeautifulSoup as bs

In [2]: soup = bs('AB')

In [3]: soup.p
Out[3]: A

In [4]: soup.find_all('p')
Out[4]: [A, B]

In [5]: from lxml import etree

In [6]: root = etree.fromstring('AB')
Traceback (most recent call last):

  File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/IPython/core/interactiveshell.py",
 line 3444, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)

  File 
"/var/folders/2l/pdng2d2x18d00m41l6r2ccjrgn/T/ipykernel_96220/3376613260.py",
 line 1, in 
root = etree.fromstring('AB')

  File "src/lxml/etree.pyx", line 3237, in lxml.etree.fromstring

  File "src/lxml/parser.pxi", line 1896, in lxml.etree._parseMemoryDocument

  File "src/lxml/parser.pxi", line 1777, in lxml.etree._parseDoc

  File "src/lxml/parser.pxi", line 1082, in 
lxml.etree._BaseParser._parseUnicodeDoc

  File "src/lxml/parser.pxi", line 615, in 
lxml.etree._ParserContext._handleParseResultDoc

  File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult

  File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError

  File "", line 1
XMLSyntaxError: Premature end of data in tag hr line 1, line 1, column 13
-- 
Pieter van Oostrum 
www: http://pieter.vanoostrum.org/
PGP key: [8DAE142BE17999C4]
-- 
https://mail.python.org/mailman/listinfo/python-list


python problem

2021-12-08 Thread Larry Warner
I am new at Python. I have installed Python 3.10.1 and the latest Pycharm.
When I attempt to execute anything via Pycharm or the command line, I
receive a message it can not find Python.

I do not know where Python was loaded or where to find and to update PATH
to the program.

Larry
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Odd locale error that has disappeared on reboot.

2021-12-08 Thread Dieter Maurer
Chris Green wrote at 2021-12-7 15:28 +:
>I have a very short Python program that runs on one of my Raspberry
>Pis to collect temperatures from a 1-wire sensor and write them to a
>database:-
> ...
>At 03:40 last night it suddenly started throwing the following error every
>time it ran:-
>
>Fatal Python error: initfsencoding: Unable to get the locale encoding
>LookupError: unknown encoding: UTF-8
>
>Current thread 0xb6f8db40 (most recent call first):
>Aborted
>
>Running the program from the command line produced the same error.
>Restarting the Pi system has fixed the problem.

Python has not its own locale database but uses that of the operating
system. From my point of view, the operating system state seems
to have got corrupted. A restart apparently has ensured a clean state again.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: HTML extraction

2021-12-08 Thread Dieter Maurer
Roland Mueller wrote at 2021-12-7 22:55 +0200:
> ...
>Can bs4 or lxml cope with HTML code that does not comply with XML as the
>following fragment?

`lxml` comes with an HTML parser; that can be configured to check loosely.
-- 
https://mail.python.org/mailman/listinfo/python-list


ANN: distlib 0.3.4 released on PyPI

2021-12-08 Thread Vinay Sajip via Python-list
I've recently released version 0.3.4 of distlib on PyPI [1]. For newcomers,
distlib is a library of packaging functionality which is intended to be
usable as the basis for third-party packaging tools.

The main changes in this release are as follows:

* Fixed #153: Raise warnings in get_distributions() if bad metadata seen, but 
keep
  going.

* Fixed #154: Determine Python versions correctly for Python >= 3.10.

* Updated launcher executables with changes to handle duplication logic.

Code relating to support for Python 2.6 was also removed (support for Python 
2.6 was
dropped in an earlier release, but supporting code wasn't removed until now).

A more detailed change log is available at [2].

Please try it out, and if you find any problems or have any suggestions for 
improvements,
please give some feedback using the issue tracker! [3]

Regards,

Vinay Sajip

[1] https://pypi.org/project/distlib/0.3.4/
[2] https://distlib.readthedocs.io/en/0.3.4/
[3] https://bitbucket.org/pypa/distlib/issues/new

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python child process in while True loop blocks parent

2021-12-08 Thread Jen Kris via Python-list
I started this post on November 29, and there have been helpful comments since 
then from Barry Scott, Cameron Simpson, Peter Holzer and Chris Angelico.  
Thanks to all of you.  

I've found a solution that works for my purpose, and I said earlier that I 
would post the solution I found. If anyone has a better solution I would 
appreciate any feedback. 

To recap, I'm using a pair of named pipes for IPC between C and Python.  Python 
runs as a child process after fork-execv.  The Python program continues to run 
concurrently in a while True loop, and responds to requests from C at 
intervals, and continues to run until it receives a signal from C to exit.  C 
sends signals to Python, then waits to receive data back from Python.  My 
problem was that C was blocked when Python started. 

The solution was twofold:  (1) for Python to run concurrently it must be a 
multiprocessing loop (from the multiprocessing module), and (2) Python must 
terminate its write strings with \n, or read will block in C waiting for 
something that never comes.  The multiprocessing module sidesteps the GIL; 
without multiprocessing the GIL will block all other threads once Python 
starts. 

Originally I used epoll() on the pipes.  Cameron Smith and Barry Scott advised 
against epoll, and for this case they are right.  Blocking pipes work here, and 
epoll is too much overhead for watching on a single file descriptor. 

This is the Python code now:

#!/usr/bin/python3
from multiprocessing import Process
import os

print("Python is running")

child_pid = os.getpid()
print('child process id:', child_pid)

def f(a, b):

    print("Python now in function f")

    pr = os.open('/tmp/Pipe_01', os.O_RDONLY)
    print("File Descriptor1 Opened " + str(pr))
    pw = os.open('/tmp/Pipe_02', os.O_WRONLY)
    print("File Descriptor2 Opened " + str(pw))

    while True:

    v = os.read(pr,64)
    print("Python read from pipe pr")
    print(v)

    if v == b'99':
    os.close(pr)
    os.close(pw)
    print("Python is terminating")
    os._exit(os.EX_OK)

    if v != "Send child PID":
    os.write(pw, b"OK message received\n")
    print("Python wrote back")

if __name__ == '__main__':
    a = 0
    b = 0
    p = Process(target=f, args=(a, b,))
    p.start()
    p.join()

The variables a and b are not currently used in the body, but they will be 
later. 

This is the part of the C code that communicates with Python:

    fifo_fd1 = open(fifo_path1, O_WRONLY);
    fifo_fd2 = open(fifo_path2, O_RDONLY);

    status_write = write(fifo_fd1, py_msg_01, sizeof(py_msg_01));
    if (status_write < 0) perror("write");

    status_read = read(fifo_fd2, fifo_readbuf, sizeof(py_msg_01));
    if (status_read < 0) perror("read");
    printf("C received message 1 from Python\n");
    printf("%.*s",(int)buf_len, fifo_readbuf);

    status_write = write(fifo_fd1, py_msg_02, sizeof(py_msg_02));
    if (status_write < 0) perror("write");

    status_read = read(fifo_fd2, fifo_readbuf, sizeof(py_msg_02));
    if (status_read < 0) perror("read");
    printf("C received message 2 from Python\n");
    printf("%.*s",(int)buf_len, fifo_readbuf);

    // Terminate Python multiprocessing
    printf("C is sending exit message to Python\n");
    status_write = write(fifo_fd1, py_msg_03, 2);

    printf("C is closing\n");
    close(fifo_fd1);
    close(fifo_fd2);

Screen output:

Python is running
child process id: 5353
Python now in function f
File Descriptor1 Opened 6
Thread created 0
File Descriptor2 Opened 7
Process ID: 5351
Parent Process ID: 5351
I am the parent
Core joined 0
I am the child
Python read from pipe pr
b'Hello to Python from C\x00\x00'
Python wrote back
C received message 1 from Python
OK message received
Python read from pipe pr
b'Message to Python 2\x00\x00'
Python wrote back
C received message 2 from Python
OK message received
C is sending exit message to Python
C is closing
Python read from pipe pr
b'99'
Python is terminating

Python runs on a separate thread (created with pthreads) because I want the 
flexibility of using this same basic code as a stand-alone .exe, or for a C 
extension from Python called with ctypes.  If I use it as a C extension then I 
want the Python code on a separate thread because I can't have two instances of 
the Python interpreter running on one thread, and one instance will already be 
running on the main thread, albeit "suspended" by the call from ctypes. 

So that's my solution:  (1) Python multiprocessing module; (2) Python strings 
written to the pipe must be terminated with \n. 

Thanks again to all who commented. 



Dec 6, 2021, 13:33 by ba...@barrys-emacs.org:

>
>
>
>> On 6 Dec 2021, at 21:05, Jen Kris <>> jenk...@tutanota.com>> > wrote:
>>
>> Here is what I don't understand from what you said.  "The child process is 
>> created with a single thread—the one that called fork()."  To me that 
>> implies that the thread that called fork() is the same thre