Re: Short, perfect program to read sentences of webpage
On 08Dec2021 23:17, Stefan Ram wrote: > Regexps might have their disadvantages, but when I use them, > it is clearer for me to do all the matching with regexps > instead of mixing them with Python calls like str.isupper. > Therefore, it is helpful for me to have a regexp to match > upper and lower case characters separately. Some regexp > dialects support "\p{Lu}" and "\p{Ll}" for this. Aye. I went looking for that in the Python re module docs and could not find them. So the comprimise is match any word, then test the word with isupper() (or whatever is appropriate). > I have not yet incorporated (all) your advice into my code, > but I came to the conclusion myself that the repetition of > long sequences like r"A-ZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝ" and > not using f strings to insert other strings was especially > ugly. The tricky bit with f-strings and regexps is that \w{3,5} means from 3 through 5 "word characters". So if you've got those in an f-string you're off to double-the-brackets land, a bit like double backslash land and non-raw-strings. Otherwise, yes f-strings are a nice way to compose things. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Odd locale error that has disappeared on reboot.
On Wed, Dec 8, 2021 at 2:52 AM Chris Green wrote: > > > At 03:40 last night it suddenly started throwing the following error every > time it ran:- > > Fatal Python error: initfsencoding: Unable to get the locale encoding > LookupError: unknown encoding: UTF-8 > > Current thread 0xb6f8db40 (most recent call first): > Aborted > > Running the program from the command line produced the same error. > Restarting the Pi system has fixed the problem. > This error means Python can not find its standard libraries. There are some possibilities. * You set the wrong PYTHONHOME PYTHONHOME is very rarely useful. It shouldn't be used if you can not solve this kind of problem. * Your Python installation is broken. Some files are deleted or overwritten. You need to *clean* install Python again. Bets, -- Inada Naoki -- https://mail.python.org/mailman/listinfo/python-list
Re: Short, perfect program to read sentences of webpage
On 2021-12-08 23:17, Stefan Ram wrote: Cameron Simpson writes: Instead, consider the \b (word boundary) and \w (word character) markers, which will let you break strings up, and then maybe test the results with str.isupper(). Thanks for your comments, most or all of them are valid, and I will try to take them into account! Regexps might have their disadvantages, but when I use them, it is clearer for me to do all the matching with regexps instead of mixing them with Python calls like str.isupper. Therefore, it is helpful for me to have a regexp to match upper and lower case characters separately. Some regexp dialects support "\p{Lu}" and "\p{Ll}" for this. If you want "\p{Lu}" and "\p{Ll}", have a look at the 'regex' module on PyPI: https://pypi.org/project/regex/ [snip] -- https://mail.python.org/mailman/listinfo/python-list
[RELEASE] Python 3.11.0a3 is available
You can tell that we are slowly getting closer to the first beta as the number of release blockers that we need to fix on every release starts to increase [image: :sweat_smile:] But we did it! Thanks to Steve Dower, Ned Deily, Christian Heimes, Łukasz Langa and Mark Shannon that helped get things ready for this release :) Go get the new version here: https://www.python.org/downloads/release/python-3110a3/ **This is an early developer preview of Python 3.11** # Major new features of the 3.11 series, compared to 3.10 Python 3.11 is still in development. This release, 3.11.0a3 is the third of seven planned alpha releases. Alpha releases are intended to make it easier to test the current state of new features and bug fixes and to test the release process. During the alpha phase, features may be added up until the start of the beta phase (2022-05-06) and, if necessary, may be modified or deleted up until the release candidate phase (2022-08-01). Please keep in mind that this is a preview release and its use is **not** recommended for production environments. Many new features for Python 3.11 are still being planned and written. Among the new major new features and changes so far: * [PEP 657](https://www.python.org/dev/peps/pep-0657/) -- Include Fine-Grained Error Locations in Tracebacks * [PEP 654](https://www.python.org/dev/peps/pep-0654/) -- Exception Groups and except* * The [Faster Cpython Project](https://github.com/faster-cpython) is already yielding some exciting results: this version of CPython 3.11 is ~12% faster on the geometric mean of the [PyPerformance benchmarks]( speed.python.org), compared to 3.10.0. * Hey, **fellow core developer,** if a feature you find important is missing from this list, let me know. The next pre-release of Python 3.11 will be 3.11.0a4, currently scheduled for Monday, 2022-01-03. # More resources * [Online Documentation](https://docs.python.org/3.11/) * [PEP 664](https://www.python.org/dev/peps/pep-0664/), 3.11 Release Schedule * Report bugs at [https://bugs.python.org](https://bugs.python.org). * [Help fund Python and its community](/psf/donations/). # And now for something completely different Rayleigh scattering, named after the nineteenth-century British physicist Lord Rayleigh is the predominantly elastic scattering of light or other electromagnetic radiation by particles much smaller than the wavelength of the radiation. For light frequencies well below the resonance frequency of the scattering particle, the amount of scattering is inversely proportional to the fourth power of the wavelength. Rayleigh scattering results from the electric polarizability of the particles. The oscillating electric field of a light wave acts on the charges within a particle, causing them to move at the same frequency. The particle, therefore, becomes a small radiating dipole whose radiation we see as scattered light. The particles may be individual atoms or molecules; it can occur when light travels through transparent solids and liquids but is most prominently seen in gases. The strong wavelength dependence of the scattering means that shorter (blue) wavelengths are scattered more strongly than longer (red) wavelengths. This results in the indirect blue light coming from all regions of the sky. # We hope you enjoy those new releases! Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organization contributions to the Python Software Foundation. Your friendly release team, Pablo Galindo @pablogsal Ned Deily @nad Steve Dower @steve.dower -- https://mail.python.org/mailman/listinfo/python-list
Re: Short, perfect program to read sentences of webpage
On 2021-12-09 09:42:07 +1100, Cameron Simpson wrote: > On 08Dec2021 21:41, Stefan Ram wrote: > >Julius Hamilton writes: > >>This is a really simple program which extracts the text from webpages and > >>displays them one sentence at a time. > > > > Our teacher said NLTK will not come up until next year, so > > I tried to do with regexps. It still has bugs, for example > > it can not tell the dot at the end of an abbreviation from > > the dot at the end of a sentence! > > This is almost a classic demo of why regexps are a poor tool as a first > choice. You can do much with them, but they are cryptic and bug prone. I don't think that's problem here. The problem is that natural languages just aren't regular languages. In fact I'm not sure that they fit anywhere within the Chomsky hierarchy (but if they aren't type-0, that would be a strong argument against the possibility of human-level AI). In English, if a sentence ends with an abbreviation you write only a single dot. So if you look at these two fragments: For matching strings, numbers, etc. Python provides regular expressions. Let's say you want to match strings, numbers, etc. Python provides regular expresssions for these tasks. In second case the dot ends a sentence in the first it doesn't. But to distinguish those cases you need to at least parse the sentences at the syntax level (which regular expressions can't do), maybe even understand them semantically. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: Short, perfect program to read sentences of webpage
On 08Dec2021 21:41, Stefan Ram wrote: >Julius Hamilton writes: >>This is a really simple program which extracts the text from webpages and >>displays them one sentence at a time. > > Our teacher said NLTK will not come up until next year, so > I tried to do with regexps. It still has bugs, for example > it can not tell the dot at the end of an abbreviation from > the dot at the end of a sentence! This is almost a classic demo of why regexps are a poor tool as a first choice. You can do much with them, but they are cryptic and bug prone. I am not seeking to mock you, but trying to make apparent why regexps are to be avoided a lot of the time. They have their place. You've read the whole re module docs I hope: https://docs.python.org/3/library/re.html#module-re >import re >import urllib.request >uri = r'''http://example.com/article''' # replace this with your URI! >request = urllib.request.Request( uri ) >resource = urllib.request.urlopen( request ) >cs = resource.headers.get_content_charset() >content = resource.read().decode( cs, errors="ignore" ) >content = re.sub( r'''[\r\n\t\s]+''', r''' ''', content ) You're not multiline, so I would recommend a plain raw string: content = re.sub( r'[\r\n\t\s]+', r' ', content ) No need for \r in the class, \s covers that. From the docs: \s For Unicode (str) patterns: Matches Unicode whitespace characters (which includes [ \t\n\r\f\v], and also many other characters, for example the non-breaking spaces mandated by typography rules in many languages). If the ASCII flag is used, only [ \t\n\r\f\v] is matched. >upper = r"[A-ZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝ]" # "[\\p{Lu}]" >lower = r"[a-zµàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]" # "[\\p{Ll}]" This is very fragile - you have an arbitrary set of additional uppercase characters, almost certainly incomplete, and visually hard to inspect for completeness. Instead, consider the \b (word boundary) and \w (word character) markers, which will let you break strings up, and then maybe test the results with str.isupper(). >digit = r"[0-9]" #"[\\p{Nd}]" There's a \d character class for this, covers nondecimal digits too. >firstwordstart = upper; >firstwordnext = "(?:[a-zµàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ-])"; Again, an inline arbitrary list of characters. This is fragile. >wordcharacter = "[A-ZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝa-zµàáâãäåæçèéêëìíîïð\ >ñòóôõöøùúûüýþÿ0-9-]" Again inline. Why not construct it? wordcharacter = upper + lower + digit but I recommend \w instead, or for this: [\w\d] >addition = "(?:(?:[']" + wordcharacter + "+)*[']?)?" As a matter of good practice with regexp strings, use raw quotes: addition = r"(?:(?:[']" + wordcharacter + r"+)*[']?)?" even when there are no backslahes. Seriously, doing this with regexps is difficult. A useful exercise for learning regexps, but in the general case not the first tool to reach for. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Python child process in while True loop blocks parent
On 2021-12-08 18:11:48 +0100, Jen Kris via Python-list wrote: > To recap, I'm using a pair of named pipes for IPC between C and > Python. Python runs as a child process after fork-execv. The Python > program continues to run concurrently in a while True loop, and > responds to requests from C at intervals, and continues to run until > it receives a signal from C to exit. C sends signals to Python, then > waits to receive data back from Python. My problem was that C was > blocked when Python started. > > The solution was twofold: (1) for Python to run concurrently it must > be a multiprocessing loop (from the multiprocessing module), I don't see how this could achieve anything. It starts another (third) process, but then it just does all the work in that process and just waits for it. Doing the same work in the original Python process should have exactly the same effect. > and (2) Python must terminate its write strings with \n, or read will > block in C waiting for something that never comes. That's also strange. You are using os.write in Python and read in C, both of which shoudn't care about newlines. > The multiprocessing module sidesteps the GIL; without multiprocessing > the GIL will block all other threads once Python starts. Your Python interpreter runs in a different process than your C code. There is absolutely no way the GIL could block threads in your C program. And your Python code doesn't need to use more than one thread or process. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: Short, perfect program to read sentences of webpage
On 2021-12-08, Julius Hamilton wrote: > 1. The HTML extraction is not perfect. It doesn’t produce as clean text as > I would like. Sometimes random links or tags get left in there. And the > sentences are sometimes randomly broken by newlines. Oh. Leaving tags in suggests you are doing this very wrongly. Python has plenty of open source libraries you can use that will parse the HTML reliably into tags and text for you. > 2. Neither is the segmentation perfect. I am currently researching > developing an optimal segmenter with tools from Spacy. > > Brevity is greatly valued. I mean, anyone who can make the program more > perfect, that’s hugely appreciated. But if someone can do it in very few > lines of code, that’s also appreciated. It isn't something that can be done in a few lines of code. There's the spaces issue you mention for example. Nor is it something that can necessarily be done just by inspecting the HTML alone. To take a trivial example: powergenitalia = powergen italia but: powergenitalia= powergenitalia but the second with the addition of: span { dispaly: block } is back to "powergen italia". So you need to parse and apply styles (including external stylesheets) as well. Potentially you may also need to execute JavaScript on the page, which means you also need a JavaScript interpreter and a DOM implementation. Basically you need a complete browser to do it on general web pages. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python child process in while True loop blocks parent
On 08Dec2021 18:11, Jen Kris wrote: >Python must terminate its write strings with \n, or read will block in >C waiting for something that never comes. There are two aspects to this: - if upstream is rding "lines of text" then you need a newline to terminate the lines - you (probably) should flush the output pipe (Python to C) after the newline I see you're using file descriptors and os.write() to sent data. This is unbuffered, so there is nothing to flush, so you have not encountered the second point above. But if you shift to using a Python file object for your output (eg f=os.fdopen(pw)), which would let you use print() or any number of other things which do things with Python files) your file object would have a buffer and normally that would not be sent to the pipe unless it was full. So your deadlock issue has 2 components: - you need to terminate your records for upstream (C) to see complete records. Your records are lines, so you need a newline character. - you need to ensure the whole record has been sent upstream (been written to the pipe). If you use a buffered Python file object for your output, you need to flush it at your synchronisation points or upstream will not receive the buffer contents. That synchronisation point for you is the end of the record. Hopefully this makes the flow considerations more clear. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Short, perfect program to read sentences of webpage
Assorted remarks inline below: On 08Dec2021 20:39, Julius Hamilton wrote: >deepreader.py: > >import sys >import requests >import html2text >import nltk > >url = sys.argv[1] I might spell this: cmd, url = sys.argv which enforces exactly one argument. And since you don't care about the command name, maybe: _, url = sys.argv because "_" is a conventional name for "a value we do not care about". >sentences = nltk.sent_tokenize(html2text.html2text(requests.get(url).text)) Neat! ># Activate an elementary reader interface for the text >for index, sentence in enumerate(sentences): I would be inclined to count from 1, so "enumerate(sentences, 1)". > # Print the sentence > print(“\n” + str(index) + “/“ + str(len(sentences)) + “: “ + sentence + >“\n”) Personally, since print() adds a trailing newline, I would drop the final +"\n". If you want an additional blank line, I would put it in the input() call below: > # Wait for user key-press > x = input(“\n> “) You're not using "x". Just discard input()'s return value: input("\n> ") >A lot of refining is possible, and I’d really like to see how some more >experienced people might handle it. > >1. The HTML extraction is not perfect. It doesn’t produce as clean text as >I would like. Sometimes random links or tags get left in there. Maybe try beautifulsoup instead of html2text? The module name is "bs4". >And the >sentences are sometimes randomly broken by newlines. I would flatten the newlines. Either the simple: sentence = sentence.strip().replace("\n", " ") or maybe better: sentence = " ".join(sentence.split() Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Short, perfect program to read sentences of webpage
On 2021-12-08 19:39, Julius Hamilton wrote: Hey, This is something I have been working on for a very long time. It’s one of the reasons I got into programming at all. I’d really appreciate if people could input some advice on this. This is a really simple program which extracts the text from webpages and displays them one sentence at a time. It’s meant to help you study dense material, especially documentation, with much more focus and comprehension. I actually hope it can be of help to people who have difficulty reading. I know it’s been of use to me at least. This is a minimally acceptable way to pull it off currently: deepreader.py: import sys import requests import html2text import nltk url = sys.argv[1] # Get the html, pull out the text, and sentence-segment it in one line of code sentences = nltk.sent_tokenize(html2text.html2text(requests.get(url).text)) # Activate an elementary reader interface for the text for index, sentence in enumerate(sentences): # Print the sentence print(“\n” + str(index) + “/“ + str(len(sentences)) + “: “ + sentence + “\n”) You can shorten that with format strings: print("\n{}/{}: {}\n".format(index, len(sentences), sentence)) or even: print(f"\n{index}/{len(sentences)}: {sentence}\n") # Wait for user key-press x = input(“\n> “) EOF That’s it. A lot of refining is possible, and I’d really like to see how some more experienced people might handle it. 1. The HTML extraction is not perfect. It doesn’t produce as clean text as I would like. Sometimes random links or tags get left in there. And the sentences are sometimes randomly broken by newlines. 2. Neither is the segmentation perfect. I am currently researching developing an optimal segmenter with tools from Spacy. Brevity is greatly valued. I mean, anyone who can make the program more perfect, that’s hugely appreciated. But if someone can do it in very few lines of code, that’s also appreciated. Thanks very much, Julius -- https://mail.python.org/mailman/listinfo/python-list
Short, perfect program to read sentences of webpage
Hey, This is something I have been working on for a very long time. It’s one of the reasons I got into programming at all. I’d really appreciate if people could input some advice on this. This is a really simple program which extracts the text from webpages and displays them one sentence at a time. It’s meant to help you study dense material, especially documentation, with much more focus and comprehension. I actually hope it can be of help to people who have difficulty reading. I know it’s been of use to me at least. This is a minimally acceptable way to pull it off currently: deepreader.py: import sys import requests import html2text import nltk url = sys.argv[1] # Get the html, pull out the text, and sentence-segment it in one line of code sentences = nltk.sent_tokenize(html2text.html2text(requests.get(url).text)) # Activate an elementary reader interface for the text for index, sentence in enumerate(sentences): # Print the sentence print(“\n” + str(index) + “/“ + str(len(sentences)) + “: “ + sentence + “\n”) # Wait for user key-press x = input(“\n> “) EOF That’s it. A lot of refining is possible, and I’d really like to see how some more experienced people might handle it. 1. The HTML extraction is not perfect. It doesn’t produce as clean text as I would like. Sometimes random links or tags get left in there. And the sentences are sometimes randomly broken by newlines. 2. Neither is the segmentation perfect. I am currently researching developing an optimal segmenter with tools from Spacy. Brevity is greatly valued. I mean, anyone who can make the program more perfect, that’s hugely appreciated. But if someone can do it in very few lines of code, that’s also appreciated. Thanks very much, Julius -- https://mail.python.org/mailman/listinfo/python-list
Re: python problem
On 12/8/21 11:18, Larry Warner wrote: I am new at Python. I have installed Python 3.10.1 and the latest Pycharm. When I attempt to execute anything via Pycharm or the command line, I receive a message it can not find Python. I do not know where Python was loaded or where to find and to update PATH to the program. Since you don't mention which operating system you are using or where you got your Python from, it's hard for anyone to help. Do the general notes here help? https://docs.python.org/3/using/ This problem usually strikes on Windows, where all the commands don't go into a small number of directories which are searched by default. Following the Windows link in the page above should describe some approaches to dealing with that. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to package a Python command line app?
Hi Loris, On Wed, 08 Dec 2021 15:38:48 +0100 "Loris Bennett" wrote: > Hi Manfred, > > Manfred Lotz writes: > > > The are many possibilities to package a Python app, and I have to > > admit I am pretty confused. > > > > Here is what I have: > > > > A Python command line app which requires some packages which are > > not in the standard library. > > > > I am on Linux and like to have an executable (perhaps a zip file > > with a shebang; whatever) which runs on different Linux systems. > > > > Different mean > > - perhaps different glibc versions > > - perhaps different Python versions > > > > In my specific case this is: > > - RedHat 8.4 with Python 3.6.8 > > - Ubuntu 20.04 LTS with Python 3.8.10 > > - and finally Fedora 33 with Python 3.9.9 > > > > > > Is this possible to do? If yes which tool would I use for this? > > I use poetry[1] on CentOS 7 to handle all the dependencies and create > a wheel which I then install to a custom directory with pip3. > > You would checkout the repository with your code on the target system, > start a poetry shell using the Python version required, and then build > the wheel. From outside the poetry shell you can set PYTHONUSERBASE > and then install with pip3. You then just need to set PYTHONPATH > appropriately where ever you want to use your software. > In my case it could happen that I do not have access to the target system but wants to handover the Python app to somebody else. This person wants just to run it. > Different Python versions shouldn't be a problem. If some module > depends on a specific glibc version, then you might end up in standard > dependency-hell territory, but you can pin module versions of > dependencies in poetry, and you could also possibly use different > branches within your repository to handle that. > I try to avoid using modules which depeng on specific glibc. Although, it seems that it doesn't really help for my use case I will play with poetry to get a better understanding of its capabilities. -- Thanks a lot, Manfred -- https://mail.python.org/mailman/listinfo/python-list
Re: How to package a Python command line app?
Hi Manfred, Manfred Lotz writes: > The are many possibilities to package a Python app, and I have to admit > I am pretty confused. > > Here is what I have: > > A Python command line app which requires some packages which are not in > the standard library. > > I am on Linux and like to have an executable (perhaps a zip file with a > shebang; whatever) which runs on different Linux systems. > > Different mean > - perhaps different glibc versions > - perhaps different Python versions > > In my specific case this is: > - RedHat 8.4 with Python 3.6.8 > - Ubuntu 20.04 LTS with Python 3.8.10 > - and finally Fedora 33 with Python 3.9.9 > > > Is this possible to do? If yes which tool would I use for this? I use poetry[1] on CentOS 7 to handle all the dependencies and create a wheel which I then install to a custom directory with pip3. You would checkout the repository with your code on the target system, start a poetry shell using the Python version required, and then build the wheel. From outside the poetry shell you can set PYTHONUSERBASE and then install with pip3. You then just need to set PYTHONPATH appropriately where ever you want to use your software. Different Python versions shouldn't be a problem. If some module depends on a specific glibc version, then you might end up in standard dependency-hell territory, but you can pin module versions of dependencies in poetry, and you could also possibly use different branches within your repository to handle that. HTH Loris Footnotes: [1] https://python-poetry.org -- This signature is currently under construction. -- https://mail.python.org/mailman/listinfo/python-list
Re: Odd locale error that has disappeared on reboot.
Julio Di Egidio wrote: > On 07/12/2021 16:28, Chris Green wrote: > > I have a very short Python program that runs on one of my Raspberry > > Pis to collect temperatures from a 1-wire sensor and write them to a > > database:- > > > > #!/usr/bin/python3 > > # > > # > > # read temperature from 1-wire sensor and store in database with date > > and time > > # > > import sqlite3 > > import time > > > > ftxt = > > str(open("/sys/bus/w1/devices/28-01204e1e64c3/w1_slave").read(100)) > > temp = (float(ftxt[ftxt.find("t=") +2:]))/1000 > > # > > # > > # insert date, time and temperature into the database > > # > > tdb = > > sqlite3.connect("/home/chris/.cfg/share/temperature/temperature.db") > > cr = tdb.cursor() > > dt = time.strftime("%Y-%m-%d %H:%M") > > cr.execute("Insert INTO temperatures (DateTime, Temperature) VALUES(?, > round(?, 2))", (dt, temp) > > ) > > tdb.commit() > > tdb.close() > > > > It's run by cron every 10 minutes. > > > > > > At 03:40 last night it suddenly started throwing the following error every > > time it ran:- > > > > Fatal Python error: initfsencoding: Unable to get the locale encoding > > LookupError: unknown encoding: UTF-8 > > > > Current thread 0xb6f8db40 (most recent call first): > > Aborted > > > > Running the program from the command line produced the same error. > > Restarting the Pi system has fixed the problem. > > > > > > What could have caused this? I certainly wasn't around at 03:40! :-) > > There aren't any automatic updates enabled on the system, the only > > thing that might have been going on was a backup as that Pi is also > > my 'NAS' with a big USB drive connected to it. The backups have been > > running without problems for more than a year. Looking at the system > > logs shows that a backup was started at 03:35 so I suppose that *could* > > have provoked something but I fail to understand how. > > Since it's a one-off, doesn't sound like a system problem. The easiest > might be that you try-catch that call and retry when needed, and I'd > also check that 'ftxt' is what it should be: "external devices" may > fail, including when they do produce output... > Well it repeated every ten minutes through the night until I rebooted the system, so it wasn't really a "one off". I hasn't repeated since though. -- Chris Green · -- https://mail.python.org/mailman/listinfo/python-list
How to package a Python command line app?
The are many possibilities to package a Python app, and I have to admit I am pretty confused. Here is what I have: A Python command line app which requires some packages which are not in the standard library. I am on Linux and like to have an executable (perhaps a zip file with a shebang; whatever) which runs on different Linux systems. Different mean - perhaps different glibc versions - perhaps different Python versions In my specific case this is: - RedHat 8.4 with Python 3.6.8 - Ubuntu 20.04 LTS with Python 3.8.10 - and finally Fedora 33 with Python 3.9.9 Is this possible to do? If yes which tool would I use for this? -- Manfred -- https://mail.python.org/mailman/listinfo/python-list
Re: For a hierarchical project, the EXE file generated by "pyinstaller" does not start.
Chris Angelico schrieb am Dienstag, 7. Dezember 2021 um 19:16:54 UTC+1: > On Wed, Dec 8, 2021 at 4:49 AM Mohsen Owzar wrote: > > *** > > GPIOContrl.py > > *** > > class GPIOControl: > > def my_print(self, args): > > if print_allowed == 1: > > print(args) > > > > def __init__(self): > Can't much help with your main question as I don't do Windows, but one > small side point: Instead of having a my_print that checks if printing > is allowed, you can conditionally replace the print function itself. > > if not print_allowed: > def print(*args, **kwargs): pass > > ChrisA Thanks Chris Your answer didn't help me to solve my problem, but gave me another idea to write a conditional print statement. Regards Mohsen -- https://mail.python.org/mailman/listinfo/python-list
Re: HTML extraction
Roland Mueller writes: > But isn't bs4 only for SOAP content? > Can bs4 or lxml cope with HTML code that does not comply with XML as the > following fragment? > > A > B > > bs4 can do it, but lxml wants correct XML. Jupyter console 6.4.0 Python 3.9.9 (main, Nov 16 2021, 07:21:43) Type 'copyright', 'credits' or 'license' for more information IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: from bs4 import BeautifulSoup as bs In [2]: soup = bs('AB') In [3]: soup.p Out[3]: A In [4]: soup.find_all('p') Out[4]: [A, B] In [5]: from lxml import etree In [6]: root = etree.fromstring('AB') Traceback (most recent call last): File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3444, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/var/folders/2l/pdng2d2x18d00m41l6r2ccjrgn/T/ipykernel_96220/3376613260.py", line 1, in root = etree.fromstring('AB') File "src/lxml/etree.pyx", line 3237, in lxml.etree.fromstring File "src/lxml/parser.pxi", line 1896, in lxml.etree._parseMemoryDocument File "src/lxml/parser.pxi", line 1777, in lxml.etree._parseDoc File "src/lxml/parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError File "", line 1 XMLSyntaxError: Premature end of data in tag hr line 1, line 1, column 13 -- Pieter van Oostrum www: http://pieter.vanoostrum.org/ PGP key: [8DAE142BE17999C4] -- https://mail.python.org/mailman/listinfo/python-list
python problem
I am new at Python. I have installed Python 3.10.1 and the latest Pycharm. When I attempt to execute anything via Pycharm or the command line, I receive a message it can not find Python. I do not know where Python was loaded or where to find and to update PATH to the program. Larry -- https://mail.python.org/mailman/listinfo/python-list
Re: Odd locale error that has disappeared on reboot.
Chris Green wrote at 2021-12-7 15:28 +: >I have a very short Python program that runs on one of my Raspberry >Pis to collect temperatures from a 1-wire sensor and write them to a >database:- > ... >At 03:40 last night it suddenly started throwing the following error every >time it ran:- > >Fatal Python error: initfsencoding: Unable to get the locale encoding >LookupError: unknown encoding: UTF-8 > >Current thread 0xb6f8db40 (most recent call first): >Aborted > >Running the program from the command line produced the same error. >Restarting the Pi system has fixed the problem. Python has not its own locale database but uses that of the operating system. From my point of view, the operating system state seems to have got corrupted. A restart apparently has ensured a clean state again. -- https://mail.python.org/mailman/listinfo/python-list
Re: HTML extraction
Roland Mueller wrote at 2021-12-7 22:55 +0200: > ... >Can bs4 or lxml cope with HTML code that does not comply with XML as the >following fragment? `lxml` comes with an HTML parser; that can be configured to check loosely. -- https://mail.python.org/mailman/listinfo/python-list
ANN: distlib 0.3.4 released on PyPI
I've recently released version 0.3.4 of distlib on PyPI [1]. For newcomers, distlib is a library of packaging functionality which is intended to be usable as the basis for third-party packaging tools. The main changes in this release are as follows: * Fixed #153: Raise warnings in get_distributions() if bad metadata seen, but keep going. * Fixed #154: Determine Python versions correctly for Python >= 3.10. * Updated launcher executables with changes to handle duplication logic. Code relating to support for Python 2.6 was also removed (support for Python 2.6 was dropped in an earlier release, but supporting code wasn't removed until now). A more detailed change log is available at [2]. Please try it out, and if you find any problems or have any suggestions for improvements, please give some feedback using the issue tracker! [3] Regards, Vinay Sajip [1] https://pypi.org/project/distlib/0.3.4/ [2] https://distlib.readthedocs.io/en/0.3.4/ [3] https://bitbucket.org/pypa/distlib/issues/new -- https://mail.python.org/mailman/listinfo/python-list
Re: Python child process in while True loop blocks parent
I started this post on November 29, and there have been helpful comments since then from Barry Scott, Cameron Simpson, Peter Holzer and Chris Angelico. Thanks to all of you. I've found a solution that works for my purpose, and I said earlier that I would post the solution I found. If anyone has a better solution I would appreciate any feedback. To recap, I'm using a pair of named pipes for IPC between C and Python. Python runs as a child process after fork-execv. The Python program continues to run concurrently in a while True loop, and responds to requests from C at intervals, and continues to run until it receives a signal from C to exit. C sends signals to Python, then waits to receive data back from Python. My problem was that C was blocked when Python started. The solution was twofold: (1) for Python to run concurrently it must be a multiprocessing loop (from the multiprocessing module), and (2) Python must terminate its write strings with \n, or read will block in C waiting for something that never comes. The multiprocessing module sidesteps the GIL; without multiprocessing the GIL will block all other threads once Python starts. Originally I used epoll() on the pipes. Cameron Smith and Barry Scott advised against epoll, and for this case they are right. Blocking pipes work here, and epoll is too much overhead for watching on a single file descriptor. This is the Python code now: #!/usr/bin/python3 from multiprocessing import Process import os print("Python is running") child_pid = os.getpid() print('child process id:', child_pid) def f(a, b): print("Python now in function f") pr = os.open('/tmp/Pipe_01', os.O_RDONLY) print("File Descriptor1 Opened " + str(pr)) pw = os.open('/tmp/Pipe_02', os.O_WRONLY) print("File Descriptor2 Opened " + str(pw)) while True: v = os.read(pr,64) print("Python read from pipe pr") print(v) if v == b'99': os.close(pr) os.close(pw) print("Python is terminating") os._exit(os.EX_OK) if v != "Send child PID": os.write(pw, b"OK message received\n") print("Python wrote back") if __name__ == '__main__': a = 0 b = 0 p = Process(target=f, args=(a, b,)) p.start() p.join() The variables a and b are not currently used in the body, but they will be later. This is the part of the C code that communicates with Python: fifo_fd1 = open(fifo_path1, O_WRONLY); fifo_fd2 = open(fifo_path2, O_RDONLY); status_write = write(fifo_fd1, py_msg_01, sizeof(py_msg_01)); if (status_write < 0) perror("write"); status_read = read(fifo_fd2, fifo_readbuf, sizeof(py_msg_01)); if (status_read < 0) perror("read"); printf("C received message 1 from Python\n"); printf("%.*s",(int)buf_len, fifo_readbuf); status_write = write(fifo_fd1, py_msg_02, sizeof(py_msg_02)); if (status_write < 0) perror("write"); status_read = read(fifo_fd2, fifo_readbuf, sizeof(py_msg_02)); if (status_read < 0) perror("read"); printf("C received message 2 from Python\n"); printf("%.*s",(int)buf_len, fifo_readbuf); // Terminate Python multiprocessing printf("C is sending exit message to Python\n"); status_write = write(fifo_fd1, py_msg_03, 2); printf("C is closing\n"); close(fifo_fd1); close(fifo_fd2); Screen output: Python is running child process id: 5353 Python now in function f File Descriptor1 Opened 6 Thread created 0 File Descriptor2 Opened 7 Process ID: 5351 Parent Process ID: 5351 I am the parent Core joined 0 I am the child Python read from pipe pr b'Hello to Python from C\x00\x00' Python wrote back C received message 1 from Python OK message received Python read from pipe pr b'Message to Python 2\x00\x00' Python wrote back C received message 2 from Python OK message received C is sending exit message to Python C is closing Python read from pipe pr b'99' Python is terminating Python runs on a separate thread (created with pthreads) because I want the flexibility of using this same basic code as a stand-alone .exe, or for a C extension from Python called with ctypes. If I use it as a C extension then I want the Python code on a separate thread because I can't have two instances of the Python interpreter running on one thread, and one instance will already be running on the main thread, albeit "suspended" by the call from ctypes. So that's my solution: (1) Python multiprocessing module; (2) Python strings written to the pipe must be terminated with \n. Thanks again to all who commented. Dec 6, 2021, 13:33 by ba...@barrys-emacs.org: > > > >> On 6 Dec 2021, at 21:05, Jen Kris <>> jenk...@tutanota.com>> > wrote: >> >> Here is what I don't understand from what you said. "The child process is >> created with a single thread—the one that called fork()." To me that >> implies that the thread that called fork() is the same thre