[Python-announce] [RELEASE] The first Python 3.11 beta (3.11.0b1) is available - Feature freeze is here

2022-05-07 Thread Pablo Galindo Salgado
We did it, team!! After quite a bumpy release process and a bunch of
last-time fixes, we have reached **beta 1** and **feature freeze**. What a
ride eh? You can get the shiny new release artefacts from here:

https://www.python.org/downloads/release/python-3110b1/

## This is a beta preview of Python  3.11

Python 3.11 is still in development. 3.11.0b1 is the first of four planned
beta release previews. Beta release previews are intended to give the wider
community the opportunity to test new features and bug fixes and to prepare
their projects to support the new feature release.

We **strongly encourage** maintainers of third-party Python projects to
**test with 3.11** during the beta phase and report issues found to [the
Python bug tracker](https://bugs.python.org) as soon as possible.  While
the release is planned to be feature complete entering the beta phase, it
is possible that features may be modified or, in rare cases, deleted up
until the start of the release candidate phase (Monday, 2021-08-02).  Our
goal is to have no ABI changes after beta 4 and as few code changes as
possible after 3.11.0rc1, the first release candidate.  To achieve that, it
will be **extremely important** to get as much exposure for 3.11 as
possible during the beta phase.

Please keep in mind that this is a preview release and its use is **not**
recommended for production environments.

# Major new features of the 3.11 series, compared to 3.10

Python 3.11 is still in development.  This release, 3.11.0b1 is the
**first** of four beta releases.
Beta release previews are intended to give the wider community the
opportunity to test new features and bug fixes and to prepare their
projects to support the new feature release.

Many new features for Python 3.11 are still being planned and written.
Among the new major new features and changes so far:

* [PEP 657](https://www.python.org/dev/peps/pep-0657/) -- Include
Fine-Grained Error Locations in Tracebacks
* [PEP 654](https://www.python.org/dev/peps/pep-0654/) -- Exception Groups
and except*
* [PEP 673](https://www.python.org/dev/peps/pep-0673/)  -- Self Type
* [PEP 646](https://www.python.org/dev/peps/pep-0646/)-- Variadic Generics
* [PEP 680](https://www.python.org/dev/peps/pep-0680/)-- tomllib: Support
for Parsing TOML in the Standard Library
* [PEP 675](https://www.python.org/dev/peps/pep-0675/)-- Arbitrary Literal
String Type
* [PEP 655](https://www.python.org/dev/peps/pep-0655/)-- Marking individual
TypedDict items as required or potentially-missing
* [bpo-46752](https://bugs.python.org/issue46752)-- Introduce task groups
to asyncio
* The Faster Cpython Project  is already
yielding some exciting results. Python 3.11 is up to 10-60% faster than
Python 3.10. On average, we measured a 1.22x speedup on the standard
benchmark suite. See Faster CPython
for details.
 * Hey, **fellow core developer,** if a feature you find important is
missing from this list, let me know.

The next pre-release of Python 3.11 will be 3.11.0b2, currently scheduled
for Monday, 2022-05-30.

# More resources

* [Online Documentation](https://docs.python.org/3.11/)
* [PEP 664](https://www.python.org/dev/peps/pep-0664/), 3.11 Release
Schedule
* Report bugs at [https://bugs.python.org](https://bugs.python.org).
* [Help fund Python and its community](/psf/donations/).

# And now for something completely different

The holographic principle is a tenet of string theories and a supposed
property of quantum gravity that states that the description of a volume of
space can be thought of as encoded on a lower-dimensional boundary to the
region—such as a light-like boundary like a gravitational horizon. First
proposed by Gerard 't Hooft, it was given a precise string-theory
interpretation by Leonard Susskind, who combined his ideas with previous
ones of 't Hooft and Charles Thorn.[ Leonard Susskind said, “The
three-dimensional world of ordinary experience––the universe filled with
galaxies, stars, planets, houses, boulders, and people––is a hologram, an
image of reality cited on a distant two-dimensional (2D) surface." As
pointed out by Raphael Bousso, Thorn observed in 1978 that string theory
admits a lower-dimensional description in which gravity emerges from it in
what would now be called a holographic way.

The holographic principle was inspired by black hole thermodynamics, which
conjectures that the maximal entropy in any region scales with the radius
squared, and not cubed as might be expected. In the case of a black hole,
the insight was that the informational content of all the objects that have
fallen into the hole might be entirely contained in surface fluctuations of
the event horizon. The holographic principle resolves the black hole
information paradox within the framework of string theory. However, there
exist classical solutions to the Einstein equations that allow values of
the entropy larger than 

[RELEASE] The first Python 3.11 beta (3.11.0b1) is available - Feature freeze is here

2022-05-07 Thread Pablo Galindo Salgado
We did it, team!! After quite a bumpy release process and a bunch of
last-time fixes, we have reached **beta 1** and **feature freeze**. What a
ride eh? You can get the shiny new release artefacts from here:

https://www.python.org/downloads/release/python-3110b1/

## This is a beta preview of Python  3.11

Python 3.11 is still in development. 3.11.0b1 is the first of four planned
beta release previews. Beta release previews are intended to give the wider
community the opportunity to test new features and bug fixes and to prepare
their projects to support the new feature release.

We **strongly encourage** maintainers of third-party Python projects to
**test with 3.11** during the beta phase and report issues found to [the
Python bug tracker](https://bugs.python.org) as soon as possible.  While
the release is planned to be feature complete entering the beta phase, it
is possible that features may be modified or, in rare cases, deleted up
until the start of the release candidate phase (Monday, 2021-08-02).  Our
goal is to have no ABI changes after beta 4 and as few code changes as
possible after 3.11.0rc1, the first release candidate.  To achieve that, it
will be **extremely important** to get as much exposure for 3.11 as
possible during the beta phase.

Please keep in mind that this is a preview release and its use is **not**
recommended for production environments.

# Major new features of the 3.11 series, compared to 3.10

Python 3.11 is still in development.  This release, 3.11.0b1 is the
**first** of four beta releases.
Beta release previews are intended to give the wider community the
opportunity to test new features and bug fixes and to prepare their
projects to support the new feature release.

Many new features for Python 3.11 are still being planned and written.
Among the new major new features and changes so far:

* [PEP 657](https://www.python.org/dev/peps/pep-0657/) -- Include
Fine-Grained Error Locations in Tracebacks
* [PEP 654](https://www.python.org/dev/peps/pep-0654/) -- Exception Groups
and except*
* [PEP 673](https://www.python.org/dev/peps/pep-0673/)  -- Self Type
* [PEP 646](https://www.python.org/dev/peps/pep-0646/)-- Variadic Generics
* [PEP 680](https://www.python.org/dev/peps/pep-0680/)-- tomllib: Support
for Parsing TOML in the Standard Library
* [PEP 675](https://www.python.org/dev/peps/pep-0675/)-- Arbitrary Literal
String Type
* [PEP 655](https://www.python.org/dev/peps/pep-0655/)-- Marking individual
TypedDict items as required or potentially-missing
* [bpo-46752](https://bugs.python.org/issue46752)-- Introduce task groups
to asyncio
* The Faster Cpython Project  is already
yielding some exciting results. Python 3.11 is up to 10-60% faster than
Python 3.10. On average, we measured a 1.22x speedup on the standard
benchmark suite. See Faster CPython
for details.
 * Hey, **fellow core developer,** if a feature you find important is
missing from this list, let me know.

The next pre-release of Python 3.11 will be 3.11.0b2, currently scheduled
for Monday, 2022-05-30.

# More resources

* [Online Documentation](https://docs.python.org/3.11/)
* [PEP 664](https://www.python.org/dev/peps/pep-0664/), 3.11 Release
Schedule
* Report bugs at [https://bugs.python.org](https://bugs.python.org).
* [Help fund Python and its community](/psf/donations/).

# And now for something completely different

The holographic principle is a tenet of string theories and a supposed
property of quantum gravity that states that the description of a volume of
space can be thought of as encoded on a lower-dimensional boundary to the
region—such as a light-like boundary like a gravitational horizon. First
proposed by Gerard 't Hooft, it was given a precise string-theory
interpretation by Leonard Susskind, who combined his ideas with previous
ones of 't Hooft and Charles Thorn.[ Leonard Susskind said, “The
three-dimensional world of ordinary experience––the universe filled with
galaxies, stars, planets, houses, boulders, and people––is a hologram, an
image of reality cited on a distant two-dimensional (2D) surface." As
pointed out by Raphael Bousso, Thorn observed in 1978 that string theory
admits a lower-dimensional description in which gravity emerges from it in
what would now be called a holographic way.

The holographic principle was inspired by black hole thermodynamics, which
conjectures that the maximal entropy in any region scales with the radius
squared, and not cubed as might be expected. In the case of a black hole,
the insight was that the informational content of all the objects that have
fallen into the hole might be entirely contained in surface fluctuations of
the event horizon. The holographic principle resolves the black hole
information paradox within the framework of string theory. However, there
exist classical solutions to the Einstein equations that allow values of
the entropy larger than 

Re: tail

2022-05-07 Thread Chris Angelico
On Sun, 8 May 2022 at 07:19, Stefan Ram  wrote:
>
> MRAB  writes:
> >On 2022-05-07 19:47, Stefan Ram wrote:
> ...
> >>def encoding( name ):
> >>path = pathlib.Path( name )
> >>for encoding in( "utf_8", "latin_1", "cp1252" ):
> >>try:
> >>with path.open( encoding=encoding, errors="strict" )as file:
> >>text = file.read()
> >>return encoding
> >>except UnicodeDecodeError:
> >>pass
> >>return "ascii"
> >>Yes, it's potentially slow and might be wrong.
> >>The result "ascii" might mean it's a binary file.
> >"latin-1" will decode any sequence of bytes, so it'll never try
> >"cp1252", nor fall back to "ascii", and falling back to "ascii" is wrong
> >anyway because the file could contain 0x80..0xFF, which aren't supported
> >by that encoding.
>
>   Thank you! It's working for my specific application where
>   I'm reading from a collection of text files that should be
>   encoded in either utf_8, latin_1, or ascii.
>

In that case, I'd exclude ASCII from the check, and just check UTF-8,
and if that fails, decode as Latin-1. Any ASCII files will decode
correctly as UTF-8, and any file will decode as Latin-1.

I've used this exact fallback system when decoding raw data from
Unicode-naive servers - they accept and share bytes, so it's entirely
possible to have a mix of encodings in a single stream. As long as you
can define the span of a single "unit" (say, a line, or a chunk in
some form), you can read as bytes and do the exact same "decode as
UTF-8 if possible, otherwise decode as Latin-1" dance. Sure, it's not
perfectly ideal, but it's about as good as you'll get with a lot of
US-based servers. (Depending on context, you might use CP-1252 instead
of Latin-1, but you might need errors="replace" there, since
Windows-1252 has some undefined byte values.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python/New/Learn

2022-05-07 Thread Grant Edwards
On 2022-05-06, dn  wrote:

> The problem with some of the advice given in this thread, eg using
> StackOverflow or YouTube videos, is that a beginner (particularly)
> has no measure of the material's quality. Both platforms are riddled
> with utter-junk - even 'dangerous' advice.

And the "quality level" of such online forum answers seems to vary
widely by subject. They're not nearly as bad for Python as they are
for PHP and Javascript. Your chances of finding a correct answer to a
PHP question are virtually nil.  [It dosn't help that the "correct
answer" often changes between versions -- even minor ones.]

--
Grant
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread Chris Angelico
On Sun, 8 May 2022 at 04:37, Marco Sulla  wrote:
>
> On Sat, 7 May 2022 at 19:02, MRAB  wrote:
> >
> > On 2022-05-07 17:28, Marco Sulla wrote:
> > > On Sat, 7 May 2022 at 16:08, Barry  wrote:
> > >> You need to handle the file in bin mode and do the handling of line 
> > >> endings and encodings yourself. It’s not that hard for the cases you 
> > >> wanted.
> > >
> >  "\n".encode("utf-16")
> > > b'\xff\xfe\n\x00'
> >  "".encode("utf-16")
> > > b'\xff\xfe'
> >  "a\nb".encode("utf-16")
> > > b'\xff\xfea\x00\n\x00b\x00'
> >  "\n".encode("utf-16").lstrip("".encode("utf-16"))
> > > b'\n\x00'
> > >
> > > Can I use the last trick to get the encoding of a LF or a CR in any 
> > > encoding?
> >
> > In the case of UTF-16, it's 2 bytes per code unit, but those 2 bytes
> > could be little-endian or big-endian.
> >
> > As you didn't specify which you wanted, it defaulted to little-endian
> > and added a BOM (U+FEFF).
> >
> > If you specify which endianness you want with "utf-16le" or "utf-16be",
> > it won't add the BOM:
> >
> >  >>> # Little-endian.
> >  >>> "\n".encode("utf-16le")
> > b'\n\x00'
> >  >>> # Big-endian.
> >  >>> "\n".encode("utf-16be")
> > b'\x00\n'
>
> Well, ok, but I need a generic method to get LF and CR for any
> encoding an user can input.
> Do you think that
>
> "\n".encode(encoding).lstrip("".encode(encoding))
>
> is good for any encoding?

No, because it is only useful for stateless encodings. Any encoding
which uses "shift bytes" that cause subsequent bytes to be interpreted
differently will simply not work with this naive technique. Also,
you're assuming that the byte(s) you get from encoding LF will *only*
represent LF, which is also not true for a number of other encodings -
they might always encode LF to the same byte sequence, but could use
that same byte sequence as part of a multi-byte encoding. So, no, for
arbitrarily chosen encodings, this is not dependable.

> Furthermore, is there a way to get the
> encoding of an opened file object?

Nope. That's fundamentally not possible. Unless you mean in the
trivial sense of "what was the parameter passed to the open() call?",
in which case f.encoding will give it to you; but to find out the
actual encoding, no, you can't.

The ONLY way to 100% reliably decode arbitrary text is to know, from
external information, what encoding it is in. Every other scheme
imposes restrictions. Trying to do something that works for absolutely
any encoding is a doomed project.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread MRAB

On 2022-05-07 19:47, Stefan Ram wrote:

Marco Sulla  writes:

Well, ok, but I need a generic method to get LF and CR for any
encoding an user can input.


   "LF" and "CR" come from US-ASCII. It is theoretically
   possible that there might be some encodings out there
   (not for Unicode) that are not based on US-ASCII and
   have no LF or no CR.


is good for any encoding? Furthermore, is there a way to get the
encoding of an opened file object?


   I have written a function that might be able to detect one
   of few encodings based on a heuristic algorithm.

def encoding( name ):
 path = pathlib.Path( name )
 for encoding in( "utf_8", "latin_1", "cp1252" ):
 try:
 with path.open( encoding=encoding, errors="strict" )as file:
 text = file.read()
 return encoding
 except UnicodeDecodeError:
 pass
 return "ascii"

   Yes, it's potentially slow and might be wrong.
   The result "ascii" might mean it's a binary file.

"latin-1" will decode any sequence of bytes, so it'll never try 
"cp1252", nor fall back to "ascii", and falling back to "ascii" is wrong 
anyway because the file could contain 0x80..0xFF, which aren't supported 
by that encoding.

--
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread MRAB

On 2022-05-07 19:35, Marco Sulla wrote:

On Sat, 7 May 2022 at 19:02, MRAB  wrote:
>
> On 2022-05-07 17:28, Marco Sulla wrote:
> > On Sat, 7 May 2022 at 16:08, Barry  wrote:
> >> You need to handle the file in bin mode and do the handling of line 
endings and encodings yourself. It’s not that hard for the cases you wanted.
> >
>  "\n".encode("utf-16")
> > b'\xff\xfe\n\x00'
>  "".encode("utf-16")
> > b'\xff\xfe'
>  "a\nb".encode("utf-16")
> > b'\xff\xfea\x00\n\x00b\x00'
>  "\n".encode("utf-16").lstrip("".encode("utf-16"))
> > b'\n\x00'
> >
> > Can I use the last trick to get the encoding of a LF or a CR in any 
encoding?
>
> In the case of UTF-16, it's 2 bytes per code unit, but those 2 bytes
> could be little-endian or big-endian.
>
> As you didn't specify which you wanted, it defaulted to little-endian
> and added a BOM (U+FEFF).
>
> If you specify which endianness you want with "utf-16le" or "utf-16be",
> it won't add the BOM:
>
>  >>> # Little-endian.
>  >>> "\n".encode("utf-16le")
> b'\n\x00'
>  >>> # Big-endian.
>  >>> "\n".encode("utf-16be")
> b'\x00\n'

Well, ok, but I need a generic method to get LF and CR for any
encoding an user can input.
Do you think that

"\n".encode(encoding).lstrip("".encode(encoding))

is good for any encoding?
'.lstrip' is the wrong method to use because it treats its argument as a 
set of characters, so it might strip off too many characters. A better 
choice is '.removeprefix'.

Furthermore, is there a way to get the encoding of an opened file object?


How was the file opened?


If it was opened as a text file, use the '.encoding' attribute (which 
just tells you what encoding was specified when it was opened, and you'd 
be assuming that it's the correct one).



If it was opened as a binary file, all you know is that it contains 
bytes, and determining the encoding (assuming that it is a text file) is 
down to heuristics (i.e. guesswork).


--
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread Dennis Lee Bieber
On Sat, 7 May 2022 20:35:34 +0200, Marco Sulla
 declaimed the following:

>Well, ok, but I need a generic method to get LF and CR for any
>encoding an user can input.

Other than EBCDIC,  and  AS BYTES should appear as x0A and x0D
in any of the 8-bit encodings (ASCII, ISO-8859-x, CP, UTF-8). I believe
those bytes also appear in UTF-16 -- BUT, they will have a null (x00) byte
associated with them as padding; as a result, you can not search for just
x0Dx0A (Windows line end convention -- they may be x00x0Dx00x0A or
x0Dx00x0Ax00 depending on endianness cf:
https://docs.microsoft.com/en-us/cpp/text/support-for-unicode?view=msvc-170
)

For EBCDIC  is still x0D, but  is x25 (and there is a separate
 [new line] at x15)


-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread Marco Sulla
On Sat, 7 May 2022 at 19:02, MRAB  wrote:
>
> On 2022-05-07 17:28, Marco Sulla wrote:
> > On Sat, 7 May 2022 at 16:08, Barry  wrote:
> >> You need to handle the file in bin mode and do the handling of line 
> >> endings and encodings yourself. It’s not that hard for the cases you 
> >> wanted.
> >
>  "\n".encode("utf-16")
> > b'\xff\xfe\n\x00'
>  "".encode("utf-16")
> > b'\xff\xfe'
>  "a\nb".encode("utf-16")
> > b'\xff\xfea\x00\n\x00b\x00'
>  "\n".encode("utf-16").lstrip("".encode("utf-16"))
> > b'\n\x00'
> >
> > Can I use the last trick to get the encoding of a LF or a CR in any 
> > encoding?
>
> In the case of UTF-16, it's 2 bytes per code unit, but those 2 bytes
> could be little-endian or big-endian.
>
> As you didn't specify which you wanted, it defaulted to little-endian
> and added a BOM (U+FEFF).
>
> If you specify which endianness you want with "utf-16le" or "utf-16be",
> it won't add the BOM:
>
>  >>> # Little-endian.
>  >>> "\n".encode("utf-16le")
> b'\n\x00'
>  >>> # Big-endian.
>  >>> "\n".encode("utf-16be")
> b'\x00\n'

Well, ok, but I need a generic method to get LF and CR for any
encoding an user can input.
Do you think that

"\n".encode(encoding).lstrip("".encode(encoding))

is good for any encoding? Furthermore, is there a way to get the
encoding of an opened file object?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread MRAB

On 2022-05-07 17:28, Marco Sulla wrote:

On Sat, 7 May 2022 at 16:08, Barry  wrote:

You need to handle the file in bin mode and do the handling of line endings and 
encodings yourself. It’s not that hard for the cases you wanted.



"\n".encode("utf-16")

b'\xff\xfe\n\x00'

"".encode("utf-16")

b'\xff\xfe'

"a\nb".encode("utf-16")

b'\xff\xfea\x00\n\x00b\x00'

"\n".encode("utf-16").lstrip("".encode("utf-16"))

b'\n\x00'

Can I use the last trick to get the encoding of a LF or a CR in any encoding?


In the case of UTF-16, it's 2 bytes per code unit, but those 2 bytes 
could be little-endian or big-endian.


As you didn't specify which you wanted, it defaulted to little-endian 
and added a BOM (U+FEFF).


If you specify which endianness you want with "utf-16le" or "utf-16be", 
it won't add the BOM:


>>> # Little-endian.
>>> "\n".encode("utf-16le")
b'\n\x00'
>>> # Big-endian.
>>> "\n".encode("utf-16be")
b'\x00\n'
--
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread Dan Stromberg
I believe I'd do something like:

#!/usr/local/cpython-3.10/bin/python3

"""
Output the last 10 lines of a potentially-huge file.


O(n).  But technically so is scanning backward from the EOF.



It'd be faster to use a dict, but this has the advantage of working for
huge num_lines.
"""



import dbm

import os

import sys





tempfile = f'/tmp/{os.path.basename(sys.argv[0])}.{os.getpid()}'



db = dbm.open(tempfile, 'n')



num_lines = 10



for cur_lineno, line in enumerate(sys.stdin):

db[str(cur_lineno)] = line.encode('utf-8')

max_lineno = cur_lineno

str_age_out_lineno = str(cur_lineno - num_lines - 1)

if str_age_out_lineno in db:

del db[str_age_out_lineno]



for lineno in range(max_lineno, max_lineno - num_lines, -1):

str_lineno = str(lineno)

if str_lineno not in db:

break

print(db[str(lineno)].decode('utf-8'), end='')



db.close()

os.unlink(tempfile)


On Sat, Apr 23, 2022 at 11:36 AM Marco Sulla 
wrote:

> What about introducing a method for text streams that reads the lines
> from the bottom? Java has also a ReversedLinesFileReader with Apache
> Commons IO.
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread Marco Sulla
On Sat, 7 May 2022 at 16:08, Barry  wrote:
> You need to handle the file in bin mode and do the handling of line endings 
> and encodings yourself. It’s not that hard for the cases you wanted.

>>> "\n".encode("utf-16")
b'\xff\xfe\n\x00'
>>> "".encode("utf-16")
b'\xff\xfe'
>>> "a\nb".encode("utf-16")
b'\xff\xfea\x00\n\x00b\x00'
>>> "\n".encode("utf-16").lstrip("".encode("utf-16"))
b'\n\x00'

Can I use the last trick to get the encoding of a LF or a CR in any encoding?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python/New/Learn

2022-05-07 Thread o1bigtenor
On Sat, May 7, 2022 at 3:29 AM Peter J. Holzer  wrote:
>
> On 2022-05-07 14:07:53 +1200, Greg Ewing wrote:
> > On 7/05/22 12:27 pm, Stefan Ram wrote:
> > >So, one might actually be able to learn the pronunciation
> > >of a foreign language from text in a book better than from
> > >an audio tape (or an audio file or a video with sound)!
> >
> > Such books would certainly help, but I don't think there's any
> > substitute for actually hearing the sounds if you want to be
> > able to understand the spoken language.
>
> I think "learning to understand the spoken language" and "learning to
> speak without a (foreign) accent" are two different things. I agree that
> the former needs exposure to actual people talking (preferably in real
> life, where people talk fast, slur endings, omit words, hem and haw,
> talk over each other ...). For learning to speak without an accent, just
> listening (or talking) to native speakers is probably not sufficient for
> the reasons Stefan mentioned plus another one: Outside of a classroom
> people usually won't correct your mistakes unless you say something
> truly incomprehensible or unintentionally funny. However I don't think a
> book is sufficient either: Most people are probably even worse at
> observing the position of their various mouth parts while speaking than
> at listening, so without feedback from a native speaker (preferably a
> trained voice coach) they can't really tell whether they are doing it
> right.
>
>

Hmmm - - - - fascinating discussion on language learning.
I would suggest that adults CAN learn other languages.
One factor that hasn't been mentioned is the musicality of
the individual. I added 3 languages, to varying degrees, as
an adult and have done some functioning in about 5 more
than the 3 that I developed through childhood and early youth.
the use of the IPA (international phonetics alphabet) was
encouraged in my formal studies, to be done on my own
time and fashion, and its use was very very helpful for
much in this area.

As far as learning to speak without an accent - - - that does not
necessarily coincide with actual knowledge - - - imo there is
a definite difference between a language 'in the ear' and 'in the
mind'. I have found each language to feel different in both the
mouth AND in the brain (and have found the differences quite
fascinating).

Thank you for very very interesting discussions - - - lol - - - which
perhaps should have early on be relabelled as OT - - - grin!

Pace
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread Barry


> On 7 May 2022, at 14:24, Marco Sulla  wrote:
> 
> On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber  wrote:
>> 
>>Windows also uses  for the EOL marker, but Python's I/O system
>> condenses that to just  internally (for TEXT mode) -- so using the
>> length of a string so read to compute a file position may be off-by-one for
>> each EOL in the string.
> 
> So there's no way to reliably read lines in reverse in text mode using
> seek and read, but the only option is readlines?

You need to handle the file in bin mode and do the handling of line endings and 
encodings yourself. It’s not that hard for the cases you wanted.
Figure out which line ending is in use from the CR LF, LF, CR.
Once you have a line decode it before returning it.

The only OS I know that used CR was Classic Mac OS.
If you do not care about that then you can split on NL and strip any trailing 
CR.

Barry


> -- 
> https://mail.python.org/mailman/listinfo/python-list
> 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread Avi Gross via Python-list

Marco,
I think it was made clear from the start that "text" files in the classic sense 
have no random access method at any higher level than reading a byte at some 
offset from the beginning of the file, or back from the end when it has not 
grown.
The obvious fact is that most of the time the lines are not of fixed widths and 
you have heard about multiple byte encodings and how the ends of lines can vary.

When files get long enough that just reading them from the start as a whole, or 
even in chunks, gets too expensive, some people might consider some other 
method. Log files can go on for years so it is not uncommon to start a new one 
periodically and have a folder with many of them in some order. To get the last 
few lines simply means finding the last file and reading it, or if it is too 
short, getting the penultimate one too.
And obviously a database or other structure might work better which might make 
each "line" a record and index them.
But there are ways to create your own data that get around this such as using 
an encoding with a large but fixed width for every character albeit you need 
more storage space. But if the goal is a general purpose tool, 
internationalization from ASCII has created a challenge for lots of such tools.


-Original Message-
From: Marco Sulla 
To: Dennis Lee Bieber 
Cc: python-list@python.org
Sent: Sat, May 7, 2022 9:21 am
Subject: Re: tail

On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber  wrote:
>
>        Windows also uses  for the EOL marker, but Python's I/O system
> condenses that to just  internally (for TEXT mode) -- so using the
> length of a string so read to compute a file position may be off-by-one for
> each EOL in the string.

So there's no way to reliably read lines in reverse in text mode using
seek and read, but the only option is readlines?
-- 
https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: tail

2022-05-07 Thread Marco Sulla
On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber  wrote:
>
> Windows also uses  for the EOL marker, but Python's I/O system
> condenses that to just  internally (for TEXT mode) -- so using the
> length of a string so read to compute a file position may be off-by-one for
> each EOL in the string.

So there's no way to reliably read lines in reverse in text mode using
seek and read, but the only option is readlines?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python/New/Learn

2022-05-07 Thread Peter J. Holzer
On 2022-05-07 14:07:53 +1200, Greg Ewing wrote:
> On 7/05/22 12:27 pm, Stefan Ram wrote:
> >So, one might actually be able to learn the pronunciation
> >of a foreign language from text in a book better than from
> >an audio tape (or an audio file or a video with sound)!
> 
> Such books would certainly help, but I don't think there's any
> substitute for actually hearing the sounds if you want to be
> able to understand the spoken language.

I think "learning to understand the spoken language" and "learning to
speak without a (foreign) accent" are two different things. I agree that
the former needs exposure to actual people talking (preferably in real
life, where people talk fast, slur endings, omit words, hem and haw,
talk over each other ...). For learning to speak without an accent, just
listening (or talking) to native speakers is probably not sufficient for
the reasons Stefan mentioned plus another one: Outside of a classroom
people usually won't correct your mistakes unless you say something
truly incomprehensible or unintentionally funny. However I don't think a
book is sufficient either: Most people are probably even worse at
observing the position of their various mouth parts while speaking than
at listening, so without feedback from a native speaker (preferably a
trained voice coach) they can't really tell whether they are doing it
right.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Instatiating module / Reusing module of command-line tool

2022-05-07 Thread Cameron Simpson
On 06May2022 14:11, Loris Bennett  wrote:
>r...@zedat.fu-berlin.de (Stefan Ram) writes:
>>   If you need a class, you can write a class.
>>
>>   When one imports a module, the module actually gets executed.
>>   That's why people write "if __name__ == '__main__':" often.
>>   So, everything one wants to be done at import time can be
>>   written directly into the body of one's module.
>
>So if I have a module which relies on having internal data being set
>from outside, then, even though the program only ever has one instance
>of the module, different runs, say test and production, would require
>different internal data and thus different instances.  Therefore a class
>seems more appropriate and it is more obvious to me how to initialise
>the objects (e.g. by having the some main function which can read
>command-line arguments and then just pass the arguments to the
>constructor.
>
>I suppose that the decisive aspect is that my module needs
>initialisation and thus should to be a class.  Your examples in the
>other posting of the modules 'math' and 'string' are different, because
>they just contain functions and no data.

Yeah, I do this quite a bit. So I might have the core class which does 
it all:

class Thing:
def __init__(self, whatever...):


and if I'm exercising this from the command line I'll write a main 
function:

def main(argv):
cmd = argv.pop(0)
... use the arguments to specify data files or modes or whatever 
...
obj = Thing(...init the thing...)
obj.do_something(...)

That is usually the top thing, after the imports but before everything 
else. Then right down the bottom:

if __name__ == '__main__':
sys.exit(main(sys.argv))

for running the module in command line mode:

python3 -m the.module.name args here ...

That way you can import it elsewhere for the "thing" class and also do 
basic command line stuff with it directly.

Cheers,
Cameron Simpson 
way I'll probably write a class for a command line.
-- 
https://mail.python.org/mailman/listinfo/python-list