Re: on a tail-recursive square-and-multiply

2023-11-09 Thread Julieta Shem via Python-list
Julieta Shem writes: [...] > I agree. By the way, I once read or watched an interview with Guido van > Rossum and and he was asked why not to tail-call optimize Python and the > answer he gave --- IIRC --- was that tail-call optimization makes it > harder for a beginner to unders

Re: on a tail-recursive square-and-multiply

2023-11-08 Thread Julieta Shem via Python-list
Greg Ewing writes: > On 8/11/23 2:26 pm, Julieta Shem wrote: >> For the first time I'm trying to write a tail-recursive >> square-and-multiply and, even though it /seems/ to work, I'm not happy >> with what I wrote and I don't seem to understand it so well. > > Step

Re: on a tail-recursive square-and-multiply

2023-11-07 Thread Greg Ewing via Python-list
On 8/11/23 2:26 pm, Julieta Shem wrote: For the first time I'm trying to write a tail-recursive square-and-multiply and, even though it /seems/ to work, I'm not happy with what I wrote and I don't seem to understand it so well. Stepping back a bit, why do you feel the need to write this tail

Re: on a tail-recursive square-and-multiply

2023-11-07 Thread Michael Torrie via Python-list
On 11/7/23 18:26, Julieta Shem via Python-list wrote: > For the first time I'm trying to write a tail-recursive > square-and-multiply and, even though it /seems/ to work, I'm not happy > with what I wrote and I don't seem to understand it so well. > > --8<-

on a tail-recursive square-and-multiply

2023-11-07 Thread Julieta Shem via Python-list
For the first time I'm trying to write a tail-recursive square-and-multiply and, even though it /seems/ to work, I'm not happy with what I wrote and I don't seem to understand it so well. --8<---cut here---start->8--- def sam(b, e, m, acc = 1): if

Re: Precision Tail-off?

2023-02-18 Thread Oscar Benjamin
On Sat, 18 Feb 2023 at 11:19, Peter J. Holzer wrote: > > On 2023-02-18 03:52:51 +, Oscar Benjamin wrote: > > On Sat, 18 Feb 2023 at 01:47, Chris Angelico wrote: > > > On Sat, 18 Feb 2023 at 12:41, Greg Ewing via Python-list > > > > To avoid it you would need to use an algorithm that computes

Re: Precision Tail-off?

2023-02-18 Thread Peter J. Holzer
On 2023-02-18 03:52:51 +, Oscar Benjamin wrote: > On Sat, 18 Feb 2023 at 01:47, Chris Angelico wrote: > > On Sat, 18 Feb 2023 at 12:41, Greg Ewing via Python-list > > > To avoid it you would need to use an algorithm that computes nth > > > roots directly rather than raising to the power 1/n.

Re: Precision Tail-off?

2023-02-17 Thread Oscar Benjamin
On Sat, 18 Feb 2023 at 01:47, Chris Angelico wrote: > > On Sat, 18 Feb 2023 at 12:41, Greg Ewing via Python-list > wrote: > > > > On 18/02/23 7:42 am, Richard Damon wrote: > > > On 2/17/23 5:27 AM, Stephen Tucker wrote: > > >> None of the digits in RootNZZZ's string should be different from the

Re: Precision Tail-off?

2023-02-17 Thread Michael Torrie
On 2/17/23 15:03, Grant Edwards wrote: > Every fall, the groups were again full of a new crop of people who had > just discovered all sorts of bugs in the way > implemented floating point, and pointing them to a nicely written > document that explained it never did any good. But to be fair,

Re: Precision Tail-off?

2023-02-17 Thread Chris Angelico
On Sat, 18 Feb 2023 at 12:41, Greg Ewing via Python-list wrote: > > On 18/02/23 7:42 am, Richard Damon wrote: > > On 2/17/23 5:27 AM, Stephen Tucker wrote: > >> None of the digits in RootNZZZ's string should be different from the > >> corresponding digits in RootN. > > > > Only if the storage

Re: Precision Tail-off?

2023-02-17 Thread Greg Ewing via Python-list
On 18/02/23 7:42 am, Richard Damon wrote: On 2/17/23 5:27 AM, Stephen Tucker wrote: None of the digits in RootNZZZ's string should be different from the corresponding digits in RootN. Only if the storage format was DECIMAL. Note that using decimal wouldn't eliminate this particular problem,

Re: Precision Tail-off?

2023-02-17 Thread Grant Edwards
On 2023-02-17, Mats Wichmann wrote: > And... this topic as a whole comes up over and over again, like > everywhere. That's an understatement. I remember it getting rehashed over and over again in various USENET groups 35 years ago when when the VAX 11/780 BSD machine on which I read news

Re: Precision Tail-off?

2023-02-17 Thread Mats Wichmann
On 2/17/23 11:42, Richard Damon wrote: On 2/17/23 5:27 AM, Stephen Tucker wrote: The key factor here is IEEE floating point is storing numbers in BINARY, not DECIMAL, so a multiply by 1000 will change the representation of the number, and thus the possible resolution errors. Store you

Re: Precision Tail-off?

2023-02-17 Thread Grant Edwards
On 2023-02-17, Richard Damon wrote: > [...] > >> Perhaps this observation should be brought to the attention of the IEEE. I >> would like to know their response to it. > > That is why they have developed the Decimal Floating point format, to > handle people with those sorts of problems. > > They

Re: Precision Tail-off?

2023-02-17 Thread Peter J. Holzer
On 2023-02-17 14:39:42 +, Weatherby,Gerard wrote: > IEEE did not define a standard for floating point arithmetics. They > designed multiple standards, including a decimal float point one. > Although decimal floating point (DFP) hardware used to be > manufactured, I couldn’t find any current

Re: Precision Tail-off?

2023-02-17 Thread Peter J. Holzer
On 2023-02-17 10:27:08 +, Stephen Tucker wrote: > This is a hugely controversial claim, I know, but I would consider this > behaviour to be a serious deficiency in the IEEE standard. > > Consider an integer N consisting of a finitely-long string of digits in > base 10. > > Consider the

Re: Precision Tail-off?

2023-02-17 Thread Peter J. Holzer
On 2023-02-17 08:38:58 -0700, Michael Torrie wrote: > On 2/17/23 03:27, Stephen Tucker wrote: > > Thanks, one and all, for your reponses. > > > > This is a hugely controversial claim, I know, but I would consider this > > behaviour to be a serious deficiency in the IEEE standard. > > No matter

Re: Precision Tail-off?

2023-02-17 Thread Oscar Benjamin
On Fri, 17 Feb 2023 at 10:29, Stephen Tucker wrote: > > Thanks, one and all, for your reponses. > > This is a hugely controversial claim, I know, but I would consider this > behaviour to be a serious deficiency in the IEEE standard. [snip] > > Perhaps this observation should be brought to the

Re: Precision Tail-off?

2023-02-17 Thread Richard Damon
On 2/17/23 5:27 AM, Stephen Tucker wrote: Thanks, one and all, for your reponses. This is a hugely controversial claim, I know, but I would consider this behaviour to be a serious deficiency in the IEEE standard. Consider an integer N consisting of a finitely-long string of digits in base 10.

Re: Precision Tail-off?

2023-02-17 Thread Michael Torrie
On 2/17/23 03:27, Stephen Tucker wrote: > Thanks, one and all, for your reponses. > > This is a hugely controversial claim, I know, but I would consider this > behaviour to be a serious deficiency in the IEEE standard. No matter how you do it, there are always tradeoffs and inaccuracies moving

Re: Precision Tail-off?

2023-02-17 Thread Peter Pearson
[snip] >> >> I have just produced the following log in IDLE (admittedly, in Python >> >> 2.7.10 and, yes I know that it has been superseded). >> >> >> >> It appears to show a precision tail-off as the supplied float gets >> bigger. >&g

RE: Precision Tail-off?

2023-02-17 Thread avi.e.gross
? -Original Message- From: Python-list On Behalf Of Stephen Tucker Sent: Friday, February 17, 2023 5:27 AM To: python-list@python.org Subject: Re: Precision Tail-off? Thanks, one and all, for your reponses. This is a hugely controversial claim, I know, but I would consider this behaviour

Re: Precision Tail-off?

2023-02-17 Thread Weatherby,Gerard
until a few years ago, but they seem to have gone dark: https://twitter.com/SilMinds From: Python-list on behalf of Thomas Passin Date: Friday, February 17, 2023 at 9:02 AM To: python-list@python.org Subject: Re: Precision Tail-off? *** Attention: This is an external email. Use caution

Re: Precision Tail-off?

2023-02-17 Thread Thomas Passin
+, Oscar Benjamin wrote: On Tue, 14 Feb 2023 at 07:12, Stephen Tucker wrote: [snip] I have just produced the following log in IDLE (admittedly, in Python 2.7.10 and, yes I know that it has been superseded). It appears to show a precision tail-off as the supplied float gets bigger. [snip

Re: Precision Tail-off?

2023-02-17 Thread Stephen Tucker
cker. > > > On Thu, Feb 16, 2023 at 6:49 PM Peter Pearson > wrote: > >> On Tue, 14 Feb 2023 11:17:20 +, Oscar Benjamin wrote: >> > On Tue, 14 Feb 2023 at 07:12, Stephen Tucker >> wrote: >> [snip] >> >> I have just produced the following log i

Re: Precision Tail-off?

2023-02-17 Thread Stephen Tucker
en Tucker > wrote: > [snip] > >> I have just produced the following log in IDLE (admittedly, in Python > >> 2.7.10 and, yes I know that it has been superseded). > >> > >> It appears to show a precision tail-off as the supplied float gets > bigger. > [snip]

Re: Precision Tail-off?

2023-02-16 Thread Peter Pearson
On Tue, 14 Feb 2023 11:17:20 +, Oscar Benjamin wrote: > On Tue, 14 Feb 2023 at 07:12, Stephen Tucker wrote: [snip] >> I have just produced the following log in IDLE (admittedly, in Python >> 2.7.10 and, yes I know that it has been superseded). >> >> It appears t

Re: Precision Tail-off?

2023-02-15 Thread Weatherby,Gerard
) 8.881784197001252e-16 1E-99 From: Python-list on behalf of Michael Torrie Date: Tuesday, February 14, 2023 at 5:52 PM To: python-list@python.org Subject: Re: Precision Tail-off? *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. *** On 2

Re: Precision Tail-off?

2023-02-14 Thread Michael Torrie
On 2/14/23 00:09, Stephen Tucker wrote: > I have two questions: > 1. Is there a straightforward explanation for this or is it a bug? To you 1/3 may be an exact fraction, and the definition of raising a number to that power means a cube root which also has an exact answer, but to the computer, 1/3

Re: Precision Tail-off?

2023-02-14 Thread Weatherby,Gerard
Use Python3 Use the decimal module: https://docs.python.org/3/library/decimal.html From: Python-list on behalf of Stephen Tucker Date: Tuesday, February 14, 2023 at 2:11 AM To: Python Subject: Precision Tail-off? *** Attention: This is an external email. Use caution responding, opening

Re: Precision Tail-off?

2023-02-14 Thread Oscar Benjamin
On Tue, 14 Feb 2023 at 07:12, Stephen Tucker wrote: > > Hi, > > I have just produced the following log in IDLE (admittedly, in Python > 2.7.10 and, yes I know that it has been superseded). > > It appears to show a precision tail-off as the supplied float gets bigger. > >

Precision Tail-off?

2023-02-13 Thread Stephen Tucker
Hi, I have just produced the following log in IDLE (admittedly, in Python 2.7.10 and, yes I know that it has been superseded). It appears to show a precision tail-off as the supplied float gets bigger. I have two questions: 1. Is there a straightforward explanation for this or is it a bug? 2

Re: tail

2022-05-19 Thread Cameron Simpson
preter. Try: >> >> time python3 your-tail-prog.py /home/marco/lorem.txt > >Well, I'll try it, but it's not a bit unfair to compare Python startup with C? Yes it is. But timeit goes the other way and only measures the code. Admittedly I'd expect a C tail to be pretty quick a

Re: tail

2022-05-19 Thread Marco Sulla
On Wed, 18 May 2022 at 23:32, Cameron Simpson wrote: > > On 17May2022 22:45, Marco Sulla wrote: > >Well, I've done a benchmark. > >>>> timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail}, > >>>> number=10)

Re: tail

2022-05-18 Thread Cameron Simpson
On 17May2022 22:45, Marco Sulla wrote: >Well, I've done a benchmark. >>>> timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail}, >>>> number=10) >1.5963431186974049 >>>> timeit.timeit("tail('/

Re: tail

2022-05-18 Thread Marco Sulla
Well, I've done a benchmark. >>> timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail}, >>> number=10) 1.5963431186974049 >>> timeit.timeit("tail('/home/marco/lorem.txt')", globals={"tail":tail}, >>

Re: tail

2022-05-16 Thread Marco Sulla
ion" I have ever seen. > > > You're lucky. I've seen much worse (or no one). > > At least with *no* documentation, the source code stands for itself. So I did it well to not put one in the first time. I think that after 100 posts about tail, chunks etc it was clear what that stuff w

Re: tail

2022-05-13 Thread 2QdxY4RzWzUUiLuE
On 2022-05-13 at 12:16:57 +0200, Marco Sulla wrote: > On Fri, 13 May 2022 at 00:31, Cameron Simpson wrote: [...] > > This is nearly the worst "specification" I have ever seen. > You're lucky. I've seen much worse (or no one). At least with *no* documentation, the source code stands for

Re: tail

2022-05-13 Thread Marco Sulla
;"" > >A function that "tails" the file. If you don't know what that means, > >google "man tail" > > > >filepath: the file path of the file to be "tailed" > >n: the numbers of lines "tailed" > >chunk_size: oh don

Re: tail

2022-05-12 Thread Cameron Simpson
On 12May2022 19:48, Marco Sulla wrote: >On Thu, 12 May 2022 at 00:50, Stefan Ram wrote: >> There's no spec/doc, so one can't even test it. > >Excuse me, you're very right. > >""" >A function that "tails" the file. If you don't know what that mea

Re: tail

2022-05-12 Thread Dennis Lee Bieber
On Thu, 12 May 2022 22:45:42 +0200, Marco Sulla declaimed the following: > >Maybe. Maybe not. What if the file ends with no newline? https://github.com/coreutils/coreutils/blob/master/src/tail.c Lines 567-569 (also lines 550-557 for "bytes_read" determination) -- Wulfraed

Re: tail

2022-05-12 Thread Marco Sulla
Thank you very much. This helped me to improve the function: import os _lf = b"\n" _err_n = "Parameter n must be a positive integer number" _err_chunk_size = "Parameter chunk_size must be a positive integer number" def tail(filepath, n=10, chunk_size=100):

Re: tail

2022-05-12 Thread Marco Sulla
On Thu, 12 May 2022 at 00:50, Stefan Ram wrote: > > Marco Sulla writes: > >def tail(filepath, n=10, chunk_size=100): > >if (n <= 0): > >raise ValueError(_err_n) > ... > > There's no spec/doc, so one can't even test it. Excuse me, you're ver

Re: tail

2022-05-11 Thread Avi Gross via Python-list
numpy/pandas in Python often provide functions with names like head or tail as do other languages where data structures with names like data.frame are commonly used. These structures are in some way indexed to make it easy to jump towards the end. Text files are not. Efficiency aside, a 3-year-old

Re: tail

2022-05-11 Thread Avi Gross via Python-list
Just FYI, UNIX had a bunch of utilities that could emulate a vanilla version of tail on a command line. You can use sed, awk and quite a few others to simply show line N to the end of a file or other variations.  Of course the way many things were done back then had less focus on efficiency

Re: tail

2022-05-11 Thread Dennis Lee Bieber
On Thu, 12 May 2022 06:07:18 +1000, Chris Angelico declaimed the following: >I don't understand why this wants to be in the standard library. > Especially as any Linux distribution probably includes the compiled "tail" command, so this would only be of use on Windows

Re: tail

2022-05-11 Thread Chris Angelico
other > tests, and, frankly, I don't want to. I don't want to because I'm > quite sure the implementation is fast, since it reads by chunks and > cache them. I'm not sure it's 100% free of bugs, but the concept is > very simple, since it simply mimics the *nix tail, so it shoul

Re: tail

2022-05-11 Thread Marco Sulla
uite sure the implementation is fast, since it reads by chunks and cache them. I'm not sure it's 100% free of bugs, but the concept is very simple, since it simply mimics the *nix tail, so it should be reliable. > > > I'd very much like to see a CPython implementation of that function. It >

Re: tail

2022-05-11 Thread Chris Angelico
read method). > > I suppose the function is reliable. File is opened in binary mode and only > b"\n" is searched as line end, as *nix tail (and python readline in binary > mode) do. And bytes are returned. The caller can use them as is or convert > them to a string using the

Re: tail

2022-05-11 Thread Marco Sulla
On Mon, 9 May 2022 at 23:15, Dennis Lee Bieber wrote: > > On Mon, 9 May 2022 21:11:23 +0200, Marco Sulla > declaimed the following: > > >Nevertheless, tail is a fundamental tool in *nix. It's fast and > >reliable. Also the tail command can't handle different encodings

Re: tail

2022-05-09 Thread Alan Bawden
Marco Sulla writes: On Mon, 9 May 2022 at 19:53, Chris Angelico wrote: ... Nevertheless, tail is a fundamental tool in *nix. It's fast and reliable. Also the tail command can't handle different encodings? It definitely can't. It works for UTF-8, and all the ASCII compatible single

Re: tail

2022-05-09 Thread Dennis Lee Bieber
On Mon, 9 May 2022 21:11:23 +0200, Marco Sulla declaimed the following: >Nevertheless, tail is a fundamental tool in *nix. It's fast and >reliable. Also the tail command can't handle different encodings? Based upon https://github.com/coreutils/coreutils/blob/master/src/tail.c th

Re: tail

2022-05-09 Thread Chris Angelico
On Tue, 10 May 2022 at 07:07, Barry wrote: > POSIX tail just prints the bytes to the output that it finds between \n bytes. > At no time does it need to care about encodings as that is a problem solved > by the terminal software. I would not expect utf-16 to work with tail on > l

Re: tail

2022-05-09 Thread Barry
he middle of some character. And there are encodings >>>> where you cannot inspect the data to find a character boundary in the >>>> byte stream. >>> >>> Ooook, now I understand what you and Barry mean. I suppose there's no >>> reliable way to tail

Re: tail

2022-05-09 Thread Barry
> On 9 May 2022, at 17:41, r...@zedat.fu-berlin.de wrote: > > Barry Scott writes: >> Why use tiny chunks? You can read 4KiB as fast as 100 bytes > > When optimizing code, it helps to be aware of the orders of > magnitude That is true and we’ll know to me, now show how what I said is

Re: tail

2022-05-09 Thread Chris Angelico
up in the middle of some character. And there are encodings > > > > where you cannot inspect the data to find a character boundary in the > > > > byte stream. > > > > > > Ooook, now I understand what you and Barry mean. I suppose there's no > > > reli

Re: tail

2022-05-09 Thread Marco Sulla
to find a character boundary in the > > > byte stream. > > > > Ooook, now I understand what you and Barry mean. I suppose there's no > > reliable way to tail a big file opened in text mode with a decent > > performance. > > > > Anyway, the previous-previous fu

Re: tail

2022-05-09 Thread 2QdxY4RzWzUUiLuE
On 2022-05-08 at 18:52:42 +, Stefan Ram wrote: > Remember how recently people here talked about how you cannot copy > text from a video? Then, how did I do it? Turns out, for my > operating system, there's a screen OCR program! So I did this OCR > and then manually corrected a few

Re: tail

2022-05-09 Thread Chris Angelico
sized characters. _If_ you did a seek to an arbitrary number > > you can end up in the middle of some character. And there are encodings > > where you cannot inspect the data to find a character boundary in the > > byte stream. > > Ooook, now I understand what you and Barry m

Re: tail

2022-05-09 Thread Marco Sulla
p in the middle of some character. And there are encodings > where you cannot inspect the data to find a character boundary in the > byte stream. Ooook, now I understand what you and Barry mean. I suppose there's no reliable way to tail a big file opened in text mode with a decent performance.

Re: tail

2022-05-09 Thread Dennis Lee Bieber
On Sun, 8 May 2022 22:48:32 +0200, Marco Sulla declaimed the following: > >Emh. I re-quote > >seek(offset, whence=SEEK_SET) >Change the stream position to the given byte offset. > >And so on. No mention of differences between text and binary mode. You ignore that, underneath, Python is

Re: tail

2022-05-09 Thread Greg Ewing
On 9/05/22 7:47 am, Marco Sulla wrote: It will fail if the contents is not ASCII. Why? For some encodings, if you seek to an arbitrary byte position and then read, it may *appear* to succeed but give you complete gibberish. Your method might work for a certain subset of encodings (those

Re: tail

2022-05-08 Thread Cameron Simpson
On 08May2022 22:48, Marco Sulla wrote: >On Sun, 8 May 2022 at 22:34, Barry wrote: >> >> In text mode you can only seek to a value return from f.tell() >> >> otherwise the behaviour is undefined. >> > >> > Why? I don't see any recommendation about it in the docs: >> >

Re: tail

2022-05-08 Thread Marco Sulla
On Sun, 8 May 2022 at 22:34, Barry wrote: > > > On 8 May 2022, at 20:48, Marco Sulla wrote: > > > > On Sun, 8 May 2022 at 20:31, Barry Scott wrote: > >> > >>>> On 8 May 2022, at 17:05, Marco Sulla > >>>> wrote: > >>> &g

Re: tail

2022-05-08 Thread Barry
> On 8 May 2022, at 20:48, Marco Sulla wrote: > > On Sun, 8 May 2022 at 20:31, Barry Scott wrote: >> >>>> On 8 May 2022, at 17:05, Marco Sulla wrote: >>> >>> def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): >>&g

Re: tail

2022-05-08 Thread Marco Sulla
On Sun, 8 May 2022 at 22:02, Chris Angelico wrote: > > Absolutely not. As has been stated multiple times in this thread, a > fully general approach is extremely complicated, horrifically > unreliable, and hopelessly inefficient. Well, my implementation is quite general now. It's not complicated

Re: tail

2022-05-08 Thread Chris Angelico
On Mon, 9 May 2022 at 05:49, Marco Sulla wrote: > Anyway, apart from my implementation, I'm curious if you think a tail > method is worth it to be a method of the builtin file objects in > CPython. Absolutely not. As has been stated multiple times in this thread, a fully general

Re: tail

2022-05-08 Thread Marco Sulla
On Sun, 8 May 2022 at 20:31, Barry Scott wrote: > > > On 8 May 2022, at 17:05, Marco Sulla wrote: > > > > def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): > >n_chunk_size = n * chunk_size > > Why use tiny chunks? You can read 4KiB as f

Re: tail

2022-05-08 Thread MRAB
On 2022-05-08 19:15, Barry Scott wrote: On 7 May 2022, at 22:31, Chris Angelico wrote: On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: MRAB writes: On 2022-05-07 19:47, Stefan Ram wrote: ... def encoding( name ): path = pathlib.Path( name ) for encoding in( "utf_8", "latin_1",

Re: tail

2022-05-08 Thread Barry Scott
> On 8 May 2022, at 17:05, Marco Sulla wrote: > > I think I've _almost_ found a simpler, general way: > > import os > > _lf = "\n" > _cr = "\r" > > def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): >n_chunk_siz

Re: tail

2022-05-08 Thread Chris Angelico
On Mon, 9 May 2022 at 04:15, Barry Scott wrote: > > > > > On 7 May 2022, at 22:31, Chris Angelico wrote: > > > > On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: > >> > >> MRAB writes: > >>> On 2022-05-07 19:47, Stefan Ram wrote: > >> ... > def encoding( name ): > path =

Re: tail

2022-05-08 Thread Barry Scott
> On 7 May 2022, at 22:31, Chris Angelico wrote: > > On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: >> >> MRAB writes: >>> On 2022-05-07 19:47, Stefan Ram wrote: >> ... def encoding( name ): path = pathlib.Path( name ) for encoding in( "utf_8", "latin_1", "cp1252" ):

Re: tail

2022-05-08 Thread Barry Scott
> On 7 May 2022, at 14:40, Stefan Ram wrote: > > Marco Sulla writes: >> So there's no way to reliably read lines in reverse in text mode using >> seek and read, but the only option is readlines? > > I think, CPython is based on C. I don't know whether > Python's seek function directly

Re: tail

2022-05-08 Thread Marco Sulla
I think I've _almost_ found a simpler, general way: import os _lf = "\n" _cr = "\r" def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): n_chunk_size = n * chunk_size pos = os.stat(filepath).st_size chunk_line_pos = -1 lines_not_foun

Re: tail

2022-05-08 Thread Barry
> On 7 May 2022, at 17:29, Marco Sulla wrote: > > On Sat, 7 May 2022 at 16:08, Barry wrote: >> You need to handle the file in bin mode and do the handling of line endings >> and encodings yourself. It’s not that hard for the cases you wanted. > "\n".encode("utf-16") >

Re: tail

2022-05-07 Thread Chris Angelico
On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: > > MRAB writes: > >On 2022-05-07 19:47, Stefan Ram wrote: > ... > >>def encoding( name ): > >>path = pathlib.Path( name ) > >>for encoding in( "utf_8", "latin_1", "cp1252" ): > >>try: > >>with path.open(

Re: tail

2022-05-07 Thread Chris Angelico
On Sun, 8 May 2022 at 04:37, Marco Sulla wrote: > > On Sat, 7 May 2022 at 19:02, MRAB wrote: > > > > On 2022-05-07 17:28, Marco Sulla wrote: > > > On Sat, 7 May 2022 at 16:08, Barry wrote: > > >> You need to handle the file in bin mode and do the handling of line > > >> endings and encodings

Re: tail

2022-05-07 Thread MRAB
On 2022-05-07 19:47, Stefan Ram wrote: Marco Sulla writes: Well, ok, but I need a generic method to get LF and CR for any encoding an user can input. "LF" and "CR" come from US-ASCII. It is theoretically possible that there might be some encodings out there (not for Unicode) that

Re: tail

2022-05-07 Thread MRAB
On 2022-05-07 19:35, Marco Sulla wrote: On Sat, 7 May 2022 at 19:02, MRAB wrote: > > On 2022-05-07 17:28, Marco Sulla wrote: > > On Sat, 7 May 2022 at 16:08, Barry wrote: > >> You need to handle the file in bin mode and do the handling of line endings and encodings yourself. It’s not that

Re: tail

2022-05-07 Thread Dennis Lee Bieber
On Sat, 7 May 2022 20:35:34 +0200, Marco Sulla declaimed the following: >Well, ok, but I need a generic method to get LF and CR for any >encoding an user can input. Other than EBCDIC, and AS BYTES should appear as x0A and x0D in any of the 8-bit encodings (ASCII, ISO-8859-x, CP,

Re: tail

2022-05-07 Thread Marco Sulla
On Sat, 7 May 2022 at 19:02, MRAB wrote: > > On 2022-05-07 17:28, Marco Sulla wrote: > > On Sat, 7 May 2022 at 16:08, Barry wrote: > >> You need to handle the file in bin mode and do the handling of line > >> endings and encodings yourself. It’s not that hard for the cases you > >> wanted. > >

Re: tail

2022-05-07 Thread MRAB
On 2022-05-07 17:28, Marco Sulla wrote: On Sat, 7 May 2022 at 16:08, Barry wrote: You need to handle the file in bin mode and do the handling of line endings and encodings yourself. It’s not that hard for the cases you wanted. "\n".encode("utf-16") b'\xff\xfe\n\x00' "".encode("utf-16")

Re: tail

2022-05-07 Thread Dan Stromberg
I believe I'd do something like: #!/usr/local/cpython-3.10/bin/python3 """ Output the last 10 lines of a potentially-huge file. O(n). But technically so is scanning backward from the EOF. It'd be faster to use a dict, but this has the advantage of working for huge num_lines. """ import

Re: tail

2022-05-07 Thread Marco Sulla
On Sat, 7 May 2022 at 16:08, Barry wrote: > You need to handle the file in bin mode and do the handling of line endings > and encodings yourself. It’s not that hard for the cases you wanted. >>> "\n".encode("utf-16") b'\xff\xfe\n\x00' >>> "".encode("utf-16") b'\xff\xfe' >>>

Re: tail

2022-05-07 Thread Barry
> On 7 May 2022, at 14:24, Marco Sulla wrote: > > On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber wrote: >> >>Windows also uses for the EOL marker, but Python's I/O system >> condenses that to just internally (for TEXT mode) -- so using the >> length of a string so read to compute

Re: tail

2022-05-07 Thread Avi Gross via Python-list
general purpose tool, internationalization from ASCII has created a challenge for lots of such tools. -Original Message- From: Marco Sulla To: Dennis Lee Bieber Cc: python-list@python.org Sent: Sat, May 7, 2022 9:21 am Subject: Re: tail On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber wrote: > >

Re: tail

2022-05-07 Thread Marco Sulla
On Sat, 7 May 2022 at 01:03, Dennis Lee Bieber wrote: > > Windows also uses for the EOL marker, but Python's I/O system > condenses that to just internally (for TEXT mode) -- so using the > length of a string so read to compute a file position may be off-by-one for > each EOL in the

Re: tail

2022-05-06 Thread Dennis Lee Bieber
On Fri, 6 May 2022 21:19:48 +0100, MRAB declaimed the following: >Is the file UTF-8? That's a variable-width encoding, so are any of the >characters > U+007F? > >Which OS? On Windows, it's common/normal for UTF-8 files to start with a >BOM/signature, which is 3 bytes/1 codepoint.

Re: tail

2022-05-06 Thread MRAB
On 2022-05-06 20:21, Marco Sulla wrote: I have a little problem. I tried to extend the tail function, so it can read lines from the bottom of a file object opened in text mode. The problem is it does not work. It gets a starting position that is lower than the expected by 3 characters. So

Re: tail

2022-05-06 Thread Marco Sulla
I have a little problem. I tried to extend the tail function, so it can read lines from the bottom of a file object opened in text mode. The problem is it does not work. It gets a starting position that is lower than the expected by 3 characters. So the first line is read only for 2 chars

Re: tail

2022-05-02 Thread Marco Sulla
On Mon, 2 May 2022 at 00:20, Cameron Simpson wrote: > > On 01May2022 18:55, Marco Sulla wrote: > >Something like this is OK? > [...] > >def tail(f): > >chunk_size = 100 > >size = os.stat(f.fileno()).st_size > > I think you want

Re: tail

2022-05-02 Thread Marco Sulla
Ok, I suppose \n and \r are enough: readline(size=- 1, /) Read and return one line from the stream. If size is specified, at most size bytes will be read. The line terminator is always b'\n' for binary files; for text files, the newline argument to open() can be used to select the line

Re: tail

2022-05-02 Thread Chris Angelico
On Tue, 3 May 2022 at 04:38, Marco Sulla wrote: > > On Mon, 2 May 2022 at 18:31, Stefan Ram wrote: > > > > |The Unicode standard defines a number of characters that > > |conforming applications should recognize as line terminators:[7] > > | > > |LF:Line Feed, U+000A > > |VT:Vertical Tab,

Re: tail

2022-05-02 Thread Marco Sulla
On Mon, 2 May 2022 at 18:31, Stefan Ram wrote: > > |The Unicode standard defines a number of characters that > |conforming applications should recognize as line terminators:[7] > | > |LF:Line Feed, U+000A > |VT:Vertical Tab, U+000B > |FF:Form Feed, U+000C > |CR:Carriage Return,

Re: tail

2022-05-01 Thread Chris Angelico
On Mon, 2 May 2022 at 11:54, Cameron Simpson wrote: > > On 01May2022 23:30, Stefan Ram wrote: > >Dan Stromberg writes: > >>But what about Unicode? Are all 10 bytes newlines in Unicode encodings? > > It seems in UTF-8, when a value is above U+007F, it will be > > encoded with bytes that

Re: tail

2022-05-01 Thread Cameron Simpson
On 01May2022 23:30, Stefan Ram wrote: >Dan Stromberg writes: >>But what about Unicode? Are all 10 bytes newlines in Unicode encodings? > It seems in UTF-8, when a value is above U+007F, it will be > encoded with bytes that always have their high bit set. Aye. Design festure enabling easy

Re: tail

2022-05-01 Thread Chris Angelico
On Mon, 2 May 2022 at 09:19, Dan Stromberg wrote: > > On Sun, May 1, 2022 at 3:19 PM Cameron Simpson wrote: > > > On 01May2022 18:55, Marco Sulla wrote: > > >Something like this is OK? > > > > Scanning backward for a byte == 10 in ASCII or ISO-8859 seems fine. > > But what about Unicode? Are

Re: tail

2022-05-01 Thread Dan Stromberg
On Sun, May 1, 2022 at 3:19 PM Cameron Simpson wrote: > On 01May2022 18:55, Marco Sulla wrote: > >Something like this is OK? > Scanning backward for a byte == 10 in ASCII or ISO-8859 seems fine. But what about Unicode? Are all 10 bytes newlines in Unicode encodings? If not, and you have a

Re: tail

2022-05-01 Thread Cameron Simpson
On 01May2022 18:55, Marco Sulla wrote: >Something like this is OK? [...] >def tail(f): >chunk_size = 100 >size = os.stat(f.fileno()).st_size I think you want os.fstat(). >positions = iter(range(size, -1, -chunk_size)) >next(positions) I was wonder

Re: tail

2022-05-01 Thread Marco Sulla
Something like this is OK? import os def tail(f): chunk_size = 100 size = os.stat(f.fileno()).st_size positions = iter(range(size, -1, -chunk_size)) next(positions) chunk_line_pos = -1 pos = 0 for pos in positions: f.seek(pos) chars = f.read

  1   2   3   4   5   6   >