Re: tail

2022-05-08 Thread Cameron Simpson
On 08May2022 22:48, Marco Sulla wrote: >On Sun, 8 May 2022 at 22:34, Barry wrote: >> >> In text mode you can only seek to a value return from f.tell() >> >> otherwise the behaviour is undefined. >> > >> > Why? I don't see any recommendation about it in the docs: >> > https://docs.python.org/3/li

Re: tail

2022-05-08 Thread Marco Sulla
On Sun, 8 May 2022 at 22:34, Barry wrote: > > > On 8 May 2022, at 20:48, Marco Sulla wrote: > > > > On Sun, 8 May 2022 at 20:31, Barry Scott wrote: > >> > On 8 May 2022, at 17:05, Marco Sulla > wrote: > >>> > >>> def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100):

Re: tail

2022-05-08 Thread Barry
> On 8 May 2022, at 20:48, Marco Sulla wrote: > > On Sun, 8 May 2022 at 20:31, Barry Scott wrote: >> On 8 May 2022, at 17:05, Marco Sulla wrote: >>> >>> def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): >>> n_chunk_size = n * chunk_size >> >> Why use tiny chunk

Re: tail

2022-05-08 Thread Marco Sulla
On Sun, 8 May 2022 at 22:02, Chris Angelico wrote: > > Absolutely not. As has been stated multiple times in this thread, a > fully general approach is extremely complicated, horrifically > unreliable, and hopelessly inefficient. Well, my implementation is quite general now. It's not complicated a

Re: tail

2022-05-08 Thread Chris Angelico
On Mon, 9 May 2022 at 05:49, Marco Sulla wrote: > Anyway, apart from my implementation, I'm curious if you think a tail > method is worth it to be a method of the builtin file objects in > CPython. Absolutely not. As has been stated multiple times in this thread, a fully general approach is extre

Re: tail

2022-05-08 Thread Marco Sulla
On Sun, 8 May 2022 at 20:31, Barry Scott wrote: > > > On 8 May 2022, at 17:05, Marco Sulla wrote: > > > > def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): > >n_chunk_size = n * chunk_size > > Why use tiny chunks? You can read 4KiB as fast as 100 bytes as its typically >

Re: tail

2022-05-08 Thread MRAB
On 2022-05-08 19:15, Barry Scott wrote: On 7 May 2022, at 22:31, Chris Angelico wrote: On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: MRAB writes: On 2022-05-07 19:47, Stefan Ram wrote: ... def encoding( name ): path = pathlib.Path( name ) for encoding in( "utf_8", "latin_1", "cp1

Re: tail

2022-05-08 Thread Barry Scott
> On 8 May 2022, at 17:05, Marco Sulla wrote: > > I think I've _almost_ found a simpler, general way: > > import os > > _lf = "\n" > _cr = "\r" > > def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): >n_chunk_size = n * chunk_size Why use tiny chunks? You can read 4K

Re: tail

2022-05-08 Thread Chris Angelico
On Mon, 9 May 2022 at 04:15, Barry Scott wrote: > > > > > On 7 May 2022, at 22:31, Chris Angelico wrote: > > > > On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: > >> > >> MRAB writes: > >>> On 2022-05-07 19:47, Stefan Ram wrote: > >> ... > def encoding( name ): > path = pathlib.Path(

Re: tail

2022-05-08 Thread Barry Scott
> On 7 May 2022, at 22:31, Chris Angelico wrote: > > On Sun, 8 May 2022 at 07:19, Stefan Ram wrote: >> >> MRAB writes: >>> On 2022-05-07 19:47, Stefan Ram wrote: >> ... def encoding( name ): path = pathlib.Path( name ) for encoding in( "utf_8", "latin_1", "cp1252" ):

Re: tail

2022-05-08 Thread Barry Scott
> On 7 May 2022, at 14:40, Stefan Ram wrote: > > Marco Sulla writes: >> So there's no way to reliably read lines in reverse in text mode using >> seek and read, but the only option is readlines? > > I think, CPython is based on C. I don't know whether > Python's seek function directly call

Re: tail

2022-05-08 Thread Marco Sulla
I think I've _almost_ found a simpler, general way: import os _lf = "\n" _cr = "\r" def tail(filepath, n=10, newline=None, encoding=None, chunk_size=100): n_chunk_size = n * chunk_size pos = os.stat(filepath).st_size chunk_line_pos = -1 lines_not_found = n with open(filepath

Re: tail

2022-05-08 Thread Barry
> On 7 May 2022, at 17:29, Marco Sulla wrote: > > On Sat, 7 May 2022 at 16:08, Barry wrote: >> You need to handle the file in bin mode and do the handling of line endings >> and encodings yourself. It’s not that hard for the cases you wanted. > "\n".encode("utf-16") > b'\xff\xfe\n\x00'