Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Serhiy Storchaka
06.10.18 10:22, Ram Rachum пише: I'd like to use the re module to parse a long text file, 1GB in size. I wish that the re module could parse a stream, so I wouldn't have to load the whole thing into memory. I'd like to iterate over matches from the stream without keeping the old matches and

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Ram Rachum
It'll load as much as it needs to in order to match or rule out a match on a pattern. If you'd try to match `a.*b` it'll load the whole thing. The use cases that are relevant to a stream wouldn't have these kinds of problems. On Sat, Oct 6, 2018 at 11:22 AM Serhiy Storchaka wrote: > 06.10.18

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Ram Rachum
"This is a regular expression problem, rather than a Python problem." Do you have evidence for this assertion, except that other regex implementations have this limitation? Is there a regex specification somewhere that specifies that streams aren't supported? Is there a fundamental reason that

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Jonathan Fine
Hi Ram You wrote: > I'd like to use the re module to parse a long text file, 1GB in size. I > wish that the re module could parse a stream, so I wouldn't have to load > the whole thing into memory. I'd like to iterate over matches from the > stream without keeping the old matches and input in

Re: [Python-ideas] Debugging: some problems and possible solutions

2018-10-06 Thread Jonathan Fine
Samuel Colvin wrote: > Python definitely needs a dedicated debug print command. > I've built python devtools with has such a command: > https://github.com/samuelcolvin/python-devtools > Is this the kind of thing you were thinking about? Thank you for this comment, Samuel. And also very much

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Jonathan Fine
I wrote: > This is a regular expression problem, rather than a Python problem. Ram wrote: > Do you have evidence for this assertion, except that > other regex implementations have this limitation? Yes. 1. I've already supplied: https://svn.boost.org/trac10/ticket/11776 2.

[Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Ram Rachum
Hi, I'd like to use the re module to parse a long text file, 1GB in size. I wish that the re module could parse a stream, so I wouldn't have to load the whole thing into memory. I'd like to iterate over matches from the stream without keeping the old matches and input in RAM. What do you think?

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Ned Batchelder
On 10/6/18 7:25 AM, Ram Rachum wrote: "This is a regular expression problem, rather than a Python problem." Do you have evidence for this assertion, except that other regex implementations have this limitation? Is there a regex specification somewhere that specifies that streams aren't

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Nathaniel Smith
On Sat, Oct 6, 2018 at 12:22 AM, Ram Rachum wrote: > I'd like to use the re module to parse a long text file, 1GB in size. I wish > that the re module could parse a stream, so I wouldn't have to load the > whole thing into memory. I'd like to iterate over matches from the stream > without keeping

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Chris Angelico
On Sun, Oct 7, 2018 at 8:01 AM Nathaniel Smith wrote: > > On Sat, Oct 6, 2018 at 12:22 AM, Ram Rachum wrote: > > I'd like to use the re module to parse a long text file, 1GB in size. I wish > > that the re module could parse a stream, so I wouldn't have to load the > > whole thing into memory.

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Nathaniel Smith
On Sat, Oct 6, 2018 at 2:04 PM, Chris Angelico wrote: > On Sun, Oct 7, 2018 at 8:01 AM Nathaniel Smith wrote: >> >> On Sat, Oct 6, 2018 at 12:22 AM, Ram Rachum wrote: >> > I'd like to use the re module to parse a long text file, 1GB in size. I >> > wish >> > that the re module could parse a

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Chris Angelico
On Sun, Oct 7, 2018 at 9:54 AM Nathaniel Smith wrote: > > On Sat, Oct 6, 2018 at 2:04 PM, Chris Angelico wrote: > > On Sun, Oct 7, 2018 at 8:01 AM Nathaniel Smith wrote: > >> > >> On Sat, Oct 6, 2018 at 12:22 AM, Ram Rachum wrote: > >> > I'd like to use the re module to parse a long text file,

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Steven D'Aprano
On Sat, Oct 06, 2018 at 02:00:27PM -0700, Nathaniel Smith wrote: > Fortunately, there's an elegant and natural solution: Just save the > regex engine's internal state when it hits the end of the string, and > then when more data arrives, use the saved state to pick up the search > where we left

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Ram Rachum
On Sun, Oct 7, 2018 at 4:40 AM Steven D'Aprano wrote: > I'm sure that Python will never be as efficient as C in that regard > (although PyPy might argue the point) but is there something we can do > to ameliorate this? If we could make char-by-char processing only 10 > times less efficient than

Re: [Python-ideas] Support parsing stream with `re`

2018-10-06 Thread Ram Rachum
Hi Ned! I'm happy to see you here. I'm doing multi-color 3d-printing. The slicing software generates a GCode file, which is a text file of instructions for the printer, each command meaning something like "move the head to coordinates x,y,z while extruding plastic at a rate of w" and lots of