Re: [Python-Dev] Regular expressions: splitting on zero-width patterns

2017-11-28 Thread MRAB
On 2017-11-28 22:27, Guido van Rossum wrote: On Tue, Nov 28, 2017 at 2:23 PM, MRAB > wrote: On 2017-11-28 20:04, Serhiy Storchaka wrote: The two largest problems in the re module are splitting on zero-width

Re: [Python-Dev] Regular expressions: splitting on zero-width patterns

2017-11-28 Thread Guido van Rossum
On Tue, Nov 28, 2017 at 2:23 PM, MRAB wrote: > On 2017-11-28 20:04, Serhiy Storchaka wrote: > >> The two largest problems in the re module are splitting on zero-width >> patterns and complete and correct support of the Unicode standard. These >> problems are solved in

Re: [Python-Dev] Regular expressions: splitting on zero-width patterns

2017-11-28 Thread MRAB
On 2017-11-28 20:04, Serhiy Storchaka wrote: The two largest problems in the re module are splitting on zero-width patterns and complete and correct support of the Unicode standard. These problems are solved in regex. regex has many other features, but they are less important. I want to tell

Re: [Python-Dev] Regular expressions: splitting on zero-width patterns

2017-11-28 Thread MRAB
On 2017-11-28 20:04, Serhiy Storchaka wrote: The two largest problems in the re module are splitting on zero-width patterns and complete and correct support of the Unicode standard. These problems are solved in regex. regex has many other features, but they are less important. I want to tell

Re: [Python-Dev] Regular expressions: splitting on zero-width patterns

2017-11-28 Thread Guido van Rossum
I trust your instincts and powers of analysis here. Maybe MRAB has some useful feedback on the tar in the honey? On Tue, Nov 28, 2017 at 12:04 PM, Serhiy Storchaka wrote: > The two largest problems in the re module are splitting on zero-width > patterns and complete and

[Python-Dev] Regular expressions: splitting on zero-width patterns

2017-11-28 Thread Serhiy Storchaka
The two largest problems in the re module are splitting on zero-width patterns and complete and correct support of the Unicode standard. These problems are solved in regex. regex has many other features, but they are less important. I want to tell the problem of splitting on zero-width