On Sunday, 10 September 2017 at 00:16:10 UTC, Chad Joan wrote:
On Tuesday, 5 September 2017 at 10:50:46 UTC, Dmitry Olshansky wrote:
My burndown list for std.regex:

https://issues.dlang.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=VERIFIED&component=phobos&list_id=216638&product=D&query_format=advanced&resolution=---&short_desc=regex&short_desc_type=allwordssubstr
...

I was working on std.regex a bit myself, so I created this bug report to capture some of the findings/progress:
https://issues.dlang.org/show_bug.cgi?id=17820

It seems like something you might be interested in, or might even have a small chance of fixing in the course of other things.

Yeah, well known problem. Solution is to keep a bit of memory cached eg in TLS variable.


...

There are other regex improvements I might be interested in, but I'm not sure I have time to make bug reports for them right now. I might be convinced to fast track them if someone wants to make legitimate effort towards fixing them, otherwise I'll eventually get around to writing the reports and/or making PRs someday.

Examples:

-- Calls to malloc in the CTFE path cause some regexes to fail at compile time. I suspect this happens due to the Captures (n
> smallString) condition when the number of possible captures
is greater than 3, but I haven't tested it (time consuming...).


Sholudn't be a problem, but please report an example.

-- I remember being unable to iterate over named captures. But I'm not confident that I'm remembering this correctly, and I'm not sure if it's still true.


Would be nice and simple enhancement.

-- The Captures struct does not specify what value is returned for submatches that were in the branch of an alternation that wasn't taken or in a repetition that matched 0 or more than 1 times.

As every engine out there the value is "", empty string.


-- The Captures struct does not seem to have a way to access all of the strings matched by a submatch in repetition context, not to mention nested repetition contexts.


Just like any other regex library.


I'm not sure how much those mentions help without proper bug reports, but at least I got it off my chest (for the time being) without having to spend my whole weekend writing bug reports ;)


Well they are warmly welcome shouldypu get to it.

...

Dmitry, I appreciate your working towards making the regex module easier to work on. Thanks.

...

I'm curious what you're thinking about when you mention something ambitious like writing a new GC :)
(like this https://imgur.com/cWa4evD)

I can't help but fantasize about cheap ways to get GC allocations to parallelize well and not end up writing an entire generational collector!

ThreadCache can go a long way to help that.

But I doubt I'll ever have the opportunity to work on such things. I hope your GC attempt works out!

Me too. It's won't be trivial effort though.


Reply via email to