On Tue, 2 Aug 2011, Marco van de Voort wrote:

In our previous episode, Michael Van Canneyt said:
I knew I recognised the name of the author. This code is already used in
lazarus and can be found in components/synedit/synregexpr.pas. I had to work
on it because the original code doesn't run on cpu's requiring alignment.
The patch is attached at http://bugs.freepascal.org/view.php?id=19109.


Still, this is good news, in Lazarus the license is MPL or GPL, now it
can be modified LGPL.

If Florian agrees (if I'm correct, he wrote the old unit), we can move the
old regexpr to oldregexpr, and move this one into its place.

There are more contenders
Yesterday, on IRC, sb (Rosseaux) offered a native regex unit with PCRE 
constructs:

19:52 < rosseaux> https://scm.fluktuation.net/svn/brre/ a feature-complete
(with mostly all known regexp features from
                 perl/pcre/etc.) and Unicode8.0-conform-and-UTF8-capable
(as optional work mode in addition to the
                 ansichar-bytewise-mode) bytecode-based regular expression
engine for object pascal, it has two subengines, a
                 backtracking NFA and a parallel threaded NFA (also called
lazy-computed DFA), both engines are cascaded in
                 each another, so that ReDO
19:52 < rosseaux> S attacks are still possible but not more so easy
exploitable as like in some other regex engines. It's
                 licensed under the LGPL with static-linking-exception.
20:00 < rosseaux> and it includes shift-or and boyer-moore (if >32
subsearchs for static simple regex patterns

20:46 < fpk> rosseaux: did you do any speed comparisations?
20:46 < rosseaux> not yet
20:47 < rosseaux> only feature comparsion tests, but i'll do it in the next
days
20:49 < fpk> nice work :
20:49 < fpk> )
20:49 < rosseaux> the parallel threaded non-backtracking NFA idea is based
on the http://swtch.com/~rsc/regexp/regexp1.html
                 article, which I've found with google months ago,
20:49 < rosseaux> thanks :)
20:52 < rosseaux> the UTF8 decoder stuff in BRRE is also a DFA machine and
is position-state-hold-based until the whle regexp
                 stuff is in the process, so the UTF8 support should be
faster than in some other regexp engines with
                 non-complete UTF8 support just as PCRE and so on.
20:55 < rosseaux> and a already-done-compiled BRRE regex can be used in
multiple CPU threads at the same time, so it's
                 semi-threadsafe in this sense.
20:56 < rosseaux> so https://anonym...@scm.fluktuation.net/svn/brre/  that
should working now


That being said, there is probably room for two packages.

Hmmm. Yes.

But the units would have to be named differently anyhow, to avoid the mess like we had with the apache headers.

In order to avoid future nameclashes, I would propose to prefix the 'native FPC' one with 'fp'.

Which one that is, is largely irrelevant to me. I haven't had the need for regular expressions yet :)

Michael.
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to