Re: [pcre-dev] Add #ifdef SUPPORT_UCP to pcre_ucd.c

Philip Hazel Sat, 02 May 2009 02:52:16 -0700

On Thu, 30 Apr 2009, David Dennerline wrote:

> Would there be any problem with adding a #ifdef SUPPORT_UCP to prevent
> including the Unicode character table for pcre_ucd.c? I tried doing this
> because I don't need UTF-8 support and it decreases binary size by 50KB. I
> saw the GET_UCD() is not called in pcre_dfa_exec() only if SUPPORT_UCP is
> called.


I don't see any problem, but then again, I don't see the need. Surely if 
there are no references to the module, it won't get included in the 
binary? I thought that was how libraries worked?

> The program compiles and links correctly, but I wanted to double-check to
> see if there would be any potential instability.

You did not say which operating system you are using. Is it Windows? I 
know nothing about Windows, never having used it. I have just done an 
experiment on Linux, and when I compile with UCP support disabled, 
adding #ifdef SUPPORT_UCP makes no difference at all to the size of the 
binaries for pcretest and pcregrep (though it does reduce the size of 
the pcre_ucd.o compiled module). The binaries are, however, noticeably
smaller than when UCP support is enabled (by more than 50K because a lot 
of other code is cut out as well as the tables).

> Second, has there ever been any discussion or any plans on trying to
> implement a hybrid NFA/DFA engine that would improve performance for
> applications that do not require back-references (i.e., substitution) or
> other non-DFA friendly constructs. Something like Henry Spencer's Tcl
> regular expression parser.

There has been no discussion or planning that I am aware of. A while 
before I retired (18 months ago) I did start thinking about the 
possibility of turning the compiled regex into a proper state table for 
a traditional finite state machine that would probably execute faster
than pcre_dfa_exec(). However, I did not get very far (it was very
tricky, as I recall) and I have not picked this up again since.

Philip

-- 
Philip Hazel

-- 
## List details at http://lists.exim.org/mailman/listinfo/pcre-dev

Re: [pcre-dev] Add #ifdef SUPPORT_UCP to pcre_ucd.c

Reply via email to