[lingu-dev] C replacement for substrings.pl

Nanning Buitenhuis Wed, 26 Jul 2006 04:17:46 -0700

Hi,

I wrote a C replacement for substrings.pl.
Although it uses an identical algorithm it is quite a bit faster:


$ time ./substrings hyphen.us hyphen.new
0.04user 0.00system 0:00.05elapsed 84%CPU (0avgtext+0avgdata 0maxres)k
0inputs+0outputs (0major+381minor)pagefaults 0swaps

$ time perl substrings.pl hyphen.us hyphen.mashed
1.09user 0.00system 0:01.13elapsed 97%CPU (0avgtext+0avgdata 0maxres)k
0inputs+0outputs (0major+832minor)pagefaults 0swaps

It also fixed a minor bug in combine(): if a sub-pattern is found twice(or more) in the main pattern, then all occurences were changed insteadof (the correct) last occurence. Only example in hyphen.us is 'tanta3'


Other caveats are:
- the output of the C version is sorted in unicode order
- the input should be utf-8

Anybody interested?
  NaNning.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[lingu-dev] C replacement for substrings.pl

Reply via email to