On Sat, 13 Jul 2019, I wrote: > > May be "[^a]" can use the same algorithm as "[^ab]"? > > [^a] is optimized into a different (faster) opcode; I will see if this > can easily produce the same starting code units as [^ab] for tidyness. I > do not expect it will do much for performance.
Having looked at the code, I have decided for the moment just to leave this on the Wish List. Reasons: (a) I don't think it will give much performance improvement. (b) It is a surprising amount of work, because [^a] is handled as a special "not a", and like just "a" there are a number of different opcodes for [^a]* [^a]+ [^a]{1,4} and so on, all of which would need handling. (c) It gets complicated in the 16-bit and 32-bit cases, and is pointless for the UTF-8 case for values greater than 255 (e.g. [^\x{1234}]) where it would not lock out any starting bytes. Regards, Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev