------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1416 CJ Dennis <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID --- Comment #3 from CJ Dennis <[email protected]> 2013-11-19 23:46:12 --- @Graycode The equivalent English regex would be: <?php print(preg_replace('/(?<!k)/u', '*', 'k') . "\n"); // /u is unnecessary but harmless as no UTF-8 characters print(preg_replace('/(?<!k)/u', '*', 'm') . "\n"); ?> -------- Actual output: *k *m* @Philip Hazel I could see the characters correctly in your reply. The regex is looking for any position where the Sinhala letter for 'k' is not before and insert a '*'. the string to search in is the third argument (Sinhala 'k' and Sinhala 'm'). Yes, /u is to turn on UTF-8 mode, otherwise PHP treats the pattern as ASCII (albeit with 8 bits). You seem to be correct that it's not a PCRE bug. I downloaded pcre-7.0.exe (I'm using Windows), read the pcre-man.pdf file and constructed a test file with the regex and patterns in it. I got one match for the first string and two matches for the second. If the bug was present I would expect to see four matches, not two. I'm assuming therefore that PCRE is behaving correctly (assuming no bug has crept in between 7.0 and 8.32). I will report this bug on the PHP site. Thanks for your time! By the way the input file I used was: -------- /(?<!(ක))/g8 ක ම -------- and the output was: -------- PCRE version 7.0 18-Dec-2006 /(?<!(ක))/g8 ක 0: ම 0: 0: -------- -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
