On 1/21/10 10:27 AM, "John Campbell" <jcampbe...@gmail.com> wrote:
> I want to find rows that contain a word that matches a term, accent > insensitive: I am using utf8-general collation everywhere. > > attempt 1: > SELECT * FROM t WHERE txt LIKE '%que%' > Matches que qué, but also matches 'queue' > > attempt 1.5: > SELECT * FROM t WHERE txt LIKE '% que %' OR LIKE 'que %' OR LIKE '% que'; > Almost, but misses "que!" or 'que...' > > attempt2: > SELECT * FROM t WHERE txt REGEXP '[[:<:]]que[[:>:]]' > Matches que, not queue, but doesn't match qué. > > attempt3 > SELECT * FROM t WHERE txt REGEXP > '[[:<:]]q[uùúûüũūŭůűųǔǖǘǚǜ][eèéêëēĕėęě][[:>:]]' > Matches que, queue, qué. (I have no idea why this matches queue, but > the Regex behavior is bizarre with unicode.) > > Does anyone know why the final regex acts weird? "Warning "The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal." -- Mysql 11.4.2 > It there a good solution? doesn't look like it. Sphinxsearch might work nicely for you (it does for me) but that may not be an option for you. i generated a Sphinxsearch charset_table config that mimics utf8_general_ci collation. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org