[rt-users] Full text indexing failure (invalid byte sequence for encoding UTF8)

2013-02-01 Thread Ben Poliakoff
We're currently running RT 4.0.5-3~bpo60+1 (from Debian backports) with
Postgresql 8.4.12-0squeeze1.

Recently I tried to enable full text search following the instructions
here:

http://blog.bestpractical.com/2011/06/full-text-searching.html

...but ran into this error an hour into the initial rt-fulltext-indexer
--all:

[crit]: error: ERROR:  invalid byte sequence for encoding UTF8: 0xfc
HINT:  This error can also happen if the byte sequence does not
  match the encoding expected by the server, which is controlled by
  client_encoding. at /usr/sbin/rt-fulltext-indexer-4 line 375.
  (/usr/share/request-tracker4/lib/RT.pm:351)

Subsequent runs of the same command end with the same error.

The encoding for the rt4 db has been set to utf8 for as long as I can
recall.  I assume this relates to some data inserted into the db ages
ago when client_encoding was something other than utf8, or in a previous
version of postgresql which might have been less stringent about input.

There is a FAQ about 'invalid byte sequence for encoding' but I'm not
sure that this is the same issue.

Anyone else been through this sort of issue?  Would it be better to take
the question to a postgresql list?

Ben

-- 

pub   4096R/318B6A97 2009-05-11 Ben Poliakoff b...@reed.edu
 Primary key fingerprint: 3F23 EBC8 B73E 92B7 0A67  705A 8219 DCF0 318B 6A97


signature.asc
Description: Digital signature


Re: [rt-users] Full text indexing failure (invalid byte sequence for encoding UTF8)

2013-02-01 Thread Alex Vandiver
On Fri, 2013-02-01 at 17:03 -0800, Ben Poliakoff wrote:
 We're currently running RT 4.0.5-3~bpo60+1 (from Debian backports) with
 Postgresql 8.4.12-0squeeze1.

This is fixed in RT 4.0.9 and above, wich resolve this issue by skipping
the attachment with bad data.  RT 4.0.7 and above are better about not
trusting emails which claim to be utf-8, which prevents the bad data
from getting in in the first place, which is the likely cause here, and
which older Pg allowed.
 - Alex