The server and client encoding are both set to UTF8, and according to this 
http://www.fileformat.info/info/unicode/char/2013/index.htm en dash is a valid 
UTF8 character, but when running a script with insert statement with en dash 
character in it, I got the error below.

mydb=> select name, setting from pg_settings where name like '%encoding%';
      name       | setting
-----------------+---------
client_encoding | UTF8
server_encoding | UTF8

mydb=> \i activity_type.lst
psql:activity_type.lst:379: ERROR:  invalid byte sequence for encoding "UTF8": 
0x96

If I set client_encoding to WIN1252, the same file will be run ok and records 
inserted in, but afterwards the en dash character showed up as "û", instead of 
the en dash character "-"
mydb=> show client_encoding;
client_encoding
-----------------
WIN1252

I created a database with WIN1252 encoding so both server and client encoding 
are WIN1252, loaded the same file in, en dash character still showed up as 
"û"(actually that is on a windows box, on a linux box the character didn't show 
up at all), so the client setting still makes a difference, even though 
client_encoding showed the same value.

Is there any way I will be able to load the en dash character into the 
postgresql database as is? I had worked around it by editing the input file to 
replace en dash with a plain dash, but that's quite some manual work each time 
when a new dump is generated from an oracle database with WE8MSWIN1252 
characterset.


Thanks,
Patricia

Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.

Reply via email to