$ iconv -f US-ASCII -t UTF-8 < test.sql > out.sql iconv: illegal input sequence at position 114500
Any ideas how the job can be accomplised reliably.
Also my database may contain data in multiple encodings like WINDOWS-1251 and WINDOWS-1256 in various places as data has been inserted by different peoples using different sources and client software.
You could use a simple program like that (in Python):
output = open( "unidump", "w" )
for line in open( "your dump" ):
for encoding in "utf-8", "iso-8859-15", "whatever":
try:
output.write( unicode( line, encoding ).encode( "utf-8"
))
break
except UnicodeError:
pass
else:
print "No suitable encoding for line..."I'd say this might work, if UTF-8 cannot absorb an apostrophe inside a multibit character. Can it ?
Or you could do that to all your table using SELECTs but it's going to be painful...
---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings
