On Sun, May 14, 2017 at 10:36 PM, Philip Earvolino <pearvol...@gmail.com> wrote: > > Hello. I am now working with the mySQL db and the titles do not, apparently, > have the right encoding (i.e., certain characters do not appear properly). > The encoding is cp1252 West European (latin1) and the collation is latin1_bin > which are what is specified in the flat file IMDB export and, I think(?), in > the imdb sql creation script.
IMDbPY takes the iso-8859-1 plain text files and convert them to utf-8. If I remember correctly, we don't force the db collections to be utf-8 - and we didn't document it :-/ - so if you've created your db and tables as cp1252, it's normal that the data seems messy. > Any suggestions? I don't know what happens if you change your collation encoding to utf8_unicode_ci (or something like that). If MySQL doesn't touch the data, great, otherwise you will have an even bigger mess, I fear. HTH, -- Davide Alberani <davide.alber...@gmail.com> [PGP KeyID: 0x3845A3D4AC9B61AD] http://www.mimante.net/ ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Imdbpy-help mailing list Imdbpy-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-help