Thanks Ken, I don't think the locale is relevant for this. My locale is set to en_AU.UTF-8 which matches the locale being used in the postgres database. The wordpress is imported from MySQL - I didn't change any settings there so I suspect that is Latin1.
Cheers, Paul On 30 April 2014 22:36, Ken Bolton <[email protected]> wrote: > Hi Paul, > > In my experience, the UnicodeDecodeError only happens if you have not set > up your locale correctly. The highlighted section of the fabfile, here > https://github.com/stephenmcd/mezzanine/blob/master/mezzanine/project_template/fabfile.py#L346-L350, > remedies this problem every time. > > hth, > ken > > > On Wed, Apr 30, 2014 at 5:38 AM, Paul Whipp <[email protected]> wrote: > >> I may be joining the translation discussion shortly; I have a site that >> is using Russian, French, English and Indonesian. >> >> I'm importing pages from Wordpress (not blog entries - pages) and I get >> the dreaded "UnicodeDecodeError: 'ascii' codec can't decode byte..." error >> in Mezzanine code that joins up the titles and that gets the >> 'description_from_content' when I save the RichTextPage object created from >> the wordpress page. >> >> USE_I18N is True in settings. >> >> Obviously I don't want to lose the Cyrillic characters and I need to get >> these posts imported. I've tried various options and the best one so far >> seems to be using kitchen's to_unicode and to_bytes e.g: >> >> >> from kitchen.text.converters import to_bytes, to_unicode >> ... >> >> def import_page(self, page, pages): >> title = to_unicode(page['post_title']) >> self.vprint("BEGIN Importing page '{0}'".format(to_bytes(title)), >> 1) >> mezz_page = self.get_or_create(RichTextPage, title=title) >> if page['post_parent'] > 0: # there is a parent >> mezz_page.parent = self.get_mezz_page(page['post_parent'], >> pages) >> mezz_page.created = page['post_modified'] >> mezz_page.updated = page['post_modified'] >> mezz_page.content = to_unicode(page['post_content']) >> mezz_page.save() >> >> The parent bit is w.i.p. but this works for the content and title - it >> retains the cyrillic characters correctly. However it seems unwieldy. Is >> this approach a good one or should I be doing something else? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Mezzanine Users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to a topic in the > Google Groups "Mezzanine Users" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/mezzanine-users/2-4lUfxEzZo/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Mezzanine Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
