Thanks Ken,

I don't think the locale is relevant for this. My locale is set to
en_AU.UTF-8 which matches the locale being used in the postgres database.
The wordpress is imported from MySQL - I didn't change any settings there
so I suspect that is Latin1.

Cheers,
Paul




On 30 April 2014 22:36, Ken Bolton <[email protected]> wrote:

> Hi Paul,
>
> In my experience, the UnicodeDecodeError only happens if you have not set
> up your locale correctly.  The highlighted section of the fabfile, here
> https://github.com/stephenmcd/mezzanine/blob/master/mezzanine/project_template/fabfile.py#L346-L350,
> remedies this problem every time.
>
> hth,
> ken
>
>
> On Wed, Apr 30, 2014 at 5:38 AM, Paul Whipp <[email protected]> wrote:
>
>> I may be joining the translation discussion shortly; I have a site that
>> is using Russian, French, English and Indonesian.
>>
>> I'm importing pages from Wordpress (not blog entries - pages) and I get
>> the dreaded "UnicodeDecodeError: 'ascii' codec can't decode byte..." error
>> in Mezzanine code that joins up the titles and that gets the
>> 'description_from_content' when I save the RichTextPage object created from
>> the wordpress page.
>>
>> USE_I18N is True in settings.
>>
>> Obviously I don't want to lose the Cyrillic characters and I need to get
>> these posts imported. I've tried various options and the best one so far
>> seems to be using kitchen's to_unicode and to_bytes e.g:
>>
>>
>> from kitchen.text.converters import to_bytes, to_unicode
>> ...
>>
>>     def import_page(self, page, pages):
>>         title = to_unicode(page['post_title'])
>>         self.vprint("BEGIN Importing page '{0}'".format(to_bytes(title)),
>> 1)
>>         mezz_page = self.get_or_create(RichTextPage, title=title)
>>         if page['post_parent'] > 0:  # there is a parent
>>             mezz_page.parent = self.get_mezz_page(page['post_parent'],
>> pages)
>>         mezz_page.created = page['post_modified']
>>         mezz_page.updated = page['post_modified']
>>         mezz_page.content = to_unicode(page['post_content'])
>>         mezz_page.save()
>>
>> The parent bit is w.i.p. but this works for the content and title - it
>> retains the cyrillic characters correctly. However it seems unwieldy. Is
>> this approach a good one or should I be doing something else?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Mezzanine Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "Mezzanine Users" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/mezzanine-users/2-4lUfxEzZo/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Mezzanine Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to