[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???
I have just returned from vacation and am looking at this issue in more detail. Dieter's explanation seems logical, especially considering the following traceback when I try and move a folder containing these specific CMF object types. Looking at the objects, they have plain text ids (I assume by looking only), but unicode titles (again by just looking in ZMI). The latter were probably copied and pasted from somehwere into a Cmf product's editting screen, which I presume is the problem here. So following Dieter's explanation, is it possible to find and identify which objects have the non-unicode/non-ascii ids/titles using some python? I'm assuming that I could then edit the offending object's id/title in the ZMI to "re-unicode" it. Changing the default character encoding of my production server for this one application just isn't an option unfortunately. Thanks, nick Traceback (innermost last): * Module ZPublisher.Publish, line 101, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 39, in call_object * Module OFS.CopySupport, line 231, in manage_renameObjects * Module OFS.CopySupport, line 260, in manage_renameObject * Module OFS.ObjectManager, line 276, in _setObject * Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd * Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse * Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd * Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse * Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd * Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse * Module Products.CMFCore.CMFCatalogAware, line 147, in manage_afterAdd * Module Products.CMFCore.CMFCatalogAware, line 42, in indexObject * Module Products.CMFPlone.CatalogTool, line 56, in indexObject * Module Products.CMFCore.CatalogTool, line 235, in catalog_object * Module Products.ZCatalog.ZCatalog, line 528, in catalog_object * Module Products.ZCatalog.Catalog, line 381, in catalogObject * Module Products.ZCTextIndex.ZCTextIndex, line 163, in index_object * Module Products.ZCTextIndex.ZCTextIndex, line 176, in _index_object * Module Products.ZCTextIndex.OkapiIndex, line 58, in index_doc * Module Products.ZCTextIndex.BaseIndex, line 108, in index_doc * Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds * Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: ordinal not in range(128) Dieter Maurer wrote: Nick Bower wrote at 2004-10-8 16:41 +0200: ... Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: orginal not in range(128). In your lexicon operation a unicode and a non-unicode string is put together (this can happen internally during BTree traversal). Whenever such a thing happens, Python tries to convert the non unicode to unicode -- using its default encoding. This fails as the non unicode string contains bytes not convertable this this encoding. In a later message you reported that setting Python's default encoding to "utf-8" gave you an unexpected end exception. This means that your non unicode string is not utf-8 encoded. You should use as default encoding the encoding that is used for your non unicode strings. If you do not know it, use an encoding that can map any 8 bit byte. Windows has a few of them (called "cpXXX" (for CodePage); I do not know the correct XXX). ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???
I have just returned from vacation and am looking at this issue in more detail. Dieter's explanation seems logical, especially considering the following traceback when I try and move a folder containing these specific CMF object types. Looking at the objects, they have plain text ids (I assume by looking only), but unicode titles (again by just looking in ZMI). The latter were probably copied and pasted from somehwere into a Cmf product's editting screen, which I presume is the problem here. So following Dieter's explanation, is it possible to find and identify which objects have the non-unicode/non-ascii ids/titles using some python? I'm assuming that I could then edit the offending object's id/title in the ZMI to "re-unicode" it. Changing the default character encoding of my production server for this one application just isn't an option unfortunately. Thanks, nick Traceback (innermost last): * Module ZPublisher.Publish, line 101, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 39, in call_object * Module OFS.CopySupport, line 231, in manage_renameObjects * Module OFS.CopySupport, line 260, in manage_renameObject * Module OFS.ObjectManager, line 276, in _setObject * Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd * Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse * Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd * Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse * Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd * Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse * Module Products.CMFCore.CMFCatalogAware, line 147, in manage_afterAdd * Module Products.CMFCore.CMFCatalogAware, line 42, in indexObject * Module Products.CMFPlone.CatalogTool, line 56, in indexObject * Module Products.CMFCore.CatalogTool, line 235, in catalog_object * Module Products.ZCatalog.ZCatalog, line 528, in catalog_object * Module Products.ZCatalog.Catalog, line 381, in catalogObject * Module Products.ZCTextIndex.ZCTextIndex, line 163, in index_object * Module Products.ZCTextIndex.ZCTextIndex, line 176, in _index_object * Module Products.ZCTextIndex.OkapiIndex, line 58, in index_doc * Module Products.ZCTextIndex.BaseIndex, line 108, in index_doc * Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds * Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: ordinal not in range(128) Dieter Maurer wrote: Nick Bower wrote at 2004-10-8 16:41 +0200: ... Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: orginal not in range(128). In your lexicon operation a unicode and a non-unicode string is put together (this can happen internally during BTree traversal). Whenever such a thing happens, Python tries to convert the non unicode to unicode -- using its default encoding. This fails as the non unicode string contains bytes not convertable this this encoding. In a later message you reported that setting Python's default encoding to "utf-8" gave you an unexpected end exception. This means that your non unicode string is not utf-8 encoded. You should use as default encoding the encoding that is used for your non unicode strings. If you do not know it, use an encoding that can map any 8 bit byte. Windows has a few of them (called "cpXXX" (for CodePage); I do not know the correct XXX). ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???
Wow Dieter - that's a really concise explanation. It'll allow us to fix the product knowing this background. Thanks! Dieter Maurer wrote: Nick Bower wrote at 2004-10-8 16:41 +0200: ... Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: orginal not in range(128). In your lexicon operation a unicode and a non-unicode string is put together (this can happen internally during BTree traversal). Whenever such a thing happens, Python tries to convert the non unicode to unicode -- using its default encoding. This fails as the non unicode string contains bytes not convertable this this encoding. In a later message you reported that setting Python's default encoding to "utf-8" gave you an unexpected end exception. This means that your non unicode string is not utf-8 encoded. You should use as default encoding the encoding that is used for your non unicode strings. If you do not know it, use an encoding that can map any 8 bit byte. Windows has a few of them (called "cpXXX" (for CodePage); I do not know the correct XXX). ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???
Maybe try with TextIndexNG :-) -aj --On Freitag, 8. Oktober 2004 17:42 Uhr +0200 Nick Bower <[EMAIL PROTECTED]> wrote: If I change Python's default encoding to utf-8 (which I shouldn't have to do anyway), I get the following slightly different error: Error Type: UnicodeDecodeError Error Value: 'utf8' codec can't decode byte 0xef in position 5: unexpected end of data This is really frustrating... :( Nick Bower wrote: I'm trying to import a zexp export (a Plone site actually) from a windows workstation to a zope server I built on Linux RH9 but it fails with a UnicodeDecodeError: ... Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: orginal not in range(128). What is confusing is that the zexp *will* import to another linux zope build I have (on Mdk10) and sys.getdefaultencoding() and locale.getdefaultlocale() on the 2 linux machines are almost the same (the working one reports "en_GB, utf"/ascii and the non-working one reports "en_US, utf"/ascii), so I think this is a red herring. I can't for the life of me work out what is going wrong as I built python myself on both Linux machines exactly the same. Can anyone help me please??? Thanks, Nick ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???
If I change Python's default encoding to utf-8 (which I shouldn't have to do anyway), I get the following slightly different error: Error Type: UnicodeDecodeError Error Value: 'utf8' codec can't decode byte 0xef in position 5: unexpected end of data This is really frustrating... :( Nick Bower wrote: I'm trying to import a zexp export (a Plone site actually) from a windows workstation to a zope server I built on Linux RH9 but it fails with a UnicodeDecodeError: ... Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: orginal not in range(128). What is confusing is that the zexp *will* import to another linux zope build I have (on Mdk10) and sys.getdefaultencoding() and locale.getdefaultlocale() on the 2 linux machines are almost the same (the working one reports "en_GB, utf"/ascii and the non-working one reports "en_US, utf"/ascii), so I think this is a red herring. I can't for the life of me work out what is going wrong as I built python myself on both Linux machines exactly the same. Can anyone help me please??? Thanks, Nick ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???
Andreas Jung wrote: > Are you really sure that the source and destination instance are running > the *same* > setup (means same Zope and Plone versions)? > > -aj As sure as one can be when comparing a Plone 2.0.3 all-in-one windows distribution with a custom built zope 2.7.0/zope2.7.2 + plone 2.0.3 (all combinations tried) Unix server environment. (Not really basically) But surely no-one will advise me that this should matter for the purposes of importing/exporting as long as both are Plone 2.0.3. Besides, the problem is not consistent between linux servers of same versions (just compiled on different distros) and AFAIK, there's no reason why objects should not be portable in this way because although the exception mentions CMFCore.CMFCatalog* and CMFPlone, the actual catch is way down in ZCTextIndex and ZCatalog which is Zope. Or am I wrong? nick --On Freitag, 8. Oktober 2004 16:41 Uhr +0200 Nick Bower <[EMAIL PROTECTED]> wrote: I'm trying to import a zexp export (a Plone site actually) from a windows workstation to a zope server I built on Linux RH9 but it fails with a UnicodeDecodeError: ... Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: orginal not in range(128). What is confusing is that the zexp *will* import to another linux zope build I have (on Mdk10) and sys.getdefaultencoding() and locale.getdefaultlocale() on the 2 linux machines are almost the same (the working one reports "en_GB, utf"/ascii and the non-working one reports "en_US, utf"/ascii), so I think this is a red herring. I can't for the life of me work out what is going wrong as I built python myself on both Linux machines exactly the same. Can anyone help me please??? Thanks, Nick ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???
I should mention that I've tried this on various combinations of python 2.3.3/2.3.4 and zope 2.7.0/2.7.2. Nick Bower wrote: I'm trying to import a zexp export (a Plone site actually) from a windows workstation to a zope server I built on Linux RH9 but it fails with a UnicodeDecodeError: ... Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: orginal not in range(128). What is confusing is that the zexp *will* import to another linux zope build I have (on Mdk10) and sys.getdefaultencoding() and locale.getdefaultlocale() on the 2 linux machines are almost the same (the working one reports "en_GB, utf"/ascii and the non-working one reports "en_US, utf"/ascii), so I think this is a red herring. I can't for the life of me work out what is going wrong as I built python myself on both Linux machines exactly the same. Can anyone help me please??? Thanks, Nick ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )