[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???

2004-11-05 Thread Nick Bower
I have just returned from vacation and am looking at this issue in more
detail.  Dieter's explanation seems logical, especially considering the
following traceback when I try and move a folder containing these
specific CMF object types.
Looking at the objects, they have plain text ids (I assume by looking
only), but unicode titles (again by just looking in ZMI).  The latter
were probably copied and pasted from somehwere into a Cmf product's
editting screen, which I presume is the problem here.
So following Dieter's explanation, is it possible to find and identify
which objects have the non-unicode/non-ascii ids/titles using some
python?  I'm assuming that I could then edit the offending object's
id/title in the ZMI to "re-unicode" it.
Changing the default character encoding of my production server for this
one application just isn't an option unfortunately.
Thanks, nick
Traceback (innermost last):
* Module ZPublisher.Publish, line 101, in publish
* Module ZPublisher.mapply, line 88, in mapply
* Module ZPublisher.Publish, line 39, in call_object
* Module OFS.CopySupport, line 231, in manage_renameObjects
* Module OFS.CopySupport, line 260, in manage_renameObject
* Module OFS.ObjectManager, line 276, in _setObject
* Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd
* Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse
* Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd
* Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse
* Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd
* Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse
* Module Products.CMFCore.CMFCatalogAware, line 147, in manage_afterAdd
* Module Products.CMFCore.CMFCatalogAware, line 42, in indexObject
* Module Products.CMFPlone.CatalogTool, line 56, in indexObject
* Module Products.CMFCore.CatalogTool, line 235, in catalog_object
* Module Products.ZCatalog.ZCatalog, line 528, in catalog_object
* Module Products.ZCatalog.Catalog, line 381, in catalogObject
* Module Products.ZCTextIndex.ZCTextIndex, line 163, in index_object
* Module Products.ZCTextIndex.ZCTextIndex, line 176, in _index_object
* Module Products.ZCTextIndex.OkapiIndex, line 58, in index_doc
* Module Products.ZCTextIndex.BaseIndex, line 108, in index_doc
* Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
* Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5:
ordinal not in range(128)

Dieter Maurer wrote:
Nick Bower wrote at 2004-10-8 16:41 +0200:
...
Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: 
orginal not in range(128).

In your lexicon operation a unicode and a non-unicode string is
put together (this can happen internally during BTree traversal).
Whenever such a thing happens, Python tries to convert the
non unicode to unicode -- using its default encoding.
This fails as the non unicode string contains bytes not convertable
this this encoding.
In a later message you reported that setting Python's default
encoding to "utf-8" gave you an unexpected end exception.
This means that your non unicode string is not utf-8 encoded.
You should use as default encoding the encoding that is
used for your non unicode strings.
If you do not know it, use an encoding that can map any 8 bit byte.
Windows has a few of them (called "cpXXX" (for CodePage);
I do not know the correct XXX).

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???

2004-11-05 Thread Nick Bower
I have just returned from vacation and am looking at this issue in more 
detail.  Dieter's explanation seems logical, especially considering the 
following traceback when I try and move a folder containing these 
specific CMF object types.

Looking at the objects, they have plain text ids (I assume by looking 
only), but unicode titles (again by just looking in ZMI).  The latter 
were probably copied and pasted from somehwere into a Cmf product's 
editting screen, which I presume is the problem here.

So following Dieter's explanation, is it possible to find and identify 
which objects have the non-unicode/non-ascii ids/titles using some 
python?  I'm assuming that I could then edit the offending object's 
id/title in the ZMI to "re-unicode" it.

Changing the default character encoding of my production server for this 
one application just isn't an option unfortunately.

Thanks, nick
Traceback (innermost last):
* Module ZPublisher.Publish, line 101, in publish
* Module ZPublisher.mapply, line 88, in mapply
* Module ZPublisher.Publish, line 39, in call_object
* Module OFS.CopySupport, line 231, in manage_renameObjects
* Module OFS.CopySupport, line 260, in manage_renameObject
* Module OFS.ObjectManager, line 276, in _setObject
* Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd
* Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse
* Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd
* Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse
* Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd
* Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse
* Module Products.CMFCore.CMFCatalogAware, line 147, in manage_afterAdd
* Module Products.CMFCore.CMFCatalogAware, line 42, in indexObject
* Module Products.CMFPlone.CatalogTool, line 56, in indexObject
* Module Products.CMFCore.CatalogTool, line 235, in catalog_object
* Module Products.ZCatalog.ZCatalog, line 528, in catalog_object
* Module Products.ZCatalog.Catalog, line 381, in catalogObject
* Module Products.ZCTextIndex.ZCTextIndex, line 163, in index_object
* Module Products.ZCTextIndex.ZCTextIndex, line 176, in _index_object
* Module Products.ZCTextIndex.OkapiIndex, line 58, in index_doc
* Module Products.ZCTextIndex.BaseIndex, line 108, in index_doc
* Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
* Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: 
ordinal not in range(128)


Dieter Maurer wrote:
Nick Bower wrote at 2004-10-8 16:41 +0200:
...
Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: 
orginal not in range(128).

In your lexicon operation a unicode and a non-unicode string is
put together (this can happen internally during BTree traversal).
Whenever such a thing happens, Python tries to convert the
non unicode to unicode -- using its default encoding.
This fails as the non unicode string contains bytes not convertable
this this encoding.
In a later message you reported that setting Python's default
encoding to "utf-8" gave you an unexpected end exception.
This means that your non unicode string is not utf-8 encoded.
You should use as default encoding the encoding that is
used for your non unicode strings.
If you do not know it, use an encoding that can map any 8 bit byte.
Windows has a few of them (called "cpXXX" (for CodePage);
I do not know the correct XXX).
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???

2004-10-11 Thread Nick Bower
Wow Dieter - that's a really concise explanation.  It'll allow us to fix 
the product knowing this background.  Thanks!

Dieter Maurer wrote:
Nick Bower wrote at 2004-10-8 16:41 +0200:
...
Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: 
orginal not in range(128).

In your lexicon operation a unicode and a non-unicode string is
put together (this can happen internally during BTree traversal).
Whenever such a thing happens, Python tries to convert the
non unicode to unicode -- using its default encoding.
This fails as the non unicode string contains bytes not convertable
this this encoding.
In a later message you reported that setting Python's default
encoding to "utf-8" gave you an unexpected end exception.
This means that your non unicode string is not utf-8 encoded.
You should use as default encoding the encoding that is
used for your non unicode strings.
If you do not know it, use an encoding that can map any 8 bit byte.
Windows has a few of them (called "cpXXX" (for CodePage);
I do not know the correct XXX).
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???

2004-10-08 Thread Andreas Jung
Maybe try with TextIndexNG :-)
-aj
--On Freitag, 8. Oktober 2004 17:42 Uhr +0200 Nick Bower 
<[EMAIL PROTECTED]> wrote:

If I change Python's default encoding to utf-8 (which I shouldn't have to
do anyway), I get the following slightly different error:
Error Type: UnicodeDecodeError
Error Value: 'utf8' codec can't decode byte 0xef in position 5:
unexpected end of data
This is really frustrating... :(
Nick Bower wrote:
I'm trying to import a zexp export (a Plone site actually) from a
windows workstation to a zope server I built on Linux RH9 but it fails
with a UnicodeDecodeError:
...
Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5:
orginal not in range(128).
What is confusing is that the zexp *will* import to another linux zope
build I have (on Mdk10) and sys.getdefaultencoding() and
locale.getdefaultlocale() on the 2 linux machines are almost the same
(the working one reports "en_GB, utf"/ascii and the non-working one
reports "en_US, utf"/ascii), so I think this is a red herring.  I can't
for the life of me work out what is going wrong as I built python myself
on both Linux machines exactly the same.
Can anyone help me please???
Thanks, Nick
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -  http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???

2004-10-08 Thread Nick Bower
If I change Python's default encoding to utf-8 (which I shouldn't have 
to do anyway), I get the following slightly different error:

Error Type: UnicodeDecodeError
Error Value: 'utf8' codec can't decode byte 0xef in position 5: 
unexpected end of data

This is really frustrating... :(
Nick Bower wrote:
I'm trying to import a zexp export (a Plone site actually) from a 
windows workstation to a zope server I built on Linux RH9 but it fails 
with a UnicodeDecodeError:

...
Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: 
orginal not in range(128).

What is confusing is that the zexp *will* import to another linux zope 
build I have (on Mdk10) and sys.getdefaultencoding() and 
locale.getdefaultlocale() on the 2 linux machines are almost the same 
(the working one reports "en_GB, utf"/ascii and the non-working one 
reports "en_US, utf"/ascii), so I think this is a red herring.  I can't 
for the life of me work out what is going wrong as I built python myself 
on both Linux machines exactly the same.

Can anyone help me please???
Thanks, Nick
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???

2004-10-08 Thread Nick Bower
Andreas Jung wrote:
> Are you really sure that the source and destination instance are running
> the *same*
> setup (means same Zope and Plone versions)?
>
> -aj
As sure as one can be when comparing a Plone 2.0.3 all-in-one windows 
distribution with a custom built zope 2.7.0/zope2.7.2 + plone 2.0.3 (all 
combinations tried) Unix server environment.  (Not really basically)

But surely no-one will advise me that this should matter for the 
purposes of importing/exporting as long as both are Plone 2.0.3.

Besides, the problem is not consistent between linux servers of same 
versions (just compiled on different distros) and AFAIK, there's no 
reason why objects should not be portable in this way because although 
the exception mentions CMFCore.CMFCatalog* and CMFPlone, the actual 
catch is way down in ZCTextIndex and ZCatalog which is Zope.  Or am I wrong?

nick

--On Freitag, 8. Oktober 2004 16:41 Uhr +0200 Nick Bower 
<[EMAIL PROTECTED]> wrote:

I'm trying to import a zexp export (a Plone site actually) from a windows
workstation to a zope server I built on Linux RH9 but it fails with a
UnicodeDecodeError:
...
Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5:
orginal not in range(128).
What is confusing is that the zexp *will* import to another linux zope
build I have (on Mdk10) and sys.getdefaultencoding() and
locale.getdefaultlocale() on the 2 linux machines are almost the same
(the working one reports "en_GB, utf"/ascii and the non-working one
reports "en_US, utf"/ascii), so I think this is a red herring.  I can't
for the life of me work out what is going wrong as I built python myself
on both Linux machines exactly the same.
Can anyone help me please???
Thanks, Nick
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -  http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???

2004-10-08 Thread Nick Bower
I should mention that I've tried this on various combinations of python 
2.3.3/2.3.4 and zope 2.7.0/2.7.2.

Nick Bower wrote:
I'm trying to import a zexp export (a Plone site actually) from a 
windows workstation to a zope server I built on Linux RH9 but it fails 
with a UnicodeDecodeError:

...
Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: 
orginal not in range(128).

What is confusing is that the zexp *will* import to another linux zope 
build I have (on Mdk10) and sys.getdefaultencoding() and 
locale.getdefaultlocale() on the 2 linux machines are almost the same 
(the working one reports "en_GB, utf"/ascii and the non-working one 
reports "en_US, utf"/ascii), so I think this is a red herring.  I can't 
for the life of me work out what is going wrong as I built python myself 
on both Linux machines exactly the same.

Can anyone help me please???
Thanks, Nick
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )