Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget

2005-06-13 Thread Jim Fulton

Stéphane Brunet wrote:

Jim Fulton wrote:



Oh, that. Wa.  Christian Theune almost had this fixed last year
at EuroPython.

This is also related to a File object refactoring that someone *almost*
finished recently.



What kind of refactoring ? Is it already in the trunk ?


No,

  http://svn.zope.org/Zope3/branches/jhauser-filefieldwidget/

Note that this happened before the recent (PyCon) Blob work.
Ii also doesn't reflect a number of comments I made
a few months back.



I don't know what your solution looks like at this point. But I'll note:

- File objects store Bytes data.  Not unicode.

I did not changed anything to the IFile interface. Text or not, the 
content is always stored on the Bytes field.



- For text content, File object's want to keep track of an
  encoding.

This is not what I understood of the description of issue 302 on 
Collector (see 
http://mail.zope.org/pipermail/zope3-dev/2004-October/012371.html ).


My solution just convert the text input to UTF-8 before for storage in 
the Bytes field. A UTF-8 specific widget (very similar to the future 
ASCIIAreaWidget) is used in the edit form. When the text file is 
displayed, the content is converted to unicode and afterwards to the 
preferred encoding of the user's browser. This works quite well on my 
computer...


This may be an improvement over what we have now.  Howeverm you can't
count on UTF-8 in general.  The content may have been uploaded, in which
case you really don't know what the encoding is unless someone tells you.

You have no idea what the prefered encoding of the user's browser is.
You *should* be able to control what encoding they send back by
specifying an encoding on the generated form.  If you specify the encoding,
then modern browsers should reliably send back the same encoding.


I expect that, in the long term (3.2?), we'll need to totally redo
Files to make then sane and to take advantage of ZODB Blobs.


Resolving issue 302 is on the todo list for 3.1.


Frankly, I don't think it should be a show stopper for 3.1.

> However, one could take
advantage of this future refactoring in order to merge I18NFile and File 
into a single package. After all, a non-i18n file is just a i18n file 
with a default language...


I think that would be a bad idea.  There are a number of possible approaches
to management of content translation.  Trying to make all files ne I18nFiles
would be too great a policy commitment.

BTW, I18NFile is also broken because of encoding problems. I was not 
able to enter non-ASCII characters in the text area... That's pretty 
problematic for a I18N-aware product :-D


It is really just a demo. :/

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget

2005-06-13 Thread Stéphane Brunet

Jim Fulton wrote:



Oh, that. Wa.  Christian Theune almost had this fixed last year
at EuroPython.

This is also related to a File object refactoring that someone *almost*
finished recently.


What kind of refactoring ? Is it already in the trunk ?



I don't know what your solution looks like at this point. But I'll note:

- File objects store Bytes data.  Not unicode.

I did not changed anything to the IFile interface. Text or not, the 
content is always stored on the Bytes field.



- For text content, File object's want to keep track of an
  encoding.

This is not what I understood of the description of issue 302 on 
Collector (see 
http://mail.zope.org/pipermail/zope3-dev/2004-October/012371.html ).


My solution just convert the text input to UTF-8 before for storage in 
the Bytes field. A UTF-8 specific widget (very similar to the future 
ASCIIAreaWidget) is used in the edit form. When the text file is 
displayed, the content is converted to unicode and afterwards to the 
preferred encoding of the user's browser. This works quite well on my 
computer...



I expect that, in the long term (3.2?), we'll need to totally redo
Files to make then sane and to take advantage of ZODB Blobs.

Resolving issue 302 is on the todo list for 3.1. However, one could take 
advantage of this future refactoring in order to merge I18NFile and File 
into a single package. After all, a non-i18n file is just a i18n file 
with a default language...


BTW, I18NFile is also broken because of encoding problems. I was not 
able to enter non-ASCII characters in the text area... That's pretty 
problematic for a I18N-aware product :-D


Stéphane

___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget

2005-06-13 Thread Jim Fulton

Stéphane Brunet wrote:

Jim Fulton wrote:



First, you are confusing schema definitions and widgets.  You should
start from the definitions of the field types.


That was a typo... Sorry for the confusion :-P


As Derrick (sort of) suggested, Bytes fields are fields that contain
Python strings, as opposed to Text fields, which contain unicodes.
Bytes values can contain pretty much arbitrary string values.  For
example, a Bytes fields could contain image data.

ASCII fields contain only 7-bit ascii data. ASCII fields were introduced
in recognition that many Bytes fields were being used in cases of source
code where the desire was, mainly, to avoid unicode.

I see! I have just found the definition of ASCII fields which are 
derivatives of Bytes field but with a validate function (for the 0-127 
range).
Even if ASCII can be stored in Bytes field, the choice has been made to 
separate the two types of fields in order to add this validation 
function. Right ?


Right


There are lots of schemas that are using Bytes that should probably
use ASCII or Text instead.  I would say that most or all occurrences
of BytesLine should use ASCIILine instead. Unfortunately there is no
ASCIILine. Sigh.

The widgets are probably out of sync with these definitions.
I suspect that the Bytes widgets behave the way they do because
they were developed before we had an ASCCII type.



If I try to sum up a little bit what I understand :
* concerning fields :
   - Bytes field should be used for raw binary data or byte-friendly 
text encoding (e.g. UTF-8) _except ASCII_.


Right

   - BytesLine field should be used for byte-friendly single-line text 
encoding (e.g. UTF-8) _except ASCII_.


Well, It's hard for me to believe that someone *really* wants to specify a type
that can contain more or less arbitrary binary data except for
a newline.  I don't really think we need BytesLine.


   - ASCII field should be used for multi-line ASCII text.
   - ASCIILine field for single-line ASCII text( e.g. MIME content type 
field in the "File" package), which must be added in zope.schema


Yup


* concerning widgets :
   - ASCIIWidget should be preferred to BytesWidget because its name 
clearly informs about the expected text encoding (although 
Bytes(Area)Widget accepts only ASCII text).


Widget names don't matter.  The widget names should match the
field names.

IMO, the default widget for a bytes line should be a file-upload widget.

   - ASCIIAreaWidget should be added in order to replace BytesAreaWidget 
for multiline ASCII fields.


Yes


   - The need for Bytes(Area)Widgets is just unclear in my mind...


Good. ;)  I see no point in such a thing. An upload widget should be
used instead.



It woul dbe great for someone to try to get this cleaned up. :)

I would be ready to put some work on it as soon as everything is clear 
in my mind about the use of Bytes v.s. ASCII fields and 
Bytes(Area)Widgets. Moreover, this is related to the issue 302 which I 
am trying to solve (the job is almost done concerning the encoding 
problems).


Oh, that. Wa.  Christian Theune almost had this fixed last year
at EuroPython.

This is also related to a File object refactoring that someone *almost*
finished recently.

I don't know what your solution looks like at this point. But I'll note:

- File objects store Bytes data.  Not unicode.

- For text content, File object's want to keep track of an
  encoding.

I expect that, in the long term (3.2?), we'll need to totally redo
Files to make then sane and to take advantage of ZODB Blobs.

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget

2005-06-13 Thread Stéphane Brunet

Jim Fulton wrote:



First, you are confusing schema definitions and widgets.  You should
start from the definitions of the field types.


That was a typo... Sorry for the confusion :-P


As Derrick (sort of) suggested, Bytes fields are fields that contain
Python strings, as opposed to Text fields, which contain unicodes.
Bytes values can contain pretty much arbitrary string values.  For
example, a Bytes fields could contain image data.

ASCII fields contain only 7-bit ascii data. ASCII fields were introduced
in recognition that many Bytes fields were being used in cases of source
code where the desire was, mainly, to avoid unicode.

I see! I have just found the definition of ASCII fields which are 
derivatives of Bytes field but with a validate function (for the 0-127 
range).
Even if ASCII can be stored in Bytes field, the choice has been made to 
separate the two types of fields in order to add this validation 
function. Right ?



There are lots of schemas that are using Bytes that should probably
use ASCII or Text instead.  I would say that most or all occurrences
of BytesLine should use ASCIILine instead. Unfortunately there is no
ASCIILine. Sigh.

The widgets are probably out of sync with these definitions.
I suspect that the Bytes widgets behave the way they do because
they were developed before we had an ASCCII type.


If I try to sum up a little bit what I understand :
* concerning fields :
   - Bytes field should be used for raw binary data or byte-friendly 
text encoding (e.g. UTF-8) _except ASCII_.
   - BytesLine field should be used for byte-friendly single-line text 
encoding (e.g. UTF-8) _except ASCII_.

   - ASCII field should be used for multi-line ASCII text.
   - ASCIILine field for single-line ASCII text( e.g. MIME content type 
field in the "File" package), which must be added in zope.schema

* concerning widgets :
   - ASCIIWidget should be preferred to BytesWidget because its name 
clearly informs about the expected text encoding (although 
Bytes(Area)Widget accepts only ASCII text).
   - ASCIIAreaWidget should be added in order to replace 
BytesAreaWidget for multiline ASCII fields.

   - The need for Bytes(Area)Widgets is just unclear in my mind...



It woul dbe great for someone to try to get this cleaned up. :)

I would be ready to put some work on it as soon as everything is clear 
in my mind about the use of Bytes v.s. ASCII fields and 
Bytes(Area)Widgets. Moreover, this is related to the issue 302 which I 
am trying to solve (the job is almost done concerning the encoding 
problems).


Stéphane


___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget

2005-06-13 Thread Jim Fulton

Stéphane Brunet wrote:

Derrick Hudson wrote:


On Sun, Jun 12, 2005 at 08:20:36PM -0400, Stéphane Brunet wrote:
[... (read the thread if you want all the background info) ...]

| What is the "raison d'être" of ASCIIWidget v.s. BytesWidget if they 
| expect the same type of input (plain ASCII text) and store it the 
same | type of fields?


My interpretation is this:  BytesWidget will accept any byte in the
range 0x00 - 0xFF whereas the ASCIIWidget is intended to only accept
bytes in the range 0x20 - 0x7E and 0x09.  In other words I expect that
the ASCIIWidget will accept only printable characters from the ASCII
character set (IOW "text") but the BytesWidget will accept a sequence
of any arbitrary 8-bit value.

 

That's what I thought... However, Bytes(Area)Widgets only accept ASCII 
input, even if the Bytes widget accepts any bytes in the 0-255 range.


First, you are confusing schema definitions and widgets.  You should
start from the definitions of the field types.

As Derrick (sort of) suggested, Bytes fields are fields that contain
Python strings, as opposed to Text fields, which contain unicodes.
Bytes values can contain pretty much arbitrary string values.  For
example, a Bytes fields could contain image data.

ASCII fields contain only 7-bit ascii data. ASCII fields were introduced
in recognition that many Bytes fields were being used in cases of source
code where the desire was, mainly, to avoid unicode.

There are lots of schemas that are using Bytes that should probably
use ASCII or Text instead.  I would say that most or all occurrences
of BytesLine should use ASCIILine instead. Unfortunately there is no
ASCIILine. Sigh.

The widgets are probably out of sync with these definitions.
I suspect that the Bytes widgets behave the way they do because
they were developed before we had an ASCCII type.

It woul dbe great for someone to try to get this cleaned up. :)

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget

2005-06-13 Thread Stéphane Brunet

Stéphane Brunet wrote:


Derrick Hudson wrote:


On Sun, Jun 12, 2005 at 08:20:36PM -0400, Stéphane Brunet wrote:
[... (read the thread if you want all the background info) ...]

| What is the "raison d'être" of ASCIIWidget v.s. BytesWidget if they 
| expect the same type of input (plain ASCII text) and store it the 
same | type of fields?


My interpretation is this:  BytesWidget will accept any byte in the
range 0x00 - 0xFF whereas the ASCIIWidget is intended to only accept
bytes in the range 0x20 - 0x7E and 0x09.  In other words I expect that
the ASCIIWidget will accept only printable characters from the ASCII
character set (IOW "text") but the BytesWidget will accept a sequence
of any arbitrary 8-bit value.

 

That's what I thought... However, Bytes(Area)Widgets only accept ASCII 
input, even if the Bytes widget accepts any bytes in the 0-255 range.


Small correction: read ... even if the Bytes _field_ accepts any bytes 
in the 0-255 range.


Stéphane

___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget

2005-06-13 Thread Stéphane Brunet

Derrick Hudson wrote:


On Sun, Jun 12, 2005 at 08:20:36PM -0400, Stéphane Brunet wrote:
[... (read the thread if you want all the background info) ...]

| What is the "raison d'être" of ASCIIWidget v.s. BytesWidget if they 
| expect the same type of input (plain ASCII text) and store it the same 
| type of fields?


My interpretation is this:  BytesWidget will accept any byte in the
range 0x00 - 0xFF whereas the ASCIIWidget is intended to only accept
bytes in the range 0x20 - 0x7E and 0x09.  In other words I expect that
the ASCIIWidget will accept only printable characters from the ASCII
character set (IOW "text") but the BytesWidget will accept a sequence
of any arbitrary 8-bit value.

 

That's what I thought... However, Bytes(Area)Widgets only accept ASCII 
input, even if the Bytes widget accepts any bytes in the 0-255 range.


Stéphane
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com