Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget
Stéphane Brunet wrote: Jim Fulton wrote: Oh, that. Wa. Christian Theune almost had this fixed last year at EuroPython. This is also related to a File object refactoring that someone *almost* finished recently. What kind of refactoring ? Is it already in the trunk ? No, http://svn.zope.org/Zope3/branches/jhauser-filefieldwidget/ Note that this happened before the recent (PyCon) Blob work. Ii also doesn't reflect a number of comments I made a few months back. I don't know what your solution looks like at this point. But I'll note: - File objects store Bytes data. Not unicode. I did not changed anything to the IFile interface. Text or not, the content is always stored on the Bytes field. - For text content, File object's want to keep track of an encoding. This is not what I understood of the description of issue 302 on Collector (see http://mail.zope.org/pipermail/zope3-dev/2004-October/012371.html ). My solution just convert the text input to UTF-8 before for storage in the Bytes field. A UTF-8 specific widget (very similar to the future ASCIIAreaWidget) is used in the edit form. When the text file is displayed, the content is converted to unicode and afterwards to the preferred encoding of the user's browser. This works quite well on my computer... This may be an improvement over what we have now. Howeverm you can't count on UTF-8 in general. The content may have been uploaded, in which case you really don't know what the encoding is unless someone tells you. You have no idea what the prefered encoding of the user's browser is. You *should* be able to control what encoding they send back by specifying an encoding on the generated form. If you specify the encoding, then modern browsers should reliably send back the same encoding. I expect that, in the long term (3.2?), we'll need to totally redo Files to make then sane and to take advantage of ZODB Blobs. Resolving issue 302 is on the todo list for 3.1. Frankly, I don't think it should be a show stopper for 3.1. > However, one could take advantage of this future refactoring in order to merge I18NFile and File into a single package. After all, a non-i18n file is just a i18n file with a default language... I think that would be a bad idea. There are a number of possible approaches to management of content translation. Trying to make all files ne I18nFiles would be too great a policy commitment. BTW, I18NFile is also broken because of encoding problems. I was not able to enter non-ASCII characters in the text area... That's pretty problematic for a I18N-aware product :-D It is really just a demo. :/ Jim -- Jim Fulton mailto:[EMAIL PROTECTED] Python Powered! CTO (540) 361-1714http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget
Jim Fulton wrote: Oh, that. Wa. Christian Theune almost had this fixed last year at EuroPython. This is also related to a File object refactoring that someone *almost* finished recently. What kind of refactoring ? Is it already in the trunk ? I don't know what your solution looks like at this point. But I'll note: - File objects store Bytes data. Not unicode. I did not changed anything to the IFile interface. Text or not, the content is always stored on the Bytes field. - For text content, File object's want to keep track of an encoding. This is not what I understood of the description of issue 302 on Collector (see http://mail.zope.org/pipermail/zope3-dev/2004-October/012371.html ). My solution just convert the text input to UTF-8 before for storage in the Bytes field. A UTF-8 specific widget (very similar to the future ASCIIAreaWidget) is used in the edit form. When the text file is displayed, the content is converted to unicode and afterwards to the preferred encoding of the user's browser. This works quite well on my computer... I expect that, in the long term (3.2?), we'll need to totally redo Files to make then sane and to take advantage of ZODB Blobs. Resolving issue 302 is on the todo list for 3.1. However, one could take advantage of this future refactoring in order to merge I18NFile and File into a single package. After all, a non-i18n file is just a i18n file with a default language... BTW, I18NFile is also broken because of encoding problems. I was not able to enter non-ASCII characters in the text area... That's pretty problematic for a I18N-aware product :-D Stéphane ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget
Stéphane Brunet wrote: Jim Fulton wrote: First, you are confusing schema definitions and widgets. You should start from the definitions of the field types. That was a typo... Sorry for the confusion :-P As Derrick (sort of) suggested, Bytes fields are fields that contain Python strings, as opposed to Text fields, which contain unicodes. Bytes values can contain pretty much arbitrary string values. For example, a Bytes fields could contain image data. ASCII fields contain only 7-bit ascii data. ASCII fields were introduced in recognition that many Bytes fields were being used in cases of source code where the desire was, mainly, to avoid unicode. I see! I have just found the definition of ASCII fields which are derivatives of Bytes field but with a validate function (for the 0-127 range). Even if ASCII can be stored in Bytes field, the choice has been made to separate the two types of fields in order to add this validation function. Right ? Right There are lots of schemas that are using Bytes that should probably use ASCII or Text instead. I would say that most or all occurrences of BytesLine should use ASCIILine instead. Unfortunately there is no ASCIILine. Sigh. The widgets are probably out of sync with these definitions. I suspect that the Bytes widgets behave the way they do because they were developed before we had an ASCCII type. If I try to sum up a little bit what I understand : * concerning fields : - Bytes field should be used for raw binary data or byte-friendly text encoding (e.g. UTF-8) _except ASCII_. Right - BytesLine field should be used for byte-friendly single-line text encoding (e.g. UTF-8) _except ASCII_. Well, It's hard for me to believe that someone *really* wants to specify a type that can contain more or less arbitrary binary data except for a newline. I don't really think we need BytesLine. - ASCII field should be used for multi-line ASCII text. - ASCIILine field for single-line ASCII text( e.g. MIME content type field in the "File" package), which must be added in zope.schema Yup * concerning widgets : - ASCIIWidget should be preferred to BytesWidget because its name clearly informs about the expected text encoding (although Bytes(Area)Widget accepts only ASCII text). Widget names don't matter. The widget names should match the field names. IMO, the default widget for a bytes line should be a file-upload widget. - ASCIIAreaWidget should be added in order to replace BytesAreaWidget for multiline ASCII fields. Yes - The need for Bytes(Area)Widgets is just unclear in my mind... Good. ;) I see no point in such a thing. An upload widget should be used instead. It woul dbe great for someone to try to get this cleaned up. :) I would be ready to put some work on it as soon as everything is clear in my mind about the use of Bytes v.s. ASCII fields and Bytes(Area)Widgets. Moreover, this is related to the issue 302 which I am trying to solve (the job is almost done concerning the encoding problems). Oh, that. Wa. Christian Theune almost had this fixed last year at EuroPython. This is also related to a File object refactoring that someone *almost* finished recently. I don't know what your solution looks like at this point. But I'll note: - File objects store Bytes data. Not unicode. - For text content, File object's want to keep track of an encoding. I expect that, in the long term (3.2?), we'll need to totally redo Files to make then sane and to take advantage of ZODB Blobs. Jim -- Jim Fulton mailto:[EMAIL PROTECTED] Python Powered! CTO (540) 361-1714http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget
Jim Fulton wrote: First, you are confusing schema definitions and widgets. You should start from the definitions of the field types. That was a typo... Sorry for the confusion :-P As Derrick (sort of) suggested, Bytes fields are fields that contain Python strings, as opposed to Text fields, which contain unicodes. Bytes values can contain pretty much arbitrary string values. For example, a Bytes fields could contain image data. ASCII fields contain only 7-bit ascii data. ASCII fields were introduced in recognition that many Bytes fields were being used in cases of source code where the desire was, mainly, to avoid unicode. I see! I have just found the definition of ASCII fields which are derivatives of Bytes field but with a validate function (for the 0-127 range). Even if ASCII can be stored in Bytes field, the choice has been made to separate the two types of fields in order to add this validation function. Right ? There are lots of schemas that are using Bytes that should probably use ASCII or Text instead. I would say that most or all occurrences of BytesLine should use ASCIILine instead. Unfortunately there is no ASCIILine. Sigh. The widgets are probably out of sync with these definitions. I suspect that the Bytes widgets behave the way they do because they were developed before we had an ASCCII type. If I try to sum up a little bit what I understand : * concerning fields : - Bytes field should be used for raw binary data or byte-friendly text encoding (e.g. UTF-8) _except ASCII_. - BytesLine field should be used for byte-friendly single-line text encoding (e.g. UTF-8) _except ASCII_. - ASCII field should be used for multi-line ASCII text. - ASCIILine field for single-line ASCII text( e.g. MIME content type field in the "File" package), which must be added in zope.schema * concerning widgets : - ASCIIWidget should be preferred to BytesWidget because its name clearly informs about the expected text encoding (although Bytes(Area)Widget accepts only ASCII text). - ASCIIAreaWidget should be added in order to replace BytesAreaWidget for multiline ASCII fields. - The need for Bytes(Area)Widgets is just unclear in my mind... It woul dbe great for someone to try to get this cleaned up. :) I would be ready to put some work on it as soon as everything is clear in my mind about the use of Bytes v.s. ASCII fields and Bytes(Area)Widgets. Moreover, this is related to the issue 302 which I am trying to solve (the job is almost done concerning the encoding problems). Stéphane ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget
Stéphane Brunet wrote: Derrick Hudson wrote: On Sun, Jun 12, 2005 at 08:20:36PM -0400, Stéphane Brunet wrote: [... (read the thread if you want all the background info) ...] | What is the "raison d'être" of ASCIIWidget v.s. BytesWidget if they | expect the same type of input (plain ASCII text) and store it the same | type of fields? My interpretation is this: BytesWidget will accept any byte in the range 0x00 - 0xFF whereas the ASCIIWidget is intended to only accept bytes in the range 0x20 - 0x7E and 0x09. In other words I expect that the ASCIIWidget will accept only printable characters from the ASCII character set (IOW "text") but the BytesWidget will accept a sequence of any arbitrary 8-bit value. That's what I thought... However, Bytes(Area)Widgets only accept ASCII input, even if the Bytes widget accepts any bytes in the 0-255 range. First, you are confusing schema definitions and widgets. You should start from the definitions of the field types. As Derrick (sort of) suggested, Bytes fields are fields that contain Python strings, as opposed to Text fields, which contain unicodes. Bytes values can contain pretty much arbitrary string values. For example, a Bytes fields could contain image data. ASCII fields contain only 7-bit ascii data. ASCII fields were introduced in recognition that many Bytes fields were being used in cases of source code where the desire was, mainly, to avoid unicode. There are lots of schemas that are using Bytes that should probably use ASCII or Text instead. I would say that most or all occurrences of BytesLine should use ASCIILine instead. Unfortunately there is no ASCIILine. Sigh. The widgets are probably out of sync with these definitions. I suspect that the Bytes widgets behave the way they do because they were developed before we had an ASCCII type. It woul dbe great for someone to try to get this cleaned up. :) Jim -- Jim Fulton mailto:[EMAIL PROTECTED] Python Powered! CTO (540) 361-1714http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget
Stéphane Brunet wrote: Derrick Hudson wrote: On Sun, Jun 12, 2005 at 08:20:36PM -0400, Stéphane Brunet wrote: [... (read the thread if you want all the background info) ...] | What is the "raison d'être" of ASCIIWidget v.s. BytesWidget if they | expect the same type of input (plain ASCII text) and store it the same | type of fields? My interpretation is this: BytesWidget will accept any byte in the range 0x00 - 0xFF whereas the ASCIIWidget is intended to only accept bytes in the range 0x20 - 0x7E and 0x09. In other words I expect that the ASCIIWidget will accept only printable characters from the ASCII character set (IOW "text") but the BytesWidget will accept a sequence of any arbitrary 8-bit value. That's what I thought... However, Bytes(Area)Widgets only accept ASCII input, even if the Bytes widget accepts any bytes in the 0-255 range. Small correction: read ... even if the Bytes _field_ accepts any bytes in the 0-255 range. Stéphane ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Re: Existential question about BytesWidget v.s. ASCIIWidget
Derrick Hudson wrote: On Sun, Jun 12, 2005 at 08:20:36PM -0400, Stéphane Brunet wrote: [... (read the thread if you want all the background info) ...] | What is the "raison d'être" of ASCIIWidget v.s. BytesWidget if they | expect the same type of input (plain ASCII text) and store it the same | type of fields? My interpretation is this: BytesWidget will accept any byte in the range 0x00 - 0xFF whereas the ASCIIWidget is intended to only accept bytes in the range 0x20 - 0x7E and 0x09. In other words I expect that the ASCIIWidget will accept only printable characters from the ASCII character set (IOW "text") but the BytesWidget will accept a sequence of any arbitrary 8-bit value. That's what I thought... However, Bytes(Area)Widgets only accept ASCII input, even if the Bytes widget accepts any bytes in the 0-255 range. Stéphane ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com