DO NOT REPLY TO THIS MESSAGE.  INSTEAD, POST ANY RESPONSES TO THE LINK BELOW.

[STR New]

Link: http://www.fltk.org/str.php?L2348
Version: 1.3-current


It seems difficult to support all 8-bit encodings because there were
so many, (this fact is the most compelling justification for UTF).
However, the case for supporting CP152 as input has been considered 
valid in recent discussions here because a large number of legacy 
files use this encoding.

Matt's suggestion, to replace each incorrect byte by its escaped 
numerical value allows writing back the original data without change. This
is attracting, but seems incompatible with text edition, because 
the new parts of the text will be UTF-8 whereas the old ones would 
remain in the initial encoding.


Proposal for a new, FLTK-1.1.* -compatible API for 1.3.0:

int Fl_Text_Buffer::insertfile(const char *file, int pos, int buflen=128
*1024, 
                Fl_Text_Buffer::read_options flags=5)
        
int Fl_Text_Buffer::loadfile(const char *file, int buflen=128 *1024, 
                Fl_Text_Buffer::read_options flags=5)

int Fl_Text_Buffer::appendfile(const char *file, int buflen=128 *1024, 
                Fl_Text_Buffer::read_options flags=5)
        
flags : ORing of
typedef enum {
        UTF8_cp1252 = 1  // inputs both UTF-8 and CP1252-encoded files
        UTF8 = 2,   // all non-UTF-8 bytes are replaced by the replacement
character
        warn = 4    // warn if text buffer differs from initial file content
        } read_options;
        
return values:
        0: success
        1: error opening file for read access
        2: file read error 
        
static const char *warn_message = 
"Displayed text contains the UTF-8 recoding\n"
"of file content which was not UTF-8 encoded.\n"
"Some changes may have occurred";
        
Semantics:
If UTF8_cp1252 is applied, file is read with fl_utf8decode that accepts
both UTF-8
and CP1252 transcoded to UTF-8.
If UTF8 is applied, all illegal bytes are replaced by the 
replacement character '�' (U+FFFD).
If warn is applied, and the Fl_Text_Buffer contains any UTF-8-transcoding
relatively to
the input file, the text of warn_message is sent to fl_alert() just before
successful
completion (return value is 0) of the file operation. No message is
displayed if the
file operation returns non zero. This would also set the "dirty" flag of
the
Fl_Text_Buffer object, if there's one.

Other options could be added in the future, for other encodings, and 
possibly also for a custom encoding.


Link: http://www.fltk.org/str.php?L2348
Version: 1.3-current

_______________________________________________
fltk-bugs mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk-bugs

Reply via email to