Re: [crossfire] Data File (Maps, Archetypes) Encodings

2007-02-06 Thread Mark Wedel
Christian Hujer wrote:
 Hello dear co-devs.
 
 
 We have a common problem with the text encodings of data files.
 
 Examples:
 * Daimonin used (until a few minutes ago) the ISO paragraph character 0xA7 
 for 
 separating a map's sound spec from its name.
 * Daimonin uses the ISO degree character 0xB0 for highlights in messages.
 * Crossfire uses the a circumflex character 0xE2 for the name of a wine in 
 map /maps/scorn/houses/house3.bas2.

  Not sure if still the case, but at one time there were some objects that also 
used special characters - Mjølnir  comes to mind.

 
 This leads to some problems.
 * Crossfire x11 client displays 0xE2 as a circumflex.
 * Crossfire gtk client displays 0xE2 as ? (tested by Ragnor).

  And it appears in the GTK2 client, it won't draw the entire line/message that 
has the bad character.

 
 For both projects, it makes sense to rethink the file formats. I see three 
 possible solutions:
 
 1. Use US-ASCII text only.
 That means, only data files with bytes 0x13, 0x20-0x7E are valid.
 Pro: easy
 Pro: stable
 Pro: no changes required.
 Con: very limited solution

  And one that is currently not in use, as demonstrated we already have some 
non 
ASCII characters making their way in.

 
 2. Use ISO-8859-15 text.
 That means, bytes 0x13, 0x20-0x7E, 0xA0-0xFF are valid.
 Pro: easy
 Con: clients need special handling for non-ascii chars if they are UTF-8 
 aware 
 and run on UTF-8 systems (e.g. gtk client).
 Con: limited solution
 
 3. Use UTF-8 text.
 That means, only valid UTF-8 streams with Unicodes u0013, u0020-u007E, 
 u00A0-... are valid.
 Pro: future-proof
 Pro: Allows full unicode (e.g. Chinese chars if somebody likes, or even 
 klingon if the underlying system supports it).
 Con: clients need special handling.
 Con: Windows users or users of other ancient OS editions with no good UTF-8 
 support will have more problems than with ISO-8859-15.
 
 I see two places, where the encoding needs to be specified:
 * Data files
 * Network protocol
 
 My favorite solution would be 3. UTF-8, followed by 1. US-ASCII. I dislike 2. 
 ISO-8859-15 very much.

  #3 probably makes the most sense, and at least for the gtk2 client, looks 
like 
it would actually be handled properly (as the message generated on the wine 
bottle is about invalid utf8 character).

  Also, I'm not sure how easy #2 is - it is easy from a person writing the maps 
or archetypes, but as demonstrated, pretty much all clients would have to do 
special string handling.

  #3 does make it harder for people putting the strings in (I'd think the map 
editors could try to do the right thing in those cases and covert ISO 8859 15 
characters to unicode)

  So I'd vote unicode.  I'd suspect that for clients that don't support utf8, 
things won't really be any more broken than right now - the client would 
display 
a funky character instead of the correct one.  But I don't believe that would 
break any portion of the clients or protocol.

___
crossfire mailing list
crossfire@metalforge.org
http://mailman.metalforge.org/mailman/listinfo/crossfire


Re: [crossfire] Data File (Maps, Archetypes) Encodings

2007-02-06 Thread tchize
En l'instant précis du 02/06/07 09:03, Mark Wedel s'exprimait en ces termes:

(OT: Did not post for a long time, hello to everyone)
   #3 probably makes the most sense, and at least for the gtk2 client, looks 
 like 
 it would actually be handled properly (as the message generated on the wine 
 bottle is about invalid utf8 character).

   Also, I'm not sure how easy #2 is - it is easy from a person writing the 
 maps 
 or archetypes, but as demonstrated, pretty much all clients would have to do 
 special string handling.
   
Well, it's just a 256 entries table to convert those character to
destination system.
   #3 does make it harder for people putting the strings in (I'd think the map 
 editors could try to do the right thing in those cases and covert ISO 8859 15 
 characters to unicode)
   
At least for java editor, it's not a problem. Java natively supports
UTF-8 strings. On the onther hand, java does not support easily
iso-8859-1 or iso-8859-15. Java supports UTF-8 and US-ASCII, the other
encoding formats support is platform dependent.
   So I'd vote unicode.  I'd suspect that for clients that don't support utf8, 
 things won't really be any more broken than right now - the client would 
 display 
 a funky character instead of the correct one.  But I don't believe that would 
 break any portion of the clients or protocol.
   
Here is how some special utf-8 characters looks like when drawn in
iso-8859-1
utf-8 stored string: é - à - µ - ù - €
iso-8859-1 read string: é - à - µ - ù - €

Nothing very critical i think...
 ___
 crossfire mailing list
 crossfire@metalforge.org
 http://mailman.metalforge.org/mailman/listinfo/crossfire

   


___
crossfire mailing list
crossfire@metalforge.org
http://mailman.metalforge.org/mailman/listinfo/crossfire


Re: [crossfire] Data File (Maps, Archetypes) Encodings

2007-02-06 Thread Nicolas Weeger (Laposte)
Hello.

 My favorite solution would be 3. UTF-8, followed by 1. US-ASCII. I dislike
 2. ISO-8859-15 very much.

Favorite solution is 3, UTF-8 :)
It works nicely even with non utf8-aware functions (strlen  such), and is 
international so we'll not have any more issues with that :)

Nicolas
-- 
http://nicolas.weeger.free.fr [Petit site d'images, de textes, de code, bref 
de l'aléatoire !]

___
crossfire mailing list
crossfire@metalforge.org
http://mailman.metalforge.org/mailman/listinfo/crossfire


Re: [crossfire] Data File (Maps, Archetypes) Encodings

2007-02-06 Thread Alex Schultz
Nicolas Weeger (Laposte) wrote:
 My favorite solution would be 3. UTF-8, followed by 1. US-ASCII. I dislike
 2. ISO-8859-15 very much.
 

 Favorite solution is 3, UTF-8 :)
 It works nicely even with non utf8-aware functions (strlen  such), and is 
 international so we'll not have any more issues with that :)
Agreed again here, UTF-8 seems like a good idea to me. Seems there's a
strong consensus of most people towards UTF-8

Alex

___
crossfire mailing list
crossfire@metalforge.org
http://mailman.metalforge.org/mailman/listinfo/crossfire