[wpkg-users] [Bug 269] config.xml file has wrong encoding

2012-04-09 Thread bugzilla-daemon
http://bugzilla.wpkg.org/show_bug.cgi?id=269

Rainer Meier r.me...@wpkg.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||r.me...@wpkg.org
 Resolution||FIXED

--- Comment #2 from Rainer Meier r.me...@wpkg.org  ---
I've saved it using eclipse which does not seem to use BOM. It looks like it
was stored using UTF-8 but without BOM. Opening it with PSPad in UTF-8 mode
does not show any issues with German, French or Spanish translations. Also your
sample shows that special characters are encoded using two bytes (UTF-8). But
likely your editor translated the characters using ANSI or any other 8-bit
charset which makes UTF-8 characters 2-byte characters appear wrongly.

Just out of curiosity, which editor are you using? It seems to ignore the
encoding header in the XML header too.

Nevertheless I've added the BOM header and committed it.

FIX: Fixed UTF-8 with BOM encoding for config.xml.
 Fixes Bug 269. Thanks to Stefan Pendl.


br,
Rainer

-- 
Configure bugmail: http://bugzilla.wpkg.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the QA contact for the bug.
-
wpkg-users mailing list archives  http://lists.wpkg.org/pipermail/wpkg-users/
___
wpkg-users mailing list
wpkg-users@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/wpkg-users


[wpkg-users] [Bug 269] config.xml file has wrong encoding

2012-04-09 Thread bugzilla-daemon
http://bugzilla.wpkg.org/show_bug.cgi?id=269

Stefan Pendl pendl2mega...@yahoo.de changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED

--- Comment #3 from Stefan Pendl pendl2mega...@yahoo.de  ---
Hi Rainer,

In general Notepad++ is happy with ANSI as UTF-8 encoded files, but this
seems not to be the case for this file.

The BOM doesn't hurt and makes sure that any editor, not just only specialized
ones, display and edit the text correctly.

--
Stefan

-- 
Configure bugmail: http://bugzilla.wpkg.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the QA contact for the bug.
-
wpkg-users mailing list archives  http://lists.wpkg.org/pipermail/wpkg-users/
___
wpkg-users mailing list
wpkg-users@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/wpkg-users


Re: [wpkg-users] [Bug 269] config.xml file has wrong encoding

2012-04-09 Thread Malte Starostik
Hi Stefan,

Am Montag, 9. April 2012, 11:14:43 schrieb bugzilla-dae...@bugzilla.wpkg.org:
 http://bugzilla.wpkg.org/show_bug.cgi?id=269
 In general Notepad++ is happy with ANSI as UTF-8 encoded files, but this
 seems not to be the case for this file.
 
 The BOM doesn't hurt and makes sure that any editor, not just only
 specialized ones, display and edit the text correctly.

please don't get me wrong, all of what you say in this bug is correct :)
However, there is IMHO a subtle problem with BOMs in any file that is supposed 
to be machine-readable.  Without a BOM, any UTF-8 encoded file will be 
correctly parsed by anything that digests 8-bit ASCII files.  A BOM can break 
this in sometimes hard to debug ways, as it's usually not visible.  Imagine a 
simple key=value based config file:

message=Içh ßiñ €in Täst ☺

Without a BOM, this will be correctly handled by even the most stupid parser.  
As long as the code consuming the data from that parser is aware of the UTF-8 
encoding, all is fine.  But when you add a BOM, the parser will fail to match 
message vs. BOMmessage as key and fail miserably.

Any XML tool must definately cope with the presence of a BOM.  But then an XML 
file without explicitly specified encoding and without BOM must be UTF-8 
encoded 
anyway.  So as you already said, the BOM helps non-specialized editors.  
Right, but personally I've had those invisible buggers bite me several times 
while they never served me any good ;)

{Gosh I do feel like ranting, it's not towards you, but shall emphasize why I 
consider adding a BOM a valid but unfortunate fix:}
rant
I'd assert that BOMs are a kludge that should be used very sparingly.  In 
fact, as the byte order is clear in UTF-8, the BOM as customary and necessary 
with UTF-16/UCS-2 is degenerated to only flag the text as Unicode.  What an 
epic fail ;) Anything non-UTF-8 should be flagged as Danger: obsolete encoding 
inside intead and those ISO-8859-*, WIN125* and whatnot should go die in a 
fire.  AFAIK among all current OSs, only Windows still doesn't default to UTF-8 
in text files.  Plus MS has this evil habit of assuming Unicode = UCS-2, which 
totally breaks ASCII-compatibility, breaks all protocols that can't deal with 
NUL-bytes in text streams etc. (I've seen Outlook Express send E-Mails with 
UCS-2 content as text/plain, no charset given, no transfer encoding and all 
those lovely NULs inside...) Yeah, UCS-2 is a Unicode encoding and fine for 
internal processing (if all you need is the BMP).  But when serializing to a 
file, UTF-8 is the way to go, no BOMs needed.
/rant

Kind regards,
Malte
-
wpkg-users mailing list archives  http://lists.wpkg.org/pipermail/wpkg-users/
___
wpkg-users mailing list
wpkg-users@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/wpkg-users


[wpkg-users] [Bug 269] config.xml file has wrong encoding

2012-04-08 Thread bugzilla-daemon
http://bugzilla.wpkg.org/show_bug.cgi?id=269

--- Comment #1 from Loren M. Lang lor...@north-winds.org  ---
On 4/8/2012 11:14 AM, bugzilla-dae...@bugzilla.wpkg.org wrote:
 http://bugzilla.wpkg.org/show_bug.cgi?id=269

 Summary: config.xml file has wrong encoding
 Product: WPKG
 Version: other
Platform: PC
  OS/Version: All
  Status: NEW
Severity: major
Priority: P2
   Component: config files
  AssignedTo: man...@wpkg.org
  ReportedBy: pendl2mega...@yahoo.de
   QAContact: wpkg-users@lists.wpkg.org


 The config.xml file is encoded as ANSI, which renders the foreign characters
 for German and Spanish useless.
 The file must be saved as UTF-8 with BOM to correct this.

 Spanish example:

 La utilidad de instalación automática de software está actualizando el
 sistema.
And on that note, bugzilla should be properly declaring it's content 
type and charset for UTF-8 with something like this:

Content-Type: text/plain; charset=utf-8



 Thanks in advance,
 Stefan



 -
 wpkg-users mailing list archives  
 http://lists.wpkg.org/pipermail/wpkg-users/
 ___
 wpkg-users mailing list
 wpkg-users@lists.wpkg.org
 http://lists.wpkg.org/mailman/listinfo/wpkg-users

-- 
Configure bugmail: http://bugzilla.wpkg.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the QA contact for the bug.-
wpkg-users mailing list archives  http://lists.wpkg.org/pipermail/wpkg-users/
___
wpkg-users mailing list
wpkg-users@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/wpkg-users