On Jul 11, 2012, at 02:04, Henri Gourvest wrote:

> Le 10/07/2012 19:04, dhruvbird a écrit :
>> Why not utf-8
> It support UTF-8, I don't know why you think it doesn't do

You are currently calling node-xml-lite an "XML ANSI/Unicode SAX parser".


The "Unicode" portion of that description is redundant since by definition all 
XML documents are composed of characters from the Unicode character set:

http://www.w3.org/TR/REC-xml/#charsets

There are numerous character encodings that can represent characters from the 
Unicode character set. All XML parsers must support UTF-8 and UTF-16, and they 
may support others as well. The XML specification provides guidance on how a 
parser can detect which encoding is being used:

http://www.w3.org/TR/REC-xml/#sec-guessing


"ANSI" does not refer to a particular character encoding. It might refer to the 
various Windows code pages:

http://en.wikipedia.org/wiki/Windows_code_page#ANSI_code_page

By listing "ANSI" in the description, are you trying to say that you explicitly 
support all of those character encodings? Or you might be referring 
specifically to the Windows-1252 character encoding only. If it is your 
intention to indicate what character encodings your parser supports, then to 
reduce confusion, it might be better to list those character encodings by their 
most common unambiguous names.

The Windows code pages are not ISO or ANSI standards, but they are supersets of 
ISO standards. For example, Windows-1252 is a superset of ISO-8859-1. If you're 
going to claim compatibility with Windows-1252, then you're probably also 
compatible with ISO-8859-1.


"Why not utf-8" was not asking why UTF-8 isn't supported; it goes without 
saying that it is. Rather, it was in response to your question whether you 
should change "ANSI" to "ASCII" in your module's description:

>> On Jul 10, 2012, at 06:25, Henri Gourvest wrote:
>> 
>>> what name should I use instead ?
>>> ASCII ?

"ASCII" of course refers to the 7-bit character encoding of which many other 
character encodings (including the Windows code pages and the UTF encodings) 
are a superset:

http://en.wikipedia.org/wiki/ASCII

Thus mentioning that you support ASCII is redundant, since by specification you 
are required to support UTF-8, and UTF-8 includes all of ASCII.


~ ~ ~


In the end, it comes down to what you wrote earlier:

On Jul 10, 2012, at 02:42, Henri Gourvest wrote:

> Le 10/07/2012 08:02, dhruvbird a écrit :
>> What do you mean by "ANSI parser"?
> 
> ANSI = 1 byte/character
> Unicode in Js = 2 byte/character (UCS2)
> It is an ANSI parser because it can parse inputs from a Buffer that is an 
> array of bytes.
> Most of times XML is encoded in UTF8.
> UTF8 is a special ANSI code page where all unicode characters can be encoded. 
> In UTF8 characters can be encoded on one or more bytes.


Based on this, I think you don't mean "ANSI" at all. The American National 
Standards Institute had nothing to do with Unicode or UTF-8. The Unicode 
standard is ISO 10646, by the International Standards Organization. UTF-8 was 
created by the Network Working Group, and is defined in an RFC (Request For 
Comments) document, RFC 3629.

I think you're just trying to convey the idea that, in the event of multibyte 
characters, the buffer you're using might contain partial / incomplete / 
fractional characters, and that your parser is able to accommodate that 
situation, presumably by waiting to handle them until the rest of the bytes 
that make up the character have arrived in the buffer. That's great, and I'd 
just suggest that you explain it as such at the beginning of your read me, and 
discard the "Unicode", "ANSI" or "ASCII" labels, since that's not what they 
convey. The first paragraph of your read me should be written to give users an 
understanding of what your module is for and to clearly differentiate it from 
the alternatives.



-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to