Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-25 Thread Markus Schaber
Hi, Hannu, Hannu Krosing wrote: Are you sure it's UCS-4 ? I've always thought that XML is what is given in xml tag, and utf-8 if no charset is given. You have to distinguish between the supported charset, and the document encoding. UCS-4 and UTF-8 are both encodings for UNICODE see:

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-25 Thread Markus Schaber
Hi, Bruce, Bruce Momjian wrote: I don't think that any of our SGML documentation is actually in UCS-4 encoding. The source files use nothing beyond plain ASCII (and should remain that way, IMHO) so there isn't any need to inquire very far into exactly what the toolchain thinks the document

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Peter Eisentraut
Alvaro Herrera wrote: On the other hand, I don't understand why DocBook would be Latin-1 only. What would be the point of that limitation? Some googling seems to reveal that people indeed uses other charsets, UTF-8 in particular (but also Big5, Latin-2, etc), so apparently this isn't set in

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Hannu Krosing
Ühel kenal päeval, P, 2006-09-24 kell 10:20, kirjutas Peter Eisentraut: Alvaro Herrera wrote: On the other hand, I don't understand why DocBook would be Latin-1 only. What would be the point of that limitation? Some googling seems to reveal that people indeed uses other charsets, UTF-8 in

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Markus Schaber
Hi, Hannu, Hannu Krosing wrote: Are you sure it's UCS-4 ? I've always thought that XML is what is given in xml tag, and utf-8 if no charset is given. You have to distinguish between the supported charset, and the document encoding. HTH, Markus -- Markus Schaber | Logical TrackingTracing

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread David Fetter
On Sun, Sep 24, 2006 at 10:20:22AM +0200, Peter Eisentraut wrote: Alvaro Herrera wrote: On the other hand, I don't understand why DocBook would be Latin-1 only. What would be the point of that limitation? Some googling seems to reveal that people indeed uses other charsets, UTF-8 in

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Hannu Krosing
Ühel kenal päeval, P, 2006-09-24 kell 14:56, kirjutas Markus Schaber: Hi, Hannu, Hannu Krosing wrote: Are you sure it's UCS-4 ? I've always thought that XML is what is given in xml tag, and utf-8 if no charset is given. You have to distinguish between the supported charset, and the

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Andrew Dunstan
Hannu Krosing wrote: Ühel kenal päeval, P, 2006-09-24 kell 14:56, kirjutas Markus Schaber: Hi, Hannu, Hannu Krosing wrote: Are you sure it's UCS-4 ? I've always thought that XML is what is given in xml tag, and utf-8 if no charset is given. You have to distinguish between

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Peter Eisentraut
Andrew Dunstan wrote: If we want to quote references, we should quote the XML standard. For example, see here to see the exact charset supported by XML: http://www.w3.org/TR/2006/REC-xml11-20060816/#charsets. The actual cause of the processing problems we have been seeing are the character set

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Hannu Krosing
Ühel kenal päeval, E, 2006-09-25 kell 00:23, kirjutas Peter Eisentraut: Andrew Dunstan wrote: If we want to quote references, we should quote the XML standard. For example, see here to see the exact charset supported by XML: http://www.w3.org/TR/2006/REC-xml11-20060816/#charsets. The

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes: I don't think that any of our SGML documentation is actually in UCS-4 encoding. The source files use nothing beyond plain ASCII (and should remain that way, IMHO) so there isn't any need to inquire very far into exactly what the toolchain thinks the

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Bruce Momjian
Tom Lane wrote: Hannu Krosing [EMAIL PROTECTED] writes: I don't think that any of our SGML documentation is actually in UCS-4 encoding. The source files use nothing beyond plain ASCII (and should remain that way, IMHO) so there isn't any need to inquire very far into exactly what the

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-24 Thread Martijn van Oosterhout
On Sun, Sep 24, 2006 at 07:38:20PM -0400, Tom Lane wrote: Hannu Krosing [EMAIL PROTECTED] writes: I don't think that any of our SGML documentation is actually in UCS-4 encoding. The source files use nothing beyond plain ASCII (and should remain that way, IMHO) so there isn't any need to

Re: [HACKERS] pgsql: We're going to have to spell dotless i as plain i, because

2006-09-23 Thread Martijn van Oosterhout
On Fri, Sep 22, 2006 at 12:29:05PM -0300, Tom Lane wrote: Log Message: --- We're going to have to spell dotless i as plain i, because dotless i is not in the character set supported by DocBook nor standard HTML. (Sorry Volkan.) Also replace random character-set references by a

Re: [HACKERS] pgsql: We're going to have to spell dotless i as plain i, because

2006-09-23 Thread Peter Eisentraut
Martijn van Oosterhout wrote: Well you could always use te HTML4 #305; which most tools should understand. At least browsers have good support for this kind of entity. Please review the recent thread on pgsql-docs before reiterating all the suggestions. -- Peter Eisentraut

Re: [HACKERS] pgsql: We're going to have to spell dotless i as plain i, because

2006-09-23 Thread Martijn van Oosterhout
On Sat, Sep 23, 2006 at 11:54:47AM +0200, Peter Eisentraut wrote: Martijn van Oosterhout wrote: Well you could always use te HTML4 #305; which most tools should understand. At least browsers have good support for this kind of entity. Please review the recent thread on pgsql-docs before

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Bruce Momjian
Martijn van Oosterhout wrote: -- Start of PGP signed section. On Sat, Sep 23, 2006 at 11:54:47AM +0200, Peter Eisentraut wrote: Martijn van Oosterhout wrote: Well you could always use te HTML4 #305; which most tools should understand. At least browsers have good support for this kind of

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Martijn van Oosterhout
On Sat, Sep 23, 2006 at 08:49:02AM -0400, Bruce Momjian wrote: That's not how I understand it. The document encoding is only related to how high-bit characters are interpreted, I am told by Peter, but for some reason the toolchain just doesn't support UTF8, even though if you use #305; in

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: So to me (a more docbook novice) it seems like it's the stylesheet that's limiting you to latin1, not the docbook parser. But the stylesheet in question is part of the basic docbook infrastructure, so the above distinction is academic. (Or at

Re: [HACKERS] pgsql: We're going to have to spell dotless i as plain i, because

2006-09-23 Thread Peter Eisentraut
Martijn van Oosterhout wrote: Oh sorry, it wasn't clear from the commit entry. It's not that DocBook doesn't support the character or that it can't be represented. It's just not supported in the document encoding we're using. No, no, and no. The reason that it doesn't work is that the

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Martijn van Oosterhout
On Sat, Sep 23, 2006 at 12:27:51PM -0400, Tom Lane wrote: To my mind the real problem is that one of the principal output formats we are interested in is HTML, and there is no dotless-i entity in any version of the HTML standard. I trust I need not point out again the difference between my

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes: I created a simple docbook document on my computer with inodot; and ran openjade over and in the output file it is converted to #305;. I experimented with that, and openjade didn't complain about it, but it renders in my browser (Safari) as Have

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Alvaro Herrera
Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: I created a simple docbook document on my computer with inodot; and ran openjade over and in the output file it is converted to #305;. I experimented with that, and openjade didn't complain about it, but it renders in my

Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes: So maybe your Openjade is not exactly the same Martijn was using, because what I understood was that Openjade replaced the inodot; with #305;, which should work. I think it's more likely that he was running with a non-DocBook stylesheet (his openjade