Re: character
Joerg and Arved, Thanks for sorting this out while I was asleep. I talk about these things in terms of the parser, in spite of the offence it might give to specification purists, because that is where I have spent a lot of my time lately. J.Pietschmann wrote: Don't look at XML AttValue, look at the XSLFO property expression language. Somehow it is implicit that all attributes in a XSLFO document are parsed as expressions which are defined in 5.9 Expressions. This is the critical point. The namespace not only restricts the elements and attributes, but imposes itself on the contents of the attribute values passed in by the XML parser. I need to think about this a bit more, but it seems to me that the recent ruling on string with respect to the format attribute, which makes my flesh creep every time I think about it, disguises an attempt to smuggle part of the Transform namespace's constraints into the Format namespace. They are completely different expression environments, which is why it doesn't work. Has anyone else given this any thought? Peter -- Peter B. West [EMAIL PROTECTED] http://www.powerup.com.au/~pbwest/ Lord, to whom shall we go? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Peter B. West wrote: This is the critical point. The namespace not only restricts the elements and attributes, but imposes itself on the contents of the attribute values passed in by the XML parser. Umm, the namespace does not impose anything. It's the XSLFO spec which defines the semantics of some elements and XML attribute values. That said elements happen to be in a certain namespace is not really relevant for getting something formatted. I need to think about this a bit more, but it seems to me that the recent ruling on string with respect to the format attribute, which makes my flesh creep every time I think about it, disguises an attempt to smuggle part of the Transform namespace's constraints into the Format namespace. They are completely different expression environments, which is why it doesn't work. Has anyone else given this any thought? Where does XSLT come into the picture? The whole thing is specified in the XSLFO spec, section 5. The expressions which make up property values in the end come from 5.9ff. The expression language used by XSLT, XPath, is an entirely different beast (I don't think this is much of an advantage). J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: character
-Original Message- From: Peter B. West [mailto:[EMAIL PROTECTED]] Sent: September 30, 2002 11:24 PM To: [EMAIL PROTECTED] Subject: Re: character Arved Sandstrom wrote: -Original Message- From: Tony Graham [mailto:[EMAIL PROTECTED]] Peter B. West wrote at 30 Sep 2002 13:28:18 +1000: Tony Graham wrote: [EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300: ... That means -, #12235 , etc are characters, while '1' is not. #12235; is a character reference. '#12235' is how you talk about a character's code point, although the hexadecimal representation is usually preferable. In XSL terms, '1' is a one-character string literal, but while you could claim that it is one character, there's no XSL conversion from a string to a character, so fo:character character='1'/ should fail. Tony, I don't think this gets us out of difficulty. A casual inspection Forgive me, but I wasn't trying to get anybody out of any difficulty, I was just trying to keep the terminology accurate. ... So how do I represent a character? To me, the cleanest, least ambiguous way is to represent a character attribute assignment value with 'character' - a string literal of length 1. Except that you know that that's not specified among the allowed conversions. The interesting thing is that 'character' doesn't appear in the productions in Section 5.9, Expressions, of the XSL Recommendation. Now there's a question for [EMAIL PROTECTED]! I think that you represent a character as a single character, e.g., character=c, or as a numeric character reference, e.g., character=#xA;. I agree with this last, after having digested everything. Point is well taken that we have some points to nitpick with xsl-editors, mostly about disambiguating some of the language. Arved, Help me here. I must be missing something. What is it that you agree with? That the spec, as worded, leaves us with character=c or character=#x63; which amounts to the same thing? Yes, this is what I agree with. If so, fair enough. Do you also agree that c is an NCName? And that character=- is a parsing error? Well, the production for NCName doesn't live in isolation, with reference to http://www.w3.org/TR/REC-xml-names/#ns-decl. Yes, c fits the production, but it's really an NCName when you have also declared the namespace. Why is character=- a parsing error? The XML Recommendation has at least one example of an attribute value that contains a hyphen. Maybe _I_ am missing something here. ;-) As far as I can see, the only immediate ways forward are to descend into the mire of context dependent parsing (which the editors have recently formally decided that we must do in respect of format) or apply our own disambiguating condition. How are you intending to implement character? By storing it as a Unicode value according to the XML Rec production Char::=#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x1-#x10] It will depend on the implementation library. ICU for example has UChar and UChar32 types. Regards, Arved - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Arved Sandstrom wrote: Why is character=- a parsing error? The XML Recommendation has at least one example of an attribute value that contains a hyphen. This comes from assuming that every unqoted sequence of characters which is not a number, mesutrement or a color has to be interpreted as NCName, as the grammar suggests, and IIRC a NCName must not start with a hyphen. This means hyphenation-char=- can't parse as number, can't parse as string, can't parse as color, can't parse as NCName - parsing error. Interestingly hyphenation-char=-1 would parse, but certainly can't be converted to a char Some other niceties: hyphenation-char=1*4 would this make the hyphenation charater be 4? Can hyphenation-char=1 div 4 be converted to #x00BC? bg I know this becomes silly. How are you intending to implement character? By storing it as a Unicode value according to the XML Rec production Functions complicate matters, and something like hyphenation-char=from-table-column('hyphenation-char') might even make some sense. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Peter B. West wrote: Just for curiosity: what should happen if the following snippet is used: fo:page-sequence master-reference=font-size font-size=20pt fo:flow font-size=from-parent(from-parent('master-reference'))/ This looks OK. I see potential for an Obfuscated FO Code Contest :-) J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: character
-Original Message- From: J.Pietschmann [mailto:[EMAIL PROTECTED]] Sent: October 6, 2002 12:00 PM To: [EMAIL PROTECTED] Subject: Re: character Arved Sandstrom wrote: Why is character=- a parsing error? The XML Recommendation has at least one example of an attribute value that contains a hyphen. This comes from assuming that every unqoted sequence of characters which is not a number, mesutrement or a color has to be interpreted as NCName, as the grammar suggests, and IIRC a NCName must not start with a hyphen. This means hyphenation-char=- can't parse as number, can't parse as string, can't parse as color, can't parse as NCName - parsing error. Hi Joerg Can you cite the specific productions that lead to this conclusion? I am not saying that you are wrong but I can't find it. I must be tired. ;-) I just looked at the XML 1.1 production for AttValue which is AttValue::='' ([^] | Reference)* '' | ' ([^'] | Reference)* ' and I see a prohibition here on using a literal '' or '' in the attribute value, anywhere. But I see nothing about '-'. If the grammar of the recommendations leads to the conclusion that character=- is not OK, then this just simply offends my common sense. Arved - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Arved Sandstrom wrote: Can you cite the specific productions that lead to this conclusion? I am not saying that you are wrong but I can't find it. I must be tired. ;-) I just looked at the XML 1.1 production for AttValue which is Don't look at XML AttValue, look at the XSLFO property expression language. Somehow it is implicit that all attributes in a XSLFO document are parsed as expressions which are defined in 5.9 Expressions. Refer specifically to 5.9.3 Basics. A single hyphen is not a valid expression according to the XSLFO expression grammar. Maybe some fallbacks are implicit somewhere, I don't know. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: character
-Original Message- From: J.Pietschmann [mailto:[EMAIL PROTECTED]] Sent: October 6, 2002 12:39 PM To: [EMAIL PROTECTED] Subject: Re: character Arved Sandstrom wrote: Can you cite the specific productions that lead to this conclusion? I am not saying that you are wrong but I can't find it. I must be tired. ;-) I just looked at the XML 1.1 production for AttValue which is Don't look at XML AttValue, look at the XSLFO property expression language. Somehow it is implicit that all attributes in a XSLFO document are parsed as expressions which are defined in 5.9 Expressions. Refer specifically to 5.9.3 Basics. A single hyphen is not a valid expression according to the XSLFO expression grammar. Maybe some fallbacks are implicit somewhere, I don't know. An Expr can be a Literal, the production for which is '' [^]* '' | ' [^']* ' If I look at the first alternative, '' [^]* '' it seems to me that I have pretty considerable leeway, and - isn't ruled out at all. Arved - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Arved Sandstrom wrote: An Expr can be a Literal, the production for which is '' [^]* '' | ' [^']* ' If I look at the first alternative, '' [^]* '' it seems to me that I have pretty considerable leeway, and - isn't ruled out at all. Erm, the expression is supposed to be inside the XML attribute quotes, for example hyphenation-char='-' would be ok (literal, second production), but hyphenation-char=- does not match the literal production, nor any other (except operator). Unless I missed something, of course. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: character
-Original Message- From: J.Pietschmann [mailto:[EMAIL PROTECTED]] Sent: October 6, 2002 1:29 PM To: [EMAIL PROTECTED] Subject: Re: character Arved Sandstrom wrote: An Expr can be a Literal, the production for which is '' [^]* '' | ' [^']* ' If I look at the first alternative, '' [^]* '' it seems to me that I have pretty considerable leeway, and - isn't ruled out at all. Erm, the expression is supposed to be inside the XML attribute quotes, for example hyphenation-char='-' would be ok (literal, second production), but hyphenation-char=- does not match the literal production, nor any other (except operator). Unless I missed something, of course. And unless _I_ am missing something, - precisely matches that production. You are looking at ' [^']* ' but I am looking at '' [^]* '' According to the latter I can absolutely do -. Arved - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Arved Sandstrom wrote: And unless _I_ am missing something, - precisely matches that production. You are looking at ' [^']* ' but I am looking at '' [^]* '' According to the latter I can absolutely do -. Well, in hyphenation-char=- the hyphen is the expression, not the hyphen surrounded by double quotes. As I said, unless I'm something missing, the FO property expression is the value of the XML attribute, which in turn is the hyphen, because the double quotes are part of the XML syntax and are stripped by the XML parser. The XSLFO property expression parser gets the hyphen, without any quotes, double, or single. And without the quotes, it does not match either of the two productions for literal. This is the problem here. Perhaps I should have written that hyphenation-char='-' and hyphenation-char='-' as well as hyphenation-char='quot;-quot;' are legal, while neiter hyphenation-char='-' nor hyphenation-char=- are ok. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: character
-Original Message- From: J.Pietschmann [mailto:[EMAIL PROTECTED]] Sent: October 6, 2002 2:15 PM To: [EMAIL PROTECTED] Subject: Re: character Arved Sandstrom wrote: And unless _I_ am missing something, - precisely matches that production. You are looking at ' [^']* ' but I am looking at '' [^]* '' According to the latter I can absolutely do -. Well, in hyphenation-char=- the hyphen is the expression, not the hyphen surrounded by double quotes. As I said, unless I'm something missing, the FO property expression is the value of the XML attribute, which in turn is the hyphen, because the double quotes are part of the XML syntax and are stripped by the XML parser. The XSLFO property expression parser gets the hyphen, without any quotes, double, or single. And without the quotes, it does not match either of the two productions for literal. This is the problem here. Perhaps I should have written that hyphenation-char='-' and hyphenation-char='-' as well as hyphenation-char='quot;-quot;' are legal, while neiter hyphenation-char='-' nor hyphenation-char=- are ok. Yes, I see your point. I think they screwed up the grammar. As I stated before, I find it ludicrous that character=- would not be OK. Arved - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Arved Sandstrom wrote: I think they screwed up the grammar. Me too. However, I think it would be really hard to press something which is intuitive, consistent as well as easy to parse into a single grammar for all XSLFO porperties. It seems they fell for the same as the C preprocessor guys did, which is intuitive and easy to implement for the most part, but had this abominable 0xe-12 problem as well as the rather unintuitive argument prescanning hidden in its dark corners. As I stated before, I find it ludicrous that character=- would not be OK. That's ok, but it would require some extensions to the whole property handling. J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Tony Graham [EMAIL PROTECTED] escreveu: [EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300: Out of the XML recomendation,section 2.2: A character is an atomic unit of text as specified by ISO/IEC 10646 [ISO10646]. Legal characters are tab, carriage return, line feed, and the legal graphic characters of Unicode and ISO/IEC 10646. XML 1.0 Second Edition removed graphic (which I always found confusing but which is good ISO-speak). or, more clearly: Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x1-#x10] /* any Unicode character, excluding the surrogate blocks, FFFE, and . */ That means -, #12235 , etc are characters, while '1' is not. #12235; is a character reference. '#12235' is how you talk about a character's code point, although the hexadecimal representation is usually preferable. In XSL terms, '1' is a one-character string literal, but while you could claim that it is one character, there's no XSL conversion from a string to a character, so fo:character character='1'/ should fail. You are correct. What I tried to poit out is that '1' IS a string string that HAS one character. He who claim[s] it is one character is IMHO seriouly misled. 1 , on the other hand, IS a character. The concept of a character, in the XML syntax definition, is that of the symbols allowed in the grammar, the most elementary piece of the lexical. The term string is not formally defined in the recommendation (sadly), but it is used throughout the text meaning sequence of characters. A string _type_ is defined for attributes, and consists of a quoted literal string (i.e. a sequence of characters delimited by quotes). In the XML point of view, '1' is nothing but a three-character string. And a three-character string is not a character. The XSL recommendation defines a string datatype that has a rather different scope of the 'literal string' and 'string attribute type' defined in the XML spec. But defines no 'character' datatype, so I think there is no other option but to assume it means the XML definition of a character. So, fo:character character=1/ is correct, while, as you said, fo:character character='1'/ should fail. This leaves us with a problem, however, because since the character datatype is not defined, there is also no conversion rule which results in a character. You cannot store a character in a xsl:variable because there is no way to specify or retrieve it -- variables know only about strings. I find that very disturbing, because it hampers stylesheet coding, in that we cannot specify characters indirectly or do any work with them. I think this should be reported to the editors of XSLT 2.0 so they can provide a clear way out. = Marcelo Jaccoud Amaral PetrobrĂ¡s (http://www.petrobras.com.br) mailto:[EMAIL PROTECTED] voice: +55 21 2534-3485 fax: +55 21 2534-1809 =
Re: character
Tony Graham wrote: [EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300: Out of the XML recomendation,section 2.2: A character is an atomic unit of text as specified by ISO/IEC 10646 [ISO10646]. Legal characters are tab, carriage return, line feed, and the legal graphic characters of Unicode and ISO/IEC 10646. XML 1.0 Second Edition removed graphic (which I always found confusing but which is good ISO-speak). or, more clearly: Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x1-#x10] /* any Unicode character, excluding the surrogate blocks, FFFE, and . */ That means -, #12235 , etc are characters, while '1' is not. #12235; is a character reference. '#12235' is how you talk about a character's code point, although the hexadecimal representation is usually preferable. In XSL terms, '1' is a one-character string literal, but while you could claim that it is one character, there's no XSL conversion from a string to a character, so fo:character character='1'/ should fail. Tony, I don't think this gets us out of difficulty. A casual inspection reveals no conversion, either, from an NCName to a character. So an attribute value assignment of a will, I think, parse (in the parser implied by the grammar of XSL expressions) as an NCName (whereas - will parse as an unadorned MINUS sign.) So how do I represent a character? Furthermore, Section 5.11 has q character A single Unicode character. string A sequence of characters. /q If an attribute value assignment of 'a sequence of characters' assigns a sequence of characters, then 'a' must assign a sequence of one character. What's the difference between a single Unicode character and a sequence of one character? Well, one is a sequence, and therefore a string, and there's no XSL conversion, etc. So how do I represent a character? To me, the cleanest, least ambiguous way is to represent a character attribute assignment value with 'character' - a string literal of length 1. Peter -- Peter B. West [EMAIL PROTECTED] http://www.powerup.com.au/~pbwest/ Lord, to whom shall we go? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Peter B. West wrote at 30 Sep 2002 13:28:18 +1000: Tony Graham wrote: [EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300: ... That means -, #12235 , etc are characters, while '1' is not. #12235; is a character reference. '#12235' is how you talk about a character's code point, although the hexadecimal representation is usually preferable. In XSL terms, '1' is a one-character string literal, but while you could claim that it is one character, there's no XSL conversion from a string to a character, so fo:character character='1'/ should fail. Tony, I don't think this gets us out of difficulty. A casual inspection Forgive me, but I wasn't trying to get anybody out of any difficulty, I was just trying to keep the terminology accurate. ... So how do I represent a character? To me, the cleanest, least ambiguous way is to represent a character attribute assignment value with 'character' - a string literal of length 1. Except that you know that that's not specified among the allowed conversions. The interesting thing is that 'character' doesn't appear in the productions in Section 5.9, Expressions, of the XSL Recommendation. Now there's a question for [EMAIL PROTECTED]! I think that you represent a character as a single character, e.g., character=c, or as a numeric character reference, e.g., character=#xA;. Regards, Tony Graham XML Technology Center - Dublinmailto:[EMAIL PROTECTED] Sun Microsystems Ireland Ltd Phone: +353 1 8199708 Hamilton House, East Point Business Park, Dublin 3x(70)19708 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: character
-Original Message- From: Tony Graham [mailto:[EMAIL PROTECTED]] Sent: September 30, 2002 10:09 AM To: [EMAIL PROTECTED] Subject: Re: character Peter B. West wrote at 30 Sep 2002 13:28:18 +1000: Tony Graham wrote: [EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300: ... That means -, #12235 , etc are characters, while '1' is not. #12235; is a character reference. '#12235' is how you talk about a character's code point, although the hexadecimal representation is usually preferable. In XSL terms, '1' is a one-character string literal, but while you could claim that it is one character, there's no XSL conversion from a string to a character, so fo:character character='1'/ should fail. Tony, I don't think this gets us out of difficulty. A casual inspection Forgive me, but I wasn't trying to get anybody out of any difficulty, I was just trying to keep the terminology accurate. ... So how do I represent a character? To me, the cleanest, least ambiguous way is to represent a character attribute assignment value with 'character' - a string literal of length 1. Except that you know that that's not specified among the allowed conversions. The interesting thing is that 'character' doesn't appear in the productions in Section 5.9, Expressions, of the XSL Recommendation. Now there's a question for [EMAIL PROTECTED]! I think that you represent a character as a single character, e.g., character=c, or as a numeric character reference, e.g., character=#xA;. I agree with this last, after having digested everything. Point is well taken that we have some points to nitpick with xsl-editors, mostly about disambiguating some of the language. Arved Sandstrom - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Arved Sandstrom wrote: -Original Message- From: Tony Graham [mailto:[EMAIL PROTECTED]] Peter B. West wrote at 30 Sep 2002 13:28:18 +1000: Tony Graham wrote: [EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300: ... That means -, #12235 , etc are characters, while '1' is not. #12235; is a character reference. '#12235' is how you talk about a character's code point, although the hexadecimal representation is usually preferable. In XSL terms, '1' is a one-character string literal, but while you could claim that it is one character, there's no XSL conversion from a string to a character, so fo:character character='1'/ should fail. Tony, I don't think this gets us out of difficulty. A casual inspection Forgive me, but I wasn't trying to get anybody out of any difficulty, I was just trying to keep the terminology accurate. ... So how do I represent a character? To me, the cleanest, least ambiguous way is to represent a character attribute assignment value with 'character' - a string literal of length 1. Except that you know that that's not specified among the allowed conversions. The interesting thing is that 'character' doesn't appear in the productions in Section 5.9, Expressions, of the XSL Recommendation. Now there's a question for [EMAIL PROTECTED]! I think that you represent a character as a single character, e.g., character=c, or as a numeric character reference, e.g., character=#xA;. I agree with this last, after having digested everything. Point is well taken that we have some points to nitpick with xsl-editors, mostly about disambiguating some of the language. Arved, Help me here. I must be missing something. What is it that you agree with? That the spec, as worded, leaves us with character=c or character=#x63; which amounts to the same thing? If so, fair enough. Do you also agree that c is an NCName? And that character=- is a parsing error? As far as I can see, the only immediate ways forward are to descend into the mire of context dependent parsing (which the editors have recently formally decided that we must do in respect of format) or apply our own disambiguating condition. How are you intending to implement character? Peter -- Peter B. West [EMAIL PROTECTED] http://www.powerup.com.au/~pbwest/ Lord, to whom shall we go? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: character
Arved Sandstrom wrote at 26 Sep 2002 19:50:01 -0300: Tony Graham says that character should be a Unicode character, or Char. As in the actual real, encoded thing. Empirical evidence suggests that is the general understanding: grepping the XSL CR test suite shows everybody, FOP included, using literal characters. Problem being, one property with a character datatype is defined in XSLT, which actually says that it's a Char. hyphenation-separator merely says that it's a specification of a Unicode character. I guess that could be interpreted the same way. But character for the character property says _code point_. And that is an integer value. Section 5.11, Property Datatypes, trumps the individual property definitions, since Section 5.11 defines the syntax for specifying the datatypes usable in property values. It says A single Unicode character. Now, the interesting if so far theoretical case is what do you do if you want a hyphenation-separator character that you can only represent in Unicode as the combination of a base character and one or more combining marks? What if your precomposed character gets normalised to a base character and a combining mark before the XSL processor sees it? So IMO the spec is currently very vague on this. Then write to [EMAIL PROTECTED] asking for a clarification. Regards, Tony Graham XML Technology Center - Dublinmailto:[EMAIL PROTECTED] Sun Microsystems Ireland Ltd Phone: +353 1 8199708 Hamilton House, East Point Business Park, Dublin 3x(70)19708 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Peter B. West wrote at 28 Sep 2002 00:39:34 +1000: ... Tony Graham wrote: ... Section 5.11, Property Datatypes, trumps the individual property definitions, since Section 5.11 defines the syntax for specifying the datatypes usable in property values. It says A single Unicode character. Ok, so it's a character. How, then, is it represented? Is it also a string (of length one), or is it just a literal (length 1), or just an NCName (length 1), or is it something else? What does it look like, and how is the parser going to handle it? A character is a character, and you should go to XML 1.0 for the definition of a character. Also, parser is ambiguous in this context as well as having no XML or XSL meaning. XML defines an XML processor, which is often called a parser for historical reasons, and the XSL Recommendation uses parse without designating a thing called a parser. ... So IMO the spec is currently very vague on this. Then write to [EMAIL PROTECTED] asking for a clarification. Nice dry wit you have Tony. That was a serious suggestion. You do get an answer eventually, even if you don't like the answer. Regards, Tony Graham XML Technology Center - Dublinmailto:[EMAIL PROTECTED] Sun Microsystems Ireland Ltd Phone: +353 1 8199708 Hamilton House, East Point Business Park, Dublin 3x(70)19708 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
[EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300: Out of the XML recomendation,section 2.2: A character is an atomic unit of text as specified by ISO/IEC 10646 [ISO10646]. Legal characters are tab, carriage return, line feed, and the legal graphic characters of Unicode and ISO/IEC 10646. XML 1.0 Second Edition removed graphic (which I always found confusing but which is good ISO-speak). or, more clearly: Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x1-#x10] /* any Unicode character, excluding the surrogate blocks, FFFE, and . */ That means -, #12235 , etc are characters, while '1' is not. #12235; is a character reference. '#12235' is how you talk about a character's code point, although the hexadecimal representation is usually preferable. In XSL terms, '1' is a one-character string literal, but while you could claim that it is one character, there's no XSL conversion from a string to a character, so fo:character character='1'/ should fail. Regards, Tony Graham XML Technology Center - Dublinmailto:[EMAIL PROTECTED] Sun Microsystems Ireland Ltd Phone: +353 1 8199708 Hamilton House, East Point Business Park, Dublin 3x(70)19708 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Peter B. West wrote: Fopdevs, Any comments on the representation and parsing of character type attributes would be gratefully received. According to 5.11 Property Datatypes, the value is a single unicode character. I believe the representation is a unceremonial single unicode character, or an NCName whose string representation has the length 1. I'd parse such attributes as an expression resulting in a string, and bomb if the string is longer than 1. This would accept character='a' character=1 + 1 character=from-parent('font-size') - 12 which may upset purist, or not. An alternative would be to use a custom parser, which accepts either a single character (NCName of length 1) or any of the functions inherited-property-value(NCName), from-parent( NCName), from-nearest-specified-value( NCName) and from-table-column( NCName) (might even make a bit of sense for hyphenation-char and for fo:character's character in very, very strange cases) Just for curiosity: what should happen if the following snippet is used: fo:page-sequence master-reference=font-size font-size=20pt fo:flow font-size=from-parent(from-parent('master-reference'))/ J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
RE: character
-Original Message- From: Peter B. West [mailto:[EMAIL PROTECTED]] Sent: September 26, 2002 11:41 AM To: fop-dev Subject: character Fopdevs, Any comments on the representation and parsing of character type attributes would be gratefully received. This came up on www-xsl-fo, because Eric Bischoff and myself had the same question. Tony Graham says that character should be a Unicode character, or Char. As in the actual real, encoded thing. Problem being, one property with a character datatype is defined in XSLT, which actually says that it's a Char. hyphenation-separator merely says that it's a specification of a Unicode character. I guess that could be interpreted the same way. But character for the character property says _code point_. And that is an integer value. So IMO the spec is currently very vague on this. Regards, Arved - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: character
Arved, Thanks for this. I vaguely remembered some discussion about this, but I went looking in the xsl-editors archive. That _code point_ had me puzzled as well. I'll be interested in some feedback on this from the editors. See also my response to Joerg. Peter Arved Sandstrom wrote: From: Peter B. West [mailto:[EMAIL PROTECTED]] Fopdevs, Any comments on the representation and parsing of character type attributes would be gratefully received. This came up on www-xsl-fo, because Eric Bischoff and myself had the same question. Tony Graham says that character should be a Unicode character, or Char. As in the actual real, encoded thing. Problem being, one property with a character datatype is defined in XSLT, which actually says that it's a Char. hyphenation-separator merely says that it's a specification of a Unicode character. I guess that could be interpreted the same way. But character for the character property says _code point_. And that is an integer value. So IMO the spec is currently very vague on this. -- Peter B. West [EMAIL PROTECTED] http://www.powerup.com.au/~pbwest/ Lord, to whom shall we go? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Character Encoding
- Original Message - From: J.Pietschmann [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, July 09, 2002 9:58 PM Subject: Re: Character Encoding Holger Prause wrote: I use the character squence #8722; in a html page it will be dispalyed as an - minus sign. So far so good.Now i want to use that chracter sequence in FO but in the ^ ^ ^ ^ ^ ^ ^ It is a character reference Yes your are right. generated pdf it will displayed as an # sign(which stands for undefined ?) This means the selected font does not have a glyph for it. Ok i undestand that, its also written in the FOP Faq. What can i do to display this character squence, changeing the encoding in the stylesheet(or using xsl:output /)? The only way is to get a font with a glyph for it and let FOP use it. The mathematical minus is pretty esoteric, you'll probably need a special math font, rummage through implementations for MathML or TeX distributions. Why can't you usse a dash or hyphen? What i wanted was a dash, but for some reasons i choosed the character reference #8722; which is , like u already said, a mathematical minus. Now i use a the character reference for dash , and i works fine with my font. Thx for the quick response, Bye, Holger J.Pietschmann - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: Character encoding on other platforms (previously os/390)
[EMAIL PROTECTED] schrieb: I've had a couple folks ask me for the modified code so the proper character encoding is returned on the toString().getBytes() is US-ASCII. This is cool that other people besides me need this. [..] I downloaded this snapshot xml-fop_20020515162132 and I don't see any modification to the code. Is this change going to be incorporated or has been incorporated in a way that I missed? It hasn't been incorporated yet but it's on my todo list and should be in the next maintenance release. Many Thanks, Jason West Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]