Re: character

2002-10-07 Thread Peter B. West

Joerg and Arved,

Thanks for sorting this out while I was asleep.  I talk about these 
things in terms of the parser, in spite of the offence it might give to 
specification purists, because that is where I have spent a lot of my 
time lately.

J.Pietschmann wrote:
 Don't look at XML AttValue, look at the XSLFO property expression language.
 Somehow it is implicit that all attributes in a XSLFO document are parsed
 as expressions which are defined in 5.9 Expressions.

This is the critical point.  The namespace not only restricts the 
elements and attributes, but imposes itself on the contents of the 
attribute values passed in by the XML parser.  I need to think about 
this a bit more, but it seems to me that the recent ruling on string 
with respect to the format attribute, which makes my flesh creep every 
time I think about it, disguises an attempt to smuggle part of the 
Transform namespace's constraints into the Format namespace.  They are 
completely different expression environments, which is why it doesn't 
work.  Has anyone else given this any thought?

Peter
-- 
Peter B. West  [EMAIL PROTECTED]  http://www.powerup.com.au/~pbwest/
Lord, to whom shall we go?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-10-07 Thread J.Pietschmann

Peter B. West wrote:
 This is the critical point.  The namespace not only restricts the 
 elements and attributes, but imposes itself on the contents of the 
 attribute values passed in by the XML parser.

Umm, the namespace does not impose anything. It's the XSLFO spec which
defines the semantics of some elements and XML attribute values. That
said elements happen to be in a certain namespace is not really relevant
for getting something formatted.

  I need to think about 
 this a bit more, but it seems to me that the recent ruling on string 
 with respect to the format attribute, which makes my flesh creep every 
 time I think about it, disguises an attempt to smuggle part of the 
 Transform namespace's constraints into the Format namespace.  They are 
 completely different expression environments, which is why it doesn't 
 work.  Has anyone else given this any thought?

Where does XSLT come into the picture? The whole thing is specified
in the XSLFO spec, section 5. The expressions which make up property
values in the end come from 5.9ff. The expression language used by
XSLT, XPath, is an entirely different beast (I don't think this is
much of an advantage).

J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: character

2002-10-06 Thread Arved Sandstrom

 -Original Message-
 From: Peter B. West [mailto:[EMAIL PROTECTED]]
 Sent: September 30, 2002 11:24 PM
 To: [EMAIL PROTECTED]
 Subject: Re: character

 Arved Sandstrom wrote:
 -Original Message-
 From: Tony Graham [mailto:[EMAIL PROTECTED]]

 Peter B. West wrote at 30 Sep 2002 13:28:18 +1000:
   Tony Graham wrote:
[EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300:
 ...
  That means  -, #12235 , etc are characters, while
 '1' is not.
   
#12235; is a character reference.  '#12235' is how you
 talk about a
character's code point, although the hexadecimal representation is
usually preferable.
   
In XSL terms, '1' is a one-character string literal, but
 while you
could claim that it is one character, there's no XSL
 conversion from a
string to a character, so fo:character character='1'/
 should fail.
  
   Tony,
  
   I don't think this gets us out of difficulty.  A casual inspection
 
 Forgive me, but I wasn't trying to get anybody out of any difficulty,
 I was just trying to keep the terminology accurate.
 
 ...
   So how do I represent a character?
  
   To me, the cleanest, least ambiguous way is to represent a
 character
   attribute assignment value with 'character' - a string literal of
   length 1.
 
 Except that you know that that's not specified among the allowed
 conversions.
 
 The interesting thing is that 'character' doesn't appear in the
 productions in Section 5.9, Expressions, of the XSL Recommendation.
 Now there's a question for [EMAIL PROTECTED]!
 
 I think that you represent a character as a single character, e.g.,
 character=c, or as a numeric character reference, e.g.,
 character=#xA;.
 
 
  I agree with this last, after having digested everything.
 
  Point is well taken that we have some points to nitpick with
 xsl-editors,
  mostly about disambiguating some of the language.

 Arved,

 Help me here. I must be missing something.  What is it that you agree
 with?  That the spec, as worded, leaves us with
   character=c
 or
   character=#x63;
 which amounts to the same thing?

Yes, this is what I agree with.

 If so, fair enough.  Do you also agree that c is an NCName?  And that
   character=-
 is a parsing error?

Well, the production for NCName doesn't live in isolation, with reference to
http://www.w3.org/TR/REC-xml-names/#ns-decl. Yes, c fits the production,
but it's really an NCName when you have also declared the namespace.

Why is character=- a parsing error? The XML Recommendation has at least
one example of an attribute value that contains a hyphen.

Maybe _I_ am missing something here. ;-)

 As far as I can see, the only immediate ways forward are to descend into
 the mire of context dependent parsing (which the editors have recently
 formally decided that we must do in respect of format) or apply our
 own disambiguating condition.  How are you intending to implement
 character?

By storing it as a Unicode value according to the XML Rec production

Char::=#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x1-#x10]

It will depend on the implementation library. ICU for example has UChar and
UChar32 types.

Regards,
Arved


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-10-06 Thread J.Pietschmann

Arved Sandstrom wrote:
 Why is character=- a parsing error? The XML Recommendation has at least
 one example of an attribute value that contains a hyphen.

This comes from assuming that every unqoted sequence of characters which
is not a number, mesutrement or a color has to be interpreted as NCName,
as the grammar suggests, and IIRC a NCName must not start with a hyphen.
This means
  hyphenation-char=-
can't parse as number, can't parse as string, can't parse as color, can't
parse as NCName  - parsing error.
Interestingly
  hyphenation-char=-1
would parse, but certainly can't be converted to a char
Some other niceties:
  hyphenation-char=1*4
would this make the hyphenation charater be 4?
Can
  hyphenation-char=1 div 4
be converted to #x00BC? bg I know this becomes silly.

How are you intending to implement
character?
 
 By storing it as a Unicode value according to the XML Rec production

Functions complicate matters, and something like
   hyphenation-char=from-table-column('hyphenation-char')
might even make some sense.

J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-10-06 Thread J.Pietschmann

Peter B. West wrote:
 Just for curiosity: what should happen if the following snippet
 is used:
  fo:page-sequence master-reference=font-size font-size=20pt
fo:flow font-size=from-parent(from-parent('master-reference'))/
 
 
 This looks OK.

I see potential for an Obfuscated FO Code Contest :-)

J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: character

2002-10-06 Thread Arved Sandstrom

 -Original Message-
 From: J.Pietschmann [mailto:[EMAIL PROTECTED]]
 Sent: October 6, 2002 12:00 PM
 To: [EMAIL PROTECTED]
 Subject: Re: character


 Arved Sandstrom wrote:
  Why is character=- a parsing error? The XML Recommendation
 has at least
  one example of an attribute value that contains a hyphen.

 This comes from assuming that every unqoted sequence of characters which
 is not a number, mesutrement or a color has to be interpreted as NCName,
 as the grammar suggests, and IIRC a NCName must not start with a hyphen.
 This means
   hyphenation-char=-
 can't parse as number, can't parse as string, can't parse as color, can't
 parse as NCName  - parsing error.

Hi Joerg

Can you cite the specific productions that lead to this conclusion? I am not
saying that you are wrong but I can't find it.

I must be tired. ;-) I just looked at the XML 1.1 production for AttValue
which is

AttValue::='' ([^] | Reference)* ''
   |  ' ([^'] | Reference)* '

and I see a prohibition here on using a literal '' or '' in the attribute
value, anywhere. But I see nothing about '-'.

If the grammar of the recommendations leads to the conclusion that

character=-

is not OK, then this just simply offends my common sense.

Arved


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-10-06 Thread J.Pietschmann

Arved Sandstrom wrote:
 Can you cite the specific productions that lead to this conclusion? I am not
 saying that you are wrong but I can't find it.
 
 I must be tired. ;-) I just looked at the XML 1.1 production for AttValue
 which is

Don't look at XML AttValue, look at the XSLFO property expression language.
Somehow it is implicit that all attributes in a XSLFO document are parsed
as expressions which are defined in 5.9 Expressions. Refer specifically
to 5.9.3 Basics. A single hyphen is not a valid expression according to
the XSLFO expression grammar.
Maybe some fallbacks are implicit somewhere, I don't know.

J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: character

2002-10-06 Thread Arved Sandstrom

 -Original Message-
 From: J.Pietschmann [mailto:[EMAIL PROTECTED]]
 Sent: October 6, 2002 12:39 PM
 To: [EMAIL PROTECTED]
 Subject: Re: character

 Arved Sandstrom wrote:
  Can you cite the specific productions that lead to this
 conclusion? I am not
  saying that you are wrong but I can't find it.
 
  I must be tired. ;-) I just looked at the XML 1.1 production
 for AttValue
  which is

 Don't look at XML AttValue, look at the XSLFO property expression
 language.
 Somehow it is implicit that all attributes in a XSLFO document are parsed
 as expressions which are defined in 5.9 Expressions. Refer specifically
 to 5.9.3 Basics. A single hyphen is not a valid expression according to
 the XSLFO expression grammar.
 Maybe some fallbacks are implicit somewhere, I don't know.

An Expr can be a Literal, the production for which is

'' [^]* ''
| ' [^']* '

If I look at the first alternative,

'' [^]* ''

it seems to me that I have pretty considerable leeway, and - isn't ruled
out at all.

Arved


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-10-06 Thread J.Pietschmann

Arved Sandstrom wrote:
 An Expr can be a Literal, the production for which is
 
 '' [^]* ''
 | ' [^']* '
 
 If I look at the first alternative,
 
 '' [^]* ''
 
 it seems to me that I have pretty considerable leeway, and - isn't ruled
 out at all.

Erm, the expression is supposed to be inside the XML attribute quotes,
for example hyphenation-char='-' would be ok (literal, second
production), but hyphenation-char=- does not match the literal
production, nor any other (except operator). Unless I missed
something, of course.

J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: character

2002-10-06 Thread Arved Sandstrom

 -Original Message-
 From: J.Pietschmann [mailto:[EMAIL PROTECTED]]
 Sent: October 6, 2002 1:29 PM
 To: [EMAIL PROTECTED]
 Subject: Re: character
 
 
 Arved Sandstrom wrote:
  An Expr can be a Literal, the production for which is
  
  '' [^]* ''
  | ' [^']* '
  
  If I look at the first alternative,
  
  '' [^]* ''
  
  it seems to me that I have pretty considerable leeway, and - 
 isn't ruled
  out at all.
 
 Erm, the expression is supposed to be inside the XML attribute quotes,
 for example hyphenation-char='-' would be ok (literal, second
 production), but hyphenation-char=- does not match the literal
 production, nor any other (except operator). Unless I missed
 something, of course.

And unless _I_ am missing something, - precisely matches that production.

You are looking at

' [^']* '

but I am looking at

'' [^]* ''

According to the latter I can absolutely do -.

Arved


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-10-06 Thread J.Pietschmann

Arved Sandstrom wrote:
 And unless _I_ am missing something, - precisely matches that production.
 
 You are looking at
 
 ' [^']* '
 
 but I am looking at
 
 '' [^]* ''
 
 According to the latter I can absolutely do -.

Well, in
   hyphenation-char=-
the hyphen is the expression, not the hyphen surrounded by double
quotes. As I said, unless I'm something missing, the FO property
expression is the value of the XML attribute, which in turn is the
hyphen, because the double quotes are part of the XML syntax and
are stripped by the XML parser. The XSLFO property expression parser
gets the hyphen, without any quotes, double, or single. And without
the quotes, it does not match either of the two productions for literal.
This is the problem here.

Perhaps I should have written that
   hyphenation-char='-'
and
   hyphenation-char='-'
as well as
 hyphenation-char='quot;-quot;'
are legal, while neiter
 hyphenation-char='-'
nor
 hyphenation-char=-
are ok.

J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: character

2002-10-06 Thread Arved Sandstrom

 -Original Message-
 From: J.Pietschmann [mailto:[EMAIL PROTECTED]]
 Sent: October 6, 2002 2:15 PM
 To: [EMAIL PROTECTED]
 Subject: Re: character


 Arved Sandstrom wrote:
  And unless _I_ am missing something, - precisely matches that
 production.
 
  You are looking at
 
  ' [^']* '
 
  but I am looking at
 
  '' [^]* ''
 
  According to the latter I can absolutely do -.

 Well, in
hyphenation-char=-
 the hyphen is the expression, not the hyphen surrounded by double
 quotes. As I said, unless I'm something missing, the FO property
 expression is the value of the XML attribute, which in turn is the
 hyphen, because the double quotes are part of the XML syntax and
 are stripped by the XML parser. The XSLFO property expression parser
 gets the hyphen, without any quotes, double, or single. And without
 the quotes, it does not match either of the two productions for literal.
 This is the problem here.

 Perhaps I should have written that
hyphenation-char='-'
 and
hyphenation-char='-'
 as well as
  hyphenation-char='quot;-quot;'
 are legal, while neiter
  hyphenation-char='-'
 nor
  hyphenation-char=-
 are ok.

Yes, I see your point.

I think they screwed up the grammar. As I stated before, I find it ludicrous
that character=- would not be OK.

Arved


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-10-06 Thread J.Pietschmann

Arved Sandstrom wrote:
 I think they screwed up the grammar.
Me too. However, I think it would be really hard to press
something which is intuitive, consistent as well as easy to
parse into a single grammar for all XSLFO porperties. It seems
they fell for the same as the C preprocessor guys did, which
is intuitive and easy to implement for the most part, but had
this abominable 0xe-12 problem as well as the rather unintuitive
argument prescanning hidden in its dark corners.

 As I stated before, I find it ludicrous
 that character=- would not be OK.
That's ok, but it would require some extensions to the whole
property handling.


J.Pietschmann



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-09-30 Thread jaccoud

Tony Graham [EMAIL PROTECTED] escreveu:
[EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300:
  Out of the XML recomendation,section 2.2:
  
  A character is an atomic unit of text as specified by ISO/IEC 
  10646 [ISO10646]. Legal characters are tab,
  carriage return, line feed, and the legal graphic characters of Unicode 
  and ISO/IEC 10646.

XML 1.0 Second Edition removed graphic (which I always found
confusing but which is good ISO-speak).

  or, more clearly:
  
  Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | 
  [#x1-#x10]
  /* any Unicode character, excluding the surrogate blocks, FFFE, 
  and . */
  
  
  That means -, #12235 , etc are characters, while '1' is not. 

#12235; is a character reference. '#12235' is how you talk about a
character's code point, although the hexadecimal representation is
usually preferable.

In XSL terms, '1' is a one-character string literal, but while you
could claim that it is one character, there's no XSL conversion from a
string to a character, so fo:character character='1'/ should fail.


You are correct. What I tried to poit out is that '1' IS a string string that HAS one character. He who claim[s] it is one character is IMHO seriouly misled. 1 , on the other hand, IS a character. The concept of a character, in the XML syntax definition, is that of the symbols allowed in the grammar, the most elementary piece of the lexical. The term string is not formally defined in the recommendation (sadly), but it is used throughout the text meaning sequence of characters. A string _type_ is defined for attributes, and consists of a quoted literal string (i.e. a sequence of characters delimited by quotes). In the XML point of view, '1' is nothing but a three-character string. And a three-character string is not a character. 

The XSL recommendation defines a string datatype that has a rather different scope of the 'literal string' and 'string attribute type' defined in the XML spec. But defines no 'character' datatype, so I think there is no other option but to assume it means the XML definition of a character. So, fo:character character=1/ is correct, while, as you said, fo:character character='1'/ should fail.

This leaves us with a problem, however, because since the character datatype is not defined, there is also no conversion rule which results in a character. You cannot store a character in a xsl:variable because there is no way to specify or retrieve it -- variables know only about strings. I find that very disturbing, because it hampers stylesheet coding, in that we cannot specify characters indirectly or do any work with them. I think this should be reported to the editors of XSLT 2.0 so they can provide a clear way out.


=
Marcelo Jaccoud Amaral
PetrobrĂ¡s (http://www.petrobras.com.br)
mailto:[EMAIL PROTECTED]
voice: +55 21 2534-3485
fax: +55 21 2534-1809
=



Re: character

2002-09-30 Thread Peter B. West

Tony Graham wrote:
 [EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300:
   Out of the XML recomendation,section 2.2:
   
   A character is an atomic unit of text as specified by ISO/IEC 
   10646 [ISO10646]. Legal characters are tab,
   carriage return, line feed, and the legal graphic characters of Unicode 
   and ISO/IEC 10646.
 
 XML 1.0 Second Edition removed graphic (which I always found
 confusing but which is good ISO-speak).
 
   or, more clearly:
   
   Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | 
   [#x1-#x10]
   /* any Unicode character, excluding the surrogate blocks, FFFE, 
   and . */
   
   
   That means  -, #12235 , etc are characters, while '1' is not. 
 
 #12235; is a character reference.  '#12235' is how you talk about a
 character's code point, although the hexadecimal representation is
 usually preferable.
 
 In XSL terms, '1' is a one-character string literal, but while you
 could claim that it is one character, there's no XSL conversion from a
 string to a character, so fo:character character='1'/ should fail.

Tony,

I don't think this gets us out of difficulty.  A casual inspection 
reveals no conversion, either, from an NCName to a character.  So an 
attribute value assignment of
  a
will, I think, parse (in the parser implied by the grammar of XSL 
expressions) as an NCName (whereas
  -
will parse as an unadorned MINUS sign.)  So how do I represent a character?

Furthermore, Section 5.11 has
q
character
 A single Unicode character.
string
 A sequence of characters.
/q

If an attribute value assignment of
   'a sequence of characters'
assigns a sequence of characters, then
   'a'
must assign a sequence of one character.

What's the difference between a single Unicode character and a 
sequence of one character?  Well, one is a sequence, and therefore a 
string, and there's no XSL conversion, etc.

So how do I represent a character?

To me, the cleanest, least ambiguous way is to represent a character 
attribute assignment value with 'character' - a string literal of 
length 1.

Peter
-- 
Peter B. West  [EMAIL PROTECTED]  http://www.powerup.com.au/~pbwest/
Lord, to whom shall we go?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-09-30 Thread Tony Graham

Peter B. West wrote at 30 Sep 2002 13:28:18 +1000:
  Tony Graham wrote:
   [EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300:
...
 That means  -, #12235 , etc are characters, while '1' is not. 
   
   #12235; is a character reference.  '#12235' is how you talk about a
   character's code point, although the hexadecimal representation is
   usually preferable.
   
   In XSL terms, '1' is a one-character string literal, but while you
   could claim that it is one character, there's no XSL conversion from a
   string to a character, so fo:character character='1'/ should fail.
  
  Tony,
  
  I don't think this gets us out of difficulty.  A casual inspection 

Forgive me, but I wasn't trying to get anybody out of any difficulty,
I was just trying to keep the terminology accurate.

...
  So how do I represent a character?
  
  To me, the cleanest, least ambiguous way is to represent a character 
  attribute assignment value with 'character' - a string literal of 
  length 1.

Except that you know that that's not specified among the allowed
conversions.

The interesting thing is that 'character' doesn't appear in the
productions in Section 5.9, Expressions, of the XSL Recommendation.
Now there's a question for [EMAIL PROTECTED]!

I think that you represent a character as a single character, e.g.,
character=c, or as a numeric character reference, e.g.,
character=#xA;.

Regards,


Tony Graham

XML Technology Center - Dublinmailto:[EMAIL PROTECTED]
Sun Microsystems Ireland Ltd   Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3x(70)19708

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: character

2002-09-30 Thread Arved Sandstrom

 -Original Message-
 From: Tony Graham [mailto:[EMAIL PROTECTED]]
 Sent: September 30, 2002 10:09 AM
 To: [EMAIL PROTECTED]
 Subject: Re: character

 Peter B. West wrote at 30 Sep 2002 13:28:18 +1000:
   Tony Graham wrote:
[EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300:
 ...
  That means  -, #12235 , etc are characters, while
 '1' is not.
   
#12235; is a character reference.  '#12235' is how you talk about a
character's code point, although the hexadecimal representation is
usually preferable.
   
In XSL terms, '1' is a one-character string literal, but while you
could claim that it is one character, there's no XSL
 conversion from a
string to a character, so fo:character character='1'/
 should fail.
  
   Tony,
  
   I don't think this gets us out of difficulty.  A casual inspection

 Forgive me, but I wasn't trying to get anybody out of any difficulty,
 I was just trying to keep the terminology accurate.

 ...
   So how do I represent a character?
  
   To me, the cleanest, least ambiguous way is to represent a character
   attribute assignment value with 'character' - a string literal of
   length 1.

 Except that you know that that's not specified among the allowed
 conversions.

 The interesting thing is that 'character' doesn't appear in the
 productions in Section 5.9, Expressions, of the XSL Recommendation.
 Now there's a question for [EMAIL PROTECTED]!

 I think that you represent a character as a single character, e.g.,
 character=c, or as a numeric character reference, e.g.,
 character=#xA;.

I agree with this last, after having digested everything.

Point is well taken that we have some points to nitpick with xsl-editors,
mostly about disambiguating some of the language.

Arved Sandstrom


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-09-30 Thread Peter B. West

Arved Sandstrom wrote:
-Original Message-
From: Tony Graham [mailto:[EMAIL PROTECTED]]


Peter B. West wrote at 30 Sep 2002 13:28:18 +1000:
  Tony Graham wrote:
   [EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300:
...
 That means  -, #12235 , etc are characters, while
'1' is not.
  
   #12235; is a character reference.  '#12235' is how you talk about a
   character's code point, although the hexadecimal representation is
   usually preferable.
  
   In XSL terms, '1' is a one-character string literal, but while you
   could claim that it is one character, there's no XSL
conversion from a
   string to a character, so fo:character character='1'/
should fail.
 
  Tony,
 
  I don't think this gets us out of difficulty.  A casual inspection

Forgive me, but I wasn't trying to get anybody out of any difficulty,
I was just trying to keep the terminology accurate.

...
  So how do I represent a character?
 
  To me, the cleanest, least ambiguous way is to represent a character
  attribute assignment value with 'character' - a string literal of
  length 1.

Except that you know that that's not specified among the allowed
conversions.

The interesting thing is that 'character' doesn't appear in the
productions in Section 5.9, Expressions, of the XSL Recommendation.
Now there's a question for [EMAIL PROTECTED]!

I think that you represent a character as a single character, e.g.,
character=c, or as a numeric character reference, e.g.,
character=#xA;.
 
 
 I agree with this last, after having digested everything.
 
 Point is well taken that we have some points to nitpick with xsl-editors,
 mostly about disambiguating some of the language.

Arved,

Help me here. I must be missing something.  What is it that you agree 
with?  That the spec, as worded, leaves us with
  character=c
or
  character=#x63;
which amounts to the same thing?

If so, fair enough.  Do you also agree that c is an NCName?  And that
  character=-
is a parsing error?

As far as I can see, the only immediate ways forward are to descend into 
the mire of context dependent parsing (which the editors have recently 
formally decided that we must do in respect of format) or apply our 
own disambiguating condition.  How are you intending to implement 
character?

Peter
-- 
Peter B. West  [EMAIL PROTECTED]  http://www.powerup.com.au/~pbwest/
Lord, to whom shall we go?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: character

2002-09-27 Thread Tony Graham

Arved Sandstrom wrote at 26 Sep 2002 19:50:01 -0300:
  Tony Graham says that character should be a Unicode character, or Char. As
  in the actual real, encoded thing.

Empirical evidence suggests that is the general understanding:
grepping the XSL CR test suite shows everybody, FOP included, using
literal characters.

  Problem being, one property with a character datatype is defined in XSLT,
  which actually says that it's a Char. hyphenation-separator merely says
  that it's a specification of a Unicode character. I guess that could be
  interpreted the same way.
  
  But character for the character property says _code point_. And that is
  an integer value.

Section 5.11, Property Datatypes, trumps the individual property
definitions, since Section 5.11 defines the syntax for specifying the
datatypes usable in property values.  It says A single Unicode
character.

Now, the interesting if so far theoretical case is what do you do if
you want a hyphenation-separator character that you can only represent
in Unicode as the combination of a base character and one or more
combining marks?  What if your precomposed character gets normalised
to a base character and a combining mark before the XSL processor sees
it?

  So IMO the spec is currently very vague on this.

Then write to [EMAIL PROTECTED] asking for a clarification.

Regards,


Tony Graham

XML Technology Center - Dublinmailto:[EMAIL PROTECTED]
Sun Microsystems Ireland Ltd   Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3x(70)19708

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-09-27 Thread Tony Graham

Peter B. West wrote at 28 Sep 2002 00:39:34 +1000:
...
  Tony Graham wrote:
...
   Section 5.11, Property Datatypes, trumps the individual property
   definitions, since Section 5.11 defines the syntax for specifying the
   datatypes usable in property values.  It says A single Unicode
   character.
  
  Ok, so it's a character.  How, then, is it represented?  Is it also a 
  string (of length one), or is it just a literal (length 1), or just an 
  NCName (length 1), or is it something else?  What does it look like, and 
  how is the parser going to handle it?

A character is a character, and you should go to XML 1.0 for the
definition of a character.

Also, parser is ambiguous in this context as well as having no XML
or XSL meaning.  XML defines an XML processor, which is often called a
parser for historical reasons, and the XSL Recommendation uses
parse without designating a thing called a parser.

  ...
  
 So IMO the spec is currently very vague on this.
   
   Then write to [EMAIL PROTECTED] asking for a clarification.
  
  Nice dry wit you have Tony.

That was a serious suggestion.  You do get an answer eventually, even
if you don't like the answer.

Regards,


Tony Graham

XML Technology Center - Dublinmailto:[EMAIL PROTECTED]
Sun Microsystems Ireland Ltd   Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3x(70)19708

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-09-27 Thread Tony Graham

[EMAIL PROTECTED] wrote at 27 Sep 2002 16:44:32 -0300:
  Out of the XML recomendation,section 2.2:
  
  A character is an atomic unit of text as specified by ISO/IEC 
  10646 [ISO10646]. Legal characters are tab,
  carriage return, line feed, and the legal graphic characters of Unicode 
  and ISO/IEC 10646.

XML 1.0 Second Edition removed graphic (which I always found
confusing but which is good ISO-speak).

  or, more clearly:
  
  Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | 
  [#x1-#x10]
  /* any Unicode character, excluding the surrogate blocks, FFFE, 
  and . */
  
  
  That means  -, #12235 , etc are characters, while '1' is not. 

#12235; is a character reference.  '#12235' is how you talk about a
character's code point, although the hexadecimal representation is
usually preferable.

In XSL terms, '1' is a one-character string literal, but while you
could claim that it is one character, there's no XSL conversion from a
string to a character, so fo:character character='1'/ should fail.

Regards,


Tony Graham

XML Technology Center - Dublinmailto:[EMAIL PROTECTED]
Sun Microsystems Ireland Ltd   Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3x(70)19708

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-09-26 Thread J.Pietschmann

Peter B. West wrote:
 Fopdevs,
 
 Any comments on the representation and parsing of character type 
 attributes would be gratefully received.

According to 5.11 Property Datatypes, the value is a single
unicode character. I believe the representation is a
unceremonial single unicode character, or an NCName whose
string representation has the length 1. I'd parse such
attributes as an expression resulting in a string, and
bomb if the string is longer than 1.
This would accept
  character='a'
  character=1 + 1
  character=from-parent('font-size') - 12
which may upset purist, or not.
An alternative would be to use a custom parser, which accepts
either a single character (NCName of length 1) or any of the
functions inherited-property-value(NCName),
from-parent( NCName), from-nearest-specified-value( NCName)
and from-table-column( NCName)
(might even make a bit of sense for hyphenation-char and
for fo:character's character in very, very strange cases)

Just for curiosity: what should happen if the following snippet
is used:
  fo:page-sequence master-reference=font-size font-size=20pt
fo:flow font-size=from-parent(from-parent('master-reference'))/

J.Pietschmann


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




RE: character

2002-09-26 Thread Arved Sandstrom

 -Original Message-
 From: Peter B. West [mailto:[EMAIL PROTECTED]]
 Sent: September 26, 2002 11:41 AM
 To: fop-dev
 Subject: character

 Fopdevs,

 Any comments on the representation and parsing of character type
 attributes would be gratefully received.

This came up on www-xsl-fo, because Eric Bischoff and myself had the same
question.

Tony Graham says that character should be a Unicode character, or Char. As
in the actual real, encoded thing.

Problem being, one property with a character datatype is defined in XSLT,
which actually says that it's a Char. hyphenation-separator merely says
that it's a specification of a Unicode character. I guess that could be
interpreted the same way.

But character for the character property says _code point_. And that is
an integer value.

So IMO the spec is currently very vague on this.

Regards,
Arved


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: character

2002-09-26 Thread Peter B. West

Arved,

Thanks for this.  I vaguely remembered some discussion about this, but I 
went looking in the xsl-editors archive.  That _code point_ had me 
puzzled as well.  I'll be interested in some feedback on this from the 
editors.  See also my response to Joerg.

Peter

Arved Sandstrom wrote:
From: Peter B. West [mailto:[EMAIL PROTECTED]]

Fopdevs,

Any comments on the representation and parsing of character type
attributes would be gratefully received.
 
 
 This came up on www-xsl-fo, because Eric Bischoff and myself had the same
 question.
 
 Tony Graham says that character should be a Unicode character, or Char. As
 in the actual real, encoded thing.
 
 Problem being, one property with a character datatype is defined in XSLT,
 which actually says that it's a Char. hyphenation-separator merely says
 that it's a specification of a Unicode character. I guess that could be
 interpreted the same way.
 
 But character for the character property says _code point_. And that is
 an integer value.
 
 So IMO the spec is currently very vague on this.

-- 
Peter B. West  [EMAIL PROTECTED]  http://www.powerup.com.au/~pbwest/
Lord, to whom shall we go?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: Character Encoding

2002-07-10 Thread Holger Prause


- Original Message -
From: J.Pietschmann [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, July 09, 2002 9:58 PM
Subject: Re: Character Encoding


 Holger Prause wrote:
  I  use the character squence #8722; in a html page it will be dispalyed
as
  an - minus sign.
 
  So far so good.Now i want to use that chracter sequence in FO but in the
 ^ ^ ^ ^ ^ ^ ^
 It is a character reference
Yes your are right.

  generated pdf it will displayed as an # sign(which stands for undefined
?)

 This means the selected font does not have a glyph for it.

Ok i undestand that, its also written in the FOP Faq.


  What can i do to display this character squence, changeing the encoding
in
  the stylesheet(or using xsl:output /)?

 The only way is to get a font with a glyph for it and let
 FOP use it. The mathematical minus is pretty esoteric,
 you'll probably need a special math font, rummage through
 implementations for MathML or TeX distributions.
 Why can't you usse a dash or hyphen?
What i wanted was a dash, but for some reasons i choosed the character
reference #8722;  which is , like u already said,  a mathematical minus.
Now i use a the character reference  for dash , and i works fine with my
font.

Thx for the quick response,

Bye,

Holger


 J.Pietschmann




 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, email: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: Character encoding on other platforms (previously os/390)

2002-05-16 Thread Christian Geisert

[EMAIL PROTECTED] schrieb:
 I've had a couple folks ask me for the modified code so the proper character
 encoding is returned on the toString().getBytes()
 is US-ASCII. This is cool that other people besides me need this.

[..]

 I downloaded this snapshot xml-fop_20020515162132 and I don't see any
 modification to the code.
 Is this change going to be incorporated or has been incorporated in a way
 that I missed?

It hasn't been incorporated yet but it's on my todo list and should
be in the next maintenance release.

 Many Thanks,
 Jason West 

Christian


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]