Re: atom, xslt processors (Re: atom:type, xsl:output)

2005-05-30 Thread James Cerra

Aristotle,

> > > Why do you think it is appropriate to have a random PI
> > > processor in between the XML processor and the Atom
> > > processor?
> > 
> > Because the XML is just the grammar serializing the ontology,
> > or data structure, of interest. I see Atom processors as
> > normally the Bridge between the XML document and the particular
> > CMS, IA, or web browser application's internal structure. So
> > an Atom processor would be the last stage before the data is
> > internalized. The intent of _Processing_ Instructions seems to
> > change how the document is processed. That implies to me that
> > they would probably be interpreted between the XML processor
> > and the particular grammar's processor.
> 
> Atom is a transport envelope for potentially a multitude of
> documents. Saying that the enveloped documents should be
> preprocessed before they are unfolded from the envelope is like
> saying that a part of a multipart/mime document which has a
> “Content-Encoding: gzip” header should be uncompressed in situ
> *before* it is extracted from the multipart envelope.

Note that by "change how the document is processed" I mean the Atom document -
i.e. the envelope - not the content of atom:content - the enveloped document. 
Sorry for my imprecise language.  So PIs change how the Atom document is
processed.  Could you give an example of how they would be clearly associated
with the enveloped document (which counters this argument)?



--
Jimmy Cerra
https://nemo.dev.java.net

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



Re: atom:type, xsl:output

2005-05-30 Thread James Cerra

Aristotle,

> > > >> Depending on an entity reference and not being able to
> > > >> accept the straight replacement text is just wrong.
> > > >
> > > > I agree.  I'm just bringing up possible incompatibilities
> > > > for debate!
> > > 
> > > I don't think that's an incompatibility that deserves
> > > catering for.
> > 
> > Why not?  Why be incompatible for the sake of incompatibility?
> > That's just asking for trouble when Atom is used in ways not
> > originally envisioned (and there always are).
> 
> That’s bogus. Some software generates or processes non-wellformed
> XML. Should Atom cater to that too? If not, where do you draw the
> line between somewhat-broken software that Atom should support
> and really-broken software that it should not?

Depends on the situation.  :-)

I think that Atom should support specifying doctypes since there're in the XML
spec and are well-formed constructs (even if external subsets are not read). 
I'd like Atom to support having _all_ well-formed XML documents be possible
atom:content (referenced by @src at least).

> > HTML currently uses doctypes for switching between versions of
> > the langauage.
> 
> No, it doesn’t. Browsers do so, but only as a hack; the idea is
> that an author who puts a DOCTYPE declaration in their document
> *probably* cares about validity and so his document should
> *probably* be treated as strict standards compliance requires.
> But there is nothing in any spec that justifies this behaviour.

How do you differentiate between XHTML 1.0 Strict and XHTML 1.0 Transitional
without doctypes?

> > There is no other way to properly interpret content without
> > guessing in the current drafts.  That is a worse sin inflicted
> > upon programmers, than allowing the subset that do rendering to
> > do naughty DOCTYPE sniffing.
> 
> You don’t seem to understand XML. XML has something called
> namespaces. Use them.

I understand XML and namespaces.  Lets keep the personal attacks out of this
thread, and not start a flame war, OK?

The problem is that there are specs which use one namespace for several
versions!

> > > > No, any well formed XML processor MUST expand internal
> > > > entities and add replaced attributes as well.  This is NOT
> > > > OPTIONAL.
> > > 
> > > So is expat non-conforming?
> > 
> > If it doesn't, then it is non-conforming.
> 
> Wrong.
> 
> > That's how non-validating XML processors are defined by the
> > spec.  I quote from section 5.1 paragraph 4:
> > 
> > [Definition: While they are not required to check the document
> > for validity, they are REQUIRED to process all the declarations
> > they read in the internal DTD subset and in any parameter
> ^^
> > entity that they read, up to the first reference to a parameter
>  ^^
> > entity that they do not read; that is to say, they MUST use the
>  ^
> > information in those declarations to normalize attribute
> > values, include the replacement text of internal entities, and
> > supply default attribute values.]
> > 
> > There is no room for any other interpretation, I think.
> 
> Nowhere is a non-validating processor REQUIRED to read an
> external DTD in the spec. Only *internal* declarations MUST be
> processed and references resolved.

Um, I was talking about internal declarations!  Non-validating processors are
required to read the external subset declaration (but they don't have to
resolve and parse it) and the internal subset up to any paramater references
they don't read.  So in the following declaration everything must be read
except for anything in the external file:

http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"; [
  
]>

A non-validating processor will report the public and system ids as well as
processing the entity declaration.  Since the declaration of the external
subset is used, in this case, to change semantics of the XHTML elements
(namely, which version of the standard they conform to).



--
Jimmy Cerra
https://nemo.dev.java.net



__ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new Resources site
http://smallbusiness.yahoo.com/resources/



Re: atom, xslt processors (Re: atom:type, xsl:output)

2005-05-30 Thread Paul Hoffman


At 4:54 PM +0200 5/30/05, A. Pagaltzis wrote:

Atom is a transport envelope for potentially a multitude of
documents. Saying that the enveloped documents should be
preprocessed before they are unfolded from the envelope is like
saying that a part of a multipart/mime document which has a
"Content-Encoding: gzip" header should be uncompressed in situ
*before* it is extracted from the multipart envelope.

That doesn't make any sense.


+1

--Paul Hoffman, Director
--Internet Mail Consortium



Re: atom, xslt processors (Re: atom:type, xsl:output)

2005-05-30 Thread A. Pagaltzis

* James Cerra <[EMAIL PROTECTED]> [2005-05-30 02:35]:
> For a counter example: For internal use I might create an Atom
> feed from atom entries with  attached to
> either the feed or the entries. Furthermore, I could add
> instructions for my stylesheet by defining PIs used in the
> entries. In this case, the PI processor is an xslt processor
> that works before the atom processor. The original contains
> global options and defines the template, while the entries are
> parsed and aggregated into the original file. This way it is
> easier to change the feed's options without editing the XSLT
> source files. The atom processor becomes the last stage in the
> pipe.

So you expect every XML processor to have an XSLT processor
attached so that it will be able to resolve your custom Atom+PIs
superformat to regular Atom?

Regards,
-- 
Aristotle



Re: atom, xslt processors (Re: atom:type, xsl:output)

2005-05-30 Thread A. Pagaltzis

* James Cerra <[EMAIL PROTECTED]> [2005-05-30 02:00]:
> > Why do you think it is appropriate to have a random PI
> > processor in between the XML processor and the Atom
> > processor?
> 
> Because the XML is just the grammar serializing the ontology,
> or data structure, of interest. I see Atom processors as
> normally the Bridge between the XML document and the particular
> CMS, IA, or web browser application's internal structure. So
> an Atom processor would be the last stage before the data is
> internalized. The intent of _Processing_ Instructions seems to
> change how the document is processed. That implies to me that
> they would probably be interpreted between the XML processor
> and the particular grammar's processor.

Atom is a transport envelope for potentially a multitude of
documents. Saying that the enveloped documents should be
preprocessed before they are unfolded from the envelope is like
saying that a part of a multipart/mime document which has a
“Content-Encoding: gzip” header should be uncompressed in situ
*before* it is extracted from the multipart envelope.

That doesn’t make any sense.

Regards,
-- 
Aristotle



Re: atom:type, xsl:output

2005-05-30 Thread A. Pagaltzis

* James Cerra <[EMAIL PROTECTED]> [2005-05-30 03:20]:
> > >> Depending on an entity reference and not being able to
> > >> accept the straight replacement text is just wrong.
> > >
> > > I agree.  I'm just bringing up possible incompatibilities
> > > for debate!
> > 
> > I don't think that's an incompatibility that deserves
> > catering for.
> 
> Why not?  Why be incompatible for the sake of incompatibility?
> That's just asking for trouble when Atom is used in ways not
> originally envisioned (and there always are).

That’s bogus. Some software generates or processes non-wellformed
XML. Should Atom cater to that too? If not, where do you draw the
line between somewhat-broken software that Atom should support
and really-broken software that it should not?

> HTML currently uses doctypes for switching between versions of
> the langauage.

No, it doesn’t. Browsers do so, but only as a hack; the idea is
that an author who puts a DOCTYPE declaration in their document
*probably* cares about validity and so his document should
*probably* be treated as strict standards compliance requires.
But there is nothing in any spec that justifies this behaviour.

> There is no other way to properly interpret content without
> guessing in the current drafts.  That is a worse sin inflicted
> upon programmers, than allowing the subset that do rendering to
> do naughty DOCTYPE sniffing.

You don’t seem to understand XML. XML has something called
namespaces. Use them.

> > > No, any well formed XML processor MUST expand internal
> > > entities and add replaced attributes as well.  This is NOT
> > > OPTIONAL.
> > 
> > So is expat non-conforming?
> 
> If it doesn't, then it is non-conforming.

Wrong.

> That's how non-validating XML processors are defined by the
> spec.  I quote from section 5.1 paragraph 4:
> 
> [Definition: While they are not required to check the document
> for validity, they are REQUIRED to process all the declarations
> they read in the internal DTD subset and in any parameter
    ^^
> entity that they read, up to the first reference to a parameter
 ^^
> entity that they do not read; that is to say, they MUST use the
 ^
> information in those declarations to normalize attribute
> values, include the replacement text of internal entities, and
> supply default attribute values.]
> 
> There is no room for any other interpretation, I think.

Nowhere is a non-validating processor REQUIRED to read an
external DTD in the spec. Only *internal* declarations MUST be
processed and references resolved.

Regards,
-- 
Aristotle



Re: atom:type, xsl:output

2005-05-29 Thread James Cerra

Henri Sivonen,

> > MSIE conditional comments.  See other person's reply.
> 
> I thought they were for text/html tag soup only. Anyway, I think Atom 
> should not try to enable such bogosity in XML.

I don't think Atom should take a position if it doesn't have to.

> > And yes: I know that Atom is not FTP or HTTP.  I'm looking at the Atom 
> > entry
> > uploader's POV rather than the downloader's perspective.  The Atom API 
> > is a way
> > to simplify managing web sites:
> 
> Is it really? Straight bytes over HTTP PUT seems a lot simpler to me 
> than Base64 in an XML envelope over HTTP PUT.

In this case, the simplier grammar option is a more complex job.  First, Atom
already supports this with every other MIME type including HTML, but not
anything that is XML.  Why?  Second, if your content is a literal document of
*any* type then:

** steps with envelope **
1. PUT doc-with-envelope.atom to http://uri1.com.
2. PUT doc-with-envelope.atom to http://uri2.com with user, pass.

** steps without envelope **
1. PUT doc-referencing-uri1.atom to http://uri1.com.
2. PUT image.png to http://uri1.com/images/image.png.
3. Modify doc-referencing-uri1.atom to reference uri2.
4. PUT doc-referencing-uri2.atom to http://uri2.com with user, pass.
5. PUT image.png to http://uri2.com/_title_of_entry_/image.png with user, pass.

That is, including it in an atom:content element and sending that off to the
server allows you to send that off to another server with ***only changing the
Atom document's PUT uri*** rather than sending the external document according
to each server's implementation (i.e. uri or other means).  Having a standard
way of sending files for a blog - like a picture blog for instance - is really
nice.  Otherwise it is much harder to change servers or atom processors.

> >> Vanilla XML processors don't act on PIs. They report them to the
> >> application--in this case the Atom processor.
> >
> > So Atom Processors MUST be between any XML and PI processor stages?  
> > That
> > sucks!  It means just to display Atom, I can't use a generic XSLT 
> > processor
> > like Saxon or MSIE.
> 
> What PIs do they act on besides the style sheet PI, which must appear 
> in the prolog (so conforming tools should not act on it if it appears 
> within atom:content)?

You are correct about  but in general there could be others. 
XSLT stylesheets can use them for flexibility, for example.  Better not to make
assumptions about their behavior.

> >> Depending on an entity reference and not being able to accept the
> >> straight replacement text is just wrong.
> >
> > I agree.  I'm just bringing up possible incompatibilities for debate!
> 
> I don't think that's an incompatibility that deserves catering for.

Why not?  Why be incompatible for the sake of incompatibility?  That's just
asking for trouble when Atom is used in ways not originally envisioned (and
there always are).

> >> That's a non-issue. You don't just throw away the DOCTYPE but parse 
> >> the
> >> original document with it and reserialize as a DOCTYPEless fragment.
> >> You don't lose well-formedness or content. You only lose the shallow
> >> attribute data typing provided by the DTD.
> >
> > Well, that data could be valuable.
> 
> IDness is. For practical purposes, the other stuff is not. There are 
> other solutions for IDness.

I don't understand why?

> > With DOCTYPE information associated with embedded content, it becomes 
> > possible to
> > transform entries into valid XML or HTML documents.
> 
> What do you need valid documents for if the validity is just an 
> property internal to the document? That is, the document is valid 
> according to its own rules as opposed to conforming to the rules 
> required by some downstream software.

Valid as in, this is valid HTML 3.2 or that is valid HTML 5 or this is valid
HTML 4.01 transitional.  Right now it is impossible to specify that for
atom:[EMAIL PROTECTED]"html"] content, but that doctype information is 
esssencial
for properly interpreting the semantics of the HTML content, and possibly how
to write it out as a web page.  :-(  If you make me guessing I'll go mad!  :-)

> > Some apps change their behavior with well-formed and valid modes.
> 
> That seems like a bad idea. Can you give examples?
> 
> > DOCTYPE switching may be evil, but it is currently a necessary evil.
> 
> It is not a necessary evil for XML. I am strongly against any Atom 
> feature motivated by or designed for enabling and/or encouraging 
> doctype sniffing in XML.
> 
> Please see http://hsivonen.iki.fi/doctype/#xml

See reply immediately above.  HTML currently uses doctypes for switching
between versions of the langauage.  There is no other way to properly interpret
content without guessing in the current drafts.  That is a worse sin inflicted
upon programmers, than allowing the subset that do rendering to do naughty
DOCTYPE sniffing.

> > I see this as low risk (it doesn't break anything and could even be an 
> > OPTIONAL feature)
> 
> O

Re: atom, xslt processors (Re: atom:type, xsl:output)

2005-05-29 Thread James Cerra

Aristotle,

> > (2) RSS and Atom documents are now processed with
> >  PIs, and there is no way to formally specify
> > when those instructions should be executed (except for specific
> > instances that don't matter here).  So any 
> > nodes, and PI nodes in general, must be assumed to be linked to
> > the document as a whole; I know of no standard granularity
> > mechanism.
> 
> I agree, but I see this as a problem with the xml-stylesheet PI
> rather than as something that Atom must solve. You are right, of
> course, that it would be prudent for the spec to legislate that
> PIs which occur in atom:content with an XML payload should be
> passed thru to the processor for that XML payload.

Note that I was wrong about  though I still believe the
property holds for PIs in general.

I think that *all* PIs in the XML document should be considered readable - and
therefor dangerous - before the Atom processor does its magic of informing the
application about the feed's ontology.  Letting ONLY escaped (either into
strings or with base64 encoding) content of atom:content, or documents linked
via @href, is one way to allow literal, complete XML documents as content (and
not just XML fragments).  This gives them the same treatment as any non-XML
content like images or sounds.  OTOH for XML fragments, I think it would be
nice to allow the content to be a child of atom:content, as currently possible.

> > Instead, they could be reported to the PI processor before the
> > atom processor. In my mind, any Atom processor (or any
> > language-specific processor) has to sit on top of several other
> > ones:
> > 
> > IO Processor
> > XML Processor
> > PI Processors
> > Atom Processor
> 
> I disagree. The typical setup, IMO, would be
> 
> 1. IO Processor
> 2. XML Processor
> 3. Atom Processor
> 4a. Possibly a processor for that specific XML NS
> 4b. Possibly a PI Processor
> 
> where 4a might or might not itself act on specific PIs.
> 
> F.ex, I would expect that an Atom processor will disregard an
> xml-stylesheet PI attached to a feed as a whole. But it might
> apply an XSL transformation if it is told to process a document
> which is not an Atom feed – under the assumption that this
> transform might generate a valid feed document.

I disagree.  I see any XML application conceptually as a transformer from the
XML processor's trees or events to the internal data structure of the
application.  To use an analogy, I see an XML processor as a lexer and the
grammar's processor as the parser - substituting tokens for XML objects (i.e.
elements with names, attributes, children, base uri, etc.).  XSLT is a case
when the grammar's processor's output are XML objects that can be parsed by
another grammar's processor without having to use an XML processor.  So I see
an Atom processor as taking XML objects and producing an ontology to be
converted into the program's internal data structure.  This is the abstraction
I use; practically the details are usually different (though compatible with
that abstraction).

For a counter example: For internal use I might create an Atom feed from atom
entries with  attached to either the feed or the entries. 
Furthermore, I could add instructions for my stylesheet by defining PIs used in
the entries.  In this case, the PI processor is an xslt processor that works
before the atom processor.  The original contains global options and defines
the template, while the entries are parsed and aggregated into the original
file.  This way it is easier to change the feed's options without editing the
XSLT source files.  The atom processor becomes the last stage in the pipe.



--
Jimmy Cerra
https://nemo.dev.java.net



__ 
Do you Yahoo!? 
Make Yahoo! your home page 
http://www.yahoo.com/r/hs



Re: atom, xslt processors (Re: atom:type, xsl:output)

2005-05-29 Thread James Cerra

Henri Sivonen,

> A conforming xml-stylesheet processor will only act on the PI if it 
> appears in the prolog. For this reason, it would be possible to allow 
> xml-stylesheet PIs in atom:content and spec that they are part of the 
> content document, so that an XML document with the PI in its prolog 
> could be constructed from the content.

You are correct.  I forgot that part of the spec and should have re-read it.

One possible area of incompatibility is where the xml-stylesheet spec says that
they are only allowed in the prolog (as you stated as well).  So if any are in
the application/xml atom:content, then they could be flagged as errors.  The
spec doesn't say whether they must or should be considered errors or warnings. 
:-(

> >> Vanilla XML processors don't act on PIs. They report them to the
> >> application--in this case the Atom processor.
> >
> > Instead, they could be reported to the PI processor before the atom 
> > processor.
> 
> Why do you think it is appropriate to have a random PI processor in 
> between the XML processor and the Atom processor?

Because the XML is just the grammar serializing the ontology, or data
structure, of interest.  I see Atom processors as normally the Bridge between
the XML document and the particular CMS, IA, or web browser application's
internal structure.  So an Atom processor would be the last stage before the
data is internalized.  The intent of _Processing_ Instructions seems to change
how the document is processed.  That implies to me that they would probably be
interpreted between the XML processor and the particular grammar's processor.


--
Jimmy Cerra
https://nemo.dev.java.net



__ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new Resources site
http://smallbusiness.yahoo.com/resources/



Re: atom:type, xsl:output

2005-05-28 Thread Bill de hÓra


James Cerra wrote:


So do I.  That's why the spec should state that any XML fragments embedded into
atom:content as an XML tree become part of that tree.  So PIs become associated
with the Atom document rather than the content, for example.  If the entry MUST
be preserved unmodified (Author's intention), the spec should either specify
that it must be associated externally via @src or encoded via entities or
base64 like any other non-xml-compatible formats in the current draft.



This is why MIME is still a good idea.

cheers
Bill




Re: atom, xslt processors (Re: atom:type, xsl:output)

2005-05-28 Thread A. Pagaltzis

* James Cerra <[EMAIL PROTECTED]> [2005-05-28 04:00]:
> > Nothing prevents anyone from writing a generator or pre- or
> > postprocessor for Atom documents to cater to the needs of
> > their particular brand of broken software. Wrangling that
> > particular piece of broken software however is their job; not
> > ours nor the Atom specs.
> 
> Yes, but any Atom document MUST be processable from just an XML
> processor, an  processor, among other specs
> it uses, and possiblely produce something that can be
> transformed Atom's processing model allows.  This is because:

Note that I wrote the above in reference to broken software that
depends on particular entities vs the literal character or on
specific comments, not in reference to PIs. That’s why I quoted
that part of the previous conversation which I quoted.

> (2) RSS and Atom documents are now processed with
>  PIs, and there is no way to formally specify
> when those instructions should be executed (except for specific
> instances that don't matter here).  So any 
> nodes, and PI nodes in general, must be assumed to be linked to
> the document as a whole; I know of no standard granularity
> mechanism.

I agree, but I see this as a problem with the xml-stylesheet PI
rather than as something that Atom must solve. You are right, of
course, that it would be prudent for the spec to legislate that
PIs which occur in atom:content with an XML payload should be
passed thru to the processor for that XML payload.

Btw, it might well be that the practice of attaching a stylesheet
goes out of fashion again in a year or two, if other browsers
follow Safari’s lead and acquire feed rendering capabilities.

> That's one reason the following is not true:
> 
> > Vanilla XML processors don't act on PIs. They report them to
> > the application--in this case the Atom processor.
> 
> Instead, they could be reported to the PI processor before the
> atom processor. In my mind, any Atom processor (or any
> language-specific processor) has to sit on top of several other
> ones:
> 
> IO Processor
> XML Processor
> PI Processors
> Atom Processor

I disagree. The typical setup, IMO, would be

1. IO Processor
2. XML Processor
3. Atom Processor
4a. Possibly a processor for that specific XML NS
4b. Possibly a PI Processor

where 4a might or might not itself act on specific PIs.

F.ex, I would expect that an Atom processor will disregard an
xml-stylesheet PI attached to a feed as a whole. But it might
apply an XSL transformation if it is told to process a document
which is not an Atom feed – under the assumption that this
transform might generate a valid feed document.

That’s still murky (and I have to go read up on xml-stylesheet
in detail to see how much room for interpretation it allows), but
all of this seems to indicate to me it’s that the semantics of
that particular PI which are ill thoughtout, rather than a
general problem with PI granularity.

Regards,
-- 
Aristotle



Re: atom, xslt processors (Re: atom:type, xsl:output)

2005-05-28 Thread Henri Sivonen


On May 28, 2005, at 04:54, James Cerra wrote:


Aristotle, Henri Sivonen,


Yes, but any Atom document MUST be processable from just an XML 
processor, an

 processor,


A conforming xml-stylesheet processor will only act on the PI if it 
appears in the prolog. For this reason, it would be possible to allow 
xml-stylesheet PIs in atom:content and spec that they are part of the 
content document, so that an XML document with the PI in its prolog 
could be constructed from the content.


(2) RSS and Atom documents are now processed with  
PIs, and
there is no way to formally specify when those instructions should be 
executed

(except for specific instances that don't matter here).


Perhaps it follows that you should not use xml-stylesheet if you don't 
want it to be acted upon.


So any  nodes, and PI nodes in general, must be 
assumed to be

linked to the document as a whole;


Only if the PI appears in the prolog.


That's one reason the following is not true:


Vanilla XML processors don't act on PIs. They report them to the
application--in this case the Atom processor.


Instead, they could be reported to the PI processor before the atom 
processor.


Why do you think it is appropriate to have a random PI processor in 
between the XML processor and the Atom processor?


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/



Re: atom:type, xsl:output

2005-05-28 Thread Henri Sivonen


On May 28, 2005, at 05:28, James Cerra wrote:


Henri Sivonen,


Yes, but MSIE^H^H^H^Hsome xml processors (cough  cough) still
inappropriately
use comments for that purpose.


I am not familiar with that. What purpose exactly? Why should Atom
support it?


MSIE conditional comments.  See other person's reply.


I thought they were for text/html tag soup only. Anyway, I think Atom 
should not try to enable such bogosity in XML.


And yes: I know that Atom is not FTP or HTTP.  I'm looking at the Atom 
entry
uploader's POV rather than the downloader's perspective.  The Atom API 
is a way

to simplify managing web sites:


Is it really? Straight bytes over HTTP PUT seems a lot simpler to me 
than Base64 in an XML envelope over HTTP PUT.



Vanilla XML processors don't act on PIs. They report them to the
application--in this case the Atom processor.


So Atom Processors MUST be between any XML and PI processor stages?  
That
sucks!  It means just to display Atom, I can't use a generic XSLT 
processor

like Saxon or MSIE.


What PIs do they act on besides the style sheet PI, which must appear 
in the prolog (so conforming tools should not act on it if it appears 
within atom:content)?



Depending on an entity reference and not being able to accept the
straight replacement text is just wrong.


I agree.  I'm just bringing up possible incompatibilities for debate!


I don't think that's an incompatibility that deserves catering for.

That's a non-issue. You don't just throw away the DOCTYPE but parse 
the

original document with it and reserialize as a DOCTYPEless fragment.
You don't lose well-formedness or content. You only lose the shallow
attribute data typing provided by the DTD.


Well, that data could be valuable.


IDness is. For practical purposes, the other stuff is not. There are 
other solutions for IDness.


With DOCTYPE information associated with embedded content, it becomes 
possible to

transform entries into valid XML or HTML documents.


What do you need valid documents for if the validity is just an 
property internal to the document? That is, the document is valid 
according to its own rules as opposed to conforming to the rules 
required by some downstream software.



Some apps change their behavior with well-formed and valid modes.


That seems like a bad idea. Can you give examples?


DOCTYPE switching may be evil, but it is currently a necessary evil.


It is not a necessary evil for XML. I am strongly against any Atom 
feature motivated by or designed for enabling and/or encouraging 
doctype sniffing in XML.


Please see http://hsivonen.iki.fi/doctype/#xml

I see this as low risk (it doesn't break anything and could even be an 
OPTIONAL feature)


Optional features always have a cost. Also, they tend to become de 
facto mandatory for everyone or de facto useless for everyone.



Atom's main purpose is to facilitate software to software
communication. When interop or implementation ease and readability
conflict, readability should yield.


That's why I proposed they should be externally referenced (or 
base64ed), so
the Atom processor doesn't touch them!  This is a hint to authors on 
how to

avoid a common error.


Base64 XML in XML was rejected earlier in the working group process.


There is nothing intrinsically bad about DOCTYPE sections


Yes, there is. On the Web you cannot trust that the recipient is using
an XML processor that processes the DOCTYPE beyond checking the
internal subset for well-formedness. An optional feature that does not
degrade gracefully when not supported is bad. DOCTYPE is such a
feature. In the cases where the DOCTYPE can be gracefully ignored, the
DOCTYPE is pointless.


No, any well formed XML processor MUST expand internal entities and add
replaced attributes as well.  This is NOT OPTIONAL.


So is expat non-conforming?

Atom is not claiming that you could embed the literal source of any 
XML

doc. You can embed the stuff that canonicalization would preserve.


Yet you can do that with PNG, PDF, or other content via base 64 
encodings.  It

is silly to allow that for them and not XML.


Looks like straight HTTP without an Atom envelope in between suits your 
needs better.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/



Re: atom:type, xsl:output

2005-05-27 Thread James Cerra

Henri Sivonen,

> > Yes, but MSIE^H^H^H^Hsome xml processors (cough  cough) still 
> > inappropriately
> > use comments for that purpose.
> 
> I am not familiar with that. What purpose exactly? Why should Atom 
> support it?

MSIE conditional comments.  See other person's reply.

> > Then there are example XML documents (i.e. for tutorials) that 
> > sometimes require comments be preserved.
> 
> If you want people to read the source as is, it would make sense to 
> make it text/plain or put the source snippets in  or  in an 
> (X)HTML tutorial.

Except when it is a live example.  Or when posting XHTML or XML content that is
intended to be embedded (via object element) as a seperate, child document.  It
seems silly to allow anything NOT xml to be embedded unmodified (i.e. PNGs,
sounds), while no XML content can have that guarantee.

And yes: I know that Atom is not FTP or HTTP.  I'm looking at the Atom entry
uploader's POV rather than the downloader's perspective.  The Atom API is a way
to simplify managing web sites: it would be silly to make it unusable for
associating XML in general while allowing any other type to be unmodified.

> > Finally if you are posting content associated with an entry with the 
> > Atom API, then it is important that any documents uploaded are not 
> > modified.
> 
> I think Atom processors should not be required to preserve 
> serialization artifacts. That is, IMO, they should not be required to 
> care about anything that is not exposed via SAX2 ContentHandler with 
> qNames ignored.

So do I.  That's why the spec should state that any XML fragments embedded into
atom:content as an XML tree become part of that tree.  So PIs become associated
with the Atom document rather than the content, for example.  If the entry MUST
be preserved unmodified (Author's intention), the spec should either specify
that it must be associated externally via @src or encoded via entities or
base64 like any other non-xml-compatible formats in the current draft.


> Vanilla XML processors don't act on PIs. They report them to the 
> application--in this case the Atom processor.

So Atom Processors MUST be between any XML and PI processor stages?  That
sucks!  It means just to display Atom, I can't use a generic XSLT processor
like Saxon or MSIE.

> Depending on an entity reference and not being able to accept the 
> straight replacement text is just wrong.

I agree.  I'm just bringing up possible incompatibilities for debate!

> >>> DOCTYPE declaration(s)
> >>
> >> If DOCTYPE is essential for the receiving app, you've got bigger
> >> problems than Atom. Hardwiring IDness based on namespaces is more
> >> practical than relying on DTD-based data typing.
> >
> > I'm not necessarily talking about external document type definitions.
> > Preventing DOCTYPEs would cause incompatibilities with XML 
> > well-formedness
> > processing.  From the spec:
> >
> > "[Definition: While they are not required to check the document for 
> > validity,
> > they are REQUIRED to process all the declarations they read in the 
> > internal DTD
> > subset and in any parameter entity that they read, up to the first 
> > reference to
> > a parameter entity that they do not read; that is to say, they MUST 
> > use the
> > information in those declarations to normalize attribute values, 
> > include the
> > replacement text of internal entities, and supply default attribute 
> > values.]"
> 
> That's a non-issue. You don't just throw away the DOCTYPE but parse the 
> original document with it and reserialize as a DOCTYPEless fragment. 
> You don't lose well-formedness or content. You only lose the shallow 
> attribute data typing provided by the DTD.

Well, that data could be valuable.  Some situations require them, and it would
be really nice to have optional DOCTYPE info kept with the fragment.  With
DOCTYPE information associated with embedded content, it becomes possible to
transform entries into valid XML or HTML documents.  Some apps change their
behavior with well-formed and valid modes.  DOCTYPE switching may be evil, but
it is currently a necessary evil.  I see this as low risk (it doesn't break
anything and could even be an OPTIONAL feature) while easing the pain when
dealing with the existing web.

> Atom's main purpose is to facilitate software to software 
> communication. When interop or implementation ease and readability 
> conflict, readability should yield.

That's why I proposed they should be externally referenced (or base64ed), so
the Atom processor doesn't touch them!  This is a hint to authors on how to
avoid a common error.

> > There is nothing intrinsically bad about DOCTYPE sections
> 
> Yes, there is. On the Web you cannot trust that the recipient is using 
> an XML processor that processes the DOCTYPE beyond checking the 
> internal subset for well-formedness. An optional feature that does not 
> degrade gracefully when not supported is bad. DOCTYPE is such a 
> feature. In the cases where the DOCTYPE 

atom, xslt processors (Re: atom:type, xsl:output)

2005-05-27 Thread James Cerra

Aristotle, Henri Sivonen,

> > > Entities can be flattened.
> > 
> > Again, as with comments, I agree in principle, but in practice
> > some processors depend on them.
> 
> I do not consider it at all wise to legislate anything in the
> Atom spec to address these cases.

That's probably a good idea.

> Nothing prevents anyone from writing a generator or pre- or
> postprocessor for Atom documents to cater to the needs of their
> particular brand of broken software. Wrangling that particular
> piece of broken software however is their job; not ours nor the
> Atom specs.

Yes, but any Atom document MUST be processable from just an XML processor, an
 processor, among other specs it uses, and possiblely produce
something that can be transformed Atom's processing model allows.  This is
because:

(1) The first stage of an Atom processor will probably be an XML processor.

(2) RSS and Atom documents are now processed with  PIs, and
there is no way to formally specify when those instructions should be executed
(except for specific instances that don't matter here).  So any
 nodes, and PI nodes in general, must be assumed to be
linked to the document as a whole; I know of no standard granularity mechanism.
 

That's one reason the following is not true:

> Vanilla XML processors don't act on PIs. They report them to the 
> application--in this case the Atom processor.

Instead, they could be reported to the PI processor before the atom processor. 
In my mind, any Atom processor (or any language-specific processor) has to sit
on top of several other ones:

IO Processor
XML Processor
PI Processors
Atom Processor

An example of an IO processor is an HTTP processor: it reads Atom documents. 
An example of a PI processor is an XSLT processor or CSS renderer.  Forcing the
Atom Processor to be first is practically impossible and counter to the reason
for writing generic processors for IO, XML or PIs at all.



--
Jimmy Cerra
https://nemo.dev.java.net



__ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new Resources site
http://smallbusiness.yahoo.com/resources/



Re: atom:type, xsl:output

2005-05-26 Thread A. Pagaltzis

* Henri Sivonen <[EMAIL PROTECTED]> [2005-05-26 10:20]:
> On May 26, 2005, at 06:23, James Cerra wrote:
> >Yes, but MSIE^H^H^H^Hsome xml processors (cough  cough) still
> >inappropriately use comments for that purpose.
> 
> I am not familiar with that. What purpose exactly? Why should
> Atom support it?

http://msdn.microsoft.com/workshop/author/dhtml/overview/ccomment_ovw.asp

They are parsed in HTML documents; I have no idea if IE actually
respects these in XML documents.

Regards,
-- 
Aristotle



Re: atom:type, xsl:output

2005-05-26 Thread A. Pagaltzis

* James Cerra <[EMAIL PROTECTED]> [2005-05-26 05:35]:
> Yes, but MSIE^H^H^H^Hsome xml processors (cough  cough) still
> inappropriately use comments for that purpose.  Then there are
> example XML documents (i.e. for tutorials) that sometimes
> require comments be preserved.
> 
> > Entities can be flattened.
> 
> Again, as with comments, I agree in principle, but in practice
> some processors depend on them.

I do not consider it at all wise to legislate anything in the
Atom spec to address these cases.

Nothing prevents anyone from writing a generator or pre- or
postprocessor for Atom documents to cater to the needs of their
particular brand of broken software. Wrangling that particular
piece of broken software however is their job; not ours nor the
Atom spec’s.

Regards,
-- 
Aristotle



Re: atom:type, xsl:output

2005-05-26 Thread Henri Sivonen


On May 26, 2005, at 06:23, James Cerra wrote:


Henri Sivonen,


If the intended Atom content contains essential comments,


There should be no such thing as essential comments. The XML spec does
not require XML processors to report comments to the app. Hence,
comments are inappropriate for transferring essential data.


Yes, but MSIE^H^H^H^Hsome xml processors (cough  cough) still 
inappropriately

use comments for that purpose.


I am not familiar with that. What purpose exactly? Why should Atom 
support it?


Then there are example XML documents (i.e. for tutorials) that 
sometimes require comments be preserved.


If you want people to read the source as is, it would make sense to 
make it text/plain or put the source snippets in  or  in an 
(X)HTML tutorial.


Finally if you are posting content associated with an entry with the 
Atom API, then it is important that any documents uploaded are not 
modified.


I think Atom processors should not be required to preserve 
serialization artifacts. That is, IMO, they should not be required to 
care about anything that is not exposed via SAX2 ContentHandler with 
qNames ignored.



processing instructions,


These could be supported in embedded content if the Atom spec said PIs
in atom:content belong in content and should not be acted upon by the
Atom processor.


I don't like this condition.  Any content will be read by XML 
processors before
being handed to an Atom processor.  So the generic XML processor will 
not be
able to differentiate between PIs that are significant - and thus 
should be
processed by their processors - from those which should be passed 
through.


Vanilla XML processors don't act on PIs. They report them to the 
application--in this case the Atom processor.



entities


Entities can be flattened.


Again, as with comments, I agree in principle, but in practice some 
processors

depend on them.


Depending on an entity reference and not being able to accept the 
straight replacement text is just wrong.



DOCTYPE declaration(s)


If DOCTYPE is essential for the receiving app, you've got bigger
problems than Atom. Hardwiring IDness based on namespaces is more
practical than relying on DTD-based data typing.


I'm not necessarily talking about external document type definitions.
Preventing DOCTYPEs would cause incompatibilities with XML 
well-formedness

processing.  From the spec:

"[Definition: While they are not required to check the document for 
validity,
they are REQUIRED to process all the declarations they read in the 
internal DTD
subset and in any parameter entity that they read, up to the first 
reference to
a parameter entity that they do not read; that is to say, they MUST 
use the
information in those declarations to normalize attribute values, 
include the
replacement text of internal entities, and supply default attribute 
values.]"


That's a non-issue. You don't just throw away the DOCTYPE but parse the 
original document with it and reserialize as a DOCTYPEless fragment. 
You don't lose well-formedness or content. You only lose the shallow 
attribute data typing provided by the DTD.


Some RDF/XML processors already put internal DOCTYPE declarations to 
make the

content more readable.


Atom's main purpose is to facilitate software to software 
communication. When interop or implementation ease and readability 
conflict, readability should yield.



There is nothing intrinsically bad about DOCTYPE sections


Yes, there is. On the Web you cannot trust that the recipient is using 
an XML processor that processes the DOCTYPE beyond checking the 
internal subset for well-formedness. An optional feature that does not 
degrade gracefully when not supported is bad. DOCTYPE is such a 
feature. In the cases where the DOCTYPE can be gracefully ignored, the 
DOCTYPE is pointless.


Remember that anything compatible with XML MUST allow its optional 
features to
be in the markup (even if they are just ignored).  So if you disallow 
DOCTYPE

sections, then you can't claim to support XML.


Atom is not claiming that you could embed the literal source of any XML 
doc. You can embed the stuff that canonicalization would preserve.


Anything that could have any chance of dataloss - in this case, the 
loss of XML

comments, PIs, and XML and DOCTYPE declarations – is bad.


Would you consider changes in the order of attributes and the white 
space between attributes data loss as well?


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/



Re: atom:type, xsl:output

2005-05-25 Thread James Cerra

Henri Sivonen,

> > If the intended Atom content contains essential comments,
> 
> There should be no such thing as essential comments. The XML spec does 
> not require XML processors to report comments to the app. Hence, 
> comments are inappropriate for transferring essential data.

Yes, but MSIE^H^H^H^Hsome xml processors (cough  cough) still inappropriately
use comments for that purpose.  Then there are example XML documents (i.e. for
tutorials) that sometimes require comments be preserved.  So although it is not
required by XML processors, it is nice to not introduce incompatibility with
that optional part of the spec.  Finally if you are posting content associated
with an entry with the Atom API, then it is important that any documents
uploaded are not modified.

(Besides at worst this is a non-issue if no comments are ever essential.)

> > processing instructions,
> 
> These could be supported in embedded content if the Atom spec said PIs 
> in atom:content belong in content and should not be acted upon by the 
> Atom processor.

I don't like this condition.  Any content will be read by XML processors before
being handed to an Atom processor.  So the generic XML processor will not be
able to differentiate between PIs that are significant - and thus should be
processed by their processors - from those which should be passed through.

> > entities
> 
> Entities can be flattened.

Again, as with comments, I agree in principle, but in practice some processors
depend on them.

> > DOCTYPE declaration(s)
> 
> If DOCTYPE is essential for the receiving app, you've got bigger 
> problems than Atom. Hardwiring IDness based on namespaces is more 
> practical than relying on DTD-based data typing.

I'm not necessarily talking about external document type definitions. 
Preventing DOCTYPEs would cause incompatibilities with XML well-formedness
processing.  From the spec:

"[Definition: While they are not required to check the document for validity,
they are REQUIRED to process all the declarations they read in the internal DTD
subset and in any parameter entity that they read, up to the first reference to
a parameter entity that they do not read; that is to say, they MUST use the
information in those declarations to normalize attribute values, include the
replacement text of internal entities, and supply default attribute values.]"

Some RDF/XML processors already put internal DOCTYPE declarations to make the
content more readable.  There is nothing intrinsically bad about DOCTYPE
sections, they provide hints to validating XML processors about validity and
can save time, space, and increase readability.

Remember that anything compatible with XML MUST allow its optional features to
be in the markup (even if they are just ignored).  So if you disallow DOCTYPE
sections, then you can't claim to support XML.

Anything that could have any chance of dataloss - in this case, the loss of XML
comments, PIs, and XML and DOCTYPE declarations – is bad.  Your comment is a
good design pattern for XML based specifications, but in general they don't
have to follow that rule of thumb.  Besides, I don't want the implementors to
get any bad ideas so explicitly stating these things disenfranchises that
behavior from being acceptable from the beginning.

--
Jimmy Cerra
https://nemo.dev.java.net



__ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new Resources site
http://smallbusiness.yahoo.com/resources/



Re: atom:type, xsl:output

2005-05-25 Thread Henri Sivonen


On May 25, 2005, at 06:43, James Cerra wrote:


If the intended Atom content contains essential comments,


There should be no such thing as essential comments. The XML spec does 
not require XML processors to report comments to the app. Hence, 
comments are inappropriate for transferring essential data.



processing instructions,


These could be supported in embedded content if the Atom spec said PIs 
in atom:content belong in content and should not be acted upon by the 
Atom processor.



entities


Entities can be flattened.


DOCTYPE declaration(s)


If DOCTYPE is essential for the receiving app, you've got bigger 
problems than Atom. Hardwiring IDness based on namespaces is more 
practical than relying on DTD-based data typing.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/



Re: atom:type, xsl:output

2005-05-25 Thread Henri Sivonen


On May 25, 2005, at 06:46, James Cerra wrote:

fter being processed by a XML processor, any entites should be 
dereferenced to

their text values and placed into the document tree.  So this:







would become the text string:





No. It becomes a tree:
Element atom:content having attribute type="application/xml"
   |
   \ Text: http://hsivonen.iki.fi/



Re: atom:type, xsl:output

2005-05-24 Thread James Cerra

Graham,

> > Secondly, XML may be entity (or CDATA) encoded like
> > @type="html" or plain xml like @type="xhtml".  This is
> If I follow you right, you misunderstand. Atom documents are  
> unambiguous. XML has to be inserted literally, with no entity  
> escaping (except for entities that are part of text nodes) allowed.

I'm not so sure.  The spec seems to hint at the answer, but provides no
cannonical interpretation like it does with @type="html".

After being processed by a XML processor, any entites should be dereferenced to
their text values and placed into the document tree.  So this:

> > 
> > 
> > 

would become the text string:

> > 
> > http://smallbusiness.yahoo.com/resources/



Re: atom:type, xsl:output

2005-05-24 Thread James Cerra

Aristotle,

> Embedding XML 1.1 documents inside an XML 1.0 document is not
> possible, because...

I understand.  Thanks.

> An Atom processor does not know whether the embedded XML document
> is a valid document of the given type. Only an attached processor
> for the given document type will be able to decide. As such, the
> validity of the Atom feed document is not affected by the media
> type. It will, of course, cause interop problems if you lie about
> the media type, hence SHOULD.

That's convincing.

> > Secondly, XML may be entity (or CDATA) encoded like
> > @type="html" or plain xml like @type="xhtml".  This is becuase
> > of the "content of atom:content MAY include child elements"
> > phrase.  There is no guarantee if an entity escaped passage is
> > xml or a text node example of an xml document (i.e. an example
> > of an xml document), for example.
> 
> No. Wrong.
> 
> There is nothing saying that the content of atom:content MAY be a
> text node. It MAY only be a child element.

If it may be an element, then it may be something else too.  By the definition
in RFC 2119, this behavior (having one or more child elements according to
20050418 revision) is "truly optional."  There's three other things it could be
(ignoring comments and PIs): a text node, mixed content, no nothing at all.  So
it is still a valid Atom fragment.

> If there is no child element, @src must be set, ie the XML
> document constituting the content is external to the feed.
> 
> I will say, though, the the spec could stand to be more explicit
> in this instance.

I agree with both your commentary on the spec and that your interpretation is
mostly what Atom should support.  Here's how I would write it:

"4. If the value of "type" ends with "+xml" or "/xml" (case-insensitive), the
content of atom:content MUST either consist of no children or one or more XML
1.0 elements, comments, processing instructions, an XML 1.0 text node, or mixed
content [1].  Any comments, processing instructions, or entities found as
children of atom:content MUST be interpreted as being associated with the Atom
document itself.  If the intended Atom content contains essential comments,
processing instructions, entities, DOCTYPE declaration(s), or other properties
that make it unable to be valid atom:content content, then it MAY be specified
in an external file identified by a URI in the the "src" attribute, which MAY
be interpreted by an Atom processor as being associated as the Atom content of
the entry.  The Atom content of the entry SHOULD be suitable for handling as
the indicated media type."

I defined a phrase, "Atom content," to represent the content of an entry, that
is refernced by atom:content either through its src attribute or literally as
its children elements and text nodes.  Any comments or processing instructions
should be explicitly associated with the Atom document's xml, rather than the
"Atom content", so style information can be applied (for viewing the raw Atom
XML) and the processing order of processing instructions and comments is well
defined (to solve the problem of which ones to read and which ones to pass?)
for examples.

That paragraph is also very long-winded, but I think it is more precise than
the existing paragraph.  I don't know if it is deterministic, whatever that is,
and I still think some of the attributes of xsl:output should be included as
optional metadata about the atom:content child (doctype, xml declaration, etc.)
if it is a complete document.

--
Jimmy Cerra

[1] 



__ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new Resources site
http://smallbusiness.yahoo.com/resources/



Re: atom:type, xsl:output

2005-05-24 Thread Tim Bray


On May 24, 2005, at 1:25 AM, Graham wrote:


First off: it is an error to lie about your
media-type, so I would change "SHOULD be suitable for
handling as the indicated media type" to "MUST be
suitable for handling as the indicated media type."



+1


I'm tempted to agree but can't, because the condition of "being  
suitable for handling as" is not deterministic and testable enough to  
merit a MUST. -Tim




Re: atom:type, xsl:output

2005-05-24 Thread Graham


On 24 May 2005, at 5:06 am, James Cerra wrote:


First off: it is an error to lie about your
media-type, so I would change "SHOULD be suitable for
handling as the indicated media type" to "MUST be
suitable for handling as the indicated media type."


+1


Secondly, XML may be entity (or CDATA) encoded like
@type="html" or plain xml like @type="xhtml".  This is
becuase of the "content of atom:content MAY include
child elements" phrase.  There is no guarantee if an
entity escaped passage is xml or a text node example
of an xml document (i.e. an example of an xml
document), for example.


If I follow you right, you misunderstand. Atom documents are  
unambiguous. XML has to be inserted literally, with no entity  
escaping (except for entities that are part of text nodes) allowed.








This is invalid, since it has no root element. It represents the  
standalone XML document:






This is correct:




Graham



Re: atom:type, xsl:output

2005-05-24 Thread A. Pagaltzis

* James Cerra <[EMAIL PROTECTED]> [2005-05-24 06:35]:
> > > XML 1.1
> > 
> > Not supported.
> 
> I'm confused.  There is nothing inherent in the spec that
> prevents XML 1.1 or any future versions from being supported.
> And why introduce incompatibilities in atom:content that also
> bork with arbitrary XML 1.0 documents too?  I assert this,
> because the spec says in section 4.1.3.3, "Processing Model,"
> that:

You can only embed XML 1.1 documents directly into another
document if that document is also XML 1.1. (Note that I am not
talking about transporting XML 1.1 documents as an opaque
serialized string.)

Embedding XML 1.1 documents inside an XML 1.0 document is not
possible, because
• XML 1.1 allows control characters to be included as entities
  which XML 1.0 forbids in any form,
• an XML 1.0 processor will not understand an embedded XML 1.1
  document,
• etc.

> ] 4.  If the value of "type" ends with "+xml" or "/xml"
> ] (case-insensitive), the content of atom:content MAY
> ] include child elements, and SHOULD be suitable for
> ] handling as the indicated media type.  If the "src"
> ] attribute is not provided, this would normally mean that
> ] the "atom:content" element would contain a single child
> ] element which would serve as the root element of the XML
> ] document of the indicated type.
> 
> First off: it is an error to lie about your media-type, so I
> would change "SHOULD be suitable for handling as the indicated
> media type" to "MUST be suitable for handling as the indicated
> media type."

An Atom processor does not know whether the embedded XML document
is a valid document of the given type. Only an attached processor
for the given document type will be able to decide. As such, the
validity of the Atom feed document is not affected by the media
type. It will, of course, cause interop problems if you lie about
the media type, hence SHOULD.

> Secondly, XML may be entity (or CDATA) encoded like
> @type="html" or plain xml like @type="xhtml".  This is becuase
> of the "content of atom:content MAY include child elements"
> phrase.  There is no guarantee if an entity escaped passage is
> xml or a text node example of an xml document (i.e. an example
> of an xml document), for example.

No. Wrong.

There is nothing saying that the content of atom:content MAY be a
text node. It MAY only be a child element.

If there is no child element, @src must be set, ie the XML
document constituting the content is external to the feed.

I will say, though, the the spec could stand to be more explicit
in this instance.

Regards,
-- 
Aristotle



Re: atom:type, xsl:output

2005-05-23 Thread James Cerra

Graham,

> > * When @type="html" then the content of the
element is a xsd:string
> > [1] of an HTML DIV element plus optional
insignificant whitespace
> > around it.  Which version of HTML is defined?  How
do you
> > differentiate between HTML 4.01, HTML 3.2, the
upcoming HTML 5, or
> > "nonstandard" HTML (like with marquee elements)?
> 
> I believe the answer would be "street HTML", not any
> specific version.

I don't like this, since the behavior of an unknown
version of HTML is obviously undefined.  You can't
expect any HTML processors (used to parse atom content
items) to follow the standards when the standard of
the instance being used is unknown!  There should be a
way to identify the version of html used so that the
behavior of the processor can be explicitly defined.

> > * When @type="mime/type" [2], ONLY for
atom:content, then the content
> > (or src document) is that type of document.  Why
> not allow other
> > elements to use this?
> 
> Because the other elements are for purely textual
> content.

I understand.

> > XML 1.1
> 
> Not supported.

I'm confused.  There is nothing inherent in the spec
that prevents XML 1.1 or any future versions from
being supported.  And why introduce incompatibilities
in atom:content that also bork with arbitrary XML 1.0
documents too?  I assert this, because the spec says
in section 4.1.3.3, "Processing Model," that:

] 4.  If the value of "type" ends with "+xml" or
"/xml"
] (case-insensitive), the content of atom:content
MAY include child
] elements, and SHOULD be suitable for handling as
the indicated
] media type.  If the "src" attribute is not
provided, this would
] normally mean that the "atom:content" element
would contain a
] single child element which would serve as the
root element of the
] XML document of the indicated type.

First off: it is an error to lie about your
media-type, so I would change "SHOULD be suitable for
handling as the indicated media type" to "MUST be
suitable for handling as the indicated media type."

Secondly, XML may be entity (or CDATA) encoded like
@type="html" or plain xml like @type="xhtml".  This is
becuase of the "content of atom:content MAY include
child elements" phrase.  There is no guarantee if an
entity escaped passage is xml or a text node example
of an xml document (i.e. an example of an xml
document), for example.  SO the following could
legally be interpreted as either a text node or as
full-fledged xml:

] 
] 

There is no way to make even an educated guess in
advance without reading the text node with an xml
processor and seeing if it is well-formed.  This
really sucks for processing Atom with XSLT.

In any case, the above example could never be "the
root element of the XML document of the indicated
type" if written as xml instead of entity-escaped
either, since there is that pesky XML declaration. 
Yet it still conforms to what the mime type says it
is.  That's one reason why I think this is a bug in
the Atom spec.  It's also a reason why I wish
atom:content adopted xslt:output attributes for
specifying the doctype/xml declaration and version
info of the content.

Finally: In general there could also be processing
instructions that must be put as children of the
atom:content element, but how should they be
interpreted?  A non-atom aware processor will read
those processing instructions as belonging to the ATOM
DOCUMENT instead of the atom:content's XML CONTENT!

This is a horrible situation, especially when you want
the content of a blog to be an actual XML document
that uses XML declarations, DOCTYPE declarations, and
processing instructions.

There needs to be a way to specify these things
without using an external file referenced with a @src
attribute.

> > RTF
> 
> It is compatible, as a string, though certain
> obsolete characters are  
> not.
> 
> > PNG
> 
> Should be base64 encoded.

I understand, thanks.

> > What should one do when encountering these
> situations?
> 
> See section 4.1.3.3  Processing Model

Thanks for the pointer.  I can't believe I missed
that.

-- Jimmy Cerra

P.S.  Thanks you Danny Ayers, wherever you are!



__ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new Resources site
http://smallbusiness.yahoo.com/resources/



Re: atom:type, xsl:output

2005-05-23 Thread Tim Bray


On May 23, 2005, at 2:43 AM, Graham wrote:


* When @type="html" then the content of the element is a xsd:string
[1] of an HTML DIV element plus optional insignificant whitespace
around it.  Which version of HTML is defined?  How do you
differentiate between HTML 4.01, HTML 3.2, the upcoming HTML 5, or
"nonstandard" HTML (like with marquee elements)?


I believe the answer would be "street HTML", not any specific version.


Agreed.  What it says is "SHOULD be suitable for handling as HTML".   
I.e. can be passed to your local HTML rendering control, which is  
doubtless a forgiving tag-soup engine. -Tim




Re: atom:type, xsl:output

2005-05-23 Thread Graham


On 23 May 2005, at 9:14 am, Danny Ayers wrote:


* When @type="html" then the content of the element is a xsd:string
[1] of an HTML DIV element plus optional insignificant whitespace
around it.  Which version of HTML is defined?  How do you
differentiate between HTML 4.01, HTML 3.2, the upcoming HTML 5, or
"nonstandard" HTML (like with marquee elements)?


I believe the answer would be "street HTML", not any specific version.


* When @type="mime/type" [2], ONLY for atom:content, then the content
(or src document) is that type of document.  Why not allow other
elements to use this?


Because the other elements are for purely textual content.


XML 1.1


Not supported.


Turtle


Don't know.


RTF


It is compatible, as a string, though certain obsolete characters are  
not.



PNG


Should be base64 encoded.


What should one do when encountering these situations?


See section 4.1.3.3  Processing Model


Does that mean that if you use "application/xhtml+xml", you can do
(rest of feed omitted for brevity):


Yes. This is why we have a special XHTML mode for fragments.

Graham



atom:type, xsl:output

2005-05-23 Thread Danny Ayers

[forwarding for Jimmy, he's having mail problems]

From: Jimmy Cerra <[EMAIL PROTECTED]>

I'm a little confused by the type attribute for atom:content and other
elements.  This the following correct?

* When @type="html" then the content of the element is a xsd:string
[1] of an HTML DIV element plus optional insignificant whitespace
around it.  Which version of HTML is defined?  How do you
differentiate between HTML 4.01, HTML 3.2, the upcoming HTML 5, or
"nonstandard" HTML (like with marquee elements)?

* When @type="xhtml" then the content of the element is a xhtml:div
XML fragment plus optional insignificant whitespace around it.  Again
which version of XHTML is defined?  How do you differentiate between
XHTML 1.0, 1.1, or 2.0? Since XHTML 2.0 may have a new namespace, so
will you allow that?  How does requiring XHTML:div impact which XHTML
modules are required (I have a guess)? And what about exotic versions
like "XHTML+MathML+SVG" or "XHTML 1.1 plus MathML 2.0"?  (Some blogs
actually use it "XHTML 1.1 plus MathML 2.0"!)

* When @type="text" then the content of the element is a xsd:string of
a text/plain document.

* When @type="mime/type" [2], ONLY for atom:content, then the content
(or src document) is that type of document.  Why not allow other
elements to use this? What about content that isn't compatible with
XML 1.0 (like XML 1.1, Turtle, RTF, PNG); should it be entity
encoded/put into cdata section like HTML?  What should one do when
encountering these situations?

Does that mean that if you use "application/xhtml+xml", you can do
(rest of feed omitted for brevity):


  
...
 ... 
  


How do you specify xml-stylesheets processing instructions, doctype
and xml declarations, and other data about the content?  PIs may be a
non-issue if they are not read by the XML processor for the Atom feed;
although, that isn't guaranteed.

I think you should at least allow @type="xml", as others have
suggested for xml 1.0 content, along with adopting XSLT's output
attributes (perhaps not using @method or @media-type).  This would
ease the pain with XML, and all media/types for @type require the
content to be xsd:string valid (that is, entity encoded or using CDATA
sections).

Sorry for the boatload of questions.

--
Jimmy Cerra
http://pawsgroup.blogspot.com

[1] That is, a text node in the XML tree.

[2] A mime type, noting the exception in 4.1.3.1.