Re: Does xml:base apply to type="html" content?

2006-03-30 Thread Eric Scheid

On 31/3/06 3:08 PM, "Antone Roundy" <[EMAIL PROTECTED]> wrote:

>> The escaped HTML content contained within the content element that
>> David was originally concerned with is more than likely a copy of
>> all or part of the elements and content contained inside the body
>> tag of the external document referenced by an associated link
>> element, and therefore no guarentee that the xml:base of the atom
>> feed is going to be anywhere even close to accurate.

I'm doing something similar right now, scraping some website that doesn't
provide feeds for what I want. I check the html of the page I scraped and if
they have a  I use that, else I use the URL I used to fetch the page.

The tag soup I extract for each entry contains relative references. I really
don't want to go fixing that tag soup so I just stick that base url into
xml:base for each entry (and not just at the top of the feed, because I'm
scraping paginated results).

e.



Re: Does xml:base apply to type="html" content?

2006-03-30 Thread M. David Peterson
I speaking in terms of mashups... If a feed comes from one source, then I would agree...  but mashups from both a syndication as well as an application standpoint are become the primary focus of EVERY major vendor.  Its in this scenario that I see the problem of assuming the xml:base in current context has any value whatsoever.
 Pick a planet, any planet, and my point suddenly and immediattelly becomes relavent.On 3/30/06, Antone Roundy <
[EMAIL PROTECTED]> wrote:
On Mar 30, 2006, at 10:00 PM, M. David Peterson wrote:> Then it should be a best practice that if they invoke this, the> xml:base value should be set upon the "element containing the> text", in this case, the content element.  Obviously you can't
> simply assume that the current xml:base in context has any direct> relation, and therefore value to the current entry/content in> context, as, using Aristotle's use case (and a billion others just
> like it -- if not a billion now, it won't be too long before that> number is quite realistic, and in fact only scratching the Atom> feed surface of the not too distant future), there is no way that> one can simply assume that the current @xml:base value is legit.
I disagree.  The best practice should be to set xml:base explicitlyin any document using relative URIs, and at any point in the documentwhere the relative URIs appear, ensure that the xml:base in context
is the correct base URI by overriding it if necessary.  If thispractice is followed, and only if this practice is followed, thenconsumers will be able to reliably resolve relative URIs.  I see nojustification for assuming that the xml:base in context is invalid
and using some other base URI just because xml:base is set somewhereother than the containing element.  It's a pretty sorry world if wenot only assume, but operate on the assumption that publishers areand will continue to be that inept.
Just to amplify one point:> you can't simply assume that the current xml:base in context has> any direct relation...What you can't simply assume is that it the xml:base in context doesNOT have any direct relation to the content.  Part of the point of
XML is that we'll all be better off if consumers rely on publishersdoing things correctly (in this case, getting xml:base right) andhold publishers to it until they get it right.Antone
-- M. David Petersonhttp://www.xsltblog.com/ 


Re: Does xml:base apply to type="html" content?

2006-03-30 Thread M. David Peterson
Yeah, agreed... In fact, I think at this stage of the game my inexperience in understanding all that must be considered during the development of a standard as far reaching as Atom both is and, even more so, will be is beginning to show through.  None-the-less, this is an area I want to learn as much as I possibly can, so I'm going to simply chill out in the background and take notes for a bit, as you all obviously understand these things FAR beyond what I could even imagine at this stage.
My pen and notepad are now firnly in place of where my keyboard once was...  well, speaking in the terms just slighty in the future...like right now..; :)On 3/30/06, 
James M Snell <[EMAIL PROTECTED]> wrote:
I would agree that, as a best practice, the xml:base should appear onthe content element, but implementations need to be prepared to usewhatever the in-scope URI is (e.g. if no xml:base is specified, relative
refs in the content will be relative to Content-Location or the feedsRequest URI).  In other words, consumers of the feed *have* to assumethat the current xml:base in context is going to be correct andpublishers of the feed simply have to be responsible for Doing The Right
Thing.- JamesM. David Peterson wrote:> Then it should be a best practice that if they invoke> this, the xml:base value should be set upon the "element containing the text", in this case, the content element.
>  Obviously> you can't simply assume that the current xml:base in context has> any direct relation, and therefore value to the current entry/content in context, as, using Aristotle's use case (and a billion others just like it
> -- if not a billion now, it won't be too long before that number is> quite realistic, and in fact only scratching the Atom feed surface of> the not too distant> future), there is no way that one can simply assume that the current @xml:base value is legit.
>>> It seems to me that this current definition of xml:base didn't take into> consideration the fact that the world would soon be revolving around XML> mashups, all of which can contain any number of possible combinations of URI's of which may have absolutely nothing even remotely in common with another.
>>> Seems like maybe its time for a quick update to the xml:base definition,> as this is not just an issue that effects Atom syndication feeds.>> On 3/30/06, * James M Snell* <
[EMAIL PROTECTED]> [EMAIL PROTECTED]>> wrote:>>> In retrospect, it likely would have been a good idea for us to have
> covered this in the Atom spec.  The definition of xml:base does include> a statement that "[t]he base URI for a URI reference appearing in text> content is the base URI of the element containing the text."  That would
> include URI references contained within the escaped HTML markup of Text> constructs and the content element.>> - James>> Sean Lyndersay wrote:> >
> > This is unfortunate, because HTML itself only allows > elements in the header (one per page). So if anyone wants to build a> client that displays more than one item at a time using a standard
> HTML renderer (and most client render HTML using someone else's> renderer, not their own), they have to go groveling in HTML to do> URL fixup (or use iframes).> >> > In my own case (IE7) case, this isn't that big a deal because we
> have to grovel in HTML for many other reasons, but I suspect it'd be> pain for other clients.> >> > My own reading goes like this: Since xml:base is an XML concept,> it should apply only to relative references in XML content
> (including XHTML). From the XML perspective, the HTML content is> just a string, so the xml:base should not apply.> >> > Sean> >> > -Original Message-
> > From: [EMAIL PROTECTED]> [EMAIL PROTECTED]>> [mailto:
[EMAIL PROTECTED]> [EMAIL PROTECTED]>] On Behalf Of Tim Bray> > Sent: Thursday, March 23, 2006 10:49 AM
> > To: David Powell> > Cc: Atom Syntax> > Subject: Re: Does xml:base apply to type="html" content?> >> >> >> > On Mar 23, 2006, at 10:03 AM, David Powell wrote:
> >> >>> >> xml:base applies to type="xhtml" content, but I'm not sure whether it> >> is supposed to apply to escaped type="html" content? I reckon
> that it> >> does.> >> > RFC4287, section 2:> >> > Any element defined by this specification MAY have an xml:base> > attribute [
W3C.REC-xmlbase-20010627].  When xml:base is used in an> > Atom Document, it serves the function described in section> 5.1.1 of> > [RFC3986], establishing the base URI (or IRI) for resolving any
> > relative references found within the effective scope of the> xml:base> > attribute.> >> > Seems pretty clear to me.  Yes, the base URI of that HTML is now
> > whatever xml:base said it was -Tim> >> >> >> --> >> M. David Peterson> 
http://www.xsltblog.co

Re: Does xml:base apply to type="html" content?

2006-03-30 Thread M. David Peterson
Yeah, I 100% agree with you on ALL of this...  The break down was more to showcase "here's the best case scenario thats even remotely possible" but we all know that remotely possible and real world reliable are near polar opposites...
Actually, I think what you have layed out here has quite a bit of real world merit.  Definitely something that can be used to build some sort of foundation on in regards to best practice type efforts.  In fact, if you take a look at my last follow-up its obvious we're both in the same chapter, you're just near the end, and I just turned the first page. :)
On 3/30/06, Antone Roundy <[EMAIL PROTECTED]> wrote:
On Mar 30, 2006, at 8:34 PM, M. David Peterson wrote:> ...the content element can be basically anything as long as its either>> - non-escaped plain text with a @type value set to text,> - escaped text,with a @type set to a valid 'text' mime-type
> - enitity escaped with @type set to html,> - xhtml wrapped in a properly xhtml namespaced div with @type set> to xhtml,> - base64 encoded with @type set to the proper media type, or> - its xml with @type set to a proper XML mime-type.
>> In each of these cases, the only one that shold have even a remote> chance of the current value of the @xml:base in current context> applying to is inline xml> The escaped HTML content contained within the content element that
> David was originally concerned with is more than likely a copy of> all or part of the elements and content contained inside the body> tag of the external document referenced by an associated link
> element, and therefore no guarentee that the xml:base of the atom> feed is going to be anywhere even close to accurate.On what basis are you concluding that Atom publishers are more likelyto be smart enough to set xml:base correctly when publishing inline
XML than when publishing escaped HTML?  What if the source materialis tag soup HTML?  You could clean it up and turn it into XHTML orpublish it as is as escaped HTML.  Either option is valid, and may bepreferable in some situations.  I don't see how any assumptions can
be made about the publisher's ability to set xml:base correctly basedon the content type.If you're assuming that xml:base is going to be set only at the topof the Atom document, then it may very well fail to be correct for a
lot of the content.  But xml:base may also be set at on the entry orcontent element, and could easily be set correctly based on thepublisher's knowledge of the appropriate base URI for the content.Anyway,theoretical arguments aside, there are two questions to answer
for the real world:1) If you're publishing Atom, in which content @types can you userelative URIs with reasonable confidence that consumers will applythe base URI correctly?2) If you're consuming Atom and you encounter a relative URI, how
should you choose the appropriate base URI with which to resolve it?I think there are only three remotely possible answers to #2:xml:base (including the URI from which the feed was retrieved ifxml:base isn't explicitly defined), the URI of the self link, and the
URI of the alternate link.  Given that Atom explicitly supportsxml:base, if it's explicitly defined, it's difficult to justifyignoring it in favor of anything else.If xml:base isn't explicitly defined, there may be some justification
for using the self link rather than the URI from which the feed wasretrieved.  It's sloppy on the publisher's part, but might be morelikely to succeed in practice.The alternate link is only a possible choice if there is at least one
alternate link, and if either there is only one, or there are morethan one, and all of them point to documents in the same directory.I'd say it's a fairly weak choice.Conclusion: you've got to resolve relative URIs with respect to
SOMETHING, and clearly the best choice is xml:base if it's explicitlydefined. If not, the self link and the URI from which the feed isretrieved each have some merit.If that's the correct answer for #2, then in a reasonably perfect
world, the answer to #1 should be that relative URIs should be safeanywhere as long as you're explicitly (and correctly!) definingxml:base.  In the real world, I'd guess that more consumingapplications will get it right in inline XML than in escaped HTML.
-- M. David Petersonhttp://www.xsltblog.com/


Re: Does xml:base apply to type="html" content?

2006-03-30 Thread James M Snell



Antone Roundy wrote:
>[snip]
> 2) If you're consuming Atom and you encounter a relative URI, how should
> you choose the appropriate base URI with which to resolve it?
> 
> I think there are only three remotely possible answers to #2: xml:base
> (including the URI from which the feed was retrieved if xml:base isn't
> explicitly defined), the URI of the self link, and the URI of the
> alternate link.  Given that Atom explicitly supports xml:base, if it's
> explicitly defined, it's difficult to justify ignoring it in favor of
> anything else.
> 

There is no basis in any of the specs for using the URI of the self or
alternate link as a base uri for resolving relative references in the
content.  The process for resolving relative references is very clearly
defined.

> If xml:base isn't explicitly defined, there may be some justification
> for using the self link rather than the URI from which the feed was
> retrieved.  It's sloppy on the publisher's part, but might be more
> likely to succeed in practice.
> 

-1.

> The alternate link is only a possible choice if there is at least one
> alternate link, and if either there is only one, or there are more than
> one, and all of them point to documents in the same directory. I'd say
> it's a fairly weak choice.
> 
> Conclusion: you've got to resolve relative URIs with respect to
> SOMETHING, and clearly the best choice is xml:base if it's explicitly
> defined. If not, the self link and the URI from which the feed is
> retrieved each have some merit.
> 

Wrong. You've got to resolve relative URI's with respect to the proper
base URI.  Let's reserve the sloppy guessing hacks for specs that
actually need them.

- James



Re: Does xml:base apply to type="html" content?

2006-03-30 Thread James Holderness


Sean Lyndersay wrote:

In my own case (IE7) case, this isn't that big a deal because
we have to grovel in HTML for many other reasons, but I suspect
it'd be pain for other clients.


Looking at the results of the Atom XmlBaseConformanceTests [1] mosts of the 
clients tested seemed capable of handling relative references inside HTML to 
some extent. Even the ones that don't necessarily pass all the tests at 
least get enough right to suggest that they're on the right track.


IE7 is actually one of the few clients that I would consider to have failed 
outright. Is the latest beta any better at handling xml:base or do these 
problems still exist?


Regards
James

[1] http://www.intertwingly.net/wiki/pie/XmlBaseConformanceTests



Re: Does xml:base apply to type="html" content?

2006-03-30 Thread Antone Roundy


On Mar 30, 2006, at 10:00 PM, M. David Peterson wrote:
Then it should be a best practice that if they invoke this, the  
xml:base value should be set upon the "element containing the  
text", in this case, the content element.  Obviously you can't  
simply assume that the current xml:base in context has any direct  
relation, and therefore value to the current entry/content in  
context, as, using Aristotle's use case (and a billion others just  
like it -- if not a billion now, it won't be too long before that  
number is quite realistic, and in fact only scratching the Atom  
feed surface of the not too distant future), there is no way that  
one can simply assume that the current @xml:base value is legit.


I disagree.  The best practice should be to set xml:base explicitly  
in any document using relative URIs, and at any point in the document  
where the relative URIs appear, ensure that the xml:base in context  
is the correct base URI by overriding it if necessary.  If this  
practice is followed, and only if this practice is followed, then  
consumers will be able to reliably resolve relative URIs.  I see no  
justification for assuming that the xml:base in context is invalid  
and using some other base URI just because xml:base is set somewhere  
other than the containing element.  It's a pretty sorry world if we  
not only assume, but operate on the assumption that publishers are  
and will continue to be that inept.


Just to amplify one point:
you can't simply assume that the current xml:base in context has  
any direct relation...


What you can't simply assume is that it the xml:base in context does  
NOT have any direct relation to the content.  Part of the point of  
XML is that we'll all be better off if consumers rely on publishers  
doing things correctly (in this case, getting xml:base right) and  
hold publishers to it until they get it right.


Antone



Re: Does xml:base apply to type="html" content?

2006-03-30 Thread James M Snell

I would agree that, as a best practice, the xml:base should appear on
the content element, but implementations need to be prepared to use
whatever the in-scope URI is (e.g. if no xml:base is specified, relative
refs in the content will be relative to Content-Location or the feeds
Request URI).  In other words, consumers of the feed *have* to assume
that the current xml:base in context is going to be correct and
publishers of the feed simply have to be responsible for Doing The Right
Thing.

- James

M. David Peterson wrote:
> Then it should be a best practice that if they invoke
> this, the xml:base value should be set upon the "element containing the 
> text", in this case, the content element.
>  Obviously
> you can't simply assume that the current xml:base in context has
> any direct relation, and therefore value to the current entry/content in 
> context, as, using Aristotle's use case (and a billion others just like it
> -- if not a billion now, it won't be too long before that number is
> quite realistic, and in fact only scratching the Atom feed surface of
> the not too distant
> future), there is no way that one can simply assume that the current 
> @xml:base value is legit.
> 
> 
> It seems to me that this current definition of xml:base didn't take into
> consideration the fact that the world would soon be revolving around XML
> mashups, all of which can contain any number of possible combinations of 
> URI's of which may have absolutely nothing even remotely in common with 
> another.
> 
> 
> Seems like maybe its time for a quick update to the xml:base definition,
> as this is not just an issue that effects Atom syndication feeds.
> 
> On 3/30/06, * James M Snell* <[EMAIL PROTECTED]
> > wrote:
> 
> 
> In retrospect, it likely would have been a good idea for us to have
> covered this in the Atom spec.  The definition of xml:base does include
> a statement that "[t]he base URI for a URI reference appearing in text
> content is the base URI of the element containing the text."  That would
> include URI references contained within the escaped HTML markup of Text
> constructs and the content element.
> 
> - James
> 
> Sean Lyndersay wrote:
> >
> > This is unfortunate, because HTML itself only allows 
> elements in the header (one per page). So if anyone wants to build a
> client that displays more than one item at a time using a standard
> HTML renderer (and most client render HTML using someone else's
> renderer, not their own), they have to go groveling in HTML to do
> URL fixup (or use iframes).
> >
> > In my own case (IE7) case, this isn't that big a deal because we
> have to grovel in HTML for many other reasons, but I suspect it'd be
> pain for other clients.
> >
> > My own reading goes like this: Since xml:base is an XML concept,
> it should apply only to relative references in XML content
> (including XHTML). From the XML perspective, the HTML content is
> just a string, so the xml:base should not apply.
> >
> > Sean
> >
> > -Original Message-
> > From: [EMAIL PROTECTED]
> 
> [mailto:[EMAIL PROTECTED]
> ] On Behalf Of Tim Bray
> > Sent: Thursday, March 23, 2006 10:49 AM
> > To: David Powell
> > Cc: Atom Syntax
> > Subject: Re: Does xml:base apply to type="html" content?
> >
> >
> >
> > On Mar 23, 2006, at 10:03 AM, David Powell wrote:
> >
> >>
> >> xml:base applies to type="xhtml" content, but I'm not sure whether it
> >> is supposed to apply to escaped type="html" content? I reckon
> that it
> >> does.
> >
> > RFC4287, section 2:
> >
> > Any element defined by this specification MAY have an xml:base
> > attribute [W3C.REC-xmlbase-20010627].  When xml:base is used in an
> > Atom Document, it serves the function described in section
> 5.1.1 of
> > [RFC3986], establishing the base URI (or IRI) for resolving any
> > relative references found within the effective scope of the
> xml:base
> > attribute.
> >
> > Seems pretty clear to me.  Yes, the base URI of that HTML is now
> > whatever xml:base said it was -Tim
> >
> >
> >
> 
> 
> 
> 
> -- 
> 
> 
> M. David Peterson
> http://www.xsltblog.com/ 



Re: Does xml:base apply to type="html" content?

2006-03-30 Thread Antone Roundy


On Mar 30, 2006, at 8:34 PM, M. David Peterson wrote:

...the content element can be basically anything as long as its either

- non-escaped plain text with a @type value set to text,
- escaped text,with a @type set to a valid 'text' mime-type
- enitity escaped with @type set to html,
- xhtml wrapped in a properly xhtml namespaced div with @type set  
to xhtml,

- base64 encoded with @type set to the proper media type, or
- its xml with @type set to a proper XML mime-type.

In each of these cases, the only one that shold have even a remote  
chance of the current value of the @xml:base in current context  
applying to is inline xml.

...
The escaped HTML content contained within the content element that  
David was originally concerned with is more than likely a copy of  
all or part of the elements and content contained inside the body  
tag of the external document referenced by an associated link  
element, and therefore no guarentee that the xml:base of the atom  
feed is going to be anywhere even close to accurate.


On what basis are you concluding that Atom publishers are more likely  
to be smart enough to set xml:base correctly when publishing inline  
XML than when publishing escaped HTML?  What if the source material  
is tag soup HTML?  You could clean it up and turn it into XHTML or  
publish it as is as escaped HTML.  Either option is valid, and may be  
preferable in some situations.  I don't see how any assumptions can  
be made about the publisher's ability to set xml:base correctly based  
on the content type.


If you're assuming that xml:base is going to be set only at the top  
of the Atom document, then it may very well fail to be correct for a  
lot of the content.  But xml:base may also be set at on the entry or  
content element, and could easily be set correctly based on the  
publisher's knowledge of the appropriate base URI for the content.


Anyway,theoretical arguments aside, there are two questions to answer  
for the real world:


1) If you're publishing Atom, in which content @types can you use  
relative URIs with reasonable confidence that consumers will apply  
the base URI correctly?


2) If you're consuming Atom and you encounter a relative URI, how  
should you choose the appropriate base URI with which to resolve it?


I think there are only three remotely possible answers to #2:  
xml:base (including the URI from which the feed was retrieved if  
xml:base isn't explicitly defined), the URI of the self link, and the  
URI of the alternate link.  Given that Atom explicitly supports  
xml:base, if it's explicitly defined, it's difficult to justify  
ignoring it in favor of anything else.


If xml:base isn't explicitly defined, there may be some justification  
for using the self link rather than the URI from which the feed was  
retrieved.  It's sloppy on the publisher's part, but might be more  
likely to succeed in practice.


The alternate link is only a possible choice if there is at least one  
alternate link, and if either there is only one, or there are more  
than one, and all of them point to documents in the same directory.  
I'd say it's a fairly weak choice.


Conclusion: you've got to resolve relative URIs with respect to  
SOMETHING, and clearly the best choice is xml:base if it's explicitly  
defined. If not, the self link and the URI from which the feed is  
retrieved each have some merit.


If that's the correct answer for #2, then in a reasonably perfect  
world, the answer to #1 should be that relative URIs should be safe  
anywhere as long as you're explicitly (and correctly!) defining  
xml:base.  In the real world, I'd guess that more consuming  
applications will get it right in inline XML than in escaped HTML.




Re: Does xml:base apply to type="html" content?

2006-03-30 Thread M. David Peterson
Then it should be a best practice that if they invoke this, the xml:base value should be set upon the "element containing the text", in this case, the content element.  Obviously you can't simply assume that the current xml:base in context has any direct relation, and therefore value to the current entry/content in context, as, using Aristotle's use case (and a billion others just like it -- if not a billion now, it won't be too long before that number is quite realistic, and in fact only scratching the Atom feed surface of the not too distant future), there is no way that one can simply assume that the current @xml:base value is legit.
 It seems to me that this current definition of xml:base didn't take into consideration the fact that the world would soon be revolving around XML mashups, all of which can contain any number of possible combinations of URI's of which may have absolutely nothing even remotely in common with another.
 Seems like maybe its time for a quick update to the xml:base definition, as this is not just an issue that effects Atom syndication feeds.On 3/30/06, 
James M Snell <[EMAIL PROTECTED]> wrote:
In retrospect, it likely would have been a good idea for us to havecovered this in the Atom spec.  The definition of xml:base does includea statement that "[t]he base URI for a URI reference appearing in text
content is the base URI of the element containing the text."  That wouldinclude URI references contained within the escaped HTML markup of Textconstructs and the content element.- JamesSean Lyndersay wrote:
>> This is unfortunate, because HTML itself only allows  elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes).
>> In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients.>> My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply.
>> Sean>> -Original Message-> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
] On Behalf Of Tim Bray> Sent: Thursday, March 23, 2006 10:49 AM> To: David Powell> Cc: Atom Syntax> Subject: Re: Does xml:base apply to type="html" content?>>>
> On Mar 23, 2006, at 10:03 AM, David Powell wrote:> xml:base applies to type="xhtml" content, but I'm not sure whether it>> is supposed to apply to escaped type="html" content? I reckon that it
>> does.>> RFC4287, section 2:>> Any element defined by this specification MAY have an xml:base> attribute [W3C.REC-xmlbase-20010627].  When xml:base is used in an> Atom Document, it serves the function described in section 
5.1.1 of> [RFC3986], establishing the base URI (or IRI) for resolving any> relative references found within the effective scope of the xml:base> attribute.>> Seems pretty clear to me.  Yes, the base URI of that HTML is now
> whatever xml:base said it was -Tim>>>-- M. David Petersonhttp://www.xsltblog.com/
 


Re: Does xml:base apply to type="html" content?

2006-03-30 Thread James M Snell

In retrospect, it likely would have been a good idea for us to have
covered this in the Atom spec.  The definition of xml:base does include
a statement that "[t]he base URI for a URI reference appearing in text
content is the base URI of the element containing the text."  That would
include URI references contained within the escaped HTML markup of Text
constructs and the content element.

- James

Sean Lyndersay wrote:
> 
> This is unfortunate, because HTML itself only allows  elements in the 
> header (one per page). So if anyone wants to build a client that displays 
> more than one item at a time using a standard HTML renderer (and most client 
> render HTML using someone else's renderer, not their own), they have to go 
> groveling in HTML to do URL fixup (or use iframes).
> 
> In my own case (IE7) case, this isn't that big a deal because we have to 
> grovel in HTML for many other reasons, but I suspect it'd be pain for other 
> clients.
> 
> My own reading goes like this: Since xml:base is an XML concept, it should 
> apply only to relative references in XML content (including XHTML). From the 
> XML perspective, the HTML content is just a string, so the xml:base should 
> not apply.
> 
> Sean
> 
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Bray
> Sent: Thursday, March 23, 2006 10:49 AM
> To: David Powell
> Cc: Atom Syntax
> Subject: Re: Does xml:base apply to type="html" content?
> 
> 
> 
> On Mar 23, 2006, at 10:03 AM, David Powell wrote:
> 
>>
>> xml:base applies to type="xhtml" content, but I'm not sure whether it
>> is supposed to apply to escaped type="html" content? I reckon that it
>> does.
> 
> RFC4287, section 2:
> 
> Any element defined by this specification MAY have an xml:base
> attribute [W3C.REC-xmlbase-20010627].  When xml:base is used in an
> Atom Document, it serves the function described in section 5.1.1 of
> [RFC3986], establishing the base URI (or IRI) for resolving any
> relative references found within the effective scope of the xml:base
> attribute.
> 
> Seems pretty clear to me.  Yes, the base URI of that HTML is now
> whatever xml:base said it was -Tim
> 
> 
> 



Re: Does xml:base apply to type="html" content?

2006-03-30 Thread M. David Peterson
Oopps > Canadian *M*ount*ie*Sorry Tim! :)On 3/30/06, M. David Peterson <[EMAIL PROTECTED]> wrote:
> @href attribute *or other attribute or elements who's value CAN or MUST be a URI/IRI* 
On 3/30/06, M. David Peterson <

[EMAIL PROTECTED]> wrote:I have to wonder why xml:base would apply to anything other than the hardline schema specific @href attribute values of the structured document in which the schema directly applys to. Extending this, a good portion of an Atom document is fairly rigid in regards to what is and is not allowed until you reach the content element. Within the content element can be basically anything as long as its either 
- non-escaped plain text with a @type value set to text, - escaped text,with a @type set to a valid 'text' mime-type- enitity escaped with @type set to html, - xhtml wrapped in a properly xhtml namespaced div with @type set to xhtml, 
- base64 encoded with @type set to the proper media type, or - its xml with @type set to a proper XML mime-type.  In each of these cases, the only one that shold have even a remote chance of the current value of the @xml:base in current context applying to is inline xml.  But given the fact that those of us who are inlining xml (that isn't xhtml pulled from a referenced document) are doing so using a completely different namespace, schema, etc...
then the chances that the current @xml:base value in context even making it into the related xml before being replaced by another @xml:base value is not all that great.  And if it does?  Then its context document  is going to be it's containing Atom file, in which xml:base would apply, but to what?  It's in a different namespace, has a different schema that applies to it, which would then mean that the chances of  the Atom savvy processor understanding that a particular element or attribute value is a URI, and should therefore apply the current @xml:base value in context to these values obviously is not something that fits within the confines of the Atom specication given the fact that theres no guarentee that a schema language it even partially understands is going to be applied to the contained content to act as a URI-guide for the now legally Blind as a BAtom processor. ;)
With all of this stated, if you're not all already sick of me, heres one last  final point to help push you over the edge ;) :D The escaped HTML content contained within the content element that David was originally concerned  with is more than likely a copy of  all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element, and therefore no guarentee that the xml:base of the atom feed is going to be anywhere even close to accurate.  Of course for the Atom feed to validate correctly, the link elements  @rel value will need to be either 'alternate', 'via', 'related', or a spec conforming IRI, as 'enclosure', if inline, is base64 encoded, and 'self''?  Well now that wouldn't apply correctly to a  
link/@rel who has a grandparent by the name of feed, now would it :) So this all brings us down to the last possible scenario... The @src of the content element.It would seem to me that if there is an @xml:base value currently in context, then as soon as it reaches the '>' character of the opening content element, it no longer has jurisdiction...
 Kind of like a Canadian mounty has to call it quits once He/She reaches to CA/USA borderline... Or something like that anway :)Peace, Love, and all the Atomic Joy you can handle is wished upon all of you :)
On 3/30/06, Sean Lyndersay <

[EMAIL PROTECTED]> wrote:
This is unfortunate, because HTML itself only allows  elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes).
In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients.My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply.
Sean-Original Message-From: [EMAIL PROTECTED] [mailto:

[EMAIL PROTECTED]
] On Behalf Of Tim BraySent: Thursday, March 23, 2006 10:49 AMTo: David PowellCc: Atom SyntaxSubject: Re: Does xml:base apply to type="html" content?On Mar 23, 2006, at 10:03 AM, David Powell wrote:
>>> xml:base applies to type="xhtml" content, but I'm not sure whether it> is supposed to apply to escaped type="html" content? I reckon that it> does.RFC4287, section 2:
Any element defined by this specification MAY have an xml:baseattribute [W3C.REC-xmlbase-20010627].  When xml:base is used in anAtom Document, it serves the function described 

Re: Does xml:base apply to type="html" content?

2006-03-30 Thread M. David Peterson
> @href attribute *or other attribute or elements who's value CAN or MUST be a URI/IRI* On 3/30/06, M. David Peterson <
[EMAIL PROTECTED]> wrote:I have to wonder why xml:base would apply to anything other than the hardline schema specific @href attribute values of the structured document in which the schema directly applys to. Extending this, a good portion of an Atom document is fairly rigid in regards to what is and is not allowed until you reach the content element. Within the content element can be basically anything as long as its either 
- non-escaped plain text with a @type value set to text, - escaped text,with a @type set to a valid 'text' mime-type- enitity escaped with @type set to html, - xhtml wrapped in a properly xhtml namespaced div with @type set to xhtml, 
- base64 encoded with @type set to the proper media type, or - its xml with @type set to a proper XML mime-type.  In each of these cases, the only one that shold have even a remote chance of the current value of the @xml:base in current context applying to is inline xml.  But given the fact that those of us who are inlining xml (that isn't xhtml pulled from a referenced document) are doing so using a completely different namespace, schema, etc...
then the chances that the current @xml:base value in context even making it into the related xml before being replaced by another @xml:base value is not all that great.  And if it does?  Then its context document  is going to be it's containing Atom file, in which xml:base would apply, but to what?  It's in a different namespace, has a different schema that applies to it, which would then mean that the chances of  the Atom savvy processor understanding that a particular element or attribute value is a URI, and should therefore apply the current @xml:base value in context to these values obviously is not something that fits within the confines of the Atom specication given the fact that theres no guarentee that a schema language it even partially understands is going to be applied to the contained content to act as a URI-guide for the now legally Blind as a BAtom processor. ;)
With all of this stated, if you're not all already sick of me, heres one last  final point to help push you over the edge ;) :D The escaped HTML content contained within the content element that David was originally concerned  with is more than likely a copy of  all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element, and therefore no guarentee that the xml:base of the atom feed is going to be anywhere even close to accurate.  Of course for the Atom feed to validate correctly, the link elements  @rel value will need to be either 'alternate', 'via', 'related', or a spec conforming IRI, as 'enclosure', if inline, is base64 encoded, and 'self''?  Well now that wouldn't apply correctly to a  
link/@rel who has a grandparent by the name of feed, now would it :) So this all brings us down to the last possible scenario... The @src of the content element.It would seem to me that if there is an @xml:base value currently in context, then as soon as it reaches the '>' character of the opening content element, it no longer has jurisdiction...
 Kind of like a Canadian mounty has to call it quits once He/She reaches to CA/USA borderline... Or something like that anway :)Peace, Love, and all the Atomic Joy you can handle is wished upon all of you :)
On 3/30/06, Sean Lyndersay <
[EMAIL PROTECTED]> wrote:
This is unfortunate, because HTML itself only allows  elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes).
In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients.My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply.
Sean-Original Message-From: [EMAIL PROTECTED] [mailto:
[EMAIL PROTECTED]
] On Behalf Of Tim BraySent: Thursday, March 23, 2006 10:49 AMTo: David PowellCc: Atom SyntaxSubject: Re: Does xml:base apply to type="html" content?On Mar 23, 2006, at 10:03 AM, David Powell wrote:
>>> xml:base applies to type="xhtml" content, but I'm not sure whether it> is supposed to apply to escaped type="html" content? I reckon that it> does.RFC4287, section 2:
Any element defined by this specification MAY have an xml:baseattribute [W3C.REC-xmlbase-20010627].  When xml:base is used in anAtom Document, it serves the function described in section 5.1.1
 of
[RFC3986], establishing the base URI (or IRI) for resolving anyrelative r

Re: Does xml:base apply to type="html" content?

2006-03-30 Thread M. David Peterson
I have to wonder why xml:base would apply to anything other than the hardline schema specific @href attribute values of the structured document in which the schema directly applys to. Extending this, a good portion of an Atom document is fairly rigid in regards to what is and is not allowed until you reach the content element. Within the content element can be basically anything as long as its either 
- non-escaped plain text with a @type value set to text, - escaped text,with a @type set to a valid 'text' mime-type- enitity escaped with @type set to html, - xhtml wrapped in a properly xhtml namespaced div with @type set to xhtml, 
- base64 encoded with @type set to the proper media type, or - its xml with @type set to a proper XML mime-type.  In each of these cases, the only one that shold have even a remote chance of the current value of the @xml:base in current context applying to is inline xml.  But given the fact that those of us who are inlining xml (that isn't xhtml pulled from a referenced document) are doing so using a completely different namespace, schema, etc...
then the chances that the current @xml:base value in context even making it into the related xml before being replaced by another @xml:base value is not all that great.  And if it does?  Then its context document  is going to be it's containing Atom file, in which xml:base would apply, but to what?  It's in a different namespace, has a different schema that applies to it, which would then mean that the chances of  the Atom savvy processor understanding that a particular element or attribute value is a URI, and should therefore apply the current @xml:base value in context to these values obviously is not something that fits within the confines of the Atom specication given the fact that theres no guarentee that a schema language it even partially understands is going to be applied to the contained content to act as a URI-guide for the now legally Blind as a BAtom processor. ;)
With all of this stated, if you're not all already sick of me, heres one last  final point to help push you over the edge ;) :D The escaped HTML content contained within the content element that David was originally concerned  with is more than likely a copy of  all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element, and therefore no guarentee that the xml:base of the atom feed is going to be anywhere even close to accurate.  Of course for the Atom feed to validate correctly, the link elements  @rel value will need to be either 'alternate', 'via', 'related', or a spec conforming IRI, as 'enclosure', if inline, is base64 encoded, and 'self''?  Well now that wouldn't apply correctly to a  
link/@rel who has a grandparent by the name of feed, now would it :) So this all brings us down to the last possible scenario... The @src of the content element.It would seem to me that if there is an @xml:base value currently in context, then as soon as it reaches the '>' character of the opening content element, it no longer has jurisdiction...
 Kind of like a Canadian mounty has to call it quits once He/She reaches to CA/USA borderline... Or something like that anway :)Peace, Love, and all the Atomic Joy you can handle is wished upon all of you :)
On 3/30/06, Sean Lyndersay <[EMAIL PROTECTED]> wrote:
This is unfortunate, because HTML itself only allows  elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes).
In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients.My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply.
Sean-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
] On Behalf Of Tim BraySent: Thursday, March 23, 2006 10:49 AMTo: David PowellCc: Atom SyntaxSubject: Re: Does xml:base apply to type="html" content?On Mar 23, 2006, at 10:03 AM, David Powell wrote:
>>> xml:base applies to type="xhtml" content, but I'm not sure whether it> is supposed to apply to escaped type="html" content? I reckon that it> does.RFC4287, section 2:
Any element defined by this specification MAY have an xml:baseattribute [W3C.REC-xmlbase-20010627].  When xml:base is used in anAtom Document, it serves the function described in section 5.1.1 of
[RFC3986], establishing the base URI (or IRI) for resolving anyrelative references found within the effective scope of the xml:baseattribute.Seems pretty clear to me.  Yes, the base URI of that HTML is now
whatever xm

Re: Does xml:base apply to type="html" content?

2006-03-30 Thread A. Pagaltzis

* Sean Lyndersay <[EMAIL PROTECTED]> [2006-03-31 04:00]:
>This is unfortunate, because HTML itself only allows 
>elements in the header (one per page). So if anyone wants to
>build a client that displays more than one item at a time using
>a standard HTML renderer (and most client render HTML using
>someone else's renderer, not their own), they have to go
>groveling in HTML to do URL fixup (or use iframes).

That’s exactly the problem currently facing Liferea.

However, exempting [EMAIL PROTECTED]'html'` content from xml:base
processing won’t help.

If the items can come from multiple feeds, such as is supported
by Liferea, then mixing items from an Atom feed that uses
xml:base and other feeds automatically runs into the same issue.
In that scenario, either the tag soup from the other feeds must
be fixed up so the view can be rendered as XHTML (which supports
xml:base in content), or URL fixup needs to be done on the
content from the Atom feed so it can be passed to a tag soup
renderer.

Regards,
-- 
Aristotle Pagaltzis // 



RE: Does xml:base apply to type="html" content?

2006-03-30 Thread Sean Lyndersay


This is unfortunate, because HTML itself only allows  elements in the 
header (one per page). So if anyone wants to build a client that displays more 
than one item at a time using a standard HTML renderer (and most client render 
HTML using someone else's renderer, not their own), they have to go groveling 
in HTML to do URL fixup (or use iframes).

In my own case (IE7) case, this isn't that big a deal because we have to grovel 
in HTML for many other reasons, but I suspect it'd be pain for other clients.

My own reading goes like this: Since xml:base is an XML concept, it should 
apply only to relative references in XML content (including XHTML). From the 
XML perspective, the HTML content is just a string, so the xml:base should not 
apply.

Sean

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Bray
Sent: Thursday, March 23, 2006 10:49 AM
To: David Powell
Cc: Atom Syntax
Subject: Re: Does xml:base apply to type="html" content?



On Mar 23, 2006, at 10:03 AM, David Powell wrote:

>
>
> xml:base applies to type="xhtml" content, but I'm not sure whether it
> is supposed to apply to escaped type="html" content? I reckon that it
> does.

RFC4287, section 2:

Any element defined by this specification MAY have an xml:base
attribute [W3C.REC-xmlbase-20010627].  When xml:base is used in an
Atom Document, it serves the function described in section 5.1.1 of
[RFC3986], establishing the base URI (or IRI) for resolving any
relative references found within the effective scope of the xml:base
attribute.

Seems pretty clear to me.  Yes, the base URI of that HTML is now
whatever xml:base said it was -Tim