from:"David Powell"

Re: Inheritance of license grants by entries in a feed

2007-01-14 Thread David Powell

Sorry for the delay in responding. I disagree that feed elements apply
to the feed document and not the feed itself. I believe that both the
spirit and letter of the specification make it clear that feed
elements are metadata about the feed not the document, and the typical
behaviour of implementations seems to agree.

I agree that it is important to distinguish between feeds and feed
documents, and this is why I think that feed level inheritance of
licenses should be dropped as it is incompatible with Atom.

Monday, December 18, 2006, 10:22:17 PM, Bob Wyman wrote:

> On 12/17/06, David Powell <[EMAIL PROTECTED]> wrote:
>>  What you can do however, is to specify that feed licenses apply to the
>> "feed", and inherit to the entries in the feed. ... It
>> means that the license applies to all entries in that feed,  not just
>> ones in that specific feed document. This is probably reasonable
>> behaviour for licenses anyway.

> Particularly in the case of licenses, it is very important to
> distinguish between the "feed" or stream of all entries (past,
> present and future) associated with a feed id and the actual feed
> documents that encapsulate subsets of that stream.

> Atom provides no mechanism for associating meta-data with "feeds."

The text of RFC4287 seems to contradict this:

  The "atom:feed" element is the document (i.e., top-level) element of
  an Atom Feed Document, acting as a container for metadata and data
  associated with the feed.

Atom does support inheritance of  and  elements, but
only because this is behaviour is clearly documented in the core
specification, so there can be no doubt of how implementations should
process these elements:

  If an atom:entry element does not contain atom:author elements, then
  the atom:author elements of the contained atom:source element are
  considered to apply. In an Atom Feed Document, the atom:author
  elements of the containing atom:feed element are considered to apply
  to the entry if there are no atom:author elements in the locations
  described above.

Effectively, this inheritance can be implemented by copying the
elements at the feed document parsing stage.

You can't just say that the license extension inherits and expect
every implementation out there to implement that.  You'd need an Atom
2.0 to do that: either support for must-understand (which was rejected
from Atom 1.0), or a special feed document extension container.  I
agree that feed document inheritance would be a useful feature, but as
we don't have it we'll have to write out these duplicate elements
longhand.  It isn't such a big deal though.

It also seems obvious to me that feed elements are metadata applied
to the feed based on the expected behaviour of implementations.  As an
example:

I observed a feed that I am subscribed to in Bloglines change its feed
title in the feed list pane. The feed did this by changing the value
of the  element for the feed. Surely you wouldn't say that no
conclusion can be drawn from this change in feed document as to the
intended change in state of the feed?

The whole feed model is based around changing feed documents
communicating the latest state of the feed; for both the entries in
the feed and the feed itself.

In fact the whole web is based around the fact that representations of
resources can vary over time, and that the latest representation is,
the latest.

> Data in one feed document does not apply to entries found in another
> feed document -- or to entries that stand-alone. Feed meta-data
> found in one feed document does not override, compliment or
> invalidate feed meta-data found in other feed documents.

If you poll a feed twice you have two feed documents.  These are
obviously related.  If I update an entry element, then the latter
version is understood to be a replacement.  You can correlate them by
the entry ids and updated.  If I update a feed element, then the
latter version is understood to be the replacement.  What's the
problem?  I don't see how it could work any other way.

> This is one of the many reasons we have atom:source -- so that we
> can bind specific feed meta-data to an entry no matter what context
> in which that entry might appear or when it might be read.

Atom doesn't describe the processing model of Atom documents
explicitly enough for me to infer much about the semantics of
atom:source.  If you want each entry (or a group of entries within a
document) to have its own private feed state, then atom:source is one
way to implement that.  Needing to do that is a good sign that you are
abusing feed elements to carry entry metadata though.

> If we had a case where data in one feed document overrides data in
> other feed documents, we'd have a mess. Some of the questions that
> we'd have to answer are:

> + Elements like at

Re: Inheritance of license grants by entries in a feed

2006-12-17 Thread David Powell

Sunday, December 17, 2006, 1:55:39 AM, Bob Wyman wrote:

>  2.3.  Inherited Licenses
> The license on a feed MAY be inherited by entries.

James,

I'm not sure exactly what you are trying to achieve with the
inheritance rule for licenses, but I think that it could do with the
term "feed" being more accurately specified.

Whilst it would be very useful for extensions to be able to support
inheritance rules like the ones that Atom specifies for atom:author
and atom:rights, which cause properties applied to a "feed document"
to inherit to the entries declared within the "feed document"; there
is nothing in Atom's specification of extensions elements that
supports this short-hand notation, and attempting to emulate this
behaviour in an extension will cause real-world implementations of
feed stores to incorrectly assign, or not assign, licenses to the
wrong entries. Simply because Atom implementations tend to give
entries and feeds seperate life-cycles, and implementations that
maintain a feed-state over multiple pollings of a feed are unlikely to
associate each entry with the set of feed document metadata from each
of the documents that it occurred in.

Eg, if you store a feed in an implementation such as Microsoft's Feed
Engine, only a single set of feed extensions will be associated with
the feed. This will mean that if you change the license in the feed
document when a feed is subsequently polled, intending it only to
apply to the entries within that new feed document, you will
effectively retroactively apply the license to the old entries too.
Atom, unfortunately, doesn't have a way of indicating that an
extension applies at "feed document"-level and MUST be processed at
the feed document parsing stage.

What you can do however, is to specify that feed licenses apply to the
"feed", and inherit to the entries in the feed. This behaviour doesn't
require implementations to be psychic and guess that an unknown
extension needs to be processed at the document parsing stage. It
means that the license applies to all entries in that feed, not just
ones in that specific feed document. This is probably reasonable
behaviour for licenses anyway.

This might be your intention, but I'm not clear from the draft.

-- 
Dave

Re: Atom Entry docs

2006-12-15 Thread David Powell



I've always interpreted a kind of inheritance relationship between
MIME types.

It's never wrong to label an Excel file, an XML document, or an Atom
Feed as application/octet-stream, because all of those types ARE
octet-streams.  It is just not as helpful as it could be.

Likewise, it is never wrong to label an Atom Feed, as application/xml.
It just isn't being optimally helpful.

I think that you have to interpret MIME types this way, otherwise a
generic XML processing application which labelled content as
application/xml would be standards compliant until the day that the
Atom RFC was published when it would become un-compliant overnight
without anyone touching the code. I don't think that that is the way
that standards are supposed to work.

So likewise, I see no harm in inventing application/atomentry+xml.
Sure, applications can still use application/atom+xml for entries,
they just aren't being as helpful as ones that use the more explicit
MIME type.


-- 
Dave

Re: Atom Entry docs

2006-12-15 Thread David Powell

Thursday, December 14, 2006, 9:04:00 AM, Henri Sivonen wrote:

> On Dec 13, 2006, at 17:51, Mark Baker wrote:
>> But
>> given that an alternative exists which shouldn't break those servers,
>> why not use it when there's no apparent downside?

> The downside is that implementations that (quite reasonably) assume  
> that application/atom+xml == feed are also reasonable when they  
> ignore unknown media type parameters.

An example would be an HTML page with rel="alternative" links
pointing to a feed and an Atom Entry document.  This seems quite a
reasonable use-case, yet if we don't create a new MIME type, then I'd
expect that all current feed reader implementations would incorrectly
detect the entry document to be a feed, which would be very confusing
for the user if they select the entry document, and their feed reader
attempts to subscribe to the entry document.  Which would either work,
and they would get subscribed to a feed that doesn't update, or they'd
get an error.

> Given the options of a new type or a new parameter, I am +1 on the  
> new type. (Although in general, I don't like the proliferation of  
> application/*+xml types, because apps need to do root sniffing for  
> application/xml anyway.)

Another issue with MIME parameters is that the old MIME RFCs are
inconsistent about the definition of the term "media type", which can
mean top-level type, top-level/sub-type, or top-level/sub-type with
parameters. When Atom was being developed I asked (several times)
whether Atom documents are allowed to contain parameters in places
that they specify media types, and I got conflicting answers. My
perception is that MIME parameters just don't work very well.

Another case for a new type is browser dispatch.  Operating systems
tend not to use MIME types for deciding which application to use.
Whilst a feed reader is a sensible application to open a feed in; an
entry document should probably be opened in an editing/publishing
application.  This is not really possible if we use a MIME parameter
to distinguish the two document types.

-- 
Dave

Re: PaceAtomBidi

2006-10-05 Thread David Powell

I don't have much experience with bidi. I've been having a quick read
up on it, and there seem to be the following features. Correct me if
I’m wrong.

a) Unicode implicitly supports bidi. Write a span containing Hebrew
characters, and it will be laid out right to left. We don’t need to do
anything to support this.

b) and Unicode controls such as RIGHT-TO-LEFT
EMBEDDING, give hints about how to layout neutral characters such as
punctuation marks that are used between flows of ltr and rtl text. We
don’t need to do anything to support these Unicode controls. can be used in Text constructs if required, but such a mechanism
isn’t necessary for the rest of Atom.

HTML benefits from because it is easier to type it in HTML
source than Unicode escapes, but nobody really types Atom source by
hand, and operating systems do support embedding of Unicode escapes
via text widgets (eg, on Windows, install supplemental language
support via Regional Settings control panel - I think?). Allowing dir
to be used anywhere in Atom does not provide the same feature as , because the sole purpose of it is to disambiguate mixed
directions *within* a flow of text, and allowing @dir anywhere in Atom
would not achieve that granularity.

c) and Unicode controls such as RIGHT-TO-LEFT OVERRIDE
allow the Unicode algorithm to be disabled and direction to be
hard-coded. Under what circumstances is this needed? Is it
appropriate for Atom?

d) sets the base direction of the text. One of the
effects of this is that makes paragraphs right
aligned, and table columns to be laid out right-to-left.

I'm not convinced that we need support for this in Atom. I'd like to
hear more opinions. In HTML it makes sense to specify that a block of
text is right-aligned, but in Atom there are no blocks of text; just
fields. Isn't this largely a presentation feature?

Adding an attribute that is supposed to inherit from feed elements to
entry elements is a breaking change to Atom that is likely to lead to
silent data-loss with existing implementations. Implementations would
need to know to inherit the attribute down to entries and store it
with the entries rather than with the feed, so that retrospective
changes don't change the direction of previously polled entries.

I think that inheritance is useful for a hierarchically structured
document like HTML, but confusing for Atom, which is essentially a
bunch of XML-ified structs with a fairly arbitrary hierarchy
representing multiple entities with seperate life-cycles. (People deal
with feeds containing a subset of all entries ever, not with feed
documents)

e) in IE at least, this affects not only the page,
but the layout of browser chrome such as scrollbars and Javascript
alerts. I don't think that this feature is appropriate for syndication
format. The user preferences of the user-agent (probably via the OS)
would probably be better suited to control the application layout.

So, the question is: do we need the ability to markup the "base
direction" of text in Atom, at a block level. If so, what are the
applications of the "base direction" in Atom's layout-agnostic fields
such as category label?

Also at what granularity do we need to do this? On every element, or
would it be sufficient to use extension elements to describe the base
direction at the entry/feed level (cf: ) ?

--
Dave

Re: PaceAtomBidi

2006-10-04 Thread David Powell



It is hard to discuss the impacts without knowing what status we are
trying to achieve with this and any other proposed changes to Atom.

Are we planning on changing the Atom namespace?

Adding inheritable attributes seems like a breaking to change.
Existing compliant implementations will silently corrupt such feeds.


Example (incomplete):


  rtl feed title
  
 this should be right to left
  


If this feed is stored anywhere then, a) the undefined attribute may
get lost; and b) even if it doesn't, stateful implementations won't
know that they have to resolve each entry against it and store the
attribute with the entry data as well as the feed data.

Eg, if a later entry is posted to the same feed, and the feed polled
as:


  rtl feed title
  
 this should be left right
  


A stateful implementation needs to know that this implies a feed with
the latest feed metadata:


  rtl feed title


and two entries:


   this should be right to left


   this should be left right




-- 
Dave

Re: Atom and bidi (was: Re: Atom Syndication Format To Draft Standard?)

2006-10-02 Thread David Powell

Tuesday, October 3, 2006, 1:55:31 AM, I wrote:

> As we depend on Unicode, then we can't really stop people from using
> Unicode bidi. We can't stop people from using HTML/XHTML bidi. Or even
> CSS bidi controls. I think we should think carefully before we
> introduce yet another method for bidi text.

Hmm, that sounded a bit odd. I don't want to "stop people from using
bidi"...

I was trying to say that implementations can support Unicode bidi and
HTML bidi today without any change to the spec, and that they seem
more powerful than an Atom bidi attribute.

-- 
Dave

Re: Atom and bidi (was: Re: Atom Syndication Format To Draft Standard?)

2006-10-02 Thread David Powell

Tuesday, October 3, 2006, 12:20:01 AM, James Snell wrote:

> I think the suggestion of adding a dir attribute is a very good idea.
> The great thing is that it can be done without any significant backwards
> compatibility concerns.  The definition of the attribute is simple enough:

>   atomCommonAttributes =
>   attribute xml:base { atomUri }?,
>   attribute xml:lang { atomLanguageTag }?,
>   attribute dir { "rtl" | "ltr" }?,
>   undefinedAttribute*

In the context of Atom, what's the problem with the Unicode bidi
control characters?

I suspect that browsers and standard OS text input widgets have better
support for Unicode bidi, than they do for a currently non-existing
Atom attribute.

Which elements would this help?

  content, subtitle, summary, rights and title support HTML, so this
  wouldn't be necessary for them.

  updated, published, logo, id, and icon I would guess can cope
  without.

  extensions are responsible for their own namespace, I don't think
  that we need to say what attributes can appear on an extension.

I think [EMAIL PROTECTED], [EMAIL PROTECTED], and [EMAIL PROTECTED] are the only
attributes that would really benefit.

Wouldn't Unicode bidi be more powerful than a single direction
element, that would restrict the field to a single direction?

As we depend on Unicode, then we can't really stop people from using
Unicode bidi. We can't stop people from using HTML/XHTML bidi. Or even
CSS bidi controls. I think we should think carefully before we
introduce yet another method for bidi text.  Especially one that will
be incompatible with all existing Atom consumers.

-- 
Dave

Re: atom license extension (Re: [cc-tab] important heads up)

2006-09-06 Thread David Powell

Wednesday, September 6, 2006, 11:38:13 AM, you wrote:

> So, here's the proposal:

> - Use  for entry licenses -- either on the feed
>   level, setting a default analogous to what atom:rights does, or on
>   the element level.

I think that there are data modelling issues with this approach. I
don't think that the inheritance of extensions from the 'feed
document', to the entries contained within that document is supported
by the spec, nor would it be likely to be supported by typical
implementations of stateful feed stores, frameworks and APIs.

Feeds and entries are seperate entities with seperate life-cycles.
Stateful feed platforms, such as the Windows feed platform,
typically store a single instance of feed metadata, and a single
instance of entry metadata.

When, for example, a feed title changes, the change applies to the
feed as a whole; it isn't localised to only apply to the entries that
were present in that feed document at the time of the change, because
each entry doesn't typically store its own private copy of feed
metadata.

Implementors can cope with the inheritance of atom:rights and
atom:author, because it is explicitly described in the spec, and
implementors know that they must implement the inheritance at the
document parsing stage, and apply the feed-level data to the entry
before storing the entry before they attempt to store the entries in a
database or whatever, but implementations cannot be expected to apply
all feed-level extensions to the entries that they were transmitted
with, just in case any of them might expect to implement feed document
inheritance.

Feed properties are properties of the feed, not the feed document. An
extension can't implement atom:rights/atom:author style inheritance
from the feed document to the contained entries.

-- 
Dave

Re: Language Negotiation

2006-07-26 Thread David Powell

Wednesday, July 26, 2006, 8:33:55 PM, James M Snell wrote:

> Now imagine that we start to apply machine translation to entries, so
> that we can say, give me all entries, but translate them to French, or
> English, etc.  Would that be best done using conneg or separate URIs?

I'd go for seperate URLs. Server-driven conneg (SDN) has its uses, but
for a protocol like atom (-syntax), the benefits are very limited, and
the risks and costs are high. With client-driven negotiation (CDN),
the client and server both have a better understanding of what they
are asking for and what is available - there is less risk of anything
going wrong.

The only advantage of SDN, is that it selects what it thinks is the
best feed without any user interaction - but that is something that
only needs to be done once anyway.

With SDN, you'll need to employ the Vary header, which can adversely affect
caching.

Hosted services (eg Bloglines) won't work unless they store multiple
seperate lists of entries and feed data for each set of request
headers that the request varies over; this seems very optimistic.

If you want to use a user's language preference to provide them with a
localized feed, I'd do it at the HTML level - use accept headers to
provide the correct autodiscovery link, or even better, to sort the
list of multiple links in a suitable order.

What are you going to do about ids BTW? I'd probably mint new ids for
the translated entries and feeds, and employ some sort of
link/extension if you need to be able to associate them.

-- 
Dave

Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests

2006-06-28 Thread David Powell

Wednesday, June 28, 2006, 9:55:29 PM, James Snell wrote:

> David,

> you're right, ideally the xhtml container div would be nothing but the
> div, but if it's not, we still need to be prepared to handle it.  Silent
> data loss sucks, if it's silly data :-)

I'm just looking at it from the perspective of the producer and the
consumer.

In my consumer implementation, I take the resolved base URI of the div
(including any xml:base there), and the language context of the div,
discard the div, and store them both out-of-band of the content, with
namespace prefixes inline. That's probably good enough. Some
post-processing is used to convert the data in the store into a form
that allows it to be safely embedded in an HTML page - I've been
trying XSLT (with TagSoup for HTML content).

I don't think that the div should have lang or base attached, but if
it is there, it is better to use it than ignore it, cause it is likely
there for a reason. I wouldn't produce feeds like that though.

If people start using CSS links in feeds (or even just CSS styling in
aggregators), discarding the div could be important.

If you're going to supply an API for extracting usable
[X]HTML, there are a number of features that consumers might want in
some combination:

* Forcing the XHTML to use a blank namespace prefix to make it DTD
  compatable, and removing unused prefixes.

* Resolving relative references (which will inevitably be a lossy
  process)

* Removing XSS risks (intentionally lossy)

I still keep the original content in a reasonably accurate form
though.

-- 
Dave

Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests

2006-06-28 Thread David Powell

Wednesday, June 28, 2006, 1:22:00 PM, James Snell wrote:

> Hiding the div completely from users of Abdera would mean
> potentially losing important data (e.g. the div may contain an xml:lang
> or xml:base)

I don't think that the div should contain an xml:base, because it
isn't valid to use xml:base in XHTML 1.x. As the xhtml:div is added by
the producer, it should be removed by the consumer, so there shouldn't
be an xml:lang in there either. I wouldn't expect consumers to handle
either consistently, so if you are a producer don't do it. I think in
my implementation I handle lang and base on the div, and store them
out-of-band, but it is more by accident than anything.

I would hope that any other xmlns:* declarations on xhtml:div are
honoured. Namespaces are so core to XML that making any
recommendations about their placement is asking for trouble.

> or forcing me to perform additional processing (pushing the
> in-scope xml:lang/xml:base down to child elements of the div.

I avoid that, it isn't nice as the xml:base will make the XHTML
invalid and browser-dependant. In my RDF implementation, I store the
lang context, base context, content model, and other stuff out-of-band
from the content itself. I do rely on RDF's exclusive canonicalization
rules though, to preserve the inscope namespace decls.

(I assume that namespace decls aren't strictly allowed in valid XHTML
either? Oh well...)

> It also has ease-of-use ramifications on the API. So I really do
> need a solid answer on this one.

You need to preserve a load of context in addition to the content
string itself, so expect to have to return these extra properties for
each use of Text Constructs in your API.  It is a bit of a
high-barrier to entry really.

(If Atom had been designed in JSON, instead of XML, I wonder if it
would have been more sympathetic to the OO/RDBMS crowd, and whether we
would have bothered with such fine-grained language tagging?)

-- 
Dave

Re: Link rel test cases

2006-05-26 Thread David Powell

Friday, May 26, 2006, 6:57:03 PM, Robert Sayre wrote:

> On 5/26/06, James Holderness <[EMAIL PROTECTED]> wrote:
>>
>> Logically I would assume the simple string comparison in section 5.3.1 of
>> RFC3987, but I was hoping this would be documented somewhere more
>> explicitly. An atom:id is an IRI too, but it explicitly specifies
>> character-by-character, case-sensitive comparisons. By not doing the same
>> for link relations the spec kind of leaves things open to interpretation.

> RFC3986, section 6 and RFC3987, section 5 document this procedure very
> well. The comparison ladder reduces false negatives in exchange for
> processing effort. If you think the ladder is worth climbing in this
> case, go for it.

Yeah, equally if you are a publisher, and don't want to restrict
interoperability to just the set of clients that climb the ladder,
then don't do anything stupid.

We probably should have specified the "simple string comparison"
method; it is more satisfying to call an implementation that publishes
not obviously equivalent IRIs 'wrong', rather than just 'somewhat unwise'.

-- 
Dave

Re: Link rel test cases

2006-05-26 Thread David Powell



Friday, May 26, 2006, 2:31:40 PM, Andreas Sewe wrote:

> But the test cases should IMHO not test whether "ALTERNATE" works, since
> it should not, but whether 
> "http://www.iana.org/assignments/relation/alternate"; does. But then 
> again my reading of 4.2.7.2 might be wrong.

I agree. [EMAIL PROTECTED] is case sensitive as it is an IRI-ref, therefore
ALTERNATE isn't correct. Any resemblance of HTML's link rel is purely
coincidental.

Actually, as @rel is an IRI-ref with
"http://www.iana.org/assignments/relation/"; as its base-IRI, sick
stuff like this should work:

http://www.example.com"; />

-- 
Dave

Re: Feed Thread in Last Call

2006-05-25 Thread David Powell

Tuesday, May 23, 2006, 10:31:37 PM, Tim Bray wrote:

> I would say that furious debates about elements-vs-attributes have
> been going on since the dawn of XML in 1998, but that would be
> untrue; they've been going on since the dawn of XML's precursor SGML
> in 1986. They have never led anywhere. After you've noticed that if
> you need nested element structure you're stuck with elements, and if
> you don't want to have two things with the same names attributes can
> help, there really aren't any deterministic decision procedures.

> I note with pleasure that *all* known XML APIs allow you to retrieve  
> elements and attributes with about the same degree of difficulty.

What Aristotle said.  Yes, I agree, but debate over elements vs
attributes in generic XML isn't relevant.

> So, my conclusion: I disagree with Powell.  Let people put extensions
> wherever they feel like putting them (they will anyhow), remembering  
> that human-readability is a virtue.  If models try to micro-manage  
> the element/attribute thing, those models are broken, don't use  
> them.

> If software arbitrarily discards markup because the markup
> doesn't match its idiosyncratic ideas about elements and attributes,  
> that software is non-comformant and non-interoperable.

I'm a bit surprised that you're saying this. I'm not sure why you
think it is non-conformant to discard extensions. RFC4287 doesn't
describe any conformance criteria for Atom intermediaries.

Over on atom-protocol, there were a number of implementors who are
ignoring/rewriting client-supplied atom:ids - about the most
fundamental way that you could possibly change an atom document - and
you didn't seem to disapprove of this?

People have also said in the past that they planned not to preserve
extensions, and I don't remember there being many objections. If it is
non-conformant and non-interoperable to discard extensions, then why
doesn't APP prohibit it?

>> Extension elements are defined to have both a model and a syntax, but
>> Atom's allowance for foreign attributes to appear anywhere is a case
>> of syntax that has no corresponding model. Atom doesn't really explain
>> what foreign attributes are intended for.

> Extension elements also, as noted above, have *no normative effect*.

I don't know what "no normative effect" means.

> This is only true for software which ignores the fact that RFC4287 is
> specified only in terms of the XML Infoset.  If you lose information  
> because it doesn't match up with some ex post facto model you've  
> dreamed up, you cannot expect to achieve interoperability.

But every implementor is dreaming up ex post facto models.

> Obviously, but the notion that this depends on whether you use an  
> attribute or an element seems really, really bizarre to me.  An  
> intermediary that drops markup it doesn't recognize won't last long  
> in the marketplace, whether those are elements or attributes.

It is easier to support extensibility in 3 places, than on every
possible element in the document. Storing entry extensions in a
database table is possible, storing atom:updated attributes, is
getting silly.

>> Interoperability should take priority of concerns that 'approach X  
>> looks
>> better than Y', and other unjustifiable minor concerns.

> Yes, and interoperability is based on the normative rules in RFC4287,
> right?

Wrong (in my opinion).

There are no normative rules for the behaviour of Atom intermediaries.

Given the absence of such rules, the next best way to ensure
interoperability to understand the typical behaviour of Atom
implementations: the fact that it is easier for some implementations
to support "Extension Elements" in the 3 defined extension points than
foreign attributes on every element in an Atom document; that this
translates to more implementations supporting extension elements than
attributes; and that this implies that an extension would be more
interoperable if it chooses Section 6.4 markup over undefined foreign
markup.

-- 
Dave

Re: Feed Thread in Last Call

2006-05-20 Thread David Powell

Friday, May 19, 2006, 1:40:43 AM, Lisa Dusseault wrote:

> I've been trying to understand if there's a technical problem with
> the draft's chosen placement of the attributes and the best case
> I've seen is that "that location is technically disallowed by
> RFC4287" , an assertion which is disputed (alas, natural language
> meanings are often disputed).

Hmm, that wasn't my argument either... I fully agree that the
attributes are technically allowed by both the text and RelaxNG of
RFC4287. I don't think anyone is disputing that.

Sorry for repeating myself again, but I'll recap my argument in full
because it seems like it hasn't been understood.  A bit of background:

Modelling
=

Software that deals with XML such as an XHTML document, doesn't have
much choice but to model the document using generic XML concepts and
tools - Infosets, DOM, SAX, strings containing XML tags, etc.

For Atom though, it is useful to model feeds and entries in terms of
some other data model: OO, RDBMS, WebDAV (I've been doing it as RDF,
but that is a dirty word around these parts). Some of the reasons for
this are:

  Most Atom implementations are concerned with the combined state of a
  feed over time, not the state of an individual feed document, so
  pure XML access is inadequate.

  More robust, and efficient implementations, particularly in the case
  of RDBMS.

  Implementations may be built upon existing systems, such as existing
  content management systems, where a mapping between Atom the XML
  syntax, and the entities in an existing system would need to be
  established.

  Even the best XML APIs are horrible to use when compared to domain
  specific APIs.

Extensions
==

Atom standardised a minimal set of elements, with the expectation that
any other elements would be created as extensions. It is therefore
important that Atom infrastructure shouldn't get in the way of the
deployment of extensions. Atom uses mustIgnore semantics for
extensions, which allows implementations to copy-through extensions
even if they don't understand their purpose.

Section 6 of RFC4287 is flawed. It is an ugly mix of my (overly)
strict PaceExtensionConstruct proposal[1], and an (overly) liberal
philosophy that the existence of foreign markup anywhere won't break
implementations, so shouldn't be disallowed.

[1] 
http://www.intertwingly.net/wiki/pie/PaceExtensionConstruct?action=recall&date=1105566248

I complained about it here[2], admittedly a long time after IETF last
call, but as we were still making Atom 101 changes about then (eg:
let's allow multiple authors), maybe that wasn't so unreasonable. It
doesn't make any sense to define two very precise classes of extension
element, and then say that any extra markup can go anywhere, without
giving any justification or explanation to why we did this, or any
guidelines to why extension authors should pick each of these options.

[2] http://www.imc.org/atom-syntax/mail-archive/msg15915.html

Unfortunately Simple Extension elements are too unconstrained to fulfil
their original objective, so the difference between them and
Structured Extensions is academic. The intent was that Simple
Extensions were a class of extension whose values were context-free
strings that would be easy to implement, and easy to provide UIs for,
and extension proposers might be encouraged to choose this class of
extension where it was appropriate for improved interoperability. I
don't think that this worked out.

The difference between extension elements, and foreign attributes is
significant however.

atompub's charter states:

> Atom consists of:
> * A conceptual model of a resource
> * A concrete syntax for this model

Extension elements are defined to have both a model and a syntax, but
Atom's allowance for foreign attributes to appear anywhere is a case
of syntax that has no corresponding model. Atom doesn't really explain
what foreign attributes are intended for. It seems like they could be
an extension point, but given that many implementations will have an
application model that isn't based on the XML Infoset (as described
above), it seems very unwise to create an extension proposal which
depends on the precise syntax of an element being preserved.

The intent of Simple Extensions was to provide a class of extension
that was more interoperable; foreign attributes appear to provide a
class of extension (if that is what it is) that will be much less
interoperable.

Some guidance in how to design extensions is definitely missing from
the RFC, perhaps an Informational RFC explaining the issues would be
appropriate.

Relevance of intermediaries
===

Most of these issues only apply to Atom intermediaries: agents which
much accept Atom documents, and represent the documents internally
without losing information.  It doesn't matter if a desktop aggregator
drops extensions if it has no UI for displaying them anyway.  But I
think that the class of Atom intermediaries w

Re: Feed Thread in Last Call

2006-05-18 Thread David Powell

Tuesday, May 16, 2006, 4:50:03 AM, James M Snell wrote:

> A few of the individuals on the WG had a problem with the placement of
> the attributes due to various limitations of a few existing Atom 1.0
> implementations.

That doesn't accurately state my problem with FTE. My concern is more
general than the compatibility with a specific Atom implementations.
The fact that the MS feed platform exhibits the issue is just a
confirmation that the issue is more than a theoretical concern.

An implementor that wants to /consume/ Atom (such as a feed API, or
store), ought to be able to create an accurate implementation just by
reading RFC4287. That document describes what properties entries,
feeds, links, and extension elements have, and how they are
represented in an Atom document. Using this it is possible to create a
database schema or OO API that can represent feed data.

My problem with FTE is that rather than using Atom's extension-point:
the "Extension Element" (which an implementor of RFC4287 is likely to
be capable of preserving), it uses the fact that Atom doesn't prohibit
extra attributes to be included anywhere in the document. Is an API,
feed store, or Atom Protocol implementation expected to preserve every
attribute on every element, in addition to the core elements, and
extension points? This might be practical on a few green-field Atom
implementations, but it wouldn't be practical on systems than need to
be retrofitted to support Atom.

What I see as a problem is that reasonable implementations will not
preserve Atom documents bit-for-bit, so they will need to explicitly
support this draft if they don't want to corrupt data by dropping the
thr:count attributes. By the letter of RFC4287 there is no problem
with the draft, but practically there is something like a layering
concern if an extension requires existing conformant implementations
to be changed.

Atom implementations with generic support for links and extensions,
will find that this draft moves the goalposts of what a reasonable
implementation is expected to handle, and I firmly believe that it
should be possible for implementations to provide generic support for
extensions in order to assist their adoption.

Part of the problem might be caused by flaws in Atom's extensibility
model, but that is all the more reason for extensions to be
conservative in its use.

> None of the folks I know of that have actually
> implemented support for the extension has had any problems with them.

Does that include any stateful feed APIs, or APP servers that aren't
based on native XML back-ends?

I notice that you said "implemented support" - that is fine for
user-agents etc, but I don't believe that Atom infrastructure should
be required to "implement support" for each new bit of content that
publishers put into their feeds.

-- 
Dave

Re: Feed update mechanism

2006-05-16 Thread David Powell



Tuesday, May 16, 2006, 11:18:04 AM, Sylvain Hellegouarch wrote:

> Hi everyone,

> These days It seems that when UAs request a server to check if a feed has
> changed the server responds with either an HTTP 304 Not Modified status
> code or by returning the updated feed.

> It looks to me as a problem if only one or a couple of entries have been
> updated within the feed as it wastes so much bandwith specially when the
> said feed contains all entries being produced on the server.

Check out "A-IM: feed" [1]. It is already implemented by lots of
aggregators, and it is trivial for client implementors to support.

[1] http://bobwyman.pubsub.com/main/2004/09/using_rfc3229_w.html


-- 
Dave

Re: Feed thread -09

2006-05-05 Thread David Powell

Friday, May 5, 2006, 12:20:25 AM, A. Pagaltzis wrote:

> * M. David Peterson <[EMAIL PROTECTED]> [2006-05-04 23:30]:
>> Or is something like this simply inviting WAY TOO MANY little
>> things to find justification to plug up the collective inbox of
>> the community members?

> I don’t know. So far during extension development discussions,
> only the missing extensibility for links has stuck out as a real
> sore point in RFC 4287. Other than that, the spec has stood up
> very well with only a few minor errata reported here and there.

> At least, that’s my impression; I don’t know what others think,
> of course.

> Frankly, I would hope there won’t be much interest – cause if
> there is, what else would that mean than that we did a shoddy
> job? :-)

The RFC does a good job of saying what is and isn't a valid Atom
document. What is out of scope of the RFC though, is the behaviour of
processors that provide access to, store, or act as intermediaries for
Atom streams, wrt what elements and extensions are retained.

Not that all processors should behave the same, or that processors
that retain the most information are necessarily "the best" (ones that
retain less information are probably likely to offer more usable
interfaces), just that extension authors need to be mindful of typical
behaviours of processors, and vice versa. As the processing behaviour
is not defined (not even in APP), application of the robustness
principle is particularly relevant.

I started on a document describing some checklists for the
preservation of data required to meet several arbitrary conformance
levels - intended more as considerations for extension authors, than
for processor implementations.

-- 
Dave

Re: Feed thread update

2006-05-05 Thread David Powell

Friday, May 5, 2006, 4:05:15 AM, Tim Bray wrote:

> Give me a break, we're in the first *days* of something that will be  
> used for at least decades.  Todays' APIs will have a vanishingly- 
> small lifespan in comparison

The issue isn't that an implementation is at fault. The issue is that
a implementation is not at fault, is behaving reasonably, and yet
still wouldn't be compatible with the extension.

As I said 12 months ago
:

| The problem with section 6 is that we define 3 different classes of
| non-Atom markup, but don't say why we've done that, or help authors to
| know which type they should choose when they write a document.
| [...]
| It doesn't really make sense we have thought out how to extend
| atom:feed, atom:entry, atom:author etc, but atom:link is a sub-RSS
| free-for-all.

Section 6 was received inadequate review. Perhaps someone can tell me
now, why there are three different classes of non-Atom markup?

Section 6 came about due to a clash between people who see Atom as an
XML document format like XHTML ([1] choice 1), and people who see Atom
as a serialisation of feed and entry concepts, that span multiple XML
documents ([1] choice 2). Section 6 satisfies neither party.

[1] http://www.mnot.net/blog/2004/05/12/on_infosets

The distinction between Simple and Structured extensions was a well
meaning, but misguided, effort to provide a lower bar for extension
support for implementations that would find it unfeasible to preserve
the full Infoset and xml:lang/xml:base context.

> crippling our expressiveness

"...and conservative in what you send".  That is the point here.

What percent of stateful feed stores do you think support all of
Atom's core elements?

How many handle xml:lang and xml:base correctly for core elements? -
less?

How many handle simple extensions? - less?

How many handle structured extensions, including ones that require that the 
xml:base
and xml:lang context is preserved? - less?

How many handle arbitrary attributes on any Atom element, and
arbitrary XML in atom:link? - less?

How many preserve the order of elements for any future extensions that
choose to make order significant? - less?

How many cope with extensions that follow our example of
"inheritance", and maintain feed extensions as they were at the time
of the document that the entry was last delivered with for the benefit
of "inheritable" feed extensions, whilst also maintaining an
association with the latest feed metadata for the benefit of "true"
feed extensions? - less?

How many preserve the namespace mappings for any future extensions
that choose to use QNames in content? - less?

How many preserve PIs and comments for any future extensions that make
use of them? - less?

Atom doesn't define any conformance levels for the preservation of
content and extensions, so extension designers are forced to consider
the consequences themselves.

If you're happy working with XML APIs over a set of polled feed
documents (together with their preserved URIs, Content-Locations,
entity headers and retrieval dates), then none of this matters.

If you see a benefit of providing a higher level abstraction over
feeds and entries, an OO API, an RDBMS schema, an APP implementation,
or any value-added feed processing service, then it is worth
considering the practicalities of such support and employing a bit of
conservatism.

-- 
Dave

Re: Atom Rank Extensions

2006-05-03 Thread David Powell

Tuesday, May 2, 2006, 10:06:51 PM, James Holderness wrote:

> Just looking at that example, it seems to me that an aggregator that 
> implements Microsoft's simple list extensions would get a full-featured
> representation of that feed without having to know anything at all about
> feed rank and feed history. So why bother with them?

I made a suggestion for avoiding implicit ordering in SLE on the
FEED-TECH list[1]. Basically to allow the cf:treatAs to use the
@element and @data-type attributes that are used in cf:sort and
cf:group elements, so that entries could include a datestamp, or
index, which could be used for the ordering of the list rather than
the natural ordering.

[1] 
http://discuss.microsoft.com/SCRIPTS/WA-MSD.EXE?A2=ind0603d&L=feed-tech&T=0&F=&S=&P=312

Actually - can cf:list be used together with cf:[EMAIL PROTECTED]'true'?,
I think I'll ask...

-- 
Dave

Re: Tools that make use of previous/next/first/last links?

2006-05-03 Thread David Powell



Wednesday, May 3, 2006, 6:48:55 AM, Mark Nottingham wrote:

> If you use URIs like
>http://example.com/feed?start=5&num=10
> changing the directionality of "next" and "previous" will not make  
> what you're doing compatible with feed history.

> Such URIs have a much more fundamental problem -- they don't refer to
> a stable set of entries, and therefore only act as a snapshot of the  
> *current* feed, chopped up into chunks. If the feed changes between  
> accesses, the client will be in an inconsistent state. The client  
> also has to walk through all of the pages every time it fetches the  
> feed; it can't cache them -- which is a primary requirement for feed  
> history.

I think it would be worth recommending the use of stable URIs in the
draft.


-- 
Dave

Re: Atom Rank Extensions

2006-05-02 Thread David Powell

Tuesday, May 2, 2006, 9:12:56 PM, James Snell wrote:

> Does your implementation properly handle the following (contrived) example:

> http://example.org/foo/bar";>
> ...
> 
> http://EXAMPLE.org:80/foo/bar/../comments.xml"; ... />
> ...
> 

I don't think you should do URI normalisation. The ref is being used
as an identifier, you don't do protocol level normalisation on
namespace URIs or Atom ids why do it here? The draft should specify
character-by-character comparison of the resolved URI's.

[Er, I mean IRI. Everyone's using IRI datatypes, right?...]

-- 
Dave

Re: Entry types

2006-05-02 Thread David Powell

Monday, May 1, 2006, 8:40:57 PM, James Snell wrote:

> I'm wondering if it would make sense to have a single common "type"
> scheme that could be used consistently across implementations.

"Type" seems a bit vague, this seems to be mainly about describing how
an entry should be processed.  A few possible ways to do that:

a) Using categories and a known categorisation scheme
b) Using an ex:processAs extension
c) Using domain specific extensions, eg 
d) "duck-typing", eg assuming that contact:firstName implies the type.

I think that using category might be an overloading of the semantics
of category?, I'm not sure, it probably depends on the circumstances.
Category is really a summary of the set of real-world concepts the
entry is about, it ought to be under control of the publisher. Typical
Atom consumers will probably provide a UI to filter entries by
category. Do consumers want the set of categories for an entry
polluted with physical properties of the entry? This might be ok
sometimes, but not in general.

Can entries have multiple types? Like an entry that includes both the
event details and contact details for the organiser? Will typical Atom
infrastructures be able to dispatch entries to multiple processors, or
will it be like MIME-types, where this is difficult (as in the
dispatching of generic XML types based on namespace, or of RDF)?

Microsoft has a screen saver sample [1] that uses Windows Feed
Platform to display picture feeds. It does this without any typing,
just by examining each entry in the feeds associated with the screen
saver, and selecting the appropriate ones.

[1] http://blogs.msdn.com/rssteam/archive/2006/02/28/540319.aspx

I remember seeing a demo of a calendar feed that synced with Outlook,
I suspect that that also relied on having the plugin decide which
entries it was interested in, rather than the engine dispatch entries
to specific plugins. Engine driven dispatching might be slightly more
performant, but plugin driven dispatching is more flexible and doesn't
need a well-known type extension to key off.

Is it enough to go for the "duck-typing" approach, and not require
explicit typing? But, what if all of the extensions for contacts are
optional? What if some are shared with the extensions for
appointments? If the rules for deciding on the type of an entry aren't
specified, it might be more prone to interop problems, where one
implementation detects an entry as a contact, and another doesn't. I
suppose a quick vcard:* check is pretty easy with XPath.

-- 
Dave

Re: Feed Thread Draft Updated

2006-04-27 Thread David Powell

Saturday, April 22, 2006, 1:53:26 AM, James M Snell wrote:

> So this is what I've got:

>  count = element thr:count {
>attribute xml:base { atomUri }?,
>attribute ref { atomUri }?,
>attribute updated { date-time }?,
>( nonNegativeInteger )
>  }

I think that is ok.

Aristotle's suggestion is ok, in that it saves a bit of typing in the
common case where there is only one link - but in the case where there
is more than one link, a combined count seems pretty useless: if there
are multiple comment links, then either the consumer can cope with
them and process both the links and counts, or the consumer can't cope
with them and can only process the combined count - but the count
alone without any links to reach the comments isn't very useful, so
why bother with it - consumers that can cope with multiple comments
links will be able to manage addition of the counts if necessary.

A variation would be to go with your proposal, but to say that ref is
optional if there is only one comments link, but would that be
fragile?  People would have to remember to fix things up if they added
extra comment links - but is that going to be a common occurrence?
And implementations that add comment links will presumably be aware of
the semantics of comment links, and will therefore know about the need
to add missing refs?

Anyway, I think that any of these options is better than the current
situation.

> The value of ref is the href of a replies link appearing in the
> entry/feed/source.  Where that gets nasty, of course, is when the href
> is relative and xml:base is being used to set the Base URI.

xml:base always gets nasty, but I don't see it as being a big problem
in this case.

> The updated spec would have an appendix that would explain that previous
> versions of the extension defined the attributes and that some
> implementations have been deployed that use the attributes.  The spec
> will indicate that those attributes are deprecated in favor of the
> thr:count element but that feed consumers have the discretion of whether
> or not to support them.

Maintaining compatibility with an expired Internet Draft?  I don't
really support that.  Is it really needed:

Stage 1: Implementation X has no support for FTE
Stage 2: Implementation X supports obsolete draft of FTE
Stage 3: Implementation X supports final version of FTE

Nobody is going to consider the obsolete form to be invalid. All that
would happen if consumers don't support it, is that they won't see the
thread counts. This isn't a big deal, as before the implementations
had implemented support for FTE consumers wouldn't see the thread
counts anyway. Eventually those implementations that support the
obsolete form will be updated, and consumers will be able to process
the thread counts.

(I think the moral of this story is to, in future, not to specify
IANA-namespaced link rel until a specification has been finalised.
During development there are bound to be changes, and IANA links don't
allow versioning, or DO-NOT-DEPLOY banners in the URI (like the one we
put in Atom draft namespace). Although, to be honest, I'm amazed if
there is any real reluctance to make breaking changes to an Internet
Draft.)

> Does this work?

> - James

-- 
Dave

Re: Feed Thread Draft Updated

2006-04-13 Thread David Powell

Thursday, April 13, 2006, 8:24:48 AM, Thomas Broyer wrote:

>> c. Create a new replies extension element
>>> type="..."
>> hreflang="..."
>> title="..."
>> count="..."
>> when="..." />

> -0.5, it *is* a link

thr:in-reply-to is a "link" too. An extension element was used for
thr:in-reply-to, because atom:link wasn't up to the job. atom:link
isn't up to the job for "replies" either.

I think that slavishly adhering to convention, at the cost of
interoperability is unwise.

I'm bothered about this because I think requiring people to process
undefined foreign markup is harmful to Atom.

With something like an XHTML document, the only sensible way to
process it is by using XML tools. XPath, XSLT, whatever - they work
well.

With Atom, an "Atom Feed Document" alone isn't very useful. Almost all
applications will work in terms of Atom Feeds - streams of entries,
and feed metadata. To process this, you need to think of Atom in terms
of entities: Feeds and Entries, not a set of XML Feed documents.
Implementations are based on OO classes and databases, and model
entries in terms of titles, content, links, extensions etc. The Atom
RFC is clear enough, that you can model an Atom Feed in terms of OO
classes, rather than XML documents without losing any data.

This all falls apart when people sprinkle the XML with undefined
markup that can't be represented outside the context of an XML
document.

-- 
Dave

Re: Feed Thread Draft Updated

2006-04-13 Thread David Powell

Thursday, April 13, 2006, 6:11:32 AM, Eric Scheid wrote:

> atom:link beats thr:replies on the basis that I don't need to understand
> what "replies" are to discover that there is a link from this thing to that
> thing.

> atom processors know what atom:link is, but it wouldn't know what to do with
> this:

>  bar="opaque-string
>qaz="opaque-string"
>zak="opaque-string" />

But what would processors do with an atom:link? Atom Protocol uses
"edit", there have been calls for a "stylesheet". Links aren't
necessarily things that you'd display to users (check HTML out for
examples of that: favicon, P3P, Atom/RSS, GRDDL)

-- 
Dave

Re: Feed Thread Draft Updated

2006-04-12 Thread David Powell

Tuesday, April 11, 2006, 9:20:32 PM, James M Snell wrote:

> I also added a new warning for implementors: "Implementors should note
> that while the Atom Syndication Format does not forbid the inclusion of
> namespaced extension attributes on the Atom link element, neither does
> is explicitly allow for such extensions.  The result of this is that the
> thr:count and thr:when attributes fall outside of the guidelines for
> Atom extensions as defined in Section 6 of [RFC4287] and may not be
> supported by all Atom implementations."

[Apologies to everyone for flogging this dead horse some more. I'm
only doing this, because I care.]

I'll quote your argument from MDP's blog and comment here, rather than
on his dead comment thread, if that's ok:

> Ironically, the primary reason for using link for "replies" is
> because it makes very little sense to duplicate the function of the
> link element. I considered a thr:replies element. Ultimately, it
> would have looked and acted darn near identical to the atom:link
> element (with attributes like type, href, hreflang, etc). One needs
> to consider whether or not it makes sense to introduce a new
> namespaced extension element that duplicates the basic function of
> atom:link just to support the addition of two *optional* and *purely
> advisory* attributes.

You recognize the fact that adding non-standard attributes on
atom:link isn't going to be fully interoperable (that the thr:count
and thr:when attributes might not make it from producer to consumer
intact), but still prefer to make "replies" a link element on the
basis of style and convention.

This seems to be the wrong priority to me.
Replacing  with an otherwise identical
 element would fix the interop problem, and wouldn't
introduce any other problems that I can think of. The only reason to
keep the current link element is because it somehow looks better, or
saves copy/pasting the relevant bits of the atom:link spec into the
draft.

In terms of the considerations to the interoperability of running
code, thr:replies seems to beat atom:link in every way. It even
manages to be more concise (you don't need the @rel), and you wouldn't
need to put thr:count and thr:when into a namespace (namespaced
attributes confuse people).

Eg:
 http://www.example.org/mycommentsfeed.xml";
   count="10"
   when="2006-02-20T00:00:00Z" />

instead of:

 http://www.example.org/mycommentsfeed.xml";
   thr:count="10"
   thr:when="2006-02-20T00:00:00Z" />

I don't really buy the justification that the attributes don't matter,
so it is ok if they get lost btw. If I was using an API that didn't
give access to the count attributes, I'd probably be a bit miffed, I'm
unlikely to say "oh it doesn't matter cause they were only advisory,
I'll just load the comments feed, parse the XML, etc, etc, instead".
Yes, thr:count is derived, so it isn't essential, but this doesn't
mean that it isn't useful. It is obviously useful, else it wouldn't be
in the draft.

-- 
Dave

Re: Don't make Feed Extensions inherit!

2006-04-12 Thread David Powell

Wednesday, April 12, 2006, 1:29:00 PM, A. Pagaltzis wrote:

> * David Powell <[EMAIL PROTECTED]> [2006-04-12 13:40]:
>> Reasonable implementations will probably just store the latest
>> versions of feed and entry metadata, something like this:

> Of course, what they *should* do is use `atom:source` so that
> they can preserve all of the feed metadata, including RFC4287
> Sec.6 extensions:

Using atom:source is fair enough for representing loose entries, but I
think it is overkill for representing entries in the context of a
known feed, where the vast majority of feed metadata is going to be
the same for each entry.

I believe feed metadata to be about feeds, and entry metadata to be
about entries. Two separate entities, with their own life-cycle. There
is one feed, if I change the title of it, then the feed title should
change.  I don't expect it to change back to the old title whenever I
look at older entries.

Feed documents are just an implementation detail of the most popular
way of passing this information around.

-- 
Dave

Don't make Feed Extensions inherit! (was Re: Feed License Draft)

2006-04-12 Thread David Powell

Tuesday, April 11, 2006, 11:17:07 PM, James M Snell wrote:

> This was specifically added in response to feedback provided on this
> list.  Although I don't have the link to the original thread, the
> rationale has to do with aggregated feeds.  Specifically, I may publish
> an entry that does not have a license that you turn around and republish
> in an aggregate feed that does have a license. If entries inherited the
> licenses of their parents, that would mean that you would end up
> distributing my content under a different license than what I had
> originally intended, which you, of course, have no right to do.
> Therefore, entries are licensed independently of the feeds in which they
> happen to appear.

Wise.  It is a really bad idea to invent extensions that "inherit" in
the way that atom:author does.  Just think about how people are going
to implement it:

Look at Windows Feed Platform, as an example. It preserves entries
with full fidelity (or should do), and preserves feed metadata with
full fidelity (or should do).

But most reasonable implementations won't preserve the exact feed
documents that each entry came from - they'll probably just preserve
the latest version of the feed metadata.

So if you have:

 Dave
 x

   ...2

   ...1

and later poll again, and get:

 Someone Else
 y

   ...4

   ...3

Then a correct implementation does need to maintain the association
between feed/author and the entries that appeared in that document,
because that is required by the RFC, but that doesn't apply to
extensions.  Reasonable implementations will probably just store the
latest versions of feed and entry metadata, something like this:

 Someone Else
 y

   ...4
   Someone Else

   ...3
   Someone Else

   ...2
   Dave

   ...1
   Dave

If users of this data, attempt to perform "inheritance" on the ex:tag
extension, they will corrupt the entries by assuming that the latest
version of the feed metadata applies retrospectively to all previous
entries.

-- 
Dave

Re: Does xml:base apply to type="html" content?

2006-03-31 Thread David Powell



Friday, March 31, 2006, 11:02:18 AM, Sean Lyndersay wrote:

> I haven't looked in detail at how IE does on the xml:base
> comformance tests, since the current beta has no support for
> xml:base. In light of that fact, I'm glad we failed outright instead
> of halfway; halfway would have been weird :).

> We're actually implementing xml:base support right now (and in the
> process, fixing the relative URL issue that Sam Ruby pointed out in
> our normalization format), so we'll be broken on those conformance
> tests for while. The fix won't make it out in the next public
> release, but it should make the one after that.

> I'll let you know how we do on those tests when the code is done.

Great. It would be good if you could preserve the "effective" base-URI
of feeds and entries, so that applications using Atom extensions that
contain relative URIRefs can resolve them into URIs. I suppose that it
could be done by pinning an absolute xml:base onto the channel and
item elements.

-- 
Dave

Re: Does xml:base apply to type="html" content?

2006-03-31 Thread David Powell

Friday, March 31, 2006, 4:34:48 AM, you wrote:

> The escaped HTML content contained within the content element that
> David was originally concerned with is more than likely a copy of
> all or part of the elements and content contained inside the body
> tag of the external document referenced by an associated link
> element, and therefore no guarentee that the xml:base of the atom
> feed is going to be anywhere even close to accurate.

That might be exactly the case where the xml:base is useful: the
content came from different places, had relative URI-refs, so the
xml:base was set on each entry to the source URIs of each document so
that the relative links work[*] in both in cases.

[*] in theory.

-- 
Dave

Re: Does xml:base apply to type="html" content?

2006-03-31 Thread David Powell

Friday, March 31, 2006, 3:31:12 AM, A. Pagaltzis wrote:

> In that scenario, either the tag soup from the other feeds must
> be fixed up so the view can be rendered as XHTML (which supports
> xml:base in content)

XHTML 1.0 doesn't support xml:base does it?  As I understand it, only
specs that say that they support xml:base allow you to put xml:base on
their elements, but any spec that allows URIrefs has the concept of a
base-URI, so for envelope specs such as Atom, you'd expect xml:base in
the envelope to set the base-URI for the content.

-- 
Dave

Re: Atom Thread Feed syntax

2006-03-24 Thread David Powell

Friday, March 24, 2006, 3:28:02 AM, James Snell wrote:

> I believe the concern is over the thr:count and thr:when attributes for
> the replies link relation, both of which are optional, and both of which
> provide what I consider to be extra information.  In other words, it's
> ok if an implementation drops them.

Yeah I agree that an implementation losing those attributes wasn't
exactly life-threatening. But is it "*ok* if an implementation drops
them"? Will publishers think that it is ok if some infrastructure drop
those attributes? Will subscribers? Presumably publishers have spent
some effort adding those attributes to the feed, in the hope that
subscribers will get some benefit from them.

> The important bit is the in-reply-to element and the replies link
> rel, both of which fall within the bounds of the Atom extension
> model.

Yeah, I agree.

> I'm most certainly not abandoning the extension constructs.  One of the
> motivations for walking these extension specs through the I-D and
> eventually standards-track process is so that they get their own RFC
> number.  Implementations that choose to support the extension can point
> to RFC4287 *and* RFCwhatever and say, "I support both".  If an
> implementation only says "I support RFC4287" and doesn't say anything
> about RFCwhatever, it's pretty clear what the result would be.
> 
> The most an RFC4287 implementation should be expected to do is adhere to
> the defined extension model.  If that implementation also chooses to
> support other RFC's that go beyond that extension model, so be it.

I find much of section 6 of RFC4287 a bit pointless.  It describes
these classes of extensions, and I just think - so what?.

I think it would be good to have a second draft which described a
number of conformance levels for feed infrastructure - things like the
handling, and preservation or not, of extensions. Then publishers
would know that if a popular infrastructure (Windows RSS, Rome,
Feedburner, APP implementations, etc) implements a given level of a
compliance, what to expect. It isn't always preferable for a feed to
reappear exactly as it was after travelling through an API or APP
store, much of the added value in implementations will come from the
transformations that they do. Even producing an archive feed, isn't
possible without a reasonable number of transformations (eg: storing
inherited constructs: author, base, lang, rights with the entry).

> That said, the critical parts of the Feed Thread draft (the in-reply-to
> element and the replies link rel) follow the guidelines of the Atom
> extension model.  That is, any RFC4287 implementation *should* be able
> to do something with those elements (even if it's just preserving them).
>  The optional parts of the extension (thr:count an thr:when) fall
> outside of the Atom extension model.  That's ok.  Implementations can
> choose to ignore those things, even completely drop them.

My hope is that implementors will be able to think of Atom in terms of
Entries, Feeds, and everything else; rather than in terms of XML, a
fragile document markup language.

> As for the other extension drafts I put out, keep in mind that most
> should be considered strictly experimental at this time.  That said,
> there is really only one that really falls outside the extension model..
> the Link Extensions draft [1]... which, by definition cannot adhere to
> the extension model given the fact that Atom link elements are actually
> not extensible.

I haven't looked at them all thoroughly. Do you want to extend link
elements rather than use extension elements in these cases because you
expect that link constructs are more likely to provide a UI in
implementations?

I suppose that there are workarounds, eg:

  http://www.example.org/mycommentsfeed.xml";>
10
2006-02-20T00:00:00Z

  http://www.example.org/mycommentsfeed.xml"; />

-- 
Dave

Re: Atom Thread Feed syntax

2006-03-23 Thread David Powell

Thursday, March 23, 2006, 9:39:09 PM, James M Snell wrote:

> Just wanted to follow through on this for everyone.  Given that there
> are vendors getting ready to ship code based on the current rev of the
> spec, I'm *not* going to rename the "id" attribute to "ref".  Yes, I
> know that "id" is confusing to some folks, but we're just talking the
> name of a single attribute and not a critical functional bug.  From this
> point forward, only critical spec bugs will be fixed and I will be
> submitting the spec for consideration as a standards track RFC in the
> not too distant future.

I'm more bothered about the use of undefined markup on the link
element. I know, I know, I keep going on and on about this, but I keep
seeing more drafts that do the same thing and it isn't just a
theoretical problem: Windows Feed Platform does not preserve arbitrary
markup other than proper extension elements. Other feed stores and
servers are likely to do the same (justifiably IMO).

The abandonment of extension constructs in favour of undefined markup
by this draft, and other draft-*-atompub-* drafts would be an
interoperability concern if these drafts were deployed. If you want to
extend Atom, use Extension Elements.

-- 
Dave

Does xml:base apply to type="html" content?

2006-03-23 Thread David Powell



xml:base applies to type="xhtml" content, but I'm not sure whether it
is supposed to apply to escaped type="html" content? I reckon that it
does.

Anybody came across this? Any opinions?

-- 
Dave

Re: atom:name ... text or html?

2006-03-23 Thread David Powell

Thursday, March 23, 2006, 4:57:11 PM, you wrote:

> On 24/3/06 3:21 AM, "Anne van Kesteren" <[EMAIL PROTECTED]> wrote:

>>> 
>>> 
>> Even if it was "HTML" you couldn't really use the entity, could you? I think
>> you have to use a character reference or the actual character instead, yes.
>> 

> It's true that XML has only a half dozen or so entities defined, meaning
> most interesting entities from html can't exist in XML ... unless maybe they
> are wrapped like in CDATA block like above?

atom:name is not intended to contain HTML, the spec for it doesn't
mention HTML, it is no more correct to put HTML in it, than it is to
put base64'd PDF in there.

> I'm getting the data by scraping an html page, so I'm expecting it to be
> acceptable html code, including html entities.

Your HTML parser should decode the entities for you and return a
string. Your Atom generator should encode or escape the string using
numeric entities.

If you really need to use HTML entities directly, then you could put:

]>

at the top of your feed and get rid of that CDATA. XML processors are
REQUIRED [1] to process internal DTD subsets.

[Hmm, internal DTD subsets completely fail in IE7's feed reader,
throwing up a "friendly error message"]

[1] 

-- 
Dave

Re: Latest IE7 release 'AtomicRSS' output comparison results

2006-03-22 Thread David Powell

Wednesday, March 22, 2006, 5:13:05 AM, M. David Peterson wrote:

> Hey Folks,
>   
> With yesterdays build release of IE7, it seemed appropriate to run
> a quick inventory check to see where things stand in regards to the
> derived MS/RSS conversion of a fairly element/attribute usage
> heavy Atom feed.  Here's the overall breakdown.
> [...]
> Beyond this, it seems that everything else *SHOULD*  be able to map
> back fairly well.

There haven't been many changes to the transformation process in this
build, so all of the 15 issues with the Atom transformation in the old
build are still issues with this one.

http://www.imc.org/atom-syntax/mail-archive/msg17898.html

[Quick summary of actual data-loss:

loss of person extensions, loss of timezones/corruption of times, loss
of [EMAIL PROTECTED], titles are flattened to text without preserving
HTML version, loss of category label, xml:base/xml:lang loss,
inheritance of atom:author and atom:rights not handled
]

The last issue perhaps needs some more explanation:

In Atom, the following two entries are equivalent:

a)

David Powell
[...]

[...]

b)

[...]

David Powell
[...]

The same inheritance also applies to some other elements such as
atom:rights, and xml:base/lang suffer similar issues.

Superficially it seems that there is a problem with IE7s rendering, in
that <http://www.tbray.org/ongoing/ongoing.atom> doesn't display Tim
Bray's name next to the entries.

But, actually the problem is deeper than that. Because you only
preserve the latest instance of feed metadata, if it was up to the
client of the API to examine the feed author, and manually inherit it
every time it wanted to display the author of the entry, then the
entry would inherit the wrong authors if the feed author had been
changed since the entry was polled.

Eg: feed producing code may put atom:author only on the feed unless
there are multiple entries in the feed with different authors when it
would add it to the entries too.

Basically you can't require the client of the API (eg IE7) to perform
the inheritance, because they need to inherit the author from the feed
document as it was when the entry appeared in it, not as it is now.

The solution, I expect, is to copy any elements that should inherit
down into the entry during the normalisation process. That way the
display of Tim's feed gets fixed, clients don't need to worry about
inheritance, and author and rights attributions of old entries don't
get mangled by future modifications to the feed document.

Ideally you should perform inheritance from atom:source too, as
described in the RFC.

>  The areas that are currently untested, and potentially a point of
> concern (that I can think of off the top of my head, anyway)
>   
>  * undefinedContent of element atom:category

I think it is perfectly reasonable to discard "undefined content"
(such as namespaced attributes on Atom elements). If you want an
extension, use an extension element. If you want to sprinkle
attributes everywhere in the assumption that implementations are going
to preserve whatever document you pass to them verbatim - well, don't
be too disappointed.

>  * any extended usage of xml:base and xml:lang

Proper preservation of these two is essential. In particular the
baseURI for each feed needs to be preserved after resolving it
relative to the document URI (and Content-Location if you're after
extra credit); the baseURI for each entry needs to be preserved after
resolving it relative to the baseURI for the feed.

The baseURI for entries needs to be stored independently of the feed
metadata, otherwise redirecting the feed, or changing its base, will
retroactively break the baseURIs of older entries.

> atom:published

atom:published isn't preserved as an exact copy. It is converted to an
RSS style date (with the time as-is, and the timezone set to GMT even
if it wasn't).

> Overall it seems to me the MS/RSS team has done a pretty fantastic
> job of ensuring a fairly high quality conversion, making exact
> copies of those elements and their associated attributes that did
> not allow for a clean conversion to the MS/RSS format and still
> maintain any hope whatsoever of making the round trip back to its
> original Atom format.

Here here! With all of the high-profile coverage of RSS in the
publicity, I was expecting Atom to be either neglected, or supported
with much lower fidelity than it is now.

Most of the data-loss problems are minor (and therefore easy to fix
:). The serious problems are with inheritance (xml:base, xml:lang,
atom:author, and atom:rights).

-- 
Dave

Re: Atom syndication schema

2006-03-16 Thread David Powell



Thursday, March 16, 2006, 7:31:08 PM, you wrote:

> David Powell wrote:
>> Not sure if this is a known bug, but I just noticed that the RelaxNG
>> grammar doesn't accept "atomCommonAttributes" (eg xml:lang) on the
>> "atom:name" and "atom:uri" and "atom:email" elements used within
>> Person constructs.

> Did you cc me because of my coverage of the matter?

> http://copia.ogbuji.net/blog/2006-02-06/Small_fix_

> If so, I think I said all I have to say about it there.  My fixed RNG is
> still available.

Er, I hadn't read the thread properly, and posted to it having
independently discovered the same bug when I was doing some hard-core
XSLT-ing of Atom.  Doh.

Could you post the errata to the rfc-editor, via:
http://www.rfc-editor.org/errata.html


-- 
Dave

Re: Atom syndication schema

2006-03-15 Thread David Powell

Wednesday, March 15, 2006, 3:21:08 AM, Martin Duerst wrote:

> For atom:uri and atom:email at least, not having xml:lang may
> be seen as a feature.

The spec says that "Any element defined by this specification MAY have
an xml:lang attribute". We chose to limit the effects of xml:lang,
rather than the occurrence of it. Eg: atom:published is allowed
xml:lang, even though it is meaningless. The spec includes a sentence
about element xxx being "Language-Sensitive" when we consider the
language to be relevant. The idea is, if a feed reading framework such
as Microsoft's Windows/IE7 feed platform doesn't preserve xml:lang on
elements that aren't "Language-Sensitive", then they are doing nothing
wrong. Same for, eg: an Atom publishing server backed by a legacy CMS.

> While these often contain pieces from one language or another, they
> are not really in a language.

I agree. Note that this is the case in Atom, because those two
elements are not "Language-Sensitive".

Also note, that atom:uri is an IRI-reference, so it is affected by any
xml:base attributes on that element.

And that atomCommonAttributes also covers extension attributes, which
are also allowed anywhere. They are "undefined", which *I* think means
that implementations need not feel bad about dropping them on the
floor. The official meaning is, er, undefined.

-- 
Dave

Re: Atom syndication schema

2006-03-14 Thread David Powell



Not sure if this is a known bug, but I just noticed that the RelaxNG
grammar doesn't accept "atomCommonAttributes" (eg xml:lang) on the
"atom:name" and "atom:uri" and "atom:email" elements used within
Person constructs.

-- 
Dave

Re: Feed paging and atom:feed/atom:id

2006-03-10 Thread David Powell

Friday, March 10, 2006, 5:44:21 PM, you wrote:

> Are linked feeds required to have unique atom:id values? Or, are they
> required to have the same atom:id values?

> Thoughts?

The history spec frequently uses the phrase "the feed" in the
singular, this implies to me that the id's of the feeds must be the
same to the satisfy atom:id semantics.

-- 
Dave

Re: IE7 Atom Handling (was RE: Link rel attribute "stylesheet")

2006-03-01 Thread David Powell


Hi Sean,

I've been testing IE7 beta 2's support for Atom, with the following
test feed:


Also here for easier viewing in IE7



Here are the problems that I have found:


01. Person Extensions

In Atom, extension elements can appear in feeds, entries, and person
constructs.  So atom:author and atom:contributor should preserve any
extension elements.  Currently, the transform only preserves atom:uri,
atom:name, and atom:email.  It should preserve all extensions too.


02. Timezones

atom:updated is converted to RSS's RFC822 pubDate element, but the
timezone information is lost.  Eg: a date such as
"2006-01-01T05:00:00+02:00" is converted to
"Sun, 01 Jan 2006 05:00:00 GMT", which is incorrect.


03. atom:published

While atom:updated is converted to pubDate, atom:published is kept as
atom:published; except, the date format is converted to RFC822 format.
I think that the date format should be kept as-is in ISO8601-style
format.


04. Alternate links for non-HTML types

The entry called "Binary Entry" contains a link of the form:



This link isn't treated as the link for the entry, presumably because
it has a type other than HTML.  If no HTML link can be found for the
alternate link, perhaps it would be worth just choosing any other
alternate link instead.


05. HTML titles

HTML in feed and entry titles is interpreted properly, but flattened
to text. This is presumably deliberate but it does mean that there is
some data loss. Perhaps the original atom namespaced element should be
preserved as well in these cases?


06. atom:generator

atom:generator is converted to RSS's generator.  The uri attribute is
included as an unnamespaced uri, but the version attribute is dropped.
Perhaps both should be preserved, and it might be better to put the
attributes into a namespace?


07. XHTML namespace prefix

More of a rendering problem, but I've included it here because it is
significant:  xhtml content currently only works if the xhtml is in
the default namespace.  If a namespace prefix is used, it fails to be
interpretted correctly.  See the entry entitled:
"Entry with full iana [EMAIL PROTECTED] values"; the link should appear as an
HTML link, but doesn't.


08. IANA URIs for link relations

A bit of a quirky one, but in Atom the rel values are actually URIs
relative to , so
rel="alternate" and
rel="http://www.iana.org/assignments/relation/alternate"; should be
treated the same. The same goes for enclosures. See the entry: "Entry
with full iana [EMAIL PROTECTED] values", which should show an enclosure and a
valid entry link.


09. Category label

atom:category is converted to RSS's category element.  This causes the
"label" attribute to be lost.  It perhaps should be preserved as a
namespaced attribute.

Also, if it is available it might be better to use the "label" rather
than the "term" as the RSS2 category name, because "term" might not be
very human readable, that is the purpose of "label".  See
"Content Source Entry", which causes the WordNet URI to be displayed
in the category filter box.


10. xml:base everywhere

Some handling of xml:base is done if it appears on atom:feed or
atom:entry, but it can appear anywhere. Eg, xml:base on the atom:link
element should affect that link. There are a number of examples of
xml:base being handled wrongly in the example, eg the broken feed
logo.


11. xml:base / xml:lang namespace

I notice that lang and base attributes appear on the transformed feed,
but don't have the "xml:" namespace prefix?  Is this a bug caused by
the weirdness of the implicit "xml:" namespace?


12. Subscription name

An IE7 bug, but I'll mention it here: If the feed title contains a
line-break, the "Subscribe to feed"-dialog doesn't work because the
line-break get's embedded as a hollow-square in the text box and
causes an error. Try subscribing to the test feed, it works if you
remove the hollow-box from the subscription name.


13. xml:base and xml:lang inheritance from atom:feed to entries

xml:base and xml:lang at feed level should apply to all elements
nested within the feed document. However the atom:feed element and its
metadata can obviously change over time. This creates a problem: What
if the atom:feed element contains an xml:base element, and it changes.
The feed document as polled can be assumed to be consistent, but it
would be wrong to retroactively apply this new base to old entries.
In order to avoid these problems each entry needs to store the
xml:lang and xml:base context at the time it was last seen in the
document.

I think that if a document has xml:lang set on atom:feed, then this
attribute should be written to all item elements, unless it is
overridden on that atom:entry element. Same for xml:base, except you
might need to resolve the entry base against the feed base.

Actually if you support feeds redirecting

Re: Fwd: [rss-public] Microsoft Feeds API Enclosure Test

2006-02-23 Thread David Powell

Thursday, February 23, 2006, 6:37:50 AM, you wrote:

> Does someone who has access to an MSFT system care to take a
> look at this?

I have been playing with IE7, and it is interesting to see what
happens when you click on a feed and "view source".

If the feed hasn't been subscribed to, you just see the feed source as
you would expect.

If you have subscribed to the feed however, you see Windows's internal
representation of the feed, which is normalised to a sort of RSS2++. I
assume that this is what is exposed when you use the APIs to access
the XML.

(Hmm - giving access to the XML in this way is a brave move, XML has a
huge surface area for an API, practically any change to the XML
produced by Windows could break client applications, and I didn't find
any documentation for the normalised RSS2++ ).

What is interesting is that Atom is handled (reasonably well), by
converting the Atom to RSS2. The logic seems to replace atom elements
with there RSS2 equivalents and the loss in fidelity is not too great
(eg atom:updated -> pubDate), and to leave the Atom as-is for awkward
(eg: [EMAIL PROTECTED]/xml)

There is definitely some loss in fidelity though.  It would be nice to
run an extreme Atom feed through the process to see what gets lost.
xml:base appears to get corrupted, and unless the API provides access
to the baseURI of each entry there is a risk of data loss (as the
xml:base at feed level may change between polls it therefore needs to
be preserved with each entry.)

Does anyone have a bad-ass atom feed with IRIs, binary content,
atom:source, xml:base, xml:lang, extensions etc for testing?

-- 
Dave

Re: More on atom:id handling

2006-02-01 Thread David Powell

Wednesday, February 1, 2006, 3:20:23 PM, Thomas Broyer wrote:

> [CC'ing atom-syntax]

> 2006/2/1, David Powell <[EMAIL PROTECTED]>:
>>
>> Wednesday, February 1, 2006, 6:21:12 AM, James M Snell wrote:
>>
>> > Entries in an Atom feed can share the same atom:id but their
>> > atom:updated values should be different.
>>
>> To be precise, it is "Entries in an Atom Feed Document" not "Entries
>> in an Atom feed".
>>
>> I really really dislike that rule, and don't understand how it was
>> ever accepted, and personally I would be tempted to ignore it.

> IIRC, it was to allow a feed listing "revisions" of the same entry:
> same id, different "updated" values.

I don't have a problem with allowing multiple revisions with the same
atom:id in a single document at all; I think that is a good thing.

On the contrary, I have a problem with preventing multiple revisions
from having the same atom:updated value. It subverts the intent of
atom:updated being a subjective element, and it puts the feed compiler
in an impossible situation. Nothing prohibits the entry author from
producing two different instances with the same atom:updated value,
but given this valid situation, the feed compiler is forced to
silently lose data.

And for what purpose? The restriction is useless anyway. If it is
trying to provide a strict ordering of instances within a feed, it
fails, because the restriction only applies within a feed document.
Two seperate polls of a feed document can still produce two different
instances with the same updated value.

It also prevents synchronization applications, such as Microsoft's SSE
from introducing a more discerning date/revision extension, because
nothing is allowed to be more discerning than atom:updated, even
though the specification admits that:

  "not all modifications necessarily result in a changed atom:updated
  value"

-- 
Dave

Re: atom:content's src and server-driven content negotiation

2006-01-19 Thread David Powell

Thursday, January 19, 2006, 11:17:38 AM, Graham Parks wrote:

> The correct thing to do is to pick the one provided by default by the
> server when no content negotiation occurs. eg:

>   http://www.example.com/img"; />

Possibly, but that solution isn't perfect. There is a tradeoff between
supplying an inaccurate type, and supplying no type at all. This TAG
finding [1] discusses the issue quite thoroughly.

[1] http://www.w3.org/2001/tag/doc/mime-respect-20040225

The risk in providing an inaccurate mime-type in the content or link
elements, is that a user-agent such as a mobile device, might not
attempt to fetch the content at all if they don't support the advisory
MIME type, even though if they had requested the type, and conneg had
taken place, they could have been served a different type that they
could have handled.

-- 
Dave

Re: partial xml in atom:content ?

2006-01-17 Thread David Powell

Tuesday, January 17, 2006, 8:39:54 PM, James Holderness wrote:

> This has got nothing to do with second-guessing. Just pretend for a moment
> that there was no such thing as the "xhtml" type. Now the question is what
> is the correct way for an aggregator to display an "application/xhtml+xml"
> document. There's nothing in the spec that says an aggregator can't display
> that document inline. That's not second-guessing, that's an implementation
> choice. The fact that displaying such a document inline turns out to involve
> the same process as the "xhtml" type is irrelevant.

Assuming that the document's /html/head section is irrelevant and
discarding it, even when the publisher has specifically used non-core
types to send the full document, is second-guessing the user though.

Eg: perhaps the publisher is attempting to send a HTML document that
they saved in Word, full of CSS styles, that is intended for printing. [*]

I agree that how you display such content is just an implementation
choice, but if the publisher has specifically used a non-core type to
label content, I think it is a better choice to just treat the content
identically to any other non-core type, and probably display a
download link.

-- 
Dave

Re: partial xml in atom:content ?

2006-01-17 Thread David Powell

Tuesday, January 17, 2006, 9:48:22 PM, I wrote:

> Eg: perhaps the publisher is attempting to send a HTML document that
> they saved in Word, full of CSS styles, that is intended for printing. [*]

[*] Off-topic rant:

Let's hope that the user doesn't attempt to publish their document
as .mhtml, which is bizarrely banned from appearing in Atom. I
suppose it could be base64 encoded and typed as application/octet-stream.

Hmm.

-- 
Dave

Inheritance of atom:rights

2006-01-10 Thread David Powell



>From 4.2.10:

>If an atom:entry element does not contain an atom:rights element,
>then the atom:rights element of the containing atom:feed element, if
>present, is considered to apply to the entry.

Is there a reason that this paragraph excludes inheritance from
atom:source?

Section 4.2.1, describing the inheritance of atom:author, does
explicitly allow inheritance from atom:source:

>If an atom:entry element does not contain atom:author elements, then
>the atom:author elements of the contained atom:source element are
>considered to apply.  In an Atom Feed Document, the atom:author
>elements of the containing atom:feed element are considered to apply
>to the entry if there are no atom:author elements in the locations
>described above.

-- 
Dave

Re: [Fwd: Re: todo: add language encoding information]

2005-12-23 Thread David Powell

Friday, December 23, 2005, 10:47:23 AM, Henry Story wrote:

> On 23 Dec 2005, at 10:56, James Holderness wrote:
>>
>> The similarity to the Thread Extension also occured to me, but I  
>> didn't have time to write more about it earlier. My thought was  
>> that we could perhaps get by with an extension attribute to the  
>> link element that would work for both cases. The link element  
>> already has an href for pointing to the feed itself, so all we need  
>> is an id to point to a particular entry in the feed. I guess for  
>> the Thead Extension you'd also need a new rel value though.
>>
>> An example thead reply-to link would look something like this:
>>
>> >  type="application/atom+xml"
>>  href=http://www.example.org/feed1.atom
>>  x:id="entry1_id" />
>>
>> An example translation link would look something like this:
>>
>> >  type="application/atom+xml"
>>  hreflang="fr"
>>  x:id="french_entry_id" />
>>
>> What do you think?

I assume that there is an href missing from that?

Isn't it likely that the entry will have been dropped from the end of
the feed by the time anyone dereferences it? Would it be better to
point the link at a static Atom Entry document instead.

I've said it before, but I strongly dislike extending standard
elements by adding namespaced attributes. I think that it is a failing
that the specification defines several extension points, then pretty
much says that you can extend anything pretty much anywhere, and fails
to explain why some options might be better than others. It is
difficult to support attribute extensions in an APP server or Atom API
without explicitly coding support for each one. It is quite likely
that some implementations will fail to preserve the namespaced
attribute and forward on a corrupted link element without it.

The spec's statement that "the role of other foreign markup is
undefined by this specification", would just be an unhelpful
comp.lang.c-ism, except C does at least clearly define the meaning and
implications of the term "undefined".

But anyway...

> The problem as I explained a little quickly in my mail yesterday, is  
> that you are relating a entry and an id. Because there can be any  
> number of entries with the same id it won't be clear which entry is  
> the translation.

The description of atom:id in the spec flounders slightly, but I
assumed that the intention was that representations of entries only
vary over time. The use of the word "revision" implies that to me. So,
the relation is to any revision of the entry with that id, though the
latest is probably most relevant.

Unlike HTTP, Atom Syntax can't support conneg, so I don't think that
it would be useful for entries to vary over anything other than time.
If you want to support multi-languages, I think it is better to build
that on top of Atom's infrastructure via linking, rather than
underneath it via stretching the scope of the entry id.

-- 
Dave

RE: How to specify multiple alternative encodings of the same content?

2005-11-07 Thread David Powell


Quoting Lindsley Brett-ABL001 <[EMAIL PROTECTED]>:

> I have raised this question a few times. The issue is separating the
> "resource" from its "representation". A single resource (e.g. a football
> game) may have many representations (audio only, slide show,
> audio/video, etc.). We would need a more sophisticated link mechanism to
> separate resources from representations.
>
> Is anyone else interested in this? Can this be done with an extension
> module?

I'm generally not a fan of adding attributes to atom elements rather than using
the proper extensibility constructs, but how about something like this:





This would support multiple resources, and multiple representations of each
resouce in the same entry.  Implementations that didn't support the extension
would still work as before.

Clients could select the best alternative from a full table of prefered MIME
types without the impracticalities of sending a massive Accept: header as would
be required by traditional HTTP conneg.

--
Dave

Re: FYI: Updated Index draft

2005-09-14 Thread David Powell

Monday, September 12, 2005, 5:55:20 PM, James M Snell wrote:

> I've updated the draft that defines an extension that can be used to 
> indicate that the order of entries within a Feed should be considered 
> significant.

How will this interact with the sliding-window/feed-history
interpretation of feeds? The natural order assigned by this extension
seems incompatible with the implied date order that would be implied
by two feed documents, polled over some period of time.

What should be the order of a merged feed history such as this:

Poll 1:
feed(e1, e2, e3)

Poll 2:
feed(e3, e1, e5)

- where, perhaps, 3 and 1 have been updated. How do you combine
entries sorted by their natural order, with the time-ordered feed
history?

How will this interact with entry documents, eg over pubsub.

What about Atom Protocol - I can't imagine how I would publish a feed
with a given natural order. For something like the BBC feeds, some
sort of arbitrary "score" field might be more interoperable with both
entry documents, Atom protocol, and feed history.

I'm probably on my own, but I expected Atom's statement that "This
specification assigns no significance to the order of atom:entry
elements within the feed" was non-negotiable and couldn't be changed
by extensions. This seems more like potential Atom 1.1 material to me
- it doesn't seem to layer on top of the Atom framework so much as
slightly rewrite part of it.

Eg - An Atom library or server that doesn't know about this extension
is free to not preserve the entry order, and yet to retain the
 element, even though this will have corrupted the data.

I think that as implemented, this extension wouldn't be safe to deploy
without must-understand extensions, which Atom 1.0 doesn't support.

Ordered feeds are a useful problem though. Indexes or scores on
entries might work better with entry documents, the protocol, and with
the Atom extension framework, but it still isn't clear how they would
interact with the sliding window.

A couple more minor points:

I'm not sure whether the descending/ascending attribute is necessary?
Given that the extension just presents a natural order (by some
unnamed ordering), why would anyone go to the trouble of presenting
the entries in reversed order, and then label them as descending; why
not just present them in ascending order to begin with?

Would it be useful for the extension to allow the natural ordering to
be named? - so if the ordering is by "Importance", or "Order of
real-life events", or something else, then it could labelled with a
URI and/or label, so that people don't have to guess the significance
of the natural order.

-- 
Dave

Re: Extensions at the feed level (Was: Re: geolocation in atom:author?)

2005-08-21 Thread David Powell

Sunday, August 21, 2005, 8:46:54 PM, Paul Hoffman wrote:

> At 7:24 PM +0100 8/21/05, Peter Robinson wrote:
>>I do something similar, intending it to mean "the location of the items
>>described by this feed" (when there is a single location).

> Ah, I had missed that. This leads to a question for the mailing list. 
> Does an informative extension that appears at the feed level (as 
> compared to in entries) indicate:

> a) this information pertains to each entry

> b) this information pertains to the feed itself

> c) this information pertains to each entry and to the feed itself

> d) completely unknown unless specified in the extension definition

In my RDF model, feed extensions (together with properties such as
atom:generator), are considered to be properties of the FeedInstance.

EntryInstance's are related to FeedInstance's using containingFeed and
sourceFeed properties.

(Entry's and Feed's can have multiple EntryInstance's and
FeedInstance's, but that's not really relevant...)

So, feed extensions don't automatically inherit to entries in the
model (unlike atom:author which does), but for a given entry you can
locate its feed and take a look at its extension properties, so it
isn't like the information is lost.

So I'd say b); but as long as you aren't throwing away atom:feed data,
that shouldn't prevent an application using feed extensions to do a)
or c).

I think that the interpretation b) is probably what is supported by
section 6 in the absence of any talk about extension inheritance.

-- 
Dave

Re: Feed History -03

2005-08-16 Thread David Powell

Tuesday, August 16, 2005, 11:14:42 PM, Mark Nottingham wrote:

> E.g., what if I want to have an optional attribute on an empty
> element? Is it "simple" or "complex"?

FYI: The first draft of the proposal used an atom:notation="structured"
attribute on the extension to indicate the extension class which
avoided that problem, but I think it was considered to be too crufty.

-- 
Dave

Re: Feed History -03

2005-08-16 Thread David Powell

Tuesday, August 16, 2005, 8:00:55 PM, Mark Nottingham wrote:

> I very much disagree; relative references should be allowable in  
> simple extensions, and in fact the rationale that Tim gives is the  
> reasoning I assumed regarding Atom extensions; if I had known that  
> the division between simple and complex extensions would be used to  
> justify a constraint on the use of context in simple extensions, I  
> would have objected to it.

The constraint on the context is the main reason for the distinction
between Simple and Structured extensions. Why else would we define two
classes of extension if it didn't affect the processing model, and why
else would we make Simple extensions language-insensitive (a very
similar constraint on context, which is spelt out quite clearly, and
therefore presumably not disputed)?

> If you're using something like RDF to model feeds, you already have a
> number of context-related issues to work through, this isn't an extra
> burden.

Yeah a few, Structured extensions are obviously a big one (where the
lang and base context do need to be preserved). The reason for there
being two classes of extensions was to reduce this burden, so that
implementations based on RDBMS, RDF, or whatever can process a common
class of unknown extensions generically. The burden of requiring the
lang and base context to be preserved in a legacy CMS database along
with each extension, on the off-chance that they might be significant
seemed to great.

If you think that the context shouldn't be constrained, then maybe
you're right - maybe the constraint was unnecessary; but I think that
the current spec does impose that constraint, albeit too subtly, in
Section 6.4.1 paragraph 2, and that constraint is what I originally
intended.

If a significant number of people have not picked up on this (the lack
of rationale probably didn't help), and would have disputed it, then I
guess that we have a problem.

-- 
Dave

Re: More about Extensions

2005-08-11 Thread David Powell


I said:

> I might have misinterpreted your comment, but I'm arguing with Tim for
> saying that SEE's CAN contain relative refs and no clarifification is
> needed, and with you for saying that SEE's CANNOT contain relative
> refs and no clarification is needed.  There's a word for that :)

I oversimplified that.  What I really meant is not that rel refs are banned, but
that publishers should not expect rel refs to be processed differently than
strings.

The value of a Simple Extension is a string.  It is the job of an unextended
Atom implementation to transfer those strings.

Extensions can encode what they like in those strings, and if the extension is
supported by the receiver, then it can be decoded.  A publisher can put
numbers, dates, URIs, or escaped XML in one of these string; they can even put
relative refs in there, but they must only expect the string to be preserved,
not the context of the surrounding XML.  SEE's exist so that simple properties
can be transfered without requiring that the CMS store an Infoset equivalent.

--
Dave

Re: More about Extensions

2005-08-11 Thread David Powell

Wednesday, August 10, 2005, 11:12:30 PM, you wrote:

> Dave: I think I see what you're getting at... correct me if I'm wrong.

> So I decide that my aggregator is going to look for unknown Simple
> Extensions in Atom feeds and display them as a table of name/value
> pairs at the bottom of every entry. And during the display process,
> I'm going to run a regex over the values and linkify any URLs I find.

> When someone's relative references just sit there as plain text and I
> get a complaint, who's to blame?

Another example is:

An AtomPP client publishes entries to an AtomPP server.  The server
stores the entries in a CMS.  The CMS publishes the entries as an Atom
feed.  The CMS shouldn't have to preserve the value and base URI for
each extension attribute.  Simple Extensions were designed with this
sort of scenario in mind.

> (1) Me, for trying to provide generic support for unknown extensions?
> (2) The publisher, for failing to consider non-specific or limited
> support of the extension?
> (3) The complaining user, for expecting too much?

> If it's (3), then I agree with Tim... the spec says what it says, and
> that's fine. Otherwise, there may be a legitimate problem here.

Well I think that it can't be (1). We wouldn't have wrote section 6.4.1
and 6.4.2, if we didn't want to support this.

It shouldn't be (3) either. If the publisher puts something into an
Atom document, it shouldn't be ambiguous whether it is a URIRef or
not, even for extensions.  This is what I mean by extensions being
part of the Atom model.

We wouldn't have banned Simple Extensions from containing language
sensitive text, if we were requiring implementations to preserve each
of their base URIs.

This paragraph explains that, but obviously not well enough:

> The element can be interpreted as a simple property (or name/value
> pair) of the parent element that encloses it. The pair consisting of
> the namespace-URI of the element and the local name of the element
> can be interpreted as the name of the property. The character data
> content of the element can be interpreted as the value of the
> property. If the element is empty, then the property value can be
> interpreted as an empty string.

I'm requesting that some clarification be added specific to relative
refs.

-- 
Dave

Re: More about Extensions

2005-08-11 Thread David Powell

Wednesday, August 10, 2005, 11:33:46 PM, Robert Sayre wrote:

> On 8/10/05, David Powell <[EMAIL PROTECTED]> wrote:
>> I think that it is pretty clear, but as Tim disagrees, I think that
>> this is a good indication that we need clarification.

> I think it's good indication that you've argued with everyone, no
> matter what they say. I'm strongly opposed to adding anything like
> you're suggesting. Tim and I agree that the current text is
> sufficient. There's a word for that.

I might have misinterpreted your comment, but I'm arguing with Tim for
saying that SEE's CAN contain relative refs and no clarifification is
needed, and with you for saying that SEE's CANNOT contain relative
refs and no clarification is needed.  There's a word for that :)

-- 
Dave

Re: More about Extensions

2005-08-10 Thread David Powell

Wednesday, August 10, 2005, 1:34:44 AM, Tim Bray wrote:

> The problem could hypothetically arise when someone extracts
> properties from the foreign markup, stuffs them in a tuple store, and
> then when the software that knows what to do with comes along and  
> retrieves it and recognizes the relative URI and can't do much  
> because the base URI is lost.

I expect that many Atom Protocol server implementations will work this
way.

> So... IF you know how to handle some particular extension, AND IF you
> expect to handle it when the extension data has been ripped out of  
> the feed and stored somewhere without any context, THEN you shouldn't
> use a relative reference.

The 1st/2nd "you"s are likely to be a different person from the 3rd
"you". You seem to be saying: "if the consumer knows how to handle
some particular extension [...] then the producer shouldn't use a
relative reference". How does the producer know what the consumer will
do?

(Well of course that is the job of the specification, which is my
point here).

> Alternatively, IF you want to empower extensions to process they
> data they understand, AND IF you want to rip that data out of the
> feed and store it somewhere, THEN it would be smart to provide
> software an interface to retrieve context, such as feed-level
> metadata and the base URI.

It sounds like you disagree with the rationale for Simple vs
Structured extensions in [1]. So can I ask you why you think Atom
defines Simple and Structured Extensions, when according to your
understanding they need to be processed in exactly the seem way
(though simple aren't language sensitive - which seems a bit odd if
you believe that).

I think that the second paragraph of 6.2.1 is pretty clear about what
Simple Extensions mean (the value range of the property is character
data, not character data plus base URI and language context), and that
base URIs would be irrelevant to their processing, but I thought that
it might need some clarification as once people start using them for
URIs there is a risk that people could start using them for relative
refs, given the lack of explanation or rationale for their purpose.

> Sounds like implementor's-guide material to me.

Alternative 1 is not implementable, and alternative 2 suggests
processing that is not compatible with the intended purpose of Simple
Extension elements.

I wasn't entirely convinced that we needed to add this clarification,
but now I am sure that we do.  If there is a belief that
implementations SHOULD preserve the base uri of Simple Extensions then
we need to indicate that this is not true, else Simple Extensions
will be broken from the start.

[1] http://www.imc.org/atom-syntax/mail-archive/msg16643.html

-- 
Dave

Re: More about Extensions

2005-08-10 Thread David Powell

Wednesday, August 10, 2005, 1:30:54 AM, Robert Sayre wrote:

> On 8/9/05, David Powell <[EMAIL PROTECTED]> wrote:
>> 
>> Publishers should expect that relative refs used in atom:link will
>> work, but publishers should expect that relative refs used in Simple
>> Extensions will break.

> Disagree. We have no idea what people will do with this, or where they
> will be deployed. You're suggesting adding implementation advice,
> since the content of a simple extension element is not defined as a
> URI reference.

I want to know whether an implementation is corrupting data if it
discards the base-uri of simple extensions. I think that it is the job
of the spec to tell me this, not the the job of "implementation
advice" to decide this.

> By your logic, we have to explicitly clarify that
> atom:updated is not subject to xml:base processing. Sorry, I strongly
> disagree.

Are you saying that we don't need clarification because it is obvious
that simple extensions only contain strings that aren't subject to
base processing?

I think that it is pretty clear, but as Tim disagrees, I think that
this is a good indication that we need clarification.

-- 
Dave

Re: More about Extensions

2005-08-09 Thread David Powell

Tuesday, August 9, 2005, 11:22:14 PM, Robert Sayre wrote:

> What are we going to do, outlaw strings that happen to look like
> relative references?

No, we just need to warn publishers (and extension authors) that the
base URI of Simple Extension elements is not significant, and that
they must not expect it to be preserved.

We do the same regarding xml:lang already by saying that the element
is not Language Sensitive, which means that the language context is
not significant and that publishers must not expect it to be
preserved.

from Section 2:

> The language context is only significant for elements and attributes
> declared to be "Language-Sensitive" by this specification.

I'd suggest adding something similar to Section 6.4.1, eg:

"The base URI is not significant for Simple Extension elements."

> Relative references are fragile, and people understand why they
> break.

Publishers should expect that relative refs used in atom:link will
work, but publishers should expect that relative refs used in Simple
Extensions will break.

-- 
Dave

More about Extensions

2005-08-09 Thread David Powell



I still believe that relative URIs shouldn't exist in Simple Extension
constructs [1]. I think that the lack of rationale for their being 2-3
classes of extension construct is proving to be harmful.


Prior to the introduction of Section 6, Atom pretty much said you can
include any foreign markup anywhere. I thought that this conflicted
with the claim made by the charter that:

> Atom consists of:
> * A conceptual model of a resource
> * A concrete syntax for this model

I thought that the model should be separable from the syntax, so that
people can use databases and RDF stores as their back-ends rather than
just XML files. And I thought that it was important that extensions
should be part of that model, rather than only be representable in the
syntax, else extensions would be poor-cousins of the core elements.

Restricting Atom extensions to only being simple string name/value
parameters would ensure that they were represented in the model, but
it would have been too limiting.

So the two classes of Extension construct, Simple and Structured, are
a compromise between constraints and flexibility.

The pros and cons of each class are:

Simple Extension constructs:


  + simple string name/value properties of the feed/entry/person. Easy
to implement generically end-to-end in servers/clients so that
extensions can be deployed generically without requiring "boil the
ocean" acceptance.

  + property semantics as described by section 6.4.1.

  + publishing clients could provide an extension editor, where
metadata fields could be added to the clients form, given a
namespace URI and element name.

  + extensions don't need to be defined specifically for Atom. RDF
Vocabularies, RSS extensions, DC, and PRISM already define
properties that are compatible with Atom Simple Extensions.

  + simple, useful mapping to RDF

  - can't represent language sensitive text. This decision was made
because very few RSS extensions contain language sensitive text,
(they tend to contain dates, numbers, tokens, URIs etc - when
language-sensitive text is required Structured Extensions should
be used). Also, the barrier for implementations such as custom
property tables, CRMs, and WebDAV implementations would be high. 

  - can't represent relative URI references, because they are defined
to be strings only, and generic implementations can't know what is
or isn't a URI reference.


Structured Extension constructs:


  + Can support (almost) arbitrary XML.

  - no pre-defined semantics.

  + no pre-defined semantics.

  - clumsy generic mapping to RDF (by preserving the XML blob), though
with extension specific knowledge a better mapping could be used.
  
  + Publishing servers can generically support them by preserving the
blob of XML.

  - Publishing clients can't easily generically support them, as the UI
to edit a chunk of arbitrary XML wouldn't be very user-friendly.
  
  - require at least a mandatory attribute or child in order to exist.


Namespaced attributes & atom:link children
--

  - Not part of the Atom model - only representable by the syntax.

  - Not really practical to support generically; require
"boil the ocean" adoption.
  
  - Really not something I'm keen on as evidenced by this biased
assessment...  Are they really allowed for things other than
future versions of Atom?

  + ...OK, they let you add annotations to elements in a way that
would be difficult to address without an RDF style graph-based
format.


Does that sound about right?

So, can we agree that relative URIRefs aren't allowed in Simple
Extension constructs and add a clarification, else their
implementation won't satisfy the rationale for their design.

If I'm wrong, and the rationale behind Simple Extensions isn't
important, then can someone explain why there are two classes of
extension?

[1] http://www.imc.org/atom-syntax/mail-archive/msg16598.html

-- 
Dave

Re: Simple Extensions and xml:base

2005-08-06 Thread David Powell


Quoting Tim Bray <[EMAIL PROTECTED]>:

   >Right, but anyone who reads a simple extension out of an Atom feed
   >and finds something they consider to be a relative URI reference, and
   >wants to absolutize the reference, either uses the base URI as
   >established by xml:base, or they are wrong. -Tim
   >

I wasn't suggesting that we needed special xml:base rules.  My point was that it
is not legal for an extension to define its content  to be a (relative) URIRef.
(although it is ok for an extension to contain an absolute URI.)

Simple Extensions are designed to be a simple class of extension that generic
support can be implemented for fairly easily in publishing clients and servers.
 They are intended to be name/value pairs that can be stored outside of an Atom
Infoset.   Generic processing is the reason that the distinction between Simple
and Structured extensions exists.

Generic processing, for implementations with database or RDF backends, can't be
implemented if you need to know whether each extension contains a URIRef or
text before putting the extension in your database.

You can't put URIRefs in Simple Extension Elements because some implementations
will not preserve their base URI (which is legal behavior, because the spec
defines them to contain plain strings, not URIRefs.)

I think that this is already implied by the spec, but it obviously isn't very
clear.

--
Dave

Re: Simple Extensions and xml:base

2005-08-05 Thread David Powell

Quoting Tim Bray <[EMAIL PROTECTED]>:

> On Aug 4, 2005, at 11:21 PM, David Powell wrote:
>
> > We say that Simple Extension Elements are not language sensitive, but
> > we don't say that Simple Extension constructs aren't affected by
> > xml:base.  I think that the implication is that they are not, but it
> > is not very explicit:
>
> They *are* affected by xml:base.  xml:base establishes the base URI
> for wherever it's in-scope, with a specific callout to RFC3986 for
> the semantics.  Anytime you see something that you know is a relative
> URI reference, you have to absolutize it using the base URI, and the
> base URI is what xml:base says it is.  -Tim

Yes, I understand, but I disagree.  xml:base only affects things that are
designated to be URI references.  I wouldn't expect
/index.html to get resolved as a URI, because
atom:title isn't defined to be a URIRef.  Neither are Simple Extension
Elements.

There is a layering.  Implementors of draft-ietf-atompub-format-10 can never
know that an arbitrary Simple Extension Element's value is a URI reference, so
it can never be treated as one; the value is clearly stated to be a
language-insensitive string, not a URIRef, therefore xml:base has no effect.

Specifications for extensions may declare that the string value should be
interpretted by applications as being a URI (or a date, or an integer, or
whatever), but we are at a different layer now, the xml:base of the element
that the property was declared is as irrelevent as its namespace-prefix.

--
Dave

Simple Extensions and xml:base

2005-08-04 Thread David Powell



The intention of Simple Extension Elements is to provide a
simple class of extension that is part of the Atom model, and can
therefore be preserved end-to-end by implementions via publishing
clients, servers, databases, and aggregators.

We say that Simple Extension Elements are not language sensitive, but
we don't say that Simple Extension constructs aren't affected by
xml:base.  I think that the implication is that they are not, but it
is not very explicit:

> [...] Simple Extension elements are not Language-Sensitive. The
> element can be interpreted as a simple property (or name/value pair)
> of the parent element that encloses it. [...] The character data
> content of the element can be interpreted as the value of the
> property. If the element is empty, then the property value can be
> interpreted as an empty string.

It doesn't make sense for Simple Extension Elements to be affected by
xml:base, because that requires extensions specific processing to know
whether the extension is intended to be a URIRef or not; avoiding
extension specific processing is the reason for Simple Extension
Elements existence.

Can we add some clarification?

-- 
Dave

Re: Proposed changes for format-11

2005-08-01 Thread David Powell



draft-11:

> This specification does not place any restrictions on what elements
> may be used as Metadata Extensions, but the RelaxNG grammar
> explicitly excludes elements in the Atom namespace. The Atom
> namespace is reserved for future forwards-compatable revisions of
> Atom.

I'm not sure I like this paragraph. It starts by saying that it places
no restriction on the elements, then mentions the RelaxNG, then in the
final sentence, it says that actually there is a restriction after
all. I don't know - perhaps I'm not reading it right, but it sounds
contradictory. It would make more sense to me if everything was
dropped except the last sentence.


Also, Section 6.4 still doesn't permit Extension Elements in atom:source.
I thought that we were going to fix this:
http://www.imc.org/atom-syntax/mail-archive/msg15916.html


And, did we ever get a resolution to the composite MIME types thing
(I'm guessing that we didn't):
http://www.imc.org/atom-syntax/mail-archive/msg15911.html

-- 
Dave

Re: Comments Draft

2005-07-31 Thread David Powell

Sunday, July 31, 2005, 4:47:44 PM, A. Pagaltzis wrote:

> Strictly speaking, per Extensions To the Atom Vocabulary (sec.
> 6.2), an Atom processor must treat the nested link as it would
> treat any other Structured Extension Element (sec. 6.4.2).

Only child elements of atom:entry, atom:feed, and person constructs
(and pending bugfixes: atom:source) can be Structured Extension
Elements (Metadata Elements). Child elements of atom:link are not
Metadata Elements, and neither are foreign-namespaced attributes; they
are undefined markup.

It might be possible to get away with using undefined markup to
communicate between two consenting parties, but I don't think that it
forms a viable platform for extension. It falls outside of the Atom
model, and it can only be preserved between publishing client,
publishing server, feed publication, and aggregator if all parties
either have special case support for the extension, or they preserve
the document's entire Infoset intact along every step in the chain. I
don't believe that the latter is a realistic expectation for Atom
implementations, so it seems reasonable behaviour for a publishing
server to just drop undefined content, such as foreign-namespaced
attributes and atom:link children, and to just publish the link
without them.

Personally, I read "undefined" more like how C uses the word
undefined.

-- 
Dave

Re: Atom namespace, qname-uri-qname roundtripping

2005-07-31 Thread David Powell

Sunday, July 31, 2005, 4:32:11 PM, Graham wrote:

> On 31 Jul 2005, at 4:01 pm, James Cerra wrote:

>> That's apparently what libxml does.  As you can see, with Atom's  
>> namespace it
>> is a mess.  It is also a mess with XHTML's namespace, XSLT's  
>> namespace, and
>> most document-oriented namespaces.

> Were the RDF folks not smart enough to think of this problem and come
> up with a better system or a workaround?

It is an RDF/XML problem, not an RDF problem, and on the list of
RDF/XML problems, there are plenty more serious ones.

Anyway, it is only a problem if you are trying to use ns-qualified
elements that aren't RDF/XML as input to an RDF/XML processor - or
something similar. I'm not really surprised that that doesn't work too
well.

WebDAV does the same thing with namespaces btw[1].

[1] http://lists.w3.org/Archives/Public/w3c-dist-auth/1999OctDec/0343.html

-- 
Dave

Re: Atom namespace, qname-uri-qname roundtripping

2005-07-31 Thread David Powell

Sunday, July 31, 2005, 4:19:40 PM, you wrote:

> I see, thanks for the clarification.

> (I guess atom never intended to allow free -as in speech *and* in
> beer- data mixing anyway, but another namespace would perhaps have
> facilitated the inclusion of atom data in existing rdf tools.)

I would recommend shifting Atom into a separate AtomRDF namespace if
you want to use it with RDF tools. You'll probably need to anyway to
disambiguate things in Atom that are context sensitive, eg link/@type
vs title/@type.

Actually this raises an issue with my Atom/RDF model.  I currently
convert Simple Extension Elements to RDF properties, but I probably
need to create another triple to preserve the namespace URI of the
property - eg:

_:entry "53"
  ns:namespaceURI

otherwise, it won't be possible to round-trip properties.

-- 
Dave

Re: Comments Draft

2005-07-30 Thread David Powell



Sunday, July 31, 2005, 1:09:44 AM, I wrote:

> I don't believe that atom:link _isn't_ usefully extensible other than by

er, that should be "is"


-- 
Dave

Re: Comments Draft

2005-07-30 Thread David Powell

Saturday, July 30, 2005, 9:55:33 PM, Antone Roundy wrote:

> 
> 
> 

I'm not at all keen on extending the link element in this way. Atom
Publishing Servers that don't know about this extension that receive
an entry containing nested links from a publishing client will most
likely drop the content of the link and publish it to clients without
the inner link.

I don't believe that atom:link isn't usefully extensible other than by
creating new @rel values; or if you want something more powerful, use
extension elements.

Atom's extensibility framework has been touted as one of Atom's major
advantages[1] over RSS. Nested link elements sound a bit too much like
[2], but without the namespaces constraint.

[1] http://www.intertwingly.net/wiki/pie/Rss20AndAtom10Compared#x
[2] http://blogs.law.harvard.edu/tech/rss#extendingRss

-- 
Dave

Re: Atom RDF/OWL models

2005-07-26 Thread David Powell

Sunday, July 24, 2005, 9:39:53 AM, Danny Ayers wrote:

> David Powell's full and fairly verbose RDF schema, again I think it's
> an Atom-specific vocab :
> http://djpowell.net/atomrdf/0.1/
> Dates from 2005-03-22, covers draft-05.
> it can be viewed through a nifty little styled-TriX converter:
> http://djpowell.net/rdftrix/

I've just updated my transform to draft-10. The transform attempts to
implement everything in the Atom draft, including things like author
defaulting, xml:base, extension elements, etc...

One of the disadvantages of RSS1.0, I think, is that it doesn't really
discuss mutable entries. If you smush together two instances of an
RSS1.0 feed, the result isn't very useful (you get entries with
multiple rss:description's, and no clue to which one is current). So,
in this schema I've divorced entries from entry instances, and feeds
from feed documents, so that you can archive multiple instances in the
same graph and do some temporal stuff with them. (Eg, you might want
to add an http:pollDate property to the entry and feed instances).

I think that the implementation is complete, but I intend to have a go
at mapping RSS2.0 (and probably the others) onto the same RDF Schema.
If they don't fit well, the schema might need some more updates.

http://djpowell.net/atomrdf/0.1/

-- 
Dave

atom:rights inheritance?

2005-07-23 Thread David Powell



Sorry for the pedantry, but I'm trying to update my Atom/RDF thing, so
pedantry is required...

The inheritance rules for atom:rights, are different to the ones for
atom:author.  Is this intentional?  Are they supposed to be treated
differently?

See:

> If an atom:entry element does not contain an atom:rights element, then
> the atom:rights element of the containing atom:feed element, if
> present, is considered to apply to the entry.

vs:

> If an atom:entry element does not contain atom:author elements, then
> the atom:author elements of the contained atom:source element are
> considered to apply. In an Atom Feed Document, the atom:author
> elements of the containing atom:feed element are considered to apply
> to the entry if there are no atom:author elements in the locations
> described above.

-- 
Dave

Re: Atom 1.0 xml:base/URI funnies

2005-07-18 Thread David Powell

Tuesday, July 19, 2005, 12:44:51 AM, A. Pagaltzis wrote:

> You misunderstood what I said. The point is that regardless of
> how the base URI is determined (whether it is embedded in content
> or otherwise), it *means* that the content it applies to was
> actually found at the base URI. It’s not simply any arbitrary old
> prefix defined for convenience.

Why does xml:base allow for relative base URIs and stacking then? If
xml:base can only describe the actual source URI of the document, then
these features don't make sense.

The example in the xml:base spec [1] uses a relative URI in the
 element, after defining an absolute URI in
http://example.org/today/";> at the top of the document.
If xml:base can only describe the source URI, then one of them must be
lying?

[1] http://www.w3.org/TR/xmlbase/#syntax

-- 
Dave

Re: More while we're waiting discussion

2005-07-12 Thread David Powell

Tuesday, July 12, 2005, 12:29:58 AM, James M Snell wrote:

> The third is a non-RDF adaptation of the Creative Commons RSS 1.0 Module
> that uses the Atom link element and provides a machine readable license
> for entries and feeds. It is described @ 
> http://www.snellspace.com/wp/?p=184.

>
>   href="http://www.creativecommons.org/licenses/by-nc/1.0";
>xmlns:lic="...">
>  {URI}
>  {URI}
>   
>

I might be on my own here, but I would advise against extending the
link element in this way. I'd say that it is better to use a
Structured Extension Element. The section 6.4 Metadata Elements,
(particularly Simple Extension Elements, though that might not work in
well in this case), are designed so that metadata can be added to
entries without requiring every step in the chain to be upgraded, from
the publishing client, to the XML processor and database of the server
and aggregator.

Because the content of atom:link is undefined, there is a risk that
some implementations, particularly Atom server implementations
accepting entries from a publishing client, might just drop the
contents of the element.

-- 
Dave

Re: Roll-up of proposed changes to atompub-format section 5

2005-07-05 Thread David Powell



Tuesday, July 5, 2005, 5:09:40 PM, Paul Hoffman wrote:

> At 11:58 AM -0400 7/5/05, Bob Wyman wrote:
>>   Could we at least put in a sentence that states that including a
>>source element in signed entries is recommended? The implementer's guide
>>would then expand on that with more detail, discussion, etc.

> It's quite late to do this. The IESG is looking at a particular 
> version of the draft, and are making comments on that particular 
> version. It is really important that we only make changes based on 
> IESG members' input, not on yet-more things we find.

Will we still be fixing some of bugs raised since the last draft
though?

Specifically, what are the resolutions for the disputed ban on
composite MIME types [1], and some of the specific bugs in section 6
[2] (mismatches with the RelaxNG etc)?

[1] http://www.imc.org/atom-syntax/mail-archive/msg15911.html
[2] http://www.imc.org/atom-syntax/mail-archive/msg15915.html

-- 
Dave

Media type clarification

2005-07-05 Thread David Powell



It's been raised before [1] [2], but can we clarify whether a MIME
type in atom:content etc. can contain parameters or not?

MIME is a bit vague about the definition of what a "mime type" is, and
historically applications have been tripped up by unexpected MIME
parameters.

Can we add something like "consisting of a main-type, sub-type, and
optional parameters"?

[1] http://www.imc.org/atom-syntax/mail-archive/msg08283.html
[2] http://www.imc.org/atom-syntax/mail-archive/msg14177.html

-- 
Dave

Re: Question on Use Case: Choosing atom:content's type when ArchivingMailing Lists

2005-06-19 Thread David Powell

Sunday, June 19, 2005, 5:07:01 AM, you wrote:

>> The prohibition of composite types in the 08 draft (made many months
>> later) 

> Um, no. One of the drafts reworded the requirement in terms of the new
> MIME draft. Previously, the draft cited RFC2045's "discrete type".
> From format-03:

> "Failing that, it MUST be a MIME media type [RFC2045] in which, to use
> the terminology of Section 5 of [RFC2045], the top level is a discrete
> type."

> We had to, you know, make an editorial change because the new MIME
> draft doesn't use the term "discrete type" anymore.

Ah, thanks. I don't know how I missed that.

OK, process-objection withdrawn, but the problem that Mark highlighted
still exists: Atom prohibits message/rfc822; I don't think that it
should. I'd prefer the solution to be to lift the restriction
completely, than to only lift the restriction for remote content.

It is worth looking what HTTP says about composite types in RFC 2616:

> In general, HTTP treats a multipart message-body no differently than
> any other media type: strictly as payload.

message/rfc822 isn't mentioned at all by HTTP, and is therefore also
treated as just data. It is pretty commonly used too, for both email
(eg: Download from webmail interfaces), and for MHTML content (such as
the format that can be saved by Word 2000 (I think?)).

To be honest, I'm not bothered about mulipart/* being banned. I don't
think that it is particularly useful, but I still don't think that we
need to ban it. What if some really useful multipart/* type gets
defined in future - something on the lines of multipart/appledouble
[1], where the multipart is a blob of content that can be passed to
helper applications as a discrete unit.

message/rfc822 is definitely useful though. There is no reason to ban
it, certainly not with a MUST level constraint. Conceptually it is no
more composite than an application/msword document is composite.

[1] http://www.iana.org/assignments/media-types/multipart/appledouble

-- 
Dave

Re: Question on Use Case: Choosing atom:content's type when ArchivingMailing Lists

2005-06-19 Thread David Powell

Sunday, June 19, 2005, 5:43:36 AM, Antone Roundy wrote:

> On Saturday, June 18, 2005, at 06:28  PM, David Powell wrote:
>> Atom 0.3 multiparts forced a dubious and complex processing model on
>> everyone wanting to process Atom documents. This problem was solved by
>> their removal in the 03 to 07 drafts.
>>
>> The prohibition of composite types in the 08 draft (made many months
>> later) is something quite different.

> I haven't combed through the archives for messages to support this, but
> had you asked me, I would have thought that we had explicitly decided
> to disallow composite types, not just to get rid of 0.3's multipart 
> stuff.

As Robert pointed out, it has been in all drafts since 03, only it was
worded differently.  My mistake.

>> Composite types don't impose any
>> change in the processing model of user-agents, they are just blobs
>> that get passed to a MIME processor; there is no justification for
>> restricting Atom payloads to a subset of the MIME type space.
> Not all user-agents have a "MIME processor".

Well they all do something with atom:content. They may choose only to
allow the built in types: text, html, and xhtml to be processed; they
may support a fixed set of additional MIME types: image/jpeg and
image/gif; or they may support all MIME types by handing them off to
another component (eg via an OBJECT tag, or just as a download link).
composite types should be dealt with by processors in exactly the same
way as any other unsupported MIME type.

I'm not suggesting that Atom processors do anything with the
composite, like select alternates or present attachments somehow. They
should just treat it as they would any other exotic MIME type, just
like HTTP does.

It is unlikely that desktop aggregators would do anything special with
composite types, but we aren't defining a protocol for desktop
agregators, we are defining a document format and documenting what it
means so that publishers and subscribers have a shared understanding
of it. Allowing message/rfc822 would not cause any additional
interpretability problems, so it is unsuitable for it to be banned
with a MUST level constraint.

> Given the potential complexity and messiness of composite types, I'm
> opposed to leaving the door open for them, since they won't, I
> think, be needed by a significant proportion of Atom users. This
> brings back bad memories of some really ugly data I saw coming out
> of Cyber Dog when I was working on Claris Emailer (Japanese)--a very
> convoluted and in fact buggy mess of multipart/alternative and
> multipart/mixed with lots of duplicated data which forced us to
> handle composite types incorrectly to avoid losing data. Yes, that's
> just one anecdote. But it's an example of the kinds of ugliness that
> can spill over from one bad implementation and mess things up for
> everybody. Unless there's a use case that meets the so-called 80/20
> threshold, I'd be in favor of requiring publishers to keep things a
> little simpler within Atom feeds. If somebody really needs composite
> types, they can use an extension, and user-agent developers can
> decide whether to support it.

Mail user-agents need to unwrap and process composite types (the one
I'm using now does it in a recursive panels and tabs way which works
quite well).

Atom user-agents don't need to unwrap and process composite types
though. They should just treat them as data. If they don't support
them themselves then they should just treat them as they would if they
saw any other MIME type that they don't support.

As I said in another reply, I'm not too bothered about us banning
multipart/* because they aren't really standalone types, but we really
shouldn't ban publishers from including message/rfc822 data, there
isn't any good reason.

-- 
Dave

Re: Question on Use Case: Choosing atom:content's type when ArchivingMailing Lists

2005-06-18 Thread David Powell

Saturday, June 18, 2005, 4:40:54 PM, Robert Sayre wrote:

> Incorrect. Multipart content presents an accessibility issue because
> the entry metadata is no longer sufficiently granular. There would
> have to be Atom metadata for each part.[0]

A message/rfc822 email or an MHTML document with embedded images isn't
any more or less granular than an HTML document with linked images or
an application/zip file.

The granularity problem came from Atom 0.3 allowing multiple payloads,
this is different from allowing a single composite payload.

The earlier discussion was about the rejection of Atom-0.3's multiple
content elements, and the quirky multipart/alternative thing for
nesting content elements.

The conclusion was to reject multiple atom:content elements, and to
require text or html in text fields such as title and summary, leaving
atom:content unrestricted.

Placing a restriction on the MIME types that can be contained as the
payload of atom:content is contrary to the earlier conclusion.

-- 
Dave

Re: Question on Use Case: Choosing atom:content's type when ArchivingMailing Lists

2005-06-18 Thread David Powell

Saturday, June 18, 2005, 7:16:50 PM, Tim Bray wrote:

>>> My feeling was that we ruled out composite types in *local* content
>>> [...]
>>>
>>
>> I'm still looking, but my suspicion is that we never did rule them
>> out, and that this restriction crept in during some editorial
>> rephrasing.

> I disagree.  Atom 0.3 had explicit built-in support  
> for multipart, and there was strong (not rough) consensus support for
> retiring that and replacing it with the language in the current draft.

I disagree with the co-chair's call.

Yes, the Atom 0.3 "support" for multipart was broken and there was
strong consensus for removing it, as in the text of drafts 03 to 07.

But, there doesn't seem to have been any discussion or consensus
whatsoever for the prohibition of composite MIME types in content,
such as message/*, that was introduced in 08.

In fact the conclusion of the debate that Robert pointed to [1], was
that the types allowed by atom:content remained unrestricted.

Atom 0.3 multiparts forced a dubious and complex processing model on
everyone wanting to process Atom documents. This problem was solved by
their removal in the 03 to 07 drafts.

The prohibition of composite types in the 08 draft (made many months
later) is something quite different. Composite types don't impose any
change in the processing model of user-agents, they are just blobs
that get passed to a MIME processor; there is no justification for
restricting Atom payloads to a subset of the MIME type space.

The restriction is just arbitrary: it disallows MHTML Word documents
and RFC822 emails, but allows application/msword Word documents and
application/zip.

Let's be clear: composites probably won't be used by bloggers, or
supported by blogging aggregators; but this isn't an excuse for this
explicit blanket ban.

> My recollection of the debate is that it was exclusively focused on
> the problems of multipart in the document, thus I proposed to the WG  
> that we did not in fact have consensus of banning it in external  
> content; the feedback so far is supportive of the notion that that's  
> a bug in the spec.

The earlier debate was on the problems in the 02 draft, which were
solved in the 03 draft. The restriction on the payload of atom:content
made in the 08 draft is a different issue, so the context of the
earlier debate isn't relevant.

[1] http://www.imc.org/atom-syntax/mail-archive/msg09357.html

-- 
Dave

Re: Question on Use Case: Choosing atom:content's type when ArchivingMailing Lists

2005-06-18 Thread David Powell

Friday, June 17, 2005, 6:14:38 PM, Tim Bray wrote:

> My feeling was that we ruled out composite types in *local* content
> [...]

I'm still looking, but my suspicion is that we never did rule them
out, and that this restriction crept in during some editorial
rephrasing.

> [...] for fairly obvious reasons.

I don't know what the obvious reasons are...

Lack of client support? - hardly seems a reason to ban a MIME type,
else there is a huge list of ones that we should ban at IANA.

We are defining a data format here. If publishers want to publish
entries as text, message/rfc-822, application/msword, image/jp2, or
whatever, then that is up to them. I don't see how we can justify a
MUST NOT requirement for "composite types".

> The fix is obvious, in 4.1.3.1

> "Failing that, it MUST be a MIME media type.  If the "src" attribute
> is not provided, i.e. the content is local, the value of the "type"
> attribute MUST NOT be a composite type... "

I'm in favour of replacing:

  "Failing that, it MUST be a MIME media type, but MUST NOT be a
  composite type (see Section 4.2.6 of [MIMEREG])."

with:

  "Failing that, it MUST be a MIME media type."

-- 
Dave

Re: Question on Use Case: Choosing atom:content's type when ArchivingMailing Lists

2005-06-18 Thread David Powell

Saturday, June 18, 2005, 1:40:52 PM, Sam Ruby wrote:

>> Can somebody give me a link to where we discussed the requirement that
>> atom:content MUST NOT contain a composite type? I've tried searching my
>> archive but I couldn't find anything at all. The change was introduced
>> in draft-08.
>> 
>> I can't agree that this is a mere spec bug until I can find where it
>> was discussed, and what the intent for this ban was.
>> 
>> I don't really see why we are banning these MIME types from either
>> local or remote content.

> http://www.intertwingly.net/wiki/pie/PaceReformedContent3

Thanks, but is that really the same discussion? That proposal was
incorporated into draft-03. The prohibition of "composite types"
wasn't introduced until draft-08.

One of the main effects of PaceReformedContent3 was banning the
Atom-0.3-style "multipart/alternative" envelope that allowed multiple
 elements to be embedded.

Fair enough - but I don't see how this is related to the prohibition
of message/* and multipart/*?

Was this a heavy-handed attempt to prevent people from using Atom 0.3-
style alternatives in Atom 1.0, perhaps?

-- 
Dave

Re: Question on Use Case: Choosing atom:content's type when ArchivingMailing Lists

2005-06-18 Thread David Powell

Friday, June 17, 2005, 6:14:38 PM, you wrote:

> Uh, has Mark spotted a dumb bug here that we should fix?  Do we care
> if *remote* content is of a composite MIME type?  My feeling was that
> we ruled out composite types in *local* content, for fairly obvious
> reasons.  The fix is obvious, in 4.1.3.1

> "Failing that, it MUST be a MIME media type.  If the "src" attribute
> is not provided, i.e. the content is local, the value of the "type"
> attribute MUST NOT be a composite type... "

> -Tim

Can somebody give me a link to where we discussed the requirement that
atom:content MUST NOT contain a composite type? I've tried searching my
archive but I couldn't find anything at all. The change was introduced
in draft-08.

I can't agree that this is a mere spec bug until I can find where it
was discussed, and what the intent for this ban was.

I don't really see why we are banning these MIME types from either
local or remote content.

-- 
Dave

Re: Review of Section 6

2005-06-09 Thread David Powell

Friday, June 10, 2005, 1:03:41 AM, Paul Hoffman wrote:

> *All* reworking is not acceptable now.

> [...]

> There is a large difference between suggesting a bunch of reworking 
> and pointing out specific ambiguities. Please do the latter if you 
> find them.

Yes, I understand. In my previous mail I suggested some areas which
might be ambiguous, such as the distinctions between the various types
of extension markup; and whether we need to say that namespace
attributes don't count as attributes for the purposes of 6.4.1/2.
Whether or not this is serious is up for debate...

I posted this purely as a talking point, so that anyone unsure about
section 6, can take a look, see if it matches their understanding of
section 6; and if not, to see whether this is the draft's fault, their
fault, or my fault.

The reason I posted it at all, is because I will be on holiday for the
next week, so I won't have any more input until after then, which will
probably be a bit late.

-- 
Dave

Re: Review of Section 6

2005-06-09 Thread David Powell

Thursday, June 9, 2005, 5:51:57 PM, Tim Bray wrote:

> On Jun 9, 2005, at 9:22 AM, David Powell wrote:

>> Firstly, there are some mismatches between the RelaxNG grammar and the
>> specification text.  I know that the RelaxNG grammar isn't  
>> normative; but this
>> doesn't mean that it can be contradictory:

> I've asked Paul, and in fact we can fix typos and outright bugs later
> on in the process.  If you're right about the Relax mismatch (Rob?
> Norm?) then let's fix that.

> On the other hand, a general re-organization of section 6 is right  
> out; it is our finding that the format-09 draft (modulo errors)  
> reflects the rough consensus of the WG.  If you disagree, the IETF  
> provides appeal procedures.

Last week I thought about how to rework Section 6. Although some of
this reworking might be not acceptable now, I'll post it anyway. It
might be that there are some actual ambiguities in the current draft;
if so, then I guess that ambiguities might be considered to be "bugs",
that still need fixing.

It would be good if people could give section 6 a check to see that it
makes sense, and that the text matched their impression.

This rework itself was done fairly quickly, so it might have made some
things worse, but the major changes are:

1) Namespace attributes are explicitly excluded from the attributes
that determine the class of a Metadata Element.

2) Added namespace rules to section 6.4.1 and 6.4.2.

3) Added atom:source as a valid location for Metadata Elements.

4) Added paragraph about the intent of Metadata Elements.

5) Renamed 6.4 terms to "Metadata Extensions", "Simple Metadata
Extensions", and "Structured Metadata Extensions" for uniformity.

6) Added explicit definition of the term "Atom Vocabulary".

7) Moved sub-sections around so that the section can be read more
linearly.

I won't be available for the next week to discuss this btw.

== Dave's version of Section 6 ==

6.  Extending Atom

6.1  Extensions To the Atom Vocabulary

   Future versions of Atom could add new elements to the Atom
   namespace, and new attributes, in the default namespace, to
   existing Atom-namespaced elements.  Software written to conform to
   this version of the specification will not be able to process such
   markup correctly and, in fact, will not be able to distinguish it
   from markup error.  For the purposes of this discussion,
   unrecognized markup from the Atom vocabulary will be considered
   "foreign markup".

6.2  Extensions From Non-Atom Vocabularies

   This specification describes Atom's XML markup vocabulary.  Markup
   from other vocabularies ("foreign markup") can be used in an Atom
   document.  Note that the atom:content element is designed to support
   the inclusion of arbitrary foreign markup.

6.2.1  Metadata Extensions

   Child elements of atom:entry, atom:feed, atom:source, and Person
   constructs are considered Metadata Extensions, and are described
   below.  Child elements of Person constructs are considered to apply
   to the construct.

   Atom defines two classes of Metadata Extensions.  Simple Metadata
   Extensions are designed to be easier for simpler implementations to
   support.  To prevent sporadic support of an extension by
   implementations that only support Simple Metadata Extensions,
   extension authors SHOULD ensure that an extension is either a
   Simple Metadata Extension, or a Structured Metadata Extension for
   all of the extension's possible values.

6.2.1.1  Simple Metadata Extensions

   A Simple Metadata Extension element MUST NOT have any child
   elements or attributes, other than namespace declarations.  It MUST
   be namespaced-qualified, and MUST be defined outside of the Atom
   namespace.  The element MAY contain character data, or be empty.
   Simple Metadata Extensions are not Language-Sensitive.

   simpleExtensionElement =
  element * - atom:* {
 text
  }

   The element can be interpreted as a simple property (or name/value
   pair) of the parent element that encloses it.  The pair consisting of
   the namespace-URI of the element and the local name of the element
   can be interpreted as the name of the property.  The character data
   content of the element can be interpreted as the value of the
   property.  If the element is empty, then the property value can be
   interpreted as an empty string.

6.2.1.1  Structured Metadata Extensions

   The root element of a Structured Metadata Extension MUST have at
   least child element or attribute, other than namespace
   declarations.  It MUST be namespace-qualified, and MUST be defined
   outside of the Atom namespace.  It MAY have attributes, it MAY
   contain well-formed XML content (including character data), or it
   MAY be empty.  Structured Metadata Extensions are
   Langu

Review of Section 6

2005-06-09 Thread David Powell


Apologies for the rubbish timing, but I've been reviewing section 6, and found a
number of problems.


Firstly, there are some mismatches between the RelaxNG grammar and the
specification text.  I know that the RelaxNG grammar isn't normative; but this
doesn't mean that it can be contradictory:

a) Section 6.4 omits atom:source as a valid location for Metadata Extensions,
but it is allowed by the RelaxNG in 4.2.11.  I believe that the RelaxNG
reflects our intent to allow extensions to be preserved in atom:source.

b) Section 6.4.1 and Section 6.4.2 don't place any restrictions on what elements
may be used as Metadata Extensions, but the RelaxNG grammar explicitly excludes
elements in the Atom namespace.  The Atom namespace should be reserved for
future forwards-compatable revisions of Atom.

I actually raised these issues before the new draft was published but I didn't
receive any comments.  I am surprised that nothing has been fixed, these two
issues are simply bugs.


c) It isn't the intent of Section 6.4.1/6.4.2 that namespace declarations should
be considered among the "attributes and child elements" that determine whether
an extension is "Simple" or "Structured".  This should be made explicit, many
implementors don't have control over their placement (luckily).


I also have some problems with the structure of the whole section:

Section 6.3 describes "unknown foreign markup" as markup not defined by the
specification and describes how it should be processed; but Section 6.4
specifies the behaviour of another type of markup.  Is Section 6.4 markup
described by the specification, and therefore distinct from "unknown foreign
markup"?  If so why is it described after describing the catch-all - this
non-linearity is very confusing.  Or is it a subset?  I think that the
definition is rather ambiguous.  I find the distinction between "metadata
elements", "foreign markup", and "unknown foreign markup" to be extremely
confusing.  I wouldn't be surprised if a Venn Diagram of them was totally
different to my interpretation.

It would be better if this section was re-ordered, so that the catch-all
definition of non-Metadata Element extensions was placed after the definition
of Metadata Elements.

The two subsections of Section 6.4 are intended to describe the two subclasses
of Metadata Element.  It would make more sense if the terminology for these
subclasses was uniform.  Why not call them "Metadata Extensions", "Simple
Metadata Extensions", and "Structured Metadata Extensions"; rather than
"Metadata Elements", "Simple Extension Elements", and "Structured Extension
Elements".

I think it is insufficient to describe these subclasses purely in terms of their
syntactic characteristics, rather than in terms of their intent.  It would be
beneficial for publishers if a paragraph was introduced into Section 6.4 to
describe why a publisher/extension author would favor one of these classes. 
The intent is that a given extension would belong to a fixed class, and the
Simple class is designed to offer the advantages of easy generic processing by
simpler implementations (eg storage by servers in custom attributes), amongst
other things.


Finally, on a more fundamental level, I am worried by the class of markup that
is an extension, but isn't a "Metadata Element".  This, I think, is limited to
children of atom:link, and additional attributes on atom elements.

If this markup was reserved for future revisions of Atom, that would be fine,
but it doesn't appear to be the case - I can't really tell?

Acording to the charter Atom should have a seperate model and syntax.  Metadata
Elements were proposed to ensure that extensions could be part of the model, so
that they could be represented in RDBMS/OO/RDF/WebDAV implementations as custom
properties.

This other class of extension, if it is intended to be one, can only be
represented as part of the Atom's XML syntax.  To provide support for
extensions that use this class of markup would require changes to every stage
of the publishing and processing process.

A better alternative would be if these attribute extensions were just
treated as an alternative way of expressing Simple Metadata Extensions - at
least then some generic support for them could be implemented.  Either that or
they should be reserved for use by future Atom specifications.


There are other areas that should be clarified too.  I assume that "Atom
markup vocabulary" is intended to be current and future elements in the Atom
namespace, and un-namespaced attribtues on Atom elements, but this is just my
guess, it isn't defined in the specification.


--
Dave

Re: Problem with Metadata Elements (section 6.4)

2005-05-30 Thread David Powell


Quoting Thomas Broyer <[EMAIL PROTECTED]>:

> The problem come when I use a "plain flowed text" and can then omit the
> "type" attribute:
> By Thomas Broyer and al.
> My extension becomes a Simple Extension Element when processed by an Atom
> Processor, and an Atom Processor having some "generic support" for Simple
> Extension Elements (which is really the intent of SEE) would change its
> behavior when processing it, which is not really wanted.
> However, I don't think I badly designed the byline extension, or this
> would imply that Atom itself is badly designed (and I don't think so).

I don't think that this is a big problem.  The allowed syntax of Structured
Extension Elements is defined by the designer of the Structured Extension.  It
is the designer's responsibility to ensure that Structured Extensions don't
flip into being Simple Extensions, so they need to be designed in a way such
that they have at least one child element or attribute.

In this case the easiest way to do this would be to make @type mandatory even
when @type="text" is required.

--
Dave

Re: extension elements inside link elements?

2005-05-28 Thread David Powell

Tuesday, May 24, 2005, 5:26:39 PM, you wrote:

> On 24 May 2005, at 4:07 pm, Robert Sayre wrote:

>> 4.2.9 (editorial):  The atom:link element is explicitly described as
>> empty, which violates the rules in 6 for foreign element extension.
>> Remove "is an empty element that".

> That's not an editorial change, that's newly allowing extension  
> elements in a place most people (such as Paul and myself) assumed  
> they weren't.

IF, the interpretation of Section 6, that Thomas Broyer has helped me
to hammered out is correct, then:

Extension Elements [6.4], in Atom 1.0, are allowed only as direct
children of atom:entry, atom:feed, Person Constructs, and atom:source.
They must be qualified with a namespace, and it mustn't be the atom
namespace. Extension Elements, atom:content, and atom:link/@rel are
the only User Extensions to Atom.

Future versions of Atom [6.2] may add additional elements to the Atom
namespace, may add attributes (namespaced or otherwise) to existing
Atom elements, and presumably may allow text inside Atom elements
(basically any markup that isn't in the spec or an Extension Element).
(Text as a 6.2 element isn't mentioned, but it probably ought to be
treated as 6.2 by clients.)

This is to allow Atom 1.0 implementations to be forward compatible
with future versions of Atom. Atom 1.0 implementations won't know how
to handle these elements and MUST ignore them, and MUST NOT forward
them. (they could be some sort of link-level control element; or they
might be an element that allows a relative URIRef, which would break
if it was dependant on the source document's base URI)

So, assuming that that description is roughly correct, and that we can
rework section 6 to make it easier to understand and unambiguous about
the difference between 6.2 and 6.4 markup, then I am +1 to removing
"is an empty element that", because it contradicts section 6.2 which
is supposed to apply everywhere.

-- 
Dave

Re: protocol-04 first reading

2005-05-27 Thread David Powell



Friday, May 27, 2005, 7:18:40 PM, Eric Scheid wrote:

> On 27/5/05 4:49 PM, "Thomas Broyer" <[EMAIL PROTECTED]> wrote:

>> Replace "format-08" with "protocol-04" and you get it ;o)
>> http://ietf.org/internet-drafts/draft-ietf-atompub-protocol-04.txt

> except I've been getting format-nn from http://atompub.org/... ;-)

The HTML version and the diffs seem to be here:



-- 
Dave

Re: extension elements inside link elements?

2005-05-26 Thread David Powell

Thursday, May 26, 2005, 11:16:05 PM, Thomas Broyer wrote:

> David Powell wrote:

>>Thursday, May 26, 2005, 8:50:04 PM, Thomas Broyer wrote:
>>
>>>6.2 deals with the "Atom vocabulary", which is the markup in the Atom
>>>namespace or un prefixed attributes on Atom-namespaced elements (this is
>>>my interpretation, it's not clearly stated in the spec, and I'm quite
>>>sure I already raised this in the past two weeks).
>>>
>>Yes, I proposed that we fix this here too:
>><http://www.imc.org/atom-syntax/mail-archive/msg15743.html>
>>
> It'd be still be necessary to define the term "vocabulary" somewhere, in
> section 6 or in the introduction (section 1).

> As "vocabulary" is only used only in sections 6.1 and 6.2, I suggest
> using "namespace" instead:

> [
> 6.1  Extensions From Non-Atom Namespaces

>This specification describes Atom's XML markup namespace.  Markup
>from other namespaces ("foreign markup") can be used in an Atom
>document.  Note that the atom:content element is designed to support
>the inclusion of arbitrary foreign markup.

> 6.2  Extensions To The Atom Namespace

>Future versions of this specification could add new elements and
>attributes to the Atom markup namespace.  Software written to
>conform to this version of the specification will not be able to
>process such markup correctly and, in fact, will not be able to
>distinguish it from markup error.  For the purposes of this
>discussion, unrecognized markup from the Atom namespace will be
>considered "foreign markup".
> ]

This sounds good, but 6.2 ought to also include unprefixed attributes
on atom elements as well as atom: prefixed ones.

> This however doesn't prevent us adding your sentence to 6.4 to make it
> clearer.

I think 6.3 would be much easier to understand if its position was
swapped with 6.4. When you read 6.3 you assume that everything not
mentioned so far must be "unknown foreign markup", then when you read
6.4, it redefines some of that as "metadata elements". They are
intended to be disjoint, but the ordering suggested to me that 6.4 was
a subset.

Paragraph 2 of 6.3 reinforced my misconception that 6.4 markup is
subset of "unknown foreign markup", especially as it didn't mention
markup on other elements or attributes as 6.2 proposed. If 6.3 and 6.4
were swapped, then I think paragraph 2 of 6.3 could be trimmed down to
avoid this. Replace paragraph 2 & 3 of 6.3, with:

  When unknown foreign markup is encountered in a Text Contruct or
  atom:content element, software SHOULD ignore the markup and process
  any text content of foreign elements as though the surrounding
  markup were not present.

  When unknown foreign markup is encountered elsewhere, software MAY
  bypass the markup and any textual content and MUST NOT change its
  behavior as a result of the markup's presence.

--
Dave

Re: extension elements inside link elements?

2005-05-26 Thread David Powell

Thursday, May 26, 2005, 8:50:04 PM, you wrote:

>>6.2 says that new elements and attributes can be added.
>>
> 6.2 deals with the "Atom vocabulary", which is the markup in the Atom
> namespace or un prefixed attributes on Atom-namespaced elements (this is
> my interpretation, it's not clearly stated in the spec, and I'm quite
> sure I already raised this in the past two weeks).

Yes, I proposed that we fix this here too:

It is extremely important that clients can discriminate Section 6.2
elements from Section 6.4 elements.

If we don't make this change, then implementations could interpret
atom-namespaced "section 6.2 elements" inside entry/feed/person/source
as "section 6.4 metadata elements". Implementations may then forward
these elements believing that they are harmless metadata, when in fact
they are more likely to be control elements, which they definitely
should not be forwarded by implementations that don't understand them.

This is pretty much exactly the same problem as the one that caused
HTTP 1.0 Persistent Connections to fail. [see RFC2068 section 19.7.1]

-- 
Dave

Re: extension elements inside link elements?

2005-05-26 Thread David Powell

Thursday, May 26, 2005, 7:20:23 PM, Thomas Broyer wrote:

>>But then 6.3 goes on to explain how to process it.
>>This sounds like a contradiction?
>>  
>>
> No, why?

Ok, I'd interpreted "ignoring" it to be processing it, as opposed to
failing.  I'll concede that I misinterpreted that.

> Say I am an Atom Processor and I find these extensions elements:
> 
> 26.58
> -97.83

> Is this something I know how to process?

> * yes: this is "known foreign markup", I process it
> * no: this is "unknown foreign markup", I ignore it without failing
>   or signaling an error

So known 6.4 extensions are "known foreign markup", and unknown
extensions are "unknown foreign markup"?  Ok.

>>Also, if 6.4 extensions are a subset of "unknown[?] foreign markup",

> 6.4 extensions are not a subset of "foreign markup" or "unknown foreign
> markup". 6.4 extensions are the whole "foreign markup" set

6.2 says that new elements and attributes can be added. Earlier
discussion in this thread suggested that these may be inside
 elements. Isn't this a type of markup that isn't a 6.4
extension?

6.4 extensions can only appear in atom:feed, atom:entry, and Person
constructs. [BUG ALERT - they should blatantly be allowed inside
atom:source too, and the inheritance needs explaining]

For the people who were proposing that atom:link shouldn't ban
content, can they tell me which type of Section 6 markup they regard
it to be?  Or were they proposing that Section 6.4 should be extended
to allow Link Extensions.

> (except it doesn't deal with "foreign attributes", but the whole
> section 6 and the whole spec doesn't deal with them).

Yes it does, section 6.2 says that future versions of Atom can add new
attributes.

> If you read carefully the section 6.4, you'll notice that what is not a
> Simple Extension Element is a Structured Extension Element.
> A Simple Extension Element is an element with no attribute and is either
> empty or has text-only content.
> A Structured Extension Element is an element with at least an attribute
> or a child.

Yep, I understand Section 6.4, but these rules only apply directly
inside atom:entry, atom:feed, and Person constructs [and atom:source].
Foreign markup in other places isn't 6.4 markup, so what is it?

> Is there any other construct I missed?

* Elements in places that they aren't banned such as atom:category, and
depending on the outcome of this thread, atom:link - presumably
intended for future versions of Atom.  Allowed by Section 6.2.

[BUG ALERT!!! The RNG says that the content is empty, the text
doesn't]

* Attributes on any existing elements.  Allowed by Section 6.2.

-- 
Dave

Re: extension elements inside link elements?

2005-05-25 Thread David Powell

Wednesday, May 25, 2005, 10:04:52 PM, Tim Bray wrote:

> I think the notion of foreign markup exists so that we can write the
> extremely-important section 6.3, our MustIgnore assertion.  The point
> is, either software knows what to do with an extension and does it,
> or if not it's not allowed to to break and should pass text through
> in contentful contexts. -Tim

Basically, I want future versions of Atom to be able to add markup
anywhere it likes (including on or inside atom:link), and Atom 1.0
should be designed to ignore such "unknown foreign markup".

What I really don't want is "unknown foreign markup" to be used as a
poor-man's "Extension Element".

Does that match the consensus of the WG?

I'm also a bit confused about the terminology in Section 6.3:

> It might be the case that the software is able to process the
> foreign markup correctly and does so. Otherwise, such markup is
> termed "unknown foreign markup".

So "unknown foreign markup" is "foreign markup" that software is
unable to process? But then 6.3 goes on to explain how to process it.
This sounds like a contradiction?

Also, in what cases would software be able to process "foreign markup"
other than by ignoring it as described in section 6.3 or treating it
as a 6.4 extension?

So I don't see how the term "foreign markup" and "unknown foreign
markup" are any different?

Also, if 6.4 extensions are a subset of "unknown[?] foreign markup",
then the rule "[software] MUST NOT change its behavior as a result of
the markup's presence" doesn't really make sense? Surely extensions
can change the behaviour of software.

-- 
Dave

Re: Comments about Extensions (1)

2005-05-25 Thread David Powell



> Section 6.4:

> The RNGs in this section require Extension Elements to be in a
> namespace that isn't the Atom namespace. This requirement is missing
> from the text.

Just a note:

This proposal doesn't rehash the
"extensions -- Atom NS and unprefixed attributes" thread [1], because it
only applies to "6.4 Extension Elements"; not to "6.2 future
extensions to atom" - so the arguments about change control of the
specification are irrelevant.

Is this just an editorial fix on the basis that it is already in the
RNG, and 6.4.1 already implies the presence of a namespace URI; or do
I need to campaign for consensus on this?

[1] http://www.imc.org/atom-syntax/mail-archive/msg15035.html

-- 
Dave

Re: extension elements inside link elements?

2005-05-25 Thread David Powell

Wednesday, May 25, 2005, 10:04:52 PM, Tim Bray wrote:

> On May 25, 2005, at 1:40 PM, David Powell wrote:

>> What is section 6.3 "unknown foreign markup" for?

> I think the notion of foreign markup exists so that we can write the
> extremely-important section 6.3, our MustIgnore assertion.

Sorry I wasn't clear:

When I said "Section 6.3" markup, I was referring to the subset of
"Section 6.3 markup" that isn't also a "Section 6.4 extension". (we
could do with a term for this)

Section 6.4 extensions are MustIgnore, so what is this other subset of
6.3 markup for? I know that it the intent is that Atom parsers
shouldn't fail on it, but is it supposed to be an extension point for
extension authors?

The problem with section 6 is that we define 3 different classes of
non-Atom markup, but don't say why we've done that, or help authors to
know which type they should choose when they write a document.

I'll have a go at writing an editorial proposal to explain things a
bit better.

It doesn't really make sense we have thought out how to extend
atom:feed, atom:entry, atom:author etc, but atom:link is a sub-RSS
free-for-all.

-- 
Dave

Re: extension elements inside link elements?

2005-05-25 Thread David Powell

Tuesday, May 24, 2005, 9:28:09 PM, Tim Bray wrote:

> On the one hand I agree with Graham; this does feel like a
> substantial change.  On the other, it's hard to see that having stuff
> inside  would do any damage; I best most software would never
> notice it.  Having said that, I don't agree that it's editorial  
> change and I'd like to see a couple more voices in favor and give  
> people a chance to shout "No!" before going ahead and doing it. -Tim

I'm either +1 or -1.  I haven't decided yet, but I'm lying in the road
on this one :)  I can't answer this until something is explained to
me:

What is section 6.3 "unknown foreign markup" for?

* We have content constructs which are extensible by allowing any type
of data to be embedded using the IANA registry of MIME types.

* We have carefully designed an IANA registry for @rel values, and used
the distributed nature of namespace URIs to allow for unregistered
values.

* We have carefully defined two classes of extension construct (Simple
and Structured) to ensure that extensions are disambiguated by
namespace URIs, and that xml:lang issues are considered. Simple
extensions make generic support for this class of extension feasible
for real-world implementations; Structured extensions are there for
those that need the extra power.

And, we have "unknown foreign markup".

I believed that the purpose of "unknown foreign markup" was to define
an error handling strategy for parsers, and to give the spec a bit of
breathing room for things like Atom 1.1.  This is great.

But I'm getting a sneaking suspicion though, that "unknown foreign
markup" is positioning itself as a third (poorly thought through)
class of extension point.

Please tell me that that isn't the intent?

In the future someone said:

> Hey, I've had an idea for an extension to atom:link.  The text content
> of atom:link could be a tooltip giving a long description for the link
> It would work like this:
> 
> http://example.com/dave";>
>   This is a picture of my holiday.  I'm the one on the right.
> 
> 
> What do you think?

In the future someone said:

> Hey, I've had an idea for an extension to atom:link.  The text content
> of atom:link could be a GUID of an ActiveX control that is capable of
> displaying the linked object.  It would work like this:
> 
> http://example.com/dave";>
>   0A28AB31-3143-4b70-B969-8650BDFDBAC9
> 
> 
> What do you think?

-- 
Dave

Re: extension elements inside link elements?

2005-05-24 Thread David Powell

Tuesday, May 24, 2005, 7:50:13 PM, Thomas Broyer wrote:

> David Powell wrote:

>>Whether the draft allowed it or not, atom:link isn't an extension
>>point.
>>  
>>
> Could you explain why?

The following are extension points:

* Adding additional metadata to atom:feed by using Section 6.4
Extension Elements.

* Adding additional metadata to atom:entry by using Section 6.4
Extension Elements.

* Adding additional metadata to Person Constructs by using Section 6.4
Extension Elements.

* Adding new atom:link types by using new @rel attributes.

* Embedding arbitrary data by using atom:content.

Atom processors won't fail if you put elements in other places, but I
didn't think that they were extension points.

-- 
Dave

Re: extension elements inside link elements?

2005-05-24 Thread David Powell



Tuesday, May 24, 2005, 8:24:16 PM, Graham wrote:

> On 24 May 2005, at 7:08 pm, David Powell wrote:

>> Whether the draft allowed it or not, atom:link isn't an extension
>> point. That would be Section 6.3 style "unknown foreign markup".

> Unless there's a memo I missed, extensions are a subset of "unknown
> foreign markup".

That is what I said isn't it?  Some "unknown foreign markup" is an
extension.

-- 
Dave

Comments about Extensions (1)

2005-05-24 Thread David Powell



Section 6.4:

The RNGs in this section require Extension Elements to be in a
namespace that isn't the Atom namespace. This requirement is missing
from the text.

Proposal


Add to section 6.4.1:

> A Simple Extension Element MUST be namespace-qualified. The element
> MUST be defined outside of the Atom namespace.

Add to section 6.4.2:

> The root element of a Structured Extension element MUST be
> namespace-qualified. The element MUST be defined outside of the Atom
> namespace.


-- 
Dave

1 2 3 >

1 - 100 of 219 matches

Mail list logo