Re: RSS extensibility

Antone Roundy Sat, 08 Jan 2005 17:26:29 -0800

On Saturday, January 8, 2005, at 08:00 AM, Henry Story wrote:

Now the problem with the graph above, is that it does not make sense to assign a geo location to an entry. This will probably not be allowed by the geo ontology, which will probably specify some location class to be the the subject of such a relation. One may want to imagine some weird monster called a Entry-geolocation, but it will have all kinds of weird properties. So let us be throw that monster out of the window, even though it makes xml sense: it does not make any other sense at all.
This is probably what was meant:
_f1 ---is a---> <Feed> |----head----> _h1 | |---author-> _p1 ---is a---> <Person> | |----email---> [EMAIL PROTECTED] | |----entry---> _e1 --is a---> <Entry> |----id-----> <http://123> |---date----> "2005-02-02T13:05:04"^^xsd:dateTime |--content--> _p1 ---is a ---> <foaf:Person> |--foaf:home--> _loc ---is a---><geo:Location> | |---geo:x--->"10.1" | |---geo:y--->"57.3" |--foaf:mbox--> <mailto:[EMAIL PROTECTED]>

_f2 ---is a---> <Feed> |--head--> .... |--entry-> _e2 --is a---> <Entry> |----id-----> <http://example/123> |---date----> "2005-03-02T07:05:04"^^xsd:dateTime |--content--> _event1 ---is a ---> <phy:Event> |--phy:date-->"2005-02-02T13:05:04"^^xsd:dateTime |--phy:loc---> _loc2 --is a ---> <geo:Location> | |--geo:x-------> "10.1" | |--geo:y-------> "57.3" |--seismo:magnitude-->"7"

[WARNING: What follows is a long exploration of a few issues including the foundations of structural choices for XML. You may want to skim it or ignore it completely.]

Good explanation. I see what you're saying. But I'm not convinced of one thing: that it doesn't make sense to assign a geo location to an entry. Perhaps if my mental model of what an entry is were different, I'd feel differently, but it makes as much sense to me for a publisher to want to attach a location to an entry as it does to attach a title to an entry, for example.

So what is an entry? My first thought is that it is content and metadata about the content. But thinking of it that way, I run into a little problem: is the metadata about the content, or about the entry? For example, is the creation date the creation date of the content or of the entry? It could be both in some cases, but it's only always going to be the creation date of the entry. I think we can rule out the idea of an entry being just content and metadata about the content. But there's a third possibility, which coincides with your diagram: an entry is content, metadata about the content, and metadata about the entry itself.

So a few questions arise:

1) What metadata can be attached to the content?

2) What metadata can be attached to the entry?

3) Is there a practical way to always have metadata appear as an attribute on or child of the thing it describes?

To the content: if, as in your _e2, the content is an event, the location where the event occurred makes sense as metadata. I don't think I need to explore this any further.

To the entry: the creation date of the entry makes sense as metadata in a very concrete way. I think it's easy to imagine wanting to attach an ID to the entry, but that's a very different thing than the creation date--it is not an inherent property of an entry, but something we attach to it by choice for convenience in referring to or identifying the entry. Similarly, we can assign a title to the entry, which could even conceivably be different from the title of the content, for example, "An essay about dogs" vs. "An entry that carries as its core content an essay about dogs." The title of the content is more obviously useful metadata, but the title of the entry certainly isn't ruled out. Getting back to geo location--it's easy to imagine what the geo location of content would be (the place where something occurred, the place that something is about, the place where the content was written, etc.) I'll admit, I can't imagine what "the location of an entry" would mean. So, on to question #3.

Looking at your example, you have the content of one entry being a person, and the other being an event. Obviously, a physical person doesn't travel around the world through the wires wherever the entry is transmitted, and the event doesn't occur inside the XML. So I suppose it would be more accurate to state that the content is information about a person or about an event--ie. it's all metadata. In that case, I'm not quite sure how to even think about the question of the metadata appearing as a child of the content which it describes, because the content is only referred to, it's not actually there. When the content is actually there, as in the case of inline textual content, making the metadata a child of the content that it describes wouldn't really work. For example, if the content is the sentence "This is a pen.", you wouldn't want your XML to look like this...or at least I wouldn't--it would be too difficult to separate the content from its metadata:

<content> This is a pen. <description>The first English sentence taught to Japanese students.</description> </content>

So we end up violating the pure tree structure ideal in one way or another: A) (metadata describing its sibling): <content>This is a pen</content> <description>The first English sentence taught to Japanese students.</description>

B) or (...metadata describing its sibling? or just encapsulation of the actual content for parsing convenience): <content> <theactualcontent>This is a pen.</theactualcontent> <description>The first English sentence taught to Japanese students.</description> </content>

C) or (fabricating a method of referring to the content that actually lives elsewhere--possibly in an XML document as in the first example, possibly not, as in the second): <content about="foo"> <description>The first English sentence taught to Japanese students.</description> <content> <thing id="foo">This is a pen.</thing>

<content about="Antone">
        <description>The guy typing this email.</description>
</content>

I think A and C make the most sense, which is probably why those are the ways most of us have settled on. While A probably makes less sense than C from a purely technical point of view, it's intuitively easy to understand, and is the least verbose, so I for one prefer that syntax. However, it leads to a situation where it's not explicitly clear whether metadata describes its sibling or its parent. As long as things are not allowed to get too convoluted--ie., children of <entry> can ONLY be metadata about <entry> or <content>, not about any of the other children of <entry>, then things don't get unmanageable.

Anyway, coming full circle to assigning a geo location to an entry...I guess technically, the geo location is probably really the location of the content. But it makes sense to me to have it appear as a child of <entry>. And I generally don't feel the need to differentiate between the location of the content and the location of the entry, because, the content being the central data of the entry, I rarely feel the need to think of them as different things--my mental model sort of mushes entries and their content together.

Whew. I feel better now. Sorry if you read that and didn't get anything out of it.

Antone

Re: RSS extensibility

Reply via email to