Martin Atkins wrote:
Bill de hOra wrote:
1,2,4,5 I think are duplicates, 3 depends on what activity:published
means. But on reading through the spec, I see that "5.5. The
activity:object Extension Element" is saying to use atom:* elements.
So I think the example XML is buggy, and here's my attempt at a fix by
throwing in xmlns and xml:base declarations:
Note that the top-level entry here represents the activity or action,
while the content of activity:object (which is also an Atom Entry in all
but name) represents the object of the activity.
Therefore these are in fact not duplicates. The entry-level atom:title
is a human-readable description of the activity ("Geraldine posted ...")
while the activity:object atom:title is the title of what Geraldine
posted, which in this case is a photo with the title "My Cat".
- with that many elements there will be boilerplate data injected into
one or the other. James has already mentioned this, and it is an age old
problem with markup, one that predates XML and HTML. Given what I know
about event formats in general and SN specific data like activities
which I have to aggregate, I think you have too many tags, which leads
me to...
- ...volume. My rule of thumb is that each interesting thing that
happens in a web system causes about 10 events and you should plan to
scale accordingly. I believe social network activity to be like this. My
guess is that verbosity of the proposal (which I don't normally care
about) will be an adoption factor.
And even if you are right about non-duplication, then there are still
two arguments to be had,
- rename all the activity elements so people are not confused about
the purpose of the elements (I don't necessarily agree with this
approach, but it's real)
- either remove atom:source from activity:object or type qualify
atom:source as being the source of the action and lift it up. I have a
doubt you need both elements, unless there is a use case where an
intermediary like friendfeed needs atom:source for something else.
You are correct that in this case there is ambiguity for a few elements:
* If the verb is something other than "post", atom:published becomes the
activity time rather than the published time. The definition of the
"post" verb means that for that verb the activity time and the
publication time are always the same, since it was not published until
it was posted.
* atom:author becomes the activity actor rather than the author of the
object. Again, in the "post" case these are the same, but for other
verbs they are not necessarily the same. Unfortunately existing practice
constrains the solution here; for example, YouTube's "favorite" feeds
have atom:author set to the user that owns the feed, not the user who
authored the video. Therefore the spec currently says that the author
belongs to the activity, not to the object.
In markup-land, these are called co-occurence constraints. I would try
to avoid them if possible :)
- I suspect activity:type could be a machine tag and therefor using
atom:category with a scheme could be an option. This would allow a
useful extension point.
I did actually already get feedback in favor of machine tags, but they
made me wary for two reasons:
* They are not yet widely known or adopted, as far as I can tell.
* (and this point relates to the use of atom:category in general) many
existing aggregators display to the user all categories regardless of
scheme, and a design goal of this specification is for the new
annotations to cause no visible difference in legacy software, so that
there will be less resistance to adoption.
Makes sense to me.
- If this was an IETF standards track, I would suggest an IANA
registry for activity:verbs.
Do you intend the IANA registry to replace the use of URIs as the
extension mechanism with simple keywords?
No, just to act as social control as per the link registry. If these
will be acting as switch-on-type controls in code rather than data to be
rendered out, then I can't imagine verb proliferation being beneficial
(what will happen is that the problem Dave Winer has identified with
photo media and which Dare Obsanjo has explained will get pushed around
rather than solved - activity:verb reminds me of soap:action in that
sense).
Right now the main "inspiration" for this spec are the syndication-based
activity aggregators such as FriendFeed, Plaxo, Movable Type Action
Streams, etc, but I agree that it is important to align with OpenSocial
too.
I work on aggregation for mobile systems, and while its not a format
issue, I suspect the amount of data being served to describe the
activity is high enough to impact services that are not flat-rated. Do
the spec backers want to send that much data down to a handset?
- There is repetition of data - practically speaking there are a lot
more activity events than blog posts, I'd guess easily one order of
magnitude, probably two. Having a uniform means of serving them will
result in more events, maybe getting us to three orders. I imagine
we'll see claims the web will collapse under the weight.
The "implied activity" shorthand is intended to address this. In the
weblog case, one needs only to add a single element to each entry to
mark it up as a weblog entry:
<object-type xmlns="http://activitystrea.ms/spec/1.0/">
http://activitystrea.ms/schema/1.0/blog-entry/
</object-type>
though of course in practice many activity consumers are likely to
present an un-annotated entry much like a weblog entry because that's
what they do today.
Right, and this approach is closer to how MediaRSS acts wrt to RSS/Atom,
which leaves the door open for broadly consistent profiling of feed data
down the line.
The work for things such as photos and videos is slightly more since
some providers would need to add thumbnail links and so forth, but it
was a very strong requirement that existing feeds would require only
minimal changes -- and, in most cases, only additions -- to hit the
important use-cases for this spec.
I think this is more an abbreviated syntax than an implied activity and
I'd strongly suggest you pick just one and spec it - keeping client and
server dev work to a minimum ups the chance of adoption. [Having an
abbreviated syntax without question damaged RDF/XML].
- Having duplicate places to put data will easily confuse publishers,
as will clients be confused by data acquisition from parent elements.
This is just how things are with XML modelling.
The most important feedback I've taken from this is that the spec should
be re-organised to emphasise the implied activity shorthand rather than
starting with the full "activity entry" case, since I expect only
activity aggregator services to publish the latter, and indeed the
motivation for this separation is that these publishers already have
feeds that represent activities rather than objects.
That and the fact that everyone who uses this spec will have to think
about where to put the data instead of being able to guess right. Reduce
that burden!
Thanks for the very detailed feedback. Six Apart is hosting a meet-up
regarding activity streams publishing this afternoon[1] where we can
take a look at the feedback we've recieved.
Good luck with it.
Bill