Re: Some Draft Atom-related Specs for Activity Streams

Martin Atkins Sun, 11 Jan 2009 20:36:07 -0800


Bill de hOra wrote:

Martin Atkins wrote:
Bill de hOra wrote:
1,2,4,5 I think are duplicates, 3 depends on what activity:publishedmeans. But on reading through the spec, I see that "5.5. Theactivity:object Extension Element" is saying to use atom:* elements.So I think the example XML is buggy, and here's my attempt at a fixby throwing in xmlns and xml:base declarations:
Note that the top-level entry here represents the activity or action,while the content of activity:object (which is also an Atom Entry inall but name) represents the object of the activity.
Therefore these are in fact not duplicates. The entry-level atom:titleis a human-readable description of the activity ("Geraldine posted...") while the activity:object atom:title is the title of whatGeraldine posted, which in this case is a photo with the title "My Cat".
- with that many elements there will be boilerplate data injected intoone or the other. James has already mentioned this, and it is an age oldproblem with markup, one that predates XML and HTML. Given what I knowabout event formats in general and SN specific data like activitieswhich I have to aggregate, I think you have too many tags, which leadsme to...
- ...volume. My rule of thumb is that each interesting thing thathappens in a web system causes about 10 events and you should plan toscale accordingly. I believe social network activity to be like this. Myguess is that verbosity of the proposal (which I don't normally careabout) will be an adoption factor.

Activity aggregator services already use the entry-level title to encodea human-readable activity description, and they do this for thebenefit of non-activity-aware consumers, so the actual title of theobject must be in a distinct element.

My intent with activity:object was to be able to re-use the Atomelements which are already understood rather than inventing newelements. I consider this to be similar in design to atom:source, whichre-uses the existing Feed-level elements.

And even if you are right about non-duplication, then there are stilltwo arguments to be had,
- rename all the activity elements so people are not confused about thepurpose of the elements (I don't necessarily agree with this approach,but it's real)

So you believe that re-using the Atom elements inside activity:object isconfusing? If this is the case, then my design has indeed failed, sincemy intention was exactly the opposite... that activity:object/atom:titlewould be understood to mean "the title of the object" based on people'sexisting understanding of what atom:title means.

- either remove atom:source from activity:object or type qualifyatom:source as being the source of the action and lift it up. I have adoubt you need both elements, unless there is a use case where anintermediary like friendfeed needs atom:source for something else.

atom:source inside activity:object tells us where the aggregator (i.e.FriendFeed) found the object.

atom:source as a sibling of activity:object comes into play when thesource feed already contains activity entries; FriendFeed is notsynthesizing implied activities in this case.

The most readily-available example of such an instance is when I feedthe output of Plaxo (another activity aggregator) into FriendFeed. Inthat case, there ought to be an atom:source sibling to activity:objectthat refers to the Plaxo feed, while the atom:source insideactivity:object would be the source as far as *Plaxo* is concerned.


Without this distinction, the following activity published on Plaxo:
   "Martin posted a photo on Flickr"
would become the following when read into FriendFeed:
   "Martin posted a photo on Plaxo"
or worse,
   "Martin posted 'Martin posted a photo on Flickr' on Plaxo'"

Do you believe that this level of indirection is counter-intuitive?

You are correct that in this case there is ambiguity for a few elements:
* If the verb is something other than "post", atom:published becomesthe activity time rather than the published time. The definition ofthe "post" verb means that for that verb the activity time and thepublication time are always the same, since it was not published untilit was posted.
* atom:author becomes the activity actor rather than the author of theobject. Again, in the "post" case these are the same, but for otherverbs they are not necessarily the same. Unfortunately existingpractice constrains the solution here; for example, YouTube's"favorite" feeds have atom:author set to the user that owns the feed,not the user who authored the video. Therefore the spec currently saysthat the author belongs to the activity, not to the object.
In markup-land, these are called co-occurence constraints. I would tryto avoid them if possible :)

As I'm sure you will have seen, I've started a separate thread aboutthis, since I believe the problem is more general than AtomActivity.

- If this was an IETF standards track, I would suggest an IANAregistry for activity:verbs.
Do you intend the IANA registry to replace the use of URIs as theextension mechanism with simple keywords?
No, just to act as social control as per the link registry. If thesewill be acting as switch-on-type controls in code rather than data to berendered out, then I can't imagine verb proliferation being beneficial(what will happen is that the problem Dave Winer has identified withphoto media and which Dare Obsanjo has explained will get pushed aroundrather than solved - activity:verb reminds me of soap:action in thatsense).

Okay, I understand. As currently defined, the ActivitySchema spec iseffectively the registry and I suppose you need to go through me to getnew verbs under http://activitystrea.ms/ added; once the spec is final,it would make sense to have a mechanism like an IANA registry to allowother specifications to extend the core set.

Right now the main "inspiration" for this spec are thesyndication-based activity aggregators such as FriendFeed, Plaxo,Movable Type Action Streams, etc, but I agree that it is important toalign with OpenSocial too.
I work on aggregation for mobile systems, and while its not a formatissue, I suspect the amount of data being served to describe theactivity is high enough to impact services that are not flat-rated. Dothe spec backers want to send that much data down to a handset?

This is of course a concern; adding more data will always disadvantagethose consumers that are not making use of that data.

If there is an activity-aware aggregator running on a mobile device itwill presumably be interested in this extra data, but a genericaggregator will of course ignore it having already spent the datatransfer on it.

As far as I can tell, the best way that the industry has found to dealwith this problem thus far is to serve different content to constraineddevices than to traditional computers, whether done by providing twoparallel resources (m.sitename.com vs. www.sitename.com) or by usingheuristics to determine whether a user-agent is resource-constrained andautomatically delivering the constrained version.

On the other hand, it could be argued that AtomActivity would enable amobile-based activity aggregator to use a single feed on FriendFeed inplace of fetching a number of site feeds individually and mimicking thefunctionality of FriendFeed client-side, which would presumably be moreefficient overall.

The work for things such as photos and videos is slightly more sincesome providers would need to add thumbnail links and so forth, but itwas a very strong requirement that existing feeds would require onlyminimal changes -- and, in most cases, only additions -- to hit theimportant use-cases for this spec.
I think this is more an abbreviated syntax than an implied activity andI'd strongly suggest you pick just one and spec it - keeping client andserver dev work to a minimum ups the chance of adoption. [Having anabbreviated syntax without question damaged RDF/XML].

The two kinds of entry already exist in the wild, so I'm wary ofexcluding it from the spec. FriendFeed would be unable to use theimplied activity shorthand without changing the content of their feedsas percieved by non-activity aggregators, and the reverse is true forsites such as YouTube and Flickr that currently publish what I refer toas "object entries".

- Having duplicate places to put data will easily confuse publishers,as will clients be confused by data acquisition from parent elements.This is just how things are with XML modelling.
The most important feedback I've taken from this is that the specshould be re-organised to emphasise the implied activity shorthandrather than starting with the full "activity entry" case, since Iexpect only activity aggregator services to publish the latter, andindeed the motivation for this separation is that these publishersalready have feeds that represent activities rather than objects.
That and the fact that everyone who uses this spec will have to thinkabout where to put the data instead of being able to guess right. Reducethat burden!

It is my current belief that legacy software (both on the publisher andconsumer side) requires at least some duplication in the "activityentry" case, but I will definitely take your feedback on board and seeif it can be made clearer and less redundant.


Thanks,
Martin

Re: Some Draft Atom-related Specs for Activity Streams

Reply via email to