Bill de hOra wrote:
Martin Atkins wrote:
Bill de hOra wrote:
1,2,4,5 I think are duplicates, 3 depends on what activity:published
means. But on reading through the spec, I see that "5.5. The
activity:object Extension Element" is saying to use atom:* elements.
So I think the example XML is buggy, and here's my attempt at a fix
by throwing in xmlns and xml:base declarations:
Note that the top-level entry here represents the activity or action,
while the content of activity:object (which is also an Atom Entry in
all but name) represents the object of the activity.
Therefore these are in fact not duplicates. The entry-level atom:title
is a human-readable description of the activity ("Geraldine posted
...") while the activity:object atom:title is the title of what
Geraldine posted, which in this case is a photo with the title "My Cat".
- with that many elements there will be boilerplate data injected into
one or the other. James has already mentioned this, and it is an age old
problem with markup, one that predates XML and HTML. Given what I know
about event formats in general and SN specific data like activities
which I have to aggregate, I think you have too many tags, which leads
me to...
- ...volume. My rule of thumb is that each interesting thing that
happens in a web system causes about 10 events and you should plan to
scale accordingly. I believe social network activity to be like this. My
guess is that verbosity of the proposal (which I don't normally care
about) will be an adoption factor.
Activity aggregator services already use the entry-level title to encode
a human-readable activity description, and they do this for the
benefit of non-activity-aware consumers, so the actual title of the
object must be in a distinct element.
My intent with activity:object was to be able to re-use the Atom
elements which are already understood rather than inventing new
elements. I consider this to be similar in design to atom:source, which
re-uses the existing Feed-level elements.
And even if you are right about non-duplication, then there are still
two arguments to be had,
- rename all the activity elements so people are not confused about the
purpose of the elements (I don't necessarily agree with this approach,
but it's real)
So you believe that re-using the Atom elements inside activity:object is
confusing? If this is the case, then my design has indeed failed, since
my intention was exactly the opposite... that activity:object/atom:title
would be understood to mean "the title of the object" based on people's
existing understanding of what atom:title means.
- either remove atom:source from activity:object or type qualify
atom:source as being the source of the action and lift it up. I have a
doubt you need both elements, unless there is a use case where an
intermediary like friendfeed needs atom:source for something else.
atom:source inside activity:object tells us where the aggregator (i.e.
FriendFeed) found the object.
atom:source as a sibling of activity:object comes into play when the
source feed already contains activity entries; FriendFeed is not
synthesizing implied activities in this case.
The most readily-available example of such an instance is when I feed
the output of Plaxo (another activity aggregator) into FriendFeed. In
that case, there ought to be an atom:source sibling to activity:object
that refers to the Plaxo feed, while the atom:source inside
activity:object would be the source as far as *Plaxo* is concerned.
Without this distinction, the following activity published on Plaxo:
"Martin posted a photo on Flickr"
would become the following when read into FriendFeed:
"Martin posted a photo on Plaxo"
or worse,
"Martin posted 'Martin posted a photo on Flickr' on Plaxo'"
Do you believe that this level of indirection is counter-intuitive?
You are correct that in this case there is ambiguity for a few elements:
* If the verb is something other than "post", atom:published becomes
the activity time rather than the published time. The definition of
the "post" verb means that for that verb the activity time and the
publication time are always the same, since it was not published until
it was posted.
* atom:author becomes the activity actor rather than the author of the
object. Again, in the "post" case these are the same, but for other
verbs they are not necessarily the same. Unfortunately existing
practice constrains the solution here; for example, YouTube's
"favorite" feeds have atom:author set to the user that owns the feed,
not the user who authored the video. Therefore the spec currently says
that the author belongs to the activity, not to the object.
In markup-land, these are called co-occurence constraints. I would try
to avoid them if possible :)
As I'm sure you will have seen, I've started a separate thread about
this, since I believe the problem is more general than AtomActivity.
- If this was an IETF standards track, I would suggest an IANA
registry for activity:verbs.
Do you intend the IANA registry to replace the use of URIs as the
extension mechanism with simple keywords?
No, just to act as social control as per the link registry. If these
will be acting as switch-on-type controls in code rather than data to be
rendered out, then I can't imagine verb proliferation being beneficial
(what will happen is that the problem Dave Winer has identified with
photo media and which Dare Obsanjo has explained will get pushed around
rather than solved - activity:verb reminds me of soap:action in that
sense).
Okay, I understand. As currently defined, the ActivitySchema spec is
effectively the registry and I suppose you need to go through me to get
new verbs under http://activitystrea.ms/ added; once the spec is final,
it would make sense to have a mechanism like an IANA registry to allow
other specifications to extend the core set.
Right now the main "inspiration" for this spec are the
syndication-based activity aggregators such as FriendFeed, Plaxo,
Movable Type Action Streams, etc, but I agree that it is important to
align with OpenSocial too.
I work on aggregation for mobile systems, and while its not a format
issue, I suspect the amount of data being served to describe the
activity is high enough to impact services that are not flat-rated. Do
the spec backers want to send that much data down to a handset?
This is of course a concern; adding more data will always disadvantage
those consumers that are not making use of that data.
If there is an activity-aware aggregator running on a mobile device it
will presumably be interested in this extra data, but a generic
aggregator will of course ignore it having already spent the data
transfer on it.
As far as I can tell, the best way that the industry has found to deal
with this problem thus far is to serve different content to constrained
devices than to traditional computers, whether done by providing two
parallel resources (m.sitename.com vs. www.sitename.com) or by using
heuristics to determine whether a user-agent is resource-constrained and
automatically delivering the constrained version.
On the other hand, it could be argued that AtomActivity would enable a
mobile-based activity aggregator to use a single feed on FriendFeed in
place of fetching a number of site feeds individually and mimicking the
functionality of FriendFeed client-side, which would presumably be more
efficient overall.
The work for things such as photos and videos is slightly more since
some providers would need to add thumbnail links and so forth, but it
was a very strong requirement that existing feeds would require only
minimal changes -- and, in most cases, only additions -- to hit the
important use-cases for this spec.
I think this is more an abbreviated syntax than an implied activity and
I'd strongly suggest you pick just one and spec it - keeping client and
server dev work to a minimum ups the chance of adoption. [Having an
abbreviated syntax without question damaged RDF/XML].
The two kinds of entry already exist in the wild, so I'm wary of
excluding it from the spec. FriendFeed would be unable to use the
implied activity shorthand without changing the content of their feeds
as percieved by non-activity aggregators, and the reverse is true for
sites such as YouTube and Flickr that currently publish what I refer to
as "object entries".
- Having duplicate places to put data will easily confuse publishers,
as will clients be confused by data acquisition from parent elements.
This is just how things are with XML modelling.
The most important feedback I've taken from this is that the spec
should be re-organised to emphasise the implied activity shorthand
rather than starting with the full "activity entry" case, since I
expect only activity aggregator services to publish the latter, and
indeed the motivation for this separation is that these publishers
already have feeds that represent activities rather than objects.
That and the fact that everyone who uses this spec will have to think
about where to put the data instead of being able to guess right. Reduce
that burden!
It is my current belief that legacy software (both on the publisher and
consumer side) requires at least some duplication in the "activity
entry" case, but I will definitely take your feedback on board and see
if it can be made clearer and less redundant.
Thanks,
Martin