On 2 July 2015 at 03:37, Tantek Çelik <tan...@cs.stanford.edu> wrote:

> tl;dr: It's time. Let's land microformats parsing support in Gecko as
> a Q3 Platform deliverable that Gaia can use.
>

Happy to hear this!


> I think there's rough consensus that a subset of OG, as described by
> Ted, satisfies this. Minimizing our exposure to OG (including Twitter
> Cards) is ideal for a number of reasons (backcompat/proprietary
> maintenance etc.).
>

That's certainly a good start. It seems a shame to intentionally filter out
all the extra meta tags used by other Open Graph types like:

   - music.song
   - music.album
   - music.playlist
   - music.radio_station
   - video.movie
   - video.episode
   - video.tv_show
   - article
   - book
   - profile
   - business
   - fitness.course
   - game.achievement
   - place
   - product
   - restaurant.menu

I envisage allowing the community to contribute addons to add extra
experimental card packs for types we don't support out of the box from day
one. Filtering out this data would make it very difficult for them to do
that, for no good reason.

I absolutely understand the argument about having to maintain backwards
compatibility with a format if we don't want to promote it going forward
though, which is why I agree we should be conservative when adding built-in
Open Graph types.

There appear to be multiple options for this, with the best (most
> open, aligned with our mission, already open source interoperably
> implemented, etc.) being microformats.
>

That is your opinion. There may be things you don't like about JSON-LD for
example, but it is a W3C Recommendation created through a standards body
and has open source implementations in just as many languages as
Microformats. There may be other more subjective measures of "open" you're
talking about, but I think it would be better for us all to stick to
arguments about technical merit and adoption statistics when making
comparisons in this case, at the risk of falling into the Not Invented Here
trap.


> "fulfils" mostly in theory. Schema is 99% overdesigned and
> aspirational, most objects and properties not showing up anywhere even
> in search results (except generic testing tools perhaps).
>

> A small handful of Schema objects and subset of properties are
> actually implemented by anyone in anything user-facing.
>

As I mentioned, level of current usage is not the most important criteria
for Gaia's own requirements, but if we're talking about how proven these
schemas are, according to schema.org these are the number of domains which
use the schemas we're talking about:

   - Person - over 1,000,000 domains
   - Event - 100,000 - 250,000 domains
   - ImageObject - over 1,000,000 domains
   - AudioObject - 10,000 - 50,000 domains
   - VideoObject - 100,000 - 200,000 domains
   - RadioChannel - fewer than 10 domains
   - EmailMessage - 100 - 1000 domains
   - Comment - 10,000 - 50,000 domains

The only equivalent data I have for Microformats is for hCard (equivalent
to the Person schema) from a crawl at the end of last year [1], and it has
about the same usage:

   - hCard - 1,095,517 domains

The data also shows that Microdata and RDFa are used on more pages per
domain than Microformats.

I'd say that Microformats looks at best equally as unproven on that basis,
though I'm open to new data.


> Everything else is untested, and claiming "fulfils these use cases"
> puts far too much faith in a company known for abandoning their
> overdesigned efforts (APIs, vocabularies, syntaxes!) every few years.
> Google Base / gData / etc. likely "fulfilled" these use cases too.
>

Our Gecko and Gaia code is not going to stop working if Google decides to
use something else. Content authors on the wider web might migrate to newer
vocabularies (or even syntaxes) over time, but that's something we're going
to have to monitor on an ongoing basis anyway.

Existing interoperably implemented microformats support most of these:
>
> - Contact - http://microformats.org/wiki/h-card
> - Event - http://microformats.org/wiki/h-event
> - Photo - http://microformats.org/wiki/h-entry with u-photo property
> - Song - no current vocabulary - classic hAudio vocabulary could be
> simplified for this
> - Video - http://microformats.org/wiki/h-entry with u-video property
> - Radio station - no current vocabulary - worth researching with
> schema RadioChannel as input
> - Email - http://microformats.org/wiki/h-entry with u-in-reply-to property
> - Message - http://microformats.org/wiki/h-entry
>

OK, so there are actually three Microformats that are useful to us here.
For photos, videos, emails and messages we have to re-use the same hEntry
Microformat and try to figure out from its properties which type of thing
it is. For song and radio station we'd need to invent something new.

This is not very attractive for Firefox OS where we'd like to have cleary
defined types of cards with different card templates. It also makes it
harder for the community to create new types of cards (e.g. via addons)
because they have to reason about "an h-entry with a u-video property" vs.
just a "VideoObject" and have to deal with the ambiguity of "an h-entry
with both a u-photo and u-video property" which could either be a video or
a photo.

An explicit type is important because in our UI we try to say "Pin
Contact", "Pin Photo" or "Pin Video" rather than just "Pin Page".


>  The "actions" space has been a difficult and challenging one.
>
> Google's (abandoned) "web intents" was one such effort.
>

Yes we've had a similar experience with Web Activities.


> Currently the IndieWeb community is pursuing Web Actions (and has them
> working across sites)
>
> http://indiewebcamp.com/webactions
>
> There's likely potential there to connect webactions to be part of the
> format of the post/page to be parsed, consumed, re-used.
>

I agree with Kelly's impression of this. It seems like it's at a very early
stage and loosely defined (even compared with the Action schemas), and I'm
not sure it's trying to solve the same set of problems we're trying to
solve. I don't think this is going to solve our use cases for 2.5.

It's not just that, but the experience (that any Mozilla engineer who
> was here before Firefox will relay, e.g ping jst sometime if you want
> to hear horror stories) of RDF, triple-stores etc. being a disaster
> for Mozilla, performance, etc. and taking ages to undo.
>

I want to emphasise here that we are not trying to build an RDF parser or
pursue any grand Semantic Web vision. We're just trying to extract user
value from existing metadata on the web by saving a bit of extra JSON in
the Places database and using it to build a richer representation of web
content in our UI.


> In practice, if you're actually bothering with JSON-LD (not just plain
> JSON), and using or depending on anything triples related, you're
> likely to run into similar problems and objections. It's a very high
> risk path. If you're ignoring all the "LD"ness of JSON-LD, then just
> admit that upfront and use some one-off JSON.
>

I don't want to invent something new if we can avoid it. We've already
invented plenty of new APIs on the B2G project which have not made it onto
a standards track. I want to make a better user agent by creating user
value from the metadata and formats which are already in use on the web.

I agree with Jonas that the vast majority of this work can be done in Gaia,
we just need a couple of new events on the Browser API which Kan-Ru says he
is happy with being landed. A lot of our Gaia requirements could be
fulfilled by existing Open Graph types if we wanted to re-use those as
Jonas suggests, and we also have the option of creating our own namespace
for additional types. If we were to go down that route we would definitely
need more than the four simple meta tags.

I think there are going to be some missing use cases just using Open Graph
alone though, particularly with regards to actions which are actually more
of a Linked Data type thing (not just metadata). JSON-LD is an easy tool
for us to play around with for this, which is why I'd like to move forward
on making the small change to the Browser API that would be needed for us
to do that.

I'm very happy for the Microformats work to continue in parallel so that
Gaia developers can use it when it's ready, but the evidence I've seen so
far has not convinced me that it's a great fit for all of our requirements
and it's too much of a risk for us to put all of our 2.5 eggs in that
basket.

So let's move forward on both the Browser API events to give us some
flexibility to start experimenting in Gaia, and the Microformats
implementation in Gecko so that we can evaluate that properly going forward.

Thanks

Ben

P.S. I'm curious whether Microformats has ever been proposed as a
specification to a standards body like the W3C or WHATWG? It seems like it
could benefit from some additional feedback from outside its existing
community.

1. http://webdatacommons.org/structureddata/2014-12/stats/stats.html
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to