Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic

2009-05-06 Thread Manu Sporny
Ian Hickson wrote:
 On Tue, 5 May 2009, Manu Sporny wrote:
 Creating a Microformat is a very time consuming prospect, including:

 ... Microformats Due Diligence Rules ...
 
 Are you saying that RDF vocabularies can be created _without_ this due 
 diligence?

What I am saying is that the amount of due diligence that goes into a
particular vocabulary should be determined by the community that will
use the vocabulary.

Some of these will be large communities and will require an enormous
amount of due diligence, others will be very small communities, which
may not require as much due diligence as larger communities, or they may
have a completely different process to the Microformats process. The key
here is that a micro-data approach should allow them to have the
flexibility to create vocabularies in a distributed manner.

Ian Hickson wrote:
 On Tue, 5 May 2009, Ben Adida wrote:
 Ian Hickson wrote:
 Are you saying that RDF vocabularies can be created _without_ this
 due diligence?

 Who decides what the right due diligence is?

The person writing the vocabulary, presumably.

Your stance is a bit more lax than mine on this. I'd say that it is the
community, not solely the vocabulary author, that determines the right
amount of due diligence. If the community does not see the proper amount
of due diligence going into vocabulary creation, or the vocabulary does
not solve their problem, then they should be free to develop a competing
alternative.

This is especially true because the proper amount of due diligence can
easily become a philosophical argument - each community can have a
perfectly rational argument to do things differently when solving the
same problem.

Your position, that the vocabulary author decides the proper amount of
due diligence, is rejected in the Microformats community. In the
Microformats community, every vocabulary has the same amount of due
diligence applied to it.

I think that this is a good thing for that particular community, but it
does have a number of downsides - scalability being one of them. It
creates a bottleneck - we can only get so many vocabularies through our
centralized, community-based process and the barrier to creating a
vocabulary is very high. As a result, we don't support small community
vocabularies and only support widely established publishing behavior
(contact information, events, audio, recipes, etc).

So, maybe this requirement should be added to the micro-data
requirements list:

If micro-data is going to succeed, it needs to support a mechanism that
provides easy, distributed vocabulary development, publishing and re-use.

-- manu

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: A Collaborative Distribution Model for Music
http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/


Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic

2009-05-06 Thread Manu Sporny
Ian Hickson wrote:
 One organization for *all* topics, ever?
 
 I don't think that would really scale. Even for major languages, like 
 HTML, we haven't found a single organisation to be a successful model.

Then you, Ben, and I agree on this particular point:

In order for semantic/micro-data markup to scale well, we must ensure
that distributed vocabulary development, publishing and re-use is a
cornerstone of the solution.

 Manu's list didn't mention anything about a single organisation

Then I wasn't clear enough - I meant that the single organization was
the Microformats community and that the list works for that particular
community, but is not guaranteed to work for all communities.

You could say that the single community could be the W3C or WHATWG -
pushing vocabulary standardization solely through any one of these
organizations would be the wrong solution, therefore we should be
cognizant of that in this micro-data discussion.

 Surely all of the above apply equally to any RDFa vocabulary just as it 
 would to _any_ vocabularly, regardless of the underlying syntax?

Not necessarily...

 6: Justifying your design is a key part of any language design effort 
 also. Not doing this would lead to a language or vocabulary with 
 unnecessary parts, making it harder to use.

What happens when the people you're justifying your design to are the
gatekeepers? What happens when they don't understand the problem you're
attempting to solve? Or they disagree with you on a philosophical level?
Or they have some sort of political reason to not allow your vocabulary
to see the light of day (think large multi-national vs. little guy)? In
the Microformats community, this stage, especially if one of the
Microformat founders disagrees with your stance, can kill a vocabulary.

 7: With any language, part of designing the vocabulary is defining how to 
 process content that uses it.

Not if there are clear parsing rules and it's easy to separate the
vocabulary from the parsing rules. This should be a requirement for the
micro-data solution:

Separation of concerns between the markup used to express the micro-data
(the HTML markup) and the vocabularies used to express the semantics
(the micro-data vocabularies).

 9: The most important practical test of a language is the test of 
 deployment. Getting feedback and writing code is naturally part of writing 
 a format.

This statement is vague, so I'm elaborating a bit to cover the possible
readings of this statement:

Writing markup code (ie: HTML) should be a natural part of writing a
semantic vocabulary meant to be embedded in HTML.

Writing parser code (ie: Python, Perl, Ruby, C, etc.) should not be a
natural part of writing a semantic vocabulary - they wholly different
disciplines. Microformats require you to write both markup code and
parser code by design.

 As far as I can tell, the steps above are just the steps one would take 
 for designing any format, language, or vocabulary. Are you saying that 
 creating an RDF vocabulary _doesn't_ involve these steps? How is an RDF 
 vocabulary defined if not using these steps?

I don't believe that Ben is saying that at all - those steps are best
practices and apply generally to most communities. However, they do not
work for all communities and they do not work well when they are
transformed from best practices to a requirement that all vocabularies
must meet in order to be published.

-- manu

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: A Collaborative Distribution Model for Music
http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/


[whatwg] Just create a Microformat for it - thoughts on micro-data topic

2009-05-05 Thread Manu Sporny
bcc: Public RDFa Task Force mailing list (but not speaking as a member)

Kyle Weems recent post[1] on CSSquirrel discusses[2] some of the more
recent rumblings surrounding RDFa and Microformats as potential
micro-data solutions. It specifically addresses a conversation between
Ian and Tantek regarding Microformats:

http://krijnhoetmer.nl/irc-logs/whatwg/20090430#l-693

Since I've seen this argument made numerous times now, and because it
seems like a valid solution to someone that isn't familiar with the
Microformats process, I'm addressing it here. The argument goes
something like this:

It looks like that markup problem X can be solved with a simple
Microformat.

This seems like a reasonable answer at first - Microformats, at their
core, are simple tag-based mechanisms for data markup. Most semantic
representation problems can be solved by explicitly tagging content.

What most people fail to see, however, is that this statement
trivializes the actual implementation cost of the solution. A
Microformat is much more than a simple tag-based mechanism and it is far
more difficult to create one than most people realize. Creating a
Microformat is a very time consuming prospect, including:

  1. Attempting to apply current Microformats to solve your problem.
  2. Gathering examples to show how the content is represented in the
 wild.
  3. Gathering common data formats that encode the sort of content
 you are attempting to express.
  4. Analyzing the data formats and the content.
  5. Deriving common vocabulary terms.
  6. Proposing a draft Microformat and arguing the relevance of each
 term in the vocabulary.
  7. Sorting out parsing rules for the Microformat.
  8. Repeating steps 1-7 until the community is happy.
  9. Testing the Microformat in the wild, getting feedback, writing
 code to support your specific Microformat.
  10. Draft stage - if you didn't give up by this point.

I say this as the primary editor of the hAudio Microformat - it is a
grueling process, certainly not for those without thick skin and a
strong determination to complete even simple vocabularies. Each one of
those steps can take weeks or months to complete.

I'm certainly not knocking the output of the Microformats community -
the documents that come out of the community have usually been vetted
quite thoroughly. However, to hear somebody propose Microformats as a
quick or easy solution makes me cringe every time I hear it.

The hAudio Microformat initiative started over 2 years ago and it's
still going, still not done. So, while it is true that someone may want
to put themselves through the headache of creating a Microformat to
solve a particular markup problem, it is unlikely. One must only look at
our track record - output for the Microformats community is at roughly
10 new vocabularies[3] (not counting rel-vocabularies and vocabularies
not based directly on a previous data format).

Compare that with the roughly 120-150 registered[3], active RDF
vocabularies[4] via prefix.cc. Now certainly, quantity != quality,
however, it does demonstrate that there is something that is causing
more people to generate RDF vocabularies than Microformats vocabularies.

Note that this argument doesn't apply to class-attribute-based semantic
markup, but one should not make the mistake that it is easy to create a
Microformat.

-- manu

[1] http://www.cssquirrel.com/comic/?comic=16
[2] http://www.cssquirrel.com/2009/05/04/comic-update-html5-manners/
[3] http://microformats.org/wiki/Main_Page#Specifications
[4] http://prefix.cc/popular/all

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: A Collaborative Distribution Model for Music
http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/



Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic

2009-05-05 Thread Ben Adida
Ian Hickson wrote:
 Are you saying that RDF vocabularies can be created _without_ this due 
 diligence?

Who decides what the right due diligence is? One organization for *all*
topics, ever?

An RDF vocabulary can be created by the proper community, i.e. a music
vocabulary by music experts, a copyright vocabulary by copyright
experts, a biomedical vocabulary by biomedical experts, rather than
assuming that one central group should be the centralized bottleneck for
all development.

In other words, RDF vocabularies function like the web does:
decentralized, let the best sites/vocabs win.

-Ben


Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic

2009-05-05 Thread Ian Hickson
On Tue, 5 May 2009, Ben Adida wrote:
 Ian Hickson wrote:
  Are you saying that RDF vocabularies can be created _without_ this due 
  diligence?
 
 Who decides what the right due diligence is?

The person writing the vocabulary, presumably.


 One organization for *all* topics, ever?

I don't think that would really scale. Even for major languages, like 
HTML, we haven't found a single organisation to be a successful model.


Manu's list didn't mention anything about a single organisation:

On Tue, 5 May 2009, Manu Sporny wrote:

 Creating a Microformat is a very time consuming prospect, including:
 
   1. Attempting to apply current Microformats to solve your problem.
   2. Gathering examples to show how the content is represented in the
  wild.
   3. Gathering common data formats that encode the sort of content
  you are attempting to express.
   4. Analyzing the data formats and the content.
   5. Deriving common vocabulary terms.
   6. Proposing a draft Microformat and arguing the relevance of each
  term in the vocabulary.
   7. Sorting out parsing rules for the Microformat.
   8. Repeating steps 1-7 until the community is happy.
   9. Testing the Microformat in the wild, getting feedback, writing
  code to support your specific Microformat.
   10. Draft stage - if you didn't give up by this point.

Surely all of the above apply equally to any RDFa vocabulary just as it 
would to _any_ vocabularly, regardless of the underlying syntax?

Consider each of these in turn:

1: You have to make sure you're not reinventing the wheel, whatever 
language or vocabulary you are designing.

2: You have to make sure whatever language or vocabulary you are designing 
is something that your users can use.

3: If you do have to invent a new language or vocabulary, it makes sense 
to base it on the base of knowledge humanity has collected on the subject.

4: You have to study the information collected in steps 2 and 3 to make 
sense of it.

5: Deriving vocabulary names is a key part of any language design effort.

6: Justifying your design is a key part of any language design effort 
also. Not doing this would lead to a language or vocabulary with 
unnecessary parts, making it harder to use.

7: With any language, part of designing the vocabulary is defining how to 
process content that uses it.

8: Defining any language or vocabulary effectively must, clearly, involve 
a feedback loop with community review.

9: The most important practical test of a language is the test of 
deployment. Getting feedback and writing code is naturally part of writing 
a format.

10: You have to specify the language.

As far as I can tell, the steps above are just the steps one would take 
for designing any format, language, or vocabulary. Are you saying that 
creating an RDF vocabulary _doesn't_ involve these steps? How is an RDF 
vocabulary defined if not using these steps?

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'