Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic
Ian Hickson wrote: On Tue, 5 May 2009, Manu Sporny wrote: Creating a Microformat is a very time consuming prospect, including: ... Microformats Due Diligence Rules ... Are you saying that RDF vocabularies can be created _without_ this due diligence? What I am saying is that the amount of due diligence that goes into a particular vocabulary should be determined by the community that will use the vocabulary. Some of these will be large communities and will require an enormous amount of due diligence, others will be very small communities, which may not require as much due diligence as larger communities, or they may have a completely different process to the Microformats process. The key here is that a micro-data approach should allow them to have the flexibility to create vocabularies in a distributed manner. Ian Hickson wrote: On Tue, 5 May 2009, Ben Adida wrote: Ian Hickson wrote: Are you saying that RDF vocabularies can be created _without_ this due diligence? Who decides what the right due diligence is? The person writing the vocabulary, presumably. Your stance is a bit more lax than mine on this. I'd say that it is the community, not solely the vocabulary author, that determines the right amount of due diligence. If the community does not see the proper amount of due diligence going into vocabulary creation, or the vocabulary does not solve their problem, then they should be free to develop a competing alternative. This is especially true because the proper amount of due diligence can easily become a philosophical argument - each community can have a perfectly rational argument to do things differently when solving the same problem. Your position, that the vocabulary author decides the proper amount of due diligence, is rejected in the Microformats community. In the Microformats community, every vocabulary has the same amount of due diligence applied to it. I think that this is a good thing for that particular community, but it does have a number of downsides - scalability being one of them. It creates a bottleneck - we can only get so many vocabularies through our centralized, community-based process and the barrier to creating a vocabulary is very high. As a result, we don't support small community vocabularies and only support widely established publishing behavior (contact information, events, audio, recipes, etc). So, maybe this requirement should be added to the micro-data requirements list: If micro-data is going to succeed, it needs to support a mechanism that provides easy, distributed vocabulary development, publishing and re-use. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic
Ian Hickson wrote: One organization for *all* topics, ever? I don't think that would really scale. Even for major languages, like HTML, we haven't found a single organisation to be a successful model. Then you, Ben, and I agree on this particular point: In order for semantic/micro-data markup to scale well, we must ensure that distributed vocabulary development, publishing and re-use is a cornerstone of the solution. Manu's list didn't mention anything about a single organisation Then I wasn't clear enough - I meant that the single organization was the Microformats community and that the list works for that particular community, but is not guaranteed to work for all communities. You could say that the single community could be the W3C or WHATWG - pushing vocabulary standardization solely through any one of these organizations would be the wrong solution, therefore we should be cognizant of that in this micro-data discussion. Surely all of the above apply equally to any RDFa vocabulary just as it would to _any_ vocabularly, regardless of the underlying syntax? Not necessarily... 6: Justifying your design is a key part of any language design effort also. Not doing this would lead to a language or vocabulary with unnecessary parts, making it harder to use. What happens when the people you're justifying your design to are the gatekeepers? What happens when they don't understand the problem you're attempting to solve? Or they disagree with you on a philosophical level? Or they have some sort of political reason to not allow your vocabulary to see the light of day (think large multi-national vs. little guy)? In the Microformats community, this stage, especially if one of the Microformat founders disagrees with your stance, can kill a vocabulary. 7: With any language, part of designing the vocabulary is defining how to process content that uses it. Not if there are clear parsing rules and it's easy to separate the vocabulary from the parsing rules. This should be a requirement for the micro-data solution: Separation of concerns between the markup used to express the micro-data (the HTML markup) and the vocabularies used to express the semantics (the micro-data vocabularies). 9: The most important practical test of a language is the test of deployment. Getting feedback and writing code is naturally part of writing a format. This statement is vague, so I'm elaborating a bit to cover the possible readings of this statement: Writing markup code (ie: HTML) should be a natural part of writing a semantic vocabulary meant to be embedded in HTML. Writing parser code (ie: Python, Perl, Ruby, C, etc.) should not be a natural part of writing a semantic vocabulary - they wholly different disciplines. Microformats require you to write both markup code and parser code by design. As far as I can tell, the steps above are just the steps one would take for designing any format, language, or vocabulary. Are you saying that creating an RDF vocabulary _doesn't_ involve these steps? How is an RDF vocabulary defined if not using these steps? I don't believe that Ben is saying that at all - those steps are best practices and apply generally to most communities. However, they do not work for all communities and they do not work well when they are transformed from best practices to a requirement that all vocabularies must meet in order to be published. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
[whatwg] Just create a Microformat for it - thoughts on micro-data topic
bcc: Public RDFa Task Force mailing list (but not speaking as a member) Kyle Weems recent post[1] on CSSquirrel discusses[2] some of the more recent rumblings surrounding RDFa and Microformats as potential micro-data solutions. It specifically addresses a conversation between Ian and Tantek regarding Microformats: http://krijnhoetmer.nl/irc-logs/whatwg/20090430#l-693 Since I've seen this argument made numerous times now, and because it seems like a valid solution to someone that isn't familiar with the Microformats process, I'm addressing it here. The argument goes something like this: It looks like that markup problem X can be solved with a simple Microformat. This seems like a reasonable answer at first - Microformats, at their core, are simple tag-based mechanisms for data markup. Most semantic representation problems can be solved by explicitly tagging content. What most people fail to see, however, is that this statement trivializes the actual implementation cost of the solution. A Microformat is much more than a simple tag-based mechanism and it is far more difficult to create one than most people realize. Creating a Microformat is a very time consuming prospect, including: 1. Attempting to apply current Microformats to solve your problem. 2. Gathering examples to show how the content is represented in the wild. 3. Gathering common data formats that encode the sort of content you are attempting to express. 4. Analyzing the data formats and the content. 5. Deriving common vocabulary terms. 6. Proposing a draft Microformat and arguing the relevance of each term in the vocabulary. 7. Sorting out parsing rules for the Microformat. 8. Repeating steps 1-7 until the community is happy. 9. Testing the Microformat in the wild, getting feedback, writing code to support your specific Microformat. 10. Draft stage - if you didn't give up by this point. I say this as the primary editor of the hAudio Microformat - it is a grueling process, certainly not for those without thick skin and a strong determination to complete even simple vocabularies. Each one of those steps can take weeks or months to complete. I'm certainly not knocking the output of the Microformats community - the documents that come out of the community have usually been vetted quite thoroughly. However, to hear somebody propose Microformats as a quick or easy solution makes me cringe every time I hear it. The hAudio Microformat initiative started over 2 years ago and it's still going, still not done. So, while it is true that someone may want to put themselves through the headache of creating a Microformat to solve a particular markup problem, it is unlikely. One must only look at our track record - output for the Microformats community is at roughly 10 new vocabularies[3] (not counting rel-vocabularies and vocabularies not based directly on a previous data format). Compare that with the roughly 120-150 registered[3], active RDF vocabularies[4] via prefix.cc. Now certainly, quantity != quality, however, it does demonstrate that there is something that is causing more people to generate RDF vocabularies than Microformats vocabularies. Note that this argument doesn't apply to class-attribute-based semantic markup, but one should not make the mistake that it is easy to create a Microformat. -- manu [1] http://www.cssquirrel.com/comic/?comic=16 [2] http://www.cssquirrel.com/2009/05/04/comic-update-html5-manners/ [3] http://microformats.org/wiki/Main_Page#Specifications [4] http://prefix.cc/popular/all -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic
Ian Hickson wrote: Are you saying that RDF vocabularies can be created _without_ this due diligence? Who decides what the right due diligence is? One organization for *all* topics, ever? An RDF vocabulary can be created by the proper community, i.e. a music vocabulary by music experts, a copyright vocabulary by copyright experts, a biomedical vocabulary by biomedical experts, rather than assuming that one central group should be the centralized bottleneck for all development. In other words, RDF vocabularies function like the web does: decentralized, let the best sites/vocabs win. -Ben
Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic
On Tue, 5 May 2009, Ben Adida wrote: Ian Hickson wrote: Are you saying that RDF vocabularies can be created _without_ this due diligence? Who decides what the right due diligence is? The person writing the vocabulary, presumably. One organization for *all* topics, ever? I don't think that would really scale. Even for major languages, like HTML, we haven't found a single organisation to be a successful model. Manu's list didn't mention anything about a single organisation: On Tue, 5 May 2009, Manu Sporny wrote: Creating a Microformat is a very time consuming prospect, including: 1. Attempting to apply current Microformats to solve your problem. 2. Gathering examples to show how the content is represented in the wild. 3. Gathering common data formats that encode the sort of content you are attempting to express. 4. Analyzing the data formats and the content. 5. Deriving common vocabulary terms. 6. Proposing a draft Microformat and arguing the relevance of each term in the vocabulary. 7. Sorting out parsing rules for the Microformat. 8. Repeating steps 1-7 until the community is happy. 9. Testing the Microformat in the wild, getting feedback, writing code to support your specific Microformat. 10. Draft stage - if you didn't give up by this point. Surely all of the above apply equally to any RDFa vocabulary just as it would to _any_ vocabularly, regardless of the underlying syntax? Consider each of these in turn: 1: You have to make sure you're not reinventing the wheel, whatever language or vocabulary you are designing. 2: You have to make sure whatever language or vocabulary you are designing is something that your users can use. 3: If you do have to invent a new language or vocabulary, it makes sense to base it on the base of knowledge humanity has collected on the subject. 4: You have to study the information collected in steps 2 and 3 to make sense of it. 5: Deriving vocabulary names is a key part of any language design effort. 6: Justifying your design is a key part of any language design effort also. Not doing this would lead to a language or vocabulary with unnecessary parts, making it harder to use. 7: With any language, part of designing the vocabulary is defining how to process content that uses it. 8: Defining any language or vocabulary effectively must, clearly, involve a feedback loop with community review. 9: The most important practical test of a language is the test of deployment. Getting feedback and writing code is naturally part of writing a format. 10: You have to specify the language. As far as I can tell, the steps above are just the steps one would take for designing any format, language, or vocabulary. Are you saying that creating an RDF vocabulary _doesn't_ involve these steps? How is an RDF vocabulary defined if not using these steps? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'