Re: [whatwg] Support for RDFa in HTML5
On 03/12/12 15:00, Benjamin Hawkes-Lewis wrote: On Mon, Mar 12, 2012 at 5:16 PM, Philipp Serafin phil...@gmail.com wrote: I admit, I haven't really followed this debate. But out of interest: Is it by now actually possible to write any kind of implementation that complies to both, the HTML5 *and* the RDFa specification? As Ian said, the HTML5 spec allows extensions by other specifications. The HTML5+RDFa specification is making use of that facility. By writing an implementation that complies with HTML5+RDFa, you write an implementation that complies with both. Yes, what Benjamin and Ian said. Additionally, HTML5+RDFa spec does not support the @version attribute in HTML5+RDFa: The version attribute is not supported in HTML5 and is non-conforming. However, if an HTML+RDFa document contains the version attribute on the HTML element, a conforming RDFa Processor must examine the value of this attribute. If the value matches that of a defined version of RDFa, then the processing rules for that version must be used. If the value does not match a defined version, or there is no version attribute, then the processing rules for the most recent version of RDFa 1.1 must be used. It also should be possible to write a parser implementation that is capable of complying with both the HTML5 specification and the HTML+RDFa specification. You may want to read the latest HTML+RDFa specification for details: http://www.w3.org/TR/2012/WD-rdfa-in-html-20120315/ If after reading the explanation and document above, you still do not think this is possible, please explain exactly what you feel is not possible and why. -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: PaySwarm Website for Developers Launched http://digitalbazaar.com/2012/02/22/new-payswarm-alpha/
[whatwg] JSON-LD - Universal Linked Data markup for Web Services
Just in case there are people in this community that haven't seen this yet: On May 29th 2010, Manu Sporny tweeted: Just published JSON-LD: http://rdfa.digitalbazaar.com/specs/source/json-ld/ Universal markup of #rdfa #microdata and #microformats via lightweight JSON. #html5 #json #lod - Abstract Developers that embed structured data in their Web pages can choose among a number of languages such as RDFa, Microformats and Microdata. Each of these structured data languages, while incompatible at the syntax level, can be easily mapped to RDF. JSON has proven to be a highly useful object serialization and messaging replacement for SOAP. In an attempt to harmonize the representation of Link Data in JSON, this specification outlines a common JSON representation format for Linked Data that can be used to represent objects specified via RDFa, Microformats and Microdata. -- There is currently some discussion going on in the RDFa Community on this markup mechanism (start of thread): http://lists.w3.org/Archives/Public/public-rdfa/2010May/0018.html A great amount of effort was made to ensure that the markup mechanism was compatible with Microdata. Just a heads-up that this exists and we may be attempting to formalize it (via a spec) over the next couple of months. -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.2.2 - Good Relations and Ditching Apache+PHP http://blog.digitalbazaar.com/2010/05/06/bitmunk-3-2-2/2/
[whatwg] HTML5-warnings - request to publish as next heartbeat WD
I took some time this weekend to go through the HTML5 specification and write warning language for features that are currently either controversial or have long-standing bugs logged against them. It is important that we draw attention to the least stable sections of the HTML5 draft in order to align public expectation as we move towards Last Call. The only difference between this draft and Ian's latest draft are the warnings - there are no new technical additions or deletions. Since there are no new technical changes, there is no need to trigger a FPWD. I am requesting three things of the HTML WG: 1. That this version is published as the next heartbeat Working Draft. Specifically, this is not a FPWD since there are no technical changes and thus there are no additional patent concerns. 2. Two other independent voices to support the publishing of this draft. Without those voices, this proposal cannot be considered for publishing. 3. A poll is created with two options: [ ] Publish Ian's latest draft to address the heartbeat requirement. [ ] Publish Ian's latest draft with Manu's warning language to address the heartbeat requirement. Whichever option that receives more than 50% of the vote will be published. A tie will result in Ian's latest draft being published. Here is the complete diff-marked version: http://html5.digitalbazaar.com/specs/html5-warnings-diff.html Here is a link to every warning that was added to the HTML5 specification (this is the easiest way to review the changes): http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#editor-s-draft-date-08-August-2009 http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#urls http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#fetch http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#misinterpreted-for-compatibility http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#implied-strong-reference http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#other-metadata-names http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#microdata http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#predefined-type http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#obsolete http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#the-source-element http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#alt http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#the-head-element http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#navigating-across-documents http://html5.digitalbazaar.com/specs/html5-warnings-diff.html#the-bb-element If Ian updates his spec, I can regenerate and republish an updated version of this document within an hour or two. The non-diff-marked specification can be found here: http://html5.digitalbazaar.com/specs/html5-warnings.html This version of the specification will be checked into W3C CVS when Mike(tm) clears its addition to the repository. -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
[whatwg] Test results for xmlns:foo attribute preservation across all browsers
With thanks to the CTO of our company, Dave Longley, we have run a set of preliminary tests across a number of browsers to determine if and when xmlns:-style attributes are preserved. The test ensures that attributes originating in the markup of an HTML4 document are preserved by the HTML parser and are preserved in the DOM. The xmlns:-style attributes are then accessed via pure Javascript and DOM-Level-1 mechanisms. Here is the test: http://html5.digitalbazaar.com/tests/xmlns-attribute-test.html We have verified that xmlns:-style attributes are preserved in the following browsers: Firefox 3.0.9, Firefox 3.5.1, Chrome 3.0.196, Internet Explorer 7.0, Internet Explorer 8.0, Safari 4.0, Opera 9, Arora 0.7.0, Konqeror 4.2, Epiphany 2.22, and Android 1.5 (T-Mobile G1) Maciej, I believe that these results were what you were expecting. Ben, Shane, Mark, these results contradict what I asserted this morning during the RDFa telecon. We have not been able to test a vanilla installation of IE 5.0 or IE 6.0 running on Windows XP SP2. The Multiple IE program is not guaranteed to work - the tests worked for us, but we may have accidentally been using the IE 7 browser engine. Could members in the communities addressed in this e-mail please: 1. Review the test source code to ensure the test is accurate. 2. Submit test results for browsers that are not in the list above, or on the test page. Please specify whether the test worked and include your browser version string (which is available on the test page). Ian, is there language in the HTML5 specification (I looked and could not find any) that ensures that this current, widely supported browser behavior is documented in the spec? -- manu -- Manu Sporny (skype: msporny) (twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
Re: [whatwg] New HTML5 spec-editing tools released
Joe D Williams wrote: http://wiki.github.com/html5/spec I think this is really great, Manu. However, I think it needs the mode where only the new stuff is provided. That is, content that is identical to the main spec is just not really a factor is evaluating added or revised, or changed material. I think the main button for showing candidate text should be that just the new/revised is shown. You're right - we do need an easy and automatic mechanism to create diffs between specs in the new HTML5 spec-editing tools. I was eventually going to coordinate with W3C staff to use their HTML diffing marking program (since they do this on a regular basis). There seems to be a variety of programs out there to perform HTML diffs: http://esw.w3.org/topic/HtmlDiff in which, essentially, only the replacement text is shown. That may be an old way to do it, but it provides focus for the reviewer who is familiar with the rest of the document. Then, when the candidate is approved, if everything ishooked in, the editors might well choose to use the process you describe here to publish a complete current draft. Other things that we could do to make the reviewing process easier: * Generate very focused HTML diffs, showing only what changed and hiding everything that didn't change. Give minimal context (1-4 paragraphs at most), highlight terms. * Provide a jump-to-next/previous-change button/index Thanks for the feedback, Joe - I'll try to put something together that provides automatic diffs during the coming weeks. -- manu -- Manu Sporny (skype: msporny) (twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
[whatwg] New HTML5 spec-editing tools released
The newest version of the microsection spec-editing tools have been made available: http://wiki.github.com/html5/spec These tools, microsplit and microjoin, are capable of: * Taking Ian's latest HTML5 spec as an input document and splitting it up into microsections. * Re-mixing, removing and adding microsections specified from another source (for example: RDFa, John Foliot's summary suggestions, etc.) * Producing one or more output specifications (such as Ian's HTML5 spec, HTML5-rdfa, HTML5-johnfoliot-summary, etc.) This process: * Does not impact Ian's current editing workflow. * Empowers additional editors to modify the HTML5 specification without stomping on each other's changes. * Enables alternate HTML5 specifications to be authored while automatically updating the alternates with Ian's spec changes. * Is currently used to produce the HTML5+RDFa specification. * Provides a mechanism that can be used to generate specification language that is specific, and that can be used to form consensus around the HTML5 specification at the W3C. * Enables thoughtful and well-mannered dissent. There is even a pretty picture that describes the workflow: http://wiki.github.com/html5/spec Anyone is free to clone the repository, use the tools, generate remixed/updated/altered specifications and propose them as alternatives. I am seeking thoughts and suggestions about these tools - how they might help or hinder, as well as improvements that should be considered. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
[whatwg] New HTML5 spec GIT collaboration repository
There is now a public GIT repository containing the build process for producing the HTML5 specification. The build process and source for generating the HTML5+RDFa specification are also included. More details here: http://github.com/html5/spec/ For instructions on how one might use the repository to contribute changes, see the README: http://github.com/html5/spec/blob/e84bd4bd252ba7ec69cd9ef877eee78d3e90e2e4/README A couple of quick bullet-points: * Any member of WHAT WG or HTML WG that has agreed to the W3C Patent and Licensing Policy may have commit rights to the repository. * If you would like to collaborate on tools, test cases, examples or specification text, get a github account (free), join the HTML WG (free) and contact me. * There are 3 suggestions on etiquette for contributors, please read them. In short - don't stomp on anyone's work without their express permission. The tools to split and re-assemble the specification (as outlined in the Restructing HTML5 document[1]) are not yet available. I'll be writing and placing those tools in the HTML5 git repository in the coming month. Here is the current process that is used to build the specification: 1. Copy Ian's latest spec from WHAT WG's SVN repository. 2. Apply changes via a Python script to the copy of Ian's spec (such as inserting the RDFa spec text). 3. Running the Anolis post-processor on the newly modified spec. If you would just like to take the repository for a test spin, do the following: git clone git://github.com/html5/spec.git Let me know if there are any bugs, questions or concerns with the current setup. It will hopefully become more usable as the weeks progress. -- manu [1] http://html5.digitalbazaar.com/a-new-way-forward/ -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
Re: [whatwg] New HTML5 spec GIT collaboration repository
Geoffrey Sneddon wrote: Manu Sporny wrote: 3. Running the Anolis post-processor on the newly modified spec. Is there any reason you use --allow-duplicate-dfns? Legacy cruft. There was a time that I had duplicate dfns while attempting to figure something else out. The latest commit to the master branch has it removed - thanks :) Likewise, you probably don't want --w3c-compat (the name is slightly misleading, it provides compatibility with the CSS WG's CSS3 Module Postprocessor, not with any W3C pubrules). Ah, I thought it was required to generate some W3C-specific HTML. Removed as well, thanks for the pointer. On the whole I'd recommend running it with: --w3c-compat-xref-a-placement --parser=lxml.html --output-encoding=us-ascii Done, those are the default flags that the HTML5 git repo uses now to build all of the specifications. The latter two options require Anolis 1.1, which is just as stable as 1.0. I believe those options are identical to how Hixie runs it through PMS. Seeing as how building Python eggs and using Mercurial is scary for some people, would it be okay if I included the Anolis app into the HTML5 git repository? Your license allows this, but I thought I'd ask first in case you wanted to collaborate on it in a particular way. I can either track updates from the mercurial anolis source repo, or give you commit access to the HTML5 git repo so that you can continue to modify Anolis there. Let me know which you would prefer... -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
Re: [whatwg] A New Way Forward for HTML5 (revised)
Peter Kasting wrote: On Mon, Jul 27, 2009 at 12:06 PM, John Foliot jfol...@stanford.edu mailto:jfol...@stanford.edu wrote: That said, the barrier to equal entry remains high: http://burningbird.net/node/28 I don't necessarily agree with most of Shelley's take on the situation. I do agree with the point that we need to make contributing to HTML5 easier for those without the technical skills required for source control. So, this response has nothing to do with the post that John linked to or Shelley's take on the situation (just making those points clear). I'm beginning to suspect that this whole line of conversation is specific to RDFa, which is a discussion I never took part in. No, it is not specific to RDFa. If it were specific to RDFa, I would have said that it was specific to RDFa and wouldn't have gone to the trouble of writing the Restructuring HTML5 document. The RDFa discussion triggered my current thinking on how this spec is being put together, the XHTML2 work being halted added to the concern, others (both inside and outside WHAT WG) helped to focus the issues. They are all aspects of the document, but are not end-goals. Here's why I'm not that concerned about RDFa at this point in time: Even if it isn't in the HTML5 specification: RDFa can be embedded, as-is, in XHTML5. There exists an HTML5+RDFa spec, and it will probably be published as a WD. If this conversation was specific to RDFa, why would we go to the trouble of creating tools to edit the specification when the end-product (HTML5+RDFa) already exists? As for the discussion on HTML5+RDFa - it's still going on, if you'd like to provide constructive criticism or feedback of any kind. -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
Re: [whatwg] New HTML5 spec GIT collaboration repository
Cameron McCormack wrote: Manu Sporny: 3. Running the Anolis post-processor on the newly modified spec. Geoffrey Sneddon: Is there any reason you use --allow-duplicate-dfns? I think it’s because the source file includes the source for multiple specs (HTML 5, Web Sockets, etc.) which, when taken all together, have duplicate definition. Manu’s Makefile will need to split out the HTML 5 specific parts (between the !--START html5-- and !--END html5-- markers). The ‘source-html5 : source’ rule in http://dev.w3.org/html5/spec-template/Makefile will handle that. What a great answer, Cameron! I wish I had thought of that :) Yes, that will become an issue in time and was going to have a chat with Geoffrey about how to modify Anolis to handle that as well as handling what happens when there is no definitions when building the cross-references (perhaps having a formatter warnings section in the file?). I also spoke too soon, Geoffrey, --allow-duplicate-dfns is needed because of this error when compiling Ian's spec: The term dom-sharedworkerglobalscope-applicationcache is defined more than once I'm probably doing something wrong... haven't had a chance to look at Cameron's Makefile pointer yet, so --allow-duplicate-dfns is in there for now. Here's the latest: http://github.com/html5/spec/commit/16514d4ec9175fdf6a408789628817d81c44e3a9 -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
Re: [whatwg] A New Way Forward for HTML5 (revised)
Michael Enright wrote: On Thu, Jul 23, 2009 at 2:48 PM, Manu Spornymspo...@digitalbazaar.com wrote: I can git clone the Linux kernel, mess around with it and submit a patch to any number of kernel maintainers. If that patch is rejected, I can still share the changes with others in the community. Using the same tools as everybody else, I can refine the patch until there is a large enough group of people that agree, and implementation feedback to back up the patch, where I may have another chance of resubmitting the patch for re-review. This mechanism is a fundamental part of the community. I think you have incorrectly characterized the kernel maintenance process. For one thing, Linus Torvalds is the gatekeeper. No, he is not /the/ gatekeeper, he is /a/ gatekeeper - one of several. There is one for each stable version of the Linux kernel: * 2.0 David Weinehall t...@acc.umu.se * 2.2 Alan Cox a...@lxorguk.ukuu.org.uk * 2.4 Marcelo Tosatti mtosa...@redhat.com * 2.6 Linus Torvalds torva...@osdl.org There are also over 926 kernel maintainers[1] -- each directly responsible for a different part of the Linux kernel. For another thing, there is no sense that some sort of consensus will get a patch accepted if LT finds it deficient. Linus does not have a hand in every patch that comes through the LKML. Just to clarify, he doesn't even look at many of the patches that come through LKML[2]. When asked if he's looked at MSFT's latest contribution, he replied: “I haven’t. Mainly because I’m not personally all that interested in driver code (it doesn’t affect anything else), especially when I wouldn’t use it myself. So for things like that, I just trust the maintainers. I tend to look at code when bugs happen, or when it crosses multiple subsystems, or when it’s one of the core subsystems that I’m actively involved in (ie things like VM, core device resource handling, basic kernel code etc). I’ll likely look at it when the code is actually submitted to me by the maintainers (Greg [Kroah-Hartman], in this case), just out of morbid curiosity.” If you go back and read what I was saying -- the argument I was making wasn't about consensus - it was about providing proper tools for collaboration. I think these inaccuracies, and characterization of the Linux kernel process as wide open, greatly degrade your argument. I don't believe them to be inaccuracies, as I explain and back up with citations above. Where did I assert the Linux kernel process as being wide open? Even if I were completely wrong about how LKML operates, the question that I posed in that e-mail to David still stands: Why do we not provide the distributed source control systems and specification generation tools to empower our community members to openly collaborate with one another? I'm not proposing that we allow people to directly stomp all over Ian's specification - that wouldn't help anything. I am also not suggesting that Ian should change how he authors his HTML5 specification. What I'm proposing is that others should be able to easily create lasting alternate language, modified sections, remove sections or add sections IN THEIR OWN SANDBOX and generate alternate specifications based on Ian's HTML5 specification. We should provide the tools to enable that. -- manu [1]http://www.linux-mag.com/cache/7439/1.html [2]http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob_plain;f=MAINTAINERS;hb=4be3bd7849165e7efa6b0b35a23d6a3598d97465 -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
Re: [whatwg] A New Way Forward for HTML5 (revised)
Peter Kasting wrote: On Sun, Jul 26, 2009 at 6:26 PM, Manu Sporny mspo...@digitalbazaar.com mailto:mspo...@digitalbazaar.com wrote: If people sending emails containing proposals, and having the editor directly respond to all of those emails, frequently by changing the spec, does not give you the impression you can impact the specification, I'm not sure what would. Having a distributed source control system in place that would provide the tools available to generate, modify and submit specification text for HTML5. Having the ability to generate alternate HTML5 specification text. Are you are saying that writing an email is too taxing, but checking text in and out of source control is not? No, I'm not saying that. Writing an e-mail is great, submitting comments directly from the specification is better. There is nothing wrong with those avenues if one wants to contribute changes via Ian to the HTML5 specification. However, if one wants to disagree in a non-combative way with Ian's specification or wants to work on a bigger set of changes to the HTML5 specification, we currently don't provide any tools to make that easy and I think we should. Respectful disagreement is healthy, it can lead to better solutions. If Ian disagrees with me on some specification text, he can create a solution (Microdata) that he feels is the best way forward and he can insert that into his specification. However, if I disagree with Ian on some specification text (RDFa), I currently don't have a way to create a specification that people can look at without having to know a great deal about how to build the specification in the first place. Ideally, people would be able to edit the HTML5 specification from a web browser, in their own developer sandbox, and provide alternate language or sections that could be mixed and matched with Ian's specification so that others may know the exact text that is being proposed. If not, see the final paragraph below. The tools and mechanism doesn't exist to do this easily in the HTML5 community. The process is unclear and undocumented. I'm working to resolve these issues. Ian has just added a way to submit comments immediately, anonymously, on the spec itself. Does this ameliorate your concern? I can hardly imagine a lower barrier to entry. No, it doesn't because I would have a hard time proposing the HTML5+RDFa specification using that interface, or the next HTML5+SVG specification, or other extensions to HTML5. The mechanism that I'm proposing allows people to effectively put their spec text where their mouth is and produce an alternate specification for review, without having to convince Ian to work on the new feature. Ian would still have the final say on whether or not the changes would be integrated into his specification. That wouldn't change. It seems like the only thing you could ask for beyond this is the ability to directly insert your own changes into the spec without prior editorial oversight. I think that might be what you're asking for. This seems very unwise. I'm not proposing that anybody should be able to modify Ian's specification. Ian has made it very clear that he reserves that right. What I'm proposing is that the community should provide tools that are capable of generating multiple /alternate/ HTML5 specifications (based on Ian's specification) for consideration by the community. These might include alternate ways of doing structured data in HTML, a stripped-down version of HTML5, a web developer friendly version of the specification, and various other documents that are all based off of Ian's specification. I'll release the first set of tools for doing so tomorrow morning. -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
Re: [whatwg] A New Way Forward for HTML5 (revised)
Maciej Stachowiak wrote: WebKit also has, arguably, a more open development model than either Linux or HTML5. There are many reviewers with the authority to approve a checkin, even more people with the ability to directly commit to the code after review, and even more people who have submitted some patches but don't yet have commit privileges. There is no single central gatekeeper, either for WebKit as a whole or any particular version. Hmm, didn't know there was no central gatekeeper for WebKit... cool. My conclusions: * Number of mailing list subscribers doesn't necessarily have a direct relationship to either project openness or project impact. * For a project with decent levels of both impact and openness, around a thousand mailing list subscribers is within expectations. * The Linux Kernel mailing list likely has a huge number of subscribers due to unique social and historical factors, not just due to the development model. While I don't necessarily agree with your conclusions, I think all of the above are perfectly reasonable possibilities. I don't have any data to prove or disprove any of the conclusions you make above, so this is as far in the conversation as I can go :) I would also caution that, by their nature, standards projects are not quite the same thing as software projects. While the way HTML5 has been run is much more in the spirit of open source than many past Web standards, I'm not sure all the lessons can be applied blindly. I don't think that the lessons should be applied blindly... I think they should be applied selectively and with great care. We don't want to destabilize the way HTML5 is currently being developed - but we do want to improve the process for giving feedback, get more people making meaningful contributions, make the process of contributing more harmonious and hopefully accelerate the speed at which features can be developed. These are all direct or implied goals in the Restructuring HTML5 proposal[1]. -- manu [1] http://html5.digitalbazaar.com/a-new-way-forward/ -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
[whatwg] A New Way Forward for HTML5
By halting the XHTML2 work and announcing more resources for the HTML5 project, the World Wide Web Consortium has sent a clear signal on the future markup language for the Web: it will be HTML5. Unfortunately, the decision comes at a time when many working with Web standards have taken issue with the way the HTML5 specification is being developed. The shut down of the XHTML2 Working Group has brought to a head a long-standing set of concerns related to how the new specification is being developed. The following page outlines the current state of development and suggests that there is a more harmonious way to move forward. By adopting some or all of the proposals outlined below, the standards community will ensure that the greatest features for the Web are integrated into HTML5. http://html5.digitalbazaar.com/a-new-way-forward/ -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
Re: [whatwg] A New Way Forward for HTML5
as much language that could be re-used in each document as possible. Take the progress element for example: http://www.whatwg.org/specs/web-apps/current-work/#the-progress-element Web Developers, in general, don't care about the User Agent Requirements parts of that section, which constitutes almost half of the content for the progress element. That is, they don't tend to care (in general) about how the browser does what it does. Similarly, people that are creating user agents tend to not care about examples (in general) and would like to know more about the DOM Interface. They would probably rather see examples of how to use the element correctly. So, we have 3 possible microsections: progress-element-introduction progress-element-ua-requirements progress-element-examples These microsections could be used to generate two separate documents from the same source: HTML5: The Language - Syntax and Semantics, would contain * progress-element-introduction * progress-element-examples HTML5: Parsing and Processing Rules, would contain * progress-element-introduction * progress-element-ua-requirements HTML5: Ian Hicksons Specification, would contain * progress-element-introduction * progress-element-examples * progress-element-ua-requirements I'm a little bit puzzled by the inclusion of the section Problem: Partial Distributed Extensibility: it seems to be a technical issue (although a far-reaching one) in a document otherwise about process issues. I'm not sure it belongs in this discussion. Other reviewers of the document had said that as well. Perhaps it is not desirable to talk about it in this discussion (which is about process)... but I thought it should be mentioned (and will be brought up in the coming months). I think that there are enough people that care about it to author some spec language to guarantee that round-tripping between HTML5 and XHTML5 is possible. That is, that xmlns: is preserved in the DOM in HTML5. Although, you're right, that's a discussion for another time. Finally, regarding the section Problem: Disregarding Input from the Accessibility Community. I think some of the input that has been ignored or has been felt to be ignored is input that is difficult to act on. I agree with the majority of what you had to say. The issue is that, once again, the WAI community was unaware or felt like they were not being given the opportunity to affect change. It's a break down in communication on both sides. WHAT WG has not been clear about what we require, or we haven't made it clear to WAI. There are at least 8-10 people in WAI that are confused about why they are being repeatedly rejected - perhaps they're emotionally tied to their solutions, perhaps WHAT WG doesn't get what they're trying to accomplish. Those possibilities are beside the point: There is no easy mechanism for them to edit the specification, and until recently, they were under the general impression that Ian was the gatekeeper for the HTML5 specification. They should be able to edit /something/ lasting, publish it for review, and rise or fall on the merits and accuracy of their specification language. They are not being given the opportunity to do so. I hope this helps clarify some of the positions that the paper was attempting to express. It wasn't meant as an attack on the WHAT WG - it was meant as a feeler document to see how this community would react to the proposed changes. That is, it's good to get feedback before I spend the time (and money) to implement the changes (some of which we'll conclude are unnecessary). -- manu [1]http://img132.imageshack.us/img132/2988/27062201.jpg [2]http://vger.kernel.org/vger-lists.html -- Manu Sporny (skype: msporny, twitter: manusporny) President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
[whatwg] HTML5+RDFa first Editors Draft published
The first public Editors Draft of RDFa for HTML5 was published earlier today. You can view the draft in two forms: * [1] HTML5+RDFa Section (small 34K HTML document) * [2] Complete HTML5+RDFa Specification (very large 4MB HTML document) This blog post explains how this draft came to be, how it was published via the World Wide Web Consortium, and what it means for the future of RDFa and HTML5: http://blog.digitalbazaar.com/2009/07/13/html5rdfa/ -- manu [1]http://dev.w3.org/html5/rdfa/rdfa-module.html [2]http://dev.w3.org/html5/rdfa/Overview.html#rdfa -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Released - Browser-based P2P Commerce http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/
[whatwg] Link rot is not dangerous (was: Re: Annotating structured data that HTML has nosemanticsfor)
Kristof Zelechovski wrote: Therefore, link rot is a bigger problem for CURIE prefixes than for links. There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. This has also lead to a false requirement that all vocabularies should be centralized. Here's the fear: If a vocabulary document disappears for any reason, then the meaning of the vocabulary is lost and all triples depending on the lost vocabulary become useless. That fear ignores the fact that we have a highly available document store available to us (the Web). Not only that, but these vocabularies will be cached (at Google, at Yahoo, at The Wayback Machine, etc.). IF a vocabulary document disappears, which is highly unlikely for popular vocabularies - imagine FOAF disappearing overnight, then there are alternative mechanisms to extract meaning from the triples that will be left on the web. Here are just two of the possible solutions to the problem outlined: - The vocabulary is restored at another URL using a cached copy of the vocabulary. The site owner of the original vocabulary either re-uses the vocabulary, or re-directs the vocabulary page to another domain (somebody that will ensure the vocabulary continues to be provided - somebody like the W3C). - RDFa parsers can be given an override list of legacy vocabularies that will be loaded from disk (from a cached copy). If a cached copy of the vocabulary cannot be found, it can be re-created from scratch if necessary. The argument that link rot would cause massive damage to the semantic web is just not true. Even if there is minor damage caused, it is fairly easy to recover from it, as outlined above. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Link rot is not dangerous
Kristof Zelechovski wrote: I understand that there are ways to recover resources that disappear from the Web; however, the postulated advantage of RDFa you can go see what it means simply does not hold. This is a strawman argument more below... All this does not imply, of course, that RDFa is no good. It is only intended to demonstrate that the postulated advantage of the CURIE lookup is wishful thinking. That train of logic seems to falsely conclude that if something does not hold true 100% of the time, then it cannot be counted as an advantage. Example: Since the postulated advantage of RAID-5 is that a disk array is unlikely to fail due to a single disk failure, and since it is possible for more than one disk to fail before a recovery is complete, one cannot call running a disk array in RAID-5 mode an advantage to not running RAID at all (because failure is possible). or Since the postulated advantage of CURIEs is that you can go see what it means and it is possible for a CURIE defined URL to be unavailable, one cannot call it an advantage because it may fail. There are two flaws in the premises and reasoning above, for the CURIE case: - It is assumed that for something to be called an 'advantage' that it must hold true 100% of the time. - It is assumed that most proponents of RDFa believe that you can go see what it means holds at all times - one would have to be very deluded to believe that. The recovery mechanism, Web search/cache, would be as good for CURIE URL as for domain prefixes. Creating a redirect is not always possible and the built-in redirect dictionary (CURIE catalog?) smells of a central repository. Why does having a file sitting on your local machine that lists alternate vocabulary files for CURIEs smell of a central repository? Perhaps you're assuming that the file would be managed by a single entity? If so, it wouldn't need to be and that was not what I was proposing. Serving the vocabulary from the own domain is not always possible, e.g. in case of reader-contributed content, This isn't clear, could you please clarify what you mean by reader-contributed content? and only guarantees that the vocabulary will be alive while it is supported by the domain owner. This case and it's solution was already covered previously. Again - if the domain owner disappears, the domain disappears, or the domain owner doesn't want to cooperate for any reason, one could easily set up an alternate URL and instruct the RDFa processor to re-direct any discovered CURIEs that match the old vocabulary to the new (referenceable) vocabulary. (WHATWG wants HTML documents to be readable 1000 years from now.) Is that really a requirement? What about external CSS files that disappear? External Javascript files that disappear? External SVG files that disappear? All those have something to do with the document's human/machine readability. Why is HTML5 not susceptible to link rot in the same way that RDFa is susceptible to link rot? Also, why 1000 years, that seems a bit arbitrary? =P It is not always practical either as it could confuse URL-based tools that do not retrieve the resources referenced. Could you give an example of this that wouldn't be a bug in the dereferencing application? How could a non-dereference-able URL confuse URL-based tools? -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
[whatwg] Link rot is not dangerous
Tab Atkins Jr. wrote: Reversed domains aren't *meant* to link to anything. They shouldn't be parsed at all. They're a uniquifier so that multiple vocabularies can use the same terms without clashing or ambiguity. The Microdata proposal also allows normal urls, but they are similarly nothing more than a uniquifier. CURIEs, at least theoretically, *rely* on the prefix lookup. After all, how else can you tell that a given relation is really the same as, say, foaf:name? If the domain isn't available, the data will be parsed incorrectly. That's why link rot is an issue. Where in the CURIE spec does it state or imply that if a domain isn't available, that the resulting parsed data will be invalid? -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
[whatwg] Google announces Microformats/RDFa support
Google announces Microformats/RDFa support http://lists.w3.org/Archives/Public/public-rdfa/2009May/0011.html -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Custom microdata handling added to HTML5 spec
Shelley Powers wrote: Since a new section detailing HTML5's handling of custom microdata has been added to the HTML5 spec http://dev.w3.org/html5/spec/Overview.html#microdata I've only had a brief chance to look over the HTML5 Microdata spec, but there is one big problem that overrides all of the other issues: The HTML5 Microdata spec is in direct conflict with planned RDFa extensions and will almost surely result in spurious triples being generated in RDFa processors in the future. We are currently working[1] on features to dynamically extending the base set of reserved words and the set of pre-defined prefixes through a mechanism called RDFa Profiles[2]. It is proposed that this mechanism would allow authors to do this in their documents: div profile=http://example.org/myprofile.html; ... span property=descriptionA description for this page./span span about=#me property=nameManu Sporny/span /div Note that 'description' and 'name' are not prefixed, but would be mapped to a full URI in the document listed by @profile. This allows the ease of Microformats-like markup but with all of the rigor of RDFa. The HTML5 microdata proposal, as it stands right now, would create numerous spurious triples if implemented and would violate the purpose of @property as it is being developed in the RDFa community. I'll have more comments on the microdata proposal based on the response to this e-mail. -- manu [1]http://www.w3.org/2009/04/30-rdfa-minutes.html#item04 [2]http://rdfa.info/wiki/RDFa_Profiles -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Micro-data/Microformats/RDFa Interoperability Requirement
Ian Hickson wrote: On Thu, 7 May 2009, Manu Sporny wrote: That's certainly not what the WHATWG blog stated just 20 days ago for rel=license [...] The WHATWG blog is an open platform on which anyone can post, and content is not vetted for correctness. Mark can sometimes make mistakes. Feel free to post a correction. :-) Well, the problem is that I don't know who to correct - you or Mark. It's unclear to me if it's the spec that needs correcting or the blog post? For rel-license, the HTML5 spec defines the value to apply to the content and not the page as a whole. This is a recent change to match actual practice and I will be posting about this shortly. Hmm, yes - after re-reading the definitions, they do differ... especially in how the hAudio Microformat uses rel=license. I find the HTML5 one to be very problematic. Microformats rel=license is better, and the RDFa use of rel=license is even better (I can go into the reasoning if those on the list are curious). For example, in HTML5, how do you express 20 items on a page, each with separate licenses? How do you differentiate a page that has 3 primary topics, each with a separate license? In short - what's the purpose of rel=license if a machine can't use it to help the person browsing identify important sections of a page? Afterall, it's only machine readable, isn't it? What's the sense in having rel=license if a machine can't be sure of the section of the page to which it applies? Surely this is what namespaces were intended for. Uhh, what sort of namespaces are we talking about here? xmlns-style, namespaces? The idea of XML Namespaces was to allow people to extend vocabularies with a new features without clashing with older features by putting the new names in new namespaces. It seems odd that RDFa, a W3C technology for an XML vocabulary, didn't use namespaces to do it. As you are aware, the RDF in XHTML 1.1 Task Force was created to figure out a way to express RDF via XHTML. The standard mechanism for extending XHTML is XHTML Modularization[1]. From the XHTML modularization spec: This modularization provides a means for subsetting and extending XHTML... this specification is intended for use by language designers as they construct new XHTML Family Markup Languages. We used the standard mechanism, specifically designed to extend XHTML vocabularies, approved by the XHTML working group, the W3C TAG and a number of other W3C and public web groups, to extend the language. Had we been tasked with expressing RDF in XML, we would have called it RDF/XML[2]... :) For example, the way that n:next and next can end up being equivalent in RDFa processors despite being different per HTML rules (assuming an n namespace is appropriately declared). If they end up being equivalent in RDFa, the RDFa author did so explicitly when declaring the 'n' prefix to the default prefix mapping and we should not second-guess the authors intentions. My only point is that it is not compatible with HTML4 and HTML5, because they end up with different results in the same situation (one can treat two different values as the same, while the other can treat two different values as different). It is only not compatible with HTML5 if this community chooses for it to not be compatible with HTML5. Do you agree or disagree that we shouldn't second guess the authors intentions if they go out of their way to declare a mapping for 'n'? I don't think that's a relevant question. My point is that it is possible in RDFa to put two strings that have different semantics in HTML4 and yet have them have the same semantics in RDFa. This means RDFa is not compatible with HTML4. Of course it's relevant - the whole reason there are two strings with the same semantics, in your rather contrived example, is because the author went out of their way to make the statement. This doesn't happen by accident - the web page author intended it to happen. More importantly - you've just made the same statement twice in RDFa and once in HTML4. I can't think of a single technically significant negative repercussion for generating a duplicate triple in a corner case. Why does one duplicated triple in a contrived example mean that the entirety of RDFa isn't compatible with HTML4? More importantly, if you see this as an issue, why don't you see the semantic difference between rel=alternate[3] in HTML4 and rel=alternate[4] in HTML5 as being an issue? That case is even worse, exactly the same string - entirely different semantics. If HTML4 validation is a concern, there's even a preliminary HTML4+RDFa DTD that is available: http://www.w3.org/MarkUp/DTD/html4-rdfa-1.dtd I do get your point - but why should we be concerned about it? Browser vendors would not accept having to resolve prefixes in attribute values as part of processing link relations. Why not? You would have to ask them. I tend not to argue with implementor feedback. If they tell me
Re: [whatwg] Helping people seaching for content filtered by license
showing a page based on what Flickr is doing. Just changing rel=license is not going to be sufficient. You will have to express the same information, as shown above, in HTML5 if you would like to support the same functionality that Flickr supports today. This e-mail only covers the first 1/3rd of this single use case - as you can see, there are still many issues to resolve. I'll go through the rest of this use case e-mail when I have some more free time... hope this helps. :) -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic
Ian Hickson wrote: On Tue, 5 May 2009, Manu Sporny wrote: Creating a Microformat is a very time consuming prospect, including: ... Microformats Due Diligence Rules ... Are you saying that RDF vocabularies can be created _without_ this due diligence? What I am saying is that the amount of due diligence that goes into a particular vocabulary should be determined by the community that will use the vocabulary. Some of these will be large communities and will require an enormous amount of due diligence, others will be very small communities, which may not require as much due diligence as larger communities, or they may have a completely different process to the Microformats process. The key here is that a micro-data approach should allow them to have the flexibility to create vocabularies in a distributed manner. Ian Hickson wrote: On Tue, 5 May 2009, Ben Adida wrote: Ian Hickson wrote: Are you saying that RDF vocabularies can be created _without_ this due diligence? Who decides what the right due diligence is? The person writing the vocabulary, presumably. Your stance is a bit more lax than mine on this. I'd say that it is the community, not solely the vocabulary author, that determines the right amount of due diligence. If the community does not see the proper amount of due diligence going into vocabulary creation, or the vocabulary does not solve their problem, then they should be free to develop a competing alternative. This is especially true because the proper amount of due diligence can easily become a philosophical argument - each community can have a perfectly rational argument to do things differently when solving the same problem. Your position, that the vocabulary author decides the proper amount of due diligence, is rejected in the Microformats community. In the Microformats community, every vocabulary has the same amount of due diligence applied to it. I think that this is a good thing for that particular community, but it does have a number of downsides - scalability being one of them. It creates a bottleneck - we can only get so many vocabularies through our centralized, community-based process and the barrier to creating a vocabulary is very high. As a result, we don't support small community vocabularies and only support widely established publishing behavior (contact information, events, audio, recipes, etc). So, maybe this requirement should be added to the micro-data requirements list: If micro-data is going to succeed, it needs to support a mechanism that provides easy, distributed vocabulary development, publishing and re-use. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Just create a Microformat for it - thoughts on micro-data topic
Ian Hickson wrote: One organization for *all* topics, ever? I don't think that would really scale. Even for major languages, like HTML, we haven't found a single organisation to be a successful model. Then you, Ben, and I agree on this particular point: In order for semantic/micro-data markup to scale well, we must ensure that distributed vocabulary development, publishing and re-use is a cornerstone of the solution. Manu's list didn't mention anything about a single organisation Then I wasn't clear enough - I meant that the single organization was the Microformats community and that the list works for that particular community, but is not guaranteed to work for all communities. You could say that the single community could be the W3C or WHATWG - pushing vocabulary standardization solely through any one of these organizations would be the wrong solution, therefore we should be cognizant of that in this micro-data discussion. Surely all of the above apply equally to any RDFa vocabulary just as it would to _any_ vocabularly, regardless of the underlying syntax? Not necessarily... 6: Justifying your design is a key part of any language design effort also. Not doing this would lead to a language or vocabulary with unnecessary parts, making it harder to use. What happens when the people you're justifying your design to are the gatekeepers? What happens when they don't understand the problem you're attempting to solve? Or they disagree with you on a philosophical level? Or they have some sort of political reason to not allow your vocabulary to see the light of day (think large multi-national vs. little guy)? In the Microformats community, this stage, especially if one of the Microformat founders disagrees with your stance, can kill a vocabulary. 7: With any language, part of designing the vocabulary is defining how to process content that uses it. Not if there are clear parsing rules and it's easy to separate the vocabulary from the parsing rules. This should be a requirement for the micro-data solution: Separation of concerns between the markup used to express the micro-data (the HTML markup) and the vocabularies used to express the semantics (the micro-data vocabularies). 9: The most important practical test of a language is the test of deployment. Getting feedback and writing code is naturally part of writing a format. This statement is vague, so I'm elaborating a bit to cover the possible readings of this statement: Writing markup code (ie: HTML) should be a natural part of writing a semantic vocabulary meant to be embedded in HTML. Writing parser code (ie: Python, Perl, Ruby, C, etc.) should not be a natural part of writing a semantic vocabulary - they wholly different disciplines. Microformats require you to write both markup code and parser code by design. As far as I can tell, the steps above are just the steps one would take for designing any format, language, or vocabulary. Are you saying that creating an RDF vocabulary _doesn't_ involve these steps? How is an RDF vocabulary defined if not using these steps? I don't believe that Ben is saying that at all - those steps are best practices and apply generally to most communities. However, they do not work for all communities and they do not work well when they are transformed from best practices to a requirement that all vocabularies must meet in order to be published. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
[whatwg] Just create a Microformat for it - thoughts on micro-data topic
bcc: Public RDFa Task Force mailing list (but not speaking as a member) Kyle Weems recent post[1] on CSSquirrel discusses[2] some of the more recent rumblings surrounding RDFa and Microformats as potential micro-data solutions. It specifically addresses a conversation between Ian and Tantek regarding Microformats: http://krijnhoetmer.nl/irc-logs/whatwg/20090430#l-693 Since I've seen this argument made numerous times now, and because it seems like a valid solution to someone that isn't familiar with the Microformats process, I'm addressing it here. The argument goes something like this: It looks like that markup problem X can be solved with a simple Microformat. This seems like a reasonable answer at first - Microformats, at their core, are simple tag-based mechanisms for data markup. Most semantic representation problems can be solved by explicitly tagging content. What most people fail to see, however, is that this statement trivializes the actual implementation cost of the solution. A Microformat is much more than a simple tag-based mechanism and it is far more difficult to create one than most people realize. Creating a Microformat is a very time consuming prospect, including: 1. Attempting to apply current Microformats to solve your problem. 2. Gathering examples to show how the content is represented in the wild. 3. Gathering common data formats that encode the sort of content you are attempting to express. 4. Analyzing the data formats and the content. 5. Deriving common vocabulary terms. 6. Proposing a draft Microformat and arguing the relevance of each term in the vocabulary. 7. Sorting out parsing rules for the Microformat. 8. Repeating steps 1-7 until the community is happy. 9. Testing the Microformat in the wild, getting feedback, writing code to support your specific Microformat. 10. Draft stage - if you didn't give up by this point. I say this as the primary editor of the hAudio Microformat - it is a grueling process, certainly not for those without thick skin and a strong determination to complete even simple vocabularies. Each one of those steps can take weeks or months to complete. I'm certainly not knocking the output of the Microformats community - the documents that come out of the community have usually been vetted quite thoroughly. However, to hear somebody propose Microformats as a quick or easy solution makes me cringe every time I hear it. The hAudio Microformat initiative started over 2 years ago and it's still going, still not done. So, while it is true that someone may want to put themselves through the headache of creating a Microformat to solve a particular markup problem, it is unlikely. One must only look at our track record - output for the Microformats community is at roughly 10 new vocabularies[3] (not counting rel-vocabularies and vocabularies not based directly on a previous data format). Compare that with the roughly 120-150 registered[3], active RDF vocabularies[4] via prefix.cc. Now certainly, quantity != quality, however, it does demonstrate that there is something that is causing more people to generate RDF vocabularies than Microformats vocabularies. Note that this argument doesn't apply to class-attribute-based semantic markup, but one should not make the mistake that it is easy to create a Microformat. -- manu [1] http://www.cssquirrel.com/comic/?comic=16 [2] http://www.cssquirrel.com/2009/05/04/comic-update-html5-manners/ [3] http://microformats.org/wiki/Main_Page#Specifications [4] http://prefix.cc/popular/all -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
[whatwg] Micro-data/Microformats/RDFa Interoperability Requirement
bcc: Public RDFa Task Force mailing list (but not speaking as a member) Thinking out loud... It seems as though there is potential here, based on the recent IRC conversations about the topic[1] and the use cases[2] posted by Ian, that WHATWG's use cases/requirements, and therefore solution, could diverge from both the Microformats community as well as the RDFa community use cases/requirements/solution. There should be a requirement, as Microformats and XHTML1.1+RDFa have required, that a potential solution to this issue should be compatible with both the Microformats and RDFa approaches. This would mean, at a high-level: - Not creating ambiguous cases for parser writers. - Not triggering output in a Microformats/RDFa parser as a side-effect of WHATWG micro-data markup. - Not creating an environment where WHATWG micro-data markup breaks or eliminates Microformats/RDFa markup. I think these are implied since HTML5 has gone to great lengths to provide backward compatibility. However, since I'm not clear on the details of how this community operates, I thought it better to be explicit about the requirement. -- manu [1]http://krijnhoetmer.nl/irc-logs/whatwg/20090430#l-693 [2]http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019374.html -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
Ian Hickson wrote: [bcc'ed previous participants in this discussion] Earlier this year I asked for use cases that HTML5 did not yet cover, with an emphasis on use cases relating to semantic microdata. I list below the use cases and requirements that I derived from the response to that request, and from related discussions. My primary concern right now is in making sure that these are indeed the use cases people care about, so that whatever we add to the spec can be carefully evaluated to make sure it is in fact solving the problems that we want solving. I've looked over the list a couple of times and it's a good introduction to the problem space. For those that are new to the discussion, some of these use cases are covered in more depth on the RDFa wiki[1]. The RDFa wiki includes example markup and Javascript pseudo-code describing consuming applications, but only for a few of the use cases. I'm elaborating on one use case every day until all of them are done (it should take about a month at this rate). Ian, would it help if I continue to elaborate on the RDFa use cases on the RDFa wiki? Or perhaps, I could merge these use cases into the RDFa wiki and elaborate on the WHATWG micro-data use cases first? I'd like to focus my effort on something that will benefit /both/ WHATWG and the RDFa community. Thoughts? -- manu [1] http://rdfa.info/wiki/rdfa-use-cases -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Alternative method of declaring prefixes in RDFa
Michael(tm) Smith wrote: Michael(tm) Smith m...@w3.org, 2009-01-19 17:40 +0900: Manu Sporny mspo...@digitalbazaar.com, 2009-01-18 19:18 -0500: prefix=foaf=http://xmlns.com/foaf/0.1/; URL for an archived mailing-list discussion about it? OK, I found this: http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Jan/thread.html#msg74 http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Jan/0074.html I believe that the thread started here, @prefix is a small part of the conversation: http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2008Sep/0001.html ... it's fairly involved, and a good bit has changed since the discussion back in September. The goal, though, is to provide a non-XML mechanism for declaring CURIE prefixes. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Website Launch http://blog.digitalbazaar.com/2009/01/16/bitmunk-3-1-website-launch
Re: [whatwg] Alternative method of declaring prefixes in RDFa
Just a couple of clarifications - not trying to convince anybody of anything, just setting the record straight. Henri Sivonen wrote: Even though switching over to 'prefix' in both HTML and XHTML would address the DOM Consistency concern, using them for RDF-like URI mapping would as opposed to XML names would remove the issue of having to pass around compound values and putting them on the same layer on the layer cake would remove most objections related to qnames-in-content, some usual problem with Namespaces in XML would remain: * Brittleness under copy-paste due to prefixes potentially being declared far away from the use of the prefix in source. * Various confusions about the prefix being significant. There does not seem to be agreement or data to demonstrate just how significant these issues are... to some they're minor, to others major. I'm not saying it isn't an issue. It certainly is an issue, but one that was identified as having little impact. RDFa, by design, does not generate a triple unless it is fairly clear that the author intended to create one. Therefore, if prefix mappings are not specified, no triples are generated. In other words, no bad data is created as a result of a careless cut/paste operation. The author will notice the lack of triple generation when checking the page using a triple debugging tool such as Fuzzbot (assuming that they care). * The problem of generating nice prefixes algorithmically without maintaining a massive table of a known RDF vocabularies. This is a best-practices issue and one that is a fairly easy problem to solve with a wiki. Here's an example of one solution to your issue: http://rdfa.info/wiki/best-practice-standard-prefix-names * Negative savings in syntax length when I given prefix is only used a couple of times in a file. The cost of specifying the prefix for foaf, when foaf is only specified once in a document, is: len(xmlns:foaf='http://xmlns.com/foaf/0.1/') + len(foaf:) - len(http://xmlns.com/foaf/0.1/;) == 18 characters The cost of specifying the prefix for foaf, when foaf is used two times in a document is: len(xmlns:foaf='http://xmlns.com/foaf/0.1/') + len(foaf:) - len(http://xmlns.com/foaf/0.1/;)*2 == -8 characters So, in general, your setup cost is re-couped if you have more than 1 instance of the prefix in a document... which was one of the stronger reasons for providing a mechanism for specifying prefixes in RDFa. The reason that we used xmlns: was because our charter was to specifically create a mechanism for RDF in XHTML markup. The XML folks would have berated us if we created a new namespace declaration mechanism without using an attribute that already existed for exactly that purpose. The easy way to avoid accusations of inventing another declaration mechanism is not to have a declaration mechanism. URIs already have namespacing built into their structure. You seem to be taking as a given that there needs to be an indirection mechanism for declaring common URI prefixes. As far as I can tell, an indirection mechanism isn't a hard requirement flowing from the RDF data model. We did not take the @prefix requirement as a given, it was a requirement flowing from the web authoring community (the ones that still code HTML and HTML templates by hand), the use cases, as well as the RDF community. I would expect the HTML5 LC or CR comments to reflect the same requirements if WHATWG were to adopt RDFa without support for CURIEs. After all, N-Triples don't have such a mechanism. You are correct - N-Triples do not... however, Turtle, Notation 3, and RDF/XML do specify a prefixing mechanism. Each do so because it was deemed useful by the people and workgroups that created each one of those specifications. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Website Launch http://blog.digitalbazaar.com/2009/01/16/bitmunk-3-1-website-launch
[whatwg] Alternative method of declaring prefixes in RDFa (was Re: RDFa is to structured data, like canvas is to bitmap and SVG is to vector)
Toby A Inkster wrote: So RDFa, as it is currently defined, does need a CURIE binding mechanism. XML namespaces are used for XHTML+RDFa 1.0, but given that namespaces don't work in HTML, an alternative mechanism for defining them is expected, and for consistency would probably be allowed in XHTML too - albeit in a future version of XHTML+RDFa, as 1.0 is already finalised. (I don't speak for the RDFa task force as I am not a member, but I would be surprised if many of them disagreed with me strongly on this.) Speaking as an RDFa Task Force member - we're currently looking at an alternative prefix binding mechanism, so that this: xmlns:foaf=http://xmlns.com/foaf/0.1/; could also be declared like this in non-XML family languages: prefix=foaf=http://xmlns.com/foaf/0.1/; The thought is that this prefix binding mechanism would be available in both XML and non-XML family languages. The reason that we used xmlns: was because our charter was to specifically create a mechanism for RDF in XHTML markup. The XML folks would have berated us if we created a new namespace declaration mechanism without using an attribute that already existed for exactly that purpose. That being said, we're now being berated by the WHATWG list for doing the Right Thing per our charter... sometimes you just can't win :) I don't think that the RDFa Task Force is as rigid in their positions as some on this list are claiming... we do understand the issues, are working to resolve issues or educate where possible and desire an open dialog with WHATWG. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Website Launch http://blog.digitalbazaar.com/2009/01/16/bitmunk-3-1-website-launch
Re: [whatwg] Fuzzbot (Firefox RDFa semantics processor) (was: Trying to work out the problems solved by RDFa)
Calogero Alex Baldacchino wrote: That is, choosing a proper level of integration for RDF(a) support into a web browser might divide success from failure. I don't know what's the best possible level, but I guess the deepest may be the worst, thus starting from an external support through out plugins, or scripts to be embedded in a webbapp, and working on top of other feature might work fine and lead to a better, native support by all vendors, yet limited to an API for custom applications There seems to be a bit of confusion over what RDFa can and can't do as well as the current state of the art. We have created an RDFa Firefox plugin called Fuzzbot (for Windows, Linux and Mac OS X) that is a very rough demonstration of how an browser-based RDFa processor might operate. If you're new to RDFa, you can use it to edit and debug RDFa pages in order to get a better sense of how RDFa works. There is a primer[1] to the semantic web and an RDFa basics[2] tutorial on YouTube for the completely un-initiated. The rdfa.info wiki[3] has further information. (sent to public-r...@w3.org earlier this week): We've just released a new version of Fuzzbot[4], this time with packages for all major platforms, which we're going to be using at the upcoming RDFa workshop at the Web Directions North 2009 conference[5]. Fuzzbot uses librdfa as the RDFa processing back-end and can display triples extracted from webpages via the Firefox UI. It is currently most useful when debugging RDFa web page triples. We use it to ensure that the RDFa web pages that we are editing are generating the expected triples - it is part of our suite of Firefox web development plug-ins. There are three versions of the Firefox XPI: Windows XP/Vista (i386) http://rdfa.digitalbazaar.com/fuzzbot/download/fuzzbot-windows.xpi Mac OS X (i386) http://rdfa.digitalbazaar.com/fuzzbot/download/fuzzbot-macosx-i386.xpi Linux (i386) - you must have xulrunner-1.9 installed http://rdfa.digitalbazaar.com/fuzzbot/download/fuzzbot-linux.xpi There is also very preliminary support for the Audio RDF and Video RDF vocabularies, demos of which can be found on YouTube[6][7]. To try it out on the Audio RDF vocab, install the plugin, then click on the Fuzzbot icon at the bottom of the Firefox window (in the status bar): http://bitmunk.com/media/6566872 There should be a number of triples that show up in the frame at the bottom of the screen as well as a music note icon that shows up in the Firefox 3 AwesomeBar. To try out the Video RDF vocab, do the same at this URL: http://rdfa.digitalbazaar.com/fuzzbot/demo/video.html Please report any installation or run-time issues (such as the plug-in not working on your platform) to me, or on the librdfa bugs page: http://rdfa.digitalbazaar.com/librdfa/trac -- manu [1] http://www.youtube.com/watch?v=OGg8A2zfWKg [2] http://www.youtube.com/watch?v=ldl0m-5zLz4 [3] http://rdfa.info/wiki [4] http://rdfa.digitalbazaar.com/fuzzbot/ [5] http://north.webdirections.org/ [6] http://www.youtube.com/watch?v=oPWNgZ4peuI [7] http://www.youtube.com/watch?v=PVGD9HQloDI -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Fibers are the Future: Scaling Past 100K Concurrent Requests http://blog.digitalbazaar.com/2008/10/21/scaling-webservices-part-2
Re: [whatwg] Generic Metadata Mechanisms (RDFa feedback summary wikipage)
Ian Hickson wrote: If people who understand this stuff better could fill in the blanks that would be great. Ian, we'll take some time in the coming months to fill in the details and reorganize the page a bit. What you are asking for is in line with what the Microformats community requires for a spec to go forward. All of this information exists for RDFa in the RDFa Primer[1] and the RDFa Specification[2]. We will pick bits and pieces out of each document and format it in a way that makes sense for this community. For example: 1.1 What is the problem we are trying to solve? When visiting a website, it is difficult for browsers to do much more than display information in the browser. Extracting concepts in the page is currently very difficult because a scalable syntax to express those concepts does not exist in HTML5. For example, if one were to visit a music blog, it would be beneficial if the browser, or a browser-plugin, could detect all of the songs on the web page and re-mash the data in a way that benefits the person browsing. Actions such as queuing the songs into a playlist in the user's favorite music player, searching for more information on the band of a particular song, finding the cheapest price for an album listed on the page, auto-blogging details about the song, auto-adding the song to a wish-list, and sharing the song info with a friend all require the ability to reliably extract the data from the page that is being viewed. - There will, of course, be many more examples of the problem following the same format as shown above. Is this what you had in mind for the problem description? If so, give us some time and we'll be able to refine that page in the coming months. -- manu [1] http://www.w3.org/TR/xhtml-rdfa-primer/ [2] http://www.w3.org/TR/rdfa-syntax/ -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
[whatwg] Self-imposed RDFa cool-down period
This will be the last post about RDFa that both Ben and I will be making for at least the next couple of weeks. The conversation started in this community regarding Creative Common's decision to use RDFa to not only express licensing concerns, but other important metadata surrounding creative works. There are a number of us that were pulled into the discussion by members in this community and it is unfortunate that our explanations of the semantic data expression requirements have been understood as beat(ing) everyone over the head[1]. That was not our intention and we have endeavored to keep our argumentation logical and civil. We are instituting a self-imposed cool-down period regarding RDFa. The threads are getting very long at this point and the discussion seems to be, against our best intentions, annoying some of the people on this mailing list[1]. The goal of these threads was always to educate the HTML5 community about the requirements of semantic data expression in HTML[2] and the possibilities of semantics expression using RDFa. I believe that there has been enough discussion to see that goal realized among those that have kept an open mind to the idea of semantics expression in HTML. We invite those of you that are interested in HTML5+RDFa (in any form) to continue the discussion on the [EMAIL PROTECTED] mailing list. The RDFa Task Force has two to three more months of W3C process to go through before RDFa becomes an official standard. We must focus our effort on seeing that through first. After RDFa is a W3C standard, we will create a request to integrate it into HTML5 and discuss the matter in an official capacity with the WHATWG community at that time. We'd like to thank those that have read these threads and contributed to the discussion for taking the time to do so. While we may not agree on the methods of making the Web a better place, it is evident that all of us care deeply about that shared goal. -- manu [1]http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-August/016135.html [2]http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-August/015957.html [3]http://lists.w3.org/Archives/Public/public-rdfa/ -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa statement consistency
Kristof Zelechovski wrote: I think RDFa has already happened: you know what it is and how to use it. Yes, you are correct - RDFa has, more or less, already happened. It will be an official W3C standard in the next couple of months and will be supported in XHTML1.1 and XHTML2. Some are currently working to get it integrated into a new HTML4 DTD as well. No putting the semantic genie back in the bottle. :) You can embed it in XHTML. Why is having it in HTML necessary for creating statistical models? I was speaking generally in that case because I thought you were speaking generally... this seems to have caused confusion. Apologies for that. If we are to be very specific, you do not /need/ RDFa attributes in HTML5 to create statistical semantic models. You could build the same models from all of the HTML4+RDFa, XHTML1.1+RDFa, and XHTML2 documents out there. It would also be easier to check those documents for NLP/semantic correctness with the RDFa markup embedded in the document. Statistical models are just one approach among the many that would be used to perform NLP correctness verification. You would not be able to depend solely on statistical models. So while you are technically correct, not having any sort of robust semantic expression mechanism in HTML5 deprives the content from having multiple paths to validating the document semantics: - The page's NLP based semantic model verified against RDFa model. - The page's Statistical model used to verify parts of the RDFa model. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] Vocabulary ambiguity with non-namespaced semantic languages (was: Ghosts from the past and the semantic Web)
Silvia Pfeiffer wrote: I am still struggling with the need for using URIs to address resource properties rather than resources. How do you differentiate Job Title from Book Title in a way that scales to the web? You need unique identification, and a way to explore more information about the data field. Kristof Zelechovski wrote: What is wrong with job-title for Job Title and title for Book Title? (The reason is that Job Title really is not a person's title at all). The correct relations are: A book has exactly one title. A person has several jobs (especially when she is a scientist or a doctor in Poland :-)). Each job has exactly one title. So job title is a shortcut property while book title is a real property. There are several things that have bad consequences with regard to your solution to the namespace problem. Take this scenario, for example - an author disagrees with your vocabulary because it conflicts directly with their vocabulary: That's fine and you have defined title for your own use... but what about my use! I want to use title as a means to identify the title of a job position in my resume vocabulary. There is no need to use job-title as it is redundant in my vocabulary. Your definition of title A book has exactly one title has nothing to do with my definition of title The title of the position that was held for a period of time. Title most certainly is not a shortcut property in my vocabulary. If you didn't have namespacing of some kind, such as title, this sort of problem - as found in the Microformats community, becomes more and more prevalent as more vocabularies are created (like the example given above). If you choose namespaces that contain arbitrary separators, such as resume.title you still have the issue of potential conflicts and you're inventing a new namespacing mechanism for the web - which is bad design. If you use URIs, such as http://purl.org/resume/title; you are using a familiar concept to all users of the Web, re-using a design that the Web uses for uniquely identifying resources and removing all potential conflicts regarding vocabulary term collision. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa Features
Kristof Zelechovski wrote: Manu Sporny wrote: 3. We needed a solution that would map cleanly to non-XML family languages. In hindsight, we should have picked @prefix instead of @xmlns to define prefixes, but that ended up going through. We're looking at alternative mechanisms, such as having a @prefix attribute, to declare prefixes in RDFa 1.1. While we are at that, why not resurrect semantic [profile] instead of adding syntactic [prefix]? Here's the definition for @profile from HTML 4.01, it's the same for XHTML 1.0 and 1.1: profile = uri [CT] This attribute specifies the location of one or more meta data profiles, separated by white space. For future extensions, user agents should consider the value to be a list even though this specification only considers the first URI to be significant. Profiles are discussed below in the section on meta data. ... The profile attribute of the HEAD specifies the location of a meta data profile. The value of the profile attribute is a URI. User agents may use this URI in two ways: * As a globally unique name. User agents may be able to recognize the name (without actually retrieving the profile) and perform some activity based on known conventions for that profile. For instance, search engines could provide an interface for searching through catalogs of HTML documents, where these documents all use the same profile for representing catalog entries. * As a link. User agents may dereference the URI and perform some activity based on the actual definitions within the profile (e.g., authorize the usage of the profile within the current HTML document). This specification does not define formats for profiles. This example refers to a hypothetical profile that defines useful properties for document indexing. The properties defined by this profile -- including author, copyright, keywords, and date -- have their values set by subsequent META declarations. So, there are several reasons that we didn't use @profile and used xmlns:PREFIX instead. Even if we don't use xmlns:PREFIX, we should use @prefix for the following reasons: 1. The contents of @profile is supposed to be used for known conventions to use with the URI specified or dereferenced and some activity based on the contents of the reference, neither of which map cleaning to the concept of prefixes. 2. We need to be able to define prefixes in-line, anywhere in the document if needed - @profile does not allow use outside of the HEAD element. 3. @profile is used for specifying the location of GRDDL transforms and other such instructions currently. In other words, it is used for different purposes than specifying prefixes. Re-defining it's scope was unnecessary and could break legacy markup. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa statement consistency
Kristof Zelechovski wrote: HTML5 is too crucial as a technology to allow arbitrary experimentation. Please refrain from making wildly opinionated and loaded comments such as this without logically backing up your argument Kristof. Many on this list and off this list would view a number of HTML5 features as arbitrary experimentation. Experimentation is fine and arbitrary is a matter of opinion when applied in broad strokes. The important thing is to discuss the benefits and drawbacks of each decision without resorting to wording such as yours. I would rather wait for a consistency checker to exist, at least approximately and conceptually, before having alternate content streams in HTML. Maybe that is just me. Yes, it does seems to be just you that is making this argument that the Web must be completely consistent at all times. Perhaps if you could outline exactly what your consistency checker would check, then we could make some progress on whether or not it is achievable. Do you think that we should also have a consistency checker for natural language used in a web page? Is your idea of a valid consistency checker to solve the Natural Language Processing (NLP) problem first? -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa uses CURIEs, not QNames (was: RDFa statement consistency)
Anne van Kesteren wrote: The idea and premise of RDF is sort of attractive (people being able to do their own thing, unified data model, etc), though I agree with others that the complexity (lengthy URIs, ***qname***/curie cruft) is an issue. We do not use QName's in RDFa - there is not QName/CURIE cruft! We went to great lengths to avoid QNames, please take the time to understand why (it's because of the cruft that you complain about): Here's an excerpt from the section in the CURIE spec that explains why we don't use QNames in RDFa[1]: * CURIEs are designed from the ground up to be used in attribute values. QNames are designed for unambiguously naming elements and attributes. * CURIEs expand to any IRI. QNames are treated as value pairs, but even if those pairs are combined into a string, only a subset of IRIs can be represented. * CURIEs can be used in non-XML grammars, and can even be used in XML languages that do not support XML Namespaces. QNames are limited to XML Namespace-aware XML Applications. The syntax document explains each bullet point more clearly in the Introduction section[1]. In other words, 1) CURIEs always map to a IRI. 2) They don't have any constraints on the reference portion (the part after the colon). 3) They can be used outside of XML languages. 4) They were designed specifically for the purpose of compacting IRIs in attribute values. RDFa is not encumbered by any QName cruftiness. -- manu [1]http://www.w3.org/MarkUp/2008/ED-curie-20080617/#s_intro -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa Features
Tab Atkins Jr. wrote: Ben Adida wrote: Well, for one, if you've got prefixes, you just need to change where your prefix points :) So that's kinda nice. That's the issue. We're talking *legacy* pages, which means that updates, even fairly easy ones, probably aren't going to happen. For legacy pages, the answer to your question lies here: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-August/016094.html -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa Features
Kristof Zelechovski wrote: While Google owns the Web, it is not the core of the Web. If Google goes down, Google users cannot use Google any more. Sure, there are quite a few of them; but Google is a big fish accordingly. On the other hand, if Verizon or InterNIC goes down, we have a blackout, possibly with street riots and people plundering stores. That shows Verizon is an authority, Google is not, although, in general, Google is more useful. I believe in the general sanity in the architecture of the Web. I keep asking these questions because I would like it to stay. Kristof - you will have to be more precise. Could you please outline (in short form bulleted list), every specific issue that you have with RDFa. A parallel short-form bulleted list of all RDFa features that you enjoy would also be welcome. I believe that this thread has been going long enough for you to formulate an educated opinion about what you do like and what you don't like about RDFa. Getting such a list together will also help us address your grievances better. Here's an example of what I'd like to see: RDFa Pros: - Allows semantic metadata markup in HTML family languages. - Does not use QNames RDFa Cons: - Mixes semantics with HTML structure - Uses CURIEs to specify prefixes - Does not work like CSS and does not re-use @class and so on... If any of you that have been involved in all of these discussions can weight in with a list like that it would be very helpful. I can put together a set of issues that the HTML5 community is concerned about and we can start discussing them in the W3C RDFa Task Force. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa Features
-20080617/ The other advantage of unique prefixes over URIs is the one you mention: they are not dereferenceable. As has been mentioned on this list, that means nobody (human or system) will attempt to reference them, either by mistake or in the hope of finding something there. All URLs listed as vocabulary terms in RDFa should be dereferenceable - that's the whole point. What you are listing as an advantage has been identified as a severe disadvantage for users. We are strongly suggesting that the document that is dereferenced have a machine readable (RDF vocabulary) and human readable (human explaination of vocabulary and all terms). Take a look at the following vocabulary for an example of what should be at the end of an RDFa vocabulary link: http://purl.org/media/video#Recording The vocabulary term above is dereference-able and if you put the term into a web browser, you will get both a machine-readable and human-readable definition of that term. Contrast that functionality with the following as a namespaced vocabulary term that you cannot dereference: foo.blah If you wanted to know what the definition of foo.blah is, there is no way for you to do so other than relying on a search engine to find the vocabulary for you. The page that you land on might not even be the correct page for the vocabulary that was used to mark up the original page that you were viewing. Using URLs allow one to specify, with great accuracy, both for machines and for humans, the meaning behind semantic statements. So unique prefixes have 2 advantages over URIs; therefore they cannot be dismissed as unnecessary merely because URIs exist. The approach wasn't dismissed - we chose a better solution that addressed all of the needs of a reliable knowledge representation system, something that was scalable without needing a central authority, and a system that could be adopted by ordinary folks on the Internet. Of course those advantages don't necessarily apply to all users in all situations; there may be users whom don't find the above advantageous, and prefer URIs for other reasons. That's OK, because such users can still choose to use a URI as their unique prefix. (And there can be a rule which says you are only allowed to have something which is syntactically a URL as a unique prefix if you own that URL.) You can always do something like xmlns:foo=whatever- ... property=foo:blah-fnurt-jackalope although, doing so is highly frowned upon for the reasons explained above: 1. The link isn't dereference-able. 2. It uses a namespace mechanism that is not used anywhere else on the web. 3. It has a much higher learning curve than plain old URIs. That suggests that giving users the freedom to use either URIs or any other prefixes of their choice is superior to forcing them to use URIs, surely? I think we do give people the freedom to use any other sort of prefix of their choice, within reason. I do not think that using anything other than a dereferenceable URL is superior for the reasons outlined above. Does that make sense? Do you have any further concerns with these responses? -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa Features
Kristof Zelechovski wrote: Has anyone considered having an URI in an entity in order to avoid typing it over and over? e.g. !DOCTYPE html !ENTITY myVocab http://www.example.com/vocab/; And later Property=myVocab;myPred? Yes, we did discuss this at great length in the RDFa community and to a lesser extent in the Microformats community. The Microformats community decided to not use namespaces at all, and the RDFa community ended up using CURIEs for URI expression. RDFa specifically did not choose QNames for the following reasons: 1. We believe QNames should not be used in attribute values. 2. QNames are really restrictive in what can be used as a reference. 3. Qnames do not expand to URIs, they map to a tuple and RDFa (and many other approaches that use URIs as resources) need things to map to URIs. The CURIE spec, which is a really cool solution to the problem of compact URI expression in all HTML family languages, explains these points in more detail: http://www.w3.org/MarkUp/2008/ED-curie-20080617/ To answer your question more directly, there were several issues with the ENTITY approach for defining prefixes: 1. The markup is ugly and would be difficult for most regular web folks to grasp. Not many people that modify web pages have to deal with entity declarations, or using entities outside of the lt; gt: realm. This would harm adoption. 2. It is becoming more and more common, when using online blogging, CMSes, and publishing software that you don't have access to the entire HTML document as you edit it. The @xmlns:NAMESPACE and @prefix approach takes this limitation into account. It allows one to do the following: ...JANE'S BLOG POST... ...BOB'S BLOG POST... ...MY BLOG POST div xmlns:foaf=http://xmlns.com/foaf/0.1/; about=#me This is my blog entry. My name is span property=foaf:nameKřištof Želechovski/span. /div ...END MY BLOG POST... Keep in mind that we still must modify the systems described above to not strip out RDFa attributes from elements, but that work is merely a small code change to the HTML processing code in most publishing platforms. 3. We needed a solution that would map cleanly to non-XML family languages. In hindsight, we should have picked @prefix instead of @xmlns to define prefixes, but that ended up going through. We're looking at alternative mechanisms, such as having a @prefix attribute, to declare prefixes in RDFa 1.1. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa mis-use and abuse (was: RDFa Basics Video (8 minutes))
Kristof Zelechovski wrote: span about=#jane typeof=foaf:Person property=foaf:name Jane/span span about=#jane property=foaf:loves resource=#mac hates/span span about=#mac typeof=foaf:Person property=foaf:name Mac/span . Ugh. That really hurt. (I've corrected @instanceof to @typeof in your example above - @instanceof is old and deprecated - apologies, but the video linked to was made before we made the change to RDFa). You didn't really specify what your point was, Křištof. I'm guessing that you are alluding to the fact that people will mis-use RDFa. If that was your point, then yes, I agree with you - a small subset of people will mis-use RDFa. However, that is a problem with all technologies - there is ample room for both good uses, bad uses and mis-uses when dealing with any new, revolutionary technology. For example, the automobile allowed for a massive shift in the way we use transportation. However, road-related incidents kill roughly 1.2 million people each year, and around 50 million people per year are injured as a cause of this technology[1]. Even with these massive causes to loss of life and body, we continue to use the technology. We continue usage because it is more economically beneficial for us to do so than it is for us to do without the technology. We are privileged in this case because RDFa is certainly not going to cause as much bodily injury as the automobile. :) In your example above, someone has carelessly modified the meaning of the semantic statement but has not modified the meaning of the human statement. They have mis-communicated - something that we do in language all the time. I am not defending their mistake, just acknowledging that mistakes happen and we must deal with the consequences. The question is, how do we address this issue? Your answer seems to be implying that People are going to mis-use RDFa, so let's not give them the option of using RDFa! - which is a very extreme and authoritarian way to look at the issue. If you believe that most web authors are sheep that are not capable of doing their job or hobby correctly, it will be difficult to trust them to use any web-based technology. People will mis-use HTML5 features. They will abuse the video tag, they will screw up the progress meter and make a mess of the Worker API. This does not mean that you should not provide the mechanism. There will be many others that will gain utility from the feature. That is really what all of us are trying to do here - raise the overall utility of the tools available to web designers. In the case of RDFa - we are going from something that they can barely do today, to a full specification of how to express semantics in HTML family languages. So, let's make sure that we don't fall down the dangerous forbidden technology rabbit hole - that argument rarely works in the interest of of progress or those who use it as a means to halt progress. -- manu [1]http://www.who.int/features/2004/road_safety/en/ -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa statement consistency (was: RDFa Basics Video (8 minutes))
Kristof Zelechovski wrote: We have two options for having both human-readable and machine-readable information in a document: write the structure and generate the text or write the text and recover the structure. At the very least, if you insist on having both, there must be a mechanism to verify that they are consistent. As both Ben and Toby already pointed out, there doesn't have to be any sort of consistency mechanism. The web today works without a logical consistency checking mechanism in any of the current publishing languages. We are not striving for perfect semantic data with RDFa on day one - we are striving for an easy mechanism for expressing semantic data in HTML family languages. It is understood that there may be data that is corrupt and that is okay. There is an area of semantic web development that deals with the concept of provenance and validation. You can even apply statistical models to catch logical inconsistencies, but those models need a core set of semantic information to be of any use. RDFa must happen before any sort of statistical model can be created for checking logical consistency between HTML text and semantic data. The other approach is the use of Natural Language Processing (NLP) to address the HTML/RDFa logical consistency issue. Solving the problem of NLP has proven to be quite a tough nut to crack. Computer scientists have been working on the problem for decades and a general purpose solution is nowhere in sight. The good news is that we don't need to solve this problem, as the web contains logical inconsistencies in it's content today without affecting the positive global utility of the system. Assume, however, that the NLP problem is solved. RDFa still provides a mechanism that is useful in a post-NLP web. That is, one of content validation. There will come a time where we mis-state facts on a web page but the machine generated semantic data on the page is correct. The counter to this scenario is also possible - where we state the correct facts on a web page, but the machine generated semantic data on the page is incorrect. In both cases, the consistency verification process would use both the RDFa data as well as the NLP data in determining the location of the inconsistency. A human could then be brought in to correct the inconsistency - or a solution could be reasoned out by the computer using the same statistical method as mentioned previously. The possibilities don't stop there - we can utilize RDFa to get to the stage of NLP by using RDFa markup to teach computers how to reason. Think of RDFa as a NLP stepping-stone that could be used by researchers in the field. Building NLP databases are a very expensive, manual and thus and time consuming process. However, humans are publishing data that would be useful to a NLP system every day - in blog posts, wikipedia entries, e-commerce sites, news feeds, maps and photo streams. RDFa-assisted NLP would be one such approach that researchers could use to get us to the eventual goal of true NLP. I hope that outlines the possibilities and shows how RDFa is a major part in realizing each scenario outlined above. Any questions on the concepts outlined in this e-mail? -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa Problem Statement
Kristof Zelechovski wrote: Web browsers are (hopefully) designed so that they run in every culture. If you define a custom vocabulary without considering its ability to describe phenomena of other cultures and try to impose it worldwide, you do more harm than good to the representatives of those cultures. And considering it properly does require much time and effort; I do not think you can have that off the shelf without actually listening to them. Kristof - I believe that you may also be confounding the concept of the method of expression and the vocabulary. RDFa is the method of expression, the vocabulary uses that method of expression to convey semantics. RDFa is a collection of properties[1] for HTML family languages that are used to express semantics through the use of a vocabulary. For an example of what an RDF vocabulary page looks like, check out the following: http://purl.org/media/ That page is marked up using RDFa to not only provide a human-readable version of the vocabulary, but a machine readable version of the vocabulary. In a way, complaining that the Microformats protocol impedes innovation is like saying 'we are big and rich and strong, so either you accommodate or you do not exist'. Not that I do not understand; it is straightforward to say so and it happens all the time. It's easy to miss the effect that the Microformats approach has on innovation because it isn't stated directly in any Microformats literature. I'd like to re-iterate that I have spent many, many hours creating specifications in the Microformats community and have seen this effect first-hand. I'd like to not focus on theory, but the state of the world as it is right now. Right now, the Microformats process requires everyone to go through our community to create a vocabulary. It is the we are big and rich and strong, so either you accommodate or you do not exist approach that you seem to be arguing against. If someone were to come along and request that bloodline be added to the hCard format, it would be rejected as a corner-case. So, unless I understood you incorrectly, RDFa provides a more open environment for innovation because it doesn't require any sort of central authority to approve a vocabulary. One of the things that RDFa strives to do, and is successful at doing, is to not give anyone power over what a constitutes a valid vocabulary. If that's not what you were attempting to express, you will have to explain it again, please. -- manu [1] http://www.w3.org/TR/rdfa-syntax#rdfa-attributes -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa Problem Statement
Ian, I am addressing these questions both personally and as a representative of our company, Digital Bazaar. I am certainly not speaking in any way for the W3C SWD, RDFa Task Force, or Microformats community. Ian Hickson wrote: On Mon, 25 Aug 2008, Manu Sporny wrote: Web browsers currently do not understand the meaning behind human statements or concepts on a web page. While this may seem academic, it has direct implications on website usability. If web browsers could understand that a particular page was describing a piece of music, a movie, an event, a person or a product, the browser could then help the user find more information about the particular item in question. Is this something that users actually want? These are fairly broad questions, so I will attempt to address them in a general sense. We can go into the details at a later date if that would benefit the group in understanding how RDFa addresses this perceived need. Both the Microformats community and the RDFa community believe that users want a web browser that can help them navigate the web more efficiently. One of the best ways that a browser can provide this functionality is by understanding what the user is currently browsing with more accuracy than what is available today. The Microformats community is currently at 1,145 members on the discussion mailing list and 350 members on the vocabulary specification mailing list. The community has a common goal of making web semantics a ubiquitous technology. It should be noted as well that the Microformats community ARE the users that want this technology. There are very few commercial interests in that community - we have people from all walks of life contributing to the concept that the semantic web is going to make the browsing experience much better by helping computers to understand the human concepts that are being discussed on each page. I should also point out that XHTML1.1 and XHTML2 will have RDFa integrated because it is the best technology that we have at this moment to address the issue of web semantics. You don't have to agree with the best technology aspect of the statement, just that there is some technology X, that has been adopted to provide semantics in HTML. The Semantic Web Deployment group at the W3C also believes this to be a fundamental issue with the evolution of the Web. We are also working on an HTML4 DTD to add RDFa markup to legacy websites. I say this not to make the argument that everybody is doing it, but to point out that there seems to be a fairly wide representation, both from standards bodies and from web communities that semantics is a requirement of near-term web technologies. How would this actually work? I don't know if you mean from a societal perspective, a standards perspective, a technological perspective or some other philosophical perspective. I am going to assume that you mean from a technological perspective and a societal perspective since I believe those to be the most important. The technological perspective is the easiest to answer - we have working code, to the tune of 9 RDFa parser implementations and two browser plug-ins. Here's the implementation report for RDFa: http://www.w3.org/2006/07/SWD/RDFa/implementation-report/#TestResults To see how it works in practice, the Fuzzbot plug-in shows what we have right now. It's rough, but demonstrates the simplest use case (semantic data on a web page that is extracted and acted upon by the browser): http://www.youtube.com/watch?v=oPWNgZ4peuI All of the code to do this stuff is available under an Open Source license. librdfa, one of the many RDFa parsers is available here: http://rdfa.digitalbazaar.com/librdfa/ and Fuzzbot, the semantic web processor, is available here: http://rdfa.digitalbazaar.com/fuzzbot/ From a societal perspective, it frees up the people working on this problem to focus on creating vocabularies. We're wasting most of our time in the Microformats community arguing over the syntax of the vocabulary expression language - which isn't what we want to talk about - we want to talk about web semantics. More accurately, RDFa relies on technologies that are readily accepted on the web (URIs, URLs, etc.) to express semantic information. So, RDFa frees up users to focus on expressing semantics by creating vocabularies either through a standards body, an ad-hoc group, or individually. Anybody can create a vocabulary, then you let the web decide which vocabularies are useful and which ones are not. The ones that aren't useful get ignored and the ones that are useful find widespread usage. From a societal perspective, this is how the web already operates and it is the defining feature that makes the web such a great tool for humanity. Personally I find that if I'm looking at a site with music tracks, say Amazon's MP3 store, I don't have any difficulty working out what the tracks are or interacting with the page. Why would I want to ask
Re: [whatwg] RDFa Features (was: RDFa Problem Statement)
of a standardized semantics expression mechanism in HTML family languages. RDFa not only enables the use cases described in the videos listed above, but all use cases that struggle with enabling web browsers and web spiders to understand the context of the current page. I'm not convinced the problem you describe can be solved in the manner you describe. It seems to rely on getting authors to do something that they have shown themselves incapable of doing over and over since the Web started. It seems like a much better solution would be to get computers to understand what humans are doing already. I agree with your last two points, but not the first. Yes, some authors have shown themselves incapable of marking up certain types of metadata in HTML pages since the dawn of the Web. Yes, a better solution would be to get computers to understand what humans are doing already. Tools and education are vital, but not necessary, in addressing the laziness issue. RDFa is a tool that we can use to address the issue of machine learning. I won't go into the whole machine learning thing again other than to state that RDFa and machine-based learning are not mutually exclusive. They are mutually beneficial. As for not being convinced, I believe that I am addressing your concerns in a logical manner and am informing this community as to why some of the suggestions that have been made over the past week by members of WHATWG are not the proper approach to the problems that RDFa is addressing. I trust that you will continue to address the issues I am raising in the same logical manner and explain why WHATWG doesn't consider semantics to be an issue that needs to be addressed. It is unfortunate that it takes so much time to convey over a decade of experience and work performed by members of the Semantic Web Deployment workgroup as well as those involved in the RDFa Task Force and Microformats community. I trust that all on this list will continue to be receptive to what those that have been involved with these issues have to say as we continue to answer questions posed by members of WHATWG. Even if we ignore that, it doesn't seem like the above discussion would lead one to a set of requirements that would lead one to design a language like RDFa. You will have to clarify this statement and give some examples, Ian. I am sure that there are holes in my explanation, however, we should be careful not to prematurely discount RDFa. It went through a very rigorous process involving many different people from many different communities that settled on one method, out of hundreds of permutations, for semantic expression in HTML. RDFa is going to be published as a W3C Recommendation in the coming months (it just got through the Candidate Recommendation phase and will be transitioning to the Proposed Recommendation phase very shortly). I need to know your thoughts about this topic in a bit more depth - why doesn't it seem like the above discussion would lead one to a set of requirements that would lead to a semantic expression mechanism like RDFa? Thanks for the explanation, by the way. This is by far the most useful explanation of RDFa that I have ever seen. Good, I'm glad it helped as there is much that people do not understand about RDFa. Thanks for listening thus far, and I'm committing time over the next two weeks to answer any questions that this community might have as a result of our work in the Microformats and RDFa communities. RDFa has changed a great deal in the past two years and what most people know of RDF and RDFa predates 2005. For those that are interested in all of the gory details, there is an RDFa Primer which is a pretty quick read, available here: http://www.w3.org/TR/xhtml-rdfa-primer/ The full RDFa specification, which is not a quick read, is available here: http://www.w3.org/TR/rdfa-syntax/ -- manu [1]http://microformats.org/wiki/namespaces-inconsistency-issue [2]http://microformats.org/wiki/datetime-design-pattern [3]http://microformats.org/wiki/haudio#Published [4]http://www.bbc.co.uk/blogs/radiolabs/2008/06/removing_microformats_from_bbc.shtml [5]http://microformats.org/wiki/accepted-limitations-of-microformats#Microformat_Scoping_Issue -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
[whatwg] RDFa Basics Video (8 minutes)
Here's a quick 8 minute RDFa Basics video for those of you that are not familiar with how RDF and RDFa work. I'm posting this in an attempt to bring those that are unfamiliar with these concepts up to speed. RDFa Basics (8 minutes) http://www.youtube.com/watch?v=ldl0m-5zLz4 -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa
Henri Sivonen wrote: ... Since RDF vocabularies use URIs as identifiers, I find creating more microformats (even if they need more one-off speccing) a more appealing way forward from the language usage point of view than importing RDF vocabularies in a generically mappable way. (I can't see how generic mapping can be had without using URIs as identifiers.) It's becoming more and more apparent that several vocal people in this community believe that the Microformats design-as-you-go approach is the best way forward when addressing the semantics expression issue. The people that are not present in this discussion, however, are the Microformats community. I am a frequent contributor to that community and the lead editor on the only Microformat to be pushed through the Microformats process in the past 3 years. I have taken the time to outline why the just take the Microformats approach answer to the question of semantics expression in HTML5 glosses over the details of the problem that is being addressed with RDFa. The Microformats community, nor our approaches, are going to solve some of the most serious problems surrounding semantics expression for HTML5: http://blog.digitalbazaar.com/2008/08/23/html5-rdfa-and-microformats/ Our work with the Microformats community over the past two years have shown that: 1. Vocabulary term collisions are a real issue and need to be addressed. 2. The Microformats community does not acknowledge vocabularies created outside of the Microformats Process. 3. The Microformats process is a slow, centralized on and doesn't afford distributed innovation at the same rate as would be provided by RDFa. While the Microformats approach does solve several semantic web issues, it does not address all of them, nor was it ever intended to be a universal method of semantic data expression. Read the blog post to understand some of the reasoning behind the points made above. All the best, :) -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Re: [whatwg] RDFa Problem Statement (was: Creative Commons Rights Expression Language)
. The web has always relied on distributed innovation and RDFa allows that sort of innovation to continue by solving the tenable problem of a semantics expression mechanism. Microformats has no such general purpose solution. In short, RDFa addresses the problem of a lack of a standardized semantics expression mechanism in HTML family languages. RDFa not only enables the use cases described in the videos listed above, but all use cases that struggle with enabling web browsers and web spiders understand the context of the current page. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.0 Website Launches http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches