Re: [whatwg] Annotating structured data that HTML has no semantics for
Maciej Stachowiak wrote: On May 14, 2009, at 1:30 PM, Shelley Powers wrote: So, if I'm pushing for RDFa, it's not because I want to win. It's because I have things I want to do now, and I would like to make sure have a reasonable chance of working a couple of years in the future. And yeah, once SVG is in HTML5, and RDFa can work with HTML5, maybe I wouldn't mind giving old HTML a try again. Lord knows I'd like to user ampersands again. It sounds like your argument comes down to this: you have personally invested in RDFa, therefore having a competing technology is bad, regardless of the technical merits. I don't mean to parody here - I am somewhat sympathetic to this line of argument. Often pragmatic concerns mean that an incremental improvement just isn't worth the cost of switching (for example HTML vs. XHTML). My personally judgment is that we're not past the point of no return on data embedding. There's microformats, RDFa, and then dozens of other serializations of RDF (some of which you cited). This doesn't seem like a space on the verge of picking a single winner, and the players seem willing to experiment with different options. There are not dozens of other serializations of RDF. The point I was trying to make is, I'd rather put my time into something that exists now, than have to watch the wheel re-invented. I'd rather see semantic metadata become a reality. I'm glad that you personally feel that companies will be just peachy keen on having to support multiple parsers to get the same data. On the HTML WG side, I will never support microdata, because no case has been made for its existence. The point is, people in the real world have to use this stuff. It helps them if they have one, generally agreed on approach. As it is, folks have to contend with both RDFa and microformats, but at least we know these have different purposes. From my cursory study, I think microdata could subsume many of the use cases of both microformats and RDFa. It seems to me that it avoids much of what microformats advocates find objectionable, and provides a good basis for new microformats; but at the same time it seems it can represent a full RDF data model. Thus, I think we have the potential to get one solution that works for everyone. I'm not 100% sure microdata can really achieve this, but I think making the attempt is a positive step. It can't, don't you see? Microdata will only work in HTML5/XHTML5. XHTML 1.1 and yes, 2.0 will be around for years, decades. In addition, XHTML5 already supports RDFa. Supporting XHTML 1.1 has about 0.001% as much value as supporting text/html. XHTML 2.0 is completely irrelevant to the Web, and looks on track to remain so. So I don't find this point very persuasive. I don't think you'll find that the world is breathlessly waiting for HTML5. I think you'll find that XHTML 1.1 will have wider use than HTML5 for the next decade. If not longer. I wouldn't count out XHTML 2.0, either. And in a decade, a lot can change. Why you think something completely brand new, no vendor support, drummed up in a few hours or a day or so is more robust, and a better option than a mature spec in wide use, well frankly boggles my mind. I haven't evaluated it enough to know for sure (as I said). I do think avoiding CURIEs is extremely valuable from the point of view of sane text/html semantics and ease of authoring; and RDF experts seem to think it works fine for representing RDF data models. So tentatively, I don't see any gaping holes. If you see a technical problem, and not just potential competition for the technology you've invested in, then you should definitely cite it. I don't think CURIEs are that difficult, nor impossible no matter the arguments that Henri brings out. I am impressed with your belief in HTML5. But One other detail that it seems not many people have picked up on yet is that microdata proposes a DOM API to extract microdata-based info from a live document on the client side. In my opinion this is huge and has the potential to greatly increase author interest in semantic markup. Not really. Can do this now with RDFa in XHTML. And I don't need any new DOM to do it. The power of semantic markup isn't really seen until you take that markup data _outside_ the document. And merge that data with data from other documents. Google rich snippets. Yahoo searchmonkey. Heck, even an application that manages the data from different subsites of one domain. I respectfully disagree. An API to do things client-side that doesn't require an external library is extremely powerful, because it lets content authors easily make use of the very same semantic markup that they are vending for third parties, so they have more incentive to use it and get it right. Sure, we'll have to disagree on this one. Now, it may be that microdata will ultimately fail, either because it is outcompeted by RDFa
Re: [whatwg] Link rot is not dangerous
Dan Brickley wrote: On 15/5/09 18:20, Manu Sporny wrote: Kristof Zelechovski wrote: Therefore, link rot is a bigger problem for CURIE prefixes than for links. There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. This has also lead to a false requirement that all vocabularies should be centralized. Here's the fear: If a vocabulary document disappears for any reason, then the meaning of the vocabulary is lost and all triples depending on the lost vocabulary become useless. That fear ignores the fact that we have a highly available document store available to us (the Web). Not only that, but these vocabularies will be cached (at Google, at Yahoo, at The Wayback Machine, etc.). IF a vocabulary document disappears, which is highly unlikely for popular vocabularies - imagine FOAF disappearing overnight, then there are alternative mechanisms to extract meaning from the triples that will be left on the web. Here are just two of the possible solutions to the problem outlined: - The vocabulary is restored at another URL using a cached copy of the vocabulary. The site owner of the original vocabulary either re-uses the vocabulary, or re-directs the vocabulary page to another domain (somebody that will ensure the vocabulary continues to be provided - somebody like the W3C). - RDFa parsers can be given an override list of legacy vocabularies that will be loaded from disk (from a cached copy). If a cached copy of the vocabulary cannot be found, it can be re-created from scratch if necessary. The argument that link rot would cause massive damage to the semantic web is just not true. Even if there is minor damage caused, it is fairly easy to recover from it, as outlined above. A few other points: 1. It's for the community of vocabulary-creators to help each other out w.r.t. hosting/publishing these: I just nudged a friend to put another 5 years on the DNS rental for a popular namespace. I think we should put a bit more structure around these kinds of habit, so that popular namespaces won't drop off the Web through accident. 2. digitally signing the schemas will become part of the story, I'm sure. While it's a bit fiddly, there are advantages to having other mechanisms beyond URI de-referencing for knowing where a schema came from 3. Parties worried about external dependencies when using namespaces can always indirect through their own namespace, whose schema document can declare subclass/subproperty relations to other URIs cheers Dan The most important point to take from all of this, though, is that link rot within the RDF world is an extremely rare and unlikely occurrence. I've been working with RDF for close to a decade, and link rot has never been an issue. One of the very first uses of RDF, in RSS 1.0, for feeds, is still in existence, still viable. You don't have to take my word, check it out yourselves: http://purl.org/rss/1.0/ Even if, and I want to strongly emphasize if link rot does occur, both Manu and Dan have demonstrated multiple ways of ensuring that no meaning is lost, and nothing is broken. However, I hope that people are open enough to take away from their discussions that they are trying to treat this concern respectfully, and trying to demonstrate that there's more than one solution. Not that this forms a proof that Oh my god, if we use RDF, we're doomed! Also don't lose sight that this is really no more serious an issue than, say, a company originating com.sun.* being purchased by another company, named com.oracle.*. And you can't say, Well that's not the same, because it is. The only safe bet is to designate some central authority and give them power over every possible name. Then we run the massive risk of this system failing (and this applies to microdata's reverse DNS as well as RDF's URI), or it being taken over by an entity that sees such a data store as a way to make a great profit. We also defeat the very principle on which semantic data on the web abides, and that's true whether you're support microdata or RDF. Shelley
Re: [whatwg] Link rot is not dangerous
Kristof Zelechovski wrote: Classes in com.sun.* are reserved for Java implementation details and should not be used by the general public. CURIE URL are intended for general use. So, I can say Well, it is not the same, because it is not. Cheers, Chris But we're not dealing with Java anymore. We're dealing with using reversed DNS concatenated with some kind of default URI, to create some kind of bastardized URL, which actually is valid, though incredibly painful to see, and can be implied to actually take one to to a web address. You don't have to take my word for it -- check out Philip's testing demo for microdata. You get triples with the following: http://www.w3.org/1999/xhtml/custom#com.damowmow.cat http://philip.html5.org/demos/microdata/demo.html#output_ntriples Not only do you face problems with link rot, you also face a significant amount of confusion, as people look at that and go, What the hell is that? Oh, and you can say, Well, but we don't _mean_ anything by it -- but what does that have to do with anything? People don't go running the spec everytime they see something. They look at this thing and think, Oh, a link. I wonder where it goes. You go ahead and try it, and imagine for a moment the confusion when it goes absolutely no where. Except that I imagine the W3C folks are getting a little annoyed with the HTML WG now, for allowing this type of thing in, generating a whole bunch of 404 errors for the web master(s). But hey, you've given me another idea. I think I'll create my own vocabulary items, with the reversed DNS http://www.w3.org/1999/xhtml/custom#com.sun.*. No, maybe http://www.w3.org/1999/xhtml/custom#com.opera.*. Nah, how about http://www.w3.org/1999/xhtml/custom#com.microsoft.*. Yeah, that's cool. And there is no mechanism is place to prevent this, because unlike regular URIs, where the domain is actually controlled by specific entity, you've created the world famous W3C fudge pot. Anything goes. I can't wait for the lawsuits on this one. You think that cybersquatting is an issue on the web, or facebook, or Twitter, wait until you see people use com.microsoft.*. Then there's the vocabulary that was created by foobar.com, that people think, Hey, cool, I'll use that...whatever it is. After all, if you want to play with the RDF kids, your vocabularies have to be usable by other people. But Foobar takes a dive in the dot com pool, and foobar.com gets taken over by a porn establishment. Yeah, I can't wait for people to explain that one to the boss. Just because it doesn't link, won't mean it won't end up on Twitter as a big, huge joke. If you want to find something to criticize, I think it's important to realize that hey, folks, you've just stepped over the line, and you're now in the Zone of Decentralization. Whatever impacts us, babes, impacts all of you. Because if you look at Philip's example, you're going to see the same set of vocabulary URIs we're using for RDF right now, as microdata uses our stuff, too. Including the links that are all trembling on the edge on the self-implosion. So the point of all of this is moot. But it was fun. Really fun. Have a great weekend. Shelley
Re: [whatwg] Link rot is not dangerous
Philip Taylor wrote: On Fri, May 15, 2009 at 6:25 PM, Shelley Powers shell...@burningbird.net wrote: The most important point to take from all of this, though, is that link rot within the RDF world is an extremely rare and unlikely occurrence. That seems to be untrue in practice - see http://philip.html5.org/data/rdf-namespace-status.txt The source data is the list of common RDF namespace URIs at http://ebiquity.umbc.edu/resource/html/id/196/Most-common-RDF-namespaces from three years ago. Out of those 284: * 56 are 404s. (Of those, 37 end with '#', so that URI itself really ought to exist. In the other cases, it'd be possible that only the prefix+suffix URIs are meant to exist. Some of the cases are just typos, but I'm not sure how many.) * 2 are Forbidden. (Of those, 1 looks like a typo.) * 2 are Bad Gateway. * 22 could not connect to the server. (Of those, 2 weren't http:// URIs, and 1 was a typo. The others represent 13 different domains.) (For the URIs which returned Redirect responses, I didn't check what happens when you request the URI it redirected to, so there may be more failures.) Over a quarter of the most common namespace URIs don't resolve successfully today, and most of those look like they should have resolved when they were originally used, so link rot seems to be common. (Major vocabularies like RSS and FOAF are likely to exist for a long time, but they're the easiest cases to handle - we could just pre-define the prefixes rss: and foaf: and have a centralised database mapping them onto schemas/documentation/etc. It seems to me that URIs are most valuable to let any tiny group make one for their rarely-used vocabulary, and be guaranteed no name collisions without needing to communicate with a centralised registry to ensure uniqueness; but it's those cases that are most vulnerable to link rot, and in practice the links appear to fail quite often.) (I'm not arguing that link rot is dangerous - just that the numbers indicate it's a common situation rather than an extremely rare exception.) Philip, I don't think the occurrence of link rot causing problems in the RDF world is all that common, but thanks for looking up this data. Actually I will probably quote your info on my next writing at my weblog. I'd like to be dropped from any additional emails in this thread. After all, I have it on good authority I'm not open for rational discussion. So I'll leave this type of thing to you guys. Thanks Shelley
Re: [whatwg] Annotating structured data that HTML has no semantics for
James Graham wrote: jgra...@opera.com wrote: Quoting Philip Taylor excors+wha...@gmail.com: On Sun, May 10, 2009 at 11:32 AM, Ian Hickson i...@hixie.ch wrote: One of the more elaborate use cases I collected from the e-mails sent in over the past few months was the following: USE CASE: Annotate structured data that HTML has no semantics for, and which nobody has annotated before, and may never again, for private use or use in a small self-contained community. [...] To address this use case and its scenarios, I've added to HTML5 a simple syntax (three new attributes) based on RDFa. There's a quickly-hacked-together demo at http://philip.html5.org/demos/microdata/demo.html (works in at least Firefox and Opera), which attempts to show you the JSON serialisation of the embedded data, which might help in examining the proposal. I have a *totally unfinished* demo that does something rather similar at [1]. It is highly likely to break and/or give incorrect results**. If you use it for anything important you are insane :) I have now added extremely preliminary RDF support with output as N3 and RDF/XML courtesy of rdflib. It is certain to be buggy. So much concern about generating RDF, makes one wonder why we didn't just implement RDFa... Shelley
Re: [whatwg] Annotating structured data that HTML has no semantics for
Dan Brickley wrote: On 14/5/09 14:18, Shelley Powers wrote: James Graham wrote: jgra...@opera.com wrote: Quoting Philip Taylor excors+wha...@gmail.com: On Sun, May 10, 2009 at 11:32 AM, Ian Hickson i...@hixie.ch wrote: One of the more elaborate use cases I collected from the e-mails sent in over the past few months was the following: USE CASE: Annotate structured data that HTML has no semantics for, and which nobody has annotated before, and may never again, for private use or use in a small self-contained community. [...] To address this use case and its scenarios, I've added to HTML5 a simple syntax (three new attributes) based on RDFa. There's a quickly-hacked-together demo at http://philip.html5.org/demos/microdata/demo.html (works in at least Firefox and Opera), which attempts to show you the JSON serialisation of the embedded data, which might help in examining the proposal. I have a *totally unfinished* demo that does something rather similar at [1]. It is highly likely to break and/or give incorrect results**. If you use it for anything important you are insane :) I have now added extremely preliminary RDF support with output as N3 and RDF/XML courtesy of rdflib. It is certain to be buggy. So much concern about generating RDF, makes one wonder why we didn't just implement RDFa... Having HTML5-microdata -to- RDF parsers is pretty critical to having test cases that help us all understand where RDFa-Classic and HTML5 diverge. I'm very happy to see this work being done and that there are multiple implementations. As far as I can see, the main point of divergence is around URI abbreviation mechanisms. But also HTML5 might not have a notion equivalent to RDF/RDFa's bNodes construct. The sooner we have these parsers the sooner we'll know for sure. Dan Actually, I believe there are other differences, as others have pointed out. http://www.jenitennison.com/blog/node/103 http://realtech.burningbird.net/semantic-web/semantic-web-issues-and-practices/holding-on-html5 Some of the differences have resulted in more modifications to the underlying HTML5 spec, which is curious, because Ian has stated in comments that support for RDF is only a side interest and not the main purpose behind the microdata section. With the statement that support for RDF isn't a particular goal of microdata, Dan, I think you're being optimistic about the good this effort will generate for RDFa. But, more power to you. Shelley
Re: [whatwg] Annotating structured data that HTML has no semantics for
Maciej Stachowiak wrote: On May 14, 2009, at 5:18 AM, Shelley Powers wrote: So much concern about generating RDF, makes one wonder why we didn't just implement RDFa... If it's possible to produce RDF triples from microdata, and if RDF triples of interest can be expressed with microdata, why does it matter if the concrete syntax is the same as RDFa? Isn't the important thing about RDF the data model, not the surface syntax? (I understand that if the microdata syntax offered no advantages over RDFa, then it would be a wasted effort to diverge. But my impression is that you'd object to anything that isn't exactly identical to RDFa, even if it can easily be used in the same way.) Regards, Maciej Because one would assume that one way to accomplish a task would be more attractive to web developers, designers, parser developers, browsers, et al. In addition, one would also assume that one way to accomplish a task would be more attractive in regards to testing, maintaining and moving on in the future. Notice how there is only VHS and not Betamax? Notice the same about Blu-Ray and HD-TV? People won't buy into something while there are competitive specs, and these are competitive in that it makes little since to use both in a document, though you can now. The point is, people in the real world have to use this stuff. It helps them if they have one, generally agreed on approach. As it is, folks have to contend with both RDFa and microformats, but at least we know these have different purposes. Shelley
Re: [whatwg] Annotating structured data that HTML has no semantics for
Maciej Stachowiak wrote: On May 14, 2009, at 1:04 PM, Shelley Powers wrote: Maciej Stachowiak wrote: On May 14, 2009, at 5:18 AM, Shelley Powers wrote: So much concern about generating RDF, makes one wonder why we didn't just implement RDFa... If it's possible to produce RDF triples from microdata, and if RDF triples of interest can be expressed with microdata, why does it matter if the concrete syntax is the same as RDFa? Isn't the important thing about RDF the data model, not the surface syntax? (I understand that if the microdata syntax offered no advantages over RDFa, then it would be a wasted effort to diverge. But my impression is that you'd object to anything that isn't exactly identical to RDFa, even if it can easily be used in the same way.) Regards, Maciej Because one would assume that one way to accomplish a task would be more attractive to web developers, designers, parser developers, browsers, et al. In addition, one would also assume that one way to accomplish a task would be more attractive in regards to testing, maintaining and moving on in the future. Notice how there is only VHS and not Betamax? Notice the same about Blu-Ray and HD-TV? People won't buy into something while there are competitive specs, and these are competitive in that it makes little since to use both in a document, though you can now. Physical media do tend to converge due to network effects. I think the effect is less strong for digital file formats. For example, MP3 and AAC are both fairly successful; similarly, MPEG-4, Windows Media and Ogg are all getting some degree of traction. But you may be right that ultimately there will be only one winner. Now, that's the problem with all of this effort...winners and losers. I don't support a spec because it gives me grins and giggles. I have certain tasks I want to do, and I look for what is the technology that has the most support in order to do them. I've long been an adherent to RDF, which isn't really up for debate. Originally, I was an RDF/XML person, until the RDF-in-XHTML folks changed my mind. What I see of RDFa is a specification that has been through a very long period of time, testing, commenting, being implemented by major players. I also have tools, right now, that I can use to process the RDFa, as well as support by two major search engine companies. As Dan pointed out earlier, microdata seems to support most of RDF. Well, I know that RDFa does. It makes little sense to me to start from scratch when a mature specification with multi-vendor support already exists. Especially when Drupal 7 rolls out with RDFa baked in. That's 1.7 million sites supporting the spec. Then there's the new Google snippet thing -- who knows how many additional sites we'll now find supporting RDFa. So, if I'm pushing for RDFa, it's not because I want to win. It's because I have things I want to do now, and I would like to make sure have a reasonable chance of working a couple of years in the future. And yeah, once SVG is in HTML5, and RDFa can work with HTML5, maybe I wouldn't mind giving old HTML a try again. Lord knows I'd like to user ampersands again. The point is, people in the real world have to use this stuff. It helps them if they have one, generally agreed on approach. As it is, folks have to contend with both RDFa and microformats, but at least we know these have different purposes. From my cursory study, I think microdata could subsume many of the use cases of both microformats and RDFa. It seems to me that it avoids much of what microformats advocates find objectionable, and provides a good basis for new microformats; but at the same time it seems it can represent a full RDF data model. Thus, I think we have the potential to get one solution that works for everyone. I'm not 100% sure microdata can really achieve this, but I think making the attempt is a positive step. It can't, don't you see? Microdata will only work in HTML5/XHTML5. XHTML 1.1 and yes, 2.0 will be around for years, decades. In addition, XHTML5 already supports RDFa. Why you think something completely brand new, no vendor support, drummed up in a few hours or a day or so is more robust, and a better option than a mature spec in wide use, well frankly boggles my mind. I am impressed with your belief in HTML5. But One other detail that it seems not many people have picked up on yet is that microdata proposes a DOM API to extract microdata-based info from a live document on the client side. In my opinion this is huge and has the potential to greatly increase author interest in semantic markup. Not really. Can do this now with RDFa in XHTML. And I don't need any new DOM to do it. The power of semantic markup isn't really seen until you take that markup data _outside_ the document. And merge that data with data from other documents. Google rich snippets. Yahoo searchmonkey. Heck, even an application
Re: [whatwg] Annotating structured data that HTML has no semantics for
Philip Taylor wrote: On Tue, May 12, 2009 at 11:55 AM, Eduard Pascual herenva...@gmail.com wrote: [...] (at least for now: many RDFa-aware agents vs. zero HTML5's microdata -aware agents) HTML5 microdata parsers seem pretty trivial to write - http://philip.html5.org/demos/microdata/demo.html is only about two hundred lines to read all the data and to produce JSON and N3-serialised RDF. It shouldn't take more than a few hours to produce a similar library for other languages, including the time taken to read the spec, so the implementation cost for generic parser libraries doesn't seem like a significant problem. Writing something that will produce triples may be easy, but what's important is that you're producing an RDF model. Philip, I've been looking at your application, and you're not producing the same model for Ian's microdata proposal that is produced using either eRDF or RDFa. I'll have more on this later. The cost of integration with backend RDF-based systems seems more significant - hopefully you could simply replace the frontend RDFa parser with a microdata parser and generate the same RDF triples and it would all work fine, but I don't know whether that's true in practice (because maybe the microdata syntax is too restrictive to represent the vocabularies people want to use, and so they'd have to go to lots of extra effort to create a new vocabulary). [...] there are other cases where separate values might be needed: for example using a street address for the human-readable representation of a location and the exact geographic coordinates as the machine-readable (since not all micro-data parsers can rely on Google Maps's database to resolve street addresses, you know); or using a colored name (such as lime green displayed on lime green color) as the human-readable representation of a color, and the hexcode (like #00FF00) as the machine-readable representation. You could replace span itemprop=colorlime green/span span itemprop=location1 High Street/span with meta itemprop=color content=#00FF00spanlime green/span meta itemprop=location.lat content=56.78meta itemprop=location.long content=-12.34span1 High Street/span to get the desired output. (Not particularly elegant syntax, though.) It's funny, but oddly enough, this discussion reminds me of when I started at Boeing, right after college. I started just when the great debate between SQL and QUEL was ending, in SQL's favor. Most folks still feel that QUEL was the superior option, but SQL won out in the end because it had widespread use, and was supported by more of the (powerful) database companies, and hence the companies using the databases. The same could be said of Betamax versus VHS, and even the recent HDTV and Blu-Ray debates: we can get caught up in issues of superiority and argue the fine points of (mostly) obscure markup until the cows come home, but at some point in time, you have to pick a standard to get behind, or no one will any confidence in _any_ of the options being proposed--and the concept underlying the competing technologies (or standards) is hindered, perhaps for years. Sorry, I digress. Eduard, looking forward to seeing your own interpretation of the best metadata annotation. Shelley
Re: [whatwg] Annotating structured data that HTML has no semantics for
Ian Hickson wrote: On Tue, 12 May 2009, Peter Mika wrote: Just a quick comment on: it uses prefixes, which most authors simply do not understand, and which many implementors end up getting wrong (e.g. SearchMonkey hard-coded certain prefixes in its first implementation, Google's handling of RDF blocks for license declarations is all done with Actually, the problem we see is not so much the prefixes themselves but rather the cumbersome way of specifying namespace prefix definitions using xmlns. So I think it would make sense to have some mechanism for referencing bundles of namespace prefixes ('profiles') or namespace registries, in order to easy authoring. In terms of prefixes, I find that 'com.foaf-project.name' is a lot more difficult to write than 'foaf:name'. Reverse domain names are non-intuitive for non-programmer types (or non-Java programmers). If we can come up with a way of using the string foaf:name without having to declare foaf in each document, I'm totally in agreement. I've considered maybe registering the foaf URL scheme, or using some other punctuation character and having people register prefixes, but I don't know what punctuation character to use (':' and '.' are both taken). But then we would lose the extensibility, which is the power behind all of this. If I remember correctly, Henri had an issue with the DOM when it came to support of namespaces in XHTML, and not in HTML, which was the reason that @prefix or something along those lines proposed. There was quite positive progress in this regard, too. I don't know what happened to that progress. But regardless, the majority of people will include metadata markup by installing a plug-in or module, and making a couple of choices. And if you put together a good ten-minute tutorial for the average developer, they'll have no problem with foaf:name. Training and clarity of communication is much ore important than form, it always has been with technology. The examples you come up with just don't justify discarding consideration of a capability that just started getting incorporated into Google search. I would say if your fellow Google developers could understand how this all works, there is hope for others. Shelley
Re: [whatwg] Annotating structured data that HTML has no semantics for
Sam Ruby wrote: On Tue, May 12, 2009 at 4:34 PM, Shelley Powers shell...@burningbird.net wrote: I would say if your fellow Google developers could understand how this all works, there is hope for others. if http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009May/0064.html \ - Sam Ruby Ah heck, I've made mistakes with vocabularies too. That's why you ask for feedback. Unfortunately, asking for feedback isn't an option when you're creating secret stuff. I could have wished Google used FOAF or DC, too, but it's a start. Shelley
[whatwg] Custom microdata handling added to HTML5 spec
Since a new section detailing HTML5's handling of custom microdata has been added to the HTML5 spec (tracked here http://html5.org/tools/web-apps-tracker?from=3073to=3074 and displayed here http://dev.w3.org/html5/spec/Overview.html#microdata and announced herehttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019681.html ), I'm assuming my effort to re-examine the use cases Ian has published is irrelevant, and a waste of everyone's time. I will hence discontinue any and all effort associated with this specification. Shelley
[whatwg] Continuing
Sorry for the double emails today. I will continue with revisiting the use cases for the microdata section. One additional component I'll add to the use cases is applying my interpretation of how RDFa might handle the use case, as compared to how it could be handled with Ian's new HTML5 microdata proposal. This will, of course, slow me down a bit. Note, though, that I don't claim to be an expert on either RDFa or Ian's new microdata proposal. My hope is that if I make a mistake, or I'm not clear, folks will respond to my writing with corrections and/or additions. The purpose behind my effort is to open discussion. I will admit, though, that I do have a bias for RDFa, primarily because this is something that's real, today, and that I can use, today. Shelley
[whatwg] microdata use cases and Getting data out of poorly written Web pages
It's difficult to tell where one should comment on the so-called microdata use cases. I'm forced to send to multiple mailing lists. Ian, I would like to see the original request that went into this particular use case. In particular, I'd like to know who originated it, so that we can ensure that the person has read your follow-up, as well as how you condensed the use case down (to check if your interpretation is proper or not). In addition, from my reading of this posting of yours titled [whatwg] Getting data out of poorly written Web pages, is this open for any discussion? It seems to me that you received the original data, generated a use case document from the data, unilaterally, and now you're making unilateral decisions as to whether the use case requires a change in HTML5 or not. Is this what we can expect from all of the use cases? Shelley
Re: [whatwg] microdata use cases and Getting data out of poorly written Web pages
Ian Hickson wrote: On Fri, 8 May 2009, Shelley Powers wrote: It's difficult to tell where one should comment on the so-called microdata use cases. I'm forced to send to multiple mailing lists. Please don't cross-post to the WHATWG list and other lists -- you may pick either one, I read all of them. (Cross-posting results in a lot of confusion because some of the lists only allow members to posts, which others allow anyone to post, so we end up with fragmented threads.) But different people respond to the mailings in different ways, depending on the list. This isn't just you, Ian. How can I ensure that the W3C people have access to the same concerns? Ian, I would like to see the original request that went into this particular use case. In particular, I'd like to know who originated it, so that we can ensure that the person has read your follow-up, as well as how you condensed the use case down (to check if your interpretation is proper or not). I did not keep track of where the use cases came from (I generally ignore the source of requests so as to avoid any possible bias). Documenting the originator of a use case is introducing bias? In what universe? If anything, documenting where the use cases come from, and providing access to the original, raw data helps to ensure that bias has not been introduced. More importantly, it gives your teammates a chance to verify your interpretation of the use cases, and provide correction, if needed. However, I can probably figure out some of the sources of a particular scenario if you have a specific one in mind. Could you clarify which scenario or requirement you are particularly interested in? Ian, I think its important that you provide a place documenting the original raw data. This provides a historical perspective on the decisions going into HTML5 if nothing else. If you need help, I'm willing to help you. You'll need to forward me the emails you received, and send me links to the other locations. I'll then put all these into a document and we can work to map to your condensed document. That way there's accountability at all steps in the decision process, as well as transparency. Once I put the document together, we can put with other documents that also provide history of the decision processes. In addition, from my reading of this posting of yours titled [whatwg] Getting data out of poorly written Web pages, is this open for any discussion? Naturally, all input is always welcome. No, I didn't ask if input was welcome. I asked if this was still open for discussion, or if you have made up your mind, and and further discussion will just be wasting everyone's time. It seems to me that you received the original data, generated a use case document from the data, unilaterally, and now you're making unilateral decisions as to whether the use case requires a change in HTML5 or not. Is this what we can expect from all of the use cases? Yes. That's not appropriate for a team environment. If my proposals don't actually address the use cases, then please do point how that is the case. Similarly, if there are missing use cases, please bring them up. All input is always welcome (whether on the lists, or direct e-mal, on blogs, or wherever). None of the text in the HTML5 spec is frozen, it's merely a proposal. If there are use cases that should be addressed that are not addressed then we should address them. Again, how can I? I don't have the original data. (Regarding microdata note that I've so far only sent proposals for three of the 20 use cases that I collected. I've still got a lot to go through.) After digging, I found another one, at http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019620.html Again, though, the writing style indicates the item is closed, and discussion is not welcome. I have to assume that this is how you mentally perceive the item, and therefore though we may respond, the response will make no difference. And I can't find the third one. Perhaps you can provide a direct link. I'm concerned, too, about the fact that the discussion for these is happening on the WhatWG group, but not in the HTML WG email list. I've never understood two different email lists, and have felt having both is confusing, and potentially misleading. Regardless, shouldn't this discussion be taking place in the HTML WG, too? Isn't the specification the W3C HTML5 specification, also? I'm just concerned because from what I can see of both groups, interests and concerns differ between the groups. That means only addressing issues in one group, would leave out potentially important discussions in the other group. Shelley
[whatwg] notes on current HTML5 draft
Per Ian Hickson's request, first of my notes on the current HTML 5 draft Section 1.6.3, where you compare HTML5 with XHTML2 and XForms, you write However, XHTML2 and XForms lack features to express the semantics of many of the non-document types of content often seen on the Web. For instance, they are not well-suited for marking up forum sites, auction sites, search engines, online shops, mapping applications, e-mail applications, word processors, real-time strategy games, and the like. This specification aims to extend HTML so that it is also suitable in these contexts. This sounds more like marketing speak than something one would find in a specification. If it's important for an individual to know why they might want to use HTML5 over XHTML2, then the information should be given in detail, rather than in one vague paragraph. In addition, I've not found that the HTML5 specification answers the claims given in the above paragraph. For instance, why would HTML5 be better for a mapping application than XHTML2? Or an auction site? In section 1.7, you write The DOM5 HTML, HTML5, and XHTML5 representations cannot all represent the same content. For example, namespaces cannot be represented using HTML5, but they are supported in DOM5 HTML and XHTML5. Similarly, documents that use the noscript feature can be represented using HTML5, but cannot be represented with XHTML5 and DOM5 HTML. Comments that contain the string -- can be represented in DOM5 HTML but not in HTML5 and XHTML5. And so forth. And so forth, is not something one wants to read in a specification, because we expect precision, and and so forth is vague, and imprecise. Since the HTML5 supposedly represents both a HTML and a XHTML serialization technique, perhaps the document can take a lesson from the RDF community and provide a separate document, or at least a section detailing the two different serialization techniques. This would go far, too, in clearing up the confusion regarding XHTML. Too many people are making assumptions that XHTML is dead because the XHTML serialization of HTML5 is not spelled out as clearly as it could be. You actually do mix the differences between the two throughout the document, but that, to me, seems to 'clutter' up the spec -- making it difficult to determine what's new in the spec. If the HTML5 document is a new model for web page markup, then the model aspect of the spec should be detailed separately from its various serializations, and that includes any API. Right now, it's difficult to read the specification because it jumps too frequently between the abstract and the implementation, sometimes in one sentence. More later. Shelley
[whatwg] Section 3 semantics and structure
More general comments on the HTML5 draft: In section three, you mix structure and semantics, but the two are not necessarily compatible. For instance, we see an introduction to the Document, and then immediately proceed into a description of Documents in the DOM. Frankly, I don't see how a description of the DOM fits either structure or semantics. To me, structure would be the structure of the markup in the document, and the semantics would be the, well, it's hard to say what it would be, you apply semantics to elements, such as section and header. Whatever it is, it's not DOM related. Then you follow up with Security. What does this have to do with structure or semantics? Perhaps if the intro section was filled in, we would have an understanding of what you mean by structure, and semantics. Right now, though, I see what is basically a bucket of information, somehow grouped under this heading, perhaps because it doesn't fit anywhere else. Now you do a nice description of what you consider as semantics in section 3.3.1, and I would expect this, then, to be followed by a listing of the elements, but again, there's the DOM. There's no cohesive pattern to the document, especially when the different document levels are mixed so haphazardly. I think of a document as a communication between writer and audience. Now there are probably three audiences for HTML5: user agent developers, such as browser companies; web developers, interested in the DOM, scripting events, and so on; and designers or others, more likely interested in the markup. I, as a web developer/designer, am not really interested in the user agent aspects of the specs. Another person who is a designer, may not be interested in the developer or UA aspects. But all of us are forced to go through material addressed to all three audiences just to find the information we need. I, a designer interested in learning about the new semantic elements, have to wade through sections on the DOM and security, including cookies, because I'm not sure when I'll be getting to the bits I need. There's no clear demarcation between audiences in the document. More later Shelley
[whatwg] example of serialization problems
Review of HTML5 document: Here's a good example of a potential point of confusion for readers of the spec when it comes to serialization: In section 4.5.8 you introduce the ul element, and then demonstrate it with a several child li elements, each of which is shown with an HTML serialization. In second 4.5.9, you introduce the li element, and then demonstrate the li element using a serialization approach that would work with both XHTML and HTML serializations. And still later, in section 4.5.13.1, you again demonstrate li elements using only the HTML serialization format. In all of this is an implicit assumption of the capabilities of your audience, that they understand the differences between the two. Yet, this isn't stated as a prereq for the audience of the document. In fact, you state that a familiarity with XML is helpful, but not required. And as far as I've been able to see, though I may have missed it, discussions about closing tags doesn't take place until section 8. My suggestion would be to include both HTML and XHTML serializations, carefully differentiating between the two. Or to provide separate documents detailing the elements and their serialized form, HTML version and XHTML version, if you want to inter-mix model and serialization technique. As for Section 8, that really is for user agent developers, only. Seriously, I doubt you expect typical web developers or designers to get much from this section. I would almost expect this to be a separate document. What would be helpful is to bring this section up one level in complexity, specifically focused at web developers/designers. More later Shelley
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Dan Brickley wrote: On 18/1/09 20:07, Henri Sivonen wrote: On Jan 18, 2009, at 20:48, Dan Brickley wrote: On 18/1/09 19:34, Henri Sivonen wrote: On Jan 18, 2009, at 01:32, Shelley Powers wrote: Are you then saying that this will be a showstopper, and there will never be either a workaround or compromise? Are the RDFa TF open to compromises that involve changing the XHTML side of RDFa not to use attribute whose qualified name has a colon in them to achieve DOM Consistency by changing RDFa instead of changing parsing? I don't believe the RDFa TF are in a position to singlehandedly rescind a W3C Recommendation, ie. http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/. What they presumably could do is propose new work items within W3C, which I'd guess would be more likely to be accepted if it had the active enthusiasm of the core HTML5 team. Am cc:'ing TimBL here who might have something more to add. Do you have an alternative design in mind, for expressing the namespace mappings? The simplest thing is not to have mappings but to put the corresponding absolute URI wherever RDFa uses a CURIE. So this would be a kind of interoperability profile of RDFa, where certain features approved of by REC-rdfa-syntax-20081014 wouldn't be used in some hypothetical HTML5 RDFa. If people can control their urge to use namespace abbreviations, and stick to URIs directly, would this make your DOM-oriented concerns go away? Took five minutes to make this change in my template. Ran through validator.nu. Results: Doesn't like the content-type. Didn't like profile on head. Having to remove the profile attribute in my head element limits usability, but I'm not going to throw myself on the sword for this one. Doesn't like property, doesn't like about. These are the RDFa attributes I'm using. The RDF extractor doesn't care that I used the URIs directly. Didn't seem to mind SVG, but a value of none is a valid value for preserveAspectRatio. Shelley cheers, Dan -- http://danbri.org/
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Eduard Pascual wrote: On Sun, Jan 18, 2009 at 3:56 PM, Anne van Kesteren ann...@opera.com wrote: On Sun, 18 Jan 2009 16:22:40 +0100, Shelley Powers shell...@burningbird.net wrote: My apologies for not responding sooner to this thread. You see, one of the WhatWG working group members thought it would be fun to add a comment to my Stop Justifying RDF and RDFa web post, which caused the page to break. I am using XHTML at my site, because I want to incorporate inline SVG, in addition to RDFa. An unfortunate consequence of XHTML is its less than forgiving nature regarding playful pranks such as this. I'm assuming the WhatWG member thought the act was clever. It was, indeed. Three people emailed me to let me know the post was breaking while loading the page in a browser, and I made sure to note that such breakage was courtesy of a WhatWG member, who decided that perhaps I should just shut up, here and at my site, about the Important Work people(?) here are doing. Of course, the person only highlighted why it is so important that something such as RDFa, and SVG, and MathML, get a home in HTML5. XHTML is hard to support when you're allowing comments and external input. Typically my filters will catch the accidental input of crappy markup, but not the intentional. Not yet. I'm not an exerpt at markup, but I know more than the average person. And the average person most likely doesn't have my commitment, either. http://annevankesteren.nl/2009/01/xml-sunday shows the commentor (who by the way seems to be on your side in this debate) simply forgot to escape self-closed / and then WordPress somehow messed up in an attempt to fix it. I don't think anyone tries to make you shut up. Ouch! Thanks Anne for the screenshot, otherwise I wouldn't have known that it was my comment the one causing the issue. My apologies Shelley for that incident. I assure you that it was not intentional: it was a quite long post, I used some markup with the intention of making it more readable (like italizing the quotes), and by the end I messed things up. Thanks to the preview page I noticed some issues, like that I had to escape the sarcasm.../sarcasm for it to display (I'm too used to BBCode, which leaves unrecognized markup as is), but I didn't catch the self-closed / one (nor the preview page did: it showed up without issues). Eduard, no worries. Your comment just demonstrated that a secondary preview after editing is needed to self-catch these types of errors. Sorry for the misunderstanding. That and Anne's image, and trying to wade through the markup and figure out what was going on, because this error should have been caught, put me in an irritated mood. Especially since I have had people deliberately trip up my comments every time I write about XHTML et al (ie the Philipe Anne mentions). But no worries, and I shouldn't have made such a jump in assumption. Shelley
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Ian Hickson wrote: On Sun, 18 Jan 2009, Shelley Powers wrote: The more use cases there are, the better informed the results will be. The point isn't to provide use cases. The point is to highlight a serious problem with this working group--there is a mindset of what the future of HTML will look like, and the holders of the mindset brook no challenge, tolerate no disagreement, and continually move to quash any possibility of asserting perhaps even the faintest difference of opinion. I'm certainly sad that this is the impression I have given. I'd like to clarify for everyone's sake that this mailing list is definitely open to any proposals, any opinions, any disagreement. The only thing I ask is that people use rational debate, back up their opinions with logical arguments, present research to justify their claims, and derive proposals from user needs. I've been especially critical of you, which isn't fair. At the same time, as you have said yourself, you are a benevolent dictator, which seems to me to not be the best strategy for an inclusive HTML for the future. I know I'm not comfortable with the concept. But I'm also late to this group, and shouldn't disrupt if the strategy works. Regardless, I got the point in the comment. That, combined with this email from Ian, tells us that it doesn't matter how our arguments run, the logic of our debate, the rightness of our cause--he is the final arbiter, and he does not want RDFa. For the record, I am as open to us including a feature like RDFa as I am to us including a feature like MathML, SVG, or indeed anything else. While I may present a devil's advocate position to stimulate critical consideration of proposals, this does not mean that my mind is made up. If my mind was made up, I wouldn't be asking for use cases, and I wouldn't be planning to investigate the issue further in April. There is a fine difference between being the devil's advocate, and the devil's front door made of thick oak, with heavy brass fittings. How does one know if one has provided a use case in a format that is more likely to meet a successful outcome, than not. Is the criteria documented somewhere? It's difficult to provide use cases with the twenty questions approach. What are the criteria by which a possible solution to a problem is judged? Is there a consistent set of questions asked? Tests made? A certain number of implementations? Again, is this documented somewhere? I am not paid by Google, or Mozilla, or IBM to continue throwing away my time, arguing for naught. It may be worth pointing out that, many of our most active participants are volunteers, not paid by anyone to participate. Indeed I myself spent many years contributing to the standards community while unemployed or while a student. I am sorry you feel that you need to be compensated for your participation in the standards community, and wish you the best of luck in finding a suitable employer. The point I was trying to make, and forgive me if the my writing was too subtle, is that it's not the fact that the work will time, but whether the time will be well spent. Operating in the dark and tossing use cases in hopes they stick against the wall, without understanding criteria is not a particularly good use of time. However, having specific tasks that meet a given goal, and knowing that the goal is stable, and not a moving target, goes a long way to ensuring that the time spent has value. Knowing that one can, with diligence, ensure that the best result occurs is a good use of time. Spitting into the wind, at the whim and whimsy of a benevolent dictator, is not a good use of time. As far as Google goes, we have no corporate opinion either way on the topic of RDFa in HTML5. We do, however, encourage the continued practice of basing decisions on data rather than hopes. Bully for Google. Shelley
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Ian Hickson wrote: On Sat, 17 Jan 2009, Sam Ruby wrote: But back to expectations. I've seen references elsewhere to Ian being booked through the end of this quarter. I may have misheard, but in any case, my point is the same: if this is awaiting something from Ian, it will be prioritized and dealt with accordingly. For what it's worth, my current plan running up to last call in October includes an item in April for me to go through all the use cases that have by that point been put forward in the data markup space, and to work out for each use case and on the aggregate: 1. Whether there is compelling evidence that users want that use case addressed (e.g. whether there are successful companies addressing that use case using proprietary solutions or ad-hoc extensions to HTML, or whether there are usability studies or some independent market research showing demand from users, or whether it can be demonstrated that users are avoiding the Web because it doesn't address this problem). Again, you've become gatekeeper Ian. You are the one making the decision as to worth. You are the only one, as far as I can see, that is making decisions about what is, or is not included in the next version of HTML. You use I so frequently. Reading through your emails, one can't help wondering if you're the lead singer and everyone else here is nothing more than a faint echo. 2. Whether the use case is being addressed well enough already (e.g. if there are companies addressing this use case adequately, or whether the current solutions really are just hacks with numerous problems). 3. What the requirements are for each use case. 4. What solutions are available to address these use cases. 5. For each solution, whether it addresses the requirements. 6. Whether the relevant implementors are interested in implementing solutions for these use cases (e.g. whether authoring tools are willing to expose the feature, whether validator writers want to check for the correctness, whether browser vendors are willing to expose the relevant UI, whether search engine companies are willing to use the data, or whatever else might be appropriate). The more use cases there are, the better informed the results will be. The point isn't to provide use cases. The point is to highlight a serious problem with this working group--there is a mindset of what the future of HTML will look like, and the holders of the mindset brook no challenge, tolerate no disagreement, and continually move to quash any possibility of asserting perhaps even the faintest difference of opinion. My apologies for not responding sooner to this thread. You see, one of the WhatWG working group members thought it would be fun to add a comment to my Stop Justifying RDF and RDFa web post, which caused the page to break. I am using XHTML at my site, because I want to incorporate inline SVG, in addition to RDFa. An unfortunate consequence of XHTML is its less than forgiving nature regarding playful pranks such as this. I'm assuming the WhatWG member thought the act was clever. It was, indeed. Three people emailed me to let me know the post was breaking while loading the page in a browser, and I made sure to note that such breakage was courtesy of a WhatWG member, who decided that perhaps I should just shut up, here and at my site, about the Important Work people(?) here are doing. Of course, the person only highlighted why it is so important that something such as RDFa, and SVG, and MathML, get a home in HTML5. XHTML is hard to support when you're allowing comments and external input. Typically my filters will catch the accidental input of crappy markup, but not the intentional. Not yet. I'm not an exerpt at markup, but I know more than the average person. And the average person most likely doesn't have my commitment, either. Someone earlier said that HTML5 is for web application users, only, and that the rest of us interested in things like RDFa should just use XHTML. In other words, make it good for Google and to hell with the rest of us. This, this is the guiding attitude behind the future of the web? Regardless, I got the point in the comment. That, combined with this email from Ian, tells us that it doesn't matter how our arguments run, the logic of our debate, the rightness of our cause--he is the final arbiter, and he does not want RDFa. I am not paid by Google, or Mozilla, or IBM to continue throwing away my time, arguing for naught. Shelley
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Anne van Kesteren wrote: On Sun, 18 Jan 2009 16:22:40 +0100, Shelley Powers shell...@burningbird.net wrote: My apologies for not responding sooner to this thread. You see, one of the WhatWG working group members thought it would be fun to add a comment to my Stop Justifying RDF and RDFa web post, which caused the page to break. I am using XHTML at my site, because I want to incorporate inline SVG, in addition to RDFa. An unfortunate consequence of XHTML is its less than forgiving nature regarding playful pranks such as this. I'm assuming the WhatWG member thought the act was clever. It was, indeed. Three people emailed me to let me know the post was breaking while loading the page in a browser, and I made sure to note that such breakage was courtesy of a WhatWG member, who decided that perhaps I should just shut up, here and at my site, about the Important Work people(?) here are doing. Of course, the person only highlighted why it is so important that something such as RDFa, and SVG, and MathML, get a home in HTML5. XHTML is hard to support when you're allowing comments and external input. Typically my filters will catch the accidental input of crappy markup, but not the intentional. Not yet. I'm not an exerpt at markup, but I know more than the average person. And the average person most likely doesn't have my commitment, either. http://annevankesteren.nl/2009/01/xml-sunday shows the commentor (who by the way seems to be on your side in this debate) simply forgot to escape self-closed / and then WordPress somehow messed up in an attempt to fix it. I don't think anyone tries to make you shut up. (And if we, the evil WHATWG cabal, wanted to break your site, we would've asked Philip` ;-)) You're not seeing all of the markup that caused problems, Anne. The intention was to crash the post. However, I shouldn't have assumed that the person who inserted the markup that caused the problems is a WhatWG member. My apologies. Regardless of intent, it does demonstrate, again, why it is important for RDFa, SVG, and MathML find a home in HTML5. XHTML is a very difficult markup to support when you're allowing outside input. The tools do not do a good job of supporting XHTML, and hence the average person finds such failures to be intimidating, and will immediately return to HTML. Heck, I find the yellow screen of death to be unnerving myself. It's only my interest in inline SVG and RDFa, and basically distributed extensibility, that keeps me trying. And regardless of the fact that I jumped to conclusions about WhatWG membership, I do not believe I was inaccurate with the earlier part of this email. Sam started a new thread in the discussion about the issues of namespace and how, perhaps we could find a way to work the issues through with RDFa. My god, I use RDFa in my pages, and they load fine with any browser, including IE. I have to believe its incorporation into HTML5 is not the daunting effort that others make it seem to be.' However, the debate ended as soon as Ian re-asserted his authority. Shelley
[whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
The debate about RDFa highlights a disconnect in the decision making related to HTML5. The purpose behind RDFa is to provide a way to embed complex information into a web document, in such a way that a machine can extract this information and combine it with other data extracted from other web pages. It is not a way to document private data, or data that is meant to be used by some JavaScript-based application. The sole purpose of the data is for external extraction and combination. An earlier email between Martin Atkins and Ian Hickson had the following: On Sun, 11 Jan 2009, Martin Atkins wrote: One problem this can solve is that an agent can, given a URL that represents a person, extract some basic profile information such as the person's name along with references to other people that person knows. This can further be applied to allow a user who provides his own URL (for example, by signing in via OpenID) to bootstrap his account from existing published data rather than having to re-enter it. So, to distill that into a list of requirements: - Allow software agents to extract profile information for a person as often exposed on social networking sites from a page that represents that person. - Allow software agents to determine who a person lists as their friends given a page that represents that person. - Allow the above to be encoded without duplicating the data in both machine-readable and human-readable forms. Is this the sort of thing you're looking for, Ian? Yes, the above is perfect. (I cut out the bits that weren't really the problem from the quote above -- the above is what I'm looking for.) The most critical part is allow a user who provides his own URL to bootstrap his account from existing published data rather than having to re-enter it. The one thing I would add would be a scenario that one would like to be able to play out, so that we can see if our solution would enable that scenario. For example: I have an account on social networking site A. I go to a new social networking site B. I want to be able to automatically add all my friends from site A to site B. There are presumably other requirements, e.g. site B must not ask the user for the user's credentials for site A (since that would train people to be susceptible to phishing attacks). Also, site A must not publish the data in a manner that allows unrelated users to obtain privacy-sensitive data about the user, for example we don't want to let other users determine relationships that the user has intentionally kept secret [1]. It's important that we have these scenarios so that we can check if the solutions we consider are actually able to solve these problems, these scenarios, within the constraints and requirements we have. It would seem that Ian agrees with a need to both a) provide a way to document complex information in a consistent, machine readable form and that b) the purpose of this data is for external consumption, rather than internal use. Where the disconnect comes in is he believes that RDF, and the web page serialization technique, RDFa, are only one of a set of possible solutions. Yet at the same time, he references how the MathML and SVG people provide sufficient use cases to justify the inclusion of both of these into HTML5. But what is MathML. What does it solve? A way to include mathematical formula into a document in a formatted manner. What is SVG? A way to embed vector graphics into a web page, in such a way that the individual elements described by the graphics can become part of the overall DOM. So, why accept that we have to use MathML in order to solve the problems of formatting mathematical formula? Why not start from scratch, and devise a new approach? So, why accept that we have to use SVG in order to solve the problems of vector graphics? Why not start from scratch, and devise a new approach? Come to think of it, I think we should also question the use of the canvas element. After all, if the problem set is that we need the ability to animate graphics in a web page using a non-proprietary technology, then wouldn't something like SVG work for this purpose? Isn't the canvas element redundant? But then, perhaps we should start over from the beginning and just create a new graphics capability from scratch, and reject both canvas and SVG. We don't reject MathML, though. Neither do we reject SVG or canvas. Or any other of a number of entities being included in HTML5, including SQL. Why? Because they have a history of use, extensive documentation as to purpose and behavior, and there are a considerable number of implementations that support the specifications. It doesn't make sense to start from scratch. It makes more sense to make use of what already works. I have to ask, then: why do we isolate RDF, and RDFa for special handling? If we can accept that SQL is a natural database query mechanism, and SVG is a natural for
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Dan Brickley wrote: On 17/1/09 19:27, Sam Ruby wrote: On Sat, Jan 17, 2009 at 11:55 AM, Shelley Powers shell...@burningbird.net wrote: The debate about RDFa highlights a disconnect in the decision making related to HTML5. Perhaps. Or perhaps not. I am far from an apologist for Hixie, (nor for that matter and I a strong advocate for RDF), but I offer the following question and observation. The purpose behind RDFa is to provide a way to embed complex information into a web document, in such a way that a machine can extract this information and combine it with other data extracted from other web pages. It is not a way to document private data, or data that is meant to be used by some JavaScript-based application. The sole purpose of the data is for external extraction and combination. So, I take it that it isn't essential that RDFa information be included in the DOM? This is not rhetorical: I honestly don't know the answer to this question. Good question. I for one expect RDFa to be accessible to Javascript. http://code.google.com/p/rdfquery/wiki/Introduction - http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html is a nice example of code that does something useful in this way. cheers, Dan I agree, and appreciate Dan for pointing out a specific instance of use. Apologies for not making the assertion explicit. Shelley -- http://danbri.org/
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Sam Ruby wrote: On Sat, Jan 17, 2009 at 1:33 PM, Dan Brickley dan...@danbri.org wrote: On 17/1/09 19:27, Sam Ruby wrote: On Sat, Jan 17, 2009 at 11:55 AM, Shelley Powers shell...@burningbird.net wrote: The debate about RDFa highlights a disconnect in the decision making related to HTML5. Perhaps. Or perhaps not. I am far from an apologist for Hixie, (nor for that matter and I a strong advocate for RDF), but I offer the following question and observation. The purpose behind RDFa is to provide a way to embed complex information into a web document, in such a way that a machine can extract this information and combine it with other data extracted from other web pages. It is not a way to document private data, or data that is meant to be used by some JavaScript-based application. The sole purpose of the data is for external extraction and combination. So, I take it that it isn't essential that RDFa information be included in the DOM? This is not rhetorical: I honestly don't know the answer to this question. Good question. I for one expect RDFa to be accessible to Javascript. http://code.google.com/p/rdfquery/wiki/Introduction - http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html is a nice example of code that does something useful in this way. The fact that this works anywhere at all today implies that little, if any, changes to browsers is required in order to support this. Is that a fair statement? I've not taken a look at the code, but have taken a quick glance at the output using IE8.0.7000.0 beta, Safari 3.2.1/Windows, Chrome 1.0.154.43, Opera 9.63, and Firefox 3.0.5. The page is different (as in less functional) under IE8 and Safari. Is there something that they need to do which is not already covered in the HTML5 specification in order to support this? I would think we would have to go through the code to see what this specific instance of client-side access of the RDFa isn't working. The debugger I'm using with IE8 shows the problem is occuring in the jQuery code, not necessarily anything specific to the RDFa plugin. I know other JavaScript libraries that work with RDFa work, at least with Safari. For instance: http://www.w3.org/2006/07/SWD/RDFa/impl/js/ Since this library was vetted for IE7, would assume it would work for IE8, too. Of course, the RDFa attributes aren't incorporated into HTML5, which means their use would result in an invalid document. And of course, if they were incorporated, the issue of namespace for them would have to be addressed as namespaces were for MathML and SVG. Shelley - Sam Ruby
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Ian Hickson wrote: On Sat, 17 Jan 2009, Sam Ruby wrote: Shelley Powers wrote: So, why accept that we have to use MathML in order to solve the problems of formatting mathematical formula? Why not start from scratch, and devise a new approach? Ian explored (and answered) that here: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-April/014372.html Key to Ian's decision was the importance of DOM integration for this vocabulary. If DOM integration is essential for RDFa, then perhaps the same principles apply. If not, perhaps some other principles may apply. Sam's point here bears repeating, because there seems to be an impression that we took on SVG and MathML without any consideration, while RDF is getting an unfair reception. On the contrary, SVG and MathML got the same reception. For MathML, for instance, a number of options were very seriously considered, most notably LaTeX. For SVG, we considered a variety of options including VML. I would encourage people to read the e-mail Sam cited: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-April/014372.html It's long, but the start of it is a summary of what was considered and shows that the same process derived from use cases was used for SVG and MathML as is being used on this thread here. I'm not doubting the effort that went into getting MathML and SVG accepted. I've followed the effort associated with SVG since the beginning. I'm not sure if the same procedure was also applied to the canvas object, as well as the SQL query capability. Will assume so. The point I'm making is that you set a precedent, and a good one I think: giving precedence to not invented here. In other words, to not re-invent new ways of doing something, but to look for established processes, models, et al already in place, implemented, vetted, etc, that solve specific problems. Now that you have accepted a use case, Martin's, and we've established that RDFa solves the problem associated with the use case, the issue then becomes is there another data model already as vetted, documented, implemented that would better solve the problem. I propose that RDFa is the best solution to the use case Martin supplied, and we've shown how it is not a disruptive solution to HTML5. The fact that it is based on RDF, a mature, well documented, widely used model with many different implementations is a perk. Shelley
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Henri Sivonen wrote: On Jan 17, 2009, at 20:33, Dan Brickley wrote: Good question. I for one expect RDFa to be accessible to Javascript. http://code.google.com/p/rdfquery/wiki/Introduction - http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html is a nice example of code that does something useful in this way. Does this code run the same way on both DOMs parsed from text/html and application/xhtml+xml in existing browsers without at any point branching on a condition that is a DOM difference between text/html-originated and application/xhtml+xml-originated DOMs? I don't want to specifically look at just the one case, since it is not working in Safari, and IE8 and is too complex to debug right at this moment. Generally, though, RDFa is based on reusing a set of attributes already existing in HTML5, and adding a few more. I would assume no differences in the DOM based on XHTML or HTML. The one issue that would occur has to do with the values assigned, not the syntax. I put together a very crude demonstration of JavaScript access of a specific RDFa attribute, about. It's temporary, but if you go to my main web page, http://realtech.burningbird.net, and look in the sidebar for the click me text, it will traverse each div element looking for an about attribute, and then pop up an alert with the value of the attribute. I would use console rather than alert, but I don't believe all browsers support console, yet. Access the page using Firefox, which is served the page as XHTML. Access it as IE8, which gets the page as HTML. You can tell the difference between my graphics are based in inline SVG, and will only show if the page is served as XHTML. So, yes, with my quick, crude demonstration, DOM access is the same in both environments. Shelley
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Henri Sivonen wrote: On Jan 17, 2009, at 21:38, Shelley Powers wrote: I'm not doubting the effort that went into getting MathML and SVG accepted. I've followed the effort associated with SVG since the beginning. I'm not sure if the same procedure was also applied to the canvas object, as well as the SQL query capability. Will assume so. Note that SVG, MathML and SQL have had different popularity trajectories in top four browser engines than RDF. SVG is going up. At the time it was included in HTML5 (only to be commented out shortly thereafter), three of the top browser engines implemented SVG for retained-mode vector graphics and their SVG support was actively being improved. (One of the top four engines implemented VML, though.) At the time MathML was included in HTML5, it was supported by Gecko with renewed investment into it as part of the Cairo migration. Also, Opera added some MathML features at that time. Thus, two of the top four engines had active MathML development going on. Further, one of the major MathML implementations is an ActiveX control for IE. When SQL was included in HTML5, Apple (in WebKit) and Google (in Gears) had decided to use SQLite for this functionality. Even though Firefox doesn't have a Web-exposed database, Firefox also already ships with embedded SQLite. At that point it would have been futile for HTML5 to go against the flow of implementations. The story of RDF is very different. Of the top four engines, only Gecko has RDF functionality. It was implemented at a time when RDF was a young W3C REC and stuff that were W3C RECs were implemented less critically than nowadays. Unlike SVG and MathML, the RDF code isn't actively developed (see hg logs). Moreover, the general direction seems to be away from using RDF data sources in Firefox internally. Now wait a second, you're changing the parameters of the requirements. Before, the criteria was based on the DOM. Now you're saying that the browsers actually have to do with something with it. Who is to say what the browsers will do with RDF in the future? In addition, is that the criteria for pages on the web -- that every element in them has to result in different behaviors in browsers, only? What about other user agents? That seems to me to be looking for RDFa sized holes and them throwing them into the criteria, specifically to trip up RDF, and hence, RDFa. Meanwhile, the feed example you gave--RSS 1.0--shows how the feed spec community knowingly moved away from RDF with RSS 2.0 and Atom. Furthermore, RSS 1.0 usually isn't parsed into an RDF graph but is treated as XML instead. If RSS 1.0 is evidence, it's evidence *against* RDF. The point I'm making is that you set a precedent, and a good one I think: giving precedence to not invented here. In other words, to not re-invent new ways of doing something, but to look for established processes, models, et al already in place, implemented, vetted, etc, that solve specific problems. Now that you have accepted a use case, Martin's, and we've established that RDFa solves the problem associated with the use case, the issue then becomes is there another data model already as vetted, documented, implemented that would better solve the problem. Clearly, RDFa wasn't properly vetted--as far as the desire to deploy it in text/html goes--when the outcome was that it ended up using markup that doesn't parse into the DOM the same way in HTML and XML. SVG and MathML were both created as XML, and hence were not vetted for text/html, either. And yet, here they are. Well, here they'll be, eventually. Come to that -- I don't think the creators of SQL actually ever expected that someday SQL queries would be initiated from HTML pages. Shelley
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
Sam Ruby wrote: On Sat, Jan 17, 2009 at 2:38 PM, Shelley Powers shell...@burningbird.net wrote: I propose that RDFa is the best solution to the use case Martin supplied, and we've shown how it is not a disruptive solution to HTML5. Others may differ, but my read is that the case is a strong one. But I will caution you that a little patience is in order. SVG is not a done deal yet. I've been involved in a number of standards efforts, and I've never seen a case of proposed on a Saturday morning, decided on a Saturday afternoon. One demo is not conclusive. Now you mention that there exists a number of libraries. I think that's important. Very important. Possibly conclusive. I am patient. Look at me? I make extensive use of both SVG and RDF -- that is the mark of a patient woman. But back to expectations. I've seen references elsewhere to Ian being booked through the end of this quarter. I may have misheard, but in any case, my point is the same: if this is awaiting something from Ian, it will be prioritized and dealt with accordingly. If, however, some of the legwork is done for Ian, this may help accelerate the effort. First of all, whatever happens has to happen with either vetting by the RDF/RDFa folks, if not their active help. This is my way of saying, I'd be willing to do much of the legwork, but I want to make I don't represent RDFa incorrectly. Secondly, my finances have been caught up in the current downturn, and my first priority has to be on the hourly work and odd jobs I'm getting to keep afloat. Which means that I can't always guarantee 20+ hours a week on a task, nor can I travel. Anywhere. But if both are acceptable conditions, I'm willing to help with tasks. Even little things may help a lot. I know what I'm about to say may be unpopular, but I'll say it anyway: take a few good examples of RDFa and run them through Henri's validator. The validator will helpfully indicate exactly what areas of the spec would need to be updated in order to accommodate RDFa. The next step would be to take a look at those sections. If the update is obvious and straightforward, perhaps nothing more is required. But if not, researching into the options and making recommendations may help. Tasks including this one. Shelley - Sam Ruby
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
The assumption is incorrect. Please compare http://hsivonen.iki.fi/test/moz/xmlns-dom.html and http://hsivonen.iki.fi/test/moz/xmlns-dom.xhtml Same bytes, different media type. I put together a very crude demonstration of JavaScript access of a specific RDFa attribute, about. It's temporary, but if you go to my main web page,http://realtech.burningbird.net, and look in the sidebar for the click me text, it will traverse each div element looking for an about attribute, and then pop up an alert with the value of the attribute. I would use console rather than alert, but I don't believe all browsers support console, yet. This misses the point, because the inconsistency is with attributes named xmlns:foo. And I also said that we would have to address the issue of namespaces, which actually may require additional effort. I said that the addition of RDFa would mean the addition of some attributes, and we would have to deal with namespace issues. Just like the HTML5 working group is having to deal with namespaces with MathML and SVG. And probably the next dozen or so innovations that come along. That is the price for not having distributed extensibility. One works the issues. I assume the same could be said of any many of the newer additions to HTML5. Are you then saying that this will be a showstopper, and there will never be either a workaround or compromise? Shelley