Re: [whatwg] Creative Commons Rights Expression Language
On Sep 1, 2008, at 06:20, Karl Dubost wrote: Le 29 août 2008 à 23:04, Henri Sivonen a écrit : Also, having more metadata leads to UI clutter and data entry fatigue that alienates users. In the past, I worked on a content repository project that failed because (among other things) the content upload UI asked for an insane amount (a couple of screenfuls back then; probably a screenful today) of metadata when it didn't occur to system specifiers to invest in full text search. More metadata isn't better. Instead, systems should ask for the least amount of metadata that can possibly work (when the metadata must be entered by humans as opposed to being captured by machines like EXIF data). See also http://www.w3.org/QA/2008/08/the-digital-stakhanovite hehe. This was a-good-try-but-mischaracterization-from-the-ministry- of-truth That was uncalled for. to associate this article with the rants on metadata :) Let's clarify. It's an excellent article. Thank you for writing it. What I explain in the article is not the volume of metadata, but the volume of items and the context of usage. 1. Extract anything you can from the data itself (exif, iptc, xmp, modifications, date) Yes. It's sad how some systems ask the user for a title when the title is already in an HTML or PDF file but it never occurred to the specifiers of the system that files can actually be parsed. It even sadder to ask the user for keywords, because it never occurred to the specifiers of a system that full-text search has been invented. 2. Give a possibility in the UI to modify or add data. Even the *possibility* to add costs UI real estate, so specifiers of a system should be very, very careful in what possibilities they offer. In a business environment, you might have to give metadata about a work. I do it in my every day job. I give titles to my emails, I put comments in my cvs commits, etc. etc. These are all constraints. Not adding the data would still work technically. Sure. However, writing a string that appears in mailbox list view or in a list view of commits is the baseline of user-entered metadata. Everything else is something *more*. Just because something happens in a business setting where people can be fired doesn't mean that more metadata is better. I've seen metadata fail even in the military where they thought they could *order* people to enter metadata (and where they have a more elaborate punishment structure than in an ordinary working environment). Having a UI cluttered with fields to enter is not a failure of metadata, it is a failure of the project in the social and business constraints of the project. It's definitely a failure of the project in the social and business constraints. The reason for failure was a line of thought that went something like this: Metadata is good. Therefore, let's have more of it. Let's model what can be said about the domain. We are in a position to require people to enter the metadata. The process didn't try to seriously find out what the real must-have hard social and business constraints were. My point is that metadata is useful isn't the whole story. -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
Re: [whatwg] Creative Commons Rights Expression Language
Sorry for joining in naively to a conversation I've not been following, but reading Karl's remarks on the facilitation of metadata entry for users, some discussions in the vicinity of the recent SVGOpen that concerned usability, accessibility, and metadata made me think the following (that I suppose is rather outside the realm of HTML): Suppose the user or author (since in an app the distinction is blurred somewhat) is building something like a graph (in the discrete math sense), an image repository, or even a diagram (though the categories of content here are heterogeneous, making the argument a bit more tenuous) using a guiwebapp (like inkscape for diagrams or http://srufaculty.sru.edu/david.dailey/svg/graphs30.svg for graphs). Let's say there are n basic entities (like graphs or images) for which metadata is required. Let us furthermore assume the metadata description language is of order 0 1 2 3 or 4 * and that the minimum number of user operations required to complete the metadata description for a single entity is bounded above by k. We then may plot a user performance function that estimates the probability, p, that users will actually succeed in entering data (as a function perhaps of not only n and k, but of the user's investment in the process). Clearly as n and k grow and as the user's investment in the process declines, so does p. We are interested, through, interface, in maximizing p. I have a hunch (in math it is called a conjecture, but in CHI it is more like a hunch) that not only how, but also when, this conversation between user and software takes place affects the probability. For example if an artist were using Inkscape to draw SVG, then mandating a conversation about metadata each time a curve or gradient is completed is likely to drive users to AutoCad for their diagrams, even if wine is served. In certain cases, it makes most sense to build that conversation as an exit interview. If we will have k phrases to enter (using a grammar of graph theoretic phrases) for each of n objects, then we may wish to build a very comfortable GUI to facilitate that for all the affected entities upon closing the app: Dear user, you have just completed a schematic drawing for the Intel i-Chore 42x processor, would you now like to a) save b) enter appropriate metadata c) save and enter data d) drink wine. The notion is that a GUI enabling such, could if it were viewed as a stage or mode of development a) rely on the visualization of the opus as thus far created b) be appropriately rich to the order of the metadata description language and c) make the data entry process unbundled from the creation process, hence allowing diversification of the assignments of tasks to workers (e.g. the familiar phrase of the assessment revolt of 2028: let the bureacrats do the bureaucracy!). That isn't to say that we should not also facilitate the entry of data at each stage of the drawing process, with a sub-interface of the master metadata editor, but given the complexity that some metadata editors may have to convey, the nature of the conversation between user and software may not be allowed to remain entirely casual (that is, wine may need to be upgraded to tequila). /fwiw David (by the way, an Intellectual Property/provenance description language such as the library and visual rights communities work with might be an interesting overlay for the web, provided both free and corporate models (together with ample graph theory) are included) * define the order of a metadata description language as 0 if it consists of simple non-delimited strings, 1 if it consists of delimited strings (with a single delimiter), 2 if the delimiters are parentheses (required to match), 3 if the delimiters act like parentheses of multiple flavors as in XML, and 4 if the language is fully graph theoretic (parenthesized strings plus cross linkages -- footnotes). - Original Message - From: Karl Dubost [EMAIL PROTECTED] To: Henri Sivonen [EMAIL PROTECTED] Cc: Ben Adida [EMAIL PROTECTED]; Paul Prescod [EMAIL PROTECTED]; Ian Hickson [EMAIL PROTECTED]; WHAT-WG [EMAIL PROTECTED] Sent: Sunday, August 31, 2008 11:20 PM Subject: Re: [whatwg] Creative Commons Rights Expression Language Le 29 août 2008 à 23:04, Henri Sivonen a écrit : Also, having more metadata leads to UI clutter and data entry fatigue that alienates users. In the past, I worked on a content repository project that failed because (among other things) the content upload UI asked for an insane amount (a couple of screenfuls back then; probably a screenful today) of metadata when it didn't occur to system specifiers to invest in full text search. More metadata isn't better. Instead, systems should ask for the least amount of metadata that can possibly work (when the metadata must be entered by humans as opposed to being captured by machines like EXIF data). See also http://www.w3.org/QA/2008
Re: [whatwg] Creative Commons Rights Expression Language
Le 29 août 2008 à 23:04, Henri Sivonen a écrit : Also, having more metadata leads to UI clutter and data entry fatigue that alienates users. In the past, I worked on a content repository project that failed because (among other things) the content upload UI asked for an insane amount (a couple of screenfuls back then; probably a screenful today) of metadata when it didn't occur to system specifiers to invest in full text search. More metadata isn't better. Instead, systems should ask for the least amount of metadata that can possibly work (when the metadata must be entered by humans as opposed to being captured by machines like EXIF data). See also http://www.w3.org/QA/2008/08/the-digital-stakhanovite hehe. This was a-good-try-but-mischaracterization-from-the-ministry-of- truth to associate this article with the rants on metadata :) Let's clarify. What I explain in the article is not the volume of metadata, but the volume of items and the context of usage. 1. Extract anything you can from the data itself (exif, iptc, xmp, modifications, date) 2. Give a possibility in the UI to modify or add data. In a business environment, you might have to give metadata about a work. I do it in my every day job. I give titles to my emails, I put comments in my cvs commits, etc. etc. These are all constraints. Not adding the data would still work technically. For my own personal photo, I don't (want/have) time to put plenty of metadata. And that's fine. I do though bulk metadata at a regular pace, for location (ex: all these selected photos have been taken in Taiwan with the help of GUI tools. Yes tools save my life). Having a UI cluttered with fields to enter is not a failure of metadata, it is a failure of the project in the social and business constraints of the project. -- Karl Dubost - W3C http://www.w3.org/QA/ Be Strict To Be Cool
Re: [whatwg] Creative Commons Rights Expression Language
On Aug 28, 2008, at 15:31, Paul Prescod wrote: I don't really understand why there is any debate about the utility of metadata in general. Are you also against microformats? Title elements? The meta element? It seems obvious to me that a) metadata has been a huge success on the web (the success of other techniques like NLP and PageRank notwithstanding) and b) we haven't yet invented every metadata tag we need. I think it is worthwhile to debate whether RDFa is the right solution but do we really want to go back to a debate over whether metadata is valuable or not? This is useful stuff, right? Some metadata may be useful. A lot of it isn't. Strugeon's Revelation applies. I don't know what the right way to find the useful bits is, but just telling people out there to publish metadata and expecting use cases to emerge later isn't a good way, since that approach wastes a lot of people's effort. (I'm not suggesting that you are telling people to just go publish a lot of stuff. However, the upwards-scalable RDF naming approach and the approach of ignoring triples the consumer doesn't know about seem to be designed for erring on the side of publishing too much whereas the Microformats Process and the WHATWG approach ask for use cases first.) One example of useless metadata evangelism that I myself fell for 8 years ago was embedding Dublin Core metadata in HTML. It wasn't nice to realize that I had been tricked into something totally pointless. (The data was redundant with HTML and HTTP native data.) Also, having more metadata leads to UI clutter and data entry fatigue that alienates users. In the past, I worked on a content repository project that failed because (among other things) the content upload UI asked for an insane amount (a couple of screenfuls back then; probably a screenful today) of metadata when it didn't occur to system specifiers to invest in full text search. More metadata isn't better. Instead, systems should ask for the least amount of metadata that can possibly work (when the metadata must be entered by humans as opposed to being captured by machines like EXIF data). See also http://www.w3.org/QA/2008/08/the-digital-stakhanovite -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
Re: [whatwg] Creative Commons Rights Expression Language
Henri Sivonen wrote: I don't know what the right way to find the useful bits is, but just telling people out there to publish metadata and expecting use cases to emerge later isn't a good way, since that approach wastes a lot of people's effort. In this email you claim there are no use cases. But in another email only 6 hours earlier, you said: I'm getting mixed signals about the extent to which RDFa in envisioned to be browser-sensitive. Weren't browsers supposed to do cool stuff with it according to some emails in this thread? So, clearly, there are use cases we've explained. Here they are again, just in case: SearchMonkey: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-August/015967.html Ubiquity: http://lists.w3.org/Archives/Public/www-archive/2008Aug/0127.html (I can't find the WHATWG link, but it was sent to WHATWG, too...) (I'm not suggesting that you are telling people to just go publish a lot of stuff. However, the upwards-scalable RDF naming approach and the approach of ignoring triples the consumer doesn't know about seem to be designed for erring on the side of publishing too much whereas the Microformats Process and the WHATWG approach ask for use cases first.) You could put it that way, but what RDF is really about is publishing data in a fine-grained enough matter that applications can easily overlap. That's why you can ignore parts of the data if you don't need it. You get a much more loosely-coupled, opportunistic Web, that way, which is exactly the kind of opportunity explored by tools like Ubiquity. One example of useless metadata evangelism that I myself fell for 8 years ago was embedding Dublin Core metadata in HTML. It wasn't nice to realize that I had been tricked into something totally pointless. (The data was redundant with HTML and HTTP native data.) I don't think anyone was trying to trick you (did someone make money or acquire fame off your DC markup?), but certainly it's true that the infrastructure wasn't quite there (and the flexibility of adding other vocabularies wasn't there, either.) We tried to remedy that with RDFa, and we can already see from the CC uptake and tools that the situation is quite a bit better. Also, having more metadata leads to UI clutter and data entry fatigue that alienates users. In the past, I worked on a content repository project that failed because (among other things) the content upload UI asked for an insane amount (a couple of screenfuls back then; probably a screenful today) of metadata when it didn't occur to system specifiers to invest in full text search. More metadata isn't better. Instead, systems should ask for the least amount of metadata that can possibly work (when the metadata must be entered by humans as opposed to being captured by machines like EXIF data). See also http://www.w3.org/QA/2008/08/the-digital-stakhanovite That is an extremely limited view of how this might be used, as if every web site that wants to publish RDFa is going to prompt users for 20 fields. No. Take Craigslist again: 5 structured fields is plenty to do super interesting stuff, but we'd have to come up with a special microformat for apartment listings if we reject fine-grained metadata like RDF. And in some cases, you *do* need to be able to output lots of metadata (think publication records at pubmed, other online journals.) Some past approaches to this problem have failed for reasons we believe we've identified: - repetition of rendered and machine-readable data leading to staleness - too hard to include (modifying the HEAD, separate file) for non-savvy web publishers - vocabularies are monolithic and non-remixable, limiting reuse and ability of little guys to participate We've worked hard to address these in RDFa, and the publisher interest we're seeing shows we've done *something* right. Maybe it's time to let go of old ghosts and explore how this new solution may address some of the problems of the past? -Ben
Re: [whatwg] Creative Commons Rights Expression Language
Ian Hickson wrote: Clearly, and as the voice-over states, the site needs embedded metadata that easily connects what the user is pointing to to the structured data required for mapping. Since Craigslist doesn't have structured data now, that seems like a verifiably false claim. :-) Did you listen to the video? It clearly states that they wrote a specific hack for Craigslist, but that they expect this to work more generically. Site-specific hacks don't scale to the Web. A solution that scales will require a single parser, not site-specific parsers (though site-specific parsers will likely be a transition path.) The video's comments about microformats should make that clear. In fact, Craigslist is a great example. Given how hostile Craigslist has been to people reusing their data, You're confusing two issues. Craigslist doesn't want other *web sites* redistributing their data. I doubt they would take issue with users trying to process the data for their own private needs. Craigslist mostly relies on its no bots Terms of Use to prevent other sites from reusing their data. They certainly don't make it too difficult to screen-scrape, given their simple templates. what reason do we have to believe that they would ever make their data accessible using RDFa? (Or any other metadata system in fact.) So, assuming you're right about Craigslist (and I think you're wrong, as mentioned above), in your opinion, there won't be a reasonable number of publishers who want to publish RDFa (or something like it?) Everyone will just obscure their data so it's only human readable? That's a rather limited view of the potential of the web. Do you not see the value that's unleashed by tools like Ubiquity, and the incentive that web sites will have to plug in? -Ben
Re: [whatwg] Creative Commons Rights Expression Language
On Wed, 27 Aug 2008, Ben Adida wrote: Ian Hickson wrote: Clearly, and as the voice-over states, the site needs embedded metadata that easily connects what the user is pointing to to the structured data required for mapping. Since Craigslist doesn't have structured data now, that seems like a verifiably false claim. :-) Did you listen to the video? It clearly states that they wrote a specific hack for Craigslist, but that they expect this to work more generically. Sure, I'm just debating needs. It is possible to do it without structured data, indeed the flagship example here doesn't have any. I'm not saying that that's a better design (on the contrary). It's just the way it is today. Site-specific hacks don't scale to the Web. A solution that scales will require a single parser, not site-specific parsers (though site-specific parsers will likely be a transition path.) To scale to the whole Web, the only thing I can see working is the computers understanding human language. I just don't see the whole Web marking up their data using fine grained semantic markup. We have enough trouble getting them to use h1 and p. Examine the markup of this page (which I originally stumbled across a few months ago, but which was updated just yesterday): http://puysl.com/view.htm This is the level of authoring that we have to deal with if we're targetting the whole Web. That page is a microcosm of specialness, but pages like it abound. So, assuming you're right about Craigslist (and I think you're wrong, as mentioned above), in your opinion, there won't be a reasonable number of publishers who want to publish RDFa (or something like it?) Everyone will just obscure their data so it's only human readable? Not everyone, no. Some, many even, will get the religion and mark up their data in useful ways. But I don't see any evidence to suggest that a critical mass will do so. That's a rather limited view of the potential of the web. Do you not see the value that's unleashed by tools like Ubiquity, and the incentive that web sites will have to plug in? I absolutely see the value. I would absolutely love for the Semantic Web vision to be the future. However, just because I want it to come true doesn't mean it will come true. It fundamentally relies on humans acting in a way that we _know_ they don't. We can't just ignore 18 years of experience with the Web and Web authors and say well our idea is so great that authors will all magically make it happen. I think (some hip) sites will totally plug in, just as they already have, using site-specific scripts that can be downloaded by the users of those sites. I think a few will use simple domain-specific fine grained markup conventions (like Microformats); I think fewer still, possibly many but likely not a critical mass, will use RDF and RDFa. This mirrors what happens today (e.g. GMail and other big sites have contacts APIs, a small number of sites have hCard, a very few have FOAF). I don't see that tools like Ubiquity give any incentive to use RDF. The immediate reward from a hard-coded site-specific script is more effective than the compound reward of writing a generic script (typically a harder task), convincing at least one site to rewrite its markup to use a suitable convention, and then debugging the script to work around the bugs that that site has, even if one eventually convinces multiple sites to support the same conventions. (Also, note that as much as things like Ubiquity are great for people like us, they, like Quicksilver before it, and the Unix command line before that, would totally confuse regular users. The concept of using a site for a single task, and copying the output of that site into another site, resonates with users in a way that just trust us, if you tell the computer what you want it'll do it somehow doesn't. If power like Ubiquity is the goal, we haven't yet found the UI for it.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Creative Commons Rights Expression Language
On Wed, Aug 27, 2008 at 8:17 PM, Ian Hickson [EMAIL PROTECTED] wrote: On Wed, 27 Aug 2008, Ben Adida wrote: Consider specifically the Craigslist example, where the user selects a few of the apartments and says map these. Clearly, and as the voice-over states, the site needs embedded metadata that easily connects what the user is pointing to to the structured data required for mapping. Since Craigslist doesn't have structured data now, that seems like a verifiably false claim. :-) In fact, Craigslist is a great example. Given how hostile Craigslist has been to people reusing their data, and how unstructured their page is now, what reason do we have to believe that they would ever make their data accessible using RDFa? (Or any other metadata system in fact.) I don't really understand why there is any debate about the utility of metadata in general. Are you also against microformats? Title elements? The meta element? It seems obvious to me that a) metadata has been a huge success on the web (the success of other techniques like NLP and PageRank notwithstanding) and b) we haven't yet invented every metadata tag we need. I think it is worthwhile to debate whether RDFa is the right solution but do we really want to go back to a debate over whether metadata is valuable or not? This is useful stuff, right? http://googlemapsapi.blogspot.com/2007/06/microformats-in-google-maps.html http://greasemonkey.makedatamakesense.com/google_hcalendar/ Paul Prescod
Re: [whatwg] Creative Commons Rights Expression Language
On Thu, Aug 28, 2008 at 2:28 AM, Ian Hickson [EMAIL PROTECTED] wrote: ... Site-specific hacks don't scale to the Web. A solution that scales will require a single parser, not site-specific parsers (though site-specific parsers will likely be a transition path.) To scale to the whole Web, the only thing I can see working is the computers understanding human language. I just don't see the whole Web marking up their data using fine grained semantic markup. We have enough trouble getting them to use h1 and p. When did it become necessary for every new HTML element to be used by every author of every web page on the web? A huge amount of browsing time is spent on the top hundred web sites. If they do it right, it will filter down. If it doesn't the web is still a better place than if those top hundred sites did not use standards for representing metadata. I think (some hip) sites will totally plug in, just as they already have, using site-specific scripts that can be downloaded by the users of those sites. I think a few will use simple domain-specific fine grained markup conventions (like Microformats); I think fewer still, possibly many but likely not a critical mass, will use RDF and RDFa. Why would hip sites prefer site-specific scripts to standard markups, standard scripts and/or browser features? Is it really logical for each of the top sites to invent their own markup and scripts rather than cooperate on common tools? ... I don't see that tools like Ubiquity give any incentive to use RDF. The immediate reward from a hard-coded site-specific script is more effective than the compound reward of writing a generic script (typically a harder task), convincing at least one site to rewrite its markup to use a suitable convention, and then debugging the script to work around the bugs that that site has, even if one eventually convinces multiple sites to support the same conventions. Good point. It turns out that we don't need standards bodies at all. It is also easier *at first* for every site to write their own vector markup or stylesheet language. It is even easier to invent your own networking protocol than to get one standardized. (after all, you must invent it before you can get it standardized) I don't see why you believe that metadata is uniquely immune to the forces of standardization. This mirrors what happens today (e.g. GMail and other big sites have contacts APIs, a small number of sites have hCard, a very few have FOAF). HTML has no standard mechanism for embedding contacts. hCard is a sort of de facto mechanism. Given how long it takes Web standards to work their way through the ecosystem, I think it's doing okay. Google supports it on some key sites. Yahoo supports it on some as well. Does it really need to be supported on Bob's Hockey Team site in order to be a success? It should be available and accessible to Bob if he wants the feature, but if not, that's cool too. Javascript is not necessary for every site out there either. Paul Prescod
Re: [whatwg] Creative Commons Rights Expression Language
I thought the standard mechanism for embedding contacts is OBJECT[type=text/vcard]. Chris -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Prescod Sent: Thursday, August 28, 2008 2:51 PM To: Ian Hickson Cc: Ben Adida; WHAT-WG Subject: Re: [whatwg] Creative Commons Rights Expression Language HTML has no standard mechanism for embedding contacts. hCard is a sort of de facto mechanism. Given how long it takes Web standards to work their way through the ecosystem, I think it's doing okay. Google supports it on some key sites. Yahoo supports it on some as well. Does it really need to be supported on Bob's Hockey Team site in order to be a success? It should be available and accessible to Bob if he wants the feature, but if not, that's cool too. Javascript is not necessary for every site out there either. Paul Prescod
Re: [whatwg] Creative Commons Rights Expression Language
Ian Hickson wrote: Did you listen to the video? It clearly states that they wrote a specific hack for Craigslist, but that they expect this to work more generically. Sure, I'm just debating needs. It is possible to do it without structured data, indeed the flagship example here doesn't have any. The video clearly states that they have a site-specific hack for now, and how it would be better if they could instead parse something like microformats. It sounds like you're saying it's not already deployed everywhere, so we don't need to deploy it. We're trying to put together the pieces to make it more easily deployable! To scale to the whole Web, the only thing I can see working is the computers understanding human language. I just don't see the whole Web marking up their data using fine grained semantic markup. We have enough trouble getting them to use h1 and p. As Paul said well, I don't think the feature needs to be used by everyone, no even close. How many publishers will really know how to use the browser SQL? I'd say in the end, the potential # of publishers is lower for browser SQL, because you need serious tech chops to make that work, whereas RDFa is as easy as copying and pasting a chunk of HTML that someone (like CC) gives you into your web page. (Total number of *end-users* will surely be higher for SQL, given the reach of gmail and Google in general, but you keep referring to difficulty for the *publisher*, so it's important to point out how difficult it's going to be to get offline+browser-SQL working for the average publisher, especially compared to markup like RDFa which typically requires just modifying a JSP/ASP/etc... template.) Examine the markup of this page (which I originally stumbled across a few months ago, but which was updated just yesterday): http://puysl.com/view.htm And by that reasoning, I think there are a lot of other HTML5 features you need to kill, starting with browser SQL. Not everyone, no. Some, many even, will get the religion and mark up their data in useful ways. But I don't see any evidence to suggest that a critical mass will do so. As I mentioned above, if you're talking about *publishers*, I think many more will find RDFa useful before they find SQL-in-the-browser useful, especially with client-side tools like Ubiquity. I absolutely see the value. Okay, I think that's major progress: we agree that there's value :) I would absolutely love for the Semantic Web vision to be the future. However, just because I want it to come true doesn't mean it will come true. How about letting it happen with a well-thought-out plan that tries to grow semantics out of the existing Web, and seeing if it does succeed? The cost is minimal, a number of publishers are interested, and the tools are easy to build (9 implementations of RDFa parsers already, full test suite, attribute-focused implementation, etc...) It fundamentally relies on humans acting in a way that we _know_ they don't. That's a false comparison. You're going back to the argument that there is no user incentive or feedback for users to produce structured data. But I just gave you two very high-profile examples: Ubiquity and SearchMonkey. Both of those provide strong user incentive to play in the structured data space, as long as that space is generic enough for small publishers to hook in. Same tool, many publishers. We can't just ignore 18 years 18 years where we didn't have well thought-out metadata schemes for the web, nor the client-side programmability of Firefox to stitch things together. This is not the same old thing. I think (some hip) sites will totally plug in, just as they already have, using site-specific scripts that can be downloaded by the users of those sites. I think a few will use simple domain-specific fine grained markup conventions (like Microformats); I think fewer still, possibly many but likely not a critical mass, will use RDF and RDFa. So you continue to confuse *publishers* and *end-users*. If you're arguing that a small number of publishers means the feature shouldn't be used, then you've got a number of features in HTML5 that need killing (SQL.) This mirrors what happens today (e.g. GMail and other big sites have contacts APIs, a small number of sites have hCard, a very few have FOAF). What happens today is limited by what's allowed in HTML. Your argument is circular. We'd like RDFa to validate so people can feel more comfortable adding it to their production sites. I don't see that tools like Ubiquity give any incentive to use RDF. The immediate reward from a hard-coded site-specific script is more effective than the compound reward of writing a generic script (typically a harder task), convincing at least one site to rewrite its markup to use a suitable convention, and then debugging the script to work around the bugs that that site has, even if one eventually convinces multiple sites to support the
Re: [whatwg] Creative Commons Rights Expression Language
On Wed, 27 Aug 2008, Ben Adida wrote: Consider specifically the Craigslist example, where the user selects a few of the apartments and says map these. Clearly, and as the voice-over states, the site needs embedded metadata that easily connects what the user is pointing to to the structured data required for mapping. Since Craigslist doesn't have structured data now, that seems like a verifiably false claim. :-) In fact, Craigslist is a great example. Given how hostile Craigslist has been to people reusing their data, and how unstructured their page is now, what reason do we have to believe that they would ever make their data accessible using RDFa? (Or any other metadata system in fact.) (I fully intend to reply to the rest of the e-mails sent on the topic of RDFa in due course, by the way; unfortunately it is not my top priority right now and so I can only spend so much time on it each day. You can track what feedback is outstanding on the topic here: http://www.whatwg.org/issues/#rdfa All those e-mails will get a reply from me in due course. Sorry for the delay.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Creative Commons Rights Expression Language
Ian, Thanks for the details. Some questions below. Ian Hickson wrote: The Database stuff was mostly driven by requests from large Web application authors (including, for example, GMail), who wanted to be able to offer their services even while their users were offline. I am quite favorable to the SQL DB in the browser approach, in that I think it is an enabler of many great things. However, it's pretty clear that this need comes from technically capable folks, not from the bulk of users, right? The need from the bulk of users is, at best, I'd like to access my email offline. So then, why is IMAP insufficient? And why SQL? Why not something a little simpler? I ask these questions because there's a parallel to RDFa here. A number of web publishers / application authors want to use RDFa, because they know it will enable many new applications. End-users likely won't know much about RDFa, nor should they. They'll just know that suddenly their browsers can recall articles similarly tagged in their history, that search engines like Yahoo's SearchMonkey can surface significant useful information directly in their search results, that reusing CC-licensed works is now much easier with automated attribution, etc... Pick any single application here, and you could come up with an easier alternative than RDFa. But putting them together, you need something more generic. The SQL database of interoperable web data, in a sense. And that's where RDF (and thus RDFa) comes in. So my question is: what would it take to convince you that we need something more generic than the one-off solutions you and others have been suggesting? How did the gmail proposal become we need more than just IMAP, we need the generic SQL DB, and we will change the way the web browser works forever by enabling offline access. Again, I like the SQL-DB idea, I'm not arguing against it, I'm just wondering how it got through your stringent process. And I note that RDFa is a much more modest proposal that requires almost no work for browser implementors. Consider it from our side. How would you feel if you asked a question and I told you the answer was somewhere in the HTML5 spec? Not quite the same thing, ccREL is the complete reasoning for this particular problem, from one party (the equivalent of gmail to SQL-in-browser). We have to address problems that people know they have, or would agree they have if told they had them, because people won't spend any effort to address problems they don't think they have. So, just to be clear, how does that link up with SQL-in-browser? When you say people, do you mean web publishers / application builders? The word problem doesn't appear once in the ccREL paper. Where is the statement of what ccREL is trying to solve? Well, the exact word doesn't have to appear, does it? Here are the first two sentences of the introduction: This paper introduces the Creative Commons Rights Expression Language (ccREL), the standard recommended by Creative Commons (CC) for machine-readable expression of copyright licensing terms and related information. ccREL and its description in this paper supersede all previous Creative Commons recommendations for expressing licensing metadata. From this it's pretty clear that we're trying to express copyright licensing information (with all of the sub-fields it implies and all the possible data types we might license) in machine-readable form. But I'm more concerned about RDFa, since presumably if we addressed the problems of RDFa, ccREL would be automatically resolved. Sure, although if you want to understand the use case, ccREL is fairly important. The ccREL paper is long, wordy, and doesn't really seem to clearly state the answers to the questions I listed above. Interestingly, ccREL has been extremely well received in the non-technical space. But I guess you can't please all the people all the time :) I'm really just looking for a simple one-page answer. A one-page answer? That's only possible if you're willing to accept premises like RDF is a good way to express interoperable data on the web. Imagine trying to convince someone about SQL-in-browser when that someone doesn't believe that SQL is the right approach, rather that it should be XML object and XPath. Can you do that in one page? So, if you're willing to start with RDF is a good idea for interoperable web data, then we can probably put together a short proposal. But without a baseline, you're sending me on a fool's errand. -Ben
Re: [whatwg] Creative Commons Rights Expression Language
On Mon, 25 Aug 2008, Ben Adida wrote: Ian Hickson wrote: The Database stuff was mostly driven by requests from large Web application authors (including, for example, GMail), who wanted to be able to offer their services even while their users were offline. However, it's pretty clear that this need comes from technically capable folks, not from the bulk of users, right? The need from the bulk of users is, at best, I'd like to access my email offline. Right, it came from a group of folk who clearly described their problem (you can't browser GMail offline, you can't use Google Reader offline, every time you query the user's data in the server-side database, you have to do a network round-trip) and their requirements (e.g. has to work for Web apps, has to work for a variety of application types, has to work offline, has to be able to support full-text search, has to be able to support structured data, has to be synchronisable). I have no idea what problem RDFa is trying to solve. I have no idea what the requirements are. If you want this seriously considered for HTML5, please write a clear and concise e-mail that explains what the needs are. So my question is: what would it take to convince you that we need something more generic than the one-off solutions you and others have been suggesting? I have no idea what problem you're trying to solve, so it's hard for me to answer this question. We have to address problems that people know they have, or would agree they have if told they had them, because people won't spend any effort to address problems they don't think they have. So, just to be clear, how does that link up with SQL-in-browser? When you say people, do you mean web publishers / application builders? Users. One of the problems, for instance, was that they could not access their GMail while offline. The word problem doesn't appear once in the ccREL paper. Where is the statement of what ccREL is trying to solve? Well, the exact word doesn't have to appear, does it? Here are the first two sentences of the introduction: This paper introduces the Creative Commons Rights Expression Language (ccREL), the standard recommended by Creative Commons (CC) for machine-readable expression of copyright licensing terms and related information. ccREL and its description in this paper supersede all previous Creative Commons recommendations for expressing licensing metadata. That's not a problem statement, sorry. It's a description of what it does, but it doesn't say why anyone needs that. From this it's pretty clear that we're trying to express copyright licensing information (with all of the sub-fields it implies and all the possible data types we might license) in machine-readable form. I know _what_ you're trying to do, it's _why_ you're trying to do it that matters. A one-page answer? That's only possible if you're willing to accept premises like RDF is a good way to express interoperable data on the web. The one-page answer should be explaining why you need to express interoperable data on the Web in the first place. It shouldn't even mention RDF. RDF is part of a proposed solution, it's not part of the problem. Imagine trying to convince someone about SQL-in-browser when that someone doesn't believe that SQL is the right approach, rather that it should be XML object and XPath. Can you do that in one page? The point isn't to convince me about the solution. The point is to convince me that there is a problem at all, so that we can consider what solutions might exist. But without a baseline, you're sending me on a fool's errand. I'm trying my best to explain to you how you can get somewhere here. This is a good faith effort at trying to help. If you think my advice is somehow intended to waste your time then I can't help you. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Creative Commons Rights Expression Language
Ian Hickson wrote: ... So, just to be clear, how does that link up with SQL-in-browser? When you say people, do you mean web publishers / application builders? Users. One of the problems, for instance, was that they could not access their GMail while offline. ... So, out of curiosity, where did the requirement to be SQL-based come from? Were different technologies ever considered? Such as XML, triple stores or JCR? BR, Julian
Re: [whatwg] Creative Commons Rights Expression Language
On Mon, 25 Aug 2008, Julian Reschke wrote: Ian Hickson wrote: ... So, just to be clear, how does that link up with SQL-in-browser? When you say people, do you mean web publishers / application builders? Users. One of the problems, for instance, was that they could not access their GMail while offline. ... So, out of curiosity, where did the requirement to be SQL-based come from? Were different technologies ever considered? Such as XML, triple stores or JCR? Using table storage with a SQL query front-end wasn't a requirement, it was a solution, just like using XML, using triple stores, etc could have been. It just happened that SQL solved the problem better. For example, one of the use cases was the ability to easily use the same kind of data model as was being used server-side. We actually tried exposing an XML front-end at one point, but implementors didn't want to implement it (see the old API for what was called globalStorage at the time). -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Creative Commons Rights Expression Language
[I've been asked to bring this back to the WHATWG list, so I'm doing so now. For folks who want to look at the beginning of this thread on www-archive, it begins here: http://lists.w3.org/Archives/Public/www-archive/2008Aug/0024.html ] Kristof Zelechovski wrote: Forcing metadata into content is an incompatible modification. So, that would squarely contradict Ian's point that we can already ado this with existing HTML extensibility. But let's dig in for a second. Incompatible with what? What principle of HTML or existing feature of HTML would be broken by adding metadata into content? (Not to mention that Julian is right, the distinction between metadata and data is often irrelevant.) Also, does that mean microformats go against the principle of HTML? After all, they include calendar event markup in the HTML body. -Ben
Re: [whatwg] Creative Commons Rights Expression Language
Ian Hickson wrote: On Fri, 22 Aug 2008, Ben Adida wrote: cc:attributionName, cc:attributionURL, dc:title, dc:type, dc:date, Notice how these are so unique already that you didn't have to give their full names, these short names were enough for everyone to know what you were talking about without risk of clashes. So, you're looking at the web purely as humans browsing web sites? I think Dan Brickley described it well, so I'll just point to his answer and say I agree with it 100%: Actually we can do a fair bit more than simply have human readable strings. For example from the CC case, we've got a sub-property relationship between cc:license and dc:license. RDF often (more often, even) has relationships amongst classes too, and between classes and properties. So for example, the SIOC vocabulary defines a class sioc:User as a subclass of foaf:OnlineAccount; this is mechanically evident from http://rdfs.org/sioc/ns# http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-August/015933.html The idea here is to begin to build a web of data, but to do so by simply sprinkling it a bit of metadata to the existing HTML. For the web of data to be useful, some amount of automated data processing has to be possible. We're not simply trying to do spreadsheets in HTML. Data field names mean things, they can be related to other field names, etc... -Ben
Re: [whatwg] Creative Commons Rights Expression Language
On Aug 21, 2008, at 21:53, Ben Adida wrote: Not to mention that our design approach was specifically tailored to be HTML5-friendly. It really isn't HTML5-friendly, since it depends on the namespace mapping context at a node. Henri Sivonen writes: and those additions use a Namespace-dependent anti-pattern, so they aren't portable to HTML. Namespaces are an anti-pattern, really? Says who? The anti-pattern I was referring to was qnames-in-content. (But, I'm not saying that Namespaces in XML were not themselves an anti- pattern. :-) The web is inherently namespaced. Everything you go to is scoped to a URL prefix. There isn't one Paris or one New York, there is wikipedia/paris, and nyc.gov/NewYork. At least in the case of New York, the settlers had the good sense to choose a short disambiguating prefix instead of thinking they were off in a different default namespace like Texas and free to reuse local names causing problems with global map search usability later. So is it the : that bothers you? Is that really relevant? It's not the colon per se, although now that XML and HTML do DOM-wise different things with the colon, the colon is trouble for element and attribute names. Here's what bothers me about namespaces: 1) I need write namespaces URIs several times a day, but the URIs aren't memorable. Mistyping an NS URI would waste even more time as bugs than looking URIs up for copying and pasting, so I look them up for copying and pasting, and it's a huge waste of time. 2) The indirection layer from prefix to URI confuses people. 3) Namespaces not inheriting to attributes confuses people. (I have had to give a crash course in how namespaces work on W3C telecons and f2f meetings! Others have had to do it as well. This point is so confusing that people whose job is working on Web specs get it wrong. I've been told about a professor teaching a class about XML who got it wrong.) 4) Instead of comparing names against a string literals, you have to compare two datums against two literals. That is, instead of doing foo-bar.equals(name), you have to do http://www.example.com/2008/08/namespace# .equals(uri) bar.equals(localName). 5) Removing uri,local pairs from XML parsing context makes it hard to write the full name in a compact form. Witness the NSResolver complications with XPath and Selectors DOM APIs. 6) That the prefix is semantically not important confuses people who go and write uninteroperable software thinking that they should be comparing the prefix instead of the URI. 7) The design of namespaces considers parsing. It doesn't consider serialization. Writing an XML serializer that doesn't suck isn't trivial, and one will spend most of the development time on dealing with Namespaces. (The prefixes aren't important but people still have aesthetic opinions about how they should be generated...) 8) Namespaces dropped the HTML ball a decade ago letting the HTML and XML DOMs diverge. 9) Namespaces stuff their syntax into attributes as opposed to having syntax on their own meaning that certain magic attribute names need blacklisting both in parsing and in serialization. 10) Namespaces slow down parsing. (By over 20% with Xerces-J and the Wikipedia front page!) 11) I've spent *a lot* of time writing code that is Namespace-wise excruciatingly correct. Yet, Namespaces have never actually solved a problem for me. My software developer friends complain to me about how Namespaces cause them grief. No one can remember Namespaces solving a real problem. It's like feeding a white elephant. Qnames in content have further problems: They complicate APIs and the application layer when the mapping context needs to leak to the application instead of being a parser-internal thing. Under scripted DOM scenarios, there's the issue of the mapping context not getting captured at node creation time thereby making the meaning of qnames brittle under tree mutations. Finally, serializing XML that *may* have qnames in content without the serializer knowing which values are qnames (i.e. writing a generic serializer) is complex. (See also the TAG finding about problems with digital signatures.) Just look at what microformats are forced to do, which is effectively re-inventing ad-hoc namespaces with - separators. That's different. When the prefixes are fixed and go inside a name token without an indirection layer of without the name becoming a tuple, that's fine. You can still do foo-bar.equals(name). The namespaces are bad argument is the most mind-boggling web-tech meme I've seen in a while. It's Namespaces in XML that are bad--not *necessarily* lower-case 'n' namespaces. Also, qname-in-content are even worse than just Namespaces in XML. making them to identify which CC license they mean, making them understand what permissions they are giving irrevocably to others upon granting a license and
Re: [whatwg] Creative Commons Rights Expression Language
Bonner, Matt wrote: On Wed, Aug 20, 2008 at 5:22 PM, Bonner, Matt wrote: Hola, I see that the Creative Commons has proposed additions to HTML to support licenses (ccREL): http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ ... Tab Atkins Jr. replied: The whole thing would be best expressed as a microformat, as the entire thing can be made just as machine- and human-readable without having to introduce an entire new addition to html. I think someone is a little confused about the important of CC... then Dan Brickley wrote: I encourage you to (re)-read http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ ... the spec explains that all of CC's concrete markup requirements are addressed by the HTML additions in the RDFa spec. It does not propose *any* new HTML markup to address CC's specific needs. (big snip) In other words, adding 'about', 'property', 'resource', 'datatype' and 'typeof' and a namespace-URI association convention to HTML5 ... Just so I understand you, are you saying that attributes aren't markup? Because first you say no new markup, then you list 5 attributes to add. Ah, sorry for the unclarity. Attributes are markup. The sentence comes as a whole: I meant that ccREL proposes no new *CC-specific* attributes or elements. They get their job done using general RDFa markup. Second, the Introduction cites RDFa, which footnote 4 describes as an emerging collection of attributes and processing rules for extending XHTML to support RDF. However, the Introduction text and example go on to talk about HTML. Independent of any other discussions, I think it behooves the authors to clarify their intent. Is this for XHTML, HTML or both? Yes, this could be clearer. The group's general line (Ben feel free to correct me) is that this attribute-driven markup style is intended to be largely neutral of its 'carrier' format, but that RDFa-in-XHTML is the only version that is fully specified with implementor tests etc underway. For this markup to work in other XML languages would require some more work; for it to be deployed in non-XML HTML (HTML5 etc) requires even more. But the general notion is that these attributes could be deployed in SVG-based, HTML5/6-based etc. languages too, ie. that this isn't a project tightly bound to (some specific version of) XHTML. Of course in a non-XML context, some other mechanism is needed (eg. link rels) to associate abbreviations with URLs. Also in http://www.w3.org/TR/rdfa-syntax/ (now in CR at W3C, http://www.w3.org/TR/2008/CR-rdfa-syntax-20080620/) [[ RDFa is a specification for attributes to be used with languages such as HTML and XHTML to express structured data. [...] This document only specifies the use of the RDFa attributes with XHTML. ]] Does that help? cheers Dan -- http://danbri.org/
Re: [whatwg] Creative Commons Rights Expression Language
12. DOCTYPE declarations have to use prefixes where the corresponding namespaces are yet undeclared. The same problem affects external CSS. This effectively fixes the prefixes, making the redirection to the URL redundant. Chris -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Henri Sivonen Sent: Friday, August 22, 2008 9:51 AM To: Ben Adida Cc: Tab Atkins Jr.; WHAT-WG; [EMAIL PROTECTED]; Dan Brickley; Bonner, Matt; Ian Hickson Subject: Re: [whatwg] Creative Commons Rights Expression Language Here's what bothers me about namespaces: 1) I need write namespaces URIs several times a day, but the URIs aren't memorable. Mistyping an NS URI would waste even more time as bugs than looking URIs up for copying and pasting, so I look them up for copying and pasting, and it's a huge waste of time. 2) The indirection layer from prefix to URI confuses people. 3) Namespaces not inheriting to attributes confuses people. (I have had to give a crash course in how namespaces work on W3C telecons and f2f meetings! Others have had to do it as well. This point is so confusing that people whose job is working on Web specs get it wrong. I've been told about a professor teaching a class about XML who got it wrong.) 4) Instead of comparing names against a string literals, you have to compare two datums against two literals. That is, instead of doing foo-bar.equals(name), you have to do http://www.example.com/2008/08/namespace# .equals(uri) bar.equals(localName). 5) Removing uri,local pairs from XML parsing context makes it hard to write the full name in a compact form. Witness the NSResolver complications with XPath and Selectors DOM APIs. 6) That the prefix is semantically not important confuses people who go and write uninteroperable software thinking that they should be comparing the prefix instead of the URI. 7) The design of namespaces considers parsing. It doesn't consider serialization. Writing an XML serializer that doesn't suck isn't trivial, and one will spend most of the development time on dealing with Namespaces. (The prefixes aren't important but people still have aesthetic opinions about how they should be generated...) 8) Namespaces dropped the HTML ball a decade ago letting the HTML and XML DOMs diverge. 9) Namespaces stuff their syntax into attributes as opposed to having syntax on their own meaning that certain magic attribute names need blacklisting both in parsing and in serialization. 10) Namespaces slow down parsing. (By over 20% with Xerces-J and the Wikipedia front page!) 11) I've spent *a lot* of time writing code that is Namespace-wise excruciatingly correct. Yet, Namespaces have never actually solved a problem for me. My software developer friends complain to me about how Namespaces cause them grief. No one can remember Namespaces solving a real problem. It's like feeding a white elephant.
Re: [whatwg] Creative Commons Rights Expression Language
+cc: Ben Adida Tab Atkins Jr. wrote: On Wed, Aug 20, 2008 at 5:22 PM, Bonner, Matt [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hola, I see that the Creative Commons has proposed additions to HTML to support licenses (ccREL): http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ As an example, they offer: div about=http://lessig.org/blog/; xmlns:cc=http://creativecommons.org/ns#; This page, by a property=cc:attributionName rel=cc:attributionURL href=http://lessig.org/; Lawrence Lessig /a, is licensed under a a rel=license href=http://creativecommons.org/licenses/by/3.0/; Creative Commons Attribution License /a. /div Unless I missed something in the HTML5 spec, at the least this would add the property attribute to a. Wouldn't ccREL be expressed better using link instead of a? Matt -- Matt Bonner Hewlett-Packard Company The whole thing would be best expressed as a microformat, as the entire thing can be made just as machine- and human-readable without having to introduce an entire new addition to html. I think someone is a little confused about the important of CC... (Note: the someone is not you, Matt, but the drafters of this proposal. Also, I love CC as much as the next guy, but there's absolutely no reason to extend html to accomodate it, as everything they want to express can be done in existing html and formatted as a microformat.) I encourage you to (re)-read http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ ... the spec explains that all of CC's concrete markup requirements are addressed by the HTML additions in the RDFa spec. It does not propose *any* new HTML markup to address CC's specific needs. Instead, they're telling the world that CC's needs (including their own requirement for independent extensions) are well-handled by RDFa. RDFa adds a set of attributes; http://www.w3.org/MarkUp/2008/ED-rdfa-syntax-20080403/#rdfa-attributes has a full list. The ccREL spec shows these in an XHTML+RDFa XHTML format. There's a strong case to add them to HTML5 too, in my view. In other words, adding 'about', 'property', 'resource', 'datatype' and 'typeof' and a namespace-URI association convention to HTML5 wouldn't merely be addressing the important needs of the Creative Commons community. It would allow the expression of properties defined by any decentralised community, without the need for central coordination. This includes not just CC, but every group worldwide who are extending and customising CC for their own needs. Not just FOAF, but groups extending it for modelling forum posts and social media (eg. SIOC), or opensource projects (DOAP). Not just Dublin Core, but the huge range of projects that extend it to handle educational metadata (which itself varies nationally), rights, aggregation, classification etc. The addition of the RDFa attributes would allow HTML5 to carry structured data expressed in all/any of these vocabularies. The Microformats.org community have done wonderful work and have inspired many others, but it is unfair on them (and unrealistic) to pressure their community, mailing lists and wiki by expecting their process to be a central bottleneck for all markup extensions to HTML. The Web serves a massive and fast growing community, many of whom don't speak English and are whose markup needs aren't core business for Microformats.org. By using RDFa and associating each vocabulary with a URI, we can spread the workload a bit more evenly. Note also that every new vocabulary initiative at Microformats.org creates real and non-trivial work for parser writers, as well as work for vocabulary authors in specifying what it means to mix each pair of vocabularies. For ccREL (and FOAF, Dublin Core, SIOC, DOAP, ...), this is largely handled by RDF/RDFa: it can be freely mixed with any other RDF vocabulary, and reliably parsed by generic parser code. The tradeoff here is that the markup is less hand optimised for beauty than with microformats. (When extra-pretty custom markup is important, RDF provides GRDDL as a way of using XSLT to specify a mapping into its common data model.) For more on RDFa, see the primer, http://www.w3.org/2006/07/SWD/RDFa/primer/ For a microformat parser that also handles RDFa, see http://buzzword.org.uk/cognition/ ... or an RDF toolkit that also parses some popular microformats, see http://arc.semsol.org/ For RDFa parsing in Javascript, see http://www.w3.org/2006/07/SWD/RDFa/impl/js/ cheers, Dan ps. my slides from a recent talk on rdf and microformats are here, if anyone's interested. It's more about how enthusiasts from each effort can learn from each other, than about the technical detail: http://www.slideshare.net/danbri/one-big-happy-family/ via http://microformats.eventwax.com/vevent -- http://danbri.org/
Re: [whatwg] Creative Commons Rights Expression Language
On Thu, 21 Aug 2008, Dan Brickley wrote: The Microformats.org community have done wonderful work and have inspired many others, but it is unfair on them (and unrealistic) to pressure their community, mailing lists and wiki by expecting their process to be a central bottleneck for all markup extensions to HTML. I don't think anyone is suggesting that all such ideas should go through the Microformats community. What is being suggested is that instead of adding more features to HTML, the people who want to annotate their HTML documents with metadata, like Creative Commons, merely use some of the many existing HTML extension mechanisms, like class=, rel=, etc. Microformats.org has shown several things; one is that it is important to actually make sure the problem you are solving is one that needs solving, another is that it is possible to use the existing HTML extension mechanisms to mark up very rich semantic data. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Creative Commons Rights Expression Language
On Aug 21, 2008, at 10:49, Dan Brickley wrote: I encourage you to (re)-read http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ ... the spec explains that all of CC's concrete markup requirements are addressed by the HTML additions in the RDFa spec. The RDFa spec doesn't make any additions to HTML. It only specifies additions to XHTML, and those additions use a Namespace-dependent anti- pattern, so they aren't portable to HTML. In other words, adding 'about', 'property', 'resource', 'datatype' and 'typeof' and a namespace-URI association convention to HTML5 wouldn't merely be addressing the important needs of the Creative Commons community. It seems to me that the Creative Commons community has more pressing needs that aren't related to RDF syntax. Specifically: Making people to refer to license URI at all, making them to identify which CC license they mean, making them understand what permissions they are giving irrevocably to others upon granting a license and making them understand what licenses used by others mean (NonCommercial, anyone?). Syntax doesn't solve any of these. People don't know what they are doing when they flip those Flickr settings: http://diveintomark.org/archives/2008/02/05/writing-with-ease#comment-11272 At least in a non-RDF context, pointing to the license by URI seems too hard. See http://intertwingly.net/blog/2008/02/09/Mashups-Smashups#c1202660852 Also note that even CC leadership omits the license URI. I encourage you to examine the last frames of the videos at http:// lessig.blip.tv/. The latest video (http://lessig.blip.tv/file/ 1185352/) works as an example. Whenever the last frame acknowledges the use of CC-licensed photos, it doesn't show the URI of the license. In fact, it doesn't even state in words or icons *which* CC license the photos were used under! Getting back to the comment thread on intertwingly.net, a later comment contained this gem: http://intertwingly.net/blog/2008/02/09/Mashups-Smashups#c1202810109 My sarcasm detector isn't quite working, so I can't tell if the comment was *meant* to mock RDF, but the follow-up comment is spot on: http://intertwingly.net/blog/2008/02/09/Mashups-Smashups#c1202870522 It would allow the expression of properties defined by any decentralised community, without the need for central coordination. This includes not just CC, but every group worldwide who are extending and customising CC for their own needs. Not just FOAF, but groups extending it for modelling forum posts and social media (eg. SIOC), or opensource projects (DOAP). Not just Dublin Core, Interesting that you mention Dublin Core. It's a great example of why it's bad to just rush embedding an RDF vocabulary into HTML without a semantic overlap unification process like the Microformats Process. Most of the original DC elements duplicate native metadata facilities of HTML and HTTP. There will always be more content using HTML title than DC title, so consumers will be better off being able to consume HTML title. There will always be more consuming apps for HTML title than DC title, so publishers will be better off using HTML titles. but the huge range of projects that extend it to handle educational metadata (which itself varies nationally), rights, aggregation, classification etc. The addition of the RDFa attributes would allow HTML5 to carry structured data expressed in all/any of these vocabularies. RDFa including namespace-URI mappings isn't the only possible way to accomplish RDF embedding into HTML5. RDFa uses CURIEs which take the qnames-in-content anti-pattern and keep digging the hole. I think we shouldn't introduce the complexity of Namespaces and qnames-in-content to HTML5. Aside: The TAG has a finding saying that qnames-in-content are problematic: http://www.w3.org/2001/tag/doc/qnameids.html There's an obvious way how RDFa could have been adjusted to avoid the ills of Namespaces and qnames-in-content: using full URIs instead of CURIEs. N-Triples demonstrate that RDF triples can be serialized without a prefix binding layer. Even if RDFa were adjusted to use full URIs, there'd still be the issue of objections to the additional attributes by people who not only think Namespaces are bad but think that embedding RDF in HTML at all is bad. I sent an outline of a possible way to route around this issue to the HTML WG and xml-dev, but my trial balloon got Warnocked: http://lists.w3.org/Archives/Public/public-html/2008Aug/0231.html Note: I'm not suggesting that it would be good for CC to promote something as complex as that. I wish CC weren't telling people to use RDF with any syntax (or with the NonCommercial license element, but that's off-topic here). However, something like the eRDF5 trial balloon could work for communities who, unlike CC, aren't trying to meet the general public and, therefore, can afford more
Re: [whatwg] Creative Commons Rights Expression Language
Can't you just embed your XML metadata in a SCRIPT element? Chris
Re: [whatwg] Creative Commons Rights Expression Language
If I understand it correctly, we do not have a problem with the colon as a namespace separator. Our problem is that a:x sometimes means the same as b:x and there is no reasonable way to make legacy browsers support this. Different URLs, OTOH, are not expected to mean the same thing even if one is an alias for another. Chris -Original Message- From: Ben Adida [mailto:[EMAIL PROTECTED] Sent: Thursday, August 21, 2008 8:53 PM To: Dan Brickley Cc: Tab Atkins Jr.; Bonner, Matt; WHAT-WG; Ian Hickson; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [whatwg] Creative Commons Rights Expression Language Namespaces are an anti-pattern, really? Says who? The web is inherently namespaced. Everything you go to is scoped to a URL prefix. There isn't one Paris or one New York, there is wikipedia/paris, and nyc.gov/NewYork. So is it the : that bothers you? Is that really relevant?
Re: [whatwg] Creative Commons Rights Expression Language
I was trying to explain the rejection of namespaces in general because it is a general decision. It is not enough to make sure this particular use case does not cause problems. AFAIK, you can make a legacy browser that supports custom elements and scripting to display a progress bar. This probably means you partially right: Lynx, NCSA Mosaic and MacWeb cannot render a progress bar element. Chris -Original Message- From: Ben Adida [mailto:[EMAIL PROTECTED] Sent: Thursday, August 21, 2008 9:36 PM To: Kristof Zelechovski Cc: 'Dan Brickley'; 'Tab Atkins Jr.'; 'Bonner, Matt'; 'WHAT-WG'; 'Ian Hickson'; [EMAIL PROTECTED] Subject: Re: [whatwg] Creative Commons Rights Expression Language Kristof Zelechovski wrote: If I understand it correctly, we do not have a problem with the colon as a namespace separator. Our problem is that a:x sometimes means the same as b:x and there is no reasonable way to make legacy browsers support this. But... legacy browsers have no way to display a Progress Bar either, right? RDFa does *not* affect how something is rendered. It just tells you what portions of the page mean what exactly (this is a license, this is a tag, etc...) So we're okay if legacy browsers don't understand it, they can simply ignore it. In fact, even new browsers can ignore RDFa, leaving the job to an extension. But of course, everyone is much better off if RDFa can be validated in HTML/XHTML. -Ben
Re: [whatwg] Creative Commons Rights Expression Language
Kristof Zelechovski wrote: If I understand it correctly, we do not have a problem with the colon as a namespace separator. Our problem is that a:x sometimes means the same as b:x and there is no reasonable way to make legacy browsers support this. But... legacy browsers have no way to display a Progress Bar either, right? RDFa does *not* affect how something is rendered. It just tells you what portions of the page mean what exactly (this is a license, this is a tag, etc...) So we're okay if legacy browsers don't understand it, they can simply ignore it. In fact, even new browsers can ignore RDFa, leaving the job to an extension. But of course, everyone is much better off if RDFa can be validated in HTML/XHTML. -Ben
Re: [whatwg] Creative Commons Rights Expression Language
Just a little side-track for the video issues around this thread: On Fri, Aug 22, 2008 at 4:53 AM, Ben Adida [EMAIL PROTECTED] wrote: Also note that even CC leadership omits the license URI. So you want a URI in the video content itself? What good would that do? With links directly in the video, copies of the videos will continue to contain the license, so there is a reason for putting metadata such as the license inside the video. In fact, RDF inside video would be a big step forward to deal with the DRM issues around videos. With ccREL (and specifically RDFa), the surrounding HTML can easily say *this* video is licensed under *that* license. This is a good solution in the current situation where there is no standard video and video annotation format. If there was a standard video annotation format, we could have the video's DOM accessible directly in the browser and such questions as what is the video's license could be answered easily directly. I think this may come out of the new W3C proposed video activity http://www.w3.org/QA/2008/04/proposed_activity_for_video_on.html. Regards, Silvia.
Re: [whatwg] Creative Commons Rights Expression Language
Silvia Pfeiffer wrote: With links directly in the video, copies of the videos will continue to contain the license, so there is a reason for putting metadata such as the license inside the video. Ah yes, in this case I agree: if the metadata were machine-readable, that would be great. I was talking about the URL not appearing in the actual human-visible content. I do hope we see some good standardization on embedded metadata. -Ben
[whatwg] Creative Commons Rights Expression Language
Hola, I see that the Creative Commons has proposed additions to HTML to support licenses (ccREL): http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ As an example, they offer: div about=http://lessig.org/blog/; xmlns:cc=http://creativecommons.org/ns#; This page, by a property=cc:attributionName rel=cc:attributionURL href=http://lessig.org/; Lawrence Lessig /a, is licensed under a a rel=license href=http://creativecommons.org/licenses/by/3.0/; Creative Commons Attribution License /a. /div Unless I missed something in the HTML5 spec, at the least this would add the property attribute to a. Wouldn't ccREL be expressed better using link instead of a? Matt -- Matt Bonner Hewlett-Packard Company smime.p7s Description: S/MIME cryptographic signature
Re: [whatwg] Creative Commons Rights Expression Language
On Wed, 20 Aug 2008, Tab Atkins Jr. wrote: Unless I missed something in the HTML5 spec, at the least this would add the property attribute to a. Wouldn't ccREL be expressed better using link instead of a? The whole thing would be best expressed as a microformat, as the entire thing can be made just as machine- and human-readable without having to introduce an entire new addition to html. I think someone is a little confused about the important of CC... (Note: the someone is not you, Matt, but the drafters of this proposal. Also, I love CC as much as the next guy, but there's absolutely no reason to extend html to accomodate it, as everything they want to express can be done in existing html and formatted as a microformat.) I tend to agree with Tab here. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Creative Commons Rights Expression Language
Le 21 août 2008 à 07:22, Bonner, Matt a écrit : I see that the Creative Commons has proposed additions to HTML to support licenses (ccREL): http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ And as a practical implementation of it. Click at the logo at the bottom, and it returns the license with parsed information from the initial page. http://joi.ito.com/weblog/2008/08/06/board-report-fr.html#cc There is also an CC License Validator. (in maintenance as of the time of this email) http://validator.creativecommons.org/ -- Karl Dubost - W3C http://www.w3.org/QA/ Be Strict To Be Cool
Re: [whatwg] Creative Commons Rights Expression Language
Ian Hickson wrote: On Wed, 20 Aug 2008, Tab Atkins Jr. wrote: Unless I missed something in the HTML5 spec, at the least this would add the property attribute to a. Wouldn't ccREL be expressed better using link instead of a? The whole thing would be best expressed as a microformat, as the entire thing can be made just as machine- and human-readable without having to introduce an entire new addition to html. I think someone is a little confused about the important of CC... (Note: the someone is not you, Matt, but the drafters of this proposal. Also, I love CC as much as the next guy, but there's absolutely no reason to extend html to accomodate it, as everything they want to express can be done in existing html and formatted as a microformat.) I tend to agree with Tab here. Thank you both for not calling me confused. :-) I have no problem with using a micro-format. That might offer better extensibility to other file types, too. Just wanted to make sure HTML people were aware of the proposal so that any needed response to it would be timely. best regards, Matt -- Matt Bonner Hewlett-Packard Company smime.p7s Description: S/MIME cryptographic signature