Re: [whatwg] header for JSON-LD ???
Hypothetically, if search engines were to start picking up JSON-LD from linked files, which link rel type would this group consider most appropriate? Dan On 23 July 2017 at 06:12, Jeffrey Yasskinwrote: > 2¢: This list tends to disapprove of JSON-LD, so you should probably first > run your proposal by a group that likes JSON-LD. Maybe > public-rdf-comme...@w3.org referenced from https://www.w3.org/TR/json-ld/? > Or an issue against https://github.com/json-ld/json-ld.org? > > Jeffrey > > On Fri, Jul 21, 2017 at 2:21 PM, Michael A. Peters > > wrote: > > > I am (finally) starting to implement JSON-LD on a site, it generates a > lot > > of data that is useless to the non-bot typical user. > > > > I'd prefer to only stick it in the head when the client is a crawler that > > wants it. > > > > Wouldn't it be prudent if agents that want JSON-LD can send a > standardized > > header as part of their request so web apps can optionally choose to only > > send the JSON-LD data to clients that want it? Seems it would be kinder > to > > mobile users on limited bandwidth if they didn't have to download a bunch > > of JSON that is meaningless to them. > > > > Is this the right group to suggest that? > > >
Re: [whatwg] inverse property mechanism for Microdata?
On 17 March 2014 21:15, Ian Hickson i...@hixie.ch wrote: On Mon, 17 Mar 2014, Dan Brickley wrote: We discussed this (and the -inv suggestion) at schema.org again, and the consensus there was that we'd like to have the search engines proceed with accepting an experimental/proposed 'inverse itemprop' attribute, rather than work around its absence. So the idea here that the itemprop-up (or whatever -- it would be good to get a more intuitive name, not sure what to call it though) would have to be specified in conjunction with the itemscope= attribute on a top-level microdata item whose element had an ancestor that itself creates an item, and would actually specify a property on the inner item, whose value was the outer item? This is what the example would look like if I'm understanding this right: div itemscope itemtype=http://schema.org/LocalBusiness; h1span itemprop=name(Entity A) Beachwalk Beachwear Giftware/span/h1 span itemprop=description A superb collection of fine gifts and clothing to accent your stay in Mexico Beach./span Phone: span itemprop=telephone850-648-4200/span div itemscope itemtype=http://schema.org/LocalBusiness; itemprop-up=containedIn h2span itemprop=name(Entity B) The tiny store within a store/span/h2 span itemprop=description A superb collection of tiny clothes, from the store within the store./span Phone: span itemprop=telephone123-456-7890/span /div /div It's not too bad, I guess. Yes. I notice that the words we were playing with at schema.org relate to the underlying graph data model itemprop-inverse, -reverse etc., whereas your draft name, itemprop-up is about the markup hierarchy. My main concern is that this seems to solve a very narrow use case for non-tree structures, but doesn't take into account the many, many other non-tree structures. Yup, there are some cases where this can be addressed through the rigorous use of entity IDs in itemid, as you sketch below. That would be relatively new territory for schema.org and for publishers. Perhaps there is an attribute name we can find that would leave the door open to more use cases, e.g. itemprop-backwards rather than itemprop-up. It seems reasonable to try to address relationships between sibling elements too. Something like (trying out -backwards instead of -up, to allow for non-hierarchical usage): div itemid=bigshop itemscope itemtype=http://schema.org/LocalBusiness; h1span itemprop=name(Entity A) Beachwalk Beachwear Giftware/span/h1 /div div itemscope itemtype=http://schema.org/Pharmacy; meta itemprop-backwards=containedIn itemid=bigshop / h2span itemprop=nameTiny pharmacy store within a store/span/h2 /div ? Can we use itemid in that way, to give a property value too? I don't see itemid used much in the wild and the spec only mentions its use for the item having the property, rather than using when supplying the value of a property. For example, consider the case of a TV Episode with an Actor: div itemscope itemtype=http://schema.org/Episode; ... div itemprop=actor itemscope itemtype=http://schema.org/Person; ... /div /div ...now suppose it's marked up the other way around: div itemscope itemtype=http://schema.org/Person; ... div itemprop-up=actor itemscope itemtype=http://schema.org/Episode; ... /div /div So far so good. But what if there's two episodes with two actors, and the page just lists both episodes and both actors, and wants to cross-reference both episodes to both actors? itemprop-up (or whatever we call it) can't help there. itemref= can help in some simple cases, but as you pointed out, it soon gets out of hand. Microdata actually already has a solution to this. The vocabulary can define an ID for each item using itemid=, and can define multiple items having the same ID as being the same conceptual item. Thus: !-- first episode -- div itemscope itemtype=http://schema.org/Episode; ... div itemprop=actor itemscope itemtype=http://schema.org/Person; itemid=http://.../person/123;/div div itemprop=actor itemscope itemtype=http://schema.org/Person; itemid=http://.../person/456;/div /div !-- second episode -- div itemscope itemtype=http://schema.org/Episode; ... div itemprop=actor itemscope itemtype=http://schema.org/Person; itemid=http://.../person/123;/div div itemprop=actor itemscope itemtype=http://schema.org/Person; itemid=http://.../person/456;/div /div !-- actors -- div itemscope itemtype=http://schema.org/Person; itemid=http://.../person/123; ... /div div itemscope itemtype=http://schema.org/Person; itemid=http://.../person/456; ... /div This also enables the data to be spread across multiple
Re: [whatwg] inverse property mechanism for Microdata?
Hi Ian, HTML people, On 31 January 2014 23:45, Ian Hickson i...@hixie.ch wrote: On Fri, 31 Jan 2014, Dan Brickley wrote: We'd (schema.org 'we') like to make a public proposal to update Microdata with a syntax for expressing inverse properties/relationships. [...] Here's an example with 'containedIn'. The idea is that we want to express that the LocalBusiness (i.e. Place) Entity B is 'containedIn' Entity A. The example I show here expresses the reverse, incorrectly. So we're looking for a change to the markup that would turn this example into one that said The LocalBusiness Entity B is containedIn the LocalBusiness Entity A: div itemscope itemtype=http://schema.org/LocalBusiness; h1span itemprop=name(Entity A) Beachwalk Beachwear Giftware/span/h1 span itemprop=description A superb collection of fine gifts and clothing to accent your stay in Mexico Beach./span Phone: span itemprop=telephone850-648-4200/span div itemprop=containedIn itemscope itemtype=http://schema.org/LocalBusiness; h2span itemprop=name(Entity B) The tiny store within a store/span/h2 span itemprop=description A superb collection of tiny clothes, from the store within the store./span Phone: span itemprop=telephone123-456-7890/span /div /div This is actually possible today: div itemscope itemtype=http://schema.org/LocalBusiness; id=a itemprop=containedIn h1span itemprop=name(Entity A) Beachwalk Beachwear Giftware/span/h1 span itemprop=description A superb collection of fine gifts and clothing to accent your stay in Mexico Beach./span Phone: span itemprop=telephone850-648-4200/span div itemscope itemref=a itemtype=http://schema.org/LocalBusiness; h2span itemprop=name(Entity B) The tiny store within a store/span/h2 span itemprop=description A superb collection of tiny clothes, from the store within the store./span Phone: span itemprop=telephone123-456-7890/span /div /div The trick here is to turn the inner item into the top-level microdata item, and use itemref= to have that inner item point to the outer item. You're right; it is indeed possible. However it is perhaps a little too clever. I've tried it on a few colleagues, and it didn't 'click' with anyone yet. We discussed this (and the -inv suggestion) at schema.org again, and the consensus there was that we'd like to have the search engines proceed with accepting an experimental/proposed 'inverse itemprop' attribute, rather than work around its absence. (This works great unless you want two items to refer to the same third item using different properties, but that's something microdata can't do in general, since it's based on a tree structure, not a graph structure. To address that particular problem, you need a vocabulary that defines how itemid= works; at that point, you can just have the same underlying item represented as multiple microdata items in the document by having all the items share the same ID. But how exactly that is to be interpreted is something the vocabulary has to define.) One response is that the markup could be reorganized. That's basically what the above does, but without moving the elements around in the DOM. (itemref= is basically all about making the microdata model work around constraints coming from the author's preferred DOM.) (Yup.) Another reasonable response to this is 'well, perhaps you should have a property (instead or in addition) called geospatiallyContains, or containerOf or contains, or rev_containedIn for this usage scenario'? That is another option, similar to the parenthetical itemid= note above -- you could just have the vocabulary define that for every property whose value is an item, the item type that that property can point to has another property with the same name plus a fixed suffix, like -inv, that inverses the relationship. That would make the above look like: div itemscope itemtype=http://schema.org/LocalBusiness; h1span itemprop=name(Entity A) Beachwalk Beachwear Giftware/span/h1 span itemprop=description A superb collection of fine gifts and clothing to accent your stay in Mexico Beach./span Phone: span itemprop=telephone850-648-4200/span div itemprop=containedIn-inv itemscope itemtype=http://schema.org/LocalBusiness; h2span itemprop=name(Entity B) The tiny store within a store/span/h2 span itemprop=description A superb collection of tiny clothes, from the store within the store./span Phone: span itemprop=telephone123-456-7890/span /div /div This is easier to understand than itemref, but still involves creating 100s of additional properties instead of just one new piece of syntax. We have tried this and in a few cases we have included pairs of inverse properties in schema.org, e.g. we have alumni and an inverse, alumniOf. In designing schemas we have found it consistently
Re: [whatwg] Supporting more address levels in autocomplete
On 24 Feb 2014 05:17, Charles McCathie Nevile cha...@yandex-team.ru wrote: On Sat, 22 Feb 2014 05:05:06 +0100, Ian Hickson i...@hixie.ch wrote: On Fri, 21 Feb 2014, Kevin Marks wrote: On 21 Feb 2014 17:03, Ian Hickson i...@hixie.ch wrote: Those names come from vcard - if adding a new one, consider how to model it in vcard too. Note that UK addresses can have this too - eg 3 high street, Kenton, Harrow, Middlesex, UK That's actually a bogus UK address. I'm not sure exactly which town you meant that to be in, but official UK addresses never have more than two region levels, and usually only one (the post town). The only time they have two is when the post town has two streets with the same name. The real address, where I grew up, was: 2 Melbury Road, Kenton, Harrow, Middlesex, HA3 9RA Today, the address of that building is: 2 Melbury Rd Harrow HA3 9RA Damn humans, not following specs. Actually UK addresses have a huge amount of leeway, as they are routed by postcode in the main (though I did receive a postcard addressed to Kevin, Sidney, Cambridge once). The post office will deal with all kinds of stuff, sure. But Web forms only have to accept the formal address format, which in the UK only ever has a street, a locality (sometimes), a post town, and a post code. That depends on whether you want to force your customers to think like the Post Office, or whether you prefer to be responsive to your customers. Speaking without data, I suspect that nervousness at not being able to put *what someone thinks* is their address translates fairly readily into a certain amount of failure to proceed with a transaction. Providing specification purity over the concerns of both users and developers trying to use the Web to successfully interact with them seems like a pretty basic mistake to me. Who is using the data? Just post offices? Or taxi drivers, pizza delivery bikers, pedestrians? Dan cheers Chaals -- Charles McCathie Nevile - Consultant (web standards) CTO Office, Yandex cha...@yandex-team.ru Find more at http://yandex.com
[whatwg] inverse property mechanism for Microdata?
Hi folks. I'm relaying this from the schema.org collaboration, probably the main user of HTML's Microdata mechanism. We'd (schema.org 'we') like to make a public proposal to update Microdata with a syntax for expressing inverse properties/relationships. FWIW other notations that schema.org supports (JSON-LD and RDFa) have such mechanisms ([1],[2]). At schema.org we are repeatedly running into situations where we have a need for properties to be used in reverse direction. There are 630 or so properties defined currently (and a similar number of types); see listing at http://schema.org/docs/full.html. Inverse properties are relatively a cornercase, but a persistent one. By inverse, I refer to scenarios where there are any pair of properties (relationship types) e.g. foo and bar, such that whenever some entity-1 has a foo relationship to an entity-2, then by definition, entity-2 will have a bar relationship to entity-1. We'd like to avoid the need to give bar a specific name, and instead be able to in effect just say the inverse of foo. e.g. perhaps entity-1 is a shop, entity-2 is another shop, and foo = containedIn, bar = containsWithin, indicating that the one shop is inside the other. Or perhaps entity-1 is a school, entity-2 is a celebrity, and foo=alumni, bar=alumniOf. Schema.org would like Microdata syntax to be extended somehow, to allow a single property name to be used regardless of whether the markup nesting structure emphasises entity-1 or entity-2. For more example topics, here are some of the properties we define. http://schema.org/containedIn (which relates a smaller place to a larger containing place); http://schema.org/member http://schema.org/alumni http://schema.org/author http://schema.org/performerIn http://schema.org/worksFor http://schema.org/employee http://schema.org/founder http://schema.org/member ... and various others, often role-related or where two independent entities have a relationship that is being described, and where neither entity is necessarily the primary focus in all markup. For a property like alumni it could reasonably be used either in a paragraph that was describing the educational institution, or describing a (famous) person who attended it. We would like to have a standard markup convention for using a single named property, i.e. being able to indicate sometimes that it is to be read in reversed direction. In other words we want to avoid having to come up with two different names for each of these situations; and more importantly, to avoid publishers/authors having to remember two names for one situation. Here's an example with 'containedIn'. The idea is that we want to express that the LocalBusiness (i.e. Place) Entity B is 'containedIn' Entity A. The example I show here expresses the reverse, incorrectly. So we're looking for a change to the markup that would turn this example into one that said The LocalBusiness Entity B is containedIn the LocalBusiness Entity A: div itemscope itemtype=http://schema.org/LocalBusiness; h1span itemprop=name(Entity A) Beachwalk Beachwear Giftware/span/h1 span itemprop=description A superb collection of fine gifts and clothing to accent your stay in Mexico Beach./span Phone: span itemprop=telephone850-648-4200/span div itemprop=containedIn itemscope itemtype=http://schema.org/LocalBusiness; h2span itemprop=name(Entity B) The tiny store within a store/span/h2 span itemprop=description A superb collection of tiny clothes, from the store within the store./span Phone: span itemprop=telephone123-456-7890/span /div /div One response is that the markup could be reorganized. For example, div itemscope itemtype=http://schema.org/LocalBusiness; h2span itemprop=name(Entity B) The tiny store within a store/span/h2 span itemprop=description A superb collection of tiny clothes, from the store within the store./span Phone: span itemprop=telephone123-456-7890/span div itemprop=containedIn itemscope itemtype=http://schema.org/LocalBusiness; h2span itemprop=name(Entity A) Beachwalk Beachwear Giftware/span/h2 span itemprop=description A superb collection of fine gifts and clothing to accent your stay in Mexico Beach./span Phone: span itemprop=telephone850-648-4200/span /div /div We're not so optimistic about this approach, especially when multiple entities are described. Schema.org is widely used but seems generally to be added to existing pages with relatively fixed structure. Another reasonable response to this is 'well, perhaps you should have a property (instead or in addition) called geospatiallyContains, or containerOf or contains, or rev_containedIn for this usage scenario'? We have tried this and in a few cases we have included pairs of inverse properties in schema.org, e.g. we have alumni and an inverse, alumniOf. In designing schemas we have found it consistently hard to get even a single natural/intuitive name for each property, and finding a good
[whatwg] Microdata feedback: please state that property value ordering is in the data model, and give usage guidelines
Hello, Reading http://www.whatwg.org/specs/web-apps/current-work/multipage/links.html#microdata Section '5.2.3 Names: the itemprop attribute' states something important about Microdata's data model, Within an item, the properties are unordered with respect to each other, except for properties with the same name, which are ordered in the order they are given by the algorithm that defines the properties of an item. ... and gives an example In the following example, the a property has the values 1 and 2, in that order, ... div itemscope itemref=x p itemprop=btest/p p itemprop=a2/p /div div id=x p itemprop=a1/p /div However '5.2.1 The microdata model' does not mention anything of this data model feature. If property values (for some specific property/item context), this should be mentioned when introducing the data model; if only by copying or linking the above sentence (Within an item, ...). Is the expectation that Microdata vocabulary authors can decide whether such ordering is meaningful, when they define / describe their properties? For example, in academic publishing where they care about being first named author, the ordering of 'itemprop=author' might seem to matter. 5.2.3 suggests that the ordering information is at least preserved in Microdata's data model. If someone creates an 'author' property for Microdata, should they state that property ordering is meaningful, or is that not their decision? Thanks, Dan
Re: [whatwg] Captions, Subtitles and the Video Element
On 17/7/09 15:04, Tab Atkins Jr. wrote: On Fri, Jul 17, 2009 at 4:15 AM, Ian Hicksoni...@hixie.ch wrote: On Thu, 16 Jul 2009, Jeff Walden wrote: (For the few authors who really want to go crazy, they can already overlap HTML onto theirvideo and do whatever crazy stuff they want to do.) By way of a use case for at least color and positioning, there's a certain part of the third (?) Austin Powers movie wherein the color and position of foreign-language subtitles plays an important part in the artistic merits (lack thereof, arguably) of the scene. How would you suggest a movie-viewing site usevideo to display these? It seems unreasonable to say that the site must include special-case handling for this particular movie clip's subtitles; it's more likely they would be mangled in some manner and the semantic content (lack thereof) would be lost. By the way, I have no idea how foreign-language translations of the movie handle this scene. It's possible they simply subtitle the subtitles and avoid the more complicated problems this scene arguably presents. I think this particular case can be a victim of the 80% rule. I don't remember the exact scene you're referring to, but it's also possible that those subtitles are then an integral part of the content, and should properly be baked into the movie. Yep, slippery slope. If we're not careful we'll end up requiring a 3d file browsing facility, so that Jurassic Park can be properly represented - http://en.wikipedia.org/wiki/Fsn cheers, Dan
Re: [whatwg] Fullscreenable attribute.
On 13/7/09 11:06, Ian Hickson wrote: On Tue, 16 Jun 2009, Alpha Omega wrote: I think it would be useful to add fullscreenable (or more refined name) attribute to arbitrary element, so users could be able to full-screen DOM subtrees, that document author marked as fullscreenable. Usage: User choses area that he wants to fullscreen, peforms UA-specific action there(go to fullscreen in context menu in desktop browsers, or gesture on mobile devices for example), UA goes up in DOM tree until it founds fullscreenable attribute, and then fullscreens this subtree. If fullscreenable attribute is not found, then it is UA authors decision what to do - for example fullscreen entire page. Should UAs always put users in control of this? ie. everything in principle is fullscreenable, but this indicator would be a strong hint that this chunk of content makes special sense to be treated in this manner. Use case: Not only solves problem withvideo tag, but also useful for mobile UAs (users could use it to zoom to author defined parts, on pages with complex layouts.), and for interactive webapps in general IMHO. I think this would be an interesting idea. I haven't any idea what the UI would look like though. I recommend approaching vendors directly and getting their input and experimental implementations, as described here: http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_the_spec.3F I like the idea of being able to go full-screen. I'd encourage talking to Web accessibility folk before going to far with a proposal / implementation... cheers, Dan
Re: [whatwg] Removing the need for separate feeds
On 22/5/09 09:21, Ian Hickson wrote: On Fri, 22 May 2009, Henri Sivonen wrote: On May 22, 2009, at 09:01, Ian Hickson wrote: USE CASE: Remove the need for feeds to restate the content of HTML pages (i.e. replace Atom with HTML). Did you do some kind of Is this Good for the Web? analysis on this one? That is, do things get better if there's yet another feed format? As far as I can tell, things get better if the feed format and the default output format are the same, yes. Generally, redundant information has tended to lead to problems. Would this include having a mechanism (microdata? xml islands?) that preserves extension markup from Atom feeds? eg. see http://www.ibm.com/developerworks/xml/library/x-extatom1/ cheers, Dan
Re: [whatwg] Removing the need for separate feeds
On 22/5/09 12:36, Toby Inkster wrote: Eduard Pascual wrote: For manually authored pages and feeds things would be different; but are there really a significant ammount of such cases out there? I can't say I have seen the entire web (who can?), but among what I have seen, I have never encountered any hand authored feed, except for code examples and similar experimental stuff. Surely this proves the need for a way of extracting feeds from HTML? You never see manually written feeds because people can't be bothered to manually write feeds. So the people who manually author HTML simply don't bother providing feeds at all. If an HTML page can *be* a feed, this allows manually authored HTML pages to be subscribed to in feed readers. FWIW the W3C homepage works this way since ~2000, http://www.w3.org/2000/08/w3c-synd/ cheers, Dan
Re: [whatwg] Link rot is not dangerous
On 20/5/09 22:54, Tab Atkins Jr. wrote: On Wed, May 20, 2009 at 2:35 AM, Toby A Inksterm...@tobyinkster.co.uk wrote: And yet, given an example use of the vocabulary, I'm quite certain I can easily find the page I want describing the vocab, even when there are overlaps in prefixes such as with bio. FYN is nearly never necessary for humans. We have the intelligence to craft search queries and decide which returned result is correct. What happens in practice is that many of these perfectly intelligent humans ask in email or IRC questions that are clearly answered directly in the relevant documentation. You can lead humans to the documentation, but you can't make 'em read... cheers, Dan
Re: [whatwg] Link rot is not dangerous
On 18/5/09 10:34, Henri Sivonen wrote: On May 15, 2009, at 19:20, Manu Sporny wrote: There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. The flawed conclusion flows out of Follow Your Nose advocacy, and is not flawed if one takes Follow Your Nose seriously. It seems to me that the positions that RDF applications should Follow Their Nose and that link rot is not dangerous (to RDF) are contradictory positions. That's a strong claim. There is certainly a balance to be found between taking advantage of de-referencable URIs and relying on their de-referencability. De-referencing is a privilege not a right, after all. If I lost control of xmlns.com tommorrow, and it became un-rescuably owned by offshore spam-virus-malware pirates, that doesn't change history. For nine years, the FOAF documentation has lived there, and we can use URIs to ask other services about what they saw during that period: http://web.archive.org/web/*/http://xmlns.com/foaf/0.1/ Since there is useful information to know about FOAF properties and terms from its schema and human-oriented docs, it would be a shame if people ignored that. Since domain names can be lost, it would also be a shame if directly de-referencing URIs to the schema was the only way people could find that info. Fortunately, neither is the case. That link rot hasn't been a practical problem to the Semantic Web community suggests that applications don't really Follow Their Nose in practice. Can anyone point me to a deployed end user application that uses RDF internally and Follows Its Nose? The search site, sindice.com does this: Yes Sindice dereferences URIs it finds in RDF instance data, including class and property URIs. It performs OWL reasoning using the retrieved information, mostly to infer additional triples based on subclass and subproperty relationships. Doing this helps us to increase recall in queries. (from Richard Cyganiak, who I asked offlist for confirmation) Whether you consider sindice.com end-user facing or not, I don't know. I put in roughly the same category as Google's Social Graph API. But it's a non-trivial implementation that aggregates and integrates a lot of data. BTW here's another use case for identifying properties and classes by URI: we can decentralise the translation of their labels into other languages. Here are some Korean descriptions of FOAF, for example: http://svn.foaf-project.org/foaftown/foaf18n/foaf-kr.rdf cheers, Dan
Re: [whatwg] Annotating structured data that HTML has no semanticsfor
On 15/5/09 14:11, Shelley Powers wrote: Kristof Zelechovski wrote: I do not think anybody in WHATWG hates the CURIE tool; however, the following problems have been put forward: Copy-Paste The CURIE mechanism is considered inconvenient because is not copy-paste-resilient, and the associated risk is that semantic elements would randomly change their meaning. Well, no, the elements won't randomly change their meaning. The only risk is copying and pasting them into a document that doesn't provide namespace definitions for the prefixes. Are you thinking that someone will be using different namespaces but the same prefix? Come on -- do you really think that will happen? The most likely case is with Dublin Core, but DC data varies enough already that this isn't too destructive... Dan
Re: [whatwg] Link rot is not dangerous
On 15/5/09 18:20, Manu Sporny wrote: Kristof Zelechovski wrote: Therefore, link rot is a bigger problem for CURIE prefixes than for links. There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. This has also lead to a false requirement that all vocabularies should be centralized. Here's the fear: If a vocabulary document disappears for any reason, then the meaning of the vocabulary is lost and all triples depending on the lost vocabulary become useless. That fear ignores the fact that we have a highly available document store available to us (the Web). Not only that, but these vocabularies will be cached (at Google, at Yahoo, at The Wayback Machine, etc.). IF a vocabulary document disappears, which is highly unlikely for popular vocabularies - imagine FOAF disappearing overnight, then there are alternative mechanisms to extract meaning from the triples that will be left on the web. Here are just two of the possible solutions to the problem outlined: - The vocabulary is restored at another URL using a cached copy of the vocabulary. The site owner of the original vocabulary either re-uses the vocabulary, or re-directs the vocabulary page to another domain (somebody that will ensure the vocabulary continues to be provided - somebody like the W3C). - RDFa parsers can be given an override list of legacy vocabularies that will be loaded from disk (from a cached copy). If a cached copy of the vocabulary cannot be found, it can be re-created from scratch if necessary. The argument that link rot would cause massive damage to the semantic web is just not true. Even if there is minor damage caused, it is fairly easy to recover from it, as outlined above. A few other points: 1. It's for the community of vocabulary-creators to help each other out w.r.t. hosting/publishing these: I just nudged a friend to put another 5 years on the DNS rental for a popular namespace. I think we should put a bit more structure around these kinds of habit, so that popular namespaces won't drop off the Web through accident. 2. digitally signing the schemas will become part of the story, I'm sure. While it's a bit fiddly, there are advantages to having other mechanisms beyond URI de-referencing for knowing where a schema came from 3. Parties worried about external dependencies when using namespaces can always indirect through their own namespace, whose schema document can declare subclass/subproperty relations to other URIs cheers Dan
Re: [whatwg] Annotating structured data that HTML has no semantics for
On 14/5/09 14:18, Shelley Powers wrote: James Graham wrote: jgra...@opera.com wrote: Quoting Philip Taylor excors+wha...@gmail.com: On Sun, May 10, 2009 at 11:32 AM, Ian Hickson i...@hixie.ch wrote: One of the more elaborate use cases I collected from the e-mails sent in over the past few months was the following: USE CASE: Annotate structured data that HTML has no semantics for, and which nobody has annotated before, and may never again, for private use or use in a small self-contained community. [...] To address this use case and its scenarios, I've added to HTML5 a simple syntax (three new attributes) based on RDFa. There's a quickly-hacked-together demo at http://philip.html5.org/demos/microdata/demo.html (works in at least Firefox and Opera), which attempts to show you the JSON serialisation of the embedded data, which might help in examining the proposal. I have a *totally unfinished* demo that does something rather similar at [1]. It is highly likely to break and/or give incorrect results**. If you use it for anything important you are insane :) I have now added extremely preliminary RDF support with output as N3 and RDF/XML courtesy of rdflib. It is certain to be buggy. So much concern about generating RDF, makes one wonder why we didn't just implement RDFa... Having HTML5-microdata -to- RDF parsers is pretty critical to having test cases that help us all understand where RDFa-Classic and HTML5 diverge. I'm very happy to see this work being done and that there are multiple implementations. As far as I can see, the main point of divergence is around URI abbreviation mechanisms. But also HTML5 might not have a notion equivalent to RDF/RDFa's bNodes construct. The sooner we have these parsers the sooner we'll know for sure. Dan
Re: [whatwg] Start position of media resources
On 8/4/09 00:29, Silvia Pfeiffer wrote: The media fragment WG decided that fragment addressing should be done with # and be able to just deliver the actual fragment. Interesting! Do you have a reference for this? I can't understand how this is possible if these are URI references, unless something very non-traditional is happening... cheers, Dan
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
On 17/1/09 23:30, L. David Baron wrote: On Saturday 2009-01-17 22:25 +0200, Henri Sivonen wrote: The story of RDF is very different. Of the top four engines, only Gecko has RDF functionality. It was implemented at a time when RDF was a young W3C REC and stuff that were W3C RECs were implemented less critically than nowadays. Actually, the implementation was well underway *before* RDF was a W3C REC, done by a team led by one of the designers of RDF. In other words, it was in Gecko because there were RDF advocates at Netscape (although advocating, I think, a somewhat different RDF than the current RDF recommendations). Yes, Netscape had this stuff when it was still called MCF. W3C's RDF took ideas from several input activities, including MCF, Microsoft XML-Data, PICS, and requirements from the Dublin Core community. But it looks more like MCF than the others. MCF was originally proposed by R.V.Guha at Apple; it followed him from Apple to Netscape in 1997, and when the Mozilla sources were later thrown over the wall, there was a lot of MCF in there. MCF White Paper, 1996 http://www.guha.com/mcf/wp.html spec, http://www.guha.com/mcf/mcf_spec.html While this was at Apple, there was a product/viewer called HotSauce / Project X, and some early grassroots adoption of MCF as a text format for publishing website summaries. http://web.archive.org/web/19961224042753/http://hotsauce.apple.com/ http://downlode.org/Etext/MCF/macworld_online.html It was at this stage that dialog started with the Library scene and Dublin Core folk, about how it related to their notion of catalogue records, and to the evolving PICS labelling system, format and protocol being built at W3C. eg. http://www.ssrc.hku.hk/tb-issues/TidBITS-355.html#lnk3 http://web.archive.org/web/19980215092626/http://www.ariadne.ac.uk/issue7/mcf/ The MCF/RSS relationship is a whole other story, eg. see http://www.scripting.com/midas/mcf.html http://www.scripting.com/frontier/siteMap.mcf http://web.archive.org/web/19990222114619/http://www.xspace.net/hotsauce/sites.html Then the thing moved to Netscape. Tim Bray helped Guha XMLize the spec, which was submitted to W3C in 1997, where it joined the existing efforts to extend PICS to include text labels and more structure - http://www.w3.org/TR/NOTE-pics-ng-metadata http://www.daml.org/committee/minutes/2000-12-07-RDF-design-rationale.ppt http://searchenginewatch.com/2165291 So the June 97 spec was http://www.w3.org/TR/NOTE-MCF-XML/ .. you can see from the figures that the technology was very RDF-shaped, http://www.w3.org/TR/NOTE-MCF-XML/#sec2. Also a tutorial at http://www.w3.org/TR/NOTE-MCF-XML/MCF-tutorial.html Netscape press release accompanying June 13 1997 submission - http://web.archive.org/web/20010308150737/http://cgi.netscape.com/newsref/pr/newsrelease432.html Less than 4 months later, this came out as a W3C Working Draft called RDF: http://www.w3.org/TR/WD-rdf-syntax-971002/ ... in a shape that didn't really change much subsequently. RDF wasn't the same design exactly as MCF but the ancestry is clear enough. And getting back to the original point, yeah Mozilla had MCF sitemaps code in there. Revisiting http://www.prnewswire.com/cgi-bin/stories.pl?ACCT=104STORY=/www/story/9-8-97/312711EDATE= http://www.irt.org/articles/js086/ and the like, it's clear that RDF was very much a child of the 1st browser wars. In retrospect the direction it took within Mozilla didn't do anyone much good. The earliest MCF apps were about public data on the public Web, feeds, sitemaps and so on. But eventually the ambition to be a complete information hub led to MCF/RDF being used for pretty much everything *inside* Mozilla. And I don't think that turned out very well. http://www.mozilla.org/rdf/doc/api.html etc. The RDF vocabularies it used were poorly or never documented (I have some guilt there) and when Netscape went away, the incentive to connect to public data on the Web seemed to drop (no more tie-ins with the 'what's related' annotation server, 'dmoz' etc.). RDF drifted from being a Web data format to be consumed *by* the browser, into an engineering tool to be used in the construction *of* the browser, ie. as a datasource abstraction within Mozilla APIs. While I can certainly see the value of having a unified view of mail, news, sitemaps, and so on, the Moz code at the time wasn't really in a position to match up to the language in the press releases. Not making any particular point here beyond connecting up to the MCF heritage... cheers, Dan -- http://danbri.org/
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
On 18/1/09 00:24, Henri Sivonen wrote: No. However, most of the time, when people publish HTML, they do it to elicit browser behavior when a user loads the HTML document in a browser. Most users of the Web barely know what a browser is, let alone HTML. They're just putting information online; perhaps into a closed site (eg. facebook), perhaps into a public-facing site (eg. a blog), or perhaps into 1:1, group or IM messaging (eg. webmail). HTML figures in all these scenarios. Browsers or HTML rendering code too, of course. But I don't think we can jump from that to claims about user intent, and more than their use of the Internet signifies an intent to have their information chopped up into packets and transmitted according to the rules of TCP/IP. The reason for my pedantry here is not to be argumentative, but just to suggest that this (otherwise very natural) thinking leads us to forget about the other major consumers of HTML - search engines. Having their stuff found and linked by other is often a big part of the motivation for putting stuff online. HTML parsing is involved, impact on the needs and interests of mainstream users is involved; but it's not clear whether all/any/many users 'do it to elicit search engine behaviour when indexing the HTML document'. Aren't search engines equally important consumers of HTML? Perhaps they're more simple-minded in their behaviour than a full UI browser. But from the user side, there's only slightly more value in being readable without being findable than vice-versa... cheers, Dan -- http://danbri.org/
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
On 18/1/09 19:34, Henri Sivonen wrote: On Jan 18, 2009, at 01:32, Shelley Powers wrote: Are you then saying that this will be a showstopper, and there will never be either a workaround or compromise? Are the RDFa TF open to compromises that involve changing the XHTML side of RDFa not to use attribute whose qualified name has a colon in them to achieve DOM Consistency by changing RDFa instead of changing parsing? I don't believe the RDFa TF are in a position to singlehandedly rescind a W3C Recommendation, ie. http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/. What they presumably could do is propose new work items within W3C, which I'd guess would be more likely to be accepted if it had the active enthusiasm of the core HTML5 team. Am cc:'ing TimBL here who might have something more to add. Do you have an alternative design in mind, for expressing the namespace mappings? cheers, Dan -- http://danbri.org/
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
On 18/1/09 20:07, Henri Sivonen wrote: On Jan 18, 2009, at 20:48, Dan Brickley wrote: On 18/1/09 19:34, Henri Sivonen wrote: On Jan 18, 2009, at 01:32, Shelley Powers wrote: Are you then saying that this will be a showstopper, and there will never be either a workaround or compromise? Are the RDFa TF open to compromises that involve changing the XHTML side of RDFa not to use attribute whose qualified name has a colon in them to achieve DOM Consistency by changing RDFa instead of changing parsing? I don't believe the RDFa TF are in a position to singlehandedly rescind a W3C Recommendation, ie. http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/. What they presumably could do is propose new work items within W3C, which I'd guess would be more likely to be accepted if it had the active enthusiasm of the core HTML5 team. Am cc:'ing TimBL here who might have something more to add. Do you have an alternative design in mind, for expressing the namespace mappings? The simplest thing is not to have mappings but to put the corresponding absolute URI wherever RDFa uses a CURIE. So this would be a kind of interoperability profile of RDFa, where certain features approved of by REC-rdfa-syntax-20081014 wouldn't be used in some hypothetical HTML5 RDFa. If people can control their urge to use namespace abbreviations, and stick to URIs directly, would this make your DOM-oriented concerns go away? cheers, Dan -- http://danbri.org/
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
On 18/1/09 21:04, Shelley Powers wrote: Dan Brickley wrote: On 18/1/09 20:07, Henri Sivonen wrote: On Jan 18, 2009, at 20:48, Dan Brickley wrote: On 18/1/09 19:34, Henri Sivonen wrote: On Jan 18, 2009, at 01:32, Shelley Powers wrote: Are you then saying that this will be a showstopper, and there will never be either a workaround or compromise? Are the RDFa TF open to compromises that involve changing the XHTML side of RDFa not to use attribute whose qualified name has a colon in them to achieve DOM Consistency by changing RDFa instead of changing parsing? I don't believe the RDFa TF are in a position to singlehandedly rescind a W3C Recommendation, ie. http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/. What they presumably could do is propose new work items within W3C, which I'd guess would be more likely to be accepted if it had the active enthusiasm of the core HTML5 team. Am cc:'ing TimBL here who might have something more to add. Do you have an alternative design in mind, for expressing the namespace mappings? The simplest thing is not to have mappings but to put the corresponding absolute URI wherever RDFa uses a CURIE. So this would be a kind of interoperability profile of RDFa, where certain features approved of by REC-rdfa-syntax-20081014 wouldn't be used in some hypothetical HTML5 RDFa. If people can control their urge to use namespace abbreviations, and stick to URIs directly, would this make your DOM-oriented concerns go away? Took five minutes to make this change in my template. Ran through validator.nu. Results: Doesn't like the content-type. Didn't like profile on head. Having to remove the profile attribute in my head element limits usability, but I'm not going to throw myself on the sword for this one. Doesn't like property, doesn't like about. These are the RDFa attributes I'm using. The RDF extractor doesn't care that I used the URIs directly. This sounds encouraging. Thanks for taking the time to try the experiment, Shelley. But ... to be clear, are you putting full URIs in the @property attribute too? In http://www.w3.org/TR/rdfa-syntax/#s_curieprocessing it says '@property, @datatype and @typeof support only CURIE values.' (Can you post an example?) Reading ... Many of the attributes that hold URIs are also able to carry 'compact URIs' or CURIEs. A CURIE is a convenient way to represent a long URI, by replacing a leading section of the URI with a substitution token. It's possible for authors to define a number of substitution tokens as they see fit; the full URI is obtained by locating the mapping defined by a token from a list of in-scope tokens, and then simply concatenating the second part of the CURIE onto the mapped value. ... I guess the fact that @property is supposed to be CURIE-only isn't a problem with parsers since this can be understood as a CURIE with no (or empty) substitution token. cheers, Dan -- http://danbri.org/
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
On 17/1/09 19:27, Sam Ruby wrote: On Sat, Jan 17, 2009 at 11:55 AM, Shelley Powers shell...@burningbird.net wrote: The debate about RDFa highlights a disconnect in the decision making related to HTML5. Perhaps. Or perhaps not. I am far from an apologist for Hixie, (nor for that matter and I a strong advocate for RDF), but I offer the following question and observation. The purpose behind RDFa is to provide a way to embed complex information into a web document, in such a way that a machine can extract this information and combine it with other data extracted from other web pages. It is not a way to document private data, or data that is meant to be used by some JavaScript-based application. The sole purpose of the data is for external extraction and combination. So, I take it that it isn't essential that RDFa information be included in the DOM? This is not rhetorical: I honestly don't know the answer to this question. Good question. I for one expect RDFa to be accessible to Javascript. http://code.google.com/p/rdfquery/wiki/Introduction - http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html is a nice example of code that does something useful in this way. cheers, Dan -- http://danbri.org/
Re: [whatwg] Trying to work out the problems solved by RDFa
On 10/1/09 00:37, Ian Hickson wrote: On Fri, 9 Jan 2009, Ben Adida wrote: Is inherent resistance to spam a condition (even a consideration) for HTML5? We have to make sure that whatever we specify in HTML5 actually is going to be useful for the purpose it is intended for. If a feature intended for wide-scale automated data extraction is especially susceptible to spamming attacks, then it is unlikely to be useful for wide-scale automated data extraction. I've been looking at such concerns a bit for RDFa. One issue (shared with HTML in general I think) is user-supplied content, eg. blog comments and 'rel=nofollow' scenarios). Is there any way in HTML5 to indicate that a whole chunk of Web page is from an (in some to-be-defined sense) untrusted source? I see http://www.whatwg.org/specs/web-apps/current-work/#link-type-nofollow The nofollow keyword indicates that the link is not endorsed by the original author or publisher of the page, or that the link to the referenced document was included primarily because of a commercial relationship between people affiliated with the two pages. While I'm unsure about the commercial relationship clause quite capturing what's needed, the basic idea seems sound. Is there any provision (or plans) for applying this notion to entire blocks of markup, rather than just to simple hyperlinks? This would be rather useful for distinguishing embedded metadata that comes from the page author from that included from blog comments or similar. Thanks for any pointers, cheers, Dan -- http://danbri.org/
Re: [whatwg] Trying to work out the problems solved by RDFa
On 3/1/09 14:02, Julian Reschke wrote: Tab Atkins Jr. wrote: ... Well, it'll require an N3 parser where previously none was needed. RDFa requires an RDFa parser as well, and in general *any* metadata requires a parser, so this point is moot. The only metadata that doesn't require a parser is no metadata at all. With RDFa, most of the parsing is done by HTML. So I would call it an RDFa processor. And yes, that doesn't change the fact that code needs to be written. But it affects the type of the code that needs to be written. Somewhat of an aside, but for the curious - here is an RDFa parser/processor app: http://code.google.com/p/rdfquery/wiki/Introduction example: http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html js: http://rdfquery.googlecode.com/svn/trunk/jquery.rdfa.js [...] The most successful alternative is nothing at all. ^_^ We can extract copious data from web pages reliably without metadata, either using our human senses (in personal use) or natural-language-based processing (in search engine use). It has not yet been established that sufficient and significant enough problems *exist* to justify a solution, let alone one that requires an addition to html. That is what Ian is specifically looking for. That's what you and Ian claim. Many disagree. My main problem with the natural language processing option is that it feels too close to waiting for Artificial Intelligence. I'd rather add 6 attributes to HTML and get on with life. But perhaps a more practical concern is that it unfairly biases things towards popular languages - lucky English, lucky Spanish, etc., and those that lend themselves more to NLP analysis. The Web is for everyone, and people shouldn't be forced to read and write English to enjoy the latest advances in Web automation. Since HTML5 is going through W3C, such considerations need to be taken pretty seriously. As a note, this isn't the W3C's HTML WG. The WHATWG is independent from the W3C. But the WHATWG HTML5 *work* is no longer entirely independent of W3C; the two organizations embarked on a major joint venture. It seems reasonable for members of the WHATWG world to take W3C-oriented considerations seriously, regardless of mailing list. cheers, Dan -- http://danbri.org/
Re: [whatwg] Trying to work out the problems solved by RDFa
On 3/1/09 16:54, Håkon Wium Lie wrote: Also sprach Dan Brickley: My main problem with the natural language processing option is that it feels too close to waiting for Artificial Intelligence. I'd rather add 6 attributes to HTML and get on with life. :-) Another thought re NLP. RDFa (and similar, ...) are formats that can be used for writing down the conclusions of NLP analysis. For example here see the BBC's recent Muddy Boots experiment, using DBPedia (Wikipedia in RDF) data to drive autoclassification / named entity recognition. So here we can agree with Ian and others that text analysis has much to offer, and still use RDFa (or other semantic markup - i'll sidestep that debate for now) as a notation for marking up the words with a machine-friendly indicator of their NLP-guessed meaning. http://www.bbc.co.uk/blogs/journalismlabs/2008/12/muddy_boots.html Personally, I think the 'class' attribute may still be a more compelling option in a less-is-more way. It already exists and can easily be used for styling purposes. Styling is bait for authors to disclose semantics. I'm sure there's mileage to be had there. I'm somehow incapable of writing XSLT so GRDDL hasn't really charmed me, but 'class' certainly corresponds to a lot of meaningful markup. Naturally enough it is stronger at tagging bits of information with a category than at defining relationships amongst the things defined when they're scattered around the page. But that's no reason to dismiss it entirely. Did you see the RDF-EASE draft, http://buzzword.org.uk/2008/rdf-ease/spec? From which comes: Ten second sales pitch: CSS is an external file that specifies how your document should look; RDF-EASE is an external file that specifies what your document means. RDF-EASE uses CSS-based syntax. More discussion here, http://lists.w3.org/Archives/Public/semantic-web/2008Dec/0148.html including question of whether it ought to be expressed using css3-namespace, http://lists.w3.org/Archives/Public/semantic-web/2008Dec/0175.html chers, Dan -- http://danbri.org/
Re: [whatwg] Absent rev?
Ian Hickson wrote: On Tue, 18 Nov 2008, Martin McEvoy wrote: Just one small question Why Has HTML5 dropped the rev=[1] attribute? [1] http://www.w3.org/TR/html5-diff/#absent-attributes We did some studies and found that the attribute was almost never used, and most of the time, when it was used, it was a typo where someone meant to write rel= but wrote rev=. To be precise, the most commonly used value was rev=made, which is equivalent to rel=author and thus was not a convincing use case. The second most common value was rev=stylesheet, which is meaningless and obviously meant to be rel=stylesheet. We therefore determined that authors would benefit more from the validator complaining about this attribute instead of supporting it. (I don't dispute it's relative un-used-ness...) Anything that could be done with rev= can be done with rel= with an opposite keyword, so this omission should be easy to handle. This would seem to shift work from HTML5 to relationship vocabulary specs, whether RDFa-oriented or XFN-based: they'll have to name the relationship in both directions now. eg. john.html: pSee my a rel=father href=pa.htmldad's page/a for details/p and pa.html: pSee my a rel=child href=john.htmlson's page/a for details/p are ok in html5, but pa.html: pReader,a rev=father href=john.htmli'm his father/a/p So long as there's a plausible inverse defined, ...isn't. I'm not arguing here that this is right or wrong or good or bad or pretty or ugly, just that the parties defining little relationship vocabularies such as 'parent', 'child', 'father','mother','brother','ex-line-manager', and so on will (now 'rev' is going away) need to think carefully about naming each inverse relationship as well. As you point out, rev= wasn't heavily used anyway; however technologies like microformats and RDFa are relatively new to the Web, and things can take a while to get adopted (eg. XHR/'ajax'). cheers, Dan a personal ps.: for some reason, rev= always made my head hurt slightly to even think about, I guess because there are two senses of a reversed link: the reversed meaning of a link versus the idea of an incoming link / backlink, and the difference is simultaneously both obvious and subtle
Re: [whatwg] Absent rev?
Smylers wrote: Martin McEvoy writes: o be precise, the most commonly used value was rev=made, which is equivalent to rel=author and thus was not a convincing use case. !! rel-author doesn't mean the same as rev-made eg: In which cases doesn't it? If A is the author of B then B was made by A, surely? Then B contributed to the creation of A, yes. Perhaps not on their own. But we need it in the other direction too: can we conclude from { A made B } that { B author A } ? Not if B isn't textual. Authorship is about writing, but there are many other avenues for human creativity (some of which result in things with URLs, eg. software, images, sounds). So there are two complications here, and these are very real world issues, chewing up countless hours in projects like Dublin Core. First is a versus the. Nothing warrants reading the into rel=author. There might be other authors, listed or not listed in their own hyperlink. Or the page pointed to might be a collectively maintained page or group homepage etc. Or a mailto: for a mailing list. Second is non-textual creations. The early Dublin Core specs had a dc:author property. This was changed back in 1996 or so to be dc:creator, since this better includes visual works, museum artifacts and so forth, ie. things that can be made, but which are not (postmodernism aside) conventionally considered texts. Authorship is a notion that doesn't make much sense in a non-textual context. My point in previous mail about shifting work from HTML5 to elsewhere, is that this kind of distinction is subtle for many seemingly obvious pairs of relationship-type names, and that rev= is at least precise in its meaning. cheers, Dan -- http://danbri.org/
Re: [whatwg] Absent rev?
Smylers wrote: Dan Brickley writes: Smylers wrote: Martin McEvoy writes: !! rel-author doesn't mean the same as rev-made eg: In which cases doesn't it? If A is the author of B then B was made by A, surely? Then B contributed to the creation of A, yes. Perhaps not on their own. But we need it in the other direction too: can we conclude from { A made B } that { B author A } ? Not if B isn't textual. Authorship is about writing, but there are many other avenues for human creativity (some of which result in things with URLs, eg. software, images, sounds). Firstly, the term author can be used for at least some of those things; definitely software. Yes, 'software' was a bad example. But Dublin Core certainly did abandon the early term 'author' in favour of 'creator' after a workshop looking at requirements around images, museum artifacts and so on. Secondly, if you think made is more generic than author, then surely linking to such URLs with rel=made is an improvement on using rev=author? I don't associate 'being more generic' as a positive or a negative thing. Sometimes we want specificity, sometimes not. There is value in a 'see also' relationship type; there is value in a 'schoolHomepage' relationship type too. Neither need be better. If I wanted to find written works, then 'author' is a more relevant property than 'made'. If my concern is to find all the things created by some party, then 'made' may be more useful. My point was just that they have a different meaning (although much overlap). First is a versus the. Nothing warrants reading the into rel=author. So presumably also nothing warrants reading the into rel=made? Yup. If syntactic context (eg. via RDFa) associated the string 'made' with a specific definition rather than just the English word, then of course that definition could say anything it wanted - such as 'sole maker of ...' , 'primary maker of', etc. The early Dublin Core specs had a dc:author property. This was changed back in 1996 or so to be dc:creator, I agree that creator would be a better term than author. But I think that's irrelevant to needing rev. Without rev, content creators (in every language) will need to go through this dance, hunting through dictionaries and debating subtleties, to make sure that they've identified a suitable pair of words such that { X word1 Y } is true if and only if { Y word1 X }. Which is why I see this in terms of division of labour. Cleaning it out of HTML5 makes work elsewhere... cheers, Dan -- http://danbri.org/ Smylers
Re: [whatwg] Absent rev?
Dan Brickley wrote: Without rev, content creators (in every language) will need to go through this dance, hunting through dictionaries and debating subtleties, to make sure that they've identified a suitable pair of words such that { X word1 Y } is true if and only if { Y word1 X }. Which is why I see this in terms of division of labour. Cleaning it out of HTML5 makes work elsewhere... Sorry that should've been, { X word1 Y } is true if and only if { Y word2 X }. Dan ps. (since i'm mailing again, sorry) ... in an RDF/XML context, we had this issue in FOAF: we added 'depicts' alongside 'depiction' because the old RDF/XML syntax didn't deal well with inverses
Re: [whatwg] RDFa statement consistency
Henri Sivonen wrote: On Aug 29, 2008, at 11:11, Julian Reschke wrote: Henri Sivonen wrote: I don't believe that is the case. If I've understood history correctly, introducing Namespaces into XML was primarily a requirement stipulated by the RDF community. XML got Pointer, please? http://lists.w3.org/Archives/Public/semantic-web/2007Dec/0116.html W3C Members (or invited experts with the right permissions) can read more of the back story in the original XML WG archives. See http://lists.w3.org/Archives/Member/w3c-xml-wg/1998Jan/0034.html 'URGENT: Proposal to modify or delay XML 1.0 Recommendation', From: Jon Bosak, 12 Jan 1998. This points to a paper, 'Turning XML into a Universal Syntax for Web Data Formats' http://www.w3.org/Member/Meeting/98JanAC/xml-req.html that was put before the Jan 1998 W3C AC Meeting in San Jose. I think it's reasonable to share the abstract here: Concern is shared by members of the RDF, SMIL and Math working groups, and the W3C architecture domain staff, that the XML 1.0 Proposed Recommendation of 8Dec97 does not address the needs as a common base for the transmission of machine-understandable data.. cheers, Dan
Re: [whatwg] RDFa Features
Kristof Zelechovski wrote: This amounts to saying that URLs take precedence over CURIEs and CURIEs can be enclosed in brackets in case of any ambiguity. This sounds ridiculous given the weight you put on avoiding ambiguities and name clashes. Since the author does not control the URL scheme registration process, he can never be sure that a particular prefix is safe, therefore using unsafe CURIEs is just asking for trouble. However, Manu's examples DO NOT use safe CURIEs, nor do any examples I have seen on this discussion. Good heavens!~ I agree. The when there is any possibility of ambiguity sentence is a bit weak. I don't know the CURIEs spec well; but for cases where the assumption is 'this URI scheme won't be registered', the assumption is dangerous. Dan -Original Message- From: Julian Reschke [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 27, 2008 5:33 PM To: Kristof Zelechovski Cc: 'Manu Sporny'; 'Ian Hickson'; 'WHAT-WG'; [EMAIL PROTECTED] Subject: Re: [whatwg] RDFa Features Kristof Zelechovski wrote: You cannot support both CURIEs and URLs. What happens when someone declares xmlns:http? http://www.w3.org/TR/curie/#sec_2.2.. BR, Julian
Re: [whatwg] RDFa
Ian Hickson wrote: On Sat, 23 Aug 2008, Julian Reschke wrote: Again you're confusing HTTP URLs with URIs. Using URIs as identifiers allows lots of identification schemes other than HTTP, in particular ones that are not based on DNS, or that use DNS, but include a timestamp to address the concern of losing a domain name (tag URI scheme). Sure, but most people use HTTP URIs anyway for namespaces. You can use any URI or any system you want with class=. The key is just to make it unique enough that clashes won't happen. In practice, names like dc:title are actually quite unique enough. But people can use much more unique ones if desired, all the way to full URIs. I'm certainly in favour of making mainstream namespace names prettier. But this design worries me, since it requires guesswork and heuristics on the part of consumer code to figure out if class = info.age or museum.acquisitionDate is intended as a URI or not. I'll air the worry first, and then sketch an approach that makes me worry less and which might have some of the characteristics that you value (such as not depending on separate xmlns-like declarations of abbreviations, and not being too ugly to look at). You mentioned earlier that the RDFish practices around downloading and interpreting schemas from the Web is news to you. I'll take up an action to document some of the things we do in that area (eg. with SPARQL for data merging), probably as a blog post. Doing so would help as background on my next point, which is that making it ambiguous whether a URI was declared is something that would need careful security review, to ensure that data consumers are aware that they should not expect property definitions found at the domain to be consistent with the intended meaning of the markup. Sketch of a scenario: 1. Alice deploys class=creationDate.info1979/class to describe a museum artifact. She calls it this because it marks up some information about the creation date of some real world thing, and because 'creationDate' is already in use for describing page creation dates, in the CSS library she's using. 2. Bob buys himself the Internet domain creationDate.info and wires up a webserver to respond with an RDFa schema defining creationDate as a sub-property of http://ecommerce.example.com/vocab#priceInEuros. 3. Charlie's code downloads Alice's markup, parses out the RDFa, and noticing that creationDate.info seems to be de-referencable, so goes to fetch the schema. For every triple x creationDate y in the document, it also generates x ecom:priceInEuros y too. Perhaps Bob is selling other museum artifact and wants to make Alice's look more expensive. Or cheaper. Or to make her data look corrupted so that certain consumers won't include her listing. Or maybe he wants to buy the item cheaply and is probing for bugs in Alice's online shopping system. In other words, the fact that Alice's markup only *appears* to be using an Internet domain opens her up to risk that someone will go buy that domain, and put a fake schema there which affects the likely interpretation of her markup. This exposure is increased by our uncertainty about ICANN strategy: we can't rely on the assumption that there are only a tiny handful of TLDs. We can probably rely on them being expensive at the top level, but not on having a hardcoded list enumerating them. [[ Icann has announced it will allow the creation of any new top-level domains, albeit at a considerable cost. As well as opening the door to an influx of new web addresses, Icann has also said that it will allow Japanese, Chinese, Arabic and Cyrillic characters to be used in registrations for the first time. It's a massive increase in the real estate of the internet. It will allow groups, communities and businesses to express their identities online, says Paul Twonmey, chief executive of Icann, speaking to the Times. ]] http://www.pcpro.co.uk/news/208833/icann-creates-domain-name-freeforall.html The RDF approach generally has been to make it very clear which chunks of data contain URIs, and whether they can be relative or not. Other markup systems have adopted a similar approach. These share the merit that it makes such ambiguity much less of a problem (although there are other attacks of course). Lately I've been thinking that perhaps we can get something less ugly than http://; in the markup, yet specify rules that allow expansion to http:// or https:// while keeping it clear whether the markup author really intends to cite some domain/page as vocabulary documentation. For example pI'm span property=info.foaf/age1979/p years old/p (if FOAF was documented at http://foaf.info/age and we specified the property attribute to use java-style names, and be declared relative to the http:// scheme). Or pI'm span property=foaf/age1979/p years old/p (if I spend $100k at ICANN to buy a tld 'foaf') or pI'm span property=Com.xmlns.foaf.age1979/p
Re: [whatwg] RDFa Problem Statement
Kristof Zelechovski wrote: Web browsers are (hopefully) designed so that they run in every culture. If you define a custom vocabulary without considering its ability to describe phenomena of other cultures and try to impose it worldwide, you do more harm than good to the representatives of those cultures. And considering it properly does require much time and effort; I do not think you can have that off the shelf without actually listening to them. In a way, complaining that the Microformats protocol impedes innovation is like saying 'we are big and rich and strong, so either you accommodate or you do not exist'. Not that I do not understand; it is straightforward to say so and it happens all the time. Chris Let me give a quick example of how this works in RDFland. Each vocabulary defines nothing except classes (types of thing) and properties (aka relationship types). In FOAF for example, we defined Person, Agent, Document, OnlineAccount, Project, Group as classes. And we defined properties too. These tend to have a bit more 'character' than the classes, and carry the distinctive style of each vocabulary. FOAF has properties of Person and Agent such as 'openid', 'homepage', 'weblog' that have as their range (ie. values) instances of the class Document. We also define properties like 'primaryTopic' that relate a page primarily about something to the thing itself. Each class and property is considered to be in the vocabulary whose URI is http://xmlns.com/foaf/0.1/ ... and this is the basis of RDF's division of labour mechanism. See also a squiggly diagram at http://danbri.org/2008/foafspec/foafspec.jpg (apologies that this is currently inaccessible). The SIOC project declares a bunch more classes and properties. Some of these are defined with relationship to Person, Document, OnlineAccount from FOAF; classes that sub-class ours, or properties that cite our FOAF classes as the range or domain. DOAP does the same, expanding from the class Project to describe opensource projects. I've talked about this before so won't go on about those schemas. The point about cultural diversity, independent extension etc is made better by the JaUranai FOAF extension that appeared a few years back: http://kota.s12.xrea.com/vocab/uranai They decided that FOAF was nice and all but was lacking some properties important in a Japanese context. So they declare new RDF properties: starsign, bloodtype, and various others that I don't fully understand because they have japanese names and documentation. From blood type's description from the RDF Schema file at http://kota.s12.xrea.com/vocab/uranai/uranai.rdf rdf:Property rdf:about=http://kota.s12.xrea.com/vocab/uranaibloodtype; rdfs:label血液型/rdfs:label rdfs:label xml:lang=enBlood type/rdfs:label rdfs:comment血液型を書きます。/rdfs:comment rdfs:comment xml:lang=enA blood type./rdfs:comment rdfs:domain rdf:resource=http://xmlns.com/foaf/0.1/Person/ rdfs:range rdf:resource=http://www.w3.org/2000/01/rdf-schema#Literal/ [...] /rdf:Property This effectively wires in 'bloodtype' to the other classes in use in this wider community. Wherever SIOC or DOAP projects have created a property whose range is Person, we know that Uranai's 'bloodtype' property is also applicable. Without needing heavy duty coordination between the SIOC and DOAP authors and the author of Uranai. Furthermore, the fact that all these projects share a common syntactic grammar means that I can simply add a Uranai 'bloodtype' property into my FOAF self-description, and expect each and every RDF parser and SPARQL database to immediately be able to parse and query it - see http://danbri.org/words/2008/02/25/286 for example. As Manu describes in http://blog.digitalbazaar.com/2008/08/23/html5-rdfa-and-microformats/ this is rather different to the Microformats.org approach, which is by intention a monolithic community designing a single, self-consistent product. Back on my point that RDF vocabulary classes (ie. named types of thing, Person etc) tend to be boring, and the properties more interesting. This is to address the difficulty you mention, ie. ... If you define a custom vocabulary without considering its ability to describe phenomena of other cultures and try to impose it worldwide, you do more harm than good to the representatives of those cultures. So for example in FOAF, we define fairly boring bland classes (like Person, Document) in a way that allow different cultures to attach properties that they care about. It seems bloodtype is more important in Japanese culture than in Western Europe, but that the toolset and design provided by RDFa allows independent extension of FOAF in Japan without expensive central bottlenecks. For Creative Commons, they have huge headaches because copyright law varies from country to country; this has informed their redesign and their enthusiasm for RDFa. Hope this helps explain something of where RDFa folk are coming from,
Re: [whatwg] RDFa Problem Statement
Ben Adida wrote: Greg Houston wrote: I am not sure if Ben was eluding to this in the last paragraph, but to further complicate things SearchMonkey is not actually using RDF, I think you're confusing two different layers. SearchMonkey parses HTML with microformats, and soon HTML+RDFa, and makes that data available in RDF form to PHP scripts that you or anyone else can write. It does just this today, from actual RDFa. I've been working on an extension that integrates RDFa from the matched pages with additional information from external DataRSS (Atom+OpenSearch+RDFa) feeds. cheers, Dan -- http://danbri.org/
Re: [whatwg] RDFa Problem Statement
Dan Brickley wrote: Ben Adida wrote: Greg Houston wrote: I am not sure if Ben was eluding to this in the last paragraph, but to further complicate things SearchMonkey is not actually using RDF, I think you're confusing two different layers. SearchMonkey parses HTML with microformats, and soon HTML+RDFa, and makes that data available in RDF form to PHP scripts that you or anyone else can write. It does just this today, from actual RDFa. I've been working on an extension that integrates RDFa from the matched pages with additional information from external DataRSS (Atom+OpenSearch+RDFa) feeds. A bit more information from Peter Mika at Yahoo (fwd'd with permission): [[ the key point... is that indeed DataRSS is both Atom and RDFa compatible. RDFa is a set of attributes, we merely invented names for the XML elements that carry them... but you can completely ignore that and get the triples out by running an RDFa parser over it. OpenSearch is another extension you can add in the mix if you want. We turn both microformats and RDFa-in-HTML into DataRSS when used as input for applications so that SearchMonkey applications can abstract away from the original format. We are definitely not Microsoft doing JavaScript, since we are extending formats in the way they were foreseen (Atom extensibility) and complying with standards (RDFa) without adding to them or changing the meaning of constructs. So this is a genuine Semantic Web standards play. Btw, we haven't announced RDFa support officially because we want to get it 100% right before we do... ok maybe 99% ;) ]] cheers, Dan ps. http://labs.mozilla.com/2008/08/introducing-ubiquity/ is a nice case for in-page structured data, whether microformatty/posh or rdfa
Re: [whatwg] RDFa
Tab Atkins Jr. wrote: On Sun, Aug 24, 2008 at 3:10 PM, Julian Reschke [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Tab Atkins Jr. wrote: The point was made before that html5 already has extensive extension mechanisms in place that can address the particular needs of various communities without requiring it to be written explicitly into the spec. I know you've said that your team has reviewed the extension mechanisms and found them lacking, but could you explain why it is insufficient to use @data-rdf-property, @data-rdf-about, etc.? I ask about these specifically because my mail timestamps show that the @data-* class of attributes was introduced April 10th of this year, while the ccRel submission is dated May 1st, and thus it's very likely that these were impossible to consider during your review of existing extension mechanisms. ... Custom data attributes are intended to store custom data private to the page or application, for which there are no more appropriate attributes or elements. -- http://www.w3.org/html/wg/html5/#custom I'm confused. Are you trying to imply that my suggestion is somehow against the spec definition? If so, please accompany your quoting of the spec with an actual explanation of your point. I cannot respond to you when I essentially have to imagine your entire argument for myself first. My homepage at http://danbri.org/ is XHTML / RDFa and has data in RDFa attributes. I'd like to do this in HTML5 +RDFa instead, so I can take advantage of the other new features in HTML5. However the data is very much not private to the page, but designed to be used by a broad range of consumers. For example, Yahoo's SearchMonkey, or Google's Social Graph API. The use of RDF namespaces in that data indicates that we're using shared public schemas, rather than private islands of application-specific data. Perhaps if the Web itself is considered an application, then this is application-specific data. cheers, Dan -- http://danbri.org/
Re: [whatwg] RDFa
Ben Adida wrote: Ian Hickson wrote: Why would it scale any less than URIs? That's basically all URIs are. Why would you reinvent URIs in a way that they can't be de-referenced? Is that really a good design, in your opinion? and it's extremely web-unfriendly, since you can't look up a concept to figure out what it might mean. Sure you can. Just search for it on a search engine. That's sort of good for humans, and that's assuming there's no bug in the search engine algorithm where you get, say, Google-bombed. I'm not sure a web design should be predicated on the existence of Google, especially when it's not clear that Google will always be able to index the entire web (it's not clear Google indexes the entire web even today.) We can reasonably assume the existence of large search engines covering a good part of the public Web. Google being a well known example. But we can't necessarily assume their owners will offer reliable machine-friendly APIs to that data, with terms of service that are sufficiently unconstrained. Google for example switched off machine access via SOAP in favour of an AJAX-based approach back in 2006: http://code.google.com/apis/soapsearch/ [[ Google Code HomeGoogle SOAP Search API As of December 5, 2006, we are no longer issuing new API keys for the SOAP Search API. Developers with existing SOAP Search API keys will not be affected. Depending on your application, the AJAX Search API may be a better choice for you instead. It tends to be better suited for search-based web applications and supports additional features like Video, News, Maps, and Blog search results. ]] We went backwards, from a situation where machines could do lookups against the Google index, to http://code.google.com/apis/ajaxsearch/ which seems really much more focussed on customisation of human-facing Web content. That's really cool but doesn't help with Just search for it on a search engine if you're building things outside the browser. Now it turns out we can still do programatic searches, because the AJAX API does offer a json interface, see http://code.google.com/apis/ajaxsearch/documentation/#fonje eg.: curl -e http://www.my-ajax-site.com 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0q=Paris%20Hilton' ...however http://code.google.com/apis/ajaxsearch/terms.html warns that The API is limited to allowing You to host and display Google Search Results on your site ... The API may be used only for services that are accessible to your end users without charge. ... You agree that you will not, and you will not permit your users or other third parties to: (a) modify or replace the text, images, or other content of the Google Search Results, including by (i) changing the order in which the Google Search Results appear, (ii) intermixing Search Results from sources other than Google, or (iii) intermixing other content such that it appears to be part of the Google Search Results; or (b) modify, replace or otherwise disable the functioning of links to Google or third party websites provided in the Google Search Results. ...the constraints are significant. And may change at any time (just as things changed for users of the old SOAP API). Perhaps the miscommunication we have here is that when Ian says Sure you can. Just search for it on a search engine. he's assuming a human you, but RDFa people are thinking of this as a scripted operation too, because we know that machine-readable RDF/RDFa vocabulary descriptions exist that make it easier to find equivalencies between the classes and properties used in our data. cheers, Dan -- http://danbri.org/
Re: [whatwg] RDFa
+cc: Paul Miller of Talis, who worked on the AHDS report mentioned below. Henri Sivonen wrote: On Aug 23, 2008, at 02:43, Ben Adida wrote: Why would you reinvent URIs in a way that they can't be de-referenced? To avoid having misleading affordances. http://en.wikipedia.org/wiki/Affordance We want one parser, with variability and innovation in the vocabulary definition only. Having one parser seems appealing compared to using the native mechanisms of each of HTML (meta, link), PDF (document information dictionary), PNG (tEXt chunk), etc. at first, but the vision that tools handle this all when you remix culture already requires the tools to support reading and writing the file formats they remix. When you already have format-native key-value read/write capability, the ability to build and mine RDF *graphs* becomes an additional burden. It may not be obvious to those who haven't followed the history, or who were at school at the time, but many of us did indeed invest a lot of time and effort using name/value metadata structures in HTML. For example, the Dublin Core project began with this technology base beginning back in 1994/5, and the experience of metadata implementors using it was one of the drivers for the creation of RDF. At the time there no WHATWG to talk to, but the metadata community *did* talk to W3C. See http://dublincore.org/about/history/ Early on, the Dublin Core community found a lot of pressure for feature-creep: new elements/terms to address the needs of various groups who liked Dublin Core, but wanted some specifics added. This situation gave rise to the 'Warwick Framework', defined in 1996 - http://www.dlib.org/dlib/july96/lagoze/07lagoze.html [[ While there was consensus among the attendees that the concept of a simple metadata set is useful, there were a number of fundamental questions concerning the real utility of the Dublin Core as it was defined at the end of the preceding workshop. Does the very loosely defined Dublin Core really qualify as a standard that can be read and processed programmatically? Should the number of the core elements be expanded, to increase semantic richness, or reduced, to improve ease-of-use by authors and/or web publishers? Will authors reliably attach core metadata elements to their content? Should a core metadata set be restricted to only descriptive cataloging information or should it include other types of metadata such as administrative information, linkage data, and the like? What is the relationship of the Dublin Core to other developing work in metadata schemes, particularly in those areas such as rights management information (terms and conditions)? The workshop attendees concluded that the answer to these questions and the route to progress on the metadata issue lay in the formulation a higher-level context for the Dublin Core. This context should define how the Core can be combined with other sets of metadata in a manner that addresses the individual integrity, distinct audiences, and separate realms of responsibility of these distinct metadata sets. ]] For an implementor report typical of the experience from this era, ie. with name/value pairs, see the UK Arts and Humanities Data Service document http://ahds.ac.uk/public/metadata/discovery.html which was presented at the Oct'97 Helsinki workshop of the Dublin Core. At the time I was involved with the ROADS internet cataloguing project and can vouch that we hit a similar ceiling with attribute/value metadata. From the appendix, http://ahds.ac.uk/public/metadata/disc_09.html ... here are some of attribute/value structures they were forced to squash their metadata records into. DC.creator.corporateName.1 Canterbury Archaeological Trust DC.creator.phone.1 +44 227 462062 DC.creator.personalName.2 Paul Miller DC.creator.affiliation.2 Archaeology Data Service ...this expresses name, affiliation and contact information for a number of contributors to a work. Another example describes several contributors along with their roles (actor, director, etc). Again the attribute/value representations contained numeric indexes ('DC.creator.role.9') to disambiguate which individual was being described. What barrier is there to building reusable vocabularies? The follow-your-nose principle is missing, which is fairly essential for discovering the meaning of vocabularies (partially automatically, not by doing a Google search.) The partial automation with RDFa doesn't go very far. If a program automatically dereferences http://creativecommons.org/ns# and parses the result as RDFa, the program now has a human-readable string for each property--not exactly something that the program can act on further without human help. Looking at this example, div id=license about=#license typeof=rdf:Property h4cc:license/h4 A a rel=rdfs:domain href=#WorkWork/a span
Re: [whatwg] RDFa
Kristof Zelechovski wrote: It seems to me identification and description of various entities is best achieved with LDAP which is hierarchical by design. Why wasn't LDAP adopted for the purpose, given that it is older, widely used and well understood? Work began on LDAP (a simplification from X.500) in 1993; and on Dublin Core (in some ways a simplification of longstanding library cataloguing methds for the Web) in 1994. We might equally ask why it didn't use SGML (it did) or XML (it did that too, after it was invented). There was work on exploring the use of LDAP and X.500 to address Dublin Core's needs, eg. see http://tools.ietf.org/html/draft-hamilton-dcxl-02 although it never really caught the world on fire. Why, is probably related to the larger question of why the Web evolved as a technology stack on top of IETF/internet specs rather than on top of X.500 or other work from that world... Dan -- http://danbri.org/
Re: [whatwg] Creative Commons Rights Expression Language
Bonner, Matt wrote: On Wed, Aug 20, 2008 at 5:22 PM, Bonner, Matt wrote: Hola, I see that the Creative Commons has proposed additions to HTML to support licenses (ccREL): http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ ... Tab Atkins Jr. replied: The whole thing would be best expressed as a microformat, as the entire thing can be made just as machine- and human-readable without having to introduce an entire new addition to html. I think someone is a little confused about the important of CC... then Dan Brickley wrote: I encourage you to (re)-read http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ ... the spec explains that all of CC's concrete markup requirements are addressed by the HTML additions in the RDFa spec. It does not propose *any* new HTML markup to address CC's specific needs. (big snip) In other words, adding 'about', 'property', 'resource', 'datatype' and 'typeof' and a namespace-URI association convention to HTML5 ... Just so I understand you, are you saying that attributes aren't markup? Because first you say no new markup, then you list 5 attributes to add. Ah, sorry for the unclarity. Attributes are markup. The sentence comes as a whole: I meant that ccREL proposes no new *CC-specific* attributes or elements. They get their job done using general RDFa markup. Second, the Introduction cites RDFa, which footnote 4 describes as an emerging collection of attributes and processing rules for extending XHTML to support RDF. However, the Introduction text and example go on to talk about HTML. Independent of any other discussions, I think it behooves the authors to clarify their intent. Is this for XHTML, HTML or both? Yes, this could be clearer. The group's general line (Ben feel free to correct me) is that this attribute-driven markup style is intended to be largely neutral of its 'carrier' format, but that RDFa-in-XHTML is the only version that is fully specified with implementor tests etc underway. For this markup to work in other XML languages would require some more work; for it to be deployed in non-XML HTML (HTML5 etc) requires even more. But the general notion is that these attributes could be deployed in SVG-based, HTML5/6-based etc. languages too, ie. that this isn't a project tightly bound to (some specific version of) XHTML. Of course in a non-XML context, some other mechanism is needed (eg. link rels) to associate abbreviations with URLs. Also in http://www.w3.org/TR/rdfa-syntax/ (now in CR at W3C, http://www.w3.org/TR/2008/CR-rdfa-syntax-20080620/) [[ RDFa is a specification for attributes to be used with languages such as HTML and XHTML to express structured data. [...] This document only specifies the use of the RDFa attributes with XHTML. ]] Does that help? cheers Dan -- http://danbri.org/
Re: [whatwg] Creative Commons Rights Expression Language
+cc: Ben Adida Tab Atkins Jr. wrote: On Wed, Aug 20, 2008 at 5:22 PM, Bonner, Matt [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hola, I see that the Creative Commons has proposed additions to HTML to support licenses (ccREL): http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ As an example, they offer: div about=http://lessig.org/blog/; xmlns:cc=http://creativecommons.org/ns#; This page, by a property=cc:attributionName rel=cc:attributionURL href=http://lessig.org/; Lawrence Lessig /a, is licensed under a a rel=license href=http://creativecommons.org/licenses/by/3.0/; Creative Commons Attribution License /a. /div Unless I missed something in the HTML5 spec, at the least this would add the property attribute to a. Wouldn't ccREL be expressed better using link instead of a? Matt -- Matt Bonner Hewlett-Packard Company The whole thing would be best expressed as a microformat, as the entire thing can be made just as machine- and human-readable without having to introduce an entire new addition to html. I think someone is a little confused about the important of CC... (Note: the someone is not you, Matt, but the drafters of this proposal. Also, I love CC as much as the next guy, but there's absolutely no reason to extend html to accomodate it, as everything they want to express can be done in existing html and formatted as a microformat.) I encourage you to (re)-read http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ ... the spec explains that all of CC's concrete markup requirements are addressed by the HTML additions in the RDFa spec. It does not propose *any* new HTML markup to address CC's specific needs. Instead, they're telling the world that CC's needs (including their own requirement for independent extensions) are well-handled by RDFa. RDFa adds a set of attributes; http://www.w3.org/MarkUp/2008/ED-rdfa-syntax-20080403/#rdfa-attributes has a full list. The ccREL spec shows these in an XHTML+RDFa XHTML format. There's a strong case to add them to HTML5 too, in my view. In other words, adding 'about', 'property', 'resource', 'datatype' and 'typeof' and a namespace-URI association convention to HTML5 wouldn't merely be addressing the important needs of the Creative Commons community. It would allow the expression of properties defined by any decentralised community, without the need for central coordination. This includes not just CC, but every group worldwide who are extending and customising CC for their own needs. Not just FOAF, but groups extending it for modelling forum posts and social media (eg. SIOC), or opensource projects (DOAP). Not just Dublin Core, but the huge range of projects that extend it to handle educational metadata (which itself varies nationally), rights, aggregation, classification etc. The addition of the RDFa attributes would allow HTML5 to carry structured data expressed in all/any of these vocabularies. The Microformats.org community have done wonderful work and have inspired many others, but it is unfair on them (and unrealistic) to pressure their community, mailing lists and wiki by expecting their process to be a central bottleneck for all markup extensions to HTML. The Web serves a massive and fast growing community, many of whom don't speak English and are whose markup needs aren't core business for Microformats.org. By using RDFa and associating each vocabulary with a URI, we can spread the workload a bit more evenly. Note also that every new vocabulary initiative at Microformats.org creates real and non-trivial work for parser writers, as well as work for vocabulary authors in specifying what it means to mix each pair of vocabularies. For ccREL (and FOAF, Dublin Core, SIOC, DOAP, ...), this is largely handled by RDF/RDFa: it can be freely mixed with any other RDF vocabulary, and reliably parsed by generic parser code. The tradeoff here is that the markup is less hand optimised for beauty than with microformats. (When extra-pretty custom markup is important, RDF provides GRDDL as a way of using XSLT to specify a mapping into its common data model.) For more on RDFa, see the primer, http://www.w3.org/2006/07/SWD/RDFa/primer/ For a microformat parser that also handles RDFa, see http://buzzword.org.uk/cognition/ ... or an RDF toolkit that also parses some popular microformats, see http://arc.semsol.org/ For RDFa parsing in Javascript, see http://www.w3.org/2006/07/SWD/RDFa/impl/js/ cheers, Dan ps. my slides from a recent talk on rdf and microformats are here, if anyone's interested. It's more about how enthusiasts from each effort can learn from each other, than about the technical detail: http://www.slideshare.net/danbri/one-big-happy-family/ via http://microformats.eventwax.com/vevent -- http://danbri.org/
Re: [whatwg] Question about the PICS label in HTML5
Anne van Kesteren wrote: On Thu, 17 Apr 2008 11:06:46 +0200, Dan Brickley [EMAIL PROTECTED] wrote: http://wiki.whatwg.org/wiki/RelExtensions Erm, 'For the Status section to be changed to Accepted, the proposed keyword must have been through the Microformats process, and been approved by the Microformats community. ' Is that really so? That's the current proposal. I personally think a W3C Recommendation backing it should be enough as well. If these drafts are destined for W3C specs, then yes, please make that change to your process. Microformats.org should be one of several in-routes here. cheers, Dan -- http://danbri.org/
Re: [whatwg] Question about the PICS label in HTML5
Anne van Kesteren wrote: On Thu, 17 Apr 2008 10:37:30 +0200, Phil Archer [EMAIL PROTECTED] wrote: What do we need for HTML 5? Just the link/rel element. A POWDER link will be something like link rel=powder href=powder.xml type=application/xml / If the POWDER WG defines the powder relationship and adds powder to the following Wiki page as proposal that should be enough (with a pointer to the definition): http://wiki.whatwg.org/wiki/RelExtensions Erm, 'For the Status section to be changed to Accepted, the proposed keyword must have been through the Microformats process, and been approved by the Microformats community. ' Is that really so? Dan -- http://danbri.org/
Re: [whatwg] Administrivia: new member in the oversight committee
Ian Hickson wrote: On Sun, 30 Mar 2008, Dan Brickley wrote: Ian Hickson wrote: FYI, Anne van Kesteren was just invited to join the WHATWG membership (as defined by our charter, basically that's the small group of people whom I have to answer to in my role as editor). He was invited due to his long involvement in the WHATWG. This oversight group doesn't do much and this won't really change anything; basically the group is there to make sure I don't become evil and biased somehow, and to help direct the group should we decide to take on some new project. Does the committee have a mailing list? Where do they discuss things? Any papertrail? There's no public accountability for this group, no. It's roughly equivalent to W3C staff, except that it is not a paid position. W3C staff report through a variety of documented means to their stakeholders (including at regular events, Web Conference, TPs etc), they have named and documented roles grounded in the W3C Process, a class of document for airing their proposals to the wider community (Team notes) as well as strong internal-transparency via extensive internal email, cvs and irc logging so that new team-members can have access to previous discussions. Is this the equivalence you have in mind? W3C staff as a group culture (nothing personal here; I was one myself years) also have a tendency to be a little over-secretive, insular, and too often slip into thinking of themselves as having to heroically figure out what to do internally before presenting an external opinion. Get a tight-knit, smart and distributed group of people together with a sense of mission, and that's a hard trait to avoid. I hope you'll lean towards the public accountability side of things here. See also: http://www.whatwg.org/charter Thanks, interesting. Is a version history and change-log available, beyond what can be discerned from http://web.archive.org/web/*/http://www.whatwg.org/charter ? From the outside it is hard to understand how the charter has evolved over time. cheers, Dan -- http://danbri.org/
Re: [whatwg] Administrivia: new member in the oversight committee
Hi Ian, Ian Hickson wrote: FYI, Anne van Kesteren was just invited to join the WHATWG membership (as defined by our charter, basically that's the small group of people whom I have to answer to in my role as editor). He was invited due to his long involvement in the WHATWG. This oversight group doesn't do much and this won't really change anything; basically the group is there to make sure I don't become evil and biased somehow, and to help direct the group should we decide to take on some new project. Does the committee have a mailing list? Where do they discuss things? Any papertrail? cheers, Dan -- http://danbri.org/
Re: [whatwg] Video codec requirements changed
[snip] How about this permathread gets a @whatwg.org mailing list all of its own? Just a suggestion... dan
Re: [whatwg] sarcasm
Elliotte Harold wrote: It occurs to me that one of the most frequently used nits of pseudo-markup is to indicate sarcasm. For example, sarcasmYeah, George W. Bush has been such a great president./sarcasm Should we perhaps formalize this? Is there any benefit to be achieved by adding an explicit sarcasm element to HTML? Seems rather culturally specific. I found from living in Boston for a while, that a British sense of humour often seems harsher and more sarcastic to our gentle US cousins. So I wouldn't burn this into an element name. Some way of citing externally maintained lists might be nice, eg. see work of http://www.w3.org/2005/Incubator/emotion/charter The mission of the Emotion Incubator Group, part of the Incubator Activity, is to investigate the prospects of defining a general-purpose Emotion annotation and representation language, which should be usable in a large variety of technological contexts where emotions need to be represented. cheers, Dan
Re: [whatwg] video, object, Timed Media Elements -- Part I SMIL
Martin Atkins wrote: ddailey wrote: On Thu, 22 Mar 2007 13:03:24, Anne van Kesteren wrote 1. why not just include SMIL as a part of HTML, much in the same way that it is integrated with SVG? It is an existing W3C reco. Reasons for not using t:video were that it was 1) complicated and 2) not used. Thanks Anne... Is there some easy way to resurrect prior discussions of this from the archives somewhere? I would like to try to understand the reasoning here. SMIL doesn't seem complicated to me -- declarative animation is rather charming and the complicatedness is cognitively less demanding than scripting. Its popularity will probably be synergized by rather dramatic increases in use of SVG. SMIL solves problems far greater than the current aim of video, which is a much more modest goal of just being able to embed video interoperably in an HTML document. If you want to do all that fun SMIL stuff, then why not just use SVG? It already does it all. video for the simple use cases and SVG+SMIL for the complicated ones doesn't seem too bad a compromise to me. I've not followed it, ... but there's a SMIL subset integrated with XHTML at http://www.w3.org/TR/XHTMLplusSMIL/ ... if you find SMIL too large, perhaps this or another profile is less intimidating? Dan
Re: [whatwg] video: togglePause() versus pause()
Alexey Feldgendler wrote: On Sun, 18 Mar 2007 22:09:02 +0100, Magnus Kristiansen [EMAIL PROTECTED] wrote: I just played some more with our internal implementation (Opera's) and noticed that our pause() really is like togglePause() in the HTML5 proposal. Looking at the specification I don't see much need for pause() there. Perhaps togglePause() should just become pause() and pause() be removed? I would suggest the opposite. For basic actions like play and pause, play() and pause() are the most natural options. I question whether we need a command to toggle between play/pause at all. Any UI which uses a combined play/resume button has to know which state it is, so it already knows which command is relevant. +1 What's good for UI (a play/pause toggle button) isn't necessarily good for API. play() should only start playback (and do nothing if it's already playing), pause() should only pause (and do nothing if it's stopped). The spec also mentions a property to find out the current state. This is an important point. Pause UI is a well known slippery issue (state vs action). An API shouldn't dictate the UI... Dan
Re: [whatwg] W3C restarts HTML effort
Ian Hickson wrote: The W3C today publicly announced that they are restarting an HTML specification effort. http://www.w3.org/2007/03/html-pressrelease This is great news and a clear validation of the WHATWG effort, which has been leading the maintenance and development of HTML since 2004. I'd like to congratulate everyone who has been involved in the WHATWG work, this really confirms that we have been doing good work. Surprisingly, the W3C never actually contacted the WHATWG during the chartering process. However, the WHATWG model has clearly had some influence on the creation of this group, and the charter says that the W3C will try to actively pursue convergence with WHATWG: http://www.w3.org/2007/03/HTML-WG-charter.html#conformance Hopefully they will get in contact soon. In the meantime, apparently anyone can actually join the W3C effort. http://www.w3.org/2004/01/pp-impl/40318/instructions The instructions to join the group are as follows: 1. Fill in the Public Access Request Form; in the Reason field, put: To apply for participation in the HTML Working Group as an Invited Expert. http://cgi.w3.org/MemberAccess/Public 2. When you get a reply back, you should have a username and password. Fill in the W3C Invited Expert Application form. http://www.w3.org/2002/09/wbs/1/ieapp/ 3. E-mail Dan Connolly and Karl Dubost ([EMAIL PROTECTED], [EMAIL PROTECTED]) asking for approval. 4. When you get a reply back, fill in the Joining the HTML Working Group form. http://www.w3.org/2004/01/pp-impl/40318/join I would encourage everyone interested in working with the HTML working group to go through these steps as soon as possible, so that you will be a member of the group before the work starts. I have also posted a WHATWG blog entry with this information: http://blog.whatwg.org/w3c-restarts-html-effort Cheers, The charter page also notes The HTML Working Group also welcomes participation from non-Members. This may take the form of questions and comments on the mailing list or IRC channel, for which there is no formal requirement, or technical submissions for consideration, for which the participant must agree to Royalty-Free licensing under the W3C Patent Policy. -- http://www.w3.org/2007/03/HTML-WG-charter.html#participation ...also This group primarily conducts its technical work on a Public mailing list. In other words, there's a participation level below full WG membership, but acknowledged in the charter. It may suit some folk here. Great to have the WG discussions in the public record too. Easier to find, easier to link to, etc. This is a very healthy level of openness. Good for W3C, good for the Web... cheers, Dan
Re: [whatwg] The IMG element, proposing a CAPTION attribute
Elliotte Harold wrote: Jeff Seager wrote: A better way would be to semantically attach the caption or cutline to the image itself, so its display is paired naturally. In this way, the width of the cutline would be dictated (unless overruled in the stylesheet) by the width of the image. I'm suggesting that CAPTION be adopted as a new attribute of the IMG element, as it is already for the TABLE element. I don't think caption should be an attribute, an element maybe, but not an attribute. The problem is that captions can and do have substructure. For instance, a caption might include multiple emphasized or strongly emphasized sections. Attributes just aren't powerful enough for this. Given that, I suspect we're probably better off just using regular paragraphs in text with appropriate CSS instructions rather than introducing a new element. I agree, attributes are too weak (eg. couldn't support http://www.w3.org/TR/ruby/ ). Dan
Re: [whatwg] Mathematics in HTML5
* Ian Hickson [EMAIL PROTECTED] [2006-06-08 00:28+] On Wed, 7 Jun 2006, Michel Fortin wrote: I'd like to try something a little simpler. So here is my idea for a math markup. I would be very cautious about introducing an entirely new language to do this (even if it is just an extension of HTML4). For something as big as Mathematics, we want to simply re-use an existing language, not invent a new one. Inventing a new language for encoding content with as wide a problem-space as mathematics would require months, as well as the time of domain experts, etc. This work has already been done, e.g. in ISO12083, MathML, LaTeX, and other such languages. I absolutely agree. It would also be both considerate and sensible (if anyone does want to undertake such a task) to talk to the MathML folks first. cheers, Dan
Re: [whatwg] HTML5 Parsing spec first draft ready
* Ian Hickson [EMAIL PROTECTED] [2006-02-15 23:02+] On Wed, 15 Feb 2006, Dan Brickley wrote: Have you considered defining the parser behaviour in terms of XML concepts? What would that mean? Could you give an example of what that would look like? Expressing things in terms of DOM would be one way, assuming there is a mapping to XML infoset from the DOM (which http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/ suggests there is, though perhaps there are DOM version issues here?). If you do get to the test suite stage, there'll be need for some concrete syntax presumably, to express test outputs in? The output of the parser is a DOM, so the natural form to use as an output concrete syntax is simply a serialised DOM (e.g. an XML file). If your DOM comes with a standard XMLization, we're golden. Sorry I'm not so up to date on DOM stuff (eg. which DOMs have an XMLization defined, etc.). GRDDL could then say for HTML-ish bytestreams, feed them to the WHATWG algorithm to get XML, and feed that XML to normal GRDDL algorithm to get RDF... I'm with you up to the step where the output is XML, but I fail to see how the next step is something WHATWG would be interested in. Could you expand on this? The next step is for people who find value in RDF's abstract graph structure but find the standard RDF/XML syntax unattractive. GRDDL lets folk deploy using XML or XHTML-based formats of their own devising, but map into RDF using XSLT so that RDF tools (eg. databases, SPARQL query engines) can consume and exploit the data. I don't expect this to be directly of interest to WHATWG unless WHATWG find value in RDF. Beyond that, just think of it as another potential user of the parser spec. http://www.w3.org/2004/01/rdxh/grddl-xml-demo has some demos of GRDDL in action; http://librdf.org/query has some demos of RDF query using SPARQL, from a toolkit that has GRDDL support. So one use case would be to mix natively RDF content with RDFized microformat markup, so we could write queries whose answers draw on information scattered across both formats and potentially multiple documents. Dan -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] HTML5 Parsing spec first draft ready
* Ian Hickson [EMAIL PROTECTED] [2006-02-13 22:07+] So... The first draft of the HTML5 Parsing spec is ready. I plan to start implementing it at some point in the next few months, to see how well it fares. Any plans for a test suite? eg. pairs of input files and normalised output? (if that makes sense...). Dan
Re: [whatwg] HTML5 Parsing spec first draft ready
+cc: Dan Connolly * Ian Hickson [EMAIL PROTECTED] [2006-02-14 18:41+] On Mon, 13 Feb 2006, Dan Brickley wrote: Any plans for a test suite? eg. pairs of input files and normalised output? (if that makes sense...). I'd strongly recommend people put off creating a test suite until the spec is in more than a first draft, but yes, on the long term this is something we should definitely do. Yup, I appreciate it's early days. Discussing some related work (GRDDL) in the W3C SemWeb CG, I was wondering whether there is any way your parser spec could be specified as input for a GRDDL transform. GRDDL provides techniques for transforming XML-based languages (including XHTML) into an RDF representation; typically by reference to an XSLT. If the WHATWG parser spec defined itself in terms of some XML-shaped output, the two should chain nicely together. Have you considered defining the parser behaviour in terms of XML concepts? If you do get to the test suite stage, there'll be need for some concrete syntax presumably, to express test outputs in? GRDDL could then say for HTML-ish bytestreams, feed them to the WHATWG algorithm to get XML, and feed that XML to normal GRDDL algorithm to get RDF... Dan
Re: [whatwg] What exactly is contentEditable for?
Olav Junker Kjær wrote: Lachlan Hunt wrote: I'm not disputing the fact that there is an unfortunate demand for embedded WYSIWYG editing in web based CMSs, it is the conceputally broken implementation I'm against. I don't consider this demand unfortunate. I consider it an essential part of the vision for the web. The writable web or universal canvas or whatever its called, has been a part of the vision from the beginning (rumor has it that TBL's very first browser was read/write). Yup, see screenshots etc c/o http://www.w3.org/People/Berners-Lee/WorldWideWeb.html [[[ The broken X in the Tim's home page window means that the document has been edited and not yet saved. (A dirty flag). As a convenience, pressing Command/Shift/S would save back all modified web pages. ]]] Dan