Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
On Thu, 23 Apr 2009, Kjetil Kjernsmo wrote: I'm searching for new hardware for my desktop and most of the specs I do not care about too much, but I've decided that I want a 45 nm CPU with at least a 1333 MHz FSB and at least 2800 MHz clock frequency, and a thermal energy of at most 65 W. The motherboard needs to have at least 2 PCI ports, unless it has an onboard Wifi card, and it needs to accomodate for at least 12 GB of DDR3 RAM, which needs to match the FSB frequency. Furthermore, all components should be well supported by Linux and the RAID controller should have at least RAID acceleration. This is actually remarkably hard to achieve these days, I need to manually search out all the components independently, and none of them have information about the RAID controllers. When you say none of them have information about the RAID controllers, what do you mean? The sites you looked at don't have that information? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
On Sat, 25 Apr 2009, Charles McCathieNevile wrote: On Thu, 23 Apr 2009 22:46:09 +0200, Ian Hickson i...@hixie.ch wrote: * Shouldn't require the consumer to write XSLT or server-side code to process the annotated data. Does process here mean extract from the page, or something more? Not sure. This requirement originally came form Daniel O'Connor in a blog comment here: http://realtech.burningbird.net/semantic-web/semantic-markup/stop-justifying-rdfa ...where he said: Reasons for RDFa in HTML: [...] * I want to provide a machine readable interpretation of my data * I do not want to write XSLT, or server side code to transform my data if I don't have to My interpretation is that it means that he would like to not have to use XSLT to do anything with the data, whether extracting it or analysing it or anything. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
(Please avoid cross-posting. I've bcc'ed public-html since this e-mail was originally sent to both whatwg and public-html, but the thread has mostly been on the whatwg list so far.) On Sat, 25 Apr 2009, Charles McCathieNevile wrote: From the point of view of the HTML5 effort, what is needed is use cases, scenarios, and requirements, that don't in any way imply a particular solution, as in the list I posted, so that solutions can be evaluated. So how do the solutions get proposed, or do you already have a candidate list you have selected? What's the process here? As with other issues, I intend to carefully examine the many suggestions that have already been put forward (RDFa and the various other forms of RDF, Microformats, the various extension mechanisms in HTML4, various domain-specific solutions, NLP-type solutions, automated search solutions that already exist, etc) as well as considering possible new solutions for specific problems. This will then result in a draft proposal for further discussion. There's no need to propose solutions yet, though, I'm pretty confident that every possible solution has already been brought up. :-) (Thanks for your other comments btw, I've taken note of them.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
On Thu, Apr 23, 2009 at 10:46 PM, Ian Hickson i...@hixie.ch wrote: [...] Exposing known data types in a reusable way USE CASE: Exposing calendar events so that users can add those events to their calendaring systems. [...] REQUIREMENTS: [...] * Should be unlikely to get out of sync with prose on the page. * Machine-readable event data shouldn't be on a separate page than human-readable dates. [...] --- USE CASE: Exposing contact details so that users can add people to their address books or social networking sites. [...] REQUIREMENTS: [...] * Data should not need to be duplicated between machine-readable and human-readable forms (i.e. the human-readable form should be machine-readable). * Machine-readable contact information shouldn't be on a separate page than human-readable contact information. [...] --- USE CASE: Allow users to maintain bibliographies or otherwise keep track of sources of quotes or references. [...] REQUIREMENTS: * Machine-readable bibliographic information shouldn't be on a separate page than human-readable bibliographic information. [...] --- USE CASE: Help people searching for content to find content covered by licenses that suit their needs. [...] REQUIREMENTS: [...] * License information should be able to survive from one site to another as the data is transfered. [...] * Machine-readable licensing information shouldn't be on a separate page than human-readable licensing information. [...] == Annotations USE CASE: Annotate structured data that HTML has no semantics for, and which nobody has annotated before, and may never again, for private use or use in a small self-contained community. [...] REQUIREMENTS: [...] * Machine-readable annotations shouldn't be on a separate page than human-readable annotations. [...] * The syntax for adding this data should encourage the data to remain accurate when the page is changed. * The syntax should be resilient to intentional copy-and-paste authoring: people copying data into the page from a page that already has data should not have to know about any declarations far from the data. * The syntax should be resilient to unintentional copy-and-paste authoring: people copying markup from the page who do not know about these features should not inadvertently mark up their page with inapplicable data. --- [...] USE CASE: Site owners want a way to provide enhanced search results to the engines, so that an entry in the search results page is more than just a bare link and snippet of text, and provides additional resources for users straight on the search page without them having to click into the page and discover those resources themselves. [...] REQUIREMENTS: * Information for the search engine should be on the same page as information that would be shown to the user if the user visited the page. == Cross-site communication USE CASE: Copy-and-paste should work between Web apps and native apps and between Web apps and other Web apps. I have noticed (highlighted by the quoted fragments above) quite a bit of recurrence of some of the requirements, namely: - Information for the machine / agent / whatever should be on the same page as information for the (human) user. - copy-paste resilience - (on some cases) Data shouldn't be duplicated for humans and for machines (although this is not always achievable, for example with dates). There is a requirement that has been put forward previously [1], which IMO may interact with these, and didn't show up on Ian's original mail: - Meta-data (or any additional markup or data used to allow the machine to understand the actual information) shouldn't be redundantly repeated. Examples: - An author puts up a page with contact information for several people (for example, the people responsible for the website; a list of entities that are somehow related to the website, like sponsors; or a list of friends in a restricted-access social website, such as in Microsoft's Live Spaces). Let's say that author puts this info in a table, with the contact name on the first column, the e-mail address on the second column, and so on, just because that's the kind of job tables are for. Of course, the first row in the table would hold the headers describing what each column means. The author *should* be able to tell the
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
On Tue, 28 Apr 2009, Eduard Pascual wrote: There is a requirement that has been put forward previously [1], which IMO may interact with these, and didn't show up on Ian's original mail: - Meta-data (or any additional markup or data used to allow the machine to understand the actual information) shouldn't be redundantly repeated. Noted, thanks. It's going to be quite interesting to try to find a solution that actually fits all the requirements for some of these use cases... -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
On Tue, 28 Apr 2009, Kjetil Kjernsmo wrote: On Tuesday 28 April 2009, you wrote: When you say none of them have information about the RAID controllers, what do you mean? The sites you looked at don't have that information? Ah, sorry, this was unclear. What I mean is that this information is not provided by manufacturers, you can find it summarized at some sites, but often you need information gleaned from various forums around the net. Ah, ok. So we can't rely on the information being marked up usefully then? I'm trying to work out what the requirements are... So far I have: USE CASE: Allow the user to perform vertical searches across multiple sites even when the sites don't include the information the user wants. SCENARIOS: * Kjetil is searching for new hardware for his desktop and most of the specs he does not care about too much, but he's decided that he wants a 45 nm CPU with at least a 1333 MHz FSB and at least 2800 MHz clock frequency, and a thermal energy of at most 65 W. The motherboard needs to have at least 2 PCI ports, unless it has an onboard Wifi card, and it needs to accommodate for at least 12 GB of DDR3 RAM, which needs to match the FSB frequency. Furthermore, all components should be well supported by Linux and the RAID controller should have at least RAID acceleration. None of the manufacturer sites have information about the RAID controllers, that information is only available form various forums. * Fred is going to buy a property. The property needs to be close to the forest, yet close to a train station that will take him to town in less than half an hour. It needs to have a stable snow-fall in the winter, and access to tracks that are regularly prepared for XC skating. The property should be of a certain size, and proximity to kindergarten and schools. It needs to have been regulated for residential use and have roads and the usual infrastructure. Furthermore, it needs to be on soil that is suitable for geothermal heating yet have a low abundance of uranium. It should have a good view of the fjord to the southeast. REQUIREMENTS: * Performing search searches should be feasible and cheap. * It should be possible to perform such searches without relying on a third-party to seek out the information. * The tool that collects information must not require the information to be marked up in some special way, since manufacturers don't include all the information, and users on forums (where the information can sometimes be found) are unlikely to mark it up in some particularly machine-readable way. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
On Thu, 23 Apr 2009 22:46:09 +0200, Ian Hickson i...@hixie.ch wrote: USE CASE: Allow users to maintain bibliographies or otherwise keep track of sources of quotes or references. SCENARIOS: ... * Chaals could improve the Opera intranet if he had a mechanism for identifying the original source of various parts of a page. (why?) Because the page is put together by various different people (or processes), so knowing who is responsible for some bit that needs work is important in contacting the right person faster. (This isn't specific to Opera's intranet, of course. That happens to be the one I use most). REQUIREMENTS: * Machine-readable bibliographic information shouldn't be on a separate page than human-readable bibliographic information. * The information should be convertible into a dedicated form (RDF, JSON, XML, BibTex) in a consistent manner, so that tools that use this information separate from the pages on which it is found have a standard way of conveying the information. cheers -- Charles McCathieNevile Opera Software, Standards Group je parle français -- hablo español -- jeg lærer norsk http://my.opera.com/chaals Try Opera: http://www.opera.com
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
On Thu, 23 Apr 2009 22:46:09 +0200, Ian Hickson i...@hixie.ch wrote: * Shouldn't require the consumer to write XSLT or server-side code to process the annotated data. Does process here mean extract from the page, or something more? cheers -- Charles McCathieNevile Opera Software, Standards Group je parle français -- hablo español -- jeg lærer norsk http://my.opera.com/chaals Try Opera: http://www.opera.com
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
On Fri, 24 Apr 2009 05:53:09 +0200, Ian Hickson i...@hixie.ch wrote: On Thu, 23 Apr 2009, Manu Sporny wrote: I've looked over the list a couple of times and it's a good introduction to the problem space. It's not really intended to be an introduction, so much as a complete list of use cases that people want the spec to cover. ... Oh. Then I think it is probably doomed to be incomplete - users not only do concrete things, but they do lots of different concrete things. This is possibly (probably?) a large enough set from which to derive general principles and clear goals. From the point of view of the HTML5 effort, what is needed is use cases, scenarios, and requirements, that don't in any way imply a particular solution, as in the list I posted, so that solutions can be evaluated. ... So how do the solutions get proposed, or do you already have a candidate list you have selected? What's the process here? cheers chaals -- Charles McCathieNevile Opera Software, Standards Group je parle français -- hablo español -- jeg lærer norsk http://my.opera.com/chaals Try Opera: http://www.opera.com
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
The contacts section uses event where it meant contact On 4/23/09, Ian Hickson i...@hixie.ch wrote: [bcc'ed previous participants in this discussion] Earlier this year I asked for use cases that HTML5 did not yet cover, with an emphasis on use cases relating to semantic microdata. I list below the use cases and requirements that I derived from the response to that request, and from related discussions. I would appreciate it if people could review this list for errors or important omissions, before I go through the list to work out whether these use cases already have solutions, or whether we should have solutions for these use cases in HTML, or whether we should address these use cases with other technologies, or whatnot. I encourage people to focus on the use cases themselves, rather than on potential solutions; various solutions to all these use cases have already been argued in great detail and I have already read all those e-mails, blog comments, wiki faqs, etc, carefully. My primary concern right now is in making sure that these are indeed the use cases people care about, so that whatever we add to the spec can be carefully evaluated to make sure it is in fact solving the problems that we want solving. == Exposing known data types in a reusable way USE CASE: Exposing calendar events so that users can add those events to their calendaring systems. SCENARIOS: * A user visits the Avenue Q site and wants to make a note of when tickets go on sale for the tour's stop in his home town. The site says October 3rd, so the user clicks this and selects add to calendar, which causes an entry to be added to his calendar. * A student is making a timeline of important events in Apple's history. As he reads Wikipedia entries on the topic, he clicks on dates and selects add to timeline, which causes an entry to be added to his timeline. * TV guide listings - browsers should be able to expose to the user's tools (e.g. calendar, DVR, TV tuner) the times that a TV show is on. * Paul sometimes gives talks on various topics, and announces them on his blog. He would like to mark up these announcements with proper scheduling information, so that his readers' software can automatically obtain the scheduling information and add it to their calendar. Importantly, some of the rendered data might be more informal than the machine-readable data required to produce a calendar event. Also of importance: Paul may want to annotate his event with a combination of existing vocabularies and a new vocabulary of his own design. (why?) * David can use the data in a web page to generate a custom browser UI for adding an event to our calendaring software without using brittle screen-scraping. REQUIREMENTS: * Should be discoverable. * Should be compatible with existing calendar systems. * Should be unlikely to get out of sync with prose on the page. * Shouldn't require the consumer to write XSLT or server-side code to read the calendar information. * Machine-readable event data shouldn't be on a separate page than human-readable dates. * The information should be convertible into a dedicated form (RDF, JSON, XML, iCalendar) in a consistent manner, so that tools that use this information separate from the pages on which it is found have a standard way of conveying the information. * Should be possible for different parts of an event to be given in different parts of the page. For example, a page with calendar events in columns (with each row giving the time, date, place, etc) should still have unambiguous calendar events parseable from it. --- USE CASE: Exposing contact details so that users can add people to their address books or social networking sites. SCENARIOS: * Instead of giving a colleague a business card, someone gives their colleague a URL, and that colleague's user agent extracts basic profile information such as the person's name along with references to other people that person knows and adds the information into an address book. * A scholar and teacher wants other scholars (and potentially students) to be able to easily extract information about who he is to add it to their contact databases. * Fred copies the names of one of his Facebook friends and pastes it into his OS address book; the contact information is imported automatically. * Fred copies the names of one of his Facebook friends and pastes it into his Webmail's address book feature; the
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
Ian Hickson wrote: [bcc'ed previous participants in this discussion] Earlier this year I asked for use cases that HTML5 did not yet cover, with an emphasis on use cases relating to semantic microdata. I list below the use cases and requirements that I derived from the response to that request, and from related discussions. My primary concern right now is in making sure that these are indeed the use cases people care about, so that whatever we add to the spec can be carefully evaluated to make sure it is in fact solving the problems that we want solving. I've looked over the list a couple of times and it's a good introduction to the problem space. For those that are new to the discussion, some of these use cases are covered in more depth on the RDFa wiki[1]. The RDFa wiki includes example markup and Javascript pseudo-code describing consuming applications, but only for a few of the use cases. I'm elaborating on one use case every day until all of them are done (it should take about a month at this rate). Ian, would it help if I continue to elaborate on the RDFa use cases on the RDFa wiki? Or perhaps, I could merge these use cases into the RDFa wiki and elaborate on the WHATWG micro-data use cases first? I'd like to focus my effort on something that will benefit /both/ WHATWG and the RDFa community. Thoughts? -- manu [1] http://rdfa.info/wiki/rdfa-use-cases -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Please review use cases relating to embedding micro-data in text/html
On Thu, 23 Apr 2009, Manu Sporny wrote: I've looked over the list a couple of times and it's a good introduction to the problem space. It's not really intended to be an introduction, so much as a complete list of use cases that people want the spec to cover. Ian, would it help if I continue to elaborate on the RDFa use cases on the RDFa wiki? Or perhaps, I could merge these use cases into the RDFa wiki and elaborate on the WHATWG micro-data use cases first? I'd like to focus my effort on something that will benefit /both/ WHATWG and the RDFa community. Thoughts? From the point of view of the HTML5 effort, what is needed is use cases, scenarios, and requirements, that don't in any way imply a particular solution, as in the list I posted, so that solutions can be evaluated. The rdfa.info wiki page was invaluable in the creation of the list I posted this morning -- I used that, as well as blog comments and about 15,000 lines' worth of e-mails, in the creation of the list. I tried to make sure every use case mentioned was covered, so if anyone posted a use case to this mailing list, to the wiki, or to blogs on the subject in the past few months, that is not listed in that e-mail, I apologise -- please let me know so that I can add them. (It may be that I didn't understand the use case -- there's a couple I don't get, noted with (why?) in the e-mail sent this morning.) The more concrete the use cases the better. Users do concrete things. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'