Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-28 Thread Ian Hickson
On Thu, 23 Apr 2009, Kjetil Kjernsmo wrote:
 
 I'm searching for new hardware for my desktop and most of the specs I do 
 not care about too much, but I've decided that I want a 45 nm CPU with 
 at least a 1333 MHz FSB and at least 2800 MHz clock frequency, and a 
 thermal energy of at most 65 W. The motherboard needs to have at least 2 
 PCI ports, unless it has an onboard Wifi card, and it needs to 
 accomodate for at least 12 GB of DDR3 RAM, which needs to match the FSB 
 frequency. Furthermore, all components should be well supported by Linux 
 and the RAID controller should have at least RAID acceleration.

 This is actually remarkably hard to achieve these days, I need to 
 manually search out all the components independently, and none of them 
 have information about the RAID controllers.

When you say none of them have information about the RAID controllers, 
what do you mean? The sites you looked at don't have that information?

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-28 Thread Ian Hickson
On Sat, 25 Apr 2009, Charles McCathieNevile wrote:

 On Thu, 23 Apr 2009 22:46:09 +0200, Ian Hickson i...@hixie.ch wrote:
 
   * Shouldn't require the consumer to write XSLT or server-side code
  to process the annotated data.
 
 Does process here mean extract from the page, or something more?

Not sure. This requirement originally came form Daniel O'Connor in a blog 
comment here:

   
http://realtech.burningbird.net/semantic-web/semantic-markup/stop-justifying-rdfa

...where he said:

   Reasons for RDFa in HTML:
   [...]
   * I want to provide a machine readable interpretation of my data
   * I do not want to write XSLT, or server side code to transform my data 
 if I don't have to

My interpretation is that it means that he would like to not have to use 
XSLT to do anything with the data, whether extracting it or analysing it 
or anything.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-28 Thread Ian Hickson

(Please avoid cross-posting. I've bcc'ed public-html since this e-mail was 
originally sent to both whatwg and public-html, but the thread has mostly 
been on the whatwg list so far.)

On Sat, 25 Apr 2009, Charles McCathieNevile wrote:
  
  From the point of view of the HTML5 effort, what is needed is use cases,
  scenarios, and requirements, that don't in any way imply a particular
  solution, as in the list I posted, so that solutions can be evaluated.
 
 So how do the solutions get proposed, or do you already have a candidate list
 you have selected? What's the process here?

As with other issues, I intend to carefully examine the many suggestions 
that have already been put forward (RDFa and the various other forms of 
RDF, Microformats, the various extension mechanisms in HTML4, various 
domain-specific solutions, NLP-type solutions, automated search solutions 
that already exist, etc) as well as considering possible new solutions for 
specific problems. This will then result in a draft proposal for further 
discussion. There's no need to propose solutions yet, though, I'm pretty 
confident that every possible solution has already been brought up. :-)

(Thanks for your other comments btw, I've taken note of them.)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-28 Thread Eduard Pascual
On Thu, Apr 23, 2009 at 10:46 PM, Ian Hickson i...@hixie.ch wrote:
[...]
 Exposing known data types in a reusable way

   USE CASE: Exposing calendar events so that users can add those events to
   their calendaring systems.
[...]
   REQUIREMENTS:
[...]
     * Should be unlikely to get out of sync with prose on the page.
     * Machine-readable event data shouldn't be on a separate page than
       human-readable dates.
[...]
   ---

   USE CASE: Exposing contact details so that users can add people to their
   address books or social networking sites.
[...]
   REQUIREMENTS:
[...]
     * Data should not need to be duplicated between machine-readable and
       human-readable forms (i.e. the human-readable form should be
       machine-readable).
     * Machine-readable contact information shouldn't be on a separate page
       than human-readable contact information.
[...]
   ---

   USE CASE: Allow users to maintain bibliographies or otherwise keep track
   of sources of quotes or references.
[...]
   REQUIREMENTS:

     * Machine-readable bibliographic information shouldn't be on a separate
       page than human-readable bibliographic information.
[...]
   ---

   USE CASE: Help people searching for content to find content covered by
   licenses that suit their needs.
[...]
   REQUIREMENTS:
[...]
     * License information should be able to survive from one site to another
       as the data is transfered.
[...]
     * Machine-readable licensing information shouldn't be on a separate page
       than human-readable licensing information.
[...]
 ==

 Annotations

   USE CASE: Annotate structured data that HTML has no semantics for, and
   which nobody has annotated before, and may never again, for private use or
   use in a small self-contained community.
 [...]
   REQUIREMENTS:
 [...]
     * Machine-readable annotations shouldn't be on a separate page than
       human-readable annotations.
[...]
     * The syntax for adding this data should encourage the data to remain
       accurate when the page is changed.
     * The syntax should be resilient to intentional copy-and-paste
       authoring: people copying data into the page from a page that already
       has data should not have to know about any declarations far from the
       data.
     * The syntax should be resilient to unintentional copy-and-paste
       authoring: people copying markup from the page who do not know about
       these features should not inadvertently mark up their page with
       inapplicable data.

   ---
[...]
   USE CASE: Site owners want a way to provide enhanced search results to the
   engines, so that an entry in the search results page is more than just a
   bare link and snippet of text, and provides additional resources for users
   straight on the search page without them having to click into the page and
   discover those resources themselves.
[...]
   REQUIREMENTS:

     * Information for the search engine should be on the same page as
       information that would be shown to the user if the user visited the
       page.

 ==

 Cross-site communication

   USE CASE: Copy-and-paste should work between Web apps and native apps and
   between Web apps and other Web apps.

I have noticed (highlighted by the quoted fragments above) quite a bit
of recurrence of some of the requirements, namely:
- Information for the machine / agent / whatever should be on the same
page as information for the (human) user.
- copy-paste resilience
- (on some cases) Data shouldn't be duplicated for humans and for
machines (although this is not always achievable, for example with
dates).

There is a requirement that has been put forward previously [1], which
IMO may interact with these, and didn't show up on Ian's original
mail:
- Meta-data (or any additional markup or data used to allow the
machine to understand the actual information) shouldn't be redundantly
repeated.

Examples:

- An author puts up a page with contact information for several
people (for example, the people responsible for the website; a list of
entities that are somehow related to the website, like sponsors; or a
list of friends in a restricted-access social website, such as in
Microsoft's Live Spaces). Let's say that author puts this info in a
table, with the contact name on the first column, the e-mail address
on the second column, and so on, just because that's the kind of job
tables are for. Of course, the first row in the table would hold the
headers describing what each column means.
   The author *should* be able to tell the 

Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-28 Thread Ian Hickson
On Tue, 28 Apr 2009, Eduard Pascual wrote:
 
 There is a requirement that has been put forward previously [1], which 
 IMO may interact with these, and didn't show up on Ian's original mail:
 - Meta-data (or any additional markup or data used to allow the machine 
 to understand the actual information) shouldn't be redundantly repeated.

Noted, thanks.

It's going to be quite interesting to try to find a solution that actually 
fits all the requirements for some of these use cases...

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-28 Thread Ian Hickson
On Tue, 28 Apr 2009, Kjetil Kjernsmo wrote:
 On Tuesday 28 April 2009, you wrote:
  When you say none of them have information about the RAID 
  controllers, what do you mean? The sites you looked at don't have 
  that information?
 
 Ah, sorry, this was unclear. What I mean is that this information is not 
 provided by manufacturers, you can find it summarized at some sites, but 
 often you need information gleaned from various forums around the net.

Ah, ok. So we can't rely on the information being marked up usefully then? 
I'm trying to work out what the requirements are... So far I have:


USE CASE: Allow the user to perform vertical searches across multiple 
sites even when the sites don't include the information the user wants.

SCENARIOS:

* Kjetil is searching for new hardware for his desktop and most of the 
specs he does not care about too much, but he's decided that he wants a 45 
nm CPU with at least a 1333 MHz FSB and at least 2800 MHz clock frequency, 
and a thermal energy of at most 65 W. The motherboard needs to have at 
least 2 PCI ports, unless it has an onboard Wifi card, and it needs to 
accommodate for at least 12 GB of DDR3 RAM, which needs to match the FSB 
frequency. Furthermore, all components should be well supported by Linux 
and the RAID controller should have at least RAID acceleration. None of 
the manufacturer sites have information about the RAID controllers, that 
information is only available form various forums.

* Fred is going to buy a property. The property needs to be close to the 
forest, yet close to a train station that will take him to town in less 
than half an hour. It needs to have a stable snow-fall in the winter, and 
access to tracks that are regularly prepared for XC skating. The property 
should be of a certain size, and proximity to kindergarten and schools. It 
needs to have been regulated for residential use and have roads and the 
usual infrastructure. Furthermore, it needs to be on soil that is suitable 
for geothermal heating yet have a low abundance of uranium. It should have 
a good view of the fjord to the southeast.


REQUIREMENTS:

* Performing search searches should be feasible and cheap.

* It should be possible to perform such searches without relying on a 
third-party to seek out the information.

* The tool that collects information must not require the information to 
be marked up in some special way, since manufacturers don't include all 
the information, and users on forums (where the information can sometimes 
be found) are unlikely to mark it up in some particularly machine-readable 
way.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-25 Thread Charles McCathieNevile

On Thu, 23 Apr 2009 22:46:09 +0200, Ian Hickson i...@hixie.ch wrote:

   USE CASE: Allow users to maintain bibliographies or otherwise keep  
  track of sources of quotes or references.

  SCENARIOS:

...

 * Chaals could improve the Opera intranet if he had a mechanism for
   identifying the original source of various parts of a page. (why?)


Because the page is put together by various different people (or  
processes), so knowing who is responsible for some bit that needs work is  
important in contacting the right person faster. (This isn't specific to  
Opera's intranet, of course. That happens to be the one I use most).



  REQUIREMENTS:
* Machine-readable bibliographic information shouldn't be on a  
   separate page than human-readable bibliographic information.

* The information should be convertible into a dedicated form (RDF,
   JSON, XML, BibTex) in a consistent manner, so that tools that use  
   this information separate from the pages on which it is found

   have a standard way of conveying the information.


cheers

--
Charles McCathieNevile  Opera Software, Standards Group
je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals   Try Opera: http://www.opera.com


Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-25 Thread Charles McCathieNevile

On Thu, 23 Apr 2009 22:46:09 +0200, Ian Hickson i...@hixie.ch wrote:

 * Shouldn't require the consumer to write XSLT or server-side code  
   to process the annotated data.


Does process here mean extract from the page, or something more?

cheers

--
Charles McCathieNevile  Opera Software, Standards Group
je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals   Try Opera: http://www.opera.com


Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-25 Thread Charles McCathieNevile

On Fri, 24 Apr 2009 05:53:09 +0200, Ian Hickson i...@hixie.ch wrote:


On Thu, 23 Apr 2009, Manu Sporny wrote:


I've looked over the list a couple of times and it's a good introduction
to the problem space.


It's not really intended to be an introduction, so much as a complete  
list of use cases that people want the spec to cover.

...

Oh. Then I think it is probably doomed to be incomplete - users not only  
do concrete things, but they do lots of different concrete things. This is  
possibly (probably?) a large enough set from which to derive general  
principles and clear goals.



From the point of view of the HTML5 effort, what is needed is use cases,
scenarios, and requirements, that don't in any way imply a particular
solution, as in the list I posted, so that solutions can be evaluated.

...

So how do the solutions get proposed, or do you already have a candidate  
list you have selected? What's the process here?


cheers

chaals

--
Charles McCathieNevile  Opera Software, Standards Group
je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals   Try Opera: http://www.opera.com


Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-24 Thread timeless
The contacts section uses event where it meant contact

On 4/23/09, Ian Hickson i...@hixie.ch wrote:

 [bcc'ed previous participants in this discussion]

 Earlier this year I asked for use cases that HTML5 did not yet cover, with
 an emphasis on use cases relating to semantic microdata. I list below the
 use cases and requirements that I derived from the response to that
 request, and from related discussions.

 I would appreciate it if people could review this list for errors or
 important omissions, before I go through the list to work out whether
 these use cases already have solutions, or whether we should have
 solutions for these use cases in HTML, or whether we should address these
 use cases with other technologies, or whatnot.

 I encourage people to focus on the use cases themselves, rather than on
 potential solutions; various solutions to all these use cases have already
 been argued in great detail and I have already read all those e-mails,
 blog comments, wiki faqs, etc, carefully.

 My primary concern right now is in making sure that these are indeed the
 use cases people care about, so that whatever we add to the spec can be
 carefully evaluated to make sure it is in fact solving the problems that
 we want solving.

 ==

 Exposing known data types in a reusable way

USE CASE: Exposing calendar events so that users can add those events to
their calendaring systems.

SCENARIOS:

  * A user visits the Avenue Q site and wants to make a note of when
tickets go on sale for the tour's stop in his home town. The site
 says
October 3rd, so the user clicks this and selects add to calendar,
which causes an entry to be added to his calendar.
  * A student is making a timeline of important events in Apple's
 history.
As he reads Wikipedia entries on the topic, he clicks on dates and
selects add to timeline, which causes an entry to be added to his
timeline.
  * TV guide listings - browsers should be able to expose to the user's
tools (e.g. calendar, DVR, TV tuner) the times that a TV show is on.
  * Paul sometimes gives talks on various topics, and announces them on
his blog. He would like to mark up these announcements with proper
scheduling information, so that his readers' software can
automatically obtain the scheduling information and add it to their
calendar. Importantly, some of the rendered data might be more
informal than the machine-readable data required to produce a
 calendar
event. Also of importance: Paul may want to annotate his event with a
combination of existing vocabularies and a new vocabulary of his own
design. (why?)
  * David can use the data in a web page to generate a custom browser UI
for adding an event to our calendaring software without using brittle
screen-scraping.

REQUIREMENTS:

  * Should be discoverable.
  * Should be compatible with existing calendar systems.
  * Should be unlikely to get out of sync with prose on the page.
  * Shouldn't require the consumer to write XSLT or server-side code to
read the calendar information.
  * Machine-readable event data shouldn't be on a separate page than
human-readable dates.
  * The information should be convertible into a dedicated form (RDF,
JSON, XML, iCalendar) in a consistent manner, so that tools that use
this information separate from the pages on which it is found have a
standard way of conveying the information.
  * Should be possible for different parts of an event to be given in
different parts of the page. For example, a page with calendar events
in columns (with each row giving the time, date, place, etc) should
still have unambiguous calendar events parseable from it.


 ---

USE CASE: Exposing contact details so that users can add people to their
address books or social networking sites.

SCENARIOS:

  * Instead of giving a colleague a business card, someone gives their
colleague a URL, and that colleague's user agent extracts basic
profile information such as the person's name along with references
 to
other people that person knows and adds the information into an
address book.
  * A scholar and teacher wants other scholars (and potentially students)
to be able to easily extract information about who he is to add it to
their contact databases.
  * Fred copies the names of one of his Facebook friends and pastes it
into his OS address book; the contact information is imported
automatically.
  * Fred copies the names of one of his Facebook friends and pastes it
into his Webmail's address book feature; the 

Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-23 Thread Manu Sporny
Ian Hickson wrote:
 [bcc'ed previous participants in this discussion]
 
 Earlier this year I asked for use cases that HTML5 did not yet cover, with 
 an emphasis on use cases relating to semantic microdata. I list below the 
 use cases and requirements that I derived from the response to that 
 request, and from related discussions.

 My primary concern right now is in making sure that these are indeed the 
 use cases people care about, so that whatever we add to the spec can be 
 carefully evaluated to make sure it is in fact solving the problems that 
 we want solving.

I've looked over the list a couple of times and it's a good introduction
to the problem space. For those that are new to the discussion, some of
these use cases are covered in more depth on the RDFa wiki[1]. The RDFa
wiki includes example markup and Javascript pseudo-code describing
consuming applications, but only for a few of the use cases. I'm
elaborating on one use case every day until all of them are done (it
should take about a month at this rate).

Ian, would it help if I continue to elaborate on the RDFa use cases on
the RDFa wiki? Or perhaps, I could merge these use cases into the RDFa
wiki and elaborate on the WHATWG micro-data use cases first? I'd like to
focus my effort on something that will benefit /both/ WHATWG and the
RDFa community. Thoughts?

-- manu

[1] http://rdfa.info/wiki/rdfa-use-cases

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: A Collaborative Distribution Model for Music
http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/


Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-23 Thread Ian Hickson
On Thu, 23 Apr 2009, Manu Sporny wrote:
 
 I've looked over the list a couple of times and it's a good introduction 
 to the problem space.

It's not really intended to be an introduction, so much as a complete list 
of use cases that people want the spec to cover.


 Ian, would it help if I continue to elaborate on the RDFa use cases on 
 the RDFa wiki? Or perhaps, I could merge these use cases into the RDFa 
 wiki and elaborate on the WHATWG micro-data use cases first? I'd like to 
 focus my effort on something that will benefit /both/ WHATWG and the 
 RDFa community. Thoughts?

From the point of view of the HTML5 effort, what is needed is use cases, 
scenarios, and requirements, that don't in any way imply a particular 
solution, as in the list I posted, so that solutions can be evaluated.

The rdfa.info wiki page was invaluable in the creation of the list I 
posted this morning -- I used that, as well as blog comments and about 
15,000 lines' worth of e-mails, in the creation of the list. I tried to 
make sure every use case mentioned was covered, so if anyone posted a use 
case to this mailing list, to the wiki, or to blogs on the subject in the 
past few months, that is not listed in that e-mail, I apologise -- please 
let me know so that I can add them. (It may be that I didn't understand 
the use case -- there's a couple I don't get, noted with (why?) in the 
e-mail sent this morning.)

The more concrete the use cases the better. Users do concrete things.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'