Re: [whatwg] header for JSON-LD ???

2017-07-23 Thread Dan Brickley
Hypothetically, if search engines were to start picking up JSON-LD from
linked files, which link rel type would this group consider most
appropriate?

Dan

On 23 July 2017 at 06:12, Jeffrey Yasskin  wrote:

> 2¢: This list tends to disapprove of JSON-LD, so you should probably first
> run your proposal by a group that likes JSON-LD. Maybe
> public-rdf-comme...@w3.org referenced from https://www.w3.org/TR/json-ld/?
> Or an issue against https://github.com/json-ld/json-ld.org?
>
> Jeffrey
>
> On Fri, Jul 21, 2017 at 2:21 PM, Michael A. Peters  >
> wrote:
>
> > I am (finally) starting to implement JSON-LD on a site, it generates a
> lot
> > of data that is useless to the non-bot typical user.
> >
> > I'd prefer to only stick it in the head when the client is a crawler that
> > wants it.
> >
> > Wouldn't it be prudent if agents that want JSON-LD can send a
> standardized
> > header as part of their request so web apps can optionally choose to only
> > send the JSON-LD data to clients that want it? Seems it would be kinder
> to
> > mobile users on limited bandwidth if they didn't have to download a bunch
> > of JSON that is meaningless to them.
> >
> > Is this the right group to suggest that?
> >
>


Re: [whatwg] inverse property mechanism for Microdata?

2014-03-19 Thread Dan Brickley
On 17 March 2014 21:15, Ian Hickson i...@hixie.ch wrote:
 On Mon, 17 Mar 2014, Dan Brickley wrote:

 We discussed this (and the -inv suggestion) at schema.org again, and the
 consensus there was that we'd like to have the search engines proceed
 with accepting an experimental/proposed 'inverse itemprop' attribute,
 rather than work around its absence.

 So the idea here that the itemprop-up (or whatever -- it would be good to
 get a more intuitive name, not sure what to call it though) would have to
 be specified in conjunction with the itemscope= attribute on a top-level
 microdata item whose element had an ancestor that itself creates an item,
 and would actually specify a property on the inner item, whose value was
 the outer item?

 This is what the example would look like if I'm understanding this right:

   div itemscope itemtype=http://schema.org/LocalBusiness;
 h1span itemprop=name(Entity A) Beachwalk Beachwear 
 Giftware/span/h1
 span itemprop=description A superb collection of fine gifts and 
 clothing
 to accent your stay in Mexico Beach./span
 Phone: span itemprop=telephone850-648-4200/span

 div itemscope itemtype=http://schema.org/LocalBusiness;
  itemprop-up=containedIn
   h2span itemprop=name(Entity B) The tiny store within a
   store/span/h2
   span itemprop=description A superb collection of tiny clothes,
   from the store within the store./span
   Phone: span itemprop=telephone123-456-7890/span
 /div

   /div

 It's not too bad, I guess.

Yes. I notice that the words we were playing with at schema.org relate
to the underlying graph data model  itemprop-inverse, -reverse etc.,
whereas your draft name, itemprop-up is about the markup hierarchy.

 My main concern is that this seems to solve a
 very narrow use case for non-tree structures, but doesn't take into
 account the many, many other non-tree structures.

Yup, there are some cases where this can be addressed through the
rigorous use of entity IDs in itemid, as you sketch below. That would
be relatively new territory for schema.org and for publishers. Perhaps
there is an attribute name we can find that would leave the door open
to more use cases, e.g. itemprop-backwards rather than
itemprop-up. It seems reasonable to try to address relationships
between sibling elements too.

Something like (trying out -backwards instead of -up, to allow for
non-hierarchical usage):

div itemid=bigshop itemscope itemtype=http://schema.org/LocalBusiness;
h1span itemprop=name(Entity A) Beachwalk Beachwear 
Giftware/span/h1
/div
div itemscope itemtype=http://schema.org/Pharmacy;
  meta itemprop-backwards=containedIn itemid=bigshop /
  h2span itemprop=nameTiny pharmacy store within a store/span/h2
/div

?

Can we use itemid in that way, to give a property value too? I don't
see itemid used much in the wild and the spec only mentions its use
for the item having the property, rather than using when supplying the
value of a property.

 For example, consider
 the case of a TV Episode with an Actor:

div itemscope itemtype=http://schema.org/Episode;
 ...
 div itemprop=actor
  itemscope itemtype=http://schema.org/Person;
  ...
 /div
/div

 ...now suppose it's marked up the other way around:

div itemscope itemtype=http://schema.org/Person;
 ...
 div itemprop-up=actor
  itemscope itemtype=http://schema.org/Episode;
  ...
 /div
/div

 So far so good. But what if there's two episodes with two actors, and the
 page just lists both episodes and both actors, and wants to
 cross-reference both episodes to both actors?

 itemprop-up (or whatever we call it) can't help there. itemref= can help
 in some simple cases, but as you pointed out, it soon gets out of hand.

 Microdata actually already has a solution to this. The vocabulary can
 define an ID for each item using itemid=, and can define multiple items
 having the same ID as being the same conceptual item. Thus:

!-- first episode --
div itemscope itemtype=http://schema.org/Episode;
 ...
 div itemprop=actor
  itemscope itemtype=http://schema.org/Person;
  itemid=http://.../person/123;/div
 div itemprop=actor
  itemscope itemtype=http://schema.org/Person;
  itemid=http://.../person/456;/div
/div

!-- second episode --
div itemscope itemtype=http://schema.org/Episode;
 ...
 div itemprop=actor
  itemscope itemtype=http://schema.org/Person;
  itemid=http://.../person/123;/div
 div itemprop=actor
  itemscope itemtype=http://schema.org/Person;
  itemid=http://.../person/456;/div
/div

!-- actors --
div itemscope itemtype=http://schema.org/Person;
 itemid=http://.../person/123;
 ...
/div
div itemscope itemtype=http://schema.org/Person;
 itemid=http://.../person/456;
 ...
/div

 This also enables the data to be spread across multiple

Re: [whatwg] inverse property mechanism for Microdata?

2014-03-17 Thread Dan Brickley
Hi Ian, HTML people,

On 31 January 2014 23:45, Ian Hickson i...@hixie.ch wrote:
 On Fri, 31 Jan 2014, Dan Brickley wrote:

 We'd (schema.org 'we') like to make a public proposal to update
 Microdata with a syntax for expressing inverse properties/relationships.
 [...]

 Here's an example with 'containedIn'. The idea is that we want to
 express that the LocalBusiness (i.e. Place) Entity B is 'containedIn'
 Entity A. The example I show here expresses the reverse, incorrectly. So
 we're looking for a change to the markup that would turn this example
 into one that said The LocalBusiness Entity B is containedIn the
 LocalBusiness Entity A:

 div itemscope itemtype=http://schema.org/LocalBusiness;
   h1span itemprop=name(Entity A) Beachwalk Beachwear 
   Giftware/span/h1
   span itemprop=description A superb collection of fine gifts and 
 clothing
   to accent your stay in Mexico Beach./span
   Phone: span itemprop=telephone850-648-4200/span

   div itemprop=containedIn itemscope
itemtype=http://schema.org/LocalBusiness;
 h2span itemprop=name(Entity B) The tiny store within a
 store/span/h2
 span itemprop=description A superb collection of tiny clothes,
 from the store within the store./span
 Phone: span itemprop=telephone123-456-7890/span
   /div

 /div

 This is actually possible today:

  div itemscope itemtype=http://schema.org/LocalBusiness;
   id=a itemprop=containedIn
h1span itemprop=name(Entity A) Beachwalk Beachwear 
Giftware/span/h1
span itemprop=description A superb collection of fine gifts and 
 clothing
to accent your stay in Mexico Beach./span
Phone: span itemprop=telephone850-648-4200/span

div itemscope itemref=a itemtype=http://schema.org/LocalBusiness;
  h2span itemprop=name(Entity B) The tiny store within a
  store/span/h2
  span itemprop=description A superb collection of tiny clothes,
  from the store within the store./span
  Phone: span itemprop=telephone123-456-7890/span
/div

  /div

 The trick here is to turn the inner item into the top-level microdata
 item, and use itemref= to have that inner item point to the outer item.

You're right; it is indeed possible. However it is perhaps a little
too clever. I've tried it on a few colleagues, and it didn't 'click'
with anyone yet.

We discussed this (and the -inv suggestion) at schema.org again, and
the consensus there was that we'd like to have the search engines
proceed with accepting an experimental/proposed 'inverse itemprop'
attribute, rather than work around its absence.


 (This works great unless you want two items to refer to the same third
 item using different properties, but that's something microdata can't do
 in general, since it's based on a tree structure, not a graph structure.
 To address that particular problem, you need a vocabulary that defines
 how itemid= works; at that point, you can just have the same underlying
 item represented as multiple microdata items in the document by having all
 the items share the same ID. But how exactly that is to be interpreted is
 something the vocabulary has to define.)

 One response is that the markup could be reorganized.

 That's basically what the above does, but without moving the elements
 around in the DOM. (itemref= is basically all about making the microdata
 model work around constraints coming from the author's preferred DOM.)

(Yup.)


 Another reasonable response to this is 'well, perhaps you should have a
 property (instead or in addition) called geospatiallyContains, or
 containerOf or contains, or rev_containedIn for this usage
 scenario'?

 That is another option, similar to the parenthetical itemid= note above
 -- you could just have the vocabulary define that for every property whose
 value is an item, the item type that that property can point to has
 another property with the same name plus a fixed suffix, like -inv, that
 inverses the relationship. That would make the above look like:

  div itemscope itemtype=http://schema.org/LocalBusiness;
h1span itemprop=name(Entity A) Beachwalk Beachwear 
Giftware/span/h1
span itemprop=description A superb collection of fine gifts and 
 clothing
to accent your stay in Mexico Beach./span
Phone: span itemprop=telephone850-648-4200/span

div itemprop=containedIn-inv
 itemscope itemtype=http://schema.org/LocalBusiness;
  h2span itemprop=name(Entity B) The tiny store within a
  store/span/h2
  span itemprop=description A superb collection of tiny clothes,
  from the store within the store./span
  Phone: span itemprop=telephone123-456-7890/span
/div
  /div

This is easier to understand than itemref, but still involves creating
100s of additional properties instead of just one new piece of syntax.

 We have tried this and in a few cases we have included pairs of inverse
 properties in schema.org, e.g. we have alumni and an inverse,
 alumniOf.  In designing schemas we have found it consistently

Re: [whatwg] Supporting more address levels in autocomplete

2014-02-24 Thread Dan Brickley
On 24 Feb 2014 05:17, Charles McCathie Nevile cha...@yandex-team.ru
wrote:

 On Sat, 22 Feb 2014 05:05:06 +0100, Ian Hickson i...@hixie.ch wrote:

 On Fri, 21 Feb 2014, Kevin Marks wrote:

 On 21 Feb 2014 17:03, Ian Hickson i...@hixie.ch wrote:
   Those names come from vcard - if adding a new one, consider how to
   model it in vcard too. Note that UK addresses can have this too - eg
   3 high street, Kenton, Harrow, Middlesex, UK
 
  That's actually a bogus UK address. I'm not sure exactly which town
  you meant that to be in, but official UK addresses never have more
  than two region levels, and usually only one (the post town). The
  only time they have two is when the post town has two streets with the
  same name.

 The real address, where I grew up,  was:
 2 Melbury Road, Kenton, Harrow, Middlesex, HA3 9RA


 Today, the address of that building is:

2 Melbury Rd
Harrow
HA3 9RA


 Damn humans, not following specs. Actually UK addresses have a huge
 amount of leeway, as they are routed by postcode in the main (though I
 did receive a postcard addressed to Kevin, Sidney, Cambridge once).


 The post office will deal with all kinds of stuff, sure. But Web forms
 only have to accept the formal address format, which in the UK only ever
 has a street, a locality (sometimes), a post town, and a post code.


 That depends on whether you want to force your customers to think like
the Post Office, or whether you prefer to be responsive to your customers.
Speaking without data, I suspect that nervousness at not being able to put
*what someone thinks* is their address translates fairly readily into a
certain amount of failure to proceed with a transaction.

 Providing specification purity over the concerns of both users and
developers trying to use the Web to successfully interact with them seems
like a pretty basic mistake to me.

Who is using the data? Just post offices? Or taxi drivers, pizza delivery
bikers, pedestrians?

Dan

 cheers

 Chaals

 --
 Charles McCathie Nevile - Consultant (web standards) CTO Office, Yandex
   cha...@yandex-team.ru Find more at http://yandex.com


[whatwg] inverse property mechanism for Microdata?

2014-01-31 Thread Dan Brickley
Hi folks. I'm relaying this from the schema.org collaboration,
probably the main user of HTML's Microdata mechanism.

We'd (schema.org 'we') like to make a public proposal to update
Microdata with a syntax for expressing inverse
properties/relationships. FWIW other notations that schema.org
supports (JSON-LD and RDFa) have such mechanisms ([1],[2]).

At schema.org we are repeatedly running into situations where we have
a need for properties to be used in reverse direction. There are 630
or so properties defined currently (and a similar number of types);
see listing at http://schema.org/docs/full.html. Inverse properties
are relatively a cornercase, but a persistent one.

By inverse, I refer to scenarios where there are any pair of
properties (relationship types) e.g. foo and bar, such that
whenever some entity-1 has a foo relationship to an entity-2, then by
definition, entity-2 will have a bar relationship to entity-1. We'd
like to avoid the need to give bar a specific name, and instead be
able to in effect just say the inverse of foo.

e.g. perhaps entity-1 is a shop, entity-2 is another shop, and foo =
containedIn, bar = containsWithin, indicating that the one shop
is inside the other. Or perhaps entity-1 is a school, entity-2 is a
celebrity, and foo=alumni, bar=alumniOf. Schema.org would like
Microdata syntax to be extended somehow, to allow a single property
name to be used regardless of whether the markup nesting structure
emphasises entity-1 or entity-2.

For more example topics, here are some of the properties we define.

http://schema.org/containedIn (which relates a smaller place to a
larger containing place);
http://schema.org/member http://schema.org/alumni
http://schema.org/author http://schema.org/performerIn
http://schema.org/worksFor http://schema.org/employee
http://schema.org/founder http://schema.org/member ... and various others,
often role-related or where two independent entities have a
relationship that is being described, and where neither entity is
necessarily the primary focus in all markup.

For a property like alumni it could reasonably be used either in a
paragraph that was describing the educational institution, or
describing a (famous) person who attended it.  We would like to have a
standard markup convention for using a single named property, i.e.
being able to indicate sometimes that it is to be read in reversed
direction. In other words we want to avoid having to come up with two
different names for each of these situations; and more importantly, to
avoid publishers/authors having to remember two names for one
situation.


Here's an example with 'containedIn'. The idea is that we want to
express that the LocalBusiness (i.e. Place) Entity B is 'containedIn'
Entity A. The example I show here expresses the reverse, incorrectly.
So we're looking for a change to the markup that would turn this
example into one that said The LocalBusiness Entity B is containedIn
the LocalBusiness Entity A:

div itemscope itemtype=http://schema.org/LocalBusiness;
  h1span itemprop=name(Entity A) Beachwalk Beachwear 
Giftware/span/h1
  span itemprop=description A superb collection of fine gifts and clothing
  to accent your stay in Mexico Beach./span
  Phone: span itemprop=telephone850-648-4200/span

  div itemprop=containedIn itemscope
itemtype=http://schema.org/LocalBusiness;
h2span itemprop=name(Entity B) The tiny store within a
store/span/h2
span itemprop=description A superb collection of tiny clothes,
from the store within the store./span
Phone: span itemprop=telephone123-456-7890/span
  /div

/div


One response is that the markup could be reorganized. For example,

  div itemscope itemtype=http://schema.org/LocalBusiness;
h2span itemprop=name(Entity B) The tiny store within a
store/span/h2
span itemprop=description A superb collection of tiny clothes,
from the store within the store./span
Phone: span itemprop=telephone123-456-7890/span
   div itemprop=containedIn  itemscope
itemtype=http://schema.org/LocalBusiness;
 h2span itemprop=name(Entity A) Beachwalk Beachwear 
Giftware/span/h2
 span itemprop=description A superb collection of fine gifts
and clothing to accent your stay in Mexico Beach./span
   Phone: span itemprop=telephone850-648-4200/span
   /div
  /div

We're not so optimistic about this approach, especially when multiple
entities are described. Schema.org is widely used but seems generally
to be added to existing pages with relatively fixed structure.

Another reasonable response to this is 'well, perhaps you should have
a property (instead or in addition) called geospatiallyContains, or
containerOf or contains, or rev_containedIn for this usage
scenario'?

We have tried this and in a few cases we have included pairs of
inverse properties in schema.org, e.g. we have alumni and an
inverse, alumniOf.  In designing schemas we have found it
consistently hard to get even a single natural/intuitive name for each
property, and finding a good 

[whatwg] Microdata feedback: please state that property value ordering is in the data model, and give usage guidelines

2011-06-08 Thread Dan Brickley
Hello,

Reading 
http://www.whatwg.org/specs/web-apps/current-work/multipage/links.html#microdata

Section '5.2.3 Names: the itemprop attribute' states something
important about Microdata's data model,

Within an item, the properties are unordered with respect to each
other, except for properties with the same name, which are ordered in
the order they are given by the algorithm that defines the properties
of an item.

... and gives an example In the following example, the a property
has the values 1 and 2, in that order,  ...
div itemscope itemref=x
 p itemprop=btest/p
 p itemprop=a2/p
/div
div id=x
 p itemprop=a1/p
/div

However '5.2.1 The microdata model' does not mention anything of this
data model feature. If property values (for some specific
property/item context), this should be mentioned when introducing the
data model; if only by copying or linking the above sentence (Within
an item, ...).

Is the expectation that Microdata vocabulary authors can decide
whether such ordering is meaningful, when they define / describe their
properties?

For example, in academic publishing where they care about being first
named author, the ordering of 'itemprop=author' might seem to
matter. 5.2.3 suggests that the ordering information is at least
preserved in Microdata's data model. If someone creates an 'author'
property for Microdata, should they state that property ordering is
meaningful, or is that not their decision?

Thanks,

Dan


Re: [whatwg] Captions, Subtitles and the Video Element

2009-07-17 Thread Dan Brickley

On 17/7/09 15:04, Tab Atkins Jr. wrote:

On Fri, Jul 17, 2009 at 4:15 AM, Ian Hicksoni...@hixie.ch  wrote:

On Thu, 16 Jul 2009, Jeff Walden wrote:

(For the few authors who really want to go crazy, they can already
overlap HTML onto theirvideo  and do whatever crazy stuff they want
to do.)

By way of a use case for at least color and positioning, there's a
certain part of the third (?) Austin Powers movie wherein the color and
position of foreign-language subtitles plays an important part in the
artistic merits (lack thereof, arguably) of the scene.  How would you
suggest a movie-viewing site usevideo  to display these?  It seems
unreasonable to say that the site must include special-case handling for
this particular movie clip's subtitles; it's more likely they would be
mangled in some manner and the semantic content (lack thereof) would be
lost.

By the way, I have no idea how foreign-language translations of the
movie handle this scene.  It's possible they simply subtitle the
subtitles and avoid the more complicated problems this scene arguably
presents.

I think this particular case can be a victim of the 80% rule.


I don't remember the exact scene you're referring to, but it's also
possible that those subtitles are then an integral part of the
content, and should properly be baked into the movie.


Yep, slippery slope. If we're not careful we'll end up requiring a 3d 
file browsing facility, so that Jurassic Park can be properly 
represented - http://en.wikipedia.org/wiki/Fsn


cheers,

Dan


Re: [whatwg] Fullscreenable attribute.

2009-07-13 Thread Dan Brickley

On 13/7/09 11:06, Ian Hickson wrote:

On Tue, 16 Jun 2009, Alpha Omega wrote:

I think it would be useful to add fullscreenable (or more refined
name) attribute to arbitrary element, so users could be able to
full-screen DOM subtrees, that document author marked as
fullscreenable.

Usage: User choses area that he wants to fullscreen, peforms UA-specific
action there(go to fullscreen in context menu in desktop browsers, or
gesture on mobile devices for example), UA goes up in DOM tree until it
founds fullscreenable attribute, and then fullscreens this subtree.
If fullscreenable attribute is not found, then it is UA authors
decision what to do - for example fullscreen entire page.


Should UAs always put users in control of this?

ie. everything in principle is fullscreenable, but this indicator 
would be a strong hint that this chunk of content makes special sense to 
be treated in this manner.



Use case: Not only solves problem withvideo  tag, but also useful for
mobile UAs (users could use it to zoom to author defined parts, on
pages with complex layouts.), and for interactive webapps in general
IMHO.


I think this would be an interesting idea. I haven't any idea what the UI
would look like though. I recommend approaching vendors directly and
getting their input and experimental implementations, as described here:


http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_the_spec.3F


I like the idea of being able to go full-screen. I'd encourage talking 
to Web accessibility folk before going to far with a proposal / 
implementation...


cheers,

Dan


Re: [whatwg] Removing the need for separate feeds

2009-05-22 Thread Dan Brickley

On 22/5/09 09:21, Ian Hickson wrote:

On Fri, 22 May 2009, Henri Sivonen wrote:

On May 22, 2009, at 09:01, Ian Hickson wrote:

   USE CASE: Remove the need for feeds to restate the content of HTML pages
   (i.e. replace Atom with HTML).

Did you do some kind of Is this Good for the Web? analysis on this
one? That is, do things get better if there's yet another feed format?


As far as I can tell, things get better if the feed format and the default
output format are the same, yes. Generally, redundant information has
tended to lead to problems.


Would this include having a mechanism (microdata? xml islands?) that 
preserves extension markup from Atom feeds? eg. see 
http://www.ibm.com/developerworks/xml/library/x-extatom1/


cheers,

Dan


Re: [whatwg] Removing the need for separate feeds

2009-05-22 Thread Dan Brickley

On 22/5/09 12:36, Toby Inkster wrote:

Eduard Pascual wrote:


For manually authored pages and feeds things would be different; but
are there really a significant ammount of such cases out there? I
can't say I have seen the entire web (who can?), but among what I have
seen, I have never encountered any hand authored feed, except for code
examples and similar experimental stuff.


Surely this proves the need for a way of extracting feeds from HTML?

You never see manually written feeds because people can't be bothered to
manually write feeds. So the people who manually author HTML simply
don't bother providing feeds at all.

If an HTML page can *be* a feed, this allows manually authored HTML
pages to be subscribed to in feed readers.


FWIW the W3C homepage works this way since ~2000, 
http://www.w3.org/2000/08/w3c-synd/


cheers,

Dan



Re: [whatwg] Link rot is not dangerous

2009-05-20 Thread Dan Brickley

On 20/5/09 22:54, Tab Atkins Jr. wrote:

On Wed, May 20, 2009 at 2:35 AM, Toby A Inksterm...@tobyinkster.co.uk  wrote:



And yet, given an example use of the vocabulary, I'm quite certain I
can easily find the page I want describing the vocab, even when there
are overlaps in prefixes such as with bio.

FYN is nearly never necessary for humans.  We have the intelligence to
craft search queries and decide which returned result is correct.


What happens in practice is that many of these perfectly intelligent 
humans ask in email or IRC questions that are clearly answered directly 
in the relevant documentation. You can lead humans to the documentation, 
but you can't make 'em read...


cheers,

Dan


Re: [whatwg] Link rot is not dangerous

2009-05-18 Thread Dan Brickley

On 18/5/09 10:34, Henri Sivonen wrote:

On May 15, 2009, at 19:20, Manu Sporny wrote:


There have been a number of people now that have gone to great lengths
to outline how awful link rot is for CURIEs and the semantic web in
general. This is a flawed conclusion, based on the assumption that there
must be a single vocabulary document in existence, for all time, at one
location.


The flawed conclusion flows out of Follow Your Nose advocacy, and is
not flawed if one takes Follow Your Nose seriously.

It seems to me that the positions that RDF applications should Follow
Their Nose and that link rot is not dangerous (to RDF) are
contradictory positions.


That's a strong claim. There is certainly a balance to be found between 
taking advantage of de-referencable URIs and relying on their 
de-referencability. De-referencing is a privilege not a right, after all.


If I lost control of xmlns.com tommorrow, and it became un-rescuably 
owned by offshore spam-virus-malware pirates, that doesn't change 
history. For nine years, the FOAF documentation has lived there, and we 
can use URIs to ask other services about what they saw during that 
period: http://web.archive.org/web/*/http://xmlns.com/foaf/0.1/


Since there is useful information to know about FOAF properties and 
terms from its schema and human-oriented docs, it would be a shame if 
people ignored that. Since domain names can be lost, it would also be a 
shame if directly de-referencing URIs to the schema was the only way 
people could find that info. Fortunately, neither is the case.



That link rot hasn't been a practical problem to the Semantic Web
community suggests that applications don't really Follow Their Nose in
practice. Can anyone point me to a deployed end user application that
uses RDF internally and Follows Its Nose?


The search site, sindice.com does this:

Yes Sindice dereferences URIs it finds in RDF instance data, including 
class and property URIs. It performs OWL reasoning using the retrieved 
information, mostly to infer additional triples based on subclass and 
subproperty relationships. Doing this helps us to increase recall in 
queries. (from Richard Cyganiak, who I asked offlist for confirmation)


Whether you consider sindice.com end-user facing or not, I don't know. I 
put in roughly the same category as Google's Social Graph API. But it's 
a non-trivial implementation that aggregates and integrates a lot of data.


BTW here's another use case for identifying properties and classes by 
URI: we can decentralise the translation of their labels into other 
languages. Here are some Korean descriptions of FOAF, for example: 
http://svn.foaf-project.org/foaftown/foaf18n/foaf-kr.rdf


cheers,

Dan


Re: [whatwg] Annotating structured data that HTML has no semanticsfor

2009-05-15 Thread Dan Brickley

On 15/5/09 14:11, Shelley Powers wrote:

Kristof Zelechovski wrote:

I do not think anybody in WHATWG hates the CURIE tool; however, the
following problems have been put forward:

Copy-Paste
The CURIE mechanism is considered inconvenient because is not
copy-paste-resilient, and the associated risk is that semantic elements
would randomly change their meaning.


Well, no, the elements won't randomly change their meaning. The only
risk is copying and pasting them into a document that doesn't provide
namespace definitions for the prefixes. Are you thinking that someone
will be using different namespaces but the same prefix? Come on -- do
you really think that will happen?


The most likely case is with Dublin Core, but DC data varies enough 
already that this isn't too destructive...


Dan


Re: [whatwg] Link rot is not dangerous

2009-05-15 Thread Dan Brickley

On 15/5/09 18:20, Manu Sporny wrote:

Kristof Zelechovski wrote:

Therefore, link rot is a bigger problem for CURIE
prefixes than for links.


There have been a number of people now that have gone to great lengths
to outline how awful link rot is for CURIEs and the semantic web in
general. This is a flawed conclusion, based on the assumption that there
must be a single vocabulary document in existence, for all time, at one
location. This has also lead to a false requirement that all
vocabularies should be centralized.

Here's the fear:

If a vocabulary document disappears for any reason, then the meaning of
the vocabulary is lost and all triples depending on the lost vocabulary
become useless.

That fear ignores the fact that we have a highly available document
store available to us (the Web). Not only that, but these vocabularies
will be cached (at Google, at Yahoo, at The Wayback Machine, etc.).

IF a vocabulary document disappears, which is highly unlikely for
popular vocabularies - imagine FOAF disappearing overnight, then there
are alternative mechanisms to extract meaning from the triples that will
be left on the web.

Here are just two of the possible solutions to the problem outlined:

- The vocabulary is restored at another URL using a cached copy of the
vocabulary. The site owner of the original vocabulary either re-uses the
vocabulary, or re-directs the vocabulary page to another domain
(somebody that will ensure the vocabulary continues to be provided -
somebody like the W3C).
- RDFa parsers can be given an override list of legacy vocabularies that
will be loaded from disk (from a cached copy). If a cached copy of the
vocabulary cannot be found, it can be re-created from scratch if necessary.

The argument that link rot would cause massive damage to the semantic
web is just not true. Even if there is minor damage caused, it is fairly
easy to recover from it, as outlined above.


A few other points:

1. It's for the community of vocabulary-creators to help each other out 
w.r.t. hosting/publishing these: I just nudged a friend to put another 5 
years on the DNS rental for a popular namespace. I think we should put a 
bit more structure around these kinds of habit, so that popular 
namespaces won't drop off the Web through accident.


2. digitally signing the schemas will become part of the story, I'm 
sure. While it's a bit fiddly, there are advantages to having other 
mechanisms beyond URI de-referencing for knowing where a schema came from


3. Parties worried about external dependencies when using namespaces can 
always indirect through their own namespace, whose schema document can 
declare subclass/subproperty relations to other URIs


cheers

Dan




Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-14 Thread Dan Brickley

On 14/5/09 14:18, Shelley Powers wrote:

James Graham wrote:

jgra...@opera.com wrote:

Quoting Philip Taylor excors+wha...@gmail.com:


On Sun, May 10, 2009 at 11:32 AM, Ian Hickson i...@hixie.ch wrote:


One of the more elaborate use cases I collected from the e-mails
sent in
over the past few months was the following:

USE CASE: Annotate structured data that HTML has no semantics for, and
which nobody has annotated before, and may never again, for private
use or
use in a small self-contained community.

[...]

To address this use case and its scenarios, I've added to HTML5 a
simple
syntax (three new attributes) based on RDFa.


There's a quickly-hacked-together demo at
http://philip.html5.org/demos/microdata/demo.html (works in at least
Firefox and Opera), which attempts to show you the JSON serialisation
of the embedded data, which might help in examining the proposal.


I have a *totally unfinished* demo that does something rather similar
at [1]. It is highly likely to break and/or give incorrect results**.
If you use it for anything important you are insane :)


I have now added extremely preliminary RDF support with output as N3
and RDF/XML courtesy of rdflib. It is certain to be buggy.


So much concern about generating RDF, makes one wonder why we didn't
just implement RDFa...


Having HTML5-microdata -to- RDF parsers is pretty critical to having 
test cases that help us all understand where RDFa-Classic and HTML5 
diverge. I'm very happy to see this work being done and that there are 
multiple implementations.


As far as I can see, the main point of divergence is around URI 
abbreviation mechanisms. But also HTML5 might not have a notion 
equivalent to RDF/RDFa's bNodes construct. The sooner we have these 
parsers the sooner we'll know for sure.


Dan



Re: [whatwg] Start position of media resources

2009-04-07 Thread Dan Brickley

On 8/4/09 00:29, Silvia Pfeiffer wrote:


The media fragment WG decided that fragment addressing should be done
with # and be able to just deliver the actual fragment.


Interesting! Do you have a reference for this? I can't understand how 
this is possible if these are URI references, unless something very 
non-traditional is happening...


cheers,

Dan


Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-18 Thread Dan Brickley

On 17/1/09 23:30, L. David Baron wrote:

On Saturday 2009-01-17 22:25 +0200, Henri Sivonen wrote:

The story of RDF is very different. Of the top four engines, only Gecko
has RDF functionality. It was implemented at a time when RDF was a young
W3C REC and stuff that were W3C RECs were implemented less critically
than nowadays.


Actually, the implementation was well underway *before* RDF was a
W3C REC, done by a team led by one of the designers of RDF.  In
other words, it was in Gecko because there were RDF advocates at
Netscape (although advocating, I think, a somewhat different RDF
than the current RDF recommendations).


Yes, Netscape had this stuff when it was still called MCF. W3C's RDF 
took ideas from several input activities, including MCF, Microsoft 
XML-Data, PICS, and requirements from the Dublin Core community. But it 
looks more like MCF than the others.


MCF was originally proposed by R.V.Guha at Apple; it followed him from 
Apple to Netscape in 1997, and when the Mozilla sources were later 
thrown over the wall, there was a lot of MCF in there.


MCF White Paper, 1996 http://www.guha.com/mcf/wp.html
spec, http://www.guha.com/mcf/mcf_spec.html

While this was at Apple, there was a product/viewer called HotSauce / 
Project X, and some early grassroots adoption of MCF as a text format 
for publishing website summaries.


http://web.archive.org/web/19961224042753/http://hotsauce.apple.com/
http://downlode.org/Etext/MCF/macworld_online.html

 It was at this stage that dialog started with the Library scene and 
Dublin Core folk, about how it related to their notion of catalogue 
records, and to the evolving PICS labelling system, format and protocol 
being built at W3C.

eg.
http://www.ssrc.hku.hk/tb-issues/TidBITS-355.html#lnk3
http://web.archive.org/web/19980215092626/http://www.ariadne.ac.uk/issue7/mcf/
The MCF/RSS relationship is a whole other story, eg. see
http://www.scripting.com/midas/mcf.html
http://www.scripting.com/frontier/siteMap.mcf
http://web.archive.org/web/19990222114619/http://www.xspace.net/hotsauce/sites.html

Then the thing moved to Netscape. Tim Bray helped Guha XMLize the spec, 
which was submitted to W3C in 1997, where it joined the existing efforts 
to extend PICS to include text labels and more structure - 
http://www.w3.org/TR/NOTE-pics-ng-metadata

http://www.daml.org/committee/minutes/2000-12-07-RDF-design-rationale.ppt
http://searchenginewatch.com/2165291

So the June 97 spec was
http://www.w3.org/TR/NOTE-MCF-XML/
.. you can see from the figures that the technology was very RDF-shaped, 
http://www.w3.org/TR/NOTE-MCF-XML/#sec2. Also a tutorial at 
http://www.w3.org/TR/NOTE-MCF-XML/MCF-tutorial.html


Netscape press release accompanying June 13 1997 submission -
http://web.archive.org/web/20010308150737/http://cgi.netscape.com/newsref/pr/newsrelease432.html

Less than 4 months later, this came out as a W3C Working Draft called 
RDF: http://www.w3.org/TR/WD-rdf-syntax-971002/
... in a shape that didn't really change much subsequently. RDF wasn't 
the same design exactly as MCF but the ancestry is clear enough.


And getting back to the original point, yeah Mozilla had MCF sitemaps 
code in there.


Revisiting 
http://www.prnewswire.com/cgi-bin/stories.pl?ACCT=104STORY=/www/story/9-8-97/312711EDATE= 

http://www.irt.org/articles/js086/ and the like, it's clear that RDF was 
very much a child of the 1st browser wars.


In retrospect the direction it took within Mozilla didn't do anyone much 
good. The earliest MCF apps were about public data on the public Web, 
feeds, sitemaps and so on. But eventually the ambition to be a complete 
information hub led to MCF/RDF being used for pretty much everything 
*inside* Mozilla. And I don't think that turned out very well. 
http://www.mozilla.org/rdf/doc/api.html etc. The RDF vocabularies it 
used were poorly or never documented (I have some guilt there) and when 
Netscape went away, the incentive to connect to public data on the Web 
seemed to drop (no more tie-ins with the 'what's related' annotation 
server, 'dmoz' etc.). RDF drifted from being a Web data format to be 
consumed *by* the browser, into an engineering tool to be used in the 
construction *of* the browser, ie. as a datasource abstraction within 
Mozilla APIs. While I can certainly see the value of having a unified 
view of mail, news, sitemaps, and so on, the Moz code at the time wasn't 
really in a position to match up to the language in the press releases.


Not making any particular point here beyond connecting up to the MCF 
heritage...


cheers,

Dan

--
http://danbri.org/




Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-18 Thread Dan Brickley

On 18/1/09 00:24, Henri Sivonen wrote:


No. However, most of the time, when people publish HTML, they do it to
elicit browser behavior when a user loads the HTML document in a browser.


Most users of the Web barely know what a browser is, let alone HTML. 
They're just putting information online; perhaps into a closed site (eg. 
facebook), perhaps into a public-facing site (eg. a blog), or perhaps 
into 1:1, group or IM messaging (eg. webmail). HTML figures in all these 
scenarios. Browsers or HTML rendering code too, of course. But I don't 
think we can jump from that to claims about user intent, and more than 
their use of the Internet signifies an intent to have their information 
chopped up into packets and transmitted according to the rules of TCP/IP.


The reason for my pedantry here is not to be argumentative, but just to 
suggest that this (otherwise very natural) thinking leads us to forget 
about the other major consumers of HTML - search engines. Having their 
stuff found and linked by other is often a big part of the motivation 
for putting stuff online. HTML parsing is involved, impact on the needs 
and interests of mainstream users is involved; but it's not clear 
whether all/any/many users 'do it to elicit search engine behaviour when 
indexing the HTML document'.


Aren't search engines equally important consumers of HTML? Perhaps 
they're more simple-minded in their behaviour than a full UI browser. 
But from the user side, there's only slightly more value in being 
readable without being findable than vice-versa...


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-18 Thread Dan Brickley

On 18/1/09 19:34, Henri Sivonen wrote:

On Jan 18, 2009, at 01:32, Shelley Powers wrote:


Are you then saying that this will be a showstopper, and there will
never be either a workaround or compromise?



Are the RDFa TF open to compromises that involve changing the XHTML side
of RDFa not to use attribute whose qualified name has a colon in them to
achieve DOM Consistency by changing RDFa instead of changing parsing?


I don't believe the RDFa TF are in a position to singlehandedly rescind 
a W3C Recommendation, ie. 
http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/.


What they presumably could do is propose new work items within W3C, 
which I'd guess would be more likely to be accepted if it had the active 
enthusiasm of the core HTML5 team. Am cc:'ing TimBL here who might have 
something more to add.


Do you have an alternative design in mind, for expressing the namespace 
mappings?


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-18 Thread Dan Brickley

On 18/1/09 20:07, Henri Sivonen wrote:

On Jan 18, 2009, at 20:48, Dan Brickley wrote:


On 18/1/09 19:34, Henri Sivonen wrote:

On Jan 18, 2009, at 01:32, Shelley Powers wrote:


Are you then saying that this will be a showstopper, and there will
never be either a workaround or compromise?



Are the RDFa TF open to compromises that involve changing the XHTML side
of RDFa not to use attribute whose qualified name has a colon in them to
achieve DOM Consistency by changing RDFa instead of changing parsing?


I don't believe the RDFa TF are in a position to singlehandedly
rescind a W3C Recommendation, ie.
http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/.

What they presumably could do is propose new work items within W3C,
which I'd guess would be more likely to be accepted if it had the
active enthusiasm of the core HTML5 team. Am cc:'ing TimBL here who
might have something more to add.

Do you have an alternative design in mind, for expressing the
namespace mappings?


The simplest thing is not to have mappings but to put the corresponding
absolute URI wherever RDFa uses a CURIE.


So this would be a kind of interoperability profile of RDFa, where 
certain features approved of by REC-rdfa-syntax-20081014 wouldn't be 
used in some hypothetical HTML5 RDFa.


If people can control their urge to use namespace abbreviations, and 
stick to URIs directly, would this make your DOM-oriented concerns go away?


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-18 Thread Dan Brickley

On 18/1/09 21:04, Shelley Powers wrote:

Dan Brickley wrote:

On 18/1/09 20:07, Henri Sivonen wrote:

On Jan 18, 2009, at 20:48, Dan Brickley wrote:


On 18/1/09 19:34, Henri Sivonen wrote:

On Jan 18, 2009, at 01:32, Shelley Powers wrote:


Are you then saying that this will be a showstopper, and there will
never be either a workaround or compromise?



Are the RDFa TF open to compromises that involve changing the XHTML
side
of RDFa not to use attribute whose qualified name has a colon in
them to
achieve DOM Consistency by changing RDFa instead of changing parsing?


I don't believe the RDFa TF are in a position to singlehandedly
rescind a W3C Recommendation, ie.
http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/.

What they presumably could do is propose new work items within W3C,
which I'd guess would be more likely to be accepted if it had the
active enthusiasm of the core HTML5 team. Am cc:'ing TimBL here who
might have something more to add.

Do you have an alternative design in mind, for expressing the
namespace mappings?


The simplest thing is not to have mappings but to put the corresponding
absolute URI wherever RDFa uses a CURIE.


So this would be a kind of interoperability profile of RDFa, where
certain features approved of by REC-rdfa-syntax-20081014 wouldn't be
used in some hypothetical HTML5 RDFa.

If people can control their urge to use namespace abbreviations, and
stick to URIs directly, would this make your DOM-oriented concerns go
away?


Took five minutes to make this change in my template. Ran through
validator.nu. Results:

Doesn't like the content-type. Didn't like profile on head. Having to
remove the profile attribute in my head element limits usability, but
I'm not going to throw myself on the sword for this one.

Doesn't like property, doesn't like about. These are the RDFa attributes
I'm using. The RDF extractor doesn't care that I used the URIs directly.


This sounds encouraging. Thanks for taking the time to try the 
experiment,  Shelley. But ... to be clear, are you putting full URIs in 
the @property attribute too? In 
http://www.w3.org/TR/rdfa-syntax/#s_curieprocessing it says '@property, 
@datatype and @typeof support only CURIE values.'


(Can you post an example?)

Reading ...
Many of the attributes that hold URIs are also able to carry 'compact 
URIs' or CURIEs. A CURIE is a convenient way to represent a long URI, by 
replacing a leading section of the URI with a substitution token. It's 
possible for authors to define a number of substitution tokens as they 
see fit; the full URI is obtained by locating the mapping defined by a 
token from a list of in-scope tokens, and then simply concatenating the 
second part of the CURIE onto the mapped value.


... I guess the fact that @property is supposed to be CURIE-only isn't a 
problem with parsers since this can be understood as a CURIE with no (or 
empty) substitution token.


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-17 Thread Dan Brickley

On 17/1/09 19:27, Sam Ruby wrote:

On Sat, Jan 17, 2009 at 11:55 AM, Shelley Powers
shell...@burningbird.net  wrote:

The debate about RDFa highlights a disconnect in the decision making related
to HTML5.


Perhaps.  Or perhaps not.  I am far from an apologist for Hixie, (nor
for that matter and I a strong advocate for RDF), but I offer the
following question and observation.


The purpose behind RDFa is to provide a way to embed complex information
into a web document, in such a way that a machine can extract this
information and combine it with other data extracted from other web pages.
It is not a way to document private data, or data that is meant to be used
by some JavaScript-based application. The sole purpose of the data is for
external extraction and combination.


So, I take it that it isn't essential that RDFa information be
included in the DOM?  This is not rhetorical: I honestly don't know
the answer to this question.


Good question. I for one expect RDFa to be accessible to Javascript.

http://code.google.com/p/rdfquery/wiki/Introduction - 
http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html is a 
nice example of code that does something useful in this way.


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-09 Thread Dan Brickley

On 10/1/09 00:37, Ian Hickson wrote:

On Fri, 9 Jan 2009, Ben Adida wrote:

Is inherent resistance to spam a condition (even a consideration) for
HTML5?


We have to make sure that whatever we specify in HTML5 actually is going
to be useful for the purpose it is intended for. If a feature intended for
wide-scale automated data extraction is especially susceptible to spamming
attacks, then it is unlikely to be useful for wide-scale automated data
extraction.


I've been looking at such concerns a bit for RDFa. One issue (shared 
with HTML in general I think) is user-supplied content, eg. blog 
comments and 'rel=nofollow' scenarios).  Is there any way in HTML5 to 
indicate that a whole chunk of Web page is from an (in some 
to-be-defined sense) untrusted source?


I see http://www.whatwg.org/specs/web-apps/current-work/#link-type-nofollow

The nofollow keyword indicates that the link is not endorsed by the 
original author or publisher of the page, or that the link to the 
referenced document was included primarily because of a commercial 
relationship between people affiliated with the two pages.


While I'm unsure about the commercial relationship clause quite 
capturing what's needed, the basic idea seems sound. Is there any 
provision (or plans) for applying this notion to entire blocks of 
markup, rather than just to simple hyperlinks? This would be rather 
useful for distinguishing embedded metadata that comes from the page 
author from that included from blog comments or similar.


Thanks for any pointers,

cheers,

Dan

--
http://danbri.org/


Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-03 Thread Dan Brickley

On 3/1/09 14:02, Julian Reschke wrote:

Tab Atkins Jr. wrote:

...

Well, it'll require an N3 parser where previously none was needed.


RDFa requires an RDFa parser as well, and in general *any* metadata
requires a parser, so this point is moot. The only metadata that
doesn't require a parser is no metadata at all.


With RDFa, most of the parsing is done by HTML. So I would call it an
RDFa processor. And yes, that doesn't change the fact that code needs
to be written. But it affects the type of the code that needs to be
written.


Somewhat of an aside, but for the curious - here is an RDFa 
parser/processor app:


http://code.google.com/p/rdfquery/wiki/Introduction
example: http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html
js: http://rdfquery.googlecode.com/svn/trunk/jquery.rdfa.js

[...]


The most successful alternative is nothing at all. ^_^ We can
extract copious data from web pages reliably without metadata, either
using our human senses (in personal use) or natural-language-based
processing (in search engine use). It has not yet been established
that sufficient and significant enough problems *exist* to justify a
solution, let alone one that requires an addition to html. That is
what Ian is specifically looking for.


That's what you and Ian claim. Many disagree.


My main problem with the natural language processing option is that it 
feels too close to waiting for Artificial Intelligence. I'd rather add 6 
attributes to HTML and get on with life.


But perhaps a more practical concern is that it unfairly biases things 
towards popular languages - lucky English, lucky Spanish, etc., and 
those that lend themselves more to NLP analysis. The Web is for 
everyone, and people shouldn't be forced to read and write English to 
enjoy the latest advances in Web automation. Since HTML5 is going 
through W3C, such considerations need to be taken pretty seriously.



As a note, this isn't the W3C's HTML WG. The WHATWG is independent
from the W3C.


But the WHATWG HTML5 *work* is no longer entirely independent of W3C; 
the two organizations embarked on a major joint venture. It seems 
reasonable for members of the WHATWG world to take W3C-oriented 
considerations seriously, regardless of mailing list.


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-03 Thread Dan Brickley

On 3/1/09 16:54, Håkon Wium Lie wrote:

Also sprach Dan Brickley:

My main problem with the natural language processing option is that it
feels too close to waiting for Artificial Intelligence. I'd rather add 6
attributes to HTML and get on with life.

:-)


Another thought re NLP. RDFa (and similar, ...) are formats that can be 
used for writing down the conclusions of NLP analysis. For example here 
see the BBC's recent Muddy Boots experiment, using DBPedia (Wikipedia in 
RDF) data to drive autoclassification / named entity recognition. So 
here we can agree with Ian and others that text analysis has much to 
offer, and still use RDFa (or other semantic markup - i'll sidestep that 
debate for now) as a notation for marking up the words with a 
machine-friendly indicator of their NLP-guessed meaning.


http://www.bbc.co.uk/blogs/journalismlabs/2008/12/muddy_boots.html


Personally, I think the 'class' attribute may still be a more
compelling option in a less-is-more way. It already exists and can
easily be used for styling purposes. Styling is bait for authors to
disclose semantics.


I'm sure there's mileage to be had there. I'm somehow incapable of 
writing XSLT so GRDDL hasn't really charmed me, but 'class' certainly 
corresponds to a lot of meaningful markup. Naturally enough it is 
stronger at tagging bits of information with a category than at defining 
relationships amongst the things defined when they're scattered around 
the page. But that's no reason to dismiss it entirely.


Did you see the RDF-EASE draft, 
http://buzzword.org.uk/2008/rdf-ease/spec? From which comes: Ten second 
sales pitch: CSS is an external file that specifies how your document 
should look; RDF-EASE is an external file that specifies what your 
document means.


RDF-EASE uses CSS-based syntax. More discussion here, 
http://lists.w3.org/Archives/Public/semantic-web/2008Dec/0148.html 
including question of whether it ought to be expressed using 
css3-namespace, 
http://lists.w3.org/Archives/Public/semantic-web/2008Dec/0175.html


chers,

Dan

--
http://danbri.org/



Re: [whatwg] Absent rev?

2008-11-18 Thread Dan Brickley

Ian Hickson wrote:

On Tue, 18 Nov 2008, Martin McEvoy wrote:

Just one small question

Why Has HTML5 dropped the rev=[1] attribute?

[1] http://www.w3.org/TR/html5-diff/#absent-attributes


We did some studies and found that the attribute was almost never used, 
and most of the time, when it was used, it was a typo where someone meant 
to write rel= but wrote rev=. To be precise, the most commonly used 
value was rev=made, which is equivalent to rel=author and thus was not 
a convincing use case. The second most common value was rev=stylesheet, 
which is meaningless and obviously meant to be rel=stylesheet. We 
therefore determined that authors would benefit more from the validator 
complaining about this attribute instead of supporting it.


(I don't dispute it's relative un-used-ness...)

Anything that could be done with rev= can be done with rel= with an 
opposite keyword, so this omission should be easy to handle.


This would seem to shift work from HTML5 to relationship vocabulary 
specs, whether RDFa-oriented or XFN-based: they'll have to name the 
relationship in both directions now.


eg.
john.html:
 pSee my a rel=father href=pa.htmldad's page/a for details/p
and

pa.html:
pSee my a rel=child href=john.htmlson's page/a for details/p

are ok in html5, but

pa.html: pReader,a rev=father href=john.htmli'm his father/a/p

So long as there's a plausible inverse defined,

...isn't. I'm not arguing here that this is right or wrong or good or 
bad or pretty or ugly, just that the parties defining little 
relationship vocabularies such as 'parent', 'child', 
'father','mother','brother','ex-line-manager', and so on will (now 'rev' 
is going away) need to think carefully about naming each inverse 
relationship as well. As you point out, rev= wasn't heavily used anyway; 
however technologies like microformats and RDFa are relatively new to 
the Web, and things can take a while to get adopted (eg. XHR/'ajax').


cheers,

Dan

a personal ps.:
for some reason, rev= always made my head hurt slightly to even think 
about,  I guess because there are two senses of a reversed link: the 
reversed meaning of a link versus the idea of an incoming link / 
backlink, and the difference is simultaneously both obvious and subtle


Re: [whatwg] Absent rev?

2008-11-18 Thread Dan Brickley

Smylers wrote:

Martin McEvoy writes:


o be precise, the most commonly used value was rev=made, which is
equivalent to rel=author and thus was not a convincing use case. 

!! rel-author doesn't mean the same as rev-made eg:


In which cases doesn't it?  If A is the author of B then B was made by
A, surely?


Then B contributed to the creation of A, yes. Perhaps not on their own.

But we need it in the other direction too: can we conclude from { A made 
B } that { B author A } ?


Not if B isn't textual. Authorship is about writing, but there are many 
other avenues for human creativity (some of which result in things with 
URLs, eg. software, images, sounds).


So there are two complications here, and these are very real world 
issues, chewing up countless hours in projects like Dublin Core.


First is a versus the. Nothing warrants reading the into 
rel=author. There might be other authors, listed or not listed in their 
own hyperlink. Or the page pointed to might be a collectively maintained 
page or group homepage etc. Or a mailto: for a mailing list.


Second is non-textual creations. The early Dublin Core specs had a 
dc:author property. This was changed back in 1996 or so to be 
dc:creator, since this better includes visual works, museum artifacts 
and so forth, ie. things that can be made, but which are not 
(postmodernism aside) conventionally considered texts. Authorship is a 
notion that doesn't make much sense in a non-textual context.


My point in previous mail about shifting work from HTML5 to elsewhere, 
is that this kind of distinction is subtle for many seemingly obvious 
pairs of relationship-type names, and that rev= is at least precise in 
its meaning.


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] Absent rev?

2008-11-18 Thread Dan Brickley

Smylers wrote:

Dan Brickley writes:


Smylers wrote:


Martin McEvoy writes:


!! rel-author doesn't mean the same as rev-made eg:

In which cases doesn't it?  If A is the author of B then B was made by
A, surely?

Then B contributed to the creation of A, yes. Perhaps not on their own.

But we need it in the other direction too: can we conclude from { A made  
B } that { B author A } ?


Not if B isn't textual. Authorship is about writing, but there are
many other avenues for human creativity (some of which result in
things with URLs, eg. software, images, sounds).


Firstly, the term author can be used for at least some of those things;
definitely software.


Yes, 'software' was a bad example. But Dublin Core certainly did abandon 
the early term 'author' in favour of 'creator' after a workshop looking 
at requirements around images, museum artifacts and so on.



Secondly, if you think made is more generic than author, then surely
linking to such URLs with rel=made is an improvement on using
rev=author?


I don't associate 'being more generic' as a positive or a negative 
thing. Sometimes we want specificity, sometimes not. There is value in a 
'see also' relationship type; there is value in a 'schoolHomepage' 
relationship type too. Neither need be better.


If I wanted to find written works, then 'author' is a more relevant 
property than 'made'. If my concern is to find all the things created by 
some party, then 'made' may be more useful. My point was just that they 
have a different meaning (although much overlap).



First is a versus the. Nothing warrants reading the into
rel=author.


So presumably also nothing warrants reading the into rel=made?


Yup. If syntactic context (eg. via RDFa) associated the string 'made' 
with a specific definition rather than just the English word, then of 
course that definition could say anything it wanted - such as 'sole 
maker of ...' , 'primary maker of', etc.



The early Dublin Core specs had a dc:author property. This was
changed back in 1996 or so to be dc:creator,


I agree that creator would be a better term than author.  But I think
that's irrelevant to needing rev.


Without rev, content creators (in every language) will need to go 
through this dance, hunting through dictionaries and debating 
subtleties, to make sure that they've identified a suitable pair of 
words such that { X word1 Y } is true if and only if { Y word1 X }. 
Which is why I see this in terms of division of labour. Cleaning it out 
of HTML5 makes work elsewhere...


cheers,

Dan

--
http://danbri.org/



Smylers






Re: [whatwg] Absent rev?

2008-11-18 Thread Dan Brickley

Dan Brickley wrote:

Without rev, content creators (in every language) will need to go 
through this dance, hunting through dictionaries and debating 
subtleties, to make sure that they've identified a suitable pair of 
words such that { X word1 Y } is true if and only if { Y word1 X }. 
Which is why I see this in terms of division of labour. Cleaning it out 
of HTML5 makes work elsewhere...


Sorry that should've been,

{ X word1 Y } is true if and only if { Y word2 X }.

Dan

ps. (since i'm mailing again, sorry) ... in an RDF/XML context, we had 
this issue in FOAF: we added 'depicts' alongside 'depiction' because the 
old RDF/XML syntax didn't deal well with inverses


Re: [whatwg] RDFa statement consistency

2008-08-29 Thread Dan Brickley

Henri Sivonen wrote:

On Aug 29, 2008, at 11:11, Julian Reschke wrote:


Henri Sivonen wrote:

I don't believe that is the case.
If I've understood history correctly, introducing Namespaces into XML 
was primarily a requirement stipulated by the RDF community. XML got


Pointer, please?


http://lists.w3.org/Archives/Public/semantic-web/2007Dec/0116.html


W3C Members (or invited experts with the right permissions) can read 
more of the back story in the original XML WG archives.


See http://lists.w3.org/Archives/Member/w3c-xml-wg/1998Jan/0034.html
'URGENT: Proposal to modify or delay XML 1.0 Recommendation', From: Jon 
Bosak, 12 Jan 1998. This points to a paper,

'Turning XML into a Universal Syntax for Web Data Formats'
http://www.w3.org/Member/Meeting/98JanAC/xml-req.html that was put 
before the Jan 1998 W3C AC Meeting in San Jose. I think it's reasonable 
to share the abstract here: Concern is shared by members of the RDF, 
SMIL and Math working groups, and the W3C architecture domain staff, 
that the XML 1.0 Proposed Recommendation of 8Dec97 does not address the 
needs as a common base for the transmission of machine-understandable 
data..


cheers,

Dan


Re: [whatwg] RDFa Features

2008-08-27 Thread Dan Brickley

Kristof Zelechovski wrote:

This amounts to saying that URLs take precedence over CURIEs and CURIEs can
be enclosed in brackets in case of any ambiguity.  This sounds ridiculous
given the weight you put on avoiding ambiguities and name clashes.  Since
the author does not control the URL scheme registration process, he can
never be sure that a particular prefix is safe, therefore using unsafe
CURIEs is just asking for trouble.  However, Manu's examples DO NOT use safe
CURIEs, nor do any examples I have seen on this discussion.  Good heavens!~


I agree. The when there is any possibility of ambiguity sentence  is a 
bit weak. I don't know the CURIEs spec well; but for cases where the 
assumption is 'this URI scheme won't be registered', the assumption is 
dangerous.


Dan


-Original Message-
From: Julian Reschke [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 27, 2008 5:33 PM

To: Kristof Zelechovski
Cc: 'Manu Sporny'; 'Ian Hickson'; 'WHAT-WG'; [EMAIL PROTECTED]
Subject: Re: [whatwg] RDFa Features

Kristof Zelechovski wrote:

You cannot support both CURIEs and URLs.  What happens when someone

declares

xmlns:http?


http://www.w3.org/TR/curie/#sec_2.2..

BR, Julian






Re: [whatwg] RDFa

2008-08-26 Thread Dan Brickley

Ian Hickson wrote:

On Sat, 23 Aug 2008, Julian Reschke wrote:

Again you're confusing HTTP URLs with URIs.

Using URIs as identifiers allows lots of identification schemes other 
than HTTP, in particular ones that are not based on DNS, or that use 
DNS, but include a timestamp to address the concern of losing a domain 
name (tag URI scheme).


Sure, but most people use HTTP URIs anyway for namespaces.

You can use any URI or any system you want with class=. The key is just 
to make it unique enough that clashes won't happen. In practice, names 
like dc:title are actually quite unique enough. But people can use much 
more unique ones if desired, all the way to full URIs.


I'm certainly in favour of making mainstream namespace names prettier. 
But this design worries me, since it requires guesswork and heuristics 
on the part of consumer code to figure out if class = info.age or 
museum.acquisitionDate is intended as a URI or not. I'll air the worry 
first, and then sketch an approach that makes me worry less and which 
might have some of the characteristics that you value (such as not 
depending on separate xmlns-like declarations of abbreviations, and not 
being too ugly to look at).


You mentioned earlier that the RDFish practices around downloading and 
interpreting schemas from the Web is news to you. I'll take up an action 
to document some of the things we do in that area (eg. with SPARQL for 
data merging), probably as a blog post.


Doing so would help as background on my next point, which is that making 
it ambiguous whether a URI was declared is something that would need 
careful security review, to ensure that data consumers are aware that 
they should not expect property definitions found at the domain to be 
consistent with the intended meaning of the markup.


Sketch of a scenario:

1. Alice deploys class=creationDate.info1979/class to describe a 
museum artifact. She calls it this because it marks up some information 
about the creation date of some real world thing, and because 
'creationDate' is already in use for describing page creation dates, in 
the CSS library she's using.


2. Bob buys himself the Internet domain creationDate.info and wires up a 
webserver to respond with an RDFa schema defining creationDate as a 
sub-property of http://ecommerce.example.com/vocab#priceInEuros.


3. Charlie's code downloads Alice's markup, parses out the RDFa, and 
noticing that creationDate.info seems to be de-referencable, so goes to 
fetch the schema. For every triple x creationDate y in the document, 
it also generates x ecom:priceInEuros y too. Perhaps Bob is selling 
other museum artifact and wants to make Alice's look more expensive. Or 
cheaper. Or to make her data look corrupted so that certain consumers 
won't include her listing. Or maybe he wants to buy the item cheaply and 
is probing for bugs in Alice's online shopping system.


In other words, the fact that Alice's markup only *appears* to be using 
an Internet domain opens her up to risk that someone will go buy that 
domain, and put a fake schema there which affects the likely 
interpretation of her markup. This exposure is increased by our 
uncertainty about ICANN strategy: we can't rely on the assumption that 
there are only a tiny handful of TLDs. We can probably rely on them 
being expensive at the top level, but not on having a hardcoded list 
enumerating them.


[[
Icann has announced it will allow the creation of any new top-level 
domains, albeit at a considerable cost.


As well as opening the door to an influx of new web addresses, Icann has 
also said that it will allow Japanese, Chinese, Arabic and Cyrillic 
characters to be used in registrations for the first time.


It's a massive increase in the real estate of the internet. It will 
allow groups, communities and businesses to express their identities 
online, says Paul Twonmey, chief executive of Icann, speaking to the Times.
]] 
http://www.pcpro.co.uk/news/208833/icann-creates-domain-name-freeforall.html



The RDF approach generally has been to make it very clear which chunks 
of data contain URIs, and whether they can be relative or not. Other 
markup systems have adopted a similar approach. These share the merit 
that it makes such ambiguity much less of a problem (although there are 
other attacks of course).


Lately I've been thinking that perhaps we can get something less ugly 
than http://; in the markup, yet specify rules that allow expansion to 
http:// or https:// while keeping it clear whether the markup author 
really intends to cite some domain/page as vocabulary documentation.


For example pI'm span property=info.foaf/age1979/p years old/p
(if FOAF was documented at http://foaf.info/age and we specified the 
property attribute to use java-style names, and be declared relative to 
the http:// scheme).


Or pI'm span property=foaf/age1979/p years old/p
(if I spend $100k at ICANN to buy a tld 'foaf')

or pI'm span property=Com.xmlns.foaf.age1979/p 

Re: [whatwg] RDFa Problem Statement

2008-08-26 Thread Dan Brickley

Kristof Zelechovski wrote:

Web browsers are (hopefully) designed so that they run in every culture.  If
you define a custom vocabulary without considering its ability to describe
phenomena of other cultures and try to impose it worldwide, you do more harm
than good to the representatives of those cultures.  And considering it
properly does require much time and effort; I do not think you can have that
off the shelf without actually listening to them.
In a way, complaining that the Microformats protocol impedes innovation is
like saying 'we are big and rich and strong, so either you accommodate or
you do not exist'.  Not that I do not understand; it is straightforward to
say so and it happens all the time.
Chris


Let me give a quick example of how this works in RDFland.

Each vocabulary defines nothing except classes (types of thing) and 
properties (aka relationship types). In FOAF for example, we defined 
Person, Agent, Document, OnlineAccount, Project, Group as classes. And 
we defined properties too. These tend to have a bit more 'character' 
than the classes, and carry the distinctive style of each vocabulary. 
FOAF has properties of Person and Agent such as 'openid', 'homepage', 
'weblog' that have as their range (ie. values) instances of the class 
Document. We also define properties like 'primaryTopic' that relate a 
page primarily about something to the thing itself. Each class and 
property is considered to be in the vocabulary whose URI is 
http://xmlns.com/foaf/0.1/ ... and this is the basis of RDF's division 
of labour mechanism. See also a squiggly diagram at 
http://danbri.org/2008/foafspec/foafspec.jpg (apologies that this is 
currently inaccessible).


The SIOC project declares a bunch more classes and properties. Some of 
these are defined with relationship to Person, Document, OnlineAccount 
from FOAF; classes that sub-class ours, or properties that cite our FOAF 
classes as the range or domain. DOAP does the same, expanding from the 
class Project to describe opensource projects. I've talked about this 
before so won't go on about those schemas.


The point about cultural diversity, independent extension etc is made 
better by the JaUranai FOAF extension that appeared a few years back:


http://kota.s12.xrea.com/vocab/uranai

They decided that FOAF was nice and all but was lacking some properties 
important in a Japanese context. So they declare new RDF properties: 
starsign, bloodtype, and various others that I don't fully understand 
because they have japanese names and documentation. From blood type's 
description from the RDF Schema file at 
http://kota.s12.xrea.com/vocab/uranai/uranai.rdf


rdf:Property rdf:about=http://kota.s12.xrea.com/vocab/uranaibloodtype;
 rdfs:label血液型/rdfs:label
 rdfs:label xml:lang=enBlood type/rdfs:label
 rdfs:comment血液型を書きます。/rdfs:comment
 rdfs:comment xml:lang=enA blood type./rdfs:comment
 rdfs:domain rdf:resource=http://xmlns.com/foaf/0.1/Person/
 rdfs:range rdf:resource=http://www.w3.org/2000/01/rdf-schema#Literal/
[...]
/rdf:Property

This effectively wires in 'bloodtype' to the other classes in use in 
this wider community. Wherever SIOC or DOAP projects have created a 
property whose range is Person, we know that Uranai's 'bloodtype' 
property is also applicable. Without needing heavy duty coordination 
between the SIOC and DOAP authors and the author of Uranai.


Furthermore, the fact that all these projects share a common syntactic 
grammar means that I can simply add a Uranai 'bloodtype' property into 
my FOAF self-description, and expect each and every RDF parser and 
SPARQL database to immediately be able to parse and query it - see 
http://danbri.org/words/2008/02/25/286 for example. As Manu describes in 
http://blog.digitalbazaar.com/2008/08/23/html5-rdfa-and-microformats/ 
this is rather different to the Microformats.org approach, which is by 
intention a monolithic community designing a single, self-consistent 
product.


Back on my point that RDF vocabulary classes (ie. named types of thing, 
Person etc) tend to be boring, and the properties more interesting. This 
is to address the difficulty you mention, ie. ... If you define a 
custom vocabulary without considering its ability to describe phenomena 
of other cultures and try to impose it worldwide, you do more harm than 
good to the representatives of those cultures.


So for example in FOAF, we define fairly boring bland classes (like 
Person, Document) in a way that allow different cultures to attach 
properties that they care about. It seems bloodtype is more important 
in Japanese culture than in Western Europe, but that the toolset and 
design provided by RDFa allows independent extension of FOAF in Japan 
without expensive central bottlenecks. For Creative Commons, they have 
huge headaches because copyright law varies from country to country; 
this has informed their redesign and their enthusiasm for RDFa.


Hope this helps explain something of where RDFa folk are coming from,

Re: [whatwg] RDFa Problem Statement

2008-08-26 Thread Dan Brickley

Ben Adida wrote:

Greg Houston wrote:

I am not sure if Ben was eluding to this in the last paragraph, but to
further complicate things SearchMonkey is not actually using RDF,


I think you're confusing two different layers.

SearchMonkey parses HTML with microformats, and soon HTML+RDFa, and
makes that data available in RDF form to PHP scripts that you or anyone
else can write.


It does just this today, from actual RDFa. I've been working on an 
extension that integrates RDFa from the matched pages with additional 
information from external DataRSS (Atom+OpenSearch+RDFa) feeds.


cheers,

Dan

--
http://danbri.org/



Re: [whatwg] RDFa Problem Statement

2008-08-26 Thread Dan Brickley

Dan Brickley wrote:


Ben Adida wrote:

Greg Houston wrote:

I am not sure if Ben was eluding to this in the last paragraph, but to
further complicate things SearchMonkey is not actually using RDF,


I think you're confusing two different layers.

SearchMonkey parses HTML with microformats, and soon HTML+RDFa, and
makes that data available in RDF form to PHP scripts that you or anyone
else can write.


It does just this today, from actual RDFa. I've been working on an 
extension that integrates RDFa from the matched pages with additional 
information from external DataRSS (Atom+OpenSearch+RDFa) feeds.


A bit more information from Peter Mika at Yahoo (fwd'd with permission):
[[
the key point... is that indeed DataRSS is both Atom and RDFa 
compatible. RDFa is a set of attributes, we merely invented names for 
the XML elements that carry them... but you can completely ignore that 
and get the triples out by running an RDFa parser over it. OpenSearch is 
another extension you can add in the mix if you want.


We turn both microformats and RDFa-in-HTML into DataRSS when used as 
input for applications so that SearchMonkey applications can abstract 
away from the original format.


We are definitely not Microsoft doing JavaScript, since we are extending 
formats in the way they were foreseen (Atom extensibility) and complying 
with standards (RDFa) without adding to them or changing the meaning of 
constructs. So this is a genuine Semantic Web standards play.


Btw, we haven't announced RDFa support officially because we want to get 
it 100% right before we do... ok maybe 99% ;)

]]

cheers,

Dan

ps. http://labs.mozilla.com/2008/08/introducing-ubiquity/ is a nice case 
for in-page structured data, whether microformatty/posh or rdfa


Re: [whatwg] RDFa

2008-08-25 Thread Dan Brickley

Tab Atkins Jr. wrote:
On Sun, Aug 24, 2008 at 3:10 PM, Julian Reschke [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


Tab Atkins Jr. wrote:

The point was made before that html5 already has extensive
extension mechanisms in place that can address the particular
needs of various communities without requiring it to be written
explicitly into the spec.  I know you've said that your team has
reviewed the extension mechanisms and found them lacking, but
could you explain why it is insufficient to use
@data-rdf-property, @data-rdf-about, etc.?  I ask

  about these specifically because my mail timestamps show that the
  @data-* class of attributes was introduced April 10th of this year,
  while the ccRel submission is dated May 1st, and thus it's very
likely
  that these were impossible to consider during your review of existing
  extension mechanisms.
  ...

Custom data attributes are intended to store custom data private to
the page or application, for which there are no more appropriate
attributes or elements. -- http://www.w3.org/html/wg/html5/#custom


I'm confused.  Are you trying to imply that my suggestion is somehow 
against the spec definition?  If so, please accompany your quoting of 
the spec with an actual explanation of your point.  I cannot respond to 
you when I essentially have to imagine your entire argument for myself 
first.


My homepage at http://danbri.org/ is XHTML / RDFa and has data in RDFa 
attributes. I'd like to do this in HTML5 +RDFa instead, so I can take 
advantage of the other new features in HTML5. However the data is very 
much not private to the page, but designed to be used by a broad range 
of consumers. For example, Yahoo's SearchMonkey, or Google's Social 
Graph API. The use of RDF namespaces in that data indicates that we're 
using shared public schemas, rather than private islands of 
application-specific data.


Perhaps if the Web itself is considered an application, then this is 
application-specific data.


cheers,

Dan

--
http://danbri.org/




Re: [whatwg] RDFa

2008-08-23 Thread Dan Brickley

Ben Adida wrote:

Ian Hickson wrote:



Why would it scale any less than URIs? That's basically all URIs are.


Why would you reinvent URIs in a way that they can't be de-referenced?
Is that really a good design, in your opinion?

and it's extremely web-unfriendly, since you can't look up a concept to 
figure out what it might mean.

Sure you can. Just search for it on a search engine.


That's sort of good for humans, and that's assuming there's no bug in
the search engine algorithm where you get, say, Google-bombed. I'm not
sure a web design should be predicated on the existence of Google,
especially when it's not clear that Google will always be able to index
the entire web (it's not clear Google indexes the entire web even today.)


We can reasonably assume the existence of large search engines covering 
a good part of the public Web. Google being a well known example. But we 
can't necessarily assume their owners will offer reliable 
machine-friendly APIs to that data, with terms of service that are 
sufficiently unconstrained.


Google for example switched off machine access via SOAP in favour of an 
AJAX-based approach back in 2006:


http://code.google.com/apis/soapsearch/
[[
Google Code HomeGoogle SOAP Search API
As of December 5, 2006, we are no longer issuing new API keys for the 
SOAP Search API. Developers with existing SOAP Search API keys will not 
be affected.
Depending on your application, the AJAX Search API  may be a better 
choice for you instead. It tends to be better suited for search-based 
web applications and supports additional features like Video, News, 
Maps, and Blog search results.

]]

We went backwards, from a situation where machines could do lookups 
against the Google index, to http://code.google.com/apis/ajaxsearch/ 
which seems really much more focussed on customisation of human-facing 
Web content. That's really cool but doesn't help with Just search for 
it on a search engine if you're building things outside the browser.


Now it turns out we can still do programatic searches, because the AJAX 
API does offer a json interface, see 
http://code.google.com/apis/ajaxsearch/documentation/#fonje


eg.: curl -e http://www.my-ajax-site.com 
'http://ajax.googleapis.com/ajax/services/search/web?v=1.0q=Paris%20Hilton'


...however http://code.google.com/apis/ajaxsearch/terms.html warns that 
The API is limited to allowing You to host and display Google Search 
Results on your site ... The API may be used only for services that 
are accessible to your end users without charge. ...
You agree that you will not, and you will not permit your users or 
other third parties to: (a) modify or replace the text, images, or other 
content of the Google Search Results, including by (i) changing the 
order in which the Google Search Results appear, (ii) intermixing Search 
Results from sources other than Google, or (iii) intermixing other 
content such that it appears to be part of the Google Search Results; or 
(b) modify, replace or otherwise disable the functioning of links to 
Google or third party websites provided in the Google Search Results.


...the constraints are significant. And may change at any time (just as 
things changed for users of the old SOAP API).


Perhaps the miscommunication we have here is that when Ian says Sure 
you can. Just search for it on a search engine. he's assuming a human 
you, but RDFa people are thinking of this as a scripted operation too, 
because we know that machine-readable RDF/RDFa vocabulary descriptions 
exist that make it easier to find equivalencies between the classes and 
properties used in our data.


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] RDFa

2008-08-23 Thread Dan Brickley

+cc: Paul Miller of Talis, who worked on the AHDS report mentioned below.

Henri Sivonen wrote:

On Aug 23, 2008, at 02:43, Ben Adida wrote:


Why would you reinvent URIs in a way that they can't be de-referenced?


To avoid having misleading affordances.
http://en.wikipedia.org/wiki/Affordance

We want one parser, with variability and innovation in the vocabulary 
definition only.


Having one parser seems appealing compared to using the native 
mechanisms of each of HTML (meta, link), PDF (document information 
dictionary), PNG (tEXt chunk), etc. at first, but the vision that tools 
handle this all when you remix culture already requires the tools to 
support reading and writing the file formats they remix. When you 
already have format-native key-value read/write capability, the ability 
to build and mine RDF *graphs* becomes an additional burden.


It may not be obvious to those who haven't followed the history, or who 
were at school at the time, but many of us did indeed invest a lot of 
time and effort using name/value metadata structures in HTML. For 
example, the Dublin Core project began with this technology base 
beginning back in 1994/5, and the experience of metadata implementors 
using it was one of the drivers for the creation of RDF. At the time 
there no WHATWG to talk to, but the metadata community *did* talk to W3C.


See http://dublincore.org/about/history/

Early on, the Dublin Core community found a lot of pressure for 
feature-creep: new elements/terms to address the needs of various groups 
who liked Dublin Core, but wanted some specifics added. This situation 
gave rise to the 'Warwick Framework', defined in 1996 - 
http://www.dlib.org/dlib/july96/lagoze/07lagoze.html

[[
 While there was consensus among the attendees that the concept of a 
simple metadata set is useful, there were a number of fundamental 
questions concerning the real utility of the Dublin Core as it was 
defined at the end of the preceding workshop. Does the very loosely 
defined Dublin Core really qualify as a standard that can be read and 
processed programmatically? Should the number of the core elements be 
expanded, to increase semantic richness, or reduced, to improve 
ease-of-use by authors and/or web publishers? Will authors reliably 
attach core metadata elements to their content? Should a core metadata 
set be restricted to only descriptive cataloging information or should 
it include other types of metadata such as administrative information, 
linkage data, and the like? What is the relationship of the Dublin Core 
to other developing work in metadata schemes, particularly in those 
areas such as rights management information (terms and conditions)?


The workshop attendees concluded that the answer to these questions and 
the route to progress on the metadata issue lay in the formulation a 
higher-level context for the Dublin Core. This context should define how 
the Core can be combined with other sets of metadata in a manner that 
addresses the individual integrity, distinct audiences, and separate 
realms of responsibility of these distinct metadata sets.

]]

For an implementor report typical of the experience from this era, ie. 
with name/value pairs, see the UK Arts and Humanities Data Service 
document http://ahds.ac.uk/public/metadata/discovery.html which was 
presented at the Oct'97 Helsinki workshop of the Dublin Core. At the 
time I was involved with the ROADS internet cataloguing project and can 
vouch that we hit a similar ceiling with attribute/value metadata.


From the appendix, http://ahds.ac.uk/public/metadata/disc_09.html ... 
here are some of attribute/value structures they were forced to squash 
their metadata records into.



DC.creator.corporateName.1
Canterbury Archaeological Trust

DC.creator.phone.1
+44 227 462062


DC.creator.personalName.2
Paul Miller

DC.creator.affiliation.2
Archaeology Data Service

...this expresses name, affiliation and contact information for a number 
of contributors to a work. Another example describes several 
contributors along with their roles (actor, director, etc). Again the 
attribute/value representations contained numeric indexes 
('DC.creator.role.9') to disambiguate which individual was being described.




What barrier is there to building reusable vocabularies?


The follow-your-nose principle is missing, which is fairly essential for
discovering the meaning of vocabularies (partially automatically, not by
doing a Google search.)


The partial automation with RDFa doesn't go very far. If a program 
automatically dereferences http://creativecommons.org/ns# and parses the 
result as RDFa, the program now has a human-readable string for each 
property--not exactly something that the program can act on further 
without human help.



Looking at this example,

  div id=license about=#license typeof=rdf:Property
  h4cc:license/h4
  A a rel=rdfs:domain href=#WorkWork/a span 

Re: [whatwg] RDFa

2008-08-23 Thread Dan Brickley

Kristof Zelechovski wrote:

It seems to me identification and description of various entities is best
achieved with LDAP which is hierarchical by design.  Why wasn't LDAP adopted
for the purpose, given that it is older, widely used and well understood?


Work began on LDAP (a simplification from X.500) in 1993; and on Dublin 
Core (in some ways a simplification of longstanding library cataloguing 
methds for the Web) in 1994. We might equally ask why it didn't use SGML 
(it did) or XML (it did that too, after it was invented). There was work 
on exploring the use of LDAP and X.500 to address Dublin Core's needs, 
eg. see http://tools.ietf.org/html/draft-hamilton-dcxl-02 although it 
never really caught the world on fire. Why, is probably related to the 
larger question of why the Web evolved as a technology stack on top of 
IETF/internet specs rather than on top of X.500 or other work from that 
world...


Dan

--
http://danbri.org/


Re: [whatwg] Creative Commons Rights Expression Language

2008-08-22 Thread Dan Brickley

Bonner, Matt wrote:

On Wed, Aug 20, 2008 at 5:22 PM, Bonner, Matt wrote:

Hola,

I see that the Creative Commons has proposed additions to HTML
to support licenses (ccREL):
http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/
 ...



Tab Atkins Jr. replied:
The whole thing would be best expressed as a microformat, as the
entire thing can be made just as machine- and human-readable without
having to introduce an entire new addition to html.  I think someone
is a little confused about the important of CC...


then Dan Brickley wrote:

I encourage you to (re)-read
http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ ... the spec
explains that all of CC's concrete markup requirements are addressed
by the HTML additions in the RDFa spec. It does not propose *any* new
HTML markup to address CC's specific needs. 


(big snip)


In other words, adding 'about', 'property', 'resource', 'datatype' and
'typeof' and a namespace-URI association convention to HTML5 ...


Just so I understand you, are you saying that attributes aren't markup?
Because first you say no new markup, then you list 5 attributes to add.


Ah, sorry for the unclarity. Attributes are markup. The sentence comes 
as a whole: I meant that ccREL proposes no new *CC-specific* attributes 
or elements. They get their job done using general RDFa markup.


Second, the Introduction cites RDFa, which footnote 4 describes as an 
emerging collection of attributes and processing rules for extending 
XHTML to support RDF.  However, the Introduction text and example go

on to talk about HTML.  Independent of any other discussions, I think it
behooves the authors to clarify their intent. Is this for XHTML, HTML or 
both?


Yes, this could be clearer. The group's general line (Ben feel free to 
correct me) is that this attribute-driven markup style is intended to be 
largely neutral of its 'carrier' format, but that RDFa-in-XHTML is the 
only version that is fully specified with implementor tests etc 
underway. For this markup to work in other XML languages would require 
some more work; for it to be deployed in non-XML HTML (HTML5 etc) 
requires even more. But the general notion is that these attributes 
could be deployed in SVG-based, HTML5/6-based etc. languages too, ie. 
that this isn't a project tightly bound to (some specific version of) 
XHTML. Of course in a non-XML context, some other mechanism is needed 
(eg. link rels) to associate abbreviations with URLs.


Also in http://www.w3.org/TR/rdfa-syntax/ (now in CR at W3C, 
http://www.w3.org/TR/2008/CR-rdfa-syntax-20080620/)

[[
RDFa is a specification for attributes to be used with languages such as 
HTML and XHTML to express structured data. [...] This document only 
specifies the use of the RDFa attributes with XHTML.

]]

Does that help?

cheers

Dan

--
http://danbri.org/


Re: [whatwg] Creative Commons Rights Expression Language

2008-08-21 Thread Dan Brickley

+cc: Ben Adida

Tab Atkins Jr. wrote:
On Wed, Aug 20, 2008 at 5:22 PM, Bonner, Matt [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


Hola,

I see that the Creative Commons has proposed additions to HTML
to support licenses (ccREL):
http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/

As an example, they offer:

div about=http://lessig.org/blog/;
xmlns:cc=http://creativecommons.org/ns#;
   This page, by
   a property=cc:attributionName rel=cc:attributionURL
 href=http://lessig.org/;
  Lawrence Lessig
   /a,
   is licensed under a
   a rel=license href=http://creativecommons.org/licenses/by/3.0/;
 Creative Commons Attribution License
   /a.
/div

Unless I missed something in the HTML5 spec, at the least this would add
the property attribute to a.  Wouldn't ccREL be expressed better
using link instead of a?

Matt
--
Matt Bonner
Hewlett-Packard Company


The whole thing would be best expressed as a microformat, as the entire 
thing can be made just as machine- and human-readable without having to 
introduce an entire new addition to html.  I think someone is a little 
confused about the important of CC...


(Note: the someone is not you, Matt, but the drafters of this proposal.  
Also, I love CC as much as the next guy, but there's absolutely no 
reason to extend html to accomodate it, as everything they want to 
express can be done in existing html and formatted as a microformat.)


I encourage you to (re)-read 
http://www.w3.org/Submission/2008/SUBM-ccREL-20080501/ ... the spec 
explains that all of CC's concrete markup requirements are addressed by 
the HTML additions in the RDFa spec. It does not propose *any* new HTML 
markup to address CC's specific needs. Instead, they're telling the 
world that CC's needs (including their own requirement for independent 
extensions) are well-handled by RDFa.


RDFa adds a set of attributes; 
http://www.w3.org/MarkUp/2008/ED-rdfa-syntax-20080403/#rdfa-attributes 
has a full list. The ccREL spec shows these in an XHTML+RDFa XHTML 
format. There's a strong case to add them to HTML5 too, in my view.


In other words, adding 'about', 'property', 'resource', 'datatype' and 
'typeof' and a namespace-URI association convention to HTML5 wouldn't 
merely be addressing the important needs of the Creative Commons 
community. It would allow the expression of properties defined by any 
decentralised community, without the need for central coordination. This 
includes not just CC, but every group worldwide who are extending and 
customising CC for their own needs. Not just FOAF, but groups extending 
it for modelling forum posts and social media (eg. SIOC), or opensource 
projects (DOAP). Not just Dublin Core, but the huge range of projects 
that extend it to handle educational metadata (which itself varies 
nationally), rights, aggregation, classification etc. The addition of 
the RDFa attributes would allow HTML5 to carry structured data expressed 
in all/any of these vocabularies.


The Microformats.org community have done wonderful work and have 
inspired many others, but it is unfair on them (and unrealistic) to 
pressure their community, mailing lists and wiki by expecting their 
process to be a central bottleneck for all markup extensions to HTML. 
The Web serves a massive and fast growing community, many of whom don't 
speak English and are whose markup needs aren't core business for 
Microformats.org. By using RDFa and associating each vocabulary with a 
URI, we can spread the workload a bit more evenly.


Note also that every new vocabulary initiative at Microformats.org 
creates real and non-trivial work for parser writers, as well as work 
for vocabulary authors in specifying what it means to mix each pair of 
vocabularies. For ccREL (and FOAF, Dublin Core, SIOC, DOAP, ...), this 
is largely handled by RDF/RDFa: it can be freely mixed with any other 
RDF vocabulary, and reliably parsed by generic parser code. The tradeoff 
here is that the markup is less hand optimised for beauty than with 
microformats. (When extra-pretty custom markup is important, RDF 
provides GRDDL as a way of using XSLT to specify a mapping into its 
common data model.)


For more on RDFa, see the primer, http://www.w3.org/2006/07/SWD/RDFa/primer/

For a microformat parser that also handles RDFa, see 
http://buzzword.org.uk/cognition/   ... or an RDF toolkit that also 
parses some popular microformats, see http://arc.semsol.org/


For RDFa parsing in Javascript, see 
http://www.w3.org/2006/07/SWD/RDFa/impl/js/


cheers,

Dan

ps. my slides from a recent talk on rdf and microformats are here, if 
anyone's interested. It's more about how enthusiasts from each effort 
can learn from each other, than about the technical detail: 
http://www.slideshare.net/danbri/one-big-happy-family/ via 
http://microformats.eventwax.com/vevent



--
http://danbri.org/




Re: [whatwg] Question about the PICS label in HTML5

2008-04-24 Thread Dan Brickley

Anne van Kesteren wrote:

On Thu, 17 Apr 2008 11:06:46 +0200, Dan Brickley [EMAIL PROTECTED] wrote:

  http://wiki.whatwg.org/wiki/RelExtensions


Erm, 'For the Status section to be changed to Accepted, the 
proposed keyword must have been through the Microformats process, and 
been approved by the Microformats community. '


Is that really so?


That's the current proposal. I personally think a W3C Recommendation 
backing it should be enough as well.


If these drafts are destined for W3C specs, then yes, please make that 
change to your process. Microformats.org should be one of several 
in-routes here.


cheers,

Dan

--
http://danbri.org/



Re: [whatwg] Question about the PICS label in HTML5

2008-04-17 Thread Dan Brickley

Anne van Kesteren wrote:

On Thu, 17 Apr 2008 10:37:30 +0200, Phil Archer [EMAIL PROTECTED] wrote:

What do we need for HTML 5?

Just the link/rel element. A POWDER link will be something like

link rel=powder href=powder.xml type=application/xml /


If the POWDER WG defines the powder relationship and adds powder 
to the following Wiki page as proposal that should be enough (with a 
pointer to the definition):


  http://wiki.whatwg.org/wiki/RelExtensions
Erm, 'For the Status section to be changed to Accepted, the proposed 
keyword must have been through the Microformats process, and been 
approved by the Microformats community. '


Is that really so?

Dan

--
http://danbri.org/



Re: [whatwg] Administrivia: new member in the oversight committee

2008-03-31 Thread Dan Brickley

Ian Hickson wrote:

On Sun, 30 Mar 2008, Dan Brickley wrote:
  

Ian Hickson wrote:


FYI, Anne van Kesteren was just invited to join the WHATWG membership (as
defined by our charter, basically that's the small group of people whom I
have to answer to in my role as editor). He was invited due to his long
involvement in the WHATWG. This oversight group doesn't do much and this
won't really change anything; basically the group is there to make sure I
don't become evil and biased somehow, and to help direct the group should we
decide to take on some new project.
  
Does the committee have a mailing list? Where do they discuss things? 
Any papertrail?



There's no public accountability for this group, no. It's roughly 
equivalent to W3C staff, except that it is not a paid position.
  
W3C staff report through a variety of documented means to their 
stakeholders (including at regular events, Web Conference, TPs etc), 
they have named and documented roles grounded in the W3C Process, a 
class of document for airing their proposals to the wider community 
(Team notes) as well as strong internal-transparency via extensive 
internal email, cvs and irc logging so that new team-members can have 
access to previous discussions.


Is this the equivalence you have in mind?

W3C staff as a group culture (nothing personal here; I was one myself 
years) also have a tendency to be a little over-secretive, insular, and 
too often slip into thinking of themselves as having to heroically 
figure out what to do internally before presenting an external opinion. 
Get a tight-knit, smart and distributed group of people together with a 
sense of mission, and that's a hard trait to avoid.


I hope you'll lean towards the public accountability side of things here.

See also:
   http://www.whatwg.org/charter

  
Thanks, interesting. Is a version history and change-log available, 
beyond what can be discerned from 
http://web.archive.org/web/*/http://www.whatwg.org/charter ?
From the outside it is hard to understand how the charter has evolved 
over time.


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] Administrivia: new member in the oversight committee

2008-03-30 Thread Dan Brickley

Hi Ian,

Ian Hickson wrote:
FYI, Anne van Kesteren was just invited to join the WHATWG membership 
(as defined by our charter, basically that's the small group of people 
whom I have to answer to in my role as editor). He was invited due to his 
long involvement in the WHATWG. This oversight group doesn't do much and 
this won't really change anything; basically the group is there to make 
sure I don't become evil and biased somehow, and to help direct the group 
should we decide to take on some new project.
  
Does the committee have a mailing list? Where do they discuss things? 
Any papertrail?


cheers,

Dan

--
http://danbri.org/


Re: [whatwg] Video codec requirements changed

2008-01-07 Thread Dan Brickley

[snip]

How about this permathread gets a @whatwg.org mailing list all of its own?

Just a suggestion...

dan


Re: [whatwg] sarcasm

2007-04-24 Thread Dan Brickley

Elliotte Harold wrote:
It occurs to me that one of the most frequently used nits of 
pseudo-markup is to indicate sarcasm. For example,


sarcasmYeah, George W. Bush has been such a great president./sarcasm

Should we perhaps formalize this? Is there any benefit to be achieved by 
adding an explicit sarcasm element to HTML?


Seems rather culturally specific. I found from living in Boston for a 
while, that a British sense of humour often seems harsher and more 
sarcastic to our gentle US cousins. So I wouldn't burn this into an 
element name.


Some way of citing externally maintained lists might be nice, eg.
see work of http://www.w3.org/2005/Incubator/emotion/charter

The mission of the Emotion Incubator Group, part of the Incubator 
Activity, is to investigate the prospects of defining a general-purpose 
Emotion annotation and representation language, which should be usable 
in a large variety of technological contexts where emotions need to be 
represented.


cheers,

Dan


Re: [whatwg] video, object, Timed Media Elements -- Part I SMIL

2007-03-22 Thread Dan Brickley

Martin Atkins wrote:

ddailey wrote:


On Thu, 22 Mar 2007 13:03:24, Anne van Kesteren wrote

1. why not just include SMIL as a part of HTML, much in the same way 
that it is integrated with SVG? It is an existing W3C reco.


Reasons for not using t:video were that it was 1) complicated and 
2) not used.


Thanks Anne... Is there some easy way to resurrect prior discussions 
of this from the archives somewhere? I would like to try to understand 
the reasoning here. SMIL doesn't seem complicated to me -- declarative 
animation is rather charming and the complicatedness is cognitively 
less demanding than scripting. Its popularity will probably be 
synergized by rather dramatic increases in use of SVG.




SMIL solves problems far greater than the current aim of video, which 
is a much more modest goal of just being able to embed video 
interoperably in an HTML document.


If you want to do all that fun SMIL stuff, then why not just use SVG? It 
already does it all. video for the simple use cases and SVG+SMIL for 
the complicated ones doesn't seem too bad a compromise to me.


I've not followed it, ... but there's a SMIL subset integrated with 
XHTML at http://www.w3.org/TR/XHTMLplusSMIL/ ... if you find SMIL too 
large, perhaps this or another profile is less intimidating?


Dan



Re: [whatwg] video: togglePause() versus pause()

2007-03-18 Thread Dan Brickley

Alexey Feldgendler wrote:
On Sun, 18 Mar 2007 22:09:02 +0100, Magnus Kristiansen 
[EMAIL PROTECTED] wrote:


I just played some more with our internal implementation (Opera's) 
and noticed that our pause() really is like togglePause() in the 
HTML5 proposal. Looking at the specification I don't see much need 
for pause() there. Perhaps togglePause() should just become pause() 
and pause() be removed?


I would suggest the opposite. For basic actions like play and pause, 
play() and pause() are the most natural options. I question whether we 
need a command to toggle between play/pause at all. Any UI which uses 
a combined play/resume button has to know which state it is, so it 
already knows which command is relevant.


+1

What's good for UI (a play/pause toggle button) isn't necessarily good 
for API. play() should only start playback (and do nothing if it's 
already playing), pause() should only pause (and do nothing if it's 
stopped). The spec also mentions a property to find out the current state.


This is an important point. Pause UI is a well known slippery issue 
(state vs action). An API shouldn't dictate the UI...


Dan


Re: [whatwg] W3C restarts HTML effort

2007-03-07 Thread Dan Brickley

Ian Hickson wrote:
The W3C today publicly announced that they are restarting an HTML 
specification effort.


   http://www.w3.org/2007/03/html-pressrelease

This is great news and a clear validation of the WHATWG effort, which has 
been leading the maintenance and development of HTML since 2004. I'd like 
to congratulate everyone who has been involved in the WHATWG work, this 
really confirms that we have been doing good work.


Surprisingly, the W3C never actually contacted the WHATWG during the 
chartering process. However, the WHATWG model has clearly had some 
influence on the creation of this group, and the charter says that the W3C 
will try to actively pursue convergence with WHATWG:


   http://www.w3.org/2007/03/HTML-WG-charter.html#conformance

Hopefully they will get in contact soon. In the meantime, apparently 
anyone can actually join the W3C effort.


   http://www.w3.org/2004/01/pp-impl/40318/instructions

The instructions to join the group are as follows:

1. Fill in the Public Access Request Form; in the Reason field, put: To 
apply for participation in the HTML Working Group as an Invited Expert.


   http://cgi.w3.org/MemberAccess/Public

2. When you get a reply back, you should have a username and password. 
Fill in the W3C Invited Expert Application form.


   http://www.w3.org/2002/09/wbs/1/ieapp/

3. E-mail Dan Connolly and Karl Dubost ([EMAIL PROTECTED], [EMAIL PROTECTED]) 
asking for approval.


4. When you get a reply back, fill in the Joining the HTML Working Group 
form.


   http://www.w3.org/2004/01/pp-impl/40318/join

I would encourage everyone interested in working with the HTML working 
group to go through these steps as soon as possible, so that you will be a 
member of the group before the work starts.


I have also posted a WHATWG blog entry with this information:

   http://blog.whatwg.org/w3c-restarts-html-effort

Cheers,


The charter page also notes

The HTML Working Group also welcomes participation from non-Members. 
This may take the form of questions and comments on the mailing list or 
IRC channel, for which there is no formal requirement, or technical 
submissions for consideration, for which the participant must agree to 
Royalty-Free licensing under the W3C Patent Policy.

--  http://www.w3.org/2007/03/HTML-WG-charter.html#participation

...also This group primarily conducts its technical work on a Public 
mailing list.


In other words, there's a participation level below full WG membership, 
but acknowledged in the charter. It may suit some folk here. Great to 
have the WG discussions in the public record too. Easier to find, easier 
to link to, etc.  This is a very healthy level of openness. Good for 
W3C, good for the Web...


cheers,

Dan


Re: [whatwg] The IMG element, proposing a CAPTION attribute

2006-11-10 Thread Dan Brickley

Elliotte Harold wrote:

Jeff Seager wrote:

A better way would be to semantically attach the caption or cutline to 
the image itself, so its display is paired naturally. In this way, the 
width of the cutline would be dictated (unless overruled in the 
stylesheet) by the width of the image. I'm suggesting that CAPTION be 
adopted as a new attribute of the IMG element, as it is already for 
the TABLE element.


I don't think caption should be an attribute, an element maybe, but not 
an attribute.


The problem is that captions can and do have substructure. For instance, 
a caption might include multiple emphasized or strongly emphasized 
sections. Attributes just aren't powerful enough for this.


Given that, I suspect we're probably better off just using regular 
paragraphs in text with appropriate CSS instructions rather than 
introducing a new element.


I agree, attributes are too weak (eg. couldn't support 
http://www.w3.org/TR/ruby/ ).


Dan


Re: [whatwg] Mathematics in HTML5

2006-06-07 Thread Dan Brickley
* Ian Hickson [EMAIL PROTECTED] [2006-06-08 00:28+]
 On Wed, 7 Jun 2006, Michel Fortin wrote:
  
  I'd like to try something a little simpler. So here is my idea for a 
  math markup.
 
 I would be very cautious about introducing an entirely new language to do 
 this (even if it is just an extension of HTML4). For something as big as 
 Mathematics, we want to simply re-use an existing language, not invent a 
 new one. Inventing a new language for encoding content with as wide a 
 problem-space as mathematics would require months, as well as the time of 
 domain experts, etc. This work has already been done, e.g. in ISO12083, 
 MathML, LaTeX, and other such languages.

I absolutely agree. It would also be both considerate and sensible 
(if anyone does want to undertake such a task) to talk to the 
MathML folks first.

cheers,

Dan


Re: [whatwg] HTML5 Parsing spec first draft ready

2006-02-15 Thread Dan Brickley
* Ian Hickson [EMAIL PROTECTED] [2006-02-15 23:02+]
 On Wed, 15 Feb 2006, Dan Brickley wrote:
 
  Have you considered defining the parser behaviour in terms of XML 
  concepts?
 
 What would that mean?
 
 Could you give an example of what that would look like?

Expressing things in terms of DOM would be one way, assuming 
there is a mapping to XML infoset from the DOM (which 
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/ suggests
there is, though perhaps there are DOM version issues here?).
 
  If you do get to the test suite stage, there'll be need for some 
  concrete syntax presumably, to express test outputs in?
 
 The output of the parser is a DOM, so the natural form to use as an output 
 concrete syntax is simply a serialised DOM (e.g. an XML file).

If your DOM comes with a standard XMLization, we're golden. Sorry I'm 
not so up to date on DOM stuff (eg. which DOMs have an XMLization
defined, etc.).

 
  GRDDL could then say for HTML-ish bytestreams, feed them to the WHATWG 
  algorithm to get XML, and feed that XML to normal GRDDL algorithm to get 
  RDF...
 
 I'm with you up to the step where the output is XML, but I fail to see how 
 the next step is something WHATWG would be interested in. Could you expand 
 on this?

The next step is for people who find value in RDF's abstract graph structure
but find the standard RDF/XML syntax unattractive. GRDDL lets folk
deploy using XML or XHTML-based formats of their own devising, but 
map into RDF using XSLT so that RDF tools (eg. databases, SPARQL
query engines) can consume and exploit the data. I don't expect this
to be directly of interest to WHATWG unless WHATWG find value in RDF.
Beyond that, just think of it as another potential user of the parser
spec. http://www.w3.org/2004/01/rdxh/grddl-xml-demo has some demos
of GRDDL in action; http://librdf.org/query has some demos of RDF
query using SPARQL, from a toolkit that has GRDDL support. So one 
use case would be to mix natively RDF content with RDFized microformat
markup, so we could write queries whose answers draw on information
scattered across both formats and potentially multiple documents.

Dan



 -- 
 Ian Hickson   U+1047E)\._.,--,'``.fL
 http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
 Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] HTML5 Parsing spec first draft ready

2006-02-15 Thread Dan Brickley
* Ian Hickson [EMAIL PROTECTED] [2006-02-13 22:07+]
 
 So...
 
 The first draft of the HTML5 Parsing spec is ready.
 
 I plan to start implementing it at some point in the next few months, to 
 see how well it fares.

Any plans for a test suite? eg. pairs of input files and normalised
output? (if that makes sense...).

Dan


Re: [whatwg] HTML5 Parsing spec first draft ready

2006-02-15 Thread Dan Brickley
+cc: Dan Connolly

* Ian Hickson [EMAIL PROTECTED] [2006-02-14 18:41+]
 On Mon, 13 Feb 2006, Dan Brickley wrote:
  
  Any plans for a test suite? eg. pairs of input files and normalised 
  output? (if that makes sense...).
 
 I'd strongly recommend people put off creating a test suite until the spec 
 is in more than a first draft, but yes, on the long term this is 
 something we should definitely do.

Yup, I appreciate it's early days.

Discussing some related work (GRDDL) in the W3C SemWeb CG, I was
wondering whether there is any way your parser spec could be 
specified as input for a GRDDL transform. GRDDL provides techniques for
transforming XML-based languages (including XHTML) into an RDF
representation; typically by reference to an XSLT. If the WHATWG 
parser spec defined itself in terms of some XML-shaped output, the two
should chain nicely together. Have you considered defining the parser
behaviour in terms of XML concepts? If you do get to the test suite
stage, there'll be need for some concrete syntax presumably, to express
test outputs in? GRDDL could then say for HTML-ish bytestreams, 
feed them to the WHATWG algorithm to get XML, and feed that XML to 
normal GRDDL algorithm to get RDF... 

Dan


Re: [whatwg] What exactly is contentEditable for?

2005-08-17 Thread Dan Brickley

Olav Junker Kjær wrote:


Lachlan Hunt wrote:

I'm not disputing the fact that there is an unfortunate demand for 
embedded WYSIWYG editing in web based CMSs, it is the conceputally 
broken implementation I'm against.



I don't consider this demand unfortunate. I consider it an essential
part of the vision for the web. The writable web or universal canvas
or whatever its called, has been a part of the vision from the beginning
(rumor has it that TBL's very first browser was read/write).


Yup, see screenshots etc c/o 
http://www.w3.org/People/Berners-Lee/WorldWideWeb.html

[[[
The broken X in the Tim's home page window means that the document has 
been edited and not yet saved. (A dirty flag). As a convenience, 
pressing Command/Shift/S would save back all modified web pages.

]]]

Dan