Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-15 Thread Shelley Powers

Maciej Stachowiak wrote:


On May 14, 2009, at 1:30 PM, Shelley Powers wrote:

So, if I'm pushing for RDFa, it's not because I want to win. It's 
because I have things I want to do now, and I would like to make sure 
have a reasonable chance of working a couple of years in the future. 
And yeah, once SVG is in HTML5, and RDFa can work with HTML5, maybe I 
wouldn't mind giving old HTML a try again. Lord knows I'd like to 
user ampersands again.


It sounds like your argument comes down to this: you have personally 
invested in RDFa, therefore having a competing technology is bad, 
regardless of the technical merits. I don't mean to parody here - I am 
somewhat sympathetic to this line of argument. Often pragmatic 
concerns mean that an incremental improvement just isn't worth the 
cost of switching (for example HTML vs. XHTML). My personally judgment 
is that we're not past the point of no return on data embedding. 
There's microformats, RDFa, and then dozens of other serializations of 
RDF (some of which you cited). This doesn't seem like a space on the 
verge of picking a single winner, and the players seem willing to 
experiment with different options.



There are not dozens of other serializations of RDF.

The point I was trying to make is, I'd rather put my time into something 
that exists now, than have to watch the wheel re-invented. I'd rather 
see semantic metadata become a reality. I'm glad that you personally 
feel that companies will be just peachy keen on having to support 
multiple parsers to get the same data.


On the HTML WG side, I will never support microdata, because no case has 
been made for its existence.




The point is, people in the real world have to use this stuff. It 
helps them if they have one, generally agreed on approach. As it 
is, folks have to contend with both RDFa and microformats, but at 
least we know these have different purposes.


From my cursory study, I think microdata could subsume many of the 
use cases of both microformats and RDFa. It seems to me that it 
avoids much of what microformats advocates find objectionable, and 
provides a good basis for new microformats; but at the same time it 
seems it can represent a full RDF data model. Thus, I think we have 
the potential to get one solution that works for everyone.


I'm not 100% sure microdata can really achieve this, but I think 
making the attempt is a positive step.



It can't, don't you see?

Microdata will only work in HTML5/XHTML5. XHTML 1.1 and yes, 2.0 will 
be around for years, decades. In addition, XHTML5 already supports RDFa.


Supporting XHTML 1.1 has about 0.001% as much value as 
supporting  text/html. XHTML 2.0 is completely irrelevant to the Web, 
and looks on track to remain so. So I don't find this point very 
persuasive.


I don't think you'll find that the world is breathlessly waiting for 
HTML5. I think you'll find that XHTML 1.1 will have wider use than HTML5 
for the next decade. If not longer. I wouldn't count out XHTML 2.0, 
either.  And in a decade, a lot can change.


Why you think something completely brand new, no vendor support, 
drummed up in a few hours or a day or so is more robust, and a better 
option than a mature spec in wide use, well frankly boggles my mind.


I haven't evaluated it enough to know for sure (as I said). I do think 
avoiding CURIEs is extremely valuable from the point of view of sane 
text/html semantics and ease of authoring; and RDF experts seem to 
think it works fine for representing RDF data models. So tentatively, 
I don't see any gaping holes. If you see a technical problem, and not 
just potential competition for the technology you've invested in, then 
you should definitely cite it.


I don't think CURIEs are that difficult, nor impossible no matter the 
arguments that Henri brings out.


I am impressed with your belief in HTML5.

But
One other detail that it seems not many people have picked up on yet 
is that microdata proposes a DOM API to extract microdata-based info 
from a live document on the client side. In my opinion this is huge 
and has the potential to greatly increase author interest in 
semantic markup.




Not really. Can do this now with RDFa in XHTML. And I don't need any 
new DOM to do it.


The power of semantic markup isn't really seen until you take that 
markup data _outside_ the document. And merge that data with data 
from other documents. Google rich snippets. Yahoo searchmonkey. Heck, 
even an application that manages the data from different subsites of 
one domain.


I respectfully disagree. An API to do things client-side that doesn't 
require an external library is extremely powerful, because it lets 
content authors easily make use of the very same semantic markup that 
they are vending for third parties, so they have more incentive to use 
it and get it right.



Sure, we'll have to disagree on this one.


Now, it may be that microdata will ultimately fail, either because 
it is outcompeted by RDFa

Re: [whatwg] Link rot is not dangerous

2009-05-15 Thread Shelley Powers

Dan Brickley wrote:

On 15/5/09 18:20, Manu Sporny wrote:

Kristof Zelechovski wrote:

Therefore, link rot is a bigger problem for CURIE
prefixes than for links.


There have been a number of people now that have gone to great lengths
to outline how awful link rot is for CURIEs and the semantic web in
general. This is a flawed conclusion, based on the assumption that there
must be a single vocabulary document in existence, for all time, at one
location. This has also lead to a false requirement that all
vocabularies should be centralized.

Here's the fear:

If a vocabulary document disappears for any reason, then the meaning of
the vocabulary is lost and all triples depending on the lost vocabulary
become useless.

That fear ignores the fact that we have a highly available document
store available to us (the Web). Not only that, but these vocabularies
will be cached (at Google, at Yahoo, at The Wayback Machine, etc.).

IF a vocabulary document disappears, which is highly unlikely for
popular vocabularies - imagine FOAF disappearing overnight, then there
are alternative mechanisms to extract meaning from the triples that will
be left on the web.

Here are just two of the possible solutions to the problem outlined:

- The vocabulary is restored at another URL using a cached copy of the
vocabulary. The site owner of the original vocabulary either re-uses the
vocabulary, or re-directs the vocabulary page to another domain
(somebody that will ensure the vocabulary continues to be provided -
somebody like the W3C).
- RDFa parsers can be given an override list of legacy vocabularies that
will be loaded from disk (from a cached copy). If a cached copy of the
vocabulary cannot be found, it can be re-created from scratch if 
necessary.


The argument that link rot would cause massive damage to the semantic
web is just not true. Even if there is minor damage caused, it is fairly
easy to recover from it, as outlined above.


A few other points:

1. It's for the community of vocabulary-creators to help each other 
out w.r.t. hosting/publishing these: I just nudged a friend to put 
another 5 years on the DNS rental for a popular namespace. I think we 
should put a bit more structure around these kinds of habit, so that 
popular namespaces won't drop off the Web through accident.


2. digitally signing the schemas will become part of the story, I'm 
sure. While it's a bit fiddly, there are advantages to having other 
mechanisms beyond URI de-referencing for knowing where a schema came from


3. Parties worried about external dependencies when using namespaces 
can always indirect through their own namespace, whose schema document 
can declare subclass/subproperty relations to other URIs


cheers

Dan




The most important point to take from all of this, though, is that link 
rot within the RDF world is an extremely rare and unlikely occurrence. 
I've been working with RDF for close to a decade, and link rot has never 
been an issue.


One of the very first uses of RDF, in RSS 1.0, for feeds, is still in 
existence, still viable. You don't have to take my word, check it out 
yourselves:


http://purl.org/rss/1.0/

Even if, and I want to strongly emphasize if link rot does occur, both 
Manu and Dan have demonstrated multiple ways of ensuring that no meaning 
is lost, and nothing is broken. However, I hope that people are open 
enough to take away from their discussions that  they are trying to 
treat this concern respectfully, and trying to demonstrate that there's 
more than one solution. Not that this forms a proof that Oh my god, 
if we use RDF, we're doomed!


Also don't lose sight that this is really no more serious an issue than, 
say, a company originating com.sun.* being purchased by another 
company, named com.oracle.*.  And you can't say, Well that's not the 
same, because it is.


The only safe bet is to designate some central authority and give them 
power over every possible name. Then we run the massive risk of this 
system failing (and this applies to microdata's reverse DNS as well as 
RDF's URI), or it being taken over by an entity that sees such a data 
store as a way to make a great profit. We also defeat the very principle 
on which semantic data on the web abides, and that's true whether you're 
support microdata or RDF.


Shelley






Re: [whatwg] Link rot is not dangerous

2009-05-15 Thread Shelley Powers

Kristof Zelechovski wrote:

Classes in com.sun.* are reserved for Java implementation details and should
not be used by the general public.  CURIE URL are intended for general use.

So, I can say Well, it is not the same, because it is not.

Cheers,
Chris


  
But we're not dealing with Java anymore. We're dealing with using 
reversed DNS concatenated with some kind of default URI, to create some 
kind of bastardized URL, which actually is valid, though incredibly 
painful to see, and can be implied to actually take one to to a web address.


You don't have to take my word for it -- check out Philip's testing demo 
for microdata. You get triples with the following:


http://www.w3.org/1999/xhtml/custom#com.damowmow.cat

http://philip.html5.org/demos/microdata/demo.html#output_ntriples

Not only do you face problems with link rot, you also face a significant 
amount of confusion, as people look at that and go, What the hell is 
that?


Oh, and you can say, Well, but we don't _mean_ anything by it -- but 
what does that have to do with anything? People don't go running the 
spec everytime they see something. They look at this thing and think, 
Oh, a link. I wonder where it goes. You go ahead and try it, and 
imagine for a moment the confusion when it goes absolutely no where. 
Except that I imagine the W3C folks are getting a little annoyed with 
the HTML WG now, for allowing this type of thing in, generating a whole 
bunch of 404 errors for the web master(s).


But hey, you've given me another idea. I think I'll create my own 
vocabulary items, with the reversed DNS 
http://www.w3.org/1999/xhtml/custom#com.sun.*. No, maybe 
http://www.w3.org/1999/xhtml/custom#com.opera.*. Nah, how about 
http://www.w3.org/1999/xhtml/custom#com.microsoft.*. Yeah, that's cool. 
And there is no mechanism is place to prevent this, because unlike 
regular URIs, where the domain is actually controlled by specific 
entity, you've created the world famous W3C fudge pot. Anything goes.


I can't wait for the lawsuits on this one. You think that cybersquatting 
is an issue on the web, or facebook, or Twitter, wait until you see 
people use com.microsoft.*.


Then there's the vocabulary that was created by foobar.com, that people 
think, Hey, cool, I'll use that...whatever it is. After all, if you 
want to play with the RDF kids, your vocabularies have to be usable by 
other people.


But Foobar takes a dive in the dot com pool, and foobar.com gets taken 
over by a porn establishment. Yeah, I can't wait for people to explain 
that one to the boss. Just because it doesn't link, won't mean it won't 
end up on Twitter as a big, huge joke.


If you want to find something to criticize, I think it's important to 
realize that hey, folks, you've just stepped over the line, and you're 
now in the Zone of Decentralization. Whatever impacts us, babes, impacts 
all of you. Because if you look at Philip's example, you're going to see 
the same set of vocabulary URIs we're using for RDF right now, as 
microdata uses our stuff, too. Including the links that are all 
trembling on the edge on the self-implosion.


So the point of all of this is moot.

But it was fun. Really fun. Have a great weekend.

Shelley


Re: [whatwg] Link rot is not dangerous

2009-05-15 Thread Shelley Powers

Philip Taylor wrote:

On Fri, May 15, 2009 at 6:25 PM, Shelley Powers
shell...@burningbird.net wrote:
  

The most important point to take from all of this, though, is that link rot
within the RDF world is an extremely rare and unlikely occurrence.



That seems to be untrue in practice - see
http://philip.html5.org/data/rdf-namespace-status.txt

The source data is the list of common RDF namespace URIs at
http://ebiquity.umbc.edu/resource/html/id/196/Most-common-RDF-namespaces
from three years ago. Out of those 284:
 * 56 are 404s. (Of those, 37 end with '#', so that URI itself really
ought to exist. In the other cases, it'd be possible that only the
prefix+suffix URIs are meant to exist. Some of the cases are just
typos, but I'm not sure how many.)
 * 2 are Forbidden. (Of those, 1 looks like a typo.)
 * 2 are Bad Gateway.
 * 22 could not connect to the server. (Of those, 2 weren't http://
URIs, and 1 was a typo. The others represent 13 different domains.)

(For the URIs which returned Redirect responses, I didn't check what
happens when you request the URI it redirected to, so there may be
more failures.)

Over a quarter of the most common namespace URIs don't resolve
successfully today, and most of those look like they should have
resolved when they were originally used, so link rot seems to be
common.

(Major vocabularies like RSS and FOAF are likely to exist for a long
time, but they're the easiest cases to handle - we could just
pre-define the prefixes rss: and foaf: and have a centralised
database mapping them onto schemas/documentation/etc. It seems to me
that URIs are most valuable to let any tiny group make one for their
rarely-used vocabulary, and be guaranteed no name collisions without
needing to communicate with a centralised registry to ensure
uniqueness; but it's those cases that are most vulnerable to link rot,
and in practice the links appear to fail quite often.)

(I'm not arguing that link rot is dangerous - just that the numbers
indicate it's a common situation rather than an extremely rare
exception.)

  
Philip, I don't think the occurrence of link rot causing problems in the 
RDF world is all that common, but thanks for looking up this data. 
Actually I will probably quote your info on my next writing at my weblog.


I'd like to be dropped from any additional emails in this thread. After 
all, I  have it on good authority I'm not open for rational discussion. 
So I'll leave this type of thing to you guys.


Thanks

Shelley


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-14 Thread Shelley Powers

James Graham wrote:

jgra...@opera.com wrote:

Quoting Philip Taylor excors+wha...@gmail.com:


On Sun, May 10, 2009 at 11:32 AM, Ian Hickson i...@hixie.ch wrote:


One of the more elaborate use cases I collected from the e-mails 
sent in

over the past few months was the following:

  USE CASE: Annotate structured data that HTML has no semantics 
for, and
  which nobody has annotated before, and may never again, for 
private use or

  use in a small self-contained community.

[...]

To address this use case and its scenarios, I've added to HTML5 a 
simple

syntax (three new attributes) based on RDFa.


There's a quickly-hacked-together demo at
http://philip.html5.org/demos/microdata/demo.html (works in at least
Firefox and Opera), which attempts to show you the JSON serialisation
of the embedded data, which might help in examining the proposal.


I have a *totally unfinished* demo that does something rather similar
at [1]. It is highly likely to break and/or give incorrect results**.
If you use it for anything important you are insane :)


I have now added extremely preliminary RDF support with output as N3 
and  RDF/XML courtesy of rdflib. It is certain to be buggy.


So much concern about generating RDF, makes one wonder why we didn't 
just implement RDFa...


Shelley


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-14 Thread Shelley Powers

Dan Brickley wrote:

On 14/5/09 14:18, Shelley Powers wrote:

James Graham wrote:

jgra...@opera.com wrote:

Quoting Philip Taylor excors+wha...@gmail.com:


On Sun, May 10, 2009 at 11:32 AM, Ian Hickson i...@hixie.ch wrote:


One of the more elaborate use cases I collected from the e-mails
sent in
over the past few months was the following:

USE CASE: Annotate structured data that HTML has no semantics 
for, and

which nobody has annotated before, and may never again, for private
use or
use in a small self-contained community.

[...]

To address this use case and its scenarios, I've added to HTML5 a
simple
syntax (three new attributes) based on RDFa.


There's a quickly-hacked-together demo at
http://philip.html5.org/demos/microdata/demo.html (works in at least
Firefox and Opera), which attempts to show you the JSON serialisation
of the embedded data, which might help in examining the proposal.


I have a *totally unfinished* demo that does something rather similar
at [1]. It is highly likely to break and/or give incorrect results**.
If you use it for anything important you are insane :)


I have now added extremely preliminary RDF support with output as N3
and RDF/XML courtesy of rdflib. It is certain to be buggy.


So much concern about generating RDF, makes one wonder why we didn't
just implement RDFa...


Having HTML5-microdata -to- RDF parsers is pretty critical to having 
test cases that help us all understand where RDFa-Classic and HTML5 
diverge. I'm very happy to see this work being done and that there are 
multiple implementations.


As far as I can see, the main point of divergence is around URI 
abbreviation mechanisms. But also HTML5 might not have a notion 
equivalent to RDF/RDFa's bNodes construct. The sooner we have these 
parsers the sooner we'll know for sure.


Dan


Actually, I believe there are other differences, as others have pointed 
out.


http://www.jenitennison.com/blog/node/103

http://realtech.burningbird.net/semantic-web/semantic-web-issues-and-practices/holding-on-html5

Some of the differences have resulted in more modifications to the 
underlying HTML5 spec, which is curious, because Ian has stated in 
comments that support for RDF is only a side interest and not the main 
purpose behind the microdata section.


With the statement that support for RDF isn't a particular goal of 
microdata, Dan, I think you're being optimistic about the good this 
effort will generate for RDFa. But, more power to you.


Shelley


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-14 Thread Shelley Powers

Maciej Stachowiak wrote:


On May 14, 2009, at 5:18 AM, Shelley Powers wrote:

So much concern about generating RDF, makes one wonder why we didn't 
just implement RDFa...


If it's possible to produce RDF triples from microdata, and if RDF 
triples of interest can be expressed with microdata, why does it 
matter if the concrete syntax is the same as RDFa? Isn't the important 
thing about RDF the data model, not the surface syntax?


(I understand that if the microdata syntax offered no advantages over 
RDFa, then it would be a wasted effort to diverge. But my impression 
is that you'd object to anything that isn't exactly identical to RDFa, 
even if it can easily be used in the same way.)


Regards,
Maciej


Because one would assume that one way to accomplish a task would be more 
attractive to web developers, designers, parser developers, browsers, et 
al.


In addition, one would also assume that one way to accomplish a task 
would be more attractive in regards to testing, maintaining and moving 
on in the future.


Notice how there is only VHS and not Betamax?

Notice the same about Blu-Ray and HD-TV? People won't buy into something 
while there are competitive specs, and these are competitive in that 
it makes little since to use both in a document, though you can now.


The point is, people in the real world have to use this stuff. It helps 
them if they have one, generally agreed on approach. As it is, folks 
have to contend with both RDFa and microformats, but at least we know 
these have different purposes.


Shelley


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-14 Thread Shelley Powers

Maciej Stachowiak wrote:


On May 14, 2009, at 1:04 PM, Shelley Powers wrote:


Maciej Stachowiak wrote:


On May 14, 2009, at 5:18 AM, Shelley Powers wrote:

So much concern about generating RDF, makes one wonder why we 
didn't just implement RDFa...


If it's possible to produce RDF triples from microdata, and if RDF 
triples of interest can be expressed with microdata, why does it 
matter if the concrete syntax is the same as RDFa? Isn't the 
important thing about RDF the data model, not the surface syntax?


(I understand that if the microdata syntax offered no advantages 
over RDFa, then it would be a wasted effort to diverge. But my 
impression is that you'd object to anything that isn't exactly 
identical to RDFa, even if it can easily be used in the same way.)


Regards,
Maciej


Because one would assume that one way to accomplish a task would be 
more attractive to web developers, designers, parser developers, 
browsers, et al.


In addition, one would also assume that one way to accomplish a task 
would be more attractive in regards to testing, maintaining and 
moving on in the future.


Notice how there is only VHS and not Betamax?

Notice the same about Blu-Ray and HD-TV? People won't buy into 
something while there are competitive specs, and these are 
competitive in that it makes little since to use both in a 
document, though you can now.


Physical media do tend to converge due to network effects. I think the 
effect is less strong for digital file formats. For example, MP3 and 
AAC are both fairly successful; similarly, MPEG-4, Windows Media and 
Ogg are all getting some degree of traction. But you may be right that 
ultimately there will be only one winner.


Now, that's the problem with all of this effort...winners and losers.

I don't support a spec because it gives me grins and giggles. I have 
certain tasks I want to do, and I look for what is the technology that 
has the most support in order to do them.


I've long been an adherent to RDF, which isn't really up for debate. 
Originally, I was an RDF/XML person, until the RDF-in-XHTML folks 
changed my mind.


What I see of RDFa is a specification that has been through a very long 
period of time, testing, commenting, being implemented by major players. 
I also have tools, right now, that I can use to process the RDFa, as 
well as support by two major search engine companies.


As Dan pointed out earlier, microdata seems to support most of RDF. 
Well, I know that RDFa does. It makes little sense to me to start from 
scratch when a mature specification with multi-vendor support already 
exists.


Especially when Drupal 7 rolls out with RDFa baked in. That's 1.7 
million sites supporting the spec. Then there's the new Google snippet 
thing -- who knows how many additional sites we'll now find supporting RDFa.


So, if I'm pushing for RDFa, it's not because I want to win. It's 
because I have things I want to do now, and I would like to make sure 
have a reasonable chance of working a couple of years in the future. And 
yeah, once SVG is in HTML5, and RDFa can work with HTML5, maybe I 
wouldn't mind giving old HTML a try again. Lord knows I'd like to user 
ampersands again.




The point is, people in the real world have to use this stuff. It 
helps them if they have one, generally agreed on approach. As it is, 
folks have to contend with both RDFa and microformats, but at least 
we know these have different purposes.


From my cursory study, I think microdata could subsume many of the use 
cases of both microformats and RDFa. It seems to me that it avoids 
much of what microformats advocates find objectionable, and provides a 
good basis for new microformats; but at the same time it seems it can 
represent a full RDF data model. Thus, I think we have the potential 
to get one solution that works for everyone.


I'm not 100% sure microdata can really achieve this, but I think 
making the attempt is a positive step.



It can't, don't you see?

Microdata will only work in HTML5/XHTML5. XHTML 1.1 and yes, 2.0 will be 
around for years, decades. In addition, XHTML5 already supports RDFa.


Why you think something completely brand new, no vendor support, drummed 
up in a few hours or a day or so is more robust, and a better option 
than a mature spec in wide use, well frankly boggles my mind.


I am impressed with your belief in HTML5.

But
One other detail that it seems not many people have picked up on yet 
is that microdata proposes a DOM API to extract microdata-based info 
from a live document on the client side. In my opinion this is huge 
and has the potential to greatly increase author interest in semantic 
markup.




Not really. Can do this now with RDFa in XHTML. And I don't need any new 
DOM to do it.


The power of semantic markup isn't really seen until you take that 
markup data _outside_ the document. And merge that data with data from 
other documents. Google rich snippets. Yahoo searchmonkey. Heck, even an 
application

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-12 Thread Shelley Powers

Philip Taylor wrote:

On Tue, May 12, 2009 at 11:55 AM, Eduard Pascual herenva...@gmail.com wrote:
  

[...]
(at least for now: many RDFa-aware agents vs. zero HTML5's
microdata -aware agents)



HTML5 microdata parsers seem pretty trivial to write -
http://philip.html5.org/demos/microdata/demo.html is only about two
hundred lines to read all the data and to produce JSON and
N3-serialised RDF. It shouldn't take more than a few hours to produce
a similar library for other languages, including the time taken to
read the spec, so the implementation cost for generic parser libraries
doesn't seem like a significant problem.
  


Writing something that will produce triples may be easy, but what's 
important is that you're producing an RDF model.


Philip, I've been looking at your application, and you're not producing 
the same model for Ian's microdata proposal that is produced using 
either eRDF or RDFa. I'll have more on this later.

The cost of integration with backend RDF-based systems seems more
significant - hopefully you could simply replace the frontend RDFa
parser with a microdata parser and generate the same RDF triples and
it would all work fine, but I don't know whether that's true in
practice (because maybe the microdata syntax is too restrictive to
represent the vocabularies people want to use, and so they'd have to
go to lots of extra effort to create a new vocabulary).

  

[...] there are other cases where
separate values might be needed: for example using a street address
for the human-readable representation of a location and the exact
geographic coordinates as the machine-readable (since not all
micro-data parsers can rely on Google Maps's database to resolve
street addresses, you know); or using a colored name (such as lime
green displayed on lime green color) as the human-readable
representation of a color, and the hexcode (like #00FF00) as the
machine-readable representation.



You could replace
  span itemprop=colorlime green/span
  span itemprop=location1 High Street/span
with
  meta itemprop=color content=#00FF00spanlime green/span
  meta itemprop=location.lat content=56.78meta
itemprop=location.long content=-12.34span1 High Street/span
to get the desired output. (Not particularly elegant syntax, though.)

  


It's funny, but oddly enough, this discussion reminds me of when I 
started at Boeing, right after college. I started just when the great 
debate between SQL and QUEL was ending, in SQL's favor. Most folks still 
feel that QUEL was the superior option, but SQL won out in the end 
because it had widespread use, and was supported by more of the 
(powerful) database companies, and hence the companies using the databases.


The same could be said of Betamax versus VHS, and even the recent HDTV 
and Blu-Ray debates: we can get caught up in issues of superiority and 
argue the fine points of (mostly) obscure markup until the cows come 
home, but at some point in time, you have to pick a standard to get 
behind, or no one will any confidence in _any_ of the options being 
proposed--and the concept underlying the competing technologies (or 
standards) is hindered, perhaps for years.


Sorry, I digress. Eduard, looking forward to seeing your own 
interpretation of the best metadata annotation.


Shelley




Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-12 Thread Shelley Powers

Ian Hickson wrote:

On Tue, 12 May 2009, Peter Mika wrote:
  

Just a quick comment on:

  it uses prefixes, which most authors simply do not understand, and
  which many implementors end up getting wrong (e.g. SearchMonkey
  hard-coded certain prefixes in its first implementation, Google's
  handling of RDF blocks for license declarations is all done with

Actually, the problem we see is not so much the prefixes themselves but rather
the cumbersome way of specifying namespace prefix definitions using xmlns. So
I think it would make sense to have some mechanism for referencing bundles of
namespace prefixes ('profiles') or namespace registries, in order to easy
authoring.

In terms of prefixes, I find that 'com.foaf-project.name' is a lot more 
difficult to write than 'foaf:name'. Reverse domain names are 
non-intuitive for non-programmer types (or non-Java programmers).



If we can come up with a way of using the string foaf:name without 
having to declare foaf in each document, I'm totally in agreement. I've 
considered maybe registering the foaf URL scheme, or using some other 
punctuation character and having people register prefixes, but I don't 
know what punctuation character to use (':' and '.' are both taken).


  
But then we would lose the extensibility, which is the power behind all 
of this.


If I remember correctly, Henri had an issue with the DOM when it came to 
support of namespaces in XHTML, and not in HTML, which was the reason 
that @prefix or something along those lines proposed. There was quite 
positive progress in this regard, too. I don't know what happened to 
that progress.


But regardless, the majority of people will include metadata markup by 
installing a plug-in or module, and making a couple of choices. And if 
you put together a good ten-minute tutorial for the average developer, 
they'll have no problem with foaf:name. Training and clarity of 
communication is much ore important than form, it always has been with 
technology.


The examples you come up with just don't justify discarding 
consideration of a capability that just started getting incorporated 
into Google search. I would say if your fellow Google developers could 
understand how this all works, there is hope for others.


Shelley


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-12 Thread Shelley Powers

Sam Ruby wrote:

On Tue, May 12, 2009 at 4:34 PM, Shelley Powers
shell...@burningbird.net wrote:
  

I
would say if your fellow Google developers could understand how this all
works, there is hope for others.



if

http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009May/0064.html

  
\

- Sam Ruby

  
Ah heck, I've made mistakes with vocabularies too. That's why you ask 
for feedback. Unfortunately, asking for feedback isn't an option when 
you're creating secret stuff.


I could have wished Google used FOAF or DC, too, but it's a start.

Shelley


[whatwg] Custom microdata handling added to HTML5 spec

2009-05-10 Thread Shelley Powers
Since a new section detailing HTML5's handling of custom microdata  has 
been added to the HTML5 spec


(tracked here http://html5.org/tools/web-apps-tracker?from=3073to=3074 
and displayed here http://dev.w3.org/html5/spec/Overview.html#microdata 
and announced 
herehttp://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019681.html 
), I'm assuming my effort to re-examine the use cases Ian has published 
is irrelevant, and a waste of everyone's time.


I will hence discontinue any and all effort associated with this 
specification.


Shelley





[whatwg] Continuing

2009-05-10 Thread Shelley Powers

Sorry for the double emails today.

I will continue with revisiting the use cases for the microdata section. 
One additional component I'll add to the use cases is applying my 
interpretation of how RDFa might handle the use case, as compared  to 
how it could be handled with Ian's new HTML5 microdata proposal. This 
will, of course, slow me down a bit.


Note, though, that I don't claim to be an expert on either RDFa or Ian's 
new microdata proposal.  My hope is that if I make a mistake, or I'm not 
clear, folks will respond to my writing with corrections and/or 
additions. The purpose behind my effort is to open discussion. I will 
admit, though, that I do have a bias for RDFa, primarily because this is 
something that's real, today, and that I can use, today.


Shelley


[whatwg] microdata use cases and Getting data out of poorly written Web pages

2009-05-08 Thread Shelley Powers
It's difficult to tell where one should comment on the so-called 
microdata use cases. I'm forced to send to multiple mailing lists.


Ian, I would like to see the original request that went into this 
particular use case. In particular, I'd like to know who originated it, 
so that we can ensure that the person has read your follow-up, as well 
as how you condensed the use case down (to check if your interpretation 
is proper or not).


In addition, from my reading of this posting of yours titled [whatwg] 
Getting data out of poorly written Web pages, is this open for any 
discussion? It seems to me that you received the original data, 
generated a use case document from the data, unilaterally, and now 
you're making unilateral decisions as to whether the use case requires a 
change in HTML5 or not.


Is this what we can expect from all of the use cases?

Shelley





Re: [whatwg] microdata use cases and Getting data out of poorly written Web pages

2009-05-08 Thread Shelley Powers

Ian Hickson wrote:

On Fri, 8 May 2009, Shelley Powers wrote:
  
It's difficult to tell where one should comment on the so-called 
microdata use cases. I'm forced to send to multiple mailing lists.



Please don't cross-post to the WHATWG list and other lists -- you may pick 
either one, I read all of them. (Cross-posting results in a lot of 
confusion because some of the lists only allow members to posts, which 
others allow anyone to post, so we end up with fragmented threads.)



  
But different people respond to the mailings in different ways, 
depending on the list. This isn't just you, Ian. How can I ensure that 
the W3C people have access to the same concerns?
Ian, I would like to see the original request that went into this 
particular use case. In particular, I'd like to know who originated it, 
so that we can ensure that the person has read your follow-up, as well 
as how you condensed the use case down (to check if your interpretation 
is proper or not).



I did not keep track of where the use cases came from (I generally ignore 
the source of requests so as to avoid any possible bias).


  
Documenting the originator of a use case is introducing bias? In what 
universe?


If anything, documenting where the use cases come from, and providing 
access to the original, raw data helps to ensure that bias has not been 
introduced. More importantly, it gives your teammates a chance to verify 
your interpretation of the use cases, and provide correction, if needed.


However, I can probably figure out some of the sources of a particular 
scenario if you have a specific one in mind. Could you clarify which 
scenario or requirement you are particularly interested in?



  
Ian, I think its important that you provide a place documenting the 
original raw data. This provides a historical perspective on the 
decisions going into HTML5 if nothing else.


If you need help, I'm willing to help you. You'll need to forward me the 
emails you received, and send me links to the other locations. I'll then 
put all these into a document and we can work to map to your condensed 
document. That way there's accountability at all steps in the decision 
process, as well as transparency.


Once I put the document together, we can put with other documents that 
also provide history of the decision processes.
In addition, from my reading of this posting of yours titled [whatwg] 
Getting data out of poorly written Web pages, is this open for any 
discussion?



Naturally, all input is always welcome.


  
No, I didn't ask if input was welcome. I asked if this was still open 
for discussion, or if you have made up your mind, and and further 
discussion will just be wasting everyone's time.
It seems to me that you received the original data, generated a use case 
document from the data, unilaterally, and now you're making unilateral 
decisions as to whether the use case requires a change in HTML5 or not.


Is this what we can expect from all of the use cases?



Yes.
  

That's not appropriate for a team environment.
If my proposals don't actually address the use cases, then please do point 
how that is the case. Similarly, if there are missing use cases, please 
bring them up. All input is always welcome (whether on the lists, or 
direct e-mal, on blogs, or wherever). None of the text in the HTML5 spec 
is frozen, it's merely a proposal. If there are use cases that should be 
addressed that are not addressed then we should address them.


  

Again, how can I? I don't have the original data.
(Regarding microdata note that I've so far only sent proposals for three 
of the 20 use cases that I collected. I've still got a lot to go through.)


  

After digging, I found another one, at

http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019620.html

Again, though, the writing style indicates the item is closed, and 
discussion is not welcome. I have to assume that this is how you 
mentally perceive the item, and therefore though we may respond, the 
response will make no difference.


And I can't find the third one. Perhaps you can provide a direct link.

I'm concerned, too, about the fact that the discussion for these is 
happening on the WhatWG group, but not in the HTML WG email list. I've 
never understood two different email lists, and have felt having both is 
confusing, and potentially misleading. Regardless, shouldn't this 
discussion be taking place in the HTML WG, too?


Isn't the specification the W3C HTML5 specification, also?

I'm just concerned because from what I can see of both groups, interests 
and concerns differ between the groups. That means only addressing 
issues in one group, would leave out potentially important discussions 
in the other group.


Shelley




[whatwg] notes on current HTML5 draft

2009-05-02 Thread Shelley Powers

Per Ian Hickson's request, first of my notes on the current HTML 5 draft

Section 1.6.3, where you compare HTML5 with XHTML2 and XForms, you write

However, XHTML2 and XForms lack features to express the semantics of 
many of the non-document types of content often seen on the Web. For 
instance, they are not well-suited for marking up forum sites, auction 
sites, search engines, online shops, mapping applications, e-mail 
applications, word processors, real-time strategy games, and the like.


This specification aims to extend HTML so that it is also suitable in 
these contexts.


This sounds more like marketing speak than something one would find in a 
specification. If it's important for an individual to know why they 
might want to use HTML5 over XHTML2, then the information should be 
given in detail, rather than in one vague paragraph.


In addition, I've not found that the HTML5 specification answers the 
claims given in the above paragraph. For instance, why would HTML5 be 
better for a mapping application than XHTML2? Or an auction site?


In section 1.7, you write

The DOM5 HTML, HTML5, and XHTML5 representations cannot all 
represent the same content. For example, namespaces cannot be 
represented using HTML5, but they are supported in DOM5 HTML and 
XHTML5. Similarly, documents that use the noscript  feature can be 
represented using HTML5, but cannot be represented with XHTML5 and 
DOM5 HTML. Comments that contain the string -- can be represented 
in DOM5 HTML but not in HTML5 and XHTML5. And so forth.


And so forth, is not something one wants to read in a specification, 
because we expect precision, and and so forth is vague, and imprecise.


Since the HTML5 supposedly represents both a HTML and a XHTML 
serialization technique, perhaps the document can take a lesson from the 
RDF community and provide a separate document, or at least a section 
detailing the two different serialization techniques. This would go far, 
too, in clearing up the confusion regarding XHTML. Too many people are 
making assumptions that XHTML is dead because the XHTML serialization 
of HTML5 is not spelled out as clearly as it could be.


You actually do mix the differences between the two throughout the 
document, but that, to me, seems to 'clutter' up the spec -- making it 
difficult to determine what's new in the spec. If the HTML5 document is 
a new model for web page markup, then the model aspect of the spec 
should be detailed separately from its various serializations, and that 
includes any API.


Right now, it's difficult to read the specification because it jumps too 
frequently between the abstract and the implementation, sometimes in one 
sentence.


More later.

Shelley








[whatwg] Section 3 semantics and structure

2009-05-02 Thread Shelley Powers

More general comments on the HTML5 draft:

In section three, you mix structure and semantics, but the two are not 
necessarily compatible.


For instance, we see an introduction to the Document, and then 
immediately proceed into a description of Documents in the DOM. Frankly, 
I don't see how a description of the DOM fits either structure or 
semantics. To me, structure would be the structure of the markup in the 
document, and the semantics would be the, well, it's hard to say what it 
would be, you apply semantics to elements, such as section and header. 
Whatever it is, it's not DOM related.


Then you follow up with Security. What does this have to do with 
structure or semantics?


Perhaps if the intro section was filled in, we would have an 
understanding of what you mean by structure, and semantics. Right now, 
though, I see what is basically a bucket of information, somehow grouped 
under this heading, perhaps because it doesn't fit anywhere else.


Now you do a nice description of what you consider as semantics in 
section 3.3.1, and I would expect this, then, to be followed by a 
listing of the elements, but again, there's the DOM. There's no cohesive 
pattern to the document, especially when the different document levels 
are mixed so haphazardly.


I think of a document as a communication between writer and audience. 
Now there are probably three audiences for HTML5:  user agent 
developers, such as browser companies; web developers, interested in  
the DOM, scripting events, and so on; and designers or others, more 
likely interested in the markup.


I, as a web developer/designer, am not really interested in the user 
agent aspects of the specs. Another person who is a designer, may not be 
interested in the developer or UA aspects. But all of us are forced to 
go through material addressed to all three audiences just to find the 
information we need.


I, a designer interested in learning about the new semantic elements, 
have to wade through sections on the DOM and security, including 
cookies, because I'm not sure when I'll be getting to the bits I need. 
There's no clear demarcation between audiences in the document.


More later

Shelley


[whatwg] example of serialization problems

2009-05-02 Thread Shelley Powers

Review of HTML5 document:

Here's a good example of a potential point of confusion for readers of 
the spec when it comes to serialization:


In section 4.5.8 you introduce the ul element, and then demonstrate it 
with a several child li elements, each of which is shown with an HTML 
serialization.


In second 4.5.9, you introduce the li element, and then demonstrate the 
li element using a serialization approach that would work with both 
XHTML and HTML serializations.


And still later, in section 4.5.13.1, you again demonstrate li elements 
using only the HTML serialization format.


In all of this is an implicit assumption of the capabilities of your 
audience, that they understand the differences between the two. Yet, 
this isn't stated as a prereq for the audience of the document. In fact, 
you state that a familiarity with XML is helpful, but not required. And 
as far as I've been able to see, though I may have missed it, 
discussions about closing tags doesn't take place until section 8.


My suggestion would be to include both HTML and XHTML serializations, 
carefully differentiating between the two. Or to provide separate 
documents detailing the elements and their serialized form, HTML version 
and XHTML version, if you want to inter-mix model and serialization 
technique.


As for Section 8, that really is for user agent developers, only. 
Seriously, I doubt you expect typical web developers or designers to get 
much from this section. I would almost expect this to be a separate 
document. What would be helpful is to bring this section up one level in 
complexity, specifically focused at web developers/designers.


More later

Shelley


Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-20 Thread Shelley Powers

Dan Brickley wrote:

On 18/1/09 20:07, Henri Sivonen wrote:

On Jan 18, 2009, at 20:48, Dan Brickley wrote:


On 18/1/09 19:34, Henri Sivonen wrote:

On Jan 18, 2009, at 01:32, Shelley Powers wrote:


Are you then saying that this will be a showstopper, and there will
never be either a workaround or compromise?



Are the RDFa TF open to compromises that involve changing the XHTML 
side
of RDFa not to use attribute whose qualified name has a colon in 
them to

achieve DOM Consistency by changing RDFa instead of changing parsing?


I don't believe the RDFa TF are in a position to singlehandedly
rescind a W3C Recommendation, ie.
http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/.

What they presumably could do is propose new work items within W3C,
which I'd guess would be more likely to be accepted if it had the
active enthusiasm of the core HTML5 team. Am cc:'ing TimBL here who
might have something more to add.

Do you have an alternative design in mind, for expressing the
namespace mappings?


The simplest thing is not to have mappings but to put the corresponding
absolute URI wherever RDFa uses a CURIE.


So this would be a kind of interoperability profile of RDFa, where 
certain features approved of by REC-rdfa-syntax-20081014 wouldn't be 
used in some hypothetical HTML5 RDFa.


If people can control their urge to use namespace abbreviations, and 
stick to URIs directly, would this make your DOM-oriented concerns go 
away?


Took five minutes to make this change in my template. Ran through 
validator.nu. Results:


Doesn't like the content-type. Didn't like profile on head. Having to 
remove the profile attribute in my head element limits usability, but 
I'm not going to  throw myself on the sword for this one.


Doesn't like property, doesn't like about. These are the RDFa attributes 
I'm using. The RDF extractor doesn't care that I used the URIs directly.


Didn't seem to mind SVG, but a value of none is a valid value for 
preserveAspectRatio.


Shelley


cheers,

Dan

--
http://danbri.org/





Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-20 Thread Shelley Powers

Eduard Pascual wrote:

On Sun, Jan 18, 2009 at 3:56 PM, Anne van Kesteren ann...@opera.com wrote:
  

On Sun, 18 Jan 2009 16:22:40 +0100, Shelley Powers
shell...@burningbird.net wrote:


My apologies for not responding sooner to this thread. You see, one of the
WhatWG working group members thought it would be fun to add a comment to my
Stop Justifying RDF and RDFa web post, which caused the page to break. I am
using XHTML at my site, because I want to incorporate inline SVG, in
addition to RDFa. An unfortunate consequence of XHTML is its less than
forgiving nature regarding playful pranks such as this.

I'm assuming the WhatWG member thought the act was clever. It was, indeed.
Three people emailed me to let me know the post was breaking while loading
the page in a browser, and I made sure to note that such breakage was
courtesy of a WhatWG member, who decided that perhaps I should just shut up,
here and at my site, about the Important Work people(?) here are doing.

Of course, the person only highlighted why it is so important that
something such as RDFa, and SVG, and MathML, get a home in HTML5. XHTML is
hard to support when you're allowing comments and external input. Typically
my filters will catch the accidental input of crappy markup, but not the
intentional. Not yet. I'm not an exerpt at markup, but I know more than the
average person. And the average person most likely doesn't have my
commitment, either.
  

http://annevankesteren.nl/2009/01/xml-sunday shows the commentor (who by the
way seems to be on your side in this debate) simply forgot to escape
self-closed / and then WordPress somehow messed up in an attempt to fix
it. I don't think anyone tries to make you shut up.



Ouch! Thanks Anne for the screenshot, otherwise I wouldn't have known
that it was my comment the one causing the issue.
My apologies Shelley for that incident. I assure you that it was not
intentional: it was a quite long post, I used some markup with the
intention of making it more readable (like italizing the quotes), and
by the end I messed things up. Thanks to the preview page I noticed
some issues, like that I had to escape the sarcasm.../sarcasm
for it to display (I'm too used to BBCode, which leaves unrecognized
markup as is), but I didn't catch the self-closed / one (nor the
preview page did: it showed up without issues).
  
Eduard,  no worries. Your comment just demonstrated that a secondary 
preview after editing is needed to self-catch these types of errors.


Sorry for the misunderstanding. That and Anne's image, and trying to 
wade through the markup and figure out what was going on, because this 
error should have been caught, put me in an irritated mood. Especially 
since I have had people deliberately trip up my comments every time I 
write about XHTML et al (ie the Philipe Anne mentions).


But no worries, and I shouldn't have made such a jump in assumption.

Shelley




Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-20 Thread Shelley Powers

Ian Hickson wrote:

On Sun, 18 Jan 2009, Shelley Powers wrote:
  

The more use cases there are, the better informed the results will be.
  
The point isn't to provide use cases. The point is to highlight a 
serious problem with this working group--there is a mindset of what the 
future of HTML will look like, and the holders of the mindset brook no 
challenge, tolerate no disagreement, and continually move to quash any 
possibility of asserting perhaps even the faintest difference of 
opinion.



I'm certainly sad that this is the impression I have given. I'd like to 
clarify for everyone's sake that this mailing list is definitely open to 
any proposals, any opinions, any disagreement. The only thing I ask is 
that people use rational debate, back up their opinions with logical 
arguments, present research to justify their claims, and derive proposals 
from user needs.


  
I've been especially critical of you, which isn't fair. At the same 
time, as you have said yourself, you are a benevolent dictator, which 
seems to me to not be the best strategy for an inclusive HTML for the 
future.


I know I'm not comfortable with the concept. But I'm also late to this 
group, and shouldn't disrupt if the strategy works.
  
Regardless, I got the point in the comment. That, combined with this 
email from Ian, tells us that it doesn't matter how our arguments run, 
the logic of our debate, the rightness of our cause--he is the final 
arbiter, and he does not want RDFa.



For the record, I am as open to us including a feature like RDFa as I am 
to us including a feature like MathML, SVG, or indeed anything else. While 
I may present a devil's advocate position to stimulate critical 
consideration of proposals, this does not mean that my mind is made up. If 
my mind was made up, I wouldn't be asking for use cases, and I wouldn't 
be planning to investigate the issue further in April.



  
There is a fine difference between being the devil's advocate, and the 
devil's front door made of thick oak, with heavy brass fittings.


How does one know if one has provided a use case in a format that is 
more likely to meet a successful outcome, than not. Is the criteria 
documented somewhere? It's difficult to provide use cases with the 
twenty questions approach.


What are the criteria by which a possible solution to a problem is 
judged? Is there a consistent set of questions asked? Tests made? A 
certain number of implementations? Again, is this documented somewhere?


I am not paid by Google, or Mozilla, or IBM to continue throwing away my 
time, arguing for naught.



It may be worth pointing out that, many of our most active participants 
are volunteers, not paid by anyone to participate. Indeed I myself spent 
many years contributing to the standards community while unemployed or 
while a student. I am sorry you feel that you need to be compensated for 
your participation in the standards community, and wish you the best of 
luck in finding a suitable employer.


  
The point I was trying to make, and forgive me if the my writing was too 
subtle, is that it's not the fact that the work will time, but whether 
the time will be well spent.


Operating in the dark and tossing use cases in hopes they stick against 
the wall, without understanding criteria is not a particularly good use 
of time. However, having specific tasks that meet a given goal, and 
knowing that the goal is stable, and not a moving target, goes a long 
way to ensuring that the time spent has value.


Knowing that one can, with diligence, ensure that the best result occurs 
is a good use of time.


Spitting into the wind, at the whim and whimsy of a benevolent dictator, 
is not a good use of time.



As far as Google goes, we have no corporate opinion either way on the 
topic of RDFa in HTML5. We do, however, encourage the continued practice 
of basing decisions on data rather than hopes.


  


Bully for Google.

Shelley


Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-18 Thread Shelley Powers

Ian Hickson wrote:

On Sat, 17 Jan 2009, Sam Ruby wrote:
  
But back to expectations.  I've seen references elsewhere to Ian being 
booked through the end of this quarter.  I may have misheard, but in any 
case, my point is the same: if this is awaiting something from Ian, it 
will be prioritized and dealt with accordingly.



For what it's worth, my current plan running up to last call in October 
includes an item in April for me to go through all the use cases that have 
by that point been put forward in the data markup space, and to work out 
for each use case and on the aggregate:


1. Whether there is compelling evidence that users want that use case 
   addressed (e.g. whether there are successful companies addressing that 
   use case using proprietary solutions or ad-hoc extensions to HTML, or
   whether there are usability studies or some independent market research 
   showing demand from users, or whether it can be demonstrated that users 
   are avoiding the Web because it doesn't address this problem).
  


Again, you've become gatekeeper Ian. You are the one making the decision 
as to worth. You are the only one, as far as I can see, that is making 
decisions about what is, or is not included in the next version of HTML.


You use I so frequently. Reading through your emails, one can't help 
wondering if you're the lead singer and everyone else here is nothing 
more than a faint echo.


2. Whether the use case is being addressed well enough already (e.g. if 
   there are companies addressing this use case adequately, or whether the 
   current solutions really are just hacks with numerous problems).


3. What the requirements are for each use case.

4. What solutions are available to address these use cases.

5. For each solution, whether it addresses the requirements.

6. Whether the relevant implementors are interested in implementing 
   solutions for these use cases (e.g. whether authoring tools are willing 
   to expose the feature, whether validator writers want to check for the 
   correctness, whether browser vendors are willing to expose the relevant 
   UI, whether search engine companies are willing to use the data, or 
   whatever else might be appropriate).


The more use cases there are, the better informed the results will be.

  


The point isn't to provide use cases. The point is to highlight a 
serious problem with this working group--there is a mindset of what the 
future of HTML will look like, and the holders of the mindset brook no 
challenge, tolerate no disagreement, and continually move to quash any 
possibility of asserting perhaps even the faintest difference of opinion.


My apologies for not responding sooner to this thread. You see, one of 
the WhatWG working group members thought it would be fun to add a 
comment to my Stop Justifying RDF and RDFa web post, which caused the 
page to break. I am using XHTML at my site, because I want to 
incorporate inline SVG, in addition to RDFa. An unfortunate consequence 
of XHTML is its less than forgiving nature regarding playful pranks such 
as this.


I'm assuming the WhatWG member thought the act was clever. It was, 
indeed. Three people emailed me to let me know the post was breaking 
while loading the page in a browser, and I made sure to note that such 
breakage was courtesy of a WhatWG member, who decided that perhaps I 
should just shut up, here and at my site, about the Important Work 
people(?) here are doing.


Of course, the person only highlighted why it is so important that 
something such as RDFa, and SVG, and MathML, get a home in HTML5. XHTML 
is hard to support when you're allowing comments and external input. 
Typically my filters will catch the accidental input of crappy markup, 
but not the intentional. Not yet. I'm not an exerpt at markup, but I 
know more than the average person. And the average person most likely 
doesn't have my commitment, either.


Someone earlier said that HTML5 is for web application users, only, and 
that the rest of us interested in things like RDFa should just use 
XHTML. In other words, make it good for Google and to hell with the rest 
of us. This, this is the guiding attitude behind the future of the web?


Regardless, I got the point in the comment. That, combined with this 
email from Ian, tells us that it doesn't matter how our arguments run, 
the logic of our debate, the rightness of our cause--he is the final 
arbiter, and he does not want RDFa. I am not paid by Google, or Mozilla, 
or IBM to continue throwing away my time, arguing for naught.


Shelley



Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-18 Thread Shelley Powers

Anne van Kesteren wrote:
On Sun, 18 Jan 2009 16:22:40 +0100, Shelley Powers 
shell...@burningbird.net wrote:
My apologies for not responding sooner to this thread. You see, one 
of the WhatWG working group members thought it would be fun to add a 
comment to my Stop Justifying RDF and RDFa web post, which caused the 
page to break. I am using XHTML at my site, because I want to 
incorporate inline SVG, in addition to RDFa. An unfortunate 
consequence of XHTML is its less than forgiving nature regarding 
playful pranks such as this.


I'm assuming the WhatWG member thought the act was clever. It was, 
indeed. Three people emailed me to let me know the post was breaking 
while loading the page in a browser, and I made sure to note that 
such breakage was courtesy of a WhatWG member, who decided that 
perhaps I should just shut up, here and at my site, about the 
Important Work people(?) here are doing.


Of course, the person only highlighted why it is so important that 
something such as RDFa, and SVG, and MathML, get a home in HTML5. 
XHTML is hard to support when you're allowing comments and external 
input. Typically my filters will catch the accidental input of crappy 
markup, but not the intentional. Not yet. I'm not an exerpt at 
markup, but I know more than the average person. And the average 
person most likely doesn't have my commitment, either.


http://annevankesteren.nl/2009/01/xml-sunday shows the commentor (who 
by the way seems to be on your side in this debate) simply forgot to 
escape self-closed / and then WordPress somehow messed up in an 
attempt to fix it. I don't think anyone tries to make you shut up.



(And if we, the evil WHATWG cabal, wanted to break your site, we 
would've asked Philip` ;-))



You're not seeing all of the markup that caused problems, Anne. The 
intention was to crash the post. However, I shouldn't have assumed that 
the person who inserted the markup that caused the problems is a WhatWG 
member. My apologies.


Regardless of intent, it does demonstrate, again, why it is important 
for RDFa, SVG, and MathML find a home in HTML5. XHTML is a very 
difficult markup to support when you're allowing outside input. The 
tools do not do a good job of supporting XHTML, and hence the average 
person finds such failures to be intimidating, and will immediately 
return to HTML. Heck, I find the yellow screen of death to be unnerving 
myself. It's only my interest in inline SVG and RDFa, and basically 
distributed extensibility, that keeps me trying.


And regardless of the fact that I jumped to conclusions about WhatWG 
membership, I do not believe I was inaccurate with the earlier part of 
this email. Sam started a new thread in the discussion about the issues 
of namespace and how, perhaps we could find a way to work the issues 
through with RDFa. My god, I use RDFa in my pages, and they load fine 
with any browser, including IE. I have to believe its incorporation into 
HTML5 is not the daunting effort that others make it seem to be.'


However, the debate ended as soon as Ian re-asserted his authority.

Shelley





[whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-17 Thread Shelley Powers
The debate about RDFa highlights a disconnect in the decision making 
related to HTML5.


The purpose behind RDFa is to provide a way to embed complex information 
into a web document, in such a way that a machine can extract this 
information and combine it with other data extracted from other web 
pages. It is not a way to document private data, or data that is meant 
to be used by some JavaScript-based application. The sole purpose of the 
data is for external extraction and combination.


An earlier email between Martin Atkins and Ian Hickson had the following:

On Sun, 11 Jan 2009, Martin Atkins wrote:

 One problem this can solve is that an agent can, given a URL that
 represents a person, extract some basic profile information such as the
 person's name along with references to other people that person knows.
 This can further be applied to allow a user who provides his own URL
 (for example, by signing in via OpenID) to bootstrap his account from
 existing published data rather than having to re-enter it.

 So, to distill that into a list of requirements:

 - Allow software agents to extract profile information for a person 
as often
 exposed on social networking sites from a page that represents that 
person.


 - Allow software agents to determine who a person lists as their friends
 given a page that represents that person.

 - Allow the above to be encoded without duplicating the data in both
 machine-readable and human-readable forms.

 Is this the sort of thing you're looking for, Ian?

Yes, the above is perfect. (I cut out the bits that weren't really the
problem from the quote above -- the above is what I'm looking for.)

The most critical part is allow a user who provides his own URL to
bootstrap his account from existing published data rather than having to
re-enter it. The one thing I would add would be a scenario that one would
like to be able to play out, so that we can see if our solution would
enable that scenario.

For example:

  I have an account on social networking site A. I go to a new social
  networking site B. I want to be able to automatically add all my
  friends from site A to site B.

There are presumably other requirements, e.g. site B must not ask the
user for the user's credentials for site A (since that would train people
to be susceptible to phishing attacks). Also, site A must not publish the
data in a manner that allows unrelated users to obtain privacy-sensitive
data about the user, for example we don't want to let other users
determine relationships that the user has intentionally kept secret [1].

It's important that we have these scenarios so that we can check if the
solutions we consider are actually able to solve these problems, these
scenarios, within the constraints and requirements we have.


It would seem that Ian agrees with a need to both a) provide a way to 
document complex information in a consistent, machine readable form and 
that b) the purpose of this data is for external consumption, rather 
than internal use. Where the disconnect comes in is he believes that 
RDF, and the web page serialization technique, RDFa, are only one of a 
set of possible solutions.


Yet at the same time, he references how the MathML and SVG people 
provide sufficient use cases to justify the inclusion of both of these 
into HTML5. But what is MathML. What does it solve? A way to include 
mathematical formula into a document in a formatted manner. What is SVG? 
A way to embed vector graphics into a web page, in such a way that the 
individual elements described by the graphics can become part of the 
overall DOM.


So, why accept that we have to use MathML in order to solve the problems 
of formatting mathematical formula? Why not start from scratch, and 
devise a new approach?


So, why accept that we have to use SVG in order to solve the problems of 
vector graphics? Why not start from scratch, and devise a new approach?


Come to think of it, I think we should also question the use of the 
canvas element. After all, if the problem set is that we need the 
ability to animate graphics in a web page using a non-proprietary 
technology, then wouldn't something like SVG work for this purpose? 
Isn't the canvas element redundant? But then, perhaps we should start 
over from the beginning and just create a new graphics capability from 
scratch, and reject both canvas and SVG.


We don't reject MathML, though. Neither do we reject SVG or canvas. Or 
any other of a number of entities being included in HTML5, including 
SQL. Why? Because they have a history of use, extensive documentation as 
to purpose and behavior, and there are a considerable number of 
implementations that support the specifications. It doesn't make sense 
to start from scratch. It makes more sense to make use of what already 
works.


I have to ask, then: why do we isolate RDF, and RDFa for special 
handling? If we can accept that SQL is a natural database query 
mechanism, and SVG is a natural for 

Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-17 Thread Shelley Powers

Dan Brickley wrote:

On 17/1/09 19:27, Sam Ruby wrote:

On Sat, Jan 17, 2009 at 11:55 AM, Shelley Powers
shell...@burningbird.net  wrote:
The debate about RDFa highlights a disconnect in the decision making 
related

to HTML5.


Perhaps.  Or perhaps not.  I am far from an apologist for Hixie, (nor
for that matter and I a strong advocate for RDF), but I offer the
following question and observation.

The purpose behind RDFa is to provide a way to embed complex 
information

into a web document, in such a way that a machine can extract this
information and combine it with other data extracted from other web 
pages.
It is not a way to document private data, or data that is meant to 
be used
by some JavaScript-based application. The sole purpose of the data 
is for

external extraction and combination.


So, I take it that it isn't essential that RDFa information be
included in the DOM?  This is not rhetorical: I honestly don't know
the answer to this question.


Good question. I for one expect RDFa to be accessible to Javascript.

http://code.google.com/p/rdfquery/wiki/Introduction - 
http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html is a 
nice example of code that does something useful in this way.


cheers,

Dan



I agree, and appreciate Dan for pointing out a specific instance of use.

Apologies for not making the assertion explicit.

Shelley

--
http://danbri.org/





Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-17 Thread Shelley Powers

Sam Ruby wrote:

On Sat, Jan 17, 2009 at 1:33 PM, Dan Brickley dan...@danbri.org wrote:
  

On 17/1/09 19:27, Sam Ruby wrote:


On Sat, Jan 17, 2009 at 11:55 AM, Shelley Powers
shell...@burningbird.net  wrote:
  

The debate about RDFa highlights a disconnect in the decision making
related
to HTML5.


Perhaps.  Or perhaps not.  I am far from an apologist for Hixie, (nor
for that matter and I a strong advocate for RDF), but I offer the
following question and observation.

  

The purpose behind RDFa is to provide a way to embed complex information
into a web document, in such a way that a machine can extract this
information and combine it with other data extracted from other web
pages.
It is not a way to document private data, or data that is meant to be
used
by some JavaScript-based application. The sole purpose of the data is for
external extraction and combination.


So, I take it that it isn't essential that RDFa information be
included in the DOM?  This is not rhetorical: I honestly don't know
the answer to this question.
  

Good question. I for one expect RDFa to be accessible to Javascript.

http://code.google.com/p/rdfquery/wiki/Introduction -
http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html is a nice
example of code that does something useful in this way.



The fact that this works anywhere at all today implies that little, if
any, changes to browsers is required in order to support this.  Is
that a fair statement?

I've not taken a look at the code, but have taken a quick glance at
the output using IE8.0.7000.0 beta, Safari 3.2.1/Windows, Chrome
1.0.154.43, Opera 9.63, and Firefox 3.0.5.

The page is different (as in less functional) under IE8 and Safari.
Is there something that they need to do which is not already covered
in the HTML5 specification in order to support this?
  


I would think we would have to go through the code to see what this 
specific instance of client-side access of the RDFa isn't working. The 
debugger I'm using with IE8 shows the problem is occuring in the jQuery 
code, not necessarily anything specific to the RDFa plugin.


I know other JavaScript libraries that work with RDFa work, at least 
with Safari. For instance:


http://www.w3.org/2006/07/SWD/RDFa/impl/js/

Since this library was vetted for IE7, would assume it would work for 
IE8, too.


Of course, the RDFa attributes aren't incorporated into HTML5, which 
means their use would result in an invalid document. And of course, if 
they were incorporated, the issue of namespace for them would have to be 
addressed as namespaces were for MathML and SVG.


Shelley

- Sam Ruby

  




Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-17 Thread Shelley Powers

Ian Hickson wrote:

On Sat, 17 Jan 2009, Sam Ruby wrote:
  

Shelley Powers wrote:

So, why accept that we have to use MathML in order to solve the 
problems of formatting mathematical formula? Why not start from 
scratch, and devise a new approach?
  

Ian explored (and answered) that here:

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-April/014372.html

Key to Ian's decision was the importance of DOM integration for this 
vocabulary.  If DOM integration is essential for RDFa, then perhaps the 
same principles apply.  If not, perhaps some other principles may apply.



Sam's point here bears repeating, because there seems to be an impression 
that we took on SVG and MathML without any consideration, while RDF is 
getting an unfair reception.


On the contrary, SVG and MathML got the same reception. For MathML, for 
instance, a number of options were very seriously considered, most notably 
LaTeX. For SVG, we considered a variety of options including VML.


I would encourage people to read the e-mail Sam cited:

   http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-April/014372.html

It's long, but the start of it is a summary of what was considered and 
shows that the same process derived from use cases was used for SVG and 
MathML as is being used on this thread here.


  
I'm not doubting the effort that went into getting MathML and SVG 
accepted. I've followed the effort associated with SVG since the beginning.


I'm not sure if the same procedure was also applied to the canvas 
object, as well as the SQL query capability. Will assume so.


The point I'm making is that you set a precedent, and a good one I 
think: giving precedence to not invented here. In other words, to not 
re-invent new ways of doing something, but to look for established 
processes, models, et al already in place, implemented, vetted, etc, 
that solve specific problems. Now that you have accepted a use case, 
Martin's, and we've established that RDFa solves the problem associated 
with the use case, the issue then becomes is there another data model 
already as vetted, documented, implemented that would better solve the 
problem.


I propose that RDFa is the best solution to the use case Martin 
supplied, and we've shown how it is not a disruptive solution to HTML5.


The fact that it is based on RDF, a mature, well documented, widely used 
model with many different implementations is a perk.


Shelley



Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-17 Thread Shelley Powers

Henri Sivonen wrote:

On Jan 17, 2009, at 20:33, Dan Brickley wrote:


Good question. I for one expect RDFa to be accessible to Javascript.

http://code.google.com/p/rdfquery/wiki/Introduction - 
http://rdfquery.googlecode.com/svn/trunk/demos/markup/markup.html is 
a nice example of code that does something useful in this way.



Does this code run the same way on both DOMs parsed from text/html and 
application/xhtml+xml in existing browsers without at any point 
branching on a condition that is a DOM difference between 
text/html-originated and application/xhtml+xml-originated DOMs?


I don't want to specifically look at just the one case, since it is not 
working in Safari, and IE8 and is too complex to debug right at this moment.


Generally, though, RDFa is based on reusing a set of attributes already 
existing in HTML5, and adding a few more. I would assume no differences 
in the DOM based on XHTML or HTML. The one issue that would occur has to 
do with the values assigned, not the syntax.


I put together a very crude demonstration of JavaScript access of a 
specific RDFa attribute, about. It's temporary, but if you go to my main 
web page, http://realtech.burningbird.net, and look in the sidebar for 
the click me text, it will traverse each div element looking for an 
about attribute, and then pop up an alert with the value of the 
attribute. I would use console rather than alert, but I don't believe 
all browsers support console, yet.


Access the page using Firefox, which is served the page as XHTML. Access 
it as IE8, which gets the page as HTML. You can tell the difference 
between my graphics are based in inline SVG, and will only show if the 
page is served as XHTML.


So, yes, with my quick, crude demonstration, DOM access is the same in 
both environments.


Shelley






Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-17 Thread Shelley Powers

Henri Sivonen wrote:

On Jan 17, 2009, at 21:38, Shelley Powers wrote:

I'm not doubting the effort that went into getting MathML and SVG 
accepted. I've followed the effort associated with SVG since the 
beginning.


I'm not sure if the same procedure was also applied to the canvas 
object, as well as the SQL query capability. Will assume so.


Note that SVG, MathML and SQL have had different popularity 
trajectories in top four browser engines than RDF.


SVG is going up. At the time it was included in HTML5 (only to be 
commented out shortly thereafter), three of the top browser engines 
implemented SVG for retained-mode vector graphics and their SVG 
support was actively being improved. (One of the top four engines 
implemented VML, though.)


At the time MathML was included in HTML5, it was supported by Gecko 
with renewed investment into it as part of the Cairo migration. Also, 
Opera added some MathML features at that time. Thus, two of the top 
four engines had active MathML development going on. Further, one of 
the major MathML implementations is an ActiveX control for IE.


When SQL was included in HTML5, Apple (in WebKit) and Google (in 
Gears) had decided to use SQLite for this functionality. Even though 
Firefox doesn't have a Web-exposed database, Firefox also already 
ships with embedded SQLite. At that point it would have been futile 
for HTML5 to go against the flow of implementations.


The story of RDF is very different. Of the top four engines, only 
Gecko has RDF functionality. It was implemented at a time when RDF was 
a young W3C REC and stuff that were W3C RECs were implemented less 
critically than nowadays. Unlike SVG and MathML, the RDF code isn't 
actively developed (see hg logs). Moreover, the general direction 
seems to be away from using RDF data sources in Firefox internally.




Now wait a second, you're changing the parameters of the requirements. 
Before, the criteria was based on the DOM. Now you're saying that the 
browsers actually have to do with something with it.


Who is to say what the browsers will do with RDF in the future?

In addition, is that the criteria for pages on the web -- that every 
element in them has to result in different behaviors in browsers, only? 
What about other user agents?


That seems to me to be looking for RDFa sized holes and them throwing 
them into the criteria, specifically to trip up RDF, and hence, RDFa.



Meanwhile, the feed example you gave--RSS 1.0--shows how the feed spec 
community knowingly moved away from RDF with RSS 2.0 and Atom. 
Furthermore, RSS 1.0 usually isn't parsed into an RDF graph but is 
treated as XML instead. If RSS 1.0 is evidence, it's evidence 
*against* RDF.


The point I'm making is that you set a precedent, and a good one I 
think: giving precedence to not invented here. In other words, to 
not re-invent new ways of doing something, but to look for 
established processes, models, et al already in place, implemented, 
vetted, etc, that solve specific problems. Now that you have accepted 
a use case, Martin's, and we've established that RDFa solves the 
problem associated with the use case, the issue then becomes is there 
another data model already as vetted, documented, implemented that 
would better solve the problem.


Clearly, RDFa wasn't properly vetted--as far as the desire to deploy 
it in text/html goes--when the outcome was that it ended up using 
markup that doesn't parse into the DOM the same way in HTML and XML.


SVG and MathML were both created as XML, and hence were not vetted for 
text/html, either. And yet, here they are. Well, here they'll be, 
eventually.


Come to that -- I don't think the creators of SQL actually ever expected 
that someday SQL  queries would be initiated from HTML pages.


Shelley



Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-17 Thread Shelley Powers

Sam Ruby wrote:

On Sat, Jan 17, 2009 at 2:38 PM, Shelley Powers
shell...@burningbird.net wrote:
  

I propose that RDFa is the best solution to the use case Martin supplied,
and we've shown how it is not a disruptive solution to HTML5.



Others may differ, but my read is that the case is a strong one.  But
I will caution you that a little patience is in order.  SVG is not a
done deal yet.  I've been involved in a number of standards efforts,
and I've never seen a case of proposed on a Saturday morning, decided
on a Saturday afternoon.  One demo is not conclusive.  Now you
mention that there exists a number of libraries.  I think that's
important.  Very important.  Possibly conclusive.
  
I am patient. Look at me? I make extensive use of both SVG and RDF -- 
that is the mark of a patient woman.

But back to expectations.  I've seen references elsewhere to Ian being
booked through the end of this quarter.  I may have misheard, but in
any case, my point is the same: if this is awaiting something from
Ian, it will be prioritized and dealt with accordingly.  If, however,
some of the legwork is done for Ian, this may help accelerate the
effort.
  
First of all, whatever happens has to happen with either vetting by the 
RDF/RDFa folks, if not their active help. This is my way of saying, I'd 
be willing to do much of the legwork, but I want to make I don't 
represent RDFa incorrectly.


Secondly, my finances have been caught up in the current downturn, and 
my first priority has to be on the hourly work and odd jobs I'm getting 
to keep afloat. Which means that I can't always guarantee 20+ hours a 
week on a task, nor can I travel. Anywhere.


But if both are acceptable conditions, I'm willing to help with tasks.

Even little things may help a lot.  I know what I'm about to say may
be unpopular, but I'll say it anyway: take a few good examples of RDFa
and run them through Henri's validator.  The validator will helpfully
indicate exactly what areas of the spec would need to be updated in
order to accommodate RDFa.  The next step would be to take a look at
those sections.  If the update is obvious and straightforward, perhaps
nothing more is required.  But if not, researching into the options
and making recommendations may help.

  

Tasks including this one.

Shelley


- Sam Ruby

  




Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-17 Thread Shelley Powers




The assumption is incorrect.

Please compare
http://hsivonen.iki.fi/test/moz/xmlns-dom.html
and
http://hsivonen.iki.fi/test/moz/xmlns-dom.xhtml

Same bytes, different media type.

I put together a very crude demonstration of JavaScript access of a 
specific RDFa attribute, about. It's temporary, but if you go to my 
main web page,http://realtech.burningbird.net, and look in the 
sidebar for the click me text, it will traverse each div element 
looking for an about attribute, and then pop up an alert with the 
value of the attribute. I would use console rather than alert, but I 
don't believe all browsers support console, yet.


This misses the point, because the inconsistency is with attributes 
named xmlns:foo.


And I also said that we would have to address the issue of namespaces, 
which actually may require additional effort. I said that the addition 
of RDFa would mean the addition of some attributes, and we would have to 
deal with namespace issues. Just like the HTML5 working group is having 
to deal with namespaces with MathML and SVG. And probably the next dozen 
or so innovations that come along. That is the price for not having 
distributed extensibility.


One works the issues. I assume the same could be said of any many of the 
newer additions to HTML5. Are you then saying that this will be a 
showstopper, and there will never be either a workaround or compromise?


Shelley