from:"Eduard Pascual"

Re: [whatwg] Menus and Toolbars

2012-11-28 Thread Eduard Pascual

Greetings,

I'm not a browser implementor, but I have been dealing quite a bit with
menus and other GUI stuff from the web development perspective; so here it
goes some comments I hope implementors might find interesting:

I have put a first example on http://std.dragon-tech.org/mainmenu.png
All you can see there is doable with plain CSS and HTML. It doesn't even
use any CSS3 features. In fact, the whole thing works fine in IE8.
As Fred Andrews mentioned, menus are quite easy to achieve with no
scripting, as long as one stays within hover-based menus.
Of course, most of the buttons and items in the screenshot require JS to
actually do anything, but that's because of the app's own complexity.
All the stuff in that window should be pretty much accessible (everything
in there are ul's, li's, and img's with matching @alt and @title for the
toolbar, which contains only shortcuts anyway, and load's of @onclick
attributes that could be easily replaced by a's with @href's on a more
web-ish scenario).

In summary, menu bars and tool bars are quite a solved problem

Now, here is when stuff gets serious:
http://std.dragon-tech.org/contextmenu.png
To be honest, I hate most of the code behind that with a passion... and I
wrote it! I get paid for making stuff work, and thus I make stuff work, no
matter what it takes.
The context menu requires JS to display at all. It overrides the
browser's native menu (which sometimes would be useful). There is a huge,
empty, transparent div covering the whole iframe (the main area on the
screen with the table on it) just so it can catch clicks and dismiss the
menu. The context menu key (that nice thing between right Ctrl and the
Windows key) doesn't do anything (doesn't trigger neither my menu nor the
browser's) when the iframe has focus.
Don't get me wrong, I quite love what I pulled off, and so do most of the
app's users; but I loathe the truckload of crap code I have to feed the
browser for everything to work as intended.

So, in summary, context menus are somewhat achievable with significant
scripting, some creativity, and epic x-browser patience; but it's still a
problem far from solved.



As a web developer, what I'd love to see implemented natively would be:

* A mechanism (most probably an attribute + a CSS pseudo-class, or maybe
even recycling :hover) to show click-based menu bars some script-less love.
* A mechanism to deal with context menus in a sane way. It all boils down
to UA's natively handling the showing and dismissing of the menus; and
maybe enabling a mechanism to define a default (did you noticed the bolded
option at the start of the menu on the screenshot? Yay, double-clicking
instead of right-clicking does invoke it, skipping the whole show the
menu step! ). That would cause it to work with any native method the
browser or platform already supports for displaying context menus.

As a user, I would hope any context menu implementation grants me the
ultimate control on which menu is used (native vs. app-provided).

Of course, other users, authors, and developers may have other needs, but I
can only talk about the ones that I know about.

Re-capping, my needs could be solved with this (element and attribute names
are meant to be verbose and descriptive, not practical nor final):
- A @enable-toggling-hover-state-by-clicking attribute.
- A context-menu or similar element, plus a mechanism to bind it to
other elements on the page. As long as the browser deals with show/dismiss
logics, I could handle everything else. If it had a @default-action
attribute that would be great, but I could live without it (or, more
likely, toss in a @data-* attribute, and loop through elements bound to the
menu to hack in the double-click handlers).

Regards,
Eduard Pascual


On 28 November 2012 01:12, Ian Hickson i...@hixie.ch wrote:


 (If you're cc'ed, your opinion likely affects implementations of this and
 so your input is especially requested. See the question at the end. If you
 reply to this, please strip the cc list as the mailing list software will
 otherwise block your post for having too many cc's. Thanks.)

 There's a big section in the spec that tries to do three things:

  * context menus
  * toolbars
  * menu buttons

 Right now it's not implemented by anyone, though Firefox has a variant.

http://whatwg.org/html/#the-menu-element

 This section has two big problems:

 1. Styling of toolbars and menu buttons is just not defined.

 Toolbars could be a purely stylistic issue, to be solved either excluively
 by CSS, or CSS plus a component/widget binding model (whatever solution we
 end up with for that).

 Menu buttons are a real widget, though, so we can't just leave them to CSS
 styling of divs, there needs to be some real styling going on. Right
 now, because of the algorithm mentioned in #2 below, this is very
 complicated. I'll get back to this.

 (Styling for context menus is not a big deal, they just use native UI.)


 2. Nobody is implementing

Re: [whatwg] Idea: pseudo-classes :valid and :invalid for whole form?

2011-06-15 Thread Eduard Pascual

2011/6/14 Rafał Miłecki zaj...@gmail.com:
 We already have required attribute and :valid plus :invalid classes,
 which are nice. However some may want to display additional warning
 when form wasn't filled correctly. Just some single warning, not
 specific field-related. Could you consider adding form element class
 for such a purpose?

 Example:
 p id=errYou've to fill all required fields/p
 form:invalid #err {
 display: block;
 }


This would be more a CSS Selectors concern, and there are already some
ideas at [1] that would address this.

Regards,
Eduard Pascual

[1]: http://wiki.csswg.org/spec/selectors4

Re: [whatwg] Idea: pseudo-classes :valid and :invalid for whole form?

2011-06-15 Thread Eduard Pascual

2011/6/15 Boris Zbarsky bzbar...@mit.edu:
 No, it wouldn't.  The point here is to style based on a _form_ that is
 invalid.  Whether a form is valid or not is up to the language defining
 forms, that being HTML.

Sorry, I assumed the simple definition that a form is invalid if it
contains invalid input elements, and there are proposals on the wiki
to deal with selecting an element based on its children.

Of course, something like form:invalid { ... } would be ideal, but a
syntax like $form :invalid { ... } wouldn't be too bad.

I missed the possibility of a validation script (onsubmit event
handler) making more complex checks (like if first field is X, then
the second field can be anything but Y). However, there is no clean
way to deal with those cases short of adding onchange handlers on all
inputs:
- If a browser attempts to run the onsubmit handler upon each change,
to update the style of the form as soon as it becomes (in)valid, then
there is a risk of the javascript having some side-effect that is not
intended to happen on each change.
- If a browser waits for a submit attempt, this can easily become
confusing to users (the change in style should be triggered by the
change that actually made the form invalid). That would partially
depend on the actual style applied; but it could be easily implemented
by the document author with a single javascript line (adding a class
or some other attribute to the form), so the feature would be a sort
of glorified syntax sugar.

Maybe a sort of compromise would be implemented: checking for invalid
individual elements and, if the onsubmit handler is known to not
trigger side-effects, check it as well. But this sounds more like a
hack.

Is there something I am missing?

Regards,
Eduard Pascual

Re: [whatwg] Idea: pseudo-classes :valid and :invalid for whole form?

2011-06-15 Thread Eduard Pascual

2011/6/15 Boris Zbarsky bzbar...@mit.edu:
 A form need not contain its controls.  Consider this HTML document:

  !DOCTYPE html
  form id=myform action=something
    input type=submit
  /form
  input type=number value=abracadabra form=myform

So, indeed, I was missing something. Disregard my previous posts then.

Regards,
Eduard Pascual

Re: [whatwg] Content-Disposition property for a tags

2011-06-06 Thread Eduard Pascual

On Mon, Jun 6, 2011 at 6:59 PM, Dennis Joachimsthaler den...@efjot.de wrote:
 Yes, I was trying to refer to the verbosity. There's no html attributes
 with dashes in them as far as I know, except for data-, which are user-
 defined. This would kind of break the convention a little. I could think
 about having contentdispo or some shortname like this, it would fit
 better to what we currently have in html.

Maybe disposition could work? For the HTTP header, the content
part indeed refers to the content of the response; but on the case of
a link, the attribute would be referring to the linked resource,
rather than the actual content of the element. So it's more accurate,
we reduce verbosity, and we get rid of the dash, all of this without
having to make the name less explicit nor relying on an arbitrary
abbreviation (ie: why dispo and not disp or dispos? Since there
isn't a clear boundary, it could be harder to remember; but dropping
the content- part seems more straight-forward).

 Again, html convention: Currently html only has one statement in every
 attribute, except for things like events (which is javascript) and style
 (which is also ANOTHER language: css).

Well, meta elements with a http-equiv attribute normally have a full
HTTP header (including parameters if needed) in their content
attribute, so I see no issue in taking a similar approach. After all,
HTTP _is_ another language (or protocol, to be more precise, but
protocols are still a kind of languages).

 Seems cleaner to me if we stay to the standard and not change the syntax
 rules.
HTTP is also a standard. So we could stick to it. It all boils on a
choice of which standard we honor above the other. Seeing that HTTP is
an actual standard, rather than a mere convention, and we are actually
borrowing a feature from it, it looks like the winner to me.

 Please tell me if I missed anything here!
From the top of my head, @class is defined to be a space-separated
list of class names. Sure, it is a simpler syntax, but it's still a
multiple content attribute. I think there are some more cases, but I
can't recall any right now.

Regards,
Eduard Pascual

Re: [whatwg] Content-Disposition property for a tags

2011-06-03 Thread Eduard Pascual

On Fri, Jun 3, 2011 at 2:23 PM, Dennis Joachimsthaler den...@efjot.de wrote:
 This grants the ability for any content provider to use an explicit
 Content-Disposition: inline HTTP header to effectively block
 download links from arbitrary sources.

 True. Is it still so that some browsers ignore the filename part
 of a content-disposition if an inline disposition is used?

Ok, I have never even thought about using the filename argument with
an explicit inline disposition. When I am in control of the headers,
I find it easier to fix the filename with 301/302 redirects, and
also have the bonus of some control about how that should be cached...
In short, I think that responding with a 2xx code _and_ attempting to
change what's essentially part of the URI through other means is a
contradiction, and thus a mistake on the best case, or some attempt to
fool the browser into doing something it shouldn't do on the worst
case.
Because of that, I'm ok with whatever way the browser decides to
handle the contradiction. You can read my position about
error-handling on my earlier post some minutes ago.

 Personally, on the case I'm most concerned about (data: URIs used
 for Save log and similar functionalities), there is never a true
 disposition header; so my use cases do not push towards any of the
 options. What I have just written is what I feel is the most
 reasonable approach (the provider of a resource should have some
 control over it above an arbitrary third party).

 Data URIs would very well benefit from this attribute, in my opinion.

 This would also cater to the canvas lovers. Downloading something
 drawn on a canvas instantly? No problem! a href=data:
 disposition=attachment filename=canvas.pngDownload me!/a

Yep, these are the cases I am actually concerned about. But on these
scenarios there is no HTTP header involved, so it doesn't matter (for
them) what takes precedence.

 This is still one thing that has to be settled though.

 a) How do we call the attribute?

Is there any reason to _not_ call it 'content-disposition'?
Ok, there is one: verbosity. But, personally, I have no issue with
some verbosity if it helps making things blatantly explicit.
So many years of browser vendors reverse-engineering the error
handling in competing products have convinced me that being explicit
is a good thing.

 b) Do we include the filename part directly into the attribute
   or do we create a SECOND attribute just for this?

 People have been posting several formats now. But I don't think we
 actually have *agreed* upon one of those.

What's wrong with using the same format as HTTP? I am not too strongly
attached to that format, but I see no point in making things different
from what we already have. As a minor advantage, implementors can
reuse (or copy-paste) some few lines of parsing code instead of
writting them again, since they already parse the header when they get
it on an HTTP response.

Regards,
Eduard Pascual

Re: [whatwg] Content-Disposition property for a tags

2011-06-03 Thread Eduard Pascual

On Fri, Jun 3, 2011 at 3:24 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 6/3/11 9:16 AM, Eduard Pascual wrote:

 Ok, I have never even thought about using the filename argument with
 an explicit inline disposition. When I am in control of the headers,
 I find it easier to fix the filename with 301/302 redirects

 That doesn't work if the data is dynamically generated.

As a matter of fact, it does. It takes some work, and quite a bit of
creativity with ModRewrite or similar tools, but is perfectly
achievable. The key is to do the redirection _before_ starting to
generate the data, and to keep enough information on the final URI to
recover the parameters once the script actually gets into
data-generation.


 In short, I think that responding with a 2xx code _and_ attempting to
 change what's essentially part of the URI through other means is a
 contradiction

 The filename to save the data as is not part of the URI.

 Think a URI like this:

  http://mysite.org/generate_progress_report.php?quarter=Q12010
Wouldn't that default (in the absence of a Content-disposition) to
generate_progress_report.php as the filename? That's what I meant by
part of the URI.

 When saving, it would be good to use something like Progress report of Q1
 2010 as the filename.  But that's not part of the URI in any sense.
It would, if the author wanted it to be. Turning that URI into
something like http://mysite.org/ProgressReport_Q1_2010;, for example
(that's what I'd probably do in that scenario) is quite simple to
achieve. A literal URI like http://mysite.org/Progress report of Q1
2010 would take some extra work to get working right, but is still
doable.

After all, if the author cares about having a reasonable filename, why
wouldn't they care about having a descriptive URI? The filename option
on Content-Disposition headers is just a partial solution to a problem
for which a more powerful solution already exists.

 Note that some browsers will do weird parsing of the query params to attempt
 to extract a useful filename.  That seems strictly worse than just using
 Content-Disposition.
Not on my sites :P My URIs are a useful filename by themselves.

 and thus a mistake on the best case, or some attempt to
 fool the browser into doing something it shouldn't do on the worst
 case.

 I strongly disagree.  I think browsers that use the Content-Disposition
 filename for attachment but not inline are just buggy and should be
 fixed.
Ok, maybe I used a too harsh wording for that, but for all the
situations I can think of where a filename argument would make sense I
can achieve a better result through URI beautification. I still think
it's a mistake to try to fix a filename but not fix the URI. The
attempt to fool the browser part was more about evil sites serving
files with names like hotnudepic.jpg.exe (I have seen real sites in
the past doing thinks like that, and even worse).
In any case, note that my comment was about what *authors* should do.
Browsers will attempt to do whatever is good for the users, and I'm ok
with that.

 Of course it sounds like your position is that they should not use the
 filename for attachment either... (in which case you disagree not only
 with me, but with most of the web).
Actually, my position is more like I don't care what the browser does
with this because I have no need to use it. Honestly, I hadn't looked
into the filename option of that header until the discussions about
adding this feature to links and/or data: URIs started. data: URIs, by
their very own nature, are not suited for beautification. And even if
this feature gets implemented, the filename part doesn't concern me
too much, since it's just a mere convenience, and the user always has
the final say on what the file name will be (even if a browser didn't
allow changing that, the user could rename the file afterwards).

Regards,
Eduard Pascual

Re: [whatwg] Content-Disposition property for a tags

2011-06-03 Thread Eduard Pascual

On Fri, Jun 3, 2011 at 5:24 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 6/3/11 10:39 AM, Eduard Pascual wrote:

  http://mysite.org/generate_progress_report.php?quarter=Q12010

 Wouldn't that default (in the absence of a Content-disposition) to
 generate_progress_report.php as the filename?

 Depends on the browser.  But yes.  And that's a crappy filename for the
 Q12010 report!
Well, that's a point we agree on: it's indeed a crappy filename. IMO,
this is a direct consequence of a crappy URI.

 Is it now?  You have to do a redirect on the server side, increase latency
 for the user, etc.  For what purpose, given that you just want to specify
 the filename and there is already a mechanism for that?
The better filename is just the smallest of the benefits provided by
the beautification. A semantic URI is an additional aid to navigation
and a noticeable boost to search engine visibility.

 After all, if the author cares about having a reasonable filename, why
 wouldn't they care about having a descriptive URI?

 Because the URI is generated based on a form the user fills out, and no one
 ever sees the actual URI?
For a typical snippet of client-side form validation, one or two extra
lines of JS can beautify in advance for a GET form. In the case of
POST, I would always use the PRG pattern, so the redirection comes
for free. For a GET form with JS disabled, the actual redirection
happens, but this is now a fallback case that should only triggered on
a minority of the scenarios.

I'm not sure what do you mean by no one ever sees the actual URI: I
work on a daily basis with half a dozen different browsers, and they
all display the URI wherever I navigate. In the case of FF, it won't
even let me open a window without a location bar unless the user
explicitly enables that through the about:config options. So I'd
rather say that most users actually see the URI.

Another question could be whether they _care_ about the URI. On one of
the sites I have worked in, we implemented a beautification system for
SEO purposes: the flow of 10-12 daily mails asking for help about the
site's structure dropped to 10-12 monthly (the navigation was not
perfect, but it wasn't horrible: it just was a rather complex site), a
few users mailing us thanking the change, and the visitor - customer
conversion ratio for the site was doubled within a week.
From that and some other cases (not so extreme, but in a similar
direction) I have reached the conclusion that users are more likely to
pay attention to the URI if it looks simple and clean.

 better byt what metric?
By the amount of things it achieves: besides setting the filename
(which I consider only a minor benefit), it improves navigation and
helps SEO (see comments above).

 Actually, my position is more like I don't care what the browser does
 with this because I have no need to use it.

 That's great, and I'm happy you're willing to impose costs on your users so
 you don't have to use it.  But others may wish to make different tradeoffs
 here.
Honestly, if this were coming from someone else, I'd take it as
trolling. But coming from you, I know that's extremely unlikely, so
I'll assume that there has been a misunderstanding at some point,
because that last statement is already taking things too far from
their context. So, please, let me summarize the whole thing, in a
(hopefully) clear way:
1) Most of my sites use some URI beautification techniques to aid both
user's and spider's navigation (with a significant effort to minimize
the impact on the users).
2) Because of (1), I haven't had any need to ever use the filename
argument on a Content-Disposition header: my beautified URIs already
serve as good enough filenames.
3) Because of (2), I do not hold a strong opinion about how that
argument should be handled on the many different scenarios.

Please accept my apologies if my earlier posts yielded a different
idea; the three points above are what I have been trying to express. I
think my English is rather good, but is not native and I may fail to
express my views and/or ideas from time to time.

I wouldn't ever mentioned this if Dennis Joachimsthaler hand't asked
about it on his reply to my initial post on this discussion, since I
don't think saying that I stay neutral on something contributes too
much to the discussion. I just stepped into the thread to share my
view about how to handle conflicts between HTTP headers and parameters
given on the markup; and this has turned into a nearly pointless side
discussion that doesn't contribute to the main topic. Feel free to
contact me privately or (if you think the discussion will be of
interest to other people here) to branch into a new thread if you want
to go on; but I'd prefer not to derail this thread any further.

Regards,
Eduard Pascual

Re: [whatwg] Why is @scoped required for style as flow content?

2011-03-28 Thread Eduard Pascual

On Mon, Mar 28, 2011 at 7:23 AM, Jukka K. Korpela jkorp...@cs.tut.fiwrote:

 Boris Zbarsky wrote:

  If you can only affect _some_ parts of the body you should in fact
 be using style scoped, no?


 No, because the parts might appear around the body so that you cannot use
 any wrapper that contains them all. It would be awkward to use copies of the
 same stylesheet in different style scoped attributes.


Yes, it would be awkward. There are lots of things in the web that are
awkward, due to compatibility, interoperability, and implementation cost
concerns.
If you look closer at the scenario you describe (having control over sparse
parts of the body, but not the full document), we don't really want to
enable un-scoped stylesheets there: they could easily interfere with (up to
completely screwing up) the parts of the document you are not supposed to
have control over.

So, yes, @scoped is a suboptimal approach (from an authoring perspective) on
the scenario you described, but at least works. Allowing un-scoped
stylesheets would, in addition to the performance impact, add too much
potential to screw things up.

If you have some idea that:
1) Better addresses this use-case, and
2) would have a reasonable implementation cost
then, by all means, bring it forward.



 By the way, there are undoubtedly cases where you would want to use the
 same scoped stylesheet for different elements, e.g. for different
 blockquote elements quoting from the same source. Yet another reason for
 dropping style scoped in favor of an attribute referring to an external
 stylesheet.

You can use
style scoped
@import url(whatever);
/style
And this even enables you to add some specific styles (be them embeeded or
from another external sheet) on a per-use basis if the need arises. For
example, a long dissertation relying on many quotes may benefit from adding
stronger visual highlights on the em elements within the last quote to
reinforce the final conclusion.
Thus there is no reason to drop the feature in favor of something that only
covers a subset of the cases.
The only real benefit of your proposal would be to save typing a few extra
characters. With all other things being equal, less typing would be slightly
better, and hence a good tie-breaking factor. But other things are not equal
among style scoped and your suggestion: covering a strictly wider range of
use cases has a lot more weight than saving a few keystrokes.




  The use case for unscoped style outside head is if you have full
 control over the body but no control over head.


 It is _a_ use case.


Yep, it's just one use case. However, as of now it is *the only* use case
brought forward on the discussion that is justified and not already
addressed by style scoped.
If you know of some other case, please feel free to bring it forth, but try
to provide a justification (a reasoning of why it needs to be addressed by
HTML5) and a rationale on how current solutions fail to address it. The case
you have given may be justified, but it is solved with style scoped, as
explained above.


Regards,
Eduard Pascual

Re: [whatwg] required attribute in label

2010-08-21 Thread Eduard Pascual

On Sat, Aug 21, 2010 at 6:18 PM, Brenton Strine wha...@gmail.com wrote:
 label class=required

 and

 input id=name1 type=text requiredspannbsp;/span

 are effective, but then again this would be too:

 .../label*

 It just seems a shame that we have this neat attribute that indicates
 required controls, but we can't actually use it to change the
 presentation adding additional code.

Presentation issues should by addressed by CSS, not by HTML.
Actually, Diego's suggestion:
label + input[required] + span:after { content:  * ; }
Seems to be the right approach here (with current CSS selectors).
I'm not considering IE's issue with attribute selectors because your
original proposal (label[required]) would encounter the same problems.

What sense would have to mark a *label* as required? @required on
label is semantically wrong. And HTML should not compromise
semantics only for presentation purposes.

On a side note, keep in mind that there have been several proposals on
the CSS lists for reversed selectors (ie: selecting elements based
on what they contain rather than what contains them). So hopefully we
might have something like label:has(+ input[required]):after {
content:  *; } in the future.

Just my thoughts.

Regards,
Eduard Pascual

Re: [whatwg] Content-Disposition property for a tags

2010-08-05 Thread Eduard Pascual

On Mon, Aug 2, 2010 at 7:37 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 8/2/10 1:15 PM, Aryeh Gregor wrote:

 If you don't agree that this use-case is worth adding the feature for,
 do you think that:
 3) Something else?

 For the use case your describe, it might just make more sense for browsers
 to support Content-Disposition on data: URIs directly somehow...  Maybe.

I'd definitelly love that! :P
It would solve my use-case [1], and similar other cases.

Trying to sum up things:
Some applications use save buttons. While it may seem redundant with
the browser's ability to save the target of a link when the user
explicitly tells so; an in-page save button is still a clear
call-to-action.
Furthermore, some web-apps attempt to mimic the look-and-feel of
comparable desktop apps. There is a vast ammount of software that
performs some kind of saving; and to implement it on the web we
authors need some way to, at least, hint the browser that a resource
is being navigated as part of some in-page save UI element. The
way this is normally achieved is by round-tripping the server,
uploading (normally POSTing) the data for the server to send it back
with the Content-Disposition header. This is a workaround that
presents many issues:
- Prevents the application from working offline; even if everything
else on the app is purely HTML+CSS+JS-based.
- May cause significant delays when the volume of the data is large.
- It forces the data to take a trip around the world, creating a
security/privacy vulnerability that could be easily avoided with
client-based downloads (unless the server uses encryption, of course,
but encryption is expensive both CPU- and money- wise).
- It attaches network connection's unreliability to a feature that
would otherwise just work.
- It may cause the bill to go to the clouds when using the app from a
pay-per-volume connection (such as those provided by many mobile phone
operators).

Of course, some people are worried about this being abused. And it's a
legitimate worry. Despite it may seem so, there is no need to _force_
a download. All we (authors) need is a way to _hint_ the browser that
a download is assumed/expected from the app's side. It's entirely up
to the browser (probably taking in account user preferences) how to
deal with such a hint.
Furthermore, I'm strongly convinced that in-page save buttons should
behave as closely as possible as Content-Disposition: attachment:
anything else would go against user's expectations (example: if a user
normally gets a Save/Open/Cancel dialog when accessing a zip file,
then any in-page feature to save a zip file should present the same
dialog).


Just my thoughts.

Regards,
Eduard Pascual

Re: [whatwg] Content-Disposition property for a tags

2010-07-30 Thread Eduard Pascual

On Fri, Jul 30, 2010 at 12:36 PM, Dennis Joachimsthaler den...@efjot.de wrote:
Hello,

I have an idea which would be very cool for HTML5.

Having a Content-Disposition property on a tags which does the same as
the HTTP Header.
For example changing the file name of the file to be downloaded or rather
have a image
file download rather than it being shown in the browser directly.

This would avoid constructs such as a href=hiDownload/a (Right click
and click save
target as... to download).

It would also eliminate the need to handle such requests with a server
side scripting engine to change the headers dynamically to enforce
downloading
of the content.

HTML5 can already act on the http headers with the rel=noreferer property.

Please give me your opinion about this

Thank you in advance!

I was just about to post asking for something in that line, but you
were faster than me ^^,

Let me complement the proposal with a use case:
http://stackoverflow.com/questions/3358209/triggering-a-file-download-without-any-server-request

Also, a few comments on the potential/alledged security implications
(based on the above use case):
- This is already doable via plug-ins such as Flash. So barring this
feature may only improve security (if it improves at all) when Flash
and the like are disabled. This may even get counter-productive if
this feature becomes the one that convinces a user to enable Flash or
some other relatively unsafe plug-in.
- Sites can already trigger downloads by round-tripping to the
server, so this is more a sort of convenience: the script could send
the data through a POST request and the server send it back with the
Content-disposition header. In general, this feature would help
removing some delay and improving reliability, but on some cases it'd
be a huge enabler:
* Offline apps can't rely on a server. However, origin-based
restrictions on downloads could be applied to the app based on where
did the app come from.
* Many free and 'newbie' hosts allow hosting html and some media
files (such as images and sound), but offer no server-side scripting
at all. Hence apps on such hosts have currently no standard mechanisms
to trigger a save dialog or any equivalent UI feature. In some cases,
these hosts may even forbid flash, thus making the task entirely
impossible.
- The user will normally have the ultimate say on whether the file
is saved or not, and where on the file system will it be saved.
Actually, this is the reason why I wanted something like this instead
of using the Storage APIs: I want to give the user full control.

To top things up, note that saving a file to disk is always equally
or less dangerous than letting the UA perform the default action for
a resource: on the most evil scenario, there would be at least one
further step of confirmation or user action before the saved data has
a chance to do anything (assuming it's in some executable form;
otherwise it will never do anything).

Dennis' proposal allows for more than my use-case (actually, my intent
was to propose adding something to data: urls rather than to a
elements, which may point anywhere); but I don't see any reason why
the link with such attribute would be more dangerous than without it.

Regards,
Eduard Pascual

Re: [whatwg] Simple Links

2010-07-27 Thread Eduard Pascual

On Tue, Mar 30, 2010 at 11:44 PM, Christoph Päper
christoph.pae...@crissov.de wrote:
 If you think about various syntax variants of wiki systems they’ve got one 
 thing in common that makes them preferable to direct HTML input: easy links! 
 (Local ones at least, whatever that means.) The best known example is 
 probably double square brackets as in Mediawiki, the engine that powers the 
 Wikimediaverse. A link to another article on the same wiki is as simple as 
 “[[Foo]]”, where HTML would have needed “a href=FooFoo/a”.

 I wonder whether HTML could and should provide some sort of similar 
 shortening, i.e. “a hrefFoo/a” or even, just maybe, “aFoo/a”. The UA 
 would append the string content, properly encoded, to the base Web address as 
 the hyperlink’s target, thus behave as had it encounters “a 
 href=FooFoo/a”.

 I prefer the binary toggle role of the ‘href’ attribute, although it doesn’t 
 work well in the XML serialisation, because it provides better compatibility 
 with existing content and when I see or write “aBar/a” I rather think of 
 the origin of that element name, ‘anchor’. So I expect it to be equivalent to 
 “a idBar/a” and “a nameBar/a” which would be shortcuts for “a 
 id=BarBar/a”.

 PS: Square brackets aren’t that simple actually, because on many keyboard 
 layouts they’re not easy to input and might not be found on keytops at all.
 PPS: The serialisation difference is not that important, because XML, unlike 
 HTML, isn’t intended to be written by hand anyway.

Can't this be handled with CSS' generated content? I'm not sure if
I'll be getting the syntax right, but I think something like this:

a[href]:empty { content: attr(href); }
would pull the href from every empty a that has such attribute (so
it doesn't mess with anchor-only elements) and render it as the
content of the element. Note that href attributes are resolved
relative to what your bases define (this is slightly better than
just appending, since it makes '../whatever'-style URLs work the
right way), so you don't need to (rather, should not) use absolute
URLs for such links.

It seems that you are only concerned about avoiding duplication of
content for the href and the content of the element. Your proposal
puts the stuff on the content, while the CSS-based solution would put
it on the href; but both put it only once.

Regards,
Eduard Pascual

Re: [whatwg] Canvas and Image problems

2010-05-23 Thread Eduard Pascual

On Sun, May 23, 2010 at 12:16 PM, Schalk Neethling
sch...@ossreleasefeed.com wrote:
 Hi everyone,



 Having a really weird problem that I would like some input on. I am trying
 to draw an image, as well as video, onto canvas. I use the simple code
 below:



 $(document).ready(function() {

   var image = $(#cat).get(0);



   var cv = $(#img_container).get(0);

   var ctx = cv.getContext('2d');



   ctx.drawImage(image, 0, 0);

 });



 When I load up the HTML page in Chrome absolutely nothing happens and I see
 no errors in the JavaScript console. When running it in Firefox 3.6.3 I get
 the following error:



 uncaught exception: [Exception... Component returned failure code:
 0x80040111 (NS_ERROR_NOT_AVAILABLE)
 [nsIDOMCanvasRenderingContext2D.drawImage] nsresult: 0x80040111
 (NS_ERROR_NOT_AVAILABLE) location: JS frame ::
 file:///C:/thelab/HTML5Canvas/scripts/canvas_img.js :: anonymous :: line 9
 data: no]



 For The life of me I cannot see what I am doing wrong with the above. I have
 done console logs to ensure that the code get’s the image as well as the
 canvas on the relevant lines and it definitely does. Anything I am
 overlooking?

IIRC correctly, jQuery's document ready event is fired as soon as the
HTML itself is loaded and the DOM is built, unlike html's onload
which waits also for all external stuff (CSS, JS, images, etc) to be
available. This is a good thing to speed up things that don't depend
on external content but, in your case, it seems you are trying to use
the src resource of an img element before it's available. I'm no
jQuery expert, so I can't tell for sure; but you may check it out by
running your code from html's onload event instead of the document
ready event.

Regards,
Eduard Pascual

Re: [whatwg] The real issue with HTML5's sectioning model

2010-05-01 Thread Eduard Pascual

On Sat, May 1, 2010 at 3:56 AM, Anne van Kesteren ann...@opera.com wrote:
 On Sat, 01 May 2010 10:42:03 +0900, James Robinson jam...@google.com
 wrote:

 Is this sort of reply really necessary?  I have not been following the
 surrounding discussion, but this email showed up as a new thread in my
 mail client.  Based on this tone, I now have no desire to catch up on the
 rest of the discussion.

 My bad. It's just that we've been over this discussion like a gazillion
 times
Really? Then I must have missed something.
Please keep in mind that this was *not* another HTML5 vs. XHTML2
thread; but a discussion on the issues triggered by HTML5's approach
(styling, compatibility, room for future evolution, spec-bloating).
The (partial) comparison with XHTML2 was only intended to help
highlighting the root of these issues.
Anyway, I'm working on a formal proposal that will describe the
problems in terms of use-cases and examples, and my suggested solution
in the form of (mostly) spec-ready text, accompanied with rationales
for each proposed change to the current draft, but *without* any
mention to XHTML2 (it could properly serve as an example to discuss
some concepts in the abstract, but it has no place in a more formal
proposal).

 and it would be nice that if we were to have it again at least we
 started with the correct facts.
Then let's start by taking *correct* facts. My original statement
about XHTML2's sectioning model was indeed a simplification, but the
goal of that was highlighting the best aspects of their approach, not
to degrade this into yet another XHTML2 vs. HTML5 discussion. On the
other hand, your statement:

 Which are that XHTML2 had exactly the same
 design as HTML5 has now
Is a blatant lie. The key difference between XHTML2's and HTML5's
approaches to sectioning (and the one my suggestion was based on) is
that XHTML2 defines *a single element* to mark up sections
(unsurprisingly named section), while HTML5 defines several
(section, nav, article, and so on). This is the root of the
issues I'm trying to get addressed; and it seems a lot saner to solve
them all at the root than to define some kind of esoteric workaround
separately for each.

 but did not solve the problem of mixing h1-h6 with
 section/h. HTML5 did.
I applaud HTML5's overall approach to how mixed implicit and explicit
sectioning will be handled. It's an amazing example of the good work
done in HTML5 on the field of compatibility. But the issues here are
about how explicit sectioning is implemented. It is a new feature that
introduces some unneeded issues, instead of leveraging existing
technologies (including pre-5 X/HTML, CSS, and so on). In other words,
while HTML5's approach to heading handling is good, sound, and
elegant, the approach to sectioning is bloated, ugly, and
revolutionary rather than evolutionary.

On a side note, I'd like to highlight a detail that might have gone
unnoticed. There are, IMO, two kinds (or rather, perspectives) of
backwards compatibility:
The one you'll probably be more familiar with is the UA perspective:
new UAs must be able to handle old content: if a user's favorite sites
break on a new browser, the user won't want to use that browser.
Simetrically, there is a content authoring perspective: new content
needs to handle old UAs: if a new site or document breaks on the
user's favorite browser, the user won't use that content, even if they
want to.
For HTML5 (or any other upgrade to a web-related technology) to
succeed, both perspectives need to be observed. *This* is where XHTML2
most blatantly failed: it tried to avoid the UA perspective through
the assumption of mode-switching, and the features for the authoring
perspective (keeping elements that it was obsoleting, such as h1-6,
img, a, etc) didn't solve anything at all (due to the
mode-switching assumption).
HTML5 has done a great work to address the UA perspective of
compatibility; but there are some aspects, like the sectioning model,
that trigger serious issues on the authoring perspective.

Regards,
Eduard Pascual

[whatwg] The real issue with HTML5's sectioning model (was: Headings and sections, role of H2-H6 and Should default styles for h1-h6 match the outlining algorithm?)

2010-04-30 Thread Eduard Pascual

 with a more formal and elaborate proposal
around this idea; but I think it'd be good for contributors to share
their opinions in the meanwhile ;-)

Regards,
Eduard Pascual

Re: [whatwg] The real issue with HTML5's sectioning model (was: Headings and sections, role of H2-H6 and Should default styles for h1-h6 match the outlining algorithm?)

2010-04-30 Thread Eduard Pascual

On Fri, Apr 30, 2010 at 10:02 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 On Fri, Apr 30, 2010 at 11:57 AM, Eduard Pascual herenva...@gmail.com wrote:
 Actually, if we try to implement the outlining algorithm in
 the form of selectors that match each level of headings we have:
 On the case that the h1-only approach, selecting each level of
 heading requires a list of something raised to the n-th power
 selectors, where n is the heading level minus one. In other words: the
 top level heading can be selected with h1, but the next level would
 require section h1, nav h1, aside h1, article h1, ..., then for the
 third level we go nuts (the example is limited to section and
 article elements, including all of them would yield a list of too
 many selectors): section section h1, section article h1, article
 section h1, article article h1. A four level would translate into 64
 selectors or more (already quite insane to author), and if we ever
 reach the fifth and further levels, we'll be dealing with hundreds or
 thousands of selectors. If this isn't insane enough, keep in mind that
 this is an over-simplification. Sure, there are combinations that will
 never happen, but if we have to select also sub-headings inside a
 hgroup things get pretty funny.

 Not true.  The CSSWG has a proposal (with unresolved issues, however)
 for an :any() pseudoclass which works perfectly in this case:

 :any(section, nav, aside, article) h1 { foo }
 :any(section, nav, aside, article) :any(section, nav, aside, article) h1 { 
 bar }
 etc.

 (In other words, the x selector in the spec is just the :any() selector 
 here.)
There is a subtle, but quite important, difference: something like
section section h1 would work on all major browsers (with a minor JS
hack for IE to deal with unknown elements). Any currently works
nowhere (ok, we have :-moz-any() courtesy of David Baron, and I really
appreciate it; but it's simply useless for the scenario).

Even if an hypothetical  CSS3.1 Selectors got fast-tracked and
recommended by tomorrow, a key problem would still be there: authors
have no sane means to get HTML5'ish headings decently styled (I don't
need perfect, but some degree of *graceful* degradation is generally
needed) on current browsers.



 This yields several advantages:
 1) The styling issue improves drastically: any pre-HTML5 will
 understand this (IE would require a bit of javascript anyway) out of
 the box:
 h1 { styling for top-level }
 section h1 { styling for second-level }
 section section h1 { styling for third-level }
 and so on, for as many levels as you need.

 Fixing HTML because CSS is too difficult probably isn't the right
 solution.  Address the problem at the source - CSS needs to be able to
 handle this case well.  Luckily there are further similar use-cases
 that make solving this problem fairly attractive.
Actually, I *am* trying to fix the problem at the source. This is not
about fixing HTML because CSS being difficult. It's about fixing a
design issue within HTML itself. The problems from the CSS perspective
are only a surface symptom of a deeper issue.


 2) All of a sudden, something like section kind=aside navh1See
 also/h1 some indirectly related links here.../section becomes
 possible, plus easy to style, and works happily with the outlining
 algorithm.

 What's the benefit of marking something as both an aside and nav?
Quoting the HTML5 spec:
The nav element represents a section of a page that links to other
pages or to parts within the page: a section with navigation links.
The aside element represents a section of a page that consists of
content that is tangentially related to the content around the aside
element, and which could be considered separate from that content.
Putting that together, a section that links to other pages *and*
consists of content that is tangentially related to the content around
it *must* be *both* a nav and an aside. The most blatant example
(which I thought was clear enough from the example) are See also
sections found on many sites (see a couple of examples on [1] and
[2]). Currently, the closest thing we may get is
navaside.../aside/nav or asidenav.../nav/aside. This
raises several concerns: which one should be the outer, and which the
inner section? How would that interact with the sectioning algorithm?
(ie: would a heading inside there be taken as one level lower than the
expected one?).


 3) Future needs will become easier to solve on future versions of the
 specification, and with significantly smaller costs: for example,
 let's assume a new sectioning element such as attachment becomes a
 widespread need (it would already make sense on sites like web-mail
 services, discussion boards, bug-trackers, and some others...). So a
 new crouton on the soup, which would be treated quite like a generic
 div by pre-HTML6 (or 7, or whatever) browsers. Now, with the
 section+attribute approach, we'd get something like section
 kind=attachment: that'd would still work

Re: [whatwg] Changing punctuation value of input element in telephone state

2010-04-06 Thread Eduard Pascual

On Wed, Apr 7, 2010 at 1:10 AM, Ian Hickson i...@hixie.ch wrote:
 If there was a true standard, then the spec would refer to that, but as
 you say, it's very varied in practice.

There is quite a standard, even if an implicit one: (almost) no punctuation.
Have you ever dialed a ( or a - when phoning someone? In essence,
phone numbers are sequences of digits, and punctuation is only used as
a convenience to enhance readability.
There are two exceptions to this: + and letters are used as
replacement for numbers (the plus sign for the international call
code, the letters for specific digits to enable creating branded
numbers easier to memorize).

Maybe I'm being too hasty with this idea but, since machines don't
really need the same readability aids as humans do, I'd suggest that
the UA simply removes everything other than + and alphanumeric
characters (and obviously adds nothing) when sending the field. I
don't care too much about what they do upon rendering the introduced
value (and I think it's probably fine if the browser adds some
formatting based on its own or the system's regional settings). The
server is only left with replacing letters and +; plus any
application-specific usage of the value itself (which, by then, will
be a string of digits; assumedly representing the sequence of digits
to dial).

Other than that, the only safe alternative would be to leave the
values untouched, so the page can say what it wants, the user honor
it, and the server get it as expected; or gracefully degrade to an
error message that actually points to the user error (rather than an
error introduced by an UA trying to be out-smart the user).

For sites that are ready to sanitize values from a specific locale;
but which are accessed through an UA with different settings (ie: on a
public place while abroad), the UA adding locale-specific stuff to a
phone value is very likely to render whole forms unusables.

Regards,
Eduard Pascual

Re: [whatwg] Changing punctuation value of input element in telephone state

2010-04-06 Thread Eduard Pascual

On Wed, Apr 7, 2010 at 1:31 AM, Ashley Sheridan
a...@ashleysheridan.co.uk wrote:

 On Wed, 2010-04-07 at 01:28 +0200, Eduard Pascual wrote:

 On Wed, Apr 7, 2010 at 1:10 AM, Ian Hickson i...@hixie.ch wrote:
  If there was a true standard, then the spec would refer to that, but as
  you say, it's very varied in practice.

 There is quite a standard, even if an implicit one: (almost) no punctuation.
 Have you ever dialed a ( or a - when phoning someone? In essence,
 phone numbers are sequences of digits, and punctuation is only used as
 a convenience to enhance readability.
 There are two exceptions to this: + and letters are used as
 replacement for numbers (the plus sign for the international call
 code, the letters for specific digits to enable creating branded
 numbers easier to memorize).

 Maybe I'm being too hasty with this idea but, since machines don't
 really need the same readability aids as humans do, I'd suggest that
 the UA simply removes everything other than + and alphanumeric
 characters (and obviously adds nothing) when sending the field. I
 don't care too much about what they do upon rendering the introduced
 value (and I think it's probably fine if the browser adds some
 formatting based on its own or the system's regional settings). The
 server is only left with replacing letters and +; plus any
 application-specific usage of the value itself (which, by then, will
 be a string of digits; assumedly representing the sequence of digits
 to dial).

 Other than that, the only safe alternative would be to leave the
 values untouched, so the page can say what it wants, the user honor
 it, and the server get it as expected; or gracefully degrade to an
 error message that actually points to the user error (rather than an
 error introduced by an UA trying to be out-smart the user).

 For sites that are ready to sanitize values from a specific locale;
 but which are accessed through an UA with different settings (ie: on a
 public place while abroad), the UA adding locale-specific stuff to a
 phone value is very likely to render whole forms unusables.

 Regards,
 Eduard Pascual

 Phone numbers can also validly include pause characters too. I remember back 
 in the day saving such a number to quickly dial into my voicemail, rather 
 than having to dial in, wait for the automated voice, press a digit, wait for 
 some more robot speaking, press another number, etc.

 Also, not entirely sure, but would asterisks (*) and hashes (#) be included 
 too? I was just going on what digits exist on a standard phone keypad.

So it seems that I was indeed too hasty with my proposal. Let me put
aside the specifics and focus on the idea:

- Issue: there is no explicit standard to represent phone numbers
that works on a world-wide scale.
- Fact: there is an implicit standard that defines what a phone does
(where does it call) depending on which sequence of keys is pressed.
- Idea: given the need for a standard, and the lack of an explicit
one, use the implicit one that can actually work. I was hasty and
provided an incomplete definition of that implicit standard; but I'm
quite convinced a correct definition can be produced with a bit of
research and effort.

On Wed, Apr 7, 2010 at 1:48 AM, Davis Peixoto davis.peix...@gmail.com wrote:

 Other than that, the only safe alternative would be to leave the
 values untouched, so the page can say what it wants, the user honor
 it, and the server get it as expected; or gracefully degrade to an
 error message that actually points to the user error (rather than an
 error introduced by an UA trying to be out-smart the user).

 This goes in the opposite direction from the initial idea of creating a
 interface that intend to avoid type mismatches, unfortunately.
Actually, it doesn't. It just goes nowhere from the starting point
(pre-HTML5, phones are inputed as raw text, which provides no
phone-specific interface).
Current HTML5 approach, however, does go in that opposite direction,
since allowing UAs to add whatever they wish is nowhere near to avoid
the mismatches, and it's even guaranteed to trigger them when the UA
fails to second-guess what the page expects. The most obvious scenario
I could come up with is that of a user using a foreign computer (I
quite know what I'm speaking about here, I have struggled so many
times with Irish keyboards to get a 'ñ' through ^^' ); for example:
the user may be attempting to input a Spaniard number in a Spaniard
site, and the UA might be trying to enforce Irish phone-number
conventions: this may break even if the site itself is up for the
battle and attempts to remove all the extraneous characters, since it
could quite make sense to prepend a '0' to an Irish number (not very
advisable for an UA to do that, but possible since it may be needed to
dial an extra '0' on some situations). Also, things will definitely
break if the site expects Spaniard formatting (just some spaces, and
perhaps a bracket pair) but the UA enforces a different

Re: [whatwg] idea for .zhtml format #html5 #web

2010-04-02 Thread Eduard Pascual

On Fri, Apr 2, 2010 at 6:25 PM, Doug Schepers d...@schepers.cc wrote:
 I don't think it's defined anywhere, but a browser could choose to save
 bundled resources as a self-contained Widget (File  Save as Widget...),
 which would be a great authoring solution for Widgets.

Isn't that the same thing, in essence, as MS did with IE? IIRC, IE had
an choice, on its save dialog, to Save full page, which packed the
html page + all the CSS, JS, image, and other dependencies within a
.mht (called meta-HTML) file (which, of course, only IE would be
able to open afterwards).

The fact is that this feature has been removed from the more recent
versions of IE (not sure if it was from IE6 or 7). It would be
interesting to know why MS decided why such a feature should be
removed.

At first glance, the only potential issue I see (both with IE's old
MHT format and with any possible zhtml) is XSS: when a downloaded file
is loaded from the local filesystem into the browser, which is its
domain? It may need some same-directory files, but it may be possible
that it tries to fetch something from its original location that has
not been downloaded, it might be trying to load content from a domain
that is not the local system.
This issue should be addressed if something like that is to be usable:
if we face the choice of broken pages vs. security flaw, the idea will
be already a failure. However, I have no idea of how to approach this.

Regards,
Eduard Pascual

Re: [whatwg] idea for .zhtml format #html5 #web

2010-04-02 Thread Eduard Pascual

On Fri, Apr 2, 2010 at 9:41 PM, Thomas Broyer t.bro...@gmail.com wrote:
 On Fri, Apr 2, 2010 at 8:39 PM, Eduard Pascual herenva...@gmail.com wrote:

 On Fri, Apr 2, 2010 at 6:25 PM, Doug Schepers d...@schepers.cc wrote:
  I don't think it's defined anywhere, but a browser could choose to save
  bundled resources as a self-contained Widget (File  Save as Widget...),
  which would be a great authoring solution for Widgets.

 Isn't that the same thing, in essence, as MS did with IE? IIRC, IE had
 an choice, on its save dialog, to Save full page, which packed the
 html page + all the CSS, JS, image, and other dependencies within a
 .mht (called meta-HTML) file (which, of course, only IE would be
 able to open afterwards).

 MHTML stands for MIME-encapsulated HTML and is an IETF RFC:
 http://www.rfc-editor.org/rfc/rfc2557.txt

I can't remember for sure where I saw the meta HTML name, but I'm
sure I had seen it somewhere.
Anyway, thanks for the correction.


 The fact is that this feature has been removed from the more recent
 versions of IE (not sure if it was from IE6 or 7). It would be
 interesting to know why MS decided why such a feature should be
 removed.

 Selecting Page - Save as... on IE8 brings the save file dialog with
 the type defaulting to Web Archive, single file (*.mht)

My apologies: vague memory + not testing = stupid post from me ^^;
After a bit of research to refresh my memory, I've found that what MS
removed from IE was the offline favorites feature, and MHT was
portrayed as a better alternative. I just got a 404 Brain Not Found
and mixed things up.

So feel free to simply ignore my previous e-mail, since it was
entirely based on a mistaken assumption.

Regards
Eduard Pascual

Re: [whatwg] Drag-and-drop feedback

2010-01-23 Thread Eduard Pascual

On Sat, Jan 23, 2010 at 11:30 AM, Ian Hickson i...@hixie.ch wrote:
 On Mon, 17 Aug 2009, Jian Li wrote:
 [...]
 The issue that I'm having is that if the DataTransfer object says that
 it has Files, I have no way to determine what type those files are. (In
 this case, I only want to accept image files.) I understand that the
 DataTransfer shouldn't have the content of the files for security
 reasons, but it would be helpful if it did contain the file names and/or
 MIME types.

 I could provide a second attribute with the types of the files, would that
 work? I suppose if we did this, we should remove the Files fake type.
 That might not be a bad idea in general, it's kind of a hack. I'm not sure
 how I feel about having multiple different ways of representing the data
 in a DataTransfer object... It would give a clean precedent for adding
 other features, though, like promises, which some people have requested.
Would it be possible to provide a list of drag items (to call them
somehow) instead of, or in addition to, the current info provided by
the DataTransfer object?
More formally, add a property of type DragItemList that might be
called DragItems. The DragItem type (building brick for
DragItemList) could provide the data and meta-data for each object
being dragged (that'd be the getData, clearData, and setData methods,
a readonly string type and a readonly boolean isFile).
In principle, that list could actually replace the DataTransfer
object. In order to keep compatibility with existing content, either
of these approaches could work:
1) Actually replace the DataTransfer object with the DragItemList, but
make the DragItemList type implement DataTransfer's interface.
2) Instead of replacing, add the list as a new
field/property/attribute/whatever-you-call-it to the DataTransfer
object.
This approach would solve the issues of dragging multiple files and
the potential drop targets needing to check the metadata (such as the
type); but in addition would seamlessly adapt if in the future a
mechanism of performing a multi-object drag appear. Keeping in mind
how many modern software can already handle multiple selections,
that seems a quite near feature.
For example, in some word processors it's possible to hold the Ctrl
key while dragging the mouse over the text to select additional text
fragments: when such a split selection is dragged or sent to the
clipboard, the text fragments are typically concatenated; but if the
drop or paste target is any kind of list, it would be reasonable (and
in some cases a significant upgrade in usefulness) to import the
fragments as separate entries for the list. As long as the drag (or
cut/copy) source provides some metadata to identify the boundaries of
each fragment, this functionality enhancement would be quite easy to
implement (currently, it is impossible on most contexts).

Anyway, that's just an idea.
BTW, all the type and member names used in the proposal are arbitrary
and can be changing for anything more convenient. The only purpose of
those names was to describe what they represent.

Regards,
Eduard Pascual

Re: [whatwg] A call for tighter validation standards

2009-10-27 Thread Eduard Pascual

On Fri, Oct 23, 2009 at 4:36 AM, Ian Hickson i...@hixie.ch wrote:
 I think less rigid styles are good, and are what made the Web the success
 that it is. Authors are welcome to use validators that complain about this
 kind of markup, but we should enforce this style on everyone. Some people
 (e.g. me) like being able to omit tags.

I hope you mean't we shouldn't rather than we should.

Regards,
Eduard Pascual

Re: [whatwg] framesets

2009-10-11 Thread Eduard Pascual

 be achievable with a bit of
scripting.
C) Insane divs + CSS + Scripting: This essentially meets all
requirements (maybe excluding 4, depending on what the actual
requirement is); although at a high development cost. (This would be
the MSDN style approach.)
D) HTML4 Frameset + HTML5 documents for frame contents: this meets
requirements 1, 2, 3, and 5 out of the box, it's an almost trivial
upgrade from any HTML4 web-app that takes a similar approach, and is
relatively easy to implement.

 Seems to me many developers would regard B  C as hacks. At they very least
 they'd be more awkward than framesets. I think I've already touched on why
 the MSDN approach is undesirable. You are not the first to claim that A
 (tables  iframes) can meet this spec. I'm not an HTML expert (which
 apparently frustrates you) but if A meets the spec, I ought to have been
 able to find a working instance in the past six years or so, don't you
 think? Or do you claim it's entirely fortuitous that the only publicly well
 known solutions for this spec use framesets?
C) is definitely a hack. Even worse, it's a Microsoft hack.
Rather than a hack, it's quite re-inventing the wheel: it takes a
purely structural element with no specific semantics or behavior
(div) and uses other tools to define the appearance (CSS) and
behavior (scripting). So, of course, it's a heavy and probably
overkill application. Basically, it's been quite like migrating MS's
Document Explorer from the Windows platform to the web platform: it
works on the web, but it doesn't reuse too much of what web
technologies already provide.
B), on the other hand, is *not* a hack at all: using CSS this way for
these cases is exactly what these CSS properties were made for. Of
course, the CSS approach fails on the resizing task; and that's were a
scripting-based hack can come into play; but CSS on itself is a
legitimate solution for a wide range of use cases, rather than a hack.
It seems that your use case is not within this range, so the CSS
solution doesn't work for you; but that's legitimate: this is why we
are having this discussion, and my hopes are that it will led to the
best possible solution.

 Re D), reasons for opposing removing framesets from HTML5 include: (i)
 removal of a feature from a standard is often followed by further
 degradation of support for it, which would undermine the functionality I
 want HTML to support since framesets are commonly used, for good reasons, to
 meet this use case, (ii) there could be HTML5 features one would want to
 combine with framesets.

 Apropos the strange claims made here that removal of framesets should make
 no difference to present or future frameset use: if removal makes no
 difference whatever, there is no rationale for removing them.
I haven't claimed that the removal should make no difference. What I
have stated is that there is no such removal: Transitional/Strict
doctypes from HTML4 are being updated and combined into a single spec,
thus the version number is increased, leading to the HTML5 term.
The Frameset document type stays untouched. The only reason there
isn't an HTML5 Frameset type is because it would be 100% identical
to the HTML4 Frameset one. There is no point in having two
versions of a spec that are actually the same.
Finally, you don't need to bother too much about how the HTML5 spec
will impact on browser support for deprecated and non-updated
features: it simply won't. It's a matter of offer and demand: as long
as there are frameset or font tags out there, browsers will be
able to handle them, regardless of what the spec may say (currently,
the spec just tries to make the different UAs agree on how to handle
these and worse things).

Your point on combining HTML5 features with framesets may be
interesting; but what prevents you from doing so now? The document
shown on each frame can perfectly be a HTML5 document. Using HTML5
features inside noframes would be quite unadvisable: if a UA can't
handle frames, would you really expect it to handle stuff like video
or gauge?
That covers frame content, and no-frame content. Is there any other
place where you might want to use these features? If so, please,
elaborate (describe which features you'd need, where you'd need to be
able to use them, and why).

Regards,
Eduard Pascual

Re: [whatwg] framesets

2009-10-11 Thread Eduard Pascual

 removed from HTML. It's just not being updated
because there was nothing to add to it, so it stays the same.
Taking all this together, here comes again: the question I have
already asked several times, and you haven't answered: What are you
asking for? What did you want to achieve when you started this thread?
What is your goal in this discussion? Until you provide a clear answer
to that, we are basically stuck.
I have been trying to extract useful stuff (such as use-cases and
requirements) from you mails; but as long as there isn't a proposal to
discuss, there is no point in discussion. Because of this, I'll
abstain from posting in this discussion until there is something to
actually discuss.

Regards,
Eduard Pascual

[1] http://www.google.com/search?q=HTML+resizable+table

PS: BTW, my name is Eduard, not Edouard. I'd appreciate if you could
avoid mistyping it. Thanks.

Re: [whatwg] framesets

2009-10-10 Thread Eduard Pascual

 to update a version of a standard to a new
version that includes no changes at all.
Keep in mind that Frameset and content are two different document
types, with different content models (for example, a frameset page has
no body). HTML5 currently replaces/updates the Transitional/Strict
document types; it doesn't deal with Frameset because nothing is being
changed on it, so HTML4 Frameset stays valid as the newest (despite
its age) standard for frameset master pages.

Regards,
Eduard Pascual

Re: [whatwg] framesets

2009-10-10 Thread Eduard Pascual

On Sun, Oct 11, 2009 at 12:15 AM, tali garsiel t_gars...@hotmail.com wrote:
 I agree with Peter that this type of document navigation is an extremely 
 common use case.
 I think the use case includes navigation that loads only parts of the page, 
 leaving the other parts static.

 Almost all web applications I know have tabs on top and a tree on the left.

 The idea is not repainting the tabs/tree etc on every click to keep a good 
 user experience.

 On the old days frames were used, then a tables + iframes.

 Then iframes where considered bad design and I think most applications use 
 divs + css + Ajax.
Peter's case is quite legitimate; I just asked for him to properly
detail it so it can be properly discussed.

Your reference to the old days, however, scares me a bit: on the
old days there weren't many web applications, only web pages. If you
are trying to argue for sites (made of pages or documents) that use
framesets to keep the menus and headers static, I must stronly
oppose that: while neglecting the back button, bookmarking and
deep-linking features inherent to the web may be acceptable for some
specific applications, this is absolutely a bad idea for classic
pages: these elements do not take that much to load (and, if they use
images, the browser will have them cached anyway), and all other
advantages of using frames in this scenario (such as maintaining a
single instance of the menu) are far better handled through
server-side solutions such as includes. Being unable to deep-link to
(or bookmark) a specific page on a site is a serious drawback; hence
any site that considers breaking such capabilities must accurately
weight the cost of the drawback against the value of the benefits it
expects to achieve from it.
If you have specific use-cases for Peter's proposal, you're welcome to
bring them forward... oh, wait... what's Peter's proposal?

 I think its important that the W3C specification should provide a good 
 solution for this common case.
You know, solutions don't come out of thin air: they are proposed by
contributors to the lists like you, or me, or anyone else here. If you
know the cases you want to have addressed, then I strongly encourage
you to suggest a solution. The same advise I gave Peter applies: make
sure to describe the use-cases you are addressing, detailing the
requirements (and justifying them when they aren't 100% obvious),
showing where current solutions fail, and showing that your proposal
meets all the requirements.

Honestly, I don't think that this process is ideal, but it's the way
things are normally done here, and I can't think of a better process
(otherwise I'd bring it forward). Some discussion on the abstract can
be useful, but at the end of the day, only solutions that address
use-cases and meet requirements make it into the spec.

Finally, let me insist on a small detail that seems to go unnoticed
regardless of how many times it has been repeated in the last hours:
HTML5 updates/replaces the Transitional and Strict doctypes of HTML4.
HTML4 Frameset isn't being updated as of now, and it stays valid,
legitimate, and current. Using a HTML4 Frameset master page that shows
HTML5 documents on its frames is valid and conformant. In addition,
this seems to address all the use-cases that have been put forward (at
least, it addresses all the ones HTML4 Frameset + HTML4 inside frames
could address).
What is being asked for? What do you (and/or Peter) want to be changed
on the spec, and why? If nobody answers this, there is very little
hope this discussion will go anywhere.

Regards,
Eduard Pascual

Re: [whatwg] framesets

2009-10-09 Thread Eduard Pascual

On Fri, Oct 9, 2009 at 10:17 PM, Peter Brawley p...@artfulsoftware.com wrote:
So why *are*
frames banned, if you can easily replace them with iframes and get the
exact same lousy behavior?  Because iframes also have less evil uses,
and frames don't, I guess?

 Designation of reasonable uses as evil is authoritarian nonsense.

 PB

Both frameset and iframe are a source of several issues.
Everything that can be achieved with frameset can be done through
table+iframe.
What'd be the point of keeping two sources of issues when one can be
enough to cover all use-cases?
Since iframe can handle all the use-cases for frameset, and some
that frameset just can't, it's obvious which one to drop.

Regards,
Eduard Pascual

Re: [whatwg] framesets

2009-10-09 Thread Eduard Pascual

On Fri, Oct 9, 2009 at 10:39 PM, Peter Brawley p...@artfulsoftware.com wrote:
 Eduard,

Everything that can be achieved with frameset can be done through
table+iframe.

 If that's so, someone ought to be able to point at some examples.
I just got something even better [1]: it doesn't even use iframe: it
goes away with divs, CSS, and scripting.
table+iframe would make things simpler, since the page would only
need to add a bit of script to handle the resizing.

 Supposing that someone can produce examples, the argument for removing
 frames from HTML5 becomes: frameset has been in HTML till now, but is being
 removed because we do not like it. If you insist on such use cases,
 re-architect them. That's a misuse of standards.
That's not the argument. It would be something more like frameset has
arisen several issues and doesn't solve anything that can't be solved
in a different way. If you insist on using frameset, just forget about
HTML5 validation.
After all, the only purpose of validation is to have a hint of
potential interoperability and accessibility issues. If you are using
frameset, you should already be aware of the issues you might face,
so you don't need a validator to hint them to you.

What'd be the point of keeping two sources of issues when one can be
enough to cover all use-cases?

 If your premiss is correct, backward compatibility.
Backward compatibility is not handled at the language level, but at
the application level. frameset will not stop working: browsers will
keep handling it as they have until now. Leaving frameset out of the
spec only affects validation.
Furthermore, AFAIK it's entirely valid to have an HTML4 Frameset
doctype for a .html file, and refer from the frameset to files that
use the HTML5 doctype and new features. Since frameset has always
been intended to be used only from a frameset page or master page,
what purpose would serve allowing them on a non-frameset doctype?
Furthermore, since HTML5 adds no features at all to frameset, would
it make any sense to define a HTML5 Frameset doctype?

In summary: you can still use frameset as much as you want; and
trigger either quirks or standard mode on the client side. In
addition, if you manage properly your files and doctypes, you can even
have everything validating. What are you exactly asking for?

Regards,
Eduard Pascual

[1]: 
http://msdn.microsoft.com/en-us/library/system.windows.forms.htmldocument.aspx

Re: [whatwg] framesets

2009-10-09 Thread Eduard Pascual

I have been following this discussion for many hours and it's getting
tiring. In addition, it doesn't seem to be leading anywhere; so please
let me suggest some ideas that may, at least, help it advance:

First: you have been asking for counter-examples (and you have been
given the MSDN one), but you haven't provided any specific example to
begin with. That would make a better starting point.

Second: you reject sound arguments claiming that the use case
requires otherwise, but what's your use-case? Without clearly
specifying the use case, your arguments based on it aren't going to be
taken too seriously: they are basically based on nothing, until you
properly define the case they are supposedly based on. To specify a
use-case in a way that will be taken seriously by the editor and other
contributors:
- Clearly define the problem you are trying to solve. When doing so,
describe the problem in a way that is independent to the solution you
are proposing. For example, if you look on the archives, you'll see
that the use case for RDFa was often defined as including RDF triples
on webpages: this didn't work because that's not the problem, RDF is
the solution. The same way, if you describe the need as allowing
frameset on HTML5 you won't get anywhere: that's your own
suggestion to address the need, but which is the need?
Make sure that your use-case addresses real-world problems.
- Clearly specify and justify the requirements. Keep in mind that
because the client wants it is not enough justification; you'd need
to get an answer to why does the client want it?. For example, if
someone hired me to build a web app that takes control of the users'
computer, I might come to the lists and ask for a feature to implement
that, based on the point that the client wants it: that'd be
pointless and would go nowhere.

Third: once you have a well-defined use-case (or ideally, several
use-cases), you have a chance to get your proposal being taken
seriously. To do so, specify what you are proposing:
- State why currently existing solutions don't meet the requirements.
As far as this has gone, the only requirements that apparently aren't
met by iframe and other alternatives are the break deep-linking
and break navigation ones. Besides of the fact that you still need
to justify such requirements, that's actually easy to achieve with a
bit of ugly scripting.
- Describe your solution. State clearly how/why it meets each of the
requirements. Also, try to describe the specific changes required on
the spec.

If you manage to do that, your proposal will be definitely be taken
much more seriously.

Regards,
Eduard Pascual

Re: [whatwg] Fakepath revisited

2009-09-14 Thread Eduard Pascual

On Mon, Sep 14, 2009 at 3:12 AM, Ian Hickson i...@hixie.ch wrote:
Here are some bug reports that I believe are caused by this issue:

http://forums.linksysbycisco.com/linksys/board/message?board.id=Wireless_Routersmessage.id=135649
http://www.dslreports.com/forum/r21297706-Re-Tweak-Test-Need-help-tweaking
http://www.rx8club.com/showpost.php?s=42aad353530dfa4add91a1f2a67b2978p=2822806postcount=3269
This is factual data, thank you.

http://blogs.msdn.com/ie/archive/2009/03/20/rtm-platform-changes.aspx
This, on contrast, is not. It's an interesting read, and a good
rationale, but it doesn't show any real-world example of what is
causing the issue. From that page (or from all Microsoft's posts,
mails, etc I have seen around the topic), the only proof they give
about the problem actually exists was their own word. I won't say
whether Microsoft's word on something should be trusted (there are
enough MS evangelists and MS haters out there to guarantee
perpetual disagreement on this), but just wanted to point out that
such word is not the same as factual data.

I would love more data.
Although I'd also love it, I don't really need it. The few links you
posted, quoted above, are enough to show that this is a real issue.
That's all I was asking. Thanks again.

Please, let me clarify that my examples of filenames containing
backslashes were purely theoretical. I have no factual data to back
them, and I don't really need it. Without actual examples on the need
of fakepath, they were at the same position as the arguments standing
in favor of fakepath. Their only goal was to encourage bringing
specific data about the need for fakepath, and it has been achieved.

Now, maybe stepping on a side topic, I'd like to bring back a separate
request: I think, if fakepath is to be included on the spec, that
content authors shouldn't be left at their own risks. Considering that
pre-HTML5 browsers (like IE 6 and 7 or FF2) are going to stay there
for a while, approaches like substr(12) or any other means of just
trimming C:\fakepath\ just won't work. Last indexof(\\) would
break on any browser that doesn't include path at all (that's what
fakepath is addressing, after all), as well as any browser that runs
on Unix-like systems and provides full path (not sure if there is any
current browser on this category).
Is there any way we content authors can reliably retrieve a filename
from scripts, other than special-casing several versions of each
browser in existence?
More specifically, would .files[0] work on those pre-HTML5 browsers?
If it does, this is a non-issue. However, if it doesn't, I'd like to
suggest adding an algorythm on the spec to deal with this task. Just
like the spec offers algorythms for browsers to deal with
non-compliant existing content, on cases like this it would be equally
valuable to have algorythms for content to deal with non-compliant
existing browsers.

I am Ok with working around content's brokenness when fixing the
content is not an option; but that shouldn't be done at the expense of
good content and careful authors.

Regards,
Eduard Pascual

Re: [whatwg] Fakepath revisited

2009-09-13 Thread Eduard Pascual

On Sun, Sep 13, 2009 at 11:50 PM, Ian Hickson i...@hixie.ch wrote:
 There are basically only two arguments:

     Aesthetics: Having the fake path is ugly and poor language design.

  Compatibility: Having it increases compatibility with deployed content.

 In HTML5's development, compatibility is a stronger argument than
 aesthetics. Therefore the path stays.

I already posted an example showing how fakepath can easily break
compatibility with well-written sites. I explicitly asked for
counter-arguments to it and none has been provided, but the argument
doesn't seem to be taken in consideration at all.
Hence I'm wondering how the compatibility arguments are treated here.
Is compatibility with an unknown-size niche of clearly bad-designed
sites more important than with potentially thousands of well-designed
ones?

Opera has claimed that they are keeping fakepath just because
Microsoft claims some sites need it. Microsoft hasn't revealed the
list of such broken sites, nor even a figure about how many sites are
involved. However this group is willing to bend a standard based only
on the claims from a single vendor... not to mention that this is
precissely the vendor that less commitement has shown over the last
decade on the area of web standards implementation.

In my opinion, this is exactly the same as spitting on the face of
everyone who has ever put an effort on building an interoperable
website.
If there is a real compatibility issue, a claim that is currently held
only by Microsoft, bring some factual data about it. Otherwise,
including fakepath is equivalent to stupidifying the language
(probably at the expense of breaking currently good sites), based
only on a single vendor stating its unwillingness to implement the
non-stupid alternative.

Regards,
Eduard Pascual

Re: [whatwg] Fakepath revisited

2009-09-07 Thread Eduard Pascual

Oops... the following was meant to be a reply to all but I hit
reply instead; so here it goes a copy for the list:

On Mon, Sep 7, 2009 at 8:43 PM, Eduard Pascualherenva...@gmail.com wrote:
 On Mon, Sep 7, 2009 at 5:10 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:

 On Mon, Sep 7, 2009 at 3:24 AM, Alex Henriealexhenri...@gmail.com wrote:
  Expecting developers to hack out a substring at all will only lead to
  more bad designs. For example, Linux and Mac OS allow filenames to
  contain backslashes. So if the filename was up\load.txt then
  foo.value would be C:\fakepath\up\load.txt which could easily be
  mistaken for load.txt. Fakepath will actually encourage developers
  to fall into this trap, which just goes to show that it is not a
  perfect solution.

 Well, no, not really.  If they're hacking out a substring, they'll
 *hack out a substring*, since the prefix is of a known fixed length.
 Just lop off the first 12 characters, and whatever's left is your
 filename.  Splitting on \ is just plain silly in this instance.

 ~TJ

 That wouldn't work.
 There is an important ammount of browsers that include the full path
 in the value. So web authors would need to know *a lot* of guesswork
 if they are to hack a substring from such value. They have to figure
 out whether they'll be getting a plain file name, a file name with a
 path, or a fakepath, and treat each case separately. If a site tries
 to just substring(12), it will break on any non-HTML5 browser (except
 on the corner case where the value contains a full path and it is
 exactly 12 characters long). If they try to split on \, they will
 break when a file on a non-Windows system contains that character.

 To put things on a more obvious shape, imagine the following scenarios:

 A file named up\load.txt (on a non-Windows OS) is given from an
 HTML5 browser. We get a value=C:\fakepath\up\load.txt.
 A file named load.txt, and located at C:\fakepath\up\ from a
 browser that includes full path. We get a
 value=C:\fakepath\up\load.txt.
 Two different file-names end up yielding the same value string. So,
 basically, it is impossible to reliably recover the name of the file
 from only the value string: there will be ambiguous cases. While the
 examples above may seem corner cases, they are just intended to show
 off the ambiguity issue.

 Ok, so some (horribly-designed) sites break without fakepath. Since
 the HTML5 spec likes so much to include explicit algorythms, is there
 any reliably algorythm that web authors can use to recover the actual
 filename? (Without having to assume that everybody switches
 immediatelly to HTML5-compliant browsers, of course.) If there isn't,
 then every other site (including all the decently-designed ones) that
 need/use the filename would break. What would be the point to keep
 compatibility with some bad-sites if it would break many good sites?

 Regards,
 Eduard Pascual

Re: [whatwg] Fakepath revisited

2009-09-03 Thread Eduard Pascual

On Thu, Sep 3, 2009 at 9:29 AM, Smylerssmyl...@stripey.com wrote:
 If one major browser implements non-standard behaviour for compatibility
 with existing content, it would have an advantage with users over other
 browsers -- those other browsers would likely want to implement it, to
 avoid losing market share.  But browsers unilaterally implementing
 'extra compatibility' means other browsers wanting to be similarly
 compatibile have to reverse engineer the first browser -- a
 time-consuming and brittle process, which in practice often leads to
 some edge cases where the behaviour is not the same.

Currently, all major browsers implement non-standard behaviour for
compatibility with existing content, quoting your own words, or
quirks modes, to use more common terms. Does the HTML5 spec define
what should trigger each quirks mode and how browsers should behave
when in those modes? If it did, then the fakepath could be treated
as just another quirk, and end of the problem. But as far as I know
the spec doesn't dig too deep (correct me if I'm wrong), so there are
gonna be thousands or (quite likely) millions of sites that will break
unless browsers special-case them as they currently do with their
quirks modes. The barrier for entry for new browser vendors is
already huge on this area, and stupidifying input type=file with
fakepaths will *not* solve this.

Re: [whatwg] Web Storage: apparent contradiction in spec

2009-09-03 Thread Eduard Pascual

On Fri, Sep 4, 2009 at 12:33 AM, Ian Hicksoni...@hixie.ch wrote:
 Flash's privacy problem can be removed by uninstalling Flash. They're not
 a license to add more privacy problems to the platform.

And a permanent's storage potential privacy problems could also be
removed by having separate Delete cookies from this site and Delete
cookies and all other data from this site buttons side by side.

On Fri, Sep 4, 2009 at 12:36 AM, Peter Kastingpkast...@google.com wrote:
 All the spec really says is that UAs should note to their users that sites
 can keep data about them in Local Storage.  This isn't grounds for a
 tantrum.

The problem is not what the spec says, or is supposed to say, but how
does it say it. This long discussion seems to be mostly around the
point that the current wording is too likely to be miss-interpreted as
The delete cookies button (or any equivalent UI element) should
also delete all other data stored by the site.

Now, this question is mostly addressed to Ian, as the only one who can
provide a 100% accurate answer: based on the spec text intent, would
the idea of having separate Delete cookies and Delete everything
buttons side by side be conformant?
If it would (and a lot of people here seem to be arguing that it
would), then this discussion could be easily be put to an end by
tweaking the wording in a way that makes this more clearer.

Extra: more mails are flowing as I'm writing this, more messages are
flowing into my inbox. Trying to reply to all of them would get me on
an endless loop, but the discussion seems to be more about what the
spec text is supposed to mean rather than what it would say. So,
please, Ian, whatever it is supposed to say, could you word it in a
way that is clear enough for everybody?
Discussion about what the spec should say is generally productive but,
IMO, discussion about what it's supposed to mean is
counter-productive: the efforts put by all participants into this
debate would be more useful on other aspects of the language.

Regards,
Eduard Pascual

Re: [whatwg] Web Storage: apparent contradiction in spec

2009-09-03 Thread Eduard Pascual

On Fri, Sep 4, 2009 at 1:23 AM, Ian Hicksoni...@hixie.ch wrote:
 On Fri, 4 Sep 2009, Eduard Pascual wrote:
 If it would (and a lot of people here seem to be arguing that it would),
 then this discussion could be easily be put to an end by tweaking the
 wording in a way that makes this more clearer.

 The wording was tweaked very recently. Is it still not clear?

Sorry, I wasn't aware of the update. The text is definitelly clearer
now, even if I don't really like it too much.
I wonder if the people discussing about what the spec says or how it
says it are aware of the new wording.

Re: [whatwg] Microdata

2009-08-22 Thread Eduard Pascual

 about the use of widespread prefixes (like
foaf or dc) being used for something weird, or the use of
different prefixes for these vocabularies, I wouldn't mind adding some
default prefix mappings for CRDF to address this.
  2) The Follow your nose topic is a bit more complex: IMO, a RDF
application that needs to successfully FYN to work is insane; OTOH, an
application that works just fine, but can FYN and provide *additional*
information *when available* is quite a good thing. This is an
implementation design issue, and CRDF can't do too much here: the best
thing I can think of is to state that applications should *attempt* to
FYN, and use or show to the user the extra info *when successful*, but
must still be able to use a document's information when FYN fails.
Actually, CRDF is already moving in such direction: the type inference
rules (still under construction) try to infer the properties' types
from the vocabularies first, but take the basic type of the value if
that fails (unless, of course, the CRDF code uses explicit typing).

- Entirely new: this is a minor disadvantage agains RDFa and
Microformats, but not when compared to Microdata. It is a disadvantage
because for already existing formats there are already existing
implementations, and it is minor because it shouldn't be hard for
browsers (and some other forms of UA's that also handle CSS) to
implement it reusing most of their CSS-related code. Again, I'd
appreciate some vendors' feedback on this.

That's what I can think of now. Of course, CRDF has some issues: it's
still work in progress, and it lacks implementations and
implementation feedback, but it also provides significant advantages
that, IMO, far outweigth the drawbacks.

Regards,
Eduard Pascual

[1] http://crdf.dragon-tech.org/crdf.pdf
[2] (multiple links: the threads got split by some reason, and the
archives also break threads at months' boundaries):
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019733.html
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019857.html
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-June/020284.html
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-July/020877.html

Re: [whatwg] HTML 5 Script Tag

2009-08-06 Thread Eduard Pascual

On Thu, Aug 6, 2009 at 10:36 PM, Cready, Jamesjcre...@rtcrm.com wrote:
 Is there any good reason why a script tag with the src attribute specified
 can’t be self-closing?

 I understand the need for a script tag to have an open and close tag when
 you’re executing javascript inline:

 script type=text/javascript
   alert(Huzzah! I got executed just like you though I would.);
 /script

 However “best practices” suggest keeping all your scripts external unless
 you absolutely need to use inline scripting. So it seems odd to have an open
 and close tag when you’re including an external JS file since it won’t
 execute anything inside the tags anyway:

 script src=js/behaviour.js type=text/javascript
   alert(Boo! I won’t get executed just because the src attribute is
 specified. Weak sauce.);
 /script

 I feel like, if you’re including an external JS file, the syntax should look
 more like the link tag used to include CSS files or even the img tag
 with it’s src attribute. Both are self-closing and for good reason. Is there
 any possibility of including this syntax in the HTML 5 spec or is there some
 reason this can’t be done?

 script src=js/behaviour.js type=text/javascript /

The self-closing / syntax is an XML feature. The trailing slash is
allowed in non-X HTML5 for empty elements only for compatibility with
existing XHTML content served as text/html, but it basically means
nothing. Allowing the slash for elements that are known to be empty is
trivial: it only has to be ignored and things work fine.
The script case is a bit more complex: it can have contents, so a
closing /script is in principle required. While XML offers the
script ... / shortcut for code in the form script ... /script,
something like script src=js/behaviour.js type=text/javascript /
would actually be parsed as script src=js/behaviour.js
type=text/javascript, so everything after that will be taken as
part of the script element (and hence won't be shown) until a
/script is found.

I agree that, for empty elements, self-closing syntax is better than
explicit closing. There could be arguments on self-closing vs.
implicit closing; but in the case of requiring a closing tag for an
empty element, most people would prefer the self-closing syntax. The
main issue here, IMO, is that making this trailing slash relevant
within script would make things quite messier for the parser, while
in the general case it just gets ignored. This makes life harder for
implementors and spec writters; and opens the door to some potential
bugs. While I think it's good to put authors above implementors and
spec writters, bugs tend to hurt the user, and the user is above
content authors.

In summary: either use script / with XHTML5, or use
script/script in (non-X) HTML.

Regards,
Eduard Pascual

Re: [whatwg] HTML 5 Script Tag

2009-08-06 Thread Eduard Pascual

On Fri, Aug 7, 2009 at 1:14 AM, Cready, Jamesjcre...@rtcrm.com wrote:
 You make a great point. But whether or not you use the XML/XHTML syntax/
 or the HTML 4 syntax doesn¹t matter much. Since like I showed in my
 previous example: the instant you specify a src attribute on your opening
 script tag the browser will not execute anything inside the tags.

You are quite missing the point: see this example:
!DOCTYPE PUBLIC html
html
head
script src=test.js /
/head
body
pHello World!/p
/body
/html

Unless you serve this as XHTML (with an XML mimetype), nothing will be
shown on the browser.

Let's see why: text/html triggers the browser's tag-soup parser, which
fixes issues on the fly. The first issue it finds is the straneous
trailing slash on the script, which it just ignores. Then, /head is
probably taken as literal (as if you had typed lt;/headgt;), because
the head can't be closed until the script is closed. Then that
body.../body will be taken either as a children of the script or
as just text (I don't really know *all* the details of error-handling
case by case); but in any case it is content of a script inside the
head, so there is no chance it will get rendered. Once the parser
encounters the end of the file, it implicitly closes the currently
open tags, adding something like /script/head/html.
Summarizing, the above sample is taken by any browser as page with an
external script that includes some junk content, but with no body at
all.

In the general case, the issue is not the trailing slash for
self-closing /. The issue is that the browser will ignore everything
from script ... / until it encounters a /script


I like the idea of using link to reference external scripts; and it
shouldn't be too hard for vendors to implement. However, there is no
chance to change how browsers handle script. Not with so many
millions of pages relying on that behavior. And you can still use
XHTML5 if you want the / to mean something.

Regards,
Eduard Pascual

Re: [whatwg] Make quoted attributes a conformance criterion

2009-07-27 Thread Eduard Pascual

On Mon, Jul 27, 2009 at 2:53 AM, Jonas Sickingjo...@sicking.cc wrote:
 The more I think about it, the more I'm intrigued by Rob Sayres idea
 of completely removing the definition of what is conforming. Let the
 spec define UA (or HTML consumer) behavior, and let lint tools fight
 out best practices for authoring.

Besides the point Maciej already made, there is another aspect in
favor of good conformance definitions: web evolution.

Some of the issues, like attribute quoting, may be stylistic, but
there are many where there is a clear boundary between what's right
and what's wrong. For example, font is clearly wrong; but there are
too many legacy webpages that use it; so browsers need to support it
to render all that content. If we leave conformance out of the spec,
and only define what browsers are supposed to do, we'd be bringing
font back to the web, even for new websites, and this would be
clearly wrong (we are not speaking of assistive technologies only, but
many pages that rely on font end up unreadable even in common
browsers).

Someone could argue that this is just a matter of best practice or
style, and hence could be handled by lint tools; but conformance
criteria on the specification has a lot more strength than any lint
tool. While it may be ok to leave more arguable aspects to these
tools, things that are obviously wrong should be clearly defined as
non-conformant by the spec.

Just my two cents.

Regards,
Eduard Pascual

Re: [whatwg] A Selector-based metadata proposal (was: Annotating structured data that HTML has no semantics for)

2009-07-27 Thread Eduard Pascual

I have put a new version of the CRDF document up [1]. Here is a
summary of the most significant changes:

* Location: with the migration from Google Pages to Google Sites,
the PDF document will not be allowed anymore to be hosted at its
former location. I wanted to keep this proposal independent from my
own website; but in the need of a reliable location for the document,
I have made room for it on my site's host.
To avoid having to keep track of two online copies of the document nor
having an outdated version online, I have removed the document from
the old location.
My apologies for any inconvenience this might cause.

* Inline content: now full sheets are accepted inside the inline
crdf attribute, whatever it gets called; so something like div
crdf='@namespace foo http://example.com/foo#; '/div should be
doable, mimicking RDFa's in XML-based language ability to declare
namespaces inline with code like div
xmlns:foo=http://example.com/foo#;/div. In addition, a
pseudo-algorythm has been defined that allows to determine whether the
content of the attribute is a full sheet or just a set of
declarations.

* Inline vs. linked metadata: this brief new section attempts to
explain when each approach is more suitable, and why both need to be
supported by CRDF.

* Conformance requirements: this new section describes what a document
must do to be conformant, and what would tools have to do to be
conformant. It should be taken as an informative summary rather than
as a normative definition (especially the part about tools), and is
mostly intended to give a glance of what should be expected from an
hypothetical CRDF-aware browser.

* Microformats compatibility: after some research and lots of trial
and error, it has been found that it is not possible to match the
microformat concept of singular properties with CSS3 Selectors. The
document now suggest an extension (just a pseudo-class named
:singular) to handle this. This is a very new addition and feedback
on it would be highly valuable.

[1] http://crdf.dragon-tech.org/crdf.pdf

Regards,
Eduard Pascual

Re: [whatwg] Make quoted attributes a conformance criteria

2009-07-25 Thread Eduard Pascual

On Fri, Jul 24, 2009 at 9:52 PM, Keryx Webwebmas...@keryx.se wrote:
 On 2009-07-23 20:32, Eduard Pascual wrote:

 While I don't consider a hard requirement would be appropriate, there
 is an audience sector this discussion seems to be ignoring: Authoring
 Tools' developers. IMO, it would be highly desirable to have some
 guidelines for these tools to determine when they*should*  quote
 attribute values.


 There is one further rub. Code that initially has been made by authoring
 tools have a tendency to wind up in some front end developers lap, to be
 amended and/or fixed manually at a later stage. That is even more a reason
 for a strong recommendation about quotes.

 Furthermore, I doubt that most people on this list did read my blog post I
 included as an URL when starting this discussion.[1]
I can't speak for others, but I did read your post. And still I am
convinced that a hard requirement to quote all values is not the best
solution. There are some values that MUST be quoted, some that SHOULD
be quoted, and even some that SHOULD NOT be quoted. Those that must be
quoted are already covered by the spec, and validators will yield the
relevant error message when encountering such values unquoted. For
those values that *should* be quoted (those that improve in
readability when quoted, or those that could lead to errors when they
are later changed if unquoted), a warning from the validator should be
enough. Finally, there are some values that are better unquoted, such
as those attributes that can only take a number (there is no risk of
errors, and the quotes would normally hurt readability more than they
help it). Even in the case of @type for input, quotes seem quite an
overkill: AFAIK, there is no valid value for this attribute that will
make them strictly needed; so there is no risk of the author changing
the value into something that requires quotes and forget to add them
(unless, of course, s/he changes it to something invalid, which will
already bring problems of its own). Since input elements tend to be
relatively short, and often given in a single line of source, adding
boilerplate to them for no purpose doesn't seem to be a good idea.

 In that post I talked about a common scenario. One developer works on the
 business logic. It puts out attribute values. Another developer works on the
 presentation logic. He makes templates. Dev 2 omits the quotes and for a
 long time it might work, since the business logic in question only produces
 single word values. Then there might come a change, because dev 1 - or the
 users of the CMS - suddenly starts to produce longer values. Suddenly things
 break, and since nobody touched the presentation logic code, it might not be
 the first place where the developers look for an error.

 And believe me, lots of back end devs are absolutely clueless about front
 end issues! Yes, they might skip validation completely, but at least such a
 rule of thumb can be implemented more easily into their work flow.
Again, once the pages are going through a validator, warnings are
hints as good as errors to detect the source of the problem.

 I also note that no one who has spoken against my suggestion claims to have
 any teaching experience.
Although I didn't mention it, because I didn't think it was relevant,
I have some teaching experience (I won't claim to have worked my whole
life as a teacher; but I have worked as a particular teacher for a few
years). Do you really think this:
Error: Attribute values must always be quoted
would be more educative than this
Warning: Values for attribute X should be quoted, or errors might
arise if the value is later changed
? And these are just examples of messages.
Of course, if you just say your students all the code you provide
must validate warnings my go unnoticed. However, you may try
something like this all the code you provide must validate; and
warnings must be addressed or be properly reasoned. IMO, this kind of
details marks the difference between training code-typing zombies or
developers capable to solve problems.

In summary: I considered your arguments from the teaching perspective;
but I consider that the difference between errors and warnings has
more didactic value than a totalitarian validator that just rejects
safe code based on a seemingly arbitrary rule.


 I see 4 effects that my suggestions might have:

 1. Dismiss completely.
Unlikely. On the worst case, it is at least being discussed.

 2. No new wording, but change the code examples.
Better consistency would be appropriate for some of the examples.
However, there are many values there that are better unquoted
(especially numbers).

 3. Add some words about best practice, but do not enforce quotes as a
 conformance criterion.

 4. Go all the way and do just that.
Again, there is a middle point between these: making validators issue
warnings for potentially unsafe attributes is, IMO, the sanest
approach here.
Adding some comments about the fact that in case of doubt

Re: [whatwg] due consideration

2009-07-24 Thread Eduard Pascual

On Fri, Jul 24, 2009 at 10:04 AM, Maciej Stachowiakm...@apple.com wrote:
 Ian gives more careful consideration and more thorough responses to comments
 than any other specification editor I have seen in action. I've commented on
 many W3C standards and many times I've seen comments raising serious
 technical issues dismissed without explanation, or just ignored. I have
 never seen that with HTML5.

Is that really enough?

example
Let's take a long and well-known controversy as an example: Microdata.
It is true that Ian has given the topic very careful consideration,
and a lot of thought; but what is the result? There are already
several existing solutions that HTML5 could have adopted, most
prominently (and most argued for) RDFa, but also EASE, eRDF, and
others. During the discussions, people who have been working on Web
Semantics for *several years* contributed their knowledge, expertise
on the topic, and ideas.

By the end, Ian opted to create an entire new solution, disregarding
years of previous work on the subject and the significative base of
already existing RDFa authoring and consuming software. But that
solution has an complexity that is roughly equivalent to that of RDFa,
has no implementation nor existing content support so far, and can't
even handle all the use cases that RDFa could handle. The only
significative advantage of that proposal was that it used reversed
domain names to identify vocabularies instead of namespace prefixes;
however, there has been a lot of controversy about reversed domains
actually being better than namespace prefixes.

Even if we asume that reversed domains are slightly better (it's not
likely that they are much better if there is so much division about
the topic), is that worth the costs of: 1) Limiting the range of use
cases that can be handled; 2) Requiring new tools to be developed from
scratch; and 3) Requiring content to adapt to this new format? These
are huge costs. Especially, when we put 2) and 3) together, content
authors will be forced to keep supporting RDFa tools (as long as a
significant part of the audience is still using RDFa-related tools),
so they will need to duplicate metadata to support Microdata as well.
Wasn't duplication one of the issues inline metadata was intended to
prevent?
/example

aside
Please, note that my intention is not to bring back this discussion.
It is just an example of controversy that will be known by most
participants on this list. Actually, I have no intention to step into
that debate again for a while.
/aside

The point
I do not doubt of Ian's good faith, nor of his huge effort in making
HTML5 the best possible thing it might be. However, I doubt of the
sanity of having an individual to have the final say about any topic,
even above expert groups that have been researched and discussed the
topic for years.
Just because the fruit of so long work can't be properly sintetized in
plain-text e-mails doesn't mean that there is not enough value on it.
Going back to the example, there was a lot of people involved in RDF
and RDFa since 1997. That's already twelve years of continuous work
and research by several people. HTML5 replaces all this effort (RDF
and RDFa) with that of a single person over few months (Microdata).

Honestly, I can't say for sure which method would be best for HTML;
but I'm still convinced that having a single gatekeeper with absolute
power over the next web standard is, at least, insane.
/The point

Regards,
Eduard Pascual

Re: [whatwg] Microdata and Linked Data

2009-07-24 Thread Eduard Pascual

On Fri, Jul 24, 2009 at 1:07 PM, Peter Mikapm...@yahoo-inc.com wrote:
 [...]
 #2

 The other area that could be possibly improved is the connection of type
 identifiers with ontologies on the web. I would actually like the notion of
  reverse domain names if

 -- there would be an explicit agreement that they are of the form
 xxx.yyy.zzz.classname
 -- there would be a registry for mappings from xxx.yyy.zzz to URIs.

 For example, org.foaf-project.Person could be linked to
 http://xmlns.com/foaf/0.1/Person by having the mapping from org.foaf-project
 to http://xmlns.com/foaf/0.1/.

 It wouldn't be perfect, the FOAF ontology as you see is not at
 org.foaf-project but at com.xmlns. However, it would be a step in the right
 direction.

 [...]
 #4

 I don't expect that writing full URIs for property names will be appealing
 to users, but of course I'm not a big fan either of defining prefixes
 individually as done in RDFa with the CURIE mechanism. Still, prefixes would
 be useful, e.g. foaf:Person is much shorter to write than
 com.foaf-project.Person and also easier to remember. So would there be a way
 to reintroduce the notion of prefixes, with possibly pointing to a registry
 that defines the mapping from prefixes to namespaces?

 section id=hedral namespaces=http://www.w3c.org/registry/;
 item=animal:cat
 h1 itemprop=animal:nameHedral/h1
 /section

 Here the registry would define a number of prefixes. However, the mechanism
 would be open in that other organizations or even individuals could maintain
 registries.


IMO, both of these proposals are quite related. However, you added
substantial differences I can't really understand between them.

For #2 you suggest to have a sort of centralized registry of mappings
between the reversed domains and the vocabularies they refer to. What
happens if next year I have to use an unusual vocabulary for my site
that is not included on the registry? Would I have to get the
vocabulary included on the registry before my pages' microdata can be
mapped to the appropriate RDF graph?
On the other hand, on #4, you are opening the gate to independent
entities (be them organizations or individuals) to define the prefixes
they would be using for their pages' metadata: why don't apply this to
#2 as well? IMO, it would be more important for #2 than for #4; since
#4 only provides syntax sugar while #2 enables something that would be
undoable without it (mapping Microdata to arbitrary RDF).

About #1, I'm not sure about what you are exacly proposing, so I can't
provide much feedback on it. Maybe you could make it a bit clearer:
are you proposing any specific change to the spec? If so, what would
be the change? If now, what are you proposing then?
Finally, about #3 I'm not familiar with the OWL vocabulary, so I can't
say too much about it. But if your second proposal gets into the spec,
then this would become just syntax sugar, since any property from any
existing RDF vocabulary could be expressed; and if #4 also got in, the
benefit of built-in properties would be minimal compared to using a
reasonably short prefix (such as owl:).

Just my two cents.

Regards,
Eduard Pascual

Re: [whatwg] Make quoted attributes a conformance criteria

2009-07-23 Thread Eduard Pascual

On Thu, Jul 23, 2009 at 5:28 PM, Rimantas Liubertasriman...@gmail.com wrote:
 However, the quotation marks being *sometimes* optional is quite
 dangerous, since an author needs to exactly remember when they are
 needed and when they aren't; and using always quotation marks does
 avoid this problem.

 If author does not remember he can always use quotes and avoid
 this problem. I like the idea of having option to omit quotes valid
 for those who remember.
And this is why I was suggesting to mention on the spec that, since
quotes are always allowed, the safest choice in case of doubt is to
use them; rather than making it a hard requirement. For validators, I
think the best approach would be to produce some warning (not an
error) for missing quotes, probably omitting the safest cases (such as
numeric attributes, @type in input (which is always a single word),
and so on); so authors that go through the hassle of validating their
pages to detect issues can be made aware of the unquoted attributes
that may bring troubles in the future (ie: when updating such
attributes).

 Again, the point is not that *sometimes* it is safe to omit the
 quotes. The issue is with remembering when it is safe and when it is
 unsafe.

 I think you overestimate the danger.
 So my vote is against such a requirement.
And I think you understimate it. I have seen newbies do really
horrendous things. Murphy is omnipresent on the web.
Anyway, I don't think voting on this list makes any sense. HTML5 is
not a democratic process, but a totalitarian one with the core of the
WHATWG at the top (see
http://wiki.whatwg.org/wiki/FAQ#How_does_the_WHATWG_work.3F) and Ian
as their hand. So it's not a matter of voting; but of convincing Ian
to change the spec, or to convince the WHATWG members to replace him
with someone who will change the spec (the later is quite unlikely to
happen anyway).

Re: [whatwg] Make quoted attributes a conformance criteria

2009-07-23 Thread Eduard Pascual

On Thu, Jul 23, 2009 at 7:35 PM, Aryeh Gregorsimetrical+...@gmail.com wrote:
 Add:
 In order to avoid errors and increase readability, using quotes is highly
 recommended for all non-omitted attribute values.

 I don't think there's any value in having the spec take a stance like
 this.  It's a matter of taste, IMO.

While I don't consider a hard requirement would be appropriate, there
is an audience sector this discussion seems to be ignoring: Authoring
Tools' developers. IMO, it would be highly desirable to have some
guidelines for these tools to determine when they *should* quote
attribute values.

On the manual authoring side, I'd like to insist on the idea of
highlighting the safety of always quoting attributes versus the risk
of mistaking a required quotation as optional.

Finally, I think we might come up with some wording that worked for both cases.

Regards,
Eduard Pascual

Re: [whatwg] Create my own DTD and specify in DOCTYPE? Re: Validation

2009-07-21 Thread Eduard Pascual

On Tue, Jul 21, 2009 at 10:02 PM, dar...@chaosreigns.com wrote:
 On 07/21, Tab Atkins Jr. wrote:
 HTML5 is not an SGML or XML language.  It does not use a DOCTYPE in

 I thought HTML5 conformed to XML?

 any way.  The !DOCTYPE HTML incantation required at the top of
 HTML5 pages serves the sole purpose of tricking older browsers into
 rendering the document as well as possible.  No checking is made
 against a DTD, official or otherwise.

 I understand that, but the spec says an HTML5 document must include
 !DOCTYPE html.  And I would like, for my own purposes, to be able to
 instead use !DOCTYPE html SYSTEM
 http://www.chaosreigns.com/DTD/html5.dtd; without violating HTML5.

First things first: DTDs are a quite limited mechanism to describe
what a specific XML or SGML language allows.
The decision of HTML5 not having a DTD was influenced by two essential
factors: first, and most obvious, is that HTML5 isn't neither XML nor
SGML (sure, it provides a XML serialization; but if you are using it
you might use XML Schema instead of DTD anyway); and second, less
obvious but not less important: many of the requirements,
restrictions, and so on defined in HTML5 can't be properly described
via DTDs. So, what would be the point on defining a DTD which can't be
used to actually validate the document?

It is possible to go nuts treating your HTML documents as pure XML
(you'd need to ensure that they are well-formed and so on), and use a
DTD (or a XML Schema) with it. To spice it up, toss in an XSLT
stylesheet with the HTML output mode that just outputs the root of
the document, and voilà, you get your pure XML (served as text/xml or
application/xml) document treated as pure HTML by all browsers
(including IE6 onwards). IMHO, quite overkill. The issue here is that,
either DTDs or Schemas have some limitations, so it wouldn't be enough
to properly validate the document.

This leads to some deeper thought then: If DTD or similar tools
doesn't really help to validate the document, what is the problem we
are trying to solve with them? IIRC, the original mail stated that the
goal was to differentiate between versions (hypothetical HTML7 and
HTML9) in order to ensure browser compatibility (with an hypothetical
IE10, which would support HTML7 but not HTML9). Well, if your
validator needs to distinguish between these two versions, there are
already several mechanisms at your reach: you may use custom HTTP
headers, or add a data-html-version=7 or data-html-version=9
attribute to your body tag: on both cases, your documents would
still comply with (current) HTML requirements and document model, and
your validator will have a way to differentiate the, Problem solved.
With no changes to HTML5. And without having to write a DTD that can
get close but will never be able to work properly.

Furthermore, either of these approaches have additional benefits.
Let's make a slight change to the original scenario: suppose that IE10
complies with most of HTML7, but fails to render properly one or two
new elements; and maybe even supports some features introduced by
HTML8 (IMHO, partial support of multiple iterations of the language is
more likely to match reality than perfectly implementing one of them
but providing zero support for the following ones). Why should you
restrain yourself from using those features of HTML8 that are
supported on IE10? With the @data-* approach, you don't have to: you
may instead put something like this data-html-subset=IE10-compatible
in your body and there you go. Your validator should be made aware
of what is supported on each subset you are using, and you will be
able to squeeze the most from each browser you whish to support, and
automate the validation as intended in the original use case.

Regards,
Eduard Pascual

Re: [whatwg] Validation

2009-07-20 Thread Eduard Pascual

On Mon, Jul 20, 2009 at 10:27 PM, dar...@chaosreigns.com wrote:

 On 07/20, Nils Dagsson Moskopp wrote:
  Uh-okay. What could various means be ?

 Something like:

 object src=image.svg
 img src=image.png
 /object

  Why not use a HTML7 and a HTML9 validator in this case ? The HTML 7
  validator could check all pages and report those that aren't valid HTML
  7. Those pages could then put onto a list that is checked by the HTML 9
  validator.

 Because I don't want to have to tell the validator which pages are HTML7
 and which pages are HTML9, I want it to figure it out automatically.

You don't have to tell the validator which version is each page. All
the previous knowledge included in the setup Nils posted would be
some pages are HTML7, and some are HTML9; then you just
feed/send/pipe/whatever all pages to the HTML7 validator: since HTML9
would be a superset of 7, everything that passes this validation is
valid as both HTML7 and HTML9. Then, based on the result, failed pages
would be sent to the HTML9 validator: if they pass, they are HTML9
with features not included in 7; otherwise they are just invalid.
Although it may depend from the specifics of the validation software
used, automating this sequence would be easy on the general case, and
trivial on the best scenario.

Browsers are built incrementally. For example, IE10 is very likely to
render properly any page that IE9 had rendered properly (plus some
that IE9 couldn't handle). And IE9 will handle any page that IE8
handles (plus some that are too much for IE8), just like IE8 handles
any page that IE7 (in addition to those that use CSS2 features not
supported by IE7), and IE7 renders all the stuff that IE6 renders, and
so on... The same is true for any other browser sequence: Firefox3
handles all pages that Firefox2 did; and the later included all those
pages that rendered properly in FF1. More on the same with Opera,
Safari, Konkeror, and so on (Chrome isn't too relevant here because
it's quite young). The only problem could happen if, for example, I
(or someone else) built a new browser, only with HTML5 on mind, when
trying to open an HTML4 (or earlier) page; but the HTML5 spec already
addresses this: to be compliant a browser must treat any valid input
in a well-defined way; but it also must treat invalid input in a
well-defined way; which is actually defined to make HTML5-compliant
browsers render old and invalid content quite like current browsers
do.

Thus, if after HTML5 some features are deprecated (just like font
has been removed from the HTML specs), there will be pages using those
features that will not be valid HTML6, but HTML6 will still define
exactly what browsers are expected to do with them.

It seems that you worry about validation. Actually, there is some
reason to worry: many HTML4 Transitional pages (namely, those that use
font or other obsolete aberrations) will be reported as invalid when
processed by an HTML5 (or later) validator. So you should actually
worry about this; but not complain, because it is the best thing a
validator can do, warning you that you are using something (like
font) that you shouldn't be using.
Now, don't try to argue using font (or some other obsolete tag)
should be Ok, because it's valid on HTML4: on HTML4 these things are
already *deprecated*. Every time you see that word on the HTML4 spec,
read it as an apologize from W3C, just like saying we should have
never added this to HTML; now we regret it and it shouldn't be used.
Of course, a lot of legacy content will no longer validate with HTML5
validators; but where is the issue? It will still render. After all,
no one would expect Don Quixote or Hamlet to be valid according to
modern Spanish and English rules, respectivelly, but people who know
either language are still able to read them. This is an inherent part
of language evolution; and hence is a needed side-effect for evolving
HTML. And we need to evolve HTML, becuase the current standard is over
a decade old, and is falling short to the web's needs every now and
then.

Just my PoV anyway.

Regards,
Eduard Pascual

Re: [whatwg] do not encourage use of small element for legal text

2009-07-19 Thread Eduard Pascual

On Sun, Jul 19, 2009 at 12:29 PM, Ian Hickson i...@hixie.ch wrote:

 [...]
 On Fri, 3 Jul 2009, Eduard Pascual wrote:
  It's clear that, despite the spec would currently encourage this
  example's markup, it is not a good choice. IMHO, either of these should
  be used instead:
 
  pYour 100% satisfaction in the work of SmallCo is guaranteed.
  (Guarantee applies only to commercial buildings.)/p
 
  or
 
  smallYour 100% satisfaction in the work of SmallCo is guaranteed.
  (Guarantee applies only to commercial buildings.)/small

 In practice, if the author wants to make the parenthetical text smaller,
 he will. The question is whether we should encourage such small text to be
 marked up in a way distinguishable from other stylistic spans.

Indeed, making legal text clearly readable should be a goal. However,
I don't think it is too much a HTML5 goal: afaik, in most countries
there are general laws that define which kind of text can hold legal
value on different kinds of media, dealing with details such as
minimum size and color contrasts for each media, maximum speed for
running text (like bottom-screen text on TV ads), and so on. Of
course, these will vary from country to country and/or region to
region; but IMHO general law is the area where legal text should be
handled with. Authors hence should find advice about the actual
requirements for legal text to be legally binding (ie: asking their
lawyers for advice), and honor such restrictions when putting a
webpage together.
It is pointless to make specific encouragements or discouragements on
how to include legal text on an HTML5 document: a good advice may
backfire if it leads a good-intended author to do something that
doesn't match local laws on that regard; and evil-intended users will
ignore any advice from the spec and just push as much as they can to
the edge, looking for the most hard-to-read-but-still-legal possible
form.

The basic task of HTML (the language itself, not the spec defining it)
is to provide authors with tools to build their documents and pages in
an interoperable way. HTML5 does well that job in the area of small
print, providing the small element to mark it up. That's exactly
enough, and IMHO there is no point on trying to go further.


  I'm not sure if the word legalese was intended to refer to all kinds
  of legal text, or just the suspicios or useless ones. In any case, a
  more accurate wording would help.

 This wording is vague intentionally, because it is a vague observation. I
 don't know how we could make it more accurate.
If vagueness is intentional, just take thing explicitly vague, rather
than a term that some may just take as vague but others may take as
catch-all and others seem to even find offensive/despective.

  First, leave the formal description The small element represents
  small print or other side comments. as is: IMHO it is accurate and
  simple, and that's quite enough to ask from a spec.
 
  Next, replace the note that reads Small print is typically legalese
  describing disclaimers, caveats, legal restrictions, or copyrights.
  Small print is also sometimes used for attribution. with something
  like this: Small print is often used for *some* forms of legal text
  and for attribution. Defining which kinds of legal text should be on
  small print, however, is out of the scope of HTML.
 
  This makes clear that HTML (technically) allows using small to put
  legal text (or anything else) in small print, but it doesn't encourage
  any specific usage of small print.

 I'm not convinced the suggested text is any better than the current text,
 to be honest. I'm also reluctant to start explicitly saying what is out of
 scope, because there's no end to that list.

I don't fully agree on that argument, but let's leave the scope part
out (it was quite redundant, anyway, just to be on the safe side).

The key on the sentence Small print is often used for *some* forms of
legal text and for attribution. is the emphasis on some: this
should be enough for any reader to understand that, if only some forms
go on small print, other forms just don't. The some achieves your
intended vagueness, and the emphasis makes such vagueness explicit
enough. The current wording small print is typically used for
legalesse is not just vague, but as ambiguous as the term legalesse
itself: a significant proportion of authors might miss-understand it
and assume that any form of legal text is legalesse, so it can be on
small print, but it isn't require to be so (because of the
typically). Addressing this potential missunderstanding is the exact
intent of my proposed text.

Regards,
Eduard Pascual

Re: [whatwg] A Selector-based metadata proposal (was: Annotating structured data that HTML has no semantics for)

2009-07-09 Thread Eduard Pascual

On Thu, Jul 9, 2009 at 12:06 AM, Ian Hicksoni...@hixie.ch wrote:
 On Wed, 10 Jun 2009, Eduard Pascual wrote:
 
  I think this is a level of indirection too far -- when something is a
  heading, it should _be_ a heading, it shouldn't be labeled opaquely
  with a transformation sheet elsewhere defining that is maps to the
  heading semantic.

 That doesn't make much sense. When something is a heading, it *is* a
 heading. What do you mean by should be a heading?.

 I mean that a conforming implementation should intrinsically know that the
 content is a heading, without having to do further processing to discover
 this.

 For example, with this CSS and HTML:

   h1 { color: blue; }

   h1 Introduction /h1

 ...the HTML processor knows, regardless of what else is going on, that the
 word Introduction is part of a heading. It only knows that the word
 should be blue after applying processing rules for CSS.
Now I think I got your point. However, I don't think it is really an
issue. Let's take a variant of your example:

CSS:
h1 { font-size: large; }

CRDF:
h1 { foo|MainHeading: contents; }

HTML:
h1 Introduction /h1

If we took the HTML alone (for example, if the CSS and CRDF are in
external files and fail to download), the browser will find an H1
element and it will know that it is a first-level heading. It will
also render it large by default (maybe depending of context; a voice
browser won't render anything as large). Now, if the CSS and CRDF
get processed, the browser will *also* know that it has to render it
large (now it's not just falling back to some default, it knows that
the author wanted the heading to render as large), and that it is
whatever the foo (or the namespace mapped by the foo prefix, to be
more specific) namespace defines as a MainHeading, which will
probably be something quite similar to the browser's own concept of
first-level heading.

The point here is: the CSS is stating that the h1 should display
large; despite the browser would display it large in most cases.
Similarly, the CRDF is defining the h1 as a MainHeading, despite the
browser already knows it is a heading. Both the CSS and the CRDF
provide redundant information. Of course, someone could attempt to
describe semantics through CRDF that conflict with HTML's, but that
one could also make headings smaller, hide strongs and enlarge
smalls with CSS.

No matter what CRDF says, a compliant HTML browser will always know
that h1 is a heading (and similarly, will know what other HTML
elements mean). But if what CRDF says is consisten with what the HTML
says (the main point of metadata is stating things that are true,
false data is almost useless), then RDF tools that are completelly
unaware of HTML itself can still know that something is a heading. The
same way, when CSS is consistent with HTML's semantics (for example
making headings large, strongs bold, or ems italized), a user
viewing the page can perceive that something is a heading, important,
or emphasized, respectivelly.

 I think by and large the same should hold for more elaborate semantics.


 (I didn't really agree with your other responses regarding my criticisms
 of your proposal either, but I don't have anything except my opinions to
 go on as far as those go, so I can't argue my case usefully there.)
Most of such responses were based on what is brewing for the next
version of the document, rather than the version actually available,
so I don't think it's worth going further on those points until the
update is ready and up.

  I think CRDF has a bright future in doing the kind of thing GRDDL does,

 I'm not sure about what GRDDL does: I just took a look through the spec,
 and it seems to me that it's just an overcomplication of what XSLT can
 already do; so I'm not sure if I should take that statement as a good or
 a bad thing.

 A good thing.

 GRDDL is a way to take an HTML page and infer RDF information from that
 page despite the page, e.g. by implementing Microformats using XSLT. So
 for example, GRDDL can be used to extract hCard data from an HTML page and
 turn it into RDF data.
Ok. Making metadata available from documents that were not authored
with metadata in mind, and without altering the document itself (at
much adding a link to the header) is one of the use-cases CRDF aims
to handle; so it's good news to hear from someone that it's on the
right way to achieve it ^^-

  It's an interesting way of converting, say, Microformats to RDF.

 The ability to convert Microformats to RDF was intended (although not
 fully achieved: some bad content would be treated differently between
 CRDF and Microformats); and in the same way CRDF also provides the
 ability to define de-centralized Microformats.org-like vocabularies (I'm
 not sure if referring to these as microformats would still be
 appropiate).

 I think this is a particularly useful feature; I would encourage you to
 continue to develop this idea as a separate language, and see if there is
 a market

Re: [whatwg] Limit on number of parallel Workers.

2009-07-08 Thread Eduard Pascual

On Wed, Jul 8, 2009 at 1:59 AM, Ian Hicksoni...@hixie.ch wrote:

 I include below, for the record, a set of e-mails on the topic of settings
 limits on Workers to avoid DOS attacks.

 As with other such topics, the HTML5 spec allows more or less any
 arbitrary behaviour in the face of hardware limitations. There are a
 variety of different implementations strategies, and these will vary
 based on the target hardware. How to handle a million new workers will be
 different on a system with a million cores and little memory than a system
 with one core but terabytes of memory, or a system with 100 slow cores vs
 a system with 10 fast cores.

 I have therefore not added any text to the spec on the matter. Please let
 me know if you think there should really be something in the spec on this.


Shouldn't a per-user setting be the sanest approach for the worker
limit? For example, it would quite make sense for me to want a low
limit (let's say 10 or so) workers on my laptop's browser; but have no
restriction (or a much higher one, like some thousand workers) on my
workstation.
Ian's point is key here: what's an appropriate limit for workers
depends almost entirely on hardware resources (and probably also on
implementation efficiency and other secondary aspects), and there is a
*huge* variety of hardware configurations that act as web clients, so
it's just impossible to hardcode a limit in the spec that works
properly for more than a minority. At most, I would suggest a note
like this in the spec User agents SHOULD provide the user a way to
limit the ammount of workers running at a time.: emphasis on the
SHOULD rather than a MUST, and also on the fact that the final
choice is for users to make. Then it'd be up to each implementor to
decide on default, out-of-the-box limits for their browser (it would
make sense, for example, if Chromium had a lower default limit than
FF, since C's workers are more expensive).

Just my two cents.

Regards,
Eduard Pascual

Re: [whatwg] Helping people seaching for content filtered by license

2009-06-10 Thread Eduard Pascual

On Fri, May 8, 2009 at 9:57 PM, Ian Hickson i...@hixie.ch wrote:
 [...]
 This has some implications:

  - Each unit of content (recipe in this case) must have its own
   independent page at a distinct URL. This is actually good practice
   anyway today for making content discoverable from search engines, and
   it is compatible with what people already do, so this seems fine.

This is, on a wide range of cases, entirely impossible: while it might
work, and maybe it's even good practice, for contents that can be
represented on the web as a HTML document, it is not achievable for
many other formats. Here are some obvious cases:

Pictures (and other media) used on a page: An author might want to
have protected content, but to allow re-use of some media under
certain licenses. A good example of this are online media libraries,
which have a good deal of media available for reuse but obviously
protect the resources that inherently belong to the site (such as the
site's own logo and design elements): Having a separate page to
describe each resource's licensing is not easily achievable, and may
be completelly out of reach for small sites that handle all their
content by hand (most prominently, desginer's portfolio sites that
offer many of their contents under some attribution license to
promote their work).

Software: I have stated this previously, but here it goes again: just
like with media, it's impossible to simply put a link
rel=license... on a msi package or a tarball. Sure, the package
itself will normally include a file with the text of the corresponding
license(s), but this doesn't help on making the licensing discoverable
by search engines and other forms of web crawlers. It looks like I
should make a page for each of the products (or even each of the
releases), so I can put the link tag there and everybody's happy...
actually, this makes so much sense that I actually already have such
pages for each of my release (even if there aren't many as of now);
but I *can't* put the link on them, because my software is under
more liberal licenses (mostly GPL) than other elements of the page
(such as the site's logo, appearing everywhere on the page, which is
CC-BY-NC-ND), and I obviously don't want such contents to appear on
searches for images that I can modify and use commercially, for
example.

Until now, the best way to approach this need I have seen would be
RDF's triple concept: instead of saying licensed under Y, I'm
trying to say X is licensed under Y, and maybe also and X2 is
licensed under Y2, and this is inherently a triple. I am, however,
open to alternatives (at least on this aspect), as long as they
provide any benefit other than mere vadilation (which I don't even
care about anymore, btw) over currently deployed and available
solutions. I am not sure whether Microdata can handle this case or not
(after all, it is capable of expressing some RDF triples), but the
fact is that I can make my content discoverable by google and yahoo
using CCREL (quite suboptimal, and wouldn't validate on HTML5, but
would still work), but I can't do so using Microdata (which is also
suboptimal, would validate on HTML5, but doesn't work anywhere yet).

Regards,
Eduard Pascual

Re: [whatwg] A Selector-based metadata proposal (was: Annotating structured data that HTML has no semantics for)

2009-06-10 Thread Eduard Pascual

 selector for each case that needs it.

For what I have seen, most newbies can handle both class and element
selectors, and the descendant combinator. A good portion of them is
even aware that they can use the descendant to combine selectors of
either type, and even that they can chain this. In general, they use
the few selectors they can manage when it's enough for the job, and
rely on classes and inline styles to simplify their life.
Even in the case some users went for trial and error when crafting
their selectors for CRDF, trial implies trying, and trying implies
somehow checking the result to match expectations: even if it requires
a bit more of attention to catch the errors, even this may be
simplified: just take a look at this sample (only relevant parts of
the markup are included, for simplicity):
myFile.crdf:
@namespace foo http://www.example.com/;
myOverComplexSelector1 {
color: red; /* the CRDF parser will ignore this rule */
foo|myProperty: someValue;
}
myOverComplexSelector2 {
color: blue;
foo|myProperty: someOtherValue;
}
myFile.html:
link href=myFile.crdf type=text/css
link href=myFile.crdf type=text/crdf
...
For most content, a red or blue color will denote each of the
foo|myProperty values. Mixing CSS with CRDF is ugly (but safe), and
this is not 100% reliable: something may be red or blue without having
the relevant value, or have such value but the color being overriden
by a more specific style, but looking at patterns (and CRDF is
intended to use selectors essentially for clearly patterned cases,
such as tables or dl's) it should allow to check, on most cases,
whether the selector is doing what's expected or not.  For example, if
the selector takes the form td:nth-of-type(whatever), this method
allows checking it at a quick glance: even if in some cells the
contents override the color, it may be seen on the general if the
column as a whole is getting the relevant color, and hence the
relevant property value. One could go further, and disable (commenting
out in the HTML, or adding a ~ to the file name for its fetch to
fail) the actual CSS while testing the CRDF, and then there'll be no
risk of style overrides. And, if I try, I bet I may find other tricks
to visually test CRDF rules. Of course, this doesn't replace accurate
testing, which should be always done, but is an example of how the
similarities between CRDF and CSS can be exploited to make author's
life easier.

There is one case left: those authors who overcomplicate themselves
and just don't test things at all. I have already stated that, while I
am ok with some foolproofing, I'm totally against suicide-proofing: if
someone wants to doom their own page, they don't need the specs' help
to do so, there are plenty of ways; so adding complexity to specs to
deal with such cases is just a waste.

 I say this despite really wanting Selectors to succeed (disclosure: I'm
 one of the editors of the Selectors specification and spent years working
 on its test suite).
I am aware of your work on the Selectors spec and tests, and that's
why I value your feedback on this area so much.


 I think CRDF has a bright future in doing the kind of thing GRDDL does,
I'm not sure about what GRDDL does: I just took a look through the
spec, and it seems to me that it's just an overcomplication of what
XSLT can already do; so I'm not sure if I should take that statement
as a good or a bad thing.

 and in extracting data from pages that were written by authors who did not
 want to provide semantic data (i.e. screen scraping).
Now that you mention it, I have to agree... I haven't noticed that
before, but I'll probably make some way for this detail on the
document's intro and/or the examples

 It's an interesting way of converting, say, Microformats to RDF.
The ability to convert Microformats to RDF was intended (although not
fully achieved: some bad content would be treated differently
between CRDF and Microformats); and in the same way CRDF also provides
the ability to define de-centralized Microformats.org-like
vocabularies (I'm not sure if referring to these as microformats
would still be appropiate).


Once again, I want to thank you for your feedback. Besides some fixes,
your mail has also convinced me to add some clarifications to the
document for some recurrent missconceptions (for example, CRDF doesn't
require, nor even encourages, taking all the semantics out of the main
document: semantics should be kept as close as possible to the content
as long as this doesn't force redundance/repetition).

Regards,
Eduard Pascual

Re: [whatwg] Removing the need for separate feeds

2009-05-24 Thread Eduard Pascual

On 5/22/09, Eduard Pascual herenva...@gmail.com wrote:
 [...]
 For manually authored pages and feeds things would be different; but
 are there really a significant ammount of such cases out there? I
 can't say I have seen the entire web (who can?), but among what I have
 seen, I have never encountered any hand authored feed, except for code
 examples and similar experimental stuff.

Please, let me clarify the intent of this comment was not to say this
is pointless nor anything like that; the goal was to encourage people
to post real-world examples of where this feature would be used so it
could be properly evaluated.

Now, having seen some of the cases, I must say that this addition
looks like a good idea, but it still needs some work (some issues and
shortcommings have already been highlighted).
There are cases where keeping a separate feed is still a good idea,
most prominently for site-wide feeds (because it's not possible to put
all the relevant stuff into a single HTML document, unless such
document is made for that purpose, but that would be a separate feed
on itself then), and for cases where the traffic on the feed is
significantly higher than for the document and/or the size of the
document is significantly bigger than the feed. These cases, however,
are just unaffected by the addition, and shouldn't prevent the
relveant ones to take benefit of it.

Regards,
Eduard Pascual

Re: [whatwg] Removing the need for separate feeds

2009-05-22 Thread Eduard Pascual

On Fri, May 22, 2009 at 9:21 AM, Ian Hickson i...@hixie.ch wrote:
 On Fri, 22 May 2009, Henri Sivonen wrote:
 On May 22, 2009, at 09:01, Ian Hickson wrote:
 
    USE CASE: Remove the need for feeds to restate the content of HTML pages
    (i.e. replace Atom with HTML).

 Did you do some kind of Is this Good for the Web? analysis on this
 one? That is, do things get better if there's yet another feed format?

 As far as I can tell, things get better if the feed format and the default
 output format are the same, yes. Generally, redundant information has
 tended to lead to problems.
IMO, feeds are the exception to confirm the rule. While redundant
*source* information easily leads to problems, for what I have seen
the sites using feeds tend to be almost always dynamic: both the HTML
pages and the feeds are generated via server scripts from the *same
set of source data*, normally from a database. This is especially true
for blogs, and any other CMS-based site, since CMSs normally rely a
lot on databases and server-side scripting. So on these cases we don't
actually have redundant information, but just multiple ways to
retrieve the same information.
For manually authored pages and feeds things would be different; but
are there really a significant ammount of such cases out there? I
can't say I have seen the entire web (who can?), but among what I have
seen, I have never encountered any hand authored feed, except for code
examples and similar experimental stuff.

Regards,
Eduard Pascual

Re: [whatwg] Exposing known data types in a reusable way

2009-05-21 Thread Eduard Pascual

Interesting.
Despite my PoV against the microdata proposal, I've taken a look at it
and find a minor typo:

Within 5.4.1 vCard, by the end of the n property description, the
spec reads:
The value of the fn property a name in one of the following forms:
shouldn't it read:
The value of the fn property is a name in one of the following forms: ?

Maybe this will grant me a seat for posterity on the acknowledgements
section =P.

On Wed, May 20, 2009 at 1:07 AM, Ian Hickson i...@hixie.ch wrote:

 Some of the use cases I collected from the e-mails sent in over the past
 few months were the following:

   USE CASE: Exposing contact details so that users can add people to their
   address books or social networking sites.

   SCENARIOS:
     * Instead of giving a colleague a business card, someone gives their
       colleague a URL, and that colleague's user agent extracts basic
       profile information such as the person's name along with references to
       other people that person knows and adds the information into an
       address book.
     * A scholar and teacher wants other scholars (and potentially students)
       to be able to easily extract information about who he is to add it to
       their contact databases.
     * Fred copies the names of one of his Facebook friends and pastes it
       into his OS address book; the contact information is imported
       automatically.
     * Fred copies the names of one of his Facebook friends and pastes it
       into his Webmail's address book feature; the contact information is
       imported automatically.
     * David can use the data in a web page to generate a custom browser UI
       for including a person in our address book without using brittle
       screen-scraping.

   REQUIREMENTS:
     * A user joining a new social network should be able to identify himself
       to the new social network in way that enables the new social network
       to bootstrap his account from existing published data (e.g. from
       another social nework) rather than having to re-enter it, without the
       new site having to coordinate (or know about) the pre-existing site,
       without the user having to give either sites credentials to the other,
       and without the new site finding out about relationships that the user
       has intentionally kept secret.
       (http://w2spconf.com/2008/papers/s3p2.pdf)
     * Data should not need to be duplicated between machine-readable and
       human-readable forms (i.e. the human-readable form should be
       machine-readable).
     * Shouldn't require the consumer to write XSLT or server-side code to
       read the contact information.
     * Machine-readable contact information shouldn't be on a separate page
       than human-readable contact information.
     * The information should be convertible into a dedicated form (RDF,
       JSON, XML, vCard) in a consistent manner, so that tools that use this
       information separate from the pages on which it is found have a
       standard way of conveying the information.
     * Should be possible for different parts of a contact to be given in
       different parts of the page. For example, a page with contact details
       for people in columns (with each row giving the name, telephone
       number, etc) should still have unambiguous grouped contact details
       parseable from it.
     * Parsing rules should be unambiguous.
     * Should not require changes to HTML5 parsing rules.


   USE CASE: Exposing calendar events so that users can add those events to
   their calendaring systems.

   SCENARIOS:
     * A user visits the Avenue Q site and wants to make a note of when
       tickets go on sale for the tour's stop in his home town. The site says
       October 3rd, so the user clicks this and selects add to calendar,
       which causes an entry to be added to his calendar.
     * A student is making a timeline of important events in Apple's history.
       As he reads Wikipedia entries on the topic, he clicks on dates and
       selects add to timeline, which causes an entry to be added to his
       timeline.
     * TV guide listings - browsers should be able to expose to the user's
       tools (e.g. calendar, DVR, TV tuner) the times that a TV show is on.
     * Paul sometimes gives talks on various topics, and announces them on
       his blog. He would like to mark up these announcements with proper
       scheduling information, so that his readers' software can
       automatically obtain the scheduling information and add it to their
       calendar. Importantly, some of the rendered data might be more
       informal than the machine-readable data required to produce a calendar
       event.
     * David can use the data in a web page to generate a custom browser UI
       for adding an event to our calendaring software without using brittle
       screen-scraping.
     * http://livebrum.co.uk/: the author would like people to be able to

Re: [whatwg] A Selector-based metadata proposal (was: Annotating structured data that HTML has no semantics for)

2009-05-20 Thread Eduard Pascual

Note: I wrote this yesterday. My internet connection wasn't working as
desirable, but GMail told me it had been sent and I believed it. Now I
have just noticed that it hadn't; and at least one person has been
confused by the changes in the document. Sorry for this issue, and
hope this time GMail does send it. What follows is the message as it
should have been sent yesterday:

Update: I have just put up a new version of the CRDF document. The
main changes are:
Section 0. Rationale: several corrections on the claimed limitations
of RDFa, which have been shown to be just limitations of my knowledge
about RDFa.
Section 2. Syntax: the syntax is now more formally defined (although
it still refers to CSS3's Syntax, Values, and Namespace modules for
some stuff). The content model for property values is now fully
defined: resource and reversed support has been added, and explicit
typing capabilities are now more prominent in the document. For
subject definitions, the none keyword has been redefined; blank()
now handles what none previously did, and a syntax has been added to
mimic EASE's nearest-ancestor construct. Finally, a subsection has
been added describing how to handle escenarios where a tool might have
to extract an XML literal from source in a non-XML language.
Section 3. The host language: expanded 3.3 (embedding inline CRDF) to
allow multiple brace-delimited blocks within the attribute value, to
enable stating properties for different subjects while reusing the
same element.
Section 4. The first examples don't make sense anymore after the
changes in section 0. They have been removed, waiting for further
feedback on that section before redoing them.



I'd like to reiterate what I said in the opening message: if someone
can suggest of a better place to discuss this document, please let me
know.

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-18 Thread Eduard Pascual

On Mon, May 18, 2009 at 10:38 AM, Henri Sivonen hsivo...@iki.fi wrote:
 On May 14, 2009, at 23:52, Eduard Pascual wrote:

 On Thu, May 14, 2009 at 3:54 PM, Philip Taylor excors+wha...@gmail.com
 wrote:
 It doesn't matter one syntax or another. But if a syntax already
 exists (RDFa), building a new syntax should be properly justified.

 It was at the start of this thread:
 http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019681.html
Ian's initial message goes step by step through the creation of this
new syntax; but does *not* mention at all *why* it was being created
on the first place. The insight into the choices taken is indeed a
good think, and I thank Ian for it; but he omitted to provide insight
into the first choice taken: discarding the multiple options already
available (not only Microformats and RDFa, but also other less
discussed ones such as eRDF, EASE, etc). Sure, there has been a lot of
discussion on this topic; and it's possible that the choice was taken
as part of such discussions. In any case, I think Ian should have
clearly stated the reasons to build a brand new solution when many
others have been out for a while and users have been able to try and
test them.
Please keep in mind that I'm not critizicing the choice itself (at
least, not now), but the lack of information and reasoning behind that
choice.

 As
 of now, the only supposed benefit I have heard of for this syntax is
 that it avoids CURIEs... yet it replaces them with reversed domains??
 Is that a benefit?

 There's no indirection. A decade of Namespaces in XML shows that both
 authors and implementors have trouble getting prefix-based indirection
 right.
Really? I haven't seen any hint about that. Sure, there will be some
people who have trouble understanding namespaces, just like there is
some people who have trouble understanding why something like
trtdfoo/tdtdbar/tr/td is wrong.
Please, could you quote a source for that claim? I could also claim
something like fifteen years of Java show that reversed domains are
error-prone and harmful, and even argue about it; but this kind of
arguments, without a serious analisis or study to back them, are
completely meaningless and definitely subjective.

 (If we were limited to reasoning about something that we don't have
 experience with yet, I might believe that people can't be too inept to use
 prefix-based indirection. However, a decade of actual evidence shows that
 actual behavior defies reasoning here and prefix-based indirection is
 something that both authors and implementors get wrong over and over again.)
Curious: you refer to a decade of actual evidence, but you fail to
refer to any actual evidence. I'm eager to see that evidence; could
you share it with us? Thank you.

 I have been a Java programmer for some years, and
 still find that convention absurd, horrible, and annoying. I'll agree
 that CURIEs are ugly, and maybe hard to understand, but reversed
 domains are equally ugly and hard to understand.

 Problems shared by CURIEs, URIs and reverse DNS names:
  * Long.
  * Identifiers outlive organization charts.
Ehm. CURIEs ain't really long: the main point of prefixes is to make
them as short as reasonably possible.
Good identifiers outlive bad organization charts. Good organization
outlives bad identifiers. Good organization and good identifier tend
to outlive the context they are used in.

 Problems that reverse DNS names don't have but CURIEs and URIs do have:
  * http://; 7 characters of even extra length.
  * Affordance of dereferencability when mere identifier sementics are meant.
A CURIE (at least as typed by an author) doesn't have the http://:
it is a prefix, a colon, and whatever goes after it. Once resolved
(ie: after replacing the prefix and colon by what the prefix
represents) what you get is no longer a CURIE, but a URI like the ones
you'd type in your browser or inside a link's href attribute.
Derefercability is not a problem on itself: having more than what is
strictly needed can be either irrelevant or an advantage, not a
problem. Of course, it *may* be the cause of some actual problem, but
in that case you should rather describe the problem itself, so it can
be evaluated.

 Problems that reverse DNS names and URIs don't have but CURIEs have:
  * Prefix-based indirection.
Indirection can't be taken as a problem when most currently used RDFa
tools don't use it at all (which proves that they can work without
relying on it). Sure, it's not as big an advantage as some may claim
it to be. But the ability of indirection itself, even if not 100%
guaranteed to work, it is an actual advantage. As a real world
example, I have been able to learn about vocabularies I didn't know by
following the links on prefix declarations in documents using them.
  * Violation of the DOM Consistency Design Principle if xmlns:foo used.
*if* xmlns:foo is used. Very strong emphasis on the conditional, and
on the multiple possibilities that have already been proposed to deal

Re: [whatwg] A Selector-based metadata proposal (was: Annotating structured data that HTML has no semantics for)

2009-05-17 Thread Eduard Pascual

First of all, thanks for the time taken to review the document and to
post your feedback. I truly appreciate it.

On Sat, May 16, 2009 at 2:12 PM, Toby A Inkster m...@tobyinkster.co.uk wrote:
 In part 0.1 you include some HTML and some RDF triples that you'd like to
 mark up in the HTML and conclude that RDFa is incapable of doing that
 without adding extra wrapper elements.

 While adding redundant wrapper elements and empty elements is occasionally
 needed in RDFa (and from what I can tell, the microdata approach is even
 worse at this), the example you give doesn't require any.
I think I already stated this somewhere, but it never hurts to state
it again: as any human, I can make mistakes; and my knowledge about
RDF, RDFa, and even CSS, is definitely far from perfect. So, thanks
for your post that has actually improved it a little, with the
revelation that @property can take multiple values. My apologies for
that wrong example, then, I'll try to fix that part ASAP. Trying to
think about which cases would then require wrappers in RDFa, the only
situation I've come up with is when the value should be reused for
properties about different subjects. And, to my surprise, just
realized that CRDF in embedded form didn't handle those case neither!
So, my most sincere thanks for highlighting this, since you have
revealed a serious issue on CRDF that will get fixed on the next
iteration of the document (hopefully due for late tuesday or early
wednesday).

 Part 0.3 of your document claims that RDFa is designed for XHTML
 exclusively. This is not the case - the designers of RDFa went out of
 their way to make its use feasible in *any* XML or XML-like language. SVG
 Tiny 1.2 includes the RDFa attributes, so RDFa can be used in SVG.
My apologies here for such a bad wording, although your reply confirms
the idea behind the wording: RDFa was part of the the future is XML
dream, thus not taking into propper account non-X HTML. Not to say
that it was the RDFa's fault, since that was a quite widespread belief
(I shared it myself for a long while). But RDFa's XMLish approach is
the root of many issues for tag-soup HTML; perfectly illustrated by
the ammount of controversy generated on these lists by the
xmlns:prefix syntax.
I'll make sure to change that wording to better describe the idea
behind it; and I'd like to thank you for highlighting the issue.

 Part 0.3 also states that both Microformats and RDFa require the
 human-readable values to be reused as the machine-
 readable ones.. Actually, RDFa provides @content and @resource which,
 respectively, over-ride human-readable text and human-intended link targets.
Again, my limited knowledge of RDFa has betrayed me. This, added to
Microformats missuse of abbr as a workaround, means that the issue
itself doesn't exist, at least not as initially percevied. I'm not
sure whether I'll remove that one entirely, or just briefly mention on
the Issues with Microformats section, due to the accessibility
issues with the abbr approach.

 Lastly, and most seriously, CRDF doesn't seem to distinguish between
 literals and resources.
This is definitely an important issue, which Tab already made me aware
of. Fortunately, it's easy to fix; and Tab himself provided a possible
solution, which is very likely to be part of the next version of the
document.

Until I add the fixes to the document, it's only left to reiterate my
thanks for your feedback.

Regards,
Eduard Pascual

Re: [whatwg] Annotating structured data that HTML has no semantics

2009-05-17 Thread Eduard Pascual

On Sat, May 16, 2009 at 10:02 AM, Leif Halvard Silli l...@malform.no wrote:
 [...]
 But may be, after all, it ain't so bad. It is good to have the opportunity.
 :-)
This is the exactly the point (at least, IMO): RDFa may be quite good
at embedding inline metadata, but can't deal at all with describing
the semantics that are inherent to the structure. OTOH, EASE does
quite the latter, but can't handle the former at all.
That's why I was advocating for a solution that allows either
approach, and even mixing both when appropriate.

On a side note, about the idea of mixing CSS+EASE or CSS+CRDF or
CSS+whatever: my PoV is that these *should* not be mixed; but any
CSS-like semantic description would benefit from some foolproofing,
ensuring that if an author puts CRDF this would get ignored by CSS
parsers (and viceversa). In addition, CSS's error-handling rules make
this kind of shielding relatively easy. OTOH, adding the semantic code
as part of the CSS styling, or trying to consider this as part (or
even as an extension) of the CSS language is wrong by definition:
semantics is not styling; and we should try to make authors aware
enough of the difference.

Regards,
Eduard Pascual

Re: [whatwg] [Fwd: Re: Helping people seaching for content filtered by license]

2009-05-15 Thread Eduard Pascual

On Fri, May 15, 2009 at 8:40 AM, Smylers smyl...@stripey.com wrote:
 Nils Dagsson Moskopp writes:

 Am Freitag, den 08.05.2009, 19:57 + schrieb Ian Hickson:

       * Tara runs a video sharing web site for people who want
         licensing information to be included with their videos. When
         Paul wants to blog about a video, he can paste a fragment of
         HTML provided by Tara directly into his blog. The video is
         then available inline in his blog, along with any licensing
         information about the video.
 [...]

 Why does the license information need to be machine-readable in this
 case?  (It may need to be for a different scenario, but that would be
 dealt with separately.)

It would need to be machine-readable for tools like
http://search.creativecommons.org/ to do their job: check the license
against the engine's built-in knowledge of some licenses, and figure
out if it is suitable for the usages the user has requested (like
search for content I can build upon or search for content I can use
commercialy). Ideally, a search engine should have enough with
finding the video on either Tara's site *or* Paul's blog for it to be
available for users.

Just my two cents.

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-15 Thread Eduard Pascual

 that the value of the metadata
once taken outside the document is extremelly important. Creating a
new way to exploit the data doesn't render the other ways irrelevant.

 With due respect, you're the one who brought competition into this
 discussion by saying there can only be one winner. I don't really think
 that's true, in this case.
With due respect, it was the WHATWG who brought competition into the
whole web spec environment, due to disagreements with the W3C (I find
the WHATWG's reasons quite valid, don't want to discuss on them now);
but now this competition seems to be going to extremes. While some
people here seem to be in the position it's RDFa or nothing, there
are others who seem to be in the everything is fine except RDFa
pole. Extremes are never good; and this discussion is not an
exception.


Regards,
Eduard Pascual

Re: [whatwg] Annotating structured data that HTML has no semanticsfor

2009-05-15 Thread Eduard Pascual

On Fri, May 15, 2009 at 1:44 PM, Kristof Zelechovski
giecr...@stegny.2a.pl wrote:
 I do not think anybody in WHATWG hates the CURIE tool; however, the
 following problems have been put forward:

 Copy-Paste
        The CURIE mechanism is considered inconvenient because is not
 copy-paste-resilient, and the associated risk is that semantic elements
 would randomly change their meaning.
Copy-paste issues with RDFa and similar syntaxes can take two forms:
The first is horfaned prefixes: when metadata with a given prefix is
copied, but then it's pasted in a context where the prefix is not
defined. If the user that is copy-pasting this stuff really cares
about metadata, s/he would review the code and make the relevant fixes
and/or copy the prefix declarations; the same way when an author is
copy-pasting content and wants to preserve formatting s/he would copy
the CSS stuff. If the user doesn't actually care about the metadata,
then there is no harm, because properties relying on an unmapped
prefix should yield no RDF output at all.
The second form is prefix clashes: this is actually extremely rare.
For example, someone copies code with FOAF metadata, and then pastes
it on another page: which are the chances that user will be using a
foaf: prefix for something else than FOAF? Sure, there are cases where
a clash might happen but, again, these are only likely to appear on
pages by authors who have some idea about metadata, and hence the
author is more than likely to review the code being pasted to prevent
these and other clashes (such as classes that would mean something
completelly different under the new page's CSS code, element id
clashes, etc). A last possibility is that the author doesn't have any
idea about metadata at all, but is using a CMS that relies on
metadata. In such case, it would be wise on the CMS's part to
pre-process code fragments and either map the prefix to what they mean
(if it's obvious) or remove the invalid data (if the CMS can't figure
out what it should mean).


 Link rot
        CURIE definitions can only be looked up while the CURIE server is
 providing them; the chance of the URL becoming broken is high for
 home-brewed vocabularies.  While the vocabularies can be moved elsewhere, it
 will not always be possible to create a redirect.

Oh, and do reversed domains help at all with this? Ok, with CURIEs
there is a (relatively small) chance for the CURIE to not be
resolvable at a given time; reversed domains have a 100% chance to not
be resolvable at any time: there is always, at least, ambiguity: does
org.example.foo map to foo.example.org, example.org/foo, or
example.org#foo? Even better: what if, under example.org we find a
vocabulary at example.org/foo and another at foo.example.org? (Ok,
that'd be quite unwise, although it might be a legitimate way to keep
deployed and test versions of a vocabulary online at a time; but
anyway CURIEs can cope with it, while reversed domains can't).
Wherever there are links, there is a chance for broken links: that's
part of the nature of links, and the evolving nature of the web. But,
just because the chance of links being broken, would you deny the
utility of elements such as a and link? Reversed domains don't
face broken links because they are simply uncapable to link to
anything.


Now, please, I'd appreciate if you reviewed your arguments before
posting them: while the copy-paste issue is a legitimate argument, and
now we can consider whether this copy-paste-resilience is worth the
costs of microdata, that link rot argument is just a waste of
everyone's time, including yours. Anyway, thanks for that first
argument: that's exactly what I was asking for in the hope of letting
this discussion advance somewhere.

So, before we start comparing benefits against costs, can someone post
anymore benefits or does the copy-paste-resilience point stand alone
against all the costs and possible issues?

Regards,
Eduard Pascual

[whatwg] A Selector-based metadata proposal (was: Annotating structured data that HTML has no semantics for)

2009-05-14 Thread Eduard Pascual

I have put online a document that describes my idea/proposal for a
selector-based solution to metadata.
The document can be found at http://herenvardo.googlepages.com/CRDF.pdf
Feel free to copy and/or link the file wherever you deem appropriate.

Needless to say, feedback and constructive criticism to the proposal
is always welcome.
(Note: if discussion about this proposal should take place somewhere
else, please let me know.)

Regards,
Eduard Pascual

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-14 Thread Eduard Pascual

On Thu, May 14, 2009 at 3:54 PM, Philip Taylor excors+wha...@gmail.com wrote:
 [...]
 If we restrict literals to strings [...]
But *why* restrict literals to strings?? Being unable to state that
2009-05-14 is a date makes that value completely useless: it would
only be useful on contexts where a date is expected (bascially,
because it is a date), but it can't be used on such contexts because
the tool retrieving the value has no hint about it being a date. Same
is true for integers, prices (a.k.a. decimals plus a currency symbol),
geographic coordinates, iguana descriptions, and so on.

On Thu, May 14, 2009 at 8:25 PM, Maciej Stachowiak m...@apple.com wrote:

 On May 14, 2009, at 5:18 AM, Shelley Powers wrote:

 So much concern about generating RDF, makes one wonder why we didn't just
 implement RDFa...

 If it's possible to produce RDF triples from microdata, and if RDF triples
 of interest can be expressed with microdata, why does it matter if the
 concrete syntax is the same as RDFa? Isn't the important thing about RDF the
 data model, not the surface syntax?
It doesn't matter one syntax or another. But if a syntax already
exists (RDFa), building a new syntax should be properly justified. As
of now, the only supposed benefit I have heard of for this syntax is
that it avoids CURIEs... yet it replaces them with reversed domains??
Is that a benefit? I have been a Java programmer for some years, and
still find that convention absurd, horrible, and annoying. I'll agree
that CURIEs are ugly, and maybe hard to understand, but reversed
domains are equally ugly and hard to understand.

 (I understand that if the microdata syntax offered no advantages over RDFa,
 then it would be a wasted effort to diverge.
Which are the advantages it offers? I asked about them yesterday, and
no one has answered, so I'm asking again: please, enlighten me on this
because if I see no advantages myself and nobody else tells me about
any advantage, then the only conclusion a rational mind can take is
that there are no advantages. So, that's the position I'm on. I can
easily change my mind if anyone points out some advantage that might
actually help me more than RDFa when trying to add semantics and
metadata to my pages.

 But my impression is that you'd
 object to anything that isn't exactly identical to RDFa, even if it can
 easily be used in the same way.)
Actually, I do object to RDFa itself. Since the very first moment I
saw discussions about it on these lists, I have been trying to
highlight its flaws and to suggest ideas for alternatives.
Now, would you really expect me not to object to what, at least from
my current PoV, is simply worse than RDFa? IMHO, RDFa is just
*passable*, and microdata is too *mediocre* to get a pass. I don't
know about any solution that would be perfect, but I really think that
this community is definitely capable of producing something that is,
at least, *good*.

Of course, these are just my opinions, but I have told also what they
are based in. I'm eager to change my mind of there is base for it.

Regards,
Eduard Pascual

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-13 Thread Eduard Pascual

Let me start with some apologies:

On Tue, May 12, 2009 at 12:55 PM, Eduard Pascual herenva...@gmail.com wrote:
 [...]
 Seeing that solutions are already being discussed
 here, I'm trying to put the ideas into a human-readable document that
 I plan to submit to this list either late today or early tomorrow for
 your review and consideration.
Oops, I'm already late with that. I had some unexpected compromises and
had no time to finish that doc. I still hope, however, to publish it today.

On Tue, May 12, 2009 at 12:55 PM, Eduard Pascual herenva...@gmail.com wrote:
 [...]
 Third issue: also a flaw inherited from RDFa, it can be summarized as
 completelly ignoring the requirement I submitted to this list on April
 28th, in reply to Ian asking us to review the use cases [1].
 [...]
 [1] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019487.html

On Tue, May 12, 2009 at 7:30 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 Well, he didn't quite *ignore* it - he did explicitly call out that
 requirement to say that his solution didn't solve it at all.

I missed that part of Ian's post, sorry. I really read it from top to bottom,
but it was quite long. I guess I should have re-read it.
Now, after some re-reading, I have noticed a point I should reply to:

On Sun, May 10, 2009 at 12:32 PM, Ian Hickson i...@hixie.ch wrote:
 [...]
 * Any additional markup or data used to allow the machine to understand
   the actual information shouldn't be redundantly repeated (e.g. on each
   cell of a table, when setting it on the column is possible).

 This isn't met at all with the current proposal. Unfortunately the only
 general solutions I could come up with that would allow this were
 selector-based, and in practice authors are still having trouble
 understanding how to use Selectors even with CSS.

First, I'd like to ask for a clarification from Ian: what do you mean by
autrhos are still having trouble understanding how to use Selectors?
If you mean that they have trouble when trying to select something like
the second cell of the first row that has a 'foo' attribute different from
'bar' within tables that have four or more rows or even more obscure stuff,
then I should agree: most authors will definitely have trouble dealing with
so complex cases, and I bet many will always have such trouble. However, if
you mean that authors can't deal with simple class, id, and/or
children/descendant
selectors, then I think you are seriously understimating authors.
On a side note, I'd like to advance that my idea, despite being Selector-based
(actually, I should say CSS-based: it reuses quite more than
selectors), wouldn't
require authors to use selectors at all, at least for the cases that
can currently
be solved by RDFa (or, FWIW, with the current Microdata approach on
the spec); the
same way a page can be completely styled with CSS without using
selectors, via the
style attribute.

On Tue, May 12, 2009 at 1:59 PM, Philip Taylor excors+wha...@gmail.com wrote:
 On Tue, May 12, 2009 at 11:55 AM, Eduard Pascual herenva...@gmail.com wrote:
 [...]
 (at least for now: many RDFa-aware agents vs. zero HTML5's
 microdata -aware agents)

 HTML5 microdata parsers seem pretty trivial to write -
 http://philip.html5.org/demos/microdata/demo.html is only about two
 hundred lines to read all the data and to produce JSON and
 N3-serialised RDF. It shouldn't take more than a few hours to produce
 a similar library for other languages, including the time taken to
 read the spec, so the implementation cost for generic parser libraries
 doesn't seem like a significant problem.

Actually, I was thinking about the cost of deploying implementations,
rather than
writting them, since RDFa consumers are already out there and working. This,
however, strays a bit out of the original idea: it's not really a matter of how
big the cost is on its own, but of what do you get for that cost. This
is probably
my own fault, but I still fail to see what Ian's suggestion offers
that RDFa doesn't;
so my impression is that these costs, even if they are small, are
buying nothing, so
they are not worth it. If someone is willing to highlight what makes
this proposal
worth the costs (ie: what makes it better than RDFa), I'm willing to listen.

On Tue, May 12, 2009 at 2:30 PM, Shelley Powers
shell...@burningbird.net wrote:
 [...] Eduard, looking forward to seeing your own interpretation
 of the best metadata annotation.

Hey, who said my proposal will be, or try to be, the best one?
Definitelly, I didn't.
Actually, the reason to submit it here will be to have other people
look at it and
figure out ways to improve it (and I'm quite sure it can be improved,
I'm human after all).
Please, let me explicitly state that I don't pretend that idea to be
the best solution.
Since neither RDFa, nor Microformats, nor Ian's proposal could solve
my needs, my goal was
to build a solution that solves both my needs, and those solved by
other approaches, as a
proof

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-12 Thread Eduard Pascual

I don't really like to be harsh, but I have some criticism to this,
and it's going to be quite hard. However, my goal by pointing out what
I consider so big mistakes is to help HTML5 becoming as good as it
could be.

First issue: it solves a (major) subset of what RDFa would solve.
However, it has been taken as a requirement to avoid
clashes/incompatibilities with RDFa. In other words, as things stand,
authors will face two options: either use RDFa in HTML5, which would
forsake validation but actually work; or take a less powerful, less
supported (at least for now: many RDFa-aware agents vs. zero HTML5's
microdata -aware agents) that validates but provides no pragmatic
advantages.
IMO, an approach that forces authors to choose between
validity/conformance which doesn't *yet* works vs. invalid solutions
that actually work is a horrible idea: it encourages authors to
forsake validity if they want things to work.
Wouldn't the RDFa + @prefix solution suggested many times work better
and require less effort (for spec writters, for implementors, and for
content authors)? Keep in mind that I don't think RDFa + @prefix is
the solution we need; I'm just trying to point out that the current
approach is even worse than that.

Second issue: as the decaffeinated RDFa it is, the HTML5 Microdata
approach tends to fail where RDFa itself fails. It's nice that, thanks
to the time element, the problem with trying to reuse human-readable
dates as machine-readable is dodged; but there are other cases where
separate values might be needed: for example using a street address
for the human-readable representation of a location and the exact
geographic coordinates as the machine-readable (since not all
micro-data parsers can rely on Google Maps's database to resolve
street addresses, you know); or using a colored name (such as lime
green displayed on lime green color) as the human-readable
representation of a color, and the hexcode (like #00FF00) as the
machine-readable representation. These are just the cases from the top
of my head, and this can't be considered in any way a complete list.
While *favoring* the reuse of human-readable values for the
machine-readable ones is appropiate, because it's the widely most
common case, *forcing* that reuse is a quite bad idea, because it is
*not* the *only* case.

Third issue: also a flaw inherited from RDFa, it can be summarized as
completelly ignoring the requirement I submitted to this list on April
28th, in reply to Ian asking us to review the use cases [1]. I'll try
to illustrate it with a example, inspired by the original use-case:
Let's say someone's marking up a collection of iguanas (or cats, or
even CDs, doesn't really make a difference when illustrating this
issue), making a page for each iguana (or whatever) with all the
details for it; and then making an index page listing the maybe 20
iguanas with their name, picture, and link to the corresponding page.
Adding micro-data to that index, either with RDFa or with Ian's
microdata proposal, would involve stating 20 times in the markup
something like this is the iguana's picture; this is the iguana's
name; and this is the iguana's URL. It would be preferable to be able
to state something like each (row) tr in the table describes an
iguana: the imgs are each iguana's picture, the contents of the
a's are the names, and the @href of the a's are the URLs to their
main pages just once. If I only need to state the table headings once
for the users to understand this concept, why should a micro-data
consumer require me to state it 20 times, once for each row?
Please note how such a page would be quite painful to maintain: any
mistake in the micro-data mark-up would generate invalid data and
require a manual harvest of the data on the page, thus killing the
whole purpose of micro-data. And repeating something 20 (or more)
times brings a lot of chances to put a typo in, or to miss an
attribute, or any minor but devastating mistake like these.

Last, but not least, I'm not sure if it was wise to start defining a
solution while some of the requirements seem to be still under
discussion. Actually, I had a possible solution in mind, but I was
holding it while reviewing it against the requiremetns being
discussed, so I could adapt it to any requirements I might had
initially missed. Seeing that solutions are already being discussed
here, I'm trying to put the ideas into a human-readable document that
I plan to submit to this list either late today or early tomorrow for
your review and consideration.


Regards,
Eduard Pascual

[1] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019487.html

Re: [whatwg] Please review use cases relating to embedding micro-data in text/html

2009-04-28 Thread Eduard Pascual

 the machine something like the
first column (or the first cell on each row) are the names, the second
column (or 2nd cell on each row) are the e-mail addresses, ...,
rather than, for each contact, having to repeat this is the name,
this is the e-mail address, and so on.

- A website lists a series of software projects or products (from
something as huge as SourceForge to something as small as a company's
site listing its own few products), stating the product's title,
author/s (in the case the products have diferent authors) license,
version, and date of the last release. Again, the author of that site
should be able to tell the machine something like these are the
products, these the authors, these the licenses, ..., rather than
stating this is the product's name, this is the product's author,
this is the product's license, ... for each and every product listed.

Rationale:

I hopoe it can be noticed how ignoring this need would raise some
serious issues: first, and foremost, having to repeat the
meta-information for each entry is tedious and error-prone: if an
author misses to add a meta-data field to the new entry s/he just
added, the whole purpose of using metadata is ruined, since users
would need to manually retrieve the information anyway (wasn't the
error-prowness the main reason to require keeping the metadata as
close as possible to the actual information?).
Next, redundant data means larger files, which directly translates in
slower page loads for the user and higher bandwith costs for the
publisher. There may be some secondary issues from this (for example,
some search engines tend to truncate large files and ignore
everything beyond a certain threshold), but those come from the
needlessly enlargement of the file; so file bloating is the actual
issue to keep in mind and deal with here.

Additional considerations:

- Fullfilling this requirement could make harder to deal with the
copy-paste tasks, but not impossible. Some browsers can preserve the
formatting applied from an external CSS when copying, so preserving
the metadata when it has been defined upon structure would be equally
achievable.
- There *are* cases where repeating the metadata a few times can be
better than having it centralized. I have nothing against any solution
that *allows redundancy*, as long as it *does not enforce redundancy*.
- I want to make clear that there is a difference between having the
human-readable and machine-readable information in the same place
(even reusing the same info for both consumers when doable) and having
the metadata (the data that defines how to interpret the actual data)
there as well. There might even be cases where having the metadata
somewhere else may make sense (for example, in the SourceForge example
above, it would be quite reasonable to have a single file defining how
to retrieve the useful data for each SERP (SEarch Result Page), rather
than defining it on every SERP). Again, I feel that the ideal solution
should allow either practice and force none (after all, from an
author's PoV, more choice means more power, which is always better for
us).

References:

[1] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-August/016037.html
(You might want to review other messages on that thread as well, but I
think this is the one that better describes the actual issue. Also,
keep in mind that, while my intention with this post is to bring the
problem/need into consideration, that thread evolved into discussing
some solution ideas. I think we should have the list of needs and
use-cases properly defined before we start discussing solutions.)

Regards,
Eduard Pascual

Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-01-18 Thread Eduard Pascual

On Sun, Jan 18, 2009 at 3:56 PM, Anne van Kesteren ann...@opera.com wrote:
 On Sun, 18 Jan 2009 16:22:40 +0100, Shelley Powers
 shell...@burningbird.net wrote:

 My apologies for not responding sooner to this thread. You see, one of the
 WhatWG working group members thought it would be fun to add a comment to my
 Stop Justifying RDF and RDFa web post, which caused the page to break. I am
 using XHTML at my site, because I want to incorporate inline SVG, in
 addition to RDFa. An unfortunate consequence of XHTML is its less than
 forgiving nature regarding playful pranks such as this.

 I'm assuming the WhatWG member thought the act was clever. It was, indeed.
 Three people emailed me to let me know the post was breaking while loading
 the page in a browser, and I made sure to note that such breakage was
 courtesy of a WhatWG member, who decided that perhaps I should just shut up,
 here and at my site, about the Important Work people(?) here are doing.

 Of course, the person only highlighted why it is so important that
 something such as RDFa, and SVG, and MathML, get a home in HTML5. XHTML is
 hard to support when you're allowing comments and external input. Typically
 my filters will catch the accidental input of crappy markup, but not the
 intentional. Not yet. I'm not an exerpt at markup, but I know more than the
 average person. And the average person most likely doesn't have my
 commitment, either.

 http://annevankesteren.nl/2009/01/xml-sunday shows the commentor (who by the
 way seems to be on your side in this debate) simply forgot to escape
 self-closed / and then WordPress somehow messed up in an attempt to fix
 it. I don't think anyone tries to make you shut up.

Ouch! Thanks Anne for the screenshot, otherwise I wouldn't have known
that it was my comment the one causing the issue.
My apologies Shelley for that incident. I assure you that it was not
intentional: it was a quite long post, I used some markup with the
intention of making it more readable (like italizing the quotes), and
by the end I messed things up. Thanks to the preview page I noticed
some issues, like that I had to escape the sarcasm.../sarcasm
for it to display (I'm too used to BBCode, which leaves unrecognized
markup as is), but I didn't catch the self-closed / one (nor the
preview page did: it showed up without issues).

On Sun, Jan 18, 2009 at 4:15 PM, Shelley Powers
shell...@burningbird.net wrote:
 You're not seeing all of the markup that caused problems, Anne. The
 intention was to crash the post.
I don't really know how much did I mess up the markup on that post;
and I only managed to fix the issues that I spotted from the preview
page, so I wouldn't be surprised if there were more issues. Once more,
I would like to clarify that this was not intentional; but, given the
tension arising again from this debate, I can understand your
reaction.

Re: [whatwg] Extracted content element.

2009-01-18 Thread Eduard Pascual

On Sun, Jan 18, 2009 at 9:40 PM, Jamie Rumbelow ja...@jamierumbelow.net wrote:

 Is there a need for a tag in HTML5 to specify extracted content
 semantically, such as the first few paragraphs of a blog post. The extract
 tag could optionally contain a href attribute to the full version of the
 content.

 Just a quick idea, and I wanted some community thought on it.

For the use-case you suggest (an extract from a blog), it seems that
cite (containing the title of the post, linked to the blog) and
blockquote would be a perfect match; maybe adding a class=extract
to the blockquote element for better accuracy.

Greetings,
Eduard Pascual

Re: [whatwg] [WebForms2] custom form validation notifications

2008-11-12 Thread Eduard Pascual

On Wed, Nov 12, 2008 at 1:15 AM, Ian Hickson [EMAIL PROTECTED] wrote:
 On Thu, 23 Oct 2008, Eduard Pascual wrote:
 [...]

 I don't really follow.
Neither do I, and I wrote that :S

Re-reading the conversation, I'm not really sure if I really
understood Joao's issue and proposal correctly; and without him
providing any clarification this doesn't turn any better... And, in
addition, I couldn't have worded my ideas worse than I did. It seems
that I got a 404 Brain not found while writting that message.

Summarizing, my suggestion (a possible solution for what I thought to
be the main use-case for this) was to *not* include the
validation-related arguments in the markup (or use catch-all
placeholders), but instead add them from a script upon the page's
onload event. This way, when scripts are available, the javascript
will do whatever it needs, customize everything, and so on; but when
scripts are disabled or not supported, the client won't do anything,
and all the validation will be delegated to the server (that's what is
called *graceful* degradation, in contrast with grace-less stuff
like noscriptYou have no scripting: your browser sux, get a new
one/noscript :P )

 OTOH, I think Joao's idea was more like to relying on visual hints (ie:
 marking the field as red) on cases where an error message popup would be
 redundant and annoying. I think that could be more elegantly handled
 with an empty attribute value for an hipothetical custom-error-message
 attribute (which is not the same as an absent attribute).

 I really don't follow this. Maybe some concrete examples showing the
 problem with the current spec solutions would help.

The main point is that when a page already handless error-reporting
via CSS (for example, marking valid fields in green and invalid ones
in red), further notifications by the UA are redundant and sometimes
(if they take the form pop-up messages) annoying to the user. The need
would be to disable such messages even when scripts are not available.
I don't really know if there is a way to do that with the current
spec.
Anyway, as long as browsers don't use pop-ups for this kind of
notification, this shouldn't be an issue. Most browsers already
provide pop-up blocking functionality, so I hope they won't add
pop-ups of their own needlessly. In addition, a user that sees a field
becomming red (or marked as invalid in some other way) isn't likely
to hit the submit button before fixing it, so I don't think this is
too much of an issue anyway. Maybe Joao had something else in mind, or
some specific use-case, but unless he can provide more details, I
think this discussion will lead nowhere.

Greetings,
Eduard Pascual

Re: [whatwg] SPOOFED: Re: SPOOFED: Re: ---

2008-11-10 Thread Eduard Pascual

On Mon, Nov 10, 2008 at 2:57 PM, Pentasis [EMAIL PROTECTED] wrote:
 Hi,

 I seem to have a few problems here, but nothing I cannot handle. For some
 reason I get my e-mails later than I should and they are working on the
 electricity grid here, so I have no power during the day (only at night).
 On the other hand that gave me some time today to think about ti and I
 already wrote some stuff down on paper. I will type it tonight. What
 file-type do you prefer, is word 95 ok?

I'd rather suggest plain text or, if you really need some formatting,
html. After all, it would be a quite reasonable assumption that most
people in the list can view HTML documents, but it wouldn't be too
safe to asume they can view word proprietary format; and even if they
can, it can sometimes take some extra work to do so (for example,
loading it into Google Docs or the like), and the final document would
be significantly bigger in .doc than in .html.
Just my opinion.

Re: [whatwg] ---

2008-11-10 Thread Eduard Pascual

On Mon, Nov 10, 2008 at 8:46 PM, Matthew Paul Thomas [EMAIL PROTECTED] wrote:
 The earliest surviving HTML draft from 1992 includes the PLAINTEXT and
 LISTING elements, both entirely presentational.
 http://www.w3.org/History/19921103-hypertext/hypertext/WWW/MarkUp/Tags.html
PLAINTEXT was aimed to mark the end of hypertext in a document,
hence the contents beyond it were to be treated as plain text instead,
ignoring anything that could look like markup. That's quite
structural, IMO. The fact that the draft described an obvious
preferable rendering (intended to make the plain text stand out as
plain text) doesn't make it entirely presentational: if that were
the case, almost all elements described there would be presentational.
The case of LISTING is more of the same: it was intended to denote
listings (such as code, terminal output, or directory listings), hence
the name, and the draft proposed using monospaced rendering for this
element because it generally makes these kinds of listings much easier
to read.
On both cases, the elements are quite structural, and the rendering
described in the draft is simply an obvious consequence of the kind of
structure denoted by them.

 HTML+ in 1993 went further: In many cases it is convenient to indicate
 directly how the text is to be rendered, e.g. as italic, bold, underline or
 strike-through. http://www.w3.org/MarkUp/HTMLPlus/htmlplus_16.html Those
 presentational elements continued into HTML 2.0.
I wouldn't take too seriously specs that never went beyond the draft
stage, but if you insist, it'd be interesting to point out that in the
introduction of the HTML+ proposal those elements were described as
mere hints.
About those elements continuing into HTML 2.0, it's worth saying
that only three of them (i, b, and tt) were actually included in
that version, and for all of them the spec allowed for alternative
representation
(http://www.w3.org/MarkUp/html-spec/html-spec_5.html#SEC5.7.2).

 HTML has always been a dance between structure and presentation. Too
 structural, and humans won't understand it; too presentational, and
 computers won't understand it.
It hasn't been such a dance: there were a few presentational elements
at the beginning, because there was a need to describe the
presentation for such cases; then when the presentational needs became
more exigent there was an attempt (known as 3.2) to make the language
presentational, which soundly failed; and thanks to this failure the
W3C realized that HTML couldn't be the solution to these
presentational needs, hence they created CSS and added some hooking
and embeeding mechanisms in HTML4. All the presentational stuff that
is still in HTML after 3.2 is only retained for compatibility with
older documents. If there was a dance, it was a quite short one.
Actually, it's easier for most humans to understand a document with a
clear structure than a non-sense of eye-bleeding presentational
tag-soup. Of course, presentation can help humans understanding a
document, but it works best when it's used to highlight and emphasize
the structure of the document. I'm afraid you are understimating
humans.
And, besides that, computers actually understand completely
presentational markup: they understand that something is italic,
something is underlined, and some chunk of text is red and
right-aligned. Of course, they won't be able to figure out why is
something italic, or underlined, or red; but that's not because a
limitation of the machines, it's simply because the document doesn't
really convey that information.

Besides all of this, may I ask you what was the point of your message?
I mean, I hope I understood the contents, but I fail to see the intent
of it: you just quoted a small comment and tried to prove wrong a
point that, even if it is indeed wrong, was not the main point of that
comment; mostly because it was taken as the basis for a comparison.
Actually, after reviewing in more dept these pre-histroric specs, I
now see the paralelism between presentation and semantics even more
obvious.

Greetings,
Eduard Pascual

[whatwg] Format issue on the spec: unreadable (or hardly readable) text.

2008-11-09 Thread Eduard Pascual

I can't say for sure if this is an issue from the spec document
itself, or just a rendering bug on my browser (FF 3.0.3), but here it
goes:
Within the section 4.3.1 The script element, on the algorythm
labeled Running a script, step 6, the text for the first condition
shows overlapped, each line covering part of the text in the previous
line. I have put a screenshot on
http://herenvardo.googlepages.com/brokentext.png just for case anyone
wants to see it.
So, unless it is actually a FF bug, I hope someone fixes it.
Actually, I *can* read half-visible text like that, but even being
used to do that it still takes some extra effort from my eyes and
brain. And I wouldn't expect everyone reading the spec to be used to
look at scrambled text.

Re: [whatwg] SPOOFED: Re: SPOOFED: Re: ---

2008-11-09 Thread Eduard Pascual

On Sun, Nov 9, 2008 at 4:13 PM, Ian Hickson [EMAIL PROTECTED] wrote:
 On Sat, 8 Nov 2008, Eduard Pascual wrote:

 Can somebody put forward any technical argument against this idea?

 For my benefit, could you succintly summarise the changes that this would
 involve to the spec? I'm not sure I really understand the proposal at the
 concrete level.
Sure! I'd like to speak with Pentasis first (it was him who started
this thread with this suggestion), to make sure we have the similar
idea in mind (or to cover both viewpoints if needed), before putting
something more concrete together, so just give us a couple of days or
so to discuss these details; and we'll make our best to word it in a
succint and concrete way.

 Also, it would be helpful to have a clear summary of the use cases.
Definitely, it would. I'll try to put together the cases Pentasis and
I have in mind, and also do some research through the web to try find
out those cases that might have gone unnoticed.

So, expect a more formal (and concrete) reply within a few days.

Re: [whatwg] SPOOFED: Re: SPOOFED: Re: ---

2008-11-08 Thread Eduard Pascual

On Wed, Nov 5, 2008 at 8:47 PM, Philipp Serafin [EMAIL PROTECTED] wrote:
 On Wed, Nov 5, 2008 at 4:00 PM, Leons Petrazickis
 [EMAIL PROTECTED] wrote:
 It matters in the sense that web browsers would have to implement both
 approaches for backwards compatibility.


 This depends what you mean when talking about implementing a tag.
 Browsers already load all tags and attributes they encounter into the
 page DOM today , regardless if they know them or not. This is also
 the behavior that HTML5 demands, if I'm not mistaken.
I have just put a sample file together, and checked it on what I have available:
On Dreamweaver (CS3) and FF3, it is rendered perfectly.
On IE7 and Microsoft's Visual Web Developer 2008 Express SP1, it
ungracefully degrades, as I expected: they ignore the entire
reference tag, so neither the default styles defined for the tag nor
the specific classes are displayed. I also included title attributes,
for fancy tooltip effects, and they show perfectly on FF; IE of course
doesn't show them because they are part of the ignored reference
tag.
And, of course, both authoring tools make a mention of reference not
being valid... but that was something that could be expected :P

I tried to test this on IE8beta2, but the instalation is failing ¬¬'.
Would MS be so kind to do something that doesn't break? Thanks.
Sarcasms aside, if somebody wants to give this sample a try on other
browsers, I've put it online at
http://herenvardo.googlepages.com/test.html. Note that I have
deliberatelly abused of styling to make the result stand out (ie: to
easily see whether it works or not). Use your browser's View source
features to get an idea of what to expect.

Standards that have tried to make changes like that -- XHTML2 comes to
mind -- have not been as successful as HTML4 [...]

 We can't really make any statements about how successful XHTML2 would
 be on the public web. It's not yet a recommendation (though this would
 probably not change much) and no browser implemented it, so there was
 never opportunity to find out.
And even so it's being used sporadically :P
In addition, this case can't be compared with XHTML2: it'd give the
authors a choice, since abbr and similar elements would still be
supported, for backwards compatibility.

 But anyway, this discussion is moot, since many of those tags can't be
 changed due to backwards compatibility.
Actually, it isn't: abbr doesn't need to be kept more than big of
font do for the compatibility issue. The spec needs to define how to
handle all of those anyway; but it isn't really tied on what does it
define as conformant and non-conformant.

Now, going back to Pentasis's actual concern: authoring content in a
way that makes sense. Just go ahead: it seems that browsers that deal
with unknown elements in an HTML5-conformant way will handle it
properly, as long as you provide some rendering info in your
style-sheets. You don't really need this spec to define that
element, as long as it works on browsers; and the spec will still
require browsers to process it in a reasonable way (ie: include in the
DOM, apply relevant styles, and hopefully even apply global attribs
like @class or @title). So, essentially, including an element like
reference or not is, from the browser's viewpoint, irrelevant: their
required behavior will be exactly the same either way. On the author's
side, it will only be relevant for authors that care about conformity
and need the element, in which case it will be good to have it
available. Of course, it requires a bit of extra work by the WG, to
describe it in the spec (what would it be, one paragraph?). Validators
may face some extra work, but I don't think anyone with a bit of
sanity left would hardcode each element in the validators, but instead
use some DTD-like description of the syntax and/or content model, so
it isn't really that much work. And finally, on the assistive
technologies' and search engines' side, this kind of elements would
allow to describe the contents of webpages far better, which would be
a clear benefit.

So, in summary, there are some benefits for including this kind of
elements; and there is any relevant backward (just a bit of extra,
trivial work for spec and validator writters). Some benefits, and no
*real* issues, the only plausible reason to not include this would be
a desire to hurt the web (which, BTW, some XHTML2 evangelists out
there think is the only goal of the WG). Can somebody put forward any
technical argument against this idea? If somebody does, I'd be glad to
discuss them seriously; but please think if what you're going to say
at least makes sense: don't come up with the implementation issue
(the inclusion of this kind of elements wouldn't require from
implementations anything that the spec doesn't already require) or the
backwards compatibility thing (the inclusion of new element never
affects how other elements should be treated); I've better things to
spend my time on than needlessly looping and replying to

Re: [whatwg] ---

2008-11-05 Thread Eduard Pascual

 making it a section (with the aside element).
And, for example, what about something that's both navigation and
tangentially related (regardless of wether it is a section or not)?
For example, a list of see also stuff on a documentation page: you
would be forced to markup it as a navigation section inside an
aside section or as an aside section inside a navigation
section: none of both reflects the real structure of the page; but
they are the only ways to represent both semantics. I know these
examples are really simple, and the workarounds wouldn't really hurt
that much; but they should be enough to show how we are stepping into
the same issues with semantics that we did over a decade ago with
presentation. Do we really have to wait to be hurt by the issue before
solving it, when we can see it so clearly approaching? I don't know
you, but I know I am *not* masochist, so I don't really want to get
hurt.
Now, to something more specific, we'd need:
1) Some (external to HTML) way to describe semantics. (And no, I don't
think RDF, on its current form, is a solution for this; but maybe the
solution could be based on or inspired by RDF.) That should be to
semantics what CSS is to presentation. And we don't really need to
care about browsers quickly implementing it, or about legacy browsers
that don't implement it, because currently browsers don't care at all
about semantics (at least, not beyond displaying @title values and for
default rendering, and rendering can be dealt with through CSS
anyway).
2) A way to hook these external semantics to arbitrary elements of a
page: we already got @class for this :D
3) A way to add inline semantics when needed. I guess a semantics
attribute would be the most straight-forward approach. About the
format it uses, we should care about it once we have solved 1).
If we got that, then we could:
1) Get rid of all the wannabe semantic elements that didn't really
work well enough, sending them to the
deprecated/transitional/supported-for-backwards-compatibility-only
limbo.
2) Get rid of all the *new* wannabe semantic elements that wouldn't
be really serving any purpose (ie: un-bloat the content model)
3) Have the simplest and cleanest markup, the most accurate
presentation mechanisms, and the richest semantic descriptions of the
last 10 (or even more) years, all in one package.

 I agree with you that there are many things in HTML that have a purely
 historic legitimation, such as the h1-h6 elements. h level=n would be
 much more flexible. I personnally often get mad about the IMO totally
 unlogic set of form elements. I would highly appreciate such thigs to be
 cleaned up in a new HTML spec. But of course the task ot those who design
 HTML5 is not to re-invent the wheel, but to evolve the existing HTML in a
 highly backwards-compatible way.
I have already mentioned what do I think about the
backwards-compatibility requirement, and the way it's being
approached.

Anyway, I think its also worth pointing out the issue with headings:
currently, the spec recommends using h1 for all levels of headings,
but that would mess the hell up on current browsers. Hasn't anybody
noticed that?

 I made the experience when I suggested a new set of form elements, that I
 did not get much response on those contributions. The same might happen to
 your suggestions, as they are on a more basic level, than the HTML5 works
 act on. I don't think you can blame the people working on HTML5 for this, as
 they are quite far in the process, and your suggestions do rather set new
 starting points, than contribute to the acutal state of the work.
These are quite different cases: the main issue with form elements is
that their functionality is normally hardcoded in the browser.
Pentasis suggestions (and even my own) would only significantly affect
the spec itself and validators; and maybe future smart browsing
features that aren't yet implemented anyway.


Well, that's been a long enough message, and over 3 hours of typing
and reviewing stuff are now asking me for a cigarrete, so I'll post
again soon with the additional comments I was planning to add.
I want to remind you all that this message mostly reflects my point of
view; and if someone disagrees I'm more than willing to pay attention
to your arguments. Also, I think it'd be good to start branching stuff
from here rather than keeping the multi-discussion on this thread.

Regards,
Eduard Pascual

Re: [whatwg] Review of the 3.16 section and the HTMLInputElement interface

2008-11-05 Thread Eduard Pascual

On Thu, Nov 6, 2008 at 12:15 AM, Samuel Santos [EMAIL PROTECTED] wrote:

 On Wed, Nov 5, 2008 at 10:46 PM, Ian Hickson [EMAIL PROTECTED] wrote:

 On Wed, 5 Nov 2008, Samuel Santos wrote:
 
  I find it very hard to convince some clients that in order to have the
  browse button in their language they must configure their browsers. The
  vast majority of them don't even know where they can configure the
  default browser language, and don't feel they should even have to do it.
  It's also strange for them to have all the buttons in their language
  except the browse buttons.

 I understand but why don't they also complain about, say, the title of the
 dialog box that comes up? Or the items on the context menus?

 Why do they use the wrong language browser in the first place?

 In Portugal a lot computers come with the english OS version.
 This means that the browser is in english and configured to have english as
 the default language.

 The problem with the input file button is that the client/user assumes that
 the text that appears in it is the developer's responsibility, like with the
 other button controls.
 In the example you gave he knows that the dialog box is from the UA
 (browser) and has nothing to do with the rest of the application.
 I'm pretty sure that this happens a lot in non-english countries.

I agree with Samuel in that this is an issue. In Catalunya, most often
Spanish software is used (both OS and browsers), because a lot of the
software is not easily (or at all) available in Catalan (specially
Microsoft software, such as Windows and IE, which ammount for a quite
big fraction of web surfers). Seeing Spanish stuff in pages that are
supposed to be in Catalan is quite annoying (especially when keeping
in mind some historic factors).

I can understand that there may be some security concerns with this
control; but I don't think changing the Browse caption poses any
threat. But if there is so much paranoia on this, browsers could be
allowed (or even required) to ask for confirmation when picking a file
if the caption has been changed (something like Are you sure you
would like to submit C:\example.txt to example.com? should be enough,
and users would easily see such question as comming from the UA rather
than from the webpage).

[whatwg] Fwd: Review of the 3.16 section and the HTMLInputElement interface

2008-11-05 Thread Eduard Pascual

LOL forgot to add the whatwg list to the To: field ^^;

-- Forwarded message --
From: Eduard Pascual [EMAIL PROTECTED]
Date: Thu, Nov 6, 2008 at 3:31 AM
Subject: Re: [whatwg] Review of the 3.16 section and the
HTMLInputElement interface
To: Samuel Santos [EMAIL PROTECTED]

On Thu, Nov 6, 2008 at 3:09 AM, Samuel Santos [EMAIL PROTECTED] wrote:
 What about allowing the Author to change the control's locale?
 By doing so, the UA can then render the button with the same locale as the
 application without compromising the security.
I was going to suggest this, but I don't think it's really doable:
browsers would need to include all the translations of that caption in
all their versions. In the specific case of IE, considering that
Microsoft tends to license only single-language versions of its
products (if you want it in two languages, you need to pay twice), I'm
afraid this would be an issue (despite the fact that IE is actually
distributed for free, it would still mess with their packaging).
Still, I think that requiring user confirmation whenever something in
the control has been altered (like the button caption) should be
enough: as long as the user knows that s/he is sending that file to
that site; how much matters how the control actually looks or what the
button's caption reads?

Re: [whatwg] Add 'type' attribute to mark

2008-11-01 Thread Eduard Pascual

First of all, I'd like to avoid any missunderstandings: I have nothing
against the mark element itself; although I'm afraid my previous
e-mails may lead to think otherwise. It could be a really good
addition to HTML but, IMHO, it isn't yet, and I'm trying to show why I
think so.

On Sat, Nov 1, 2008 at 2:57 AM, Ian Hickson [EMAIL PROTECTED] wrote:
 On Sat, 1 Nov 2008, Eduard Pascual wrote:
 [...]
 What's the difference then between mark and span then? I mean, does
 the mark element provide anything that span with an appropriate
 class wouldn't?

 A default style when there's no CSS support, support in accessibility
 tools, the ability for search engines to understand what's going on,
 easier round-tripping between editors, simpler rules in CSS and other
 selector-like environments, etc. The usual benefits of semantics.

Let's take that point by point:

- A default style when there's no CSS support: that's entirely
presentational; and although I may accept it as a side-bonus, I don't
feel presentational arguments are a good base for including/excluding
a new element.

- Support in accessibility tools: that's plain daydreaming: what kind
of support can an AT provide without any hint on whether the mark'ed
text is a search term, or the line of a code snippet that caused a
crash, or the total price of the orders in a shopping cart, or
whatever other usage authors may come up with?

- The ability for search engines to understand what's going on??
Comming from someone else, I'd think they are simply wrong, but
comming from Ian I really hope you were joking: besides a SE would be
as clueless on that aspect as an AT; this would extremelly easily
abused by black hat SEO, to the point of making the element
completely meaningless for SEs: just a mark { display: none; } rule
and a site becomes able to freely spam highlighted keywords across
the entire page. And, although many SEs are capable of checking CSS
sheets, it's almost trivial to achieve the same from JavaScript, and
even to obfuscate the script if any SE managed to figure the trick
out.

- Simpler rules in CSS and other selector-like environments: I simply
can't believe that this came from an editor of the CSS3 Selectors
module. How much simples is mark than .match, .crash_line,
.total, and so on? The only difference is a single dot; plus the
fact that classes give much more flexibility that directly styling (or
selecting on any other similar environment) could ever allow. And
finally, it's worth to mention that, as soon as a page needs to use
mark for two or more different purposes, there is no other way to
differentiate them in selectors than classes (and no way at all for
ATs or SEs to differentiate then, BTW).

- The usual benefits of semantics. Honestly, that's a really good
purpose; it's only that it's not achieved. If you look again at the
counter-arguments above, you should be able to notice that they are
nothing else than the usual drawbacks of lack of semantics. Because,
simply put, the semantics defined for mark are so vague that, in
practical terms, they are the same as no semantics at all. Pentasis's
initial proposal would be a simple and efficient solution to this
issue: with some sort of type/role/whatever attribute (based on a
well-defined list of allowable values), an AT could tell the user why
some text is marked, a SE could figure out what's really going on, and
a designer could rely on that attribute upon selection instead of
defining classes with an entirely presentational purpose.

Well, that's my opinion, just wanted to share it.

Re: [whatwg] Add 'type' attribute to mark

2008-10-31 Thread Eduard Pascual

On Fri, Oct 31, 2008 at 4:52 PM, Pentasis [EMAIL PROTECTED] wrote:
 I hope I am doing this right, I am not used to mailing lists ;-)

 Anyway, following some discussions on the web regarding footnotes/side notes
 I have found that there is a need for some form of element to mark these up.
 The most commonly accepted element at the moment seems to be to use the
 small element. But this is clearly a wrong use of semantics.
 As the mark element has different usages defined on it already why not
 include a type attribute (or similar) that defines what it is used for.
 One of these types would then be footnote, others could be (relating to
 what is already in the spec) term, highlight etc. (I am sure others
 would be much better at thinking up names than I am).
 Esp. in light of the fact that the spec states that UA will probably have to
 provide cross-linking would make this an ideal element for footnotes/side
 notes.

 Bert

Although I agree with the overall idea, I have to mention that the
type attribute itself wouldn't be a good match for this purpose: it
is already used for something different (marking the content type for
stuff like script, img, object, the new audio and video, and
so on, often expressed as a Mime type). In general, I think
overloading an attribute with different meanings (semantics) is not a
good idea (we should leave the input case aside of this
generalization, mostly because it's been using the attribute for over
a decade by now). IMHO, a role attribute would match exactly what
you are asking for, although I sent some feedback about it a while ago
and got no responses (it probably went unnoticed, since there were
several discussions running on by the time, and a few of them were
quite heated). Maybe now that you are raising this issue I should try
to bring back the relevant parts of those mails?

OTOH, if a type, role, or similar attribute was added, we should
question the need of the mark element (and many others) at all: what
would it provide that a span with the same type or role doesn't?

Also, I've seen some comments suggesting that class should be used for
these purposes, and not just as a hook for CSS. If the spec is clear
enough about this broader semantics of the class attribute, and UAs
are aware of it, the only practical difference between class and
type/role will be whether the author can come up with any arbitrary
value (class), or has to choose between a pre-defined set (type/role).
I'm not sure which approach would be better for this specific case.
Have **you** considered using class for the purpose you are
suggesting? If you have, and you still feel it's not enough, maybe
explaining *why* would be helpful to figure out what the best solution
would be.

Just my thoughts.

Re: [whatwg] Add 'type' attribute to mark

2008-10-31 Thread Eduard Pascual

On Fri, Oct 31, 2008 at 7:29 PM, Ian Hickson [EMAIL PROTECTED] wrote:
 On Fri, 31 Oct 2008, Pentasis wrote:
[...]
 As the mark element has different usages defined on it already why not
 include a type attribute (or similar) that defines what it is used
 for. One of these types would then be footnote, others could be
 (relating to what is already in the spec) term, highlight etc. (I am
 sure others would be much better at thinking up names than I am).

 That's what the class attribute is for.

What's the difference then between mark and span then? I mean,
does the mark element provide anything that span with an
appropriate class wouldn't?

Re: [whatwg] Select elements and radio button/checkbox groups [Was: Form Control Group Labels]

2008-10-29 Thread Eduard Pascual

On Wed, Oct 29, 2008 at 9:49 AM, Markus Ernst [EMAIL PROTECTED] wrote:
 Consider a form with some quite big radio button groups, and now you have do
 add some more options. After you are done, your boss says: Ok, great
 work... but this looks too ugly now, just change it into those dropdown kind
 of things.
Honestly, this seems like a presentational issue to me. Isn't CSS3's
Basic UI module (http://www.w3.org/TR/css3-ui/) enough to handle that?
Correct me if I'm wrong, but it seems that the properties there would
allow you to present a radiobutton group as a dropdown menu, and
vice-versa.

 To illustrate this, have a look unordered and ordered lists, which are
 similar, too. Consider ul and ol would have the same kind of different
 syntaxes; say, the ul element would work like we know it, but to make an
 ordered list we would have to write something like:

 p type=orderedlist
  listposition value=list position 1
  listposition value=list position 2
 /p

 Now simply changing an ordered into an unordered list would cause an
 annoying amount of re-writing, such as changing a radio button group into a
 select element does.
Even in that case, CSS3 (and I think even CSS2) would perfectly allow
you to render an originally unordered list as an ordered one (with
different choices of numbering style), and vice-versa, without
changing anything on the markup.

In summary, if you only need to change the presentation, then it's a
CSS issue (and CSS seems to deal well enough with it); and if you are
really changing the semantics and inherent structure of the document,
then the need to non-trivially adjust the markup is unavoidable: after
all, the semantics and structure is what the markup is actually
defining.


Just my thoughts.

Re: [whatwg] video tag : loop for ever

2008-10-29 Thread Eduard Pascual

On Wed, Oct 29, 2008 at 6:16 PM, Jonas Sicking [EMAIL PROTECTED] wrote:
 Maciej (and I think others) have suggested that it would be useful if it was
 possible to allow audio to be used such that a single file can be
 downloaded that contains multiple sound effects, and then use javascript to
 play different sound effects contained in that file at various times.

 For example someone creating a shoot-em-up game might create a file that
 contains the sound for shoot weapon, enemy exploding, player dying,
 and player finishes level. It can then when appropriate use javascript to
 play any one of these sound effects.
Wouldn't multiple audio elements be better here? They'd point to the
actual same file, but different fragments. That would even make the
script less bloated (just selecting each element, instead of
explicitly getting the appropriate fragment from the master file
each time you need it). This brings the additional advantage that, in
the event the server does support file fragments, only the actually
required fragments will be downloaded. And, if the server doesn't
support fragments, then there is still no reason why the UA would
download the same whole file more than once for a single page (maybe
there is a need to ensure the UA isn't that silly for such cases,
however).
Furthermore, the idea of having multiple sound effects in a single
file is just a matter of packaging, probably mere convenience, and
multiple audio elements would better reflect the actual semantics:
these are separate sound effects actually.

Re: [whatwg] Web forms 2, input type suggestions (Michael A. Puls II)

2008-10-29 Thread Eduard Pascual

On Thu, Oct 30, 2008 at 1:16 AM, Matthew Paul Thomas [EMAIL PROTECTED] wrote:
 On Oct 29, 2008, at 6:40 PM, Kristof Zelechovski wrote:

 Declare INPUT[type=mailing-list] instead of INPUT[type=emails],
 please. Type=emails is ugly and confusing (as it seems to expect
 messages).
 ...

 emails is indeed ugly, but mailing-list would be even worse. A mailing
 list usually has a single address.
What about multi-email, email-list or email-addresses? The last
one is the one I most like, and the most explicit, with the only
(minor) drawback that it's the longest. Anyway, that's just a
suggestion. BTW, the same argument could be made on type=email,
since someone could easily think it expects an entire message.

Re: [whatwg] video tag : loop for ever

2008-10-29 Thread Eduard Pascual

On Thu, Oct 30, 2008 at 12:52 AM, Jonas Sicking [EMAIL PROTECTED] wrote:
 The whole idea was to make a single HTTP request to the server. Doesn't seem
 like your proposal accomplishes that.
Indeed, it doesn't. It doesn't seem that the recent messages mentioned
that need neither.
Anyway, for the case of multiple queries to the same object (ie:
identical URI at least until the # part), I think there should be
some way to ensure that a single request for that object is made, at
least optionally; but I think this would stray out-of-topic under this
discussion's subject.

Re: [whatwg] WebForms2 validity

2008-10-28 Thread Eduard Pascual

On Tue, Oct 28, 2008 at 8:09 PM, Ian Hickson [EMAIL PROTECTED] wrote:
 On Fri, 9 Feb 2007, Sean Hogan wrote:

 I might be missing something obvious, but...

 When are ValidityState properties updated? And when are CSS pseudo-classes
 (:valid, :invalid, :in-range, :out-of-range) updated?

 Continually (in particular whenever the constraints or the values change
 -- the validity states are defined in terms of those values).


 Many textual input form-controls would begin in one or another invalid
 state (valueMissing, patternMismatch) however authors would typically
 want CSS validity styles to apply only after checkValidity() - either a
 manual one or the automatic one performed by form-submission.

 Why?

I agree with Sean's idea: at least on the initial state, controls
showing up with invalid styling can be quite confusing to many
users. It may depend a lot on the context, and even more on the user:
although the initial  for a required text field would be invalid,
and even would make sense to visually convey that fact, many users may
wonder What did I wrong if I didn't do anything yet?. The best
solution I can think of this would be an additional pseudo-class, such
as :default, :initial-value, :non-modified, or anything like
that (I'm really bad at naming stuff, so please take those only as
*examples*, because that's what they are), which could be used
together depending on the needs of each site or application, like
this:
:valid {
// code for green higlighting
}
:invalid {
// code for red highlighting
}
:default {
// overriding code to remove highlighting (or to apply white highlighting)
}
:default:invalid {
// code for yellow highlighting
}
That's just an example. The idea is that an application may need to
convey (through styling the validity pseudo-classes) the meanings you
have put something wrong here and you have to provide something
here as different concepts.

Just my thoughts.

Re: [whatwg] [WebForms2] custom form validation notifications

2008-10-23 Thread Eduard Pascual

This are just my thoughts, however I feel they are worth sharing:

On Thu, Oct 23, 2008 at 4:40 PM, Ian Hickson [EMAIL PROTECTED] wrote:
 You can call setCustomValidity() to set a specific string.
Joao explicitly asked for a way to achieve this **without scripting
enabled**. I think it's quite obvious why setCustomValidity() doesn't
solve that need.
Would having some sort of custom-error-message attribute hurt that
much? (Of course, the name is just an example, and I wouldn't really
suggest it). It would simply ignored by current UAs, and not really
hard to implement (actually, it'd be trivial compared to implementing
reg.exp. parsing).

 If the UA has scripting disabled, trying to prevent the default action
 for an invalid event won't work. Too overcome this problem, there could
 be a new attribute which could be called 'notifyoninvalid=true|false'
 with a default value of true, for each control, or for the entire form.
 If the value is false, then the UA wouldn't notify the user in case of
 invalidity. This could then be delegated to some CSS using :invalid;

 If scripting is disabled, why would you not want the user notified? That
 would be pretty bad UI. :-)
That'd be really useful if validation can be delegated to server-side
scripting when no client-side scripting is available. Anyway, I don't
think such an attribute is needed: a page can be authored with a
catch-all validation rule for the field, and then the Javascript
could update that rule upon the page's loading: if scripts are
dissabled, the rule wouldn't be updated and would stay as the
catch-all.
OTOH, I think Joao's idea was more like to relying on visual hints
(ie: marking the field as red) on cases where an error message popup
would be redundant and annoying. I think that could be more elegantly
handled with an empty attribute value for an hipothetical
custom-error-message attribute (which is not the same as an absent
attribute).

Re: [whatwg] fixing the authentication problem

2008-10-21 Thread Eduard Pascual

On Tue, Oct 21, 2008 at 3:48 PM, Aaron Swartz [EMAIL PROTECTED] wrote:
 There are three costs to SSL:

 1. Purchasing a signed cert.
 2. Configuring the web server.
 3. The CPU time necessary to do the encryption.

 1 could be fixed by less paranoid UAs, 2 could be fixed with better
 software and SNI, and 3 could be fixed by better hardware. But,
 realistically, I don't see any of these things happening.
There is a difference between something having a cost, and that cost
being expensive:
(1) is definitely expensive (I know that first-hand), and most
probably out of the reach for any non-revenue website.
(2) is not expensive: currently, many server management software
already handles this decently (I'm right now thinking of CPanel, one
of the most widely deployed utilities of this type, and it allows
installing a certificate with just a few clicks).
(3) Your suggestion is not addressing that point: encryption will
still be done by the client, and decryption by the server.

In addition, for the first cost; I'm still convinced that UAs should
be fixed, because their paranoid behavior is generally wrong. I don't
think this spec should deal with browsers' bugs and paranoias on
aspects that are not strictly HTML-related; even less to specify
workarounds to these bugs that require browsers to duplicate the tasks
that are currently showing these bugs. What makes you think browsers
would behave less paranoically to your approach than to self-signed
certificates? OTOH, changing the messages show to the user when
self-signed certificates are encountered to be more informative and
less missleading should be far easier than adding a new hook to
trigger encryption (the former only requires reviewing and updating
some texts to something that makes sense, while the later involves
changes on the way forms are handled, which would require additional
testing and might arise even new bugs). That's, however, only my point
of view.

Re: [whatwg] fixing the authentication problem

2008-10-21 Thread Eduard Pascual

On Tue, Oct 21, 2008 at 4:35 PM, Kristof Zelechovski
[EMAIL PROTECTED] wrote:
 Sending any data, including, log-in data, through an unencrypted connection
 is greeted by a warning dialogue box in Internet Explorer.
Only the first time. IIRC, the don't display this again checkbox is
checked by default.

 A similar precaution is taken when the server certificate is not trusted.
Not similar at all: for unencrypted connections, you have the don't
bother me again option, in the form of an obvious checkbox; while
with self-signed certificates you are warned continuously; with the
only option to install the certificate on your system to trust it
(which is a non-trivial task; out of the reach for most average users;
still annoying even for web professionals; and, to top it up, you need
to do it on a site-by-site basis).
It doesn't make any sense for UAs to treat unencrypted connections as
safer than (some) encrypted ones: that's simply wrong.

 The risk of using an invalid certificate is bigger than not using any because
 your level of trust is bigger while you are equally unprotected.
That's, simply put, not true. The level of trust doesn't actually
depend (for average users) on the certificate at all, but on what the
browser says about it.
The level of protection, instead, is independent from the user, and
it's not the same for each case:
On an unencrypted connection, everyone could read the data being sent.
This is no protection at all.
On a connection encrypted with a self-signed certificate, the user can
rest assured that the data is only readable by the server, regardless
of who is actually behind that server. There is some protection here,
even if it's not the most possible.
On an encrypted connection with a CA-signed cert, the user has the
protection from encryption (only the server will be able to read the
data), plus the guarantee that the CA has taken care to verify that
the entity in charge of that server is who it claims to be.

 It is not enough to make sure that your credentials do not unintentionally
 leave example.com.
 Consider the following scenario:
 1. You want to update your blog at blog.com
 2. Evil.org poses as blog.com by phishing or DNS poisoning.
 3. You log in to evil.org using your credentials of blog.com.
 4. The bad guys at evil.org use your credentials to post an entry at
 blog.com that you are going to deploy a dirty bomb in NYC.
 5. You travel to the USA and you end up in Guantanamo.
 Nice, eh?
Although I'm not sure what do you mean by Evil.org poses as
blog.com, I see no point in Aaron's original suggestion that would
deal with such a case.

In summary, besides UAs' paranoia, I can't see any case where the
suggested feature would provide anything self-signed certificates
don't already provide. And since it involves using public-key
encryption, I don't see any reason why UAs would treat the encryption
keys differently from current SSL certificates.

On Tue, Oct 21, 2008 at 6:08 PM, Andy Lyttle [EMAIL PROTECTED] wrote:
 4. The need for a dedicated IP address, instead of using name-based virtual
 hosts.

 That and #1 are the reasons I don't use it more.
#4 is, again, a cost, but not an expensive one: most of the hosts I
know of offer dedicated IP for a fee that's just a fraction of the
actual hosting price.
And, about #1, I just read my points about self-signed certificates in
this and my previous mail.

Re: [whatwg] fixing the authentication problem

2008-10-21 Thread Eduard Pascual

On Wed, Oct 22, 2008 at 1:28 AM, WeBMartians [EMAIL PROTECTED] wrote:
 Somewhere, is there a definition of trust in this context? I say that in 
 all seriousness; it's not a facetious remark. I feel that
 it might be useful.
I can't speak for others, but just for myself: the way I understand
the term trust (in contrast with security or protection), and
what I meant with it on my previous message, is as a measure of how
confident a user would feed about providing (generally sensitive) data
to a website. Ie: a user that absolutely trusts a site won't hesitate
to provide any kind of data to it; while a user who doesn't trust the
site at all won't knowingly provide any data at all (of course,
s/he'll still be providing a request HTTP header and similar
details, but that's most probably not known by the user; otherwise the
user wouldn't even visit the site). Of course, there is a full range
of grays between these extremes.

Re: [whatwg] Ghosts from the past and the semantic Web

2008-08-28 Thread Eduard Pascual

I think some of you got my point quite better than others; and maybe I
should clarify the idea. I see no issue with having some attributes to
embed semantics inline within the HTML, the same way we have style to
embed presentation. The issue is about *forcing* these semantics,
which are not the structure of the document, into the HTML, which is
intended to represent structure.
Although I tried to simplify using only the CSS paralellism (also,
presentation is what ruined HTML3.2, while behaviour just stood
there), Toby has seen deeper than the example and got the exact point
;-) (although I don't entirely agree with the analogy).

Following on with the parallellism:

style=...: inline styles, quite related to things like
onclick=javascript:goBack(); (inline script statements); this would
be equivalent to the current property= and about=. There is no
issue with them, just that I feel they are not enough. This solves
some cases, but not all.

class=... when used explicitly to tie with a CSS .class selector;
relates to the usages of onclick=javascript:myFunction(); (rather
than doing all the work there, it hooks with a function defined
somewhere else); and there is currently no equivalent for semantics.

style and script are used to define document-wide styles and
behaviors. Once again, we lack something to achieve this for
semantics. Introducing a metadata element as suggested could be a
solution (I'd rather prefer semantics, but that's mostly a matter of
taste), but if somebody has any better idea I'd be glad to hear it.

link rel=stylesheet and script src=... allow to import
stylesheets and scripts that might be shared by several documents. I
guess link could be used to import semantics as well.


On the copy-pasting issue mentioned above, I have to disagree: copying
CSS'd contents from a webpage normally preserves the formatting on
most browsers, so I can't see why other kinds of metadata could be an
issue.

Before finnishing, I have come up with a use-case that might help to
illustrate my point. I (hipotetically, because my site is still under
construction) have several projects listed on my website. It'd be a
good idea, on each project's own page, to have embeeded metadata, such
as Title, Author, License, and more specific stuff such as target
platform, intended audience, programming language, version number,
release date, and what-not.
Until that point, embeeded RDF information does the job quite well.
But I also have a page listing all the projects, with some details
about them. Repeating 20 or 50 times will start bloating the code
quite a bit, and it would be extremelly redundant. Ideally, I would
like to be able to define some kind of pattern (be it an XPath
expression, a CSS-like selector, or any other way) to represent for
example that the first entry of each project is the title, the second
is the version, then the date, license, and so on. The current
approach for RDF in HTML fails to handle this without extremelly
annoying redundance.

Regards,
Eduard Pascual

[whatwg] Ghosts from the past and the semantic Web (somewhat related to the RDFa discussions)

2008-08-27 Thread Eduard Pascual

This message is quite related to the whole RDFa discussion going on,
but not to any specific message, so it would be confusing to reply
directly to one of such messages.

First of all, HTML is about structure. I want to make this clear
enough from the beginning because trying to broaden the scope of the
language would only cause it to become both unable of representing
structure, and unable to represent whatever else it is tried to
represent. As long as HTML is kept as a structuring language, HTML
will be good at structuring.
Semantics, despite it might be quite related to structure, is not
structure. Presentation is also quite related to structure, after all;
and it was thought that it would make sense to integrate it into the
language. But then we saw the consequences (HTML 3.2), and then it was
known that presentation had to go out of the language. Let us not make
the same mistakes again.

Don't get me wrong, there is a need for semantics in the Web. Things
like Yahoo OpenSearch, Google Answers, the size of the Microformats
community, and the fact that comments in HTML have been used to
express some semantics not supported by other tools (the Creative
Commons old approach), are all proofs that we need, indeed, a
mechanism to deal with the semantics of webpages.
We have, however, some experience from the past: when the need for
control of presentation arose, some ways to deal with it where
considered: presentational markup, CSS, and, later, XSL.
Presentational markup had serious issues: it stripped off HTML of its
structural nature; and it didn't handle the task well enough.
CSS seems to have worked nicely: it moves the presentation away from
the markup (whether it be in external files or embeeded into an
isolated style element), it uses a relatively simple sintax, and
then there are some hooks to relate each part of the markup with its
corresponding presentation information.
XSL, while made for XML rather than HTML, is an example of a tool for
the similar task (styling and presentation), but using it for HTML
would be overkill.

I would like to encourage this community to learn from what it has
already been done in the past, check what worked, and see why it
worked; then apply it to the problem at hand. If for presentation CSS
worked (and I really think it did; if somebody disagrees I invite you
to share your opinion), then let's see what made it work:
First of all, and essentially, CSS was independent to HTML, although
they were to be used together. I hope it is already clear by now that
we need to deal with semantics from outside of HTML. RDF is an example
of a mechanism that is independent to HTML.
Next, CSS had a simple syntax, despite the size of its vocabulary:
once you understand the selector { property: value; }, you
understand most of CSS syntax. The RDF's XML format is quite verbose
and is not a good example of a simple syntax. But RDFa comes to the
rescue, providing an approach to simplify the syntax.
Last, but not least, CSS was usable with HTML because there where were
hooks between the two: the selector's semantics are based in HTML's
structure (and, by extension, any other markup language). CSS was,
indeed, intended to represent the presentation of markup documents.
RDFa provides some hook; but there is a gotcha: RDFa is not intended
to represent the semantics of a web document; but to embeed those
semantics within the document. RDF just represents (semantic)
relationships between concepts; and RDFa puts that representation
inside the document.

Compared to presentation, RDFa is just about adding two or three
properties, compared to the bunch of new presentational elements
HTML3.2 added, so it might work; but I don't think it is a good idea
to intermix the semantics inside the HTML pages.
On one of the arguments about keeping the semantics within the
content, I'd say that the example
span about=#jane instanceof=foaf:Person property=foaf:nameJane/span 
span about=#jane property=foaf:loves resource=#machates/span 
span about=#mac instanceof=foaf:Person property=foaf:nameMac/span 
would be as silly as having something like
span style=color: #FF00FFThis text is green./span.
It is not the task of a tool or language to be fool-proof: it is task
of the user to not be fool. The same way someone tests the pages in
browsers to check that they are shown as expected, they should also be
tested within the appropriate tools (any kind of semantics-aware UA)
to ensure that they convey the expected semantics; and this applies
whether the semantics information is stored (ie: embeeded in document
vs external referenced resource).

In summary, I think RDFa might work, and it wouldn't be a too bad
solution, but I don't think it is the best approach either.

Regards,
Eduard Pascual
Software and Web developer.

93 matches

Mail list logo