Re: [whatwg] Trying to work out the problems solved by RDFa

2009-02-03 Thread Calogero Alex Baldacchino

Benjamin Hawkes-Lewis ha scritto:

On 12/1/09 20:26, Calogero Alex Baldacchino wrote:

I just mean that, as far as I know, there is no official standard
requiring UAs to support (parse and expose through the DOM) attributes
and elements which are not part of the HTML language but are found in
text/html documents.


Perhaps, but then prior to HTML5, much of what practical user agents 
must do with HTML has not been required by any official standard. ;)


RFC 2854 does say that Due to the long and distributed development of 
HTML, current practice on the Internet includes a wide variety of HTML 
variants. Implementors of text/html interpreters must be prepared to 
be 'bug-compatible' with popular browsers in order to work with many 
HTML documents available the Internet.


http://tools.ietf.org/html/rfc2854

HTML 4.01 does recommend that [i]f a user agent encounters an element 
it does not recognize, it should try to render the element's content 
and [i]f a user agent encounters an attribute it does not recognize, 
it should ignore the entire attribute specification (i.e., the 
attribute and its value).


http://www.w3.org/TR/html401/appendix/notes.html#h-B.3.2

Clearly these suggestions are incompatible with respect to attributes; 
AFAIK all popular UAs insert unrecognized attributes into the DOM and 
plenty of web content depends on that behaviour.




Very, very true. HTML 4.01 also says the recommended behaviours are ment 
to facilitate experimentation and interoperability between 
implementations of various versions of HTML, whereas the specification 
does not define how conforming user agents handle general error 
conditions, including how user agents behave when they encounter 
elements, attributes, attribute values, or entities not specified in 
this document, and since user agents may vary in how they handle error 
conditions, authors and users must not rely on specific error recovery 
behavior. I just think the last sentence defines a best practice 
everyone should follow instead of relying on a common quirk supporting 
invalid markup. However, beside something being a good or bad practice, 
there will always be authors doing whatever they please, therefore it is 
quite safe to assume UAs will always expose invalid/unrecognized 
attributes (that's unavoidable, given the need for backward compatibility).




Just like proprietary elements/attributes introduced with user agent 
behaviours (marquee, autocomplete, canvas), scripted uses of data-* 
might suggest new features to be added to HTML, which would then 
become requirements for UAs.


But unlike proprietary elements/attributes introduced with user agent 
behaviors, scripted uses of data-* do not impose new processing 
requirements on UAs.


Therefore, unlike proprietary elements/attributes introduced with user 
agent behaviors, scripted uses of data-* impose _no_ design 
constraints on new features.


Establishing user agent behaviours with data-* attributes, on the 
other hand, imposes almost as many design constraints as establishing 
them with proprietary elements and attributes. (There's just less 
pollution of the primary HTML namespace.)


If no RDFa was in deployment, you could argue it would be less wrong 
(from this perspective) to abuse data-* than introduce new attributes.


Oh, well, I don't want to argue about that. For me the idea to use 
data-rdfa-* can rest in peace, since in practice it's not different 
from using RDFa attributes as they are, at least as far as they're 
handled by scripts, either client- or server-side. However I think that,


* actually it seems not to be enough clear what UAs not involved in a 
particular project should do with RDFa attributes, beside exposing their 
content for the purpose of a script elaboration, whereas a precise 
behaviour should be defined, as well as an eventual class of UAs clearly 
identified as not required to support it, and eventual caveats on 
possible problems and relative solutions, before introducing any new 
elements/attributes in a formal specification;


* actual deployment might be harmed by the use of xml namespaces in html 
serialization.


Also, I see design suggestions more than impositions. If a new (and 
proprietary/private) attribute/element/convention is convincingly 
useful/needed, it is supported by other UAs and introduced in a 
specification, otherwise, if a not enough significant number of pages 
would be broken, it might even be redefined for use with a different 
semantics. And a possible process involving data-* attributes 
would/could be experiment privately = extend the scale involving other 
people finding it useful for their needs = get it in the primary 
namespace of an official specification (discarding the data- part and 
any other useless parts of the experimental name), so that existing 
pages may still work with their custom scripts or easily migrate to the 
new standard (and benefit of the new default support) by running a 
simple regex

Re: [whatwg] Trying to work out the problems solved by RDFa

2009-02-03 Thread Calogero Alex Baldacchino

Toby A Inkster ha scritto:


Another reason the Microformat experience suggests new attributes are 
needed for semantics is the overloading of an attribute (class) 
previously mainly used for private convention so that it is now used 
for public consumption.


Maybe this is true, but, personally, I prefere this approach to the 
addition of new features/attributes/elements to an official 
specification without a clear support requirement for UAs beside just 
parsing. A similar (if not stronger) argument may be raised against the 
reuse of the content attribute in the context of RDFa, which I think has 
caused a significant change with respect to its original semantics (now 
it should be shared by every element, originally it was a meta 
specific attribute; now it should be part of an RDF _triple_, in origin 
it was - and is still - part of a _pair_ when used in conjunction with 
the name attribute, and constitutes a pragma directive in conjunction 
with the http-equiv attribute, which is somehow closer to an XML 
processing instruction than to an RDF triple - the same applies to a 
link with rel=stylesheet, for instance).


Yes, in real life, there are pages that use class=vcard for things 
other than encoding hCard. (They mostly use it for linking to VCF 
files.) Incredibly, I've even come across pages that use class=vcard 
for non-hCard uses, *and* hCard - yes, on the same page! As the 
Microformat/POSHformat space becomes more crowded, accidental 
collisions in class names become ever more likely.




Indeed, that's a possible source of troubles. I think that's the same if 
people misused prefixes, e.g. if after merging some content from 
different documents they got a different namespace binded to a 
previously declared prefix in a scope where both namespaces are involved 
(in an xhtml document). Also, a custom script may distinguish between 
different uses of vcard by the mean of a further, private classname, 
or by enveloping elements in containers (divs) with proper ids, which 
may be a good solution in some cases, and not in other ones; a more 
generic parser, being specialized by design, has a chance to recognize a 
correct structure for a given format and to discard wrong informations, 
which may work fine in some cases, but not in others. As always, each 
choice has its own downsides, and what counts is the costs/benefits 
ratio; it seems that any solution not requiring to be supported has the 
lowest costs for UA implementors.


I do not doubt xml extensibility (which effectively is the base of 
curies) has its own benefits, it's flexible and suitable for a quick 
developement of custom solutions, but it's also got its own downsides, 
such as leading to a possible heavy fragmentation, being difficoult to 
understand and use for many people (who are usually fooled by the 
concept of namespaces) and thus potentially causing misuses and errors. 
It doesn't seems that xml extensibility brought more benefits than 
costs, and a proof can lay in the majority of the web not having 
followed the envisioned xml-alike evolution.


Anyway, I'm not strongly against RDFa in HTML, instead, I can be quite 
neutral (I'd live with it); I'm not convinced it is worth to add it to 
the spec at this stage and until it would be possible to establish what 
UAs must do with them beside parsing (and how to deal with namespaces 
while parsing). Also, I'm not fully convinced by the need to embed 
metadata in a page and keep them in sync with that page. For instance, 
it require that every page reporting the same informations must 
duplicate the same metadata structure, and this doesn't grant that those 
informations, in first place, are in sync with real world (some pages 
might be out-of-date, others might be up-to-date). Instead, a separate 
file containing metadata to be linked when appropriate might solve both 
the problems: it doesn't require duplicates and can have a somewhat 
versioning to keep trace of changes and to present updated 
machine-friendly information to help users visiting an outdated page 
(assuming users can trust those metadata). Of course, this solution has 
its own downsides too.


WBR, Alex



--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Blu American Express: gratuita a vita! 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8615d=4-2


Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-02-03 Thread Calogero Alex Baldacchino

Shelley Powers ha scritto:



The point I'm making is that you set a precedent, and a good one I 
think: giving precedence to not invented here. In other words, to 
not re-invent new ways of doing something, but to look for established 
processes, models, et al already in place, implemented, vetted, etc, 
that solve specific problems. Now that you have accepted a use case, 
Martin's, and we've established that RDFa solves the problem 
associated with the use case, the issue then becomes *is there another 
data model already as vetted, documented, implemented that would 
better solve the problem*.




RDF in a separate XML-syntax file, perhaps. Just because that use case 
raised a privacy concern on informations to keep private anyway, and 
that's not a problem solvable at the document level with metadata; 
instead, keeping relevant metadata in a separate file would help a 
better access control. Also, a separate file would have the relevant 
informations ready for use, while embedding them with other content 
would force a load and parsing of the other content in search of 
relevant metadata (possible, of course, and not much of a problem, but 
not as clean and efficient).


Moreover, it should be verified whether social-network service providers 
agree with such a requirement: I might avail of a compliant 
implementation to easily migrate from one service to another and leave 
the former, in which case why should a company open its inner 
infrastructure and database and invest resources for the benefit of a 
competitor accessing its data and consuming its bandwidth to catch its 
customers? (this is not the same interoperability issue for mail clients 
supporting different address book formats, minor vendors had to do that 
to improve their businness - and they didn't need to access a 
competitor's infrastructure).


Perhaps, that might work if personal infos and relationships were 
handled by an external service on the same lines of an OpenID service 
allowing an automated identification by other services; but this would 
reduce social networks to be a kind of front-end for such a centralized 
management (and service providers might not like that). Also, in this 
case anonimity should be ensured (for instance, I might have met you in 
two different networks, but knew your identity in only one of them, and 
you might wish that no one knew you're the person behind the other 
nickname; this is possible taking different informations in different 
databases and with different access rights, and should be replicable 
when merging such infos -- on the other hand, if you knew my identity, 
you should be allowed to fill in the blanks somehow).


Shelley Powers ha scritto:

Anne van Kesteren wrote:
On Sun, 18 Jan 2009 17:15:34 +0100, Shelley Powers 
shell...@burningbird.net wrote:
And regardless of the fact that I jumped to conclusions about WhatWG 
membership, I do not believe I was inaccurate with the earlier part 
of this email. Sam started a new thread in the discussion about the 
issues of namespace and how, perhaps we could find a way to work the 
issues through with RDFa. My god, I use RDFa in my pages, and they 
load fine with any browser, including IE. I have to believe its 
incorporation into HTML5 is not the daunting effort that others make 
it seem to be.'


You ask us to take you seriously and consider your feedback, it would 
be nice if you took what e.g. Henri wrote seriously as well. 
Integrating a new feature in HTML is not a simple task, even if the 
new feature loads and renders fine in Internet Explorer.



Take you guys seriously...OK, yeah.

I don't doubt that the work will be challenging, or problematical. I'm 
not denying Henri's claim. And I didn't claim to be the one who would 
necessarily come up with the solutions, either, but that I would help 
in those instances that I could. 


It seems that you'd expect RDFa to be specced out before solving related 
problems (so to push their solution). I don't think that's the right 
path to follow, instead known issues must be solved before making a 
decision, so that the specification can tell exactly what developers 
must implement, eventually pointing out (after/while implementing) newer 
(hopefully minor) issues to be solved by refining the spec (which is a 
different task than specifying something known to be, let's say, buggy 
or uncertain).



Everything, as always, IMHO

WBR, Alex




--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Blu American Express: gratuita a vita! 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8613d=4-2


Re: [whatwg] embedding meta data for copy/paste usages - possible use case for RDF-in-HTML?

2009-02-03 Thread Calogero Alex Baldacchino

Hallvord R M Steen ha scritto:
  

HTML5 already contains elements that can be used to help obtain this
information, such as the title, article and it's associated heading h1
to h6 and time.  Obtaining author names might be a little more
difficult, though perhaps hCard might help.



Indeed. And it's not an either-or counter-suggestion to my proposal,
UAs could fall back to extracting such data if more structured meta
data is not available.

  


I think that's a counter-suggestion, instead. If UAs can gather enough 
informations from existing markup, they don't need to support further 
metadata processing; if authors can put enough informations in a page 
within existing markup (or markup being introduced in current 
specification), they don't need to learn and use additional metadata to 
repeat the same informations. It seems that any additional 
metadata-related markup would add complexity to UAs (requiring support) 
but not advantages (with respect to existing solutions) in this case.


Therefore, the question moves to the format to use to move such infos to 
the clipboard, which is a different concern than embedding metadata in a 
page. Also, different use cases should lead to different formats (with 
different kind of informations taken apart in different clipboard 
entries, or binded in a sort of multipart envelop to be serialized in 
just one entry), because a generic format, addressing a lot of use 
cases, could seem overengineered to developers dealing with a specific 
use case, thus a specific format could gain support in other 
applications more easily --- third parties developers could find easier 
and more consistent to get access to the right infos in the right 
format, either by looking for a specific entry (if supported by the OS), 
or by parsing a few headers in a multipart entry looking for an offset 
associated with a mime-type (which would work without requiring support 
by OS's, but an OS could provide facilities to directly access to a 
proper section anyway; however, any support for multiple kinds of infos 
should be in scope for the OS clipbord API and/or the UA, not for a 
specific application requiring specific data - and, given the above, 
that should not be in scope for HTML5).



If I copy the name of one of my Facebook friends and paste it into
my OS address book, it would be cool if the contact information was
imported automatically. Or maybe I pasted it in my webmail's address
book feature, and the same import operation happened..
  

I believe this problem is adequately addressed by the hCard microformat and
various browser extensions that are available for some browsers, like
Firefox.  The solution doesn't need to involve a copy and paste operation.
 It just needs a way to select contact info on the page and export it to an
address book.



This is way more complicated for most users. Your last sentence IMO is
not an appropriate way to use the word just, seeing that you need to
find and invoke an export command, handle files, find and invoke an
import command and clear out the duplicated entries.. This is
impossible for several users I can think of, and even for techies like
us doing so repeatedly will eventually be a chore (even if we CAN, it
doesn't mean that's the way we SHOULD be working).
  


It can be improved, but it's the _best_ way to do that, and should be 
replicated in the copy-and-paste architecture you're proposing. 
Please, consider a basic usability principle says users should be able 
to understand what's going on basing on previous experience (that is, an 
interface have to be predictable); but users aren't confident with 
copying and pasting something different than text (in general), thus a 
UA should distinguish among a bare copy option, and more specific 
actions (such as copy as quotation, copy contact info, and so on), 
and related paste options (as needed), so that users can understand and 
choose what they want to do.


On the other hand, the same should happen in a recipient application, 
especially if providing support for different kinds of info; if either a 
UA or a recipient application (or both) provided a simple copy and a 
simple paste option (or fewer options than supported, basing on metadata 
or common markup) it could be confusing for users, nor should 
applications use metadata to choose what to do, because the user could 
just want to copy and paste some text (or do something else, but he 
knows what, so he must be free to choose it).


That is, what you're proposing is mainly addressed by moving around 
import/export features to put them into a context menu and making them 
work on a selection of text (not eliminating and substituting them with 
a simpler copy-paste architecture), then requiring support by other 
applications and eventually by the operative system, which is definetly 
out-of-scope for any web-related standards (we can constrain web-related 
applications to improve their interoperability with respect 

Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

2009-02-03 Thread Calogero Alex Baldacchino

Shelley Powers ha scritto:



The point I'm making is that you set a precedent, and a good one I 
think: giving precedence to not invented here. In other words, to 
not re-invent new ways of doing something, but to look for established 
processes, models, et al already in place, implemented, vetted, etc, 
that solve specific problems. Now that you have accepted a use case, 
Martin's, and we've established that RDFa solves the problem 
associated with the use case, the issue then becomes *is there another 
data model already as vetted, documented, implemented that would 
better solve the problem*.




RDF in a separate XML-syntax file, perhaps. Just because that use case 
raised a privacy concern on informations to keep private anyway, and 
that's not a problem solvable at the document level with metadata; 
instead, keeping relevant metadata in a separate file would help a 
better access control. Also, a separate file would have the relevant 
informations ready for use, while embedding them with other content 
would force a load and parsing of the other content in search of 
relevant metadata (possible, of course, and not much of a problem, but 
not as clean and efficient).


Moreover, it should be verified whether social-network service providers 
agree with such a requirement: I might avail of a compliant 
implementation to easily migrate from one service to another and leave 
the former, in which case why should a company open its inner 
infrastructure and database and invest resources for the benefit of a 
competitor accessing its data and consuming its bandwidth to catch its 
customers? (this is not the same interoperability issue for mail clients 
supporting different address book formats, minor vendors had to do that 
to improve their businness - and they didn't need to access a 
competitor's infrastructure).


Perhaps, that might work if personal infos and relationships were 
handled by an external service on the same lines of an OpenID service 
allowing an automated identification by other services; but this would 
reduce social networks to be a kind of front-end for such a centralized 
management (and service providers might not like that). Also, in this 
case anonimity should be ensured (for instance, I might have met you in 
two different networks, but knew your identity in only one of them, and 
you might wish that no one knew you're the person behind the other 
nickname; this is possible taking different informations in different 
databases and with different access rights, and should be replicable 
when merging such infos -- on the other hand, if you knew my identity, 
you should be allowed to fill in the blanks somehow).


Shelley Powers ha scritto:

Anne van Kesteren wrote:
On Sun, 18 Jan 2009 17:15:34 +0100, Shelley Powers 
shell...@burningbird.net wrote:
And regardless of the fact that I jumped to conclusions about WhatWG 
membership, I do not believe I was inaccurate with the earlier part 
of this email. Sam started a new thread in the discussion about the 
issues of namespace and how, perhaps we could find a way to work the 
issues through with RDFa. My god, I use RDFa in my pages, and they 
load fine with any browser, including IE. I have to believe its 
incorporation into HTML5 is not the daunting effort that others make 
it seem to be.'


You ask us to take you seriously and consider your feedback, it would 
be nice if you took what e.g. Henri wrote seriously as well. 
Integrating a new feature in HTML is not a simple task, even if the 
new feature loads and renders fine in Internet Explorer.



Take you guys seriously...OK, yeah.

I don't doubt that the work will be challenging, or problematical. I'm 
not denying Henri's claim. And I didn't claim to be the one who would 
necessarily come up with the solutions, either, but that I would help 
in those instances that I could. 


It seems that you'd expect RDFa to be specced out before solving related 
problems (so to push their solution). I don't think that's the right 
path to follow, instead known issues must be solved before making a 
decision, so that the specification can tell exactly what developers 
must implement, eventually pointing out (after/while implementing) newer 
(hopefully minor) issues to be solved by refining the spec (which is a 
different task than specifying something known to be, let's say, buggy 
or uncertain).



Everything, as always, IMHO

WBR, Alex




--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Blu American Express: gratuita a vita!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8614d=4-2


Re: [whatwg] rename CanvasPixelArray

2009-01-23 Thread Calogero Alex Baldacchino

Anne van Kesteren ha scritto:
Wouldn't it make more sense to give this a more generic name, just 
like the object it is associated with? That way we can later reuse it 
for img elements and the like (if we want) without it having to look 
silly and poorly thought out like the rest of the platform. :-P (E.g. 
ImagePixelArray.)





Other elements would need a rendering context anyway, thus the same 
reasoning could be applied to CanvasRenderingContext2D, CanvasGradient 
and CanvasPattern. Perhaps the best way to generalize their name is to 
remove the Canvas part (or changing it into Graphics or the alike -- 
both PixelArray and ImagePixelArray could be fine). However, canvas 
might be enough generic, when thinking to a graphics context, to be 
referred to whatever object allowing custom rendering without any 
confusion or poor association (like saying that a canvas is the main 
element for custom/dynamic rendering and that other elements might use 
the Canvas framework to provide similar capabilities). Personally, I'm 
fine with both choices.


WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Con Danone Activia, puoi vincere cellulari Nokia e Macbook Air. Scopri come
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8547d=23-1


Re: [whatwg] Spellchecking mark III

2009-01-22 Thread Calogero Alex Baldacchino

Peter Kasting ha scritto:
On Wed, Jan 21, 2009 at 7:38 PM, Calogero Alex Baldacchino 
alex.baldacch...@email.it mailto:alex.baldacch...@email.it wrote:


Why not to let the user choose the language, as it happens in word
processors? A UA can't choose accurately whether, for instance,
color is a correct American English, a wrong British English, or
even a correct (truncated) Italian word, while a human can do it
better, thus a UA could provide an interface to change the
language for a selection spellchecking, or even for each mispelled
word, starting from a hint language, which could be the value of
an element lang attribute (beside a default value and a
user-preference forced one - the latter bypassing any authored
value). Also, using the lang attribute value as the start
language to check (if not in contrast with a user preference)
would allow an interactive interface with a script changing that
value according to a user's choice (UAs could also expose a list
of supported languages).


I'm not sure I fully grasped everything here, but what I did grasp 
sounds very much like a cross between what Chromium is doing today and 
what we want to do in the future (I imagine similar things are true 
for other browser vendors).  User specification and page hints are 
both useful tools for a UA.


But I still claim that all of those aspects are outside the scope of 
the spellcheck attribute, and fall into the realm of things that 
should not be in the HTML5 spec as they're very much UA-specific 
behavior.


PK


Probably. However, establishing that the lang attribute is the 
first-choice language to check (which wouldn't prevent the UA from 
providing other choices, or just ignoring such behaviour due to a user 
preference, or using other dictionaries too -- and that might be 
suggested in a note on usability, I guess), I mean, would allow a webapp 
to emulate those functionalities to some extent, just setting a 
different value for the lang attribute of a contenteditable box and some 
of its subregions through a script at the user whim (that is, let's do 
it through script until UAs provided a better solution, which could be 
hinted by scripting hacks based on the lang and spellcheck 
attributes working together at the same grane).


I think that a control over the language to check can improve 
spellchecking at the same grane as the spellcheck attribute, whereas it 
can't harm end users more than a wrong assumption on spellchecking. A 
user would notice a wrong checking not matching the language he's using, 
and could disable it or do whatever else a UA allows him to do (though 
being annoying); on the other hand, a user might not notice 
spellchecking is disabled on a certain area, and could not beware his 
errors, unless the UA informed him somehow (about spellchecking being 
turned off). Therefore, a special care by UAs is needed in both cases, 
yet both features can improve webapps providing a rich and/or 
specialized editor (such as a code editor, where disabling spell 
checking but for comments may make sense), so why not consider both of 
them, since they're related?


Also, implementation and usages experience could suggest whether it is 
worth to expose UAs' supported languages through DOM APIs (e.g. to allow 
a webapp to create a dynamic list of checking-available languages, to 
avoid static lists being either incomplete, or too long and possibly 
including unsupported languages), and this would affect either the 
Window or the Navigator interface (or something else in HTML5 scope).


Everything, IMHO.

WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Con Danone Activia, puoi vincere cellulari Nokia e Macbook Air. Scopri come
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8547d=22-1


Re: [whatwg] Spellchecking mark III

2009-01-22 Thread Calogero Alex Baldacchino

Kornel Lesiński ha scritto:
Probably. However, establishing that the lang attribute is the 
first-choice language to check (which wouldn't prevent the UA from 
providing other choices, or just ignoring such behaviour due to a 
user preference, or using other dictionaries too -- and that might be 
suggested in a note on usability, I guess), I mean, would allow a 
webapp to emulate those functionalities to some extent, just setting 
a different value for the lang attribute of a contenteditable box and 
some of its subregions through a script at the user whim (that is, 
let's do it through script until UAs provided a better solution, 
which could be hinted by scripting hacks based on the lang and 
spellcheck attributes working together at the same grane).


I don't think that applications need ability to precisely control 
spell-checking language. Browser knows best which dictionaries are 
available, and can auto-detect language based on user's preferences, 
page's language and text itself. You can expect that browsers will 
have much more sophisticated and reliable language detection than web 
apps (that's an area where browsers can freely compete).




Browsers can't do better than word processors, which are the state of 
the art in... word processing. At most, browsers can do as well, and, 
over some extent, word processors don't use heuristics while you're 
typing, because no heuristics can guess whether you're *purposedly* 
switching between dialects (such as British and American English), or if 
you just mispelled a word (personally, I dislike even the automatic 
correction of common mistakes in w.p.). Word processors make a choice 
when you start writing (or before, basing on your installation language, 
for instance), and let you change it for the whole document or for each 
single word. I don't think any heuristic auto-detection can be better; 
instead, no language detection (and users' explicit choice) is more 
reliable than any sophisticated heuristics.


Turning spelling checking on or off makes sense if one can guess how the 
user agent would behave AND if the user agent can recover misuses, thus 
I believe that spellcheck is strictly related to the way a 
spellcheking language is detected and is half of the problem of 
controlling spellcheking. Otherwise, if it's thought that everything 
should be under the control of a UA, let's state spellchecking must be 
always on and peace. Just because being annoyed by a wrong checking 
(e.g. because the heuristics fails, but it would be the same for a wrong 
lang value) is less harmful than thinking one's writing correct text 
because of being unaware that checking has been disabled by the author 
without asking one's permission. Yet, both lang and spellcheck 
attributes can be useful for the purpose of controlling spellchecking 
and improving a web-based word processor, and in both cases UAs can 
recover from misuses, somehow (e.g. allowing the user to bypass authors' 
choices).


Moreover, I think that interactive and script-aware UAs should act as a 
framework for web-based applications providing as much of a client-only 
application functionalities as possible, thus browsers should include 
new features when possible and reasonable (while trying not to became 
oversized). I agree that spellchecking is a good feature to support in a 
browser; I don't see why a web-based rich text editor should be 
prevented from controlling it on users' behalf, as it happens in word 
processors, givent it's about to support an existing attribute (lang, 
which could be stated to be triggering UAs heuristics by default when 
unspecified for editable elements) and a new one (spellcheck) in 
conjunction for this purpose (also a list of supported dictionaries 
would be useful).


I also think that features which are not core functionalities for a UA 
should be provided in a basic version (for general use in web pages) and 
as building blocks for web applications, not in a complete version under 
a UA exclusive control (for instance, a UA could allow the user to 
change the language for some selected text through a context menu 
option, but the right place for an option allowing a (starting) choice 
valid for a whole editable element, in a rich text editor, should be the 
editor interface, which shouldn't be provided by a UA, as a whole or in 
part, or, if the UA provides it, it should be exposed to any webapp to 
be customized and enhanced). That's because a specific application can 
focus on a specific task usability better than its underlying, general 
purpose framework (like a browser is or should be for a web application).


Furthermore, if you agree that a page's language should be used to 
improve auto-detection, why not to use an element language attribute 
too? With the benefit that it can be changed dynamically to please the user.


Many of your suggestions are just implementation details, which HTML 
shouldn't specify precisely (it could force browsers to use 

Re: [whatwg] Canvas arcTo all points on a line

2009-01-21 Thread Calogero Alex Baldacchino

Philip Taylor ha scritto:


But I don't know if it makes sense from the perspective of someone
who's got to write an independent implementation of it. Does the above
explanation make more sense than the text in the spec? and if so, does
it seem implementable? If so, it seems best to keep the spec's
behaviour and try to clarify the spec's text. But this doesn't seem
like an important case where users will be unhappy if e.g. the arcTo
call draws nothing when all the points are on the same line, so if
it's still a pain to implement the spec's behaviour then I would be
happy with changing what the spec requires.

  


I haven't checked this part of the spec insofar; looking at the image 
you posted it seems the 3 points are used as control points in a 
somewhat algorithm to draw curve lines; personally, thinking to an API 
function to draw arcs, I prefer to have the specified points as being 
part of the arc itself (e.g., the two external ones are the extremes of 
a convex elliptical arc). Anyway, certainly what you say makes sense for 
an arc degenering to a line (that is, if all points lay on the same 
line). Assuming the angular coefficient and the start point of the line 
are known, it is easy to find the intersection between it and a clip 
region (through the mean-point algorithm) -- it should be the same with 
a (x2, y2) point very close with the (x0, y0)--(x1, y1) segment, that is 
if under a certain threshold one can't drow an arc and instead the 
result must be approximated to a half-infinite line (I think all an 
implementation needs is to remember an infinite line has been drawn and 
the last point in the subpath is infinitely far, so it can draw a 
parallel line when .lineTo() is invocked).


WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Con Danone Activia, puoi vincere cellulari Nokia e Macbook Air. Scopri come
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8549d=21-1


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Calogero Alex Baldacchino

Aryeh Gregor ha scritto:

On Wed, Jan 21, 2009 at 4:15 AM, Mikko Rantalainen
mikko.rantalai...@peda.net wrote:
  

If the browser does not know the language of the content, how on earth
is it supposed to *correctly* spellcheck it? I'm daily hitting a
situation where browser is trying to spellcheck content with incorrect
language. I've toggled such automatic spellchecker off and those will
stay off until correct language is detected.



In practice, I think the only way to avoid this problem is for
browsers to implement content-sniffing techniques of some kind to
figure out the language, at least per field but ideally on a
word-by-word basis.  If the browser is set to spellcheck in English
but you start putting in lots of non-Latin characters and every word
is therefore misspelled, the browser should be clever enough to try
switching the spellcheck language, or at least disabling spellcheck
for words that can't possibly be from the language it's checking
against.  More refined heuristics could detect even subtle
differences, like between British and American English, and remember
for next time which one the user usually types in.

  


Why not to let the user choose the language, as it happens in word 
processors? A UA can't choose accurately whether, for instance, color 
is a correct American English, a wrong British English, or even a 
correct (truncated) Italian word, while a human can do it better, thus a 
UA could provide an interface to change the language for a selection 
spellchecking, or even for each mispelled word, starting from a hint 
language, which could be the value of an element lang attribute 
(beside a default value and a user-preference forced one - the latter 
bypassing any authored value). Also, using the lang attribute value as 
the start language to check (if not in contrast with a user preference) 
would allow an interactive interface with a script changing that value 
according to a user's choice (UAs could also expose a list of supported 
languages).


A declaration such as lang='und' sounds like telling the user agent to 
do whatever is computed as being a good choice, which is different from 
telling don't even try to understand what the language is here, because 
I know you can't guess it; declaring a value known to be unsupported 
(such as an invented one) to turn off spellchecking sounds like a hack 
needed because we miss a more appropriate feature.


Everything IMHO.

WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Partecipa al concorso Danone Activia e vinci MacBook Air e Nokia N96. Prova
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8548d=22-1


Re: [whatwg] Issues concerning the base element and xml:base

2009-01-17 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:

On Fri, 16 Jan 2009, Calogero Alex Baldacchino wrote:
  
What should happen to a linked style sheet disabled during the first 
casced and enabled after the base has been changed? Or if it was first 
enabled, than disabled before changing the base, and re-enabled after?



For external resource links created with the link element, the URL is 
resolved when the resource is fetched, which can be delayed if the 
resource doesn't apply yet (e.g. because a media query doesn't yet match). 
This could lead to situations where different user agents had compliant 
behavior, unfortunately, but this is one case where I can't see how to 
avoid it without requiring suboptimal behavior.


  


I understand. Perhaps, if a main (more diffused) behaviour could be 
isolated, it might be chosen to normalize newer UAs behaviours, while 
possibly breaking fewer existing pages (the same eventually behaving 
differently in different browsers). However, I guess this might require 
a convergence between HTML and CSS specifications for this purpose (it 
might rise an issue on consistence for @import rules, for instance, 
which are in CSS scope).


I don't know if it may work something like establishing that a URL, in 
this case, is resolved any times it is explicitly set (e.g. when the 
document is parsed and when the href value changes), as if the 
resources were immediately fetched (thus, not being affected by a 
successive change in a base) but not constraining UAs to do so (an 
inline style element might be treated as an external resource being yet 
fetched, thus it would be about to associate it with a base URI being 
valid at the moment the style was created and maintained valid until the 
style content is explicitely changed). Though, I guess this should be 
somehow consistent with existing UAs and pages (or, at least, with a 
significant group).



Anne van Kesteren ha scritto:

On Fri, 16 Jan 2009 05:15:41 +0100, Ian Hickson i...@hixie.ch wrote:

For external resource links created with the link element, the URL is
resolved when the resource is fetched, which can be delayed if the
resource doesn't apply yet (e.g. because a media query doesn't yet 
match).

This could lead to situations where different user agents had compliant
behavior, unfortunately, but this is one case where I can't see how to
avoid it without requiring suboptimal behavior.


You have the same scenario for inline style elements that are either 
in alternate state or are of a medium that currently does not apply to 
the document. The user agent is not required to parse those CSS blocks 
directly, I believe.




WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Innammorarsi è facile con Meetic, milioni di single si sono iscritti, si sono 
conosciuti e hanno riscoperto l'amore. Tutto con Meetic, prova anche tu!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8292d=17-1


Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-12 Thread Calogero Alex Baldacchino

Benjamin Hawkes-Lewis ha scritto:



After all, support for unknown attributes/elements has never been a
standard de jure, but more of a quirk


Depends what you mean by support I guess.



I just mean that, as far as I know, there is no official standard 
requiring UAs to support (parse and expose through the DOM) attributes 
and elements which are not part of the HTML language but are found in 
text/html documents. Usually, browsers support them for robustness sake 
and/or backward compatibility with existing pages, but they might do it 
with significant differences (actually it happens for unknown elements 
but not for unknown attributes, but one shouldn't assume such common 
behavior might not change in the future, or that will be adopted by 
newer vendors (even if that might be a quite safe assumption), thus any 
hack to the language /for custom purposes and script elaboration/ should 
be done by the mean of existing attributes/elements instead of creating 
new ones (I mean, data-rdfa-about might be a bit safer than just 
about to use a conservative approach based on the assumption I know 
what happens today, not what will happen tomorrow) -- before data-* it 
was possible through the class attribute, now also data-* can be used 
for custom hacks)



I really don't see the problem if a *custom* convention became widely
accepted and reused by other people


Then you I think you don't agree with the fundamental design principle 
of the data-* attribute. The theory is that extensions to HTML 
benefit from going through a community process like WHATWG or W3C, and 
blessing extension points encourages people to circumvent that 
process, with the result that browsers have to support poorly designed 
features in order to have an interoperable web.




Yet it is *possible* to use data-* attributes to define a proper 
*private* convention by choosing names carefully in order to avoid 
clashes with other private conventions (for instance, a widget might 
need metadata to be put within the host page, and a careful choice of 
data-* names might avoid clashes with other metadata needed by other 
widgets or by the page itself). More people might find a certain 
convention useful and enough reusable for their purposes (because of 
non-clashing names), and the result would be a clearer caw path that 
community cawboys might follow to catch the free problem running away 
from standards.


The *only* difference with data-rdfa-* here would be that a higher 
number of authors/developers should agree with such a convention from 
the beginning, but only if they were interested in exchanging the same 
metadata with each others for their respective *custom* uses (through a 
custom script or plugin, either developed independently or shared). From 
this point of view, the only difference between data-rdfa-about and 
about - as used for the purposes of SearchMonkey - is that the former 
is immediately conforming to HTML5 spec and, thus, surely exposed 
through the DOM by every possible HTML5 compliant UA, as it happens for 
classes used by Microformats. I've never thought to any requirements for 
UAs not coming from a clearly traced caw path, the same way there is 
no requirement for UAs not involved in SearchMonkey to support any kind 
of metadata - for the purposes of SearchMonkey itself.


Unless one thinks that everyone facing a problem not solved (at all or 
enough for his purposes) by an official standard should either create a 
private hack disregarding any possible hacks for similar problems he 
might have happened to find on the web, or start a new community process 
eventually without knowing if other people are facing the same problem, 
or a similar one, I really can't understand why a *custom* and 
*born-private* (eventually within a group of authors/developers) and 
then become a widely accepted convention should be a problem, as far as 
it is based on existing, standard features and doesn't require any 
additional support and results in a possible cawpath to be then 
standardized as needed. And I really don't understand why class=xyz is 
a good hack whereas data-some-thing is not, assuming both are designed 
for and used by caws opening a path ( :-P )



I really can't get, right now, why it should be different, for instance,
from the case of a freely reusable widget using a custom data model
based on private data-* attributes inserted by people in thousands of
websites (the widget with relitive metadata, I mean), then liked by
other people and reused in different contexts (the same data model based
on data-*, now)


Reuse of data-* by DHTML widgets would not impose any additional 
requirements on user agents, so it would be fine from the perspective 
elaborated above. It wouldn't change the language by the back door.


Really? Is it so much different from the case of the pattern attribute 
(which addresses, at the UA and language level, a problem earlier solved 
by scripts -- e.g. getting elements by their 

Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-11 Thread Calogero Alex Baldacchino

Benjamin Hawkes-Lewis ha scritto:

On 11/1/09 02:51, Calogero Alex Baldacchino wrote:

eRDF might be a working compromise, because it doesn't need any changes
to the spec


It's not possible to author conforming HTML5 that functions as eRDF 
since eRDF requires a 'profile' attribute, but HTML5 has removed the 
attribute.




I didn't noticed that before, thanks for the info :-)

However, actually it's the same for RDFa attributes, because they're not 
in the spec. From this point of view, introducing six new attributes, or 
resorting to an older one is not very different, thus (again) why RDFa 
and not eRDF? Or why not both? Or not also RDFa embedded in Atom 
embedded, in turn, in HTML (like SVG or MathML)? It seems to me, for 
instance, that at this stage SearchMonkey might be a reason to consider 
all of them.




; RDFa covers a wider range of RDF semantics, but requires

new attributes and also namespaces (a sort of hybrid beteween them might
avoid the need to bring namespaces - xmlns:* attributes - into html
serialization).


To avoid xmlns:* attributes, one could drop CURIEs in the text/html 
serialization and use markup like:


div
  div about=http://dbpedia.org/resource/Albert_Einstein;
...
  /div
/div

instead of

div xmlns:db=http://dbpedia.org/;
  div about=[db:resource/Albert_Einstein]
...
  /div
/div

There's no data loss.



Well, that's a chance, of course, but that's *not* RDFa as specified by 
W3C; for instance, @property is specified as accepting _only_ CURIEs 
(whereas @about can accept also URIs - and eRDF allows curies, even if 
in a different format than what specified for RDFa and what is used for 
XML in general). That is, to do that not one, but _two_ specifications 
need to be changed, current HTML5 (which is a draft, thus  not a 
problem) and RDFa (which now is a Recommendation, thus, might it be more 
difficoult? should a different specification be derived?), unless we 
want that to be just an unofficial, yet widely accepted, convention - 
and I think that an unofficial convention is worth the others (any 
processors conforming to standard RDFa would need deep changes to cope 
with that - it doesn't work in Fuzzbot when CURIEs are expected, for 
instance). I'm the first to say that my suggestion was an ugly hack, but 
at least it would have been working and conformant without changing 
anything.



My suggestion was meant as a mean to test RDFa in HTML
documents without changing the spec (perhaps in conjunction with
data-xmlns-*, data-xmlns-prefixes=rdfa foaf whatever to emulate
namespaces - an ugly hack, I know, but at least would avoid changes to
html serialization, at least in a test phase) -- even if I think that
xml serialization should work better for such rdf metadata.


I really can't see anybody violating the spec in that way rather than 
violating the spec by just adding the RDFa attributes outright, --


Indeed, current specs are violated, and I was just considering a way to 
use RDFa without such violations before deciding if it's worth to be 
added to the spec, no more (and I don't want to push that hack anymore, 
just trying to point out my aim).


--especially given that there are already people publishing these 
attributes in text/html so the namespace has already been polluted 
and we already have services like SearchMonkey not only using these 
attributes but promoting them.


It seems to me that SearchMontky doesn't promote RDFa more than it 
promotes Microformats, eRDF and dataRSS (RDFa embedded in external Atom 
feeds). It's also a very recent feature, and I really can't guess which 
kind of RDF serialization is going to win the battle (that is, 
choosing one against the others *might* be a premature choice right now, 
as well as introducing all of them).


It may therefore already be problematic for a future version of HTML 
to use these attributes as extension points without breaking existing 
sites. The test is already in progress, for better or worse. HTML5 
conformance checkers don't have to bless this test, of course, any 
more than CSS validators have to give the all clear to vendor-specific 
properties.


It's the same with every possible existing custom (non-standard) 
attributes and elements out there, since there is no standard for them, 
and instead data-* has been created; it's also the same for accesskey, 
actually, since it's not in current spec (whereas it was in HTML4). 
After all, support for unknown attributes/elements has never been a 
standard de jure, but more of a quirk, and there are no grants it will 
work fine in the future (as well as actually it doesn't work 
consistently for unknown elements cross-browsers -- there are strong 
differences between IE and other browsers with this respect).


Moreover, the use of such attributes /for the purposes of SearchMonkey/ 
is a very, very custom use case, since they're used just for server-side 
computations, thus no collaboration is required by other UAs; if 
browsers just ignored

Re: [whatwg] Fuzzbot (Firefox RDFa semantics processor)

2009-01-11 Thread Calogero Alex Baldacchino

Toby A Inkster ha scritto:

Calogero Alex Baldacchino wrote:


The concern is about every kind of metadata with respect to their
possible uses; but, while it's been stated that Microforamts (for
instance) don't require any purticular support by UAs (thus they're
backward compatible), RDFa would be a completely new feature, thus html5
specification should say what UAs are espected to do with such new
attributes.


RDFa doesn't require any special support beyond the special support 
that is required for Microformats. i.e. nothing. User agents are free 
to ignore the RDFa attributes. In that sense, RDFa already works in 
pretty much every existing browser, even going back to dinosaurs like 
Mosaic and UdiWWW.


Agents are of course free to offer more than that. Look at what they 
do with Microformats: Firefox for instance offers an API to handle 
Microformats embedded on a page; Internet Explorer offers its Web 
Slices feature.




Well, at the beginning of this thread the possible need to interchange 
RDF metadata and merge triples from different vocabularies was suggested 
as a use case for RDFa serialization of RDF, and this would hint a 
requirement for supporting an  RDFa processor in every conforming UA. 
This also opens a question about what else might be needed beside 
collecting triples (is an API to build custom query applications enough, 
or should some query feature be provided by browsers? are there possible 
problems involved (like possible spam through fake metadata in cached 
ads)? possible solutions to prevent or moderate it?).


If, otherwise, nothing special must be done by browsers with RDFa 
attributes, and instead their main use is for script or plugin or 
server-side computations, or for free support by UA, these ones would 
be no way different from any other kind of custom attributes (thus 
should a validation requirement be let's accept every attribute?), 
herein included data-*, but for the /intended use/, which may make the 
difference but is something only a human can understand, and no 
validator can check (from this point of view, validating RDFa 
attributes, whatever else attribute, or just html5 attributes and custom 
data-* ones would be the same, as validating would not be a concern as 
it isn't for proprietary CSS extensions).


For what concerns html serialization, in particular, I'd consider 
some code like [...] which is rendered properly



Is it though? Try adding the following CSS:

span[property=cal:summary] { font-weight: bold; }

And you'll see that CSS doesn't cope with a missing ending tag in that 
situation either.


If you miss out a non-optional end tag, then funny things will happen 
- RDFa isn't immune to that problem, but neither is the DOM model, 
CSS, microformats, or anything else that relies on knowing where 
elements end. A better comparison would be a missing /p tag, which 
is actually allowed in HTML, and HTML-aware RDFa processors can 
generally handle just fine.


That's definetely *not* the same issue. As I've replied in a previous 
mail, people *do not* need proper styling to understend prose, they just 
need to understand the prose language, then their /brains/ will cope 
with the rest, thus the above example results in some acceptable 
graceful degradation (it may or may not be the wanted presentation, 
depending on where the closing /span was to be positioned (it wouldn't 
be the right presentation in this case), but it is not too harmful 
anyway). Bots based on metadata, instead *do need* reliable metadata to 
work properly, unless they're made smart enough to debug the code 
they're fed (should Artificial Intelligence be a requirement? - no 
sarcasm here).


If broken/wrong presentation caused by a missing end tag had ever been 
an issue, html-serialization would have been deprecated in favour of 
xml-one (if something really problematic happened, authors would 
notice it on their very first test by opening a page in a browser, 
whereas an extensive and complete debug for triples might be an odd 
problem in a large document). In contrast with that, any break in 
metadata semantics caused by html-serialization can only be a sever 
issue for a metadata-based bot (because it needs accurate metadata, 
while a non-very-accurate presentation is not a great concern for human 
beings in most cases, and if no particular presentation is attached to 
those spans, but instead they're used just to add semantics through 
metadata, as it happens to embedd RDF through RDFa attributes, a 
side-effect may arise), thus html-serialization may be more prone to 
side-effects than xml-serialization (which stops on validation errors, 
being in turn a possible cause for side-effects with metadata), from 
this point of view -- that is, since RDFa semantics is more reliable in 
a more well-formed document, xml-serialization might help to debug some 
errors, while it is not a strict requirement for content presentation, 
and instead finding more or less emboldened

Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-10 Thread Calogero Alex Baldacchino

Toby A Inkster ha scritto:


It should be noted in this case that RDFa also allows natural language 
parsers to be made more useful. By looking at the RDFa which marks up 
the author's name and website, they may be able to determine that the 
comment has been written by someone other than the page's main author, 
and thus not afford it the same level of trust granted to the rest of 
the page. So the natural language processing can benefit from RDFa.




That's true only if one can assume metadata are trustful, but they are 
only if they can be under a strict control, that is on a small-scale 
application. On a wider scale, one needs to make the opposite 
assumption, because it would or might be more common to find fake 
metadata with honest content (the prose of an advertisement does not 
lie, but related metadata can tell it's a different think to cheat a 
metadata-based UA), either because a site author can be a party to the 
spammer, or because authors can mess up metadata (yeah, they can mess up 
html code too, but that's either not a problem, because a UA can present 
the content as well, or it is but it might damage the author more than 
it may harm the user). If metadata are created/used for external 
consumption, they can be just ignored by authors, who instead may just 
copypaste code or reuse templates in different contexts, without being 
able to set proper metadata for the new content. Thus UAs can't rely on 
metadata /in general/, while they might /in some/, small-scale scenarios.


WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Con Danone Activia, puoi vincere cellulari Nokia e Macbook Air. Scopri come
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8547d=10-1


Re: [whatwg] Fuzzbot (Firefox RDFa semantics processor)

2009-01-10 Thread Calogero Alex Baldacchino

Manu Sporny ha scritto:

Calogero Alex Baldacchino wrote:
  

That is, choosing a proper level of integration for RDF(a) support into
a web browser might divide success from failure. I don't know what's the
best possible level, but I guess the deepest may be the worst, thus
starting from an external support through out plugins, or scripts to be
embedded in a webbapp, and working on top of other feature might work
fine and lead to a better, native support by all vendors, yet limited to
an API for custom applications



There seems to be a bit of confusion over what RDFa can and can't do as
well as the current state of the art. We have created an RDFa Firefox
plugin called Fuzzbot (for Windows, Linux and Mac OS X) that is a very
rough demonstration of how an browser-based RDFa processor might
operate. If you're new to RDFa, you can use it to edit and debug RDFa
pages in order to get a better sense of how RDFa works.

  


The concern is about every kind of metadata with respect to their 
possible uses; but, while it's been stated that Microforamts (for 
instance) don't require any purticular support by UAs (thus they're 
backward compatible), RDFa would be a completely new feature, thus html5 
specification should say what UAs are espected to do with such new 
attributes.


Shall UAs just accept them and expose an API to extract triples, so 
that a web application can build a query mechanism upon such an API? 
This might work fine, and fulfill small-scale scenarios, such as 
organization-wise data modelling and interchanging, as suggested by 
Charls McCathieNevile; this can also be accomplished by an external plugin.


Shall UAs (browsers) also provide an interface to view bare triples (as 
does Fuzzbot), as a kind of debugging tool? As above.


Shall UAs (browsers) also provide metadata-based features, such as a 
query interface to look for content in a local history? This is a wider 
scale application, and also a use case where problems may arise. From 
this angle, metadata can't be assumed as reliable apriori (instead, 
their reliability is uncertain), nor can users be deemed capable to 
understand the problem and filter out wrong/misused/abused metadata (in 
general). This is the scenario were spammy metadata may become an issue. 
For instance, some code like,


div  typeof=foaf:Person
   p property=foaf:name content=Manu SpornyWe sell
   a href=http://www.cheatingcarseller.com; 
rel=foaf:homepagecars/a

   /p
/div

would produce the following triples,

_:bnode0 rdf:type http://xmlns.com/foaf/0.1/Person
_:bnode0 foaf:homepage http://www.cheatingcarseller.com
_:bnode0 foaf:name Manu Sporny

(this is exactly what Fuzzbot outputs)

thus, a metadata-based search feature might output a link to a 
metadata-spammy site when queried for Manu Sporny. That is, cheating 
a metadata-based bot by the mean of fake metadata can be very easy.


Metadata-based features, but this is true for most of xml-related 
technologies (such as RDF/RDFa), work fine if properly used. Unluckily, 
things must be used properly to work fine is not the basic principle 
of the web (and this is specially true for html and related 
technologies), which instead has always been about people will mess 
everything up, but UAs will work fine as well, that is robustness 
before all, as far as possible. For what concerns html serialization, 
in particular, I'd consider some code like,


p typeof=cal:Vevent
 I'm holding
 span property=cal:summary
   one last summer Barbecue
 !-- /span --, to meet friends and have a party before the end of 
holidays

 on
 span property=cal:dtstart content=2007-09-16T16:00:00-05:00
   datatype=xsd:dateTime
   September 16th at 4pm
 /span.
/p

(taken from http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014/ and 
purposedly modified)


which is rendered properly, but produces,

_:bnode1 rdf:type http://www.w3.org/2002/12/cal/icaltzd#Vevent
_:bnode1 cal:dtstart 2007-09-16T16:00:00-05:00
_:bnode1 cal:summary one last summer Barbecue , to meet friends 
and have a party before the end of holidays on span 
xmlns:cal=http://www.w3.org/2002/12/cal/icaltzd#; 
xmlns:foaf=http://xmlns.com/foaf/0.1/; 
xmlns:xsd=http://www.w3.org/2001/XMLSchema#; 
xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; 
datatype=xsd:dateTime datatype=xsd:dateTime 
content=2007-09-16T16:00:00-05:00 property=cal:dtstartSeptember 
16th at 4pm/span


(taken from Fuzzbot keeping namespace declarations in the root element; 
without xmlns:* attributes all triples are lost)


which is not the desired result. Perhaps it might work better as an xml 
feature on a strict xml parser (aborting with an error because of a 
missing end tag), even considering RDFa relies on namespaces (thus, 
adding RDFa attributes to HTML5 spec would require some features from 
xml extensibility to be added to html serialization). But RDFa in an 
XHTML document might look like rdfa:about, rdfa:property, 
rdfa:content

Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-10 Thread Calogero Alex Baldacchino

Kornel Lesiński ha scritto:

On 09.01.2009, at 01:54, Calogero Alex Baldacchino wrote:


This is why I was thinking about somewhat data-rdfa-about, 
data-rdfa-property, data-rdfa-content and so on, so that, for the 
purposes of an RDFa processor working on top of HTML5 UAs


One can also use link rel=alternate href=description.rdf. I 
don't see why RDF metadata must be in the HTML document. It could be 
in a separated file, maybe embedded in RSS/Atom feeds (RSS1.0 is 
pretty close already).


Websites that have a lot of useful data to share usually keep it in a 
database, and this allows them to easily generate RDF as separate 
documents without risk of getting out of sync with the HTML version.




In principle, I agree (also, Atom 1.0 embedding RDFa as dataRSS is the 
base of SearchMonkey). But if people feel the need to embed metadata in 
their documents and to use them as a distributed database, well, let's 
give them a chance to do so. :-P


eRDF might be a working compromise, because it doesn't need any changes 
to the spec; RDFa covers a wider range of RDF semantics, but requires 
new attributes and also namespaces (a sort of hybrid beteween them might 
avoid the need to bring namespaces - xmlns:* attributes - into html 
serialization). My suggestion was meant as a mean to test RDFa in HTML 
documents without changing the spec (perhaps in conjunction with 
data-xmlns-*, data-xmlns-prefixes=rdfa foaf whatever to emulate 
namespaces - an ugly hack, I know, but at least would avoid changes to 
html serialization, at least in a test phase) -- even if I think that 
xml serialization should work better for such rdf metadata.


WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Con Danone Activia, puoi vincere cellulari Nokia e Macbook Air. Scopri come
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8549d=11-1


Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-09 Thread Calogero Alex Baldacchino

Julian Reschke ha scritto:

Calogero Alex Baldacchino wrote:

...
This is why I was thinking about somewhat data-rdfa-about, 
data-rdfa-property, data-rdfa-content and so on, so that, for the 
purposes of an RDFa processor working on top of HTML5 UAs (perhaps in 
a test phase, if needed at all, of course), an element dataset would 
give access to rdfa-about, instead of just about, that is using 
the prefix rdfa- as acting as a namespace prefix in xml (hence, as 
if there were rdfa:about instead of data-rdfa-about in the markup).

...


That clashed with the documented purpose of data-*.


Hmm, I'm not sure there is a clash, since I was suggesting a *custom* 
and essentially *private* mechanism to experiment with RDFa in 
conjunction with HTML serialization, for the *small-scale* needs of some 
organizations willing to embed RDFa metadata in text/html documents, and 
to exchange them with each other by using a convention likely avoiding 
name clashes with other private metadata. Since I think it's unlikely to 
find data-rdfa-* used with different semantics in the very same page, 
and in a small-scale scenario involving a few *selected* sources for 
RDFa-modelled information, it should be likely to know in advance that 
someone else is using the same conventions. Such a modelled document 
might be used in conjunction with an external RDFa processor, thus 
avoiding any direct support in a browser.


However, such a convention might be enough clash-free to work on a 
wider scale, thus it might become widespread and provide an evidence 
that the web /needs/, or at least /has chosen/ to use RDFa as (one of) 
the most common way to embed metadata in a document, and such might be 
enough to add a native support for the whole range of RDFa attributes, 
eventually along with support for earlier experimental ones (such as 
data-rdfa-* and rdfa:* ones, for backward compatibility). And 
actually I can't see much of a problem if a private-born feature became 
the base of a widespread and widely accepted convention (I'm not saying 
the spec should name data-rdfa-* as a mean to implement RDFa, instead I 
think that, if a general agreement on if and how RDFa must be spec'ed 
out and implemented can't be found, such an experiment might be proposed 
to the semantic web industry and wait for the results - given a lack in 
support might prevent any interested party to use RDFa and HTML5 
altogether).




*If* we want to support RDFa, why not add the attributes the way they 
are already named???




For instance, to experiment whether it is worth to change the if we 
want into we do want, without requiring an early implementation and 
specification, nor relying on if and what a certain browser vendor might 
want to experiment differently from others (such a convention would only 
require support for HTML5 datasets and a script or a plugin capable to 
handle them as representing RDFa metadata). -- the point here is that 
after introducing data-* attributes as a mean to support custom 
attributes any browser vendors might decide to drop support for other 
kind of custom attributes in html serialization (that is, for attributes 
being neither part of the language nor data-* ones), therefore if they 
(or any of them) decided to avoid to support RDFa attributes until they 
were introduced in a specification there might be no mean to experiment 
with them (in general, that is cross-browser) without resorting either 
to data-* or to rdfa:* (the latter in xhtml).


Anyway, /in general/ what should a browser do with RDFa metadata, on a 
*wide scale*, other than classifying a portion of the open web (e.g. in 
its local history), eventually allowing users to select trusted sources?


Actually, I don't think such would bring enough benefits for *average* 
users, compared to the risk to get a lot of spam metadata from 
/heterogeneous/ sources. I really don't expect average users to 
understand how to filter sites basing on metadata reliability (and just 
for the purpose to use a metadata-based query interface, because a site 
with wrong metadata might still contain usefull informations); instead 
they might just try and use a query interface the same way they use a 
default search bar, get wrong results (once spam metadata became 
widespread) and decide the mechanism doesn't work fine (eventually 
complaining for that). A somewhat antispam filter might help, but I 
think that understanding if metadata are reliable, that is if they 
really correspond to a web page content, is an odd problem to be solved 
by a bot without a good degree of Artificial Intelligence (filtering 
emails by looking for suspicious patterns is far easier than 
implementing a filter capable to /understand/ metadata, /understand/ 
natural language and compare /semantics/ ).


As well, I don't expect the great majority of web pages to contain 
valid metadata: most people would not care of them, and a potentially 
growing number might copypaste code containing metadata

Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-09 Thread Calogero Alex Baldacchino

Ben Adida ha scritto:

Tab Atkins Jr. wrote:
  

Actually, SearchMonkey is an excellent use case, and provides a
problem statement.



I'm surprised, but very happily so, that you agree.

My confusion stems from the fact that Ian clearly mentioned SearchMonkey
in his email a few days ago, then proceeded to say it wasn't a good use
case.

-Ben

  


It seems to me that's a very custom use case - though requiring metadata 
to be embedded in a big number of pages, but that's an optional 
requirement, because search results don't rely only on metadata -  since 
metadata are used as an optional source for informations by the server 
and don't require any collaboration by other kinds of UA (excluding, at 
most, some custom data services - whereas, for instance, a search engine 
using the mark element to highlight a keyword would require a client UA 
to understand and style it properly -- I expect it not to be working on 
IE6, for instance, because IEx browsers deal with unknown elements as if 
their content where misplaced). That is, Yahoo might develop his own 
data model and work fine with sites implementing it; perhaps RDF(a) was 
chosen because they might think RDF is a natural way to model data which 
are sparse in a web page (and re-mapping microformats on RDF might 
result in an easier implementation); anyway, in this case the only UA 
needing to understand RDFa, in this case, is SearchMonkey itself, thus a 
client browser might just drop RDFa attributes without breaking 
SearchMonkey functionalities -- at least, this is my first impression.


Furthermore, it's a very recent (yet potentially interesting) 
application, so why not to wait and see how it grows, if the opt-in 
mechanism will effectively prevent spam (e.g. spammers might model data 
basing on widely diffused vocabularies and data services, and find a way 
to make such data available in searches when users asks for additional 
infos, for instance through an ad within a page of an accomplice author, 
or exploiting some kind of errors in authors' selection of URLs to be 
crawled for metadata, or the alike), or just which model become the most 
used among RDFa, eRDF, Microformats, Atom embedding dataRSS and whatever 
else Yahoo might decide to support, before choosing to include one or 
the other into html5 specification (or to include each one because 
equally diffused)? Moreover, it seems that some xml processing is needed 
to create a custom data service, thus it might be natural to use xhtml 
(possibly along with namespaces and prefixed attributes) to provide 
metadata to such a data service, which might rely on an xml parser 
instead of implementing one from scratch (and html parser might not 
support namespaces for the purpose to expose them through DOM 
interfaces, as I understand html serialization) -- the use of prefixed 
RDFa attributes, or perhaps even unprefixed ones, within an 
xml-serialized document, shouldn't require a formalization in html5 
spec, as far as there is no strict requirement for UAs to support RDF 
processing - as it is for the purposes of SearchMonkey and its related 
data services.


WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Con Danone Activia, puoi vincere cellulari Nokia e Macbook Air. Scopri come
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8551d=9-1


Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-09 Thread Calogero Alex Baldacchino

Ben Adida ha scritto:

Ian Hickson wrote:
  
We have to make sure that whatever we specify in HTML5 actually is going 
to be useful for the purpose it is intended for. If a feature intended for 
wide-scale automated data extraction is especially susceptible to spamming 
attacks, then it is unlikely to be useful for wide-scale automated data 
extraction.



It's no more susceptible to spam than existing HTML, as per my previous
response.

  


Perhaps this is why general purpose search engines do not rely 
(entirely) on metadata and markup semantics to classify content, nor 
does Yahoo with SearchMonkey. SearchMonkey documentation points out that 
metadata never affects page ranks, nor is semantics interpreted for any 
purpose; metadata only affects additional informations presented to the 
user at the user will, and if the user chose to get informations of a 
certain kind (gathered by a certain data service), thus spammy metadata 
can be thought as circumscribed in this case, they might corrupt 
SearchMonkey additional data, but not the user's overall experience with 
the search engine. From this point of view, SearchMonkey is some kind of 
wide-range but small-scale use case (with respect to each tool and each 
site the user might enable), because the user can easily choose which 
sources to trust (e.g. which data services to use, or which sites to 
look for additional infos), and in any case he can get enough infos 
without metadata.


On the other hand, a client UA implementing a feature entirely based on 
metadata couldn't easily circumscribe abused metadata and bring valid 
informations to the user attention, nor could the average user take 
easily trusted and spammy sites apart, because he wouldn't understand 
the problem (and a site with spammy metadata might still contain 
informations users were interested in previously, or in a different 
context), whereas in SearchMonkey the average user would notice 
something doesn't work in enhanced results, but he'd also get the basic 
infos he was looking for. Thus there are different requirements to be 
taken into account for different scenarios (SearchMonkey and client UA 
are such different scenarios)


Moreover, SearchMonkey is a kind of centralised service based on 
distributed metadata, it doesn't need collaboration by any other UA 
(that is, it doesn't need support for metadata in other software) by 
default (whereas it allows custom data services to autonomously extract 
metadata, but always for the purposes of SearchMonkey), it only requires 
that web sites adhering to the project (or just willing to provide 
additional infos) embed some kind of metadata only for the purpose of 
making them available to SearchMonkey services, or at least that authors 
create appropriate metadata and send them to Yahoo (in the form of 
dataRSS embedded in a Atom document). That is, SearchMonkey seems to me 
a clear example of a use case for metadata not requiring any changes to 
html5 spec, since any kind of supported metadata are used by 
SearchMonkey as if they were custom, private metadata; whatever happens 
to such metadata client-side, even if they're just stripped by a 
browser, doesn't really matter.


Furthermore, SearchMonkey supports several kinds of metadata, not only 
RDFa, but also eRDF, microformats and dataRSS external to the document. 
So, why should SearchMonkey be the reason to introduce explicit support 
to RDFa and not also for eRDF, which doesn't require new attributes, but 
just a parser? One might think one solution is better than the other, 
and this might be true in theory, but what really counts is what people 
do find easier to use, and this might be determined by experience with 
SearchMonkey (that is, let's see what people use more often, then decide 
what's more needed).


Moreover, RDFa is thought for xhtml, thus it can't be introduced in html 
serialization just by defining a few new attributes: a processor would 
or might need some knowledge over /namespaces/, thus the whole family 
of *xmlns* attributes (with and without prefixes) should be specified 
for use with the html serialization, unless an alternative mechanism, 
similar to the one chosen for eRDF, were defined, and maybe such would 
result in a new, hybrid mechanism (stitching together pieces from eRDF 
and RDFa). Buf if we introduce xmlns and xmlns:prefix into html 
serialization, why not also prefixed attributes? That is, can RDFa be 
introduced into html serialization as is, without resorting to the 
whole xml extensibility? This should be taken into account as well, 
because just adding new attributes to the language might work fine for 
xml-serialized documents, but might not for html-serialized ones. This 
means RDFa support might be more difficult than it may seem at first 
glance, whereas it might not be needed for custom and/or small scale use 
cases (and I think SearchMonkey is one such case).


Nobody is suggesting that user agents derive any behavior from 

Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-08 Thread Calogero Alex Baldacchino

Charles McCathieNevile ha scritto:
On Sun, 04 Jan 2009 03:51:53 +1100, Calogero Alex Baldacchino 
alex.baldacch...@email.it wrote:



Charles McCathieNevile ha scritto:
... it shouldn't be too difficoult to create a custom parser, 
comforming to RDFa spec and availing of data-* attributes...


That is, since RDFa can be emulated somehow in HTML5 and tested 
without changing current specification, perhaps there isn't a strong 
need for an early adoption of the former, and instead an emulated 
mergence might be tested first within current timeline.


In principle this is possible. But the data-* attributes are designed 
for private usage, and introducing a public usage means creating a 
risk of clashes that pollute RDFa data gathered this way. In other 
words, this is indeed feasible, but one would expect it to show that 
the data generated was unreliable (unless privately nobody is 
interested in basic terms like about). 


This is why I was thinking about somewhat data-rdfa-about, 
data-rdfa-property, data-rdfa-content and so on, so that, for the 
purposes of an RDFa processor working on top of HTML5 UAs (perhaps in a 
test phase, if needed at all, of course), an element dataset would give 
access to rdfa-about, instead of just about, that is using the 
prefix rdfa- as acting as a namespace prefix in xml (hence, as if 
there were rdfa:about instead of data-rdfa-about in the markup).


This way, the public exposure of RDFa attributes on top of the generic 
and normally-private dataset feature might be enough circumscribed to 
avoid clashes. That is, if RDFa shows its best benefits when used to 
address small-scale needs involving trusted/reliable (meta-)data, it 
should be fair to assume all involved parties are aware that each one is 
using RDFa, and aren't just running an RDFa processor in the hope to 
gather enough informations.


From this point of view, it should be quite unlike to find people using 
data-rdfa-about to express different semantics in the same page 
(whereas data-property might cause ambiguity, for instance), as well as 
it is (or should be) quite unlike to find namespaces using the very same 
prefix involved in the same xml document (that is, I think choosing a 
name including a namespace prefix for a data-* attribute (and also for a 
class in a generic container as a div or a span, to tell it represents 
an external element) can replicate quite safely the xml extensibility 
for custom uses, to some extent, without requiring a wide support for it 
in text/html document - since it seems that xhtml extensibility is not a 
major concern, at least not enough to be worth merging it into html).


Just an idea, though.

However, AIUI, actual xml serialization (xhtml5) allows the use of 
namespaces and prefixed attributes, thus couldn't a proper namespace be 
introduced for RDFa attributes, so they can be used, if needed, in 
xhtml5 documents? I think such might be a valuable choice, because it 
seems to me RDFa attributes can be used to address such cases where 
metadata must stay as close as possible to correspondent data, but a 
mistake in a piece of markup may trigger the adoption agency or foster 
parenting algorithms, eventually causing a separation between metadata 
and content, thus possibly breaking reliability of gathered 
informations. From this perspective, a parser stopping on the very first 
error might give a quicker feedback than one rearranging misnested 
elements as far as it is reasonably possible (not affecting, and instead 
improving, content presentation and users' direct experience, but 
possibly causing side-effects with metadata).


Also, if the above is true, using namespaced and prefixed attributes 
instead of ones laying in the same namespace shared both by html5 and by 
xhtml5 (in theory) might prevent the use of such metadata in a document 
whose parsing rules might lead to possible side-effects.


Such results have been used to suggest that poorly implemented 
features should be dropped, but this hypothetical case suggests to me 
that the argument is wrong, and that if in the face of reasons why the 
data would be bad people use them, one might expect better usage by 
formalising the status of such features and getting decent 
implementations.




Generally speaking, I think reasoning in terms of poor implementation 
vs rare usage is quite like moving as a dog biting his own tail, 
because poorly implemented features are forcedly rarely used, and rarely 
used features can't convince UAs developers to implement them (in 
general). But, if a feature is widely needed, several hacks may born, 
thus providing an evidence of a global problem to be solved in a certain 
manner by implementing a certain, well-conceived feature.


As far as I've understood it, that's the main guideline to change actual 
specification, which is moving on the base of a bullet-tracing evolution 
(perhaps weighted on the need for completely new features, as a balance 
between the need

Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-08 Thread Calogero Alex Baldacchino

Charles McCathieNevile ha scritto:
On Mon, 05 Jan 2009 01:21:33 +1100, Henri Sivonen hsivo...@iki.fi 
wrote:

On Jan 2, 2009, at 14:01, Benjamin Hawkes-Lewis wrote:

On 2/1/09 10:38, Henri Sivonen wrote:



Is the problem in the case of recipes that the provider of the page
navigation around the recipe is unwilling to license the navigation 
bits under the same license as the content proper?


I thought Toby's example was that each recipe on the page needed a 
different licence, rather than a distinction between the main 
content area and the navigation.


Oh. That can be solved by giving each recipe its own URI  HTML page 
and scraping those pages instead of summary pages that might contain 
multiple recipes.


Sure. In which case the problem becomes doing mashups where data 
needs to have different metadata associated is impossible, so the 
requirement is enable mashups to carry different metadata about bits 
of the content that are from different sources.


A use case for this:

There are mapping organisations and data producers and people who take 
photos, and each may place different policies. Being able to keep that 
policy information helps people with further mashups avoiding 
violating a policy.


For example, if GreatMaps.com has a public domain policy on their 
maps, CoolFotos.org has a policy that you can use data other than 
images for non-commercial purposes, and Johan Ichikawa has a photo 
there of my brother's café, which he has licensed as must pay money, 
then it would be reasonable for me to copy the map and put it in a 
brochure for the café, but not to copy the data and photo from 
CoolFotos. On the other hand, if I am producing a non-commercial guide 
to cafés in Melbourne, I can add the map and the location of the cafe 
photo, but not the photo itself.




It seems a scenario where a human should carefully evaluate each licence 
and perhaps put a careful and human readable prose into the mashed-up 
page, or a link to such a prose. Metadata may or may not be accurate 
(e.g. may be misplaced and not contain the whole license, or refer to a 
wrong kind of license, different from the one stated in the prose), but 
the whole prose (and perhaps only that) is legally binding for sure (I'm 
not aware of any international law recognizing metadata and/or 
machine-processable/machine-friendly extracted content as a valid legal 
agreement/notice - in your example, Johan Ichikawa might put the must 
pay money license in a span containing a metadata reference to a 
creative commons license, but only the must pay money license is 
surely valid as a legal notice, as far as I can tell).




Another use case:
My wife wants to publish her papers online. She includes an abstract 
of each one in a page, but because they are under different copyright 
rules, she needs to clarify what the rules are. A harvester such as 
the Open Access project can actually collect and index some of them 
with no problem, but may not be allowed to index others. Meanwhile, a 
human finds it more useful to see the abstracts on a page than have to 
guess from a bunch of titles whether to look at each abstract.





I'm not strongly for one solution or the other in this case (an actual 
choice may depend on several considerations, such as harvesters 
reputation, or the need to use metadata anyway for private purposes), 
but this case might be addressed by embedding each abstract in an 
iframe, so that human users would get all of them in a single page, 
while a harvester would need to navigate each page to index/copy it, and 
a proper metadata might be put into each page, or each page might have a 
different rule to restrict access (e.g. through a robot file, or the 
Access-Control semantics, or any kind of white- or black- lists), 
specially to prevent a malicious harvester (that is one deliberately 
ignoring metadata and licenses) from accessing certain contents.


WBR, Alex



--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Email.it offre alle aziende il servizio di Email Marketing con pacchetti di 
invio a 10.000 utenti a soli 250 Euro
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8352d=9-1


Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-03 Thread Calogero Alex Baldacchino

Charles McCathieNevile ha scritto:

The results of the first set of Microformats efforts were some pretty
cool applications, like the following one demonstrating how a web
browser could forward event information from your PC web browser to 
your

phone via Bluetooth:

http://www.youtube.com/watch?v=azoNnLoJi-4


It's a technically very interesting application. What has the adoption
rate been like? How does it compare to other solutions to the problem,
like CalDav, iCal, or Microsoft Exchange? Do people publish calendar
events much? There are a lot of Web-based calendar systems, like 
MobileMe

or WebCalendar. Do people expose data on their Web page that can be used
to import calendar data to these systems?


In some cases this data is indeed exposed to Webpages. However, 
anecdotal evidence (which unfortunately is all that is available when 
trying to study the enormous collections of data in private intranets) 
suggests that this is significantly more valuable when it can be done 
within a restricted access website.


...

In short, RDFa addresses the problem of a lack of a standardized
semantics expression mechanism in HTML family languages.


A standardized semantics expression mechanism is a solution. The lack 
of a solution isn't a problem description. What's the problem that a

standardized semantics expression mechanism solves?


There are many many small problems involving encoding arbitrary data 
in pages - apparently at least enough to convince you that the data-* 
attributes are worth incorporating.


There are many cases where being able to extract that data with a 
simple toolkit from someone else's content, or using someone else's 
toolkit without having to tell them about your data model, solves a 
local problem. The data-* attributes, because they do not represent a 
formal model that can be manipulated, are insufficient to enable 
sharing of tools which can extract arbitrary modelled data.




That's because the data-* attributes are meant to create custom models 
for custom use cases not (necessarily) involving interchange and (let me 
say) agnostic extraction of data. However, data-* attributes might be 
used to emulate support for RDFa attributes, so that each one might be 
mapped to, let's say, a data-rdfa-attribute one and viceversa (I 
don't think data-rdfa-about vs about would make a great difference, 
at least in a test phase, since it wouldn't be much different from 
rdfa:about, which might be used to embed RDFa attributes in a somewhat 
xml language (e.g. an external markup embedded in a xhtml document 
through the extension mechanism)).


Since it seems there are several problems which may be addressed (beside 
other, more custom models) by RDFa for organization-wide internal use 
and intranet publication, without the explicit requirement of external 
interchange, when both HTML5 specific features and RDFa attributes are 
felt as necessary, it shouldn't be too difficoult to create a custom 
parser, comforming to RDFa spec and availing of data-* attributes, to be 
plugged in a certain browser supporting html5 (and data-*) for internal 
test first, then exposed to the community, so that html5+rdfa can be 
tested on a wider scale (especially once alike parsers are provided for 
all main browsers), looking for a widespread adoption to point out an 
effective need to merge RDFa into HTML5 spec (or to standardize an 
approach based on data-* attributes).


That is, since RDFa can be emulated somehow in HTML5 and tested 
without changing current specification, perhaps there isn't a strong 
need for an early adoption of the former, and instead an emulated 
mergence might be tested first within current timeline.



What is the cost of having different data use specialised formats?


If the data model, or a part of it, is not explicit as in RDF but is 
implicit in code made to treat it (as is the case with using scripts 
to process things stored in arbitrarily named data-* attributes, and 
is also the case in using undocumented or semi-documented XML formats, 
it requires people to understand the code as well as the data model in 
order to use the data. In a corporate situation where hundreds or tens 
of thousands of people are required to work with the same data, this 
makes the data model very fragile.




I'm not sure RDF(a) solves such a problem. AIUI, RDFa just binds (xml) 
properties and attributes (in the form of curies) to RDF concepts, 
modelling a certain kind of relationships, whereas it relies on external 
schemata to define such properties. Any undocumented or semi-documented 
XML formats may lead to misuses and, thus, to unreliably modelled data, 
and it is not clear to me how just creating an explicit relationship 
between properties is enough to ensure that a property really represents 
a subject and not a predicate or an object (in its wrongly documented 
schema), if the problem is the correct definition of the properties 
themselves. Perhaps it is enough to parse them, 

Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-03 Thread Calogero Alex Baldacchino

Dan Brickley ha scritto:

On 3/1/09 14:02, Julian Reschke wrote:

Tab Atkins Jr. wrote:

The most successful alternative is nothing at all. ^_^ We can
extract copious data from web pages reliably without metadata, either
using our human senses (in personal use) or natural-language-based
processing (in search engine use). It has not yet been established
that sufficient and significant enough problems *exist* to justify a
solution, let alone one that requires an addition to html. That is
what Ian is specifically looking for.


That's what you and Ian claim. Many disagree.


My main problem with the natural language processing option is that it 
feels too close to waiting for Artificial Intelligence. I'd rather add 
6 attributes to HTML and get on with life.


But perhaps a more practical concern is that it unfairly biases things 
towards popular languages - lucky English, lucky Spanish, etc., and 
those that lend themselves more to NLP analysis. *The Web is for 
everyone*, and people shouldn't be forced to read and write English to 
enjoy the latest advances in *Web automation*. Since HTML5 is going 
through W3C, such considerations need to be taken pretty seriously.




My concern is: is RDFa really suitable for everyone and for Web 
automation? My own answer, at first glance, is no. That's because RDF(a) 
can perhaps address nicely very niche needs, where determining how much 
data can be trusted is not a problem, but in general misuses AND 
deliberate abuses may harm automation heavily, since an automaton is 
unlikely to be able to understand whether metadata express the real 
meaning of a web page or not (without a certain degree of AI).


If an external mechanism is needed to determine trust level for 
metadata, that is to establish when an automation results are good or 
bad, such a mechanism may involve human beings at some stage, thus 
breaking automation (this is somehow similar to the problem of defining 
an oracle machine described by Turing, according to whom such a 
machine isn't an automaton).


On another hand, a very custom model thought for very custom needs (and 
not requiring wide support) may be less prone to abuses, since it's 
unlikely to find someone willing to cheat himself. Thus, having third 
parties agreeing a certain model and related APIs, and implementing APIs 
on their own sides, might be more reliable in some cases (anyway, third 
parties should agree their respective metadata are reliable and find a 
way to evaluate they really are).


Dan Brickley ha scritto:

On 3/1/09 16:54, Håkon Wium Lie wrote:

Also sprach Dan Brickley:

My main problem with the natural language processing option is 
that it
feels too close to waiting for Artificial Intelligence. I'd 
rather add 6

attributes to HTML and get on with life.

:-)


Another thought re NLP. RDFa (and similar, ...) are formats that can 
be used for writing down the conclusions of NLP analysis. For example 
here see the BBC's recent Muddy Boots experiment, using DBPedia 
(Wikipedia in RDF) data to drive autoclassification / named entity 
recognition. So here we can agree with Ian and others that text 
analysis has much to offer, and still use RDFa (or other semantic 
markup - i'll sidestep that debate for now) as a notation for marking 
up the words with a machine-friendly indicator of their NLP-guessed 
meaning.


http://www.bbc.co.uk/blogs/journalismlabs/2008/12/muddy_boots.html


Personally, I think the 'class' attribute may still be a more
compelling option in a less-is-more way. It already exists and can
easily be used for styling purposes. Styling is bait for authors to
disclose semantics.


I'm sure there's mileage to be had there. I'm somehow incapable of 
writing XSLT so GRDDL hasn't really charmed me, but 'class' certainly 
corresponds to a lot of meaningful markup. Naturally enough it is 
stronger at tagging bits of information with a category than at 
defining relationships amongst the things defined when they're 
scattered around the page. But that's no reason to dismiss it entirely.


Did you see the RDF-EASE draft, 
http://buzzword.org.uk/2008/rdf-ease/spec? From which comes: Ten 
second sales pitch: CSS is an external file that specifies how your 
document should look; *RDF-EASE is an external file that specifies 
what your document means.*


RDF-EASE uses CSS-based syntax. More discussion here, 
http://lists.w3.org/Archives/Public/semantic-web/2008Dec/0148.html 
including question of whether it ought to be expressed using 
css3-namespace, 
http://lists.w3.org/Archives/Public/semantic-web/2008Dec/0175.html


chers,

Dan

--
http://danbri.org/



My question is: how often can I trust such a file specifies what your 
document really means, without evaluating its content?


I'd distinguish two cases (not pretendig to make a complete classification),

- The semantics described by metadata is used for server-side 
computations: there's no need to evaluate content (since I'm trusting to 
you when 

Re: [whatwg] Trying to work out the problems solved by RDFa

2009-01-03 Thread Calogero Alex Baldacchino

Toby A Inkster ha scritto:

Calogero Alex Baldacchino wrote:


My concern is: is RDFa really suitable for everyone and for Web
automation? My own answer, at first glance, is no. That's because RDF(a)
can perhaps address nicely very niche needs, where determining how much
data can be trusted is not a problem, but in general misuses AND
deliberate abuses may harm automation heavily


If your agent isn't going to trust the data gleaned from RDFa, then 
why should it trust the data gleaned from the web page's natural 
language? If the page has been authored by a reprobate that cannot be 
trusted to put honest and correct data in a few RDFa attributes, why 
should we trust their prose text?




If you sell computers but your site talks about cars I'll never buy a 
notebook from you; thus you're not cheating me, but yourself and 
damaging your business. But if you believe cars are searched more often 
than computers (just an example), one may use false metadata to cheat 
any UAs relying on metadata instead of prose, and take me on a store 
selling computers instead of cars.


Reliability of metadata (with respect to the described data) is an issue 
separated from reliability of content: it's not up to any UA to 
understand AND filter content basing on the author being trusted to be 
saing the truth (such would be a form of censorship), but if I ask the 
UA to bring me a page talking about horses, I don't want it to bring me 
a page talking about v.i.a.g.r.a. (that's spam), thus it is up to any UA 
relying on metadata to understand AND filter them basing on their 
reliability.


An oft-quoted answer is that the prose text is visible whereas the 
RDFa is somehow invisible. Apart from the fact that UIs which make 
use of data pulled in from RDFa will make this data visible, there is 
also the fact that RDFa, unlike an external RDF/XML file, or some 
metadata embedded in a script block, makes use of as much visible 
data as possible: visible links, visible text, etc.


pMy name is span property=foaf:name
  about=#meToby Inkster/span./p

If you can't trust someone to correctly mark up what their name is, 
then why trust them to mark up what deserves emphasis? Why believe 
the address they provide? What if the instance they marked up with 
dfn is not really the defining one? What if a var is really a 
constant?




I don't really need a proper markup to understand a name is a name, a 
variable is a variable, a definition is a definition, and so on; you can 
use plain text and I'll understand your content the same way. If one 
makes a mistake when combining a dfn with an anchor, the result may be 
a broken link, perhaps making me look for a better site. If one's 
misusing var or em, the worst possible consequence is a bad 
presentation, and a bad presentation can be an attempt to cheat a UA (as 
when people puts a lot of keywords in a page and style them with the 
same color as the background to cheat search engines), but such is only 
if it is a deliberate choice, not a misuse (and I'm concerning mainly on 
abuses) -- anyway, it is easier to cheat a UA by the mean of false 
metadata than cheating a human person by the mean of wrong markup.


If some markup is like,

pWe sell a href=www.cheatingcarseller.com property=foaf:name 
content=Toby Inkstercars/a/p


in any advertisement, I'll notice it's about cars and I'll choice 
whether to follow it or not, basing on my interest at the moment, but if 
I query Toby Inkster to a semantic UA blindly relying on metadata, I 
might get a page of a cars webstore instead of your homepage (for instance).


Furthermore, I started my replies from a Charles McCathieNevile's mail, 
explicitly talking about trusted data and (mainly) small use cases, not 
a wide-scale web automation. If there's no agreement about what kind of 
needs are best addressed by RDFa, maybe I have to agree with people 
saying that technology must grow and become more mature (or, at least, 
better understood) before it is merged into HTML5 specification (and 
2023 is far enough to accomplish such a goal :-) ). And I re-throw my 
suggestion to map RDFa attribute to data-rdfa-* attributes and build 
RDFa processor plugins for most common browsers, to test HTML5 and RDFa 
convergence in a wider scale before having browser natively supporting 
RDFa in HTML5 documents (for the purpose of a test - but not only - I 
don't think data-rdfa-property vs rdfa:property vs property would 
be much of a problem).


I'm not saying RDFa is a bad thing, or it is useless, I just don't think 
any kind of markup can fit perfectly the semantic of random content 
for the purposes of a global, wide-scale and automatic classification 
of content.


Best regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Incrementa la visibilita' della tua azienda con l'invio di newsletter e 
campagne email marketing.
* Con investimento

Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Calogero Alex Baldacchino

Robert O'Callahan ha scritto:
2008/12/31 Giovanni Campagna scampa.giova...@gmail.com 
mailto:scampa.giova...@gmail.com


2008/12/30 timeless timel...@gmail.com mailto:timel...@gmail.com

On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński
kor...@geekhood.net mailto:kor...@geekhood.net wrote:
 It's useful for fields that contain non-textual content,
e.g. product ID,
 license plate number, CAPTCHA answer, etc.
 Browser would mark these as misspelt, which might be
confusing or at least
 distracting.

this sounds like something browser vendors need to worry about on
their own and is not a reason to let web pages do anything
about it.


maybe we could just say that spellchecking is disabled when type
is not text (for email, uri and number you have validation) and
when a pattern attribute is specified


That handles some cases, but not others --- e.g. text boxes that 
contain program code.


Rob
--
He was pierced for our transgressions, he was crushed for our 
iniquities; the punishment that brought us peace was upon him, and by 
his wounds we are healed. We all, like sheep, have gone astray, each 
of us has turned to his own way; and the LORD has laid on him the 
iniquity of us all. [Isaiah 53:5-6]
Indeed, that's a valid use case. Anyway, I don't think such a spec 
should and _would_ prevent UAs from giving users a chance to bypass the 
'spellcheck=' attribute (e.g., such an attribute may overcome a UA 
default value, as spec'ed out, but the user may be notified of it, and a 
UA context menu option may allow a different setting, just as a resort 
in case of misuses/errors, such in the example of a 'spellcheck= 
false' applied to a box containing some code).


The language to check might be choosen from several sources, such as the 
'lang' attribute of the contenteditable element itself, if different 
from the document language. For instance, a blog editor's interface 
document might not be translated in a certain language, whereas allowing 
content creation in that language and giving the author a chance to set 
the proper language for a spell checker by changing (through script) the 
editor box language.


A possible evolution, if required upon time, might involve a further 
attribute referencing an external dictionary file, perhaps in a standard 
format, or in a format a UA can recognize (thus, indicating 
alternatives), and using the 'spellcheck' attribute when no appropriate 
language/dictionary can be specified, or to say that just the specified 
dictionary/dictionaries must be used.


Best Regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Proteggi la tua auto
* Garanzia furto e incendio a soli 30 euro! Offerta valida fino al 31 Dicembre! 
Non perdere l�occasione!
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8509d=31-12


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Calogero Alex Baldacchino

Calogero Alex Baldacchino ha scritto:



The language to check might be choosen from several sources, such as 
the 'lang' attribute of the contenteditable element itself, if 
different from the document language. For instance, a blog editor's 
interface document might not be translated in a certain language, 
whereas allowing content creation in that language and giving the 
author a chance to set the proper language for a spell checker by 
changing (through script) the editor box language.




Or, perhaps, the editor interface might be negotiated basing on the 
author's language settings, but he/she might be interested to write a 
content in a foreign language, thus wishing spellcheking in that 
language (if allowed by a UA's capabilities).


Best Regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Proteggi la tua auto
* Garanzia furto e incendio a soli 30 euro! Offerta valida fino al 31 Dicembre! 
Non perdere l�occasione!
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8509d=31-12


Re: [whatwg] Thoughts on video accessibility

2008-12-27 Thread Calogero Alex Baldacchino

Silvia Pfeiffer ha scritto:

Hi Ian,

Thanks for taking the time to go through all the options, analyse and
understand them - especially on your birthday! :-) Much appreciated!
  


Than, happy birthday to Ian!


[...]
The only real issue that we have with separate files is that the
captions may get lost when people download the video, store it
locally, and share it with friends. Maybe we should consider solving
this differently. Either we could encapsulate into the video container
upon download. Or we could create a zip-file or tarball upon download.
I'd just find it a big mistake to ignore the majority use case in the
standard, which is why I proposed the text elements inside the
video tag.

[...]


A flying thought: why not thinking also to a further option for 
embedding everything in a sort of all-in-one html page generated on 
the fly when downloading, making of it a global container for video and 
text to be consumed by UAs (while maintaining the opportunity to 
download a video as a separate file, of course)? For instance, the video 
itself might become the base64-encoded (or otherwise acceptably encoded) 
value of a data-* attribute (or a more specific attribute) to be decoded 
by a script (as well generated on the fly) and served to the video 
engine as a javascript: url in place of the video src (or, perhaps 
better, the UA might do that itself by supporting the data: protocol 
as a valid source for the video, or a fragid pointing to an element 
following the /video tag, perhaps a paintext or something else, and 
containing the encoded video); while text elements might wrap the 
corresponding timed text file, to be embedded into the page as bare 
text, similarly to a script code -- if a certain format contained text 
tag, those might be changed into lt;textgt; or similarly (or perhaps 
the file content might be encoded as well) to avoid conflicts with html 
tags.


Of course, it's a first-glance idea, and needs further considerations 
on its reliability (e.g. such an html page perhaps shouldn't be the 
source set for a video in another page, and an option should be provided 
to extract embedded contet; seeking might require a sequential decoding 
to reach a desired point, and so on).


Regards, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Partecipa al concorso “Crea il tuo Webshire” su Leiweb: vincere è un gioco da ragazze! 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8518d=27-12


Re: [whatwg] /html with omitted tags

2008-12-26 Thread Calogero Alex Baldacchino

Philip Taylor ha scritto:

I can start with a simple document that's probably conforming and that
the validator doesn't complain about:

  !DOCTYPE htmlhtmlheadtitle/title/headbody/body/html

Then I can read the Writing HTML document: Optional tags section, which says:

  A head element's end tag may be omitted if the head element is not
immediately followed by a space character or a comment.

  A body element's start tag may be omitted if the first thing inside
the body element is not a space character or a comment, except if the
first thing inside the body element is a script or style element.

  A body element's end tag may be omitted if the body element is not
immediately followed by a comment.

So I choose to omit the /headbody/body because I think those
rules say I can do so. I get:

  !DOCTYPE htmlhtmlheadtitle/title/html

But now I get a parse error, which I think is because the /html
comes in the in head insertion mode and is Any other end tag: Parse
error. Ignore the token., so something seems wrong.

  


AIUI, omitting those closing tags is a parse error anyway, but in 
certain situations the parser can fix the code automatically because the 
state to enter/remain in is unambigous. Thus a validator notifies a 
parse error, while a browser keeps the error internally and handles it 
when possible.


AIUI, a brower would notice the error but might ignore it since both in

!DOCTYPE htmlhtmlheadtitle/title/html

and in

!DOCTYPE htmlhtmlheadtitle/title/headbody/body/html

there is nothing to show but the background color of the html root 
element as provided by default style sheets (that is, they're 
equivalent), whereas finding a showable start tag (like a p) would 
lead (about) to automatic insertions of /head and body and to 
reconsume the tag as in body insertion mode.


Indeed, section 8.1 says,

/'This section only applies to documents, authoring tools, and markup 
generators. In particular, it does not apply to conformance checkers; 
conformance checkers must use the requirements given in the next section 
(parsing HTML documents).'


///thus section 8.1.2.4 Optional tags does not apply to validators. ;-)

[ space characters and comments may be handled correctly anyway, but 
omitting the /head tag would result in a different document tree, thus 
athors and authoring tools should take care that


!DOCTYPE htmlhtmlheadtitle/title!-- this is a comment --/html

and

!DOCTYPE htmlhtmlheadtitle/title/head!-- this is a comment 
--/html

are not fully equivalent ]


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Polizza auto?
* Garanzia furto e incendio per un anno al vantaggioso prezzo di 30 euro tasse 
incluse! Affrettati, hai tempo fino al 31 Dicembre!
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8510d=26-12


Re: [whatwg] /html with omitted tags

2008-12-26 Thread Calogero Alex Baldacchino

Geoffrey Sneddon ha scritto:


On 26 Dec 2008, at 17:02, Calogero Alex Baldacchino wrote:


Philip Taylor ha scritto:

I can start with a simple document that's probably conforming and that
the validator doesn't complain about:

 !DOCTYPE htmlhtmlheadtitle/title/headbody/body/html

Then I can read the Writing HTML document: Optional tags section, 
which says:


 A head element's end tag may be omitted if the head element is not
immediately followed by a space character or a comment.

 A body element's start tag may be omitted if the first thing inside
the body element is not a space character or a comment, except if the
first thing inside the body element is a script or style element.

 A body element's end tag may be omitted if the body element is not
immediately followed by a comment.

So I choose to omit the /headbody/body because I think those
rules say I can do so. I get:

 !DOCTYPE htmlhtmlheadtitle/title/html

But now I get a parse error, which I think is because the /html
comes in the in head insertion mode and is Any other end tag: Parse
error. Ignore the token., so something seems wrong.




AIUI, omitting those closing tags is a parse error anyway, but in 
certain situations the parser can fix the code automatically because 
the state to enter/remain in is unambigous. Thus a validator notifies 
a parse error, while a browser keeps the error internally and handles 
it when possible.


The writing HTML documents section is meant to give what is a 
conforming HTML document, and those documents are conforming according 
to that. However, conformance checkers which are meant to follow the 
parser section (and throw the parse errors that produces) which in 
these cases differs. Therefore, either the writing section is wrong or 
the parser is wrong to throw the parse errors.



--
Geoffrey Sneddon
http://gsnedders.com/



Hmm, yeah, perhaps a /html tag should be treated as stated for the in 
body insertion mode also for the in head and after head (or better, 
it should be treated as anything else for consistence with section 
8.1); this way,


!DOCTYPE htmlhtmlheadtitle/title/html

would be treated as

!DOCTYPE htmlhtmlheadtitle/title/headbody/body/html

without any parse error.

Otherwise, if a document missing a body element is to be considered 
non-conforming, that should be stated in section 8.1.2.4 (since actually 
it seems to be conforming).


Regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Crea il tuo Webshire su Leiweb, fallo sfilare e vinci fantastici premi Locman e 
Coccinelle. Partecipa subito al concorso!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8515d=26-12


Re: [whatwg] Merry Christmas!

2008-12-25 Thread Calogero Alex Baldacchino

Giovanni Campagna ha scritto:


Probably you didn't notice, but it is 25th December today. Merry 
Christmas and Happy New Year to all members of WHAT and W3C working 
groups!


Giovanni

Merry Christmas to you and to everyone celebrating Christmas!

Happy and holy celebrations to everyone celebrating a religious 
festivity or anything else in this period!


Happy holidays to everyone having holidays in this period for whatever 
reason!


Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Proteggi la tua auto
* Con Direct Line risparmi oltre il 30% sulla tua polizza! In più per te, 15% di extra sconto! Scopri subito l’offerta! 
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8511d=25-12


Re: [whatwg] Phrasing semantics feedback omnibus

2008-12-24 Thread Calogero Alex Baldacchino
First of all, let me state I wasn't (and am not) too strongly concerned 
on the following issues. These where either formal questions, or 
impromptu thoughts inspired by the dialog, perhaps not enough weighted 
when not enough felt. I guess it was no way clear in my mails. Let me 
also state that I'm definitely not aiming to argue, but I feel to 
disagree about some conclusions.


Ian Hickson ha scritto:


On Fri, 14 Nov 2008, Pentasis wrote:
1) Just because it makes sense to a human (it doesn't to me), does not 
mean it makes sense to a machine.


HTML is ultimately meant for human consumption, not machine consumption. 
Humans write it (sometimes with the help of a machine), humans read it 
(almost always with the help of a machine). We don't need it to make sense 
to a machine, we just need the machine to do what we tell it to so that it 
makes sense to us.





Don't you really consider the machine role as central in this process? 
HTML is the way (= the *language*) you tell the machine what to do so it 
makes sense to human users. You've given a bare definition of a 
*computer language*, but a computer language is for machine consumption! 
HTML is for human use (= the author/web developer) but for machine (= 
the UA) consumption, the very same way C++ is for human use (= the 
programmer) but for machine (= the compiler) consumption, since both are 
computer languages; the former being a specialized language and the 
latter being a general purpose one is no way relevant from this point of 
view, since both are computer languages *by definition* (not my own, of 
course...). Only the machine output is for human (end users) 
consumption. How should a human user be supposed to consume an HTML 
document if a machine doesn't consume HTML _code_? And how should a 
machine be supposed to consume HTML code if it's not projected having in 
mind machine constraints _first_ (e.g. context-freedom), authors needs 
in second place? :-)



On Tue, 25 Nov 2008, Calogero Alex Baldacchino wrote:

[...]


Could you give a concrete example? In all the examples I can think of, 
there is no problem that I can see. For example this:


   pbH/bello!/p

...would be fine in an AT, even if the AT went bing as it was saying the 
first part of the word.





What about pbA/bfter that/p, if the bing followed the b 
content (the same way a radio advertisement speaker could read out 
Intel Inside followed by the usual jingle do dooDOOdooDO), wouldn't 
such end up in a difficult to understand sound? [for a 'bing' preceding 
the b content, just shifting tags inside the word causes the same 
problem] Anyway, in a following mail I agreed an AT might default such 
cases as plain text, just ignoring in word tags whose semantics may 
alter speech (but specifying certain semantics should be applied only to 
whole words by non-visual UAs wouldn't be an awful idea, I think). 
Perhaps it wasn't clear.


However, I think that a solution, at least partial, can be found for the 
rendering concern (and I'd push for this being done anyway, since there 
are several new elements defined for HTML 5).


Which rendering concern?




The one raised vs my (impromptu and abandoned) idea of new semantic 
elements: backward compatibility with older browsers unaware of such new 
tags (it's the very same for new elements though).



[...]


Actually other than the validator, user agents ignore the DTD altogether.




[other points like the above]

I've acknowledged in other mails my assumptions were definitely wrong, 
and I apologized for that, as far as I remember (did I forgot to? if so, 
I apologize now!). Then the discussion moved towards the suitability of 
a kind of foundation style sheet to handle at least new elements 
presentation, and hiding those ones whose semantics might be difficult 
to cope with in older browsers (such as a menu constrained to be a 
contextual menu: a default CSS wouldn't be enough to cope with such), as 
a graceful degradation.


[my own personal conclusion, in my humble opinion, was that the result 
might be unreliable and definitely browser-dependent -- for instance, IE 
family seems to accept a 'custom' tag with its 'custom' attributes, by 
creating a 'proper' (as far as possible) html element, and styles are 
correctly applied to the element too, BUT any content inside the unknown 
tags is extracted and put inside the outer container, as if it were 
misplaced - a partial solution, though apparently not working in IE8, 
consists of adding a script creating an element with the 'custom' tag 
name by calling document.createElement() before unknown tags are parsed, 
but such tells me a foundation style sheet is not a (fully) working 
solution _per_se_, though desirable for consistent cross-browser rendering].


Let's come to the non-typographical interpretation a today u.a. may be 
capable of, as in your example about lynx. This can be a very good 
reason to deem small a very good choice. But, are we sure that *every

Re: [whatwg] Phrasing semantics feedback omnibus

2008-12-24 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:


On Sun, 30 Nov 2008, Calogero Alex Baldacchino wrote:

[...] an activators element [...]


I encourage you to look at the command element in HTML5. I'm waiting for 
implementations of that before looking at access keys.





I've given a closer look to it and (more quickly) to the overall command 
architecture: that's a nice abstraction :-)


Just one thing: a note says a synthetic click doesn't perform the same 
actions as required by the click() method, but those seems suitable as 
pre-click activation steps (or post-click, if needed), eventually 
telling the UA to take care of the synthetic click source (user's device 
or document scripts) if such causes any difference, aren't they?


And a little aside: it is said context menus should inherit, but the 
inheritance is not yet defined, so let me suggest a convergence with 
events flow: during the capture phase, showing the menu might be 
prevented by stopping the propagation of a triggering event (e.g. by the 
UA because of a user preference, or by an ancestor's handler to make 
custom menus available only under certain conditions -- in order to have 
it working, perhaps each platform-dependent method might be abstracted 
as a 'right click', as some sort of variant on synthetic clicks); at the 
target, if a menu is provided for the element, it is shown as described, 
and the triggering event propagation is stopped (treating the menu show 
as a kind of post-run default action); otherwise, while bubbling, the 
triggering event might cause a context menu being shown as soon as an 
ancestor providing one is found (then stopping the event propagation); 
if no custom contextual menu is provided in the element subtree, the UA 
takes the control and shows a proper menu at the target element (if any 
is provided by the implementation).


For access keys, I've never liked them much, though they exist and 
removing them might break some existing pages, but I'm sure that's been 
considered in depth (and perhaps browsers vendors will support them 
anyway, so they can be re-elaborated in due course).


Personally, I think key events are more flexible than access keys, but 
are affected by the same platform- and browser- dependence, which might 
be mitigated by defining a few properties telling about default 
modifiers; I'm not sure if that's an issue for DOM Events or if an 
HTML-specific interface should be defined to be implemented with other 
DOM Events interfaces altogether, since HTML 5 spec'es interfaces 
somehow hooking to the platform hosting the document (such as Window, 
where an attribute might list the default modifiers, while a boolean on 
an event-specific interface might tell whether access modifiers have 
been pressed by the mean of a boolean -- though I guess an HTML5-side 
DOM solution would need key events to stabilize).


About my half-proposal (no more than half), it was suggested to me by 
timeless' ideas on generic commands triggered by actions a user might 
customize AND carry from one UA to another, thus I thought on the fly to 
something I supposed might have been consistent cross-browser and 
potentially coexisting with developers' choices, through an embeddable 
mechanism, either as a part of the document, or a separate document to 
be linked, or provided as default by the UA, and with some sort of 
cascading (or precedence) rules, on the same line as CSS; perhaps such 
might be considered as an evolution. The part I was considering more 
valuable was what I called (with a temporary name) a 'mousebehavior' 
describing linear movements of a pointing device: actions might be as 
easy as a succession of movements in different directions (right, left, 
up and down), detected as coordinates difference between a 'start' and 
an 'end' point (detecting one direction at a time); such might be 
helpful, I guess, as an aid for certain disabilities, in conjunction 
with a pointing device capable to 'rectify' jugged movements, picking 
the start point as a mean value in a certain range (the same way a mouse 
driver considers a 'mouse up' as determining a 'click' if a 'mouse down' 
happened in a short range of pixels), and the end point as the mean 
value in a range where the user rests for a certain interval before 
moving again.


Best regards, and happy holidays to everyone (if having holidays in this 
period)

Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Polizza Auto?
* Con Direct Line garanzia furto e incendio a soli 30 euro per un anno! Affrettati: l’offerta è valida fino al 31 Dicembre. 
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8512d=24-12


[whatwg] A few hints on html5 - part 1

2008-12-16 Thread Calogero Alex Baldacchino
Let me suggest a few hints on html5 specs, maybe some hints will be 
minor or less important, maybe some others might be useful for a 
somewhat next version of these specifications. Let me also apologize if 
the following points have been yet discussed and I'm missing such 
discussions, or if I've misunderstood any part of the specs. -- this was 
a longer message, but the list bot refused it, thus I'm splitting it 
into a few messages (and thus the subject, part 1 etc.)


First, maybe the less relevant: in the Script execution contexts 
section a request is made for some couple of terms other than 
with/without script. OK, let me suggest scriptable/unscriptable or 
reachable/unreachable by script, for instance. Just a simple hint, no 
more.


The former suggests me a possible (partial) solution for the events 
section question about events handling for non active or browsing 
context-less documents: being script execution not allowed in such 
situations, we could state that any event exclusively thought for script 
interaction should never fire, unless any valid motivation arises to let 
the event fire and be dispatched to the corresponding handler(s), and in 
such a case the whole mechanism of deciding whether script execution 
must be allowed or not should be revisited. Otherwise, if any script 
related resources are thought to be kept alive in a somewhat frozen 
state (in example, in a previously active document, a connection buffer 
with last received, not yet elaborated content, the connection itself to 
be re-established, or its status), then any related event could fire and 
be frozen in a pending state, or just be frozen or pending, 
meaning in a before firing state, ready to fire (and be dispatched) as 
soon as the document enters a scriptable state (i.e. becomes active or 
gains a browser context).


Furthermore, the event loop and task queue definitions suggests me 
that a somewhat user agent could implement a sort of all-in-one 
mechanism to handle together (maybe for an improved interaction?) both 
implementation-related and script-related events, i.e. queueing together 
both types of event (or the related tasks), with a somewhat precedence 
rule between them, or even, in some cases, the very same event/task to 
be first handled by the underlying implementation, then passed 
(wrapped?) to the script specific mechanism (for instance, when a 
document, or an object inside a document, is fully loaded, a native 
load event is generated to increment/complete the document rendering 
and then it is wrapped and sent to any script related handler). In such 
a case, the specification could establish, for clearness sake, that only 
implementation related events must fire, if meaningful for the 
implementation in a non-scriptable context (an inactive or 
context-less document), while any script related event (even the same 
wrapped event, after the underlying elaboration) must either be 
discarded (it does not fire) or be frozen (if applicable) for a further 
possible resuming. Might such a clarification be helpful for such an 
(unrealistic? strange? possible?) implementation, in order to avoid or 
reduce confusion or possible side effects?


Anyway, such considerations might perhaps either be left to the user 
agent implementation, or be deferred to a next version of html5 specs...


For the How do we allow non-JS event handlers? concern, let me 
distinguish two different cases:


1) an event handler content attribute is set in the markup:
Let's assert it must conform to the ECMAScript FunctionBody production 
rules by default, unless another language is stated elsewhere as the 
default scripting language for the whole document or for a particular 
element.


As for the whole document, a meta tag could be used, such as 'meta 
http-equiv = Content-Script-Type content = 
a_valid_scripting_mime_type /' or the alike. If the declared 
mime-type is not supported, it could be defaulted to the ECMAScript one.


For the element by itself, an attribute could be added both to the 
markup and the DOM, either to describe a script language valid for all 
the script content attributes (i.e. 
'defaultscript=appropriate_mime-type'), or to define a list of valid 
mime-types (i.e. 'acceptedscripts=first_mimetype;second_mimetype'). In 
the latter case, for each parsed script content attribute, the first 
declared mime-type should constrain the production rules, or be skipped 
if not supported, using the next mime-type upon failure or after 
skipping an unsupported one; if all sequentially applied production 
rules fail, let a SYNTAX_ERR arise (or any other appropriate 
error/exception); if no mime type is supported, the default script 
language rules are applied (if not listed, that is, yet tried in the 
previous step), and if even this fails, let an appropriate 
error/exception (maybe the SYNTAX_ERR itself) arise. For the sake of a 
graceful degradation, the script content attribute whose production has 
failed could be 

[whatwg] A few hints on html5 -- part 2

2008-12-16 Thread Calogero Alex Baldacchino

About the RemoteEventTarget interface

The removeEventSource() method is provided to remove one instance of a 
source (one matching URL) per invocation, but no way is defined to know 
whether other instances are yet listed, or if the operation succeeded. 
Maybe such method could return a boolean value telling whether the 
operation was successful, so that, i.e., all matching URLs could be 
removed at once in a simple iteration calling the method until it 
returns false. Maybe a remove all method could be considered too.


I guess a single RemoteEventTarget can list several time the same remote 
source to take advantage of more than one connection (maybe non-http) to 
fetch different resources and/or to ask for different server-side 
computations in parallel; however, it might be helpful to define either 
a mechanism to remove a precise source (i.e. passing an index or the 
alike, not just the URL) instead of removing a source on a per enter 
position basis (that is, the first encountered is removed, as could be 
thought) or a precise choice algorithm (i.e., skipping an active URL), 
since without neither a precise targeting nor a precise algorithm a 
somewhat user agent could remove the wrong url upon request, and so 
closing for instance a connection with a pending get operation: one of a 
RemoteEventTarget message event handlers could receive an end event 
and try and close its connection, but the implementation, by mistake, 
could remove a source URL used by another handler waiting for a 
response, or the method could be invoked from a piece of code outside 
any handler, and so the choice might become more difficult. Otherwise, 
an algorithm should be defined to switch the communications from a 
closed source to another still active.


According to the previous hint, let me suggest the following:
- a streamed event should be associate to a numerical index representing 
either the relative position (i.e. indicating it's the Nth occurrence) 
or the absolute position of the source URL in the RemoteEventTarget list 
of event sources; for this purpose, the last event id attribute should 
be considered unreliable;


- a removeEventSource() method variant is provided accepting the index 
as a second parameter;


- when the removeEventSource is invoked without the index argument (i.e. 
to iteratively remove all occurrences), the following algorithm is applied:

1) if the URL resolution fails return false and abort these steps;
2) pick the first occurrence of src argument in the list of event 
sources, if any;

3) if no occurrence has been found return false and abort these steps;
4) if a remove source as possible task or a remove source immediately 
task has been tailed for src, stop execution and return true;

5) queue a remove source as possible task and return true;

- when the removeEventSource is invoked with the index argument follow 
the previous steps but change step 2) and 3) as follow:
2) pick the source occurrence in the list of event sources corresponding 
to the index argument and compare it with src argument;

3) if comparison fails return false and abort these steps;

a remove source as possible task is a task delegated to remove the 
source URL from the list of event sources and to close the related 
connection as soon as any pending event is completely received and 
dispatched to every listening handler and no message has been post to 
the remote server (otherwise wait for the response event); a remove 
source immediately task is a task performing the same operation but 
without waiting for pending events: as soon as the task is executed, the 
event source is eliminated.


- a couple of removeEventSourceNow() methods is provided with the same 
characteristics of the previous, but queueing a remove source 
immediately task.


- if needed, an appropriate task source is provided.

Regards, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Tom Raider Anniversary ora sul tuo cellulare! Entra in azione!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8277d=16-12


[whatwg] A few hints on html5 - part 3

2008-12-16 Thread Calogero Alex Baldacchino

About the cross-document messaging

Let's consider the following scenario. A somewhat productivity suite (or 
any sort of web applications collection) is made up of a few different 
top-level/auxiliary browsing contexts - let's call each one a module - 
eventually from different origins, and exploits cross-document 
communications to some extent, i.e. to delegate some computations or 
some shareable communications with a remote server; each module is 
independent and can instantiate the proper auxiliary module(s).


Here we are: as far as the modules are instantiated as auxiliary 
browsing contexts of one other module (i.e. through a call to 
'window.open()'), communications are easily established, but what if any 
module is instantiated by the user as a separate top-level browsing 
context, i.e. opening a new tab or window and recalling the module 
document from a bookmark? I'd suggest the following:


- a mechanism is established to get access, without any restriction, to 
every browsing context for which the user agent can individuate a 
non-empty, non-null, non-undefined name attribute, at least with the 
capability to let cross-origin access to the postMessage() methods. 
For instance, the specifications could clearly state that the Window 
open() method must return an existing window reference with the 
specified name when invoked with an empty string or null as URL 
argument, with no security restriction (security restrictions should 
apply just to the returned window object properties). When more than one 
browsing context share the same name, actual rules for choosing a 
browsing context given a browsing context name should apply to choose a 
first result, without checking if current browsing context is allowed to 
navigate that browsing context; it might be helpful to get instead a 
list of all browsing contexts with the same name, obtained as follow: a 
Window object is created as a pseudo unit of browsing contexts, so that 
each browsing context is reachable both by invoking the XXX4() method 
and by accessing the frames property; each browsing context is wrapped 
in a Window object with 1)accessible postMessage() methods, calling the 
wrapped window ones, 2)an accessible parent attribute referring to the 
grouping Window object, 3)a self attribute referring to the wrapped 
object, accessible if access to the wrapped object is allowed by 
security restrictions, 4) access denied, without any exception/error 
arising, to any other method/attribute; the first member of the group 
(i.e. the object returned by calling XXX4(0) on the grouping Window) is 
the wrapper for a Window object determined by the rules for choosing a 
browsing context given a browsing context name (i.e. the most recently 
opened, or focused, or the most related with the open() method caller 
browsing context) and is returned.


- optionally, a few postMessageToAll() methods (with about the same 
arguments of the postMessage() ones) could be considered to let any 
browsing context to communicate, through its own Window interface, 
either to any other browsing context (eventually allowing communications 
from current browsing context as source, see below), or to every 
browsing contexts constrained by the same name (passed as, let's say, 
first argument), or to every browsing contexts with the same domain 
(specified, let's say, as the second argument).


Let's consider another scenario. A site (perhaps a blog) embeds content 
from a forum (or any social network), and uses script code to connect to 
the remote server and keep it's content up to date, but also to notify 
the user about any changes in other contents the remote server holds as 
subscribed (this scenario can be extended to mail notifications in the 
previous example of a productivity suite, or to a groupware). When the 
user navigates other documents from the site in different browsing 
contexts, each one is aware of the others (perhaps establishing a 
connection through a call to postMessageToAll, or by getting a reference 
by name); to avoid increasing the number of connections per server, any 
successive document navigated as a standalone browsing context (after 
the first or after a certain number) won't connect to the remote server, 
but will communicate with the document having an active remote 
connection. That is: the first navigated document maintains a remote 
connection and receives notifications as remote events; if it is fully 
active, the notifications are shown to the user, otherwise a message is 
sent to any other known document capable to handle the notification, 
hoping one is fully active; the first document becoming fully active 
handles the messages and notifies to the other documents that any 
required operation has been performed; when the remote events handling 
document(s) are to become no more active (i.e. they unload), a message 
is sent to the remaining documents so they can decide (somehow) who's 
the next dispatcher.


The above could 

[whatwg] A few hints on html5 - part 4

2008-12-16 Thread Calogero Alex Baldacchino

Miscellaneous


The Window interface open method accepts a features argument for 
historical (and backward compatibility) reasons, which, as stated, has 
no actual effect. I was considering the opportunity, instead, of 
maintaining the old functionality as an alternative and redundant 
implementation of the make application state. That could work this 
way: any browser feature set disabled in the features string is disabled 
and not shown in the newly opened window, BUT, a somewhat element, 
clearly being part of the browser application, is provided to let the 
user enable any hidden feature (either altogether, or one by one), so to 
reset the normal application condition; when a browser interface 
component is hidden, any related key binding is freed from usual 
capture, and redirected to the window active document, so that a full 
standalone behaviour is transparently shown to the user (the reset 
element should never be disabled), while when that component is 
re-enabled its normal behaviour is re-established; if the application is 
going full-screen the user is clearly advised about this and allowed to 
block the operation (in the case the operation is allowed, the reset 
element should become floating and maybe half-transparent -- I was 
thinking on a possible, future 2D or even 3D web based game...).


-

Current draft provides a few overloaded methods (like postMessage() 
variants) differing for the number, type and order of their attributes. 
A first concern could arise on the choice to overload functions in IDL 
interfaces, since any of the possible supported/supportable script 
language could not provide such a feature, making implementation more 
difficult; however, this could be a minor concern, both since a script 
with C-like syntax (as most are) usually let functions be overloaded, 
one way or another, and because a different kind of language, not 
providing such, could overcome the problem by defining methods with 
slightly different names and binding them to the appropriate interface 
(but this would lead maybe to a longer learning period and to possible, 
successive even greater difficulties whether such names would clash with 
future standard names). Maybe the parameters order and number could be 
another concern, since a script language could (like JavaScript does) 
allow functions overloading by varying the number of passed arguments, 
without caring about arguments types, and leaving to the inner code any 
checking and choice of what to do (that's closer to a C++ function 
declaration with default arguments, than to a full overload); this is 
not a real problem, but perhaps a little improvement in current specs 
might result from changing the arguments order so that the arguments 
list of an overloaded method's two variant, when compared, is equal for 
the first 'x' arguments, where 'x' is the length of the shortest list, 
since this could reduce the translation work the script engine must do 
before calling the underlying implementation (i.e., it could be a 
slightly easier casting of the arguments to their correspondent native 
types, without any previous checking for the right type, before calling 
the interface native implementation - the point is: a check is likely to 
be done by the casting routine(s), so couldn't it be avoid before 
casting?). Furthermore, any language missing the overload semantics 
could expose just one method with the whole list of possible arguments, 
corresponding to the idl declared method with the longer list, and I 
think that defining idl methods with some care for arguments order would 
be a neater choice.


-

Current browsers provides facilities to parse xml code (either the 
DOMParser object or a DOM Load and Save Parser). All fail with html tag 
soup, so if for any reason a somewhat string of html code must be 
parsed to manipulate its DOM representation before taking any action, a 
workaround must be found (i.e. calling 
document.implementation.createHTMLDocument() and somehow inserting the 
string into such fake document, then getting the DOM structure - this 
could be quite unreliable too, as a parsing alternative, if any script 
code in that string were executed). Since one of the goal of html 5 
specifications is the definition of a standard parser, with a standard 
parse error management, maybe the opportunity of exposing an 
html-specific parser (skipping script execution) through the DOM might 
be considered.


-

Current draft states a script element set through the innerHTML property 
is not executed at all, while it is when added by calling 
document.write() (what about insertAdjacentHTML()?). However, I think 
that allowing script execution in the former case would made of the 
innerHTML property a truly live one, with some possible benefit: i.e. it 
could be a way to insert new script elements into the document head 
section from outside the head element (i.e. from an event listener on an 
eventsource, to dynamically 

Re: [whatwg] URL parsing and same-document references [was: Re: Citing multiple blockquote elements in HTML5]

2008-12-13 Thread Calogero Alex Baldacchino

Nils Dagsson Moskopp ha scritto:

Am Freitag, den 12.12.2008, 20:36 +0100 schrieb Calogero Alex
Baldacchino:
  
The above (but the 'double check' I was suggesting) is about the way 
Firefox (2.x and 3.0.4) behaves (both href=#foo%20bar and, in a 
different page, href=./example.html#foo%20bar match id=foo bar), 
while IE7 and Opera 9.x perform an exact comparison, and show, in the 
address bar, an url with eventual blank spaces, thus applying the 
relaxation allowed by URL parsing rules, but not conforming to RFC 3986, 
as a complete URI string.


Whenever I copypaste an URI from the address bar to any other program, I
am severely annoyed by this, especially when spaces (delimiters !) are
part of the fake-URI. A chat or office program, for example, is unable
to highlight the fake-URI anymore, (how could it ?), also pasting it
into source code can create all kind of validation errors. And whenever
I get a bastardized URI via chat or mail, only a part of it is
clickable.

Can someone from the web browser faction please state if there is any
data to support breaking RFC-compatibility ? Because as I see it, its
something that makes it appear nicer, but breaks whenever URIs are to be
transferred / communicated.
  


Actually I'm not from any faction, to be honest. I think a rationale for 
that may be people write strange things, both in address bars and in 
html code, thus relaxing rules when parsing an URL is meaningful; but I 
think when resolving and recomposing a whole URI the strictest rules 
should be applied.



Getting to the problem mentioned here, the robustness principle says
that id=foo bar should be accepted, but nevertheless invalid - because
a fragment with a space can never be part of an URI.


Indeed, that's not part of an URI, but a dereferenced component: when 
splitting an URI into its components, there is no need to keep %-encoded 
characters (RFC3986 says separated components can be decoded, thus, 
AIUI, both href=#foo bar and id=foo bar respect to conformance 
rules, but when resolving #foo bar into a complete, absolute URI, the 
result should always look like 
http://example.org/something.html#foo%20bar; to be conforming).



Regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Proteggi la tua auto
* Garanzia furto e incendio a soli 30 euro! Offerta valida fino al 31 Dicembre! 
Non perdere l’occasione!
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8509d=13-12


Re: [whatwg] URL parsing and same-document references [was: Re: Citing multiple blockquote elements in HTML5]

2008-12-13 Thread Calogero Alex Baldacchino

Nils Dagsson Moskopp ha scritto:

Am Samstag, den 13.12.2008, 19:09 +0100 schrieb Calogero Alex
Baldacchino:
  
Actually I'm not from any faction, to be honest. I think a rationale for 
that may be people write strange things, both in address bars and in 
html code, thus relaxing rules when parsing an URL is meaningful; but I 
think when resolving and recomposing a whole URI the strictest rules 
should be applied.


Accepting weird input is not a problem here, outputting is. Try writing
a valid URI into the address bar, then get an invalid displayed.


Greetings
  


Could you make an example, please? I wasn't able to reproduce such in 
IE7 - Opera 9.27 (e.g., 
http://real.addressofasite.com/index.html#foo%20bar; wasn't changed 
into http://real.addressofasite.com/index.html#foo bar).


Anyway, I guess you got the point. Relaxed parsing rules are for input 
URLs, but after parsing, a normalization and/or the resolution algorithm 
should be applied, and the showed URL, being absolute and complete, 
should conform to RFC3986. Actual resolution algorithm (section 2.5.3 of 
html5 spec) does not mention fragment identifiers explicitly, and, 
although its 10th step says Apply any relevant conformance criteria of 
RFC 3986 and RFC 3987, returning an error and aborting these steps if 
appropriate., step 9 says Apply the algorithm described in RFC 3986 
section 5.2 Relative Resolution, using url as the potentially relative 
URI reference (R), and base as the base URI (Base): AIUI, the algorithm 
described in section 5.2 of rfc3986 might be applied to each component 
of an URI without building a complete URI (instead, leaving each part 
separated and held as a property of an object - a components 
recomposition algorithm is defined in section 5.3 of rfc3986, but that's 
not a 'must'); when a single component of an URI is to be handled, 
rfc3986 does not require %-encoding as a 'must', thus the freedom of 
interpretations and the different behaviors in different UAs, leading to 
inconsistent results when copying a URL from a UA and pasting it into 
another one. I think a uniform behaviour should be defined as standard 
(and implemented!), instead (the concern you rised about copypaste 
perhaps results in a further issue regarding how line breaks should be 
handled by parsing rules - e.g. stripped like leading and trailing 
characters).


Regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
CheBanca! La prima banca che ti dà gli interessi in anticipo.
* Fino al 4,70% sul Conto Deposito, zero spese e interessi subito. Aprilo!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=7918d=14-12


Re: [whatwg] URL parsing and same-document references [was: Re: Citing multiple blockquote elements in HTML5]

2008-12-12 Thread Calogero Alex Baldacchino

Calogero Alex Baldacchino ha scritto:
Maybe the above needs a further clarification. Let me start from URL 
parsing (and resolving) rules: after the URL is validated, it's 
divided into its components, but nothing is stated about normalization 
and/or %-encoded characters. I think that applying a somewhat 
normalization may be useful to parse equivalent URLs in a consistent 
manner, helpful when dealing with the interfaces for URL manipulation, 
as described in section 2.5.5, and, last but not least, an improvement 
in relative references matching (especially same-document references). 
A minimum requirement, for standardization sake, may consist of 
decoding any %-encoded characters in the fragment production, which 
are part of the unreserved production as defined in RFC 3986 with 
the changes defined in HTML 5 specification for URLs parsing and 
restricted to the Unicode ranges representing valid characters for an 
attribute value (those which are not prohibited neither as 'text' nor 
as 'character references'). This way, a character-for-character 
comparison between a fragment identifier and an id attribute value, 
which would have been equivalent but not matching without the 
normalization, should success most of times, because, as a consequence 
of the changes applied by HTML 5 current specification to the 
unreserved production, such characters might or might not be 
%-encoded in a valid URL, while an id value is likely to contain them 
non-encoded.


After the above fragment normalization, a character-for-character 
comparison would fail if the id value contained any %-encoded triplet 
representing a decoded character, such as foo%20bar. Anyway, such 
may be a weird thing to deal with, since it can be the %-encoded form 
of foo bar, but also the decoded form of foo%2520bar. In other 
words, if we apply the same normalization to two complete URLs, then 
we compare them, the result is quite reliable, but if we start from a 
component (such as a fragment identifier stored in an id attribute 
value) it's not easy to tell whether any normalization has been 
applied and which one, so there are always chances for false positives 
or false negatives to happen. According with RFC 3986, section 4.4. 
Same-Document Reference, the correct interpretation of a URI as a 
same-document reference cannot be hold as guaranteed, thus the 
mismatch between, for instance, the  decoded fragment identifier foo 
bar and the id attribute value foo%20bar, in front of (as I think) 
a wide majority of good matches, can be reasonable. Anyway, a kind of 
double check might be considered, such as:


- comparing the %-unescaped fragment identifier with the ID of each 
element in the DOM;
- upon failure, applying a %-unescape algorithm to the ID, then 
comparing again with the fragment identifier and, if matching, marking 
the element as a 'possible choice';
- upon a perfect (exact) match, without unescaping the evaluated 
element ID, choosing such element as the referenced document part 
(actually defined as the indicated part of the document in the spec) 
and stopping;
- without any perfect match in the whole document, choosing the first 
'possible choice', if any;
- without any match at all, the search for the referenced document 
part fails.


With respect to a single check for an exact match, the overall 
computational time should increase linearly, thus not being a 
performance issue.


Best regards, Alex.


The above (but the 'double check' I was suggesting) is about the way 
Firefox (2.x and 3.0.4) behaves (both href=#foo%20bar and, in a 
different page, href=./example.html#foo%20bar match id=foo bar), 
while IE7 and Opera 9.x perform an exact comparison, and show, in the 
address bar, an url with eventual blank spaces, thus applying the 
relaxation allowed by URL parsing rules, but not conforming to RFC 3986, 
as a complete URI string. It seems different browsers implement (more or 
less) different normalization/resolution algorithms, leading to 
different matches, thus the specification of a uniform behaviour 
(whatever one) might be reasonable and useful. Actual resolving 
algorithm, while explicitly asking for %-encoding in a path component 
and for conformance with RFC 3986 in general, doesn't talk about 
fragment identifiers; the referred algorithm for relative resolutions 
(section 5.2 of RFC 3986), AIUI, might not require the creation of a 
complete URI string, but instead be accomplished by returning an object 
holding a separated string for each URI part, thus not necessarily 
requiring %-encoding and potentially leaving out to UAs a certain degree 
of freedom. Furthermore, about URL decomposition attributes it is said, 
'On setting, the new value must first be mutated as described by the 
setter preprocessor column, then mutated by %-escaping any characters 
in the new value that are not valid in the relevant component as given 
by the component column.'; such seems to refer to the stricter RFC3986

Re: [whatwg] Use cases for Node.getElementById

2008-12-10 Thread Calogero Alex Baldacchino

Garrett Smith ha scritto:

On Sat, Dec 6, 2008 at 7:09 PM, Calogero Alex Baldacchino
[EMAIL PROTECTED] wrote:
  

Simon Pieters ha scritto:


On Fri, 05 Dec 2008 19:19:04 +0100, Calogero Alex Baldacchino
[EMAIL PROTECTED] wrote:

  

[...]


(I'm currently the editor of that proposal, currently located at
http://simon.html5.org/specs/web-dom-core )

  

I'm reading it :-)

And I have a few questions.



I did not see a proposal for Element.getElementById.

I would not care about that much.

I woud rather have

Element.getElementsByName.

It is perfectly valid for a doucment to have multiple elements w/the
same name (though not generally a good idea). I've seen this before.

Was this proposed?

Garrett
  
I don't remember what spec exactly stated this first, but I remind of a 
previous HTML version declaring the 'name' attribute as unique in the 
'global scope' (or something like that), meaning the whole document; 
then, I remember 'name' was deprecated in favour of 'id'. I think 
'getElementsByName' was retained from the past just because form 
elements scoped input names in a different manner (while the name of an 
anchor, for instance, had to be unique in the whole document), but it 
was a bit conflicting with the uniqueness of (at least some) elements' 
name. Anyway, this is what I remember (current specification no more 
defines a name attribute for every elements - it's not on the 
HTMLElement interface).


However, the issue about Node.getElementById originated by noticing 
problems with duplicate ids in existing pages and the likelihood new 
pages may have duplicate ids (e.g. by repeatedly cloning and inserting 
nodes without caring of all attributes), thinking on the opportunity to 
address such an illegal state somehow. If non-unique identifiers have to 
be a deliberate and 'careful' choise, such to involve a dedicated 
attribute, perhaps the class attribute and [ HTMLDocument | HTMLElement 
| whatever_else_implementing ].getElementsByClassName() methods can 
address that: classes are non-unique not only for the whole document, 
but also for the same element, which may have multiple classes listed in 
its attribute (each class name is unique in the list), so they might be 
used for some non-style-related purposes, just appending a name to the 
list of styling classes (just to give some clearness, though 
unnecessary), and querying it with getElementsByClassName() would work 
the same way as resorting to the 'name' attribute and the 
'getElementsByName()' method (perhaps a bit tricky, but should work fine).


~~

@ Simon Pieters (and everyone else on the list, of course).

I was thinking again on 'getElementsByClassName()' moved to Web DOM 
Core: maybe a good place for it might be the Node interface, so to have 
the method working on Documents as well as on Elements; if the 
HTMLCollection interface were moved as well, perhaps such might be the 
return value, instead of a NodeList, since non-element Nodes should 
never be expected to have a class name, I guess (perhaps doing the same 
with getElementsByTagName might be consistent, but maybe problematic 
because of backward compatibility -- while getElementsByClassName would 
be a 'new entry' in the 'reign' of Core interfaces, thus a greater 
degree of freedom might be taken, if reasonable, of course - it may 
depend on a known need for different, specilized algorithms in Document 
and Element nodes, for instance).


Best regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
RC Auto?
* Con Direct Line garanzia furto e incendio a soli 30 € per un anno! Non 
perdere l’occasione!
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8497d=10-12


Re: [whatwg] Use cases for Node.getElementById

2008-12-10 Thread Calogero Alex Baldacchino

Garrett Smith ha scritto:

On Wed, Dec 10, 2008 at 8:10 AM, Calogero Alex Baldacchino
[EMAIL PROTECTED] wrote:
  

Garrett Smith ha scritto:


On Sat, Dec 6, 2008 at 7:09 PM, Calogero Alex Baldacchino
[EMAIL PROTECTED] wrote:

  

Simon Pieters ha scritto:



On Fri, 05 Dec 2008 19:19:04 +0100, Calogero Alex Baldacchino
[EMAIL PROTECTED] wrote:


  

[...]



(I'm currently the editor of that proposal, currently located at
http://simon.html5.org/specs/web-dom-core )


  

I'm reading it :-)

And I have a few questions.



I did not see a proposal for Element.getElementById.

I would not care about that much.

I woud rather have

Element.getElementsByName.

It is perfectly valid for a doucment to have multiple elements w/the
same name (though not generally a good idea). I've seen this before.

Was this proposed?

Garrett

  

I don't remember what spec exactly stated this first, but I remind of a
previous HTML version declaring the 'name' attribute as unique in the
'global scope' (or something like that),



What?

  

meaning the whole document; then, I
remember 'name' was deprecated in favour of 'id'.



Name is not deprecated. It is, as I said, perfectly valid. How else
are you going to submit form values?

Garrett
  


I was referring to some elements using it as a global identifier, like 
a and img, and apologize for any lack of clearness.


From http://www.w3.org/TR/html401/struct/links.html#adef-name-A

This attribute names the current anchor so that it may be the 
destination of another link. The value of this attribute must be a 
unique anchor name. The scope of this name is the current document. Note 
that this attribute shares the same name space as the id attribute 
http://www.w3.org/TR/html401/struct/global.html#adef-id


From http://www.w3.org/TR/html401/struct/objects.html#adef-name-IMG

This attribute names the element so that it may be referred to from 
style sheets or scripts. Note.** This attribute has been included for 
backwards compatibility. Applications should use the id attribute to 
identify elements


From http://www.w3.org/TR/html401/struct/links.html#anchors-with-id

The id and name attributes share the same namespace.This means that 
they cannot both define an anchor with the same name in the same 
document. It is permissible to use both attributes to specify an 
element's unique identifier for the following elements: A, APPLET, FORM, 
FRAME, IFRAME, IMG and MAP. When both attributes are used on a single 
element, their values must be identical.


You'll find neither html 4.01, nor html 5 declare a 'name' attribute for 
every element (some of html 5 elements have lost their older 'name' 
attribute, though it might be handled by the parser for backwards 
compatibility, i.e. for the a element representing a fragment of the 
document).



--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
CheBanca! La prima banca che ti dà gli interessi in anticipo.
* Fino al 4,70% sul Conto Deposito, zero spese e interessi subito. Aprilo!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=7918d=10-12


Re: [whatwg] Use cases for Node.getElementById

2008-12-09 Thread Calogero Alex Baldacchino

ddailey ha scritto:
There are lots of times in which I've needed to examine one document 
by use of a script that resides inside another. Using lists of 
attributes to do that has been rather important, though if those lists 
were accessible as properties of objects rather than as nodes 
themselves (as in some sort of multinary relation rather than as a 
tree structure) that would be fine as well.


Well, attributes shouldn't be accessible as descendants of an Element in 
a tree structure, but rather as items of a NamedNodeMap, or directly 
through Element.getAttribute()/Element.getAttributeNS(), passing a 
string representing the attribute name and getting the value as a 
string. Thus, they don't need to be instances of Node (that's about 
redefining Attr and a related listing interface to simplify the UA 
handling of attributes, which currently are node but should be handled 
as if they weren't). Any interface replacing Attr, for such purpose, 
should take care of namespaces and prefixes (which is currently done in 
DOM3 Node interface). Dropping the list of attributes as objects would 
require to query each attribute by name, but a list of attributes seems 
to be needed in some use cases; a DOMStringMap might be considered, to 
represent attributes as a list of string couples of names and values, 
but such couldn't handle namespaces, though it might be derived to add 
such capability, nor it could solve the problem to define an interface 
to give access to a tuple of (name, value, namespace) by colling, e.g., 
an item() method, or the alike. Otherwise, if no better alternative can 
be found, Attrs will continue to be Nodes...


Learners of this stuff seem to have trouble with the fact that lists 
cannot be indexed through array notation -- i.e., that nodes[1] cannot 
be used in place of nodes.item(i) in some namespaces, but apparently 
can in HTML.




I guess that's a matter of idl bindings, in part at least, so it might 
be solved with clearer specific bindings, as needed. For instance, all 
properties (attributes and methods) of a collection-like interface can 
be declared [DontEnum], despite of them being defined on the idl or 
being created at runtime (i.e. by listing an item as a named property of 
the object), with the exception of indexed items: this way, a 
collection-like object would always behave as an Array-like object. 
Similarly, the collection might work as an associative array for named 
items (i.e., the_id = attributes[id] might work as the_id = 
attributes.getNamedItem(id) ), but the binding for such might be more 
complex, involving a redefinition of the bracket property accessor in 
order to look for properties inside an internal list (when it comes to 
implementations, such complexity may disappear or be reduced, for 
instance, in C++ such might involve an easy overload of the 
'operator[]' function).


Though I have only played a little with compound documents or with 
document fragments, it seems like viewing all nodes as accessible 
through getElementById is awfully dependent on how one finds the 
document associated with the appropriate segment of a mixed NS 
document. In SVG nestled inside HTML, for example, implementations 
have differed in terms of how that document is retrieved as a function 
of browser, and the type of tag (object, iframe, frame, or embed) in 
which the svg is placed. The ability to root one's search directly 
at a certain level in the parent DOM, might help in cases where mixed 
name spaces could lead to conflicts of the assumption of unique id's.


Perhaps, what you're asking for is something like 
Document.getNSElementById(in DOMString namespaceUri, in DOMString 
elementId), to get access to the first element, in a document, whose tag 
name has a prefix corresponding to the queried namespace, or is 
descendant of an element whose tagname is the root element tag name for 
the queried namespace (perhapse suitable for HTML 5 embedding svg or 
math elements without prefixes). Anyway, you'd have to reach the 
correct document first (but you'd have to do so to get the nested 
content root element even with a getElementById(elementId, rootElement) 
). Such method would involve a separate management of ids, one to ensure 
uniqueness in the whole document (i.e. to return the first match for 
getElementById despite of the element namespace), another to deal with 
each element (either prefixed or just embedded) coming from the same 
namespace as if they were in a separate document where to look for 
unique ids (not to be implemented necessarily this way, just managing a 
global map of unique IDs for all the elements in a document and zero or 
more secondary maps for all the elements corresponding to a particular 
namespace - different from that of the nesting document). Such might add 
some complexity to the user agent, and perhaps won't get consensus from 
implementors, I guess.


Regards, Alex.


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, 

Re: [whatwg] Thoughts on video accessibility

2008-12-09 Thread Calogero Alex Baldacchino

Silvia Pfeiffer ha scritto:

I heard some complaints about there not being any implementation of
the suggestions I made.

So here goes:

1. out-of-band
There is an example of using srt with ogg in a out-of-band approach here:
http://v2v.cc/~j/jquery.srt/
You will need Firefox3.1 to play it.
The syntax of what Jan implemented is different to what I proposed,
but I wanted to take it forward and make it more generic.

2. in-band
There is also a draft implementation of srt inside Ogg through the
OggText specification, but I it's not released yet. It is also not as
relevant to this group as the out-of-band example.

Cheers,
Silvia.

  
As far as I've understood from a first read of your proposal (I'm not 
much inside that matter), current players/codecs implements different 
kinds of bindings with text (either in-band or out-of-band) and supports 
different formats, so perhaps there is place for both mechanisms you're 
proposing:


- the html version, for compatibility with existing media and relative 
external bindings, for servers not supporting the dynamic creation of 
content defined by your ROE format and for people who don't want/can't 
afford to modify the way their medias are served (e.g. they can't access 
to the server where the media is stored and add or modify an xml 
metadata file, but want to try and bind the media with some text they 
can store separately);


- the xml file mainly to drive dynamic content creation, and as a 
gradual replacement of other binding formats.


Any problem arising from the management of separate connections 
(possibly to different domains) to get both the audio/video and the 
textual resources, might perhaps be mitigated by indicating (or 
establishing as default) a time to wait for external text before 
starting the playback (in case the text resource fails to load -- e.g. 
the server is temporarily offline -- and there is enough buffered 
content to start playing before the browser gets any answer for any 
other resource) -- when and if the text arrives, its use might be 
skipped at all, or start by synchronizing with the current point in the 
media; the same way, if any problem loading the text arose after 
starting the playback, the missing parts might just be skipped (such 
would be unlikely to happen if both the media and the text files were 
located on the same server).


Perhaps, it might be useful to provied a way to indicate an alternative 
media to stream, i.e. an .asx or .rm media which is internally binded 
with only one of the supported languages, but the browser fails to bind 
them with the 'primary' media, or in case the ROE format is not 
supported (e.g. introduced in a v2 of the spec), or the 'primary' 
media is not supported by the browser, but the same content is available 
in several formats (i.e. a lossless compressed version along a lossy 
compressed one - the UA might even choice one basing on the network 
capabilities) -- I know such is possible with source elements, but 
perhaps some considerations are needed on the opportunity to relate 
source element and text bindings, i.e. to tell the UA, by the mean of an 
attribute, whether to verify if the source supports any of the declared 
text resources, preferably one matching the locale, or not (that is, 
specifying if a source is a 'last resort' in case the UA is unable to 
bind any other source with the text -- other sources might be chosen 
anyway, if no 'last resort' source is supported).


Anyway, the use of subtitles in conjunction with screen readers might be 
problematic: a deeper synchronization with the media might be needed in 
order to have the text read just during voice pauses, to describe a mute 
scene, or to entirely substitute the sound, if the text provides a 
translation for the speech (I guess such would be untrivial to do 
without putting one's hands inside the media).


Everything, of course, IMHO.
Regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
CAPODANNO A RIMINI HOTEL 2 STELLE
* 2 notti pernottamento con colazione a buffet euro 70,00, 3 notti euro 90,00
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8500d=9-12


Re: [whatwg] Thoughts on video accessibility

2008-12-09 Thread Calogero Alex Baldacchino

Silvia Pfeiffer ha scritto:

On Wed, Dec 10, 2008 at 6:59 AM, Calogero Alex Baldacchino
[EMAIL PROTECTED] wrote:
  

Anyway, the use of subtitles in conjunction with screen readers might be
problematic: a deeper synchronization with the media might be needed in
order to have the text read just during voice pauses, to describe a mute
scene, or to entirely substitute the sound, if the text provides a
translation for the speech (I guess such would be untrivial to do without
putting one's hands inside the media).



I cannot see a problem with conflicts between screen reading a web
page and a video on the web page. A blind user would have turned off
the use of captions by default in his/her browser, since they can hear
very well what is going on, just not see it. As long as the video is
not playing, it is only represented as a video (and maybe a alt text
is read out). When the blind user clicks on the video, audio
annotations will be read out by the screen reader in addition to the
native sound. These would be placed into silence segments.

  


I was thinking on a possible lack of synchronism, with enabled 
annotations, between the screenreader reading them, and the actual 
duration of corresponding silence segments, maybe because of not enough 
brief sentences (e.g. as a consequence of a non well-groomed translation 
in a certain language) and/or a slow reading (depending on the language 
peculiarities, or the user settings, or both, and anyway out of control 
for any UA), resulting in a cross sound between the end part of a read 
out annotation and the beginning of the next non-silence segment, 
perhaps repeatedly during playback. Maybe this is a borderline case.



In the case of a video with a non-native language sound track, it's a
bit more complicated. The native sound would need to be turned off and
the screenreader would need to read out the subtitles in the user's
native language as well as the audio annotations in the breaks. This
many not be easy to set up through preferences in the Web browser, but
it should be possible for the user to manually select the right tracks
and turn off the video sound.

Regards,
Silvia.
  
If the base language of the video, or the provided languages, were 
indicated somewhere, in the metadata or in the enclosing xml file, 
perhaps such a switch might be automated (perhaps the corresponding 
preference might be something like read subtitles when the media does 
not support your language maybe coupled with the option don't read 
subtitles when the media supported language(s) can't be identified.). I 
was also thinking about 'implied' subtitles, such as those showed in a 
film when some characters speak in different language from the base 
language of the rest of the content; in such a case, if distinguishing 
'implied' subtitles were possible somehow, it might be nice to turn down 
(or off, as needed) the volume and let a voice engine to speak them 
aloud. I guess a UA with an embedded voice technology (such as Opera 
Voice, or FireVox), could do a good job and keep audio and video 
synchronized in most cases, but involving an external software (such as 
a screen reader) the scenario might change (usually a screenreader can't 
be fastened or slowed, and stopping it - when reading annotations - 
after having fed some text, if at all possible, might be untrivial -- 
again, I'm not enough inside this stuff, so I can just suppose some 
borderline scenarios). Anyway, your proposal is nice, and, once 
widespread, screen readers developers might choose to provide some kind 
of support for synchronism (if needed to improve accessibility of 
audio/video contents).


Regards, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Partecipa al concorso Sheba!
* In palio speciali premi e tanti prodotti Sheba per il tuo gatto! Gioca ora e 
vinci!   
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8431d=10-12


Re: [whatwg] Use cases for Node.getElementById

2008-12-08 Thread Calogero Alex Baldacchino

Simon Pieters ha scritto:
On Sun, 07 Dec 2008 04:09:01 +0100, Calogero Alex Baldacchino 
[EMAIL PROTECTED] wrote:



I'm reading it :-)

And I have a few questions. First, is it meant as the reference DOM 
Core for HTML 5 only, or in general (for other kinds of markup too)?


In general.


Ok.




Is it the name HTMLCollection that is the problem?



Perhaps. I don't know and can't guess how much 'political consensus' 
might be needed to make the specification fly (especially if trying a 
convergence with w3c). Maybe, the support of the main browsers vendors 
is more than enough, and a name is not much of a problem in practice. 
Anyway, since that's just a formal/political matter, that may be solved 
when and if needed, I've just pointed out a possible solution (but I'm 
sure you don't need my suggestions to get there, or to find a better one 
:-P).




I guess such attribute has been declared on the Element interface 
instead of the HTMLElement one because actually this is the most 
common implementation in current browsers.


Right. Also because it seems useful for not just HTML.



Well, on one hand it duplicates a NodeList of child nodes, but on the 
other hand only Element nodes are listed, and this can be useful in 
practice, I agree. :-)





Anyway, let me suggest [..]


This seems like adding complexity for political reasons.



Hmm, for a script no complexity would be added (I mean, a script engine 
embedded in a UA would implement the same interface as the UA, but the 
script code would work fine because of runtime inferred types -- the 
instanceof operator, in ECMAScript, might fail indeed, but such may fail 
anyway in IE, which seems not to expose the DOM hierarchy of an object). 
LiveConnect should work fine as well; any other access to the DOM 
through a plugin may require a whole implementation to make new DOM 
property types/interfaces available as compile-time known 
types/interfaces (I'm not sure, but I think that Java, actually, doesn't 
provide access to non w3c dom 2 properties -- true, at least, for some 
versions of the VM and related DOMAccessProvider objects; I don't know 
if there are third parties implementations allowing that -- and I also 
guess some non-standard interface might be adopted, in implementations, 
to give access to every properties of an html document without having 
always to cast to the proper interface, since HTMLDocument no more 
inherits from Document). Such an implementation might require objects 
wrapping at some point (e.g. to maintain consistency between 
corresponding data types), thus adding a constraint to anything likely 
yet needed shouldn't be too expensive for the implementors. Anyway, yes, 
that's mainly a formal/political need, and obviously can be added as 
needed (as above, :-P)





I'm not sure what to do with attributes. I'd like to drop support for 
attribute nodes (being moved around, etc), if possible, but keep the 
.attributes list and be able to use .value etc on each attribute.




Good question. Unless moving .value/.name on the Node interface (which I 
guess might be problematic for backward compatibility), a Node-derived 
interface is needed to accomplish that, unless changing the list 
'nature' as well (but with similar issues, and not solving the need for 
an interface defining the .value and .name interface)...




I was thinking just to that when I've read, in HTML 5 spec, that 
This specification doesn't preclude an element having multiple IDs, 
if other mechanisms (e.g. DOM Core methods) can set an element's ID 
in a way that doesn't conflict with the id attribute.


It says this, AIUI, because other specs do make it possible, not 
because it's a good idea that it is possible.


I understand it the same way (and guess such specs might allow custom ID 
attributes).


Personally I think it should not be possible (specifically I think 
'id' should be like 'xml:id' is and all other ways to get an ID-like 
attribute should be dropped).




I agree. But I'm not sure if that's a 'safe' choice in a general DOM 
(maybe it is considering actual needs; if support for those other ways 
were needed in the future, it might always be added in a future 
version/revision of Web DOM Core, and wouldn't conflict with html 5 
spec, in reason of that statement -- backward compatibility'd be no more 
problematic than it is today for new HTML tags, but changes and breaks 
are unavoidable if they're good evolutions - unless there is yet some 
degree of support for ID-like attributes, so the break might be less 
safe, but I guess that's not the case).




For this purpose, either the 'isId' property of an Attr node, or a 
mechanism to set an Element's attribute as an alternative ID (or 
both) might be helpful [...]


It's not clear to me why it would be helpful.



If ID-like attributes were to be supported, specially user defined ones, 
providing any mechanism to set such attributes and/or check their 
'nature' might be useful both in script (i.e

Re: [whatwg] Use cases for Node.getElementById

2008-12-08 Thread Calogero Alex Baldacchino

Jonas Sicking ha scritto:
I see the Element interface no more contains methods to handle Attr 
nodes: since those are described as not being child nodes of an 
Element, in W3C specifications, there will be any other way to 
handle attributes as nodes, the 'nature' of Attr nodes is going to 
change, or is there a too little use (and/or support) of them, such 
that the Attr interface might be quite close to its 'end of life'?


I'm not sure what to do with attributes. I'd like to drop support for 
attribute nodes (being moved around, etc), if possible, but keep the 
.attributes list and be able to use .value etc on each attribute.


Oooh, this is an interesting idea. It'd be great if we could make 
attributes not be nodes but rather some other type of object.


Ideally I'd like for them to not exist at all, and have people just 
use getAttribute(NS) instead. I've never thought that their usefulness 
outweighed their complexity.


/ Jonas


Effectively, Attrs are nodes, but aren't used as 'normal' nodes; that's 
complex. Perhaps they might have been defined as not inheriting from 
Node since their introduction. If creating two new interfaces, one 
replacing Attrs (perhaps called Attr as well, but not inheriting from 
Node), the other to list attributes (AttributeCollection?), doesn't rise 
any issue on backward compatibility, or it solves more problems than it 
may create, that's not a bad idea. :-)


For sure, getting/setting an attribute as a property of an element, 
through getter/setter methods taking and returning strings is easier and 
perhaps the best choice in most cases, but there might be use cases 
where the possibility to access an element's attributes as a list is 
worth it, so, perhaps, should the drop of Attr be filed for a deep 
analysis and a possible actuation in a successive version of Web DOM 
Core (maybe modifying the Attr interface in the current)?



--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Gioca e vinci con Sheba!
* Partecipa a concorso I sensi di un'intesa perfetta vinci fantastici premi 
per il tuo gatto!   
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8433d=9-12


Re: [whatwg] Use cases for Node.getElementById

2008-12-07 Thread Calogero Alex Baldacchino

João Eiras ha scritto:


IMO, anyone suggesting a Node.getElementById clearly does not know 
very well how getElementById is supposed to work.
There are ways to transverse a DOM tree currently, either DOM 
properties and methods, XPath, selectors API and such.
Considering ids are required to be unique in the context of a single 
document, implementations can, and do, implement id lookup using 
optimized data structures like a hash table, which is much more 
performant than doing transversal.
So if there is a special node in a document, add an id to it and get 
its reference will be performant (ideally O(1)).


Such a hash table cannot prevent at all the need of traversing the DOM 
tree for the purpose of a _correct_ implementation of .getElementById. A 
DOM tree is a live structure, so the hash table must be checked and 
updated each time a node is removed AND each time a node is inserted, 
for a couple of reasons, and such update may request some kind of tree 
traversing (i.e. to compare nodes relative position). Actually, 
getElementById is being defined as returning the _first_ element with a 
matching ID, as a graceful degradation in case of duplicate IDs and to 
give a better standard (= unique) definition of the expected behavior in 
front of duplicate IDs, than what stated in DOM 3 Core (which leaves 
such behavior unspecified -- it's said to be undefined -- and possibly 
implementation or document specific); this means that, upon insertion of 
a new element, this one might be the new 'first' element with a certain 
id, so its order must be checked and the hash table updated accordingly. 
When an element is removed, independently of the previous scenario, if 
it was in the hash table it might be just removed from the table a well, 
but such wouldn't work fine, because there might be a descendant, or an 
otherwise following element with the same id: after the removal, such 
element would pass from the 'illegal' state of being a duplicate-ID 
element, to the 'legal' state of being the current element to be 
returned by getElementById = the existence of such an element must be 
checked and the hash table updated accordingly. If there are far more 
insertions and/or removals of elements with the id attribute set, than 
calls to getElementById, the advantage of a live hash table vs 
traversing as needed can be quite lost; anyway, a traversal can be quite 
fast, especially if the DOM structure is implemented as a balanced 
binary tree (and I hope you don't wish to implement any kind of 
non-binary tree as the base tree structure).




If the uniqueness requirement is removed, then getElementById looses 
its whole meaning and should actually be removed from the 
specification entirely, else then we would need more bloat like 
getElementById or getElementListById and whatever.


Do you thing that getElementsByTagName and getElementsByClassName are 
bloaty and useless too? However, my point was, and is, another (I'm not 
for Node.getElementById - nor I am strongly against it).




If you really need to get the element with id in a subtree, connected 
or disconnected from the main tree, one can use selectors API, DOM 
transversal, XPath, etc.


Currently, the id uniqueness is defined such as constraining not only a 
whole document, but also a disconnected subtree. Then, what API is such 
constraint relevant for? If none, is it worth to declare such constraint 
for disconnected subtrees? Or, is there any need for an API directly 
handling IDs in disconnected subtrees?


In other words, what's being constrained by the id uniqueness in a 
disconnected subtree? A disconnected subtree may be a subtree of another 
document, different from the one currently handled by a script; in this 
case, the id uniqueness is relevant for the actual document containing 
the subtree (while any other document shouldn't be affected by 
cross-document IDs clashes). Otherwise, it may be a subtree external to 
any document, and in such case, perhaps, it might be out of scope for 
HTML 5 documents specification. I'm starting to think that at most it 
might be said, for disconnected subtrees outside any actual html 
document but consisting of html elements, that any API dealing with 
unique identifiers in a disconnected subtree of html elements must treat 
the value of any such element's id attribute as the element default ID 
(the id value uniqueness being a consequence of both its nature as ID 
property and the nature of an API methods targeting an element ID 
property, but not imposed by the specifications, since currently there 
is no such method in the scope of HTML 5 DOM). As a consequence, the id 
value uniqueness might be in scope for a DOM Core specification 
explicitly willing to handle ID properties in a disconnected (and 
'document-less') subtree of Elements, just because the id value 
represent (at least) the first attribute of an HTML element to be 
evaluated looking for an ID property.


Regard, Alex.


--
Caselle da 

Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-12-07 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:
What terminology would you prefer rather than subtree? (We can't say 
document, since we are also trying to define conformance rules for 
disconnected subtrees handled from scripts.)
  


I was thinking again on that. Let me suggest something like the 
following (and just do suggest, I'm far from wishing to impose my point 
of view, and don't want to be pedantic, but I belive deeply exploring 
every alternative may improve the specification).


The _id_ attribute represents an element unique identifier in the 
subtree within which the element finds itself and must contain at least 
one character. In this context, a subtree is either a whole document 
tree, or a tree of Node instances containing HTMLElements and 
disconnected from any HTML document; a subtree of a document tree is 
contained in a subtree of the first type, thus id values must be unique 
in the containing document (e.g. a duplicate id inside a document tree 
is always illegal, even if a branch of the document can be isolated 
where the id is unique, unless such branch is removed from the document).


This specification requires the _id_ attribute value to be unique in a 
subtree of the former type, thus a subtree of the latter type (e.g. a 
document fragment manipulated by a script) to be inserted into an HTML 
document must fulfil such requirement, as well as any other requirements 
defined in this specification for conformance purpose. Any API dealing 
with ID properties in any type of subtree must consider the _id_ 
attribute value of an HTMLElement as the element's default ID property; 
however, this specification doesn't preclude an element having multiple 
IDs, if other, API-specific mechanisms can set an element's ID in a way 
that doesn't conflict with the id attribute - then the rest.


One rational for the above is that, formally, a subtree disconnected 
from any actual HTML document might be out of scope for current 
specification, which defines conformance rules for HTML documents and 
related contexts (such as a script context or a browsing context, both 
applying to a 'connected' subtree, as far as I've understood), while a 
subtree which is disconnected from a specific HTML document, but is 
contained into another one (thus coinciding with the containing document 
tree) is yet covered by the constraint for whole documents.


Another rational is that current specification, while relying on at 
least one method affected by IDs uniqueness in a document tree (that is, 
DOM Core Document.getElementById), does not provide, nor refers to, any 
API which might be directly affected by the uniqueness of an id 
attribute value in a disconnected subtree, thus such an API may be 
indirectly related to id values uniqueness if ID properties are relevant 
for its facilities, but the subtree itself cannot be constrained by 
conformance rules before its insertion into an actual HTML document.


A further rational is that a disconnected subtree might contain Node 
instances not implementing the HTMLElement interface, such as a 
DocumentFragment node, but also MathML/SVG elements, which might be 
embedded content elements coming from an HTML document tree, but also 
from a document of a different kind where the embedded content was 
represented by HTML elements, thus, without a sure knowledge on the 
subtree origin, applying an HTML-specific conformance rule might not be 
a correct choice, until the subtree is to be inserted into an HTML document.


For the question related to space characters inside an id value, I'd 
suggest,


An ID property is not expected to contain space characters, so the 
value of an _id_ attribute should not contain any space characters. 
However, an id attribute can hold a decoded fragment identifier value 
for the purpose of same-document references, thus space characters are 
tolerated for the purpose of conformance, in order to avoid applying 
restrictions to an otherwise legal fragment identifier value not being 
part of a _URL_.


Everything, of course, IMHO.

Best regards,
Alex.


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
CAPODANNO A RICCIONE
* Speciale Capodanno Bambini con Animazione e Baby Sitter.
* Un bimbo fino a 6 anni GRATIS.
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8503d=8-12


Re: [whatwg] Use cases for Node.getElementById

2008-12-06 Thread Calogero Alex Baldacchino

Simon Pieters ha scritto:
On Fri, 05 Dec 2008 19:19:04 +0100, Calogero Alex Baldacchino 
[EMAIL PROTECTED] wrote:



[...]


(I'm currently the editor of that proposal, currently located at 
http://simon.html5.org/specs/web-dom-core )




I'm reading it :-)

And I have a few questions. First, is it meant as the reference DOM Core 
for HTML 5 only, or in general (for other kinds of markup too)?


The 'children' attribute on the Element interface, being an 
HTMLCollection instance, suggests me the former might be the answer; 
otherwise, either the reference to a specific document DOM interface, or 
(in the case such interface were moved into Web DOM Core) the reference 
to a specific dom in the name of the interface might perhaps be 
problematic (formally, at least). I guess such attribute has been 
declared on the Element interface instead of the HTMLElement one because 
actually this is the most common implementation in current browsers. 
Anyway, let me suggest (just as a hint, after all a working draft is the 
right phase to explore any alternative) something like an 
ElementCollection interface with the same properties of HTMLCollection, 
making the latter just inheriting from the former as if it were an alias 
(the same way DocumentFragment inherits from Node). On any browser 
implementing 'children' as an HTMLCollection (without any hierarchy), 
this shouldn't be a problem for scripts, since a script language usually 
provides runtime inferred types; for languages with strong types (and 
perhaps here we're moving from scripts to plugins), the access strategy 
may be implementation specific but, as far as the hierarchy of 
interfaces (ElementCollection - HTMLCollection) does not change the 
properties of an object implementing the HTMLCollection, that shouldn't 
be a lot to work around. For instance, a Java applet (as well as any 
other object implementing LiveConnect) should work fine using the 
JSObject without any modify, while a direct access to the DOM would need 
a DOMServiceProvider implementation (I'm not aware of any granting 
access to the 'children' attribute, or better, to any non-W3C DOM 
properties, but I guess as soon as your proposal became a recommendation 
at least Sun would update such in Java APIs); for such purpose, 
suggesting that any object provided by the user agent as implementing 
either interface should be wrapped by an object also implementing the 
other, for backward compatibility, might be enough (anyway, this is no 
more than a hint, a very early feedback).


I see the Element interface no more contains methods to handle Attr 
nodes: since those are described as not being child nodes of an Element, 
in W3C specifications, there will be any other way to handle attributes 
as nodes, the 'nature' of Attr nodes is going to change, or is there a 
too little use (and/or support) of them, such that the Attr interface 
might be quite close to its 'end of life'? Apart from that, I've also 
noted the 'isId' attribute has been removed from Attr; I was thinking 
just to that when I've read, in HTML 5 spec, that This specification 
doesn't preclude an element having multiple IDs, if other mechanisms 
(e.g. DOM Core methods) can set an element's ID in a way that doesn't 
conflict with the id attribute. For this purpose, either the 'isId' 
property of an Attr node, or a mechanism to set an Element's attribute 
as an alternative ID (or both) might be helpful (anyway, having more 
then one unique identifier to handle for each element|| in a document 
might cause an increase in duplicated IDs).


The above takes me to the '.getElementsByClassName()' method: if it were 
to be moved from HTML 5 spec to Web DOM Core API, and if the latter is 
meant as some kind of replacement for W3C DOM level 3, perhaps, for 
generality sake, such method might be defined as referring to a property 
named CLASS (along the same lines as ID), pointing out that such 
property might not be binded to an attribute named 'class' (just to make 
the spec ready in case the need to support such sort of document arose 
in the near future, without having to change web dom core, or to derive 
a new version, only for this reason).


But now let's come to your questions (sorry for the digression, 
sometimes I can't help starting this way...)




But the term 'Subtree' arises a problem with HTML 5: actually, the id 
attribute is defined as the element unique ID in the *subtree* 
whithin which the element is found. That is, the term subtree refers 
to a whole document tree, but also to a disconnected subtree handled 
by a script (and I haven't yet understood if such definition refers 
to a document fragment containing nodes detached by any document, or 
a whole document without a browsing context).


AIUI, it could also be a disconnected element.



And I've suggested, in another mail, to clarify it, i.e. telling a 
subtree is either a whole document (to make clearer that 'bodydiv 
id=the_id /div div ... div id=the_id

Re: [whatwg] Early feedback on header association algorithm

2008-12-05 Thread Calogero Alex Baldacchino

Aaron Leventhal ha scritto:

How about node.getElementByIdInSubtree?

On 12/2/2008 4:07 PM, timeless wrote:
On Tue, Dec 2, 2008 at 10:39 AM, Aaron 
Leventhal[EMAIL PROTECTED]  wrote:
  
Maybe there is a deeper problem if copy  paste doesn't work right 
because

of IDs?

Or maybe there should be a node.getDescendantById() method?
 


maybe, but not with that name.

  Results 1 - 10 of about 4,480,000 for Descendent [definition]. 
(0.22 seconds)
  Results 1 - 10 of about 8,370,000 for Descendant [definition]. 
(0.41 seconds)


the wikipedia links are confusing enough

http://en.wikipedia.org/wiki/Descendant links to:
http://en.wiktionary.org/wiki/descendent
which has an also link to http://en.wiktionary.org/wiki/descendant
which has a 'US' audio file

So the web says that '-dant' is favored 2:1 over '-dent', which is a
fairly bad margin considering the spelling errors we've seen in
html/http.

I'd sooner see Node.getElementById and risk the confusion of it
returning fewer nodes than Document.getElementById.


   





That's about the same then moving the getElementById method from the 
Document interface to the Node interface
(Document inherits from Node, so the actual traversed subtree would 
change basing on the node where the method is invocked, that is 
'anElement = document.getElementById(anEl)' would work as always, 
while anElement.getElementById(anEl) would look for a descendant of 
'anElement' with the same id), because, essentially, IDs are a common 
feature of all document types, despite the actual name of the attribute 
representing an ID, so an eventual .getElementByIdInSubtree() method 
should be defined on a somewhat DOM Core interface, and so would be out 
of scope for HTML 5 (as I've been told .getElementById is - there is a 
'Web DOM Core' specification under construction). But the term 'Subtree' 
arises a problem with HTML 5: actually, the id attribute is defined as 
the element unique ID in the *subtree* whithin which the element is 
found. That is, the term subtree refers to a whole document tree, but 
also to a disconnected subtree handled by a script (and I haven't yet 
understood if such definition refers to a document fragment containing 
nodes detached by any document, or a whole document without a browsing 
context).


Perhaps the possible confusion arising if moving .getElementById() to 
the Node interface might be avoided by leaving it on the document 
interface, and overloading it with, for instance,


Element   getElementById(in DOMString elementId, in Node rootElement);

so a call to document.getElementById would behave as always (or better, 
as it will be redefined in Web DOM Core, that should be 'pick the first 
element with a matching id'), and would coincide with a call to 
document.getElementById(something, document); while a call to 
document.getElementById(something, anElement) would search a matching 
ID among the descendants of 'anElement', whether anElement be a node of 
the current document, or a node removed by any document or created by a 
script, or a node in another document and both the current document and 
the current script context are enabled to access it (but a 'script 
context' is an HTML 5 related concept, so it might be generalized as a 
DOM access context).


Regards, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Personalizza il tuo cellulare con tantissimi temi!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8275d=5-12


Re: [whatwg] Handling /br in the after head insertion mode

2008-12-04 Thread Calogero Alex Baldacchino

Tommy Thorsen ha scritto:


For the record, the following markup:

!doctype htmlbody/br

results in:

html
   head
   body
  br

with the current algorithm, because the in body insertion mode 
treats /br as if it was a br.



Maybe not fully in topic.

Section 4.5.3 says,

|br| elements must be empty. Any content inside |br| elements must not 
be considered part of the surrounding text.


The first part is clearly an authoring rule. But the second part cannot 
be such as well clearly, because an author might feel that as a 
reference to a parsing rule discarding anything like brSomething/br 
(but it isn't). Yet, that can't be a parsing rule, since in contrast 
with the in body insertion mode (but not only that), which would turn 
it into brSomethingbr, thus presenting the content to the end user 
(and obviously that's unlikely anyone visiting a web page would check 
the html code looking for content to ignore :-P). For the purpose of 
validation, the first part should be enough (that is, when a /br end 
tag is found, an error may be prompted to the author). Perhaps, should 
the second sentence be modified with references to scripts (e.g. to tell 
it is wrong to use a br .innerHTML or .appendChild() to modify the 
document) and to styles (e.g. to tell it's wrong to expect any font 
property will affect the sorrounding text), to make it more clearly an 
authoring rule? Or perhaps changed into an exemple of bad markup? Or 
removed, if source of confusion with parsing rules?


Otherwise, I don't follow its meaning (perhaps I'm the only confused 
one). I mean, as far as I know, xml derived languages require a closing 
tag for every elements, while html has never had such requirements per 
se, but that's a matter of syntax, not semantics. And, semantically 
speaking, whatever (but a closing tag) follows an element which can't 
have children, in the markup, obviously consists of one or more siblings 
of such element, while its closing tag (again, that's syntax), if 
misplaced, or not provided for by syntax rules at all, causes a parse 
error (which may, or may not, be handled gracefully by the u.a., that's 
a matter of parsing rules). That is, declaring an element as empty 
should imply per se that the element cannot have any descendant, so its 
content is not... its content, but a syntax error. Perhaps, defining the 
empty content model such way might avoid misunderstandings. Or am I 
making some mistakes?


Best Regards, Alex.


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
RC Auto?
* Con Direct Line risparmi oltre il 30% sulla tua polizza! In più per te, 15% di extra sconto! Scopri subito l’offerta! 
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8496d=4-12


[whatwg] URL parsing and same-document references [was: Re: Citing multiple blockquote elements in HTML5]

2008-12-04 Thread Calogero Alex Baldacchino
Calogero Alex Baldacchino ha scritto:   


Maybe the first is wrong, and I'm still unsure of the second. My 
concern is, a character-by-character comparison between an id value 
and a fragment identifier may fail several ways. What for href=#foo 
bar  and id=foo bar ? Actual rules would strip the trailing space 
only for the href, so the matching would fail (but we might survive 
broken links). Escaping both, then comparing would succed, as well as 
first escaping then unescaping the href value before comparing (should 
it be pointed out, somewhere, that a fragment identifier must be 
unescaped before comparing to an id or a name? is it and I've missed 
it? - having space characters in the unreserved production means thy 
don't need to be escaped, but does it mean also they must be decoded 
from their pct-production, after parsing and for resolving?). As well, 
stripping the trailing spaces in both cases would succed, but would 
fail when comparing id=foo bar  with href=#foo bar%20 (which is a 
valid url, according with actual parsing rules), even with escaping 
rules (in this case the id value trailing space must stay there). And 
what about id=foo%20bar in http://foo.example.org/foo.html  and  
href=#foo bar on the same page, or on a page having the same base 
URL, or a base element with href=http://foo.example.org/foo.html; ? 
My point is, since comparisons for matching purpose happen after the 
URL parsing and resolution, and the id value is not involved in such 
steps, character-by-character comparisons may fail without a prior 
normalization of both th fragment-identifier an the id value (or one 
of them). However, if the above is yet solved with parsing and 
resolving rules and I've misunderstood the spec, I retire all and 
apologize. Or, perhaps, must a valid url with a valid fragment, which 
is equivalent but not exactly matching an id value, be considered as a 
broken link?


Maybe the above needs a further clarification. Let me start from URL 
parsing (and resolving) rules: after the URL is validated, it's divided 
into its components, but nothing is stated about normalization and/or 
%-encoded characters. I think that applying a somewhat normalization may 
be useful to parse equivalent URLs in a consistent manner, helpful when 
dealing with the interfaces for URL manipulation, as described in 
section 2.5.5, and, last but not least, an improvement in relative 
references matching (especially same-document references). A minimum 
requirement, for standardization sake, may consist of decoding any 
%-encoded characters in the fragment production, which are part of the 
unreserved production as defined in RFC 3986 with the changes defined 
in HTML 5 specification for URLs parsing and restricted to the Unicode 
ranges representing valid characters for an attribute value (those which 
are not prohibited neither as 'text' nor as 'character references'). 
This way, a character-for-character comparison between a fragment 
identifier and an id attribute value, which would have been equivalent 
but not matching without the normalization, should success most of 
times, because, as a consequence of the changes applied by HTML 5 
current specification to the unreserved production, such characters 
might or might not be %-encoded in a valid URL, while an id value is 
likely to contain them non-encoded.


After the above fragment normalization, a character-for-character 
comparison would fail if the id value contained any %-encoded triplet 
representing a decoded character, such as foo%20bar. Anyway, such may 
be a weird thing to deal with, since it can be the %-encoded form of 
foo bar, but also the decoded form of foo%2520bar. In other words, 
if we apply the same normalization to two complete URLs, then we compare 
them, the result is quite reliable, but if we start from a component 
(such as a fragment identifier stored in an id attribute value) it's not 
easy to tell whether any normalization has been applied and which one, 
so there are always chances for false positives or false negatives to 
happen. According with RFC 3986, section 4.4. Same-Document Reference, 
the correct interpretation of a URI as a same-document reference cannot 
be hold as guaranteed, thus the mismatch between, for instance, the  
decoded fragment identifier foo bar and the id attribute value 
foo%20bar, in front of (as I think) a wide majority of good matches, 
can be reasonable. Anyway, a kind of double check might be considered, 
such as:


- comparing the %-unescaped fragment identifier with the ID of each 
element in the DOM;
- upon failure, applying a %-unescape algorithm to the ID, then 
comparing again with the fragment identifier and, if matching, marking 
the element as a 'possible choice';
- upon a perfect (exact) match, without unescaping the evaluated element 
ID, choosing such element as the referenced document part (actually 
defined as the indicated part of the document in the spec) and stopping;
- without any

Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-12-03 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:

On Wed, 3 Dec 2008, Calogero Alex Baldacchino wrote:
  
But, isn't it worth to spend a word everywhere in the spec to tell when 
it's a quirck for backward compatibility, which might go away in the 
future, and when it's not, because that's not needed?



None of the implementation requirements in HTML5 will go away in the 
future. We will always have to define how implementation are to handle all 
inputs, today, tomorrow, and 100 years from now. Authors aren't going to 
stop writing invalid documents, unfortunately; and even if they did, the 
documents that exist today aren't going anywhere. (One of the goals of the 
HTML5 project is to document how someone in 2100 AD, or even 21000 AD, 
should handle Web pages of today, so that today's heritage isn't lost.)



  


Ok, and agreed. Due to the nature of the web (and of web authors' 
practices), a strict conformance requirement (such as it might be for a 
C compiler) will never be a good idea.



I mean, if you allow spacing characters inside an id value, as a parsing rule,
you can face something like 'div id=foo bar ', that is an id consisting of
more than one token. Is it good to leave it in untouched? Yes? Ok, but what
does it mean for CSS's, since there is a reference to them as one reason to
allow space characters? That is, can a browser handle an id selector starting
with the '#' character and being broken by a blank space?



Sure:

   #foo\ bar { ... }

...would match an element with id=foo bar.


  


Right, now I remember... sorry for my mess...

Now, let's say, instead, that a user agent, conforming with HTML 5 
specifications, must cut off any token after the first one (I know 
actually foo bar is taken as is), that is div id=foo bar becomes 
div id=foo  and div id= foo  is valid too. In such a case, 
skipping any spaces too, and stating the same behaviour for strings 
passed to .getElementById() could be nice as a graceful degradation for 
documents non-conforming with the rule the value [of an id attribute] 
must not contain any space characters, but such might fail with CSS 
selectors such as 'div[id=foo bar]'.



I don't follow you there. What problem are you trying to solve?

  


Just trying to explain why I was suggesting such a behaviour (= 
stripping space characters) in my first message about that. I was 
wrongly ignoring the case of id=foo bar and just concerning on id=  
foo , but not confusing authoring and parsing rules (even if I admit 
sometimes I've strict conformance in mind). If the latter were the only 
naughty boy out there, perhaps stripping spaces might have had some 
sense (though not the best choice without touching other things maybe 
out of scope).
  

Perhaps a compromise, if acceptable for backward compatibility, might be:
- when the id value must be compared to a fragment identifier, strip any
trailing space characters; if the match fails, escape any other space
characters both in the id value and in the fragid and try again;



Why not just do what we do now, and treat the attribute as-is?


  

- when an attribute is defined to hold an url and its value has spaces in its
path/query/fragment, escape them before resolving the url (not sure if
needed);



Again, aren't the current rules for handling URLs as defined in HTML5 
enough?


  
  


Maybe the first is wrong, and I'm still unsure of the second. My concern 
is, a character-by-character comparison between an id value and a 
fragment identifier may fail several ways. What for href=#foo bar  and 
id=foo bar ? Actual rules would strip the trailing space only for the 
href, so the matching would fail (but we might survive broken links). 
Escaping both, then comparing would succed, as well as first escaping 
then unescaping the href value before comparing (should it be pointed 
out, somewhere, that a fragment identifier must be unescaped before 
comparing to an id or a name? is it and I've missed it? - having space 
characters in the unreserved production means thy don't need to be 
escaped, but does it mean also they must be decoded from their 
pct-production, after parsing and for resolving?). As well, stripping 
the trailing spaces in both cases would succed, but would fail when 
comparing id=foo bar  with href=#foo bar%20 (which is a valid url, 
according with actual parsing rules), even with escaping rules (in this 
case the id value trailing space must stay there). And what about 
id=foo%20bar in http://foo.example.org/foo.html  and  href=#foo bar 
on the same page, or on a page having the same base URL, or a base 
element with href=http://foo.example.org/foo.html; ? My point is, since 
comparisons for matching purpose happen after the URL parsing and 
resolution, and the id value is not involved in such steps, 
character-by-character comparisons may fail without a prior 
normalization of both th fragment-identifier an the id value (or one of 
them). However, if the above is yet solved with parsing and resolving

Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-12-03 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:



It's intended as a replacement for DOM3 Core, I believe.

  
Then, I hope in a convergence with the W3C, as it's one of the goal of 
the WHATWG. I believe neither organizations wish a heavy standard 
fragmentation.



--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Dalla top ten al tuo cellulare. Scarica le superhit!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8268d=3-12


Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-12-03 Thread Calogero Alex Baldacchino


Jonas Sicking ha scritto:

 In firefox we now always return the first element with the requested
 ID. I think IE does the same. This seems equally reliably and much
 less likely to cause page breakage or interoperability issues.
  



That's reasonable, and I pointed out that should be standardized 
(expressing a few doubts on the opportunity to do so through the HTML 
DOM), but now acknowledge that's out of scope for HTML 5.



 As for CSS, I believe in firefox we make all elements with a given ID
 match the #foo selector. I don't have a strong feeling if this is the
 correct thing to do, or just make the first one match. The only
 concern I have is performance.

 / Jonas
  

That's reasonable too, and I was wondering about possible consistence 
issues with the DOM, but acknowledge as above that such would be out of 
scope for html 5.


Regards, Alex.




--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Polizza auto?
* Garanzia furto e incendio per un anno al vantaggioso prezzo di 30 euro tasse 
incluse!
* Scopri subito l'offerta!
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8425d=3-12


Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-12-02 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:

On Mon, 1 Dec 2008, Calogero Alex Baldacchino wrote:
  
Yes, a hash link (a href=#foo) will scroll to the element with an 
id=foo.  If coding properly, you'll virtually *never* use an a for 
an actual *anchor*, but rather will target the most semantically 
appropriate element, such as a heading or a container with the 
appropriate @id.
  

Thanks! That's what I was missing in the specicification (I should give it a
more accurate reading). Does it applies to every element, covering the cite
element too?



See:
   http://www.whatwg.org/specs/web-apps/current-work/#scroll-to-fragid

Let me know if that doesn't address your use case.

Cheers,
  
Indeed it does, and I found such behaviour more consistent than letting 
just the a element with a 'name' or an 'id' being an anchor for 
navigating to a fragment :-)


However, now I have a question. The 3rd step of the algorithm to 
determine the indicated part of the document says,


If there is an element in the DOM that has an ID exactly equal to 
/fragid/, then the first such element in tree order is the indicated 
part of the document; stop the algorithm here.


Shouldn't the id be unique in the whole document? Section 3.3.3.2 says,

The||| id |attribute represents its element's unique identifier. The 
value must be unique in the subtree within which the element finds 
itself and must contain at least one character. The value must not 
contain any space characters.


then follows,

If the value is not the empty string, user agents must associate the 
element with the given value (exactly, including any space characters) 
[...]


First of all, isn't it a bit conflicting? Space characters are legal or 
not? If not, perhaps that might say discarding any space characters 
for graceful degradation, or, at the beginning of the paragraph, If the 
value is not the empty string and does not contain any space characters, 
[...] if such an id is illegal (for graceful degradation sake, when 
more than one token may be created by skipping any space character, 
either the first token might be chosen, or each token could represent a 
different id, but the latter would require an explicit dealing with 
multiple ids the same way multiple classes are dealt with...). The rest 
of the paragraph says,


for the purposes of ID matching within the subtree the element finds 
itself (e.g. for selectors in CSS or for the |getElementById()| method 
in the DOM).


I guess the above covers, for instance, the case of a document holding 
an element with id=foo and an iframe whose content document holds 
another element with the very same id; but speaking about subtrees might 
suggest the following is legal:


body
divp id=foosomething/ppsomething else/p/div
divpsomething else from cite id=fooWhatever Example/cite/p/div
/body

since we can separate two different subtrees where the id 'foo' is 
unique. Perhaps that could be true for CSS selectors isolating the 
proper subtree (honestly, I don't remember if actually that's legal in 
CSS, though I've always thought it isn't), but might conflict with the 
DOM, because the method 'getElementById' is defined only for the 
Document interface and from this point of view both elements stay in the 
same document subtree, consisting of the whole document tree. About such 
a case, DOM level 3 Core says, If more than one element has an ID 
attribute with that value, what is returned is undefined.; as a 
consequence, if the desired behaviour were to select the first matching 
id (for consistence with the use of the first matching id as a fragment 
identifier for HTML documents), or anyway to establish a well defined 
behavior in the case of more than one element with the very same id (I 
don't think we should leave the choice of what to do to the 
implementation, because I don't think we want every browsers potentially 
to deal with clashing ids in a different, browser specific manner), I 
suppose the 'getElementById' method should be redefined accordingly; but 
such can't be done at the level of the 'Document' interface untill 
eventually a 4th specification for its core interfaces, which is out of 
HTML 5 scope.


A solution would be adding 'getElementById' to the HTMLDocument 
interface, but such might be a trouble, since HTMLDocument no more 
inherits from Document, so I can see two possible scenarios. In the 
worst one, a user agent is implemented in a language not supporting 
multiple inerithance, so either the two methods should be implemented in 
the same object with different names (this is bad to expose the 
interface for bindings to script languages supporting inerithance and 
function override), or two different objects should be created, one to 
deal with HTML documents, the other for generic (i.e. xml) documents 
(this is bad in general); in both cases, the above means doubling the 
code and the maintenance needs. In the other scenario, multiple 
inheritance helps us, yet two methods must be defined

Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-12-02 Thread Calogero Alex Baldacchino

Benjamin Hawkes-Lewis ha scritto:

Calogero Alex Baldacchino wrote:

[...]



I think you're confusing parsing rules that conforming user agents 
must follow to associate identifiers with elements (even when ids are 
duplicated) with the authoring rules that conforming documents must 
follow (ids must be unique).


Ok, so what's what?

When you read The value must not contain any space characters., is it 
an authoring rule for conforming documents, for you? Ok.


When you read *If the value is not the empty string, user agents must 
associate the element with the given value (exactly, including any space 
characters)* for the purposes of ID matching within the subtree the 
element finds itself (e.g. for selectors in CSS or for the 
|getElementById()| method in the DOM)., is it a parsing rule for 
conforming user agents, for you? Ok. But, isn't it worth to spend a word 
everywhere in the spec to tell when it's a quirck for backward 
compatibility, which might go away in the future, and when it's not, 
because that's not needed? And when it's a drawback from the past, 
shouldn't it be considered in every aspect? After all, wasn't one of the 
main goals of html 5 to turn unwritten and browser-specific rules into 
written and standard behaviours?


I mean, if you allow spacing characters inside an id value, as a parsing 
rule, you can face something like 'div id=foo bar ', that is an id 
consisting of more than one token. Is it good to leave it in untouched? 
Yes? Ok, but what does it mean for CSS's, since there is a reference to 
them as one reason to allow space characters? That is, can a browser 
handle an id selector starting with the '#' character and being broken 
by a blank space? Or better, is it legal in CSS? Honestly, again, I 
don't remember well, I've never tried something like that (since makes 
no sense at me), and I think that's illegal. But let's say that's 
illegal for conforming style sheets, but existing user agents may or may 
not allow that, each one with its own behaviour. If we close one eye 
for 'div id=foo bar ' in a piece of HTML 5 code, but leave its CSS 
counterpart to a free implementation, we'll solve half of the problem 
(where the problem is turning unwritten rules to written, and possibly 
improved, standards), won't we? But any kind of CSS quirks would be 
out of an HTML specification, and I believe 'div id=foo bar ' is a 
trouble (if instead foo bar is not a valid id selector for CSS in any 
browser, that means we're allowing user agents to parse as valid an id 
which is inconsistent with CSS, and so CSS selectors cannot be a reason 
to allow space characters inside an id string - at least, with respect 
to any direct reference to the identifier value). But it might be a 
trouble per se, even only for html conformance by user agents, since an 
URL fragment might contain escaped space characters, but an escaped 
space isn't the same thing as the space character itself, so the rule of 
exact matching, applied to space characters inside an id, may be a 
trouble without extensively considering the 'div id=foo bar ' case.


Now, let's say, instead, that a user agent, conforming with HTML 5 
specifications, must cut off any token after the first one (I know 
actually foo bar is taken as is), that is div id=foo bar becomes 
div id=foo  and div id= foo  is valid too. In such a case, 
skipping any spaces too, and stating the same behaviour for strings 
passed to .getElementById() could be nice as a graceful degradation for 
documents non-conforming with the rule the value [of an id attribute] 
must not contain any space characters, but such might fail with CSS 
selectors such as 'div[id=foo bar]'.


Perhaps a compromise, if acceptable for backward compatibility, might be:
- when the id value must be compared to a fragment identifier, strip any 
trailing space characters; if the match fails, escape any other space 
characters both in the id value and in the fragid and try again;
- when an attribute is defined to hold an url and its value has spaces 
in its path/query/fragment, escape them before resolving the url (not 
sure if needed);
- for the purpose of ID matching through the DOM 'getElementById' 
method, leave the id value untouched;
- for the purpose of ID matching through CSS selectors accessing it as 
an attribute, leave the id value untouched;
- for the purpose of ID matching through CSS selectors directly 
accessing it (e.g. '#foo') either choose the first sequence of 
non-spacing characters or let the match fail (I can't decide what's 
better, but perhaps the former would fail as well, since I guess anyone 
coding div id=foo bar not only as a fragment identifier, but also 
for styling, might have the nice idea to write #foo bar { font-weight : 
bold; } as well).


Anyway, if the id value is also a fragment identifier, which might have 
space characters (since parsing rules prescribe to add such characters 
to the unreserved production), does the (authoring) rule the value

Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-12-02 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:
Exactly how getElementById() works is out of scope for HTML5, but in the 
Web DOM Core spec that Simon is working on I imagine he has specced that 
it will pick the first element with a matching ID or some such behavior.


Cheers,
  
Is it thought as a somewhat break with w3c DOM Core (e.g. redefining 
interfaces with somehow different properties), as a substitute, or there 
will be continuity (e.g. inheriting and overriding methods to redefine 
their behaviours)?



--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Partecipa al concorso Sheba!
* In palio speciali premi e tanti prodotti Sheba per il tuo gatto! Gioca ora e 
vinci!   
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8431d=3-12


Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-12-01 Thread Calogero Alex Baldacchino

Tab Atkins Jr. ha scritto:

[[off list]]

  

Well, in fact, the above could be done as well by 'playing' with anchors
(but is it still possible to set an anchor somewhere in the document, such
as a id=foo /? I haven't found examples for that, perhaps I'm missing
something...).



Yes, a hash link (a href=#foo) will scroll to the element with an
id=foo.  If coding properly, you'll virtually *never* use an a for
an actual *anchor*, but rather will target the most semantically
appropriate element, such as a heading or a container with the
appropriate @id.

~TJ
  
Thanks! That's what I was missing in the specicification (I should give 
it a more accurate reading). Does it applies to every element, covering 
the cite element too? If so, there is no need for new attribute to 
relate a quoted content to its cited source (especially to relate 
several quotations to a single, or a main, complete reference), 
something like:


pAn interesting element is the codelt;citegt;/code element. It's 
definition is: q cite=#citeThe |codecite/code| element 
represents the title of a work (e.g. a book, a paper, an essay, a poem, 
a score, a song, a script, a film, a TV show, a game, a sculpture, a 
painting, a theatre production, a play, an opera, a musical, an 
exhibition, etc). This can be a work that is being quoted or referenced 
in detail (i.e. a citation), or it can just be a work that is mentioned 
in passing./q/p


pThe codecite/code element semantics finds a good placing inside a 
bibliographic citation, but only refers to the title of the work, not to 
the entire citation. In fact, it is sayd: q cite=#citeThe 
|codecite/code| element is obviously a key part of any citation in a 
bibliography, but it is only used to mark the title/q [...]/p

[...]
pA complete reference for the codecite/code element is found in 
WHATWG a 
href=http://www.whatwg.org/specs/web-apps/current-work/multipage/;citeHTML 
5/cite/a draft reccomendation, section a 
href=http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-cite-element;cite 
id=cite4.6.3 The codecite/code element/cite/a


should work fine, while in a scientific paper something like

...a href=#whref[whnt02]/a...

might be an instance of:

pb[whnt02]/b: cite id=whrefA new theory on White Holes: 
universe regeneration proved./cite, John Doe and Jack Someone, 2013, 
Science Paper Hall, IBAN:'example_iban_code'/p


in a similar way as

... a href=#jdJohn Doe/a...

is an instance of:

pThe name dfn id=jdJohn Doe/dfn is the one commonly used to 
indicate a person whose identity is unknown; may be found in some 
examples to indicate a generic person involved in some context, to 
indicate whoever else could be involved too, or to focus the attention 
on the context itself or its related subject, despite any real person 
involved or the likelyhood for the facts to happen./p


In conclusion, what I was suggesting is yet possible, if I'm not 
misanderstandig (again?), without any need for additional attributes. 
The reverse realtionship (from a cite to one or more 
q/blockquote), instead, might be more difficoult, but I agree with 
Ian Hickson that some 'real world' need should arise before addressing such.


BR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Meetic: il leader italiano ed europeo per trovare l'anima gemella online. 
Provalo ora
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8291d=1-12


Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-11-30 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:

I've removed the offending text.

I don't think we can say that quotes should always come before their 
citations. For example, it's easy to imagine a blog that says:


   pciteBook The First/cite says:/p
   blockquote...from book 1.../blockquote
   pBut citeBook The Second/cite says:/p
   blockquote...from book 2.../blockquote

...which is equally problematic.

Frankly, I'm not sure this was solving any real problems anyway.

  
I'm not sure I'm understanding the whole function of the cite element, 
and perhaps I'm bothering again with ids and references, but the 
relationship between a cite and a quotation could be disambiguated by 
coupling an id and a reference to that id. For instance, if it made 
sense to relate several quotations to a single cite, the cite 
element could hold the id, and every block related to the same source 
could refer to it with an attribute, let's call it 'from'.


Q: What problem does it solve?
Uhm... perhaps a first cite could be a complete reference, i.e. a book 
name along its author, publisher and IBAN code, or a reference to 
another site/blog, its author and a link to the page with the quoted 
text. Ok, and now? Let's say any other reference to the same sorce could 
be shorter and without the need of any markup, but for styling, while 
the quotation block could remind the whole sorce to the reader, for 
instance, when the user moved his pointer over it, or focuses the 
blockquote, a tooltip could present the citation content - as if it 
where the content of the title attribute - and a screen reader could 
speak it aloud after the quotation, and if the referred cite contained 
a link, a click on the blocquote content (or any other kind of 
activation) could open the linked page in another tab/window.


For sure, no reference to any natural language can disambiguate such 
relationship (human languages are ambigous and context-dependent by 
definition).


BR, Alex


--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Incontri con Meetic : Primo sito d'incontri in Europa - Milioni di single !
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8293d=30-11


Re: [whatwg] Fallback styles for legacy user agents [was: Re: Deprecating small , b ?]

2008-11-30 Thread Calogero Alex Baldacchino

timeless ha scritto:

i don't really want to spend a lot of time with this, but any feature
authors are provided will be abused.

among my list of things which i wish were never let out of pandora's
box are defining accesskeys (instead of commands) in html, and another
which i'd hope dies on the vine is aural css.

sure it is theoretically nice to let someone encode audio. however on
average it's going to be used more often by advertisers than actual
content developers.
  


Of course, advertisers could avail of aural CSS, but they can right now 
embed a voice registration into a flash ad, targeting a wider audience...



the amount of effort required to invest in a feature which is
generally not useful far exceeds any value offered to the user by the
agent.

in the case of accesskeys, a much more useful research area is to
developer a browser global way of accessing content which works well
for the device/useragent/user. enabling each site to design its own
poor access keys is much less useful than letting browsers be
configured by their users or share keybindings from one browser to
another for a given site.
  


Accesskeys are an attempt to reproduce offline applications' shortcuts 
in web pages; maybe they're not the best, since they suffer for a double 
dependence, from the browser settings and the underlying operating 
system settings. So, if you even could make modifiers consistent from 
one browser to another on the same platform, you couldn't easily do the 
same cross-platform. Yet the overall mechanism might be improved. One 
way to achieve this could be using key events, to create a more 
articulate environment, and (anyway) establishing a generic reference to 
a browser default modifiers (the ones provided for accesskeys) could be 
usefull for cross-browser, cross-platform consistence (perhaps even for 
your purposes).


For the rest, do you want to develope a browser (or a browser to be 
developed) with the option to bypass default accesskeys at the user 
will? That's possible, even without a direct support by the html: just 
add this option and make it working like any other browser 
customization, then store somewhere the users' choices on a somewhat 
profile. Do you want generic commands, like mouse-keys combinations, or 
mouse gestures too? That's the same, all to do is intercepting the 
command before the document and translate it into an activation of the 
desired control (i.e., by generating the activation behavior as defined 
in section 3.4.1.7 Interactive content). Do you wish to exchange the 
profile between browsers? Well, perhaps you're asking for a common 
profile format and a shared local storage among browsers... uhm, no, 
that would fail cross-platform (i.e. using a different OS or a different 
computer), so another solution should be found... perhaps storing users' 
webapps profiles in a remote server could solve such, and new services 
could start around such possibility, but perhaps that could lead to some 
security concerns, and perhaps, at the moment, it could be better to 
leave to each site developers the choice of default keybindings and the 
implementation of a mechanism to let the user customize the site/the 
webapp, storing the user's choices either locally or remotely on the 
site/the webapp server. Maybe a future version of html could endorse 
support for such (if not convenient at the moment). Perhaps an element 
could have an activators attribute holding the id of an activatorset 
element, which could be something like,


activatorset id=foo

   sequence type=keys
  key type=identifier state=down value=a_unicode_value_here /
  key type=modifier state=down value=defaultModifiers; /
  !-- a state of 'down' stands for a contemporary pression, 
regardless the order
of items declaration, and there can be any order for 
releasing pressed keys,
but after the first release (state=up), any other 
pression must happen after

the released key - or mouse button --
  key type=special state=down value=arrow_left; /
  key type=any state=up /
  !-- when every declared item is released, the command fires and 
synthetic click activation
steps are run, perhaps adding an activation event 
carrying the proper sequence of keys
and mouse actions, so the developer can choose among 
handling the (sinthetic) click,

the activation event or the DOMActivate event --
   /sequence!-- the first activation sequence is defined, any other 
is an alternative way to activate a control --


   sequence type=mouse
  mousebutton type=left state=down /
  mousebutton type=right state=down /
  !-- here state has the same meaning of above: users can press 
them contemprary, or one first, then

the other while still holding the other down --
  mousebehaviour type=move-right /
  mousebehaviour type=wheel value=-3 /
  !-- mouse behaviours should happen in order, however the 

Re: [whatwg] Citing multiple blockquote elements in HTML5

2008-11-30 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:

On Sun, 30 Nov 2008, Calogero Alex Baldacchino wrote:
  
I'm not sure I'm understanding the whole function of the cite element, 
and perhaps I'm bothering again with ids and references, but the 
relationship between a cite and a quotation could be disambiguated by 
coupling an id and a reference to that id.



Why is the ambiguity a problem?
  


Well, it depends on the uses the cite element is targeted to. If the 
'only' purpose (and such can be enough) is to provide the semantics of a 
citation in a media-independent manner and as well a stronger fashion 
than a 'general purpose italic' can do, but regardless of the actual 
subject taken from the cited source (which finds in the blocquote and 
q elements a proper, independent semantics), the ambiguity shouldn't 
be a problem: the end (human) user consuming the document should be able 
to correctly relate the cited source to the quoted subject just by 
extrapolating it from the surrounding prose, unless such text were 
really unintelligible (but even in this case, disambiguation would be 
out of the cite scope, with the above semantics). Otherwise, if there 
were any good reason to explicitly relate the source to the subject, or 
viceversa, i.e. to make it intelligible to a user agent (perhaps a bot 
grouping and joining in one document all contents taken from the same 
source, by parsing a series of articles? - surely there must be some 
better ways to accomplish that, but perhaps such could make sense for a 
somewhat purpose), then the ambiguity concern might be addressed by the 
mean of a well defined relationship in terms of html semantics. I just 
tried to suggest a solution to a concern I thought you and Sam Kuper 
were discussing for some reason, since there is no way to correctly 
define such a relationship in terms of relative positions, as you 
pointed out.



Q: What problem does it solve?
Uhm... perhaps a first cite could be a complete reference, i.e. a book name
along its author, publisher and IBAN code, or a reference to another
site/blog, its author and a link to the page with the quoted text. Ok, and
now? Let's say any other reference to the same sorce could be shorter and
without the need of any markup, but for styling, while the quotation block
could remind the whole sorce to the reader, for instance, when the user moved
his pointer over it, or focuses the blockquote, a tooltip could present the
citation content - as if it where the content of the title attribute - and a
screen reader could speak it aloud after the quotation, and if the referred
cite contained a link, a click on the blocquote content (or any other kind
of activation) could open the linked page in another tab/window.



That describes how it could be used, but _why_? Is there an actual problem 
that isn't solved today that needs solving?
  


Well, in fact, the above could be done as well by 'playing' with anchors 
(but is it still possible to set an anchor somewhere in the document, 
such as a id=foo /? I haven't found examples for that, perhaps I'm 
missing something...). Anyway, maybe, using a cite element as anchor, 
and a related blockquote or q as the mean to reach it, might render 
a stronger semanitcs than generic anchors and hyperlinks (maybe 
coherent for an article holding extended - complete - citations, as 
references, in a proper section, separated from the rest of its 
content), perhaps in a similar fashion as a dfn element is 
referenceable through out its id by an a element.


Best regards,
Alex.


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
I sensi di un'intesa perfetta grande concorso Sheba!
* Vinci speciali premi e tanti prodotti Sheba per il tuo gatto! Partecipa ora!  

* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8432d=1-12


Re: [whatwg] Deprecating small , b ?

2008-11-29 Thread Calogero Alex Baldacchino

Tab Atkins Jr. ha scritto:

On Wed, Nov 26, 2008 at 4:48 PM, Calogero Alex Baldacchino
[EMAIL PROTECTED] wrote:
  


[cut]


We don't have to touch parsing at all to accomplish essentially this.The issue 
you're worried about is getting crazy semantics applied to
individual letters.  Semantic parsers (which honestly the average
browser is *not*) can easily just ignore the semantic value of b or
small or i when they don't wrap a full word, assuming that the use
is either stylistic or too complex/subtle to easily capture.

  


Well, such is a 'semantic' solution equivalent to leaving all to the 
implementation; a 'parsing' solution would solve the 'problem' at the 
bottom, but I acknowledge the question is too marginal to be seriously 
taken into account.


   


Agree to disagree, I guess.  I don't find We hope you'll find bProduct
A/b to be the best laundry detergent you've ever used! to be denoting
emphasis or importance, really.
  

I think 'Product A' is the core of the message, the thing some people are
trying to sell you, the name you *must* remember when you want to by a
laundry detergent, so those people become rich. The bold presentation aims
to capture your attention and keep your eyes on it a bit longer; on a
tv/radio spot the name of the product would be spoken out with some
isolation, with at least a bit of emphasis, for the same reasons. It denotes
importance meaning you need to pay a special attention to it in order to
understand *what the author wants you to understand*. I think that the same
semantics can be expressed by strong, since the importance of a piece of
text is not (only) in its meaning, or in the message overall meaning, or in
one's way to take it as important or not, but (also, or mainly) in the
author's intention to mark it as different from the rest of the content, as
a reading key, to drive your attention and as well your thoughts (ok, that's
like saying that truth is a chimera, but such can be a crude truth :-P ).



If I was contrasting Product A with another item, I could perhaps
agree.  But we're not, so I don't.  ^_^  However, we're obviously
splitting hairs here.
  


But you're implicitly contrasting 'Product A' with a bounce of generic 
items, all items of the same category your potential buyer might happen 
to know (I think this needs some clarification with one example, I guess 
you were referring to comparative advertising, which has not been legal 
in Italy for several years, so what you wrote - with some makup - has 
always been one of the most common advertisement here). Anyway, I agree 
we're splitting hairs, but there's some reason for me to push those 
concepts, and I hope I'll be able to make it clear.


  

Well, a foreign-language word, specially if correctly pronounced (by someone
else), can be more or less hard to 'catch', so a bit of emphasis in its
pronounce might help the listener to correctly distinguish sounds.



That's stretching quite a bit more than I think is appropriate.  Just
because I use a foreign phrase, does not mean that I'm emphasizing it.
 If I, in audible speech, would put a bit of inflection on the phrase,
that still doesn't mean I'm emphasizing it in anything like the way I
emphasize I'm emnot/em going to the dance with you!.
  


Isn't a bit of inflection also a bit of emphasis in pronounce? Perhaps 
I'm misusing the English term; that sounded correct at me in a wider 
sense, but I'll leave that concept, or modify it...



In other words, at most I might slightly stylistically offset the
phrase from my surrounding spoken words, but I wouldn't be
*emphasizing* them.  So the i semantics are correct here.  ^_^

  

After all, most of times bold and italicized texts (try and) reflect our way to
pronounce sentences, with more or less isolation, more or less emphasis,
quicker or slower, so changing their meaning, telling the listener that any
part requires a greater or a lesser attention, is somehow 'special', with
somehow different grades of 'speciality'. From this point of view, I think
either b/i can be semantically the very same thing as strong/em, or
their semantic should be redefined so to indicate a different (and lower)
grade of 'speciality' on the same speciality scale, but not as a different
kind of 'speciality' (i.e., b-text stands out for some - opaque - reason
which has nothing in common with strong-text).



You're overreaching your definition of importance and emphasis.  I
don't think it's valuable to denote *everything* that is in some way
special as important or emphatic - you lose a sense of scale.  If you
wish to define the words as such, then sure, b and i are lesser
grades of importance and emphasis by definition.  By more conventional
definitions, though, they're not, and their stated semantics are fine.

  


Ok, let's define 'special' in a more correct manner. What should be a 
slight offset? What does 'outstanding for some reason' mean, in a less 
ambigous definition? How should the offset

Re: [whatwg] Fallback styles for legacy user agents [was: Re: Deprecating small , b ?]

2008-11-28 Thread Calogero Alex Baldacchino

Benjamin Hawkes-Lewis ha scritto:

Calogero Alex Baldacchino wrote:
That worked fine on Opera 9 and FF2, but, when tried on IE7, the show 
became a little weird... the element was there, the style attribute 
was regarded as for any other element (display:block worked), but 
didn't applied to any of its descendents, as if they weren't its 
descendents... setting 'display:inline' didn't changed much but a 
brake in the line disappeared, *setting 'display:none' didn't made any 
descendent disappear... Why?


Note that display values cascade, but do not inherit:

http://www.w3.org/TR/CSS21/visuren.html#propdef-display

http://www.w3.org/TR/CSS21/cascade.html#inheritance



From the first link:

none
This value causes an element to generate *no* boxes in the 
formatting structure (i.e., the element has no effect on layout). 
Descendant elements do not generate any boxes either; this behavior 
*cannot* be overridden by setting the 'display' property on the descendants.


Basically, an element (with 'normal' positioning, at least) should 
create its own box inside its parent box, but if the paren't box doesn't 
exist, the child cannot have a box as well, so there is no need to make 
the display value inheritable in order to make descendant elements 
'disappear' from the formatting structure. The inheritance, instead, 
could cause problems (unwanted behaviors) for floating elements and 
elements positioned outside the normal flow, so it couldn't be the 
default value (such is clarified in 
http://www.w3.org/TR/CSS21/visuren.html#dis-pos-flo : If 'display' has 
the value 'none', then 'position' and 'float' do not apply. In this 
case, the element generates no box.).


The 'problem', with IE, is its way to treat an unknown element, which 
cannot have children, so cascading and inheritance fail. This leads to 
the need for scripting solution along with fallback styles, and perhaps 
compromises the usefulness of a foundation style sheet for legacy user 
agents (at least, that wouldn't work alone). Though, a uniform default 
layout for visual user agents could be desireable.



Perhaps, if a foundation default aural sheet had been provided from 
its early standard definition, assistive addons could have choosen to 
support aural CSS, since the base would have been good and all they 
had to do would have been treating values as relative ones, to adjust 
accordingly to their usability studies...


Well, there was at least:

http://www.w3.org/TR/CSS2/sample.html

--
Benjamin Hawkes-Lewis


You see, I don't feel to agree with the reasons at the base of 
developers choice to ignore aural CSS, because granting to the user ( = 
the listener) or to the software ( = the screan reader) an exclusive 
full control upon speach constraints cannot be the best way to make the 
spoken message more understandable, because the author of the (written) 
message is the only one who really knows its real meaning, and since we 
understend a spoken message by the way it's... spoken, no one can know 
how to render aurally a message meaning better then its author. I guess 
a non-expert author could have made evrithing unintelligible, but I 
think a good occasion has been underestimated from several points of 
view... For instance, widespreading aural support could have leaded to 
an integration of speech engines in authoring tools, perhaps the same 
used by screen readers (especially in commercial authoring tools); maybe 
the tool could have taken a registration of the author reading a page to 
compare it to the way the speech engine read the same page and suggest 
correct settings for pitch, speed, volume, averaging between the autor 
reading and the engine usability constraints... But I know (and agree 
with Pentasis) any aural-style related subject is a marginal discussion 
in the scope of HTML.


BR, Alex


--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Personalizza il tuo cellulare con tantissimi temi!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8275d=28-11


Re: [whatwg] accesskey attribute with display:none elements

2008-11-28 Thread Calogero Alex Baldacchino

Olli Pettay ha scritto:

On 11/27/2008 06:52 PM, Calogero Alex Baldacchino wrote:

Perhaps a *good* rationale could be, if you can't see the control,

There are other modalities than just visual.



Indeed, and the display property applies to every and each the very same 
way. From http://www.w3.org/TR/CSS21/visuren.html#propdef-display


'display'
Value:  	inline | block | list-item | run-in | inline-block | table 
| inline-table | table-row-group | table-header-group | 
table-footer-group | table-row | table-column-group | table-column | 
table-cell | table-caption | none | inherit

Initial:inline
Applies to: all elements
Inherited:  no
Percentages:N/A
*Media: all*

If you care of any media aving trouble with 'display:none' (and might be 
for a visual browser + a screen reader), you have to change the value 
for that media. But if one can afford to write different style sheets 
for different media, one can also afford to avoid 'display:none' at all 
when it comes to interaction, and instead emulate it by setting a bounch 
of other properties so that the element occuped 1px or so, without 
affecting heavily the overall visual layout, and without problems with 
non-visual media (but there is another possibility, yet working only 
with css 2 compliant browsers: a menu can have an absolute positioning, 
or being floating, and a zindex telling if it's in front of or behind 
another element, which in turn can be opaque, so switching the zindex 
could work as fine as switching the display property).



  So, I stand up for

standardizing the disallow accesskey activation for 'display:none'
elements behaviour.

So you're willing to break accesskeys on some websites.



HTML *5* is the next evolution of HTML, that means it's almost a new 
language looking backward with one eye and forward with the other, 
carrying on something from the past and throwing away somthing else, 
finding some compromises for the transition phase. I think that hiding 
something to the user (whatever is the presentation modality), as if 
that wasn't in the document at all ('display:none' as a stronger 
semantics than just being hidden, invisible, behind something, and so 
on), but expecting the user would interact with that, is not the best 
possible practise, and since, as far as I remember, there have never 
been assurances on good working of accesskeys, a break with old, 
non-standard behaviours could not be a murderer. But, however...



Note, I'm not very strongly supporting accesskeys on display:none elements,
but breaking existing web sites doesn't sound good.

-Olli



but the question could be another. The new behaviour of FF3 breaks 
compatibility with existing HTML *4* (or xhtml) sites, without being an 
HTML *5* *only* browser (perhaps, at some point in the future, html 5 
could become the 'older' backward compatibility basis, like today 
browsers provide older features, i.e. document.all or document.layer, 
along with newer DOM features), so that break, though not being in 
contrast with any standard, could be deemed a kind of bug. My point now 
is: let's state *HTML 5* elements cannot be activated through accesskeys 
when they have a display propery of 'none', but user agents are left 
free (after all that's never been a standard) to activate non-HTML 5 
elements with the property 'display:none' for backward compatibility. 
That should mean the old, non-standard behaviour could be turned on for 
existing websites just by adding a dtd reference in the doctype 
declaration. Does it sound acceptable at you?

Regards, Alex.


--

P.S. I take it separate because off topic, but I'd really consider 
something like,


HTMLKeybordEvent{
readonly boolean attribute activationModifiers;
}

independent from generic DOM keyboard events, yet easily bindable to 
them and quite safe from changes in DOM 3 Events Working Draft.



--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Scegli la tua suoneria! Il meglio della musica sul tuo cellulare! 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8269d=28-11


Re: [whatwg] accesskey attribute with display:none elements

2008-11-26 Thread Calogero Alex Baldacchino

Olli Pettay ha scritto:

On 11/26/2008 02:34 AM, Calogero Alex Baldacchino wrote:

A
better way to do what you aim would consist of setting a listener for
key events on a displayable element and choosing a different operation
basing on the pressed key(s);
This is not content author friendly way to do it, because different 
browsers/OSes

use different keys to activate accesskey targets.

-Olli


On one hand, whoever wished to write a 'complex' web application, with 
keybord shortcuts, should be aware of this concern and try and deal with 
it, since it's even more complex than you wrote, and might affect 
accesskey attributes too. The problem is, keybord shortcuts have always 
been strongly platform-dependent, while the web is aimed as a somewhat 
cross-platform architecture, so both accesskeys and key events handling 
cannot be 'fully author friendly'. I mean, a somewhat browser, on a 
somewhat OS, might use the same combination of modifiers (i.e. crtl+alt 
- just an example, not thinking on a real situation) to activate both 
its own controls (which take precedence) and a web page controls, so 
there is always a chance to choose an accesskey which won't work on a 
particular platform. Perhaps it was an heavier concern a few (or even 
several) years ago, since nowaday I think most browsers take great care 
on this matter, however such problem might arise from time to time, i.e. 
with a new browser version, or a version supporting a new OS (or a new 
OS version), or using an old browser version apparently compatible with 
a new OS version (but such should be a concern more for browser/os 
developers than for web developers, since the latter can only assume the 
underlying platform - browser + os - works correctly, and cannot care 
about any 'bug' outside their work, yet access keys, whatever way 
handled, cannot be though as a 'fully and always' reliable mechanism, 
while mouse clicks and tab-key navigation plus return-key activation 
usually are). Perhaps, keybord shortcuts may work better in a 'make 
application state'.


I agree that setting an accesskey attribute is easyer to deal with than 
handling key events, and the 'no-dimension, display:inline elements 
trick' stands always out there; anyway I think key events handling may 
be improved and become easier to adopt by adding to a somewhat interface 
a few constants representing the modifiers combination used by the 
browser to activate access keys, so those modifiers could be compared to 
the modifiers 'carried on' by the key event (this would require support 
for the DOM 3 Events, which I think could be improved/modified too -- if 
something like the above is yet present in html5 spec and I've missed 
it, I apologize).


On the other hand, the key events listener could just operate on single 
alphanumerical characters, something like (javascript-style)


switch(pressedKey){
case 'a' : doSomething(); return;
case 'b' : doSomethingElse(); return;
}

so to bypass any modifiers concern, with some extra care to avoid 
interferences with textual fields (and to avoid casual key pressing by 
the user - i.e. the very first time a key listened for is pressed, the 
webapp could just show an advise and list all valid shortcuts). Anyway, 
even in this case there would be chances to clash with a browser default 
behaviour for some keys (i.e. when the key is a digit).



--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Scopri i games più scaricati su cellulare! Gioca la tua partita!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8272d=26-11


Re: [whatwg] Fallback styles for legacy user agents [was: Re: Deprecating small , b ?]

2008-11-26 Thread Calogero Alex Baldacchino

Benjamin Hawkes-Lewis ha scritto:

Calogero Alex Baldacchino wrote:

I know, and agree with the basic reasons; however I think that 
deriving an SGML version (i.e. by adding new entities and elements, as 
needed, to an html 4 dtd) should not be very difficoult, and could be 
worth the effort (i.e. to graceful degrade the presentation of a menu 
element thought as a context menu, wich content should not be shown 
untill a right click happens - if the u.a. cannot handle it, not 
showing it at all could be a reasonable behaviour). The derived sgml 
version should be aimed just for older browsers, while newer, html 
5-aware ones should just ignore any dtd reference. I'd consider this 
chance, at least on the fly - I suspect that the complete break out 
with the earlier sgml specifications might carry in an undesireable 
side-effect: from one side it solves the problems arised from sgml 
partial support/bad implementation and from browser-specific quirks, 
but from the other side no mechanism is provided to make 
sgml-somehow-based user agents to gain whatever awareness on the newly 
defined elements.


What SGML-somehow-based user agents? While many web browsers switch 
behavior based on what they detect in the first characters of an HTML 
document (including the doctype declaration), there are no (or at any 
rate, no remotely /popular/ web browsers) that read text/html DTDs in 
the way required for this idea to be workable.


Since all you're proposing is to bake implied STYLE values into the DTD, 
it seems to me your use-case could be served by making an HTML5 
foundation stylesheet publicly available.


Compare:

http://meyerweb.com/eric/thoughts/2007/05/01/reset-reloaded/

http://developer.yahoo.com/yui/base/

--
Benjamin Hawkes-Lewis


Oh, I thought (and hoped) a somewhat basic support were provided... I 
understand I was wrong...


The foundation style sheet may be at least a partial solution, but if 
the browser is not aware of an element, I guess its style could not 
apply at all. Anyway, a standard default style sheet could be desireable 
both to have a standard basic layout on all browsers (as far as 
possible, because of possible differences in CSS compliance) and as a 
potential aid for assistive UAs, since the default sheet could cover a 
few basic aureal properties.

Regards,
Alex


--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
A fine mese devi affrontare molte spese? Intesa Sanpaolo ti parla di Check-up 
finanziario. Prenotalo qui senza impegno
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8436d=26-11


Re: [whatwg] Solving the login/logout problem in HTML

2008-11-26 Thread Calogero Alex Baldacchino

artin Atkins ha scritto:
 Asbjørn Ulsberg wrote:

  [Request 1]

  GET /administration/ HTTP/1.1


  [Response 1]

  HTTP/1.1 401 Unauthorized
  WWW-Authenticate: HTML realm=Administration

  !DOCTYPE html
  html

form action=/login
  input name=username
  input type=password name=password
  input type=submit
/form
  /html


  [Request 2]

  POST /login HTTP/1.1

  username=adminpassword=secret


  [Response 2]

  HTTP/1.1 302 Found
  Authorization: HTML QWxhZGRpbjpvcGVuIHNlc2FtZQ== realm=Administration
  Location: /administration/


  [Request 3]

  GET /administration/ HTTP/1.1
  Authorization: HTML QWxhZGRpbjpvcGVuIHNlc2FtZQ== realm=Administration

  [Response 3]

  HTTP/1.1 200 OK

  !DOCTYPE html
  html
...
h1Welcome!/h1
  /html

 The twist here is that it is up to the server to provide the 
authentication token and through the 'Authorization' header, give the 
client a way to authorize future requests.


 Your auth token here seems to me to be equivalent to a session cookie.

 If you change the Authorization header in Response 2 to 
Set-Cookie (and make some syntactic adjustments) then this doesn't 
require any changes to how deployed apps handle sessions today.



Perhaps that token was meant as a cross-session one, surviving untill an 
explicit logout



--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Innammorarsi è facile con Meetic, milioni di single si sono iscritti, si sono 
conosciuti e hanno riscoperto l'amore. Tutto con Meetic, prova anche tu!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8292d=26-11


Re: [whatwg] Deprecating small , b ?

2008-11-26 Thread Calogero Alex Baldacchino

Tab Atkins Jr. ha scritto:
On Tue, Nov 25, 2008 at 3:08 PM, Calogero Alex Baldacchino 
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:


Tab Atkins Jr. ha scritto:



On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED] wrote:








Do you mean that if you had markup like pbW/bhen I was 
young.../p, it would be read out as I was young...?  If so, that's 
clearly a bug in the reader, and has nothing to do with semantics or the 
lack of it.  There is *no* legitimate interpretation of that markup that 
would lead one to discard the first word.
 


I agree that a reading software unable to understand some text with 
unexpected typographic variants, should read it as normal text; however, 
 I guess how the above can result in an unexpected situation, when 
looking for non-typographic semantics.





Basically, there is a subset of authors who are morons, and they'll 
screw up anything we do.  Most of us aren't like that, but trying to 
design around that subset is a game you can't win.  Their pages will be 
FUBAR no matter what we do, until browsers' rendering engines are 
literally hooked up to a sentient semantic parser.


Arghh!! Such a software would be too smart and dominate the world... 
That could think, morons are bothering; human beings generate morons; 
no more human beings means no more bother for me


 


Ah, the default style could be slightly or very different from the
small one, i.e. the text could be surrounded by parenthesis or
hyphens, despite of the font size (and the new elements could be
designed such to accept just non-empty strings consisting of more
than one non-spacing character).


We could, but is there any reason to have it do that?  Making the text 
small is a good visual representation of the small print or aside 
semantics.




The concept was (or could be - let me modify it), the more we provide 
alternative visual representations of the aside semantic element, the 
more likely a moron designer will stay far from it, since he could be 
confused about the style he's creating. As well, the rule there is no 
default style for the element could prevent authoring tools from just 
changing the name of a button used to style some text. But I know, all 
would fail because most popular browser would choose very similar 
rendering (or would they just follow rendering small fonts).


Anyway, I wouldn't underestimate the latter characteristic (ok, that 
wasn't clear), that is establishing the use of the element is legal if 
it sorrunds a piece of text made up of one or more whole words (or at 
least one readable character) and if it's bounded by spacing or 
punctuation characters (that is, the 'semantic element' cannot be a part 
of a word). Of course, the misuse concern would just move from the 
messed-up word to a messed-up sentence, but at least, in this case, an 
assistive reader would be less likely fouled up and, without any need 
for luck, it could speak out something funny, yet understandable. Of 
course the same could be done redefining b and small parsing rules, 
but such would result in a break with a bounch of (possible) legacy 
uses, and if we had to break somehow with the past, why don't have a 
look for some more significant names? - Just to say, not hoping to 
persuade you :-P




Here it is me not understanding. I think that any reason to offset
some text from the surrounding one can be reduced to the different
grade of 'importance' the author gives it, in the same meaning as
Smylers used in his mails (that is, not the importance of the
content, but the relevance it gets as attention focus - he made the
example of the English small print idiom, and in another mail
clarified that It's less important in the sense that it isn't the
point of what the author wants users to have conveyed to them; it's
less important to the message. (Of course, to users any caveats in
the small print may be very important indeed!)). From this point of
view, unless we aimed to avail of b as an intermediate grade of
relevance between 'normal text' and 'em/strong' (but, aren't these
enough to attract a reader's attention?), redefining its semantic
might be redundant with lesser utility. (In my crazy mind, this
applies to the headings too, since a 'good' heading focuses
attention on the core subject of its following section, so have to
be evidenced as an important slice of text). Furthermore, I meant
that strong and em would have been a better choice than b in
Smylers' examples because their *original semantics* is very close
together with that of a more relevant text/a text needing greater
attention, while b *original semantics* is very different and
needs to be redefined for this purpose (but we have still got
possible alternatives

Re: [whatwg] Feeedback on dfn, abbr, and other elements related to cross-references

2008-11-26 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:

On Thu, 27 Nov 2008, Calogero Alex Baldacchino wrote:
Perhaps a silly idea: what if abbreviations could work as an img-map 
couple? That is, i.e., an abbr without a title could avail of a, let's 
say, 'ref' attribute indicating the id of a previous abbr element with 
a title, and the former could be 'self-closing' (i.e. abbr ref=#foo 
/), so by default the UA would substitute it with the referenced 
element content (the unexpanded abbreviation), and, at the user will 
(when he/she clics on the abbreviation, or just stops the pointer, or 
navigates to the abbreviation, or according to any setting in the 
browser options) the abbreviation is expanded. (I guess the above won't 
be agreed because of backward compatibility, though)


What problem would this solve? It's not like including the abbreviation 
each time is a great burden.




Right, I retire this hint. Or perhaps could work in conjunction with 
dfn as an explicit reference to the definition and a shorter 
alternative to the use of an a element. The same rationale can apply 
to discard this too, however... if the id were put in the abbr inside 
the dfn, instead of setting the dfn id, perhaps it could be a 
reference slightly less prone to mistakes (the more I write, the more 
likely I can press a wrong key inside and mess everithing up - I'm for 
short syntax when possible/meaningful, but this might not be the case, 
of course). I.e., instead of writing:


a href=#gdoabbr title=Garage Door OpenerGDO/abbr/a

one could just write:

abbr ref=#gdo /

so that, by default, the user agent automatically insterts the text 
'GDO', when the user leaves the mouse pointer over it, or focuses it 
elsewhere, the text is expanded, and when the user clicks or activates 
it (i.e. by pressing 'return'), the definition is recalled. '#gdo' could 
be the id of the abbreviation inside the dfn element, but could work 
also being the dfn id itself, so both way (using the a element or the 
abbr one) might exist as alternatives working in a consistent manner 
for different uses (i.e. the a-fashion would be suitable if the 
defined term were a plain text or the title of the dfn element itself).


Maybe still useless, if so disregard it (and forgive my insistence). 
Best Regards,

Alex.


--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Attiva Carta Eureka! Credito fino a 3.000€, rate da 20€ e zero costi di 
attivazione. Conviene!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8429d=27-11


Re: [whatwg] Deprecating small , b ?

2008-11-25 Thread Calogero Alex Baldacchino

Smylers wrote:

Asbjørn Ulsberg writes:

  

On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED]
wrote:



In printed material users are typically given no out-of-band
information about the semantics of the typesetting.  However,
smaller things are less noticeable, and it's generally accepted that
the author of the document wishes the reader to pay less attention
to them than more prominent things.

That works fine with small .
  

No, it doesn't, and you explain why yourself here:



User-agents which can't literally render smaller fonts can choose
alternative mechanisms for denoting lower importance to users.
  


I don't see how that explains why small is an inappropriate tag to use
for things which an author wishes to be less noticeable.
[...]
  


Of course that's possible, but, as you noticed too, only by redefining 
the small semantics, and is not a best choice per se. That's both 
because the original semantics for the small tag was targeted to 
styling and nothing else (the html 4 document type definitions declared 
it as a member of the fontstyle entity, while, for instance, strong 
and em were parts of the phrase entity), and because the term 'small', 
at first glance, suggests the idea of a typographical function, 
regardless any other related concept which might be specific for the 
English (or whatever else) culture, but might not be as well immediate 
for non-English developers all around the world. As a consequence, since 
any average developer could just rely on the old semantics, being he 
intuitively confident with it, the semantics redefinition could find a 
first counter-indication: let's think on a word written with alternate 
b and small letters, or just to a paragraph first letter evidenced 
by a b, obviously the application of the new semantics here would be 
untrivial (i.e. an assistive software for blind users would be fouled by 
this and give unpredictable results). Despite the previous use case 
would be a misuse of the b and small markup, yet it would be 
possible, meaning not prohibited, and so creating a new element with a 
proper semantic could be a better choice.


But, you're right, we have to deal with backward compatibility, and 
redefining the small and b semantics can be a good compromise, since 
a new element would face some heavy concerns, mainly related to 
rendering and to the state of the art implementations in non-visual user 
agents (and the alike).


However, I think that a solution, at least partial, can be found for the 
rendering concern (and I'd push for this being done anyway, since there 
are several new elements defined for HTML 5). Most user agents are 
capable to interpret a dtd to some extent, so it could be worth the 
effort to define an html 5 specific dtd in addition to the parsing 
roules - which aim to overcome all problems arising by previous dtd-only 
html specifications - so that a non html5-fully-compliant browser can 
somehow interpret any new elements. HTML 5 Doctype declaration could 
accept a dtd just for backward compatibility purpose, and any fully 
compliant user agent would just ignore such dtd. More specifically, such 
a dtd could define default values for some attributes, such as the style 
attribute (to have any new element properly rendered - some assistive 
technologies are capable to interpret style sheets too), and, anyway, 
there should be a way, in SMGL, to create an alias for an element (i.e., 
a new element - let's call it incidental - could be aliased to small 
for better compatibility).


Let's come to the non-typographical interpretation a today u.a. may be 
capable of, as in your example about lynx. This can be a very good 
reason to deem small a very good choice. But, are we sure that *every* 
existing user agent can do that? If the answer is yes, we can stop here: 
small is a perfect choise. Better: small is all we need, so let's 
stop bothering each other about this matter. But if the answer is no, we 
have to face a number of user agents needing an update to understand the 
new semantics for the small tag, and so, if the new semantics can be 
assumed as *surely* reliable only with new/updated u.a.'s (that is, with 
those ones fully compatible with html 5 specifications), that's somehow 
like to be starting from scratch, and consequently there is space for a 
new, more appropriate element.



However, you would appreciate that the author had wished for some
particular words to stand out from the surrounding text.
  

That's a job for the style sheet, whether it's provided by the author
or by the user agent.



The style-sheet can only pick out particular words if those words have
been marked-up as special in the document, so it doesn't solve the
problem of how to mark them up.

Further, this isn't using b because the house style is to have all
text in a bold weight (that can be done by style-sheets, and if the
style-sheet is missing all the content is still there); it's using b
to 

Re: [whatwg] media elements: Relative seeking

2008-11-25 Thread Calogero Alex Baldacchino

Eric Carlson ha scritto:


On Nov 24, 2008, at 2:21 PM, Calogero Alex Baldacchino wrote:

Well, the length attribute could be an indication about such limit 
and could accept a generic value, such as 'unknown' (or '0', with the 
same meaning - just to have only numerical values) to indicate an 
endless stream (i.e. a realtime iptv): in such a case, any seeking 
operation could be either prohibited or just related to the amount of 
yet played content which is eventually present in a local cache.


  It is a mistake to assume that media data is present in the local 
cache after it has been played. Some devices have very limited storage 
(eg. small handhelds) and choose to use a very limited non-persistent 
cache, live streams have essentially unbounded size and can't be 
cached, even downloaded content can be so large that clients on a 
desktop class machine may choose to no buffer the entire file.


eric

Ok, I understand that point was too unclear, and I have to write 
something meaningful, lol!
What I meant was: let's assume the user agent has a somewhat buffer big 
enough to maintain a part of the yet played contet; such a buffer could 
be a portion of the download buffer, or a somewhat buffer acting like a 
proper, little cache (or variable in size from u.a. to u.a. - that's it: 
I called such a client-side non-better-defined buffer a 'local cache', 
not meaning that should be persistent); if this happens (and the user 
agents - or any codec avaied of - should know what part of the stream 
has yet been played and is still in the buffer, if any exists), the user 
agent could allow a backward seek limitedly to the amount of buffered, 
yet-played stream, and also buffer a little portion of the following 
stream, not much, just what would be enough to let a retarded playback 
of the stream (in such a scenario, after a while the buffered content 
would/could be discarded, so the 'local' cache would be a little and non 
persistent one). Of course all of this could (and perhaps should) be 
implementation dependent, I just aimed (in my crazy mind) to briefly 
trace such eventuality.


However, the above suggests me something else: a somewhat user agent 
could give the opportunity to record a (portion of a) stream (despite 
this being a 'fixed-size' audio/video or a livestream - for the sake of 
this mail I'm disregarding any DRM related concern), so there could be a 
real persistent cache, which could also work as a 'set-top-box' cache 
allowing retarded playback of a live stream (such a cache could even be 
non-local, but provided by a remote server as a web-based service, 
eventually related with the streamed content provider; in such a case, 
the user agent would coordinate with the remote cache, i.e. getting 
informations about the cache start point, the current position and the 
overall duration). Perhaps the specificacions could sketch out such a 
possibility too, yet leaving it to the implementation. Or perhaps that's 
out of the specifications scope, and I'm just wasting my and your 
time... if so, I apologize.

Regards.


--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Scegli Carta Eureka per tutti i tuoi acquisti! Con zero costi di attivazione 
avrai un credito fino a 3000 euro. Attivala ora!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8428d=25-11


Re: [whatwg] Deprecating small , b ?

2008-11-25 Thread Calogero Alex Baldacchino

Tab Atkins Jr. ha scritto:



On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino 
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:



Of course that's possible, but, as you noticed too, only by
redefining the small semantics, and is not a best choice per se.
That's both because the original semantics for the small tag was
targeted to styling and nothing else (the html 4 document type
definitions declared it as a member of the fontstyle entity,
while, for instance, strong and em were parts of the phrase
entity), and because the term 'small', at first glance, suggests
the idea of a typographical function, regardless any other related
concept which might be specific for the English (or whatever else)
culture, but might not be as well immediate for non-English
developers all around the world. As a consequence, since any
average developer could just rely on the old semantics, being he
intuitively confident with it, the semantics redefinition could
find a first counter-indication: let's think on a word written
with alternate b and small letters, or just to a paragraph
first letter evidenced by a b, obviously the application of the
new semantics here would be untrivial (i.e. an assistive software
for blind users would be fouled by this and give unpredictable
results). Despite the previous use case would be a misuse of the
b and small markup, yet it would be possible, meaning not
prohibited, and so creating a new element with a proper semantic
could be a better choice. 



No matter *what* we do, if there *is* a default style for an element, 
it will be misused by people.  This is a fact of life.  Defining a new 
element which is identical to small in every way except that it 
hasn't been misused *yet* is thus a mug's game, because it *will* be 
misused in the same way as small, and then we just have two 
identical elements for no reason.


I'll start with an example. A few time ago I played around with Opera 
Voice. It seemed to be capable to interpret visual style sheets and 
specifically font styles, so that bold or italics text (so constraint in 
the style sheet, not the markup) were spoken differently from 'normal' 
text, but a paragraph first letter differing from the rest of the word 
(which is a non-rare typographical choice), as far as I remember, caused 
the whole word to be skipped. This suggests me that if we really want a 
'cross-presentation' semantics, we have to keep as far as we can from 
anything having a *main* typographical semantics (as small and b 
have from their birth). Every language is somehow prone to side-effects 
caused by misuse (i.e. it is possible to cause a big mess in a software 
written in a language allowing to pass a pointer to a function - there 
are tons of examples for language design issues - yet such could be a 
desireable capability), but appropriate choices for both semantics and 
syntax may help to reduce the likelyhood of a misuse.


I think that very likely both b and small will carry on their old 
semantics, so being more prone to misuse with respect to their new one, 
since very likely a lot of developers are, and will rest, more confident 
with their original semantics, which is also suggested by their names 
('b' standing for 'bold' and 'small'... for something small on the 
screen or on paper). Instead, a new element would require the developer 
to take some effort at least to learn about its existence, so he would 
read that such element primary use is to indicate a different importance 
of a piece of text, so that a non visual user agent can present it in an 
appropriate manner, and a visual or print user agent can render it in 
different ways. Ah, the default style could be slightly or very 
different from the small one, i.e. the text could be surrounded by 
parenthesis or hyphens, despite of the font size (and the new elements 
could be designed such to accept just non-empty strings consisting of 
more than one non-spacing character).




Yes, bad markup will foul up semantic agents.  But people will 
*always* write bad markup.  At least with the semantic redefinition we 
get to declare lots of usages that *are* appropriate to be conforming 
without any effort on the author's part.


And really, the type of people who would write a word with alternating 
letters wrapped in b and small tags are hardly the kind to even 
*care* about semantics.


Let me reverse this approach: what should an assistive user agent do 
with such a bM/bsmallE/smallbS/bsmallS/small? I think 
that dealing with that word as normal text would be a more gracefull 
degradation than discarding it, and if we clearly state that b and 
small have only typographical semantics, while different elements are 
provided to differentiate the grade of emphasys of a phrase, an 
assistive user agent could support a better behaviour, while any author 
disregarding semantics would not cause any trouble

Re: [whatwg] accesskey attribute with display:none elements

2008-11-25 Thread Calogero Alex Baldacchino

Olli Pettay ha scritto:

Hi all,

currently it isn't specified anywhere (AFAIK) what should happen
if the element which has an accesskey attribute is hidden using
display:none.

HTML4 says the following:
Pressing an access key assigned to an element gives focus to the 
element. The action that occurs when an element receives focus depends 
on the element. For example, when a user activates a link defined by 
the A element, the user agent generally follows the link...

The problem is that focusing and activating isn't the same thing.

FF2, Safari 3.x and Opera 9.6 can activate display:none accesskey 
targets.

FF3 changed the behavior to require visible and focusable element.
IE7 doesn't seem to activate, only focus (at least a elements), and
because hidden element isn't really focusable, it doesn't seem to do 
anything

with elements with display:none.

A simple testcase https://bugzilla.mozilla.org/attachment.cgi?id=339588

I think allowing hidden elements to be activated is useful for web apps,
especially because there isn't any API to add listeners for accesskey 
activation.
(Key event listeners could do something similar, but they'd need to 
handle all the different

browsers and OSes.)
So I prefer what FF2, Safari and Opera do, and would like to change 
FF3.1 to work

the same way.

Anyway, I hope some behavior could be standardized.

Comments?

br,

-Olli
Maybe, the standard behaviour (for both 'display:none' and 
'visibility:hidden') could be just focusing (and changing visibility) 
after pressing the access key (so the user notices what's happening 
before activating any 'control'), then activating the element after a 
second press.




--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Incontri con Meetic : Primo sito d'incontri in Europa - Milioni di single !
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8293d=25-11


Re: [whatwg] Issues relating to the syntax of dates and times

2008-11-25 Thread Calogero Alex Baldacchino

Lachlan Hunt ha scritto:

Pentasis wrote:

Ian Hickson wrote:

On Tue, 25 Nov 2008, Pentasis wrote:
The primary use cases for these elements are for marking up 
publication

dates e.g. in blog entries, and for marking event dates in hCalendar
markup. Thus the DOM APIs are likely to be used as ways to generate
interactive calendar widgets or some such.

I agree with this, so disregard my previous remarks on this subject. I
would however recommend dropping the word primary.


Note that what you've quoted was from a note about a potential issue 
with the DOM APIs which appears in the last formally published WD, but 
which has since been removed from the current editor's draft.



I wouldn't want to make people think their particular use case was
excluded. What if someone wanted to use a date to indicate the time an
entry was added, for instance? Hence the word primary.


This confuses me again ;-) Sorry.  Are you saying that examples and 
use-cases will be excluded from the spec?


No.  It's just that the note didn't list all possible use cases and 
that there are other similar use cases for marking up contemporary 
dates which are equally valid.


In other words, the normative section of the spec will be as generic as 
possible, while a non-normative section will cover a bounch of use cases 
and examples, without pretending to be exahustive with regard to all 
possible use cases. Am I wrong?




--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
CheBanca! La prima banca che ti d� gli interessi in anticipo.
* Fino al 4,70% sul Conto Deposito, zero spese e interessi subito. Aprilo!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8435d=25-11


Re: [whatwg] Issues relating to the syntax of dates and times

2008-11-25 Thread Calogero Alex Baldacchino

Ian Hickson ha scritto:

On Tue, 25 Nov 2008, Calogero Alex Baldacchino wrote:
  
In other words, the normative section of the spec will be as generic as 
possible, while a non-normative section will cover a bounch of use cases 
and examples, without pretending to be exahustive with regard to all 
possible use cases. Am I wrong?



The examples in the spec are already in the spec in their expected final 
location, so you can pretty much see what we intend to do by looking at 
the spec today.


   http://whatwg.org/html5

There will be more examples in time, but that's the only real planned 
difference of relevance here.


  
But an example is just that, an explanation on a specification rule 
which adds nothing to its generic formulation but a clarification, 
without covering all possible scenarios, but only the more relevant for 
clarification sake, and might be labeled as non-normative. This is what 
I meant (specifically a 'logical' separation, not a 'physical' 
relocation, as opposit to a whole discarding of use cases). Sorry if I 
posted a messed up concept.



--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
CheBanca! La prima banca che ti dà gli interessi in anticipo.
* Fino al 4,70% sul Conto Deposito, zero spese e interessi subito. Aprilo!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=7919d=25-11


Re: [whatwg] accesskey attribute with display:none elements

2008-11-25 Thread Calogero Alex Baldacchino

Olli Pettay ha scritto:

On 11/25/2008 11:17 PM, Calogero Alex Baldacchino wrote:
 Maybe, the standard behaviour (for both 'display:none' and
 'visibility:hidden') could be just focusing (and changing visibility)
 after pressing the access key (so the user notices what's happening
 before activating any 'control'), then activating the element after a
 second press.


That isn't what any of the browsers do currently, so I'm not in favor of
this pretty strange behavior.
And how could the browser know how to change the display value?
From display:none to display:inline or display:block or 
display:inline-block or what?
Maybe I've replied to quickly, sorry for this. The user agent should 
have a default style sheet with a default display value for each 
element, so that value could apply (this may lead to unwanted results if 
the element had a different display value before the value none was 
set, and this case should be handled by script). I guess what you wish 
is something like a shortcut in a desktop application, letting you 
access any control in a menu without showing and esploring the menu. 
Despite this could be a desireable behaviour for a web application, I 
think it could also be used to trick the user, or cause an unwanted 
operation to be performed as a consequence of a casual key pressing, 
thus the idea of showing the control before activating it, giving the 
user a chance to stop the operation. For the possible tricks, I guess 
that might be a minor concern (since there are far 'better' ways to 
compromise your navigation); anyway, consider that an element constraint 
with 'display:none' is not a part of the formatting structure for any 
media and you cannot access it anyway (i.e. you cannot click on it, you 
cannot reach it by pressing the tab key). That's not just invisible, 
that's not presented to you at all, almost if that wasn't in the 
document tree, so that's not focusable at all, and accessing it through 
an access key would be quite a tricky way to bypass the above 
'restrictions' (as if that was forced into the document layout). A 
better way to do what you aim would consist of setting a listener for 
key events on a displayable element and choosing a different operation 
basing on the pressed key(s); a perhaps tricky alternative would be 
using controls with an accesskey attribute properly set and 'emulating' 
the 'display:none' layout property by setting their width, height, 
margin, etc., to a value of zero, and their display property to the 
'inline' value. I guess any browser allowing the behaviour you ask for 
any element with a 'display:none' value perhaps just works around a 
somewhat misuse of the display property as a quirk.



--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Incontri con Meetic : Primo sito d'incontri in Europa - Milioni di single !
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8293d=26-11


Re: [whatwg] media elements: Relative seeking

2008-11-24 Thread Calogero Alex Baldacchino
- Original Message 

 Da: Eric Carlson lt;[EMAIL PROTECTED]gt;

 To: Silvia Pfeiffer lt;[EMAIL PROTECTED]gt;

 Cc: WHAT Working Group lt;whatwg@lists.whatwg.orggt;, Maik Merten
lt;[EMAIL PROTECTED]gt;

 Oggetto: Re: [whatwg] media elements: Relative seeking

 Data: 24/11/08 03:17


gt; Silvia -

gt;

gt; On Nov 23, 2008, at 1:40 PM, Silvia Pfeiffer wrote:

gt;

gt;gt; I don't see addition of a duration attribute as much of a problem.
We

gt;gt; have width and height for images, and sizes for fonts, too, and web

gt;gt; developers have learnt how to deal with these in various entities
(px,

gt;gt; em, pt). I would not have a problem giving web developers the

gt;gt; opportunity to report the real duration of a video in an attribute
in

gt;gt; either bytes or seconds (might be better called: length), which
would

gt;gt; allow a renderer to display an accurate timeline. It is help for a

gt;gt; display mechanism just as width and height are.

gt;

gt; Those attributes are different because they change the presentation

gt; of the element: image width and height are the rendered width and

gt; height, font-size controls fond rendering size, etc. In order for a

gt; duration attribute to be equivalent we would need for it to limit the

gt; amount of the file played (like the now-removed 'end' attribute did).

gt;


Well, the length attribute could be an indication about such limit and could
accept a generic value, such as 'unknown' (or '0', with the same meaning -
just to have only numerical values) to indicate an endless stream (i.e. a
realtime iptv): in such a case, any seeking operation could be either
prohibited or just related to the amount of yet played content which is
eventually present in a local cache.


gt;gt; In case of contradiction between the attribute and the actual
decoded

gt;gt; length, a renderer can still override the length attribute at the
time

gt;gt; the real length is known. In case of contradiction between the

gt;gt; attribute and the estimated length of a video, the renderer should

gt;gt; make a call based on the probability of the estimate being correct.

gt;

gt; In the case of a file with video or VBR audio the true duration

gt; literally isn't actually known until *every* frame has been examined.

gt;

gt; When would you have the UA decide to switch from the attribute to

gt; the to the real duration?


I guess the U.A. could avail of an external codec, which could provide
facilities to estimate the real duration: in this case everything would be
as easy as just demanding the averaging an estimation to the codec, and
getting the real/estimated duration as the result of a callback.


gt; What would you have the UA do if the user seeks to time 90 seconds when

gt; attribute says a file is 100 seconds long, but the file actually has a

gt; duration of 80?

gt;

gt;eric


Nothing special, according to me. Just update any visual time indicator,
both for total and current time, and convert the previous seek value to a
relative, percentage one, to normalize the requested position in respect to
the real duration. That is, let's just switch from absolute to relative
seeking as needed, it shouldn't be so difficoult, after all.

Regards,


Alex


 
 --
 Email.it, the professional e-mail, gratis per te: http://www.email.it/f
 
 Sponsor:
 CheBanca! La prima banca che ti dà gli interessi in anticipo.
Fino al 4,70% sul Conto Deposito, zero spese e interessi subito. Aprilo!
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=7916d=20081124



Re: [whatwg] media elements: Relative seeking

2008-11-24 Thread Calogero Alex Baldacchino
nbsp;


- Original Message 

 Da: Maik Merten lt;[EMAIL PROTECTED]gt;

 To: WHATWG Proposals lt;whatwg@lists.whatwg.orggt;

 Oggetto: Re: [whatwg] media elements: Relative seeking

 Data: 24/11/08 08:45




 

gt; Eric Carlson schrieb:

 gt;gt; QuickTime has used this method this since it started supporting
VBR 

 gt;gt; mp3 in 2000, and in practice it works quite well. I am sure that
there 

 gt;gt; are degenerate cases where the initial estimate is way off, but 

 gt;gt; generally it is accurate enough that it isn't a problem. An
initial 

 gt;gt; estimate is more likely to be wrong for a very long file, but each
pixel 

 gt;gt; represents a larger amount of time in the time slider with a long 

 gt;gt; duration so changes less noticeable.

 gt;

 gt; Well, I do believe this works fine for audio (which usually hasn't a 

 gt; wildly fluctuating bitrate if you e.g. average over a second or two), 

 gt; I'm mostly concerned about video. An example for an outrageously off 

 gt; estimate would be the trailer for Generic space-pirate movie.

 gt;

 gt; The first few seconds would be mostly a static
green/red/yellow/whatever 

 gt; screen (This pirate movie has been rated ARR!) - this part would


 gt; be coded with like 100 kbit/s or less. The next few scenes (this is a 

 gt; trailer, after all) would mostly show exploding ships, genetically 

 gt; engineered mutant parrots attacking space-adventurers and a few cuts 

 gt; into random love scenes - so this part can be multi-megabit/s. After 

 gt; this the bitrate would dramatically decrease again as the last few 

 gt; seconds will just show Summer 2010. gt;




gt; Does QuickTime also handle such content gracefully (e.g. display a 

 gt; position slider that doesn't jump around wildly)? Am I overestimating 

 gt; the problem?

 

gt; Maik


The slider should just indicate a relative position (i.e. a percentage)
between 0 and the (currently known) duration of the content, which may be
estimated with a variable average time, perhaps retarded at the beginning,
and varied according to the bitrate variation with some euristic, to make
the computation more accurate (or maybe a few consecutive evaluation, at
fixed and rapid intervals, could be averaged to get a better value, before
updating anything), so no crazy horse jumping should happen. Silvia
Pfeiffer has proposed a 'length' attribute to indicate the overall duration
in the markup, and I think its value could help to improve accuracy, even
when wrong.


 
 --
 Email.it, the professional e-mail, gratis per te: http://www.email.it/f
 
 Sponsor:
 CheBanca! La prima banca che ti dà gli interessi in anticipo.
Fino al 4,70% sul Conto Deposito, zero spese e interessi subito. Aprilo!
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=7919d=20081124