Re: [whatwg] Sandboxing to accommodate user generated content.

2008-06-18 Thread Mikko Rantalainen
Frode Børli wrote:
 I have been reading up on past discussions on sandboxing content, and

 My main arguments for having this feature (in one form or another) in
 the browser is:

 - It is future proof. Changes to browsers (for example adding
 expression support to css) will never again require old sanitizers to
 be updated.

Unless some braindead vendor is going to add scripting-in-sandboxing
feature which would be equally braindead to unlimited expression support
in css. You cannot be future proof unless you trust all the players
including ALL possible browser vendors.

 If the sanitiser uses a whitelist based approach that forbids everything by
 default, and then only allows known elements and attributes; and in the case
 of the style attribute, known properties and values that are safe, then that
 would also be the case.
 
 I have written a sanitizer for html and it is very difficult -
 especially since browsers have undocumented bugs in their parsing.
 
 Example: div colspan=amp;
 style=font-family#61;expression#40;alert#40quot;hackedquot#41#41
 colspan=amp;Red/div

Every real sanitizer MUST parse the input and generate its internal DOM.
If you then generate known good serialization of that DOM there's no way
your sanitizer would ever output such code. I, too, have written my own
simplified HTML parser that converts all unknown parts to data (that is,
escape all the following characters: '). Just parse the input into
DOM and only after that check if for safe content.

You cannot sanitize HTML using only string replacements without
generating a DOM (all of DOM is not needed in the memory at once, it's
possible to process the input as a stream and handle one tag at a time
and only keep a stack of open tag names in addition).

 The proof that sanitazing HTML is difficult is the fact that no major
 site even attempts it. Even wikipedia use some obscure wiki-language,
 instead of implementing a wysiwyg editor.

Wikipedia does sanitize HTML in the content. It does support its own
wiki-language in addition to HTML. For example, Try to input the
following text as is in the wikipedia sandbox page and press Show preview:

***

 Example: div colspan=amp;
 style=font-family#61;expression#40;alert#40quot;hackedquot#41#41
 colspan=amp;Red/div

Some bmore/b content ihere/i.
***

Works just fine. The content is sanitized and unregognized parts are
converted to data. Correctly written parts are used as HTML tags.

Trust me, it's really not that hard. The hard part is to decide which
tags and which attributes and which attribute values do you want to
allow. And you have to decide that by yourself - there's no magic silver
bullet safe feature set that is suitable for every usage and for every site.

If you don't want to go through all this trouble, do not try to allow
HTML or any other markup in user generated content unless you *really*
trust your users.

 Note that sandboxing doesn't entirely remove the need for sanitising user
 generated content on the server, it's just an extra line of defence in case
 something slips through.
 
 Ofcourse. However, the sandbox feature in browser will be fail safe if
 user generated content is escaped with lt; and gt; before being sent
 to the browser - as long as the browser does not have bugs of course.

That's a pretty big if. If the page author / server application
programmer is always able to escape content correctly, how much harder
is it to correctly escape and sanitize the content in anyway?

All this sounds too much like magic_quotes in PHP...

 A problem with this approach is that developers might forget to escape
 tags, therefore I think browsers should display a security warning
 message if the character  or  is encountered inside a data tag.
 If a developer forgot to escape the markup at all, then a user could enter
 /datascript.../script and do anything they wanted.
 
 Yes, that is my point. That is why I want the sandbox to display a
 severe security warning if the developer has forgotten to escape it.

Isn't that a bit too late? If the developer is not testing his
application before the release what's the point of breaking the whole
site in the user's browser as a result? It will not guard against XSS
because the user generated content can be *first* used to end the
sandbox and *then* user to insert XSS attack. Browser sees only valid
content in the sandbox and site is still under XSS attack.

 This method will be safe for all browsers that has ever existed and
 that will ever exist in the future. If new features are introduced in
 some future version of CSS or HTML - the sandbox is still there and
 the applications created today does not need to have their sanitizers
 updated, ever.

That's a pretty bold claim! I guess that a similar claim could have been
said about CSS support before Microsoft added the expression() value
syntax.

Can *you* guarantee that a random browser vendor does not implement
anything stupid for the sandbox content in the future?

-- 
Mikko




Re: [whatwg] Sandboxing to accommodate user generated content.

2008-06-18 Thread Kristof Zelechovski
Let’s sort things out, folks.  There is nothing in the spec to prevent a
browser vendor to format the user’s hard drive and to drain her bank account
as a bonus when the page displayed contains the string D357R0Y!N0\V!.  The
spec does not tell the vendors what not to do, therefore it cannot guarantee
anything in this respect.  The spec provides a reference implementation and
it is our job not to let harmful extensions in here; what happens in the
wild is beyond our control.
IMHO,
Chris

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mikko Rantalainen
Sent: Wednesday, June 18, 2008 9:20 AM
To: whatwg@lists.whatwg.org
Subject: Re: [whatwg] Sandboxing to accommodate user generated content.

Frode Børli wrote:
 I have been reading up on past discussions on sandboxing content, and

 My main arguments for having this feature (in one form or another) in
 the browser is:

 - It is future proof. Changes to browsers (for example adding
 expression support to css) will never again require old sanitizers to
 be updated.

Unless some braindead vendor is going to add scripting-in-sandboxing
feature which would be equally braindead to unlimited expression support
in css. You cannot be future proof unless you trust all the players
including ALL possible browser vendors.

[snip]

 This method will be safe for all browsers that has ever existed and
 that will ever exist in the future. If new features are introduced in
 some future version of CSS or HTML - the sandbox is still there and
 the applications created today does not need to have their sanitizers
 updated, ever.

That's a pretty bold claim! I guess that a similar claim could have been
said about CSS support before Microsoft added the expression() value
syntax.

Can *you* guarantee that a random browser vendor does not implement
anything stupid for the sandbox content in the future?

-- 
Mikko




[whatwg] Javascript API to query supported codecs for video and audio

2008-06-18 Thread j
Hi,

as it looks like there will not be a common base codec any time soon,
there is a need to be able to detect the supported codecs in javascript.
are there any plans to provide such an interface or is this already
possible?

j



Re: [whatwg] Javascript API to query supported codecs for video and audio

2008-06-18 Thread Anne van Kesteren

On Wed, 18 Jun 2008 12:01:13 +0200, [EMAIL PROTECTED] wrote:

as it looks like there will not be a common base codec any time soon,
there is a need to be able to detect the supported codecs in javascript.
are there any plans to provide such an interface or is this already
possible?


Why is that needed? The elements provide a way to link to multiple codecs  
of which the user agent will then make a choice.



--
Anne van Kesteren
http://annevankesteren.nl/
http://www.opera.com/


Re: [whatwg] Javascript API to query supported codecs for video and audio

2008-06-18 Thread j
On Wed, 2008-06-18 at 12:03 +0200, Anne van Kesteren wrote: 
 Why is that needed? The elements provide a way to link to multiple codecs  
 of which the user agent will then make a choice.
i do not intend to provide multiple codecs since that would require
multiple backend implementations for playing files form an offset,
encoding files in multiple codecs on the server, more disk space etc,

instead i would only use the video tag if the codec i use is supported
and fall back to other means via object / java / flash or whatever to
playback the video or indicate that the user has to install a
qt/dshow/gstreamer plugin. in an ideal world i could use video like i
can use img now and be done with it, but since this is not the case we
need tools to make the best out of video, not knowing what the browser
supports and just hoping that it could work is not an option.

j




Re: [whatwg] Javascript API to query supported codecs for video and audio

2008-06-18 Thread Philip Jägenstedt
It seems to me that it's a good idea to wait with this until we know
more about what will happen with baseline codecs etc.
Implementation-wise it might be less than trivial to return an
exhaustive list of all supported mime-types if the underlying framework
doesn't use the concept of mime-types, but can say when given a few
bytes of the file whether it supports it or not. Allowing JavaScript to
second-guess this doesn't seem great 

On Wed, 2008-06-18 at 12:18 +0200, [EMAIL PROTECTED] wrote:
 On Wed, 2008-06-18 at 12:03 +0200, Anne van Kesteren wrote: 
  Why is that needed? The elements provide a way to link to multiple codecs  
  of which the user agent will then make a choice.
 i do not intend to provide multiple codecs since that would require
 multiple backend implementations for playing files form an offset,
 encoding files in multiple codecs on the server, more disk space etc,
 
 instead i would only use the video tag if the codec i use is supported
 and fall back to other means via object / java / flash or whatever to
 playback the video or indicate that the user has to install a
 qt/dshow/gstreamer plugin. in an ideal world i could use video like i
 can use img now and be done with it, but since this is not the case we
 need tools to make the best out of video, not knowing what the browser
 supports and just hoping that it could work is not an option.
 
 j
 
 
-- 
Philip Jägenstedt
Opera Software



Re: [whatwg] TCPConnection feedback

2008-06-18 Thread Frode Børli
 without informing the user. This would allow a popular page (say a facebook
 profile or banner ad) to perform massive DOS against web servers using
 visitors browsers without any noticeable feedback (though I guess this is
 also true of current HTTPXMLRequestObjects).

XMLHttpRequest only allows connections to the origin server ip of the
script that created the object. If a TCPConnection is supposed to be
able to connect to other services, then some sort of mechanism must be
implemented so that the targeted web server must perform some sort of
approval. The method of approval must be engineered in such a way that
approval process itself cannot be the target of the dos attack. I can
imagine something implemented on the DNS servers and then some digital
signing of the script using public/private key certificates.

  I propose that there be requirements that limit the amount and type of data
 a client can send before receiving a valid server response.

If the client must send information trough the TCPConnection
initially, then we effectively stop existing servers such as
IRC-servers from being able to accept connections without needing a
rewrite.

  There should also be a recommendation that UAs display some form of status
 feedback to indicate a background connection is occurring.
Agree.

  HIXIE.3) No existing SMTP server (or any non-TCPConnection server) is
 going
  to send back the appropriate handshake response.

If TCPConnection is limited to connect only to the origin server, or
servers validated by certificates, then this will never be a problem.
If we take active measures against STMP, then we should do the same
against POP3, IMAP etc as well.

  It is always possible that non-http services are running on port 80. One
 logical reason would be as a workaround for strict firewalls. So the main
 defense against abuse is not the port number but the handshake. The original
 TCP Connection spec required the client to send only Hello\n and the
 server to send only Welcome\n. The new proposal complicates things since
 the server/proxy could send any valid HTTP headers and it would be up to the
 UA to determine their validity. Since the script author can also inject URIs
 into the handshake this becomes a potential flaw. Consider the code:

The protocol should not require any data (not even hello - it should
function as an ordinary TCPConnection similar to implementations in
java, c# or any other major programming language. If not, it should be
called something else - as it is not a TCP connection.


Re: [whatwg] Javascript API to query supported codecs for video and audio

2008-06-18 Thread Philip Jägenstedt
Sorry, my reply was cut short. Again:

It seems to me that it's a good idea to wait with this until we know
more about what will happen with baseline codecs etc.
Implementation-wise it might be less than trivial to return an
exhaustive list of all supported mime-types if the underlying framework
doesn't use the concept of mime-types, but can say when given a few
bytes of the file whether it supports it or not. Allowing JavaScript to
second-guess this seems like a potential source of incompatibility.
Isn't it sufficient to look for MEDIA_ERR_DECODE and add fallback
content when that happens?

Philip

On Wed, 2008-06-18 at 17:34 +0700, Philip Jägenstedt wrote:
 It seems to me that it's a good idea to wait with this until we know
 more about what will happen with baseline codecs etc.
 Implementation-wise it might be less than trivial to return an
 exhaustive list of all supported mime-types if the underlying framework
 doesn't use the concept of mime-types, but can say when given a few
 bytes of the file whether it supports it or not. Allowing JavaScript to
 second-guess this doesn't seem great 
 
 On Wed, 2008-06-18 at 12:18 +0200, [EMAIL PROTECTED] wrote:
  On Wed, 2008-06-18 at 12:03 +0200, Anne van Kesteren wrote: 
   Why is that needed? The elements provide a way to link to multiple codecs 

   of which the user agent will then make a choice.
  i do not intend to provide multiple codecs since that would require
  multiple backend implementations for playing files form an offset,
  encoding files in multiple codecs on the server, more disk space etc,
  
  instead i would only use the video tag if the codec i use is supported
  and fall back to other means via object / java / flash or whatever to
  playback the video or indicate that the user has to install a
  qt/dshow/gstreamer plugin. in an ideal world i could use video like i
  can use img now and be done with it, but since this is not the case we
  need tools to make the best out of video, not knowing what the browser
  supports and just hoping that it could work is not an option.
  
  j
  
  
-- 
Philip Jägenstedt
Opera Software



Re: [whatwg] Javascript API to query supported codecs for video and audio

2008-06-18 Thread João Eiras
The spec clearly says the following
http://www.whatwg.org/specs/web-apps/current-work/#video1
User agents should not show this content to the user; it is intended
for older Web browsers which do not support video,

Although we fully understand the reasoning behind this, there's an use
case missing.
The user agent may support video but might not support the file format.
So in this case, video should do fallback, because:
a) video tags are markup and therefore its error handling is not
available to scripts
b) the author may not want to use scripts, or may want to make the
page fully usable with scripting
c) it's a predictable scenario without any written solution


2008/6/18 Philip Jägenstedt [EMAIL PROTECTED]:
 Sorry, my reply was cut short. Again:

 It seems to me that it's a good idea to wait with this until we know
 more about what will happen with baseline codecs etc.
 Implementation-wise it might be less than trivial to return an
 exhaustive list of all supported mime-types if the underlying framework
 doesn't use the concept of mime-types, but can say when given a few
 bytes of the file whether it supports it or not. Allowing JavaScript to
 second-guess this seems like a potential source of incompatibility.
 Isn't it sufficient to look for MEDIA_ERR_DECODE and add fallback
 content when that happens?

 Philip

 On Wed, 2008-06-18 at 17:34 +0700, Philip Jägenstedt wrote:
 It seems to me that it's a good idea to wait with this until we know
 more about what will happen with baseline codecs etc.
 Implementation-wise it might be less than trivial to return an
 exhaustive list of all supported mime-types if the underlying framework
 doesn't use the concept of mime-types, but can say when given a few
 bytes of the file whether it supports it or not. Allowing JavaScript to
 second-guess this doesn't seem great

 On Wed, 2008-06-18 at 12:18 +0200, [EMAIL PROTECTED] wrote:
  On Wed, 2008-06-18 at 12:03 +0200, Anne van Kesteren wrote:
   Why is that needed? The elements provide a way to link to multiple codecs
   of which the user agent will then make a choice.
  i do not intend to provide multiple codecs since that would require
  multiple backend implementations for playing files form an offset,
  encoding files in multiple codecs on the server, more disk space etc,
 
  instead i would only use the video tag if the codec i use is supported
  and fall back to other means via object / java / flash or whatever to
  playback the video or indicate that the user has to install a
  qt/dshow/gstreamer plugin. in an ideal world i could use video like i
  can use img now and be done with it, but since this is not the case we
  need tools to make the best out of video, not knowing what the browser
  supports and just hoping that it could work is not an option.
 
  j
 
 
 --
 Philip Jägenstedt
 Opera Software




Re: [whatwg] Javascript API to query supported codecs for video and audio

2008-06-18 Thread j
On Wed, 2008-06-18 at 17:38 +0700, Philip Jägenstedt wrote:
 Implementation-wise it might be less than trivial to return an
 exhaustive list of all supported mime-types if the underlying framework
 doesn't use the concept of mime-types, but can say when given a few
 bytes of the file whether it supports it or not. Allowing JavaScript to
 second-guess this seems like a potential source of incompatibility.
 Isn't it sufficient to look for MEDIA_ERR_DECODE and add fallback
 content when that happens?
i imagined something that would use the type string used in source
so i can do:
 canDecode('video/mp4; codecs=avc1.42E01E, mp4a.40.2')
or
 canDecode('video/ogg; codecs=theora, vorbis')

while waiting for MEDIA_ERR_DECODE might be an option,
it sounds to me that that would involve a network connection being made,
the video being buffered and after that the media engine failing, this
takes to long to make a presentation decision based on it.

j



Re: [whatwg] Javascript API to query supported codecs for video and audio

2008-06-18 Thread Henri Sivonen

On Jun 18, 2008, at 13:34, Philip Jägenstedt wrote:


Implementation-wise it might be less than trivial to return an
exhaustive list of all supported mime-types if the underlying  
framework

doesn't use the concept of mime-types, but can say when given a few
bytes of the file whether it supports it or not



Are MIME types the right way of identification in HTML5 if the well- 
known frameworks use something other than MIME types?


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




[whatwg] vtab as an NCR expansion

2008-06-18 Thread Henri Sivonen
Is it intentional that the vtab change didn't cause a change to vtab  
treatment when expanding NCRs?


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




Re: [whatwg] TCPConnection feedback

2008-06-18 Thread Shannon

Frode Børli wrote:


XMLHttpRequest only allows connections to the origin server ip of the
script that created the object. If a TCPConnection is supposed to be
able to connect to other services, then some sort of mechanism must be
implemented so that the targeted web server must perform some sort of
approval. The method of approval must be engineered in such a way that
approval process itself cannot be the target of the dos attack. I can
imagine something implemented on the DNS servers and then some digital
signing of the script using public/private key certificates.

  
Using DNS is an excellent idea, though I would debate whether the 
certificate is needed in addition the DNS record. Perhaps the DNS record 
could simply list domains authorised to provide scripted access. The 
distributed nature and general robustness of DNS servers provides the 
most solid protection against denial of service and brute-force cracking 
which are the primary concerns here. Access-control should probably be 
handled by the hosts usual firewall and authentication methods which is 
trivial once the unauthorised redirect issue is dealt with.


The biggest issue I see is that most UAs are probably not wired to read 
DNS records directly. This means adding DNS access and parsing libraries 
for this one feature. Having said that I can see a whole range of 
security issues that could be addressed by DNS access so maybe this is 
something that HTML5 could address as a more general feature. One 
feature that comes to mind would be to advertise expected server outages 
or /.'ing via DNS so the UAs could tell the user Hey, this site might 
not respond so maybe come back later.


It is worth considering allowing scripts to access devices without said 
DNS rules but with a big fat UA warning, requiring user approval. 
Something like This site is attempting to access a remote service or 
device at the address 34.45.23.54:101 (POP3). This could be part of a 
legitimate service but may also be an attempt to perform a malicious 
task. If you do not trust this site you should say no here.. This would 
address the needs of private networks and home appliances that wish to 
utilise TCPConnection services without having the desire or ability to 
mess with DNS zone files.




The protocol should not require any data (not even hello - it should
function as an ordinary TCPConnection similar to implementations in
java, c# or any other major programming language. If not, it should be
called something else - as it is not a TCP connection.

  
I agree completely. Just providing async HTTP is a weak use case 
compared to allowing client-side access to millions of existing 
(opted-in) services and gadgets.


Shannon



[whatwg] Any other end tag in after head

2008-06-18 Thread Henri Sivonen
After head talks about any other end tag, but has no definitions for  
end tags but other. Is that intentional?


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/




[whatwg] more drag/drop feedback

2008-06-18 Thread Neil Deakin
The initDragEvent/initDragEvent methods take a DataTransfer as an 
argument. Is it expected that the DataTransfer to use here can be 
created with 'new DataTransfer'?


IE and Safari allow a no-argument form of clearData as well which clears 
all formats.


The description for the 'types' property implies that this should be a 
live list. Why?


The clearData, setData and getData methods should clarify what happens 
if an empty format is supplied.


I still don't understand the purpose of the addElement method. Either 
this should be removed or there should be clarification of the 
difference between addElement and setDragImage


Previously, I said that DragEvents should be UIEvents. I think they 
should instead inherit from MouseEvent.


We have a need to be able to support both dragging multiple items, as 
well as dragging non-string data, for instance dragging a set of files 
as a set of File objects (see http://www.w3.org/TR/file-upload/) from 
the file system onto a page.


For this, we would also like to propose methods like the following, 
which are analagous to the existing methods (where Variant is just any 
type of object):


/**
* The number of items being dragged.
*/
readonly attribute unsigned long itemCount;

/**
* Holds a list of the format types of the data that is stored for an item
* at the specified index. If the index is not in the range from 0 to
* itemCount - 1, an empty string list is returned.
*/
DOMStringList typesAt(in unsigned long index);

/**
* Remove the data associated with the given format for an item at the
* specified index. The index is in the range from zero to itemCount - 1.
*
* If the last format for the item is removed, the entire item is removed,
* reducing the itemCount by one.
*
* If format is empty, then the data associated with all formats is removed.
* If the format is not found, then this method has no effect.
*
* @throws NS_ERROR_DOM_INDEX_SIZE_ERR if index is greater or equal than 
itemCount

*/
void clearDataAt(in DOMString format, in unsigned long index);

/*
* setDataAt may only be called with an index argument less than
* itemCount in which case an existing item is modified, or equal to
* itemCount in which case a new item is added, and the itemCount is
* incremented by one.
*
* Data should be added in order of preference, with the most specific
* format added first and the least specific format added last. If data of
* the given format already exists, it is replaced in the same position as
* the old data.
*
* @throws NS_ERROR_DOM_INDEX_SIZE_ERR if index is greater than itemCount
*/
void setDataAt(in DOMString format, in Variant data, in unsigned long 
index);


/**
* Retrieve the data associated with the given format for an item at the
* specified index, or null if it does not exist. The index should be in the
* range from zero to itemCount - 1.
*
* @throws NS_ERROR_DOM_INDEX_SIZE_ERR if index is greater or equal than 
itemCount

*/
Variant getDataAt(in DOMString format, in unsigned long index);



Re: [whatwg] Creating An Outline oddity

2008-06-18 Thread Geoffrey Sneddon


On 15 Jun 2008, at 04:06, Ian Hickson wrote:


On Sun, 15 Jun 2008, Geoffrey Sneddon wrote:


Having implemented the creating an outline algorithm (see
http://pastebin.ca/1048202), I'm getting some odd results (the only
TODO won't affect HTML 4.01 documents such as the following issues).

Using `h1Fooh2Barh2Lol`, and looking at the final current
section (this is the root sectioning element, body), it seems I
correctly get the heading of it (Foo), but I only get one  
subsection:

Bar. As far as I can see, my implementation follows what the spec
says, so it looks as if this is an issue with the spec.

With HTML 5, the current_outlinee at the end is a td element, when it
should be the body element. That really is rather odd.


I don't understand the markup you mean. Could you draw the DOM or  
provide

unambiguous markup for what you're describing? (I don't understand how
Foo is a heading but Bar is a section in your markup.)


The first issue is identical to http://lists.w3.org/Archives/Public/public-html/2008Mar/0032.html 
, which I bullied (sorry, asked) you in to fixing yesterday and is  
now fixed. The second issue was an implementation bug.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] TCPConnection feedback

2008-06-18 Thread Michael Carter

 The protocol should not require any data (not even hello - it should
 function as an ordinary TCPConnection similar to implementations in
 java, c# or any other major programming language. If not, it should be
 called something else - as it is not a TCP connection.



  I agree completely. Just providing async HTTP is a weak use case compared
 to allowing client-side access to millions of existing (opted-in) services
 and gadgets.

 Shannon


It's clear that we need some kind of opt-in strategy or all web viewers will
become spam bots in short order. While I think it will be extremely useful
to have a raw tcp connection in the browser, and indeed you could use an
external service like dns to handle connection authorization, I think that
it will be much more difficult to drive adoption to that kind of standard.
In the meantime, we need to make the protocol enforce the opt-in.

In that case I agree that the name shouldn't be TCPConnection. I propose
SocketConnection instead.

-Michael Carter


Re: [whatwg] TCPConnection feedback

2008-06-18 Thread Ian Hickson
On Wed, 18 Jun 2008, Michael Carter wrote:
 
 In that case I agree that the name shouldn't be TCPConnection. I propose 
 SocketConnection instead.

I was thinking WebSocket (with the protocol itself called the Web Socket 
Protocol or Web Socket Communication Protocol or some such).

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] Making it possible to do an anchor link to any DOM node

2008-06-18 Thread Lukas Kahwe Smith

Hi,

Currently when linking to specific places in a document one is limited  
to the places the original author made linkable via an anchor a tag.  
While this is a nice touch (though not well exposed by modern  
browsers), the reality is that most of the time the person who writes  
a document that links to the original page has a better idea of where  
exactly he wants to link to.


So I think it should be possible to dynamically get an implicit  
anchor on essentially anything. This would be specific DOM id's or any  
css selector or xpath expression. Browsers could be extended to not  
only feature a copy link context menu, but also copy link to  
element, which would do all the nitty gritty work of pointing to the  
element on which the context menu was invoked.


I searched this list to determine if something like this was suggested  
before, the only relevant post [1] I found does not seem to actually  
discuss the same at closer inspection. I just joined this list and I  
have not done a super indepth study on the web about this. So I hope I  
am not boring you all with an old idea.


regards,
Lukas Kahwe Smith
[EMAIL PROTECTED]

[1] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-April/010801.html


Re: [whatwg] TCPConnection feedback

2008-06-18 Thread Michael Carter
On Wed, Jun 18, 2008 at 12:59 PM, Ian Hickson [EMAIL PROTECTED] wrote:

 On Wed, 18 Jun 2008, Michael Carter wrote:
 
  In that case I agree that the name shouldn't be TCPConnection. I propose
  SocketConnection instead.

 I was thinking WebSocket (with the protocol itself called the Web Socket
 Protocol or Web Socket Communication Protocol or some such).


That sounds pretty good. Worth noting is that
http://en.wikipedia.org/wiki/Internet_socket suggests that the term Socket
doesn't refer to tcp/ip, rather it refers to multiple protocols. You can
have a UDPSocket or an IPSocket just as easily as you could have a
TCPSocket. In this context, neither WebSocket nor SocketConnection really
present a naming problem.

-Michael Carter


Re: [whatwg] Making it possible to do an anchor link to any DOM node

2008-06-18 Thread Ian Hickson
On Wed, 18 Jun 2008, Lukas Kahwe Smith wrote:
 
 So I think it should be possible to dynamically get an implicit anchor 
 on essentially anything. This would be specific DOM id's or any css 
 selector or xpath expression. Browsers could be extended to not only 
 feature a copy link context menu, but also copy link to element, 
 which would do all the nitty gritty work of pointing to the element on 
 which the context menu was invoked.
 
 I searched this list to determine if something like this was suggested 
 before, the only relevant post [1] I found does not seem to actually 
 discuss the same at closer inspection. I just joined this list and I 
 have not done a super indepth study on the web about this. So I hope I 
 am not boring you all with an old idea.

This was discussed recently in the public-html list, I believe. My 
conclusion was that the better way to approach this would be to take the 
XPointer work [1] and extend it to cover HTML DOMs as well as XML. With 
such a language specified, one could then add references to such languages 
to the relevant MIME type RFCs (fragment identifier behaviour has 
historically been defined by MIME type RFCs, not by the language specs 
themselves).

My recommendation therefore would be to approach the XML Core Working 
Group at the W3C and see if there is any interest in developing XPointer 
further.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] TCPConnection feedback

2008-06-18 Thread Michael Carter
 Still I do not believe it should have a specific protocol. If a
 protocol is decided on, and it is allowed to connect to any IP-address
 - then DDOS attacks can still be performed: If one million web
 browsers connect to any port on a single server, it does not matter
 which protocol the client tries to communicate with. The server will
 still have problems.


Aren't there and identical set of objections to the cross-domain
access-control header? Or microsofts XDR object? Even without Websocket,
browsers will be making fully cross-domain requests, and the only question
left is how exactly to implement the security in the protocol. That said,
there's no additional harm in allowing WebSocket to establish  cross-domain
connections, but there are many benefits.

-Michael Carter


[whatwg] TCPConnection feedback

2008-06-18 Thread Philipp Serafin
 Still I do not believe it should have a specific protocol.

I think a major problem with raw TCP connections is that they would be
the nightmare of every administrator. If web pages could use every
sort of homebrew protocol on all possible ports, how could you still
sensibly configure a firewall without the danger of accidentally
disabling mary sue grandmother's web application?

Also keep in mind the issue list Ian brought up in the other mail.
Things like URI based adressing and virtual hosting would not be
possible with raw TCP. That would make this feature a lot less useable
for authors that do not have full access over their server, like in
shared hosting situations, for example.

 [If a] protocol is decided on, and it is allowed to connect to any IP-address
 - then DDOS attacks can still be performed: If one million web
 browsers connect to any port on a single server, it does not matter
 which protocol the client tries to communicate with. The server will
 still have problems.

Couldn't this already be done today, though? You can already today
connect to an arbitrary server on an arbitrary port using forms,
img, script src= and all other references that cannot be
cross-domain protected for backwards compatibillity reasons. The whole
hotlinking issue is basically the result of that.
How would WebSocket connections be more harmful than something like

setInterval(function(){
  var img = new Image();
  img.src = http://victim.example.com/; + generateLongRandomString();
}, 1000);

for example would?


Re: [whatwg] Making it possible to do an anchor link to any DOM node

2008-06-18 Thread Elliotte Rusty Harold

Lukas Kahwe Smith wrote:

Hi,

Currently when linking to specific places in a document one is limited 
to the places the original author made linkable via an anchor a tag. 
While this is a nice touch (though not well exposed by modern browsers), 
the reality is that most of the time the person who writes a document 
that links to the original page has a better idea of where exactly he 
wants to link to.



Actually, what you link to these days is any element with an ID attribute.

What you're proposing is to reinvent XPointer.

--
Elliotte Rusty Harold
[EMAIL PROTECTED]


Re: [whatwg] more drag/drop feedback

2008-06-18 Thread Thomas Broyer
On Wed, Jun 18, 2008 at 4:46 PM, Neil Deakin wrote:
 The initDragEvent/initDragEvent methods take a DataTransfer as an argument.
 Is it expected that the DataTransfer to use here can be created with 'new
 DataTransfer'?

 IE and Safari allow a no-argument form of clearData as well which clears all
 formats.

FWIW, Adobe AIR's Clipboard (which is equivalent to the DataTransfer
object) has a clear() no-argument method.

See:
http://livedocs.adobe.com/flex/3/langref/flash/events/NativeDragEvent.html
http://help.adobe.com/en_US/AIR/1.1/jslr/flash/desktop/Clipboard.html

 The description for the 'types' property implies that this should be a live
 list. Why?

Maybe so that you can keep a reference to it while setData/clearData
is being called? But couldn't you just keep a reference to the
DataTransfer object? It seems that IE doesn't have such a property.
Adobe AIR's WebKit does, but how about Safari?

 The clearData, setData and getData methods should clarify what happens if an
 empty format is supplied.

For my personal culture, what happens in IE and Safari?

 I still don't understand the purpose of the addElement method.

I think addElement is there to allow e.g. dialog boxes where the box
can be dragged using its title bar: only the title-bar is draggable
but ondragstart it adds the whole dialog box to the list of what is
being dragged...

 Either this
 should be removed or there should be clarification of the difference between
 addElement and setDragImage

setDragImage is only about the drag feedback, not what is being
dragged, if I understand correctly...

 Previously, I said that DragEvents should be UIEvents. I think they should
 instead inherit from MouseEvent.

FWIW, NativeDragEvent in Adobe AIR inherits MouseEvent.

 We have a need to be able to support both dragging multiple items, as well
 as dragging non-string data, for instance dragging a set of files as a set
 of File objects (see http://www.w3.org/TR/file-upload/) from the file system
 onto a page.

That would be new data formats.
That's how Adobe AIR solved the problem. They added a Bitmap and File
list data formats, for which getData returns AIR-specific objets (a
BitmapData and an array of File objects respectively)
http://help.adobe.com/en_US/AIR/1.1/jslr/flash/desktop/ClipboardFormats.html
http://livedocs.adobe.com/air/1/devappshtml/DragAndDrop_2.html#1048911

 For this, we would also like to propose methods like the following, which
 are analagous to the existing methods (where Variant is just any type of
 object):

I don't see a real need for them (others didn't need them while
providing the same features) and they wouldn't be backwards compatible
with IE and Safari, while AFAIK the current draft is.

-- 
Thomas Broyer


[whatwg] Suggestion of an alternative TCPConnection implementation

2008-06-18 Thread Frode Børli
 I think a major problem with raw TCP connections is that they would be
 the nightmare of every administrator. If web pages could use every
 sort of homebrew protocol on all possible ports, how could you still
 sensibly configure a firewall without the danger of accidentally
 disabling mary sue grandmother's web application?

I dont think so, as long as the web page could only connect to its origin
server. I am certain that this problem was discussed when Java applets
were created also.

Web pages should only be allowed to access other servers when the
script has been digitally signed, and when the user has agreed to
giving the script elevated privileges - or there should be a
certificate on the origin server which is checked against DNS records
for each server that the script attempts to connect to.

 Also keep in mind the issue list Ian brought up in the other mail.
 Things like URI based adressing and virtual hosting would not be
 possible with raw TCP. That would make this feature a lot less useable
 for authors that do not have full access over their server, like in
 shared hosting situations, for example.

Hmm.. There are good arguments both ways. I would like both please :)

So what we want is a http based protocol which allow the client to
continue communicating with the script that handles the initial
request. I believe that a great way to implement this would be to
extend the http protocol (and by using the Connection: Keep-Alive
header).

It should be the script on the server that decides if the connection
is persistent. This will avoid most problems with cross domain
connections, i believe. Lets imagine two php-scripts on a web server:

/index.php (PHP script)
/persistent.pphp (persistent PHP script)

If the user types in the address http://host.com/persistent.pphp -
then this use case is followed:

1. Client sends GET /persistent.pphp and its headers (including domain
name and cookies etc). After all headers are sent, it expects a
standard http compliant response.
2. Server checks the Accept: header for HTML 5 support.
3 (alternative flow): If no support is found in the Accept headers, a
HTTP 406 Not Acceptable header is sent with an error message saying
that a HTML 5 browser is required.
4 (alternative flow). Server checks the (new) SessionID header if it
should reconnect the client to an existing server side instance.
4. Server side script processes the request and may reply with a
complete html page (or with simply a Hello message - it is the server
side script that decides). Server must send Connection: Keep-Alive and
Connection-Type: Persistent headers.
4. The browser renders the response - but a singleton object is
magically available from javascript; document.defaultHTMLSocket. This
object allows the client to continue communicating with the script
that generated the page by sending either serialized data in the same
form as GET/POST data or single text lines.

Other use case: User visits /index.php - which will connect to
/persistent.pphp using javascript.

1. Javascript: mySocket = new HTTPSocket(/persistent.php);
2. Exactly the same use case as the previous is followed, except that
the HTTPSocket-object is returned. The initial data sent by the server
must be read using the read() method of the HTTPSocket object.


Of course, I have not had time to validate that everything I have
suggested can be used and I would like more people to review this
suggestion - but I think it looks very viable at first glance.


I see one problem, and that is if the connection is lost (for example
because of a proxy server):

This could be fixed creating a new header ment for storing a client
session id. If we standardize on that, the web server could
automatically map the client back to the correct instance of the
server application and neither the client, nor the server application
need to know that the connection was lost.


Any feedback will be appreciated.

 Couldn't this already be done today, though? You can already today
 connect to an arbitrary server on an arbitrary port using forms,
 img, script src= and all other references that cannot be
 cross-domain protected for backwards compatibillity reasons. The whole
 hotlinking issue is basically the result of that.
 How would WebSocket connections be more harmful than something like

 setInterval(function(){
   var img = new Image();
   img.src = http://victim.example.com/; + generateLongRandomString();
 }, 1000);

 for example would?


Yes, that could be done - but I think that it would be a lot more
painful for the server if the connection was made to some port, and
kept open. Handling a request for a non-existing url can be finished
in microseconds, but if the client just opens a port without being
disconnected, then the server will quickly be overloaded overloaded by
too many incoming connections.
--
Best regards / Med vennlig hilsen
Frode Børli
Seria.no

Mobile:
+47 406 16 637
Company:
+47 216 90 000
Fax:
+47 216 91 000


Think about the 

Re: [whatwg] Implementation of a good HTTPSocket (TCP-socket)

2008-06-18 Thread Philipp Serafin
On Thu, Jun 19, 2008 at 12:46 AM, Frode Børli [EMAIL PROTECTED] wrote:

 Web pages should only be allowed to access other servers when the
 script has been digitally signed, and when the user has agreed to
 giving the script elevated privileges - or there should be a
 certificate on the origin server which is checked against DNS records
 for each server that the script attempts to connect to.

What prevents a malicious site from simply getting their own certificate?
As for user prompts, I think we have seen how well that works with
IE's ActiveX controls. I fear malicious sites would just put up a
Click 'yes' in the next dialog to continue message, and we're back
to square one.

DNS records sound like a good idea though.

 So what we want is a http based protocol which allow the client to
 continue communicating with the script that handles the initial
 request.

I absolutely agree that this would be the best way. However, couldn't
we use Michaels proposal for that? It seems to solve the same problems
and is actually compliant HTTP (in theory at least).

I find the SessionID header a very good idea though.What are the
thoughts on that?

I'm sorry if that has already been discussed, but if we use HTTP, why
can't we use the Access Control spec as an opt in mechanism that is
a little easier to implement than DNS? If you modify the behaviour a
little, you could even use it against DDOS attacks:

Counter suggestion: When a WebSocket objects attempts to connect,
perform Access Control checks the way you would for POST requests.
If the check fails and if the server response contains an
Access-Control-Max-Age header, agents must immediately close the
connection and must not open a connection to that resource again (or,
if Access-Control-Policy-Path is present, to any resource specified)
until the specified time has elapsed.
That way, administrators that are hit by a DDOS can simply put

Access-Control: allow * exclude evilsite.example.com
Access-Control-Max-Age: 86400
Access-Control-Policy-Path: /

in their server headers and the stream should relatively quickly slow
down to a trickle.

What do you think?

With best regards,
Philipp Serafin


Re: [whatwg] TCPConnection feedback

2008-06-18 Thread Shannon



I think a major problem with raw TCP connections is that they would be
the nightmare of every administrator. If web pages could use every
sort of homebrew protocol on all possible ports, how could you still
sensibly configure a firewall without the danger of accidentally
disabling mary sue grandmother's web application?
  


This already happens. Just yesterday we (an ISP) had a company unable to 
access webmail on port 81 due to an overzealous firewall administrator. 
But how is a web server on port 81 more unsafe than one on 80? It isn't 
the port that matters, it's the applications that may (or may not) be 
using them that need to be controlled. Port-based blocking of whole 
networks is a fairly naive approach today. Consider that the main reason 
for these nazi firewalls is two-fold:
1.) to prevent unauthorised/unproductive activities (at schools, 
libraries or workplaces); and

2.) to prevent viruses connecting out.

Port-blocking to resolve these things doesn't work anymore since:
1.) even without plugins a Web 2.0 browser provides any number of 
games, chat sites and other 'time-wasters'; and
2.) free (or compromised) web hosting can provide viruses with update 
and control mechanisms without creating suspicion by using uncommon 
ports; and
3.) proxies exist (commercial and free) to tunnel any type of traffic 
over port 80.


On the other hand port control interferes with legitimate services (like 
running multiple web servers on a single IP). So what I'm saying here is 
that network admins can do what they want but calling the policy of 
blocking non-standard ports sensible and then basing standards on it 
is another thing. It's pretty obvious that port-based firewalling will 
be obsoleted by protocol sniffing and IP/DNS black/whitelists sooner 
rather than later.


Your argument misses the point anyway. Using your browser as an IRC 
client is no different to downloading mIRC or using a web-based chat 
site. The genie of running arbitrary services from a web client 
escaped the bottle years ago with the introduction of javascript and 
plugins. We are looking at browser as a desktop rather than browser 
as a reader and I don't think that's something that will ever be 
reversed. Since we're on the threshold of the Web Applications age, 
and this is the Web Applications Working Group we should be doing 
everything we can to enable those applications while maintaining 
security. Disarming the browser is a valid goal ONLY once we've 
exhausted the possibility of making it safe.



Also keep in mind the issue list Ian brought up in the other mail.
Things like URI based adressing and virtual hosting would not be
possible with raw TCP. That would make this feature a lot less useable
for authors that do not have full access over their server, like in
shared hosting situations, for example.
  
I fail to see how virtual hosting will work for this anyway. I mean 
we're not talking about Apache/IIS here, we're talking about custom 
applications, scripts or devices - possibly implemented in firmware or 
a few lines of perl. Adding vhost control to the protocol is just 
silly since the webserver won't ever see the request and the customer 
application should be able to use any method it likes to differentiate 
its services. Even URI addressing is silly since again the application 
may have no concept of paths or queries. It is simply a service 
running on a port. The only valid use case for all this added complexity 
is proxying but nobody has tested yet whether proxies will handle this 
(short of enabling encryption, and even that is untested).


I'm thinking here that this proposal is basically rewriting the CGI 
protocol (web server handing off managed request to custom scripts) with 
the ONLY difference being the asynchronous nature of the request. 
Perhaps more consideration might be given to how the CGI/HTTP protocols 
might be updated to allow async communication.


Having said that I still see a very strong use case for low-level 
client-side TCP and UDP. There are ways to manage the security risks 
that require further investigation. Even if it must be kept same-domain 
that is better than creating a new protocol that won't work with 
existing services. Even if that sounds like a feature - it isn't. There 
are better ways to handle access-control for non-WebConnection devices 
than sending garbage to the port.


  

 [If a] protocol is decided on, and it is allowed to connect to any IP-address
 - then DDOS attacks can still be performed: If one million web
 browsers connect to any port on a single server, it does not matter
 which protocol the client tries to communicate with. The server will
 still have problems.



Couldn't this already be done today, though? You can already today
connect to an arbitrary server on an arbitrary port using forms,
img, script src= and all other references that cannot be
cross-domain protected for backwards compatibillity reasons. The whole
hotlinking issue is