subject:"\[whatwg\] input element's value should not be sanitized during parsing"

Re: [whatwg] input element's value should not be sanitized during parsing

2011-06-14 Thread Ian Hickson

On Fri, 11 Mar 2011, Jonas Sicking wrote:
On Tue, Dec 28, 2010 at 11:46 PM, Ian Hickson i...@hixie.ch wrote:
On Mon, 20 Sep 2010, Mounir Lamouri wrote:

With the current specification, these two elements will not have the
same value:
input value=foo#13;bar type='hidden'
input type='hidden' value=foo#13;bar

Yes they will. The attribute order has no effect. Elements are created
by the parser with their attributes already set:

# When the steps below require the UA to create an element for a token in
# a particular namespace, the UA must create a node implementing the
interface
# appropriate for the element type corresponding to the tag name of the
# token in the given namespace (as given in the specification that defines
# that element, e.g. for an a element in the HTML namespace, this
# specification defines it to be the HTMLAnchorElement interface), with
# the tag name being the name of that element, with the node being in the
# given namespace, and with the attributes on the node being those given
# in the given token.
--
http://www.whatwg.org/specs/web-apps/current-work/complete.html#create-an-element-for-the-token

Except that I don't think this is how any implementation actually works.
Nor do I have any desire to write the implementation this way since it
means duplicating a lot of code. I'd have to add code which implemented
attribute behavior both in some special code path triggered during
element creation, as well as code to react to attribute changes
triggered by attribute changes in setAttribute/removeAttribute.

So far this hasn't been needed and the parsing code basically just calls
setAttribute. Unless there are really good reasons to change this I'd
like to avoid it. So far I haven't heard of any such reasons.

The spec is defined such that attribute setting during element creation is
order-agnostic. I believe this is consistent with what authors expect (in
part based on the confusion I've seen when authors run into cases where
that isn't the case). How you implement that is somewhat orthogonal to how
it is specced; if there are specific things that are hard to implement,
I'm happy to discuss them specifically if you like.

On Tue, 21 Sep 2010, Boris Zbarsky wrote:

Where does it say that it's atomic? �I don't see that anywhere (and
in fact, the create an element code in the Gecko parser is most
decidedly non-atomic). �Now maybe the spec intends this to be an
atomic operation; if so it needs to say that.

The operation it describes is a single operation: create a node. It
describes various constraints on that operation, one of which is that
the node have the various tokenised attributes set. I don't understand
how creating a node could be anything other than atomic -- either it
exists or it does not.

You're expecting several operations to happen at the same time. We could
certainly manually insert the attributes and their value into the
datastructure inside the element which stores the attribute name/value
pairs. However at some point we need to update all of the state that
these values drive. Things like sticking elements into id-hashes,
storing the calculated type of an input, calculating the effective URI
of an image, etc. This involves several separate pieces of state and so
can't happen all at the same time.

Sure. When those things happen is defined by the spec too.

On Tue, 21 Sep 2010, Jonas Sicking wrote:

Also, it would mean that the following two pieces of code behaves
differently:

inp = document.createElement(input);
inp.setAttribute(value, foo\nbar);
inp.setAttribute(type, hidden);

and

inp = document.createElement(input);
inp.setAttribute(type, hidden);
inp.setAttribute(value, foo\nbar);

This does not seem desirable.

I can't argue that it's desireable, but it's how the Web works, as I
understand it.

Gecko doesn't exhibit this behavior and I don't know of any sites that
doesn't work in Gecko because of this.

On Wed, 30 Mar 2011, Mounir Lamouri wrote:

FWIW, it does. The first inp.value is 'foobar' while the second is 'foo
bar'. See:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/900

Though, I do not think this is related to the initial issue which is
about setting attributes while creating the element from the parser.

Right, the behaviour is different when the parser does it. This is per
spec, and seems to match what Firefox does.

--
Ian Hickson U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] input element's value should not be sanitized during parsing

2011-06-14 Thread Jonas Sicking

On Tue, Jun 14, 2011 at 2:00 PM, Ian Hickson i...@hixie.ch wrote:
On Fri, 11 Mar 2011, Jonas Sicking wrote:
On Tue, Dec 28, 2010 at 11:46 PM, Ian Hickson i...@hixie.ch wrote:
On Mon, 20 Sep 2010, Mounir Lamouri wrote:

With the current specification, these two elements will not have the
same value:
input value=foo#13;bar type='hidden'
input type='hidden' value=foo#13;bar

Yes they will. The attribute order has no effect. Elements are created
by the parser with their attributes already set:

The problem, if I understand things correctly, is that setAttribute is
*not* order agnostic, while the parsing code is expected to be. This
means that we can't use the same code paths for setAttribute and
parsing.

This is not acceptable to us in Gecko. We're not willing to have two
code paths for setting attributes.

/ Jonas

Re: [whatwg] input element's value should not be sanitized during parsing

2011-06-14 Thread Ian Hickson

On Tue, 14 Jun 2011, Jonas Sicking wrote:
 
 The problem, if I understand things correctly, is that setAttribute is 
 *not* order agnostic, while the parsing code is expected to be. This 
 means that we can't use the same code paths for setAttribute and 
 parsing.

You can, you just have to have a special initialisation signal that the 
parser sends to an element after its set its attributes.


 This is not acceptable to us in Gecko. We're not willing to have two 
 code paths for setting attributes.

You already _have_ two code paths. The example you gave shows that the 
parser is order agnostic but the equivalent DOM code is not.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] input element's value should not be sanitized during parsing

2011-03-30 Thread Mounir Lamouri

On 03/12/2011 12:56 AM, Jonas Sicking wrote:
 inp = document.createElement(input);
 inp.setAttribute(value, foo\nbar);
 inp.setAttribute(type, hidden);

 and

 inp = document.createElement(input);
 inp.setAttribute(type, hidden);
 inp.setAttribute(value, foo\nbar);

 This does not seem desirable.

 I can't argue that it's desireable, but it's how the Web works, as I
 understand it.
 
 Gecko doesn't exhibit this behavior and I don't know of any sites that
 doesn't work in Gecko because of this.

FWIW, it does. The first inp.value is 'foobar' while the second is 'foo
bar'.
See: http://software.hixie.ch/utilities/js/live-dom-viewer/saved/900

Though, I do not think this is related to the initial issue which is
about setting attributes while creating the element from the parser.

--
Mounir

Re: [whatwg] input element's value should not be sanitized during parsing

2011-03-11 Thread Jonas Sicking

(Sorry to bring back an old thread. Trying to catch up on old to-do's
now that FF4 is almost out the door)

On Tue, Dec 28, 2010 at 11:46 PM, Ian Hickson i...@hixie.ch wrote:
On Mon, 20 Sep 2010, Mounir Lamouri wrote:

With the current specification, these two elements will not have the
same value:
input value=foo#13;bar type='hidden'
input type='hidden' value=foo#13;bar

Yes they will. The attribute order has no effect. Elements are created
by the parser with their attributes already set:

# When the steps below require the UA to create an element for a token in
# a particular namespace, the UA must create a node implementing the interface
# appropriate for the element type corresponding to the tag name of the
# token in the given namespace (as given in the specification that defines
# that element, e.g. for an a element in the HTML namespace, this
# specification defines it to be the HTMLAnchorElement interface), with
# the tag name being the name of that element, with the node being in the
# given namespace, and with the attributes on the node being those given
# in the given token.
--
http://www.whatwg.org/specs/web-apps/current-work/complete.html#create-an-element-for-the-token

Except that I don't think this is how any implementation actually
works. Nor do I have any desire to write the implementation this way
since it means duplicating a lot of code. I'd have to add code which
implemented attribute behavior both in some special code path
triggered during element creation, as well as code to react to
attribute changes triggered by attribute changes in
setAttribute/removeAttribute.

So far this hasn't been needed and the parsing code basically just
calls setAttribute. Unless there are really good reasons to change
this I'd like to avoid it. So far I haven't heard of any such reasons.

On Tue, 21 Sep 2010, Boris Zbarsky wrote:

Where does it say that it's atomic? I don't see that anywhere (and in
fact, the create an element code in the Gecko parser is most decidedly
non-atomic). Now maybe the spec intends this to be an atomic operation;
if so it needs to say that.

The operation it describes is a single operation: create a node. It
describes various constraints on that operation, one of which is that the
node have the various tokenised attributes set. I don't understand how
creating a node could be anything other than atomic -- either it exists or
it does not.

You're expecting several operations to happen at the same time. We
could certainly manually insert the attributes and their value into
the datastructure inside the element which stores the attribute
name/value pairs. However at some point we need to update all of the
state that these values drive. Things like sticking elements into
id-hashes, storing the calculated type of an input, calculating the
effective URI of an image, etc. This involves several separate pieces
of state and so can't happen all at the same time.

On Tue, 21 Sep 2010, Boris Zbarsky wrote:

That doesn't work if your parser and DOM aren't very very _very_ tightly
coupled, since there are no DOM APIs to atomically set a bunch of
attributes.

The HTML spec in general assumes that the implementation of the parser is
the implementation of the DOM and that you wouldn't use the DOM Core API
to implement the DOM or the parser.

I wouldn't build a parser on the raw DOM API either. But mostly for
performance reasons since we have to do a lot more checks on data that
comes from untrusted script (things like prevent ancestor cycles etc).
But I'd also strongly want to share most of the code path between the
API that the DOM uses and that the parser uses. Not doing that is
going to lead to a lot more bloat and a lot more bugs.

On Tue, 21 Sep 2010, Jonas Sicking wrote:

Also, it would mean that the following two pieces of code behaves
differently:

inp = document.createElement(input);
inp.setAttribute(value, foo\nbar);
inp.setAttribute(type, hidden);

and

inp = document.createElement(input);
inp.setAttribute(type, hidden);
inp.setAttribute(value, foo\nbar);

This does not seem desirable.

I can't argue that it's desireable, but it's how the Web works, as I
understand it.

Gecko doesn't exhibit this behavior and I don't know of any sites that
doesn't work in Gecko because of this.

/ Jonas

Re: [whatwg] input element's value should not be sanitized during parsing

2010-12-28 Thread Ian Hickson

On Mon, 20 Sep 2010, Mounir Lamouri wrote:

With the current specification, these two elements will not have the
same value:
input value=foo#13;bar type='hidden'
input type='hidden' value=foo#13;bar

Yes they will. The attribute order has no effect. Elements are created
by the parser with their attributes already set:

# When the steps below require the UA to create an element for a token in
# a particular namespace, the UA must create a node implementing the interface
# appropriate for the element type corresponding to the tag name of the
# token in the given namespace (as given in the specification that defines
# that element, e.g. for an a element in the HTML namespace, this
# specification defines it to be the HTMLAnchorElement interface), with
# the tag name being the name of that element, with the node being in the
# given namespace, and with the attributes on the node being those given
# in the given token.
--
http://www.whatwg.org/specs/web-apps/current-work/complete.html#create-an-element-for-the-token

Depending on how the attributes are read, value will be set before or
after type, thus, changing the value sanitization algorithm.

No, the value sanitization algorithm is invoked separately after the
element is first created:

# When an input element is first created, the element's rendering and
# behavior must be set to the rendering and behavior defined for the type
# attribute's state, and the value sanitization algorithm, if one is
# defined for the type attribute's state, must be invoked.
--
http://www.whatwg.org/specs/web-apps/current-work/complete.html#the-input-element

The following change would fix that bug:
- The specification should add that the value sanitization algorithm
should not be used during parsing/as long as the element hasn't been
created.

I don't understand how it could be run before the element has been
created. It runs on the element! :-)

OR
- The specification should add in the set value content attribute
paragraph that the value sanitization algorithm should not be run during
parsing/if the element hasn't been created.

The set value content attribute paragraph doesn't apply until after the
element has been created, with the attribute already set.

The specifications already require that the value sanitization algorithm
should be run when the element is first created.
So, with this change, the element's value will be un-sanitized during
parsing and as soon as the parsing will be done, the element's value
will be sanitized.

I don't really understand what that means.

By the way, first created could probably be changed to a concept from
the specifications. We can guess what that means but there is no strong
notion behind this words AFAIK.

At some point the element is created. How is this ambiguous?

On Tue, 21 Sep 2010, James Graham wrote:

The concept of Creating an Element already exists [1] and is atomic,
that is the element is created with all its attributes in a single
operation. Therefore it is not clear to me how attribute order can make
a difference per spec. Am I missing your point?

[1]
http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#creating-and-inserting-elements

Indeed.

On Tue, 21 Sep 2010, Boris Zbarsky wrote:

The operation it describes is a single operation: create a node. It
describes various constraints on that operation, one of which is that the
node have the various tokenised attributes set. I don't understand how
creating a node could be anything other than atomic -- either it exists or
it does not.

On Tue, 21 Sep 2010, Boris Zbarsky wrote:

That doesn't work if your parser and DOM aren't very very _very_ tightly
coupled, since there are no DOM APIs to atomically set a bunch of
attributes.

The HTML spec in general assumes that the implementation of the parser is
the implementation of the DOM and that you wouldn't use the DOM Core API
to implement the DOM or the parser.

So yes, if the spec implies that this is what's supposed to happen here
then it needs to be _very_ explicit about that.

It's not clear to me how I can be more explicit. Could you elaborate on
what you would like it to say?

On Tue, 21 Sep 2010, Jonas Sicking wrote:

Also, it would mean that the following two pieces of code behaves differently:

inp = document.createElement(input);
inp.setAttribute(value, foo\nbar);
inp.setAttribute(type, hidden);

and

inp = document.createElement(input);
inp.setAttribute(type, hidden);
inp.setAttribute(value, foo\nbar);

This does not seem desirable.

I can't argue that it's desireable, but it's how the Web works, as I
understand it.

--
Ian Hickson

[whatwg] input element's value should not be sanitized during parsing

2010-09-21 Thread Mounir Lamouri

Hi,

For a few days, Firefox's nightly had a bug related to value sanitizing
which happens to be a specification bug.
With the current specification, these two elements will not have the
same value:
input value=foo#13;bar type='hidden'
input type='hidden' value=foo#13;bar
Depending on how the attributes are read, value will be set before or
after type, thus, changing the value sanitization algorithm. So, the
value sanitization algorithm of input type='text' will be used for one
of these elements and the value will be foobar.

The following change would fix that bug:
- The specification should add that the value sanitization algorithm
should not be used during parsing/as long as the element hasn't been
created.
OR
- The specification should add in the set value content attribute
paragraph that the value sanitization algorithm should not be run during
parsing/if the element hasn't been created.

For a specification point of view, both changes would have the same result.

The specifications already require that the value sanitization algorithm
 should be run when the element is first created.
So, with this change, the element's value will be un-sanitized during
parsing and as soon as the parsing will be done, the element's value
will be sanitized.

By the way, first created could probably be changed to a concept from
the specifications. We can guess what that means but there is no strong
notion behind this words AFAIK.

Thanks,
--
Mounir

Re: [whatwg] input element's value should not be sanitized during parsing

2010-09-21 Thread James Graham

On Mon, 20 Sep 2010, Mounir Lamouri wrote:

Hi,

For a few days, Firefox's nightly had a bug related to value sanitizing
which happens to be a specification bug.
With the current specification, these two elements will not have the
same value:
input value=foo#13;bar type='hidden'
input type='hidden' value=foo#13;bar
Depending on how the attributes are read, value will be set before or
after type, thus, changing the value sanitization algorithm. So, the
value sanitization algorithm of input type='text' will be used for one
of these elements and the value will be foobar.

The following change would fix that bug:
- The specification should add that the value sanitization algorithm
should not be used during parsing/as long as the element hasn't been
created.
OR
- The specification should add in the set value content attribute
paragraph that the value sanitization algorithm should not be run during
parsing/if the element hasn't been created.

For a specification point of view, both changes would have the same result.

[1]
http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#creating-and-inserting-elements

Re: [whatwg] input element's value should not be sanitized during parsing

2010-09-21 Thread Boris Zbarsky


On 9/21/10 4:06 AM, James Graham wrote:

The concept of Creating an Element already exists [1] and is atomic,


Where does it say that it's atomic?  I don't see that anywhere (and in 
fact, the create an element code in the Gecko parser is most decidedly 
non-atomic).  Now maybe the spec intends this to be an atomic operation; 
if so it needs to say that.


-Boris

Re: [whatwg] input element's value should not be sanitized during parsing

2010-09-21 Thread James Graham


On 09/21/2010 10:12 AM, Boris Zbarsky wrote:

On 9/21/10 4:06 AM, James Graham wrote:

The concept of Creating an Element already exists [1] and is atomic,


Where does it say that it's atomic? I don't see that anywhere (and in
fact, the create an element code in the Gecko parser is most decidedly
non-atomic). Now maybe the spec intends this to be an atomic operation;
if so it needs to say that.


It is described as a single step in the spec, which I take to imply that 
it should behave as a single operation from the point of view of the 
rest of the spec. Of course I am not against this being made clearer.

Re: [whatwg] input element's value should not be sanitized during parsing

2010-09-21 Thread Boris Zbarsky


On 9/21/10 5:09 AM, James Graham wrote:

It is described as a single step in the spec, which I take to imply that
it should behave as a single operation from the point of view of the
rest of the spec.


That doesn't work if your parser and DOM aren't very very _very_ tightly 
coupled, since there are no DOM APIs to atomically set a bunch of 
attributes.


So yes, if the spec implies that this is what's supposed to happen here 
then it needs to be _very_ explicit about that.


-Boris

Re: [whatwg] input element's value should not be sanitized during parsing

2010-09-21 Thread Jonas Sicking

On Tue, Sep 21, 2010 at 9:13 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 9/21/10 5:09 AM, James Graham wrote:

 It is described as a single step in the spec, which I take to imply that
 it should behave as a single operation from the point of view of the
 rest of the spec.

 That doesn't work if your parser and DOM aren't very very _very_ tightly
 coupled, since there are no DOM APIs to atomically set a bunch of
 attributes.

 So yes, if the spec implies that this is what's supposed to happen here then
 it needs to be _very_ explicit about that.

Also, it would mean that the following two pieces of code behaves differently:

inp = document.createElement(input);
inp.setAttribute(value, foo\nbar);
inp.setAttribute(type, hidden);

and

inp = document.createElement(input);
inp.setAttribute(type, hidden);inp.setAttribute(value, foo\nbar);
This does not seem desirable.

/ Jonas

Re: [whatwg] input element's value should not be sanitized during parsing

2010-09-21 Thread Mounir Lamouri

On 09/21/2010 10:18 PM, Jonas Sicking wrote:
 On Tue, Sep 21, 2010 at 9:13 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 Also, it would mean that the following two pieces of code behaves differently:
 
 inp = document.createElement(input);
 inp.setAttribute(value, foo\nbar);
 inp.setAttribute(type, hidden);
 
 and
 
 inp = document.createElement(input);
 inp.setAttribute(type, hidden);inp.setAttribute(value, foo\nbar);
 This does not seem desirable.

They do. And I don't see this can be different.

--
Mounir

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

[whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

Re: [whatwg] input element's value should not be sanitized during parsing

13 matches

Site Navigation

Mail list logo

Footer information