Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-11-01 Thread Johan Sundström
On Wed, Oct 31, 2012 at 7:33 PM, Ian Hickson i...@hixie.ch wrote:

 On Wed, 31 Oct 2012, Johan Sundström wrote:
  On Wednesday, October 31, 2012 at 15:02 , Ian Hickson wrote:
   On Tue, 30 Oct 2012, Johan Sundström wrote:
That said, I would still much enjoy a future where
javascript:alert(document.doctype) would tell you something rich
 about
the page that we today need deep knowledge of document.compatMode
 and/or
combinations of XMLSerializer and parsers, or deep study of
 DocumentType
refdocs to tease out.
  
   Can you elaborate on that?
 
  Sure – rich as in not [object DocumentType], but

 Well the toString() isn't what matters, it's what you can get from the
 rest of the attributes on the object. Or are you just saying you wish
 .toString() would expose the concatenated string? That would just be a
 conveniece method. Is it worth the compat risk?


Yes, this is where our opinions differ. :-) To me, it is the (lack of)
language integration that is the heart of the matter and the source of my
itch – not that what I attempted to do proved impossible to cobble together
with a (perfectly functional!) solution using other documented DOM APIs
scattered about in other disjunct parts of the browser object model, or
pasting together object properties and programmer provided constant strings
to almost reproduce the value sought. My own hack unintentionally got it
wrong in several ways, for example, and I deem that unnecessary brittleness.

From my own experience, the only guaranteed safe, reliable and cross
browser method for figuring out an object's class name is
Object.prototype.toString.call(object_of_interest), so I would sacrifice
consistency with DocumentType.prototype.toString behaviours of the past in
an instance for a more useful and intuitive one. If a novice programmer's
expectations on what happens when she uses the object in a string context
is met, I'd call that improvement here.

 …on apple.com: !DOCTYPE html
 
  …on roxen.com: !DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01
 Transitional//EN http://www.w3.org/TR/html4/loose.dtd;

 I don't understand how that is different than document.compatMode,
 really, other than the latter not exposing limited quirks mode. But in
 theory at least, this information is already exposed.


It tells me what the doctype is, instead of the name of a bucket the
browser sorts the doctype into for various semantic and standards
compliance (and/or political) reasons.

Both features are excellent, when they are the feature you seek, and they
already bear decent names helping with their findability and learnability.
I am essentially weary of the long knowledge gap and edit distance
between alert(document.doctype) and alert((new
XMLSerializer).serializeToString(document.doctype)) – that we can straddle
both in this group we already proved; I aspire to help the other 99%.

 …on the Firefox default homepage: !DOCTYPE html [
!ENTITY % htmlDTD
  PUBLIC -//W3C//DTD XHTML 1.0 Strict//EN
  [...]

 This is for XML, right? In HTML the bit in the square brackets would just
 be dropped. It's not clear that it's worth exposing just for XML...

 Anyway, this is the DOM Core spec, so I'll let Anne, Aryeh, and Ms2ger
 give you a proper answer. :-)


It probably is, and it's also where the change would be useful; were SVG
and other DOMs exempt from returning a string serialization, it would be a
substantially less useful change.

-- 
 / Johan Sundström, http://ecmanaut.blogspot.com/


Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-31 Thread Johan Sundström
On Tue, Oct 30, 2012 at 3:20 AM, Stewart Brodie
stewart.bro...@antplc.com wrote:
 Hi everybody!

 Serializing a complete HTML document DOM to a string is surprisingly
 hard in javascript.

 Does XMLSerializer().serializeToString(document) not meet your requirement?

Ah – good thinking. (new XMLSerializer).serializeToString(document)
does indeed do a pretty excellent job of it, including the crazy hacks
people do with conditional comments outside of the root node, which I
hadn't figured I would be able to piece back together from an already
parsed page.

While I hate to admit it, maybe on some level there is benefit to much
of the DOM APIs being javascript hostile to force you towards the
occasional really well-paved paths like the above, when you can find
them.

My use case was taking as good a snapshot of an already live web
page's structure from a non-privileged bookmarklet, for archival
purposes (i e essentially what a curl of the page would do). For my
purposes, it is a bonus that I actually get the current state of the
page with whatever DOM mods have transpired since it loaded rather
than what curl would produce, so I think XMLSerializer is a good
friend.

That said, I would still much enjoy a future where
javascript:alert(document.doctype) would tell you something rich about
the page that we today need deep knowledge of document.compatMode
and/or combinations of XMLSerializer and parsers, or deep study of
DocumentType refdocs to tease out.

Is there a case against it in people using it where they ought to pick
other solutions?

-- 
 / Johan Sundström, http://ecmanaut.blogspot.com/

 --
 Stewart Brodie
 Team Leader - ANT Galio Browser
 ANT Software Limited


Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-31 Thread Ian Hickson
On Tue, 30 Oct 2012, Johan Sundström wrote:
 
 That said, I would still much enjoy a future where 
 javascript:alert(document.doctype) would tell you something rich about 
 the page that we today need deep knowledge of document.compatMode and/or 
 combinations of XMLSerializer and parsers, or deep study of DocumentType 
 refdocs to tease out.

Can you elaborate on that?

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-31 Thread Johan Sundström
On Wednesday, October 31, 2012 at 15:02 , Ian Hickson wrote:
 On Tue, 30 Oct 2012, Johan Sundström wrote:
  That said, I would still much enjoy a future where
  javascript:alert(document.doctype) would tell you something rich about  
  the page that we today need deep knowledge of document.compatMode and/or  
  combinations of XMLSerializer and parsers, or deep study of DocumentType  
  refdocs to tease out.
  
 Can you elaborate on that?

Sure – rich as in not [object DocumentType], but

…on apple.com: !DOCTYPE html

…on roxen.com: !DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN 
http://www.w3.org/TR/html4/loose.dtd;

…on the Firefox default homepage: !DOCTYPE html [
  !ENTITY % htmlDTD
PUBLIC -//W3C//DTD XHTML 1.0 Strict//EN
DTD/xhtml1-strict.dtd
  %htmlDTD;
  !ENTITY % globalDTD SYSTEM chrome://global/locale/global.dtd
  %globalDTD;
  !ENTITY % aboutHomeDTD SYSTEM chrome://browser/locale/aboutHome.dtd
  %aboutHomeDTD;
!ENTITY % syncBrandDTD SYSTEM chrome://browser/locale/syncBrand.dtd
%syncBrandDTD;

!-- These strings are used in the about:home page --

!ENTITY abouthome.pageTitle brandFullName; Start Page

!ENTITY abouthome.searchEngineButton.label Search

!-- LOCALIZATION NOTE (abouthome.defaultSnippet1.v1):
 text in a/ will be linked to the Firefox features page on mozilla.com
--
!ENTITY abouthome.defaultSnippet1.v1 Thanks for choosing Firefox! To get the 
most out of your browser, learn more about the alatest features/a.
!-- LOCALIZATION NOTE (abouthome.defaultSnippet2.v1):
 text in a/ will be linked to the featured add-ons on addons.mozilla.org
--
!ENTITY abouthome.defaultSnippet2.v1 It's easy to customize your Firefox 
exactly the way you want it. aChoose from thousands of add-ons/a.

!ENTITY abouthome.bookmarksButton.label Bookmarks
!ENTITY abouthome.historyButton.label   History
!ENTITY abouthome.settingsButton.label  Settings
!ENTITY abouthome.addonsButton.labelAdd-ons
!ENTITY abouthome.appsButton.label  Marketplace
!ENTITY abouthome.downloadsButton.label Downloads
!ENTITY abouthome.syncButton.label  syncBrand.shortName.label;

  !ENTITY % browserDTD SYSTEM chrome://browser/locale/browser.dtd 
  %browserDTD;
]

--  
 / Johan Sundström, http://ecmanaut.blogspot.com/



Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-31 Thread Ian Hickson
On Wed, 31 Oct 2012, Johan Sundström wrote:
 On Wednesday, October 31, 2012 at 15:02 , Ian Hickson wrote:
  On Tue, 30 Oct 2012, Johan Sundström wrote:
   That said, I would still much enjoy a future where
   javascript:alert(document.doctype) would tell you something rich about  
   the page that we today need deep knowledge of document.compatMode and/or  
   combinations of XMLSerializer and parsers, or deep study of DocumentType  
   refdocs to tease out.
   
  Can you elaborate on that?
 
 Sure – rich as in not [object DocumentType], but

Well the toString() isn't what matters, it's what you can get from the 
rest of the attributes on the object. Or are you just saying you wish 
.toString() would expose the concatenated string? That would just be a 
conveniece method. Is it worth the compat risk?


 …on apple.com: !DOCTYPE html
 
 …on roxen.com: !DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN 
 http://www.w3.org/TR/html4/loose.dtd;

I don't understand how that is different than document.compatMode, 
really, other than the latter not exposing limited quirks mode. But in 
theory at least, this information is already exposed.


 …on the Firefox default homepage: !DOCTYPE html [
   !ENTITY % htmlDTD
 PUBLIC -//W3C//DTD XHTML 1.0 Strict//EN
 [...]

This is for XML, right? In HTML the bit in the square brackets would just 
be dropped. It's not clear that it's worth exposing just for XML...

Anyway, this is the DOM Core spec, so I'll let Anne, Aryeh, and Ms2ger 
give you a proper answer. :-)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-30 Thread Stewart Brodie
Johan Sundström oyas...@gmail.com wrote:

 Hi everybody!
 
 Serializing a complete HTML document DOM to a string is surprisingly
 hard in javascript.

Does XMLSerializer().serializeToString(document) not meet your requirement?


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-30 Thread Tim Streater
On 30 Oct 2012 at 10:20, Stewart Brodie stewart.bro...@antplc.com wrote: 

 Johan Sundström oyas...@gmail.com wrote:

 Serializing a complete HTML document DOM to a string is surprisingly
 hard in javascript.

 Does XMLSerializer().serializeToString(document) not meet your requirement?

I was wondering that too. I use it to get the content of an iframe into a 
string, so I can send that off to a database. Seems to work without problems 
(Safari Mac 6.0.1). But I too had to ask how to do that; it wasn't particularly 
obvious that that was what I should have been using (to me at any rate).

--
Cheers  --  Tim


[whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-29 Thread Johan Sundström
Hi everybody!

Serializing a complete HTML document DOM to a string is surprisingly
hard in javascript. As a fairly seasoned javascript hacker I figured
this might do it:

  document.doctype + document.documentElement.outerHTML

It doesn't. No browser has a useful window.DocumentType.prototype that
returns either the original document's !DOCTYPE ... before parsing –
or a semantically equivalent post-parsing one. Google Chrome shows one
in its devtools, but seems not to export some way of getting at it to
programmers.

My proposal is we specify this more useful behaviour for
javascript-running browsers, so it does become as simple as above. A
rough sketch of how a polyfill might implement the latter
window.DocumentType.prototype.toString:

  https://gist.github.com/3977584

Even as a polyfill, the above is rather limited, though:  I believe
only Firefox implements internalSubset today, and probably only in
XML contexts. The most useful implementation would IMO be a native one
that reproducing the doctype, as it was formatted in the source
document.

Thoughts?

-- 
 / Johan Sundström, http://ecmanaut.blogspot.com/


Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-29 Thread Boris Zbarsky

On 10/29/12 8:58 PM, Johan Sundström wrote:

Serializing a complete HTML document DOM to a string is surprisingly
hard in javascript.


I thought there were plans to put innerHTML on Document.  Did that go 
nowhere?



As a fairly seasoned javascript hacker I figured
this might do it:

   document.doctype + document.documentElement.outerHTML


This seems lossy in many cases (most obviously: when the HTML uses 
conditional comments, though there are also various XHTML-specific issues).



The most useful implementation would IMO be a native one
that reproducing the doctype, as it was formatted in the source
document.


That might be worth doing independent of the serialization issue.

-Boris


Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-29 Thread Ojan Vafai
On Mon, Oct 29, 2012 at 6:17 PM, Boris Zbarsky bzbar...@mit.edu wrote:

 On 10/29/12 8:58 PM, Johan Sundström wrote:

 Serializing a complete HTML document DOM to a string is surprisingly
 hard in javascript.


 I thought there were plans to put innerHTML on Document.  Did that go
 nowhere?


There were plans to put in on DocumentFragment. But IIRC no other browser
vendors voiced an interest and Hixie was opposed because he thought it
would encourage people to do more string-based DOM building. The WebKit
patch for this floundered as a result. I still think it's a good idea.


Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-29 Thread Ian Hickson
On Mon, 29 Oct 2012, Johan Sundstr�m wrote:
 
 Serializing a complete HTML document DOM to a string is surprisingly 
 hard in javascript. As a fairly seasoned javascript hacker I figured 
 this might do it:
 
   document.doctype + document.documentElement.outerHTML

 It doesn't. No browser has a useful window.DocumentType.prototype that 
 returns either the original document's !DOCTYPE ... before parsing � 
 or a semantically equivalent post-parsing one.

If you know the document is always going to be in the no-quirks mode, then 
you can just stick !DOCTYPE HTML at the start. If you need to be able 
to tell what the mode is but are ok with ignoring the limited quirks 
mode, then you can use document.compatMode to pick whether to use that 
string or none, as in:

   (document.compatMode == 'CSS1Compat' ? '!DOCTYPE HTML' : '') +
   document.documentElement.outerHTML

That will drop any comment nodes around the root element, in case that 
matters. If you want to get the actual DOCTYPE strings, you can make a 
simple serialisation function for doctype nodes that uses the three 
attributes on that object to string together the full thing (much as you 
do in the polyfill you mentioned).


 I believe only Firefox implements internalSubset today

Since the internal subset has no meaning in text/html, that's ok if your 
goal is just to be semantically equivalent.


 The most useful implementation would IMO be a native one that 
 reproducing the doctype, as it was formatted in the source document.

What's your use case, exactly?


On Mon, 29 Oct 2012, Boris Zbarsky wrote:
 
 I thought there were plans to put innerHTML on Document.  Did that go 
 nowhere?

Lack of implementor interest killed it a while ago.


On Mon, 29 Oct 2012, Ojan Vafai wrote:
 On Mon, Oct 29, 2012 at 6:17 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 
  I thought there were plans to put innerHTML on Document.  Did that go 
  nowhere?
 
 There were plans to put in on DocumentFragment.

That was a different plan, but yes, there have also been proposals to do 
that. This was in the context of templates; a better solution to which has 
since been worked on in public-webapps.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'