Re: [whatwg] Script preloading
Hi Ryosuke, Based on the feedback here, it doesn't sound like you are a huge fan of the original proposal in this thread. At this point, has any implementation come out in support of the proposal in this thread as a preferred solution over noexecute/execute()? The strongest support I've seen in this thread, though I very well could have missed some, is "it's better than status quo". Is that the case? / Jonas On Wed, Aug 28, 2013 at 7:43 PM, Ryosuke Niwa wrote: > On Jul 13, 2013, at 5:55 AM, Andy Davies wrote: > >> On 12 July 2013 01:25, Bruno Racineux wrote: >> >>> On browser preloading: >>> >>> There seems to an inherent conflict between 'indiscriminate' Pre-parsers/ >>> PreloadScanner and "responsive design" for mobile. Responsive designs >>> mostly implies that everything needed for a full screen desktop is >>> provided in markup to all devices. >>> >>> >> The pre-loader is a tradeoff, it's aiming to increase network utilisation >> by speculatively downloading resources it can discover. >> >> Some of the resources downloaded may be not be used but with good design >> and mobile first approaches hopefully this number can be minimised. >> >> Even if some unused resources get downloaded how much it matter? > > It matters a lot when you only have GSM wireless connection, and barely > loading anything at all. > >> By starting the downloads earlier, connections will be opened sooner, and >> the TCP congestion window to grow sooner. Of course this has to be balanced >> against visitors who might be paying to download those unused bytes, and >> whether the unused resources are blocking something on the critical path >> from being downloaded (believe some preloaders can re-prioritise resources >> if they need them before the preloader has downloaded them) > > Exactly. I'd to make sure whatever API we come up gives enough flexibility > for the UAs to decide whether a given resource needs to be loaded immediatley. > > > > On Jul 12, 2013, at 11:56 AM, Kyle Simpson wrote: > >> My scope (as it always has been) put simply: I want (for all the reasons >> here and before) to have a "silver bullet" in script loading, which lets me >> load any number of scripts in parallel, and to the extent that is >> reasonable, be fully in control of what order they run in, if at all, >> responding to conditions AS THE SCRIPTS EXECUTE, not merely as they might >> have existed at the time of initial request. I want such a facility because >> I want to continue to have LABjs be a best-in-class fully-capable script >> loader that sets the standard for best-practice on-demand script loading. > > > Because of the different network conditions and constraints various devices > have, I'm wary of any solution that gives the full control over when each > script is loaded. While I'm sure large corporations with lots of resources > will get this right, I don't want to provide a preloading API that's hard to > use for ordinary Web developers. > > > On Jul 15, 2013, at 7:55 AM, Kornel Lesiński wrote: > >> There's a very high overlap between module dependencies and
Re: [whatwg] Script preloading
On Jul 13, 2013, at 5:55 AM, Andy Davies wrote: > On 12 July 2013 01:25, Bruno Racineux wrote: > >> On browser preloading: >> >> There seems to an inherent conflict between 'indiscriminate' Pre-parsers/ >> PreloadScanner and "responsive design" for mobile. Responsive designs >> mostly implies that everything needed for a full screen desktop is >> provided in markup to all devices. >> >> > The pre-loader is a tradeoff, it's aiming to increase network utilisation > by speculatively downloading resources it can discover. > > Some of the resources downloaded may be not be used but with good design > and mobile first approaches hopefully this number can be minimised. > > Even if some unused resources get downloaded how much it matter? It matters a lot when you only have GSM wireless connection, and barely loading anything at all. > By starting the downloads earlier, connections will be opened sooner, and > the TCP congestion window to grow sooner. Of course this has to be balanced > against visitors who might be paying to download those unused bytes, and > whether the unused resources are blocking something on the critical path > from being downloaded (believe some preloaders can re-prioritise resources > if they need them before the preloader has downloaded them) Exactly. I'd to make sure whatever API we come up gives enough flexibility for the UAs to decide whether a given resource needs to be loaded immediatley. On Jul 12, 2013, at 11:56 AM, Kyle Simpson wrote: > My scope (as it always has been) put simply: I want (for all the reasons here > and before) to have a "silver bullet" in script loading, which lets me load > any number of scripts in parallel, and to the extent that is reasonable, be > fully in control of what order they run in, if at all, responding to > conditions AS THE SCRIPTS EXECUTE, not merely as they might have existed at > the time of initial request. I want such a facility because I want to > continue to have LABjs be a best-in-class fully-capable script loader that > sets the standard for best-practice on-demand script loading. Because of the different network conditions and constraints various devices have, I'm wary of any solution that gives the full control over when each script is loaded. While I'm sure large corporations with lots of resources will get this right, I don't want to provide a preloading API that's hard to use for ordinary Web developers. On Jul 15, 2013, at 7:55 AM, Kornel Lesiński wrote: > There's a very high overlap between module dependencies and
Re: [whatwg] Zip archives as first-class citizens
Hey Anne, On 28/08/2013, at 11:32 PM, Anne van Kesteren wrote: > > * Fragments: fail to work well for URLs relative to a zip archive. > > Fragments are conceptually the cleanest as the only part of a URL > that's supposed to depend on the Content-Type is the fragment. > However, if you want to link to an ID inside an HTML resource you'd > have to do #path=test.html&id=test which would require adding > knowledge to the HTML resource that it is contained in a zip archive > and have special processing based on that. And not just HTML, same > goes for CSS or JavaScript. I'm sure you've thought about this more than I have, but can you humour me and dig in a bit here? If I wanted to link *within* the HTML, it could still be , correct? Likewise, in the CSS if I wanted to define style for that id, it'd still be #test { ... }. AIUI the case that's more of an issue is if I want to link from foo.html to bar.html#test, both inside the zip. It seems to me that you need *some* idea of the structure of the zip inside there -- just as you need some idea of the structure of the Web site when linking between HTTP resources. The question to me is whether you can make it compatible with existing syntax to make it go down easier. E.g. if this would work: Couldn't that be done by saying that for URIs inside a ZIP file, the base URI is effectively an authority-less scheme? E.g., for "foo.html" the base uri would be "zip://foo.html". The zip URI scheme wouldn't be used in practice, just for rooting relative URIs inside of ZIP files. From the outside, the fragment identifier syntax for the zip format would dispatch appropriately, e.g., http://example.com/stuff.zip#path=foo.html&id=test I *think* the end effect here would be that from the inside, HTML, CSS and JS wouldn't have to be changed to be zipped. From the outside, if you want to link *into* a zip file, you have to be aware of its structure, but that's really always going to be the case, isn't it? Just a thought. Cheers, -- Mark Nottingham http://www.mnot.net/
Re: [whatwg] Elements should be removed from the past names map once it's no longer associated with the form element
Since Gecko has already implemented this behavior, I've gone ahead and changed WebKit's behavior: http://trac.webkit.org/changeset/154761 - R. Niwa On Aug 26, 2013, at 7:09 PM, Boris Zbarsky wrote: > On 8/26/13 9:51 PM, Ryosuke Niwa wrote: >> That's good to hear. So we're definitely in agreement with respect to this >> new behavior. > > I filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=23073 > > -Boris
Re: [whatwg] [blink-dev] Re: Intent to Update TextTrackCue and Add VTTCue
On Fri, 23 Aug 2013, Glenn Adams wrote: > On Fri, Aug 23, 2013 at 4:16 PM, Ian Hickson wrote: > > On Fri, 23 Aug 2013, Glenn Adams wrote: > > > > > > As has been pointed out a number of times, there are already > > > implementations and JS client code using this technique. > > > > Where? > > I think I've pointed this out to you at least four times before, but > I'll do so again: > > http://www.cablelabs.com/specifications/CL-SP-HTML5-MAP-I02-120510.pdf > > See section 5.2 Closed Captioning. I see nothing in that section that is either an implementation or JS client code. On Sat, 24 Aug 2013, Glenn Adams wrote: > On Sat, Aug 24, 2013 at 9:48 AM, PhistucK wrote: > > > > But where is it used? > > This specification has been implemented by CableLabs in a reference > implementation of a DLNA defined TV/STB platform for remote user > interfaces. The "generic" usage implemented there is being used by > television service provider operators to access both MPEG-2 PSI and > CEA-608 data in JS client code. Changing how the Web works wouldn't, as far as I can tell, have any impact on this. So this doesn't provide a reason to avoid changing the spec. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Canonical Image and Color
On Fri, Jul 12, 2013 at 1:32 PM, Ian Hickson wrote: > You are welcome to register these on the wiki and convince people to use > them, sure. Seems like they already have solutions, though, as you show: Would you kindly link me to the wiki? > Sounds like this is already solved, then. > In a sense, but ultimately with caveats. OpenGraph is very useful right now, but Facebook can unilaterally change it or wipe it out entirely (both have already happened to a degree). Microsoft's color properties possess mechanics that are extremely specific to how IE uses color in Windows. Why isn't sufficient? That should suffice, I agree. Meta Image can serve as a fallback when an icon is not available – specifically, as an alternative to using a programmatic screenshot of the app. The principal concept in this Meta Image proposal is to specify a graphic that represents the page content.
Re: [whatwg] Zip archives as first-class citizens
On Wed, Aug 28, 2013 at 10:21 AM, Glenn Maynard wrote: > On Wed, Aug 28, 2013 at 12:07 PM, Eric Uhrhane wrote: >> >> We've covered this several times. The directory records in a zip can >> be superseded by further directories later in the archive, so you >> can't trust that you've got the right directory until you're done >> downloading. > > Both the local headers and the central record can be wrong. (As mentioned > on IRC the other day, apparently EPUB files often have broken central > records, so eBook readers probably prefer the local records.) If they're > out of sync, then they'll always be broken in some clients. > > We just have to make sure that the record that takes priority in any > particular case is well-defined, so we have interop. If some malformed > archives won't work in some cases as a result, using a different format > isn't an improvement: that just means *zero* existing archives would work. Broken files don't work, and I'm OK with that. I'm saying that legal zips can have multiple directories, where the definitive one is last in the file, so it's not a good format for streaming. If you're saying that you want to change the format to make an earlier directory definitive, that's going to break compat for the existing archives everywhere, and would be confusing enough that we should just go with a different archive format that doesn't require changes. Or at least don't call it zip when you're done messing with the spec. > This applies to various other aspects of the format: the maximum supported > length of comments and handling of duplicate filenames, for example. This > would all need to be specified; the ZIP "AppNote" doesn't specify a parser > or error handling in the way the web needs, it just describes the format. > > -- > Glenn Maynard >
Re: [whatwg] Zip archives as first-class citizens
On Wed, Aug 28, 2013 at 12:07 PM, Eric Uhrhane wrote: > We've covered this several times. The directory records in a zip can > be superseded by further directories later in the archive, so you > can't trust that you've got the right directory until you're done > downloading. > Both the local headers and the central record can be wrong. (As mentioned on IRC the other day, apparently EPUB files often have broken central records, so eBook readers probably prefer the local records.) If they're out of sync, then they'll always be broken in some clients. We just have to make sure that the record that takes priority in any particular case is well-defined, so we have interop. If some malformed archives won't work in some cases as a result, using a different format isn't an improvement: that just means *zero* existing archives would work. This applies to various other aspects of the format: the maximum supported length of comments and handling of duplicate filenames, for example. This would all need to be specified; the ZIP "AppNote" doesn't specify a parser or error handling in the way the web needs, it just describes the format. -- Glenn Maynard
Re: [whatwg] Zip archives as first-class citizens
(resending) On Aug 28, 2013, at 6:32 AM, Anne van Kesteren wrote: A couple of us have been toying around with the idea of making zip archives first-class citizens on the web. This sounds like a great opening for a discussion about the pros and cons of doing such a thing. But until such a discussion has happened, isn't it a little premature to worry about the URL details? I'd start with things like "what is the fallback when using a browser behind an enterprise firewall that blocks all zip files?" and "what potential security vulnerabilities do we create by having the browser download a zip file and parse the contents?" and maybe "how does this influence the design of memory-constrained browsers?" Matthew Kaufman Sent from my iPad
Re: [whatwg] Zip archives as first-class citizens
On Wed, Aug 28, 2013 at 9:43 AM, Glenn Maynard wrote: > On Wed, Aug 28, 2013 at 4:54 PM, Eric Uhrhane wrote: >> >> > Without commenting on the other parts of the proposal, let me just >> > mention that every time .zip support comes up, we notice that it's not >> > a great web archive format because it's not streamable. That is, you >> > can't actually use any of the contents until you've downloaded the >> > whole file. > > > ZIPs support both streaming and random access. You can access files in a > ZIP as the ZIP is downloaded, using the local file headers. In this mode, > they work like tars (except that you don't have to decompress unneeded data, > like you do with a tar.gz). Anne's quote snipped off an important piece of my message [which apparently didn't get out due to the too-many-recipients problem]: > [Before you respond that it's streamable, please look in the archives > for the rebuttal.] We've covered this several times. The directory records in a zip can be superseded by further directories later in the archive, so you can't trust that you've got the right directory until you're done downloading. > This feature wouldn't want that, since you need to read the whole file up to > the file you want. Instead, it wants random access, which ZIPs also > support. You download the central directory record first, to find out where > the file you want lies in the archive, then download just the slice of data > you need. You don't need to download the whole file. > > -- > Glenn Maynard >
Re: [whatwg] Zip archives as first-class citizens
> On Wed, Aug 28, 2013 at 8:47 AM, Eric U wrote: > Without commenting on the other parts of the proposal, let me just > mention that every time .zip support comes up, we notice that it's not > a great web archive format because it's not streamable. That is, you > can't actually use any of the contents until you've downloaded the > whole file. > > Perhaps some other archive format would be a better fit for the web? My take on this is that zip archives are ubiquitous. That makes this feature easy to deploy from the start. If zip archives turn out to be a successful feature we can add support for an alternative format down the line that handles that better. Adding zip archive support will also make it easier to work with OOXML, EPUB, etc. -- http://annevankesteren.nl/
Re: [whatwg] Zip archives as first-class citizens
Again from the right address... On Wed, Aug 28, 2013 at 8:47 AM, Eric U wrote: > Without commenting on the other parts of the proposal, let me just > mention that every time .zip support comes up, we notice that it's not > a great web archive format because it's not streamable. That is, you > can't actually use any of the contents until you've downloaded the > whole file. > > Perhaps some other archive format would be a better fit for the web? > > [Before you respond that it's streamable, please look in the archives > for the rebuttal.] > > Eric > > > On Wed, Aug 28, 2013 at 6:32 AM, Anne van Kesteren wrote: >> A couple of us have been toying around with the idea of making zip >> archives first-class citizens on the web. What we want to support: >> >> * Group a bunch of JavaScript files together in a single resource and >> refer to them individually for upcoming JavaScript modules. >> * Package a bunch of related resources together for a game or >> applications (e.g. icons). >> * Support self-contained packages, like Flash-ads or Flash-based games. >> >> Using zip archives for this makes sense as it has broad tooling >> support. To lower adoption cost no special configuration should be >> needed. Existing zip archives should be able to fit right in. >> >> >> The above means we need URLs for zip archives. That is: >> >> >> >> should work. As well as >> >> >> >> and test.html should be able to contain URLs that reference other >> resources inside the zip archive. >> >> >> We have thought of three approaches for zip URL design thus far: >> >> * Using a sub-scheme (zip) with a zip-path (after !): >> zip:http://www.example.org/zip!image.gif >> * Introducing a zip-path (after %!): http://www.example.org/zip%!image.gif >> * Using media fragments: http://www.example.org/zip#path=image.gif >> >> High-level drawbacks: >> >> * Sub-scheme: requires changing the URL syntax with both sub-scheme >> and zip-path. >> * Zip-path: requires changing the URL syntax. >> * Fragments: fail to work well for URLs relative to a zip archive. >> >> Fragments are conceptually the cleanest as the only part of a URL >> that's supposed to depend on the Content-Type is the fragment. >> However, if you want to link to an ID inside an HTML resource you'd >> have to do #path=test.html&id=test which would require adding >> knowledge to the HTML resource that it is contained in a zip archive >> and have special processing based on that. And not just HTML, same >> goes for CSS or JavaScript. >> >> I'm not sure we need to consider sub-scheme if zip-path can work as >> it's more complex and not very well thought out. E.g. imagine >> view-source:zip:http://www.example.org/zip!test.html. (I hope we never >> need to standardize view-source and that it can be restricted to the >> address bar in browsers.) >> >> zip-path makes zip archive packaging by far the easiest. If we use %! >> as separator that would cause a network error in some existing >> browsers (due to an illegal %), which means it's extensible there, >> though not backwards compatible. >> >> We'd adjust the URL parser to build a zip-path once %! is encountered. >> And relative URLs would first look if there's a zip-path and work >> against that, and use path otherwise. >> >> Fetching would always use the path. If there's a zip-path and the >> returned resource is not a zip archive it would cause a network error. >> >> >> As for nested zip archives. Andrea suggested we should support this, >> but that would require zip-path to be a sequence of paths. I think we >> never went to allow relative URLs to escape the top-most zip archive. >> But I suppose we could support in a way that >> >> %!test.zip!test.html >> >> goes one level deeper. And "../image.gif" in test.html looks in the >> enclosing zip. And "../../image.gif" in test.html looks in the >> enclosing zip as well because it cannot ever be relative to the path, >> only the zip-path. >> >> >> -- >> http://annevankesteren.nl/
Re: [whatwg] Zip archives as first-class citizens
The idea of making zip content (and hopefully XZ content) available feels right, but adding complexity doesn't. On Wed, Aug 28, 2013 at 1:32 PM, Anne van Kesteren wrote: > We have thought of three approaches for zip URL design thus far: > > * Using a sub-scheme (zip) with a zip-path (after !): > zip:http://www.example.org/zip!image.gif > * Introducing a zip-path (after %!): http://www.example.org/zip%!image.gif > * Using media fragments: http://www.example.org/zip#path=image.gif W.r.t. the sub-scheme, KDE kioslaves have something highly similar (available for instance on their file explorers). The syntax is the following zip: / For instance, zip:/home/tyl/vault.zip/js/simplex.js Sure, a "real" directory can have a .zip extension, but spread across all KDE users since kioslave's inception, more than 10 years ago, that hasn't been raised as an issue (at least, I couldn't find one through their bug tracker). As a result, may I suggest this? zip:http://www.example.org/js.zip/simplex.js W.r.t. using fragments, which I agree is the cleanest approach, can we change the URL parsing algorithm to authorize reading any number of fragments? It would require adding # to the simple encode set, which can have consequences I didn't think of. http://example.org/assets.zip#html/frame.html#editor (Is there a reason we should have a path=, then?) That would also take care of nested zips.
Re: [whatwg] Zip archives as first-class citizens
On Aug 28, 2013, at 6:32 AM, Anne van Kesteren wrote: > A couple of us have been toying around with the idea of making zip > archives first-class citizens on the web. This sounds like a great opening for a discussion about the pros and cons of doing such a thing. But until such a discussion has happened, isn't it a little premature to worry about the URL details? I'd start with things like "what is the fallback when using a browser behind an enterprise firewall that blocks all zip files?" and "what potential security vulnerabilities do we create by having the browser download a zip file and parse the contents?" and maybe "how does this influence the design of memory-constrained browsers?" Matthew Kaufman Sent from my iPad
Re: [whatwg] Zip archives as first-class citizens
On 8/28/13 12:20 PM, Jonas Sicking wrote: * It makes it impossible to have create a relative URL from inside the zip file to refer to something on the same server but outside of the zip file. I think this comes back to use cases. If the idea of having the zip is "here is stuff that should live in its own world", then we do not want easy ways to get out of it via relative URIs. If the idea is to have "here is a fancy way of representing a directory" then relative URIs should Just Work across the zip boundary, like they would for any other directory. Which model are we working with here? Or some other one that doesn't match either of those two? -Boris
Re: [whatwg] Zip archives as first-class citizens
On Wed, Aug 28, 2013 at 4:54 PM, Eric Uhrhane wrote: > > Without commenting on the other parts of the proposal, let me just > > mention that every time .zip support comes up, we notice that it's not > > a great web archive format because it's not streamable. That is, you > > can't actually use any of the contents until you've downloaded the > > whole file. > ZIPs support both streaming and random access. You can access files in a ZIP as the ZIP is downloaded, using the local file headers. In this mode, they work like tars (except that you don't have to decompress unneeded data, like you do with a tar.gz). This feature wouldn't want that, since you need to read the whole file up to the file you want. Instead, it wants random access, which ZIPs also support. You download the central directory record first, to find out where the file you want lies in the archive, then download just the slice of data you need. You don't need to download the whole file. -- Glenn Maynard
Re: [whatwg] Zip archives as first-class citizens
On 8/28/13 11:40 AM, Anne van Kesteren wrote: On Wed, Aug 28, 2013 at 4:04 PM, Boris Zbarsky wrote: What's the issue with that? Gecko supports that (with jar:, not zip:), fwiw. As far as the web platform is considered today, URL objects are just that. In Gecko you either have a URL object, or a linked list of URL objects. In Gecko you always have a URL object. A small number of operations (extracting the origin is the main one) need to know about the fact that a URL object may delegate the work to some other URL object. I'd likewise be interested to hear from other implementers. Yes, this is the key part.
Re: [whatwg] Zip archives as first-class citizens
On 8/28/13 11:50 AM, Michal Zalewski wrote: 1) Both jar: and mhtml: (which work or worked in a very similar way) have caused problems in absence of strict Content-Type matching. This is an issue for both versions of this proposal. We'd need to do strict matching on the type. -Boris
Re: [whatwg] Zip archives as first-class citizens
On Wed, Aug 28, 2013 at 4:50 PM, Michal Zalewski wrote: > 1) Both jar: and mhtml: (which work or worked in a very similar way) > have caused problems in absence of strict Content-Type matching. In > essence, it is relatively easy for something like a valid > user-supplied text document or an image to be also a valid archive. > Such archives may end up containing "files" that the owner of the > website never intended to host in their origin. This also seems like a problem for being able to navigate to a zip archive's resources. E.g. if you have a hosting service for zip archives someone could upload one with an HTML subresource that executes some malicious script and trick users into navigating to http://hosting.example/pinkpony%!look.html I wonder if that is enough of a concern to not support navigating to zip resources at all. Or is Gecko's jar support enough to not have to care about this? (But we probably should do more than sniffing as you point out.) > 2) Both schemes also have a long history of breaking origin / host > name parsing in various places in the browser and introducing security > bugs. -- http://annevankesteren.nl/
Re: [whatwg] Zip archives as first-class citizens
On Wed, Aug 28, 2013 at 8:04 AM, Boris Zbarsky wrote: > On 8/28/13 9:32 AM, Anne van Kesteren wrote: >> >> I'm not sure we need to consider sub-scheme if zip-path can work as >> it's more complex and not very well thought out. E.g. imagine >> view-source:zip:http://www.example.org/zip!test.html. > > > What's the issue with that? Gecko supports that (with jar:, not zip:), > fwiw. I have two concerns with the scheme-based approach. * It dramatically complicates origin handling. This is something we've seen multiple times in gecko and something that I expect authors will struggle with too. * It makes it impossible to have create a relative URL from inside the zip file to refer to something on the same server but outside of the zip file. Since anything outside of the zip file uses a different scheme, it means that you have to use an absolute URL. Not even URLs starting with "/" nor "//" can be used. > 3) We have implementation experience with the "sub-scheme" approach and we > know it can work just fine (existence proof is jar: in Gecko). The main > difficulty it introduces is that computing the origin needs to be done via > object accessors, not string-parsing... Do we have any implementation > experience with "zip-path"-like approaches? I don't know about "can work just fine". Sure, if everyone does the right thing, then it works. But we're having to strictly enforce that no one does string parsing by hand and instead use URL objects and Principal objects. Neither of which really are an option on the web right now as all URL-related APIs use strings. > I don't think relative URIs should ever escape a zip archive (though I do > appreciate the way that would let someone replace directories with zipped-up > versions of those directories). The reason for that is that allowing it > sometimes but not others seems really weird to me, and it seems like we > don't want to allow it for toplevel zip archives. Why not? / Jonas
Re: [whatwg] Zip archives as first-class citizens
Resending. I recommend that people replying trim the address list as apparently "Too many recipients to the message" is a thing for this mailing list. On Wed, Aug 28, 2013 at 4:54 PM, Eric Uhrhane wrote: > Without commenting on the other parts of the proposal, let me just > mention that every time .zip support comes up, we notice that it's not > a great web archive format because it's not streamable. That is, you > can't actually use any of the contents until you've downloaded the > whole file. > > Perhaps some other archive format would be a better fit for the web? My take on this is that zip archives are ubiquitous. That makes this feature easy to deploy from the start. If zip archives turn out to be a successful feature we can add support for an alternative format down the line that handles that better. Adding zip archive support will also make it easier to work with OOXML, EPUB, etc. -- http://annevankesteren.nl/
Re: [whatwg] Zip archives as first-class citizens
Two implementation risks to keep in mind: 1) Both jar: and mhtml: (which work or worked in a very similar way) have caused problems in absence of strict Content-Type matching. In essence, it is relatively easy for something like a valid user-supplied text document or an image to be also a valid archive. Such archives may end up containing "files" that the owner of the website never intended to host in their origin. 2) Both schemes also have a long history of breaking origin / host name parsing in various places in the browser and introducing security bugs. /mz
Re: [whatwg] Zip archives as first-class citizens
On Wed, Aug 28, 2013 at 4:04 PM, Boris Zbarsky wrote: > What's the issue with that? Gecko supports that (with jar:, not zip:), > fwiw. As far as the web platform is considered today, URL objects are just that. In Gecko you either have a URL object, or a linked list of URL objects. I guess the question is whether supporting a linked list of URL objects in addition to plain URL objects is worth it just for zip archive support. Model-wise it's quite a bit of added complexity. I'd likewise be interested to hear from other implementers. -- http://annevankesteren.nl/
Re: [whatwg] Zip archives as first-class citizens
On 8/28/13 9:32 AM, Anne van Kesteren wrote: I'm not sure we need to consider sub-scheme if zip-path can work as it's more complex and not very well thought out. E.g. imagine view-source:zip:http://www.example.org/zip!test.html. What's the issue with that? Gecko supports that (with jar:, not zip:), fwiw. My concerns with the zip-path approach are as follows: 1) It requires doing the zip processing in a new layer on top of whatever pluggable architecture you have for schemes. The zip: approach nicely encapsulates things so that the protocol handler for zip: delegates to the inner URI for the archive fetch and then knows how to process it. It might be possible to do the zip processing by totally rewriting how browsers do fetch to interpose this zip-processing layer, but that seems like a nontrivial undertaking compared to having an orthogonal zip: handler that's invoked explicitly. I would be interested in knowing what other implementors think about how implementable the two options are in their architectures. 2) It changes semantics of existing URIs that happen to contain %!. I'm specifically worried about data: URIs, though Gordon points out that some http URIs may also be affected. 3) We have implementation experience with the "sub-scheme" approach and we know it can work just fine (existence proof is jar: in Gecko). The main difficulty it introduces is that computing the origin needs to be done via object accessors, not string-parsing... Do we have any implementation experience with "zip-path"-like approaches? As for nested zip archives. Andrea suggested we should support this, but that would require zip-path to be a sequence of paths. I think we never went to allow relative URLs to escape the top-most zip archive. But I suppose we could support in a way that %!test.zip!test.html goes one level deeper. And "../image.gif" in test.html looks in the enclosing zip. I don't think relative URIs should ever escape a zip archive (though I do appreciate the way that would let someone replace directories with zipped-up versions of those directories). The reason for that is that allowing it sometimes but not others seems really weird to me, and it seems like we don't want to allow it for toplevel zip archives. -Boris
Re: [whatwg] Zip archives as first-class citizens
On 8/28/13 9:32 AM, Anne van Kesteren wrote: We have thought of three approaches for zip URL design thus far: * Using a sub-scheme (zip) with a zip-path (after !): zip:http://www.example.org/zip!image.gif * Introducing a zip-path (after %!): http://www.example.org/zip%!image.gif * Using media fragments: http://www.example.org/zip#path=image.gif High-level drawbacks: * Sub-scheme: requires changing the URL syntax with both sub-scheme and zip-path. * Zip-path: requires changing the URL syntax. * Fragments: fail to work well for URLs relative to a zip archive. Fragments are conceptually the cleanest as the only part of a URL that's supposed to depend on the Content-Type is the fragment. However, if you want to link to an ID inside an HTML resource you'd have to do #path=test.html&id=test which would require adding knowledge to the HTML resource that it is contained in a zip archive and have special processing based on that. And not just HTML, same goes for CSS or JavaScript. I'm not sure we need to consider sub-scheme if zip-path can work as it's more complex and not very well thought out. E.g. imagine view-source:zip:http://www.example.org/zip!test.html. (I hope we never need to standardize view-source and that it can be restricted to the address bar in browsers.) zip-path makes zip archive packaging by far the easiest. If we use %! as separator that would cause a network error in some existing browsers (due to an illegal %), which means it's extensible there, though not backwards compatible. We'd adjust the URL parser to build a zip-path once %! is encountered. And relative URLs would first look if there's a zip-path and work against that, and use path otherwise. Fetching would always use the path. If there's a zip-path and the returned resource is not a zip archive it would cause a network error. As for nested zip archives. Andrea suggested we should support this, but that would require zip-path to be a sequence of paths. I think we never went to allow relative URLs to escape the top-most zip archive. But I suppose we could support in a way that %!test.zip!test.html goes one level deeper. And "../image.gif" in test.html looks in the enclosing zip. And "../../image.gif" in test.html looks in the enclosing zip as well because it cannot ever be relative to the path, only the zip-path. As the following URLs suggest, the %! (or %-anything) will likely not work for ZIP files generated by a script using the query portion of the URL, as the path information will be subsumed into the last value without causing a network error: http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1%!example.png http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1%/example.png http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1?example.png (And feel free to use that script to try out any other combos.) However, since fragments (i.e. anything beginning with '#') are already not sent to the server, what if you modified the URL parser to use a special hash-prefix combo that indicates the path? Then you could avoid the problem of having to make documents aware of the fact that they're in a ZIP because the hash-prefix combo would come before the plain hash which holds the ID. So, for example: http://whatwg.gphemsley.org/url_test.php?file=test.zip&spacer=1#/example.html#middle Then you could also take the opportunity to spec the #! prefix (and other hash-combo prefixes) that is used by a lot of sites nowadays. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/
[whatwg] Zip archives as first-class citizens
A couple of us have been toying around with the idea of making zip archives first-class citizens on the web. What we want to support: * Group a bunch of JavaScript files together in a single resource and refer to them individually for upcoming JavaScript modules. * Package a bunch of related resources together for a game or applications (e.g. icons). * Support self-contained packages, like Flash-ads or Flash-based games. Using zip archives for this makes sense as it has broad tooling support. To lower adoption cost no special configuration should be needed. Existing zip archives should be able to fit right in. The above means we need URLs for zip archives. That is: should work. As well as and test.html should be able to contain URLs that reference other resources inside the zip archive. We have thought of three approaches for zip URL design thus far: * Using a sub-scheme (zip) with a zip-path (after !): zip:http://www.example.org/zip!image.gif * Introducing a zip-path (after %!): http://www.example.org/zip%!image.gif * Using media fragments: http://www.example.org/zip#path=image.gif High-level drawbacks: * Sub-scheme: requires changing the URL syntax with both sub-scheme and zip-path. * Zip-path: requires changing the URL syntax. * Fragments: fail to work well for URLs relative to a zip archive. Fragments are conceptually the cleanest as the only part of a URL that's supposed to depend on the Content-Type is the fragment. However, if you want to link to an ID inside an HTML resource you'd have to do #path=test.html&id=test which would require adding knowledge to the HTML resource that it is contained in a zip archive and have special processing based on that. And not just HTML, same goes for CSS or JavaScript. I'm not sure we need to consider sub-scheme if zip-path can work as it's more complex and not very well thought out. E.g. imagine view-source:zip:http://www.example.org/zip!test.html. (I hope we never need to standardize view-source and that it can be restricted to the address bar in browsers.) zip-path makes zip archive packaging by far the easiest. If we use %! as separator that would cause a network error in some existing browsers (due to an illegal %), which means it's extensible there, though not backwards compatible. We'd adjust the URL parser to build a zip-path once %! is encountered. And relative URLs would first look if there's a zip-path and work against that, and use path otherwise. Fetching would always use the path. If there's a zip-path and the returned resource is not a zip archive it would cause a network error. As for nested zip archives. Andrea suggested we should support this, but that would require zip-path to be a sequence of paths. I think we never went to allow relative URLs to escape the top-most zip archive. But I suppose we could support in a way that %!test.zip!test.html goes one level deeper. And "../image.gif" in test.html looks in the enclosing zip. And "../../image.gif" in test.html looks in the enclosing zip as well because it cannot ever be relative to the path, only the zip-path. -- http://annevankesteren.nl/