On Nov 9, 2009, at 2:09 PM, Mike Wilson wrote:
Hi Nikunj,
I find the subjects of programmable caches and local http servers
highly interesting for the browser. The below comments and questions
are from a quick read-through of the supplied links, so please
excuse any misunderstandings:
Sorry to take so long to respond to you. I was swamped with other work
and only now am beginning to respond to DataCache comments. Hope you
will accept my apologies.
1) API orthogonality
The spec invents yet another caching mechanism and "storage area".
Have you evaluated the possibilities to build the suggested
functionality by enhancing and combining existing APIs and features
such as the regular HTTP cache, AppCache, localStorage, etc?
I have considered evolving AppCache to supporting DataCache. The API
and the specified behavior are written with AppCache as the model. I
also know that many have asked to further integrate DataCache with
AppCache and am open to suggestions about this.
2) "Data cache" naming
Your examples mention data in the cache such as images and blog
posts. But, the cache would work as well for application code
resources such as script files, wouldn't it?
If so, would it be better to name the cache along the lines of its
programmability or flexibility, rather than "Data"?
Certainly it is possible to cache resources you identify. However, I
do think that it should be possible to think of the proposed behavior
as an extension of the caching capabilities of the browser.
Programmability is key to this approach and perhaps that makes for a
better name. I don't know that will address all concerns about naming,
though.
3) Specification naming wrt Embedded Local Servers
I find the local servers to be somewhat on the border-line of the
topic of data caches, which seems to be confirmed by the
specification's section headings:
4. Programmable HTTP Processing
4.2 Data Caches
4.3 Embedded Local Server
Would it be more appropriate to name the spec "Programmable HTTP
Processing", or to move the local server to its own spec?
You are pointing out something I have suspected for a while. There are
two ways in which interception can be performed:
1. Based on transient registrations made through the DOM
2. Based on permanent registrations made through an API with
persistent effects
It should be very efficiently possible to check whether interception
is required for any given request. Given the granularity that is
required to determine this and my experience that interception is
required only on certain resources that are in the programmable cache,
I thought of combining the two.
I am open to picking a better name any day.
4) Cache groups
It took me a little time to grok the meaning of data caches and data
cache groups, and my current understanding is that a data cache
group is really a "versioned cache", where the "versions" correspond
to specific data cache instances in the spec. I think it may be
better to instead talk about one cache with many versions, as this
concept is probably more known to readers.
This concept exists in HTML5 with AppCache groups. If you are familiar
with that one, the terminology would be easy to grasp.
5) Explicit and event-loopy API design
I see that you have taken great care to make an API that allows to
have full control over concurrent cache state changes, through event-
loopy transaction callbacks and explicit cache upgrades ("swap").
Your example (below) for adding a new uri to the cache first has a
transaction callback that runs the capturing at a suitable time, but
which doesn't result in the current cache being updated. Instead
another callback, an "onready" event handler, needs to be added
where the updated cache is explicitly swapped in:
document.body.addEventListener('onready', function(event) {
event.cache.swapCache();
... // take advantage of the new stuff in the cache
});
cache.transaction(function(txn) {
txn.capture(uri);
txn.finish();
});
I think I understand your motivations on wanting to offer fine-
grained control like this, but do you have any thoughts on the above
getting a bit verbose for simple use-cases?
Good question. The API does require multiple steps to accomplish the
goal of "add a representation to the cache and then make that
representation available to applications". The notion of transactions
is necessary because it may be inappropriate to make some
representations available but not others that are necessary for an
application to function correctly. On the other hand, the use case of
adding a single resource is interesting enough that we could special
case it.
Perhaps,
<<
cache.immediate(uri)
>>
might be a nicer way of doing what the above code intends to do anyway.
6) Transaction API
I haven't followed the discussions on WebStorage/WebDatabase/
WebSimpleDB lately, but is the transaction style in Data Cache
adhering to any consensus achieved in this area?
The transaction API is similar to the AppCache update model, where the
updated cache is only made available after all the items in the update
have been cached. With a programmable cache, the only way to know
whether all the required items are in the cache is through an explicit
"transaction"-like model. Would you agree?
7) Sync API naming
There are several methods that are mirrored as synchronous versions,
f ex openDataCache and openDataCacheSync. Is the "Sync" suffix an
upcoming standard pattern for this duality?
This style was present in Web SQL Database, which is the source for
these names.
There is another style being used now in Indexed Database API where
the idea is to stick an asynchronous request object to every API so
that it can be used to communicate. I suggest moving to that style so
that names are not made different just because of sync vs. async.
8) Embedded Local Servers API
I think there will probably (need to) be long and hard discussions
about what a web server API inside a web browser should look
like ;-), but to start things off I'd like to say that I like that
you are keeping it simple. As we have XHR for client functionality
in the browser, it might be nice to have some API overlap with that,
f ex naming the HttpResponse.bodyText property "responseText"
instead. But I see the dilemma with wanting the same property name
on HttpRequest, so maybe the current writing is better.
I am open to changing names but the naming choice was influenced just
as you explained it.
BTW, for inspiration on a simple JavaScript web server API running
in a single-threaded event-looped environment you may want to look
at Node.js http://nodejs.org/.
Thanks for providing this pointer. I like the interest in running a
full-fledged server in JS, although I must confess that I am less than
comfortable with opening such a discussion in the context of a
browser, which is where this spec is focused.
Best regards
Mike Wilson
Nikunj R. Mehta wrote:
Posting for those not in HTML WG but interested in this topic.
Thanks,
Nikunj
Begin forwarded message:
Resent-From: [email protected]
From: "Nikunj R. Mehta" <[email protected]>
Date: November 6, 2009 8:12:50 AM PST
To: HTMLWG WG <[email protected]>
Subject: Caching breakout session at TPAC
For the breakout session on "Caching" yesterday, I put together a
small set of slides [1]. These slides on caching techniques for off-
line Web applications include use cases not adequately supported by
HTML ApplicationCache. These use cases are intended to be supported
by the WebApps DataCache [2] spec.
Thanks for the interesting questions and spirited discussion at the
breakout session yesterday.
Nikunj
http://o-micron.blogspot.com
[1]
https://docs.google.com/fileview?id=0B0rjI-BMFJwBYzNhYTYyNWQtNGFhZi00N2MxLWIyNDMtNGJhZjA2Y2ZhNzRh&hl=en
[2] http://www.w3.org/TR/DataCache/
Nikunj
http://o-micron.blogspot.com