*If you don't care where various bits of the localStorage implementation
live and you aren't scared about letting stuff out of the sandbox, you can
stop reading now.*

*
*
Background:

For those who don't know the spec by heart:  SessionStorage can be thought
of as 'tab local' storage space for each origin.  LocalStorage is shared
across all browser windows of the same origin and is persistent.  All data
is stored in key/value pairs where both the key and value are strings.  It's
possible to subscribe to DOM storage events.  Events and ease of use are why
a developer might use localStorage even though the database interface
exists.  The exact spec is here: http://dev.w3.org/html5/webstorage/


*Where should the localStorage implementation live?
*

I'm planning on implementing localStorage very soon within Chromium.
 Unfortunately, how to do this is not very clearcut.  Here are all the
possibilities I know of so far:  (Note that I'm intentionally ignoring the
backing file format for now...as that debate will partially depend on how
it's implemented.)

1)  The most obvious solution is to have have the browser process keep track
of the key/values for each origin and write it to disk.  The problem with
this approach is that we're allowing user supplied data to exist in memory
(possibly the stack at times, though we could probably avoid this if we
tried) outside of a sandbox.  Ian Fette (and I'm sure others) have pretty
big reservations for this reason.  That said, this is definitely the
simplest and cleanest solution, so if we can figure out something that we're
confident with security wise, this is how I'd like to do it.

2)  What follows from #1 is simply pulling all the localStorage code into
its own (sandboxed) process.  The problem is that, unless a lot of the
internet starts using localStorage, it seems disproportionately heavy
weight.  Starting it on demand and killing it off if localStorage hasn't
been used for a while would mitigate.

3)  A completely different solution is to use shared memory + the code
recently written to pass file handles between processes.  The shared memory
would be used to coordinate between processes and to store key/val data.
One render process for each origin will take responsibility for syncing data
to disk.  Event notifications can occur either via IPC (though sharing
key/val data can NOT for latency/responsiveness reasons) or shared
memory--whichever is easier.  Obviously the chief problem with this is
memory usage.  I'm sure it'll also be more complex and have a greater
bug/exploit cross section.

4)  A variation of #3 would be to keep all key/val data in the file and only
use shared memory for locking (if necessary).  I'm not going to discuss the
implementation details because I don't want us to get hung up on them, but
the general idea would be for each process to have an open file handle for
their origin(s) and somehow (shared memory, flock, etc) coordinate with the
other processes.  This will almost certainly be slower than memory (if
nothing else, due to system calls) but it'll use less memory and possibly be
easier to make secure.

5)  One last option is to layer the whole thing on top of the HTML 5
database layer.  Unfortunately, there's no efficient way for this layer to
support events.  Even hooking directly into sqlite won't work since its
triggers layer apparently only notifies you (i.e. works) if the
insert/delete/update happens in your own process.  Of course sqlite can be
the backing for any other option, but please, let's hold off on that
discussion for now.


*So here are my questions:*

How paranoid should we be about passing a user created string to the
browsing process and having it send the data on to the renderer and some
backend like sqlite?

Do we trust sqlite enough to use it outside of a sandbox?  (Hopefully,
because we're already doing this, right?  If not are there other mechanisms
for storing the data on disk that we do trust?)

Would we feel more comfortable with #1 if the renderer processes somehow
mangled the keys and values before sending them out?  For example, they
could base64 encode them or even do something non-deterministic so that
attackers have no guarantee about what the memory would look like that's
passing through the browser process?


And, most importantly, which option seems best to you?  (Or is there an
option 6 that I missed?)  I'd rank them 1, 2, 4, 3 personally.

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

Reply via email to