Our use of exclusive locking and page-cache preloading may open us up
more to this kind of shenanigans.  Basically SQLite will trust those
pages which we faulted into memory days ago.  We could mitigate
against that somewhat, but this problem reaches into areas we cannot
materially impact, such as filesystem caches.  And don't even begin to
imagine that there are not similar issues with commodity disk drives
and controllers.

That said, I don't think this is an incremental addition of any kind.
I've pointed it out before, there are things in the woods which
corrupt databases.  We could MAYBE reduce occurrences to a suitable
minimum using check-summing or something of the sort, but in the end
we still have to detect corruption and decide what course to take from
there.

-scott


On Tue, Oct 6, 2009 at 4:59 PM, John Abd-El-Malek <j...@chromium.org> wrote:
>
>
> On Tue, Oct 6, 2009 at 4:30 PM, Carlos Pizano <c...@google.com> wrote:
>>
>> On Tue, Oct 6, 2009 at 4:14 PM, John Abd-El-Malek <j...@chromium.org>
>> wrote:
>> > I'm not sure how Carlos is doing it?  Will we know if something is
>> > corrupt
>> > just on load/save?
>>
>> Many sqlite calls can return sqlite_corrupt. For example a query or an
>> insert
>> We just check for error codes 1 to 26 with 5 or 6 of them being
>> serious error such as sqlite_corrupt
>>
>> I am sure that random bit flip in memory and on disk is the cause of
>> some crashes, this is probably the 'limit' factor of how low the crash
>> rate of a perfect program deployed in millions of computers can go.
>
> The point I was trying to make is that the 'limit' factor as you put it is
> proportional to memory usage.  Given our large memory consumption in the
> browser process, the numbers from the paper imply dozens of corruptions just
> in sqlite memory per user.  Even if only a small fraction of these are
> harmful, spread over millions of users that's a lot of corruption.
>>
>> But I am unsure how to calculate, for example a random bit flip on the
>> backingstores, which add to at least 10M on most machines does not
>> hurt, or in the middle of a cache entry, or in the data part of some
>> structure.
>>
>>
>>   I imagine there's no way we can know when corruption
>> > happen in steady-state and the next query leads to some other browser
>> > memory
>> > (or another database) getting corrupted?
>> >
>> > On Tue, Oct 6, 2009 at 3:58 PM, Huan Ren <hu...@google.com> wrote:
>> >>
>> >> It will be helpful to get our own measurement on database failures.
>> >> Carlos just added something like that.
>> >>
>> >> Huan
>> >>
>> >> On Tue, Oct 6, 2009 at 3:49 PM, John Abd-El-Malek <j...@chromium.org>
>> >> wrote:
>> >> > Saw this on
>> >> > slashdot: http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf
>> >> > The conclusion is "an average of 25,000–75,000 FIT (failures in time
>> >> > per
>> >> > billion hours of operation) per Mbit".
>> >> > On my machine the browser process is usually > 100MB, so that
>> >> > averages
>> >> > out
>> >> > to 176 to 493 error per year, with those numbers having big variance
>> >> > depending on the machine.  Since most users don't have ECC, which
>> >> > means
>> >> > this
>> >> > will lead to corruption.  Sqlite is a heavy user of memory, so even
>> >> > if
>> >> > it's
>> >> > 1/4 of the 100MB, that means we'll see an average of 40-120 errors
>> >> > naturally
>> >> > because of faulty DIMMs.
>> >> > Given that sqlite corruption means (repeated) crashing of the browser
>> >> > process, it seems this data heavily suggests we should separate
>> >> > sqlite
>> >> > code
>> >> > into a separate process.  The IPC overhead is negligible compared to
>> >> > disk
>> >> > access.  My hunch is that the complexity is also not that high, since
>> >> > the
>> >> > code that deals with it is already asynchronous since we don't use
>> >> > sqlite on
>> >> > the UI/IO threads.
>> >> > What do others think?
>> >> > >> >
>> >> >
>> >
>> >
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

Reply via email to