[chromium-dev] Re: Spellchecker and memory-mapped dicts

Steve Vandebogart Thu, 22 Oct 2009 14:43:27 -0700

It's been awhile since I looked at this, but the email I was able to dig up
suggests that madvise is no faster than faulting in the mmap()ed region by
hand.  However, using posix_fadvise should give the same speeds as read()ing
it into memory.  IIRC though, posix_fadvise will only read so much in a
single request and will let readahead handle the rest after that.
--
Steve


On Thu, Oct 22, 2009 at 2:27 PM, Scott Hess <[email protected]> wrote:

> On Linux what about mmap() and then madvise() with MADV_WILLNEED?  [or
> posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor).
>
> -scott
>
>
> On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart <[email protected]>
> wrote:
> > If you plan to read the entire file, mmap()ing it, then faulting it in
> will
> > be slower than read()ing it, at least in some Linux versions.  I never
> > pinned down exactly why, but I think the kernel read-ahead mechanism
> works
> > slightly differently.
> > --
> > Steve
> >
> > On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans <[email protected]>
> wrote:
> >>
> >> There's also option 3)
> >> Pre-fault the mmap()ed region in the file thread upon dictionary
> >> initialization.
> >> On Linux at least, that may give you better behaviour than malloc() +
> >> read() in the event of memory pressure.
> >> Cheers
> >> Chris
> >>
> >> On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade <[email protected]>
> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> At its last meeting the jank task force discussed improving
> >>> responsiveness of the spellchecker but we didn't come to a solid
> >>> conclusion so I thought I'd bring it up here to see if anyone else has
> >>> opinions. The main concern is that we don't block the IO thread on
> >>> file access. To this end, I recently moved initialization of the
> >>> spellchecker from the IO thread to the file thread. However, instead
> >>> of reading in the spellchecker dictionary in one solid chunk, we
> >>> memory-map it. Then later we check individual words on the IO thread,
> >>> which will be slow since the dictionary starts off effectively
> >>> completely paged out. The proposal is that we read in the dictionary
> >>> at spellchecker intialization instead of memory mapping it.
> >>>
> >>> Memory mapping pros:
> >>> - possibly uses less overall memory, depending on the structure of the
> >>> dictionary and the usage pattern of the user.
> >>> - <strike>loading the dictionary doesn't block for a long
> >>> time</strike> this one no longer occurs either way due to my recent
> >>> refactoring
> >>>
> >>> Reading it all at once pros:
> >>> - costly disk accesses are kept to the file thread (excepting future
> >>> memory paging)
> >>> - overall disk access time is probably lower (since we can read in the
> >>> dict in one chunk)
> >>>
> >>> For reference, the English dictionary is about 500K, and most
> >>> dictionaries are under 2 megs, some (such as Hungarian) are much
> >>> higher, but no dictionary is over 10 megs.
> >>>
> >>> Opinions?
> >>>
> >>> -- Evan Stade
> >>>
> >>>
> >>
> >>
> >>
> >
> >
> > > >
> >
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: [email protected] 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

[chromium-dev] Re: Spellchecker and memory-mapped dicts

Reply via email to