[chromium-dev] Re: Spellchecker and memory-mapped dicts

Scott Hess Thu, 22 Oct 2009 14:53:22 -0700

Faulting it in by hand is only helpful if we're right!  If we're
wrong, it could evict other stuff from memory to support a feature
which a user may not use until the memory is faulted back out anyhow.


[From the rest of the thread, though, it sounds like maybe we should
just fix hunspell to be more efficient for our usage.]

-scott


On Thu, Oct 22, 2009 at 2:42 PM, Steve Vandebogart <[email protected]> wrote:
> It's been awhile since I looked at this, but the email I was able to dig up
> suggests that madvise is no faster than faulting in the mmap()ed region by
> hand.  However, using posix_fadvise should give the same speeds as read()ing
> it into memory.  IIRC though, posix_fadvise will only read so much in a
> single request and will let readahead handle the rest after that.
> --
> Steve
>
> On Thu, Oct 22, 2009 at 2:27 PM, Scott Hess <[email protected]> wrote:
>>
>> On Linux what about mmap() and then madvise() with MADV_WILLNEED?  [or
>> posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor).
>>
>> -scott
>>
>>
>> On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart <[email protected]>
>> wrote:
>> > If you plan to read the entire file, mmap()ing it, then faulting it in
>> > will
>> > be slower than read()ing it, at least in some Linux versions.  I never
>> > pinned down exactly why, but I think the kernel read-ahead mechanism
>> > works
>> > slightly differently.
>> > --
>> > Steve
>> >
>> > On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans <[email protected]>
>> > wrote:
>> >>
>> >> There's also option 3)
>> >> Pre-fault the mmap()ed region in the file thread upon dictionary
>> >> initialization.
>> >> On Linux at least, that may give you better behaviour than malloc() +
>> >> read() in the event of memory pressure.
>> >> Cheers
>> >> Chris
>> >>
>> >> On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade <[email protected]>
>> >> wrote:
>> >>>
>> >>> Hi all,
>> >>>
>> >>> At its last meeting the jank task force discussed improving
>> >>> responsiveness of the spellchecker but we didn't come to a solid
>> >>> conclusion so I thought I'd bring it up here to see if anyone else has
>> >>> opinions. The main concern is that we don't block the IO thread on
>> >>> file access. To this end, I recently moved initialization of the
>> >>> spellchecker from the IO thread to the file thread. However, instead
>> >>> of reading in the spellchecker dictionary in one solid chunk, we
>> >>> memory-map it. Then later we check individual words on the IO thread,
>> >>> which will be slow since the dictionary starts off effectively
>> >>> completely paged out. The proposal is that we read in the dictionary
>> >>> at spellchecker intialization instead of memory mapping it.
>> >>>
>> >>> Memory mapping pros:
>> >>> - possibly uses less overall memory, depending on the structure of the
>> >>> dictionary and the usage pattern of the user.
>> >>> - <strike>loading the dictionary doesn't block for a long
>> >>> time</strike> this one no longer occurs either way due to my recent
>> >>> refactoring
>> >>>
>> >>> Reading it all at once pros:
>> >>> - costly disk accesses are kept to the file thread (excepting future
>> >>> memory paging)
>> >>> - overall disk access time is probably lower (since we can read in the
>> >>> dict in one chunk)
>> >>>
>> >>> For reference, the English dictionary is about 500K, and most
>> >>> dictionaries are under 2 megs, some (such as Hungarian) are much
>> >>> higher, but no dictionary is over 10 megs.
>> >>>
>> >>> Opinions?
>> >>>
>> >>> -- Evan Stade
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >
>> >
>> > >> >
>> >
>
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: [email protected] 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

[chromium-dev] Re: Spellchecker and memory-mapped dicts

Reply via email to