[chromium-dev] Re: Spellchecker and memory-mapped dicts

Scott Hess Thu, 22 Oct 2009 14:27:26 -0700

On Linux what about mmap() and then madvise() with MADV_WILLNEED?  [or
posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor).


-scott


On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart <[email protected]> wrote:
> If you plan to read the entire file, mmap()ing it, then faulting it in will
> be slower than read()ing it, at least in some Linux versions.  I never
> pinned down exactly why, but I think the kernel read-ahead mechanism works
> slightly differently.
> --
> Steve
>
> On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans <[email protected]> wrote:
>>
>> There's also option 3)
>> Pre-fault the mmap()ed region in the file thread upon dictionary
>> initialization.
>> On Linux at least, that may give you better behaviour than malloc() +
>> read() in the event of memory pressure.
>> Cheers
>> Chris
>>
>> On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade <[email protected]> wrote:
>>>
>>> Hi all,
>>>
>>> At its last meeting the jank task force discussed improving
>>> responsiveness of the spellchecker but we didn't come to a solid
>>> conclusion so I thought I'd bring it up here to see if anyone else has
>>> opinions. The main concern is that we don't block the IO thread on
>>> file access. To this end, I recently moved initialization of the
>>> spellchecker from the IO thread to the file thread. However, instead
>>> of reading in the spellchecker dictionary in one solid chunk, we
>>> memory-map it. Then later we check individual words on the IO thread,
>>> which will be slow since the dictionary starts off effectively
>>> completely paged out. The proposal is that we read in the dictionary
>>> at spellchecker intialization instead of memory mapping it.
>>>
>>> Memory mapping pros:
>>> - possibly uses less overall memory, depending on the structure of the
>>> dictionary and the usage pattern of the user.
>>> - <strike>loading the dictionary doesn't block for a long
>>> time</strike> this one no longer occurs either way due to my recent
>>> refactoring
>>>
>>> Reading it all at once pros:
>>> - costly disk accesses are kept to the file thread (excepting future
>>> memory paging)
>>> - overall disk access time is probably lower (since we can read in the
>>> dict in one chunk)
>>>
>>> For reference, the English dictionary is about 500K, and most
>>> dictionaries are under 2 megs, some (such as Hungarian) are much
>>> higher, but no dictionary is over 10 megs.
>>>
>>> Opinions?
>>>
>>> -- Evan Stade
>>>
>>>
>>
>>
>>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: [email protected] 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

[chromium-dev] Re: Spellchecker and memory-mapped dicts

Reply via email to