On Linux what about mmap() and then madvise() with MADV_WILLNEED? [or posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor).
-scott On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart <[email protected]> wrote: > If you plan to read the entire file, mmap()ing it, then faulting it in will > be slower than read()ing it, at least in some Linux versions. I never > pinned down exactly why, but I think the kernel read-ahead mechanism works > slightly differently. > -- > Steve > > On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans <[email protected]> wrote: >> >> There's also option 3) >> Pre-fault the mmap()ed region in the file thread upon dictionary >> initialization. >> On Linux at least, that may give you better behaviour than malloc() + >> read() in the event of memory pressure. >> Cheers >> Chris >> >> On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade <[email protected]> wrote: >>> >>> Hi all, >>> >>> At its last meeting the jank task force discussed improving >>> responsiveness of the spellchecker but we didn't come to a solid >>> conclusion so I thought I'd bring it up here to see if anyone else has >>> opinions. The main concern is that we don't block the IO thread on >>> file access. To this end, I recently moved initialization of the >>> spellchecker from the IO thread to the file thread. However, instead >>> of reading in the spellchecker dictionary in one solid chunk, we >>> memory-map it. Then later we check individual words on the IO thread, >>> which will be slow since the dictionary starts off effectively >>> completely paged out. The proposal is that we read in the dictionary >>> at spellchecker intialization instead of memory mapping it. >>> >>> Memory mapping pros: >>> - possibly uses less overall memory, depending on the structure of the >>> dictionary and the usage pattern of the user. >>> - <strike>loading the dictionary doesn't block for a long >>> time</strike> this one no longer occurs either way due to my recent >>> refactoring >>> >>> Reading it all at once pros: >>> - costly disk accesses are kept to the file thread (excepting future >>> memory paging) >>> - overall disk access time is probably lower (since we can read in the >>> dict in one chunk) >>> >>> For reference, the English dictionary is about 500K, and most >>> dictionaries are under 2 megs, some (such as Hungarian) are much >>> higher, but no dictionary is over 10 megs. >>> >>> Opinions? >>> >>> -- Evan Stade >>> >>> >> >> >> > > > > > --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---
