[chromium-dev] Re: RFC: AutoFill++ Design Document
Ah, OK. That sounds like a good solution. I don't know how to implement that, but maybe there is already a way. -Darin On Tue, Oct 20, 2009 at 1:54 PM, Nick Baum nickb...@chromium.org wrote: The mocks also show a preview of all the fields when hovering over or arrowing to one of the drop-down options. Is there a way to do that without showing the values to the page? -Nick On Tue, Oct 20, 2009 at 12:46 PM, James Hawkins jhawk...@chromium.orgwrote: The current design keys off autofill after the user enters a value in the name field. A drop down menu is shown allowing the user to select which profile to autofill. If the user does not select a profile, the form will not be autofilled (using AutoFill++). I'll try to clarify this in the doc. Thanks, James On Tue, Oct 20, 2009 at 12:43 PM, Darin Fisher da...@chromium.orgwrote: One security concern: Autofill should not give users information to the page until the user makes some gesture to accept the autofill choices. For the existing autofill, this is done by having the user choose from the drop down menu. The page can read the value of an INPUT field once it is set, so you have to be careful. I wonder if a solution already exists to support Safari's autofill. -Darin On Tue, Oct 20, 2009 at 10:13 AM, James Hawkins jhawk...@chromium.orgwrote: Hi, please read the inlined proposed design for AutoFill++. Any feedback and comments are appreciated. Thanks, James Hawkins *AutoFill++* *Status:* *Draft* (as of 2009-10-16) *James Hawkins** jhawk...@chromium.org Modified: Fri Oct 16 2009 * Objective The purpose of this document is to describe a significant improvement in the current form autofill feature of Google Chrome. The current implementation is less of a form autofill and more of a form history. This approach has some limitations: - Values are saved per-site, so a Name value saved on penguin.comwill not be suggested for the Name field on turtle.com. - The form must be filled out field-by-field. I am proposing a new approach to form filling based on the client/server statistic-based model implemented by Google Toolbar. This feature provides the following benefits: - Forms are completely auto-filled. - Values are saved in user profiles, so the Name field can be auto-filled for a form on a site one has never visited. Background The main problem with the current autofill feature is that the user must move from field to field to fill out a form, even when each field can be auto-filled. The history is site-specific, so the user must enter form data for each site before autofill can be useful. Overview To use the AutoFill++ feature in Google Chrome, the user must enter personal information (name, address, email, etc.) into a form on any site and submit this form. An infobar will ask the user if he wants to enable autofill, and if so, AutoFill++ will parse the form data and query the autofill server for the field types. If the server knows the field types for the form on this particular site, AutoFill++ will match the data with the fields; otherwise, a heuristic based on the names of the fields and how they are layed out on the form is used to make this match. The map of (field, data) will then be saved in the user form data database. When the user starts entering information into any other form, a list of saved profiles will be presented to the user in a selection drop-down. After picking a profile, AutoFill++ will autofill all fields that it knows about and has information in the database for. The user will be able to enter additional profile information at any time through the AutoFill++ dialog. This dialog will be shown after the user first enters form information, and will also be available through the options dialog. Note that no personal information will be sent to the autofill server, just the field types determined by the heuristics. Detailed Design The first step in implementing AutoFill++ is to rename the current AutofillForm data structures to better match their behavior: FormFieldHistory. Just like Toolbar AutoFill++, the client will implement some strategies to reduce the bandwitdth on the autofill server. These include: - Ignore common search forms by ignoring forms that contain only one textfield with an id of q. - Ignore forms using the GET method to submit for data. - Ignore forms that have the word search in the target url for that form. - Using a flag in the query response from the server to tell the client whether or not it should upload the data if a user submits this form. *Data Structure* *Description* *AutoFillForm* The container that holds the (field, value) pairs, parsed from HTML. *RenderViewHostDelegate::AutoFill* An interface implemented by a class that wants to be notified of AutoFill events from the RenderView *AutoFillService* The main AutoFill++
[chromium-dev] Re: RFC: AutoFill++ Design Document
So, granting my name in one field, grants the page access to my credit card in another? :-/ -Darin On Tue, Oct 20, 2009 at 12:46 PM, Ben Goodger (Google) b...@chromium.orgwrote: The theory is that you would start typing in one of the fields, the traditional autofill UI (dropdown) would appear and selecting an item would be equivalent to granting the form to be filled out. -Ben On Tue, Oct 20, 2009 at 12:43 PM, Darin Fisher da...@chromium.org wrote: One security concern: Autofill should not give users information to the page until the user makes some gesture to accept the autofill choices. For the existing autofill, this is done by having the user choose from the drop down menu. The page can read the value of an INPUT field once it is set, so you have to be careful. I wonder if a solution already exists to support Safari's autofill. -Darin On Tue, Oct 20, 2009 at 10:13 AM, James Hawkins jhawk...@chromium.orgwrote: Hi, please read the inlined proposed design for AutoFill++. Any feedback and comments are appreciated. Thanks, James Hawkins *AutoFill++* *Status:* *Draft* (as of 2009-10-16) *James Hawkins** jhawk...@chromium.org Modified: Fri Oct 16 2009 * Objective The purpose of this document is to describe a significant improvement in the current form autofill feature of Google Chrome. The current implementation is less of a form autofill and more of a form history. This approach has some limitations: - Values are saved per-site, so a Name value saved on penguin.comwill not be suggested for the Name field on turtle.com. - The form must be filled out field-by-field. I am proposing a new approach to form filling based on the client/server statistic-based model implemented by Google Toolbar. This feature provides the following benefits: - Forms are completely auto-filled. - Values are saved in user profiles, so the Name field can be auto-filled for a form on a site one has never visited. Background The main problem with the current autofill feature is that the user must move from field to field to fill out a form, even when each field can be auto-filled. The history is site-specific, so the user must enter form data for each site before autofill can be useful. Overview To use the AutoFill++ feature in Google Chrome, the user must enter personal information (name, address, email, etc.) into a form on any site and submit this form. An infobar will ask the user if he wants to enable autofill, and if so, AutoFill++ will parse the form data and query the autofill server for the field types. If the server knows the field types for the form on this particular site, AutoFill++ will match the data with the fields; otherwise, a heuristic based on the names of the fields and how they are layed out on the form is used to make this match. The map of (field, data) will then be saved in the user form data database. When the user starts entering information into any other form, a list of saved profiles will be presented to the user in a selection drop-down. After picking a profile, AutoFill++ will autofill all fields that it knows about and has information in the database for. The user will be able to enter additional profile information at any time through the AutoFill++ dialog. This dialog will be shown after the user first enters form information, and will also be available through the options dialog. Note that no personal information will be sent to the autofill server, just the field types determined by the heuristics. Detailed Design The first step in implementing AutoFill++ is to rename the current AutofillForm data structures to better match their behavior: FormFieldHistory. Just like Toolbar AutoFill++, the client will implement some strategies to reduce the bandwitdth on the autofill server. These include: - Ignore common search forms by ignoring forms that contain only one textfield with an id of q. - Ignore forms using the GET method to submit for data. - Ignore forms that have the word search in the target url for that form. - Using a flag in the query response from the server to tell the client whether or not it should upload the data if a user submits this form. *Data Structure* *Description* *AutoFillForm* The container that holds the (field, value) pairs, parsed from HTML. *RenderViewHostDelegate::AutoFill* An interface implemented by a class that wants to be notified of AutoFill events from the RenderView *AutoFillService* The main AutoFill++ class. Coordinates between the RenderView, the autofill server, and the PersonalDatabaseManager Each TabContents has one, and each instance is lazily created. *AutoFillQueryXMLParser *Builds and parses XML queries that are sent to and received from the autofill server. *AutoFillDownloadThread* Spins off a new thread to download/upload data to the autofill server. *AutoFillProfileDialog*
[chromium-dev] Tab Thumbnails and Aero Peek (of Windows 7)
Greetings Chromium developers, (Please feel free to ignore this e-mail if you are not interested in Windows 7.) To celebrate the launch of Windows 7, I have written a document that describes my prototype implementation that integrates Tab Thumbnails and Aero Peek into Chromium. (My current prototype has some problems, though.)Your comments and suggestions are definitely welcome. Sorry for sending a huge e-mail in advance. Regards, Hironori Bono E-mail: hb...@chromium.org *Tab Thumbnails and Aero Peek* Objective This document is a report of my prototype implementation that integrates two new features of Windows 7 (Tab Thumbnails and Aero Peek) into Chromium. Background Windows 7, the new operating system from Microsoft, has added some new features to its taskbar[1]file:///C:/Users/hBono/Documents/Google/Chrome/Tab%20Thumbnails%20and%20Aero%20Peek.docx#_ftn1: JumpList, Tab Thumbnails, Aero Peek, etc. Among these new features, Tab Thumbnails and Aero Peek are designed for improving the user experiences of TDI (Tab-Document-Interface) applications. In fact, users want for Chrome to support them and filed an issue to our buganizer (Issue 6337[2]file:///C:/Users/hBono/Documents/Google/Chrome/Tab%20Thumbnails%20and%20Aero%20Peek.docx#_ftn2 ). * What are Tab Thumbnails and Aero Peek?* In Windows 7, an application can add (or remove) thumbnails for each application so that hovering over its taskbar button to display them as shown in Figure 1. Each thumbnail can have its own title and icon. (A thumbnail can also have its own buttons even though this example doesn’t use them.) The focused tab is shown as a high-lighted thumbnail in this thumbnail list. Figure 1 Tab Thumbnails As shown in Figure 2, hovering a mouse cursor onto a thumbnail shows the preview image of the corresponding tab, and clicking a thumbnail selects the tab, respectively. (This feature is called as Aero Peek.) Figure 2 Aero Peek Also, clicking the close button of a thumbnail (shown at the top-right corner of the focused thumbnail) closes the corresponding tab. Figure 3 Close button of a thumbnail * * Implementing Tab Thumbnails and Aero Peek Unfortunately, without using new APIs provided by Windows 7, Windows 7 just shows the thumbnail of the application window. This section provides a generic overview about implementing Tab Thumbnails and Aero Peek. (The information in this section is just the result of my investigation, and maybe incorrect.) How to add a custom thumbnail? At first, we need to add a custom thumbnail to the Tab-Thumbnail list of Windows 7. To add a custom thumbnail, we need a (tool) window[3]file:///C:/Users/hBono/Documents/Google/Chrome/Tab%20Thumbnails%20and%20Aero%20Peek.docx#_ftn3that receives messages from Windows 7. Windows 7 sends the following messages to notify thumbnail-specific events. So, we need to create a place-holder window that handles these events for each tab. 1.* WM_CREATE (0x0001) This message is sent when a window has been created. An application has to call DwmSetWindowAttribute() to notify Windows 7 that this window can handle WM_DWMSENDICONICTHUMBNAIL and WM_DWMSENDICONICLIVEPREVIEWBITMAP messages[4]file:///C:/Users/hBono/Documents/Google/Chrome/Tab%20Thumbnails%20and%20Aero%20Peek.docx#_ftn4. Also, an application needs to call ITaskbarList3::RegisterTab() to add this window to the Tab-Thumbnail list of an application[5]file:///C:/Users/hBono/Documents/Google/Chrome/Tab%20Thumbnails%20and%20Aero%20Peek.docx#_ftn5 . 2.* WM_ACTIVATE (0x0006) This message is sent when a user clicks the thumbnail image to select its corresponding tab. 3.* WM_CLOSE (0x0010) This message is sent when a user clicks the close button of a thumbnail. An application has to call ITaskbarList3::UnregisterTab() to remove this window from the Tab-Thumbnail list before calling DestroyWindow(). 4.* WM_DWMSENDICONICTHUMBNAIL (0x0323) This message is sent when Windows 7 needs the thumbnail image of the corresponding tab. An application has to create a thumbnail bitmap (whose size is given as the LPARAM parameter of this message[6]file:///C:/Users/hBono/Documents/Google/Chrome/Tab%20Thumbnails%20and%20Aero%20Peek.docx#_ftn6) and send it to Windows 7 through a DwmSetIconicThumbnail() call. (Windows 7 shows a “loading” animation before sending a bitmap to Windows 7.) 5.* WM_DWMSENDICONICLIVEPREVIEWBITMAP (0x0326) This message is sent when Windows 7 needs the preview image of the corresponding tab. An application has to create a preview bitmap and send it to Windows 7 through a DwmSetIconicLivePreviewBitmap() call. How to update the thumbnail image of a tab? Windows 7 can send WM_DWMSENDICONICTHUMBNAIL messages to a place-holder window when it needs its thumbnail image. On the other hand, when an application needs to update the thumbnail image of a tab, it has to call DwmInvalidateIconicBitmaps() and let Windows 7 send a WM_DWMSENDICONICTHUMBNAIL message. It makes Windows 7 unhappy and usually
[chromium-dev] Re: WebKit API wrapper for Document
Hi Darin, On Thu, Oct 22, 2009 at 2:50 AM, Darin Fisher da...@chromium.org wrote: Marshall, For now, can you just use NPObject to access the DOM? See WebBindings for implementation of NPRuntime methods. Long term, I am interested in reflecting the full DOM via the WebKit API, but I don't want to rush the design. At present, we are pretty busy just trying to complete the core WebKit API. I want to stay focused on that. I completely understand your priorities and support the desire not to rush the design of new features for inclusion in WebKit. CEF uses reference-counted C++ classes so I think directly accessing WebKit or WebCore objects will simplify the implementation as compared to using NPObjects. For now I'll proceed with my own WebCore DOM wrapper implementation as part of CEF. I should have lots of good suggestions when the time comes to discuss the wrapper implementation for WebKit :-). -Darin On Tue, Oct 20, 2009 at 5:28 PM, Jeremy Orlow jor...@chromium.org wrote: Darin knows for sure, but I'm not aware of any intentions on Google's part to engineer such an elaborate API. As long as it didn't add a major maintenance burden (i.e. exposed things similar to one of the other WebKit APIs) I'd imagine patches would be welcome though. I believe only Darin can speak to this authoritatively, though. J On Tue, Oct 20, 2009 at 4:00 PM, Marshall Greenblatt magreenbl...@gmail.com wrote: On Tue, Oct 20, 2009 at 5:33 PM, Adam Barth aba...@chromium.org wrote: It seems like we need to draw the line somewhere. Otherwise, we'll end up exposing the whole DOM via the WebKit API. Where do you think the optimum cut-off is? I think treating the DOM as an XML-ish object tree would be the most reasonable approach. This means: 1. The ability to walk the DOM hierarchy by following parent/child relationships. 2. The ability to get/set DOM attributes. 3. The ability to create/delete DOM objects at any level in the hierarchy. 4. The ability to set event listeners on DOM objects (perhaps using a v8::Function as the listener). Inputs would need to be sanity-checked by the API based on the underlying object context, but I don't think we need to provide separate API classes/methods for each possibility. Adam On Tue, Oct 20, 2009 at 1:55 PM, Marshall Greenblatt magreenbl...@gmail.com wrote: Hi All, The Chromium WebKit API does not currently provide a wrapper for the WebCore::Document object associated with a WebCore::Frame. CEF (http://code.google.com/p/chromiumembedded), which also uses the WebKit API, would like access to this object at the C++ level. Is there interest in the broader Chromium community for having a Document wrapper as part of the WebKit API? Thanks, Marshall --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Tab Thumbnails and Aero Peek (of Windows 7)
Hi Hironori, Thanks for researching this topic. On generating thumbnails, I wrote chrome/browser/tab_contents/thumbnail_generator which tries to get the most up-to-date thumbnail images for all tabs (except ones that haven't ever been shown) with minimal performance overhead. The code hasn't been used much in practice yet, but seems to basically work. It was written for a tab switcher, and the intent is that it will provide thumbnails for all services in the future (including the new tab page and the Windows 7 features) for all platforms. So I would definitely use that, and we should work on fixing any bugs or limitations that pop up. It's defaulted to off for most builds because it has a runtime overhead and its not currently used. If you want to turn it on for your build, it's started in the constructor for TabContents. Brett --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: WebKit API wrapper for Document
OK, sounds good. Thanks for your patience. -Darin On Thu, Oct 22, 2009 at 7:09 AM, Marshall Greenblatt magreenbl...@gmail.com wrote: Hi Darin, On Thu, Oct 22, 2009 at 2:50 AM, Darin Fisher da...@chromium.org wrote: Marshall, For now, can you just use NPObject to access the DOM? See WebBindings for implementation of NPRuntime methods. Long term, I am interested in reflecting the full DOM via the WebKit API, but I don't want to rush the design. At present, we are pretty busy just trying to complete the core WebKit API. I want to stay focused on that. I completely understand your priorities and support the desire not to rush the design of new features for inclusion in WebKit. CEF uses reference-counted C++ classes so I think directly accessing WebKit or WebCore objects will simplify the implementation as compared to using NPObjects. For now I'll proceed with my own WebCore DOM wrapper implementation as part of CEF. I should have lots of good suggestions when the time comes to discuss the wrapper implementation for WebKit :-). -Darin On Tue, Oct 20, 2009 at 5:28 PM, Jeremy Orlow jor...@chromium.orgwrote: Darin knows for sure, but I'm not aware of any intentions on Google's part to engineer such an elaborate API. As long as it didn't add a major maintenance burden (i.e. exposed things similar to one of the other WebKit APIs) I'd imagine patches would be welcome though. I believe only Darin can speak to this authoritatively, though. J On Tue, Oct 20, 2009 at 4:00 PM, Marshall Greenblatt magreenbl...@gmail.com wrote: On Tue, Oct 20, 2009 at 5:33 PM, Adam Barth aba...@chromium.orgwrote: It seems like we need to draw the line somewhere. Otherwise, we'll end up exposing the whole DOM via the WebKit API. Where do you think the optimum cut-off is? I think treating the DOM as an XML-ish object tree would be the most reasonable approach. This means: 1. The ability to walk the DOM hierarchy by following parent/child relationships. 2. The ability to get/set DOM attributes. 3. The ability to create/delete DOM objects at any level in the hierarchy. 4. The ability to set event listeners on DOM objects (perhaps using a v8::Function as the listener). Inputs would need to be sanity-checked by the API based on the underlying object context, but I don't think we need to provide separate API classes/methods for each possibility. Adam On Tue, Oct 20, 2009 at 1:55 PM, Marshall Greenblatt magreenbl...@gmail.com wrote: Hi All, The Chromium WebKit API does not currently provide a wrapper for the WebCore::Document object associated with a WebCore::Frame. CEF (http://code.google.com/p/chromiumembedded), which also uses the WebKit API, would like access to this object at the C++ level. Is there interest in the broader Chromium community for having a Document wrapper as part of the WebKit API? Thanks, Marshall --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: RFC: AutoFill++ Design Document
We need to come up with a way, if we also want to implement omnibox-style inline autocomplete for web page autofill. (a separate bug). -Ben On Wed, Oct 21, 2009 at 11:52 PM, Darin Fisher da...@chromium.org wrote: Ah, OK. That sounds like a good solution. I don't know how to implement that, but maybe there is already a way. -Darin On Tue, Oct 20, 2009 at 1:54 PM, Nick Baum nickb...@chromium.org wrote: The mocks also show a preview of all the fields when hovering over or arrowing to one of the drop-down options. Is there a way to do that without showing the values to the page? -Nick On Tue, Oct 20, 2009 at 12:46 PM, James Hawkins jhawk...@chromium.orgwrote: The current design keys off autofill after the user enters a value in the name field. A drop down menu is shown allowing the user to select which profile to autofill. If the user does not select a profile, the form will not be autofilled (using AutoFill++). I'll try to clarify this in the doc. Thanks, James On Tue, Oct 20, 2009 at 12:43 PM, Darin Fisher da...@chromium.orgwrote: One security concern: Autofill should not give users information to the page until the user makes some gesture to accept the autofill choices. For the existing autofill, this is done by having the user choose from the drop down menu. The page can read the value of an INPUT field once it is set, so you have to be careful. I wonder if a solution already exists to support Safari's autofill. -Darin On Tue, Oct 20, 2009 at 10:13 AM, James Hawkins jhawk...@chromium.orgwrote: Hi, please read the inlined proposed design for AutoFill++. Any feedback and comments are appreciated. Thanks, James Hawkins *AutoFill++* *Status:* *Draft* (as of 2009-10-16) *James Hawkins** jhawk...@chromium.org Modified: Fri Oct 16 2009 * Objective The purpose of this document is to describe a significant improvement in the current form autofill feature of Google Chrome. The current implementation is less of a form autofill and more of a form history. This approach has some limitations: - Values are saved per-site, so a Name value saved on penguin.comwill not be suggested for the Name field on turtle.com. - The form must be filled out field-by-field. I am proposing a new approach to form filling based on the client/server statistic-based model implemented by Google Toolbar. This feature provides the following benefits: - Forms are completely auto-filled. - Values are saved in user profiles, so the Name field can be auto-filled for a form on a site one has never visited. Background The main problem with the current autofill feature is that the user must move from field to field to fill out a form, even when each field can be auto-filled. The history is site-specific, so the user must enter form data for each site before autofill can be useful. Overview To use the AutoFill++ feature in Google Chrome, the user must enter personal information (name, address, email, etc.) into a form on any site and submit this form. An infobar will ask the user if he wants to enable autofill, and if so, AutoFill++ will parse the form data and query the autofill server for the field types. If the server knows the field types for the form on this particular site, AutoFill++ will match the data with the fields; otherwise, a heuristic based on the names of the fields and how they are layed out on the form is used to make this match. The map of (field, data) will then be saved in the user form data database. When the user starts entering information into any other form, a list of saved profiles will be presented to the user in a selection drop-down. After picking a profile, AutoFill++ will autofill all fields that it knows about and has information in the database for. The user will be able to enter additional profile information at any time through the AutoFill++ dialog. This dialog will be shown after the user first enters form information, and will also be available through the options dialog. Note that no personal information will be sent to the autofill server, just the field types determined by the heuristics. Detailed Design The first step in implementing AutoFill++ is to rename the current AutofillForm data structures to better match their behavior: FormFieldHistory. Just like Toolbar AutoFill++, the client will implement some strategies to reduce the bandwitdth on the autofill server. These include: - Ignore common search forms by ignoring forms that contain only one textfield with an id of q. - Ignore forms using the GET method to submit for data. - Ignore forms that have the word search in the target url for that form. - Using a flag in the query response from the server to tell the client whether or not it should upload the data if a user submits this form. *Data Structure* *Description* *AutoFillForm* The container that holds the
[chromium-dev] Re: RFC: AutoFill++ Design Document
On Thu, Oct 22, 2009 at 9:39 AM, Ben Goodger (Google) b...@chromium.orgwrote: We need to come up with a way, if we also want to implement omnibox-style inline autocomplete for web page autofill. (a separate bug). Passing comment: This would be really cool but potentially a lot of work depending on the underlying data structures. We definitely can't reuse the Omnibox machinery. Just didn't want anyone's hopes to be artificially high. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: RFC: AutoFill++ Design Document
Right. I am just wanting to match the inline behavior... using the omnibox these days I tend to notice when I have to down arrow into things to complete, so it's definitely a pain point for in-page autocomplete. -Ben On Thu, Oct 22, 2009 at 9:56 AM, Peter Kasting pkast...@google.com wrote: On Thu, Oct 22, 2009 at 9:39 AM, Ben Goodger (Google) b...@chromium.orgwrote: We need to come up with a way, if we also want to implement omnibox-style inline autocomplete for web page autofill. (a separate bug). Passing comment: This would be really cool but potentially a lot of work depending on the underlying data structures. We definitely can't reuse the Omnibox machinery. Just didn't want anyone's hopes to be artificially high. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Linux/Stability - Remember to check the return value from GTK functions.
On Wed, Oct 21, 2009 at 10:09 PM, Lei Zhang thes...@chromium.org wrote: Hi Linux folks, This is a kind reminder to check the return values from GTK functions. Every time you put the unchecked result from, say, gtk_file_chooser_get_filename() into a FilePath or std::string, you risk a browser process crash if the result is NULL. I just triaged several crashes of this type. There's probably more... http://code.google.com/p/chromium/issues/detail?id=25490 http://code.google.com/p/chromium/issues/detail?id=25491 http://code.google.com/p/chromium/issues/detail?id=25493 http://code.google.com/p/chromium/issues/detail?id=25494 Let me couch this by noting that if the return value is const gchar* then it's usually guaranteed to be non-null. --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Spellchecker and memory-mapped dicts
Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
It seems like loading into memory will result in more predictable access times for the initial set of words that get spellchecked (up to the point where the memory-mapped file would have been entirely paged in). If you combine this with my memory purger code that will (hopefully) result in the dictionary getting dumped out of memory occasionally, which causes the behavior right after open to become more significant, I think loading into memory is a win. I doubt the dictionaries are structured such that memory-mapping the file will reduce the browser process memory footprint in a meaningful way. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. Cheers Chris On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
If you plan to read the entire file, mmap()ing it, then faulting it in will be slower than read()ing it, at least in some Linux versions. I never pinned down exactly why, but I think the kernel read-ahead mechanism works slightly differently. -- Steve On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans cev...@chromium.org wrote: There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. Cheers Chris On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans cev...@chromium.org wrote: There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. On Windows, I believe that will either be equivalent to loading everything into a memory data structure or very slightly worse. I forgot to mention one facet of loading into memory. If we need to, it is probably easier for us to design the memory data structure so that most hits occur in a smaller region of pages, and are thus more friendly to memory pressure, than it is to structure the files on disk to have this property. PK --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? I've thought about this some (I wrote the memory map thing there now). History of the spellchecker: v1 : Per-process Hunspell storage (lots of memory duplicated in each renderer, expensive to load). v2 : Browser-process Hunspell storage (lots of memory, expensive to load, only occurs once) v3 : Browser-process memmap (less memory, cheap to load, only occurs once). I would like to consider moving hunspell back to the renderer so we can avoid sync IPCs and blocking the I/O thread on spellchecking. Spellchecking isn't fast (especially suggestions) even when everything is in memory, so it always sucks to have it block the I/O thread. Now that it can be memmapped, each renderer can memmap its own image of the data. This doesn't help on Mac where we want to use the system spellchecker. There would also be some amount of duplication since there are certain tables that are initialized once at the beginning (I don't think its that big, though). I would suggest first making the current histograms in the spellchecker.cc file UMA (currently they're debug-only local ones) so we can see how much blocking we're getting from Hunspell in the field. Brett --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
On Linux what about mmap() and then madvise() with MADV_WILLNEED? [or posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor). -scott On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart vand...@chromium.org wrote: If you plan to read the entire file, mmap()ing it, then faulting it in will be slower than read()ing it, at least in some Linux versions. I never pinned down exactly why, but I think the kernel read-ahead mechanism works slightly differently. -- Steve On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans cev...@chromium.org wrote: There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. Cheers Chris On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
On Thu, Oct 22, 2009 at 2:22 PM, Brett Wilson bre...@chromium.org wrote: On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? I've thought about this some (I wrote the memory map thing there now). History of the spellchecker: v1 : Per-process Hunspell storage (lots of memory duplicated in each renderer, expensive to load). v2 : Browser-process Hunspell storage (lots of memory, expensive to load, only occurs once) v3 : Browser-process memmap (less memory, cheap to load, only occurs once). I would like to consider moving hunspell back to the renderer so we can avoid sync IPCs and blocking the I/O thread on spellchecking. That would also be a stability win. Currently, any hunspell crashes due to bust dictionaries take down the entire browser. Cheers Chris Spellchecking isn't fast (especially suggestions) even when everything is in memory, so it always sucks to have it block the I/O thread. Now that it can be memmapped, each renderer can memmap its own image of the data. This doesn't help on Mac where we want to use the system spellchecker. There would also be some amount of duplication since there are certain tables that are initialized once at the beginning (I don't think its that big, though). I would suggest first making the current histograms in the spellchecker.cc file UMA (currently they're debug-only local ones) so we can see how much blocking we're getting from Hunspell in the field. Brett --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
On Thu, Oct 22, 2009 at 2:22 PM, Brett Wilson bre...@chromium.org wrote: This doesn't help on Mac where we want to use the system spellchecker. FYI, we got a patch to use the system spellchecker on Linux as well. http://code.google.com/p/chromium/issues/detail?id=24517 I should probably ping the original uploader again... This Ubuntu document describes some use cases as to why unification is good: https://wiki.ubuntu.com/ConsolidateSpellingLibs On the other hand, ChromeOS will certainly benefit from this. --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
It's been awhile since I looked at this, but the email I was able to dig up suggests that madvise is no faster than faulting in the mmap()ed region by hand. However, using posix_fadvise should give the same speeds as read()ing it into memory. IIRC though, posix_fadvise will only read so much in a single request and will let readahead handle the rest after that. -- Steve On Thu, Oct 22, 2009 at 2:27 PM, Scott Hess sh...@chromium.org wrote: On Linux what about mmap() and then madvise() with MADV_WILLNEED? [or posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor). -scott On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart vand...@chromium.org wrote: If you plan to read the entire file, mmap()ing it, then faulting it in will be slower than read()ing it, at least in some Linux versions. I never pinned down exactly why, but I think the kernel read-ahead mechanism works slightly differently. -- Steve On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans cev...@chromium.org wrote: There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. Cheers Chris On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
Faulting it in by hand is only helpful if we're right! If we're wrong, it could evict other stuff from memory to support a feature which a user may not use until the memory is faulted back out anyhow. [From the rest of the thread, though, it sounds like maybe we should just fix hunspell to be more efficient for our usage.] -scott On Thu, Oct 22, 2009 at 2:42 PM, Steve Vandebogart vand...@chromium.org wrote: It's been awhile since I looked at this, but the email I was able to dig up suggests that madvise is no faster than faulting in the mmap()ed region by hand. However, using posix_fadvise should give the same speeds as read()ing it into memory. IIRC though, posix_fadvise will only read so much in a single request and will let readahead handle the rest after that. -- Steve On Thu, Oct 22, 2009 at 2:27 PM, Scott Hess sh...@chromium.org wrote: On Linux what about mmap() and then madvise() with MADV_WILLNEED? [or posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor). -scott On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart vand...@chromium.org wrote: If you plan to read the entire file, mmap()ing it, then faulting it in will be slower than read()ing it, at least in some Linux versions. I never pinned down exactly why, but I think the kernel read-ahead mechanism works slightly differently. -- Steve On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans cev...@chromium.org wrote: There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. Cheers Chris On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
+1 on moving spell to the renderers. We can memory map in the browser and map again the in renderers. Hopefully read-only. We eliminate the sync ipc and do not increase the memory usage. On Oct 22, 2:42 pm, Steve Vandebogart vand...@chromium.org wrote: It's been awhile since I looked at this, but the email I was able to dig up suggests that madvise is no faster than faulting in the mmap()ed region by hand. However, using posix_fadvise should give the same speeds as read()ing it into memory. IIRC though, posix_fadvise will only read so much in a single request and will let readahead handle the rest after that. -- Steve On Thu, Oct 22, 2009 at 2:27 PM, Scott Hess sh...@chromium.org wrote: On Linux what about mmap() and then madvise() with MADV_WILLNEED? [or posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor). -scott On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart vand...@chromium.org wrote: If you plan to read the entire file, mmap()ing it, then faulting it in will be slower than read()ing it, at least in some Linux versions. I never pinned down exactly why, but I think the kernel read-ahead mechanism works slightly differently. -- Steve On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans cev...@chromium.org wrote: There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. Cheers Chris On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Tab Thumbnails and Aero Peek (of Windows 7)
2009/10/22 Hironori Bono (坊野 博典) hb...@chromium.org How to treat resize events of a browser window? The biggest problems of my current prototype are caused by my prototype that doesn’t handle resize events of a browser window. (Figure 4 may be acceptable. But, I think Figure 5 is unacceptable for many users.) The easiest solution for this problem is forcing a background tab to redraw when we need an image used for Aero Peek or Tab Thumbnails. Unfortunately, this solution probably hurts the rendering performance of Chromium. Another solution (or a workaround) is displaying a message “the preview image for the selected tab is not available” instead of showing a broken image. Any opinions or suggestions are definitely helpful. Firefox nightlies now supports tab preview. If you resize the window the background tabs keep their old thumbnails which I think is acceptable. -- erik --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
Probably a bit off topic at this point, but but your response confuses me - MADV_WILLNEED and POSIX_FADV_WILLNEED will bring the pages into ram, just like faulting in mmap()'ed pages by hand, or read()ing it into memory. In my experience, read() and fadvise() are faster than mmap()+faulting everything in, or madvise(). Of course, read()ing it in means it has to be swapped out and can't just be dropped. If you want to suck the entire file in at some point, probably the best way is to fadvise() it in, then mmap() it and use it from there. -- Steve On Thu, Oct 22, 2009 at 2:52 PM, Scott Hess sh...@chromium.org wrote: Faulting it in by hand is only helpful if we're right! If we're wrong, it could evict other stuff from memory to support a feature which a user may not use until the memory is faulted back out anyhow. [From the rest of the thread, though, it sounds like maybe we should just fix hunspell to be more efficient for our usage.] -scott On Thu, Oct 22, 2009 at 2:42 PM, Steve Vandebogart vand...@chromium.org wrote: It's been awhile since I looked at this, but the email I was able to dig up suggests that madvise is no faster than faulting in the mmap()ed region by hand. However, using posix_fadvise should give the same speeds as read()ing it into memory. IIRC though, posix_fadvise will only read so much in a single request and will let readahead handle the rest after that. -- Steve On Thu, Oct 22, 2009 at 2:27 PM, Scott Hess sh...@chromium.org wrote: On Linux what about mmap() and then madvise() with MADV_WILLNEED? [or posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor). -scott On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart vand...@chromium.org wrote: If you plan to read the entire file, mmap()ing it, then faulting it in will be slower than read()ing it, at least in some Linux versions. I never pinned down exactly why, but I think the kernel read-ahead mechanism works slightly differently. -- Steve On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans cev...@chromium.org wrote: There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. Cheers Chris On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
WILLNEED says Hey, OS, I think I'm going to look at these pages soon, get yourself ready, but the OS could implement them as a nop, and can do it async. If memory is under pressure the system can do less, if memory is clear it can do more. Actually reading the data into memory blocks and actually reads them into memory. -scott On Thu, Oct 22, 2009 at 3:01 PM, Steve Vandebogart vand...@chromium.org wrote: Probably a bit off topic at this point, but but your response confuses me - MADV_WILLNEED and POSIX_FADV_WILLNEED will bring the pages into ram, just like faulting in mmap()'ed pages by hand, or read()ing it into memory. In my experience, read() and fadvise() are faster than mmap()+faulting everything in, or madvise(). Of course, read()ing it in means it has to be swapped out and can't just be dropped. If you want to suck the entire file in at some point, probably the best way is to fadvise() it in, then mmap() it and use it from there. -- Steve On Thu, Oct 22, 2009 at 2:52 PM, Scott Hess sh...@chromium.org wrote: Faulting it in by hand is only helpful if we're right! If we're wrong, it could evict other stuff from memory to support a feature which a user may not use until the memory is faulted back out anyhow. [From the rest of the thread, though, it sounds like maybe we should just fix hunspell to be more efficient for our usage.] -scott On Thu, Oct 22, 2009 at 2:42 PM, Steve Vandebogart vand...@chromium.org wrote: It's been awhile since I looked at this, but the email I was able to dig up suggests that madvise is no faster than faulting in the mmap()ed region by hand. However, using posix_fadvise should give the same speeds as read()ing it into memory. IIRC though, posix_fadvise will only read so much in a single request and will let readahead handle the rest after that. -- Steve On Thu, Oct 22, 2009 at 2:27 PM, Scott Hess sh...@chromium.org wrote: On Linux what about mmap() and then madvise() with MADV_WILLNEED? [or posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor). -scott On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart vand...@chromium.org wrote: If you plan to read the entire file, mmap()ing it, then faulting it in will be slower than read()ing it, at least in some Linux versions. I never pinned down exactly why, but I think the kernel read-ahead mechanism works slightly differently. -- Steve On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans cev...@chromium.org wrote: There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. Cheers Chris On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~--~~~~--~~--~--~---
[chromium-dev] Re: Spellchecker and memory-mapped dicts
That is the intention of the interface yes, but all Linux implementations I've seen actually go and read what ever you say you will need. Of course with a few exceptions like actually being out of memory. -- Steve On Thu, Oct 22, 2009 at 3:06 PM, Scott Hess sh...@chromium.org wrote: WILLNEED says Hey, OS, I think I'm going to look at these pages soon, get yourself ready, but the OS could implement them as a nop, and can do it async. If memory is under pressure the system can do less, if memory is clear it can do more. Actually reading the data into memory blocks and actually reads them into memory. -scott On Thu, Oct 22, 2009 at 3:01 PM, Steve Vandebogart vand...@chromium.org wrote: Probably a bit off topic at this point, but but your response confuses me - MADV_WILLNEED and POSIX_FADV_WILLNEED will bring the pages into ram, just like faulting in mmap()'ed pages by hand, or read()ing it into memory. In my experience, read() and fadvise() are faster than mmap()+faulting everything in, or madvise(). Of course, read()ing it in means it has to be swapped out and can't just be dropped. If you want to suck the entire file in at some point, probably the best way is to fadvise() it in, then mmap() it and use it from there. -- Steve On Thu, Oct 22, 2009 at 2:52 PM, Scott Hess sh...@chromium.org wrote: Faulting it in by hand is only helpful if we're right! If we're wrong, it could evict other stuff from memory to support a feature which a user may not use until the memory is faulted back out anyhow. [From the rest of the thread, though, it sounds like maybe we should just fix hunspell to be more efficient for our usage.] -scott On Thu, Oct 22, 2009 at 2:42 PM, Steve Vandebogart vand...@chromium.org wrote: It's been awhile since I looked at this, but the email I was able to dig up suggests that madvise is no faster than faulting in the mmap()ed region by hand. However, using posix_fadvise should give the same speeds as read()ing it into memory. IIRC though, posix_fadvise will only read so much in a single request and will let readahead handle the rest after that. -- Steve On Thu, Oct 22, 2009 at 2:27 PM, Scott Hess sh...@chromium.org wrote: On Linux what about mmap() and then madvise() with MADV_WILLNEED? [or posix_fadvise() with POSIX_FADV_WILLNEED on the file descriptor). -scott On Thu, Oct 22, 2009 at 2:06 PM, Steve Vandebogart vand...@chromium.org wrote: If you plan to read the entire file, mmap()ing it, then faulting it in will be slower than read()ing it, at least in some Linux versions. I never pinned down exactly why, but I think the kernel read-ahead mechanism works slightly differently. -- Steve On Thu, Oct 22, 2009 at 2:02 PM, Chris Evans cev...@chromium.org wrote: There's also option 3) Pre-fault the mmap()ed region in the file thread upon dictionary initialization. On Linux at least, that may give you better behaviour than malloc() + read() in the event of memory pressure. Cheers Chris On Thu, Oct 22, 2009 at 1:39 PM, Evan Stade est...@chromium.org wrote: Hi all, At its last meeting the jank task force discussed improving responsiveness of the spellchecker but we didn't come to a solid conclusion so I thought I'd bring it up here to see if anyone else has opinions. The main concern is that we don't block the IO thread on file access. To this end, I recently moved initialization of the spellchecker from the IO thread to the file thread. However, instead of reading in the spellchecker dictionary in one solid chunk, we memory-map it. Then later we check individual words on the IO thread, which will be slow since the dictionary starts off effectively completely paged out. The proposal is that we read in the dictionary at spellchecker intialization instead of memory mapping it. Memory mapping pros: - possibly uses less overall memory, depending on the structure of the dictionary and the usage pattern of the user. - strikeloading the dictionary doesn't block for a long time/strike this one no longer occurs either way due to my recent refactoring Reading it all at once pros: - costly disk accesses are kept to the file thread (excepting future memory paging) - overall disk access time is probably lower (since we can read in the dict in one chunk) For reference, the English dictionary is about 500K, and most dictionaries are under 2 megs, some (such as Hungarian) are much higher, but no dictionary is over 10 megs. Opinions? -- Evan Stade --~--~-~--~~~---~--~~ Chromium Developers mailing list: