Scintilla currently uses simple expandable arrays to store per-line data. Per-line data includes line start positions, folding level, markers and lexer line state. When a line is inserted or deleted, each line after that is moved. Whenever any text is inserted or removed, each line after that has to have the text length added or subtracted from its start position. This is reasonably fast when interactively editing source code of normal length but can be slow for intensive editing or when editing large files.
Changing the per-line data to use split vectors, similar to the text+style buffers already used makes this more efficient since only the lines between the current insertion/deletion and the previous are copied and modifications are often close to the previous modification. The markers data is now only allocated for all the lines when the first marker is added so applications that do not use markers will use less memory. To minimize the cost of maintaining the line start positions when inserting and removing text, a step is included so that all line starts after the step line have the step value added. Thus if the step is on line 10 and a character is added to line 10 then the step value is incremented. If a character is added to line 20 the step is moved there (by adding the step value to intervening lines) before incrementing the step value. This data structure has been part of SinkWorld for a couple of years so has received some testing. The lexer line state is left as a simple expandable vector since it is appended to in order during each lex and there are no insertions or deletions. There was a performance problem caused by folding when inserting a large piece of text onto a blank line. The level of the blank line (including its whitespace flag) was copied onto each newly inserted line which led to each line being considered subordinate (whitespace lines are always subordinate) which then caused large blocks to be processed by the folding code. This was exacerbated by ContractionState::SetVisible invalidating the whole ContractionState even if the lines being made visible were already visible. The character bytes and style bytes are now separated into two objects (substance and style) inside CellBuffer. I won't be implementing different sized characters or styling information but this change should make it easier for others that want to make these changes. These modifications have changed very fundamental parts of Scintilla and are likely to have caused new bugs and to have changed performance so it would be good to see them tested and any bugs reported. Available from CVS and from http://scintilla.sourceforge.net/scite.zip Source http://scintilla.sourceforge.net/wscite.zip Windows executable Neil _______________________________________________ Scintilla-interest mailing list [email protected] http://mailman.lyra.org/mailman/listinfo/scintilla-interest
