Mattias Gärtner wrote:
Zitat von Burkhard Carstens <[EMAIL PROTECTED]>:

Am Mittwoch, 6. August 2008 21:37 schrieb Sergei Gorelkin:
Since this topic is touched, I would like to vote for removing
avl_tree dependency from the DOM altogether. The reason is that the
avl tree of child nodes is useless for any real-world XML data
because it does not handle duplicate names. The tree is useful only
for the one particular case of XML configs where each node name is
unique. OTOH, it causes considerable performance penalties (each
insert results in two searches, first one for checking the existence,
second one for insert itself; each search is a number of calls to
Node.GetNodeName, which, in case of Windows, is a Widestring copy,
etc.)

The widestrings version is slow under linux too. Last time I tested (a few
months ago) it was about 2 to 10 times slower than the ansistring version.
Mainly because of the parser.

This will change soon. I've almost completed the improved parser which is very competitive with ansistring one (about 3 times faster than current implementation, twice faster than MSXML parser).


Therefore I'd suggest to remove the avltree from DOM.
This would be great to have in 2.2.2.

In order to
keep configs at the reasonable speed, it is possible to implement
indexing within TXMLConfig class itself, preferably hashmap-based.
This would be a performance tweak which (from my POV) does not need to
be in 2.2.2, although it would be nice.

The performance penalty without the trees can be quite big for things like
txmlconfig. For example for lazarus project files it can be a hundred times
slower without the trees.
Note: lazarus itself uses its own ansistring version of xml parser, so this not
a direct problem for lazarus.


Alternatively, it is possible to use Registry-like access
(OpenKey/CloseKey methods) that I had already implemented in newer
xmlconf.pp unit. The advantage is that the searches are done starting
from the 'opened key', not from the root, so linear search isn't too
slow.
If I understand correctly, this would change the usage of xmlconfig
which would require changes to any existing code using it? If so, it's
probably not an option for 2.2.2, maybe even not for 2.2.4 ..

I second that. The main advantage of TXMLConfig is it's easy usage.
I guess you can gain some speed by implementing some kind of caching for paths.
Maybe I will do that for the lazarus ansistring version. But IMO this is too
risky for fpc 2.2.2 which is released soon.

xmlconf is source-level compatible with xmlcfg, you can replace it in uses clause and the code should remain working (although slower). Later on, you can start replacing things like

s1 := cfg.readstring('foo/bar/value1');
s2 := cfg.readstring('foo/bar/value2');

with

cfg.OpenKey('foo/bar');
s1 := cfg.readstring('value1');
s2 := cfg.readstring('value2');
cfg.CloseKey;

In this simplest example, the code becomes more complicated. However, most real projects compose the paths from parts, and they no longer need to do that.

The above also means that the key caching implemented in xmlconf may be back-ported to xmlcfg without breaking the existing user code.

Sergei
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to