On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote: > On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis > > wrote: > > Also, two of the major requirements for an improved std.xml are > > that it needs to have a range-based API, and it needs to be > > fast. > > What does range based API mean in this context? I do offer > a couple ranges over the tree, but it really isn't the main > thing there. > > Check out Element.tree() for the main one. > > > But, if you mean taking a range for input, no, doesn't > do that. I've been thinking about rewriting the parse > function (if you look at it, you'll probably hate it > too!). But, what I have works and is tested on a variety > of input, including garbage that was a pain to get working > right, so I'm in no rush to change it. > > > Tango's XML parser has pretty much set the bar on speed > > Yeah, I'm pretty sure Tango whips me hard on speed. I spent > some time in the profiler a month or two ago and got a > significant speedup over the datasets I use (html files), > but I'm sure there's a whole lot more that could be done. > > > > The biggest thing is I don't think you could use my parse > function as a stream.
Ideally, std.xml would operate of ranges of dchar (but obviously be optimized for strings, since there are lots of optimizations that can be done with string processing - at least as far as unicode goes) and it would return a range of some kind. The result would probably be a document type of some kind which provided a range of its top level nodes (or maybe just the root node) which each then provided ranges over their sub-nodes, etc. At least, that's the kind of thing that I would expect. Other calls on the document and nodes may not be range-based at all (e.g. xpaths should probably be supported, and that doesn't necessarily involve ranges). The best way to handle it all would probably depend on the implementation. I haven't implemented a full-blown XML parser, so I don't know what the best way to go about it would be, but ideally, you'd be able to process the nodes as a range. - Jonathan M Davis
