Michel Fortin wrote:
On 2009-12-30 14:53:33 -0500, Andrei Alexandrescu
<[email protected]> said:
But they break the tradition because now an algorithm may alter or not
a range, and client code must be aware of that - one more thing to
worry about.
What do you think? Should we go with by-ref passing or not? Other ideas?
I'd say go by the most efficient method. I've implemented a few text
parsing functions of my own and they all take the range by reference.
I think you can make things clear with proper naming. All my functions
that advance the range passed by reference are prefixed "consume":
"consumeOneChar", "consumeString", "consumeNumber", "consumeUntil",
"consumeWhile", etc. This makes the intent very clear.
Perfect. I'll do that.
while (!txt.empty) {
auto c = txt.front;
txt.popFront();
if (c == '<') {
if (skip(txt, "!--")) {
// This is a comment
enforce(findSkip(txt, "-->"));
...
} else if (skip(txt, "script")) {
...
}
}
...
}
Are you writing a new XML parser?
No, just some html scraping. I need to parse 200 GB worth of html :o).
Andrei