On 03/16/19 18:33, Chapman Flack wrote: > The pre-scan is a simple linear search and will ordinarily say yes or no > within a couple dozen characters--you could *have* an input with 20k of > leading whitespace and comments, but it's hardly the norm. Just trying to
If the available regexp functions want to start by munging the entire input into a pg_wchar array, then it may be better to implement the pre-scan as open code, the same way parse_xml_decl() is already implemented. Given that parse_xml_decl() already covers the first optional thing that can precede the doctype, the remaining scan routine would only need to recognize comments, PIs, and whitespace. That would be pretty straightforward. Regards, -Chap