This is an engineering notebook post. Feel free to ignore, even if you are a dev ;-)
The html importer has been, and always will be, a trivial subclass of the xml importer. As a result, we need only consider the xml importer. *The bad old days* The old xml importer is a perfect example of what was wrong with the old importers. It is horrendously complex, with the complexity having everything to do with its base class, and nothing to do with the xml language itself! Reviewing the old code, I noticed several bug fixes. Happily, it looks like all those bugs are covered with unit tests, so it won't be possible to reintroduce those bugs. *Strategy* xml and html use neither brackets nor indentation to delimit structure. Otoh, open tags are kind like open brackets. Ditto for close tags. This similarity means we *can* use i.v2_gen_lines, but the standard i.v2_scan_lines won't work. There are various faux-clever ways to proceed, but by far the simplest is simply to rewrite i.v2_scan_line so it *doesn't* use a table. Instead, it will be somewhat like js_i.v2_scan_line, used by the javascript importer. There will be no Xml_ScanState.update method. xml_i.v2_scan_line will update xml_state.tag_level/context directly. The Xml_ScanState ctor will not follow the standard protocol. That's about it. EKR -- You received this message because you are subscribed to the Google Groups "leo-editor" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/leo-editor. For more options, visit https://groups.google.com/d/optout.
