Hi Miklos, On Thu, 2011-05-19 at 12:26 +0200, Miklos Vajna wrote: > Preparing for the GSoC project, I wanted to decide what is an easy way > to test an import filter. Given that the new RTF filter will be in > writerfilter, testing docx import is a good example, I guess.
Sure, though we have no unit testing here. > So opening a docx document and checking the visual result is one way, > though in case it goes wrong, I don't think it's helpful. One method is > to export to odt, AFAIK that's lossless. So here is what I tried (build > from master, I pulled and did an incremental build today): As you'll work on the tokenizer, I think it would be nice to introduce some kind of tokens dumper replacing the dmapper that would dump what goes in the dmapper. That would possibly provide some way to isolate whether the import problem comes from the tokenizer (specific to each format) or the domain mapper (that would impact all handled formats). You would then have a much more reliable way to test that your tokenizer is working... but that wouldn't help testing the domain mapper. To test that one, I think that mostly conversions like those you are explaining are helping. > (I already heard of the xml dumper for the rendered layout, is there > something similar for the internal document model?) Yes, the ODF is a pretty good representation of the internals... though we could surely implement something nearer from the actual data structures. Let me know if it would be of any use to create such a dumper... I'm sure we could come pretty quickly to something useful. Regards, -- Cédric Bosdonnat LibreOffice hacker http://documentfoundation.org OOo Eclipse Integration developer http://cedric.bosdonnat.free.fr _______________________________________________ LibreOffice mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/libreoffice
