On ODFGet vs. ODFTextGet - I would suggest we stick with the latter, as if & when we add support for spreadsheets and presentations, we’ll need additional, separate methods for those. That is, ODFPresentationGet and ODFSheetGet. The same is true of ODFTextConverter - I expect we’d have ODFPresentationConverter and ODFSheetConverter as well.
The Word converter is structured the way it is because at the time I wrote the code, I was interested only in word processing documents. It was only when I brought the code to Apache that it was suggested that Corinthia become something that supports presentations and spreadsheets as well. In a practical sense, we don’t address either of these at all, and whether either of these gets implemented will depend on interest & available man/woman-power. But I think it would be good to keep this possibility open - in that sense ODFGet by itself doesn’t make sense sense it’s not clear what type of document is being dealt with. This is also why the Word converter is called as what it is, rather than OOXML (though since coming into Apache the directory structure has been put in place to cater for the future implementation of OOXML spreadsheets and presentations). Also one minor comment on coding style - all the existing code uses an indentation of 4 spaces, no tabs, and { placed on the same line for if/while/switch statements, and on the following line for function definitions. I’m not inherently tied to one layout or another, but I think we should try to remain consistent with the style already in place. — Dr Peter M. Kelly pmke...@apache.org PGP key: http://www.kellypmk.net/pgp-key <http://www.kellypmk.net/pgp-key> (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966) > On 7 Jun 2015, at 7:23 pm, Ian C <i...@amham.net> wrote: > > Hi Gabriela, > > attached is a patch that reorganises the ODF world to be more like the way > Word documents are processed. > > I changed to the top level from operations to use an ODFGet. Which in turn > uses an ODFConverter. The heart of the ODFGet function is > > ODFConverter *converter = > ODFConverterNew(html,abstractStorage,package,idPrefix); > > //Get the styles data > //CSSSheetRelease(converter->styleSheet); > converter->styleSheet = ODFParseStyles(converter); > > //Convert the content.xml to an html beastie > ODFTextGet(converter); > > char *cssText = CSSSheetCopyCSSText(converter->styleSheet); > HTMLAddInternalStyleSheet(converter->html, cssText); > HTML_safeIndent(converter->html->docNode,0); > > Which parses for styles as I did before ( so still needs some work). > Then calls an edited ODFTextGet - which is much as it was. > > The code has just been twisted around to match the structure of the word > world. > > Which means I can't help thinking that we could/should abstract out the > common aspects of converters. > > It converts the headers.odt document to an html which shows the headers ok. > I also attached my version of headers.odt since I changed some of the styles > to try and emphasize their differences. > > I hope it makes sense to you and that your patch tool can digest it. > > -- > Cheers, > > Ian C > <odfpatch.txt><headers.odt>