Egor, thanks for the overview. Regarding support for docx - I tested "docx to pdf", with docx4j library( http://www.docx4java.org/trac/docx4j), and a large part of the document was rendered well. This library uses poi and fop under the hood, so I think there is no big problems with docx and xlsx conversion.
2012/5/18 Yegor Kozlov <[email protected]> > Unfortunately I can't participate in the GSoC status meeting today, > but I would like to give a technical overview of Dmitry's GSoC project. > The official GSoC starts shortly and it is important to have a clear > vision of what and how he will be doing. > > I'm addressing this post to Dmitry Zamula but if anyone else is > interested then please give your feedback. > > The goal of Dmitry's GSoC project is to create a OpenOffice-free > software stack to manage documents on the whiteboard. > This will include support for two families of documents formats: > > (1) MS Office. Below is a short overview of what POI can do: > > 1a. PowerPoint > POI can convert .ppt and .pptx files to PNG. optionally you can > convert to other formats (SVG or directly to Flash). > An example of a PPTX2SVG converter is include in POI examples: > > https://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/xslf/usermodel/PPTX2SVG.txt > PPTX2SVG would use the same idea. > POI renders simple files more or less OK, but more complex drawings > are rendered with defects. > I've seen Dmitry's bug reports in POI Bugzilla, thanks for that, > but please be prepared that fixing will take some time. > > 1b. Word > For the .doc format POI provides a WordToFoConverter. Note that > there can be bugs or not supported features - > WordToFoConverter is brand-new feature that was checked-in only a > few months back. > Also, WordToFoConverter.main() method is just a demo. To get full > advantage of the WordToFo API you will need to > write Java code and integrate it with openMeetings. > > The .docx format is not yet supported. I'm not sure what is the > effort, hope not much. > > 1c. Excel > POI provides both ExcelToFoConverter and ToHtml utilities. ToHtml > may be useful when Openmeetings goes to HTML5 instead of Flash. > > 1d. Other formats: MS Publisher, MS Outlook and Visio. > At minimum, you can extract text contents and export it to XSL-FO. > See text extractors for each format. > > > (2) OpenOffice: odf, odt, ods > > For these you need to look for a open-source solution compatible with the > Apache Licence or write a converter yourself or continue to use OpenOffice. > > Here are interesting links: > > http://incubator.apache.org/odftoolkit/ - a general purpose java API > for the ODF format > > attempts to transform ODT files to XSL-FO in Java via XSLT : > > http://code.google.com/p/xdocreport/ > http://sourceforge.net/projects/office2fo/ > http://svn.clazzes.org/svn/ooo2xslfo/trunk/ooo2xslfo/ > > Transformation of ODF to XSL-FO is a common question on the OOo > mailing lists. Search the archive for more information. > > I propose to spend the first three weeks on POI: a week per format. > By the end of each week you should demonstrate what POI can do and > what it can't. > Ideally, I would like to see it integrated in Openmeetings, i.e I > would like to see not a Java.main() utility that takes .xls and > outputs .fo, > but a working code integrated in Openmeetings. > > Then it will time for ODF to XSL-FO via XSLT. Don't pay too much > attention to quality. > Prove the concept first and then will see how to improve the output. > > Thoughts? > > > Regards, > Yegor > -- Best Regards, *Dmitry Zamula* Saint-Petersburg, Russia UTC/GMT +4 hours Mobile phone: +7 (904) 646-9254 Skype Id: brantner_ru E-mail: [email protected]
