Here are what I see as problems:
1) Microsoft Documents: The location where documents are created has Microsoft products; that's what is on the machines.
Possible solutions:
- import the document, after the document content person has sent the document, into OpenOffice.org, and save it as a OpenOffice Document.
Why?: An OpenOffice document can be saved and used as-is within forrest.
Needs: openoffice plug-in from forrest
Changes: This needs to be looked into, but I think any graphics need to be included somehow.
Example: I put the JDBC howto document in the openoffice format. It transformed into html and pdf, but the graphic portion of the document was not coherent. Some mention of a method to include graphics in OpenOffice was mentioned in the past by someone in the forrest space. Getting the graphics to be included into the html and pdf (or whatever output) shouldn't need to concern the author. It should be the concern of the output process. This really should be automated, where an author places a document at a url, asks to transform and look at the result. If the result doesn't come out as the author intended, then the author should make clear comments about what is not correct about the output. (?)
Other formats:
This also needs to be investigated a bit. I looked briefly at xslt transform libraries. I *think* ? this could be done out-of-derby without regard to license(?) Can we use a transform process that does not fit within the Apache.org license scheme on out own, and put within the repository the output of the document? Possibly the document needs cleaned, possibly it depends on the license of the transform code.
?: What outputs do all content processing applications output into?
?: Can OpenOffice import and save into it's own format in a lossless way? If not, what formats are preferred?
I don't know any of these answers, but I do know an easy-to-use format should exist for people that want to contribute content. My opinion is, they spent so much time on content, that someone else can figure out how to convert it into some output that can be used. For example: I spend 10 hours in content production, then I spend (up-to) 50 hours trying to get everything into a format that can fit within a standard. That might be a bit out-of-line, but very possible from what I have seen.
Possibly the forrest site has some hints on some of this.
My entire thoughts are related to the content. I don't think people should have to give a second thought about sending in content. If 'crud' needs stripped, then lets look around for 'crud stripper' or write some content cleaning procedures.
Also, an important license issue; is OpenOffice compatible with Apache? Can we stick content created by OOo into svn? This was a current Forrest thread.
Any good solutions, ideas, etc?
scott
