Document / Content Harvesting (Word, Excel.. others) ----------------------------------------------------
Key: MSITE-514 URL: http://jira.codehaus.org/browse/MSITE-514 Project: Maven 2.x Site Plugin Issue Type: New Feature Affects Versions: 3.0-beta-2 Reporter: Andrew Hughes Hi Guys, Have an idea, but I wouldn't know where to get started on this... besides I think this is more than a one person job. Just like we have reporting plugins and the project-info reports, I think a "project documents" site plugin would be an excellent idea. Purpose: The primary purpose is to provide easy integration of non apt, xdoc... formatted documents into maven sites. Objectives: The primary objective should be to create a menu on the site that lists all of the discovered documents in the project source. Example (that extents the normal "Project Documentation" menu. * Project Documentation ** Project Information *** Continuous Integration *** Issue Tracking *** Project Team *** Source Repository ** Project Reports *** Maven Surefire Report *** Other Report ** Documents <- NEW name TDB, clicking on this should open a page with a table of all documents with their harvested metadata. *** Acme Project SRS (doc) <- New, showing a harvested word document.. the link title is the document title *** Contract (pdf) <- New, showing a harvested pdf document. *** Estimates (xls) <- New, an excel spreadsheet *** Risk Register (xls) <- New, another excel spreadsheet. The index page's could hopefully gather enuff metadata about the documents to create something that looks like... ||Title||Filename||Format||Author||Last Modified||Last Mofified By|| |Acme Project SRS| APD-ACME-SRS.doc|doc|John Smith|14-10-2010|A Hughes| |Contact| APC-ACME-CONTACT-23489345.pdf|pdf|N/A|22-02-2010|N/A| |Estimates| APE-ACME-Estimates.xls|xls|A Schwarzenegger|22-02-2010|JP Freely| Implementation: I got very little idea how this kinda thing could be integrated into the site. From menu creation, velocity templates e.t.c... sorry I am quite useless. I do know that we have things like http://poi.apache.org/ to help gather meta data about microsoft documents, and similarly pdf is available. LaTex or other formats hopefully have similar API's. Configuration: I'd think that the pom config might help define how this could work.. what options and functionality it would/could potentially offer... {noformat} <plugin> ...ommitting normal stuff... <configuration> <resources> <resource> <!-- override the default of ./src/site/resources --> <directory>${basedir}/documents</directory> <!-- override the default of what files to include --> <includes> <include>**.doc</include> <include>**.xls</include> </includes> </resource> </resources> <!-- override the default label shown on the menu --> <menuTitle>Documentz</menuTitle> <!-- select the metaData harvested from documents to show on the index page --> <metaData>title,version,author,lastModifiedBy,lastModifiedData</metaData> </configuration> <plugin> {noformat} What do you think, is this a practical idea? is this achievable and how much work would be involved? CHEERS :) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira