Document / Content Harvesting (Word, Excel.. others)
----------------------------------------------------

                 Key: MSITE-514
                 URL: http://jira.codehaus.org/browse/MSITE-514
             Project: Maven 2.x Site Plugin
          Issue Type: New Feature
    Affects Versions: 3.0-beta-2
            Reporter: Andrew Hughes


Hi Guys,

Have an idea, but I wouldn't know where to get started on this... besides I 
think this is more than a one person job. Just like we have reporting plugins 
and the project-info reports, I think a "project documents" site plugin would 
be an excellent idea.

Purpose:
The primary purpose is to provide easy integration of non apt, xdoc... 
formatted documents into maven sites.

Objectives:
The primary objective should be to create a menu on the site that lists all of 
the discovered documents in the project source.

Example (that extents the normal "Project Documentation" menu.

* Project Documentation
** Project Information
*** Continuous Integration
*** Issue Tracking
*** Project Team
*** Source Repository
** Project Reports
*** Maven Surefire Report
*** Other Report
** Documents   <- NEW name TDB, clicking on this should open a page with a 
table of all documents with their harvested metadata.
*** Acme Project SRS (doc) <- New, showing a harvested word document.. the link 
title is the document title
*** Contract (pdf) <- New, showing a harvested pdf document.
*** Estimates (xls) <- New, an excel spreadsheet
*** Risk Register (xls) <- New, another excel spreadsheet.

The index page's could hopefully gather enuff metadata about the documents to 
create something that looks like...

||Title||Filename||Format||Author||Last Modified||Last Mofified By||
|Acme Project SRS| APD-ACME-SRS.doc|doc|John Smith|14-10-2010|A Hughes|
|Contact| APC-ACME-CONTACT-23489345.pdf|pdf|N/A|22-02-2010|N/A|
|Estimates| APE-ACME-Estimates.xls|xls|A Schwarzenegger|22-02-2010|JP Freely|

Implementation:
I got very little idea how this kinda thing could be integrated into the site. 
From menu creation, velocity templates e.t.c... sorry I am quite useless. I do 
know that we have things like http://poi.apache.org/ to help gather meta data 
about microsoft documents, and similarly pdf is available. LaTex or other 
formats hopefully have similar API's.

Configuration:
I'd think that the pom config might help define how this could work.. what 
options and functionality it would/could potentially offer...

{noformat}
<plugin>
        ...ommitting normal stuff...
        <configuration>
                <resources>
                        <resource>
                                <!-- override the default of 
./src/site/resources -->
                                <directory>${basedir}/documents</directory>
                                <!-- override the default of what files to 
include -->
                                <includes>
                                        <include>**.doc</include>
                                        <include>**.xls</include>
                                </includes>
                        </resource>
                </resources>
                <!-- override the default label shown on the menu -->
                <menuTitle>Documentz</menuTitle>
                <!-- select the metaData harvested from documents to show on 
the index page -->
                
<metaData>title,version,author,lastModifiedBy,lastModifiedData</metaData>
    </configuration>
<plugin>
{noformat}

What do you think, is this a practical idea? is this achievable and how much 
work would be involved?

CHEERS :)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to