Hi! In the process of studying the possible data flows and integration possibilities with external sources of information (serials repositories, authors sites, subject wikis, etc., institutional directories, etc.), I was dreaming of an harvesting tool and I decided to verify if any would already exist.
If "Web-Harvest" is not "ideal", I will only know after deep testing! Because it really looks nice and fairly complete... Please look at: http://web-harvest.sourceforge.net * Java based * BSD licence * Independant harvesting GUI but also a Java API for application integration * Scriptable in JavaScript, Groovy, BeanShell, XSLT, XQuery, RegExp; External Java is accessible... If an effort for integration with DSpace is done, is their licence compatible with our? http://web-harvest.sourceforge.net/licence.php Short need analysis: * Record Data import: regular query to external data sources and information harvesting to create new DSpace record OR (more often) complete existing ones (author or serial information for instance) * Mashing: while displaying a record, some external data may provide interesting complement to information displayed. For instance, the number of record for an author or a subject in an external database * Authority: an external database may provide authority control information while editing data fields within a DSpace record. * The harvesting rules should be accessible to a "power user" like a Repository Manager: after some setup, a computer scientist should not be needed to maintain links with external sources. I will explore (and report) the suitability of Web-Harvest for those tasks: please let me know if you have experienced this tool or others to fulfill these needs! Have a nice day! Christophe Dupriez ------------------------------------------------------------------------------ _______________________________________________ Dspace-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-devel
