Hi!

In the process of studying the possible data flows and integration 
possibilities with external sources of information (serials 
repositories, authors sites, subject wikis, etc., institutional 
directories, etc.), I was dreaming of an harvesting tool and I decided 
to verify if any would already exist.

If "Web-Harvest" is not "ideal", I will only know after deep testing! 
Because it really looks nice and fairly complete...

Please look at:
http://web-harvest.sourceforge.net
* Java based
* BSD licence
* Independant harvesting GUI but also a Java API for application integration
* Scriptable in JavaScript, Groovy, BeanShell, XSLT, XQuery, RegExp; 
External Java is accessible...

If an effort for integration with DSpace is done, is their licence 
compatible with our?
http://web-harvest.sourceforge.net/licence.php

Short need analysis:
* Record Data import: regular query to external data sources and 
information harvesting to create new DSpace record OR (more often) 
complete existing ones (author or serial information for instance)
* Mashing: while displaying a record, some external data may provide 
interesting complement to information displayed. For instance, the 
number of record for an author or a subject in an external database
* Authority: an external database may provide authority control 
information while editing data fields within a DSpace record.
* The harvesting rules should be accessible to a "power user" like a 
Repository Manager: after some setup, a computer scientist should not be 
needed to maintain links with external sources.

I will explore (and report) the suitability of Web-Harvest for those 
tasks: please let me know if you have experienced this tool or others to 
fulfill these needs!

Have a nice day!

Christophe Dupriez

------------------------------------------------------------------------------
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to