Harvard University Library is pleased to announce the public launch of Harvard's new Web Archive Collection Service (WAX) http://wax.lib.harvard.edu.
WAX began as a pilot project in July 2006, funded by the University's Library Digital Initiative (LDI) to address the management of web sites by collection managers for long-term archiving. It was the first LDI project specifically oriented toward preserving "born-digital" material. The pilot was designed to address the capture, management, storage, and display of web sites for long-term archiving. It was a collaboration of the University Library's Office for Information Systems with three University partners, each fielding a single project: the Harvard University Archives (Harvard University Library); the Arthur and Elizabeth Schlesinger Library on the History of Women in America (Radcliffe Institute for Advanced Study); and the Edwin O. Reischauer Institute of Japanese Studies (Faculty of Arts and Sciences, with sponsorship from Harvard College Library). During the pilot, we explored the legal terrain and implemented several methods of mitigating risks. We investigated various technologies and developed work flow efficiencies for the collection managers and the technologists. We analyzed and implemented the metadata and deposit requirements for long term preservation in our repository. We continue to look at ways to ease the labor intensive nature of the QA process, to improve display as the software matures and to assess additional requirements for long term preservation. To date, we are storing 5,159 ARC files for 1405 WAX harvests representing 141 seeds (starting URLs) in our Digital Repository Service (DRS). These include 335 MIME types, 12,133,528 resources (individual HTML pages, images, graphics, audio or video clips, style sheets, scripts, etc.) for a total of 392 gigabytes. WAX was built using several open source tools developed by the Internet Archive and other International Internet Preservation Consortium (IIPC) members. These IIPC tools include the Heritrix web crawler; the Wayback index and rendering tool; and the NutchWAX index and search tool. WAX also uses Quartz open source job scheduling software from OpenSymphony. In February 2009, the pilot public interface was launched and announced to the University community. WAX has now transitioned to a production system supported by the University Library's central infrastructure. To view the collections, visit: http://wax.lib.harvard.edu. For more information, visit: http://hul.harvard.edu/ois/systems/wax, consult the May 2009 Power Point presentation: http://hul.harvard.edu/ois/support/docs-wax.html, or contact Wendy Gogel: [email protected] Wendy Marcus Gogel Digital Projects Program Librarian HUL - Office for Information Systems 90 Mt. Auburn Street Cambridge, MA 02138 phone: (617) 495-3724 fax: (617) 496-5600 [email protected] http://digitalcollections.harvard.edu _______________________________________________ Instruções para desiscrever-se por conta própria: http://listas.ibict.br/cgi-bin/mailman/options/bib_virtual Bib_virtual mailing list [email protected] http://listas.ibict.br/cgi-bin/mailman/listinfo/bib_virtual

