Hi Pascal,
Thank you very much for sharing your work. I have been maintaining nearly 
12 dspace instances in different univs and I am trying to standardize my 
automation scripts for ingesting items into dspace using REST APIs. Your 
project is definitely getting my attention. I will give it a try very soon. 
One question: Have you considered supporting Entity Model for your project?
Best regards,
Fatih

On Friday, March 27, 2026 at 10:48:26 PM UTC+3 Pascal Calarco wrote:

> Hi folks,
>
> I am releasing a set of Python scripts I have been working on since last 
> late November called FedHarv (short for federated harvesting). Its 
> available now publicly under an AGPL v.3 license for all to use, modify and 
> build upon, provided it stays as free and open source software.
>
> https://github.com/pvcalarco/FedHarv
>
> FedHarv is a sophisticated, production-ready federated harvester for open 
> access academic content, designed to automatically discover, enrich, and 
> harvest scholarly articles with PDF availability from multiple sources. 
>
> The problem we are trying to provide a solution for is to to the extent 
> possible, identify Creative Commons-licensed scholarly works (journal 
> articles, letters to the editor, retractions, errata, book chapters, 
> conference proceedings, and open access books) that are authored by 
> researchers, faculty and students of an institution of higher education or 
> research, harvest the metadata and associated PDF from a variety of API 
> services. Where we can't find a non-paywalled version, we use Unpaywall 
> to identify author manuscripts and preprints that can be deposited.
>
> The script then provides these metadata and PDFs in a series of folders 
> for the repository manager to quickly check (for departmental and 
> institutional affiliation and CC license correctness), package these up 
> into Simple Archive Format (SAF), ready for batch ingest into DSpace 
> institutional repositories.
>
> The harvester isn't perfect and you should still check to make sure closed 
> or bronze OA items were not harvested in error, but the author has made 
> every effort to do so and has encountered few such errors after much 
> iteration over this.
>
> With this tool, you'll be able to gather together as much of the Open 
> Access scholarly works that your community has formally written and legally 
> deposit these into your organization's institutional repository. If you 
> find this software useful, please drop me an email! 
>
> ## 🤖 AI Assistance & Authorship Disclosure
>
> **FedHarv** was designed, architected, and verified by **Pascal Calarco**.
>
> During the development process, AI-augmented coding tools (Google Gemini 
> and GitHub Copilot) were utilized to:
> * Generate boilerplate code and initial function structures.
> * Refactor logic for performance (e.g., implementing multi-threading).
> * Assist with documentation, licensing (AGPL-v3), and testing suites.
>
> All AI-generated suggestions have been manually reviewed, tested, and 
> integrated by the author to ensure technical accuracy,
> scholarly metadata standards, and adherence to best practices in library 
> and information science.
>
> All best wishes,
>
> Pascal
>
> 
>
>
>  
>
>  
>
> *Pascal Calarco*¦ Scholarly Communications Librarian and Systems Librarian
>
> Lead, Discovery Team
>
> Research & Publishing Services Unit
> Librarian IV
>
> University of Windsor ¦ J. Francis Leddy Library
> 401 Sunset Avenue ¦ Windsor, Ontario   N9B 3P4
> (519)-253-3000 <(519)%20253-3000> ¦ leddy.uwindsor.ca
>
>  
>
>  
>
> *The University of Windsor is situated on the traditional territory of the 
> Three Fires Confederacy of First Nations: the Ojibwa, the Odawa, and the 
> Potawatomi.*
>
>  
>
> *Join the fight for post-secondary education at Education2025.ca.*  
>

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://lyrasis.org/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/dspace-community/c57cb85e-12c4-4032-a392-fc5b465dd6ban%40googlegroups.com.

Reply via email to