Good day, all!
Pascal's Harvesting Python scripts are a very interesting project that I
appeal to us all to possibly carry on by supporting, contributing to the
repo, and growing it to more sophistication.
Like we may open a conversation on GitHub about what we would like to see
in the repo scripts functionality, and also maybe post their the inquiries
and queries to make it a growing open source tool and project.

Thank you. Regards

On Thu, Apr 2, 2026 at 11:32 AM Fatih Güneş <[email protected]> wrote:

> Pascal, thank you very much for sharing the roadmap. I got a Ds Repo Anadolu
> Univ Dspace <https://acikerisim.anadolu.edu.tr/home> which I ingested the
> content using OpenAlex API. So all items have OpenAlex work id. Can I use
> FedHarv to harvest and upload bitsteams to the items?
> Is there a specific configuration to use it only for pdfs/bitstreams?
> Is there a command to use it for this purpose?
> Or should I fork the project and modify to use it this way?
>
> Best regards,
> Fatih.
>
> On Thursday, April 2, 2026 at 6:16:25 AM UTC+3 Pascal Calarco wrote:
>
>> Hi Fatih,
>>
>> Thank you for reaching out. A bit of context before I address your
>> question.
>>
>> My Institution ist Part of a DSpace consortium in Canada, known as
>> Scholaris. I wanted to release this now because research funders in Canada
>> will soon release a revised immediate (0 day) open access publication
>> policy, and I expect many of our researchers will take advantage of the
>> transformative agreements our library has with publishers (Plan S) and so
>> much more of Canadian scholarly output will be available under a Creative
>> Commons license. We need a tool to help manage this incoming deluxe, as the
>> policy strongly encourages deposit in IRs.
>>
>> Yes, we have a group working on Configurable Entities, which we expect to
>> implement after we upgrade to DSpace 9, this summer.
>>
>> It would make much sense to evolve FedHarv to use Configurable Entities
>> once in place, certainly for authors and departments. ORCID integration
>> also opens up with this release, which Is like to incorporate as well.
>>
>> All best regards,
>>
>> Pascal
>>
>> On Wed, Apr 1, 2026, 18:26 Fatih Güneş <[email protected]> wrote:
>>
>>> Hi Pascal,
>>> Thank you very much for sharing your work. I have been maintaining
>>> nearly 12 dspace instances in different univs and I am trying to
>>> standardize my automation scripts for ingesting items into dspace using
>>> REST APIs. Your project is definitely getting my attention. I will give it
>>> a try very soon. One question: Have you considered supporting Entity Model
>>> for your project?
>>> Best regards,
>>> Fatih
>>>
>>> On Friday, March 27, 2026 at 10:48:26 PM UTC+3 Pascal Calarco wrote:
>>>
>>>> Hi folks,
>>>>
>>>> I am releasing a set of Python scripts I have been working on since
>>>> last late November called FedHarv (short for federated harvesting). Its
>>>> available now publicly under an AGPL v.3 license for all to use, modify and
>>>> build upon, provided it stays as free and open source software.
>>>>
>>>> https://github.com/pvcalarco/FedHarv
>>>>
>>>> FedHarv is a sophisticated, production-ready federated harvester for
>>>> open access academic content, designed to automatically discover, enrich,
>>>> and harvest scholarly articles with PDF availability from multiple sources.
>>>>
>>>> The problem we are trying to provide a solution for is to to the extent
>>>> possible, identify Creative Commons-licensed scholarly works (journal
>>>> articles, letters to the editor, retractions, errata, book chapters,
>>>> conference proceedings, and open access books) that are authored by
>>>> researchers, faculty and students of an institution of higher education or
>>>> research, harvest the metadata and associated PDF from a variety of API
>>>> services. Where we can't find a non-paywalled version, we use Unpaywall
>>>> to identify author manuscripts and preprints that can be deposited.
>>>>
>>>> The script then provides these metadata and PDFs in a series of folders
>>>> for the repository manager to quickly check (for departmental and
>>>> institutional affiliation and CC license correctness), package these up
>>>> into Simple Archive Format (SAF), ready for batch ingest into DSpace
>>>> institutional repositories.
>>>>
>>>> The harvester isn't perfect and you should still check to make sure
>>>> closed or bronze OA items were not harvested in error, but the author has
>>>> made every effort to do so and has encountered few such errors after much
>>>> iteration over this.
>>>>
>>>> With this tool, you'll be able to gather together as much of the Open
>>>> Access scholarly works that your community has formally written and legally
>>>> deposit these into your organization's institutional repository. If you
>>>> find this software useful, please drop me an email!
>>>>
>>>> ## 🤖 AI Assistance & Authorship Disclosure
>>>>
>>>> **FedHarv** was designed, architected, and verified by **Pascal
>>>> Calarco**.
>>>>
>>>> During the development process, AI-augmented coding tools (Google
>>>> Gemini and GitHub Copilot) were utilized to:
>>>> * Generate boilerplate code and initial function structures.
>>>> * Refactor logic for performance (e.g., implementing multi-threading).
>>>> * Assist with documentation, licensing (AGPL-v3), and testing suites.
>>>>
>>>> All AI-generated suggestions have been manually reviewed, tested, and
>>>> integrated by the author to ensure technical accuracy,
>>>> scholarly metadata standards, and adherence to best practices in
>>>> library and information science.
>>>>
>>>> All best wishes,
>>>>
>>>> Pascal
>>>>
>>>> 
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *Pascal Calarco*¦ Scholarly Communications Librarian and Systems
>>>> Librarian
>>>>
>>>> Lead, Discovery Team
>>>>
>>>> Research & Publishing Services Unit
>>>> Librarian IV
>>>>
>>>> University of Windsor ¦ J. Francis Leddy Library
>>>> 401 Sunset Avenue ¦ Windsor, Ontario   N9B 3P4
>>>> <https://www.google.com/maps/search/401+Sunset+Avenue+%C2%A6+Windsor,+Ontario%C2%A0%C2%A0+N9B+3P4?entry=gmail&source=g>
>>>> (519)-253-3000 <(519)%20253-3000> ¦ leddy.uwindsor.ca
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *The University of Windsor is situated on the traditional territory of
>>>> the Three Fires Confederacy of First Nations: the Ojibwa, the Odawa, and
>>>> the Potawatomi.*
>>>>
>>>>
>>>>
>>>> *Join the fight for post-secondary education at Education2025.ca.*
>>>>
>>> --
>>> All messages to this mailing list should adhere to the Code of Conduct:
>>> https://lyrasis.org/code-of-conduct/
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion visit
>>> https://groups.google.com/d/msgid/dspace-community/c57cb85e-12c4-4032-a392-fc5b465dd6ban%40googlegroups.com
>>> <https://groups.google.com/d/msgid/dspace-community/c57cb85e-12c4-4032-a392-fc5b465dd6ban%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> All messages to this mailing list should adhere to the Code of Conduct:
> https://lyrasis.org/code-of-conduct/
> ---
> You received this message because you are subscribed to the Google Groups
> "DSpace Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion visit
> https://groups.google.com/d/msgid/dspace-community/4f5e6ba5-2467-4755-912b-7d44adc1e96en%40googlegroups.com
> <https://groups.google.com/d/msgid/dspace-community/4f5e6ba5-2467-4755-912b-7d44adc1e96en%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://lyrasis.org/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/dspace-community/CA%2BLKazpzBk87SM%3DWNojrc4LTMQpDLsciB2M3G4hS6RbDKpHFVA%40mail.gmail.com.

Reply via email to