Hello Kevon,

Yes, I would welcome that and it would be fun and useful to collaborate.

All best wishes,

Pascal

On Thu, Apr 2, 2026 at 8:45 AM Kevon Muhoozi <[email protected]> wrote:

> Good day, all!
> Pascal's Harvesting Python scripts are a very interesting project that I
> appeal to us all to possibly carry on by supporting, contributing to the
> repo, and growing it to more sophistication.
> Like we may open a conversation on GitHub about what we would like to see
> in the repo scripts functionality, and also maybe post their the inquiries
> and queries to make it a growing open source tool and project.
>
> Thank you. Regards
>
> On Thu, Apr 2, 2026 at 11:32 AM Fatih Güneş <[email protected]>
> wrote:
>
>> Pascal, thank you very much for sharing the roadmap. I got a Ds Repo Anadolu
>> Univ Dspace <https://acikerisim.anadolu.edu.tr/home> which I ingested
>> the content using OpenAlex API. So all items have OpenAlex work id. Can I
>> use FedHarv to harvest and upload bitsteams to the items?
>> Is there a specific configuration to use it only for pdfs/bitstreams?
>> Is there a command to use it for this purpose?
>> Or should I fork the project and modify to use it this way?
>>
>> Best regards,
>> Fatih.
>>
>> On Thursday, April 2, 2026 at 6:16:25 AM UTC+3 Pascal Calarco wrote:
>>
>>> Hi Fatih,
>>>
>>> Thank you for reaching out. A bit of context before I address your
>>> question.
>>>
>>> My Institution ist Part of a DSpace consortium in Canada, known as
>>> Scholaris. I wanted to release this now because research funders in Canada
>>> will soon release a revised immediate (0 day) open access publication
>>> policy, and I expect many of our researchers will take advantage of the
>>> transformative agreements our library has with publishers (Plan S) and so
>>> much more of Canadian scholarly output will be available under a Creative
>>> Commons license. We need a tool to help manage this incoming deluxe, as the
>>> policy strongly encourages deposit in IRs.
>>>
>>> Yes, we have a group working on Configurable Entities, which we expect
>>> to implement after we upgrade to DSpace 9, this summer.
>>>
>>> It would make much sense to evolve FedHarv to use Configurable Entities
>>> once in place, certainly for authors and departments. ORCID integration
>>> also opens up with this release, which Is like to incorporate as well.
>>>
>>> All best regards,
>>>
>>> Pascal
>>>
>>> On Wed, Apr 1, 2026, 18:26 Fatih Güneş <[email protected]> wrote:
>>>
>>>> Hi Pascal,
>>>> Thank you very much for sharing your work. I have been maintaining
>>>> nearly 12 dspace instances in different univs and I am trying to
>>>> standardize my automation scripts for ingesting items into dspace using
>>>> REST APIs. Your project is definitely getting my attention. I will give it
>>>> a try very soon. One question: Have you considered supporting Entity Model
>>>> for your project?
>>>> Best regards,
>>>> Fatih
>>>>
>>>> On Friday, March 27, 2026 at 10:48:26 PM UTC+3 Pascal Calarco wrote:
>>>>
>>>>> Hi folks,
>>>>>
>>>>> I am releasing a set of Python scripts I have been working on since
>>>>> last late November called FedHarv (short for federated harvesting). Its
>>>>> available now publicly under an AGPL v.3 license for all to use, modify 
>>>>> and
>>>>> build upon, provided it stays as free and open source software.
>>>>>
>>>>> https://github.com/pvcalarco/FedHarv
>>>>>
>>>>> FedHarv is a sophisticated, production-ready federated harvester for
>>>>> open access academic content, designed to automatically discover, enrich,
>>>>> and harvest scholarly articles with PDF availability from multiple 
>>>>> sources.
>>>>>
>>>>> The problem we are trying to provide a solution for is to to the
>>>>> extent possible, identify Creative Commons-licensed scholarly works
>>>>> (journal articles, letters to the editor, retractions, errata, book
>>>>> chapters, conference proceedings, and open access books) that are authored
>>>>> by researchers, faculty and students of an institution of higher education
>>>>> or research, harvest the metadata and associated PDF from a variety of API
>>>>> services. Where we can't find a non-paywalled version, we use Unpaywall
>>>>> to identify author manuscripts and preprints that can be deposited.
>>>>>
>>>>> The script then provides these metadata and PDFs in a series of
>>>>> folders for the repository manager to quickly check (for departmental and
>>>>> institutional affiliation and CC license correctness), package these up
>>>>> into Simple Archive Format (SAF), ready for batch ingest into DSpace
>>>>> institutional repositories.
>>>>>
>>>>> The harvester isn't perfect and you should still check to make sure
>>>>> closed or bronze OA items were not harvested in error, but the author has
>>>>> made every effort to do so and has encountered few such errors after much
>>>>> iteration over this.
>>>>>
>>>>> With this tool, you'll be able to gather together as much of the Open
>>>>> Access scholarly works that your community has formally written and 
>>>>> legally
>>>>> deposit these into your organization's institutional repository. If you
>>>>> find this software useful, please drop me an email!
>>>>>
>>>>> ## 🤖 AI Assistance & Authorship Disclosure
>>>>>
>>>>> **FedHarv** was designed, architected, and verified by **Pascal
>>>>> Calarco**.
>>>>>
>>>>> During the development process, AI-augmented coding tools (Google
>>>>> Gemini and GitHub Copilot) were utilized to:
>>>>> * Generate boilerplate code and initial function structures.
>>>>> * Refactor logic for performance (e.g., implementing multi-threading).
>>>>> * Assist with documentation, licensing (AGPL-v3), and testing suites.
>>>>>
>>>>> All AI-generated suggestions have been manually reviewed, tested, and
>>>>> integrated by the author to ensure technical accuracy,
>>>>> scholarly metadata standards, and adherence to best practices in
>>>>> library and information science.
>>>>>
>>>>> All best wishes,
>>>>>
>>>>> Pascal
>>>>>
>>>>> 
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *Pascal Calarco*¦ Scholarly Communications Librarian and Systems
>>>>> Librarian
>>>>>
>>>>> Lead, Discovery Team
>>>>>
>>>>> Research & Publishing Services Unit
>>>>> Librarian IV
>>>>>
>>>>> University of Windsor ¦ J. Francis Leddy Library
>>>>> 401 Sunset Avenue ¦ Windsor, Ontario   N9B 3P4
>>>>> <https://www.google.com/maps/search/401+Sunset+Avenue+%C2%A6+Windsor,+Ontario%C2%A0%C2%A0+N9B+3P4?entry=gmail&source=g>
>>>>> (519)-253-3000 <(519)%20253-3000> ¦ leddy.uwindsor.ca
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *The University of Windsor is situated on the traditional territory of
>>>>> the Three Fires Confederacy of First Nations: the Ojibwa, the Odawa, and
>>>>> the Potawatomi.*
>>>>>
>>>>>
>>>>>
>>>>> *Join the fight for post-secondary education at Education2025.ca.*
>>>>>
>>>> --
>>>> All messages to this mailing list should adhere to the Code of Conduct:
>>>> https://lyrasis.org/code-of-conduct/
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "DSpace Community" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To view this discussion visit
>>>> https://groups.google.com/d/msgid/dspace-community/c57cb85e-12c4-4032-a392-fc5b465dd6ban%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/dspace-community/c57cb85e-12c4-4032-a392-fc5b465dd6ban%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> --
>> All messages to this mailing list should adhere to the Code of Conduct:
>> https://lyrasis.org/code-of-conduct/
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Community" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion visit
>> https://groups.google.com/d/msgid/dspace-community/4f5e6ba5-2467-4755-912b-7d44adc1e96en%40googlegroups.com
>> <https://groups.google.com/d/msgid/dspace-community/4f5e6ba5-2467-4755-912b-7d44adc1e96en%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> All messages to this mailing list should adhere to the Code of Conduct:
> https://lyrasis.org/code-of-conduct/
> ---
> You received this message because you are subscribed to the Google Groups
> "DSpace Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion visit
> https://groups.google.com/d/msgid/dspace-community/CA%2BLKazpzBk87SM%3DWNojrc4LTMQpDLsciB2M3G4hS6RbDKpHFVA%40mail.gmail.com
> <https://groups.google.com/d/msgid/dspace-community/CA%2BLKazpzBk87SM%3DWNojrc4LTMQpDLsciB2M3G4hS6RbDKpHFVA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://lyrasis.org/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/dspace-community/CAPx19OJ244FZWAv_-nPkdBMnostXwf6JHW6JC85D0%2BvxA9OJwQ%40mail.gmail.com.

Reply via email to