Hi Fatih, I wasn't aware one could load only bitsteams in the web batch ingest. That's a useful addition, I think, providing the user with more options.What is the match point between the bitstream and the metadata record? I actually have another collection of 3400 graduate major papers where I have MARC21 records and want to programmatically attach the bitstream to them. I suppose one exports the metadata-only records from DSpace and then builds SAF packages which overlay the existing metadata record? I suppose one could add to the config.ini which 'mode' FedHarv would run in -- metadata and bitstream harvesting and packaging or adding bitsreams only, or it could complement the main script as a stand-alone secondary script that could be added to the main branch. I'd welcome either, if you want to collaborate.
All best wishes, Pascal On Thu, Apr 2, 2026 at 4:32 AM Fatih Güneş <[email protected]> wrote: > Pascal, thank you very much for sharing the roadmap. I got a Ds Repo Anadolu > Univ Dspace <https://acikerisim.anadolu.edu.tr/home> which I ingested the > content using OpenAlex API. So all items have OpenAlex work id. Can I use > FedHarv to harvest and upload bitsteams to the items? > Is there a specific configuration to use it only for pdfs/bitstreams? > Is there a command to use it for this purpose? > Or should I fork the project and modify to use it this way? > > Best regards, > Fatih. > > On Thursday, April 2, 2026 at 6:16:25 AM UTC+3 Pascal Calarco wrote: > >> Hi Fatih, >> >> Thank you for reaching out. A bit of context before I address your >> question. >> >> My Institution ist Part of a DSpace consortium in Canada, known as >> Scholaris. I wanted to release this now because research funders in Canada >> will soon release a revised immediate (0 day) open access publication >> policy, and I expect many of our researchers will take advantage of the >> transformative agreements our library has with publishers (Plan S) and so >> much more of Canadian scholarly output will be available under a Creative >> Commons license. We need a tool to help manage this incoming deluxe, as the >> policy strongly encourages deposit in IRs. >> >> Yes, we have a group working on Configurable Entities, which we expect to >> implement after we upgrade to DSpace 9, this summer. >> >> It would make much sense to evolve FedHarv to use Configurable Entities >> once in place, certainly for authors and departments. ORCID integration >> also opens up with this release, which Is like to incorporate as well. >> >> All best regards, >> >> Pascal >> >> On Wed, Apr 1, 2026, 18:26 Fatih Güneş <[email protected]> wrote: >> >>> Hi Pascal, >>> Thank you very much for sharing your work. I have been maintaining >>> nearly 12 dspace instances in different univs and I am trying to >>> standardize my automation scripts for ingesting items into dspace using >>> REST APIs. Your project is definitely getting my attention. I will give it >>> a try very soon. One question: Have you considered supporting Entity Model >>> for your project? >>> Best regards, >>> Fatih >>> >>> On Friday, March 27, 2026 at 10:48:26 PM UTC+3 Pascal Calarco wrote: >>> >>>> Hi folks, >>>> >>>> I am releasing a set of Python scripts I have been working on since >>>> last late November called FedHarv (short for federated harvesting). Its >>>> available now publicly under an AGPL v.3 license for all to use, modify and >>>> build upon, provided it stays as free and open source software. >>>> >>>> https://github.com/pvcalarco/FedHarv >>>> >>>> FedHarv is a sophisticated, production-ready federated harvester for >>>> open access academic content, designed to automatically discover, enrich, >>>> and harvest scholarly articles with PDF availability from multiple sources. >>>> >>>> The problem we are trying to provide a solution for is to to the extent >>>> possible, identify Creative Commons-licensed scholarly works (journal >>>> articles, letters to the editor, retractions, errata, book chapters, >>>> conference proceedings, and open access books) that are authored by >>>> researchers, faculty and students of an institution of higher education or >>>> research, harvest the metadata and associated PDF from a variety of API >>>> services. Where we can't find a non-paywalled version, we use Unpaywall >>>> to identify author manuscripts and preprints that can be deposited. >>>> >>>> The script then provides these metadata and PDFs in a series of folders >>>> for the repository manager to quickly check (for departmental and >>>> institutional affiliation and CC license correctness), package these up >>>> into Simple Archive Format (SAF), ready for batch ingest into DSpace >>>> institutional repositories. >>>> >>>> The harvester isn't perfect and you should still check to make sure >>>> closed or bronze OA items were not harvested in error, but the author has >>>> made every effort to do so and has encountered few such errors after much >>>> iteration over this. >>>> >>>> With this tool, you'll be able to gather together as much of the Open >>>> Access scholarly works that your community has formally written and legally >>>> deposit these into your organization's institutional repository. If you >>>> find this software useful, please drop me an email! >>>> >>>> ## 🤖 AI Assistance & Authorship Disclosure >>>> >>>> **FedHarv** was designed, architected, and verified by **Pascal >>>> Calarco**. >>>> >>>> During the development process, AI-augmented coding tools (Google >>>> Gemini and GitHub Copilot) were utilized to: >>>> * Generate boilerplate code and initial function structures. >>>> * Refactor logic for performance (e.g., implementing multi-threading). >>>> * Assist with documentation, licensing (AGPL-v3), and testing suites. >>>> >>>> All AI-generated suggestions have been manually reviewed, tested, and >>>> integrated by the author to ensure technical accuracy, >>>> scholarly metadata standards, and adherence to best practices in >>>> library and information science. >>>> >>>> All best wishes, >>>> >>>> Pascal >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> *Pascal Calarco*¦ Scholarly Communications Librarian and Systems >>>> Librarian >>>> >>>> Lead, Discovery Team >>>> >>>> Research & Publishing Services Unit >>>> Librarian IV >>>> >>>> University of Windsor ¦ J. Francis Leddy Library >>>> 401 Sunset Avenue ¦ Windsor, Ontario N9B 3P4 >>>> <https://www.google.com/maps/search/401+Sunset+Avenue+%C2%A6+Windsor,+Ontario%C2%A0%C2%A0+N9B+3P4?entry=gmail&source=g> >>>> (519)-253-3000 <(519)%20253-3000> ¦ leddy.uwindsor.ca >>>> >>>> >>>> >>>> >>>> >>>> *The University of Windsor is situated on the traditional territory of >>>> the Three Fires Confederacy of First Nations: the Ojibwa, the Odawa, and >>>> the Potawatomi.* >>>> >>>> >>>> >>>> *Join the fight for post-secondary education at Education2025.ca.* >>>> >>> -- >>> All messages to this mailing list should adhere to the Code of Conduct: >>> https://lyrasis.org/code-of-conduct/ >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "DSpace Community" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion visit >>> https://groups.google.com/d/msgid/dspace-community/c57cb85e-12c4-4032-a392-fc5b465dd6ban%40googlegroups.com >>> <https://groups.google.com/d/msgid/dspace-community/c57cb85e-12c4-4032-a392-fc5b465dd6ban%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > All messages to this mailing list should adhere to the Code of Conduct: > https://lyrasis.org/code-of-conduct/ > --- > You received this message because you are subscribed to the Google Groups > "DSpace Community" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion visit > https://groups.google.com/d/msgid/dspace-community/4f5e6ba5-2467-4755-912b-7d44adc1e96en%40googlegroups.com > <https://groups.google.com/d/msgid/dspace-community/4f5e6ba5-2467-4755-912b-7d44adc1e96en%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- All messages to this mailing list should adhere to the Code of Conduct: https://lyrasis.org/code-of-conduct/ --- You received this message because you are subscribed to the Google Groups "DSpace Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/dspace-community/CAPx19O%2BcX_i5wHyNmG-H-mDes2NTLvDDaQ7nEvGmRFLQkZnxuA%40mail.gmail.com.
