Robin, that's interesting - it would be great to hear more about the enhancements you have made (for us it would be most interesting for DSpace 5) - maybe your colleagues can add more details about that? Do you have those enhancements publicly available on Github? The "Download All" button sounds interesting as well - actually we had a request about something pretty similar recently.
Best regards Oliver rice schrieb am Freitag, 30. April 2021 um 18:34:25 UTC+2: > Hello, > I saw this and want to give a brief response, about to go on leave but my > colleagues could say more, John Pinto or Pauline Ward. > > In Edinburgh DataShare https://datashare.ed.ac.uk/ which is an > institutional data repository, we have made some enhancements (first in > DSpace 5 now in 6.x) to allow drag and drop uploads up to 20 GB per item. > We will also allow batch import for up to 100 GB per item. We have tested > this and we have a message for larger data downloads that it will take a > while sometimes more than a day, but it is resumable download so robust. > > We also have a 'download all' button which points to a zip file of the > item's bitstreams, since most datasets have numerous files. > > Cheers, > Robin Rice > University of Edinburgh Library > > > On Friday, 9 April 2021 at 15:28:42 UTC+1 [email protected] wrote: > >> Bill, >> >> I just saw your request on discussing bitstream sizes in DSpace and would >> like to join the conversation. >> >> We are also having some reservations about uploading large files via the >> Web UI to DSpace. For us the reasonable limit to upload this way is at >> about 5 GB. If someone wants to publish data larger than this (which >> happened a couple of times yet, and we expect that it will increase to >> happen in the future), we are offering them to upload the files to our >> server via WebDav. Once we have the files, we are building an SAF import >> package with them and ingest it to DSpace on behalf of the user, ingesting >> it into the normal approval workflow. There we reject the item, so that the >> user has the chance to review and change the metadata before it's >> eventually approved. Another scenario using a similar approach is to let >> the user create a new item in DSpace, but tell them not to upload the file >> within that process and instead to use the WebDav option to do that. If we >> know the item number the user created and have the files, we can add the >> file with the dspace command (however, this is an new function The Library >> Code implemented for us and thus not available in generic DSpace yet). >> >> We do not use the upload limit in DSpace - as far as I remember it was >> not working properly on DSpace 5 JSPUI, though. But admittedly I haven't >> tried it for a long time now, so maybe this issue has been solved in the >> meantime. >> >> Let me add another aspect about large files, but in the other direction >> (not the upload aspect, but the download aspect): we are storing all of our >> bitstreams on an S3 storage and as DSpace 5 does not natively support that, >> we are using Cloudian Hyperfile for that, which is providing an >> NFS-mountable volume, which is linked to our assetstore. That means, all of >> the bitstreams (including thumbnails, licenses and so on) are going to the >> S3 storage. This is basically working fine, but with large files we once >> had some trouble in context with web crawlers harvesting those files: if >> too many users were getting too many of the large files parallely, this >> caused cache problems on the hyperfile volume. To avoid this, we have tuned >> some of the cache settings on Hyperfile, and we excluded the big files in >> our robots.txt from being crawled (as we think, it would be rather useless >> to crawl them at all). That solved those problems until now. But I guess >> it's something worth noting. >> >> If anyone has some experience with other download regulators preventing a >> user to download too many stuff parallely, I would be eager to know about >> that. >> >> Best >> Oliver >> >> Bill Tantzen schrieb am Donnerstag, 1. April 2021 um 20:23:12 UTC+2: >> >>> If you have a minute, I am trying to get a feel for some of the larger >>> (reasonable) bitstreams the community is currently supporting. On my site, >>> we have removed the DSpace upload limits to allow for records containing >>> research data, but of course there are practical limits that dictate what >>> makes for a good user experience. >>> >>> What is the largest bitstream you support? Do you enforce upload >>> limits? Assuming download speeds are faster than upload speeds, what are >>> some of the methods in use (besides the DSpace gui) to get large files onto >>> the server? What are some alternatives to simple DSpace upload currently >>> utilized -- like globus for instance? >>> >>> I realize the answer to these questions will always include "it >>> depends...", but are these all questions you have had at your institution >>> and how have you dealt with them? >>> >>> Thanks for any discussion you wish to contribute! >>> ~~ Bill >>> >>> -- >>> Human wheels spin round and round >>> While the clock keeps the pace... -- John Mellencamp >>> ________________________________________________________________ >>> Bill Tantzen University of Minnesota Libraries >>> 612-626-9949 <(612)%20626-9949> (U of M) 612-325-1777 >>> <(612)%20325-1777> (cell) >>> >> -- All messages to this mailing list should adhere to the Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/ --- You received this message because you are subscribed to the Google Groups "DSpace Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/7431e2de-5692-4891-b449-5510fba5b332n%40googlegroups.com.
