Hello, I saw this and want to give a brief response, about to go on leave but my colleagues could say more, John Pinto or Pauline Ward.
In Edinburgh DataShare https://datashare.ed.ac.uk/ which is an institutional data repository, we have made some enhancements (first in DSpace 5 now in 6.x) to allow drag and drop uploads up to 20 GB per item. We will also allow batch import for up to 100 GB per item. We have tested this and we have a message for larger data downloads that it will take a while sometimes more than a day, but it is resumable download so robust. We also have a 'download all' button which points to a zip file of the item's bitstreams, since most datasets have numerous files. Cheers, Robin Rice University of Edinburgh Library On Friday, 9 April 2021 at 15:28:42 UTC+1 [email protected] wrote: > Bill, > > I just saw your request on discussing bitstream sizes in DSpace and would > like to join the conversation. > > We are also having some reservations about uploading large files via the > Web UI to DSpace. For us the reasonable limit to upload this way is at > about 5 GB. If someone wants to publish data larger than this (which > happened a couple of times yet, and we expect that it will increase to > happen in the future), we are offering them to upload the files to our > server via WebDav. Once we have the files, we are building an SAF import > package with them and ingest it to DSpace on behalf of the user, ingesting > it into the normal approval workflow. There we reject the item, so that the > user has the chance to review and change the metadata before it's > eventually approved. Another scenario using a similar approach is to let > the user create a new item in DSpace, but tell them not to upload the file > within that process and instead to use the WebDav option to do that. If we > know the item number the user created and have the files, we can add the > file with the dspace command (however, this is an new function The Library > Code implemented for us and thus not available in generic DSpace yet). > > We do not use the upload limit in DSpace - as far as I remember it was not > working properly on DSpace 5 JSPUI, though. But admittedly I haven't tried > it for a long time now, so maybe this issue has been solved in the meantime. > > Let me add another aspect about large files, but in the other direction > (not the upload aspect, but the download aspect): we are storing all of our > bitstreams on an S3 storage and as DSpace 5 does not natively support that, > we are using Cloudian Hyperfile for that, which is providing an > NFS-mountable volume, which is linked to our assetstore. That means, all of > the bitstreams (including thumbnails, licenses and so on) are going to the > S3 storage. This is basically working fine, but with large files we once > had some trouble in context with web crawlers harvesting those files: if > too many users were getting too many of the large files parallely, this > caused cache problems on the hyperfile volume. To avoid this, we have tuned > some of the cache settings on Hyperfile, and we excluded the big files in > our robots.txt from being crawled (as we think, it would be rather useless > to crawl them at all). That solved those problems until now. But I guess > it's something worth noting. > > If anyone has some experience with other download regulators preventing a > user to download too many stuff parallely, I would be eager to know about > that. > > Best > Oliver > > Bill Tantzen schrieb am Donnerstag, 1. April 2021 um 20:23:12 UTC+2: > >> If you have a minute, I am trying to get a feel for some of the larger >> (reasonable) bitstreams the community is currently supporting. On my site, >> we have removed the DSpace upload limits to allow for records containing >> research data, but of course there are practical limits that dictate what >> makes for a good user experience. >> >> What is the largest bitstream you support? Do you enforce upload >> limits? Assuming download speeds are faster than upload speeds, what are >> some of the methods in use (besides the DSpace gui) to get large files onto >> the server? What are some alternatives to simple DSpace upload currently >> utilized -- like globus for instance? >> >> I realize the answer to these questions will always include "it >> depends...", but are these all questions you have had at your institution >> and how have you dealt with them? >> >> Thanks for any discussion you wish to contribute! >> ~~ Bill >> >> -- >> Human wheels spin round and round >> While the clock keeps the pace... -- John Mellencamp >> ________________________________________________________________ >> Bill Tantzen University of Minnesota Libraries >> 612-626-9949 <(612)%20626-9949> (U of M) 612-325-1777 >> <(612)%20325-1777> (cell) >> > -- All messages to this mailing list should adhere to the Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/ --- You received this message because you are subscribed to the Google Groups "DSpace Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/ad1c5103-a8cc-4547-92bc-7006a98b16c0n%40googlegroups.com.
