Interesting discussion-I'm glad I asked. Thank you all. 

I'm going to give item import a shot. Let's hope it doesn't choke on the 150 GB 
payload. It shouldn't, if it just copies the file into the asset store. My fear 
is that it tries to load the file in to memory for some reason and overflows 
the heap. 

Here goes. 

Cheers, 
Bill


> -----Original Message-----
> From: Pottinger, Hardy J. [mailto:pottinge...@umsystem.edu]
> Sent: Thursday, August 30, 2012 12:31 PM
> To: Pottinger, Hardy J.; Ingram, William A; dspace-tech@lists.sourceforge.net
> Subject: Re: [Dspace-tech] Ingesting large data set
> 
> This may be just me hijacking the thread, so, apologies up front, but I
> followed a link [1] on the Code4Lib mail list just now, and came across
> Miso Dataset [2] Which looks very cool, indeed.
> 
> [1] http://selection.datavisualization.ch/
> [2] http://misoproject.com/dataset/
> 
> --
> HARDY POTTINGER <pottinge...@umsystem.edu>
> University of Missouri Library Systems
> http://lso.umsystem.edu/~pottingerhj/
> https://MOspace.umsystem.edu/
> "Do you love it? Do you hate it? There it is, the way you made it."
> --Frank Zappa
> 
> 
> 
> 
> 
> On 8/30/12 11:16 AM, "Pottinger, Hardy J." <pottinge...@umsystem.edu>
> wrote:
> 
> >Hi, Bill, the theoretical limit for posting data via HTTP is 1.8 GB [1].
> >Your only recourse for storing this particular data set in DSpace, is to
> >transfer to the server via FTP, SFTP, or SCP, and then either batch load,
> >or run the item update script [2]. However, my main question is: once it
> >is *in* DSpace, how do you plan on getting it *out*?
> >
> >I'd love to hear more from folks who are storing large data sets, just to
> >hear how you're handling usability of the stored data.
> >
> >[warning, I'm about to jump of the deep end here...]
> >
> >
> >One possibility worth exploring is streaming uploads and downloads. I've
> >come across streaming upload clients before, and in a brief bit of
> >googling, I discovered that Apache Commons supports streaming upload. [3]
> >
> >What would be cool is if someone had a streaming dataset viewer, where you
> >could plug some kind of visualization into a 'thumbnail' snapshot of a
> >data set, and then have the full data set stream in to fill out that
> >visualization/analysis. I can't be the first person to have thought of
> >such a thing, somebody has to be working on this already, right?
> >
> >[1] http://stackoverflow.com/questions/1922414/file-upload-limit-in-http
> >[2]
> >https://wiki.duraspace.org/display/DSDOC18/Updating+Items+via+Simple+Arc
> hi
> >v
> >e+Format
> >[3] http://commons.apache.org/fileupload/streaming.html
> >
> >--
> >HARDY POTTINGER <pottinge...@umsystem.edu>
> >University of Missouri Library Systems
> >http://lso.umsystem.edu/~pottingerhj/
> >https://MOspace.umsystem.edu/
> >"Debug only code. Comments lie."
> >
> >
> >
> >
> >
> >On 8/30/12 10:53 AM, "Ingram, William A" <wingr...@illinois.edu> wrote:
> >
> >>I apologize if a similar questuon has been answered in a prior thread.
> >>
> >>
> >>We have a student needing to submit a 150 GB data set into DSpace. Is
> >>this even possible? Are there any tips or workarounds I should try?
> >>
> >>
> >>Cheers,
> >>Bill
> >>
> >>
> >>
> >>
> >
> >
> >--------------------------------------------------------------------------
> >----
> >Live Security Virtual Conference
> >Exclusive live event will cover all the ways today's security and
> >threat landscape has changed and how IT managers can respond. Discussions
> >will include endpoint security, mobile security and the latest in malware
> >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> >_______________________________________________
> >DSpace-tech mailing list
> >DSpace-tech@lists.sourceforge.net
> >https://lists.sourceforge.net/lists/listinfo/dspace-tech


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to