I see where you are going :-) I just ran a 400MB test with an ATOMZip export. Seems to have worked just fine.
A 900MB datastream export to ATOMZip test failed. No exception generated in the logs, just an internal server error. I noticed with 3.4.2 this can indicate the JVM ran out memory (not surprising if the export is still being collected into a ByteArrayOutputStream, I guess). Scott On 05/18/2011 11:53 AM, Stephen Bayliss wrote: > Hi Scott > > Thanks for that feedback. > > It would be interesting to find out if you get the same problem using the > AtomZip export format (info:fedora/fedora-system:ATOMZip-1.1) > > Steve > >> -----Original Message----- >> From: Scott Hammel [mailto:sc...@clemson.edu] >> Sent: 18 May 2011 16:16 >> To: Support and info exchange list for Fedora users. >> Subject: Re: [fcrepo-user] REST export API negative array >> index exception >> >> >> Scott, Steve, >> >> REST export in archive format still blows up with Fedora >> 3.4.2. Actually >> is crashing on a datastream< 300MB. I gave the JVM 1.5GB of >> heap, BTW. >> >> Regardless, the exception that is in fedora.log is a negative array >> index exception. It looks like it is actually occurring down in the >> base64 encoder according to the stack trace. >> >> It occurs to me that building support for a full archival >> export of an >> object in memory for arbitrarily large objects might be pragmatically >> (is that a word?) impossible: e.g., on 32-bit systems I think >> you bump >> into problems giving the JVM more than ~1.8 GB of RAM. That >> alone limits >> the size of exportable objects to well under 2GB in that environment. >> >> If I was more adept with Java, I'd volunteer to write an >> exporter that >> spooled to disk, but alas, I am not and it would take me >> twice as long >> as someone who is. :-( >> >> I can take one of several alternative paths with my >> particular project, >> so it isn't too big an issue to *me* .... I just have to do a little >> more coding in a middle-tier. Don't know about other folks, of course. >> >> -Scott >> >> On 05/18/2011 01:08 AM, Stephen Bayliss wrote: >>> Looking at those lines of code it looks like in theory >> there would be >>> a problem there. Once this is confirmed we should probably >> add a test >>> case to the large datastreams test suite. And it is likely >> to cause a >>> problem with datastreams smaller than 2GB (2^31-1 as maximum array >>> index) due to the archive export base64-encoding the content. >>> >>>> -----Original Message----- >>>> From: Scott Prater [mailto:pra...@wisc.edu] >>>> Sent: 17 May 2011 18:33 >>>> To: Support and info exchange list for Fedora users. >>>> Subject: Re: [fcrepo-user] REST export API negative array index >>>> exception >>>> >>>> >>>> Yes, trying with the latest stable version (3.4.2) would >> be useful, >>>> if you don't mind. There were some lowlevel garbage collection >>>> problems that were fixed in the 3.4.2 release; these problems >>>> manifested themselves in a variety of ways. >>>> >>>> I'm not saying this is the issue, but it wouldn't hurt to >> verify that >>>> your problem can be reproduced in 3.4.2. >>>> >>>> thanks, >>>> >>>> -- Scott >>>> >>>> On 05/17/2011 12:22 PM, Scott Hammel wrote: >>>>> I'm pretty sure it is 3.4.0 (from files on the server it >>>> looks like an >>>>> August 2010 build. The server is in a totally isolated >> network with >>>>> nothing with GUI support that can hit the admin tools). >>>>> >>>>> Tomcat is the version bundled with the Fedora installer. >>>>> >>>>> Would you like me to be sure I'm running at the latest >>>> version and try >>>>> the test scripts again before you go forward? >>>>> >>>>> Scott >>>>> >>>>> On 05/17/2011 12:45 PM, Scott Prater wrote: >>>>>> Thanks, Scott. I'll try to reproduce the problem in my >>>> environment, >>>>>> Fedora 3.4.2. >>>>>> >>>>>> Can you tell me what version of Fedora and Tomcat (or >> other webapp >>>>>> server) you're using? >>>>>> >>>>>> -- Scott >>>>>> >>>>>> On 05/17/2011 11:08 AM, Scott Hammel wrote: >>>>>>> Hey, Scott, >>>>>>> >>>>>>> Thanks for responding. I'm more a C/C++ programmer and >> not a Java >>>>>>> programmer (though I sometimes play one on the >> Internet), so I'm >>>>>>> just guessing on the array bounds -- feels like something >>>>>>> incrementing an int into the sign bit, though I'd think >>>> Java would >>>>>>> throw some array bounds exception before that happened. >>>> Figured I'd >>>>>>> do a little math later maybe to test my hypothesis. >>>>>>> >>>>>>> Recall, this was all in a 32-bit environment. I really >>>> hope it is a >>>>>>> non-issue and something I'm doing in the end. Note >>>> disseminating the >>>>>>> datastream content directly appears to work OK, which >>>> confuses me a >>>>>>> little, though I haven't looked to see if the code for >> that does >>>>>>> things differently. >>>>>>> >>>>>>> Anyway, here's a series of commands (extracted from my >>>> test scripts) >>>>>>> that should reproduce the problem: >>>>>>> >>>>>>> mkdir /usr/fedora/tomcat/webapps/ROOT/ingestpool >>>>>>> mkdir /tmp/fedrun >>>>>>> dir=/tmp/fedrun >>>>>>> pid=test:pid01 >>>>>>> >>>>>>> dd if=/dev/urandom >>>>>>> of=/usr/fedora/tomcat/webapps/ROOT/ingestpool/sample.bin bs=1M >>>>>>> count=400 >>>>>>> >>>>>>> ./makefoxml $pid http://localhost:8080/ingestpool/sample.bin> >>>>>>> $dir/sample.xml >>>>>>> >>>>>>> /usr/fedora/client/bin/fedora-ingest.sh f $dir/sample.xml >>>>>>> info:fedora/fedora-system:FOXML-1.1 localhost:8080 >>>> fedoraAdmin<insert >>>>>>> pwd here> http >>>>>>> >>>>>>> wget -O $dir/export.xml --auth-no-challenge >>>> --http-user=fedoraAdmin >>>>>>> --http-password=<insert pwd here> >>>>>>> http://localhost:8080/fedora/objects/$pid/export?context=archive >>>>>>> >>>>>>> Note: I use the REST call via a wget rather than the >>>> provided export >>>>>>> client scripts because it looks to me from the Java heap >>>> explosion >>>>>>> that the export scripts must end up doing the export >> via the SOAP >>>>>>> API. >>>>>>> -- >>>>>>> The content of makefoxml: >>>>>>> >>>>>>> #!/bin/bash >>>>>>> >>>>>>> #usage: makefoxml<pid> <refurl> >>>>>>> #escape slashes off the URL >>>>>>> RF=${2//\//\\/} >>>>>>> #if you need to escape ampersands as well, uncomment this: >>>>>>> #RF=${RF//'&'/'\&'} >>>>>>> >>>>>>> # make substitutions .... >>>>>>> sed ' >>>>>>> s/PID=""/PID="'"$1"'"/ >>>>>>> s/rdf:about=""/rdf:about="info:fedora\/'"$1"'"/ >>>>>>> s/dc:identifier>/dc:identifier>'"$1"'/ >>>>>>> s/REF=""/REF="'"${RF}"'"/ >>>>>>> '< "foxml_tpl.xml" >>>>>>> >>>>>>> -- >>>>>>> The content of foxml_tmp.xml (the sed script above does >> the edits >>>>>>> noted in the xml comments in this template): >>>>>>> >>>>>>> <?xml version="1.0" encoding="UTF-8"?> >>>>>>> <!-- following element: set the PID attribute --> >>>>>>> <foxml:digitalObject VERSION="1.1" PID="" >>>>>>> xmlns:foxml="info:fedora/fedora-system:def/foxml#" >>>>>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>>>>>> xsi:schemaLocation="info:fedora/fedora-system:def/foxml# >>>>>>> http://www.fedora.info/definitions/1/0/foxml1-1.xsd"> >>>>>>> >>>>>>> <foxml:objectProperties> >>>>>>> <foxml:property NAME="info:fedora/fedora-system:def/model#state" >>>>>>> VALUE="A"/> <foxml:property >>>>>>> NAME="info:fedora/fedora-system:def/model#label" VALUE=""/> >>>>>>> <foxml:property >> NAME="info:fedora/fedora-system:def/model#ownerId" >>>>>>> VALUE="fedoraAdmin"/> >>>>>>> </foxml:objectProperties> >>>>>>> >>>>>>> <foxml:datastream CONTROL_GROUP="X" ID="RELS-EXT"> >>>>>>> <foxml:datastreamVersion >>>>>>> FORMAT_URI="info:fedora/fedora-system:FedoraRELSExt-1.0" >>>>>>> ID="RELS-EXT.0" LABEL="RDF Statements about >>>> this Object" >>>>>>> MIMETYPE="application/rdf+xml"> <foxml:xmlContent> <rdf:RDF >>>>>>> xmlns:dc="http://purl.org/dc/elements/1.1/" >>>>>>> >>>> xmlns:fedora="info:fedora/fedora-system:def/relations-external#" >>>> xmlns:fedora-model="info:fedora/fedora-system:def/model#" >>>> xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" >>>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >>>>>>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> >>>>>>> <!-- following element: put the PID as the value for >> the rdf:about >>>>>>> attribute --> <rdf:description rdf:about=""> >>>>>>> </rdf:description> >>>>>>> </rdf:RDF> >>>>>>> </foxml:xmlContent> >>>>>>> </foxml:datastreamVersion> >>>>>>> </foxml:datastream> >>>>>>> >>>>>>> <foxml:datastream CONTROL_GROUP="X" ID="DC" STATE="A" >>>>>>> VERSIONABLE="true"> <foxml:datastreamVersion ID="DC.0" >>>> LABEL="Dublin >>>>>>> Core Record" MIMETYPE="text/xml"> <foxml:xmlContent> >> <oai_dc:dc >>>>>>> xmlns:dc="http://purl.org/dc/elements/1.1/" >>>>>>> >>>> xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" >>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>>>>>> xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ >>>>>>> http://www.openarchives.org/OAI/2.0/oai_dc.xsd"> >>>>>>> <dc:title></dc:title> >>>>>>> <dc:creator>Test Program</dc:creator> >>>>>>> <dc:description>A test object</dc:description> >>>>>>> <!-- following element: put the PID between the tags --> >>>>>>> <dc:identifier></dc:identifier> </oai_dc:dc> >>>>>>> </foxml:xmlContent> >>>>>>> </foxml:datastreamVersion> >>>>>>> </foxml:datastream> >>>>>>> >>>>>>> <foxml:datastream CONTROL_GROUP="M" ID="Content" STATE="A"> >>>>>>> <foxml:datastreamVersion ID="Content.0" LABEL="This is >> the object >>>>>>> content" MIMETYPE=" application/octet-stream"> >>>>>>> <!-- following element: put the URL to the content file >>>> as the value >>>>>>> for the REF attribute --> >>>>>>> <!-- must be an http URL, e.g., >>>>>>> http://localhost:8080/ingestpool/foxmldoc.xml --> >>>>>>> <!-- I just create a directory "ingestpool" under >>>>>>> /usr/fedora/tomcat/webapps/ROOT and put the files there --> >>>>>>> <foxml:contentLocation REF="" TYPE="URL" /> >>>>>>> </foxml:datastreamVersion> </foxml:datastream> >>>>>>> >>>>>>> >>>>>>> </foxml:digitalObject> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 05/17/2011 10:00 AM, Scott Prater wrote: >>>>>>>> Scott, >>>>>>>> >>>>>>>> Can you come up with a test case that confirms this >>>> limitation? If >>>>>>>> you can provide one, I'll open up a JIRA ticket for the issue. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> -- Scott >>>>>>>> >>>>>>>> On 05/16/2011 10:45 AM, Scott Hammel wrote: >>>>>>>>> Oh, I think I see: last line of the serializer's serialize >>>>>>>>> function does >>>>>>>>> this: >>>>>>>>> bytes.toByteArray() >>>>>>>>> where bytes is a ByteArrayOutputStream >>>>>>>>> >>>>>>>>> I *think* the max size of an array index in Java (32-bit) is >>>>>>>>> 2,147,483,647 (i.e., 2^31 - 1, max value of a java >>>> int). So, this >>>>>>>>> function will throw an exception if a datastream >>>> "archive" export >>>>>>>>> is> ~2 GB. >>>>>>>>> >>>>>>>>> scott >>>>>>>>> >>>>>>>>> On 05/16/2011 11:00 AM, Scott Hammel wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Running some export tests using Fedora's REST export >>>> API, I get a >>>>>>>>>> negative array index Java exception when doing an >>>> "archive" export of an >>>>>>>>>> object at around 400 MB (>320 MB,< 450 MB). >>>>>>>>>> >>>>>>>>>> Fedora is version 3.4 something; running on 32-bit >> CentOS 5.5, >>>>>>>>>> Sun Java 1.6, 21 >>>>>>>>>> >>>>>>>>>> Is it just me or has anyone else seen something like that? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Scott >>>>>>>>>> >>>>>>>>>> >>>> ----------------------------------------------------------------- >>>>>>>>>> ------------- >>>>>>>>>> Achieve unprecedented app performance and reliability What >>>>>>>>>> every C/C++ and Fortran developer should know. Learn >> how Intel >>>>>>>>>> has extended the reach of its >>>> next-generation tools >>>>>>>>>> to help boost performance applications - inlcuding clusters. >>>>>>>>>> http://p.sf.net/sfu/intel-dev2devmay >>>>>>>>>> _______________________________________________ >>>>>>>>>> Fedora-commons-users mailing list >>>>>>>>>> Fedora-commons-users@lists.sourceforge.net >>>>>>>>>> >>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >>>> ------------------------------------------------------------------ >>>>>>>>> ------------ >>>>>>>>> Achieve unprecedented app performance and reliability >> What every >>>>>>>>> C/C++ and Fortran developer should know. Learn how Intel has >>>>>>>>> extended the reach of its >>>> next-generation tools >>>>>>>>> to help boost performance applications - inlcuding clusters. >>>>>>>>> http://p.sf.net/sfu/intel-dev2devmay >>>>>>>>> _______________________________________________ >>>>>>>>> Fedora-commons-users mailing list >>>>>>>>> Fedora-commons-users@lists.sourceforge.net >>>>>>>>> >>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >> -------------------------------------------------------------------- >>>>>>> ---------- >>>>>>> Achieve unprecedented app performance and reliability >> What every >>>>>>> C/C++ and Fortran developer should know. Learn how Intel has >>>>>>> extended the reach of its >>>> next-generation tools >>>>>>> to help boost performance applications - inlcuding clusters. >>>>>>> http://p.sf.net/sfu/intel-dev2devmay >>>>>>> _______________________________________________ >>>>>>> Fedora-commons-users mailing list >>>>>>> Fedora-commons-users@lists.sourceforge.net >>>>>>> >> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >>>>> >> --------------------------------------------------------------------- >>>> - >>>>> -------- >>>>> Achieve unprecedented app performance and reliability >>>>> What every C/C++ and Fortran developer should know. >>>>> Learn how Intel has extended the reach of its >> next-generation tools >>>>> to help boost performance applications - inlcuding clusters. >>>>> http://p.sf.net/sfu/intel-dev2devmay >>>>> _______________________________________________ >>>>> Fedora-commons-users mailing list >>>>> Fedora-commons-users@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >>>> -- >>>> Scott Prater >>>> Library, Instructional, and Research Applications (LIRA) >>>> Division of Information Technology (DoIT) University of >>>> Wisconsin - Madison pra...@wisc.edu >>>> >>>> -------------------------------------------------------------- >>>> ---------------- >>>> Achieve unprecedented app performance and reliability >>>> What every C/C++ and Fortran developer should know. >>>> Learn how Intel has extended the reach of its >> next-generation tools >>>> to help boost performance applications - inlcuding clusters. >>>> http://p.sf.net/sfu/intel-dev2devmay >>>> _______________________________________________ >>>> Fedora-commons-users mailing list >>>> Fedora-commons-users@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >>>> >>> >> ---------------------------------------------------------------------- >>> -------- >>> What Every C/C++ and Fortran developer Should Know! >>> Read this article and learn how Intel has extended the reach of its >>> next-generation tools to help Windows* and Linux* C/C++ and Fortran >>> developers boost performance applications - including clusters. >>> http://p.sf.net/sfu/intel-dev2devmay >>> _______________________________________________ >>> Fedora-commons-users mailing list >>> Fedora-commons-users@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >>> >> >> -------------------------------------------------------------- >> ---------------- >> What Every C/C++ and Fortran developer Should Know! >> Read this article and learn how Intel has extended the reach of its >> next-generation tools to help Windows* and Linux* C/C++ and Fortran >> developers boost performance applications - including clusters. >> http://p.sf.net/sfu/intel-dev2devmay >> _______________________________________________ >> Fedora-commons-users mailing list >> Fedora-commons-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >> > > ------------------------------------------------------------------------------ > What Every C/C++ and Fortran developer Should Know! > Read this article and learn how Intel has extended the reach of its > next-generation tools to help Windows* and Linux* C/C++ and Fortran > developers boost performance applications - including clusters. > http://p.sf.net/sfu/intel-dev2devmay > _______________________________________________ > Fedora-commons-users mailing list > Fedora-commons-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > ------------------------------------------------------------------------------ What Every C/C++ and Fortran developer Should Know! Read this article and learn how Intel has extended the reach of its next-generation tools to help Windows* and Linux* C/C++ and Fortran developers boost performance applications - including clusters. http://p.sf.net/sfu/intel-dev2devmay _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users