Thanks, Scott. I'll try to reproduce the problem in my environment, Fedora 3.4.2.
Can you tell me what version of Fedora and Tomcat (or other webapp server) you're using? -- Scott On 05/17/2011 11:08 AM, Scott Hammel wrote: > Hey, Scott, > > Thanks for responding. I'm more a C/C++ programmer and not a Java > programmer (though I sometimes play one on the Internet), so I'm just > guessing on the array bounds -- feels like something incrementing an int > into the sign bit, though I'd think Java would throw some array bounds > exception before that happened. Figured I'd do a little math later maybe > to test my hypothesis. > > Recall, this was all in a 32-bit environment. I really hope it is a > non-issue and something I'm doing in the end. Note disseminating the > datastream content directly appears to work OK, which confuses me a > little, though I haven't looked to see if the code for that does things > differently. > > Anyway, here's a series of commands (extracted from my test scripts) > that should reproduce the problem: > > mkdir /usr/fedora/tomcat/webapps/ROOT/ingestpool > mkdir /tmp/fedrun > dir=/tmp/fedrun > pid=test:pid01 > > dd if=/dev/urandom > of=/usr/fedora/tomcat/webapps/ROOT/ingestpool/sample.bin bs=1M count=400 > > ./makefoxml $pid http://localhost:8080/ingestpool/sample.bin> > $dir/sample.xml > > /usr/fedora/client/bin/fedora-ingest.sh f $dir/sample.xml > info:fedora/fedora-system:FOXML-1.1 localhost:8080 fedoraAdmin<insert > pwd here> http > > wget -O $dir/export.xml --auth-no-challenge --http-user=fedoraAdmin > --http-password=<insert pwd here> > http://localhost:8080/fedora/objects/$pid/export?context=archive > > Note: I use the REST call via a wget rather than the provided export > client scripts because it looks to me from the Java heap explosion that > the export scripts must end up doing the export via the SOAP API. > -- > The content of makefoxml: > > #!/bin/bash > > #usage: makefoxml<pid> <refurl> > #escape slashes off the URL > RF=${2//\//\\/} > #if you need to escape ampersands as well, uncomment this: > #RF=${RF//'&'/'\&'} > > # make substitutions .... > sed ' > s/PID=""/PID="'"$1"'"/ > s/rdf:about=""/rdf:about="info:fedora\/'"$1"'"/ > s/dc:identifier>/dc:identifier>'"$1"'/ > s/REF=""/REF="'"${RF}"'"/ > '< "foxml_tpl.xml" > > -- > The content of foxml_tmp.xml (the sed script above does the edits noted > in the xml comments in this template): > > <?xml version="1.0" encoding="UTF-8"?> > <!-- following element: set the PID attribute --> > <foxml:digitalObject VERSION="1.1" PID="" > xmlns:foxml="info:fedora/fedora-system:def/foxml#" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="info:fedora/fedora-system:def/foxml# > http://www.fedora.info/definitions/1/0/foxml1-1.xsd"> > > <foxml:objectProperties> > <foxml:property NAME="info:fedora/fedora-system:def/model#state" VALUE="A"/> > <foxml:property NAME="info:fedora/fedora-system:def/model#label" VALUE=""/> > <foxml:property NAME="info:fedora/fedora-system:def/model#ownerId" > VALUE="fedoraAdmin"/> > </foxml:objectProperties> > > <foxml:datastream CONTROL_GROUP="X" ID="RELS-EXT"> > <foxml:datastreamVersion > FORMAT_URI="info:fedora/fedora-system:FedoraRELSExt-1.0" > ID="RELS-EXT.0" LABEL="RDF Statements about this Object" > MIMETYPE="application/rdf+xml"> > <foxml:xmlContent> > <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" > xmlns:fedora="info:fedora/fedora-system:def/relations-external#" > xmlns:fedora-model="info:fedora/fedora-system:def/model#" > xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" > xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> > <!-- following element: put the PID as the value for the rdf:about > attribute --> > <rdf:description rdf:about=""> > </rdf:description> > </rdf:RDF> > </foxml:xmlContent> > </foxml:datastreamVersion> > </foxml:datastream> > > <foxml:datastream CONTROL_GROUP="X" ID="DC" STATE="A" VERSIONABLE="true"> > <foxml:datastreamVersion ID="DC.0" LABEL="Dublin Core Record" > MIMETYPE="text/xml"> > <foxml:xmlContent> > <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" > xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ > http://www.openarchives.org/OAI/2.0/oai_dc.xsd"> > <dc:title></dc:title> > <dc:creator>Test Program</dc:creator> > <dc:description>A test object</dc:description> > <!-- following element: put the PID between the tags --> > <dc:identifier></dc:identifier> > </oai_dc:dc> > </foxml:xmlContent> > </foxml:datastreamVersion> > </foxml:datastream> > > <foxml:datastream CONTROL_GROUP="M" ID="Content" STATE="A"> > <foxml:datastreamVersion ID="Content.0" LABEL="This is the object > content" MIMETYPE=" application/octet-stream"> > <!-- following element: put the URL to the content file as the value for > the REF attribute --> > <!-- must be an http URL, e.g., > http://localhost:8080/ingestpool/foxmldoc.xml --> > <!-- I just create a directory "ingestpool" under > /usr/fedora/tomcat/webapps/ROOT and put the files there --> > <foxml:contentLocation REF="" TYPE="URL" /> > </foxml:datastreamVersion> > </foxml:datastream> > > > </foxml:digitalObject> > > > > On 05/17/2011 10:00 AM, Scott Prater wrote: >> Scott, >> >> Can you come up with a test case that confirms this limitation? If you >> can provide one, I'll open up a JIRA ticket for the issue. >> >> thanks, >> >> -- Scott >> >> On 05/16/2011 10:45 AM, Scott Hammel wrote: >>> Oh, I think I see: last line of the serializer's serialize function does >>> this: >>> bytes.toByteArray() >>> where bytes is a ByteArrayOutputStream >>> >>> I *think* the max size of an array index in Java (32-bit) is >>> 2,147,483,647 (i.e., 2^31 - 1, max value of a java int). So, this >>> function will throw an exception if a datastream "archive" export is> >>> ~2 GB. >>> >>> scott >>> >>> On 05/16/2011 11:00 AM, Scott Hammel wrote: >>>> Hi, >>>> >>>> Running some export tests using Fedora's REST export API, I get a >>>> negative array index Java exception when doing an "archive" export of an >>>> object at around 400 MB (>320 MB,< 450 MB). >>>> >>>> Fedora is version 3.4 something; running on 32-bit CentOS 5.5, Sun Java >>>> 1.6, 21 >>>> >>>> Is it just me or has anyone else seen something like that? >>>> >>>> Thanks, >>>> Scott >>>> >>>> ------------------------------------------------------------------------------ >>>> Achieve unprecedented app performance and reliability >>>> What every C/C++ and Fortran developer should know. >>>> Learn how Intel has extended the reach of its next-generation tools >>>> to help boost performance applications - inlcuding clusters. >>>> http://p.sf.net/sfu/intel-dev2devmay >>>> _______________________________________________ >>>> Fedora-commons-users mailing list >>>> Fedora-commons-users@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >>>> >>> >>> ------------------------------------------------------------------------------ >>> Achieve unprecedented app performance and reliability >>> What every C/C++ and Fortran developer should know. >>> Learn how Intel has extended the reach of its next-generation tools >>> to help boost performance applications - inlcuding clusters. >>> http://p.sf.net/sfu/intel-dev2devmay >>> _______________________________________________ >>> Fedora-commons-users mailing list >>> Fedora-commons-users@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users >> > > > ------------------------------------------------------------------------------ > Achieve unprecedented app performance and reliability > What every C/C++ and Fortran developer should know. > Learn how Intel has extended the reach of its next-generation tools > to help boost performance applications - inlcuding clusters. > http://p.sf.net/sfu/intel-dev2devmay > _______________________________________________ > Fedora-commons-users mailing list > Fedora-commons-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users -- Scott Prater Library, Instructional, and Research Applications (LIRA) Division of Information Technology (DoIT) University of Wisconsin - Madison pra...@wisc.edu ------------------------------------------------------------------------------ Achieve unprecedented app performance and reliability What every C/C++ and Fortran developer should know. Learn how Intel has extended the reach of its next-generation tools to help boost performance applications - inlcuding clusters. http://p.sf.net/sfu/intel-dev2devmay _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users