Looking at those lines of code it looks like in theory there would be a problem there. Once this is confirmed we should probably add a test case to the large datastreams test suite. And it is likely to cause a problem with datastreams smaller than 2GB (2^31-1 as maximum array index) due to the archive export base64-encoding the content.
> -----Original Message----- > From: Scott Prater [mailto:pra...@wisc.edu] > Sent: 17 May 2011 18:33 > To: Support and info exchange list for Fedora users. > Subject: Re: [fcrepo-user] REST export API negative array > index exception > > > Yes, trying with the latest stable version (3.4.2) would be > useful, if > you don't mind. There were some lowlevel garbage collection problems > that were fixed in the 3.4.2 release; these problems manifested > themselves in a variety of ways. > > I'm not saying this is the issue, but it wouldn't hurt to verify that > your problem can be reproduced in 3.4.2. > > thanks, > > -- Scott > > On 05/17/2011 12:22 PM, Scott Hammel wrote: > > I'm pretty sure it is 3.4.0 (from files on the server it > looks like an > > August 2010 build. The server is in a totally isolated network with > > nothing with GUI support that can hit the admin tools). > > > > Tomcat is the version bundled with the Fedora installer. > > > > Would you like me to be sure I'm running at the latest > version and try > > the test scripts again before you go forward? > > > > Scott > > > > On 05/17/2011 12:45 PM, Scott Prater wrote: > >> Thanks, Scott. I'll try to reproduce the problem in my > environment, > >> Fedora 3.4.2. > >> > >> Can you tell me what version of Fedora and Tomcat (or other webapp > >> server) you're using? > >> > >> -- Scott > >> > >> On 05/17/2011 11:08 AM, Scott Hammel wrote: > >>> Hey, Scott, > >>> > >>> Thanks for responding. I'm more a C/C++ programmer and not a Java > >>> programmer (though I sometimes play one on the Internet), so I'm > >>> just guessing on the array bounds -- feels like something > >>> incrementing an int into the sign bit, though I'd think > Java would > >>> throw some array bounds exception before that happened. > Figured I'd > >>> do a little math later maybe to test my hypothesis. > >>> > >>> Recall, this was all in a 32-bit environment. I really > hope it is a > >>> non-issue and something I'm doing in the end. Note > disseminating the > >>> datastream content directly appears to work OK, which > confuses me a > >>> little, though I haven't looked to see if the code for that does > >>> things differently. > >>> > >>> Anyway, here's a series of commands (extracted from my > test scripts) > >>> that should reproduce the problem: > >>> > >>> mkdir /usr/fedora/tomcat/webapps/ROOT/ingestpool > >>> mkdir /tmp/fedrun > >>> dir=/tmp/fedrun > >>> pid=test:pid01 > >>> > >>> dd if=/dev/urandom > >>> of=/usr/fedora/tomcat/webapps/ROOT/ingestpool/sample.bin bs=1M > >>> count=400 > >>> > >>> ./makefoxml $pid http://localhost:8080/ingestpool/sample.bin> > >>> $dir/sample.xml > >>> > >>> /usr/fedora/client/bin/fedora-ingest.sh f $dir/sample.xml > >>> info:fedora/fedora-system:FOXML-1.1 localhost:8080 > fedoraAdmin<insert > >>> pwd here> http > >>> > >>> wget -O $dir/export.xml --auth-no-challenge > --http-user=fedoraAdmin > >>> --http-password=<insert pwd here> > >>> http://localhost:8080/fedora/objects/$pid/export?context=archive > >>> > >>> Note: I use the REST call via a wget rather than the > provided export > >>> client scripts because it looks to me from the Java heap > explosion > >>> that the export scripts must end up doing the export via the SOAP > >>> API. > >>> -- > >>> The content of makefoxml: > >>> > >>> #!/bin/bash > >>> > >>> #usage: makefoxml<pid> <refurl> > >>> #escape slashes off the URL > >>> RF=${2//\//\\/} > >>> #if you need to escape ampersands as well, uncomment this: > >>> #RF=${RF//'&'/'\&'} > >>> > >>> # make substitutions .... > >>> sed ' > >>> s/PID=""/PID="'"$1"'"/ > >>> s/rdf:about=""/rdf:about="info:fedora\/'"$1"'"/ > >>> s/dc:identifier>/dc:identifier>'"$1"'/ > >>> s/REF=""/REF="'"${RF}"'"/ > >>> '< "foxml_tpl.xml" > >>> > >>> -- > >>> The content of foxml_tmp.xml (the sed script above does the edits > >>> noted in the xml comments in this template): > >>> > >>> <?xml version="1.0" encoding="UTF-8"?> > >>> <!-- following element: set the PID attribute --> > >>> <foxml:digitalObject VERSION="1.1" PID="" > >>> xmlns:foxml="info:fedora/fedora-system:def/foxml#" > >>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > >>> xsi:schemaLocation="info:fedora/fedora-system:def/foxml# > >>> http://www.fedora.info/definitions/1/0/foxml1-1.xsd"> > >>> > >>> <foxml:objectProperties> > >>> <foxml:property NAME="info:fedora/fedora-system:def/model#state" > >>> VALUE="A"/> <foxml:property > >>> NAME="info:fedora/fedora-system:def/model#label" VALUE=""/> > >>> <foxml:property NAME="info:fedora/fedora-system:def/model#ownerId" > >>> VALUE="fedoraAdmin"/> > >>> </foxml:objectProperties> > >>> > >>> <foxml:datastream CONTROL_GROUP="X" ID="RELS-EXT"> > >>> <foxml:datastreamVersion > >>> FORMAT_URI="info:fedora/fedora-system:FedoraRELSExt-1.0" > >>> ID="RELS-EXT.0" LABEL="RDF Statements about > this Object" > >>> MIMETYPE="application/rdf+xml"> <foxml:xmlContent> > >>> <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" > >>> > xmlns:fedora="info:fedora/fedora-system:def/relations-external#" > >>> > xmlns:fedora-model="info:fedora/fedora-system:def/model#" > >>> > xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" > >>> > xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > >>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> > >>> <!-- following element: put the PID as the value for the rdf:about > >>> attribute --> > >>> <rdf:description rdf:about=""> > >>> </rdf:description> > >>> </rdf:RDF> > >>> </foxml:xmlContent> > >>> </foxml:datastreamVersion> > >>> </foxml:datastream> > >>> > >>> <foxml:datastream CONTROL_GROUP="X" ID="DC" STATE="A" > >>> VERSIONABLE="true"> <foxml:datastreamVersion ID="DC.0" > LABEL="Dublin > >>> Core Record" MIMETYPE="text/xml"> <foxml:xmlContent> > >>> <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" > >>> > xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" > >>> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > >>> xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ > >>> http://www.openarchives.org/OAI/2.0/oai_dc.xsd"> > >>> <dc:title></dc:title> > >>> <dc:creator>Test Program</dc:creator> > >>> <dc:description>A test object</dc:description> > >>> <!-- following element: put the PID between the tags --> > >>> <dc:identifier></dc:identifier> > >>> </oai_dc:dc> > >>> </foxml:xmlContent> > >>> </foxml:datastreamVersion> > >>> </foxml:datastream> > >>> > >>> <foxml:datastream CONTROL_GROUP="M" ID="Content" STATE="A"> > >>> <foxml:datastreamVersion ID="Content.0" LABEL="This is the object > >>> content" MIMETYPE=" application/octet-stream"> > >>> <!-- following element: put the URL to the content file > as the value > >>> for the REF attribute --> > >>> <!-- must be an http URL, e.g., > >>> http://localhost:8080/ingestpool/foxmldoc.xml --> > >>> <!-- I just create a directory "ingestpool" under > >>> /usr/fedora/tomcat/webapps/ROOT and put the files there --> > >>> <foxml:contentLocation REF="" TYPE="URL" /> > >>> </foxml:datastreamVersion> </foxml:datastream> > >>> > >>> > >>> </foxml:digitalObject> > >>> > >>> > >>> > >>> On 05/17/2011 10:00 AM, Scott Prater wrote: > >>>> Scott, > >>>> > >>>> Can you come up with a test case that confirms this > limitation? If > >>>> you can provide one, I'll open up a JIRA ticket for the issue. > >>>> > >>>> thanks, > >>>> > >>>> -- Scott > >>>> > >>>> On 05/16/2011 10:45 AM, Scott Hammel wrote: > >>>>> Oh, I think I see: last line of the serializer's serialize > >>>>> function does > >>>>> this: > >>>>> bytes.toByteArray() > >>>>> where bytes is a ByteArrayOutputStream > >>>>> > >>>>> I *think* the max size of an array index in Java (32-bit) is > >>>>> 2,147,483,647 (i.e., 2^31 - 1, max value of a java > int). So, this > >>>>> function will throw an exception if a datastream > "archive" export > >>>>> is> ~2 GB. > >>>>> > >>>>> scott > >>>>> > >>>>> On 05/16/2011 11:00 AM, Scott Hammel wrote: > >>>>>> Hi, > >>>>>> > >>>>>> Running some export tests using Fedora's REST export > API, I get a > >>>>>> negative array index Java exception when doing an > "archive" export of an > >>>>>> object at around 400 MB (>320 MB,< 450 MB). > >>>>>> > >>>>>> Fedora is version 3.4 something; running on 32-bit CentOS 5.5, > >>>>>> Sun Java 1.6, 21 > >>>>>> > >>>>>> Is it just me or has anyone else seen something like that? > >>>>>> > >>>>>> Thanks, > >>>>>> Scott > >>>>>> > >>>>>> > ----------------------------------------------------------------- > >>>>>> ------------- > >>>>>> Achieve unprecedented app performance and reliability > >>>>>> What every C/C++ and Fortran developer should know. > >>>>>> Learn how Intel has extended the reach of its > next-generation tools > >>>>>> to help boost performance applications - inlcuding clusters. > >>>>>> http://p.sf.net/sfu/intel-dev2devmay > >>>>>> _______________________________________________ > >>>>>> Fedora-commons-users mailing list > >>>>>> Fedora-commons-users@lists.sourceforge.net > >>>>>> > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > >>>>>> > >>>>> > ------------------------------------------------------------------ > >>>>> ------------ > >>>>> Achieve unprecedented app performance and reliability > >>>>> What every C/C++ and Fortran developer should know. > >>>>> Learn how Intel has extended the reach of its > next-generation tools > >>>>> to help boost performance applications - inlcuding clusters. > >>>>> http://p.sf.net/sfu/intel-dev2devmay > >>>>> _______________________________________________ > >>>>> Fedora-commons-users mailing list > >>>>> Fedora-commons-users@lists.sourceforge.net > >>>>> > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > >>> > >>> > -------------------------------------------------------------------- > >>> ---------- > >>> Achieve unprecedented app performance and reliability > >>> What every C/C++ and Fortran developer should know. > >>> Learn how Intel has extended the reach of its > next-generation tools > >>> to help boost performance applications - inlcuding clusters. > >>> http://p.sf.net/sfu/intel-dev2devmay > >>> _______________________________________________ > >>> Fedora-commons-users mailing list > >>> Fedora-commons-users@lists.sourceforge.net > >>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > >> > > > > > > > ---------------------------------------------------------------------- > > -------- > > Achieve unprecedented app performance and reliability > > What every C/C++ and Fortran developer should know. > > Learn how Intel has extended the reach of its next-generation tools > > to help boost performance applications - inlcuding clusters. > > http://p.sf.net/sfu/intel-dev2devmay > > _______________________________________________ > > Fedora-commons-users mailing list > > Fedora-commons-users@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > > > -- > Scott Prater > Library, Instructional, and Research Applications (LIRA) > Division of Information Technology (DoIT) University of > Wisconsin - Madison pra...@wisc.edu > > -------------------------------------------------------------- > ---------------- > Achieve unprecedented app performance and reliability > What every C/C++ and Fortran developer should know. > Learn how Intel has extended the reach of its next-generation > tools to help boost performance applications - inlcuding > clusters. http://p.sf.net/sfu/intel-dev2devmay > _______________________________________________ > Fedora-commons-users mailing list > Fedora-commons-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > ------------------------------------------------------------------------------ What Every C/C++ and Fortran developer Should Know! Read this article and learn how Intel has extended the reach of its next-generation tools to help Windows* and Linux* C/C++ and Fortran developers boost performance applications - including clusters. http://p.sf.net/sfu/intel-dev2devmay _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users