Looking at those lines of code it looks like in theory there would be a
problem there.  Once this is confirmed we should probably add a test case to
the large datastreams test suite.  And it is likely to cause a problem with
datastreams smaller than 2GB (2^31-1 as maximum array index) due to the
archive export base64-encoding the content.

> -----Original Message-----
> From: Scott Prater [mailto:pra...@wisc.edu] 
> Sent: 17 May 2011 18:33
> To: Support and info exchange list for Fedora users.
> Subject: Re: [fcrepo-user] REST export API negative array 
> index exception
> 
> 
> Yes, trying with the latest stable version (3.4.2) would be 
> useful, if 
> you don't mind.  There were some lowlevel garbage collection problems 
> that were fixed in the 3.4.2 release;  these problems manifested 
> themselves in a variety of ways.
> 
> I'm not saying this is the issue, but it wouldn't hurt to verify that 
> your problem can be reproduced in 3.4.2.
> 
> thanks,
> 
> -- Scott
> 
> On 05/17/2011 12:22 PM, Scott Hammel wrote:
> > I'm pretty sure it is 3.4.0 (from files on the server it 
> looks like an 
> > August 2010 build. The server is in a totally isolated network with 
> > nothing with GUI support that can hit the admin tools).
> >
> > Tomcat is the version bundled with the Fedora installer.
> >
> > Would you like me to be sure I'm running at the latest 
> version and try 
> > the test scripts again before you go forward?
> >
> > Scott
> >
> > On 05/17/2011 12:45 PM, Scott Prater wrote:
> >> Thanks, Scott.  I'll try to reproduce the problem in my 
> environment, 
> >> Fedora 3.4.2.
> >>
> >> Can you tell me what version of Fedora and Tomcat (or other webapp
> >> server) you're using?
> >>
> >> -- Scott
> >>
> >> On 05/17/2011 11:08 AM, Scott Hammel wrote:
> >>> Hey, Scott,
> >>>
> >>> Thanks for responding. I'm more a C/C++ programmer and not a Java 
> >>> programmer (though I sometimes play one on the Internet), so I'm 
> >>> just guessing on the array bounds -- feels like something 
> >>> incrementing an int into the sign bit, though I'd think 
> Java would 
> >>> throw some array bounds exception before that happened. 
> Figured I'd 
> >>> do a little math later maybe to test my hypothesis.
> >>>
> >>> Recall, this was all in a 32-bit environment. I really 
> hope it is a 
> >>> non-issue and something I'm doing in the end. Note 
> disseminating the 
> >>> datastream content directly appears to work OK, which 
> confuses me a 
> >>> little, though I haven't looked to see if the code for that does 
> >>> things differently.
> >>>
> >>> Anyway, here's a series of commands (extracted from my 
> test scripts) 
> >>> that should reproduce the problem:
> >>>
> >>> mkdir /usr/fedora/tomcat/webapps/ROOT/ingestpool
> >>> mkdir /tmp/fedrun
> >>> dir=/tmp/fedrun
> >>> pid=test:pid01
> >>>
> >>> dd if=/dev/urandom 
> >>> of=/usr/fedora/tomcat/webapps/ROOT/ingestpool/sample.bin bs=1M 
> >>> count=400
> >>>
> >>> ./makefoxml $pid http://localhost:8080/ingestpool/sample.bin>
> >>> $dir/sample.xml
> >>>
> >>> /usr/fedora/client/bin/fedora-ingest.sh f $dir/sample.xml 
> >>> info:fedora/fedora-system:FOXML-1.1 localhost:8080 
> fedoraAdmin<insert
> >>> pwd here>    http
> >>>
> >>> wget -O $dir/export.xml --auth-no-challenge 
> --http-user=fedoraAdmin 
> >>> --http-password=<insert pwd here> 
> >>> http://localhost:8080/fedora/objects/$pid/export?context=archive
> >>>
> >>> Note: I use the REST call via a wget rather than the 
> provided export 
> >>> client scripts because it looks to me from the Java heap 
> explosion 
> >>> that the export scripts must end up doing the export via the SOAP 
> >>> API.
> >>> --
> >>> The content of makefoxml:
> >>>
> >>> #!/bin/bash
> >>>
> >>> #usage: makefoxml<pid>    <refurl>
> >>> #escape slashes off the URL
> >>> RF=${2//\//\\/}
> >>> #if you need to escape ampersands as well, uncomment this: 
> >>> #RF=${RF//'&'/'\&'}
> >>>
> >>> # make substitutions ....
> >>> sed '
> >>> s/PID=""/PID="'"$1"'"/ 
> >>> s/rdf:about=""/rdf:about="info:fedora\/'"$1"'"/
> >>> s/dc:identifier>/dc:identifier>'"$1"'/
> >>> s/REF=""/REF="'"${RF}"'"/
> >>> '<    "foxml_tpl.xml"
> >>>
> >>> --
> >>> The content of foxml_tmp.xml (the sed script above does the edits 
> >>> noted in the xml comments in this template):
> >>>
> >>> <?xml version="1.0" encoding="UTF-8"?>
> >>> <!-- following element: set the PID attribute --> 
> >>> <foxml:digitalObject VERSION="1.1" PID="" 
> >>> xmlns:foxml="info:fedora/fedora-system:def/foxml#"
> >>>       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
> >>> xsi:schemaLocation="info:fedora/fedora-system:def/foxml#
> >>> http://www.fedora.info/definitions/1/0/foxml1-1.xsd";>
> >>>
> >>> <foxml:objectProperties>
> >>> <foxml:property NAME="info:fedora/fedora-system:def/model#state" 
> >>> VALUE="A"/> <foxml:property 
> >>> NAME="info:fedora/fedora-system:def/model#label" VALUE=""/> 
> >>> <foxml:property NAME="info:fedora/fedora-system:def/model#ownerId"
> >>> VALUE="fedoraAdmin"/>
> >>> </foxml:objectProperties>
> >>>
> >>> <foxml:datastream CONTROL_GROUP="X" ID="RELS-EXT"> 
> >>> <foxml:datastreamVersion 
> >>> FORMAT_URI="info:fedora/fedora-system:FedoraRELSExt-1.0"
> >>>           ID="RELS-EXT.0" LABEL="RDF Statements about 
> this Object" 
> >>> MIMETYPE="application/rdf+xml"> <foxml:xmlContent>
> >>> <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/";
> >>>               
> xmlns:fedora="info:fedora/fedora-system:def/relations-external#"
> >>>               
> xmlns:fedora-model="info:fedora/fedora-system:def/model#"
> >>>               
> xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/";
> >>>               
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
> >>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#";>
> >>> <!-- following element: put the PID as the value for the rdf:about
> >>> attribute -->
> >>> <rdf:description rdf:about="">
> >>> </rdf:description>
> >>> </rdf:RDF>
> >>> </foxml:xmlContent>
> >>> </foxml:datastreamVersion>
> >>> </foxml:datastream>
> >>>
> >>> <foxml:datastream CONTROL_GROUP="X" ID="DC" STATE="A" 
> >>> VERSIONABLE="true"> <foxml:datastreamVersion ID="DC.0" 
> LABEL="Dublin 
> >>> Core Record" MIMETYPE="text/xml"> <foxml:xmlContent>
> >>> <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/";
> >>>               
> xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/";
> >>>               
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
> >>> xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
> >>> http://www.openarchives.org/OAI/2.0/oai_dc.xsd";>
> >>> <dc:title></dc:title>
> >>> <dc:creator>Test Program</dc:creator>
> >>> <dc:description>A test object</dc:description>
> >>> <!-- following element: put the PID between the tags -->
> >>> <dc:identifier></dc:identifier>
> >>> </oai_dc:dc>
> >>> </foxml:xmlContent>
> >>> </foxml:datastreamVersion>
> >>> </foxml:datastream>
> >>>
> >>> <foxml:datastream CONTROL_GROUP="M" ID="Content" STATE="A"> 
> >>> <foxml:datastreamVersion ID="Content.0" LABEL="This is the object
> >>> content" MIMETYPE="    application/octet-stream">
> >>> <!-- following element: put the URL to the content file 
> as the value 
> >>> for the REF attribute -->
> >>> <!-- must be an http URL, e.g., 
> >>> http://localhost:8080/ingestpool/foxmldoc.xml -->
> >>> <!-- I just create a directory "ingestpool" under 
> >>> /usr/fedora/tomcat/webapps/ROOT and put the files there --> 
> >>> <foxml:contentLocation REF="" TYPE="URL" /> 
> >>> </foxml:datastreamVersion> </foxml:datastream>
> >>>
> >>>
> >>> </foxml:digitalObject>
> >>>
> >>>
> >>>
> >>> On 05/17/2011 10:00 AM, Scott Prater wrote:
> >>>> Scott,
> >>>>
> >>>> Can you come up with a test case that confirms this 
> limitation?  If 
> >>>> you can provide one, I'll open up a JIRA ticket for the issue.
> >>>>
> >>>> thanks,
> >>>>
> >>>> -- Scott
> >>>>
> >>>> On 05/16/2011 10:45 AM, Scott Hammel wrote:
> >>>>> Oh, I think I see: last line of the serializer's serialize 
> >>>>> function does
> >>>>> this:
> >>>>> bytes.toByteArray()
> >>>>> where bytes is a ByteArrayOutputStream
> >>>>>
> >>>>> I *think* the max size of an array index in Java (32-bit) is 
> >>>>> 2,147,483,647 (i.e., 2^31 - 1, max value of a java 
> int). So, this 
> >>>>> function will throw an exception if a datastream 
> "archive" export 
> >>>>> is> ~2 GB.
> >>>>>
> >>>>> scott
> >>>>>
> >>>>> On 05/16/2011 11:00 AM, Scott Hammel wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> Running some export tests using Fedora's REST export 
> API, I get a 
> >>>>>> negative array index Java exception when doing an 
> "archive" export of an
> >>>>>> object at around 400 MB (>320 MB,<       450 MB).
> >>>>>>
> >>>>>> Fedora is version 3.4 something; running on 32-bit CentOS 5.5, 
> >>>>>> Sun Java 1.6, 21
> >>>>>>
> >>>>>> Is it just me or has anyone else seen something like that?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Scott
> >>>>>>
> >>>>>> 
> -----------------------------------------------------------------
> >>>>>> -------------
> >>>>>> Achieve unprecedented app performance and reliability
> >>>>>> What every C/C++ and Fortran developer should know.
> >>>>>> Learn how Intel has extended the reach of its 
> next-generation tools
> >>>>>> to help boost performance applications - inlcuding clusters.
> >>>>>> http://p.sf.net/sfu/intel-dev2devmay
> >>>>>> _______________________________________________
> >>>>>> Fedora-commons-users mailing list
> >>>>>> Fedora-commons-users@lists.sourceforge.net
> >>>>>> 
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
> >>>>>>
> >>>>> 
> ------------------------------------------------------------------
> >>>>> ------------
> >>>>> Achieve unprecedented app performance and reliability
> >>>>> What every C/C++ and Fortran developer should know.
> >>>>> Learn how Intel has extended the reach of its 
> next-generation tools
> >>>>> to help boost performance applications - inlcuding clusters.
> >>>>> http://p.sf.net/sfu/intel-dev2devmay
> >>>>> _______________________________________________
> >>>>> Fedora-commons-users mailing list
> >>>>> Fedora-commons-users@lists.sourceforge.net
> >>>>> 
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
> >>>
> >>> 
> --------------------------------------------------------------------
> >>> ----------
> >>> Achieve unprecedented app performance and reliability
> >>> What every C/C++ and Fortran developer should know.
> >>> Learn how Intel has extended the reach of its 
> next-generation tools
> >>> to help boost performance applications - inlcuding clusters.
> >>> http://p.sf.net/sfu/intel-dev2devmay
> >>> _______________________________________________
> >>> Fedora-commons-users mailing list
> >>> Fedora-commons-users@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
> >>
> >
> >
> > 
> ----------------------------------------------------------------------
> > --------
> > Achieve unprecedented app performance and reliability
> > What every C/C++ and Fortran developer should know.
> > Learn how Intel has extended the reach of its next-generation tools
> > to help boost performance applications - inlcuding clusters.
> > http://p.sf.net/sfu/intel-dev2devmay
> > _______________________________________________
> > Fedora-commons-users mailing list
> > Fedora-commons-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
> 
> 
> -- 
> Scott Prater
> Library, Instructional, and Research Applications (LIRA) 
> Division of Information Technology (DoIT) University of 
> Wisconsin - Madison pra...@wisc.edu
> 
> --------------------------------------------------------------
> ----------------
> Achieve unprecedented app performance and reliability
> What every C/C++ and Fortran developer should know.
> Learn how Intel has extended the reach of its next-generation 
> tools to help boost performance applications - inlcuding 
> clusters. http://p.sf.net/sfu/intel-dev2devmay
> _______________________________________________
> Fedora-commons-users mailing list 
> Fedora-commons-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
> 


------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to