I see where you are going :-)

I just ran a 400MB test with an ATOMZip export. Seems to have worked 
just fine.

A 900MB datastream export to ATOMZip test failed. No exception generated 
in the logs, just an internal server error. I noticed with 3.4.2 this 
can indicate the JVM ran out memory (not surprising if the export is 
still being collected into a ByteArrayOutputStream, I guess).

Scott

On 05/18/2011 11:53 AM, Stephen Bayliss wrote:
> Hi Scott
>
> Thanks for that feedback.
>
> It would be interesting to find out if you get the same problem using the
> AtomZip export format (info:fedora/fedora-system:ATOMZip-1.1)
>
> Steve
>
>> -----Original Message-----
>> From: Scott Hammel [mailto:sc...@clemson.edu]
>> Sent: 18 May 2011 16:16
>> To: Support and info exchange list for Fedora users.
>> Subject: Re: [fcrepo-user] REST export API negative array
>> index exception
>>
>>
>> Scott, Steve,
>>
>> REST export in archive format still blows up with Fedora
>> 3.4.2. Actually
>> is crashing on a datastream<  300MB. I gave the JVM 1.5GB of
>> heap, BTW.
>>
>> Regardless, the exception that is in fedora.log is a negative array
>> index exception. It looks like it is actually occurring down in the
>> base64 encoder according to the stack trace.
>>
>> It occurs to me that building support for a full archival
>> export of an
>> object in memory for arbitrarily large objects might be pragmatically
>> (is that a word?) impossible: e.g., on 32-bit systems I think
>> you bump
>> into problems giving the JVM more than ~1.8 GB of RAM. That
>> alone limits
>> the size of exportable objects to well under 2GB in that environment.
>>
>> If I was more adept with Java, I'd volunteer to write an
>> exporter that
>> spooled to disk, but alas, I am not and it would take me
>> twice as long
>> as someone who is. :-(
>>
>> I can take one of several alternative paths with my
>> particular project,
>> so it isn't too big an issue to *me* .... I just have to do a little
>> more coding in a middle-tier. Don't know about other folks, of course.
>>
>> -Scott
>>
>> On 05/18/2011 01:08 AM, Stephen Bayliss wrote:
>>> Looking at those lines of code it looks like in theory
>> there would be
>>> a problem there.  Once this is confirmed we should probably
>> add a test
>>> case to the large datastreams test suite.  And it is likely
>> to cause a
>>> problem with datastreams smaller than 2GB (2^31-1 as maximum array
>>> index) due to the archive export base64-encoding the content.
>>>
>>>> -----Original Message-----
>>>> From: Scott Prater [mailto:pra...@wisc.edu]
>>>> Sent: 17 May 2011 18:33
>>>> To: Support and info exchange list for Fedora users.
>>>> Subject: Re: [fcrepo-user] REST export API negative array index
>>>> exception
>>>>
>>>>
>>>> Yes, trying with the latest stable version (3.4.2) would
>> be useful,
>>>> if you don't mind.  There were some lowlevel garbage collection
>>>> problems that were fixed in the 3.4.2 release;  these problems
>>>> manifested themselves in a variety of ways.
>>>>
>>>> I'm not saying this is the issue, but it wouldn't hurt to
>> verify that
>>>> your problem can be reproduced in 3.4.2.
>>>>
>>>> thanks,
>>>>
>>>> -- Scott
>>>>
>>>> On 05/17/2011 12:22 PM, Scott Hammel wrote:
>>>>> I'm pretty sure it is 3.4.0 (from files on the server it
>>>> looks like an
>>>>> August 2010 build. The server is in a totally isolated
>> network with
>>>>> nothing with GUI support that can hit the admin tools).
>>>>>
>>>>> Tomcat is the version bundled with the Fedora installer.
>>>>>
>>>>> Would you like me to be sure I'm running at the latest
>>>> version and try
>>>>> the test scripts again before you go forward?
>>>>>
>>>>> Scott
>>>>>
>>>>> On 05/17/2011 12:45 PM, Scott Prater wrote:
>>>>>> Thanks, Scott.  I'll try to reproduce the problem in my
>>>> environment,
>>>>>> Fedora 3.4.2.
>>>>>>
>>>>>> Can you tell me what version of Fedora and Tomcat (or
>> other webapp
>>>>>> server) you're using?
>>>>>>
>>>>>> -- Scott
>>>>>>
>>>>>> On 05/17/2011 11:08 AM, Scott Hammel wrote:
>>>>>>> Hey, Scott,
>>>>>>>
>>>>>>> Thanks for responding. I'm more a C/C++ programmer and
>> not a Java
>>>>>>> programmer (though I sometimes play one on the
>> Internet), so I'm
>>>>>>> just guessing on the array bounds -- feels like something
>>>>>>> incrementing an int into the sign bit, though I'd think
>>>> Java would
>>>>>>> throw some array bounds exception before that happened.
>>>> Figured I'd
>>>>>>> do a little math later maybe to test my hypothesis.
>>>>>>>
>>>>>>> Recall, this was all in a 32-bit environment. I really
>>>> hope it is a
>>>>>>> non-issue and something I'm doing in the end. Note
>>>> disseminating the
>>>>>>> datastream content directly appears to work OK, which
>>>> confuses me a
>>>>>>> little, though I haven't looked to see if the code for
>> that does
>>>>>>> things differently.
>>>>>>>
>>>>>>> Anyway, here's a series of commands (extracted from my
>>>> test scripts)
>>>>>>> that should reproduce the problem:
>>>>>>>
>>>>>>> mkdir /usr/fedora/tomcat/webapps/ROOT/ingestpool
>>>>>>> mkdir /tmp/fedrun
>>>>>>> dir=/tmp/fedrun
>>>>>>> pid=test:pid01
>>>>>>>
>>>>>>> dd if=/dev/urandom
>>>>>>> of=/usr/fedora/tomcat/webapps/ROOT/ingestpool/sample.bin bs=1M
>>>>>>> count=400
>>>>>>>
>>>>>>> ./makefoxml $pid http://localhost:8080/ingestpool/sample.bin>
>>>>>>> $dir/sample.xml
>>>>>>>
>>>>>>> /usr/fedora/client/bin/fedora-ingest.sh f $dir/sample.xml
>>>>>>> info:fedora/fedora-system:FOXML-1.1 localhost:8080
>>>> fedoraAdmin<insert
>>>>>>> pwd here>      http
>>>>>>>
>>>>>>> wget -O $dir/export.xml --auth-no-challenge
>>>> --http-user=fedoraAdmin
>>>>>>> --http-password=<insert pwd here>
>>>>>>> http://localhost:8080/fedora/objects/$pid/export?context=archive
>>>>>>>
>>>>>>> Note: I use the REST call via a wget rather than the
>>>> provided export
>>>>>>> client scripts because it looks to me from the Java heap
>>>> explosion
>>>>>>> that the export scripts must end up doing the export
>> via the SOAP
>>>>>>> API.
>>>>>>> --
>>>>>>> The content of makefoxml:
>>>>>>>
>>>>>>> #!/bin/bash
>>>>>>>
>>>>>>> #usage: makefoxml<pid>      <refurl>
>>>>>>> #escape slashes off the URL
>>>>>>> RF=${2//\//\\/}
>>>>>>> #if you need to escape ampersands as well, uncomment this:
>>>>>>> #RF=${RF//'&'/'\&'}
>>>>>>>
>>>>>>> # make substitutions ....
>>>>>>> sed '
>>>>>>> s/PID=""/PID="'"$1"'"/
>>>>>>> s/rdf:about=""/rdf:about="info:fedora\/'"$1"'"/
>>>>>>> s/dc:identifier>/dc:identifier>'"$1"'/
>>>>>>> s/REF=""/REF="'"${RF}"'"/
>>>>>>> '<      "foxml_tpl.xml"
>>>>>>>
>>>>>>> --
>>>>>>> The content of foxml_tmp.xml (the sed script above does
>> the edits
>>>>>>> noted in the xml comments in this template):
>>>>>>>
>>>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>>>> <!-- following element: set the PID attribute -->
>>>>>>> <foxml:digitalObject VERSION="1.1" PID=""
>>>>>>> xmlns:foxml="info:fedora/fedora-system:def/foxml#"
>>>>>>>         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>>>>>>> xsi:schemaLocation="info:fedora/fedora-system:def/foxml#
>>>>>>> http://www.fedora.info/definitions/1/0/foxml1-1.xsd";>
>>>>>>>
>>>>>>> <foxml:objectProperties>
>>>>>>> <foxml:property NAME="info:fedora/fedora-system:def/model#state"
>>>>>>> VALUE="A"/>   <foxml:property
>>>>>>> NAME="info:fedora/fedora-system:def/model#label" VALUE=""/>
>>>>>>> <foxml:property
>> NAME="info:fedora/fedora-system:def/model#ownerId"
>>>>>>> VALUE="fedoraAdmin"/>
>>>>>>> </foxml:objectProperties>
>>>>>>>
>>>>>>> <foxml:datastream CONTROL_GROUP="X" ID="RELS-EXT">
>>>>>>> <foxml:datastreamVersion
>>>>>>> FORMAT_URI="info:fedora/fedora-system:FedoraRELSExt-1.0"
>>>>>>>             ID="RELS-EXT.0" LABEL="RDF Statements about
>>>> this Object"
>>>>>>> MIMETYPE="application/rdf+xml">   <foxml:xmlContent>  <rdf:RDF
>>>>>>> xmlns:dc="http://purl.org/dc/elements/1.1/";
>>>>>>>
>>>> xmlns:fedora="info:fedora/fedora-system:def/relations-external#"
>>>> xmlns:fedora-model="info:fedora/fedora-system:def/model#"
>>>> xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/";
>>>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
>>>>>>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#";>
>>>>>>> <!-- following element: put the PID as the value for
>> the rdf:about
>>>>>>> attribute -->  <rdf:description rdf:about="">
>>>>>>> </rdf:description>
>>>>>>> </rdf:RDF>
>>>>>>> </foxml:xmlContent>
>>>>>>> </foxml:datastreamVersion>
>>>>>>> </foxml:datastream>
>>>>>>>
>>>>>>> <foxml:datastream CONTROL_GROUP="X" ID="DC" STATE="A"
>>>>>>> VERSIONABLE="true">   <foxml:datastreamVersion ID="DC.0"
>>>> LABEL="Dublin
>>>>>>> Core Record" MIMETYPE="text/xml">   <foxml:xmlContent>
>> <oai_dc:dc
>>>>>>> xmlns:dc="http://purl.org/dc/elements/1.1/";
>>>>>>>
>>>> xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/";
>>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>>>>>>> xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
>>>>>>> http://www.openarchives.org/OAI/2.0/oai_dc.xsd";>
>>>>>>> <dc:title></dc:title>
>>>>>>> <dc:creator>Test Program</dc:creator>
>>>>>>> <dc:description>A test object</dc:description>
>>>>>>> <!-- following element: put the PID between the tags -->
>>>>>>> <dc:identifier></dc:identifier>  </oai_dc:dc>
>>>>>>> </foxml:xmlContent>
>>>>>>> </foxml:datastreamVersion>
>>>>>>> </foxml:datastream>
>>>>>>>
>>>>>>> <foxml:datastream CONTROL_GROUP="M" ID="Content" STATE="A">
>>>>>>> <foxml:datastreamVersion ID="Content.0" LABEL="This is
>> the object
>>>>>>> content" MIMETYPE="    application/octet-stream">
>>>>>>> <!-- following element: put the URL to the content file
>>>> as the value
>>>>>>> for the REF attribute -->
>>>>>>> <!-- must be an http URL, e.g.,
>>>>>>> http://localhost:8080/ingestpool/foxmldoc.xml -->
>>>>>>> <!-- I just create a directory "ingestpool" under
>>>>>>> /usr/fedora/tomcat/webapps/ROOT and put the files there -->
>>>>>>> <foxml:contentLocation REF="" TYPE="URL" />
>>>>>>> </foxml:datastreamVersion>   </foxml:datastream>
>>>>>>>
>>>>>>>
>>>>>>> </foxml:digitalObject>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 05/17/2011 10:00 AM, Scott Prater wrote:
>>>>>>>> Scott,
>>>>>>>>
>>>>>>>> Can you come up with a test case that confirms this
>>>> limitation?  If
>>>>>>>> you can provide one, I'll open up a JIRA ticket for the issue.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> -- Scott
>>>>>>>>
>>>>>>>> On 05/16/2011 10:45 AM, Scott Hammel wrote:
>>>>>>>>> Oh, I think I see: last line of the serializer's serialize
>>>>>>>>> function does
>>>>>>>>> this:
>>>>>>>>> bytes.toByteArray()
>>>>>>>>> where bytes is a ByteArrayOutputStream
>>>>>>>>>
>>>>>>>>> I *think* the max size of an array index in Java (32-bit) is
>>>>>>>>> 2,147,483,647 (i.e., 2^31 - 1, max value of a java
>>>> int). So, this
>>>>>>>>> function will throw an exception if a datastream
>>>> "archive" export
>>>>>>>>> is>   ~2 GB.
>>>>>>>>>
>>>>>>>>> scott
>>>>>>>>>
>>>>>>>>> On 05/16/2011 11:00 AM, Scott Hammel wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Running some export tests using Fedora's REST export
>>>> API, I get a
>>>>>>>>>> negative array index Java exception when doing an
>>>> "archive" export of an
>>>>>>>>>> object at around 400 MB (>320 MB,<         450 MB).
>>>>>>>>>>
>>>>>>>>>> Fedora is version 3.4 something; running on 32-bit
>> CentOS 5.5,
>>>>>>>>>> Sun Java 1.6, 21
>>>>>>>>>>
>>>>>>>>>> Is it just me or has anyone else seen something like that?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Scott
>>>>>>>>>>
>>>>>>>>>>
>>>> -----------------------------------------------------------------
>>>>>>>>>> -------------
>>>>>>>>>> Achieve unprecedented app performance and reliability What
>>>>>>>>>> every C/C++ and Fortran developer should know. Learn
>> how Intel
>>>>>>>>>> has extended the reach of its
>>>> next-generation tools
>>>>>>>>>> to help boost performance applications - inlcuding clusters.
>>>>>>>>>> http://p.sf.net/sfu/intel-dev2devmay
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Fedora-commons-users mailing list
>>>>>>>>>> Fedora-commons-users@lists.sourceforge.net
>>>>>>>>>>
>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>> ------------------------------------------------------------------
>>>>>>>>> ------------
>>>>>>>>> Achieve unprecedented app performance and reliability
>> What every
>>>>>>>>> C/C++ and Fortran developer should know. Learn how Intel has
>>>>>>>>> extended the reach of its
>>>> next-generation tools
>>>>>>>>> to help boost performance applications - inlcuding clusters.
>>>>>>>>> http://p.sf.net/sfu/intel-dev2devmay
>>>>>>>>> _______________________________________________
>>>>>>>>> Fedora-commons-users mailing list
>>>>>>>>> Fedora-commons-users@lists.sourceforge.net
>>>>>>>>>
>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>> --------------------------------------------------------------------
>>>>>>> ----------
>>>>>>> Achieve unprecedented app performance and reliability
>> What every
>>>>>>> C/C++ and Fortran developer should know. Learn how Intel has
>>>>>>> extended the reach of its
>>>> next-generation tools
>>>>>>> to help boost performance applications - inlcuding clusters.
>>>>>>> http://p.sf.net/sfu/intel-dev2devmay
>>>>>>> _______________________________________________
>>>>>>> Fedora-commons-users mailing list
>>>>>>> Fedora-commons-users@lists.sourceforge.net
>>>>>>>
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>>>
>> ---------------------------------------------------------------------
>>>> -
>>>>> --------
>>>>> Achieve unprecedented app performance and reliability
>>>>> What every C/C++ and Fortran developer should know.
>>>>> Learn how Intel has extended the reach of its
>> next-generation tools
>>>>> to help boost performance applications - inlcuding clusters.
>>>>> http://p.sf.net/sfu/intel-dev2devmay
>>>>> _______________________________________________
>>>>> Fedora-commons-users mailing list
>>>>> Fedora-commons-users@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>> --
>>>> Scott Prater
>>>> Library, Instructional, and Research Applications (LIRA)
>>>> Division of Information Technology (DoIT) University of
>>>> Wisconsin - Madison pra...@wisc.edu
>>>>
>>>> --------------------------------------------------------------
>>>> ----------------
>>>> Achieve unprecedented app performance and reliability
>>>> What every C/C++ and Fortran developer should know.
>>>> Learn how Intel has extended the reach of its
>> next-generation tools
>>>> to help boost performance applications - inlcuding clusters.
>>>> http://p.sf.net/sfu/intel-dev2devmay
>>>> _______________________________________________
>>>> Fedora-commons-users mailing list
>>>> Fedora-commons-users@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>>
>>>
>> ----------------------------------------------------------------------
>>> --------
>>> What Every C/C++ and Fortran developer Should Know!
>>> Read this article and learn how Intel has extended the reach of its
>>> next-generation tools to help Windows* and Linux* C/C++ and Fortran
>>> developers boost performance applications - including clusters.
>>> http://p.sf.net/sfu/intel-dev2devmay
>>> _______________________________________________
>>> Fedora-commons-users mailing list
>>> Fedora-commons-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>
>>
>> --------------------------------------------------------------
>> ----------------
>> What Every C/C++ and Fortran developer Should Know!
>> Read this article and learn how Intel has extended the reach of its
>> next-generation tools to help Windows* and Linux* C/C++ and Fortran
>> developers boost performance applications - including clusters.
>> http://p.sf.net/sfu/intel-dev2devmay
>> _______________________________________________
>> Fedora-commons-users mailing list
>> Fedora-commons-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>
> ------------------------------------------------------------------------------
> What Every C/C++ and Fortran developer Should Know!
> Read this article and learn how Intel has extended the reach of its
> next-generation tools to help Windows* and Linux* C/C++ and Fortran
> developers boost performance applications - including clusters.
> http://p.sf.net/sfu/intel-dev2devmay
> _______________________________________________
> Fedora-commons-users mailing list
> Fedora-commons-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>


------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to