Scott, Steve,

REST export in archive format still blows up with Fedora 3.4.2. Actually 
is crashing on a datastream < 300MB. I gave the JVM 1.5GB of heap, BTW.

Regardless, the exception that is in fedora.log is a negative array 
index exception. It looks like it is actually occurring down in the 
base64 encoder according to the stack trace.

It occurs to me that building support for a full archival export of an 
object in memory for arbitrarily large objects might be pragmatically 
(is that a word?) impossible: e.g., on 32-bit systems I think you bump 
into problems giving the JVM more than ~1.8 GB of RAM. That alone limits 
the size of exportable objects to well under 2GB in that environment.

If I was more adept with Java, I'd volunteer to write an exporter that 
spooled to disk, but alas, I am not and it would take me twice as long 
as someone who is. :-(

I can take one of several alternative paths with my particular project, 
so it isn't too big an issue to *me* .... I just have to do a little 
more coding in a middle-tier. Don't know about other folks, of course.

-Scott

On 05/18/2011 01:08 AM, Stephen Bayliss wrote:
> Looking at those lines of code it looks like in theory there would be a
> problem there.  Once this is confirmed we should probably add a test case to
> the large datastreams test suite.  And it is likely to cause a problem with
> datastreams smaller than 2GB (2^31-1 as maximum array index) due to the
> archive export base64-encoding the content.
>
>> -----Original Message-----
>> From: Scott Prater [mailto:pra...@wisc.edu]
>> Sent: 17 May 2011 18:33
>> To: Support and info exchange list for Fedora users.
>> Subject: Re: [fcrepo-user] REST export API negative array
>> index exception
>>
>>
>> Yes, trying with the latest stable version (3.4.2) would be
>> useful, if
>> you don't mind.  There were some lowlevel garbage collection problems
>> that were fixed in the 3.4.2 release;  these problems manifested
>> themselves in a variety of ways.
>>
>> I'm not saying this is the issue, but it wouldn't hurt to verify that
>> your problem can be reproduced in 3.4.2.
>>
>> thanks,
>>
>> -- Scott
>>
>> On 05/17/2011 12:22 PM, Scott Hammel wrote:
>>> I'm pretty sure it is 3.4.0 (from files on the server it
>> looks like an
>>> August 2010 build. The server is in a totally isolated network with
>>> nothing with GUI support that can hit the admin tools).
>>>
>>> Tomcat is the version bundled with the Fedora installer.
>>>
>>> Would you like me to be sure I'm running at the latest
>> version and try
>>> the test scripts again before you go forward?
>>>
>>> Scott
>>>
>>> On 05/17/2011 12:45 PM, Scott Prater wrote:
>>>> Thanks, Scott.  I'll try to reproduce the problem in my
>> environment,
>>>> Fedora 3.4.2.
>>>>
>>>> Can you tell me what version of Fedora and Tomcat (or other webapp
>>>> server) you're using?
>>>>
>>>> -- Scott
>>>>
>>>> On 05/17/2011 11:08 AM, Scott Hammel wrote:
>>>>> Hey, Scott,
>>>>>
>>>>> Thanks for responding. I'm more a C/C++ programmer and not a Java
>>>>> programmer (though I sometimes play one on the Internet), so I'm
>>>>> just guessing on the array bounds -- feels like something
>>>>> incrementing an int into the sign bit, though I'd think
>> Java would
>>>>> throw some array bounds exception before that happened.
>> Figured I'd
>>>>> do a little math later maybe to test my hypothesis.
>>>>>
>>>>> Recall, this was all in a 32-bit environment. I really
>> hope it is a
>>>>> non-issue and something I'm doing in the end. Note
>> disseminating the
>>>>> datastream content directly appears to work OK, which
>> confuses me a
>>>>> little, though I haven't looked to see if the code for that does
>>>>> things differently.
>>>>>
>>>>> Anyway, here's a series of commands (extracted from my
>> test scripts)
>>>>> that should reproduce the problem:
>>>>>
>>>>> mkdir /usr/fedora/tomcat/webapps/ROOT/ingestpool
>>>>> mkdir /tmp/fedrun
>>>>> dir=/tmp/fedrun
>>>>> pid=test:pid01
>>>>>
>>>>> dd if=/dev/urandom
>>>>> of=/usr/fedora/tomcat/webapps/ROOT/ingestpool/sample.bin bs=1M
>>>>> count=400
>>>>>
>>>>> ./makefoxml $pid http://localhost:8080/ingestpool/sample.bin>
>>>>> $dir/sample.xml
>>>>>
>>>>> /usr/fedora/client/bin/fedora-ingest.sh f $dir/sample.xml
>>>>> info:fedora/fedora-system:FOXML-1.1 localhost:8080
>> fedoraAdmin<insert
>>>>> pwd here>     http
>>>>>
>>>>> wget -O $dir/export.xml --auth-no-challenge
>> --http-user=fedoraAdmin
>>>>> --http-password=<insert pwd here>
>>>>> http://localhost:8080/fedora/objects/$pid/export?context=archive
>>>>>
>>>>> Note: I use the REST call via a wget rather than the
>> provided export
>>>>> client scripts because it looks to me from the Java heap
>> explosion
>>>>> that the export scripts must end up doing the export via the SOAP
>>>>> API.
>>>>> --
>>>>> The content of makefoxml:
>>>>>
>>>>> #!/bin/bash
>>>>>
>>>>> #usage: makefoxml<pid>     <refurl>
>>>>> #escape slashes off the URL
>>>>> RF=${2//\//\\/}
>>>>> #if you need to escape ampersands as well, uncomment this:
>>>>> #RF=${RF//'&'/'\&'}
>>>>>
>>>>> # make substitutions ....
>>>>> sed '
>>>>> s/PID=""/PID="'"$1"'"/
>>>>> s/rdf:about=""/rdf:about="info:fedora\/'"$1"'"/
>>>>> s/dc:identifier>/dc:identifier>'"$1"'/
>>>>> s/REF=""/REF="'"${RF}"'"/
>>>>> '<     "foxml_tpl.xml"
>>>>>
>>>>> --
>>>>> The content of foxml_tmp.xml (the sed script above does the edits
>>>>> noted in the xml comments in this template):
>>>>>
>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>> <!-- following element: set the PID attribute -->
>>>>> <foxml:digitalObject VERSION="1.1" PID=""
>>>>> xmlns:foxml="info:fedora/fedora-system:def/foxml#"
>>>>>        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>>>>> xsi:schemaLocation="info:fedora/fedora-system:def/foxml#
>>>>> http://www.fedora.info/definitions/1/0/foxml1-1.xsd";>
>>>>>
>>>>> <foxml:objectProperties>
>>>>> <foxml:property NAME="info:fedora/fedora-system:def/model#state"
>>>>> VALUE="A"/>  <foxml:property
>>>>> NAME="info:fedora/fedora-system:def/model#label" VALUE=""/>
>>>>> <foxml:property NAME="info:fedora/fedora-system:def/model#ownerId"
>>>>> VALUE="fedoraAdmin"/>
>>>>> </foxml:objectProperties>
>>>>>
>>>>> <foxml:datastream CONTROL_GROUP="X" ID="RELS-EXT">
>>>>> <foxml:datastreamVersion
>>>>> FORMAT_URI="info:fedora/fedora-system:FedoraRELSExt-1.0"
>>>>>            ID="RELS-EXT.0" LABEL="RDF Statements about
>> this Object"
>>>>> MIMETYPE="application/rdf+xml">  <foxml:xmlContent>
>>>>> <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/";
>>>>>
>> xmlns:fedora="info:fedora/fedora-system:def/relations-external#"
>>>>>
>> xmlns:fedora-model="info:fedora/fedora-system:def/model#"
>>>>>
>> xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/";
>>>>>
>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
>>>>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#";>
>>>>> <!-- following element: put the PID as the value for the rdf:about
>>>>> attribute -->
>>>>> <rdf:description rdf:about="">
>>>>> </rdf:description>
>>>>> </rdf:RDF>
>>>>> </foxml:xmlContent>
>>>>> </foxml:datastreamVersion>
>>>>> </foxml:datastream>
>>>>>
>>>>> <foxml:datastream CONTROL_GROUP="X" ID="DC" STATE="A"
>>>>> VERSIONABLE="true">  <foxml:datastreamVersion ID="DC.0"
>> LABEL="Dublin
>>>>> Core Record" MIMETYPE="text/xml">  <foxml:xmlContent>
>>>>> <oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/";
>>>>>
>> xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/";
>>>>>
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>>>>> xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
>>>>> http://www.openarchives.org/OAI/2.0/oai_dc.xsd";>
>>>>> <dc:title></dc:title>
>>>>> <dc:creator>Test Program</dc:creator>
>>>>> <dc:description>A test object</dc:description>
>>>>> <!-- following element: put the PID between the tags -->
>>>>> <dc:identifier></dc:identifier>
>>>>> </oai_dc:dc>
>>>>> </foxml:xmlContent>
>>>>> </foxml:datastreamVersion>
>>>>> </foxml:datastream>
>>>>>
>>>>> <foxml:datastream CONTROL_GROUP="M" ID="Content" STATE="A">
>>>>> <foxml:datastreamVersion ID="Content.0" LABEL="This is the object
>>>>> content" MIMETYPE="    application/octet-stream">
>>>>> <!-- following element: put the URL to the content file
>> as the value
>>>>> for the REF attribute -->
>>>>> <!-- must be an http URL, e.g.,
>>>>> http://localhost:8080/ingestpool/foxmldoc.xml -->
>>>>> <!-- I just create a directory "ingestpool" under
>>>>> /usr/fedora/tomcat/webapps/ROOT and put the files there -->
>>>>> <foxml:contentLocation REF="" TYPE="URL" />
>>>>> </foxml:datastreamVersion>  </foxml:datastream>
>>>>>
>>>>>
>>>>> </foxml:digitalObject>
>>>>>
>>>>>
>>>>>
>>>>> On 05/17/2011 10:00 AM, Scott Prater wrote:
>>>>>> Scott,
>>>>>>
>>>>>> Can you come up with a test case that confirms this
>> limitation?  If
>>>>>> you can provide one, I'll open up a JIRA ticket for the issue.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> -- Scott
>>>>>>
>>>>>> On 05/16/2011 10:45 AM, Scott Hammel wrote:
>>>>>>> Oh, I think I see: last line of the serializer's serialize
>>>>>>> function does
>>>>>>> this:
>>>>>>> bytes.toByteArray()
>>>>>>> where bytes is a ByteArrayOutputStream
>>>>>>>
>>>>>>> I *think* the max size of an array index in Java (32-bit) is
>>>>>>> 2,147,483,647 (i.e., 2^31 - 1, max value of a java
>> int). So, this
>>>>>>> function will throw an exception if a datastream
>> "archive" export
>>>>>>> is>  ~2 GB.
>>>>>>>
>>>>>>> scott
>>>>>>>
>>>>>>> On 05/16/2011 11:00 AM, Scott Hammel wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Running some export tests using Fedora's REST export
>> API, I get a
>>>>>>>> negative array index Java exception when doing an
>> "archive" export of an
>>>>>>>> object at around 400 MB (>320 MB,<        450 MB).
>>>>>>>>
>>>>>>>> Fedora is version 3.4 something; running on 32-bit CentOS 5.5,
>>>>>>>> Sun Java 1.6, 21
>>>>>>>>
>>>>>>>> Is it just me or has anyone else seen something like that?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Scott
>>>>>>>>
>>>>>>>>
>> -----------------------------------------------------------------
>>>>>>>> -------------
>>>>>>>> Achieve unprecedented app performance and reliability
>>>>>>>> What every C/C++ and Fortran developer should know.
>>>>>>>> Learn how Intel has extended the reach of its
>> next-generation tools
>>>>>>>> to help boost performance applications - inlcuding clusters.
>>>>>>>> http://p.sf.net/sfu/intel-dev2devmay
>>>>>>>> _______________________________________________
>>>>>>>> Fedora-commons-users mailing list
>>>>>>>> Fedora-commons-users@lists.sourceforge.net
>>>>>>>>
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>> ------------------------------------------------------------------
>>>>>>> ------------
>>>>>>> Achieve unprecedented app performance and reliability
>>>>>>> What every C/C++ and Fortran developer should know.
>>>>>>> Learn how Intel has extended the reach of its
>> next-generation tools
>>>>>>> to help boost performance applications - inlcuding clusters.
>>>>>>> http://p.sf.net/sfu/intel-dev2devmay
>>>>>>> _______________________________________________
>>>>>>> Fedora-commons-users mailing list
>>>>>>> Fedora-commons-users@lists.sourceforge.net
>>>>>>>
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>>>
>> --------------------------------------------------------------------
>>>>> ----------
>>>>> Achieve unprecedented app performance and reliability
>>>>> What every C/C++ and Fortran developer should know.
>>>>> Learn how Intel has extended the reach of its
>> next-generation tools
>>>>> to help boost performance applications - inlcuding clusters.
>>>>> http://p.sf.net/sfu/intel-dev2devmay
>>>>> _______________________________________________
>>>>> Fedora-commons-users mailing list
>>>>> Fedora-commons-users@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>>
>>>
>> ----------------------------------------------------------------------
>>> --------
>>> Achieve unprecedented app performance and reliability
>>> What every C/C++ and Fortran developer should know.
>>> Learn how Intel has extended the reach of its next-generation tools
>>> to help boost performance applications - inlcuding clusters.
>>> http://p.sf.net/sfu/intel-dev2devmay
>>> _______________________________________________
>>> Fedora-commons-users mailing list
>>> Fedora-commons-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>> -- 
>> Scott Prater
>> Library, Instructional, and Research Applications (LIRA)
>> Division of Information Technology (DoIT) University of
>> Wisconsin - Madison pra...@wisc.edu
>>
>> --------------------------------------------------------------
>> ----------------
>> Achieve unprecedented app performance and reliability
>> What every C/C++ and Fortran developer should know.
>> Learn how Intel has extended the reach of its next-generation
>> tools to help boost performance applications - inlcuding
>> clusters. http://p.sf.net/sfu/intel-dev2devmay
>> _______________________________________________
>> Fedora-commons-users mailing list
>> Fedora-commons-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>>
>
> ------------------------------------------------------------------------------
> What Every C/C++ and Fortran developer Should Know!
> Read this article and learn how Intel has extended the reach of its
> next-generation tools to help Windows* and Linux* C/C++ and Fortran
> developers boost performance applications - including clusters.
> http://p.sf.net/sfu/intel-dev2devmay
> _______________________________________________
> Fedora-commons-users mailing list
> Fedora-commons-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>


------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to