Hi Lucas, Have you tried running OAI import with the -v flag and redirecting that output to a log file? That will indicate the last item to be indexed before the OOM error. If it's the same item every time then you could look at that item to see if there's anything different about it that may be causing the error.
Between 7.5 and 7.6.1 there weren't many changes to the OAI import process, but there was a new XOAI Extension added to get the item's access status: https://github.com/DSpace/DSpace/blob/dspace-7_x/dspace-oai/src/main/java/org/dspace/xoai/app/plugins/AccessStatusElementItemCompilePlugin.java. Nick On Wednesday, June 26, 2024 at 7:25:41 PM UTC-5 [email protected] wrote: > Hello, > > I reduced the slot size to 100, but continued to face the same problem. In > this case, it was not even possible to index 14 thousand items, only 12 > thousand. Then, I adopted the other strategy of repeatedly running the > command ./dspace oai import -c until all items were indexed. I executed the > command 10 times, and each time, 12 thousand items were indexed. However, > when checking the total on the OAI interface, there was no change; it > remained the same 12 thousand items. > > Em quarta-feira, 26 de junho de 2024 às 17:44:24 UTC-3, DSpace Community > escreveu: > >> Hi Lucas, >> >> When you run "./dspace oai import -c" you should see occasional messages >> like this... >> >> ___ items imported so far... >> >> These are batches of items that are being committed every once in a >> while. The batch size is defined by "oai.import.batch.size" in your >> [dspace]/config/oai.cfg (default is 1,000). >> >> So, a few options exist: >> >> >> 1. You could decrease the batch size to see if that avoids the Out of >> Memory error. Set that config to 100 or 500 in either your local.cfg or >> in >> the oai.cfg. Then rerun the script. It will likely go a bit slower, but >> it >> should use less memory >> 2. You could increase the memory available by ensuring that your >> commandline tools have more the 4GB of memory. See instructions at >> >> https://wiki.lyrasis.org/display/DSDOC7x/Performance+Tuning+DSpace#PerformanceTuningDSpace-GivetheCommandLineToolsMoreMemory >> 3. Or, if none of that works, then you could just keep running >> "./dspace oai import -c" again and again until everything is indexed. >> The >> script should start off each time from the where you left off (it will >> determine which Items are already indexed and skip them). >> >> >> Hopefully that will help. If we find this is a common issue, there may >> be a bug here in how memory is used (as it seems like we shouldn't be >> hitting this error at all), but hopefully those workarounds will help you >> get past this issue. >> >> Tim >> >> On Tuesday, June 25, 2024 at 6:55:04 PM UTC-5 [email protected] >> wrote: >> >>> Hi, >>> >>> I run ./dspace oai import -c >>> >>> After collecting 14k items I have the following error in the console: >>> java.lang.OutOfMemoryError: Java heap space >>> at java.base/java.util.Arrays.copyOf(Arrays.java:3745) >>> at >>> java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120) >>> at >>> java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95) >>> at >>> java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156) >>> at com.ctc.wstx.io.UTF8Writer.write(UTF8Writer.java:143) >>> at >>> com.ctc.wstx.sw.BufferingXmlWriter.flushBuffer(BufferingXmlWriter.java:1417) >>> at >>> com.ctc.wstx.sw.BufferingXmlWriter.fastWriteRaw(BufferingXmlWriter.java:1463) >>> at >>> com.ctc.wstx.sw.BufferingXmlWriter.writeStartTagStart(BufferingXmlWriter.java:763) >>> at >>> com.ctc.wstx.sw.BaseNsStreamWriter.doWriteStartTag(BaseNsStreamWriter.java:612) >>> at >>> com.ctc.wstx.sw.BaseNsStreamWriter.writeStartElement(BaseNsStreamWriter.java:310) >>> at >>> com.lyncode.xoai.util.XmlIOUtils.writeElement(XmlIOUtils.java:19) >>> at >>> com.lyncode.xoai.dataprovider.xml.xoai.Metadata.write(Metadata.java:95) >>> at org.dspace.xoai.app.XOAI.index(XOAI.java:485) >>> at org.dspace.xoai.app.XOAI.index(XOAI.java:320) >>> at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:265) >>> at org.dspace.xoai.app.XOAI.index(XOAI.java:158) >>> at org.dspace.xoai.app.XOAI.main(XOAI.java:618) >>> at >>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >>> Method) >>> at >>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> at >>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.base/java.lang.reflect.Method.invoke(Method.java:566) >>> >>> >>> When I go to check dspace.log I have the following error: >>> >>> 2024-06-25 23:46:36,955 INFO unknown unknown >>> org.dspace.xoai.util.ItemUtils @ Missing READ rights for license bitstream. >>> Did not include license bitstream for item: >>> 3e52cc21-e8f6-4468-8e59-1e7c371b6b2f. >>> 2024-06-25 23:46:44,672 ERROR unknown unknown org.dspace.xoai.app.XOAI @ >>> Java heap space >>> java.lang.OutOfMemoryError: Java heap space >>> at java.util.Arrays.copyOf(Arrays.java:3745) ~[?:?] >>> at >>> java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120) ~[?:?] >>> at >>> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95) >>> ~[?:?] >>> at >>> java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156) ~[?:?] >>> at com.ctc.wstx.io.UTF8Writer.write(UTF8Writer.java:143) >>> ~[woodstox-core-6.2.4.jar:6.2.4] >>> at >>> com.ctc.wstx.sw.BufferingXmlWriter.flushBuffer(BufferingXmlWriter.java:1417) >>> >>> ~[woodstox-core-6.2.4.jar:6.2.4] >>> at >>> com.ctc.wstx.sw.BufferingXmlWriter.fastWriteRaw(BufferingXmlWriter.java:1463) >>> >>> ~[woodstox-core-6.2.4.jar:6.2.4] >>> at >>> com.ctc.wstx.sw.BufferingXmlWriter.writeStartTagStart(BufferingXmlWriter.java:763) >>> >>> ~[woodstox-core-6.2.4.jar:6.2.4] >>> at >>> com.ctc.wstx.sw.BaseNsStreamWriter.doWriteStartTag(BaseNsStreamWriter.java:612) >>> >>> ~[woodstox-core-6.2.4.jar:6.2.4] >>> at >>> com.ctc.wstx.sw.BaseNsStreamWriter.writeStartElement(BaseNsStreamWriter.java:310) >>> >>> ~[woodstox-core-6.2.4.jar:6.2.4] >>> at >>> com.lyncode.xoai.util.XmlIOUtils.writeElement(XmlIOUtils.java:19) >>> ~[xoai-3.4.0.jar:3.4.0] >>> at >>> com.lyncode.xoai.dataprovider.xml.xoai.Metadata.write(Metadata.java:95) >>> ~[xoai-3.4.0.jar:3.4.0] >>> at org.dspace.xoai.app.XOAI.index(XOAI.java:485) >>> ~[dspace-oai-7.6.1.jar:7.6.1] >>> at org.dspace.xoai.app.XOAI.index(XOAI.java:320) >>> ~[dspace-oai-7.6.1.jar:7.6.1] >>> at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:265) >>> ~[dspace-oai-7.6.1.jar:7.6.1] >>> at org.dspace.xoai.app.XOAI.index(XOAI.java:158) >>> ~[dspace-oai-7.6.1.jar:7.6.1] >>> at org.dspace.xoai.app.XOAI.main(XOAI.java:618) >>> [dspace-oai-7.6.1.jar:7.6.1] >>> at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >>> Method) ~[?:?] >>> at >>> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> >>> ~[?:?] >>> at >>> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> >>> ~[?:?] >>> at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] >>> at >>> org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:283) >>> >>> [dspace-api-7.6.1.jar:7.6.1] >>> at >>> org.dspace.app.launcher.ScriptLauncher.handleScript(ScriptLauncher.java:134) >>> >>> [dspace-api-7.6.1.jar:7.6.1] >>> at >>> org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:99) >>> [dspace-api-7.6.1.jar:7.6.1] >>> 2024-06-25 23:46:44,721 INFO unknown unknown >>> org.ehcache.core.EhcacheManager @ Cache 'org.dspace.content.MetadataSchema' >>> removed from Eh107InternalCacheManager. >>> >>> >>> Em terça-feira, 25 de junho de 2024 às 18:55:28 UTC-3, DSpace Community >>> escreveu: >>> >>>> Hi, >>>> >>>> I think we'd need more information on the exact command you are >>>> running. You also should check your logs to see if errors are occurring >>>> *before* the Java heap issue. See our troubleshooting guide: >>>> https://wiki.lyrasis.org/display/DSPACE/Troubleshoot+an+error#Troubleshootanerror-DSpace7.x(orabove) >>>> >>>> I'm not aware of a memory issue in the "dspace oai import" command. >>>> But, it is always possible that you've encountered a new/undiscovered >>>> issue >>>> with the command. So, we need to understand exactly what command you are >>>> running in order to see if others can reproduce the issue. >>>> >>>> Based on what you've shared so far, it does sound like you might be >>>> encountering some sort of bug (especially if it worked fine in 7.5 but the >>>> same command isn't working in 7.6.1). So, you are also welcome to share >>>> the detailed information in a bug ticket ( >>>> https://github.com/DSpace/DSpace/issues), and we can then look for >>>> volunteers to investigate what might be going on. >>>> >>>> Tim >>>> On Tuesday, June 25, 2024 at 4:11:35 PM UTC-5 [email protected] >>>> wrote: >>>> >>>>> Dear colleague, >>>>> >>>>> I applied the fix for the Java heap memory issue, setting it to 4 GB, >>>>> but it is not sufficient. When indexing 20k items, it crashes. I also >>>>> tried >>>>> this on another instance of DSpace version 7.6.1 and the same thing >>>>> happens. >>>>> >>>>> Em segunda-feira, 17 de junho de 2024 às 18:16:38 UTC-3, Holger Lenz >>>>> escreveu: >>>>> >>>>>> Hi there, >>>>>> >>>>>> Are you experiencing the error " java.lang.OutOfMemoryError: Java >>>>>> heap space", or it is it a different error? >>>>>> >>>>>> If it is the former, there is documentation on that (most likely a >>>>>> memory issue): >>>>>> https://wiki.lyrasis.org/display/DSDOC7x/Performance+Tuning+DSpace >>>>>> (subheading "Performance Tuning the Backend (REST API)") >>>>>> >>>>>> Please let us know if this doesn't point you in the right direction. >>>>>> >>>>>> Holger >>>>>> >>>>>> >>>>>> >>>>>> On Monday, June 17, 2024 at 10:23:18 AM UTC-4 >>>>>> [email protected] wrote: >>>>>> >>>>>>> Dear Collegues, >>>>>>> >>>>>>> I am having a problem when trying to index the metadata using OAI >>>>>>> import. In version 7.5, I was able to import all 42,000 items from the >>>>>>> digital library. I installed version 7.6.1, and now when I try to run >>>>>>> the >>>>>>> command, I can only index 1/3 of the total, and it generates a Java >>>>>>> heap >>>>>>> error. Does anyone know if this is a common issue with version 7.6.1? >>>>>>> >>>>>>> >>>>>>> Thanks advanced. >>>>>>> >>>>>>> -- All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx --- You received this message because you are subscribed to the Google Groups "DSpace Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/a4e56b31-e326-4d68-99ee-820f1f8c09c7n%40googlegroups.com.
