Sue, I *think* you can use the PDFFilter class from 1.5 (/dspace-api/src/main/java/org/dspace/app/mediafilter/PDFFilter.java)
without any problems. There are changes post-1.4.2 in the media filter regarding the ability to use 'self-named' plugins, but it doesn't look like this affects the PDFFilter. Otherwise, you should be able to get enough information from the PDFFilter in 1.5 as to how it catches and handles the OutOfMemoryExecption to port it back to 1.4.2 if necessary. G Thornton, Susan M. (LARC-B702)[NCI INFORMATION SYSTEMS] wrote: > Would I be able to use this program within DSpace 1.4.2? > Thanks, > Sue > > -----Original Message----- > From: Graham Triggs [mailto:[EMAIL PROTECTED] > Sent: Tuesday, June 10, 2008 2:58 PM > To: Thornton, Susan M. (LARC-B702)[NCI INFORMATION SYSTEMS] > Cc: [email protected] > Subject: Re: [Dspace-tech] filter-media error in DSpace 1.4.2 > > There is no fix - it is essentially a bug within PDFBox. > > In 1.5, there is a workaround that catches the out of memory exceptions, > > and skips the record. > > G > > Thornton, Susan M. (LARC-B702)[NCI INFORMATION SYSTEMS] wrote: >> No wonder I didn't get any responses on my previous message...no one >> recognized the job name! :-) Sorry...the job that is getting the >> following error is "filter-media". It intermittently gets the > following >> error and a JAVA "heap space" error which someone way-back-when told > me >> was supposed to be a bug that was going to be fixed. >> >> Does anyone know if there is a fix for it yet? I'm afraid our > full-text >> search is not accurate because this job is blowing up mid-stream. >> >> Thanks, >> Sue >> >> p.s. rim-filter is just our name for the media-filter job with a > couple >> of delete files added... >> >> Sue Walker-Thornton >> NASA Langley Research Center >> 757-224-4074 >> >> >> Error: >> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead > limit >> exceeded >> at java.util.HashMap.addEntry(HashMap.java:753) >> at java.util.HashMap.put(HashMap.java:385) >> at org.fontbox.cmap.CMap.addMapping(CMap.java:132) >> at org.fontbox.cmap.CMapParser.parse(CMapParser.java:153) >> at org.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:535) >> at org.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:387) >> at >> org.pdfbox.util.PDFStreamEngine.showString(PDFStreamEngine.java:325) >> at org.pdfbox.util.operator.ShowText.process(ShowText.java:64) >> at >> > org.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:452 >> ) >> at >> > org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:21 >> 5) >> at >> > org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:174) >> at >> org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:336) >> at >> org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:259) >> at >> org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216) >> at >> org.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:149) >> at >> > org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java >> :110) >> at >> > org.dspace.app.mediafilter.MediaFilter.processBitstream(MediaFilter.java >> :155) >> at >> > org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilte >> rManager.java:340) >> at >> > org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterMana >> ger.java:309) >> at >> > org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilt >> erManager.java:274) >> at >> > org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(Media >> FilterManager.java:242) >> at >> > org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.ja >> va:193) >> >> -----Original Message----- >> From: dspace home directory [mailto:[EMAIL PROTECTED] >> Sent: Thursday, June 05, 2008 1:05 AM >> To: [EMAIL PROTECTED] >> Subject: Output from "cron" command >> >> Your "cron" job on odyssey >> /dspace/bin/rim-filter > /dspace/bin/rim-filter.log >> >> produced the following output: >> >> >> > ------------------------------------------------------------------------ >> - >> Check out the new SourceForge.net Marketplace. >> It's the best place to buy or sell services for >> just about anything Open Source. >> http://sourceforge.net/services/buy/index.php >> _______________________________________________ >> DSpace-tech mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/dspace-tech >> >> > ------------------------------------------------------------------------ > - >> Check out the new SourceForge.net Marketplace. >> It's the best place to buy or sell services for >> just about anything Open Source. >> http://sourceforge.net/services/buy/index.php >> _______________________________________________ >> DSpace-tech mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/dspace-tech >> > > This email has been scanned by Postini. > For more information please visit http://www.postini.com > This email has been scanned by Postini. For more information please visit http://www.postini.com ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

