Hello - The errors you posted appear to be related to filtering of PDF and Word documents. I'm not sure of the limits.
For what it's worth, here are some steps you can try to build JPEG thumbnails of TIFFs in DSpace: - Enter TIFF in the DSpace bitstream format registry with mime-type image/tiff and file extensions tiff and tif - Edit the dspace.cfg file and add image/tiff and TIFF to the filter.org.dspace.app.mediafilter.JPEGFilter.inputFormats line - Download and install the Java Advanced Imaging I/O tools, currently available here: https://jai-imageio.dev.java.net/binary-builds.html . These tools contain a TIFF plugin that will allow the JPEGFilter to read the TIFF format. - Verify that the bitstreams are marked as TIFF format, and then run the filter-media script to build the JPEG thumbnails for the TIFFs. Be patient. If you see memory errors with large TIFF files, you can try increasing the "-Xmx256m" (maximum heap size) parameter in the dsrun script to resolve the problem. If you have certain types of images, you may need to write a custom filter or modify the JPEGFilter to get better results. For example, if you have large TIFF files that are primarily black and white, the JPEGFilter will favor speed over appearance when resampling the image to the thumbnail sized JPEG, and the resulting thumbnail won't look much like the original. You might need a filter that uses a different resampling method. -- Keith Systems Developer OhioLINK Branko Kovacevic wrote: > Dear All, > > So far we've been uploading jpg images into our DSpace system and had > no problems with getting thumbnails for them later. > > Unfortunately, recently after uploading a dozen of items with tiff > images (their size is between 4 and 15 Mb) couldn't get thumbnails for > them. Filter-media script returns error message. Here is the portion of > the log file, with some critical messages: > > ERROR filtering, skipping bitstream #7542 > java.io.FileNotFoundException: no such entry: "0Table" > java.io.FileNotFoundException: no such entry: "0Table" > at > org.apache.poi.poifs.filesystem.DirectoryNode.getEntry(DirectoryNode.java > :283) > at > org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.java:60) > at > org.dspace.app.mediafilter.WordFilter.getDestinationStream(WordFilter.java:97) > at > org.dspace.app.mediafilter.MediaFilter.processBitstream > (MediaFilter.java:155) > at > org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:327) > at > org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:296) > > > at > org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:266) > at > org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:234) > at > org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:185) > java.lang.Throwable: Warning: You did not close the PDF Document > at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:384) > at gnu.gcj.runtime.FinalizerThread.run(libgcj.so.70) > java.lang.Throwable: Warning: You did not close the PDF Document > at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:384) > at gnu.gcj.runtime.FinalizerThread.run (libgcj.so.70) > java.lang.Throwable: Warning: You did not close the PDF Document > at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:384) > at gnu.gcj.runtime.FinalizerThread.run(libgcj.so.70) > java.lang.Throwable : Warning: You did not close the PDF Document > at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:384) > at gnu.gcj.runtime.FinalizerThread.run(libgcj.so.70) > java.lang.Throwable: Warning: You did not close the PDF Document > at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:384) > at gnu.gcj.runtime.FinalizerThread.run(libgcj.so.70) > FILTERED: bitstream 7682 and created > 'articles_bridging_20000615.pdf.txt' > FILTERED: bitstream 7683 and created > 'articles_sustainable_developement_20000815.pdf.txt' > GC Warning: Repeated allocation of very large block (appr. size > 20230144): > May lead to memory leak and poor performance. > FILTERED: bitstream 7684 and created > 'articles_venture_20001215.pdf.txt' > FILTERED: bitstream 7685 and created > 'articles_rethinking_20010215.pdf.txt' > FILTERED: bitstream 7686 and created > 'articles_relationship_20010515.pdf.txt' > FILTERED: bitstream 7687 and created > 'articles_org_capacity_20021115.pdf.txt' > GC Warning: Out of Memory! Returning NIL! > Exception in thread "main" java.lang.OutOfMemoryError > <<No stacktrace available>> > > Is there any limit of the file size filtering? > Any help is highly appreciated. > > Best regards, > Branko Kovacevic > > Records Coordinator > Open Society Archives > Arany Janos u. 32 > 1051 Budapest, Hungary > phone: (36-1) 327-3266 or 327-2029 > e-mail: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> > website: www.osa.ceu.hu <http://www.osa.ceu.hu> > ++++++++++++++++++++++++++++ > > ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech