Hi,
During activating a page including pdfs in Magnolia 4.3.1, there was thrown a
Nullpointer Exception with the message "Failed to extract PDF text content".
The page was activated properly and the pdf was not corrupt - neither in the
autor instance nor in the public instance - so why this exception is thrown ?
The page was imported from a Magnolia Version 4.1. During the import process,
the same error was thrown, but the page was imported successfully and the pdf
is not corrupt.
This error also occurs if a pdf is uploaded.
This error was not thrown in Magnolia 4.1.
Regards
Inge
1) : Failed to extract PDF text content
java.lang.NullPointerException
at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)
at
org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:226)
at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216)
at
org.apache.jackrabbit.extractor.PdfTextExtractor.extractText(PdfTextExtractor.java:75)
at
org.apache.jackrabbit.extractor.CompositeTextExtractor.extractText(CompositeTextExtractor.java:90)
at
org.apache.jackrabbit.core.query.lucene.JackrabbitTextExtractor.extractText(JackrabbitTextExtractor.java:195)
at
org.apache.jackrabbit.core.query.lucene.TextExtractorJob$1.call(TextExtractorJob.java:93)
at EDU.oswego.cs.dl.util.concurrent.FutureResult$1.run(Unknown Source)
at
org.apache.jackrabbit.core.query.lucene.TextExtractorJob.run(TextExtractorJob.java:172)
at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Thread.java:619)
2010-06-18 09:48:04,990 WARN org.apache.jackrabbit.extractor.PdfTextExtractor
: Failed to extract PDF text content
2010-06-18 09:48:05,006 INFO info.magnolia.module.exchangesimple.ReceiveFilter
: User superuser successfuly activated news on magnoliaPublic.
I found a posting concerning this error with the following solution:
"I replaced the version of pdfbox (0.6.4) that is bundled with the jackrabbit
war file with
a more recent version (0.7.3 and fontbox 01.) and it worked fine. The bundled
versions should
be upgraded."
But after checking the WEB-INF lib of Magnolia, I figured out, that Magnolia
4.3.1 already has included pdfbox-0.7.3.jar and fontbox-0.1.0.jar.
Regards
Inge
----------------------------------------------------------------
For list details see
http://www.magnolia-cms.com/home/community/mailing-lists.html
To unsubscribe, E-mail to: <[email protected]>
----------------------------------------------------------------