Hi,

During  activating a page including pdfs in Magnolia 4.3.1, there was thrown a  
Nullpointer Exception with the message "Failed to extract PDF text content".
The page was activated properly and the pdf was not corrupt - neither in the 
autor instance nor in the public instance - so why this exception is thrown ?

The page was imported from a Magnolia Version 4.1. During the import process, 
the same error was thrown, but the page was imported successfully and the pdf 
is not corrupt.

This error also occurs if a pdf is uploaded.

This error was not thrown in Magnolia 4.1.


Regards

Inge

1)       : Failed to extract PDF text content
java.lang.NullPointerException
        at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
        at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)
        at 
org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:226)
        at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216)
        at 
org.apache.jackrabbit.extractor.PdfTextExtractor.extractText(PdfTextExtractor.java:75)
        at 
org.apache.jackrabbit.extractor.CompositeTextExtractor.extractText(CompositeTextExtractor.java:90)
        at 
org.apache.jackrabbit.core.query.lucene.JackrabbitTextExtractor.extractText(JackrabbitTextExtractor.java:195)
        at 
org.apache.jackrabbit.core.query.lucene.TextExtractorJob$1.call(TextExtractorJob.java:93)
        at EDU.oswego.cs.dl.util.concurrent.FutureResult$1.run(Unknown Source)
        at 
org.apache.jackrabbit.core.query.lucene.TextExtractorJob.run(TextExtractorJob.java:172)
        at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown 
Source)
        at java.lang.Thread.run(Thread.java:619)
2010-06-18 09:48:04,990 WARN  org.apache.jackrabbit.extractor.PdfTextExtractor  
: Failed to extract PDF text content
2010-06-18 09:48:05,006 INFO  info.magnolia.module.exchangesimple.ReceiveFilter 
: User superuser successfuly activated news on magnoliaPublic.


I found a posting concerning this error with the following solution:

"I replaced the version of pdfbox (0.6.4) that is bundled with the jackrabbit 
war file with
a more recent version (0.7.3 and fontbox 01.) and it worked fine. The bundled 
versions should
be upgraded."

But after checking the WEB-INF lib of Magnolia, I figured out, that Magnolia 
4.3.1 already has included pdfbox-0.7.3.jar and fontbox-0.1.0.jar.


Regards

Inge

----------------------------------------------------------------
For list details see
http://www.magnolia-cms.com/home/community/mailing-lists.html
To unsubscribe, E-mail to: <[email protected]>
----------------------------------------------------------------

Reply via email to