[
http://jira.nuxeo.org/browse/NXP-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_26276
]
Olivier Grisel commented on NXP-1556:
-------------------------------------
Here comes the logged stacktrace:
12:05:54,856 ERROR [PDFBoxPluginImpl] An error occured while trying transform
pdf to text...
java.io.IOException: You do not have permission to extract text
at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:189)
at org.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:140)
at
org.nuxeo.ecm.platform.transform.plugin.pdfbox.impl.PDFBoxPluginImpl.transformOne(PDFBoxPluginImpl.java:108)
at
org.nuxeo.ecm.platform.transform.plugin.pdfbox.impl.PDFBoxPluginImpl.transform(PDFBoxPluginImpl.java:84)
at
org.nuxeo.ecm.platform.transform.transformer.AbstractTransformer.transform(AbstractTransformer.java:174)
at
org.nuxeo.ecm.platform.transform.service.TransformService.transform(TransformService.java:157)
at
org.nuxeo.ecm.platform.transform.service.TransformService.transform(TransformService.java:174)
at
org.nuxeo.ecm.core.search.blobs.NXTransformBlobExtractor.extract(NXTransformBlobExtractor.java:84)
at
org.nuxeo.ecm.core.search.api.backend.indexing.resources.factory.ResolvedResourcesFactory.blobToText(ResolvedResourcesFactory.java:197)
at
org.nuxeo.ecm.core.search.api.backend.indexing.resources.factory.ResolvedResourcesFactory.convertForFullText(ResolvedResourcesFactory.java:163)
at
org.nuxeo.ecm.core.search.api.backend.indexing.resources.factory.ResolvedResourcesFactory.extractForFullText(ResolvedResourcesFactory.java:153)
at
org.nuxeo.ecm.core.search.api.backend.indexing.resources.factory.ResolvedResourcesFactory.computeFulltext(ResolvedResourcesFactory.java:117)
at
org.nuxeo.ecm.core.search.api.backend.indexing.resources.factory.ResolvedResourcesFactory.computeAggregatedResolvedResourcesFrom(ResolvedResourcesFactory.java:254)
at
org.nuxeo.ecm.core.search.service.SearchServiceImpl.index(SearchServiceImpl.java:244)
at
org.nuxeo.ecm.core.search.threading.IndexingTask.run(IndexingTask.java:61)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
at java.lang.Thread.run(Thread.java:595)
12:05:55,103 ERROR [STDERR] java.lang.Throwable: Warning: You did not close the
PDF Document
12:05:55,104 ERROR [STDERR] at
org.pdfbox.cos.COSDocument.finalize(COSDocument.java:420)
12:05:55,104 ERROR [STDERR] at
java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
12:05:55,104 ERROR [STDERR] at
java.lang.ref.Finalizer.runFinalizer(Finalizer.java:83)
12:05:55,104 ERROR [STDERR] at
java.lang.ref.Finalizer.access$100(Finalizer.java:14)
12:05:55,104 ERROR [STDERR] at
java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:160)
> pdfbox error logged while indexing some PDF files
> -------------------------------------------------
>
> Key: NXP-1556
> URL: http://jira.nuxeo.org/browse/NXP-1556
> Project: Nuxeo Enterprise Platform 5
> Issue Type: Bug
> Components: Transforms
> Affects Versions: 5.1.0.GA
> Reporter: Olivier Grisel
> Assignee: Laurent Godard
> Fix For: 5.1.1, 5.2 M1
>
> Original Estimate: 1 day
> Remaining Estimate: 1 day
>
> An error level stacktrace is logged while indexing certain kind of PDF (not
> those generated by OOo though). See attached sample file and stacktrace un
> comment. This might be a pdfbox bug or a pbm the way we manage pdfbox
> exceptions.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.nuxeo.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
_______________________________________________
ECM-tickets mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm-tickets