[ 
https://issues.apache.org/jira/browse/CONNECTORS-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944414#comment-16944414
 ] 

Donald Van den Driessche commented on CONNECTORS-1625:
------------------------------------------------------

We are running this pdf as the one and only document.

It's manifold 2.12. We tried to parse it through Tika locally with Tika 1.18 
and 1.22 and both succeeded.

We've set the heap space to 3G and 5G and still the same issues.

I've now read somewhere that disk space might be used. But since the file is 
only 21MB large, I don't see how much disk space might be used.

 

> When processing a specific PDF Manifold goes out of memory
> ----------------------------------------------------------
>
>                 Key: CONNECTORS-1625
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1625
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Tika extractor
>    Affects Versions: ManifoldCF 2.12
>            Reporter: Donald Van den Driessche
>            Assignee: Karl Wright
>            Priority: Major
>         Attachments: abd-serotec-antibodies-uk.pdf
>
>
> When processing attached file with manifoldcf 2.12, we keep getting an out of 
> memory error.
> When just parsing it throug Tika 1.18, no issues are being found.
> Can anyone look into it?
> Thanks in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to