[ 
https://issues.apache.org/jira/browse/SOLR-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135582#comment-13135582
 ] 

Steven Rowe commented on SOLR-1786:
-----------------------------------

Solr Cell upgraded to Tika 0.8, which included PDFbox 1.1.0, in the Solr 3.1 
release.

The Solr 3.5 release will include Tika 0.10, which includes PDFbox 1.6.0.

Likely this problem has been addressed.

Jan, can you test Solr 3.1+ to confirm?
                
> Solr (trunk rev. 912116) suffers from PDFBOX-537 [Endless loop in 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary()]  fixed in PDFbox 
> 1.0?
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1786
>                 URL: https://issues.apache.org/jira/browse/SOLR-1786
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - Solr Cell (Tika extraction)
>    Affects Versions: 1.5
>         Environment: Ubuntu 9.10, 32bit
>            Reporter: Jan Iwaszkiewicz
>            Priority: Critical
>              Labels: PDFbox
>             Fix For: 3.5, 4.0
>
>
> I tried indexing several thousand PDF documents but could not finish as Solr 
> was falling into an endless loop for some of them, for instance: 
> http://cdsweb.cern.ch/record/702585/files/sl-note-2000-019.pdf (the PDF seems 
> OK).
> Can Solr start using PDFbox 1.0?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to