Re: My repository is not indexing PDFs, what am I missing?

Bertrand Delacretaz Mon, 26 May 2014 07:01:15 -0700

Hi Chetan,

On Thu, May 22, 2014 at 6:52 AM, Chetan Mehrotra
<[email protected]> wrote:
> ...This might be due to OAK-1462. We had to disable the
> LuceneIndexProvider form getting registered as OSGi service...


Would that mean that the LuceneIndexEditor is still called, but the
result isn't used?

I'm asking because when adding a PDF, LuceneIndexEditor.addOrUpdate
does call context.getWriter().updateDocument with a Document that does
contain the PDF's full text in a field named :fulltext, so the text
extraction is working (thanks Alex for the tika-parsers hint).

But the query mentioned earlier in this thread still finds only .txt
documents, not .pdf.

Adding a .txt also causes LuceneIndexEditor.addOrUpdate to call
context.getWriter().updateDocument, but maybe the text is also indexed
in another way?

-Bertrand

Re: My repository is not indexing PDFs, what am I missing?

Reply via email to