Re: Lucene is not able to index certain words of txt file converted form pdf

Otis Gospodnetic Wed, 18 Jun 2008 05:57:37 -0700

Hi,

Use java-user list, there are more people on it.


You need to change the setting in IndexWriter that tells Lucene how many tokens 
froma a document to index.  By default it indexes only 10,000.  I can't 
remember the parameter name, but look at the IndexWriter javadocs, it's right 
there.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: m657m <[EMAIL PROTECTED]>
> To: general@lucene.apache.org
> Sent: Wednesday, June 18, 2008 8:24:53 AM
> Subject: Lucene is not able to index certain words of txt file converted form 
> pdf
> 
> 
> Hi
> 
> I am using Lucene for indexing and searching the documents.
> I have an PDF (Lucene_in_action.pdf) file which i converted to txt file
> using PDFBox.
> The same txt file i indexed but while searching its not able to saerch
> certain words. But Lucene has given me the results if i search for other
> words.
> I am not able to find any reason for that.
> If any of you intellectuals can help me out in finding the reason.
> 
> Thanks in advance. 
> -- 
> View this message in context: 
> http://www.nabble.com/Lucene-is-not-able-to-index-certain-words-of-txt-file-converted-form-pdf-tp17981585p17981585.html
> Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Lucene is not able to index certain words of txt file converted form pdf

Reply via email to