Re: Indexing large documents

Fouad Mardini Mon, 20 Aug 2007 03:54:35 -0700

Well, I am using the java textmining library to extract text from documents,
then i do a post to solr
I do not have an error log, i only have *.request.log files in the logs
directory


Thanks

On 8/20/07, Peter Manis <[EMAIL PROTECTED]> wrote:
>
> Fouad,
>
> I would check the error log or console for any possible errors first.
> They may not show up, it really depends on how you are processing the
> word document (custom solr, feeding the text to it, etc).  We are
> using a custom version of solr with PDF, DOC, XLS, etc text extraction
> and I have successfully indexed 40mb documents.  I did have indexing
> problems with a large document or two and simply increasing the heap
> size fixed the problem.
>
> - Pete
>
> On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote:
> > Hello,
> >
> > I am using solr to index text extracted from word documents, and it is
> > working really well.
> > Recently i started noticing that some documents are not indexed, that is
> i
> > know that the word foobar is in a document, but when i search for foobar
> the
> > id of that document is not returned.
> > I suspect that this has to do with the size of the document, and that
> > documents with a lot of text are not being indexed.
> > Please advise.
> >
> > thanks,
> > fmardini
> >
>

Re: Indexing large documents

Reply via email to