thanks, i reindexed the documents and now it works, there was an issue with text extraction it seems. I also changed the maxFieldLength and it must have helped
thanks On 8/20/07, Pieter Berkel <[EMAIL PROTECTED]> wrote: > > You will probably need to increase the value of maxFieldLength in your > solrconfig.xml. The default value is 10000 which might explain why your > documents are not being completely indexed. > > Piete > > > On 20/08/07, Peter Manis <[EMAIL PROTECTED]> wrote: > > > > The that should show some errors if something goes wrong, if not the > > console usually will. The errors will look like a java stacktrace > > output. Did increasing the heap do anything for you? Changing mine > > to 256mb max worked fine for all of our files. > > > > On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote: > > > Well, I am using the java textmining library to extract text from > > documents, > > > then i do a post to solr > > > I do not have an error log, i only have *.request.log files in the > logs > > > directory > > > > > > Thanks > > > > > > On 8/20/07, Peter Manis <[EMAIL PROTECTED]> wrote: > > > > > > > > Fouad, > > > > > > > > I would check the error log or console for any possible errors > first. > > > > They may not show up, it really depends on how you are processing > the > > > > word document (custom solr, feeding the text to it, etc). We are > > > > using a custom version of solr with PDF, DOC, XLS, etc text > extraction > > > > and I have successfully indexed 40mb documents. I did have indexing > > > > problems with a large document or two and simply increasing the heap > > > > size fixed the problem. > > > > > > > > - Pete > > > > > > > > On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote: > > > > > Hello, > > > > > > > > > > I am using solr to index text extracted from word documents, and > it > > is > > > > > working really well. > > > > > Recently i started noticing that some documents are not indexed, > > that is > > > > i > > > > > know that the word foobar is in a document, but when i search for > > foobar > > > > the > > > > > id of that document is not returned. > > > > > I suspect that this has to do with the size of the document, and > > that > > > > > documents with a lot of text are not being indexed. > > > > > Please advise. > > > > > > > > > > thanks, > > > > > fmardini > > > > > > > > > > > > > > >