Hello,
I am using solr to index text extracted from word documents, and it is
working really well.
Recently i started noticing that some documents are not indexed, that is i
know that the word foobar is in a document, but when i search for foobar the
id of that document is not returned.
I suspect
Hi
I want to know how to update my .xml file which have other field then
the default one , so which file o have to modify, and how.
pRAVEEN jAIN
+919890599250
-Original Message-
From: Fouad Mardini [mailto:[EMAIL PROTECTED]
Sent: Monday, August 20, 2007 4:00 PM
To:
Fouad,
I would check the error log or console for any possible errors first.
They may not show up, it really depends on how you are processing the
word document (custom solr, feeding the text to it, etc). We are
using a custom version of solr with PDF, DOC, XLS, etc text extraction
and I have
Well, I am using the java textmining library to extract text from documents,
then i do a post to solr
I do not have an error log, i only have *.request.log files in the logs
directory
Thanks
On 8/20/07, Peter Manis [EMAIL PROTECTED] wrote:
Fouad,
I would check the error log or console for
The that should show some errors if something goes wrong, if not the
console usually will. The errors will look like a java stacktrace
output. Did increasing the heap do anything for you? Changing mine
to 256mb max worked fine for all of our files.
On 8/20/07, Fouad Mardini [EMAIL PROTECTED]
You will probably need to increase the value of maxFieldLength in your
solrconfig.xml. The default value is 1 which might explain why your
documents are not being completely indexed.
Piete
On 20/08/07, Peter Manis [EMAIL PROTECTED] wrote:
The that should show some errors if something
Yes - they come back in the order indexed.
Erik
On Aug 19, 2007, at 7:20 PM, Yu-Hui Jin wrote:
BTW, Hoss, is there a default order for the documents returned by
running
this query?
thanks,
-Hui
On 8/16/07, Chris Hostetter [EMAIL PROTECTED] wrote:
: Any of you know whether
Hello !
At least, I've had the oportunity to test your solution, Pieter, which was to
use dynamic field :
dynamicField name=page* type=text indexed=true stored=true /
Store each page in a separate field (e.g. page1, page2, page3 .. pageN) then
at query time, use the highlighting
Hi!
I have utf-8 encoded data inside a csv file (actually it’s a tab separated file
- attached)
I can index it with no apparent errors
I did not forget to set this in my tomcat configuration
Server ...
Service ...
Connector ... URIEncoding=UTF-8/
When I query a document
Hi,
I am trying to do some counting on certain fields of the search results,
currently I am using PHP to do the counting, but it is impossible to do this
when the results sets reach a few hundred thousands. Does anyone here has
any idea on how to do this?
Example of scenario,
1. The solr
While we're on the topic, there appear to be a ton of new features in 1.3,
and they are getting debugged. When do you plan to do an official 1.3
release?
-Original Message-
From: Yu-Hui Jin [mailto:[EMAIL PROTECTED]
Sent: Friday, August 17, 2007 11:53 PM
To: solr-user@lucene.apache.org
: TermEnum terms = searcher.getReader().terms(new Term(field, ));
: while (terms.term() != null terms.term().field() == field){
: //do things
: terms.next();
: }
: while( te.next() ) {
: final Term term = te.term();
you're missing the key piece
: Sort sort = new Sort(new SortField[]
: { SortField.FIELD_SCORE, new SortField(customValue, SortField.FLOAT,
: true) });
: indexSearcher.search(q, sort)
that appears to just be a sort on score withe a secondary reversed
float sort on whatever field name is in the variable customValue
No, this about the Carrot2 clustering tool, specifically the Swing
application.
To make this app use a Solr service you have to code a custom searcher for
your Solr.
I'm requesting a generic UI for Carrot2 that works against any Solr.
-Original Message-
From: Mike Klaas [mailto:[EMAIL
what is the best approach to clearing an index?
The use case is that I'm doing some performance testing with various
index sizes. In between indexing (embedded and soon HTTP/XML) I need to
clear the index so I have a fresh start.
What's the best approach, close the index and delete the files?
If you are using solr 1.2 the following command (followed by a commit /
optimize) should do the trick:
deletequery*:*/query/delete
cheers,
Piete
On 21/08/07, Sundling, Paul [EMAIL PROTECTED] wrote:
what is the best approach to clearing an index?
The use case is that I'm doing some
How long should a commit take? I've got about 9.8G of data for 9M of
records. (Yes, I'm indexing too much data.) My commits are taking 20-30
seconds. Since other people set the autocommit to 1 second, I'm guessing we
have a major mistake somewhere in our configurations.
We have a lot of
IIRC you can also also simply stop the servlet container, delete the
contents of the data directory by hand, then restart the container.
-Charlie
On 8/20/07, Pieter Berkel [EMAIL PROTECTED] wrote:
If you are using solr 1.2 the following command (followed by a commit /
optimize) should do the
18 matches
Mail list logo