Question on solr 1.4 Replication

2009-07-15 Thread Gurjot Singh
Hi,
I am using data import handler to do full and delta import. I want to use
the replication feature of solr 1.4

For that I wanted to understand 2 scenarios

1. What happens when the slave solr server tries to poll the master at the
time delta import is running on master. Does the slave only copy those files
which have changed so far on the master index or will the slave wait for the
delta import to finish and in the next poll it will get the changes on the
master index.

2. What happens when the slave is replicating the changes in the master
index and the delta import is run at the same time on master. Will the slave
be able to get the changes when the last delta import was run.

Also I am not clear about ReplicateAfter replication configuration setting
on master. Does it mean the slave will only be able to replicate the changes
in the next poll once the commit/optimize is done on the master index.

Thanks
Gurjot


Improve indexing time

2009-07-13 Thread Gurjot Singh
Hi,
We have a solr index of size 626 MB and number of douments indexed are
141810. We have configured index based spellchecker with buildOnCommit
option set to true. Spellcheck index is of size 8.67 MB.

We use data import handler to create the index from scratch and also to
update the index periodically. We have created the job to run full import
once every week and the delta import after every 20 mins. The full import
takes about 38 mins to complete and the delta import takes about 12 mins to
complete. The index also serves the search queries (even at the time the
delta import is running). The number of documents that are changed during
every delta import are on an average 25 to 30.

Is there a way to reduce the amount of time delta import takes to update the
index.
The system specs are
MS Windows Server 2003 R2
Standard x64 Edition
8 GB RAM.
Solr is set up on Tomcat 6.0

The CPU utilization of the tomcat.exe at the time of delta import is 60%.

In the data-config.xml file there are 6 root entities for 6 database tables
under the Document element. The first root entity gets the rows from
table1, the 2nd root entity gets the rows from table2 ...so on. The root
entities have several child entities to get the fields from associated
tables.

The mergeFactor is set to 10 and ramBufferSizeMB is set to 32. The following
is the cache setting

filterCache class=solr.LRUCache size=16384 initialSize=4096
autowarmCount=4096/
queryResultCache class=solr.LRUCache size=16384 initialSize=4096
autowarmCount=4096/
documentCache class=solr.LRUCache size=16384 initialSize=16384
autowarmCount=0/
enableLazyFieldLoadingtrue/enableLazyFieldLoading

Is it advisable to use master slave configuration. Does the index size of
626 MB validate the change from existing single solr core (on which delta
import is done after every 20 mins and also serves search queries) to master
slave configuration keeping into consideration that the index size will keep
on increasing over time.

Is there any other way to improve the indexing time.

Thanks,
Gurjot



**


Monitor search traffic

2009-07-01 Thread Gurjot Singh
Hi,
Is there a way to monitor the number of search queries made on the solr
index.

Thanks
Gurjot


Solr Document Sort

2009-05-18 Thread Gurjot Singh
Hi,
In the Solr schema file I have a integer type field named as 'ContentType'
as follows

field name=ContentType type=int indexed=true stored=true/

The values of this field can be one of the following:

1(for News) , 2(for Reviews), 3(for Opinion), 4(for Blogs)

I have a scenario in which when a user does a search the result should be
sorted by Best match(i.e. from Highest relevancy score to lowest score). At
the same time I want the solr documents having Blogs as the value in the
ContentType field appear at the bottom in the search result below the
documents having News, Reviews or Opinion as the value in the ContentType.

The way I am doing this is by first doing a sort on the ContentType Field
and then doing a sort by score as follows.

sort=ContentType asc,score desc

Is there a better solution to do the same.

Thanks
Gurjot


Solr 1.4 Release Date

2009-04-27 Thread Gurjot Singh
Hi, I am curious to know when is the scheduled/tentative release date of
Solr 1.4.

Thanks,
Gurjot