date:20100215

Performance-Issues and raising numbers of "cumulative inserts"

2010-02-15 Thread Bohnsack, Sven

Hey IT-Crowd!

 

I'm dealing with some performance issues during warmup the
queryResultCache. Normally it tooks about 11 Minutes (~700.000 ms), but
now it tooks about 4 MILLION and more ms. All I can see in the solr.log
ist that the number of cumulative_inserts ascends from from ~250.000 to
~670.000.

 

I asked Google about the cumulative_inserts, but did not get an answer.
Can anyone tell me what "cumulative inserts" are and what they stand
for? What does it mean, if the number of such inserts raises?

 

Greetings,

Sven

Re: Realtime search and facets with very frequent commits

2010-02-15 Thread Janne Majaranta

Hey Dipti,

Basically query optimizations + setting cache sizes to a very high level.
Other than that, the config is about the same as the out-of-the-box config
that comes with the Solr download.

I haven't found a magic switch to get very fast query responses + facet
counts with the frequency of commits I'm having using one single SOLR
instance.
Adding some TOP queries for a certain type of user to static warming queries
just moved the time of autowarming the caches to the time it took to warm
the caches with static queries.
I've been staging a setup where there's a small solr instance receiving all
the updates and a large instance which doesn't receive the live feed of
updates.
The small index will be merged with the large index periodically (once a
week or once a month).
The two instances are seen by the client app as one instance using the
sharding features of SOLR.
The instances are running on the same server inside their own JVM / jetty.

In this setup the caches are very HOT for the large index and queries are
extremely fast, and the small index is small enough to get extremely fast
queries without having to warm up the caches too much.

Basically I'm able to have a commit frequency of 10 seconds in a 40M docs
index while counting TOP5 facets over 14 fields in 200ms.
In reality the commit frequency of 10 seconds comes from the fact that the
updates are going into a 1M - 2M documents index, and the fast facet counts
from the fact that the 38M documents index has hot caches and doesn't
receive any updates.

Also, not running updates to the large index means that the SOLR instance
reading the large index uses about half the memory it used before when
running the updates to the large index. At least it does so on Win2k3.

-Janne


2010/2/15 dipti khullar 

> Hey Janne
>
> Can you please let me know what other optimizations are you talking about
> here. Because in our application we are committing in about 5 mins but
> still
> the response time is very low and at times there are some connection time
> outs also.
>
> Just wanted to confirm if you have done some major configuration changes
> which have proved beneficial.
>
> Thanks
> Dipti
>
>

Re: too often delta imports performance effect

2010-02-15 Thread Nick Jenkin

Yes, the old data will show until there has been a commit executed. 50
docs isn't many so you should be fine
-Nick

On Mon, Feb 15, 2010 at 11:41 AM, adeelmahmood  wrote:
>
> thank you .. that helps .. actually its not that many updates .. close to 10
> fields probably and may be 50 doc updates per 15 .. so i am assuming that by
> handling indexing and searching in parallel you mean that if its updating
> some data .. it will continue to show old data until new data has been
> finalized(committed) or something like that ??
>
>
> Jan Høydahl / Cominvent wrote:
>>
>> Hi,
>>
>> This all depends on actual volumes, HW, architecture etc.
>> What exactly is "pretty frequently", how many document updates/adds per 15
>> minutes?
>>
>> Solr is designed to be able to do indexing and search in parallel, so you
>> don't need to fear this, unless you are already pushing the limits of what
>> your setup can handle. The best way to go is to start out and then
>> optimize when you see bottlenecks.
>>
>> Here is a pointer to Wiki about indexing performance:
>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>>
>> --
>> Jan Høydahl  - search architect
>> Cominvent AS - www.cominvent.com
>>
>> On 14. feb. 2010, at 23.56, adeelmahmood wrote:
>>
>>>
>>> we are trying to setup solr for a website where data gets updated pretty
>>> frequently and I want to have those changes reflected in solr indexes
>>> sooner
>>> than nighly delta-imports .. so I am thinking we will probably want to
>>> set
>>> it up to have delta imports running every 15 mins or so .. and solr
>>> search
>>> will obviously be in use while this is going on .. first of all does solr
>>> works well with adding new data or updating existing data while people
>>> are
>>> doing searches in it
>>> secondly are these delta imports are gonna cause any significant
>>> performance
>>> degradation in solr search
>>> any help is appreciated
>>> --
>>> View this message in context:
>>> http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27587778.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/too-often-delta-imports-performance-effect-tp27587778p27588472.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

How to retrieve relevance "explain" info in code?

2010-02-15 Thread uwdanny


Hi, 

I was trying to get the detailed "explain" info in (java) code using the
APIs, see codes below,

-
ResponseBuilder rb (from some inherited process function)
SolrIndexSearcher searcher = rb.req.getSearcher();
Query query = rb.getQuery();
Explanation epl = searcher.explain(query, docId) 
-

here, the docId is a valid doc id, the query is a valid one as well
(verified in log); however, I always get back score 0.0 for any matching 

INFO: 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
clause(s)


but if I issue the same query thru url with debugQuery=on, the explain
section shows the breakdown of the score correctly.

anything I'm missing here?

thanks,

- danny Z
-- 
View this message in context: 
http://old.nabble.com/How-to-retrieve-relevance-%22explain%22-info-in-code--tp27602530p27602530.html
Sent from the Solr - User mailing list archive at Nabble.com.

Discovering Slaves

2010-02-15 Thread wojtekpia


Is there a way to 'discover' slaves using ReplicationHandler? I'm writing a
quick dashboard, and don't have access to a list of slaves, but would like
to show some stats about their health.
-- 
View this message in context: 
http://old.nabble.com/Discovering-Slaves-tp27601334p27601334.html
Sent from the Solr - User mailing list archive at Nabble.com.

getting unexpected statscomponent values

2010-02-15 Thread solr-user


Has anyone encountered the following issue?

I wanted to understand the statscomponent better, so I setup a simple test
index with a few thousand docs.  In my schema I have:
-   an indexed multivalue sint field (StatsFacetField) that can contain 
values
0 thru 5 that I want to use as my stats.facet field.
-   an indexed single value sint field (ValueOfOneField) that will always
contain the value 1 and that I want stats on for this test

When I execute the following query:

http://localhost:8080/solr/select?q=*:*&stats=true&stats.field=ValueOfOneField&stats.facet=StatsFacetField&rows=0&facet=on&facet.limit=10&facet.field=StatsFacetField

For this situation (*:*) I was expecting that the statscomponent Count/Sum
values for each possible value in StatsFacetField to match the facet values
for StatsFacetField.  They don’t.  Some are close (ie 204 vs 214) while
others are way off (ie 230 vs 8000)

Shouldn’t the values match up?  If not, why?

I am using a recent copy of 1.5.0-dev solr ($Id: CHANGES.txt 906924
2010-02-05 12:43:11Z noble $)
-- 
View this message in context: 
http://old.nabble.com/getting-unexpected-statscomponent-values-tp27599248p27599248.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: regarding ranking

2010-02-15 Thread Ahmet Arslan

> 1) Does Solr (Lucene) consider exact match to be something
> more important ? I mean if the query is 
> "description:organisation", then
> which one of the following would be returned?
>        Document A, consiting 
> just "description:organisation" , where
> as Document B consisting "description:bla bla ...
> organisation bla
> bla.. ". Does it consider length of the field-text while
> ranking ?

It is called length normalization which is done by default. It favors short 
documents. It punishes long documents.

> 2) Let us assume that our query is "value0 field1:value1" .
> So here,
> if we use OR as the default operator its obvious that we
> may get
> results in which we might find dominating "value0" and no
> "field1:value1" at all. We need some kind of mixture of
> "OR" and
> "AND", which gives more importance also for the "number of
> keywords"
> found. So I would like to find out whether we can edit some
> kind of
> boosting (or something relevant) to achieve this.

Generally if a documents contains more query terms, it will get higher score. 
But it is not true all times since there are other parameters. For example a 
short document with only one query term might get higher score than a long 
document with containing two query terms.

This link can be useful:

http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/search/Similarity.html

Re: persistent cache

2010-02-15 Thread Tom Burton-West


Hi Tim,

Due to our performance needs we optimize the index early in the morning and
then run the cache-warming queries once we mount the optimized index on our
servers.  If you are indexing and serving using the same Solr instance, you
shouldn't have to re-run the cache warming queries when you add documents. 
I believe that the disk writes caused by adding the documents to the index
should put that data in the OS cache.   Actually 1600 queries are not a lot
of queries.  If you are using actual user queries from your logs you may
need more.   We used some tools based on Luke to analyze our index and
determine which words would most benefit by being in the OS cache (assuming
users entered a phrase query containing those words.)  You can experiment to
see how many queries you need to fill memory by emptying the OS cache and
then send queries and use top to watch memory usage.

Your options  (assuming peformance with current hardware does not meet your
needs ) are using SSD's, increasing memory on the machine, or splitting the
index using Solr shards.  If you either increase memory on the machine or
split the index, you will still have to run cache warming queries.

One other thing you might consider is to use stop words or CommonGrams to
reduce disk I/O requirments for phrase queries containing common words.  
(Our experiments with CommonGrams and cache-warming are described in our
blog : http://www.hathitrust.org/blogs/large-scale-search
http://www.hathitrust.org/blogs/large-scale-search )

Tom




Hi Tom,

1600 warming queries, that's quite many. Do you run them every time a
document is added to the index? Do you have any tips on warming?

If the index size is more than you can have in RAM, do you recommend
to split the index to several servers so it can all be in RAM?

I do expect phrase queries. Total index size is 107 GB. *prx files are
total 65GB and *frq files 38GB. It's probably worth buying more RAM.

/Tim


-- 
View this message in context: 
http://old.nabble.com/persistent-cache-tp27562126p27598026.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Custom SearchComponent, only getting numFound back

2010-02-15 Thread cmose


A little more info. 

Doing some more digging it appears that result.getDocList().size() is
returning 0 which would explain why I'm not getting any documents in my
result. 

I'm not quite sure how/why that would be returning 0 while
result.getDocList().matches() is returning > 0?

It also looks like result.getDocListAndSet().docSet.size() is > 0... any
suggestions?


cmose wrote:
> 
> I'm attempting to write a custom SearchComponent that utilizes some custom
> filters but i'm obviously missing something key. I extend SearchComponent
> and override the prepare and process methods and then set the results on
> the result builder a la:
> 
> SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();
> SolrIndexSearcher.QueryResult result = new
> SolrIndexSearcher.QueryResult();
> searcher.search(result, cmd);
> rb.setResult(result);
> response.add("response", builder.getREsults().docList);
> 
> 
> However, when I execute a query against the handler using this component,
> I get a empty result element e.g., 
> 
> 
> 
> I'm not quite sure where I'm falling down here and how I'm getting a > 0
> numFound yet an empty result element...
> Thanks much
> 

-- 
View this message in context: 
http://old.nabble.com/Custom-SearchComponent%2C-only-getting-numFound-back-tp27596624p27597870.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ Beginner (NEED HELP URGENT)

2010-02-15 Thread Erick Erickson

This link will answer many of your questions:
http://wiki.apache.org/solr/SolrInstall


About JSON: See SOLR-1690 (Jira) at:
https://issues.apache.org/jira/browse/SOLR-1690
WARNING: I have no clue what the condition of this patch is,
and have never used it. But at least someone else is
thinking along the same lines

HTH
Erick

On Mon, Feb 15, 2010 at 11:20 AM, muneeb  wrote:

>
> Hey All,
>
> I have gone through the tutorial and ran Solrj example code. It worked
> fine.
>
> I want to now implement my own full text search engine for my documents. I
> am not sure how should i start about doing this, since in example code I
> ran
> start.jar and post.jar?
>
>



> do I have to run start.jar even for my own search engine java file, or do i
> have to code a java class that would activate local host at solr (i.e.
> http://localhost:8983/solr/)?
>
> Also for my own search engine, I have created a schema.XML file, but not
> sure where to place it in my workspace?
>
> Lastly, my documents are stored as JSON objects, should i get each JSON
> object individually> convert it to SolrInputDocument> and then add it to
> solrServer?
>
> I am a beginner in Java and SolrJ, would highly appreciate any help.
>
> Schema for storing my docs (research papers):
> 
>  
>
>
> stored="true"/>
>  
> 
>
> Thanks very much,
> -Ali
>
> --
> View this message in context:
> http://old.nabble.com/SolrJ-Beginner-%28NEED-HELP-URGENT%29-tp27596171p27596171.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Question on Index Replication

2010-02-15 Thread Erick Erickson

Caveats:
<1> I don't know either.
<2> I think you can just fire off auto-warming queries at each SOLR
instance.
the main caching is on the server machine as far as SOLR search speed
is concerned.

But I'd really recommend thinking about just replicating the indexes, disk
space is very cheap. Probably a lot cheaper than that much RAM!
How big are your indexes?

Erick


On Mon, Feb 15, 2010 at 11:11 AM, abhishes  wrote:

>
> What you say makes perfect sense.
>
> However i can offset the risk of disk i/o and latency by having good amount
> of RAM say 64 GB and 64 bit OS.
>
> 2 caveats being that
>
> 1. I have no clue if J2EE servers can use this much RAM (64 bit OS and
> JVM).
>
> 2. I have no idea on how can cache be auto-warmed. so that the users don't
> pay the penalty of loading the cache.
>
>
>
>
> Erick Erickson wrote:
> >
> > Sure, you can do that. But you're making a change that kind of defeats
> > the purpose. The underlying Lucene engine can be very disk intensive,
> > and any network latency will adversely affect the search speed. Which
> > is the point of replicating the indexes, to get them local to the SOLR/
> > Lucene instance that's using them so disk access is as fast as
> > possible.
> >
> > If you're willing to trade the search speed for saving disk space, you
> > can set things up like you want. But I'd sure run some performance
> > tests against a local as opposed to remote instance of my index
> > before making a decision...
> >
> > HTH
> > Erick
> >
> > On Mon, Feb 15, 2010 at 2:50 AM, abhishes  wrote:
> >
> >>
> >> Hello All,
> >>
> >> Upon reading the article
> >>
> >>
> >>
> http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr
> >>
> >> I have a question around index replication.
> >>
> >> If the query load is very high and I want multiple severs to be able to
> >> search the index. Can multiple servers share one read-only copy of the
> >> index?
> >>
> >> so one server (Master) builds the index and it is stored on a SAN. Then
> >> multiple Slave servers point to the same copy of the data and answer
> user
> >> queries.
> >>
> >> In the replication diagram, I see that the index is being copied on each
> >> of
> >> the Slave servers.
> >>
> >> This is not desirable because index is read-only (for the slave servers,
> >> because only master updates the index) and copying of indexes can take
> >> very
> >> long (depending on index size) and can unnecessarily waste disk space.
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/Question-on-Index-Replication-tp27590418p27590418.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/Question-on-Index-Replication-tp27590418p27596034.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Apache Tika/ Solar Cell

2010-02-15 Thread Erick Erickson

<>
Well, It Depends (tm). What do you want to accomplish? Do you want
searches to get results from both the database import AND the imported
documents? Or are these orthogonal data sets? If they are orthogonal,
then putting them in their own core if probably conceptually easiest
(there's no requirement to do this, since you could use mutually
exclusive fields but many might find that design confusing).

<>
This is trickier. *If* you know when documents are deleted, you
can delete by unique ID or by query. But that's the easy part.
Knowing what documents are deleted is harder, and you'll have
to do that part yourself, there isn't anything in SOLR that I know
of that helps there...

HTH
Erick

On Mon, Feb 15, 2010 at 10:39 AM, Lee Smith  wrote:

> Hey All,
>
> Hope someone can advise me and a way to go.
>
> I have a my Solr setup and working well. I am using DIH to handle all my
> data input.
>
> Now I need to add content from word docs pdf's meta data etc and looking to
> use Solar Cell
>
> A few questions regarding this. Would it be best to add these to a
> different core ?
>
> How would I handle Documents removed from the server as I would want these
> removed from index as well.
>
> Hope you can advise
>
> Thank you in advanced

Re: and DisMaxRequestHandler

2010-02-15 Thread Joe Calderon


no but you can set a default for the qf parameter with the same value
On 02/15/2010 01:50 AM, Steve Radhouani wrote:

Hi there,
Can the  option be used by the DisMaxRequestHandler?
  Thanks,
-Steve

Custom SearchComponent, only getting numFound back

2010-02-15 Thread cmose


I'm attempting to write a custom SearchComponent that utilizes some custom
filters but i'm obviously missing something key. I extend SearchComponent
and override the prepare and process methods and then set the results on the
result builder a la:

SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();
SolrIndexSearcher.QueryResult result = new SolrIndexSearcher.QueryResult();
searcher.search(result, cmd);
rb.setResult(result);
response.add("response", builder.getREsults().docList);


However, when I execute a query against the handler using this component, I
get a empty result element e.g., 



I'm not quite sure where I'm falling down here and how I'm getting a > 0
numFound yet an empty result element...
Thanks much
-- 
View this message in context: 
http://old.nabble.com/Custom-SearchComponent%2C-only-getting-numFound-back-tp27596624p27596624.html
Sent from the Solr - User mailing list archive at Nabble.com.

SolrJ Beginner (NEED HELP URGENT)

2010-02-15 Thread muneeb


Hey All,

I have gone through the tutorial and ran Solrj example code. It worked fine.

I want to now implement my own full text search engine for my documents. I
am not sure how should i start about doing this, since in example code I ran
start.jar and post.jar? 

do I have to run start.jar even for my own search engine java file, or do i
have to code a java class that would activate local host at solr (i.e.
http://localhost:8983/solr/)?

Also for my own search engine, I have created a schema.XML file, but not
sure where to place it in my workspace?

Lastly, my documents are stored as JSON objects, should i get each JSON
object individually> convert it to SolrInputDocument> and then add it to
solrServer? 

I am a beginner in Java and SolrJ, would highly appreciate any help.

Schema for storing my docs (research papers):

  



  


Thanks very much,
-Ali

-- 
View this message in context: 
http://old.nabble.com/SolrJ-Beginner-%28NEED-HELP-URGENT%29-tp27596171p27596171.html
Sent from the Solr - User mailing list archive at Nabble.com.

regarding ranking

2010-02-15 Thread Smith G

Hello All,
I know that in most of the cases there is no need to edit
the ranking formula and  I hope for the same in our case. So to make
sure that there is no need, I have following queries.
1) Does Solr (Lucene) consider exact match to be something more
important ? I mean if the query is  "description:organisation", then
which one of the following would be returned?
   Document A, consiting  just "description:organisation" , where
as Document B consisting "description:bla bla ... organisation bla
bla.. ". Does it consider length of the field-text while ranking ?

2) Let us assume that our query is "value0 field1:value1" . So here,
if we use OR as the default operator its obvious that we may get
results in which we might find dominating "value0" and no
"field1:value1" at all. We need some kind of mixture of "OR" and
"AND", which gives more importance also for the "number of keywords"
found. So I would like to find out whether we can edit some kind of
boosting (or something relevant) to achieve this.

Thanks.

Re: Question on Index Replication

2010-02-15 Thread abhishes


What you say makes perfect sense.

However i can offset the risk of disk i/o and latency by having good amount
of RAM say 64 GB and 64 bit OS. 

2 caveats being that 

1. I have no clue if J2EE servers can use this much RAM (64 bit OS and JVM).

2. I have no idea on how can cache be auto-warmed. so that the users don't
pay the penalty of loading the cache.




Erick Erickson wrote:
> 
> Sure, you can do that. But you're making a change that kind of defeats
> the purpose. The underlying Lucene engine can be very disk intensive,
> and any network latency will adversely affect the search speed. Which
> is the point of replicating the indexes, to get them local to the SOLR/
> Lucene instance that's using them so disk access is as fast as
> possible.
> 
> If you're willing to trade the search speed for saving disk space, you
> can set things up like you want. But I'd sure run some performance
> tests against a local as opposed to remote instance of my index
> before making a decision...
> 
> HTH
> Erick
> 
> On Mon, Feb 15, 2010 at 2:50 AM, abhishes  wrote:
> 
>>
>> Hello All,
>>
>> Upon reading the article
>>
>>
>> http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr
>>
>> I have a question around index replication.
>>
>> If the query load is very high and I want multiple severs to be able to
>> search the index. Can multiple servers share one read-only copy of the
>> index?
>>
>> so one server (Master) builds the index and it is stored on a SAN. Then
>> multiple Slave servers point to the same copy of the data and answer user
>> queries.
>>
>> In the replication diagram, I see that the index is being copied on each
>> of
>> the Slave servers.
>>
>> This is not desirable because index is read-only (for the slave servers,
>> because only master updates the index) and copying of indexes can take
>> very
>> long (depending on index size) and can unnecessarily waste disk space.
>> --
>> View this message in context:
>> http://old.nabble.com/Question-on-Index-Replication-tp27590418p27590418.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Question-on-Index-Replication-tp27590418p27596034.html
Sent from the Solr - User mailing list archive at Nabble.com.

Apache Tika/ Solar Cell

2010-02-15 Thread Lee Smith

Hey All,

Hope someone can advise me and a way to go.

I have a my Solr setup and working well. I am using DIH to handle all my data 
input.

Now I need to add content from word docs pdf's meta data etc and looking to use 
Solar Cell

A few questions regarding this. Would it be best to add these to a different 
core ?

How would I handle Documents removed from the server as I would want these 
removed from index as well.

Hope you can advise

Thank you in advanced

Lock error when indexing with curl

2010-02-15 Thread nabil rabhi

when posting documents to solr using curl, I get the following error:

Posting file File.xml to http://localhost:8983/solr/update/



Error 500 

HTTP ERROR: 500Lock obtain timed out:
nativefsl...@./solr/data/index/lucene-bd553072dd77e805bcb4e83a6d8ca389-write.lock:
java.io.FileNotFoundException:
./solr/data/index/lucene-bd553072dd77e805bcb4e83a6d8ca389-write.lock
(Permission denied)

org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
nativefsl...@./solr/data/index/lucene-bd553072dd77e805bcb4e83a6d8ca389-write.lock:
java.io.FileNotFoundException:
./solr/data/index/lucene-bd553072dd77e805bcb4e83a6d8ca389-write.lock
(Permission denied)
at org.apache.lucene.store.Lock.obtain(Lock.java:85)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1545)
at
org.apache.lucene.index.IndexWriter.(IndexWriter.java:1402)
at
org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:190)
at
org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
at
org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: java.io.FileNotFoundException:
./solr/data/index/lucene-bd553072dd77e805bcb4e83a6d8ca389-write.lock
(Permission denied)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:212)
at
org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:219)
at org.apache.lucene.store.Lock.obtain(Lock.java:99)
... 31 more

any ideas?

Re: schema design - catch all field question

2010-02-15 Thread adeelmahmood


I am just trying to understand the difference between the two options to know
which one to choose ..
it sounds like I probably should just merge all data in the content field to
maximize search results


Erick Erickson wrote:
> 
> The obvious answer is that you won't get any hits for terms
> in titles when you search the content field.
> 
> But that's not very informative. What are you trying to accomplish?
> That is, what's the high-level issue you're trying to address with
> a change like that?
> 
> Best
> Erick
> 
> On Sun, Feb 14, 2010 at 9:02 PM, adeelmahmood
> wrote:
> 
>>
>> if this is my schema
>>
>> > required="true"
>> />
>> 
>> 
>> 
>> 
>>
>> with this one being the catch all field
>> > multiValued="true"/>
>>
>> and I am copying all fields into the content field
>>
>> my question is .. what if instead of that I change the title field to be
>> text as well and dont copy that into content field but still copy
>> everything
>> else (all string fields) to content field .. exactly what difference will
>> that make ..
>>
>> --
>> View this message in context:
>> http://old.nabble.com/schema-design---catch-all-field-question-tp27588936p27588936.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/schema-design---catch-all-field-question-tp27588936p27594836.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Force Solr to use special response-rules

2010-02-15 Thread Ahmet Arslan


> Now, I want to create an extra rule:
> If the query contains on 9 words, I want to make sure, that
> 6 of them have
> to occur within a document or else it would not be
> responsed to the user. 
> 

I think you are asking DisMaxRequestHandler's mm (Minimum 'Should' Match) 
parameter. 

http://wiki.apache.org/solr/DisMaxRequestHandler#mm_.28Minimum_.27Should.27_Match.29

Re: Question on Index Replication

2010-02-15 Thread Erick Erickson

Sure, you can do that. But you're making a change that kind of defeats
the purpose. The underlying Lucene engine can be very disk intensive,
and any network latency will adversely affect the search speed. Which
is the point of replicating the indexes, to get them local to the SOLR/
Lucene instance that's using them so disk access is as fast as
possible.

If you're willing to trade the search speed for saving disk space, you
can set things up like you want. But I'd sure run some performance
tests against a local as opposed to remote instance of my index
before making a decision...

HTH
Erick

On Mon, Feb 15, 2010 at 2:50 AM, abhishes  wrote:

>
> Hello All,
>
> Upon reading the article
>
>
> http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr
>
> I have a question around index replication.
>
> If the query load is very high and I want multiple severs to be able to
> search the index. Can multiple servers share one read-only copy of the
> index?
>
> so one server (Master) builds the index and it is stored on a SAN. Then
> multiple Slave servers point to the same copy of the data and answer user
> queries.
>
> In the replication diagram, I see that the index is being copied on each of
> the Slave servers.
>
> This is not desirable because index is read-only (for the slave servers,
> because only master updates the index) and copying of indexes can take very
> long (depending on index size) and can unnecessarily waste disk space.
> --
> View this message in context:
> http://old.nabble.com/Question-on-Index-Replication-tp27590418p27590418.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

VelocityResponseWriter: Image References

2010-02-15 Thread Chantal Ackermann


Hi all,

Google didn't come up with any helpful hits, so I'm wondering whether
this is either too simple for me to grok, or I've got some obvious
mistake in my code.


Problem:

Images that I want to load in the velocity templates (including those
referenced in CSS/JS files) for the VelocityResponseWriter do not show
up. (CSS/JS files are loaded!)

I am using the following URL (the same as for CSS/JS files (which work
fine)):


http://server:port/solr/core/admin/file?file=[path to
image]&contentType=image/png



When I try that URL in my browser (Firefox or Safari on Windows) they do
not return the image correctly. Firefox states that something is wrong
with the image, Safari simply displays the [?] icon.
When I download the file (removing the parameter contentType to get the
download dialog), something is downloaded (> 0KB) but it's a different
format (my image viewer fails to load it).

Has anyone managed to load images that are stored in the SOLR config
directory? Or do I need to move those resources to the webapps solr
folder (I'd rather avoid that)?

Thanks!
Chantal

Re: schema design - catch all field question

2010-02-15 Thread Erick Erickson

The obvious answer is that you won't get any hits for terms
in titles when you search the content field.

But that's not very informative. What are you trying to accomplish?
That is, what's the high-level issue you're trying to address with
a change like that?

Best
Erick

On Sun, Feb 14, 2010 at 9:02 PM, adeelmahmood wrote:

>
> if this is my schema
>
>  />
> 
> 
> 
> 
>
> with this one being the catch all field
>  multiValued="true"/>
>
> and I am copying all fields into the content field
>
> my question is .. what if instead of that I change the title field to be
> text as well and dont copy that into content field but still copy
> everything
> else (all string fields) to content field .. exactly what difference will
> that make ..
>
> --
> View this message in context:
> http://old.nabble.com/schema-design---catch-all-field-question-tp27588936p27588936.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Filtering a string containing a certain value with fq ?

2010-02-15 Thread Fredouille91




Koji Sekiguchi-2 wrote:
> 
> Fredouille91 wrote:
>> Hello,
>>
>> I have a field (named "countries") containing a list of values separated
>> with comma to wich belongs each document.
>> This field looks like this : france,germany,italy
>> It means that tjhis document is related to France, Germany and Italy.
>>
>>   
> If you can have country field as multiValued="true" and index the
> list of countries:
> 
> france
> germany
> italy
> 
> rather than france,germany,italy, you can
> just filter by fq=country:germany.
> 
> Or you can index france germany italy
> and use WhitespaceTokenizer for the countries field, then you
> can just filter by fq=countries:germany.
> 
> Koji
> 
> -- 
> http://www.rondhuit.com/en/
> 
> 
> 

Thank you for your answer.

Your solutions are really interresting. I'll try the second one.

Fred

-- 
View this message in context: 
http://old.nabble.com/Filtering-a-string-containing-a-certain-value-with-fq---tp27590860p27593402.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Filtering a string containing a certain value with fq ?

2010-02-15 Thread Koji Sekiguchi


Fredouille91 wrote:

Hello,

I have a field (named "countries") containing a list of values separated
with comma to wich belongs each document.
This field looks like this : france,germany,italy
It means that tjhis document is related to France, Germany and Italy.

  

If you can have country field as multiValued="true" and index the
list of countries:

france
germany
italy

rather than france,germany,italy, you can
just filter by fq=country:germany.

Or you can index france germany italy
and use WhitespaceTokenizer for the countries field, then you
can just filter by fq=countries:germany.

Koji

--
http://www.rondhuit.com/en/

Re: Facet search concept problem

2010-02-15 Thread Ranveer Kumar

Hi Eric, Raju

Thanks for reply..

It means I need to index separate table data (news, article and blog),
currently I am joining the table and making a single rows for all three
table.

One other thing I want to know that, in this case (if i indexing table data
separately then) some column of the table is not common. so it will go blank
while indexing for other table.
for example: news and article have some other column like, city which is not
common and not in the other table, if I am indexing other table then this
column will go blank.
is this ok to leaving blank field?

also in this case no. of column will increase so is there any limitation in
solr regarding no. of column?

thanks

On Mon, Feb 15, 2010 at 2:57 PM, NarasimhaRaju  wrote:

> Hi,
> you should have a new field in your index say 'type' which will have values
> 'news','article' and 'blog' for documents news,article and blog
> respectively.
> when searching with facet's elabled make use of this 'type' field then you
> will get what you wanted.
>
> Regards,
> P.N.Raju,
>
>
>
>
>
> 
> From: Ranveer Kumar 
> To: solr-user@lucene.apache.org
> Sent: Sun, February 14, 2010 5:45:54 AM
> Subject: Facet search concept problem
>
> Hi All,
>
> My concept still not clear about facet search.
>
> I am trying to search using facet query. I am indexing data from three
> table, following is the detail of table:
>
> table name: news
> news_id
> news_details
>
> table name : article
> article_id
> article_details
>
> table name: blog
> blog_id
> blog_details
>
> I am indexing above tables as:
> id
> news_id
> news_details
> article_id
> article_details
> blog_id
> blog_details
>
> Now I want, when user search by "soccer game" and search match in all field
> news(5), article(4) and blog(2),
> then it should be list like:
> news(5)
> article(4)
> blog(2)
>
> currently facet listing like:
> soccer(5)
> game(6)
>
> please help me..
> thanks
>
>
>
>
>

Phrase similarity - "more like this" feature for small set of terms

2010-02-15 Thread Xavier Schepler


Hi,


there is an indexed field in my Solr's schema, in which one phrase is 
stored per document.
I have to implement a feature that will allow users to have "more like 
this" results, based on the contents of this field.
I think that the Solr's built in "more like this" feature requires too 
many terms to be effective, maybe it's not the case.
I would like to use a custom algorithm, probably based on the Jaccard 
Index .


I see three options :

1 - create a Solr plug-in, which would introduce a custom "More like 
this" feature. That might be tricky.


2 - the quick and dirty way : sending queries that are crafted from the 
client side. Given the phrase : "term1 term2 term3 term4", it would be 
something like that:
(term1 AND term2 AND term3) OR (term1 AND term2 AND term4) OR (term1 AND 
term3 AND term4)  OR ...
With a good list of stop words, and well thought thresholds for the 
numbers of terms, the queries should not become too long.


3 - working with a stop word list and more like this parameters


I would have time to develop a solr's plugin, but I don't know how hard 
it would be.



Thanks in advance for your advices,


Xavier S.

Re: Realtime search and facets with very frequent commits

2010-02-15 Thread dipti khullar

Hey Janne

Can you please let me know what other optimizations are you talking about
here. Because in our application we are committing in about 5 mins but still
the response time is very low and at times there are some connection time
outs also.

Just wanted to confirm if you have done some major configuration changes
which have proved beneficial.

Thanks
Dipti

On Fri, Feb 12, 2010 at 3:03 AM, Janne Majaranta
wrote:

> Ok,
>
> Thanks Yonik and Otis.
> I already had static warming queries with facets turned on and autowarming
> at zero.
> There were a lot of other optimizations after that however, so I'll try
> with
> zero autowarming and static warming queries again.
>
> If that doesn't work, I'll go with 3 instances on the same server.
>
> BTW, does it sound like normal that when running updates every minute to a
> 36M index it takes all the available heap size after about 5 commits
> although there is not a single query executed to the index and autowarming
> is set to zero ? Just curious.
>
> -Janne
>
>
> 2010/2/11 Otis Gospodnetic 
>
> > Janne,
> >
> > The answers to your last 2 questions are both yes.  I've seen that done a
> > few times and it works.  I don't have the answer to the always-hot cache
> > question.
> >
> >
> > Otis
> > 
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
> >
> >
> > - Original Message 
> > > From: Janne Majaranta 
> > > To: solr-user@lucene.apache.org
> > > Sent: Thu, February 11, 2010 12:35:20 PM
> > > Subject: Realtime search and facets with very frequent commits
> > >
> > > Hello,
> > >
> > > I have a log search like application which requires indexed log events
> to
> > be
> > > searchable within a minute
> > > and uses facets and the statscomponent.
> > >
> > > Some stats:
> > > - The log events are indexed every 10 seconds with a "commitWithin" of
> 60
> > > seconds.
> > > - 1M events / day (~75% are updates to previous events).
> > > - Faceting over 14 fields ( strings ). Usually TOP5 by numdocs but
> facets
> > > for all 14 fields at the same time.
> > > - Heavy use of StatsComponent ( stats over facets of ~36M documents ).
> > >
> > >
> > > The application is running a single Solr instance. All updates and
> > queries
> > > are sent to the same instance.
> > > Faceting and the StatsComponent are both amazingly fast with that
> amount
> > of
> > > documents *when* the caches are warm.
> > >
> > > The problem I'm now facing is that keeping the caches warm is too heavy
> > > compared to the frequency of updates.
> > > It takes over 60 seconds to warmup the caches to the level where facets
> > and
> > > stats are returned in milliseconds.
> > >
> > > I have tested putting a second solr instance on the same server and
> > sending
> > > the updates to that new instance.
> > > Warming up the new small instance is very fast while the large instance
> > has
> > > very hot caches.
> > >
> > > I also put a third (empty) solr instance on the same server which
> passes
> > the
> > > queries to the two instances with the
> > > "shards" parameters. This is mainly because the client app really
> doesn't
> > > have to know anything about the shards.
> > >
> > > The setup was easy to configure and responses are back in milliseconds
> > and
> > > the updates are visible in seconds.
> > > That is, responses in milliseconds over 40M documents and a update
> > frequency
> > > of 15 seconds on a single physical server.
> > > The (lab) server has 16g RAM and it is running win23k.
> > >
> > > Also, what I found out is that using the sharded setup I only need half
> > the
> > > memory for the large instance.
> > > When indexing to the large instance the memory usage goes very fast up
> to
> > > the maximum allocated heap size and never goes down.
> > >
> > > My question is, is there a magic switch in SOLR to have that kind of
> > update
> > > frequency while having the caches on fire ?
> > > Or is it just impossible to achieve facet counts and queries in
> > milliseconds
> > > while updating the index every minute ?
> > >
> > > The second question is, the setup with a empty SOLR as a "coordinating"
> > > instance, a large SOLR instance with hot caches and a small SOLR
> instance
> > > with immediate updates,
> > > all on the same physical server, does it sound like a durable solution
> > > (until the small instance gets big) or is it something is braindead ?
> > >
> > > And the third question is, would it be a good idea to merge the small
> and
> > > the large index periodically so that a fresh and empty small instance
> > would
> > > be available
> > > after the merge ?
> > >
> > > Any ideas ?
> > >
> > > Best Regards,
> > >
> > > Janne Majaranta
> >
> >
>

Highlighting and field types

2010-02-15 Thread Jan

Hi all, 

After analysing the highlighting inconsistency [Highlighting Inconsistency 
email tree] I was wondering if I should open a jira issue? Can you advise me if 
that's a sensible thing to do?

So the issue is: 


* A query is done on a certain field (i.e. title) which is unstemmed, 
* A highlighting is requested on a different field (i.e. description), 
which is stemmed - 
http://localhost:8983/solr/select?q=title:terminator&hl=true&hl.fl=description
* Highlighting will not be producing results in many cases as the query 
will use title's field type also on the highlighting field (so the terms must 
be the same in the stemmed and unstemmed version to produce highlighting. If 
the proper description field type would be used, the highlighting would be 
correct at all times 
(this works fine: 
http://localhost:8983/solr/select?q=description:terminator&hl=true&hl.fl=description)
Shouldn't the query parser use the analysers of the specified field in "hl.fl" 
on which we are about to do highlighting?

 Jan.

Force Solr to use special response-rules

2010-02-15 Thread MitchK


Hello community,

with the help of the sloppy pharse query [1] I can say, that queried words
have to occur within a special number of words. 

Now, I want to create an extra rule:
If the query contains on 9 words, I want to make sure, that 6 of them have
to occur within a document or else it would not be responsed to the user. 

Does anybody know how to implement such a behaviour in Solr?

Thank you.

- Mitch

[1]
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_search_for_one_term_near_another_term_.28say.2C_.22batman.22_and_.22movie.22.29
-- 
View this message in context: 
http://old.nabble.com/Force-Solr-to-use-special-response-rules-tp27591491p27591491.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: persistent cache

2010-02-15 Thread Toke Eskildsen

From: Tim Terlegård [tim.terleg...@gmail.com]
> If the index size is more than you can have in RAM, do you recommend
> to split the index to several servers so it can all be in RAM?
>
> I do expect phrase queries. Total index size is 107 GB. *prx files are
> total 65GB and *frq files 38GB. It's probably worth buying more RAM.

Have you considered throwing one or more SSD's at the problem? Intel X25-M G2 
(or X25-E if you're dictated by your organization to buy enterprise level) is 
my personal favorite right now. They are, compared to RAM or even high-end 
spinning harddrives, often quite cost-effective. Most SSD's has random access 
time for reads at about 0.1ms. For us that meant that we moved the bottleneck 
for a 70GB index (10 million documents) from IO to CPU on a quad-core machine. 
We tried testing SSD vs. RAMDirectory and found it to perform at about 75% 
speed for a 14GB subset of the index.

- Toke Eskildsen - http://statsbiblioteket.dk

Updating index: Replacing data directory recommended?

2010-02-15 Thread Peter Karich


Hi solr community!

Is it recommended to replace the data directory of a heavy used solr 
instance?

(I am aware of the http queries, but that will be too slow)

I need a fast way to push development data to production servers.
I tried the following with success even under load of the index:
mv data dataOld; mv dataNew data; and reloaded the index.
Did I have luck or will this always work under linux?
And do I need to cut the production load from the solr instance?

Or is there a better way for a fast index update or index replacement?

Kind regards,
Peter.

Re: NullPointerException in ReplicationHandler.postCommit + question about compression

2010-02-15 Thread Shalin Shekhar Mangar

On Sat, Jan 30, 2010 at 5:08 AM, Chris Hostetter
wrote:

>
> : never keep a 0.
> :
> : It is better to leave not mention the deletionPolicy at all. The
> : defaults are usually fine.
>
> if setting the "keep" values to 0 results in NPEs we should do one (if not
> both) of the following...
>
> 1) change the init code to warn/fail if the values are 0 (not sure if
> there is ever a legitimate use for 0 as a value)
>
> 2) change the code that's currently throwing an NPE to check it's
> assumptings and log a more meaninful error if it can't function because of
> the existing config.
>
>
Setting the keep values to 0 does not result in NPEs. Setting both
maxCommitsToKeep and maxOptimizedCommitsToKeep to 0 is invalid and should be
checked for. However, this problem is different. We use the same
configuration in production and we haven't seen an NPE like that. I'm not
able to reproduce this locally too.

-- 
Regards,
Shalin Shekhar Mangar.

and DisMaxRequestHandler

2010-02-15 Thread Steve Radhouani

Hi there,
Can the  option be used by the DisMaxRequestHandler?
 Thanks,
-Steve

Re: Facet search concept problem

2010-02-15 Thread NarasimhaRaju

Hi,
you should have a new field in your index say 'type' which will have values 
'news','article' and 'blog' for documents news,article and blog respectively.
when searching with facet's elabled make use of this 'type' field then you will 
get what you wanted.

Regards, 
P.N.Raju, 






From: Ranveer Kumar 
To: solr-user@lucene.apache.org
Sent: Sun, February 14, 2010 5:45:54 AM
Subject: Facet search concept problem

Hi All,

My concept still not clear about facet search.

I am trying to search using facet query. I am indexing data from three
table, following is the detail of table:

table name: news
news_id
news_details

table name : article
article_id
article_details

table name: blog
blog_id
blog_details

I am indexing above tables as:
id
news_id
news_details
article_id
article_details
blog_id
blog_details

Now I want, when user search by "soccer game" and search match in all field
news(5), article(4) and blog(2),
then it should be list like:
news(5)
article(4)
blog(2)

currently facet listing like:
soccer(5)
game(6)

please help me..
thanks

Re: persistent cache

2010-02-15 Thread Tim Terlegård

Hi Tom,

1600 warming queries, that's quite many. Do you run them every time a
document is added to the index? Do you have any tips on warming?

If the index size is more than you can have in RAM, do you recommend
to split the index to several servers so it can all be in RAM?

I do expect phrase queries. Total index size is 107 GB. *prx files are
total 65GB and *frq files 38GB. It's probably worth buying more RAM.

/Tim

2010/2/12 Tom Burton-West :
>
> Hi Tim,
>
> We generally run about 1600 cache-warming queries to warm up the OS disk
> cache and the Solr caches when we mount a new index.
>
> Do you have/expect phrase queries?   If you don't, then you don't need to
> get any position information into your OS disk cache.  Our position
> information takes about 85% of the total index size (*prx files).  So with a
> 100GB index, your *frq files might only be 15-20GB and you could probably
> get more than half of that in 16GB of memory.
>
> If you have limited memory and a large index, then you need to choose cache
> warming queries carefully as once the cache is full, further queries will
> start evicting older data from the cache.  The tradeoff is to populate the
> cache with data that would require the most disk access if the data was not
> in the cache versus populating the cache based on your best guess of what
> queries your users will execute.  A good overview of the issues is the paper
> by Baeza-Yates ( http://doi.acm.org/10.1145/1277741.125 The Impact of
> Caching on Search Engines )
>
>
> Tom Burton-West
> Digital Library Production Service
> University of Michigan Library
> --
> View this message in context: 
> http://old.nabble.com/persistent-cache-tp27562126p27567840.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Filtering a string containing a certain value with fq ?

2010-02-15 Thread Fredouille91


Hello,

I have a field (named "countries") containing a list of values separated
with comma to wich belongs each document.
This field looks like this : france,germany,italy
It means that tjhis document is related to France, Germany and Italy.

I 'm trying to add a filter to list, for example, all documents only related
to Germany but I don't know how to specify it in my fq parameter.

I tried these syntaxes without success (I'm using disMaxQuery): 
fq=countries:germany => 0 results
fq=countries:%germany% => 0 results
fq=countries:*germany* => error  '*' or '?' not allowed as first character
in WildcardQuery
fq=countries:germany* => 0 results but it seems that it lists only the
fields beginning with germany

What is the correct syntax ?

Thanks for your help,

Regards,

Fred

-- 
View this message in context: 
http://old.nabble.com/Filtering-a-string-containing-a-certain-value-with-fq---tp27590860p27590860.html
Sent from the Solr - User mailing list archive at Nabble.com.

spellcheck all time

2010-02-15 Thread michaelnazaruk


I have a little problem with spellcheck! I get suggestions all time even the
word is correct! I use dictionary from file! Here my configuration:
 

 
   explicit
   true
   file
   false
   false  
   1
   false
  
 
  query
  spellcheck

  



  solr.FileBasedSpellChecker
  file
  spellings.txt
  UTF-8  
  ./spellcheckerFile  

  
Can anyone help me?
-- 
View this message in context: 
http://old.nabble.com/spellcheck-all-time-tp27590746p27590746.html
Sent from the Solr - User mailing list archive at Nabble.com.

39 matches

Mail list logo