Exact match

2008-07-28 Thread Sunil
Hi, I am sending a request to solr for exact match. Example: (title:(Web 2.0 OR Social Networking) OR description: (Web 2.0 OR Social Networking)) But in the results I am getting stories matching Social, Web etc. Please let me know what's going wrong. Thanks, Sunil

Re: Exact match

2008-07-28 Thread Erik Hatcher
Look at what Solr returns when adding debugQuery=true for the parsed query, and also consider how your fields are analyzed (their associated type, etc). Erik On Jul 28, 2008, at 4:56 AM, Sunil wrote: Hi, I am sending a request to solr for exact match. Example: (title:(Web 2.0

RE: Exact match

2008-07-28 Thread Sunil
Both the fields are text type: field name=description type=text indexed=true stored=true required=false / field name=title type=text indexed=true stored=true required=false / How debugQuery=true will help? I am not familiar with the output. Thanks, Sunil -Original Message- From: Erik

Re: Exact match

2008-07-28 Thread Erik Hatcher
On Jul 28, 2008, at 5:31 AM, Sunil wrote: Both the fields are text type: field name=description type=text indexed=true stored=true required=false / field name=title type=text indexed=true stored=true required=false / The definition of the field type is important - perhaps it is stripping

nested data structure definition

2008-07-28 Thread Ranjeet
Hi, Can we defined nested data structure in schema.xml for searching? is it prossible or not? Thanks Regards, Ranjeet Jha

Re: nested data structure definition

2008-07-28 Thread Shalin Shekhar Mangar
Hi Ranjeet, Solr supports multi-valued fields and you can always denormalize your data. Can you give more details on the problem you are trying to solve? On Mon, Jul 28, 2008 at 3:20 PM, Ranjeet [EMAIL PROTECTED]wrote: Hi, Can we defined nested data structure in schema.xml for searching? is

Re: nested data structure definition

2008-07-28 Thread Ranjeet
Hi, In our case there is Category object under Catalog object, so I do not want to defined the data structure for the Category. I want to give the reference of Category uder Catalog, how can I do this. Regards, Ranjeet - Original Message - From: Shalin Shekhar Mangar [EMAIL

Re: nested data structure definition

2008-07-28 Thread Shalin Shekhar Mangar
Hi, In Solr there is no hierarchy of objects. De-normalize everything into one schema using multi-valued fields where applicable. Decide on what the document should be. What do you want to return as individual results -- are they catalogs or categories? You can get more help if you give an

RE: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker)

2008-07-28 Thread Andrew Nagy
Shalin - yes the allfields field exists in my schema.xml file. It is a field that has all of the text from all of the fields concatenated together into one field. My spellCheckIndexDir is created and has 2 segment files, but I think the index is empty. When I initiate the 1st

Re: Unsure about omitNorms, termVectors...

2008-07-28 Thread Grant Ingersoll
On Jul 24, 2008, at 9:48 AM, Fuad Efendi wrote: Hi, It's unclear... found in schema.xml: omitNorms: (expert) set to true to omit the norms associated with this field (this disables length normalization and index-time boosting for the field, and saves some memory). Only

Re: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker)

2008-07-28 Thread Shalin Shekhar Mangar
Can you show us the query you are issuing? Make sure you add spellcheck=true to the query as a parameter to turn on spell checking. On Mon, Jul 28, 2008 at 6:16 PM, Andrew Nagy [EMAIL PROTECTED]wrote: Shalin - yes the allfields field exists in my schema.xml file. It is a field that has all of

RE: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker)

2008-07-28 Thread Andrew Nagy
-Original Message- From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED] Sent: Monday, July 28, 2008 10:09 AM To: solr-user@lucene.apache.org Subject: Re: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker) Can you show us the query you

Re: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker)

2008-07-28 Thread Shalin Shekhar Mangar
Hi Andrew, Your configuration which you specified in the earlier thread looks fine. Your query is also ok. The complete lack of spell check results in the response you pasted suggests that the SpellCheckComponent is not added to the SearchHandler's list of components. Can you check your

RE: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker)

2008-07-28 Thread Andrew Nagy
I was just reviewing the solr logs and I noticed the following: Jul 28, 2008 11:52:01 AM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.component.SpellCheckComponent' It looks like the SpellCheckComponent is

RE: solr synonyms behaviour

2008-07-28 Thread Laurent Gilles
Hi, I was faced with the same issues reguarding multiwords synonyms Let's say a synonyms list like: club, bar, night cabaret Now if we have a document containing club, with the default synonyms filter behaviour with expand=true, we will end up in the lucene index with a document containing

RE: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker)

2008-07-28 Thread Andrew Nagy
Hmm ... sorry, that was the output of a java program that uses solr that I ran and noticed the error. That error doesn't happen when I start solr. Sorry for the confusion. I just changed my schema to have a dedicated field for spelling called spelling and I created a new field type for the

RE: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker)

2008-07-28 Thread Andrew Nagy
Well I will include the stack trace for the aforementioned error: Jul 28, 2008 12:20:17 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.component.SpellCheckComponent' at

Re: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker)

2008-07-28 Thread Shalin Shekhar Mangar
Well that means the nightly solr jar you are using is older than you think it is. Try running solr normally without the program and see if you can get it working. On Mon, Jul 28, 2008 at 9:54 PM, Andrew Nagy [EMAIL PROTECTED]wrote: Well I will include the stack trace for the aforementioned

RE: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker)

2008-07-28 Thread Andrew Nagy
-Original Message- From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED] Sent: Monday, July 28, 2008 12:38 PM To: solr-user@lucene.apache.org Subject: Re: SpellCheckComponent problems (was: Multiple search components in one handler - ie spellchecker) Well that means the nightly

Unsynchronized FIFOCache - 9x times performance boost on 8-CPU system

2008-07-28 Thread Fuad Efendi
Please see discussion at http://issues.apache.org/jira/browse/SOLR-665 Very simple: map = new LinkedHashMap(initialSize, 0.75f, true) - LRU Cache (and we need synchronized get()) map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO (and we do not need synchronized get()) -- Thanks, Fuad

Multiple Update servers

2008-07-28 Thread Rakesh Godhani
Hi, we are currently evaluating Solr and have been browsing the archives for one particular issue but can¹t seem to find the answer, so please forgive me if I¹m asking a repetitive question. We like the idea of having multiple slave servers serving up queries and a master performing updates.

big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
Hi all, For some queries I need to return a lot of rows at once (say 100). When performing these queries I notice a big difference between qTime (which is mostly in the 15-30 ms range due to caching) and total time taken to return the response (measured through SolrJ's elapsedTime), which takes

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Yonik Seeley
That high of a difference is due to the part of the index containing these particular stored fields not being in OS cache. What's the size on disk of your index compared to your physical RAM? -Yonik On Mon, Jul 28, 2008 at 4:10 PM, Britske [EMAIL PROTECTED] wrote: Hi all, For some queries I

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Yonik Seeley
That's a bit too tight to have *all* of the index cached...your best bet is to go to 4GB+, or figure out a way not to have to retrieve so many stored fields. -Yonik On Mon, Jul 28, 2008 at 4:27 PM, Britske [EMAIL PROTECTED] wrote: Size on disk is 1.84 GB (of which 1.3 GB sits in FDT files if

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Mike Klaas
Another possibility is to partition the stored fields into a frequently-accessed set and a full set. If the frequently-accessed set is significantly smaller (in terms of # bytes), then the documents will be tightly-packed on disk and the os caching will be much more effective given the

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
I'm on a development box currently and production servers will be bigger, but at the same time the index will be too. Each query requests at most 20 stored fields. Why doesn't help lazyfieldloading in this situation? I don't need to retrieve all stored fields and I thought I wasn't doing this

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Grant Ingersoll
What version of Solr/Lucene are you using? On Jul 28, 2008, at 4:53 PM, Britske wrote: I'm on a development box currently and production servers will be bigger, but at the same time the index will be too. Each query requests at most 20 stored fields. Why doesn't help lazyfieldloading in

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
Thanks for clearing that up for me. I'm going to investigate some more... Yonik Seeley wrote: On Mon, Jul 28, 2008 at 4:53 PM, Britske [EMAIL PROTECTED] wrote: Each query requests at most 20 stored fields. Why doesn't help lazyfieldloading in this situation? It's the disk seek that

RE: Tokenizing and searching named character entity references

2008-07-28 Thread Steven A Rowe
Hi Frances, HTMLStripWhitespaceTokenizerFactory wraps a WhitespaceTokenizer around an HTMLStripReader. You could extend HTMLStripReader to not decode named character entities, e.g. by overriding HTMLStripReader.read() so that it calls an alternative readEntity(), which instead of converting

Re: Expansion stemming

2008-07-28 Thread Chris Hostetter
: Expansion stemming ? Takes a root word and 'expands' it to all of its : various forms ? can be used either at insertion time or at query : time. : : How do I specify that I want the expansion stemming instead of the porter : stemming? there isn't anexpclit expansion stemming filter included

Re: morphology and queryPrase

2008-07-28 Thread Chris Hostetter
: When i'm looking for words taking care of distance between them, i'm using : lucene syntax A B~distance... unfortunaly if A leads to A1 and A2 forms i : should split this into syntax +(A1 B~dist A2 B~dist ) - this grows with : progression depending of normal forms quantity of each term. : :

Re: Best way to return ExternalFileField in the results

2008-07-28 Thread Chris Hostetter
: I've been trying to return a field of type ExternalFileField in the search : result. Upon examining XMLWriter class, it seems like Solr can't do this out : of the box. Therefore, I've tried to hack Solr to enable this behaviour. : The goal is to call to

Re: Unsure about omitNorms, termVectors...

2008-07-28 Thread Chris Hostetter
: omitNorms: do I need it for full-text fields even if I don't need index-time : boosting? I don't want to boost text where keyword repeated several time. Is : my understanding correct? if you omitNorms=true then you not only lose index-time doc/field boosting, but you also loose lengthNorms

Re: Best way to return ExternalFileField in the results

2008-07-28 Thread Ryan McKinley
In general though i wondering if steping back a bit and modifying your request handler to use a SolrDocumentList where you've already flattened the ExternalFileField into each SolrDocument would be an easier approach -- then you wouldnt' need to modify the ResponseWriter at all.

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Mike Klaas
On 28-Jul-08, at 1:53 PM, Britske wrote: Each query requests at most 20 stored fields. Why doesn't help lazyfieldloading in this situation? It does help, but not enough. With lots of data per document and not a lot of memory, it becomes probabilistically likely that each doc resides in